7 min read

Towards polylithic synapse

More performance and better pings on my matrix server
Towards polylithic synapse

Synapse is a matrix homeserver implementation written in Python. Here I am showing you my setup and configuration using workers. Using multiple workers is a feature of synapse which allows you to distribute load to multiple processes and cores on your machine. This usually results in a much better performance.

Prerequisites

A worker-enabled installation is a bit more complex than a normal, monolithic one. Having a stable and fully configured monolithic installation before starting using workers is a good idea. See the general installation instructions for help on this one. A docker-compose.yaml can be found here: https://github.com/matrix-org/synapse/tree/develop/contrib/docker

Deploy synapse with workers

See docs for more detailed information.

Components

Redis

Message broker for communication between workers.

Synapse

Main synapse service that covers all endpoints that are not defined elsewhere

Worker 1-n

Each worker is responsible for a set of endpoints. The number of workers per task-type is variable. I found out that a decent number of federation senders can improve ping performance a bit. But this needs some further research.

Worker 1
  • Type: generic_worker

  • All incoming client-requests

  • The endpoints can also be distributed on more than one worker

Worker 2
  • Type: generic_worker
  • All incoming federation-requests
  • The endpoints can also be distributed on more than one worker
Worker 6, 7, 8, 9, 10, 11
  • Type: federation_sender
  • All outgoing federation-traffic
  • No need for endpoints, because federation_senders are not receiving any traffic

These components can be glued together using a docker-compose.yaml. Please be sure to always use the same version of synapse on all workers. I made use of a docker environment variable named TAG which can be defined in one central place (the .env file).

version: "3.4"
services:
  synapse:
    restart: always
    hostname: synapse
    image: matrixdotorg/synapse:${TAG}
    expose:
      - 8008
    ports:
      - 8008
    volumes:
      - /data/synapse/data:/data
      - /data/synapse/media_store:/media_store
    environment:
      SYNAPSE_SERVER_NAME: "keks.club"
      SYNAPSE_REPORT_STATS: "yes"
    depends_on:
      - db
      - lighttpd
      - redis
    labels:
      - "traefik.enable=true"
      - >
        traefik.http.routers.synapse.rule=Host(`matrix.keks.club`) && (
        PathPrefix(`/_matrix/`) ||
        PathPrefix(`/_synapse/`) ||
        PathPrefix(`/health/`)
        )
      - "traefik.http.routers.synapse.entrypoints=websecure"
      - "traefik.http.routers.synapse.tls.certresolver=myresolver"
    networks:
      - matrix
      - traefik

  worker1:
    image: matrixdotorg/synapse:${TAG}
    container_name: worker1-client
    restart: "unless-stopped"
    command:
      - 'run'
      - '--config-path=/data/homeserver.yaml'
      - '--config-path=/data/workers/worker1-client.yaml'
    environment:
      - SYNAPSE_REPORT_STATS=yes
      - SYNAPSE_SERVER_NAME=keks.club
      - SYNAPSE_WORKER=synapse.app.generic_worker
      - TZ=Berlin/Europe
    depends_on:
      - synapse
    volumes:   
      - /data/synapse/data:/data
      - /data/synapse/media_store:/media_store
    networks:
      - matrix
      - traefik
    labels:
      - "traefik.enable=true"
      - >
        traefik.http.routers.worker1.rule=Host(`matrix.keks.club`) && (
        PathPrefix(`/_matrix/client/{p1:r0|v3}/sync`)||
        PathPrefix(`/_matrix/client/{p2:v1|r0|v3}/events`) ||
        PathPrefix(`/_matrix/client/{p3:v1|r0|v3}/initialSync`) ||
        PathPrefix(`/_matrix/client/{p4:v1|r0|v3}/rooms/{p5:[^/]+}/initialSync`)||
        PathPrefix(`/_matrix/client/{p6:v1|r0|v3|unstable}/createRoom`)||
        PathPrefix(`/_matrix/client/{p7:v1|r0|v3|unstable}/publicRooms`) ||
        PathPrefix(`/_matrix/client/{p8:v1|r0|v3|unstable}/rooms/{p9:.*}/joined_members`) ||
        PathPrefix(`/_matrix/client/{p9:v1|r0|v3|unstable}/rooms/{p10:.*}/context/{p11:.*}`) ||
        PathPrefix(`/_matrix/client/{p12:v1|r0|v3|unstable}/rooms/{p13:.*}/members`) ||
        PathPrefix(`/_matrix/client/{p14:v1|r0|v3|unstable}/rooms/{p15:.*}/state`) ||
        PathPrefix(`/_matrix/client/v1/rooms/{p16:.*}/hierarchy`) ||
        PathPrefix(`/_matrix/client/unstable/org.matrix.msc2716/rooms/{p17:.*}/batch_send`) ||
        PathPrefix(`/_matrix/client/unstable/im.nheko.summary/rooms/{p18:.*}/summary`) ||
        PathPrefix(`/_matrix/client/{p19:r0|v3|unstable}/devices`) ||
        PathPrefix(`/_matrix/client/versions`) ||
        PathPrefix(`/_matrix/client/{p20:v1|r0|v3|unstable}/voip/turnServer`) ||
        PathPrefix(`/_matrix/client/{p21:v1|r0|v3|unstable}/rooms/{p22:.*}/event/`) ||
        PathPrefix(`/_matrix/client/{p23:v1|r0|v3|unstable}/joined_rooms`) ||
        PathPrefix(`/_matrix/client/{p24:v1|r0|v3|unstable}/search`)||
        PathPrefix(`/_matrix/client/{p25:r0|v3|unstable}/keys/query`)||
        PathPrefix(`/_matrix/client/{p26:r0|v3|unstable}/keys/changes`)||
        PathPrefix(`/_matrix/client/{p27:r0|v3|unstable}/keys/claim`)||
        PathPrefix(`/_matrix/client/{p28:r0|v3|unstable}/room_keys/`)||
        PathPrefix(`/_matrix/client/{p29:v1|r0|v3|unstable}/login`)||
        PathPrefix(`/_matrix/client/{p30:r0|v3|unstable}/register`)||
        PathPrefix(`/_matrix/client/v1/register/m.login.registration_token/validity`)||
        PathPrefix(`/_matrix/client/{p31:r0|v3|unstable}/rooms/{p32:.*}/receipt`)||
        PathPrefix(`/_matrix/client/{p33:r0|v3|unstable}/rooms/{p34:.*}/read_markers`)||
        PathPrefix(`/_matrix/client/{p35:v1|r0|v3|unstable}/presence/`)||
        PathPrefix(`/_matrix/client/{p36:r0|v3|unstable}/user_directory/search`)||
        PathPrefix(`/_matrix/client/{p37:v1|r0|v3|unstable}/rooms/{p38:.*}/redact`)||
        PathPrefix(`/_matrix/client/{p39:v1|r0|v3|unstable}/rooms/{p40:.*}/send`)||
        PathPrefix(`/_matrix/client/{p41:v1|r0|v3|unstable}/rooms/{p42:.*}/state/`)||
        PathPrefix(`/_matrix/client/{p43:v1|r0|v3|unstable}/rooms/{p44:.*}/{p45:join|invite|leave|ban|unban|kick}`)||
        PathPrefix(`/_matrix/client/{p46:v1|r0|v3|unstable}/join/`)||
        PathPrefix(`/_matrix/client/{p47:v1|r0|v3|unstable}/profile/`) ||
        PathPrefix(`/_matrix/client/{p48:v1|r0|v3|unstable}/rooms/{p49:.*}/typing/`)
        )
      - "traefik.http.routers.worker1.entrypoints=websecure"
      - "traefik.http.routers.worker1.tls.certresolver=myresolver"
      - "traefik.http.services.worker1.loadbalancer.server.port=8008"

  worker2:
    image: matrixdotorg/synapse:${TAG}
    container_name: worker2-federation
    restart: "unless-stopped"
    command:
      - 'run'  
      - '--config-path=/data/homeserver.yaml'
      - '--config-path=/data/workers/worker2-federation.yaml'
    environment:
      - SYNAPSE_REPORT_STATS=yes
      - SYNAPSE_SERVER_NAME=keks.club
      - SYNAPSE_WORKER=synapse.app.generic_worker
      - TZ=Berlin/Europe
    depends_on: 
      - synapse
    volumes:
      - /data/synapse/data:/data
      - /data/synapse/media_store:/media_store
    networks:
      - matrix
      - traefik
    labels:
      - "traefik.enable=true"
      - >
        traefik.http.routers.worker2.rule=Host(`matrix.keks.club`) && (
        PathPrefix(`/_matrix/federation/v1/event/`)||
        PathPrefix(`/_matrix/federation/v1/state/`)||
        PathPrefix(`/_matrix/federation/v1/state_ids/`)||
        PathPrefix(`/_matrix/federation/v1/backfill/`)||
        PathPrefix(`/_matrix/federation/v1/get_missing_events/`)||
        PathPrefix(`/_matrix/federation/v1/publicRooms`)||
        PathPrefix(`/_matrix/federation/v1/query/`)||
        PathPrefix(`/_matrix/federation/v1/make_join/`)||
        PathPrefix(`/_matrix/federation/v1/make_leave/`)||
        PathPrefix(`/_matrix/federation/{p50:v1|v2}/send_join/`)||
        PathPrefix(`/_matrix/federation/{p51:v1|v2}/send_leave/`)||
        PathPrefix(`/_matrix/federation/{p52:v1|v2}/invite/`)||
        PathPrefix(`/_matrix/federation/v1/event_auth/`)||
        PathPrefix(`/_matrix/federation/v1/exchange_third_party_invite/`)||
        PathPrefix(`/_matrix/federation/v1/user/devices/`)||
        PathPrefix(`/_matrix/key/v2/query`)||
        PathPrefix(`/_matrix/federation/v1/hierarchy/`) ||
        PathPrefix(`/_matrix/federation/v1/send/`)
        )
      - "traefik.http.routers.worker2.entrypoints=websecure"
      - "traefik.http.routers.worker2.tls.certresolver=myresolver"
      - "traefik.http.services.worker2.loadbalancer.server.port=8008"

  worker6:
    image: matrixdotorg/synapse:${TAG}
    container_name: worker6
    restart: "unless-stopped"
    command:
      - 'run'
      - '--config-path=/data/homeserver.yaml'
      - '--config-path=/data/workers/worker6.yaml'
    environment:
      - SYNAPSE_REPORT_STATS=yes
      - SYNAPSE_SERVER_NAME=keks.club
      - SYNAPSE_WORKER=synapse.app.federation_sender
      - TZ=Berlin/Europe
    depends_on:
      - synapse
    volumes:   
      - /data/synapse/data:/data
      - /data/synapse/media_store:/media_store
    networks:
      - matrix

  worker7:
    image: matrixdotorg/synapse:${TAG}
    container_name: worker7
    restart: "unless-stopped"
    command:
      - 'run'
      - '--config-path=/data/homeserver.yaml'
      - '--config-path=/data/workers/worker7.yaml'
    environment:
      - SYNAPSE_REPORT_STATS=yes
      - SYNAPSE_SERVER_NAME=keks.club
      - SYNAPSE_WORKER=synapse.app.federation_sender
      - TZ=Berlin/Europe
    depends_on:
      - synapse
    volumes:   
      - /data/synapse/data:/data
      - /data/synapse/media_store:/media_store
    networks:
      - matrix

  worker8:
    image: matrixdotorg/synapse:${TAG}
    container_name: worker8
    restart: "unless-stopped"
    command:
      - 'run'
      - '--config-path=/data/homeserver.yaml'
      - '--config-path=/data/workers/worker8.yaml'
    environment:
      - SYNAPSE_REPORT_STATS=yes
      - SYNAPSE_SERVER_NAME=keks.club
      - SYNAPSE_WORKER=synapse.app.federation_sender
      - TZ=Berlin/Europe
    depends_on:
      - synapse
    volumes:   
      - /data/synapse/data:/data
      - /data/synapse/media_store:/media_store
    networks:
      - matrix

  worker9:
    image: matrixdotorg/synapse:${TAG}
    container_name: worker9
    restart: "unless-stopped"
    command:
      - 'run'
      - '--config-path=/data/homeserver.yaml'
      - '--config-path=/data/workers/worker9.yaml'
    environment:
      - SYNAPSE_REPORT_STATS=yes
      - SYNAPSE_SERVER_NAME=keks.club
      - SYNAPSE_WORKER=synapse.app.federation_sender
      - TZ=Berlin/Europe
    depends_on:
      - synapse
    volumes:   
      - /data/synapse/data:/data
      - /data/synapse/media_store:/media_store
    networks:
      - matrix

  worker10:
    image: matrixdotorg/synapse:${TAG}
    container_name: worker10
    restart: "unless-stopped"
    command:
      - 'run'
      - '--config-path=/data/homeserver.yaml'
      - '--config-path=/data/workers/worker10.yaml'
    environment:
      - SYNAPSE_REPORT_STATS=yes
      - SYNAPSE_SERVER_NAME=keks.club
      - SYNAPSE_WORKER=synapse.app.federation_sender
      - TZ=Berlin/Europe
    depends_on:
      - synapse
    volumes:   
      - /data/synapse/data:/data
      - /data/synapse/media_store:/media_store
    networks:
      - matrix

  worker11:
    image: matrixdotorg/synapse:${TAG}
    container_name: worker11
    restart: "unless-stopped"
    command:
      - 'run'
      - '--config-path=/data/homeserver.yaml'
      - '--config-path=/data/workers/worker11.yaml'
    environment:
      - SYNAPSE_REPORT_STATS=yes
      - SYNAPSE_SERVER_NAME=keks.club
      - SYNAPSE_WORKER=synapse.app.federation_sender
      - TZ=Berlin/Europe
    depends_on:
      - synapse
    volumes:   
      - /data/synapse/data:/data
      - /data/synapse/media_store:/media_store
    networks:
      - matrix
      
  db:
    image: postgres:11.13
    restart: always
    expose:
      - 5432
    volumes:
      - /data/synapse-postgres:/var/lib/postgresql/data
    networks:
      - matrix
    environment:
      POSTGRES_PASSWORD: "SECURE_PASSWORD"
      POSTGRES_USER: "synapse_user"
      POSTGRES_DB: "synapse"
      POSTGRES_INITDB_ARGS: "--encoding='UTF8' --lc-collate='C' --lc-ctype='C'"

  lighttpd:
    image: sebp/lighttpd
    restart: always
    volumes:
      - /data/synapse-lighttpd/content:/var/www/localhost/htdocs
      - /data/synapse-lighttpd/config:/etc/lighttpd
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.matrix-well-known.rule=Host(`keks.club`)&&(Path(`/.well-known/matrix/server`)||Path(`/.well-known/matrix/client`))"
      - "traefik.http.routers.matrix-well-known.entrypoints=websecure"
      - "traefik.http.routers.matrix-well-known.tls.certresolver=myresolver"
      - "traefik.http.routers.matrix-well-known.tls=true"
    networks:
      - traefik

  redis:
    image: "redis:latest"
    restart: "unless-stopped"
    networks:
      - matrix


networks:
  traefik:
    external: true
  matrix:
    external: true

Configuration

Having all those components defined in a docker-compose.yaml file is only a part of the solution. Each one of them needs to be configured. You may have noticed the two configuration paths that were passed on each worker service in the docker-compose.yaml file:

      - '--config-path=/data/homeserver.yaml'
      - '--config-path=/data/workers/worker1-client.yaml'

Each worker has a shared an an individual configuration.

Shared configuration

The homeserver.yaml of the original synapse service. Some adjustments are required here for the polylithic setup.

Add a HTTP replication listener:

listeners:
   ...
   # The HTTP replication port
  - port: 9093
    tls: false
    type: http
    resources:
     - names: [replication]

Add a worker section:

## Workers ##

# Controls sending of outbound federation transactions on the main process.
# Set to false if using a federation sender worker.
send_federation: false

# It is possible to run multiple federation sender workers,
# in which case the work is balanced across them.
# Use this setting to list the senders.
federation_sender_instances:
  - worker6
  - worker7
  - worker8
  - worker9
  - worker10
  - worker11

# When using workers this should be a map from worker name
# to the HTTP replication listener of the worker, if configured.
instance_map:
  # Sync requests
  # Client API requests
  # Encryption requests
  # Registration/login requests
  # Receipts requests
  # Presence requests
  # User directory search requests
  # Event sending requests
  worker1-client:
    host: worker1-client
    port: 8034

  # Federation requests
  worker2-federation:
    host: worker2-federation
    port: 8034

  # Federation senders
  worker6:
    host: worker6
    port: 8034
  worker7:
    host: worker7
    port: 8034
  worker8:
    host: worker8
    port: 8034
  worker9:
    host: worker9
    port: 8034
  worker10:
    host: worker10
    port: 8034
  worker11:
    host: worker8
    port: 8034

# Experimental:
# When using workers you can define which workers should handle event
# persistence and typing notifications. Any worker specified here must
# also be in the instance_map.
stream_writers:
  events: worker1-client
  typing: worker1-client
  receipts: worker1-client
  presence: worker1-client

redis:
  enabled: true
  host: redis
  port: 6379

Worker-specific configuration

worker1-client.yaml example (incoming client-requests)

worker_app: synapse.app.generic_worker
worker_name: worker1-client
worker_replication_host: synapse # Container name of synapse service
worker_replication_http_port: 9093
worker_listeners:
  - type: http
    port: 8008
    resources:
      - names: [client]
  - type: http
    port: 8034
    resources:
      - names: [replication]
worker_log_config: /data/keks.club.log.config

worker2-federation.yaml example (incoming federation-requests)

worker_app: synapse.app.generic_worker
worker_name: worker2-federation
worker_replication_host: synapse
worker_replication_http_port: 9093
worker_listeners:
  - type: http
    port: 8008
    resources:
      - names: [federation]
  - type: http
    port: 8034
    resources:
      - names: [replication]
worker_log_config: /data/keks.club.log.config

worker6.yaml example (outgoing federation-traffic)

worker_app: synapse.app.federation_sender
worker_name: worker6
worker_replication_host: synapse
worker_replication_http_port: 9093
worker_listeners:
  - type: http
    port: 8034
    resources:
      - names: [replication]

worker_log_config: /data/keks.club.log.config

The end

I'd be glad if this article helped you to set up your very own polylithic synapse. If you have questions or if I missed something important here, please reach out to me via @valentin:keks.club