1
0
Files
synapse/docker/conf-workers
Eric Eastwood a1e9abc7df Add Prometheus HTTP service discovery endpoint for easy discovery of all workers in Docker image (#19336)
Add Prometheus [HTTP service discovery](https://prometheus.io/docs/prometheus/latest/http_sd/)
endpoint for easy discovery of all workers in Docker image.

Follow-up to https://github.com/element-hq/synapse/pull/19324

Spawning from wanting to [run a load
test](https://github.com/element-hq/synapse-rust-apps/pull/397) against
the Complement Docker image of Synapse and see metrics from the
homeserver.


`GET http://<synapse_container>:9469/metrics/service_discovery`
```json5
[
  {
    "targets": [ "<host>", ... ],
    "labels": {
      "<labelname>": "<labelvalue>", ...
    }
  },
  ...
]
```

The metrics from each worker can also be accessed via
`http://<synapse_container>:9469/metrics/worker/<worker_name>` which is
what the service discovery response points to behind the scenes. This
way, you only need to expose a single port (9469) to access all metrics.

<details>
<summary>Real HTTP service discovery response</summary>

```json5
[
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "event_persister",
            "index": "1",
            "__metrics_path__": "/metrics/worker/event_persister1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "event_persister",
            "index": "2",
            "__metrics_path__": "/metrics/worker/event_persister2"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "background_worker",
            "index": "1",
            "__metrics_path__": "/metrics/worker/background_worker1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "event_creator",
            "index": "1",
            "__metrics_path__": "/metrics/worker/event_creator1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "user_dir",
            "index": "1",
            "__metrics_path__": "/metrics/worker/user_dir1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "media_repository",
            "index": "1",
            "__metrics_path__": "/metrics/worker/media_repository1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "federation_inbound",
            "index": "1",
            "__metrics_path__": "/metrics/worker/federation_inbound1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "federation_reader",
            "index": "1",
            "__metrics_path__": "/metrics/worker/federation_reader1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "federation_sender",
            "index": "1",
            "__metrics_path__": "/metrics/worker/federation_sender1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "synchrotron",
            "index": "1",
            "__metrics_path__": "/metrics/worker/synchrotron1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "client_reader",
            "index": "1",
            "__metrics_path__": "/metrics/worker/client_reader1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "appservice",
            "index": "1",
            "__metrics_path__": "/metrics/worker/appservice1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "pusher",
            "index": "1",
            "__metrics_path__": "/metrics/worker/pusher1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "device_lists",
            "index": "1",
            "__metrics_path__": "/metrics/worker/device_lists1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "device_lists",
            "index": "2",
            "__metrics_path__": "/metrics/worker/device_lists2"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "stream_writers",
            "index": "1",
            "__metrics_path__": "/metrics/worker/stream_writers1"
        }
    },
    {
        "targets": [
            "localhost:9469"
        ],
        "labels": {
            "job": "main",
            "index": "1",
            "__metrics_path__": "/metrics/worker/main"
        }
    }
]
```

</details>


And how it ends up as targets in Prometheus
(http://localhost:9090/targets):

(image)


### Testing strategy

1. Make sure your firewall allows the Docker containers to communicate
to the host (`host.docker.internal`) so they can access exposed ports of
other Docker containers. We want to allow Synapse to access the
Prometheus container and Grafana to access to the Prometheus container.
- `sudo ufw allow in on docker0 comment "Allow traffic from the default
Docker network to the host machine (host.docker.internal)"`
- `sudo ufw allow in on br-+ comment "(from Matrix Complement testing)
Allow traffic from custom Docker networks to the host machine
(host.docker.internal)"`
- [Complement firewall
docs](ee6acd9154/README.md (potential-conflict-with-firewall-software))
1. Build the Docker image for Synapse: `docker build -t
matrixdotorg/synapse -f docker/Dockerfile . && docker build -t
matrixdotorg/synapse-workers -f docker/Dockerfile-workers .`
([docs](7a24fafbc3/docker/README-testing.md (building-and-running-the-images-manually)))
 1. Start Synapse:
     ```
    docker run -d --name synapse \
        --mount type=volume,src=synapse-data,dst=/data \
        -e SYNAPSE_SERVER_NAME=my.docker.synapse.server \
        -e SYNAPSE_REPORT_STATS=no \
        -e SYNAPSE_ENABLE_METRICS=1 \
        -p 8008:8008 \
        -p 9469:9469 \
        matrixdotorg/synapse-workers:latest
    ```
    - Also try with workers:
       ```
      docker run -d --name synapse \
          --mount type=volume,src=synapse-data,dst=/data \
          -e SYNAPSE_SERVER_NAME=my.docker.synapse.server \
          -e SYNAPSE_REPORT_STATS=no \
          -e SYNAPSE_ENABLE_METRICS=1 \
          -e SYNAPSE_WORKER_TYPES="\
              event_persister:2, \
              background_worker, \
              event_creator, \
              user_dir, \
              media_repository, \
              federation_inbound, \
              federation_reader, \
              federation_sender, \
              synchrotron, \
              client_reader, \
              appservice, \
              pusher, \
              device_lists:2, \
stream_writers=account_data+presence+receipts+to_device+typing" \
          -p 8008:8008 \
          -p 9469:9469 \
          matrixdotorg/synapse-workers:latest
      ```
1. You should be able to see Prometheus service discovery endpoint at
http://localhost:9469/metrics/service_discovery
 1. Create a Prometheus config (`prometheus.yml`)
    ```yaml
    global:
      scrape_interval: 15s
      scrape_timeout: 15s
      evaluation_interval: 15s
    
    scrape_configs:
      - job_name: synapse
        scrape_interval: 15s
        metrics_path: /_synapse/metrics
        scheme: http
# We set `honor_labels` so that each service can set their own `job`
label
        #
# > honor_labels controls how Prometheus handles conflicts between
labels that are
# > already present in scraped data and labels that Prometheus would
attach
# > server-side ("job" and "instance" labels, manually configured target
# > labels, and labels generated by service discovery implementations).
        # >
# > *--
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config*
        honor_labels: true
        # Use HTTP service discovery
        #
        # Reference:
        #  - https://prometheus.io/docs/prometheus/latest/http_sd/
# -
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config
        http_sd_configs:
          - url: 'http://localhost:9469/metrics/service_discovery'
    ```
1. Start Prometheus (update the volume bind mount to the config you just
saved somewhere):
    ```
    docker run \
        --detach \
        --name=prometheus \
        --add-host host.docker.internal:host-gateway \
        -p 9090:9090 \
-v
~/Documents/code/random/prometheus-config/prometheus.yml:/etc/prometheus/prometheus.yml
\
        prom/prometheus
    ```
1. Make sure you're seeing some data in Prometheus. On
http://localhost:9090/query, search for `synapse_build_info`
 1. Start [Grafana](https://hub.docker.com/r/grafana/grafana)
    ```
docker run -d --name=grafana --add-host
host.docker.internal:host-gateway -p 3000:3000 grafana/grafana
    ```
1. Visit the Grafana dashboard, http://localhost:3000/ (Credentials:
`admin`/`admin`)
1. **Connections** -> **Data Sources** -> **Add data source** ->
**Prometheus**
     - Prometheus server URL: `http://host.docker.internal:9090`
1. Import the Synapse dashboard:
https://github.com/element-hq/synapse/blob/develop/contrib/grafana/synapse.json
2026-01-14 18:02:55 -06:00
..