Add Prometheus [HTTP service discovery](https://prometheus.io/docs/prometheus/latest/http_sd/) endpoint for easy discovery of all workers in Docker image. Follow-up to https://github.com/element-hq/synapse/pull/19324 Spawning from wanting to [run a load test](https://github.com/element-hq/synapse-rust-apps/pull/397) against the Complement Docker image of Synapse and see metrics from the homeserver. `GET http://<synapse_container>:9469/metrics/service_discovery` ```json5 [ { "targets": [ "<host>", ... ], "labels": { "<labelname>": "<labelvalue>", ... } }, ... ] ``` The metrics from each worker can also be accessed via `http://<synapse_container>:9469/metrics/worker/<worker_name>` which is what the service discovery response points to behind the scenes. This way, you only need to expose a single port (9469) to access all metrics. <details> <summary>Real HTTP service discovery response</summary> ```json5 [ { "targets": [ "localhost:9469" ], "labels": { "job": "event_persister", "index": "1", "__metrics_path__": "/metrics/worker/event_persister1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "event_persister", "index": "2", "__metrics_path__": "/metrics/worker/event_persister2" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "background_worker", "index": "1", "__metrics_path__": "/metrics/worker/background_worker1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "event_creator", "index": "1", "__metrics_path__": "/metrics/worker/event_creator1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "user_dir", "index": "1", "__metrics_path__": "/metrics/worker/user_dir1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "media_repository", "index": "1", "__metrics_path__": "/metrics/worker/media_repository1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "federation_inbound", "index": "1", "__metrics_path__": "/metrics/worker/federation_inbound1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "federation_reader", "index": "1", "__metrics_path__": "/metrics/worker/federation_reader1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "federation_sender", "index": "1", "__metrics_path__": "/metrics/worker/federation_sender1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "synchrotron", "index": "1", "__metrics_path__": "/metrics/worker/synchrotron1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "client_reader", "index": "1", "__metrics_path__": "/metrics/worker/client_reader1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "appservice", "index": "1", "__metrics_path__": "/metrics/worker/appservice1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "pusher", "index": "1", "__metrics_path__": "/metrics/worker/pusher1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "device_lists", "index": "1", "__metrics_path__": "/metrics/worker/device_lists1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "device_lists", "index": "2", "__metrics_path__": "/metrics/worker/device_lists2" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "stream_writers", "index": "1", "__metrics_path__": "/metrics/worker/stream_writers1" } }, { "targets": [ "localhost:9469" ], "labels": { "job": "main", "index": "1", "__metrics_path__": "/metrics/worker/main" } } ] ``` </details> And how it ends up as targets in Prometheus (http://localhost:9090/targets): (image) ### Testing strategy 1. Make sure your firewall allows the Docker containers to communicate to the host (`host.docker.internal`) so they can access exposed ports of other Docker containers. We want to allow Synapse to access the Prometheus container and Grafana to access to the Prometheus container. - `sudo ufw allow in on docker0 comment "Allow traffic from the default Docker network to the host machine (host.docker.internal)"` - `sudo ufw allow in on br-+ comment "(from Matrix Complement testing) Allow traffic from custom Docker networks to the host machine (host.docker.internal)"` - [Complement firewall docs](ee6acd9154/README.md (potential-conflict-with-firewall-software)) 1. Build the Docker image for Synapse: `docker build -t matrixdotorg/synapse -f docker/Dockerfile . && docker build -t matrixdotorg/synapse-workers -f docker/Dockerfile-workers .` ([docs](7a24fafbc3/docker/README-testing.md (building-and-running-the-images-manually))) 1. Start Synapse: ``` docker run -d --name synapse \ --mount type=volume,src=synapse-data,dst=/data \ -e SYNAPSE_SERVER_NAME=my.docker.synapse.server \ -e SYNAPSE_REPORT_STATS=no \ -e SYNAPSE_ENABLE_METRICS=1 \ -p 8008:8008 \ -p 9469:9469 \ matrixdotorg/synapse-workers:latest ``` - Also try with workers: ``` docker run -d --name synapse \ --mount type=volume,src=synapse-data,dst=/data \ -e SYNAPSE_SERVER_NAME=my.docker.synapse.server \ -e SYNAPSE_REPORT_STATS=no \ -e SYNAPSE_ENABLE_METRICS=1 \ -e SYNAPSE_WORKER_TYPES="\ event_persister:2, \ background_worker, \ event_creator, \ user_dir, \ media_repository, \ federation_inbound, \ federation_reader, \ federation_sender, \ synchrotron, \ client_reader, \ appservice, \ pusher, \ device_lists:2, \ stream_writers=account_data+presence+receipts+to_device+typing" \ -p 8008:8008 \ -p 9469:9469 \ matrixdotorg/synapse-workers:latest ``` 1. You should be able to see Prometheus service discovery endpoint at http://localhost:9469/metrics/service_discovery 1. Create a Prometheus config (`prometheus.yml`) ```yaml global: scrape_interval: 15s scrape_timeout: 15s evaluation_interval: 15s scrape_configs: - job_name: synapse scrape_interval: 15s metrics_path: /_synapse/metrics scheme: http # We set `honor_labels` so that each service can set their own `job` label # # > honor_labels controls how Prometheus handles conflicts between labels that are # > already present in scraped data and labels that Prometheus would attach # > server-side ("job" and "instance" labels, manually configured target # > labels, and labels generated by service discovery implementations). # > # > *-- https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config* honor_labels: true # Use HTTP service discovery # # Reference: # - https://prometheus.io/docs/prometheus/latest/http_sd/ # - https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config http_sd_configs: - url: 'http://localhost:9469/metrics/service_discovery' ``` 1. Start Prometheus (update the volume bind mount to the config you just saved somewhere): ``` docker run \ --detach \ --name=prometheus \ --add-host host.docker.internal:host-gateway \ -p 9090:9090 \ -v ~/Documents/code/random/prometheus-config/prometheus.yml:/etc/prometheus/prometheus.yml \ prom/prometheus ``` 1. Make sure you're seeing some data in Prometheus. On http://localhost:9090/query, search for `synapse_build_info` 1. Start [Grafana](https://hub.docker.com/r/grafana/grafana) ``` docker run -d --name=grafana --add-host host.docker.internal:host-gateway -p 3000:3000 grafana/grafana ``` 1. Visit the Grafana dashboard, http://localhost:3000/ (Credentials: `admin`/`admin`) 1. **Connections** -> **Data Sources** -> **Add data source** -> **Prometheus** - Prometheus server URL: `http://host.docker.internal:9090` 1. Import the Synapse dashboard: https://github.com/element-hq/synapse/blob/develop/contrib/grafana/synapse.json
202 lines
7.5 KiB
Docker
202 lines
7.5 KiB
Docker
# syntax=docker/dockerfile:1
|
|
# Dockerfile to build the matrixdotorg/synapse docker images.
|
|
#
|
|
# Note that it uses features which are only available in BuildKit - see
|
|
# https://docs.docker.com/go/buildkit/ for more information.
|
|
#
|
|
# To build the image, run `docker build` command from the root of the
|
|
# synapse repository:
|
|
#
|
|
# DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile .
|
|
#
|
|
# There is an optional PYTHON_VERSION build argument which sets the
|
|
# version of python to build against: for example:
|
|
#
|
|
# DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile --build-arg PYTHON_VERSION=3.10 .
|
|
#
|
|
|
|
# Irritatingly, there is no blessed guide on how to distribute an application with its
|
|
# poetry-managed environment in a docker image. We have opted for
|
|
# `poetry export | pip install -r /dev/stdin`, but beware: we have experienced bugs in
|
|
# in `poetry export` in the past.
|
|
|
|
ARG DEBIAN_VERSION=trixie
|
|
ARG PYTHON_VERSION=3.13
|
|
ARG POETRY_VERSION=2.1.1
|
|
|
|
###
|
|
### Stage 0: generate requirements.txt
|
|
###
|
|
### This stage is platform-agnostic, so we can use the build platform in case of cross-compilation.
|
|
###
|
|
FROM --platform=$BUILDPLATFORM ghcr.io/astral-sh/uv:python${PYTHON_VERSION}-${DEBIAN_VERSION} AS requirements
|
|
|
|
WORKDIR /synapse
|
|
|
|
# Copy just what we need to run `poetry export`...
|
|
COPY pyproject.toml poetry.lock /synapse/
|
|
|
|
|
|
# If specified, we won't verify the hashes of dependencies.
|
|
# This is only needed if the hashes of dependencies cannot be checked for some
|
|
# reason, such as when a git repository is used directly as a dependency.
|
|
ARG TEST_ONLY_SKIP_DEP_HASH_VERIFICATION
|
|
|
|
# If specified, we won't use the Poetry lockfile.
|
|
# Instead, we'll just install what a regular `pip install` would from PyPI.
|
|
ARG TEST_ONLY_IGNORE_POETRY_LOCKFILE
|
|
|
|
# This silences a warning as uv isn't able to do hardlinks between its cache
|
|
# (mounted as --mount=type=cache) and the target directory.
|
|
ENV UV_LINK_MODE=copy
|
|
|
|
# Export the dependencies, but only if we're actually going to use the Poetry lockfile.
|
|
# Otherwise, just create an empty requirements file so that the Dockerfile can
|
|
# proceed.
|
|
ARG POETRY_VERSION
|
|
RUN --mount=type=cache,target=/root/.cache/uv \
|
|
if [ -z "$TEST_ONLY_IGNORE_POETRY_LOCKFILE" ]; then \
|
|
uvx --with poetry-plugin-export==1.9.0 \
|
|
poetry@${POETRY_VERSION} export --extras all -o /synapse/requirements.txt ${TEST_ONLY_SKIP_DEP_HASH_VERIFICATION:+--without-hashes}; \
|
|
else \
|
|
touch /synapse/requirements.txt; \
|
|
fi
|
|
|
|
###
|
|
### Stage 1: builder
|
|
###
|
|
FROM ghcr.io/astral-sh/uv:python${PYTHON_VERSION}-${DEBIAN_VERSION} AS builder
|
|
|
|
# This silences a warning as uv isn't able to do hardlinks between its cache
|
|
# (mounted as --mount=type=cache) and the target directory.
|
|
ENV UV_LINK_MODE=copy
|
|
|
|
# Install rust and ensure its in the PATH
|
|
ENV RUSTUP_HOME=/rust
|
|
ENV CARGO_HOME=/cargo
|
|
ENV PATH=/cargo/bin:/rust/bin:$PATH
|
|
RUN mkdir /rust /cargo
|
|
|
|
RUN curl -sSf https://sh.rustup.rs | sh -s -- -y --no-modify-path --default-toolchain stable --profile minimal
|
|
|
|
# arm64 builds consume a lot of memory if `CARGO_NET_GIT_FETCH_WITH_CLI` is not
|
|
# set to true, so we expose it as a build-arg.
|
|
ARG CARGO_NET_GIT_FETCH_WITH_CLI=false
|
|
ENV CARGO_NET_GIT_FETCH_WITH_CLI=$CARGO_NET_GIT_FETCH_WITH_CLI
|
|
|
|
# To speed up rebuilds, install all of the dependencies before we copy over
|
|
# the whole synapse project, so that this layer in the Docker cache can be
|
|
# used while you develop on the source
|
|
#
|
|
# This is aiming at installing the `[tool.poetry.depdendencies]` from pyproject.toml.
|
|
COPY --from=requirements /synapse/requirements.txt /synapse/
|
|
RUN --mount=type=cache,target=/root/.cache/uv \
|
|
uv pip install --prefix="/install" --no-deps -r /synapse/requirements.txt
|
|
|
|
# Copy over the rest of the synapse source code.
|
|
COPY synapse /synapse/synapse/
|
|
COPY rust /synapse/rust/
|
|
# ... and what we need to `pip install`.
|
|
COPY pyproject.toml README.rst build_rust.py Cargo.toml Cargo.lock /synapse/
|
|
|
|
# Repeat of earlier build argument declaration, as this is a new build stage.
|
|
ARG TEST_ONLY_IGNORE_POETRY_LOCKFILE
|
|
|
|
# Install the synapse package itself.
|
|
# If we have populated requirements.txt, we don't install any dependencies
|
|
# as we should already have those from the previous `pip install` step.
|
|
RUN \
|
|
--mount=type=cache,target=/root/.cache/uv \
|
|
--mount=type=cache,target=/synapse/target,sharing=locked \
|
|
--mount=type=cache,target=${CARGO_HOME}/registry,sharing=locked \
|
|
if [ -z "$TEST_ONLY_IGNORE_POETRY_LOCKFILE" ]; then \
|
|
uv pip install --prefix="/install" --no-deps /synapse[all]; \
|
|
else \
|
|
uv pip install --prefix="/install" /synapse[all]; \
|
|
fi
|
|
|
|
###
|
|
### Stage 2: runtime dependencies download for ARM64 and AMD64
|
|
###
|
|
FROM --platform=$BUILDPLATFORM docker.io/library/debian:${DEBIAN_VERSION} AS runtime-deps
|
|
|
|
# Tell apt to keep downloaded package files, as we're using cache mounts.
|
|
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
|
|
|
|
# Add both target architectures
|
|
RUN dpkg --add-architecture arm64
|
|
RUN dpkg --add-architecture amd64
|
|
|
|
# Fetch the runtime dependencies debs for both architectures
|
|
# We do that by building a recursive list of packages we need to download with `apt-cache depends`
|
|
# and then downloading them with `apt-get download`.
|
|
RUN \
|
|
--mount=type=cache,target=/var/cache/apt,sharing=locked \
|
|
--mount=type=cache,target=/var/lib/apt,sharing=locked \
|
|
apt-get update -qq && \
|
|
apt-cache depends --recurse --no-recommends --no-suggests --no-conflicts --no-breaks --no-replaces --no-enhances --no-pre-depends \
|
|
curl \
|
|
gosu \
|
|
libjpeg62-turbo \
|
|
libpq5 \
|
|
libwebp7 \
|
|
xmlsec1 \
|
|
libjemalloc2 \
|
|
| grep '^\w' > /tmp/pkg-list && \
|
|
for arch in arm64 amd64; do \
|
|
mkdir -p /tmp/debs-${arch} && \
|
|
chown _apt:root /tmp/debs-${arch} && \
|
|
cd /tmp/debs-${arch} && \
|
|
apt-get -o APT::Architecture="${arch}" download $(cat /tmp/pkg-list); \
|
|
done
|
|
|
|
# Extract the debs for each architecture
|
|
RUN \
|
|
for arch in arm64 amd64; do \
|
|
mkdir -p /install-${arch}/var/lib/dpkg/status.d/ && \
|
|
for deb in /tmp/debs-${arch}/*.deb; do \
|
|
package_name=$(dpkg-deb -I ${deb} | awk '/^ Package: .*$/ {print $2}'); \
|
|
echo "Extracting: ${package_name}"; \
|
|
dpkg --ctrl-tarfile $deb | tar -Ox ./control > /install-${arch}/var/lib/dpkg/status.d/${package_name}; \
|
|
dpkg --extract $deb /install-${arch}; \
|
|
done; \
|
|
done
|
|
|
|
|
|
###
|
|
### Stage 3: runtime
|
|
###
|
|
|
|
FROM docker.io/library/python:${PYTHON_VERSION}-slim-${DEBIAN_VERSION}
|
|
|
|
ARG TARGETARCH
|
|
|
|
LABEL org.opencontainers.image.url='https://github.com/element-hq/synapse'
|
|
LABEL org.opencontainers.image.documentation='https://element-hq.github.io/synapse/latest/'
|
|
LABEL org.opencontainers.image.source='https://github.com/element-hq/synapse.git'
|
|
LABEL org.opencontainers.image.licenses='AGPL-3.0-or-later OR LicenseRef-Element-Commercial'
|
|
|
|
COPY --from=runtime-deps /install-${TARGETARCH}/etc /etc
|
|
COPY --from=runtime-deps /install-${TARGETARCH}/usr /usr
|
|
COPY --from=runtime-deps /install-${TARGETARCH}/var /var
|
|
|
|
# Copy the installed python packages from the builder stage.
|
|
#
|
|
# uv will generate a `.lock` file when installing packages, which we don't want
|
|
# to copy to the final image.
|
|
COPY --from=builder --exclude=.lock /install /usr/local
|
|
COPY ./docker/start.py /start.py
|
|
COPY ./docker/conf /conf
|
|
|
|
# 8008: CS Matrix API port from Synapse
|
|
# 8448: SS Matrix API port from Synapse
|
|
EXPOSE 8008/tcp 8448/tcp
|
|
# 19090: Metrics listener port for the main process (metrics must be enabled with
|
|
# SYNAPSE_ENABLE_METRICS=1).
|
|
EXPOSE 19090/tcp
|
|
|
|
ENTRYPOINT ["/start.py"]
|
|
|
|
HEALTHCHECK --start-period=5s --interval=15s --timeout=5s \
|
|
CMD curl -fSs http://localhost:8008/health || exit 1
|