Fix remote images being chosen over the local ones we just built with
Complement in CI (any Docker environment using the `containerd` image
store). This problem means that Complement jobs in CI don't actually
test against the code from the PR (since 2026-02-10).
This PR approaches the problem the same way that @AndrewFerr proposed in
https://github.com/element-hq/synapse/pull/18210. This is better than
the alternative listed below as we can just make our code compatible
with whatever image store is being used.
### Problem
Spawning from
https://github.com/element-hq/synapse/pull/19460#discussion_r2818760635
where we found that our Complement jobs in CI don't actually test
against the code from the PR at the moment.
This is caused by a change in Docker Engine 29.0.0:
> `containerd` image store is now the default for **fresh installs**.
This doesn't apply to daemons configured with `userns-remap` (see
[moby#47377](https://github.com/moby/moby/issues/47377)).
>
> *-- 29.0.0 (2025-11-10),
https://docs.docker.com/engine/release-notes/29/#2900*
And our `ubuntu-latest` GitHub runner (`Current runner version:
'2.331.0'`)
[points](https://github.com/actions/runner-images/blob/ubuntu24/20260209.23/images/ubuntu/Ubuntu2404-Readme.md)
to using Docker client/server `29.1.5` 🎯
This Docker version bump happened on
416418df15
(2026-02-10) (`28.0.4` -> `29.1.5`). Specific PR:
https://github.com/actions/runner-images/pull/13633
---
I found this because I reviewed and remembered
https://github.com/element-hq/synapse/pull/18210 was a thing that
@AndrewFerr ran into. And then running `dockers system prune` also
revealed the problematic `containerd` in CI. Checking the Docker
changelogs, I found the new default culprit and then could trace down
where the GitHub runners made the dependency update.
---------
Co-authored-by: Andrew Ferrazzutti <andrewf@element.io>
Spawning from wanting to [run a load
test](https://github.com/element-hq/synapse-rust-apps/pull/397) against
the Complement Docker image of Synapse and see metrics from the
homeserver.
### Why not just provide your own homeserver config?
Probably possible but it gets tricky when you try to use the workers
variant of the Docker image (`docker/Dockerfile-workers`). The way to
workaround it would probably be to `yq` edit everything in a script and
change `/data/homeserver.yaml` and `/conf/workers/*.yaml` to add the
`metrics` listener. And then modify `/conf/workers/shared.yaml` to add
`enable_metrics: true`. Doesn't spark much joy.
Fixesmatrix-org/complement#330 (or it will, once we remove the old files).
It's not quite a lift-and-shift: I've also taken the opportunity to get rid of the custom CA that we used to use to sign the TLS certs, which has been superceded by the CA exposed by Complement.
This PR adds a Dockerfile and some supporting files to the `docker/` directory. The Dockerfile's intention is to spin up a container with:
* A Synapse main process.
* Any desired worker processes, defined by a `SYNAPSE_WORKERS` environment variable supplied at runtime.
* A redis for worker communication.
* A nginx for routing traffic.
* A supervisord to start all worker processes and monitor them if any go down.
Note that **this is not currently intended to be used in production**. If you'd like to use Synapse workers with Docker, instead make use of the official image, with one worker per container. The purpose of this dockerfile is currently to allow testing Synapse in worker mode with the [Complement](https://github.com/matrix-org/complement/) test suite.
`configure_workers_and_start.py` is where most of the magic happens in this PR. It reads from environment variables (documented in the file) and creates all necessary config files for the processes. It is the entrypoint of the Dockerfile, and thus is run any time the docker container is spun up, recreating all config files in case you want to use a different set of workers. One can specify which workers they'd like to use by setting the `SYNAPSE_WORKERS` environment variable (as a comma-separated list of arbitrary worker names) or by setting it to `*` for all worker processes. We will be using the latter in CI.
Huge thanks to @MatMaul for helping get this all working 🎉 This PR is paired with its equivalent on the Complement side: https://github.com/matrix-org/complement/pull/62.
Note, for the purpose of testing this PR before it's merged: You'll need to (re)build the base Synapse docker image for everything to work (`matrixdotorg/synapse:latest`). Then build the worker-based docker image on top (`matrixdotorg/synapse:workers`).