Merge 'release-v1.136' into 'master'

2025-08-12 15:36:51 +01:00
parent 4054d956f7
commit 283ade8e33
402 changed files with 8961 additions and 2567 deletions
@@ -16,6 +16,10 @@ jobs:
        with:
          project-url: "https://github.com/orgs/matrix-org/projects/67"
          github-token: ${{ secrets.ELEMENT_BOT_TOKEN }}
+        # This action will error if the issue already exists on the project. Which is
+        # common as `X-Needs-Info` will often be added to issues that are already in
+        # the triage queue. Prevent the whole job from failing in this case.
+        continue-on-error: true
      - name: Set status
        env:
          GITHUB_TOKEN: ${{ secrets.ELEMENT_BOT_TOKEN }}
@@ -1,3 +1,12 @@
+# Synapse 1.136.0 (2025-08-12)
+
+Note: This release includes the security fixes from `1.135.2` and `1.136.0rc2`, detailed below.
+
+### Bugfixes
+
+- Fix bug introduced in 1.135.2 and 1.136.0rc2 where the [Make Room Admin API](https://element-hq.github.io/synapse/latest/admin_api/rooms.html#make-room-admin-api) would not treat a room v12's creator power level as the highest in room. ([\#18805](https://github.com/element-hq/synapse/issues/18805))
+
+
 # Synapse 1.135.2 (2025-08-11)

 This is the Synapse portion of the [Matrix coordinated security release](https://matrix.org/blog/2025/07/security-predisclosure/). This release includes support for [room version](https://spec.matrix.org/v1.15/rooms/) 12 which fixes a number of security vulnerabilities, including [CVE-2025-49090](https://www.cve.org/CVERecord?id=CVE-2025-49090).
@@ -23,7 +32,77 @@ Two patched Synapse releases are now available:
 - Speed up upgrading a room with large numbers of banned users. ([\#18574](https://github.com/element-hq/synapse/issues/18574))


+# Synapse 1.136.0rc2 (2025-08-11)

+- Update MSC4293 redaction logic for room v12. ([\#80](https://github.com/element-hq/synapse/issues/80))
+
+### Internal Changes
+
+- Add a parameter to `upgrade_rooms(..)` to allow auto join local users. ([\#83](https://github.com/element-hq/synapse/issues/83))
+
+
+# Synapse 1.136.0rc1 (2025-08-05)
+
+Please check [the relevant section in the upgrade notes](https://github.com/element-hq/synapse/blob/develop/docs/upgrade.md#upgrading-to-v11360) as this release contains changes to MAS support, metrics labels and the module API which may require your attention when upgrading.
+
+### Features
+
+- Add configurable rate limiting for the creation of rooms. ([\#18514](https://github.com/element-hq/synapse/issues/18514))
+- Add support for [MSC4293](https://github.com/matrix-org/matrix-spec-proposals/pull/4293) - Redact on Kick/Ban. ([\#18540](https://github.com/element-hq/synapse/issues/18540))
+- When admins enable themselves to see soft-failed events, they will also see if the cause is due to the policy server flagging them as spam via `unsigned`. ([\#18585](https://github.com/element-hq/synapse/issues/18585))
+- Add ability to configure forward/outbound proxy via homeserver config instead of environment variables. See `http_proxy`, `https_proxy`, `no_proxy_hosts`. ([\#18686](https://github.com/element-hq/synapse/issues/18686))
+- Advertise experimental support for [MSC4306](https://github.com/matrix-org/matrix-spec-proposals/pull/4306) (Thread Subscriptions) through `/_matrix/clients/versions` if enabled. ([\#18722](https://github.com/element-hq/synapse/issues/18722))
+- Stabilise support for delegating authentication to [Matrix Authentication Service](https://github.com/element-hq/matrix-authentication-service/). ([\#18759](https://github.com/element-hq/synapse/issues/18759))
+- Implement the push rules for experimental [MSC4306: Thread Subscriptions](https://github.com/matrix-org/matrix-doc/issues/4306). ([\#18762](https://github.com/element-hq/synapse/issues/18762))
+
+### Bugfixes
+
+- Allow return code 403 (allowed by C2S Spec since v1.2) when fetching profiles via federation. ([\#18696](https://github.com/element-hq/synapse/issues/18696))
+- Register the MSC4306 (Thread Subscriptions) endpoints in the CS API when the experimental feature is enabled. ([\#18726](https://github.com/element-hq/synapse/issues/18726))
+- Fix a long-standing bug where suspended users could not have server notices sent to them (a 403 was returned to the admin). ([\#18750](https://github.com/element-hq/synapse/issues/18750))
+- Fix an issue that could cause logcontexts to be lost on rate-limited requests. Found by @realtyem. ([\#18763](https://github.com/element-hq/synapse/issues/18763))
+- Fix invalidation of storage cache that was broken in 1.135.0. ([\#18786](https://github.com/element-hq/synapse/issues/18786))
+
+### Improved Documentation
+
+- Minor improvements to README. ([\#18700](https://github.com/element-hq/synapse/issues/18700))
+- Document that there can be multiple workers handling the `receipts` stream. ([\#18760](https://github.com/element-hq/synapse/issues/18760))
+- Improve worker documentation for some device paths. ([\#18761](https://github.com/element-hq/synapse/issues/18761))
+
+### Deprecations and Removals
+
+- Deprecate `run_as_background_process` exported as part of the module API interface in favor of `ModuleApi.run_as_background_process`. See [the relevant section in the upgrade notes](https://github.com/element-hq/synapse/blob/develop/docs/upgrade.md#upgrading-to-v11360) for more information. ([\#18737](https://github.com/element-hq/synapse/issues/18737))
+
+### Internal Changes
+
+- Add debug logging for HMAC digest verification failures when using the admin API to register users. ([\#18474](https://github.com/element-hq/synapse/issues/18474))
+- Speed up upgrading a room with large numbers of banned users. ([\#18574](https://github.com/element-hq/synapse/issues/18574))
+- Fix config documentation generation script on Windows by enforcing UTF-8. ([\#18580](https://github.com/element-hq/synapse/issues/18580))
+- Refactor cache, background process, `Counter`, `LaterGauge`, `GaugeBucketCollector`, `Histogram`, and `Gauge` metrics to be homeserver-scoped. ([\#18656](https://github.com/element-hq/synapse/issues/18656), [\#18714](https://github.com/element-hq/synapse/issues/18714), [\#18715](https://github.com/element-hq/synapse/issues/18715), [\#18724](https://github.com/element-hq/synapse/issues/18724), [\#18753](https://github.com/element-hq/synapse/issues/18753), [\#18725](https://github.com/element-hq/synapse/issues/18725), [\#18670](https://github.com/element-hq/synapse/issues/18670), [\#18748](https://github.com/element-hq/synapse/issues/18748), [\#18751](https://github.com/element-hq/synapse/issues/18751))
+- Reduce database usage in Sliding Sync by not querying for background update completion after the update is known to be complete. ([\#18718](https://github.com/element-hq/synapse/issues/18718))
+- Improve order of validation and ratelimiting in room creation. ([\#18723](https://github.com/element-hq/synapse/issues/18723))
+- Bump minimum version bound on Twisted to 21.2.0. ([\#18727](https://github.com/element-hq/synapse/issues/18727), [\#18729](https://github.com/element-hq/synapse/issues/18729))
+- Use `twisted.internet.testing` module in tests instead of deprecated `twisted.test.proto_helpers`. ([\#18728](https://github.com/element-hq/synapse/issues/18728))
+- Remove obsolete `/send_event` replication endpoint. ([\#18730](https://github.com/element-hq/synapse/issues/18730))
+- Update metrics linting to be able to handle custom metrics. ([\#18733](https://github.com/element-hq/synapse/issues/18733))
+- Work around `twisted.protocols.amp.TooLong` error by reducing logging in some tests. ([\#18736](https://github.com/element-hq/synapse/issues/18736))
+- Prevent "Move labelled issues to correct projects" GitHub Actions workflow from failing when an issue is already on the project board. ([\#18755](https://github.com/element-hq/synapse/issues/18755))
+- Bump minimum supported Rust version (MSRV) to 1.82.0. Missed in [#18553](https://github.com/element-hq/synapse/pull/18553) (released in Synapse 1.134.0). ([\#18757](https://github.com/element-hq/synapse/issues/18757))
+- Make `Clock.sleep(...)` return a coroutine, so that mypy can catch places where we don't await on it. ([\#18772](https://github.com/element-hq/synapse/issues/18772))
+- Update implementation of [MSC4306: Thread Subscriptions](https://github.com/matrix-org/matrix-doc/issues/4306) to include automatic subscription conflict prevention as introduced in later drafts. ([\#18756](https://github.com/element-hq/synapse/issues/18756))
+
+
+
+### Updates to locked dependencies
+
+* Bump gitpython from 3.1.44 to 3.1.45. ([\#18743](https://github.com/element-hq/synapse/issues/18743))
+* Bump mypy-zope from 1.0.12 to 1.0.13. ([\#18744](https://github.com/element-hq/synapse/issues/18744))
+* Bump phonenumbers from 9.0.9 to 9.0.10. ([\#18741](https://github.com/element-hq/synapse/issues/18741))
+* Bump ruff from 0.12.4 to 0.12.5. ([\#18742](https://github.com/element-hq/synapse/issues/18742))
+* Bump sentry-sdk from 2.32.0 to 2.33.2. ([\#18745](https://github.com/element-hq/synapse/issues/18745))
+* Bump tokio from 1.46.1 to 1.47.0. ([\#18740](https://github.com/element-hq/synapse/issues/18740))
+* Bump types-jsonschema from 4.24.0.20250708 to 4.25.0.20250720. ([\#18703](https://github.com/element-hq/synapse/issues/18703))
+* Bump types-psycopg2 from 2.9.21.20250516 to 2.9.21.20250718. ([\#18706](https://github.com/element-hq/synapse/issues/18706))

 # Synapse 1.135.0 (2025-08-01)

@@ -8,7 +8,7 @@
 Synapse is an open source `Matrix <https://matrix.org>`__ homeserver
 implementation, written and maintained by `Element <https://element.io>`_.
 `Matrix <https://github.com/matrix-org>`__ is the open standard for
-secure and interoperable real time communications. You can directly run
+secure and interoperable real-time communications. You can directly run
 and manage the source code in this repository, available under an AGPL
 license (or alternatively under a commercial license from Element).
 There is no support provided by Element unless you have a
@@ -23,13 +23,13 @@ ESS builds on Synapse to offer a complete Matrix-based backend including the ful
 `Admin Console product <https://element.io/enterprise-functionality/admin-console>`_,
 giving admins the power to easily manage an organization-wide
 deployment. It includes advanced identity management, auditing,
-moderation and data retention options as well as Long Term Support and
-SLAs. ESS can be used to support any Matrix-based frontend client.
+moderation and data retention options as well as Long-Term Support and
+SLAs. ESS supports any Matrix-compatible client.

 .. contents::

-🛠️ Installing and configuration
-===============================
+🛠️ Installation and configuration
+==================================

 The Synapse documentation describes `how to install Synapse <https://element-hq.github.io/synapse/latest/setup/installation.html>`_. We recommend using
 `Docker images <https://element-hq.github.io/synapse/latest/setup/installation.html#docker-images-and-ansible-playbooks>`_ or `Debian packages from Matrix.org
@@ -133,7 +133,7 @@ connect from a client: see
 An easy way to get started is to login or register via Element at
 https://app.element.io/#/login or https://app.element.io/#/register respectively.
 You will need to change the server you are logging into from ``matrix.org``
-and instead specify a Homeserver URL of ``https://<server_name>:8448``
+and instead specify a homeserver URL of ``https://<server_name>:8448``
 (or just ``https://<server_name>`` if you are using a reverse proxy).
 If you prefer to use another client, refer to our
 `client breakdown <https://matrix.org/ecosystem/clients/>`_.
@@ -162,16 +162,15 @@ the public internet. Without it, anyone can freely register accounts on your hom
 This can be exploited by attackers to create spambots targeting the rest of the Matrix
 federation.

-Your new user name will be formed partly from the ``server_name``, and partly
-from a localpart you specify when you create the account. Your name will take
-the form of::
+Your new Matrix ID will be formed partly from the ``server_name``, and partly
+from a localpart you specify when you create the account in the form of::

    @localpart:my.domain.name

 (pronounced "at localpart on my dot domain dot name").

 As when logging in, you will need to specify a "Custom server".  Specify your
-desired ``localpart`` in the 'User name' box.
+desired ``localpart`` in the 'Username' box.

 🎯 Troubleshooting and support
 ==============================
@@ -209,10 +208,10 @@ Identity servers have the job of mapping email addresses and other 3rd Party
 IDs (3PIDs) to Matrix user IDs, as well as verifying the ownership of 3PIDs
 before creating that mapping.

-**They are not where accounts or credentials are stored - these live on home
-servers. Identity Servers are just for mapping 3rd party IDs to matrix IDs.**
+**Identity servers do not store accounts or credentials - these are stored and managed on homeservers.
+Identity Servers are just for mapping 3rd Party IDs to Matrix IDs.**

-This process is very security-sensitive, as there is obvious risk of spam if it
+This process is highly security-sensitive, as there is an obvious risk of spam if it
 is too easy to sign up for Matrix accounts or harvest 3PID data. In the longer
 term, we hope to create a decentralised system to manage it (`matrix-doc #712
 <https://github.com/matrix-org/matrix-doc/issues/712>`_), but in the meantime,
@@ -238,9 +237,9 @@ email address.
 We welcome contributions to Synapse from the community!
 The best place to get started is our
 `guide for contributors <https://element-hq.github.io/synapse/latest/development/contributing_guide.html>`_.
-This is part of our larger `documentation <https://element-hq.github.io/synapse/latest>`_, which includes
-
+This is part of our broader `documentation <https://element-hq.github.io/synapse/latest>`_, which includes
 information for Synapse developers as well as Synapse administrators.
+
 Developers might be particularly interested in:

 * `Synapse's database schema <https://element-hq.github.io/synapse/latest/development/database_schema.html>`_,
@@ -19,17 +19,17 @@ def build(setup_kwargs: Dict[str, Any]) -> None:
        # This flag is a no-op in the latest versions. Instead, we need to
        # specify this in the `bdist_wheel` config below.
        py_limited_api=True,
-        # We force always building in release mode, as we can't tell the
-        # difference between using `poetry` in development vs production.
+        # We always build in release mode, as we can't distinguish
+        # between using `poetry` in development vs production.
        debug=False,
    )
    setup_kwargs.setdefault("rust_extensions", []).append(extension)
    setup_kwargs["zip_safe"] = False

-    # We lookup the minimum supported python version by looking at
-    # `python_requires` (e.g. ">=3.9.0,<4.0.0") and finding the first python
+    # We look up the minimum supported Python version with
+    # `python_requires` (e.g. ">=3.9.0,<4.0.0") and finding the first Python
    # version that matches. We then convert that into the `py_limited_api` form,
-    # e.g. cp39 for python 3.9.
+    # e.g. cp39 for Python 3.9.
    py_limited_api: str
    python_bounds = SpecifierSet(setup_kwargs["python_requires"])
    for minor_version in itertools.count(start=8):
@@ -4396,7 +4396,7 @@
              "exemplar": false,
              "expr": "(time() - max without (job, index, host) (avg_over_time(synapse_federation_last_received_pdu_time[10m]))) / 60",
              "instant": false,
-              "legendFormat": "{{server_name}} ",
+              "legendFormat": "{{origin_server_name}} ",
              "range": true,
              "refId": "A"
            }
@@ -4518,7 +4518,7 @@
              "exemplar": false,
              "expr": "(time() - max without (job, index, host) (avg_over_time(synapse_federation_last_sent_pdu_time[10m]))) / 60",
              "instant": false,
-              "legendFormat": "{{server_name}}",
+              "legendFormat": "{{destination_server_name}}",
              "range": true,
              "refId": "A"
            }
@@ -1,3 +1,21 @@
+matrix-synapse-py3 (1.136.0) stable; urgency=medium
+
+  * New Synapse release 1.136.0.
+
+ -- Synapse Packaging team <packages@matrix.org>  Tue, 12 Aug 2025 13:18:03 +0100
+
+matrix-synapse-py3 (1.136.0~rc2) stable; urgency=medium
+
+  * New Synapse release 1.136.0rc2.
+
+ -- Synapse Packaging team <packages@matrix.org>  Mon, 11 Aug 2025 12:18:52 -0600
+
+matrix-synapse-py3 (1.136.0~rc1) stable; urgency=medium
+
+  * New Synapse release 1.136.0rc1.
+
+ -- Synapse Packaging team <packages@matrix.org>  Tue, 05 Aug 2025 08:13:30 -0600
+
 matrix-synapse-py3 (1.135.2) stable; urgency=medium

  * New Synapse release 1.135.2.
@@ -98,6 +98,10 @@ rc_delayed_event_mgmt:
  per_second: 9999
  burst_count: 9999

+rc_room_creation:
+  per_second: 9999
+  burst_count: 9999
+
 federation_rr_transactions_per_room_per_second: 9999

 allow_device_name_lookup_over_federation: true
@@ -22,4 +22,46 @@ To receive soft failed events in APIs like `/sync` and `/messages`, set `return_
 to `true` in the admin client config. When `false`, the normal behaviour of these endpoints is to
 exclude soft failed events.

+**Note**: If the policy server flagged the event as spam and that caused soft failure, that will be indicated
+in the event's `unsigned` content like so:
+
+```json
+{
+  "type": "m.room.message",
+  "other": "event_fields_go_here",
+  "unsigned": {
+    "io.element.synapse.soft_failed": true,
+    "io.element.synapse.policy_server_spammy": true
+  }
+}
+```
+
 Default: `false`
+
+## See events marked spammy by policy servers
+
+Learn more about policy servers from [MSC4284](https://github.com/matrix-org/matrix-spec-proposals/pull/4284).
+
+Similar to `return_soft_failed_events`, clients logged in with admin accounts can see events which were
+flagged by the policy server as spammy (and thus soft failed) by setting `return_policy_server_spammy_events`
+to `true`.
+
+`return_policy_server_spammy_events` may be `true` while `return_soft_failed_events` is `false` to only see
+policy server-flagged events. When `return_soft_failed_events` is `true` however, `return_policy_server_spammy_events`
+is always `true`.
+
+Events which were flagged by the policy will be flagged as `io.element.synapse.policy_server_spammy` in the
+event's `unsigned` content, like so:
+
+```json
+{
+  "type": "m.room.message",
+  "other": "event_fields_go_here",
+  "unsigned": {
+    "io.element.synapse.soft_failed": true,
+    "io.element.synapse.policy_server_spammy": true
+  }
+}
+```
+
+Default: `true` if `return_soft_failed_events` is `true`, otherwise `false`
@@ -7,8 +7,23 @@ proxy is supported, not SOCKS proxy or anything else.

 ## Configure

-The `http_proxy`, `https_proxy`, `no_proxy` environment variables are used to
-specify proxy settings. The environment variable is not case sensitive.
+The proxy settings can be configured in the homeserver configuration file via
+[`http_proxy`](../usage/configuration/config_documentation.md#http_proxy),
+[`https_proxy`](../usage/configuration/config_documentation.md#https_proxy), and
+[`no_proxy_hosts`](../usage/configuration/config_documentation.md#no_proxy_hosts).
+
+`homeserver.yaml` example:
+```yaml
+http_proxy: http://USERNAME:PASSWORD@10.0.1.1:8080/
+https_proxy: http://USERNAME:PASSWORD@proxy.example.com:8080/
+no_proxy_hosts:
+  - master.hostname.example.com
+  - 10.1.0.0/16
+  - 172.30.0.0/16
+```
+
+The proxy settings can also be configured via the `http_proxy`, `https_proxy`,
+`no_proxy` environment variables. The environment variable is not case sensitive.
 - `http_proxy`: Proxy server to use for HTTP requests.
 - `https_proxy`: Proxy server to use for HTTPS requests.
 - `no_proxy`: Comma-separated list of hosts, IP addresses, or IP ranges in CIDR
@@ -44,7 +59,7 @@ The proxy will be **used** for:
 - phone-home stats
 - recaptcha validation
 - CAS auth validation
- OpenID Connect
+- OpenID Connect (OIDC)
 - Outbound federation
 - Federation (checking public key revocation)
 - Fetching public keys of other servers
@@ -53,7 +68,7 @@ The proxy will be **used** for:
 It will **not be used** for:

 - Application Services
- Identity servers
+- Matrix Identity servers
 - In worker configurations
  - connections between workers
  - connections from workers to Redis
@@ -117,6 +117,77 @@ each upgrade are complete before moving on to the next upgrade, to avoid
 stacking them up. You can monitor the currently running background updates with
 [the Admin API](usage/administration/admin_api/background_updates.html#status).

+# Upgrading to v1.136.0
+
+## Deprecate `run_as_background_process` exported as part of the module API interface in favor of `ModuleApi.run_as_background_process`
+
+The `run_as_background_process` function is now a method of the `ModuleApi` class. If
+you were using the function directly from the module API, it will continue to work fine
+but the background process metrics will not include an accurate `server_name` label.
+This kind of metric labeling isn't relevant for many use cases and is used to
+differentiate Synapse instances running in the same Python process (relevant to Synapse
+Pro: Small Hosts). We recommend updating your usage to use the new
+`ModuleApi.run_as_background_process` method to stay on top of future changes.
+
+<details>
+<summary>Example <code>run_as_background_process</code> upgrade</summary>
+
+Before:
+```python
+class MyModule:
+    def __init__(self, module_api: ModuleApi) -> None:
+        run_as_background_process(__name__ + ":setup_database", self.setup_database)
+```
+
+After:
+```python
+class MyModule:
+    def __init__(self, module_api: ModuleApi) -> None:
+        module_api.run_as_background_process(__name__ + ":setup_database", self.setup_database)
+```
+
+</details>
+
+## Metric labels have changed on `synapse_federation_last_received_pdu_time` and `synapse_federation_last_sent_pdu_time`
+
+Previously, the `synapse_federation_last_received_pdu_time` and
+`synapse_federation_last_sent_pdu_time` metrics both used the `server_name` label to
+differentiate between different servers that we send and receive events from.
+
+Since we're now using the `server_name` label to differentiate between different Synapse
+homeserver instances running in the same process, these metrics have been changed as follows:
+
+ - `synapse_federation_last_received_pdu_time` now uses the `origin_server_name` label
+ - `synapse_federation_last_sent_pdu_time` now uses the `destination_server_name` label
+
+The Grafana dashboard JSON in `contrib/grafana/synapse.json` has been updated to reflect
+this change but you will need to manually update your own existing Grafana dashboards
+using these metrics.
+
+## Stable integration with Matrix Authentication Service
+
+Support for [Matrix Authentication Service (MAS)](https://github.com/element-hq/matrix-authentication-service) is now stable, with a simplified configuration.
+This stable integration requires MAS 0.20.0 or later.
+
+The existing `experimental_features.msc3861` configuration option is now deprecated and will be removed in Synapse v1.137.0.
+
+Synapse deployments already using MAS should now use the new configuration options:
+
+```yaml
+matrix_authentication_service:
+  # Enable the MAS integration
+  enabled: true
+  # The base URL where Synapse will contact MAS
+  endpoint: http://localhost:8080
+  # The shared secret used to authenticate MAS requests, must be the same as `matrix.secret` in the MAS configuration
+  # See https://element-hq.github.io/matrix-authentication-service/reference/configuration.html#matrix
+  secret: "asecurerandomsecretstring"
+```
+
+They must remove the `experimental_features.msc3861` configuration option from their configuration.
+
+They can also remove the client previously used by Synapse [in the MAS configuration](https://element-hq.github.io/matrix-authentication-service/reference/configuration.html#clients) as it is no longer in use.
+
 # Upgrading to v1.135.0

 ## `on_user_registration` module API callback may now run on any worker
@@ -137,10 +208,10 @@ native ICU library on your system is no longer required.
 ## Documented endpoint which can be delegated to a federation worker

 The endpoint `^/_matrix/federation/v1/version$` can be delegated to a federation
-worker. This is not new behaviour, but had not been documented yet. The 
-[list of delegatable endpoints](workers.md#synapseappgeneric_worker) has 
+worker. This is not new behaviour, but had not been documented yet. The
+[list of delegatable endpoints](workers.md#synapseappgeneric_worker) has
 been updated to include it. Make sure to check your reverse proxy rules if you
-are using workers. 
+are using workers.

 # Upgrading to v1.126.0

@@ -610,6 +610,61 @@ manhole_settings:
  ssh_pub_key_path: CONFDIR/id_rsa.pub
 ```
 ---
+### `http_proxy`
+
+*(string|null)* Proxy server to use for HTTP requests.
+For more details, see the [forward proxy documentation](../../setup/forward_proxy.md). There is no default for this option.
+
+Example configuration:
+```yaml
+http_proxy: http://USERNAME:PASSWORD@10.0.1.1:8080/
+```
+---
+### `https_proxy`
+
+*(string|null)* Proxy server to use for HTTPS requests.
+For more details, see the [forward proxy documentation](../../setup/forward_proxy.md). There is no default for this option.
+
+Example configuration:
+```yaml
+https_proxy: http://USERNAME:PASSWORD@proxy.example.com:8080/
+```
+---
+### `no_proxy_hosts`
+
+*(array)* List of hosts, IP addresses, or IP ranges in CIDR format which should not use the proxy. Synapse will directly connect to these hosts.
+For more details, see the [forward proxy documentation](../../setup/forward_proxy.md). There is no default for this option.
+
+Example configuration:
+```yaml
+no_proxy_hosts:
+- master.hostname.example.com
+- 10.1.0.0/16
+- 172.30.0.0/16
+```
+---
+### `matrix_authentication_service`
+
+*(object)* The `matrix_authentication_service` setting configures integration with [Matrix Authentication Service (MAS)](https://github.com/element-hq/matrix-authentication-service).
+
+This setting has the following sub-options:
+
+* `enabled` (boolean): Whether or not to enable the MAS integration. If this is set to `false`, Synapse will use its legacy internal authentication API. Defaults to `false`.
+
+* `endpoint` (string): The URL where Synapse can reach MAS. This *must* have the `discovery` and `oauth` resources mounted. Defaults to `"http://localhost:8080"`.
+
+* `secret` (string|null): A shared secret that will be used to authenticate requests from and to MAS.
+
+* `secret_path` (string|null): Alternative to `secret`, reading the shared secret from a file. The file should be a plain text file, containing only the secret. Synapse reads the secret from the given file once at startup.
+
+Example configuration:
+```yaml
+matrix_authentication_service:
+  enabled: true
+  secret: someverysecuresecret
+  endpoint: http://localhost:8080
+```
+---
 ### `dummy_events_threshold`

 *(integer)* Forward extremities can build up in a room due to networking delays between homeservers. Once this happens in a large room, calculation of the state of that room can become quite expensive. To mitigate this, once the number of forward extremities reaches a given threshold, Synapse will send an `org.matrix.dummy_event` event, which will reduce the forward extremities in the room.
@@ -1963,6 +2018,31 @@ rc_reports:
  burst_count: 20.0
 ```
 ---
+### `rc_room_creation`
+
+*(object)* Sets rate limits for how often users are able to create rooms.
+
+This setting has the following sub-options:
+
+* `per_second` (number): Maximum number of requests a client can send per second.
+
+* `burst_count` (number): Maximum number of requests a client can send before being throttled.
+
+Default configuration:
+```yaml
+rc_room_creation:
+  per_user:
+    per_second: 0.016
+    burst_count: 10.0
+```
+
+Example configuration:
+```yaml
+rc_room_creation:
+  per_second: 1.0
+  burst_count: 5.0
+```
+---
 ### `federation_rr_transactions_per_room_per_second`

 *(integer)* Sets outgoing federation transaction frequency for sending read-receipts, per-room.
@@ -260,7 +260,7 @@ information.
    ^/_matrix/client/(r0|v3|unstable)/keys/claim$
    ^/_matrix/client/(r0|v3|unstable)/room_keys/
    ^/_matrix/client/(r0|v3|unstable)/keys/upload
-    ^/_matrix/client/(api/v1|r0|v3|unstable/keys/device_signing/upload$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/keys/device_signing/upload$
    ^/_matrix/client/(api/v1|r0|v3|unstable)/keys/signatures/upload$

    # Registration/login requests
@@ -532,8 +532,9 @@ the stream writer for the `account_data` stream:

 ##### The `receipts` stream

-The following endpoints should be routed directly to the worker configured as
-the stream writer for the `receipts` stream:
+The `receipts` stream supports multiple writers. The following endpoints
+can be handled by any worker, but should be routed directly to one of the workers
+configured as stream writer for the `receipts` stream:

    ^/_matrix/client/(r0|v3|unstable)/rooms/.*/receipt
    ^/_matrix/client/(r0|v3|unstable)/rooms/.*/read_markers
@@ -555,13 +556,13 @@ the stream writer for the `push_rules` stream:
 ##### The `device_lists` stream

 The `device_lists` stream supports multiple writers. The following endpoints
-can be handled by any worker, but should be routed directly one of the workers
+can be handled by any worker, but should be routed directly to one of the workers
 configured as stream writer for the `device_lists` stream:

    ^/_matrix/client/(r0|v3)/delete_devices$
-    ^/_matrix/client/(api/v1|r0|v3|unstable)/devices/
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/devices(/|$)
    ^/_matrix/client/(r0|v3|unstable)/keys/upload
-    ^/_matrix/client/(api/v1|r0|v3|unstable/keys/device_signing/upload$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/keys/device_signing/upload$
    ^/_matrix/client/(api/v1|r0|v3|unstable)/keys/signatures/upload$

 #### Restrict outbound federation traffic to a specific set of workers
@@ -1,6 +1,17 @@
 [mypy]
 namespace_packages = True
-plugins = pydantic.mypy, mypy_zope:plugin, scripts-dev/mypy_synapse_plugin.py
+# Our custom mypy plugin should remain first in this list.
+#
+# mypy has a limitation where it only chooses the first plugin that returns a non-None
+# value for each hook (known-limitation, c.f.
+# https://github.com/python/mypy/issues/19524). We workaround this by putting our custom
+# plugin first in the plugin order and then manually calling any other conflicting
+# plugin hooks in our own plugin followed by our own checks.
+#
+# If you add a new plugin, make sure to check whether the hooks being used conflict with
+# our custom plugin hooks and if so, manually call the other plugin's hooks in our
+# custom plugin. (also applies to if the plugin is updated in the future)
+plugins = scripts-dev/mypy_synapse_plugin.py, pydantic.mypy, mypy_zope:plugin
 follow_imports = normal
 show_error_codes = True
 show_traceback = True
@@ -99,3 +110,6 @@ ignore_missing_imports = True

 [mypy-multipart.*]
 ignore_missing_imports = True
+
+[mypy-mypy_zope.*]
+ignore_missing_imports = True
@@ -504,18 +504,19 @@ smmap = ">=3.0.1,<6"

 [[package]]
 name = "gitpython"
-version = "3.1.44"
+version = "3.1.45"
 description = "GitPython is a Python library used to interact with Git repositories"
 optional = false
 python-versions = ">=3.7"
 groups = ["dev"]
 files = [
-    {file = "GitPython-3.1.44-py3-none-any.whl", hash = "sha256:9e0e10cda9bed1ee64bc9a6de50e7e38a9c9943241cd7f585f6df3ed28011110"},
-    {file = "gitpython-3.1.44.tar.gz", hash = "sha256:c87e30b26253bf5418b01b0660f818967f3c503193838337fe5e573331249269"},
+    {file = "gitpython-3.1.45-py3-none-any.whl", hash = "sha256:8908cb2e02fb3b93b7eb0f2827125cb699869470432cc885f019b8fd0fccff77"},
+    {file = "gitpython-3.1.45.tar.gz", hash = "sha256:85b0ee964ceddf211c41b9f27a49086010a190fd8132a24e21f362a4b36a791c"},
 ]

 [package.dependencies]
 gitdb = ">=4.0.1,<5"
+typing-extensions = {version = ">=3.10.0.2", markers = "python_version < \"3.10\""}

 [package.extras]
 doc = ["sphinx (>=7.1.2,<7.2)", "sphinx-autodoc-typehints", "sphinx_rtd_theme"]
@@ -1453,18 +1454,18 @@ files = [

 [[package]]
 name = "mypy-zope"
-version = "1.0.12"
+version = "1.0.13"
 description = "Plugin for mypy to support zope interfaces"
 optional = false
 python-versions = "*"
 groups = ["dev"]
 files = [
-    {file = "mypy_zope-1.0.12-py3-none-any.whl", hash = "sha256:f2ecf169f886fbc266e9339db0c2f3818528a7536b9bb4f5ece1d5854dc2f27c"},
-    {file = "mypy_zope-1.0.12.tar.gz", hash = "sha256:d6f8f99eb5644885553b4ec7afc8d68f5daf412c9bf238ec3c36b65d97df6cbe"},
+    {file = "mypy_zope-1.0.13-py3-none-any.whl", hash = "sha256:13740c4cbc910cca2c143c6709e1c483c991abeeeb7b629ad6f73d8ac1edad15"},
+    {file = "mypy_zope-1.0.13.tar.gz", hash = "sha256:63fb4d035ea874baf280dc69e714dcde4bd2a4a4837a0fd8d90ce91bea510f99"},
 ]

 [package.dependencies]
-mypy = ">=1.0.0,<1.17.0"
+mypy = ">=1.0.0,<1.18.0"
 "zope.interface" = "*"
 "zope.schema" = "*"

@@ -1542,14 +1543,14 @@ files = [

 [[package]]
 name = "phonenumbers"
-version = "9.0.9"
+version = "9.0.10"
 description = "Python version of Google's common library for parsing, formatting, storing and validating international phone numbers."
 optional = false
 python-versions = "*"
 groups = ["main"]
 files = [
-    {file = "phonenumbers-9.0.9-py2.py3-none-any.whl", hash = "sha256:13b91aa153f87675902829b38a556bad54824f9c121b89588bbb5fa8550d97ef"},
-    {file = "phonenumbers-9.0.9.tar.gz", hash = "sha256:c640545019a07e68b0bea57a5fede6eef45c7391165d28935f45615f9a567a5b"},
+    {file = "phonenumbers-9.0.10-py2.py3-none-any.whl", hash = "sha256:13b12d269be1f2b363c9bc2868656a7e2e8b50f1a1cef629c75005da6c374c6b"},
+    {file = "phonenumbers-9.0.10.tar.gz", hash = "sha256:c2d15a6a9d0534b14a7764f51246ada99563e263f65b80b0251d1a760ac4a1ba"},
 ]

 [[package]]
@@ -2408,30 +2409,30 @@ files = [

 [[package]]
 name = "ruff"
-version = "0.12.4"
+version = "0.12.7"
 description = "An extremely fast Python linter and code formatter, written in Rust."
 optional = false
 python-versions = ">=3.7"
 groups = ["dev"]
 files = [
-    {file = "ruff-0.12.4-py3-none-linux_armv6l.whl", hash = "sha256:cb0d261dac457ab939aeb247e804125a5d521b21adf27e721895b0d3f83a0d0a"},
-    {file = "ruff-0.12.4-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:55c0f4ca9769408d9b9bac530c30d3e66490bd2beb2d3dae3e4128a1f05c7442"},
-    {file = "ruff-0.12.4-py3-none-macosx_11_0_arm64.whl", hash = "sha256:a8224cc3722c9ad9044da7f89c4c1ec452aef2cfe3904365025dd2f51daeae0e"},
-    {file = "ruff-0.12.4-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e9949d01d64fa3672449a51ddb5d7548b33e130240ad418884ee6efa7a229586"},
-    {file = "ruff-0.12.4-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:be0593c69df9ad1465e8a2d10e3defd111fdb62dcd5be23ae2c06da77e8fcffb"},
-    {file = "ruff-0.12.4-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a7dea966bcb55d4ecc4cc3270bccb6f87a337326c9dcd3c07d5b97000dbff41c"},
-    {file = "ruff-0.12.4-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:afcfa3ab5ab5dd0e1c39bf286d829e042a15e966b3726eea79528e2e24d8371a"},
-    {file = "ruff-0.12.4-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:c057ce464b1413c926cdb203a0f858cd52f3e73dcb3270a3318d1630f6395bb3"},
-    {file = "ruff-0.12.4-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:e64b90d1122dc2713330350626b10d60818930819623abbb56535c6466cce045"},
-    {file = "ruff-0.12.4-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2abc48f3d9667fdc74022380b5c745873499ff827393a636f7a59da1515e7c57"},
-    {file = "ruff-0.12.4-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:2b2449dc0c138d877d629bea151bee8c0ae3b8e9c43f5fcaafcd0c0d0726b184"},
-    {file = "ruff-0.12.4-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:56e45bb11f625db55f9b70477062e6a1a04d53628eda7784dce6e0f55fd549eb"},
-    {file = "ruff-0.12.4-py3-none-musllinux_1_2_i686.whl", hash = "sha256:478fccdb82ca148a98a9ff43658944f7ab5ec41c3c49d77cd99d44da019371a1"},
-    {file = "ruff-0.12.4-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:0fc426bec2e4e5f4c4f182b9d2ce6a75c85ba9bcdbe5c6f2a74fcb8df437df4b"},
-    {file = "ruff-0.12.4-py3-none-win32.whl", hash = "sha256:4de27977827893cdfb1211d42d84bc180fceb7b72471104671c59be37041cf93"},
-    {file = "ruff-0.12.4-py3-none-win_amd64.whl", hash = "sha256:fe0b9e9eb23736b453143d72d2ceca5db323963330d5b7859d60d101147d461a"},
-    {file = "ruff-0.12.4-py3-none-win_arm64.whl", hash = "sha256:0618ec4442a83ab545e5b71202a5c0ed7791e8471435b94e655b570a5031a98e"},
-    {file = "ruff-0.12.4.tar.gz", hash = "sha256:13efa16df6c6eeb7d0f091abae50f58e9522f3843edb40d56ad52a5a4a4b6873"},
+    {file = "ruff-0.12.7-py3-none-linux_armv6l.whl", hash = "sha256:76e4f31529899b8c434c3c1dede98c4483b89590e15fb49f2d46183801565303"},
+    {file = "ruff-0.12.7-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:789b7a03e72507c54fb3ba6209e4bb36517b90f1a3569ea17084e3fd295500fb"},
+    {file = "ruff-0.12.7-py3-none-macosx_11_0_arm64.whl", hash = "sha256:2e1c2a3b8626339bb6369116e7030a4cf194ea48f49b64bb505732a7fce4f4e3"},
+    {file = "ruff-0.12.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:32dec41817623d388e645612ec70d5757a6d9c035f3744a52c7b195a57e03860"},
+    {file = "ruff-0.12.7-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:47ef751f722053a5df5fa48d412dbb54d41ab9b17875c6840a58ec63ff0c247c"},
+    {file = "ruff-0.12.7-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a828a5fc25a3efd3e1ff7b241fd392686c9386f20e5ac90aa9234a5faa12c423"},
+    {file = "ruff-0.12.7-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:5726f59b171111fa6a69d82aef48f00b56598b03a22f0f4170664ff4d8298efb"},
+    {file = "ruff-0.12.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:74e6f5c04c4dd4aba223f4fe6e7104f79e0eebf7d307e4f9b18c18362124bccd"},
+    {file = "ruff-0.12.7-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5d0bfe4e77fba61bf2ccadf8cf005d6133e3ce08793bbe870dd1c734f2699a3e"},
+    {file = "ruff-0.12.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:06bfb01e1623bf7f59ea749a841da56f8f653d641bfd046edee32ede7ff6c606"},
+    {file = "ruff-0.12.7-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:e41df94a957d50083fd09b916d6e89e497246698c3f3d5c681c8b3e7b9bb4ac8"},
+    {file = "ruff-0.12.7-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:4000623300563c709458d0ce170c3d0d788c23a058912f28bbadc6f905d67afa"},
+    {file = "ruff-0.12.7-py3-none-musllinux_1_2_i686.whl", hash = "sha256:69ffe0e5f9b2cf2b8e289a3f8945b402a1b19eff24ec389f45f23c42a3dd6fb5"},
+    {file = "ruff-0.12.7-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:a07a5c8ffa2611a52732bdc67bf88e243abd84fe2d7f6daef3826b59abbfeda4"},
+    {file = "ruff-0.12.7-py3-none-win32.whl", hash = "sha256:c928f1b2ec59fb77dfdf70e0419408898b63998789cc98197e15f560b9e77f77"},
+    {file = "ruff-0.12.7-py3-none-win_amd64.whl", hash = "sha256:9c18f3d707ee9edf89da76131956aba1270c6348bfee8f6c647de841eac7194f"},
+    {file = "ruff-0.12.7-py3-none-win_arm64.whl", hash = "sha256:dfce05101dbd11833a0776716d5d1578641b7fddb537fe7fa956ab85d1769b69"},
+    {file = "ruff-0.12.7.tar.gz", hash = "sha256:1fc3193f238bc2d7968772c82831a4ff69252f673be371fb49663f0068b7ec71"},
 ]

 [[package]]
@@ -2469,15 +2470,15 @@ doc = ["Sphinx", "sphinx-rtd-theme"]

 [[package]]
 name = "sentry-sdk"
-version = "2.32.0"
+version = "2.34.1"
 description = "Python client for Sentry (https://sentry.io)"
 optional = true
 python-versions = ">=3.6"
 groups = ["main"]
 markers = "extra == \"all\" or extra == \"sentry\""
 files = [
-    {file = "sentry_sdk-2.32.0-py2.py3-none-any.whl", hash = "sha256:6cf51521b099562d7ce3606da928c473643abe99b00ce4cb5626ea735f4ec345"},
-    {file = "sentry_sdk-2.32.0.tar.gz", hash = "sha256:9016c75d9316b0f6921ac14c8cd4fb938f26002430ac5be9945ab280f78bec6b"},
+    {file = "sentry_sdk-2.34.1-py2.py3-none-any.whl", hash = "sha256:b7a072e1cdc5abc48101d5146e1ae680fa81fe886d8d95aaa25a0b450c818d32"},
+    {file = "sentry_sdk-2.34.1.tar.gz", hash = "sha256:69274eb8c5c38562a544c3e9f68b5be0a43be4b697f5fd385bf98e4fbe672687"},
 ]

 [package.dependencies]
@@ -2931,14 +2932,14 @@ files = [

 [[package]]
 name = "types-jsonschema"
-version = "4.24.0.20250708"
+version = "4.25.0.20250720"
 description = "Typing stubs for jsonschema"
 optional = false
 python-versions = ">=3.9"
 groups = ["dev"]
 files = [
-    {file = "types_jsonschema-4.24.0.20250708-py3-none-any.whl", hash = "sha256:d574aa3421d178a8435cc898cf4cf5e5e8c8f37b949c8e3ceeff06da433a18bf"},
-    {file = "types_jsonschema-4.24.0.20250708.tar.gz", hash = "sha256:a910e4944681cbb1b18a93ffb502e09910db788314312fc763df08d8ac2aadb7"},
+    {file = "types_jsonschema-4.25.0.20250720-py3-none-any.whl", hash = "sha256:7d7897c715310d8bf9ae27a2cedba78bbb09e4cad83ce06d2aa79b73a88941df"},
+    {file = "types_jsonschema-4.25.0.20250720.tar.gz", hash = "sha256:765a3b6144798fe3161fd8cbe570a756ed3e8c0e5adb7c09693eb49faad39dbd"},
 ]

 [package.dependencies]
@@ -2982,14 +2983,14 @@ files = [

 [[package]]
 name = "types-psycopg2"
-version = "2.9.21.20250516"
+version = "2.9.21.20250718"
 description = "Typing stubs for psycopg2"
 optional = false
 python-versions = ">=3.9"
 groups = ["dev"]
 files = [
-    {file = "types_psycopg2-2.9.21.20250516-py3-none-any.whl", hash = "sha256:2a9212d1e5e507017b31486ce8147634d06b85d652769d7a2d91d53cb4edbd41"},
-    {file = "types_psycopg2-2.9.21.20250516.tar.gz", hash = "sha256:6721018279175cce10b9582202e2a2b4a0da667857ccf82a97691bdb5ecd610f"},
+    {file = "types_psycopg2-2.9.21.20250718-py3-none-any.whl", hash = "sha256:bcf085d4293bda48f5943a46dadf0389b2f98f7e8007722f7e1c12ee0f541858"},
+    {file = "types_psycopg2-2.9.21.20250718.tar.gz", hash = "sha256:dc09a97272ef67e739e57b9f4740b761208f4514257e311c0b05c8c7a37d04b4"},
 ]

 [[package]]
@@ -3352,4 +3353,4 @@ url-preview = ["lxml"]
 [metadata]
 lock-version = "2.1"
 python-versions = "^3.9.0"
-content-hash = "b1a0f4708465fd597d0bc7ebb09443ce0e2613cd58a33387a28036249f26856b"
+content-hash = "600a349d08dde732df251583094a121b5385eb43ae0c6ceff10dcf9749359446"
@@ -101,7 +101,7 @@ module-name = "synapse.synapse_rust"

 [tool.poetry]
 name = "matrix-synapse"
-version = "1.135.2"
+version = "1.136.0"
 description = "Homeserver for the Matrix decentralised comms protocol"
 authors = ["Matrix.org Team and Contributors <packages@matrix.org>"]
 license = "AGPL-3.0-or-later"
@@ -178,8 +178,13 @@ signedjson = "^1.1.0"
 service-identity = ">=18.1.0"
 # Twisted 18.9 introduces some logger improvements that the structured
 # logger utilises
-Twisted = {extras = ["tls"], version = ">=18.9.0"}
-treq = ">=15.1"
+# Twisted 19.7.0 moves test helpers to a new module and deprecates the old location.
+# Twisted 21.2.0 introduces contextvar support.
+# We could likely bump this to 22.1 without making distro packagers'
+# lives hard (as of 2025-07, distro support is Ubuntu LTS: 22.1, Debian stable: 22.4,
+# RHEL 9: 22.10)
+Twisted = {extras = ["tls"], version = ">=21.2.0"}
+treq = ">=21.5.0"
 # Twisted has required pyopenssl 16.0 since about Twisted 16.6.
 pyOpenSSL = ">=16.0.0"
 PyYAML = ">=5.3"
@@ -319,7 +324,7 @@ all = [
 # failing on new releases. Keeping lower bounds loose here means that dependabot
 # can bump versions without having to update the content-hash in the lockfile.
 # This helps prevents merge conflicts when running a batch of dependabot updates.
-ruff = "0.12.4"
+ruff = "0.12.7"
 # Type checking only works with the pydantic.v1 compat module from pydantic v2
 pydantic = "^2"

@@ -7,7 +7,7 @@ name = "synapse"
 version = "0.1.0"

 edition = "2021"
-rust-version = "1.81.0"
+rust-version = "1.82.0"

 [lib]
 name = "synapse"
@@ -61,6 +61,7 @@ fn bench_match_exact(b: &mut Bencher) {
        vec![],
        false,
        false,
+        false,
    )
    .unwrap();

@@ -71,10 +72,10 @@ fn bench_match_exact(b: &mut Bencher) {
        },
    ));

-    let matched = eval.match_condition(&condition, None, None).unwrap();
+    let matched = eval.match_condition(&condition, None, None, None).unwrap();
    assert!(matched, "Didn't match");

-    b.iter(|| eval.match_condition(&condition, None, None).unwrap());
+    b.iter(|| eval.match_condition(&condition, None, None, None).unwrap());
 }

 #[bench]
@@ -107,6 +108,7 @@ fn bench_match_word(b: &mut Bencher) {
        vec![],
        false,
        false,
+        false,
    )
    .unwrap();

@@ -117,10 +119,10 @@ fn bench_match_word(b: &mut Bencher) {
        },
    ));

-    let matched = eval.match_condition(&condition, None, None).unwrap();
+    let matched = eval.match_condition(&condition, None, None, None).unwrap();
    assert!(matched, "Didn't match");

-    b.iter(|| eval.match_condition(&condition, None, None).unwrap());
+    b.iter(|| eval.match_condition(&condition, None, None, None).unwrap());
 }

 #[bench]
@@ -153,6 +155,7 @@ fn bench_match_word_miss(b: &mut Bencher) {
        vec![],
        false,
        false,
+        false,
    )
    .unwrap();

@@ -163,10 +166,10 @@ fn bench_match_word_miss(b: &mut Bencher) {
        },
    ));

-    let matched = eval.match_condition(&condition, None, None).unwrap();
+    let matched = eval.match_condition(&condition, None, None, None).unwrap();
    assert!(!matched, "Didn't match");

-    b.iter(|| eval.match_condition(&condition, None, None).unwrap());
+    b.iter(|| eval.match_condition(&condition, None, None, None).unwrap());
 }

 #[bench]
@@ -199,6 +202,7 @@ fn bench_eval_message(b: &mut Bencher) {
        vec![],
        false,
        false,
+        false,
    )
    .unwrap();

@@ -210,7 +214,8 @@ fn bench_eval_message(b: &mut Bencher) {
        false,
        false,
        false,
+        false,
    );

-    b.iter(|| eval.run(&rules, Some("bob"), Some("person")));
+    b.iter(|| eval.run(&rules, Some("bob"), Some("person"), None));
 }
@@ -54,6 +54,7 @@ enum EventInternalMetadataData {
    RecheckRedaction(bool),
    SoftFailed(bool),
    ProactivelySend(bool),
+    PolicyServerSpammy(bool),
    Redacted(bool),
    TxnId(Box<str>),
    TokenId(i64),
@@ -96,6 +97,13 @@ impl EventInternalMetadataData {
                    .to_owned()
                    .into_any(),
            ),
+            EventInternalMetadataData::PolicyServerSpammy(o) => (
+                pyo3::intern!(py, "policy_server_spammy"),
+                o.into_pyobject(py)
+                    .unwrap_infallible()
+                    .to_owned()
+                    .into_any(),
+            ),
            EventInternalMetadataData::Redacted(o) => (
                pyo3::intern!(py, "redacted"),
                o.into_pyobject(py)
@@ -155,6 +163,11 @@ impl EventInternalMetadataData {
                    .extract()
                    .with_context(|| format!("'{key_str}' has invalid type"))?,
            ),
+            "policy_server_spammy" => EventInternalMetadataData::PolicyServerSpammy(
+                value
+                    .extract()
+                    .with_context(|| format!("'{key_str}' has invalid type"))?,
+            ),
            "redacted" => EventInternalMetadataData::Redacted(
                value
                    .extract()
@@ -427,6 +440,17 @@ impl EventInternalMetadata {
        set_property!(self, ProactivelySend, obj);
    }

+    #[getter]
+    fn get_policy_server_spammy(&self) -> PyResult<bool> {
+        Ok(get_property_opt!(self, PolicyServerSpammy)
+            .copied()
+            .unwrap_or(false))
+    }
+    #[setter]
+    fn set_policy_server_spammy(&mut self, obj: bool) {
+        set_property!(self, PolicyServerSpammy, obj);
+    }
+
    #[getter]
    fn get_redacted(&self) -> PyResult<bool> {
        let bool = get_property!(self, Redacted)?;
@@ -290,6 +290,26 @@ pub const BASE_APPEND_CONTENT_RULES: &[PushRule] = &[PushRule {
 }];

 pub const BASE_APPEND_UNDERRIDE_RULES: &[PushRule] = &[
+    PushRule {
+        rule_id: Cow::Borrowed("global/content/.io.element.msc4306.rule.unsubscribed_thread"),
+        priority_class: 1,
+        conditions: Cow::Borrowed(&[Condition::Known(
+            KnownCondition::Msc4306ThreadSubscription { subscribed: false },
+        )]),
+        actions: Cow::Borrowed(&[]),
+        default: true,
+        default_enabled: true,
+    },
+    PushRule {
+        rule_id: Cow::Borrowed("global/content/.io.element.msc4306.rule.subscribed_thread"),
+        priority_class: 1,
+        conditions: Cow::Borrowed(&[Condition::Known(
+            KnownCondition::Msc4306ThreadSubscription { subscribed: true },
+        )]),
+        actions: Cow::Borrowed(&[Action::Notify, SOUND_ACTION]),
+        default: true,
+        default_enabled: true,
+    },
    PushRule {
        rule_id: Cow::Borrowed("global/underride/.m.rule.call"),
        priority_class: 1,
@@ -106,8 +106,11 @@ pub struct PushRuleEvaluator {
    /// flag as MSC1767 (extensible events core).
    msc3931_enabled: bool,

-    // If MSC4210 (remove legacy mentions) is enabled.
+    /// If MSC4210 (remove legacy mentions) is enabled.
    msc4210_enabled: bool,
+
+    /// If MSC4306 (thread subscriptions) is enabled.
+    msc4306_enabled: bool,
 }

 #[pymethods]
@@ -126,6 +129,7 @@ impl PushRuleEvaluator {
        room_version_feature_flags,
        msc3931_enabled,
        msc4210_enabled,
+        msc4306_enabled,
    ))]
    pub fn py_new(
        flattened_keys: BTreeMap<String, JsonValue>,
@@ -138,6 +142,7 @@ impl PushRuleEvaluator {
        room_version_feature_flags: Vec<String>,
        msc3931_enabled: bool,
        msc4210_enabled: bool,
+        msc4306_enabled: bool,
    ) -> Result<Self, Error> {
        let body = match flattened_keys.get("content.body") {
            Some(JsonValue::Value(SimpleJsonValue::Str(s))) => s.clone().into_owned(),
@@ -156,6 +161,7 @@ impl PushRuleEvaluator {
            room_version_feature_flags,
            msc3931_enabled,
            msc4210_enabled,
+            msc4306_enabled,
        })
    }

@@ -167,12 +173,19 @@ impl PushRuleEvaluator {
    ///
    /// Returns the set of actions, if any, that match (filtering out any
    /// `dont_notify` and `coalesce` actions).
-    #[pyo3(signature = (push_rules, user_id=None, display_name=None))]
+    ///
+    /// msc4306_thread_subscription_state: (Only populated if MSC4306 is enabled)
+    /// The thread subscription state corresponding to the thread containing this event.
+    /// - `None` if the event is not in a thread, or if MSC4306 is disabled.
+    /// - `Some(true)` if the event is in a thread and the user has a subscription for that thread
+    /// - `Some(false)` if the event is in a thread and the user does NOT have a subscription for that thread
+    #[pyo3(signature = (push_rules, user_id=None, display_name=None, msc4306_thread_subscription_state=None))]
    pub fn run(
        &self,
        push_rules: &FilteredPushRules,
        user_id: Option<&str>,
        display_name: Option<&str>,
+        msc4306_thread_subscription_state: Option<bool>,
    ) -> Vec<Action> {
        'outer: for (push_rule, enabled) in push_rules.iter() {
            if !enabled {
@@ -204,7 +217,12 @@ impl PushRuleEvaluator {
                    Condition::Known(KnownCondition::RoomVersionSupports { feature: _ }),
                );

-                match self.match_condition(condition, user_id, display_name) {
+                match self.match_condition(
+                    condition,
+                    user_id,
+                    display_name,
+                    msc4306_thread_subscription_state,
+                ) {
                    Ok(true) => {}
                    Ok(false) => continue 'outer,
                    Err(err) => {
@@ -237,14 +255,20 @@ impl PushRuleEvaluator {
    }

    /// Check if the given condition matches.
-    #[pyo3(signature = (condition, user_id=None, display_name=None))]
+    #[pyo3(signature = (condition, user_id=None, display_name=None, msc4306_thread_subscription_state=None))]
    fn matches(
        &self,
        condition: Condition,
        user_id: Option<&str>,
        display_name: Option<&str>,
+        msc4306_thread_subscription_state: Option<bool>,
    ) -> bool {
-        match self.match_condition(&condition, user_id, display_name) {
+        match self.match_condition(
+            &condition,
+            user_id,
+            display_name,
+            msc4306_thread_subscription_state,
+        ) {
            Ok(true) => true,
            Ok(false) => false,
            Err(err) => {
@@ -262,6 +286,7 @@ impl PushRuleEvaluator {
        condition: &Condition,
        user_id: Option<&str>,
        display_name: Option<&str>,
+        msc4306_thread_subscription_state: Option<bool>,
    ) -> Result<bool, Error> {
        let known_condition = match condition {
            Condition::Known(known) => known,
@@ -393,6 +418,13 @@ impl PushRuleEvaluator {
                        && self.room_version_feature_flags.contains(&flag)
                }
            }
+            KnownCondition::Msc4306ThreadSubscription { subscribed } => {
+                if !self.msc4306_enabled {
+                    false
+                } else {
+                    msc4306_thread_subscription_state == Some(*subscribed)
+                }
+            }
        };

        Ok(result)
@@ -536,10 +568,11 @@ fn push_rule_evaluator() {
        vec![],
        true,
        false,
+        false,
    )
    .unwrap();

-    let result = evaluator.run(&FilteredPushRules::default(), None, Some("bob"));
+    let result = evaluator.run(&FilteredPushRules::default(), None, Some("bob"), None);
    assert_eq!(result.len(), 3);
 }

@@ -566,6 +599,7 @@ fn test_requires_room_version_supports_condition() {
        flags,
        true,
        false,
+        false,
    )
    .unwrap();

@@ -575,6 +609,7 @@ fn test_requires_room_version_supports_condition() {
        &FilteredPushRules::default(),
        Some("@bob:example.org"),
        None,
+        None,
    );
    assert_eq!(result.len(), 3);

@@ -593,7 +628,17 @@ fn test_requires_room_version_supports_condition() {
    };
    let rules = PushRules::new(vec![custom_rule]);
    result = evaluator.run(
-        &FilteredPushRules::py_new(rules, BTreeMap::new(), true, false, true, false, false),
+        &FilteredPushRules::py_new(
+            rules,
+            BTreeMap::new(),
+            true,
+            false,
+            true,
+            false,
+            false,
+            false,
+        ),
+        None,
        None,
        None,
    );
@@ -369,6 +369,10 @@ pub enum KnownCondition {
    RoomVersionSupports {
        feature: Cow<'static, str>,
    },
+    #[serde(rename = "io.element.msc4306.thread_subscription")]
+    Msc4306ThreadSubscription {
+        subscribed: bool,
+    },
 }

 impl<'source> IntoPyObject<'source> for Condition {
@@ -547,11 +551,13 @@ pub struct FilteredPushRules {
    msc3664_enabled: bool,
    msc4028_push_encrypted_events: bool,
    msc4210_enabled: bool,
+    msc4306_enabled: bool,
 }

 #[pymethods]
 impl FilteredPushRules {
    #[new]
+    #[allow(clippy::too_many_arguments)]
    pub fn py_new(
        push_rules: PushRules,
        enabled_map: BTreeMap<String, bool>,
@@ -560,6 +566,7 @@ impl FilteredPushRules {
        msc3664_enabled: bool,
        msc4028_push_encrypted_events: bool,
        msc4210_enabled: bool,
+        msc4306_enabled: bool,
    ) -> Self {
        Self {
            push_rules,
@@ -569,6 +576,7 @@ impl FilteredPushRules {
            msc3664_enabled,
            msc4028_push_encrypted_events,
            msc4210_enabled,
+            msc4306_enabled,
        }
    }

@@ -619,6 +627,10 @@ impl FilteredPushRules {
                    return false;
                }

+                if !self.msc4306_enabled && rule.rule_id.contains("/.io.element.msc4306.rule.") {
+                    return false;
+                }
+
                true
            })
            .map(|r| {
@@ -1,5 +1,5 @@
 $schema: https://element-hq.github.io/synapse/latest/schema/v1/meta.schema.json
-$id: https://element-hq.github.io/synapse/schema/synapse/v1.135/synapse-config.schema.json
+$id: https://element-hq.github.io/synapse/schema/synapse/v1.136/synapse-config.schema.json
 type: object
 properties:
  modules:
@@ -629,6 +629,70 @@ properties:
        password: mypassword
        ssh_priv_key_path: CONFDIR/id_rsa
        ssh_pub_key_path: CONFDIR/id_rsa.pub
+  http_proxy:
+    type: ["string", "null"]
+    description: >-
+      Proxy server to use for HTTP requests.
+
+      For more details, see the [forward proxy documentation](../../setup/forward_proxy.md).
+    examples:
+      - "http://USERNAME:PASSWORD@10.0.1.1:8080/"
+  https_proxy:
+    type: ["string", "null"]
+    description: >-
+      Proxy server to use for HTTPS requests.
+
+      For more details, see the [forward proxy documentation](../../setup/forward_proxy.md).
+    examples:
+      - "http://USERNAME:PASSWORD@proxy.example.com:8080/"
+  no_proxy_hosts:
+    type: array
+    description: >-
+      List of hosts, IP addresses, or IP ranges in CIDR format which should not use the
+      proxy. Synapse will directly connect to these hosts.
+
+      For more details, see the [forward proxy documentation](../../setup/forward_proxy.md).
+    examples:
+      - - master.hostname.example.com
+        - 10.1.0.0/16
+        - 172.30.0.0/16
+  matrix_authentication_service:
+    type: object
+    description: >-
+      The `matrix_authentication_service` setting configures integration with
+      [Matrix Authentication Service (MAS)](https://github.com/element-hq/matrix-authentication-service).
+    properties:
+      enabled:
+        type: boolean
+        description: >-
+          Whether or not to enable the MAS integration. If this is set to
+          `false`, Synapse will use its legacy internal authentication API.
+        default: false
+
+      endpoint:
+        type: string
+        format: uri
+        description: >-
+          The URL where Synapse can reach MAS. This *must* have the `discovery`
+          and `oauth` resources mounted.
+        default: http://localhost:8080
+
+      secret:
+        type: ["string", "null"]
+        description: >-
+          A shared secret that will be used to authenticate requests from and to MAS.
+
+      secret_path:
+        type: ["string", "null"]
+        description: >-
+          Alternative to `secret`, reading the shared secret from a file.
+          The file should be a plain text file, containing only the secret.
+          Synapse reads the secret from the given file once at startup.
+
+    examples:
+      - enabled: true
+        secret: someverysecuresecret
+        endpoint: http://localhost:8080
  dummy_events_threshold:
    type: integer
    description: >-
@@ -2201,6 +2265,17 @@ properties:
    examples:
      - per_second: 2.0
        burst_count: 20.0
+  rc_room_creation:
+    $ref: "#/$defs/rc"
+    description: >-
+      Sets rate limits for how often users are able to create rooms.
+    default:
+      per_user:
+        per_second: 0.016
+        burst_count: 10.0
+    examples:
+      - per_second: 1.0
+        burst_count: 5.0
  federation_rr_transactions_per_room_per_second:
    type: integer
    description: >-
@@ -473,6 +473,10 @@ def section(prop: str, values: dict) -> str:


 def main() -> None:
+    # For Windows: reconfigure the terminal to be UTF-8 for `print()` calls.
+    if sys.platform == "win32":
+        sys.stdout.reconfigure(encoding="utf-8")
+
    def usage(err_msg: str) -> int:
        script_name = (sys.argv[:1] or ["__main__.py"])[0]
        print(err_msg, file=sys.stderr)
@@ -485,7 +489,10 @@ def main() -> None:
            exit(usage("Too many arguments."))
        if not (filepath := (sys.argv[1:] or [""])[0]):
            exit(usage("No schema file provided."))
-        with open(filepath) as f:
+        with open(filepath, "r", encoding="utf-8") as f:
+            # Note: Windows requires that we specify the encoding otherwise it uses
+            # things like CP-1251, which can cause explosions.
+            # See https://github.com/yaml/pyyaml/issues/123 for more info.
            return yaml.safe_load(f)

    schema = read_json_file_arg()
@@ -23,28 +23,195 @@
 can crop up, e.g the cache descriptors.
 """

-from typing import Callable, Optional, Tuple, Type, Union
+import enum
+from typing import Callable, Mapping, Optional, Tuple, Type, Union

+import attr
 import mypy.types
 from mypy.erasetype import remove_instance_last_known_values
 from mypy.errorcodes import ErrorCode
-from mypy.nodes import ARG_NAMED_OPT, TempNode, Var
-from mypy.plugin import FunctionSigContext, MethodSigContext, Plugin
+from mypy.nodes import ARG_NAMED_OPT, ListExpr, NameExpr, TempNode, TupleExpr, Var
+from mypy.plugin import (
+    ClassDefContext,
+    Context,
+    FunctionLike,
+    FunctionSigContext,
+    MethodSigContext,
+    MypyFile,
+    Plugin,
+)
 from mypy.typeops import bind_self
 from mypy.types import (
    AnyType,
    CallableType,
    Instance,
    NoneType,
+    Options,
    TupleType,
    TypeAliasType,
    TypeVarType,
    UninhabitedType,
    UnionType,
 )
+from mypy_zope import plugin as mypy_zope_plugin
+from pydantic.mypy import plugin as mypy_pydantic_plugin
+
+PROMETHEUS_METRIC_MISSING_SERVER_NAME_LABEL = ErrorCode(
+    "missing-server-name-label",
+    "`SERVER_NAME_LABEL` required in metric",
+    category="per-homeserver-tenant-metrics",
+)
+
+PROMETHEUS_METRIC_MISSING_FROM_LIST_TO_CHECK = ErrorCode(
+    "metric-type-missing-from-list",
+    "Every Prometheus metric type must be included in the `prometheus_metric_fullname_to_label_arg_map`.",
+    category="per-homeserver-tenant-metrics",
+)
+
+
+class Sentinel(enum.Enum):
+    # defining a sentinel in this way allows mypy to correctly handle the
+    # type of a dictionary lookup and subsequent type narrowing.
+    UNSET_SENTINEL = object()
+
+
+@attr.s(auto_attribs=True)
+class ArgLocation:
+    keyword_name: str
+    """
+    The keyword argument name for this argument
+    """
+    position: int
+    """
+    The 0-based positional index of this argument
+    """
+
+
+prometheus_metric_fullname_to_label_arg_map: Mapping[str, Optional[ArgLocation]] = {
+    # `Collector` subclasses:
+    "prometheus_client.metrics.MetricWrapperBase": ArgLocation("labelnames", 2),
+    "prometheus_client.metrics.Counter": ArgLocation("labelnames", 2),
+    "prometheus_client.metrics.Histogram": ArgLocation("labelnames", 2),
+    "prometheus_client.metrics.Gauge": ArgLocation("labelnames", 2),
+    "prometheus_client.metrics.Summary": ArgLocation("labelnames", 2),
+    "prometheus_client.metrics.Info": ArgLocation("labelnames", 2),
+    "prometheus_client.metrics.Enum": ArgLocation("labelnames", 2),
+    "synapse.metrics.LaterGauge": ArgLocation("labelnames", 2),
+    "synapse.metrics.InFlightGauge": ArgLocation("labels", 2),
+    "synapse.metrics.GaugeBucketCollector": ArgLocation("labelnames", 2),
+    "prometheus_client.registry.Collector": None,
+    "prometheus_client.registry._EmptyCollector": None,
+    "prometheus_client.registry.CollectorRegistry": None,
+    "prometheus_client.process_collector.ProcessCollector": None,
+    "prometheus_client.platform_collector.PlatformCollector": None,
+    "prometheus_client.gc_collector.GCCollector": None,
+    "synapse.metrics._gc.GCCounts": None,
+    "synapse.metrics._gc.PyPyGCStats": None,
+    "synapse.metrics._reactor_metrics.ReactorLastSeenMetric": None,
+    "synapse.metrics.CPUMetrics": None,
+    "synapse.metrics.jemalloc.JemallocCollector": None,
+    "synapse.util.metrics.DynamicCollectorRegistry": None,
+    "synapse.metrics.background_process_metrics._Collector": None,
+    #
+    # `Metric` subclasses:
+    "prometheus_client.metrics_core.Metric": None,
+    "prometheus_client.metrics_core.UnknownMetricFamily": ArgLocation("labels", 3),
+    "prometheus_client.metrics_core.CounterMetricFamily": ArgLocation("labels", 3),
+    "prometheus_client.metrics_core.GaugeMetricFamily": ArgLocation("labels", 3),
+    "prometheus_client.metrics_core.SummaryMetricFamily": ArgLocation("labels", 3),
+    "prometheus_client.metrics_core.InfoMetricFamily": ArgLocation("labels", 3),
+    "prometheus_client.metrics_core.HistogramMetricFamily": ArgLocation("labels", 3),
+    "prometheus_client.metrics_core.GaugeHistogramMetricFamily": ArgLocation(
+        "labels", 4
+    ),
+    "prometheus_client.metrics_core.StateSetMetricFamily": ArgLocation("labels", 3),
+    "synapse.metrics.GaugeHistogramMetricFamilyWithLabels": ArgLocation(
+        "labelnames", 4
+    ),
+}
+"""
+Map from the fullname of the Prometheus `Metric`/`Collector` classes to the keyword
+argument name and positional index of the label names. This map is useful because
+different metrics have different signatures for passing in label names and we just need
+to know where to look.
+
+This map should include any metrics that we collect with Prometheus. Which corresponds
+to anything that inherits from `prometheus_client.registry.Collector`
+(`synapse.metrics._types.Collector`) or `prometheus_client.metrics_core.Metric`. The
+exhaustiveness of this list is enforced by `analyze_prometheus_metric_classes`.
+
+The entries with `None` always fail the lint because they don't have a `labelnames`
+argument (therefore, no `SERVER_NAME_LABEL`), but we include them here so that people
+can notice and manually allow via a type ignore comment as the source of truth
+should be in the source code.
+"""
+
+# Unbound at this point because we don't know the mypy version yet.
+# This is set in the `plugin(...)` function below.
+MypyPydanticPluginClass: Type[Plugin]
+MypyZopePluginClass: Type[Plugin]


 class SynapsePlugin(Plugin):
+    def __init__(self, options: Options):
+        super().__init__(options)
+        self.mypy_pydantic_plugin = MypyPydanticPluginClass(options)
+        self.mypy_zope_plugin = MypyZopePluginClass(options)
+
+    def set_modules(self, modules: dict[str, MypyFile]) -> None:
+        """
+        This is called by mypy internals. We have to override this to ensure it's also
+        called for any other plugins that we're manually handling.
+
+        Here is how mypy describes it:
+
+        > [`self._modules`] can't be set in `__init__` because it is executed too soon
+        > in `build.py`. Therefore, `build.py` *must* set it later before graph processing
+        > starts by calling `set_modules()`.
+        """
+        super().set_modules(modules)
+        self.mypy_pydantic_plugin.set_modules(modules)
+        self.mypy_zope_plugin.set_modules(modules)
+
+    def get_base_class_hook(
+        self, fullname: str
+    ) -> Optional[Callable[[ClassDefContext], None]]:
+        def _get_base_class_hook(ctx: ClassDefContext) -> None:
+            # Run any `get_base_class_hook` checks from other plugins first.
+            #
+            # Unfortunately, because mypy only chooses the first plugin that returns a
+            # non-None value (known-limitation, c.f.
+            # https://github.com/python/mypy/issues/19524), we workaround this by
+            # putting our custom plugin first in the plugin order and then calling the
+            # other plugin's hook manually followed by our own checks.
+            if callback := self.mypy_pydantic_plugin.get_base_class_hook(fullname):
+                callback(ctx)
+            if callback := self.mypy_zope_plugin.get_base_class_hook(fullname):
+                callback(ctx)
+
+            # Now run our own checks
+            analyze_prometheus_metric_classes(ctx)
+
+        return _get_base_class_hook
+
+    def get_function_signature_hook(
+        self, fullname: str
+    ) -> Optional[Callable[[FunctionSigContext], FunctionLike]]:
+        # Strip off the unique identifier for classes that are dynamically created inside
+        # functions. ex. `synapse.metrics.jemalloc.JemallocCollector@185` (this is the line
+        # number)
+        if "@" in fullname:
+            fullname = fullname.split("@", 1)[0]
+
+        # Look for any Prometheus metrics to make sure they have the `SERVER_NAME_LABEL`
+        # label.
+        if fullname in prometheus_metric_fullname_to_label_arg_map.keys():
+            # Because it's difficult to determine the `fullname` of the function in the
+            # callback, let's just pass it in while we have it.
+            return lambda ctx: check_prometheus_metric_instantiation(ctx, fullname)
+
+        return None
+
    def get_method_signature_hook(
        self, fullname: str
    ) -> Optional[Callable[[MethodSigContext], CallableType]]:
@@ -65,6 +232,157 @@ class SynapsePlugin(Plugin):
        return None


+def analyze_prometheus_metric_classes(ctx: ClassDefContext) -> None:
+    """
+    Cross-check the list of Prometheus metric classes against the
+    `prometheus_metric_fullname_to_label_arg_map` to ensure the list is exhaustive and
+    up-to-date.
+    """
+
+    fullname = ctx.cls.fullname
+    # Strip off the unique identifier for classes that are dynamically created inside
+    # functions. ex. `synapse.metrics.jemalloc.JemallocCollector@185` (this is the line
+    # number)
+    if "@" in fullname:
+        fullname = fullname.split("@", 1)[0]
+
+    if any(
+        ancestor_type.fullname
+        in (
+            # All of the Prometheus metric classes inherit from the `Collector`.
+            "prometheus_client.registry.Collector",
+            "synapse.metrics._types.Collector",
+            # And custom metrics that inherit from `Metric`.
+            "prometheus_client.metrics_core.Metric",
+        )
+        for ancestor_type in ctx.cls.info.mro
+    ):
+        if fullname not in prometheus_metric_fullname_to_label_arg_map:
+            ctx.api.fail(
+                f"Expected {fullname} to be in `prometheus_metric_fullname_to_label_arg_map`, "
+                f"but it was not found. This is a problem with our custom mypy plugin. "
+                f"Please add it to the map.",
+                Context(),
+                code=PROMETHEUS_METRIC_MISSING_FROM_LIST_TO_CHECK,
+            )
+
+
+def check_prometheus_metric_instantiation(
+    ctx: FunctionSigContext, fullname: str
+) -> CallableType:
+    """
+    Ensure that the `prometheus_client` metrics include the `SERVER_NAME_LABEL` label
+    when instantiated.
+
+    This is important because we support multiple Synapse instances running in the same
+    process, where all metrics share a single global `REGISTRY`. The `server_name` label
+    ensures metrics are correctly separated by homeserver.
+
+    There are also some metrics that apply at the process level, such as CPU usage,
+    Python garbage collection, and Twisted reactor tick time, which shouldn't have the
+    `SERVER_NAME_LABEL`. In those cases, use a type ignore comment to disable the
+    check, e.g. `# type: ignore[missing-server-name-label]`.
+
+    Args:
+        ctx: The `FunctionSigContext` from mypy.
+        fullname: The fully qualified name of the function being called,
+            e.g. `"prometheus_client.metrics.Counter"`
+    """
+    # The true signature, this isn't being modified so this is what will be returned.
+    signature = ctx.default_signature
+
+    # Find where the label names argument is in the function signature.
+    arg_location = prometheus_metric_fullname_to_label_arg_map.get(
+        fullname, Sentinel.UNSET_SENTINEL
+    )
+    assert arg_location is not Sentinel.UNSET_SENTINEL, (
+        f"Expected to find {fullname} in `prometheus_metric_fullname_to_label_arg_map`, "
+        f"but it was not found. This is a problem with our custom mypy plugin. "
+        f"Please add it to the map. Context: {ctx.context}"
+    )
+    # People should be using `# type: ignore[missing-server-name-label]` for
+    # process-level metrics that should not have the `SERVER_NAME_LABEL`.
+    if arg_location is None:
+        ctx.api.fail(
+            f"{signature.name} does not have a `labelnames`/`labels` argument "
+            "(if this is untrue, update `prometheus_metric_fullname_to_label_arg_map` "
+            "in our custom mypy plugin) and should probably have a type ignore comment, "
+            "e.g. `# type: ignore[missing-server-name-label]`. The reason we don't "
+            "automatically ignore this is the source of truth should be in the source code.",
+            ctx.context,
+            code=PROMETHEUS_METRIC_MISSING_SERVER_NAME_LABEL,
+        )
+        return signature
+
+    # Sanity check the arguments are still as expected in this version of
+    # `prometheus_client`. ex. `Counter(name, documentation, labelnames, ...)`
+    #
+    # `signature.arg_names` should be: ["name", "documentation", "labelnames", ...]
+    if (
+        len(signature.arg_names) < (arg_location.position + 1)
+        or signature.arg_names[arg_location.position] != arg_location.keyword_name
+    ):
+        ctx.api.fail(
+            f"Expected argument number {arg_location.position + 1} of {signature.name} to be `labelnames`/`labels`, "
+            f"but got {signature.arg_names[arg_location.position]}",
+            ctx.context,
+        )
+        return signature
+
+    # Ensure mypy is passing the correct number of arguments because we are doing some
+    # dirty indexing into `ctx.args` later on.
+    assert len(ctx.args) == len(signature.arg_names), (
+        f"Expected the list of arguments in the {signature.name} signature ({len(signature.arg_names)})"
+        f"to match the number of arguments from the function signature context ({len(ctx.args)})"
+    )
+
+    # Check if the `labelnames` argument includes `SERVER_NAME_LABEL`
+    #
+    # `ctx.args` should look like this:
+    # ```
+    # [
+    #     [StrExpr("name")],
+    #     [StrExpr("documentation")],
+    #     [ListExpr([StrExpr("label1"), StrExpr("label2")])]
+    #     ...
+    # ]
+    # ```
+    labelnames_arg_expression = (
+        ctx.args[arg_location.position][0]
+        if len(ctx.args[arg_location.position]) > 0
+        else None
+    )
+    if isinstance(labelnames_arg_expression, (ListExpr, TupleExpr)):
+        # Check if the `labelnames` argument includes the `server_name` label (`SERVER_NAME_LABEL`).
+        for labelname_expression in labelnames_arg_expression.items:
+            if (
+                isinstance(labelname_expression, NameExpr)
+                and labelname_expression.fullname == "synapse.metrics.SERVER_NAME_LABEL"
+            ):
+                # Found the `SERVER_NAME_LABEL`, all good!
+                break
+        else:
+            ctx.api.fail(
+                f"Expected {signature.name} to include `SERVER_NAME_LABEL` in the list of labels. "
+                "If this is a process-level metric (vs homeserver-level), use a type ignore comment "
+                "to disable this check.",
+                ctx.context,
+                code=PROMETHEUS_METRIC_MISSING_SERVER_NAME_LABEL,
+            )
+    else:
+        ctx.api.fail(
+            f"Expected the `labelnames` argument of {signature.name} to be a list of label names "
+            f"(including `SERVER_NAME_LABEL`), but got {labelnames_arg_expression}. "
+            "If this is a process-level metric (vs homeserver-level), use a type ignore comment "
+            "to disable this check.",
+            ctx.context,
+            code=PROMETHEUS_METRIC_MISSING_SERVER_NAME_LABEL,
+        )
+        return signature
+
+    return signature
+
+
 def _get_true_return_type(signature: CallableType) -> mypy.types.Type:
    """
    Get the "final" return type of a callable which might return an Awaitable/Deferred.
@@ -372,10 +690,13 @@ def is_cacheable(


 def plugin(version: str) -> Type[SynapsePlugin]:
+    global MypyPydanticPluginClass, MypyZopePluginClass
    # This is the entry point of the plugin, and lets us deal with the fact
    # that the mypy plugin interface is *not* stable by looking at the version
    # string.
    #
    # However, since we pin the version of mypy Synapse uses in CI, we don't
    # really care.
+    MypyPydanticPluginClass = mypy_pydantic_plugin(version)
+    MypyZopePluginClass = mypy_zope_plugin(version)
    return SynapsePlugin
@@ -45,16 +45,6 @@ if py_version < (3, 9):

 # Allow using the asyncio reactor via env var.
 if strtobool(os.environ.get("SYNAPSE_ASYNC_IO_REACTOR", "0")):
-    from incremental import Version
-
-    import twisted
-
-    # We need a bugfix that is included in Twisted 21.2.0:
-    # https://twistedmatrix.com/trac/ticket/9787
-    if twisted.version < Version("Twisted", 21, 2, 0):
-        print("Using asyncio reactor requires Twisted>=21.2.0")
-        sys.exit(1)
-
    import asyncio

    from twisted.internet import asyncioreactor
@@ -34,9 +34,11 @@ HAS_PYDANTIC_V2: bool = Version(pydantic_version).major == 2

 if TYPE_CHECKING or HAS_PYDANTIC_V2:
    from pydantic.v1 import (
+        AnyHttpUrl,
        BaseModel,
        Extra,
        Field,
+        FilePath,
        MissingError,
        PydanticValueError,
        StrictBool,
@@ -55,9 +57,11 @@ if TYPE_CHECKING or HAS_PYDANTIC_V2:
    from pydantic.v1.typing import get_args
 else:
    from pydantic import (
+        AnyHttpUrl,
        BaseModel,
        Extra,
        Field,
+        FilePath,
        MissingError,
        PydanticValueError,
        StrictBool,
@@ -77,6 +81,7 @@ else:

 __all__ = (
    "HAS_PYDANTIC_V2",
+    "AnyHttpUrl",
    "BaseModel",
    "constr",
    "conbytes",
@@ -85,6 +90,7 @@ __all__ = (
    "ErrorWrapper",
    "Extra",
    "Field",
+    "FilePath",
    "get_args",
    "MissingError",
    "parse_obj_as",
@@ -29,19 +29,21 @@ import attr

 from synapse.config._base import (
    Config,
+    ConfigError,
    RootConfig,
    find_config_files,
    read_config_files,
 )
 from synapse.config.database import DatabaseConfig
+from synapse.config.server import ServerConfig
 from synapse.storage.database import DatabasePool, LoggingTransaction, make_conn
 from synapse.storage.engines import create_engine


 class ReviewConfig(RootConfig):
-    "A config class that just pulls out the database config"
+    "A config class that just pulls out the server and database config"

-    config_classes = [DatabaseConfig]
+    config_classes = [ServerConfig, DatabaseConfig]


@attr.s(auto_attribs=True)
@@ -148,6 +150,10 @@ def main() -> None:
    config_dict = read_config_files(config_files)
    config.parse_config_dict(config_dict, "", "")

+    server_name = config.server.server_name
+    if not isinstance(server_name, str):
+        raise ConfigError("Must be a string", ("server_name",))
+
    since_ms = time.time() * 1000 - Config.parse_duration(config_args.since)
    exclude_users_with_email = config_args.exclude_emails
    exclude_users_with_appservice = config_args.exclude_app_service
@@ -159,7 +165,12 @@ def main() -> None:

    engine = create_engine(database_config.config)

-    with make_conn(database_config, engine, "review_recent_signups") as db_conn:
+    with make_conn(
+        db_config=database_config,
+        engine=engine,
+        default_txn_name="review_recent_signups",
+        server_name=server_name,
+    ) as db_conn:
        # This generates a type of Cursor, not LoggingTransaction.
        user_infos = get_recent_users(
            db_conn.cursor(),
@@ -672,8 +672,14 @@ class Porter:
        engine = create_engine(db_config.config)

        hs = MockHomeserver(self.hs_config)
+        server_name = hs.hostname

-        with make_conn(db_config, engine, "portdb") as db_conn:
+        with make_conn(
+            db_config=db_config,
+            engine=engine,
+            default_txn_name="portdb",
+            server_name=server_name,
+        ) as db_conn:
            engine.check_database(
                db_conn, allow_outdated_version=allow_outdated_version
            )
@@ -53,6 +53,7 @@ class MockHomeserver(HomeServer):


 def run_background_updates(hs: HomeServer) -> None:
+    server_name = hs.hostname
    main = hs.get_datastores().main
    state = hs.get_datastores().state

@@ -66,7 +67,11 @@ def run_background_updates(hs: HomeServer) -> None:
    def run() -> None:
        # Apply all background updates on the database.
        defer.ensureDeferred(
-            run_as_background_process("background_updates", run_background_updates)
+            run_as_background_process(
+                "background_updates",
+                server_name,
+                run_background_updates,
+            )
        )

    reactor.callWhenRunning(run)
@@ -20,10 +20,13 @@
 #
 from typing import TYPE_CHECKING, Optional, Protocol, Tuple

+from prometheus_client import Histogram
+
 from twisted.web.server import Request

 from synapse.appservice import ApplicationService
 from synapse.http.site import SynapseRequest
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.types import Requester

 if TYPE_CHECKING:
@@ -33,6 +36,13 @@ if TYPE_CHECKING:
 GUEST_DEVICE_ID = "guest_device"


+introspection_response_timer = Histogram(
+    "synapse_api_auth_delegated_introspection_response",
+    "Time taken to get a response for an introspection request",
+    labelnames=["code", SERVER_NAME_LABEL],
+)
+
+
 class Auth(Protocol):
    """The interface that an auth provider must implement."""

@@ -0,0 +1,432 @@
+#
+# This file is licensed under the Affero General Public License (AGPL) version 3.
+#
+# Copyright (C) 2025 New Vector, Ltd
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU Affero General Public License as
+# published by the Free Software Foundation, either version 3 of the
+# License, or (at your option) any later version.
+#
+# See the GNU Affero General Public License for more details:
+# <https://www.gnu.org/licenses/agpl-3.0.html>.
+#
+#
+import logging
+from typing import TYPE_CHECKING, Optional
+from urllib.parse import urlencode
+
+from synapse._pydantic_compat import (
+    BaseModel,
+    Extra,
+    StrictBool,
+    StrictInt,
+    StrictStr,
+    ValidationError,
+)
+from synapse.api.auth.base import BaseAuth
+from synapse.api.errors import (
+    AuthError,
+    HttpResponseException,
+    InvalidClientTokenError,
+    SynapseError,
+    UnrecognizedRequestError,
+)
+from synapse.http.site import SynapseRequest
+from synapse.logging.context import PreserveLoggingContext
+from synapse.logging.opentracing import (
+    active_span,
+    force_tracing,
+    inject_request_headers,
+    start_active_span,
+)
+from synapse.metrics import SERVER_NAME_LABEL
+from synapse.synapse_rust.http_client import HttpClient
+from synapse.types import JsonDict, Requester, UserID, create_requester
+from synapse.util import json_decoder
+from synapse.util.caches.cached_call import RetryOnExceptionCachedCall
+from synapse.util.caches.response_cache import ResponseCache, ResponseCacheContext
+
+from . import introspection_response_timer
+
+if TYPE_CHECKING:
+    from synapse.rest.admin.experimental_features import ExperimentalFeature
+    from synapse.server import HomeServer
+
+logger = logging.getLogger(__name__)
+
+# Scope as defined by MSC2967
+# https://github.com/matrix-org/matrix-spec-proposals/pull/2967
+SCOPE_MATRIX_API = "urn:matrix:org.matrix.msc2967.client:api:*"
+SCOPE_MATRIX_DEVICE_PREFIX = "urn:matrix:org.matrix.msc2967.client:device:"
+
+
+class ServerMetadata(BaseModel):
+    class Config:
+        extra = Extra.allow
+
+    issuer: StrictStr
+    account_management_uri: StrictStr
+
+
+class IntrospectionResponse(BaseModel):
+    retrieved_at_ms: StrictInt
+    active: StrictBool
+    scope: Optional[StrictStr]
+    username: Optional[StrictStr]
+    sub: Optional[StrictStr]
+    device_id: Optional[StrictStr]
+    expires_in: Optional[StrictInt]
+
+    class Config:
+        extra = Extra.allow
+
+    def get_scope_set(self) -> set[str]:
+        if not self.scope:
+            return set()
+
+        return {token for token in self.scope.split(" ") if token}
+
+    def is_active(self, now_ms: int) -> bool:
+        if not self.active:
+            return False
+
+        # Compatibility tokens don't expire and don't have an 'expires_in' field
+        if self.expires_in is None:
+            return True
+
+        absolute_expiry_ms = self.expires_in * 1000 + self.retrieved_at_ms
+        return now_ms < absolute_expiry_ms
+
+
+class MasDelegatedAuth(BaseAuth):
+    def __init__(self, hs: "HomeServer"):
+        super().__init__(hs)
+
+        self.server_name = hs.hostname
+        self._clock = hs.get_clock()
+        self._config = hs.config.mas
+
+        self._http_client = hs.get_proxied_http_client()
+        self._rust_http_client = HttpClient(
+            reactor=hs.get_reactor(),
+            user_agent=self._http_client.user_agent.decode("utf8"),
+        )
+        self._server_metadata = RetryOnExceptionCachedCall[ServerMetadata](
+            self._load_metadata
+        )
+        self._force_tracing_for_users = hs.config.tracing.force_tracing_for_users
+
+        # # Token Introspection Cache
+        # This remembers what users/devices are represented by which access tokens,
+        # in order to reduce overall system load:
+        # - on Synapse (as requests are relatively expensive)
+        # - on the network
+        # - on MAS
+        #
+        # Since there is no invalidation mechanism currently,
+        # the entries expire after 2 minutes.
+        # This does mean tokens can be treated as valid by Synapse
+        # for longer than reality.
+        #
+        # Ideally, tokens should logically be invalidated in the following circumstances:
+        # - If a session logout happens.
+        #   In this case, MAS will delete the device within Synapse
+        #   anyway and this is good enough as an invalidation.
+        # - If the client refreshes their token in MAS.
+        #   In this case, the device still exists and it's not the end of the world for
+        #   the old access token to continue working for a short time.
+        self._introspection_cache: ResponseCache[str] = ResponseCache(
+            clock=self._clock,
+            name="mas_token_introspection",
+            server_name=self.server_name,
+            timeout_ms=120_000,
+            # don't log because the keys are access tokens
+            enable_logging=False,
+        )
+
+    @property
+    def _metadata_url(self) -> str:
+        return f"{self._config.endpoint.rstrip('/')}/.well-known/openid-configuration"
+
+    @property
+    def _introspection_endpoint(self) -> str:
+        return f"{self._config.endpoint.rstrip('/')}/oauth2/introspect"
+
+    async def _load_metadata(self) -> ServerMetadata:
+        response = await self._http_client.get_json(self._metadata_url)
+        metadata = ServerMetadata(**response)
+        return metadata
+
+    async def issuer(self) -> str:
+        metadata = await self._server_metadata.get()
+        return metadata.issuer
+
+    async def account_management_url(self) -> str:
+        metadata = await self._server_metadata.get()
+        return metadata.account_management_uri
+
+    async def auth_metadata(self) -> JsonDict:
+        metadata = await self._server_metadata.get()
+        return metadata.dict()
+
+    def is_request_using_the_shared_secret(self, request: SynapseRequest) -> bool:
+        """
+        Check if the request is using the shared secret.
+
+        Args:
+            request: The request to check.
+
+        Returns:
+            True if the request is using the shared secret, False otherwise.
+        """
+        access_token = self.get_access_token_from_request(request)
+        shared_secret = self._config.secret()
+        if not shared_secret:
+            return False
+
+        return access_token == shared_secret
+
+    async def _introspect_token(
+        self, token: str, cache_context: ResponseCacheContext[str]
+    ) -> IntrospectionResponse:
+        """
+        Send a token to the introspection endpoint and returns the introspection response
+
+        Parameters:
+            token: The token to introspect
+
+        Raises:
+            HttpResponseException: If the introspection endpoint returns a non-2xx response
+            ValueError: If the introspection endpoint returns an invalid JSON response
+            JSONDecodeError: If the introspection endpoint returns a non-JSON response
+            Exception: If the HTTP request fails
+
+        Returns:
+            The introspection response
+        """
+
+        # By default, we shouldn't cache the result unless we know it's valid
+        cache_context.should_cache = False
+        raw_headers: dict[str, str] = {
+            "Content-Type": "application/x-www-form-urlencoded",
+            "Accept": "application/json",
+            "Authorization": f"Bearer {self._config.secret()}",
+            # Tell MAS that we support reading the device ID as an explicit
+            # value, not encoded in the scope. This is supported by MAS 0.15+
+            "X-MAS-Supports-Device-Id": "1",
+        }
+
+        args = {"token": token, "token_type_hint": "access_token"}
+        body = urlencode(args, True)
+
+        # Do the actual request
+
+        logger.debug("Fetching token from MAS")
+        start_time = self._clock.time()
+        try:
+            with start_active_span("mas-introspect-token"):
+                inject_request_headers(raw_headers)
+                with PreserveLoggingContext():
+                    resp_body = await self._rust_http_client.post(
+                        url=self._introspection_endpoint,
+                        response_limit=1 * 1024 * 1024,
+                        headers=raw_headers,
+                        request_body=body,
+                    )
+        except HttpResponseException as e:
+            end_time = self._clock.time()
+            introspection_response_timer.labels(
+                code=e.code, **{SERVER_NAME_LABEL: self.server_name}
+            ).observe(end_time - start_time)
+            raise
+        except Exception:
+            end_time = self._clock.time()
+            introspection_response_timer.labels(
+                code="ERR", **{SERVER_NAME_LABEL: self.server_name}
+            ).observe(end_time - start_time)
+            raise
+
+        logger.debug("Fetched token from MAS")
+
+        end_time = self._clock.time()
+        introspection_response_timer.labels(
+            code=200, **{SERVER_NAME_LABEL: self.server_name}
+        ).observe(end_time - start_time)
+
+        raw_response = json_decoder.decode(resp_body.decode("utf-8"))
+        try:
+            response = IntrospectionResponse(
+                retrieved_at_ms=self._clock.time_msec(),
+                **raw_response,
+            )
+        except ValidationError as e:
+            raise ValueError(
+                "The introspection endpoint returned an invalid JSON response"
+            ) from e
+
+        # We had a valid response, so we can cache it
+        cache_context.should_cache = True
+        return response
+
+    async def is_server_admin(self, requester: Requester) -> bool:
+        return "urn:synapse:admin:*" in requester.scope
+
+    async def get_user_by_req(
+        self,
+        request: SynapseRequest,
+        allow_guest: bool = False,
+        allow_expired: bool = False,
+        allow_locked: bool = False,
+    ) -> Requester:
+        parent_span = active_span()
+        with start_active_span("get_user_by_req"):
+            access_token = self.get_access_token_from_request(request)
+
+            requester = await self.get_appservice_user(request, access_token)
+            if not requester:
+                requester = await self.get_user_by_access_token(
+                    token=access_token,
+                    allow_expired=allow_expired,
+                )
+
+            await self._record_request(request, requester)
+
+            request.requester = requester
+
+            if parent_span:
+                if requester.authenticated_entity in self._force_tracing_for_users:
+                    # request tracing is enabled for this user, so we need to force it
+                    # tracing on for the parent span (which will be the servlet span).
+                    #
+                    # It's too late for the get_user_by_req span to inherit the setting,
+                    # so we also force it on for that.
+                    force_tracing()
+                    force_tracing(parent_span)
+                parent_span.set_tag(
+                    "authenticated_entity", requester.authenticated_entity
+                )
+                parent_span.set_tag("user_id", requester.user.to_string())
+                if requester.device_id is not None:
+                    parent_span.set_tag("device_id", requester.device_id)
+                if requester.app_service is not None:
+                    parent_span.set_tag("appservice_id", requester.app_service.id)
+            return requester
+
+    async def get_user_by_access_token(
+        self,
+        token: str,
+        allow_expired: bool = False,
+    ) -> Requester:
+        try:
+            introspection_result = await self._introspection_cache.wrap(
+                token, self._introspect_token, token, cache_context=True
+            )
+        except Exception:
+            logger.exception("Failed to introspect token")
+            raise SynapseError(503, "Unable to introspect the access token")
+
+        logger.debug("Introspection result: %r", introspection_result)
+        if not introspection_result.is_active(self._clock.time_msec()):
+            raise InvalidClientTokenError("Token is not active")
+
+        # Let's look at the scope
+        scope = introspection_result.get_scope_set()
+
+        # Determine type of user based on presence of particular scopes
+        if SCOPE_MATRIX_API not in scope:
+            raise InvalidClientTokenError(
+                "Token doesn't grant access to the Matrix C-S API"
+            )
+
+        if introspection_result.username is None:
+            raise AuthError(
+                500,
+                "Invalid username claim in the introspection result",
+            )
+
+        user_id = UserID(
+            localpart=introspection_result.username,
+            domain=self.server_name,
+        )
+
+        # Try to find a user from the username claim
+        user_info = await self.store.get_user_by_id(user_id=user_id.to_string())
+        if user_info is None:
+            raise AuthError(
+                500,
+                "User not found",
+            )
+
+        # MAS will give us the device ID as an explicit value for *compatibility* sessions
+        # If present, we get it from here, if not we get it in the scope for next-gen sessions
+        device_id = introspection_result.device_id
+        if device_id is None:
+            # Find device_ids in scope
+            # We only allow a single device_id in the scope, so we find them all in the
+            # scope list, and raise if there are more than one. The OIDC server should be
+            # the one enforcing valid scopes, so we raise a 500 if we find an invalid scope.
+            device_ids = [
+                tok[len(SCOPE_MATRIX_DEVICE_PREFIX) :]
+                for tok in scope
+                if tok.startswith(SCOPE_MATRIX_DEVICE_PREFIX)
+            ]
+
+            if len(device_ids) > 1:
+                raise AuthError(
+                    500,
+                    "Multiple device IDs in scope",
+                )
+
+            device_id = device_ids[0] if device_ids else None
+
+        if device_id is not None:
+            # Sanity check the device_id
+            if len(device_id) > 255 or len(device_id) < 1:
+                raise AuthError(
+                    500,
+                    "Invalid device ID in introspection result",
+                )
+
+            # Make sure the device exists. This helps with introspection cache
+            # invalidation: if we log out, the device gets deleted by MAS
+            device = await self.store.get_device(
+                user_id=user_id.to_string(),
+                device_id=device_id,
+            )
+            if device is None:
+                # Invalidate the introspection cache, the device was deleted
+                self._introspection_cache.unset(token)
+                raise InvalidClientTokenError("Token is not active")
+
+        return create_requester(
+            user_id=user_id,
+            device_id=device_id,
+            scope=scope,
+        )
+
+    async def get_user_by_req_experimental_feature(
+        self,
+        request: SynapseRequest,
+        feature: "ExperimentalFeature",
+        allow_guest: bool = False,
+        allow_expired: bool = False,
+        allow_locked: bool = False,
+    ) -> Requester:
+        try:
+            requester = await self.get_user_by_req(
+                request,
+                allow_guest=allow_guest,
+                allow_expired=allow_expired,
+                allow_locked=allow_locked,
+            )
+            if await self.store.is_feature_enabled(requester.user.to_string(), feature):
+                return requester
+
+            raise UnrecognizedRequestError(code=404)
+        except (AuthError, InvalidClientTokenError):
+            if feature.is_globally_enabled(self.hs.config):
+                # If its globally enabled then return the auth error
+                raise
+
+            raise UnrecognizedRequestError(code=404)
@@ -28,7 +28,6 @@ from authlib.oauth2.auth import encode_client_secret_basic, encode_client_secret
 from authlib.oauth2.rfc7523 import ClientSecretJWT, PrivateKeyJWT, private_key_jwt_sign
 from authlib.oauth2.rfc7662 import IntrospectionToken
 from authlib.oidc.discovery import OpenIDProviderMetadata, get_well_known_url
-from prometheus_client import Histogram

 from synapse.api.auth.base import BaseAuth
 from synapse.api.errors import (
@@ -47,25 +46,21 @@ from synapse.logging.opentracing import (
    inject_request_headers,
    start_active_span,
 )
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.synapse_rust.http_client import HttpClient
 from synapse.types import Requester, UserID, create_requester
 from synapse.util import json_decoder
 from synapse.util.caches.cached_call import RetryOnExceptionCachedCall
 from synapse.util.caches.response_cache import ResponseCache, ResponseCacheContext

+from . import introspection_response_timer
+
 if TYPE_CHECKING:
    from synapse.rest.admin.experimental_features import ExperimentalFeature
    from synapse.server import HomeServer

 logger = logging.getLogger(__name__)

-introspection_response_timer = Histogram(
-    "synapse_api_auth_delegated_introspection_response",
-    "Time taken to get a response for an introspection request",
-    ["code"],
-)
-
-
 # Scope as defined by MSC2967
 # https://github.com/matrix-org/matrix-spec-proposals/pull/2967
 SCOPE_MATRIX_API = "urn:matrix:org.matrix.msc2967.client:api:*"
@@ -341,17 +336,23 @@ class MSC3861DelegatedAuth(BaseAuth):
                    )
        except HttpResponseException as e:
            end_time = self._clock.time()
-            introspection_response_timer.labels(e.code).observe(end_time - start_time)
+            introspection_response_timer.labels(
+                code=e.code, **{SERVER_NAME_LABEL: self.server_name}
+            ).observe(end_time - start_time)
            raise
        except Exception:
            end_time = self._clock.time()
-            introspection_response_timer.labels("ERR").observe(end_time - start_time)
+            introspection_response_timer.labels(
+                code="ERR", **{SERVER_NAME_LABEL: self.server_name}
+            ).observe(end_time - start_time)
            raise

        logger.debug("Fetched token from MAS")

        end_time = self._clock.time()
-        introspection_response_timer.labels(200).observe(end_time - start_time)
+        introspection_response_timer.labels(
+            code=200, **{SERVER_NAME_LABEL: self.server_name}
+        ).observe(end_time - start_time)

        resp = json_decoder.decode(resp_body.decode("utf-8"))

@@ -140,6 +140,12 @@ class Codes(str, Enum):
    # Part of MSC4155
    INVITE_BLOCKED = "ORG.MATRIX.MSC4155.M_INVITE_BLOCKED"

+    # Part of MSC4306: Thread Subscriptions
+    MSC4306_CONFLICTING_UNSUBSCRIPTION = (
+        "IO.ELEMENT.MSC4306.M_CONFLICTING_UNSUBSCRIPTION"
+    )
+    MSC4306_NOT_IN_THREAD = "IO.ELEMENT.MSC4306.M_NOT_IN_THREAD"
+

 class CodeMessageException(RuntimeError):
    """An exception with integer code, a message string attributes and optional headers.
@@ -75,7 +75,7 @@ from synapse.http.site import SynapseSite
 from synapse.logging.context import PreserveLoggingContext
 from synapse.logging.opentracing import init_tracer
 from synapse.metrics import install_gc_manager, register_threadpool
-from synapse.metrics.background_process_metrics import wrap_as_background_process
+from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.metrics.jemalloc import setup_jemalloc_stats
 from synapse.module_api.callbacks.spamchecker_callbacks import load_legacy_spam_checkers
 from synapse.module_api.callbacks.third_party_event_rules_callbacks import (
@@ -512,6 +512,7 @@ async def start(hs: "HomeServer") -> None:
    Args:
        hs: homeserver instance
    """
+    server_name = hs.hostname
    reactor = hs.get_reactor()

    # We want to use a separate thread pool for the resolver so that large
@@ -524,22 +525,34 @@ async def start(hs: "HomeServer") -> None:
    )

    # Register the threadpools with our metrics.
-    register_threadpool("default", reactor.getThreadPool())
-    register_threadpool("gai_resolver", resolver_threadpool)
+    register_threadpool(
+        name="default", server_name=server_name, threadpool=reactor.getThreadPool()
+    )
+    register_threadpool(
+        name="gai_resolver", server_name=server_name, threadpool=resolver_threadpool
+    )

    # Set up the SIGHUP machinery.
    if hasattr(signal, "SIGHUP"):

-        @wrap_as_background_process("sighup")
-        async def handle_sighup(*args: Any, **kwargs: Any) -> None:
-            # Tell systemd our state, if we're using it. This will silently fail if
-            # we're not using systemd.
-            sdnotify(b"RELOADING=1")
+        def handle_sighup(*args: Any, **kwargs: Any) -> "defer.Deferred[None]":
+            async def _handle_sighup(*args: Any, **kwargs: Any) -> None:
+                # Tell systemd our state, if we're using it. This will silently fail if
+                # we're not using systemd.
+                sdnotify(b"RELOADING=1")

-            for i, args, kwargs in _sighup_callbacks:
-                i(*args, **kwargs)
+                for i, args, kwargs in _sighup_callbacks:
+                    i(*args, **kwargs)

-            sdnotify(b"READY=1")
+                sdnotify(b"READY=1")
+
+            return run_as_background_process(
+                "sighup",
+                server_name,
+                _handle_sighup,
+                *args,
+                **kwargs,
+            )

        # We defer running the sighup handlers until next reactor tick. This
        # is so that we're in a sane state, e.g. flushing the logs may fail
@@ -26,7 +26,12 @@ from typing import TYPE_CHECKING, List, Mapping, Sized, Tuple

 from prometheus_client import Gauge

-from synapse.metrics.background_process_metrics import wrap_as_background_process
+from twisted.internet import defer
+
+from synapse.metrics import SERVER_NAME_LABEL
+from synapse.metrics.background_process_metrics import (
+    run_as_background_process,
+)
 from synapse.types import JsonDict
 from synapse.util.constants import ONE_HOUR_SECONDS, ONE_MINUTE_SECONDS

@@ -53,138 +58,158 @@ Phone home stats are sent every 3 hours
 _stats_process: List[Tuple[int, "resource.struct_rusage"]] = []

 # Gauges to expose monthly active user control metrics
-current_mau_gauge = Gauge("synapse_admin_mau_current", "Current MAU")
+current_mau_gauge = Gauge(
+    "synapse_admin_mau_current",
+    "Current MAU",
+    labelnames=[SERVER_NAME_LABEL],
+)
 current_mau_by_service_gauge = Gauge(
    "synapse_admin_mau_current_mau_by_service",
    "Current MAU by service",
-    ["app_service"],
+    labelnames=["app_service", SERVER_NAME_LABEL],
+)
+max_mau_gauge = Gauge(
+    "synapse_admin_mau_max",
+    "MAU Limit",
+    labelnames=[SERVER_NAME_LABEL],
 )
-max_mau_gauge = Gauge("synapse_admin_mau_max", "MAU Limit")
 registered_reserved_users_mau_gauge = Gauge(
    "synapse_admin_mau_registered_reserved_users",
    "Registered users with reserved threepids",
+    labelnames=[SERVER_NAME_LABEL],
 )


-@wrap_as_background_process("phone_stats_home")
-async def phone_stats_home(
+def phone_stats_home(
    hs: "HomeServer",
    stats: JsonDict,
    stats_process: List[Tuple[int, "resource.struct_rusage"]] = _stats_process,
-) -> None:
-    """Collect usage statistics and send them to the configured endpoint.
+) -> "defer.Deferred[None]":
+    server_name = hs.hostname

-    Args:
-        hs: the HomeServer object to use for gathering usage data.
-        stats: the dict in which to store the statistics sent to the configured
-            endpoint. Mostly used in tests to figure out the data that is supposed to
-            be sent.
-        stats_process: statistics about resource usage of the process.
-    """
+    async def _phone_stats_home(
+        hs: "HomeServer",
+        stats: JsonDict,
+        stats_process: List[Tuple[int, "resource.struct_rusage"]] = _stats_process,
+    ) -> None:
+        """Collect usage statistics and send them to the configured endpoint.

-    logger.info("Gathering stats for reporting")
-    now = int(hs.get_clock().time())
-    # Ensure the homeserver has started.
-    assert hs.start_time is not None
-    uptime = int(now - hs.start_time)
-    if uptime < 0:
-        uptime = 0
+        Args:
+            hs: the HomeServer object to use for gathering usage data.
+            stats: the dict in which to store the statistics sent to the configured
+                endpoint. Mostly used in tests to figure out the data that is supposed to
+                be sent.
+            stats_process: statistics about resource usage of the process.
+        """

-    #
-    # Performance statistics. Keep this early in the function to maintain reliability of `test_performance_100` test.
-    #
-    old = stats_process[0]
-    new = (now, resource.getrusage(resource.RUSAGE_SELF))
-    stats_process[0] = new
+        logger.info("Gathering stats for reporting")
+        now = int(hs.get_clock().time())
+        # Ensure the homeserver has started.
+        assert hs.start_time is not None
+        uptime = int(now - hs.start_time)
+        if uptime < 0:
+            uptime = 0

-    # Get RSS in bytes
-    stats["memory_rss"] = new[1].ru_maxrss
+        #
+        # Performance statistics. Keep this early in the function to maintain reliability of `test_performance_100` test.
+        #
+        old = stats_process[0]
+        new = (now, resource.getrusage(resource.RUSAGE_SELF))
+        stats_process[0] = new

-    # Get CPU time in % of a single core, not % of all cores
-    used_cpu_time = (new[1].ru_utime + new[1].ru_stime) - (
-        old[1].ru_utime + old[1].ru_stime
-    )
-    if used_cpu_time == 0 or new[0] == old[0]:
-        stats["cpu_average"] = 0
-    else:
-        stats["cpu_average"] = math.floor(used_cpu_time / (new[0] - old[0]) * 100)
+        # Get RSS in bytes
+        stats["memory_rss"] = new[1].ru_maxrss

-    #
-    # General statistics
-    #
-
-    store = hs.get_datastores().main
-    common_metrics = await hs.get_common_usage_metrics_manager().get_metrics()
-
-    stats["homeserver"] = hs.config.server.server_name
-    stats["server_context"] = hs.config.server.server_context
-    stats["timestamp"] = now
-    stats["uptime_seconds"] = uptime
-    version = sys.version_info
-    stats["python_version"] = "{}.{}.{}".format(
-        version.major, version.minor, version.micro
-    )
-    stats["total_users"] = await store.count_all_users()
-
-    total_nonbridged_users = await store.count_nonbridged_users()
-    stats["total_nonbridged_users"] = total_nonbridged_users
-
-    daily_user_type_results = await store.count_daily_user_type()
-    for name, count in daily_user_type_results.items():
-        stats["daily_user_type_" + name] = count
-
-    room_count = await store.get_room_count()
-    stats["total_room_count"] = room_count
-
-    stats["daily_active_users"] = common_metrics.daily_active_users
-    stats["monthly_active_users"] = await store.count_monthly_users()
-    daily_active_e2ee_rooms = await store.count_daily_active_e2ee_rooms()
-    stats["daily_active_e2ee_rooms"] = daily_active_e2ee_rooms
-    stats["daily_e2ee_messages"] = await store.count_daily_e2ee_messages()
-    daily_sent_e2ee_messages = await store.count_daily_sent_e2ee_messages()
-    stats["daily_sent_e2ee_messages"] = daily_sent_e2ee_messages
-    stats["daily_active_rooms"] = await store.count_daily_active_rooms()
-    stats["daily_messages"] = await store.count_daily_messages()
-    daily_sent_messages = await store.count_daily_sent_messages()
-    stats["daily_sent_messages"] = daily_sent_messages
-
-    r30v2_results = await store.count_r30v2_users()
-    for name, count in r30v2_results.items():
-        stats["r30v2_users_" + name] = count
-
-    stats["cache_factor"] = hs.config.caches.global_factor
-    stats["event_cache_size"] = hs.config.caches.event_cache_size
-
-    #
-    # Database version
-    #
-
-    # This only reports info about the *main* database.
-    stats["database_engine"] = store.db_pool.engine.module.__name__
-    stats["database_server_version"] = store.db_pool.engine.server_version
-
-    #
-    # Logging configuration
-    #
-    synapse_logger = logging.getLogger("synapse")
-    log_level = synapse_logger.getEffectiveLevel()
-    stats["log_level"] = logging.getLevelName(log_level)
-
-    logger.info(
-        "Reporting stats to %s: %s", hs.config.metrics.report_stats_endpoint, stats
-    )
-    try:
-        await hs.get_proxied_http_client().put_json(
-            hs.config.metrics.report_stats_endpoint, stats
+        # Get CPU time in % of a single core, not % of all cores
+        used_cpu_time = (new[1].ru_utime + new[1].ru_stime) - (
+            old[1].ru_utime + old[1].ru_stime
        )
-    except Exception as e:
-        logger.warning("Error reporting stats: %s", e)
+        if used_cpu_time == 0 or new[0] == old[0]:
+            stats["cpu_average"] = 0
+        else:
+            stats["cpu_average"] = math.floor(used_cpu_time / (new[0] - old[0]) * 100)
+
+        #
+        # General statistics
+        #
+
+        store = hs.get_datastores().main
+        common_metrics = await hs.get_common_usage_metrics_manager().get_metrics()
+
+        stats["homeserver"] = hs.config.server.server_name
+        stats["server_context"] = hs.config.server.server_context
+        stats["timestamp"] = now
+        stats["uptime_seconds"] = uptime
+        version = sys.version_info
+        stats["python_version"] = "{}.{}.{}".format(
+            version.major, version.minor, version.micro
+        )
+        stats["total_users"] = await store.count_all_users()
+
+        total_nonbridged_users = await store.count_nonbridged_users()
+        stats["total_nonbridged_users"] = total_nonbridged_users
+
+        daily_user_type_results = await store.count_daily_user_type()
+        for name, count in daily_user_type_results.items():
+            stats["daily_user_type_" + name] = count
+
+        room_count = await store.get_room_count()
+        stats["total_room_count"] = room_count
+
+        stats["daily_active_users"] = common_metrics.daily_active_users
+        stats["monthly_active_users"] = await store.count_monthly_users()
+        daily_active_e2ee_rooms = await store.count_daily_active_e2ee_rooms()
+        stats["daily_active_e2ee_rooms"] = daily_active_e2ee_rooms
+        stats["daily_e2ee_messages"] = await store.count_daily_e2ee_messages()
+        daily_sent_e2ee_messages = await store.count_daily_sent_e2ee_messages()
+        stats["daily_sent_e2ee_messages"] = daily_sent_e2ee_messages
+        stats["daily_active_rooms"] = await store.count_daily_active_rooms()
+        stats["daily_messages"] = await store.count_daily_messages()
+        daily_sent_messages = await store.count_daily_sent_messages()
+        stats["daily_sent_messages"] = daily_sent_messages
+
+        r30v2_results = await store.count_r30v2_users()
+        for name, count in r30v2_results.items():
+            stats["r30v2_users_" + name] = count
+
+        stats["cache_factor"] = hs.config.caches.global_factor
+        stats["event_cache_size"] = hs.config.caches.event_cache_size
+
+        #
+        # Database version
+        #
+
+        # This only reports info about the *main* database.
+        stats["database_engine"] = store.db_pool.engine.module.__name__
+        stats["database_server_version"] = store.db_pool.engine.server_version
+
+        #
+        # Logging configuration
+        #
+        synapse_logger = logging.getLogger("synapse")
+        log_level = synapse_logger.getEffectiveLevel()
+        stats["log_level"] = logging.getLevelName(log_level)
+
+        logger.info(
+            "Reporting stats to %s: %s", hs.config.metrics.report_stats_endpoint, stats
+        )
+        try:
+            await hs.get_proxied_http_client().put_json(
+                hs.config.metrics.report_stats_endpoint, stats
+            )
+        except Exception as e:
+            logger.warning("Error reporting stats: %s", e)
+
+    return run_as_background_process(
+        "phone_stats_home", server_name, _phone_stats_home, hs, stats, stats_process
+    )


 def start_phone_stats_home(hs: "HomeServer") -> None:
    """
    Start the background tasks which report phone home stats.
    """
+    server_name = hs.hostname
    clock = hs.get_clock()

    stats: JsonDict = {}
@@ -210,25 +235,39 @@ def start_phone_stats_home(hs: "HomeServer") -> None:
    )
    hs.get_datastores().main.reap_monthly_active_users()

-    @wrap_as_background_process("generate_monthly_active_users")
-    async def generate_monthly_active_users() -> None:
-        current_mau_count = 0
-        current_mau_count_by_service: Mapping[str, int] = {}
-        reserved_users: Sized = ()
-        store = hs.get_datastores().main
-        if hs.config.server.limit_usage_by_mau or hs.config.server.mau_stats_only:
-            current_mau_count = await store.get_monthly_active_count()
-            current_mau_count_by_service = (
-                await store.get_monthly_active_count_by_service()
+    def generate_monthly_active_users() -> "defer.Deferred[None]":
+        async def _generate_monthly_active_users() -> None:
+            current_mau_count = 0
+            current_mau_count_by_service: Mapping[str, int] = {}
+            reserved_users: Sized = ()
+            store = hs.get_datastores().main
+            if hs.config.server.limit_usage_by_mau or hs.config.server.mau_stats_only:
+                current_mau_count = await store.get_monthly_active_count()
+                current_mau_count_by_service = (
+                    await store.get_monthly_active_count_by_service()
+                )
+                reserved_users = await store.get_registered_reserved_users()
+            current_mau_gauge.labels(**{SERVER_NAME_LABEL: server_name}).set(
+                float(current_mau_count)
            )
-            reserved_users = await store.get_registered_reserved_users()
-        current_mau_gauge.set(float(current_mau_count))

-        for app_service, count in current_mau_count_by_service.items():
-            current_mau_by_service_gauge.labels(app_service).set(float(count))
+            for app_service, count in current_mau_count_by_service.items():
+                current_mau_by_service_gauge.labels(
+                    app_service=app_service, **{SERVER_NAME_LABEL: server_name}
+                ).set(float(count))

-        registered_reserved_users_mau_gauge.set(float(len(reserved_users)))
-        max_mau_gauge.set(float(hs.config.server.max_mau_value))
+            registered_reserved_users_mau_gauge.labels(
+                **{SERVER_NAME_LABEL: server_name}
+            ).set(float(len(reserved_users)))
+            max_mau_gauge.labels(**{SERVER_NAME_LABEL: server_name}).set(
+                float(hs.config.server.max_mau_value)
+            )
+
+        return run_as_background_process(
+            "generate_monthly_active_users",
+            server_name,
+            _generate_monthly_active_users,
+        )

    if hs.config.server.limit_usage_by_mau or hs.config.server.mau_stats_only:
        generate_monthly_active_users()
@@ -48,6 +48,7 @@ from synapse.events import EventBase
 from synapse.events.utils import SerializeEventConfig, serialize_event
 from synapse.http.client import SimpleHttpClient, is_unknown_endpoint
 from synapse.logging import opentracing
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.types import DeviceListUpdates, JsonDict, JsonMapping, ThirdPartyInstanceID
 from synapse.util.caches.response_cache import ResponseCache

@@ -59,29 +60,31 @@ logger = logging.getLogger(__name__)
 sent_transactions_counter = Counter(
    "synapse_appservice_api_sent_transactions",
    "Number of /transactions/ requests sent",
-    ["service"],
+    labelnames=["service", SERVER_NAME_LABEL],
 )

 failed_transactions_counter = Counter(
    "synapse_appservice_api_failed_transactions",
    "Number of /transactions/ requests that failed to send",
-    ["service"],
+    labelnames=["service", SERVER_NAME_LABEL],
 )

 sent_events_counter = Counter(
-    "synapse_appservice_api_sent_events", "Number of events sent to the AS", ["service"]
+    "synapse_appservice_api_sent_events",
+    "Number of events sent to the AS",
+    labelnames=["service", SERVER_NAME_LABEL],
 )

 sent_ephemeral_counter = Counter(
    "synapse_appservice_api_sent_ephemeral",
    "Number of ephemeral events sent to the AS",
-    ["service"],
+    labelnames=["service", SERVER_NAME_LABEL],
 )

 sent_todevice_counter = Counter(
    "synapse_appservice_api_sent_todevice",
    "Number of todevice messages sent to the AS",
-    ["service"],
+    labelnames=["service", SERVER_NAME_LABEL],
 )

 HOUR_IN_MS = 60 * 60 * 1000
@@ -382,6 +385,7 @@ class ApplicationServiceApi(SimpleHttpClient):
                    "left": list(device_list_summary.left),
                }

+        labels = {"service": service.id, SERVER_NAME_LABEL: self.server_name}
        try:
            args = None
            if self.config.use_appservice_legacy_authorization:
@@ -399,10 +403,10 @@ class ApplicationServiceApi(SimpleHttpClient):
                    service.url,
                    [event.get("event_id") for event in events],
                )
-            sent_transactions_counter.labels(service.id).inc()
-            sent_events_counter.labels(service.id).inc(len(serialized_events))
-            sent_ephemeral_counter.labels(service.id).inc(len(ephemeral))
-            sent_todevice_counter.labels(service.id).inc(len(to_device_messages))
+            sent_transactions_counter.labels(**labels).inc()
+            sent_events_counter.labels(**labels).inc(len(serialized_events))
+            sent_ephemeral_counter.labels(**labels).inc(len(ephemeral))
+            sent_todevice_counter.labels(**labels).inc(len(to_device_messages))
            return True
        except CodeMessageException as e:
            logger.warning(
@@ -421,7 +425,7 @@ class ApplicationServiceApi(SimpleHttpClient):
                ex.args,
                exc_info=logger.isEnabledFor(logging.DEBUG),
            )
-        failed_transactions_counter.labels(service.id).inc()
+        failed_transactions_counter.labels(**labels).inc()
        return False

    async def claim_client_keys(
@@ -103,18 +103,16 @@ MAX_TO_DEVICE_MESSAGES_PER_TRANSACTION = 100


 class ApplicationServiceScheduler:
-    """Public facing API for this module. Does the required DI to tie the
-    components together. This also serves as the "event_pool", which in this
+    """
+    Public facing API for this module. Does the required dependency injection (DI) to
+    tie the components together. This also serves as the "event_pool", which in this
    case is a simple array.
    """

    def __init__(self, hs: "HomeServer"):
-        self.clock = hs.get_clock()
+        self.txn_ctrl = _TransactionController(hs)
        self.store = hs.get_datastores().main
-        self.as_api = hs.get_application_service_api()
-
-        self.txn_ctrl = _TransactionController(self.clock, self.store, self.as_api)
-        self.queuer = _ServiceQueuer(self.txn_ctrl, self.clock, hs)
+        self.queuer = _ServiceQueuer(self.txn_ctrl, hs)

    async def start(self) -> None:
        logger.info("Starting appservice scheduler")
@@ -184,9 +182,7 @@ class _ServiceQueuer:
    appservice at a given time.
    """

-    def __init__(
-        self, txn_ctrl: "_TransactionController", clock: Clock, hs: "HomeServer"
-    ):
+    def __init__(self, txn_ctrl: "_TransactionController", hs: "HomeServer"):
        # dict of {service_id: [events]}
        self.queued_events: Dict[str, List[EventBase]] = {}
        # dict of {service_id: [events]}
@@ -199,10 +195,11 @@ class _ServiceQueuer:
        # the appservices which currently have a transaction in flight
        self.requests_in_flight: Set[str] = set()
        self.txn_ctrl = txn_ctrl
-        self.clock = clock
        self._msc3202_transaction_extensions_enabled: bool = (
            hs.config.experimental.msc3202_transaction_extensions
        )
+        self.server_name = hs.hostname
+        self.clock = hs.get_clock()
        self._store = hs.get_datastores().main

    def start_background_request(self, service: ApplicationService) -> None:
@@ -210,7 +207,9 @@ class _ServiceQueuer:
        if service.id in self.requests_in_flight:
            return

-        run_as_background_process("as-sender", self._send_request, service)
+        run_as_background_process(
+            "as-sender", self.server_name, self._send_request, service
+        )

    async def _send_request(self, service: ApplicationService) -> None:
        # sanity-check: we shouldn't get here if this service already has a sender
@@ -359,10 +358,11 @@ class _TransactionController:
    (Note we have only have one of these in the homeserver.)
    """

-    def __init__(self, clock: Clock, store: DataStore, as_api: ApplicationServiceApi):
-        self.clock = clock
-        self.store = store
-        self.as_api = as_api
+    def __init__(self, hs: "HomeServer"):
+        self.server_name = hs.hostname
+        self.clock = hs.get_clock()
+        self.store = hs.get_datastores().main
+        self.as_api = hs.get_application_service_api()

        # map from service id to recoverer instance
        self.recoverers: Dict[str, "_Recoverer"] = {}
@@ -446,7 +446,12 @@ class _TransactionController:
        logger.info("Starting recoverer for AS ID %s", service.id)
        assert service.id not in self.recoverers
        recoverer = self.RECOVERER_CLASS(
-            self.clock, self.store, self.as_api, service, self.on_recovered
+            self.server_name,
+            self.clock,
+            self.store,
+            self.as_api,
+            service,
+            self.on_recovered,
        )
        self.recoverers[service.id] = recoverer
        recoverer.recover()
@@ -477,21 +482,24 @@ class _Recoverer:
    We have one of these for each appservice which is currently considered DOWN.

    Args:
-        clock (synapse.util.Clock):
-        store (synapse.storage.DataStore):
-        as_api (synapse.appservice.api.ApplicationServiceApi):
-        service (synapse.appservice.ApplicationService): the service we are managing
-        callback (callable[_Recoverer]): called once the service recovers.
+        server_name: the homeserver name (used to label metrics) (this should be `hs.hostname`).
+        clock:
+        store:
+        as_api:
+        service: the service we are managing
+        callback: called once the service recovers.
    """

    def __init__(
        self,
+        server_name: str,
        clock: Clock,
        store: DataStore,
        as_api: ApplicationServiceApi,
        service: ApplicationService,
        callback: Callable[["_Recoverer"], Awaitable[None]],
    ):
+        self.server_name = server_name
        self.clock = clock
        self.store = store
        self.as_api = as_api
@@ -504,7 +512,11 @@ class _Recoverer:
        delay = 2**self.backoff_counter
        logger.info("Scheduling retries on %s in %fs", self.service.id, delay)
        self.scheduled_recovery = self.clock.call_later(
-            delay, run_as_background_process, "as-recoverer", self.retry
+            delay,
+            run_as_background_process,
+            "as-recoverer",
+            self.server_name,
+            self.retry,
        )

    def _backoff(self) -> None:
@@ -525,6 +537,7 @@ class _Recoverer:
        # Run a retry, which will resechedule a recovery if it fails.
        run_as_background_process(
            "retry",
+            self.server_name,
            self.retry,
        )

@@ -36,6 +36,7 @@ from synapse.config import (  # noqa: F401
    jwt,
    key,
    logger,
+    mas,
    metrics,
    modules,
    oembed,
@@ -124,6 +125,7 @@ class RootConfig:
    background_updates: background_updates.BackgroundUpdateConfig
    auto_accept_invites: auto_accept_invites.AutoAcceptInvitesConfig
    user_types: user_types.UserTypesConfig
+    mas: mas.MasConfig

    config_classes: List[Type["Config"]] = ...
    config_files: List[str]
@@ -36,13 +36,14 @@ class AuthConfig(Config):
        if password_config is None:
            password_config = {}

-        # The default value of password_config.enabled is True, unless msc3861 is enabled.
-        msc3861_enabled = (
-            (config.get("experimental_features") or {})
-            .get("msc3861", {})
-            .get("enabled", False)
-        )
-        passwords_enabled = password_config.get("enabled", not msc3861_enabled)
+        auth_delegated = (config.get("experimental_features") or {}).get(
+            "msc3861", {}
+        ).get("enabled", False) or (
+            config.get("matrix_authentication_service") or {}
+        ).get("enabled", False)
+
+        # The default value of password_config.enabled is True, unless auth is delegated
+        passwords_enabled = password_config.get("enabled", not auth_delegated)

        # 'only_for_reauth' allows users who have previously set a password to use it,
        # even though passwords would otherwise be disabled.
@@ -582,6 +582,9 @@ class ExperimentalConfig(Config):
        # MSC4155: Invite filtering
        self.msc4155_enabled: bool = experimental.get("msc4155_enabled", False)

+        # MSC4293: Redact on Kick/Ban
+        self.msc4293_enabled: bool = experimental.get("msc4293_enabled", False)
+
        # MSC4306: Thread Subscriptions
        # (and MSC4308: sliding sync extension for thread subscriptions)
        self.msc4306_enabled: bool = experimental.get("msc4306_enabled", False)
@@ -36,6 +36,7 @@ from .federation import FederationConfig
 from .jwt import JWTConfig
 from .key import KeyConfig
 from .logger import LoggingConfig
+from .mas import MasConfig
 from .metrics import MetricsConfig
 from .modules import ModulesConfig
 from .oembed import OembedConfig
@@ -109,4 +110,6 @@ class HomeServerConfig(RootConfig):
        BackgroundUpdateConfig,
        AutoAcceptInvitesConfig,
        UserTypesConfig,
+        # This must be last, as it checks for conflicts with other config options.
+        MasConfig,
    ]
@@ -0,0 +1,192 @@
+#
+# This file is licensed under the Affero General Public License (AGPL) version 3.
+#
+# Copyright (C) 2025 New Vector, Ltd
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU Affero General Public License as
+# published by the Free Software Foundation, either version 3 of the
+# License, or (at your option) any later version.
+#
+# See the GNU Affero General Public License for more details:
+# <https://www.gnu.org/licenses/agpl-3.0.html>.
+#
+#
+
+from typing import Any, Optional
+
+from synapse._pydantic_compat import (
+    AnyHttpUrl,
+    Field,
+    FilePath,
+    StrictBool,
+    StrictStr,
+    ValidationError,
+    validator,
+)
+from synapse.config.experimental import read_secret_from_file_once
+from synapse.types import JsonDict
+from synapse.util.pydantic_models import ParseModel
+
+from ._base import Config, ConfigError, RootConfig
+
+
+class MasConfigModel(ParseModel):
+    enabled: StrictBool = False
+    endpoint: AnyHttpUrl = Field(default="http://localhost:8080")
+    secret: Optional[StrictStr] = Field(default=None)
+    secret_path: Optional[FilePath] = Field(default=None)
+
+    @validator("secret")
+    def validate_secret_is_set_if_enabled(cls, v: Any, values: dict) -> Any:
+        if values.get("enabled", False) and not values.get("secret_path") and not v:
+            raise ValueError(
+                "You must set a `secret` or `secret_path` when enabling Matrix Authentication Service integration."
+            )
+
+        return v
+
+    @validator("secret_path")
+    def validate_secret_path_is_set_if_enabled(cls, v: Any, values: dict) -> Any:
+        if values.get("secret"):
+            raise ValueError(
+                "`secret` and `secret_path` cannot be set at the same time."
+            )
+
+        return v
+
+
+class MasConfig(Config):
+    section = "mas"
+
+    def read_config(
+        self, config: JsonDict, allow_secrets_in_config: bool, **kwargs: Any
+    ) -> None:
+        mas_config = config.get("matrix_authentication_service", {})
+        if mas_config is None:
+            mas_config = {}
+
+        try:
+            parsed = MasConfigModel(**mas_config)
+        except ValidationError as e:
+            raise ConfigError(
+                "Could not validate Matrix Authentication Service configuration",
+                path=("matrix_authentication_service",),
+            ) from e
+
+        if parsed.secret and not allow_secrets_in_config:
+            raise ConfigError(
+                "Config options that expect an in-line secret as value are disabled",
+                ("matrix_authentication_service", "secret"),
+            )
+
+        self.enabled = parsed.enabled
+        self.endpoint = parsed.endpoint
+        self._secret = parsed.secret
+        self._secret_path = parsed.secret_path
+
+        self.check_config_conflicts(self.root)
+
+    def check_config_conflicts(
+        self,
+        root: RootConfig,
+    ) -> None:
+        """Checks for any configuration conflicts with other parts of Synapse.
+
+        Raises:
+            ConfigError: If there are any configuration conflicts.
+        """
+
+        if not self.enabled:
+            return
+
+        if root.experimental.msc3861.enabled:
+            raise ConfigError(
+                "Experimental MSC3861 was replaced by Matrix Authentication Service."
+                "Please disable MSC3861 or disable Matrix Authentication Service.",
+                ("experimental", "msc3861"),
+            )
+
+        if (
+            root.auth.password_enabled_for_reauth
+            or root.auth.password_enabled_for_login
+        ):
+            raise ConfigError(
+                "Password auth cannot be enabled when OAuth delegation is enabled",
+                ("password_config", "enabled"),
+            )
+
+        if root.registration.enable_registration:
+            raise ConfigError(
+                "Registration cannot be enabled when OAuth delegation is enabled",
+                ("enable_registration",),
+            )
+
+        # We only need to test the user consent version, as if it must be set if the user_consent section was present in the config
+        if root.consent.user_consent_version is not None:
+            raise ConfigError(
+                "User consent cannot be enabled when OAuth delegation is enabled",
+                ("user_consent",),
+            )
+
+        if (
+            root.oidc.oidc_enabled
+            or root.saml2.saml2_enabled
+            or root.cas.cas_enabled
+            or root.jwt.jwt_enabled
+        ):
+            raise ConfigError("SSO cannot be enabled when OAuth delegation is enabled")
+
+        if bool(root.authproviders.password_providers):
+            raise ConfigError(
+                "Password auth providers cannot be enabled when OAuth delegation is enabled"
+            )
+
+        if root.captcha.enable_registration_captcha:
+            raise ConfigError(
+                "CAPTCHA cannot be enabled when OAuth delegation is enabled",
+                ("captcha", "enable_registration_captcha"),
+            )
+
+        if root.auth.login_via_existing_enabled:
+            raise ConfigError(
+                "Login via existing session cannot be enabled when OAuth delegation is enabled",
+                ("login_via_existing_session", "enabled"),
+            )
+
+        if root.registration.refresh_token_lifetime:
+            raise ConfigError(
+                "refresh_token_lifetime cannot be set when OAuth delegation is enabled",
+                ("refresh_token_lifetime",),
+            )
+
+        if root.registration.nonrefreshable_access_token_lifetime:
+            raise ConfigError(
+                "nonrefreshable_access_token_lifetime cannot be set when OAuth delegation is enabled",
+                ("nonrefreshable_access_token_lifetime",),
+            )
+
+        if root.registration.session_lifetime:
+            raise ConfigError(
+                "session_lifetime cannot be set when OAuth delegation is enabled",
+                ("session_lifetime",),
+            )
+
+        if root.registration.enable_3pid_changes:
+            raise ConfigError(
+                "enable_3pid_changes cannot be enabled when OAuth delegation is enabled",
+                ("enable_3pid_changes",),
+            )
+
+    def secret(self) -> str:
+        if self._secret is not None:
+            return self._secret
+        elif self._secret_path is not None:
+            return read_secret_from_file_once(
+                str(self._secret_path),
+                ("matrix_authentication_service", "secret_path"),
+            )
+        else:
+            raise RuntimeError(
+                "Neither `secret` nor `secret_path` are set, this is a bug.",
+            )
@@ -241,6 +241,12 @@ class RatelimitConfig(Config):
            defaults={"per_second": 1, "burst_count": 5},
        )

+        self.rc_room_creation = RatelimitSettings.parse(
+            config,
+            "rc_room_creation",
+            defaults={"per_second": 0.016, "burst_count": 10},
+        )
+
        self.rc_reports = RatelimitSettings.parse(
            config,
            "rc_reports",
@@ -148,15 +148,14 @@ class RegistrationConfig(Config):
        self.enable_set_displayname = config.get("enable_set_displayname", True)
        self.enable_set_avatar_url = config.get("enable_set_avatar_url", True)

+        auth_delegated = (config.get("experimental_features") or {}).get(
+            "msc3861", {}
+        ).get("enabled", False) or (
+            config.get("matrix_authentication_service") or {}
+        ).get("enabled", False)
+
        # The default value of enable_3pid_changes is True, unless msc3861 is enabled.
-        msc3861_enabled = (
-            (config.get("experimental_features") or {})
-            .get("msc3861", {})
-            .get("enabled", False)
-        )
-        self.enable_3pid_changes = config.get(
-            "enable_3pid_changes", not msc3861_enabled
-        )
+        self.enable_3pid_changes = config.get("enable_3pid_changes", not auth_delegated)

        self.disable_msisdn_registration = config.get(
            "disable_msisdn_registration", False
@@ -22,11 +22,10 @@
 import logging
 import os
 from typing import Any, Dict, List, Tuple
-from urllib.request import getproxies_environment

 import attr

-from synapse.config.server import generate_ip_set
+from synapse.config.server import generate_ip_set, parse_proxy_config
 from synapse.types import JsonDict
 from synapse.util.check_dependencies import check_requirements
 from synapse.util.module_loader import load_module
@@ -61,7 +60,7 @@ THUMBNAIL_SUPPORTED_MEDIA_FORMAT_MAP = {
    "image/png": "png",
 }

-HTTP_PROXY_SET_WARNING = """\
+URL_PREVIEW_BLACKLIST_IGNORED_BECAUSE_HTTP_PROXY_SET_WARNING = """\
 The Synapse config url_preview_ip_range_blacklist will be ignored as an HTTP(s) proxy is configured."""


@@ -234,17 +233,25 @@ class ContentRepositoryConfig(Config):
        if self.url_preview_enabled:
            check_requirements("url-preview")

-            proxy_env = getproxies_environment()
-            if "url_preview_ip_range_blacklist" not in config:
-                if "http" not in proxy_env or "https" not in proxy_env:
+            proxy_config = parse_proxy_config(config)
+            is_proxy_configured = (
+                proxy_config.http_proxy is not None
+                or proxy_config.https_proxy is not None
+            )
+            if "url_preview_ip_range_blacklist" in config:
+                if is_proxy_configured:
+                    logger.warning(
+                        "".join(
+                            URL_PREVIEW_BLACKLIST_IGNORED_BECAUSE_HTTP_PROXY_SET_WARNING
+                        )
+                    )
+            else:
+                if not is_proxy_configured:
                    raise ConfigError(
                        "For security, you must specify an explicit target IP address "
                        "blacklist in url_preview_ip_range_blacklist for url previewing "
                        "to work"
                    )
-            else:
-                if "http" in proxy_env or "https" in proxy_env:
-                    logger.warning("".join(HTTP_PROXY_SET_WARNING))

            # we always block '0.0.0.0' and '::', which are supposed to be
            # unroutable addresses.
@@ -25,11 +25,13 @@ import logging
 import os.path
 import urllib.parse
 from textwrap import indent
-from typing import Any, Dict, Iterable, List, Optional, Set, Tuple, Union
+from typing import Any, Dict, Iterable, List, Optional, Set, Tuple, TypedDict, Union
+from urllib.request import getproxies_environment

 import attr
 import yaml
 from netaddr import AddrFormatError, IPNetwork, IPSet
+from typing_extensions import TypeGuard

 from twisted.conch.ssh.keys import Key

@@ -43,6 +45,21 @@ from ._util import validate_config

 logger = logging.getLogger(__name__)

+
+# Directly from the mypy docs:
+# https://typing.python.org/en/latest/spec/narrowing.html#typeguard
+def is_str_list(val: Any, allow_empty: bool) -> TypeGuard[list[str]]:
+    """
+    Type-narrow a value to a list of strings (compatible with mypy).
+    """
+    if not isinstance(val, list):
+        return False
+
+    if len(val) == 0:
+        return allow_empty
+    return all(isinstance(x, str) for x in val)
+
+
 DIRECT_TCP_ERROR = """
 Using direct TCP replication for workers is no longer supported.

@@ -291,6 +308,102 @@ class LimitRemoteRoomsConfig:
    )


+class ProxyConfigDictionary(TypedDict):
+    """
+    Dictionary of proxy settings suitable for interacting with `urllib.request` API's
+    """
+
+    http: Optional[str]
+    """
+    Proxy server to use for HTTP requests.
+    """
+    https: Optional[str]
+    """
+    Proxy server to use for HTTPS requests.
+    """
+    no: str
+    """
+    Comma-separated list of hosts, IP addresses, or IP ranges in CIDR format which
+    should not use the proxy.
+
+    Empty string means no hosts should be excluded from the proxy.
+    """
+
+
+@attr.s(slots=True, frozen=True, auto_attribs=True)
+class ProxyConfig:
+    """
+    Synapse configuration for HTTP proxy settings.
+    """
+
+    http_proxy: Optional[str]
+    """
+    Proxy server to use for HTTP requests.
+    """
+    https_proxy: Optional[str]
+    """
+    Proxy server to use for HTTPS requests.
+    """
+    no_proxy_hosts: Optional[List[str]]
+    """
+    List of hosts, IP addresses, or IP ranges in CIDR format which should not use the
+    proxy. Synapse will directly connect to these hosts.
+    """
+
+    def get_proxies_dictionary(self) -> ProxyConfigDictionary:
+        """
+        Returns a dictionary of proxy settings suitable for interacting with
+        `urllib.request` API's (e.g. `urllib.request.proxy_bypass_environment`)
+
+        The keys are `"http"`, `"https"`, and `"no"`.
+        """
+        return ProxyConfigDictionary(
+            http=self.http_proxy,
+            https=self.https_proxy,
+            no=",".join(self.no_proxy_hosts) if self.no_proxy_hosts else "",
+        )
+
+
+def parse_proxy_config(config: JsonDict) -> ProxyConfig:
+    """
+    Figure out forward proxy config for outgoing HTTP requests.
+
+    Prefer values from the given config over the environment variables (`http_proxy`,
+    `https_proxy`, `no_proxy`, not case-sensitive).
+
+    Args:
+        config: The top-level homeserver configuration dictionary.
+    """
+    proxies_from_env = getproxies_environment()
+    http_proxy = config.get("http_proxy", proxies_from_env.get("http"))
+    if http_proxy is not None and not isinstance(http_proxy, str):
+        raise ConfigError("'http_proxy' must be a string", ("http_proxy",))
+
+    https_proxy = config.get("https_proxy", proxies_from_env.get("https"))
+    if https_proxy is not None and not isinstance(https_proxy, str):
+        raise ConfigError("'https_proxy' must be a string", ("https_proxy",))
+
+    # List of hosts which should not use the proxy. Synapse will directly connect to
+    # these hosts.
+    no_proxy_hosts = config.get("no_proxy_hosts")
+    # The `no_proxy` environment variable should be a comma-separated list of hosts,
+    # IP addresses, or IP ranges in CIDR format
+    no_proxy_from_env = proxies_from_env.get("no")
+    if no_proxy_hosts is None and no_proxy_from_env is not None:
+        no_proxy_hosts = no_proxy_from_env.split(",")
+
+    if no_proxy_hosts is not None and not is_str_list(no_proxy_hosts, allow_empty=True):
+        raise ConfigError(
+            "'no_proxy_hosts' must be a list of strings", ("no_proxy_hosts",)
+        )
+
+    return ProxyConfig(
+        http_proxy=http_proxy,
+        https_proxy=https_proxy,
+        no_proxy_hosts=no_proxy_hosts,
+    )
+
+
 class ServerConfig(Config):
    section = "server"

@@ -718,6 +831,17 @@ class ServerConfig(Config):
                )
            )

+        # Figure out forward proxy config for outgoing HTTP requests.
+        #
+        # Prefer values from the file config over the environment variables
+        self.proxy_config = parse_proxy_config(config)
+        logger.debug(
+            "Using proxy settings: http_proxy=%s, https_proxy=%s, no_proxy=%s",
+            self.proxy_config.http_proxy,
+            self.proxy_config.https_proxy,
+            self.proxy_config.no_proxy_hosts,
+        )
+
        self.cleanup_extremities_with_dummy_events = config.get(
            "cleanup_extremities_with_dummy_events", True
        )
@@ -152,6 +152,8 @@ class Keyring:
    def __init__(
        self, hs: "HomeServer", key_fetchers: "Optional[Iterable[KeyFetcher]]" = None
    ):
+        self.server_name = hs.hostname
+
        if key_fetchers is None:
            # Always fetch keys from the database.
            mutable_key_fetchers: List[KeyFetcher] = [StoreKeyFetcher(hs)]
@@ -169,7 +171,8 @@ class Keyring:
        self._fetch_keys_queue: BatchingQueue[
            _FetchKeyRequest, Dict[str, Dict[str, FetchKeyResult]]
        ] = BatchingQueue(
-            "keyring_server",
+            name="keyring_server",
+            server_name=self.server_name,
            clock=hs.get_clock(),
            # The method called to fetch each key
            process_batch_callback=self._inner_fetch_key_requests,
@@ -473,8 +476,12 @@ class Keyring:

 class KeyFetcher(metaclass=abc.ABCMeta):
    def __init__(self, hs: "HomeServer"):
+        self.server_name = hs.hostname
        self._queue = BatchingQueue(
-            self.__class__.__name__, hs.get_clock(), self._fetch_keys
+            name=self.__class__.__name__,
+            server_name=self.server_name,
+            clock=hs.get_clock(),
+            process_batch_callback=self._fetch_keys,
        )

    async def get_keys(
@@ -34,6 +34,7 @@ class InviteAutoAccepter:
    def __init__(self, config: AutoAcceptInvitesConfig, api: ModuleApi):
        # Keep a reference to the Module API.
        self._api = api
+        self.server_name = api.server_name
        self._config = config

        if not self._config.enabled:
@@ -545,8 +545,11 @@ def serialize_event(
            d["content"] = dict(d["content"])
            d["content"]["redacts"] = e.redacts

-    if config.include_admin_metadata and e.internal_metadata.is_soft_failed():
-        d["unsigned"]["io.element.synapse.soft_failed"] = True
+    if config.include_admin_metadata:
+        if e.internal_metadata.is_soft_failed():
+            d["unsigned"]["io.element.synapse.soft_failed"] = True
+        if e.internal_metadata.policy_server_spammy:
+            d["unsigned"]["io.element.synapse.policy_server_spammy"] = True

    only_event_fields = config.only_event_fields
    if only_event_fields:
@@ -174,6 +174,7 @@ class FederationBase:
                "Event not allowed by policy server, soft-failing %s", pdu.event_id
            )
            pdu.internal_metadata.soft_failed = True
+            pdu.internal_metadata.policy_server_spammy = True
            # Note: we don't redact the event so admins can inspect the event after the
            # fact. Other processes may redact the event, but that won't be applied to
            # the database copy of the event until the server's config requires it.
@@ -74,6 +74,7 @@ from synapse.federation.transport.client import SendJoinResponse
 from synapse.http.client import is_unknown_endpoint
 from synapse.http.types import QueryParams
 from synapse.logging.opentracing import SynapseTags, log_kv, set_tag, tag_args, trace
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.types import JsonDict, StrCollection, UserID, get_domain_from_id
 from synapse.types.handlers.policy_server import RECOMMENDATION_OK, RECOMMENDATION_SPAM
 from synapse.util.async_helpers import concurrently_execute
@@ -85,7 +86,9 @@ if TYPE_CHECKING:

 logger = logging.getLogger(__name__)

-sent_queries_counter = Counter("synapse_federation_client_sent_queries", "", ["type"])
+sent_queries_counter = Counter(
+    "synapse_federation_client_sent_queries", "", labelnames=["type", SERVER_NAME_LABEL]
+)


 PDU_RETRY_TIME_MS = 1 * 60 * 1000
@@ -209,7 +212,10 @@ class FederationClient(FederationBase):
        Returns:
            The JSON object from the response
        """
-        sent_queries_counter.labels(query_type).inc()
+        sent_queries_counter.labels(
+            type=query_type,
+            **{SERVER_NAME_LABEL: self.server_name},
+        ).inc()

        return await self.transport_layer.make_query(
            destination,
@@ -231,7 +237,10 @@ class FederationClient(FederationBase):
        Returns:
            The JSON object from the response
        """
-        sent_queries_counter.labels("client_device_keys").inc()
+        sent_queries_counter.labels(
+            type="client_device_keys",
+            **{SERVER_NAME_LABEL: self.server_name},
+        ).inc()
        return await self.transport_layer.query_client_keys(
            destination, content, timeout
        )
@@ -242,7 +251,10 @@ class FederationClient(FederationBase):
        """Query the device keys for a list of user ids hosted on a remote
        server.
        """
-        sent_queries_counter.labels("user_devices").inc()
+        sent_queries_counter.labels(
+            type="user_devices",
+            **{SERVER_NAME_LABEL: self.server_name},
+        ).inc()
        return await self.transport_layer.query_user_devices(
            destination, user_id, timeout
        )
@@ -264,7 +276,10 @@ class FederationClient(FederationBase):
        Returns:
            The JSON object from the response
        """
-        sent_queries_counter.labels("client_one_time_keys").inc()
+        sent_queries_counter.labels(
+            type="client_one_time_keys",
+            **{SERVER_NAME_LABEL: self.server_name},
+        ).inc()

        # Convert the query with counts into a stable and unstable query and check
        # if attempting to claim more than 1 OTK.
@@ -82,6 +82,7 @@ from synapse.logging.opentracing import (
    tag_args,
    trace,
 )
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.metrics.background_process_metrics import wrap_as_background_process
 from synapse.replication.http.federation import (
    ReplicationFederationSendEduRestServlet,
@@ -104,23 +105,30 @@ TRANSACTION_CONCURRENCY_LIMIT = 10

 logger = logging.getLogger(__name__)

-received_pdus_counter = Counter("synapse_federation_server_received_pdus", "")
+received_pdus_counter = Counter(
+    "synapse_federation_server_received_pdus", "", labelnames=[SERVER_NAME_LABEL]
+)

-received_edus_counter = Counter("synapse_federation_server_received_edus", "")
+received_edus_counter = Counter(
+    "synapse_federation_server_received_edus", "", labelnames=[SERVER_NAME_LABEL]
+)

 received_queries_counter = Counter(
-    "synapse_federation_server_received_queries", "", ["type"]
+    "synapse_federation_server_received_queries",
+    "",
+    labelnames=["type", SERVER_NAME_LABEL],
 )

 pdu_process_time = Histogram(
    "synapse_federation_server_pdu_process_time",
    "Time taken to process an event",
+    labelnames=[SERVER_NAME_LABEL],
 )

 last_pdu_ts_metric = Gauge(
    "synapse_federation_last_received_pdu_time",
    "The timestamp of the last PDU which was successfully received from the given domain",
-    labelnames=("server_name",),
+    labelnames=("origin_server_name", SERVER_NAME_LABEL),
 )


@@ -434,7 +442,9 @@ class FederationServer(FederationBase):
            report back to the sending server.
        """

-        received_pdus_counter.inc(len(transaction.pdus))
+        received_pdus_counter.labels(**{SERVER_NAME_LABEL: self.server_name}).inc(
+            len(transaction.pdus)
+        )

        origin_host, _ = parse_server_name(origin)

@@ -545,7 +555,9 @@ class FederationServer(FederationBase):
        )

        if newest_pdu_ts and origin in self._federation_metrics_domains:
-            last_pdu_ts_metric.labels(server_name=origin).set(newest_pdu_ts / 1000)
+            last_pdu_ts_metric.labels(
+                origin_server_name=origin, **{SERVER_NAME_LABEL: self.server_name}
+            ).set(newest_pdu_ts / 1000)

        return pdu_results

@@ -553,7 +565,7 @@ class FederationServer(FederationBase):
        """Process the EDUs in a received transaction."""

        async def _process_edu(edu_dict: JsonDict) -> None:
-            received_edus_counter.inc()
+            received_edus_counter.labels(**{SERVER_NAME_LABEL: self.server_name}).inc()

            edu = Edu(
                origin=origin,
@@ -668,7 +680,10 @@ class FederationServer(FederationBase):
    async def on_query_request(
        self, query_type: str, args: Dict[str, str]
    ) -> Tuple[int, Dict[str, Any]]:
-        received_queries_counter.labels(query_type).inc()
+        received_queries_counter.labels(
+            type=query_type,
+            **{SERVER_NAME_LABEL: self.server_name},
+        ).inc()
        resp = await self.registry.on_query(query_type, args)
        return 200, resp

@@ -1310,9 +1325,9 @@ class FederationServer(FederationBase):
                    origin, event.event_id
                )
                if received_ts is not None:
-                    pdu_process_time.observe(
-                        (self._clock.time_msec() - received_ts) / 1000
-                    )
+                    pdu_process_time.labels(
+                        **{SERVER_NAME_LABEL: self.server_name}
+                    ).observe((self._clock.time_msec() - received_ts) / 1000)

            next = await self._get_next_nonspam_staged_event_for_room(
                room_id, room_version
@@ -54,7 +54,7 @@ from sortedcontainers import SortedDict

 from synapse.api.presence import UserPresenceState
 from synapse.federation.sender import AbstractFederationSender, FederationSender
-from synapse.metrics import LaterGauge
+from synapse.metrics import SERVER_NAME_LABEL, LaterGauge
 from synapse.replication.tcp.streams.federation import FederationStream
 from synapse.types import JsonDict, ReadReceipt, RoomStreamToken, StrCollection
 from synapse.util.metrics import Measure
@@ -113,10 +113,10 @@ class FederationRemoteSendQueue(AbstractFederationSender):
        # changes. ARGH.
        def register(name: str, queue: Sized) -> None:
            LaterGauge(
-                "synapse_federation_send_queue_%s_size" % (queue_name,),
-                "",
-                [],
-                lambda: len(queue),
+                name="synapse_federation_send_queue_%s_size" % (queue_name,),
+                desc="",
+                labelnames=[SERVER_NAME_LABEL],
+                caller=lambda: {(self.server_name,): len(queue)},
            )

        for queue_name in [
@@ -160,6 +160,7 @@ from synapse.federation.sender.transaction_manager import TransactionManager
 from synapse.federation.units import Edu
 from synapse.logging.context import make_deferred_yieldable, run_in_background
 from synapse.metrics import (
+    SERVER_NAME_LABEL,
    LaterGauge,
    event_processing_loop_counter,
    event_processing_loop_room_count,
@@ -189,11 +190,13 @@ logger = logging.getLogger(__name__)
 sent_pdus_destination_dist_count = Counter(
    "synapse_federation_client_sent_pdu_destinations_count",
    "Number of PDUs queued for sending to one or more destinations",
+    labelnames=[SERVER_NAME_LABEL],
 )

 sent_pdus_destination_dist_total = Counter(
    "synapse_federation_client_sent_pdu_destinations",
    "Total number of PDUs queued for sending across all destinations",
+    labelnames=[SERVER_NAME_LABEL],
 )

 # Time (in s) to wait before trying to wake up destinations that have
@@ -296,6 +299,7 @@ class _DestinationWakeupQueue:

    Staggers waking up of per destination queues to ensure that we don't attempt
    to start TLS connections with many hosts all at once, leading to pinned CPU.
+
    """

    # The maximum duration in seconds between queuing up a destination and it
@@ -303,6 +307,10 @@ class _DestinationWakeupQueue:
    _MAX_TIME_IN_QUEUE = 30.0

    sender: "FederationSender" = attr.ib()
+    server_name: str = attr.ib()
+    """
+    Our homeserver name (used to label metrics) (`hs.hostname`).
+    """
    clock: Clock = attr.ib()
    max_delay_s: int = attr.ib()

@@ -391,31 +399,37 @@ class FederationSender(AbstractFederationSender):
        self._per_destination_queues: Dict[str, PerDestinationQueue] = {}

        LaterGauge(
-            "synapse_federation_transaction_queue_pending_destinations",
-            "",
-            [],
-            lambda: sum(
-                1
-                for d in self._per_destination_queues.values()
-                if d.transmission_loop_running
-            ),
+            name="synapse_federation_transaction_queue_pending_destinations",
+            desc="",
+            labelnames=[SERVER_NAME_LABEL],
+            caller=lambda: {
+                (self.server_name,): sum(
+                    1
+                    for d in self._per_destination_queues.values()
+                    if d.transmission_loop_running
+                )
+            },
        )

        LaterGauge(
-            "synapse_federation_transaction_queue_pending_pdus",
-            "",
-            [],
-            lambda: sum(
-                d.pending_pdu_count() for d in self._per_destination_queues.values()
-            ),
+            name="synapse_federation_transaction_queue_pending_pdus",
+            desc="",
+            labelnames=[SERVER_NAME_LABEL],
+            caller=lambda: {
+                (self.server_name,): sum(
+                    d.pending_pdu_count() for d in self._per_destination_queues.values()
+                )
+            },
        )
        LaterGauge(
-            "synapse_federation_transaction_queue_pending_edus",
-            "",
-            [],
-            lambda: sum(
-                d.pending_edu_count() for d in self._per_destination_queues.values()
-            ),
+            name="synapse_federation_transaction_queue_pending_edus",
+            desc="",
+            labelnames=[SERVER_NAME_LABEL],
+            caller=lambda: {
+                (self.server_name,): sum(
+                    d.pending_edu_count() for d in self._per_destination_queues.values()
+                )
+            },
        )

        self._is_processing = False
@@ -427,7 +441,7 @@ class FederationSender(AbstractFederationSender):
            1.0 / hs.config.ratelimiting.federation_rr_transactions_per_room_per_second
        )
        self._destination_wakeup_queue = _DestinationWakeupQueue(
-            self, self.clock, max_delay_s=rr_txn_interval_per_room_s
+            self, self.server_name, self.clock, max_delay_s=rr_txn_interval_per_room_s
        )

        # Regularly wake up destinations that have outstanding PDUs to be caught up
@@ -435,6 +449,7 @@ class FederationSender(AbstractFederationSender):
            run_as_background_process,
            WAKEUP_RETRY_PERIOD_SEC * 1000.0,
            "wake_destinations_needing_catchup",
+            self.server_name,
            self._wake_destinations_needing_catchup,
        )

@@ -477,7 +492,9 @@ class FederationSender(AbstractFederationSender):

        # fire off a processing loop in the background
        run_as_background_process(
-            "process_event_queue_for_federation", self._process_event_queue_loop
+            "process_event_queue_for_federation",
+            self.server_name,
+            self._process_event_queue_loop,
        )

    async def _process_event_queue_loop(self) -> None:
@@ -650,7 +667,8 @@ class FederationSender(AbstractFederationSender):
                        ts = event_to_received_ts[event.event_id]
                        assert ts is not None
                        synapse.metrics.event_processing_lag_by_event.labels(
-                            "federation_sender"
+                            name="federation_sender",
+                            **{SERVER_NAME_LABEL: self.server_name},
                        ).observe((now - ts) / 1000)

                async def handle_room_events(events: List[EventBase]) -> None:
@@ -694,22 +712,30 @@ class FederationSender(AbstractFederationSender):
                    assert ts is not None

                    synapse.metrics.event_processing_lag.labels(
-                        "federation_sender"
+                        name="federation_sender",
+                        **{SERVER_NAME_LABEL: self.server_name},
                    ).set(now - ts)
                    synapse.metrics.event_processing_last_ts.labels(
-                        "federation_sender"
+                        name="federation_sender",
+                        **{SERVER_NAME_LABEL: self.server_name},
                    ).set(ts)

-                    events_processed_counter.inc(len(event_entries))
+                    events_processed_counter.labels(
+                        **{SERVER_NAME_LABEL: self.server_name}
+                    ).inc(len(event_entries))

-                    event_processing_loop_room_count.labels("federation_sender").inc(
-                        len(events_by_room)
-                    )
+                    event_processing_loop_room_count.labels(
+                        name="federation_sender",
+                        **{SERVER_NAME_LABEL: self.server_name},
+                    ).inc(len(events_by_room))

-                event_processing_loop_counter.labels("federation_sender").inc()
+                event_processing_loop_counter.labels(
+                    name="federation_sender",
+                    **{SERVER_NAME_LABEL: self.server_name},
+                ).inc()

                synapse.metrics.event_processing_positions.labels(
-                    "federation_sender"
+                    name="federation_sender", **{SERVER_NAME_LABEL: self.server_name}
                ).set(next_token)

        finally:
@@ -727,8 +753,12 @@ class FederationSender(AbstractFederationSender):
        if not destinations:
            return

-        sent_pdus_destination_dist_total.inc(len(destinations))
-        sent_pdus_destination_dist_count.inc()
+        sent_pdus_destination_dist_total.labels(
+            **{SERVER_NAME_LABEL: self.server_name}
+        ).inc(len(destinations))
+        sent_pdus_destination_dist_count.labels(
+            **{SERVER_NAME_LABEL: self.server_name}
+        ).inc()

        assert pdu.internal_metadata.stream_ordering

@@ -40,7 +40,7 @@ from synapse.federation.units import Edu
 from synapse.handlers.presence import format_user_presence_state
 from synapse.logging import issue9533_logger
 from synapse.logging.opentracing import SynapseTags, set_tag
-from synapse.metrics import sent_transactions_counter
+from synapse.metrics import SERVER_NAME_LABEL, sent_transactions_counter
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.types import JsonDict, ReadReceipt
 from synapse.util.retryutils import NotRetryingDestination, get_retry_limiter
@@ -56,13 +56,15 @@ logger = logging.getLogger(__name__)


 sent_edus_counter = Counter(
-    "synapse_federation_client_sent_edus", "Total number of EDUs successfully sent"
+    "synapse_federation_client_sent_edus",
+    "Total number of EDUs successfully sent",
+    labelnames=[SERVER_NAME_LABEL],
 )

 sent_edus_by_type = Counter(
    "synapse_federation_client_sent_edus_by_type",
    "Number of sent EDUs successfully sent, by event type",
-    ["type"],
+    labelnames=["type", SERVER_NAME_LABEL],
 )


@@ -91,7 +93,7 @@ class PerDestinationQueue:
        transaction_manager: "synapse.federation.sender.TransactionManager",
        destination: str,
    ):
-        self._server_name = hs.hostname
+        self.server_name = hs.hostname
        self._clock = hs.get_clock()
        self._storage_controllers = hs.get_storage_controllers()
        self._store = hs.get_datastores().main
@@ -311,6 +313,7 @@ class PerDestinationQueue:

        run_as_background_process(
            "federation_transaction_transmission_loop",
+            self.server_name,
            self._transaction_transmission_loop,
        )

@@ -322,7 +325,12 @@ class PerDestinationQueue:
            # This will throw if we wouldn't retry. We do this here so we fail
            # quickly, but we will later check this again in the http client,
            # hence why we throw the result away.
-            await get_retry_limiter(self._destination, self._clock, self._store)
+            await get_retry_limiter(
+                destination=self._destination,
+                our_server_name=self.server_name,
+                clock=self._clock,
+                store=self._store,
+            )

            if self._catching_up:
                # we potentially need to catch-up first
@@ -362,10 +370,17 @@ class PerDestinationQueue:
                        self._destination, pending_pdus, pending_edus
                    )

-                    sent_transactions_counter.inc()
-                    sent_edus_counter.inc(len(pending_edus))
+                    sent_transactions_counter.labels(
+                        **{SERVER_NAME_LABEL: self.server_name}
+                    ).inc()
+                    sent_edus_counter.labels(
+                        **{SERVER_NAME_LABEL: self.server_name}
+                    ).inc(len(pending_edus))
                    for edu in pending_edus:
-                        sent_edus_by_type.labels(edu.edu_type).inc()
+                        sent_edus_by_type.labels(
+                            type=edu.edu_type,
+                            **{SERVER_NAME_LABEL: self.server_name},
+                        ).inc()

        except NotRetryingDestination as e:
            logger.debug(
@@ -566,7 +581,7 @@ class PerDestinationQueue:
                    new_pdus = await filter_events_for_server(
                        self._storage_controllers,
                        self._destination,
-                        self._server_name,
+                        self.server_name,
                        new_pdus,
                        redact=False,
                        filter_out_erased_senders=True,
@@ -590,7 +605,9 @@ class PerDestinationQueue:
                    self._destination, room_catchup_pdus, []
                )

-                sent_transactions_counter.inc()
+                sent_transactions_counter.labels(
+                    **{SERVER_NAME_LABEL: self.server_name}
+                ).inc()

                # We pulled this from the DB, so it'll be non-null
                assert pdu.internal_metadata.stream_ordering
@@ -613,7 +630,7 @@ class PerDestinationQueue:
        # Send at most limit EDUs for receipts.
        for content in self._pending_receipt_edus[:limit]:
            yield Edu(
-                origin=self._server_name,
+                origin=self.server_name,
                destination=self._destination,
                edu_type=EduTypes.RECEIPT,
                content=content,
@@ -639,7 +656,7 @@ class PerDestinationQueue:
        )
        edus = [
            Edu(
-                origin=self._server_name,
+                origin=self.server_name,
                destination=self._destination,
                edu_type=edu_type,
                content=content,
@@ -666,7 +683,7 @@ class PerDestinationQueue:

        edus = [
            Edu(
-                origin=self._server_name,
+                origin=self.server_name,
                destination=self._destination,
                edu_type=EduTypes.DIRECT_TO_DEVICE,
                content=content,
@@ -739,7 +756,7 @@ class _TransactionQueueManager:

            pending_edus.append(
                Edu(
-                    origin=self.queue._server_name,
+                    origin=self.queue.server_name,
                    destination=self.queue._destination,
                    edu_type=EduTypes.PRESENCE,
                    content={"push": presence_to_add},
@@ -34,6 +34,7 @@ from synapse.logging.opentracing import (
    tags,
    whitelisted_homeserver,
 )
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.types import JsonDict
 from synapse.util import json_decoder
 from synapse.util.metrics import measure_func
@@ -47,7 +48,7 @@ issue_8631_logger = logging.getLogger("synapse.8631_debug")
 last_pdu_ts_metric = Gauge(
    "synapse_federation_last_sent_pdu_time",
    "The timestamp of the last PDU which was successfully sent to the given domain",
-    labelnames=("server_name",),
+    labelnames=("destination_server_name", SERVER_NAME_LABEL),
 )


@@ -191,6 +192,7 @@ class TransactionManager:

            if pdus and destination in self._federation_metrics_domains:
                last_pdu = pdus[-1]
-                last_pdu_ts_metric.labels(server_name=destination).set(
-                    last_pdu.origin_server_ts / 1000
-                )
+                last_pdu_ts_metric.labels(
+                    destination_server_name=destination,
+                    **{SERVER_NAME_LABEL: self.server_name},
+                ).set(last_pdu.origin_server_ts / 1000)
@@ -38,6 +38,9 @@ logger = logging.getLogger(__name__)
 class AccountValidityHandler:
    def __init__(self, hs: "HomeServer"):
        self.hs = hs
+        self.server_name = (
+            hs.hostname
+        )  # nb must be called this for @wrap_as_background_process
        self.config = hs.config
        self.store = hs.get_datastores().main
        self.send_email_handler = hs.get_send_email_handler()
@@ -42,6 +42,7 @@ from synapse.events import EventBase
 from synapse.handlers.presence import format_user_presence_state
 from synapse.logging.context import make_deferred_yieldable, run_in_background
 from synapse.metrics import (
+    SERVER_NAME_LABEL,
    event_processing_loop_counter,
    event_processing_loop_room_count,
 )
@@ -68,12 +69,16 @@ if TYPE_CHECKING:

 logger = logging.getLogger(__name__)

-events_processed_counter = Counter("synapse_handlers_appservice_events_processed", "")
+events_processed_counter = Counter(
+    "synapse_handlers_appservice_events_processed", "", labelnames=[SERVER_NAME_LABEL]
+)


 class ApplicationServicesHandler:
    def __init__(self, hs: "HomeServer"):
-        self.server_name = hs.hostname
+        self.server_name = (
+            hs.hostname
+        )  # nb must be called this for @wrap_as_background_process
        self.store = hs.get_datastores().main
        self.is_mine_id = hs.is_mine_id
        self.appservice_api = hs.get_application_service_api()
@@ -166,7 +171,9 @@ class ApplicationServicesHandler:
                                except Exception:
                                    logger.error("Application Services Failure")

-                            run_as_background_process("as_scheduler", start_scheduler)
+                            run_as_background_process(
+                                "as_scheduler", self.server_name, start_scheduler
+                            )
                            self.started_scheduler = True

                        # Fork off pushes to these services
@@ -180,7 +187,8 @@ class ApplicationServicesHandler:
                        assert ts is not None

                        synapse.metrics.event_processing_lag_by_event.labels(
-                            "appservice_sender"
+                            name="appservice_sender",
+                            **{SERVER_NAME_LABEL: self.server_name},
                        ).observe((now - ts) / 1000)

                    async def handle_room_events(events: Iterable[EventBase]) -> None:
@@ -200,16 +208,23 @@ class ApplicationServicesHandler:
                    await self.store.set_appservice_last_pos(upper_bound)

                    synapse.metrics.event_processing_positions.labels(
-                        "appservice_sender"
+                        name="appservice_sender",
+                        **{SERVER_NAME_LABEL: self.server_name},
                    ).set(upper_bound)

-                    events_processed_counter.inc(len(events))
+                    events_processed_counter.labels(
+                        **{SERVER_NAME_LABEL: self.server_name}
+                    ).inc(len(events))

-                    event_processing_loop_room_count.labels("appservice_sender").inc(
-                        len(events_by_room)
-                    )
+                    event_processing_loop_room_count.labels(
+                        name="appservice_sender",
+                        **{SERVER_NAME_LABEL: self.server_name},
+                    ).inc(len(events_by_room))

-                    event_processing_loop_counter.labels("appservice_sender").inc()
+                    event_processing_loop_counter.labels(
+                        name="appservice_sender",
+                        **{SERVER_NAME_LABEL: self.server_name},
+                    ).inc()

                    if events:
                        now = self.clock.time_msec()
@@ -217,10 +232,12 @@ class ApplicationServicesHandler:
                        assert ts is not None

                        synapse.metrics.event_processing_lag.labels(
-                            "appservice_sender"
+                            name="appservice_sender",
+                            **{SERVER_NAME_LABEL: self.server_name},
                        ).set(now - ts)
                        synapse.metrics.event_processing_last_ts.labels(
-                            "appservice_sender"
+                            name="appservice_sender",
+                            **{SERVER_NAME_LABEL: self.server_name},
                        ).set(ts)
            finally:
                self.is_processing = False
@@ -70,6 +70,7 @@ from synapse.http import get_request_user_agent
 from synapse.http.server import finish_request, respond_with_html
 from synapse.http.site import SynapseRequest
 from synapse.logging.context import defer_to_thread
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.storage.databases.main.registration import (
    LoginTokenExpired,
@@ -95,7 +96,7 @@ INVALID_USERNAME_OR_PASSWORD = "Invalid username or password"
 invalid_login_token_counter = Counter(
    "synapse_user_login_invalid_login_tokens",
    "Counts the number of rejected m.login.token on /login",
-    ["reason"],
+    labelnames=["reason", SERVER_NAME_LABEL],
 )


@@ -199,6 +200,7 @@ class AuthHandler:
    SESSION_EXPIRE_MS = 48 * 60 * 60 * 1000

    def __init__(self, hs: "HomeServer"):
+        self.server_name = hs.hostname
        self.store = hs.get_datastores().main
        self.auth = hs.get_auth()
        self.auth_blocking = hs.get_auth_blocking()
@@ -248,6 +250,7 @@ class AuthHandler:
                run_as_background_process,
                5 * 60 * 1000,
                "expire_old_sessions",
+                self.server_name,
                self._expire_old_sessions,
            )

@@ -272,8 +275,6 @@ class AuthHandler:
            hs.config.sso.sso_account_deactivated_template
        )

-        self._server_name = hs.config.server.server_name
-
        # cast to tuple for use with str.startswith
        self._whitelisted_sso_clients = tuple(hs.config.sso.sso_client_whitelist)

@@ -281,7 +282,9 @@ class AuthHandler:
        # response.
        self._extra_attributes: Dict[str, SsoLoginExtraAttributes] = {}

-        self.msc3861_oauth_delegation_enabled = hs.config.experimental.msc3861.enabled
+        self._auth_delegation_enabled = (
+            hs.config.mas.enabled or hs.config.experimental.msc3861.enabled
+        )

    async def validate_user_via_ui_auth(
        self,
@@ -332,7 +335,7 @@ class AuthHandler:
            LimitExceededError if the ratelimiter's failed request count for this
                user is too high to proceed
        """
-        if self.msc3861_oauth_delegation_enabled:
+        if self._auth_delegation_enabled:
            raise SynapseError(
                HTTPStatus.INTERNAL_SERVER_ERROR, "UIA shouldn't be used with MSC3861"
            )
@@ -1479,11 +1482,20 @@ class AuthHandler:
        try:
            return await self.store.consume_login_token(login_token)
        except LoginTokenExpired:
-            invalid_login_token_counter.labels("expired").inc()
+            invalid_login_token_counter.labels(
+                reason="expired",
+                **{SERVER_NAME_LABEL: self.server_name},
+            ).inc()
        except LoginTokenReused:
-            invalid_login_token_counter.labels("reused").inc()
+            invalid_login_token_counter.labels(
+                reason="reused",
+                **{SERVER_NAME_LABEL: self.server_name},
+            ).inc()
        except NotFoundError:
-            invalid_login_token_counter.labels("not found").inc()
+            invalid_login_token_counter.labels(
+                reason="not found",
+                **{SERVER_NAME_LABEL: self.server_name},
+            ).inc()

        raise AuthError(403, "Invalid login token", errcode=Codes.FORBIDDEN)

@@ -1858,7 +1870,7 @@ class AuthHandler:
        html = self._sso_redirect_confirm_template.render(
            display_url=display_url,
            redirect_url=redirect_url,
-            server_name=self._server_name,
+            server_name=self.server_name,
            new_user=new_user,
            user_id=registered_user_id,
            user_profile=user_profile_data,
@@ -42,6 +42,7 @@ class DeactivateAccountHandler:
    def __init__(self, hs: "HomeServer"):
        self.store = hs.get_datastores().main
        self.hs = hs
+        self.server_name = hs.hostname
        self._auth_handler = hs.get_auth_handler()
        self._device_handler = hs.get_device_handler()
        self._room_member_handler = hs.get_room_member_handler()
@@ -271,7 +272,9 @@ class DeactivateAccountHandler:
        pending deactivation, if it isn't already running.
        """
        if not self._user_parter_running:
-            run_as_background_process("user_parter_loop", self._user_parter_loop)
+            run_as_background_process(
+                "user_parter_loop", self.server_name, self._user_parter_loop
+            )

    async def _user_parter_loop(self) -> None:
        """Loop that parts deactivated users from rooms"""
@@ -22,7 +22,7 @@ from synapse.api.errors import ShadowBanError
 from synapse.api.ratelimiting import Ratelimiter
 from synapse.config.workers import MAIN_PROCESS_INSTANCE_NAME
 from synapse.logging.opentracing import set_tag
-from synapse.metrics import event_processing_positions
+from synapse.metrics import SERVER_NAME_LABEL, event_processing_positions
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.replication.http.delayed_events import (
    ReplicationAddedDelayedEventRestServlet,
@@ -110,12 +110,13 @@ class DelayedEventsHandler:
                # Can send the events in background after having awaited on marking them as processed
                run_as_background_process(
                    "_send_events",
+                    self.server_name,
                    self._send_events,
                    events,
                )

            self._initialized_from_db = run_as_background_process(
-                "_schedule_db_events", _schedule_db_events
+                "_schedule_db_events", self.server_name, _schedule_db_events
            )
        else:
            self._repl_client = ReplicationAddedDelayedEventRestServlet.make_client(hs)
@@ -140,7 +141,9 @@ class DelayedEventsHandler:
            finally:
                self._event_processing = False

-        run_as_background_process("delayed_events.notify_new_event", process)
+        run_as_background_process(
+            "delayed_events.notify_new_event", self.server_name, process
+        )

    async def _unsafe_process_new_event(self) -> None:
        # If self._event_pos is None then means we haven't fetched it from the DB yet
@@ -188,7 +191,9 @@ class DelayedEventsHandler:
                self._event_pos = max_pos

                # Expose current event processing position to prometheus
-                event_processing_positions.labels("delayed_events").set(max_pos)
+                event_processing_positions.labels(
+                    name="delayed_events", **{SERVER_NAME_LABEL: self.server_name}
+                ).set(max_pos)

                await self._store.update_delayed_events_stream_pos(max_pos)

@@ -450,6 +455,7 @@ class DelayedEventsHandler:
                delay_sec,
                run_as_background_process,
                "_send_on_timeout",
+                self.server_name,
                self._send_on_timeout,
            )
        else:
@@ -193,8 +193,9 @@ class DeviceHandler:
            self.clock.looping_call(
                run_as_background_process,
                DELETE_STALE_DEVICES_INTERVAL_MS,
-                "delete_stale_devices",
-                self._delete_stale_devices,
+                desc="delete_stale_devices",
+                server_name=self.server_name,
+                func=self._delete_stale_devices,
            )

    async def _delete_stale_devices(self) -> None:
@@ -963,6 +964,9 @@ class DeviceWriterHandler(DeviceHandler):
    def __init__(self, hs: "HomeServer"):
        super().__init__(hs)

+        self.server_name = (
+            hs.hostname
+        )  # nb must be called this for @measure_func and @wrap_as_background_process
        # We only need to poke the federation sender explicitly if its on the
        # same instance. Other federation sender instances will get notified by
        # `synapse.app.generic_worker.FederationSenderHandler` when it sees it
@@ -1440,6 +1444,7 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
    def __init__(self, hs: "HomeServer", device_handler: DeviceWriterHandler):
        super().__init__(hs)

+        self.server_name = hs.hostname
        self.federation = hs.get_federation_client()
        self.server_name = hs.hostname  # nb must be called this for @measure_func
        self.clock = hs.get_clock()  # nb must be called this for @measure_func
@@ -1470,6 +1475,7 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
        self.clock.looping_call(
            run_as_background_process,
            30 * 1000,
+            server_name=self.server_name,
            func=self._maybe_retry_device_resync,
            desc="_maybe_retry_device_resync",
        )
@@ -1591,6 +1597,7 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
                await self.store.mark_remote_users_device_caches_as_stale([user_id])
                run_as_background_process(
                    "_maybe_retry_device_resync",
+                    self.server_name,
                    self.multi_user_device_resync,
                    [user_id],
                    False,
@@ -71,6 +71,7 @@ from synapse.handlers.pagination import PURGE_PAGINATION_LOCK_NAME
 from synapse.http.servlet import assert_params_in_dict
 from synapse.logging.context import nested_logging_context
 from synapse.logging.opentracing import SynapseTags, set_tag, tag_args, trace
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.module_api import NOT_SPAM
 from synapse.storage.databases.main.events_worker import EventRedactBehaviour
@@ -90,7 +91,7 @@ logger = logging.getLogger(__name__)
 backfill_processing_before_timer = Histogram(
    "synapse_federation_backfill_processing_before_time_seconds",
    "sec",
-    [],
+    labelnames=[SERVER_NAME_LABEL],
    buckets=(
        0.1,
        0.5,
@@ -187,7 +188,9 @@ class FederationHandler:
        # were shut down.
        if not hs.config.worker.worker_app:
            run_as_background_process(
-                "resume_sync_partial_state_room", self._resume_partial_state_room_sync
+                "resume_sync_partial_state_room",
+                self.server_name,
+                self._resume_partial_state_room_sync,
            )

    @trace
@@ -316,6 +319,7 @@ class FederationHandler:
            )
            run_as_background_process(
                "_maybe_backfill_inner_anyway_with_max_depth",
+                self.server_name,
                self.maybe_backfill,
                room_id=room_id,
                # We use `MAX_DEPTH` so that we find all backfill points next
@@ -530,9 +534,9 @@ class FederationHandler:
        # backfill points regardless of `current_depth`.
        if processing_start_time is not None:
            processing_end_time = self.clock.time_msec()
-            backfill_processing_before_timer.observe(
-                (processing_end_time - processing_start_time) / 1000
-            )
+            backfill_processing_before_timer.labels(
+                **{SERVER_NAME_LABEL: self.server_name}
+            ).observe((processing_end_time - processing_start_time) / 1000)

        success = await try_backfill(likely_domains)
        if success:
@@ -798,7 +802,10 @@ class FederationHandler:
            # have. Hence we fire off the background task, but don't wait for it.

            run_as_background_process(
-                "handle_queued_pdus", self._handle_queued_pdus, room_queue
+                "handle_queued_pdus",
+                self.server_name,
+                self._handle_queued_pdus,
+                room_queue,
            )

    async def do_knock(
@@ -1870,7 +1877,9 @@ class FederationHandler:
                        )

        run_as_background_process(
-            desc="sync_partial_state_room", func=_sync_partial_state_room_wrapper
+            desc="sync_partial_state_room",
+            server_name=self.server_name,
+            func=_sync_partial_state_room_wrapper,
        )

    async def _sync_partial_state_room(
@@ -76,6 +76,7 @@ from synapse.logging.opentracing import (
    tag_args,
    trace,
 )
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.replication.http.federation import (
    ReplicationFederationSendEventsRestServlet,
@@ -105,13 +106,14 @@ logger = logging.getLogger(__name__)
 soft_failed_event_counter = Counter(
    "synapse_federation_soft_failed_events_total",
    "Events received over federation that we marked as soft_failed",
+    labelnames=[SERVER_NAME_LABEL],
 )

 # Added to debug performance and track progress on optimizations
 backfill_processing_after_timer = Histogram(
    "synapse_federation_backfill_processing_after_time_seconds",
    "sec",
-    [],
+    labelnames=[SERVER_NAME_LABEL],
    buckets=(
        0.1,
        0.25,
@@ -146,6 +148,7 @@ class FederationEventHandler:
    """

    def __init__(self, hs: "HomeServer"):
+        self.server_name = hs.hostname
        self._clock = hs.get_clock()
        self._store = hs.get_datastores().main
        self._state_store = hs.get_datastores().state
@@ -170,7 +173,6 @@ class FederationEventHandler:

        self._is_mine_id = hs.is_mine_id
        self._is_mine_server_name = hs.is_mine_server_name
-        self._server_name = hs.hostname
        self._instance_name = hs.get_instance_name()

        self._config = hs.config
@@ -249,7 +251,7 @@ class FederationEventHandler:
        # Note that if we were never in the room then we would have already
        # dropped the event, since we wouldn't know the room version.
        is_in_room = await self._event_auth_handler.is_host_in_room(
-            room_id, self._server_name
+            room_id, self.server_name
        )
        if not is_in_room:
            logger.info(
@@ -690,7 +692,9 @@ class FederationEventHandler:
        if not events:
            return

-        with backfill_processing_after_timer.time():
+        with backfill_processing_after_timer.labels(
+            **{SERVER_NAME_LABEL: self.server_name}
+        ).time():
            # if there are any events in the wrong room, the remote server is buggy and
            # should not be trusted.
            for ev in events:
@@ -930,6 +934,7 @@ class FederationEventHandler:
        if len(events_with_failed_pull_attempts) > 0:
            run_as_background_process(
                "_process_new_pulled_events_with_failed_pull_attempts",
+                self.server_name,
                _process_new_pulled_events,
                events_with_failed_pull_attempts,
            )
@@ -1523,6 +1528,7 @@ class FederationEventHandler:
            if resync:
                run_as_background_process(
                    "resync_device_due_to_pdu",
+                    self.server_name,
                    self._resync_device,
                    event.sender,
                )
@@ -2049,7 +2055,9 @@ class FederationEventHandler:
                    "hs": origin,
                },
            )
-            soft_failed_event_counter.inc()
+            soft_failed_event_counter.labels(
+                **{SERVER_NAME_LABEL: self.server_name}
+            ).inc()
            event.internal_metadata.soft_failed = True

    async def _load_or_fetch_auth_events_for_event(
@@ -67,7 +67,6 @@ from synapse.handlers.worker_lock import NEW_EVENT_DURING_PURGE_LOCK_NAME
 from synapse.logging import opentracing
 from synapse.logging.context import make_deferred_yieldable, run_in_background
 from synapse.metrics.background_process_metrics import run_as_background_process
-from synapse.replication.http.send_event import ReplicationSendEventRestServlet
 from synapse.replication.http.send_events import ReplicationSendEventsRestServlet
 from synapse.storage.databases.main.events_worker import EventRedactBehaviour
 from synapse.types import (
@@ -97,6 +96,7 @@ class MessageHandler:
    """Contains some read only APIs to get state about a room"""

    def __init__(self, hs: "HomeServer"):
+        self.server_name = hs.hostname
        self.auth = hs.get_auth()
        self.clock = hs.get_clock()
        self.state = hs.get_state_handler()
@@ -112,7 +112,7 @@ class MessageHandler:

        if not hs.config.worker.worker_app:
            run_as_background_process(
-                "_schedule_next_expiry", self._schedule_next_expiry
+                "_schedule_next_expiry", self.server_name, self._schedule_next_expiry
            )

    async def get_room_data(
@@ -444,6 +444,7 @@ class MessageHandler:
            delay,
            run_as_background_process,
            "_expire_event",
+            self.server_name,
            self._expire_event,
            event_id,
        )
@@ -504,7 +505,6 @@ class EventCreationHandler:

        self.room_prejoin_state_types = self.hs.config.api.room_prejoin_state

-        self.send_event = ReplicationSendEventRestServlet.make_client(hs)
        self.send_events = ReplicationSendEventsRestServlet.make_client(hs)

        self.request_ratelimiter = hs.get_request_ratelimiter()
@@ -546,6 +546,7 @@ class EventCreationHandler:
            self.clock.looping_call(
                lambda: run_as_background_process(
                    "send_dummy_events_to_fill_extremities",
+                    self.server_name,
                    self._send_dummy_events_to_fill_extremities,
                ),
                5 * 60 * 1000,
@@ -646,38 +647,46 @@ class EventCreationHandler:
        """
        await self.auth_blocking.check_auth_blocking(requester=requester)

-        requester_suspended = await self.store.get_user_suspended_status(
-            requester.user.to_string()
+        # The requester may be a regular user, but puppeted by the server.
+        request_by_server = (
+            requester.authenticated_entity == self.hs.config.server.server_name
        )
-        if requester_suspended:
-            # We want to allow suspended users to perform "corrective" actions
-            # asked of them by server admins, such as redact their messages and
-            # leave rooms.
-            if event_dict["type"] in ["m.room.redaction", "m.room.member"]:
-                if event_dict["type"] == "m.room.redaction":
-                    event = await self.store.get_event(
-                        event_dict["content"]["redacts"], allow_none=True
-                    )
-                    if event:
-                        if event.sender != requester.user.to_string():
+
+        # If the request is initiated by the server, ignore whether the
+        # requester or target is suspended.
+        if not request_by_server:
+            requester_suspended = await self.store.get_user_suspended_status(
+                requester.user.to_string()
+            )
+            if requester_suspended:
+                # We want to allow suspended users to perform "corrective" actions
+                # asked of them by server admins, such as redact their messages and
+                # leave rooms.
+                if event_dict["type"] in ["m.room.redaction", "m.room.member"]:
+                    if event_dict["type"] == "m.room.redaction":
+                        event = await self.store.get_event(
+                            event_dict["content"]["redacts"], allow_none=True
+                        )
+                        if event:
+                            if event.sender != requester.user.to_string():
+                                raise SynapseError(
+                                    403,
+                                    "You can only redact your own events while account is suspended.",
+                                    Codes.USER_ACCOUNT_SUSPENDED,
+                                )
+                    if event_dict["type"] == "m.room.member":
+                        if event_dict["content"]["membership"] != "leave":
                            raise SynapseError(
                                403,
-                                "You can only redact your own events while account is suspended.",
+                                "Changing membership while account is suspended is not allowed.",
                                Codes.USER_ACCOUNT_SUSPENDED,
                            )
-                if event_dict["type"] == "m.room.member":
-                    if event_dict["content"]["membership"] != "leave":
-                        raise SynapseError(
-                            403,
-                            "Changing membership while account is suspended is not allowed.",
-                            Codes.USER_ACCOUNT_SUSPENDED,
-                        )
-            else:
-                raise SynapseError(
-                    403,
-                    "Sending messages while account is suspended is not allowed.",
-                    Codes.USER_ACCOUNT_SUSPENDED,
-                )
+                else:
+                    raise SynapseError(
+                        403,
+                        "Sending messages while account is suspended is not allowed.",
+                        Codes.USER_ACCOUNT_SUSPENDED,
+                    )

        is_create_event = (
            event_dict["type"] == EventTypes.Create and event_dict["state_key"] == ""
@@ -1107,6 +1116,9 @@ class EventCreationHandler:

                policy_allowed = await self._policy_handler.is_event_allowed(event)
                if not policy_allowed:
+                    # We shouldn't need to set the metadata because the raise should
+                    # cause the request to be denied, but just in case:
+                    event.internal_metadata.policy_server_spammy = True
                    logger.warning(
                        "Event not allowed by policy server, rejecting %s",
                        event.event_id,
@@ -2070,6 +2082,7 @@ class EventCreationHandler:
                # matters as sometimes presence code can take a while.
                run_as_background_process(
                    "bump_presence_active_time",
+                    self.server_name,
                    self._bump_active_time,
                    requester.user,
                    requester.device_id,
@@ -79,12 +79,12 @@ class PaginationHandler:

    def __init__(self, hs: "HomeServer"):
        self.hs = hs
+        self.server_name = hs.hostname
        self.auth = hs.get_auth()
        self.store = hs.get_datastores().main
        self._storage_controllers = hs.get_storage_controllers()
        self._state_storage_controller = self._storage_controllers.state
        self.clock = hs.get_clock()
-        self._server_name = hs.hostname
        self._room_shutdown_handler = hs.get_room_shutdown_handler()
        self._relations_handler = hs.get_relations_handler()
        self._worker_locks = hs.get_worker_locks_handler()
@@ -119,6 +119,7 @@ class PaginationHandler:
                    run_as_background_process,
                    job.interval,
                    "purge_history_for_rooms_in_range",
+                    self.server_name,
                    self.purge_history_for_rooms_in_range,
                    job.shortest_max_lifetime,
                    job.longest_max_lifetime,
@@ -245,6 +246,7 @@ class PaginationHandler:
            # other purges in the same room.
            run_as_background_process(
                PURGE_HISTORY_ACTION_NAME,
+                self.server_name,
                self.purge_history,
                room_id,
                token,
@@ -395,7 +397,7 @@ class PaginationHandler:
            write=True,
        ):
            # first check that we have no users in this room
-            joined = await self.store.is_host_joined(room_id, self._server_name)
+            joined = await self.store.is_host_joined(room_id, self.server_name)
            if joined:
                if force:
                    logger.info(
@@ -604,6 +606,7 @@ class PaginationHandler:
                # for a costly federation call and processing.
                run_as_background_process(
                    "maybe_backfill_in_the_background",
+                    self.server_name,
                    self.hs.get_federation_handler().maybe_backfill,
                    room_id,
                    curr_topo,
@@ -105,7 +105,7 @@ from synapse.api.presence import UserDevicePresenceState, UserPresenceState
 from synapse.appservice import ApplicationService
 from synapse.events.presence_router import PresenceRouter
 from synapse.logging.context import run_in_background
-from synapse.metrics import LaterGauge
+from synapse.metrics import SERVER_NAME_LABEL, LaterGauge
 from synapse.metrics.background_process_metrics import (
    run_as_background_process,
    wrap_as_background_process,
@@ -137,24 +137,40 @@ if TYPE_CHECKING:
 logger = logging.getLogger(__name__)


-notified_presence_counter = Counter("synapse_handler_presence_notified_presence", "")
+notified_presence_counter = Counter(
+    "synapse_handler_presence_notified_presence", "", labelnames=[SERVER_NAME_LABEL]
+)
 federation_presence_out_counter = Counter(
-    "synapse_handler_presence_federation_presence_out", ""
+    "synapse_handler_presence_federation_presence_out",
+    "",
+    labelnames=[SERVER_NAME_LABEL],
+)
+presence_updates_counter = Counter(
+    "synapse_handler_presence_presence_updates", "", labelnames=[SERVER_NAME_LABEL]
+)
+timers_fired_counter = Counter(
+    "synapse_handler_presence_timers_fired", "", labelnames=[SERVER_NAME_LABEL]
 )
-presence_updates_counter = Counter("synapse_handler_presence_presence_updates", "")
-timers_fired_counter = Counter("synapse_handler_presence_timers_fired", "")
 federation_presence_counter = Counter(
-    "synapse_handler_presence_federation_presence", ""
+    "synapse_handler_presence_federation_presence", "", labelnames=[SERVER_NAME_LABEL]
+)
+bump_active_time_counter = Counter(
+    "synapse_handler_presence_bump_active_time", "", labelnames=[SERVER_NAME_LABEL]
 )
-bump_active_time_counter = Counter("synapse_handler_presence_bump_active_time", "")

-get_updates_counter = Counter("synapse_handler_presence_get_updates", "", ["type"])
+get_updates_counter = Counter(
+    "synapse_handler_presence_get_updates", "", labelnames=["type", SERVER_NAME_LABEL]
+)

 notify_reason_counter = Counter(
-    "synapse_handler_presence_notify_reason", "", ["locality", "reason"]
+    "synapse_handler_presence_notify_reason",
+    "",
+    labelnames=["locality", "reason", SERVER_NAME_LABEL],
 )
 state_transition_counter = Counter(
-    "synapse_handler_presence_state_transition", "", ["locality", "from", "to"]
+    "synapse_handler_presence_state_transition",
+    "",
+    labelnames=["locality", "from", "to", SERVER_NAME_LABEL],
 )

 # If a user was last active in the last LAST_ACTIVE_GRANULARITY, consider them
@@ -484,6 +500,7 @@ class _NullContextManager(ContextManager[None]):
 class WorkerPresenceHandler(BasePresenceHandler):
    def __init__(self, hs: "HomeServer"):
        super().__init__(hs)
+        self.server_name = hs.hostname
        self._presence_writer_instance = hs.config.worker.writers.presence[0]

        # Route presence EDUs to the right worker
@@ -517,6 +534,7 @@ class WorkerPresenceHandler(BasePresenceHandler):
            "shutdown",
            run_as_background_process,
            "generic_presence.on_shutdown",
+            self.server_name,
            self._on_shutdown,
        )

@@ -666,7 +684,9 @@ class WorkerPresenceHandler(BasePresenceHandler):
            old_state = self.user_to_current_state.get(new_state.user_id)
            self.user_to_current_state[new_state.user_id] = new_state
            is_mine = self.is_mine_id(new_state.user_id)
-            if not old_state or should_notify(old_state, new_state, is_mine):
+            if not old_state or should_notify(
+                old_state, new_state, is_mine, self.server_name
+            ):
                state_to_notify.append(new_state)

        stream_id = token
@@ -747,7 +767,9 @@ class WorkerPresenceHandler(BasePresenceHandler):
 class PresenceHandler(BasePresenceHandler):
    def __init__(self, hs: "HomeServer"):
        super().__init__(hs)
-        self.server_name = hs.hostname
+        self.server_name = (
+            hs.hostname
+        )  # nb must be called this for @wrap_as_background_process
        self.wheel_timer: WheelTimer[str] = WheelTimer()
        self.notifier = hs.get_notifier()

@@ -758,10 +780,10 @@ class PresenceHandler(BasePresenceHandler):
        )

        LaterGauge(
-            "synapse_handlers_presence_user_to_current_state_size",
-            "",
-            [],
-            lambda: len(self.user_to_current_state),
+            name="synapse_handlers_presence_user_to_current_state_size",
+            desc="",
+            labelnames=[SERVER_NAME_LABEL],
+            caller=lambda: {(self.server_name,): len(self.user_to_current_state)},
        )

        # The per-device presence state, maps user to devices to per-device presence state.
@@ -815,6 +837,7 @@ class PresenceHandler(BasePresenceHandler):
            "shutdown",
            run_as_background_process,
            "presence.on_shutdown",
+            self.server_name,
            self._on_shutdown,
        )

@@ -860,10 +883,10 @@ class PresenceHandler(BasePresenceHandler):
            )

        LaterGauge(
-            "synapse_handlers_presence_wheel_timer_size",
-            "",
-            [],
-            lambda: len(self.wheel_timer),
+            name="synapse_handlers_presence_wheel_timer_size",
+            desc="",
+            labelnames=[SERVER_NAME_LABEL],
+            caller=lambda: {(self.server_name,): len(self.wheel_timer)},
        )

        # Used to handle sending of presence to newly joined users/servers
@@ -972,6 +995,7 @@ class PresenceHandler(BasePresenceHandler):
                    prev_state,
                    new_state,
                    is_mine=self.is_mine_id(user_id),
+                    our_server_name=self.server_name,
                    wheel_timer=self.wheel_timer,
                    now=now,
                    # When overriding disabled presence, don't kick off all the
@@ -991,10 +1015,14 @@ class PresenceHandler(BasePresenceHandler):

            # TODO: We should probably ensure there are no races hereafter

-            presence_updates_counter.inc(len(new_states))
+            presence_updates_counter.labels(
+                **{SERVER_NAME_LABEL: self.server_name}
+            ).inc(len(new_states))

            if to_notify:
-                notified_presence_counter.inc(len(to_notify))
+                notified_presence_counter.labels(
+                    **{SERVER_NAME_LABEL: self.server_name}
+                ).inc(len(to_notify))
                await self._persist_and_notify(list(to_notify.values()))

            self.unpersisted_users_changes |= {s.user_id for s in new_states}
@@ -1013,7 +1041,9 @@ class PresenceHandler(BasePresenceHandler):
                if user_id not in to_notify
            }
            if to_federation_ping:
-                federation_presence_out_counter.inc(len(to_federation_ping))
+                federation_presence_out_counter.labels(
+                    **{SERVER_NAME_LABEL: self.server_name}
+                ).inc(len(to_federation_ping))

                hosts_to_states = await get_interested_remotes(
                    self.store,
@@ -1063,7 +1093,9 @@ class PresenceHandler(BasePresenceHandler):
            for user_id in users_to_check
        ]

-        timers_fired_counter.inc(len(states))
+        timers_fired_counter.labels(**{SERVER_NAME_LABEL: self.server_name}).inc(
+            len(states)
+        )

        # Set of user ID & device IDs which are currently syncing.
        syncing_user_devices = {
@@ -1097,7 +1129,7 @@ class PresenceHandler(BasePresenceHandler):

        user_id = user.to_string()

-        bump_active_time_counter.inc()
+        bump_active_time_counter.labels(**{SERVER_NAME_LABEL: self.server_name}).inc()

        now = self.clock.time_msec()

@@ -1349,7 +1381,9 @@ class PresenceHandler(BasePresenceHandler):
            updates.append(prev_state.copy_and_replace(**new_fields))

        if updates:
-            federation_presence_counter.inc(len(updates))
+            federation_presence_counter.labels(
+                **{SERVER_NAME_LABEL: self.server_name}
+            ).inc(len(updates))
            await self._update_states(updates)

    async def set_state(
@@ -1495,7 +1529,9 @@ class PresenceHandler(BasePresenceHandler):
            finally:
                self._event_processing = False

-        run_as_background_process("presence.notify_new_event", _process_presence)
+        run_as_background_process(
+            "presence.notify_new_event", self.server_name, _process_presence
+        )

    async def _unsafe_process(self) -> None:
        # Loop round handling deltas until we're up to date
@@ -1532,9 +1568,9 @@ class PresenceHandler(BasePresenceHandler):
                self._event_pos = max_pos

                # Expose current event processing position to prometheus
-                synapse.metrics.event_processing_positions.labels("presence").set(
-                    max_pos
-                )
+                synapse.metrics.event_processing_positions.labels(
+                    name="presence", **{SERVER_NAME_LABEL: self.server_name}
+                ).set(max_pos)

    async def _handle_state_delta(self, room_id: str, deltas: List[StateDelta]) -> None:
        """Process current state deltas for the room to find new joins that need
@@ -1660,7 +1696,10 @@ class PresenceHandler(BasePresenceHandler):


 def should_notify(
-    old_state: UserPresenceState, new_state: UserPresenceState, is_mine: bool
+    old_state: UserPresenceState,
+    new_state: UserPresenceState,
+    is_mine: bool,
+    our_server_name: str,
 ) -> bool:
    """Decides if a presence state change should be sent to interested parties."""
    user_location = "remote"
@@ -1671,19 +1710,38 @@ def should_notify(
        return False

    if old_state.status_msg != new_state.status_msg:
-        notify_reason_counter.labels(user_location, "status_msg_change").inc()
+        notify_reason_counter.labels(
+            locality=user_location,
+            reason="status_msg_change",
+            **{SERVER_NAME_LABEL: our_server_name},
+        ).inc()
        return True

    if old_state.state != new_state.state:
-        notify_reason_counter.labels(user_location, "state_change").inc()
+        notify_reason_counter.labels(
+            locality=user_location,
+            reason="state_change",
+            **{SERVER_NAME_LABEL: our_server_name},
+        ).inc()
        state_transition_counter.labels(
-            user_location, old_state.state, new_state.state
+            **{
+                "locality": user_location,
+                # `from` is a reserved word in Python so we have to label it this way if
+                # we want to use keyword args.
+                "from": old_state.state,
+                "to": new_state.state,
+                SERVER_NAME_LABEL: our_server_name,
+            },
        ).inc()
        return True

    if old_state.state == PresenceState.ONLINE:
        if new_state.currently_active != old_state.currently_active:
-            notify_reason_counter.labels(user_location, "current_active_change").inc()
+            notify_reason_counter.labels(
+                locality=user_location,
+                reason="current_active_change",
+                **{SERVER_NAME_LABEL: our_server_name},
+            ).inc()
            return True

        if (
@@ -1693,14 +1751,18 @@ def should_notify(
            # Only notify about last active bumps if we're not currently active
            if not new_state.currently_active:
                notify_reason_counter.labels(
-                    user_location, "last_active_change_online"
+                    locality=user_location,
+                    reason="last_active_change_online",
+                    **{SERVER_NAME_LABEL: our_server_name},
                ).inc()
                return True

    elif new_state.last_active_ts - old_state.last_active_ts > LAST_ACTIVE_GRANULARITY:
        # Always notify for a transition where last active gets bumped.
        notify_reason_counter.labels(
-            user_location, "last_active_change_not_online"
+            locality=user_location,
+            reason="last_active_change_not_online",
+            **{SERVER_NAME_LABEL: our_server_name},
        ).inc()
        return True

@@ -1767,6 +1829,7 @@ class PresenceEventSource(EventSource[int, UserPresenceState]):
        self.server_name = hs.hostname
        self.get_presence_handler = hs.get_presence_handler
        self.get_presence_router = hs.get_presence_router
+        self.server_name = hs.hostname
        self.clock = hs.get_clock()
        self.store = hs.get_datastores().main

@@ -1878,7 +1941,10 @@ class PresenceEventSource(EventSource[int, UserPresenceState]):

                    # If we have the full list of changes for presence we can
                    # simply check which ones share a room with the user.
-                    get_updates_counter.labels("stream").inc()
+                    get_updates_counter.labels(
+                        type="stream",
+                        **{SERVER_NAME_LABEL: self.server_name},
+                    ).inc()

                    sharing_users = await self.store.do_users_share_a_room(
                        user_id, updated_users
@@ -1891,7 +1957,10 @@ class PresenceEventSource(EventSource[int, UserPresenceState]):
                else:
                    # Too many possible updates. Find all users we can see and check
                    # if any of them have changed.
-                    get_updates_counter.labels("full").inc()
+                    get_updates_counter.labels(
+                        type="full",
+                        **{SERVER_NAME_LABEL: self.server_name},
+                    ).inc()

                    users_interested_in = (
                        await self.store.get_users_who_share_room_with_user(user_id)
@@ -2141,6 +2210,7 @@ def handle_update(
    prev_state: UserPresenceState,
    new_state: UserPresenceState,
    is_mine: bool,
+    our_server_name: str,
    wheel_timer: WheelTimer,
    now: int,
    persist: bool,
@@ -2153,6 +2223,7 @@ def handle_update(
        prev_state
        new_state
        is_mine: Whether the user is ours
+        our_server_name: The homeserver name of the our server (`hs.hostname`)
        wheel_timer
        now: Time now in ms
        persist: True if this state should persist until another update occurs.
@@ -2221,7 +2292,7 @@ def handle_update(
            )

    # Check whether the change was something worth notifying about
-    if should_notify(prev_state, new_state, is_mine):
+    if should_notify(prev_state, new_state, is_mine, our_server_name):
        new_state = new_state.copy_and_replace(last_federation_update_ts=now)
        persist_and_notify = True

@@ -124,7 +124,7 @@ class ProfileHandler:
            except RequestSendFailed as e:
                raise SynapseError(502, "Failed to fetch profile") from e
            except HttpResponseException as e:
-                if e.code < 500 and e.code != 404:
+                if e.code < 500 and e.code not in (403, 404):
                    # Other codes are not allowed in c2s API
                    logger.info(
                        "Server replied with wrong response: %s %s", e.code, e.msg
@@ -45,6 +45,7 @@ from synapse.api.errors import (
 from synapse.appservice import ApplicationService
 from synapse.config.server import is_threepid_reserved
 from synapse.http.servlet import assert_params_in_dict
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.replication.http.login import RegisterDeviceReplicationServlet
 from synapse.replication.http.register import (
    ReplicationPostRegisterActionsServlet,
@@ -62,29 +63,38 @@ logger = logging.getLogger(__name__)
 registration_counter = Counter(
    "synapse_user_registrations_total",
    "Number of new users registered (since restart)",
-    ["guest", "shadow_banned", "auth_provider"],
+    labelnames=["guest", "shadow_banned", "auth_provider", SERVER_NAME_LABEL],
 )

 login_counter = Counter(
    "synapse_user_logins_total",
    "Number of user logins (since restart)",
-    ["guest", "auth_provider"],
+    labelnames=["guest", "auth_provider", SERVER_NAME_LABEL],
 )


-def init_counters_for_auth_provider(auth_provider_id: str) -> None:
+def init_counters_for_auth_provider(auth_provider_id: str, server_name: str) -> None:
    """Ensure the prometheus counters for the given auth provider are initialised

    This fixes a problem where the counters are not reported for a given auth provider
    until the user first logs in/registers.
+
+    Args:
+        auth_provider_id: The ID of the auth provider to initialise counters for.
+        server_name: Our server name (used to label metrics) (this should be `hs.hostname`).
    """
    for is_guest in (True, False):
-        login_counter.labels(guest=is_guest, auth_provider=auth_provider_id)
+        login_counter.labels(
+            guest=is_guest,
+            auth_provider=auth_provider_id,
+            **{SERVER_NAME_LABEL: server_name},
+        )
        for shadow_banned in (True, False):
            registration_counter.labels(
                guest=is_guest,
                shadow_banned=shadow_banned,
                auth_provider=auth_provider_id,
+                **{SERVER_NAME_LABEL: server_name},
            )


@@ -97,6 +107,7 @@ class LoginDict(TypedDict):

 class RegistrationHandler:
    def __init__(self, hs: "HomeServer"):
+        self.server_name = hs.hostname
        self.store = hs.get_datastores().main
        self._storage_controllers = hs.get_storage_controllers()
        self.clock = hs.get_clock()
@@ -112,7 +123,6 @@ class RegistrationHandler:
        self._account_validity_handler = hs.get_account_validity_handler()
        self._user_consent_version = self.hs.config.consent.user_consent_version
        self._server_notices_mxid = hs.config.servernotices.server_notices_mxid
-        self._server_name = hs.hostname
        self._user_types_config = hs.config.user_types

        self._spam_checker_module_callbacks = hs.get_module_api_callbacks().spam_checker
@@ -138,7 +148,9 @@ class RegistrationHandler:
        )
        self.refresh_token_lifetime = hs.config.registration.refresh_token_lifetime

-        init_counters_for_auth_provider("")
+        init_counters_for_auth_provider(
+            auth_provider_id="", server_name=self.server_name
+        )

    async def check_username(
        self,
@@ -362,6 +374,7 @@ class RegistrationHandler:
            guest=make_guest,
            shadow_banned=shadow_banned,
            auth_provider=(auth_provider_id or ""),
+            **{SERVER_NAME_LABEL: self.server_name},
        ).inc()

        # If the user does not need to consent at registration, auto-join any
@@ -422,7 +435,7 @@ class RegistrationHandler:
        if self.hs.config.registration.auto_join_user_id:
            fake_requester = create_requester(
                self.hs.config.registration.auto_join_user_id,
-                authenticated_entity=self._server_name,
+                authenticated_entity=self.server_name,
            )

            # If the room requires an invite, add the user to the list of invites.
@@ -435,7 +448,7 @@ class RegistrationHandler:
            requires_join = True
        else:
            fake_requester = create_requester(
-                user_id, authenticated_entity=self._server_name
+                user_id, authenticated_entity=self.server_name
            )

        # Choose whether to federate the new room.
@@ -467,7 +480,7 @@ class RegistrationHandler:

                    await room_member_handler.update_membership(
                        requester=create_requester(
-                            user_id, authenticated_entity=self._server_name
+                            user_id, authenticated_entity=self.server_name
                        ),
                        target=UserID.from_string(user_id),
                        room_id=room_id,
@@ -493,7 +506,7 @@ class RegistrationHandler:
                    if requires_join:
                        await room_member_handler.update_membership(
                            requester=create_requester(
-                                user_id, authenticated_entity=self._server_name
+                                user_id, authenticated_entity=self.server_name
                            ),
                            target=UserID.from_string(user_id),
                            room_id=room_id,
@@ -539,7 +552,7 @@ class RegistrationHandler:
                # we don't have a local user in the room to craft up an invite with.
                requires_invite = await self.store.is_host_joined(
                    room_id,
-                    self._server_name,
+                    self.server_name,
                )

                if requires_invite:
@@ -567,7 +580,7 @@ class RegistrationHandler:
                    await room_member_handler.update_membership(
                        requester=create_requester(
                            self.hs.config.registration.auto_join_user_id,
-                            authenticated_entity=self._server_name,
+                            authenticated_entity=self.server_name,
                        ),
                        target=UserID.from_string(user_id),
                        room_id=room_id,
@@ -579,7 +592,7 @@ class RegistrationHandler:
                # Send the join.
                await room_member_handler.update_membership(
                    requester=create_requester(
-                        user_id, authenticated_entity=self._server_name
+                        user_id, authenticated_entity=self.server_name
                    ),
                    target=UserID.from_string(user_id),
                    room_id=room_id,
@@ -790,6 +803,7 @@ class RegistrationHandler:
        login_counter.labels(
            guest=is_guest,
            auth_provider=(auth_provider_id or ""),
+            **{SERVER_NAME_LABEL: self.server_name},
        ).inc()

        return (
@@ -66,6 +66,7 @@ from synapse.api.errors import (
    SynapseError,
 )
 from synapse.api.filtering import Filter
+from synapse.api.ratelimiting import Ratelimiter
 from synapse.api.room_versions import KNOWN_ROOM_VERSIONS, RoomVersion
 from synapse.event_auth import validate_event_for_room_version
 from synapse.events import EventBase
@@ -134,7 +135,12 @@ class RoomCreationHandler:
        self.room_member_handler = hs.get_room_member_handler()
        self._event_auth_handler = hs.get_event_auth_handler()
        self.config = hs.config
-        self.request_ratelimiter = hs.get_request_ratelimiter()
+        self.common_request_ratelimiter = hs.get_request_ratelimiter()
+        self.creation_ratelimiter = Ratelimiter(
+            store=self.store,
+            clock=self.clock,
+            cfg=self.config.ratelimiting.rc_room_creation,
+        )

        # Room state based off defined presets
        self._presets_dict: Dict[str, Dict[str, Any]] = {
@@ -216,7 +222,11 @@ class RoomCreationHandler:
            ShadowBanError if the requester is shadow-banned.
        """
        if ratelimit:
-            await self.request_ratelimiter.ratelimit(requester)
+            await self.creation_ratelimiter.ratelimit(requester, update=False)
+
+            # then apply the ratelimits
+            await self.common_request_ratelimiter.ratelimit(requester)
+            await self.creation_ratelimiter.ratelimit(requester)

        user_id = requester.user.to_string()

@@ -566,6 +576,7 @@ class RoomCreationHandler:
                created with _generate_room_id())
            new_room_version: the new room version to use
            tombstone_event_id: the ID of the tombstone event in the old room.
+            additional_creators: additional room creators, for MSC4289.
            creation_event_with_context: The create event of the new room, if the new room supports
            room ID as create event ID hash.
            auto_member: Whether to automatically join local users to the new
@@ -1060,6 +1071,25 @@ class RoomCreationHandler:

        await self.auth_blocking.check_auth_blocking(requester=requester)

+        if ratelimit:
+            # Limit the rate of room creations,
+            # using both the limiter specific to room creations as well
+            # as the general request ratelimiter.
+            #
+            # Note that we don't rate limit the individual
+            # events in the room — room creation isn't atomic and
+            # historically it was very janky if half the events in the
+            # initial state don't make it because of rate limiting.
+
+            # First check the room creation ratelimiter without updating it
+            # (this is so we don't consume a token if the other ratelimiter doesn't
+            # allow us to proceed)
+            await self.creation_ratelimiter.ratelimit(requester, update=False)
+
+            # then apply the ratelimits
+            await self.common_request_ratelimiter.ratelimit(requester)
+            await self.creation_ratelimiter.ratelimit(requester)
+
        if (
            self._server_notices_mxid is not None
            and user_id == self._server_notices_mxid
@@ -1091,25 +1121,6 @@ class RoomCreationHandler:
                    Codes.MISSING_PARAM,
                )

-        if not is_requester_admin:
-            spam_check = await self._spam_checker_module_callbacks.user_may_create_room(
-                user_id, config
-            )
-            if spam_check != self._spam_checker_module_callbacks.NOT_SPAM:
-                raise SynapseError(
-                    403,
-                    "You are not permitted to create rooms",
-                    errcode=spam_check[0],
-                    additional_fields=spam_check[1],
-                )
-
-        if ratelimit:
-            # Rate limit once in advance, but don't rate limit the individual
-            # events in the room — room creation isn't atomic and it's very
-            # janky if half the events in the initial state don't make it because
-            # of rate limiting.
-            await self.request_ratelimiter.ratelimit(requester)
-
        room_version_id = config.get(
            "room_version", self.config.server.default_room_version.identifier
        )
@@ -1202,6 +1213,19 @@ class RoomCreationHandler:

        self._validate_room_config(config, visibility)

+        # Run the spam checker after other validation
+        if not is_requester_admin:
+            spam_check = await self._spam_checker_module_callbacks.user_may_create_room(
+                user_id, config
+            )
+            if spam_check != self._spam_checker_module_callbacks.NOT_SPAM:
+                raise SynapseError(
+                    403,
+                    "You are not permitted to create rooms",
+                    errcode=spam_check[0],
+                    additional_fields=spam_check[1],
+                )
+
        creation_content = config.get("creation_content", {})
        # override any attempt to set room versions via the creation_content
        creation_content["room_version"] = room_version.identifier
@@ -49,7 +49,7 @@ from synapse.handlers.profile import MAX_AVATAR_URL_LEN, MAX_DISPLAYNAME_LEN
 from synapse.handlers.state_deltas import MatchChange, StateDeltasHandler
 from synapse.handlers.worker_lock import NEW_EVENT_DURING_PURGE_LOCK_NAME
 from synapse.logging import opentracing
-from synapse.metrics import event_processing_positions
+from synapse.metrics import SERVER_NAME_LABEL, event_processing_positions
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.replication.http.push import ReplicationCopyPusherRestServlet
 from synapse.storage.databases.main.state_deltas import StateDelta
@@ -746,35 +746,41 @@ class RoomMemberHandler(metaclass=abc.ABCMeta):
            and requester.user.to_string() == self._server_notices_mxid
        )

-        requester_suspended = await self.store.get_user_suspended_status(
-            requester.user.to_string()
-        )
-        if action == Membership.INVITE and requester_suspended:
-            raise SynapseError(
-                403,
-                "Sending invites while account is suspended is not allowed.",
-                Codes.USER_ACCOUNT_SUSPENDED,
-            )
+        # The requester may be a regular user, but puppeted by the server.
+        request_by_server = requester.authenticated_entity == self._server_name

-        if target.to_string() != requester.user.to_string():
-            target_suspended = await self.store.get_user_suspended_status(
-                target.to_string()
+        # If the request is initiated by the server, ignore whether the
+        # requester or target is suspended.
+        if not request_by_server:
+            requester_suspended = await self.store.get_user_suspended_status(
+                requester.user.to_string()
            )
-        else:
-            target_suspended = requester_suspended
+            if action == Membership.INVITE and requester_suspended:
+                raise SynapseError(
+                    403,
+                    "Sending invites while account is suspended is not allowed.",
+                    Codes.USER_ACCOUNT_SUSPENDED,
+                )

-        if action == Membership.JOIN and target_suspended:
-            raise SynapseError(
-                403,
-                "Joining rooms while account is suspended is not allowed.",
-                Codes.USER_ACCOUNT_SUSPENDED,
-            )
-        if action == Membership.KNOCK and target_suspended:
-            raise SynapseError(
-                403,
-                "Knocking on rooms while account is suspended is not allowed.",
-                Codes.USER_ACCOUNT_SUSPENDED,
-            )
+            if target.to_string() != requester.user.to_string():
+                target_suspended = await self.store.get_user_suspended_status(
+                    target.to_string()
+                )
+            else:
+                target_suspended = requester_suspended
+
+            if action == Membership.JOIN and target_suspended:
+                raise SynapseError(
+                    403,
+                    "Joining rooms while account is suspended is not allowed.",
+                    Codes.USER_ACCOUNT_SUSPENDED,
+                )
+            if action == Membership.KNOCK and target_suspended:
+                raise SynapseError(
+                    403,
+                    "Knocking on rooms while account is suspended is not allowed.",
+                    Codes.USER_ACCOUNT_SUSPENDED,
+                )

        if (
            not self.allow_per_room_profiles and not is_requester_server_notices_user
@@ -2163,6 +2169,7 @@ class RoomForgetterHandler(StateDeltasHandler):
        super().__init__(hs)

        self._hs = hs
+        self.server_name = hs.hostname
        self._store = hs.get_datastores().main
        self._storage_controllers = hs.get_storage_controllers()
        self._clock = hs.get_clock()
@@ -2194,7 +2201,9 @@ class RoomForgetterHandler(StateDeltasHandler):
            finally:
                self._is_processing = False

-        run_as_background_process("room_forgetter.notify_new_event", process)
+        run_as_background_process(
+            "room_forgetter.notify_new_event", self.server_name, process
+        )

    async def _unsafe_process(self) -> None:
        # If self.pos is None then means we haven't fetched it from DB
@@ -2251,7 +2260,9 @@ class RoomForgetterHandler(StateDeltasHandler):
            self.pos = max_pos

            # Expose current event processing position to prometheus
-            event_processing_positions.labels("room_forgetter").set(max_pos)
+            event_processing_positions.labels(
+                name="room_forgetter", **{SERVER_NAME_LABEL: self.server_name}
+            ).set(max_pos)

            await self._store.update_room_forgetter_stream_pos(max_pos)

@@ -24,16 +24,13 @@ import logging
 from email.mime.multipart import MIMEMultipart
 from email.mime.text import MIMEText
 from io import BytesIO
-from typing import TYPE_CHECKING, Any, Dict, Optional
+from typing import TYPE_CHECKING, Dict, Optional

-from pkg_resources import parse_version
-
-import twisted
 from twisted.internet.defer import Deferred
 from twisted.internet.endpoints import HostnameEndpoint
-from twisted.internet.interfaces import IOpenSSLContextFactory, IProtocolFactory
+from twisted.internet.interfaces import IProtocolFactory
 from twisted.internet.ssl import optionsForClientTLS
-from twisted.mail.smtp import ESMTPSender, ESMTPSenderFactory
+from twisted.mail.smtp import ESMTPSenderFactory
 from twisted.protocols.tls import TLSMemoryBIOFactory

 from synapse.logging.context import make_deferred_yieldable
@@ -44,49 +41,6 @@ if TYPE_CHECKING:

 logger = logging.getLogger(__name__)

-_is_old_twisted = parse_version(twisted.__version__) < parse_version("21")
-
-
-class _BackportESMTPSender(ESMTPSender):
-    """Extend old versions of ESMTPSender to configure TLS.
-
-    Unfortunately, before Twisted 21.2, ESMTPSender doesn't give an easy way to
-    disable TLS, or to configure the hostname used for TLS certificate validation.
-    This backports the `hostname` parameter for that functionality.
-    """
-
-    __hostname: Optional[str]
-
-    def __init__(self, *args: Any, **kwargs: Any) -> None:
-        """"""
-        self.__hostname = kwargs.pop("hostname", None)
-        super().__init__(*args, **kwargs)
-
-    def _getContextFactory(self) -> Optional[IOpenSSLContextFactory]:
-        if self.context is not None:
-            return self.context
-        elif self.__hostname is None:
-            return None  # disable TLS if hostname is None
-        return optionsForClientTLS(self.__hostname)
-
-
-class _BackportESMTPSenderFactory(ESMTPSenderFactory):
-    """An ESMTPSenderFactory for _BackportESMTPSender.
-
-    This backports the `hostname` parameter, to disable or configure TLS.
-    """
-
-    __hostname: Optional[str]
-
-    def __init__(self, *args: Any, **kwargs: Any) -> None:
-        self.__hostname = kwargs.pop("hostname", None)
-        super().__init__(*args, **kwargs)
-
-    def protocol(self, *args: Any, **kwargs: Any) -> ESMTPSender:  # type: ignore
-        # this overrides ESMTPSenderFactory's `protocol` attribute, with a Callable
-        # instantiating our _BackportESMTPSender, providing the hostname parameter
-        return _BackportESMTPSender(*args, **kwargs, hostname=self.__hostname)
-

 async def _sendmail(
    reactor: ISynapseReactor,
@@ -129,9 +83,7 @@ async def _sendmail(
    elif tlsname is None:
        tlsname = smtphost

-    factory: IProtocolFactory = (
-        _BackportESMTPSenderFactory if _is_old_twisted else ESMTPSenderFactory
-    )(
+    factory: IProtocolFactory = ESMTPSenderFactory(
        username,
        password,
        from_addr,
@@ -38,6 +38,7 @@ from synapse.logging.opentracing import (
    tag_args,
    trace,
 )
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.storage.databases.main.roommember import extract_heroes_from_room_summary
 from synapse.storage.databases.main.state_deltas import StateDelta
 from synapse.storage.databases.main.stream import PaginateFunction
@@ -79,7 +80,7 @@ logger = logging.getLogger(__name__)
 sync_processing_time = Histogram(
    "synapse_sliding_sync_processing_time",
    "Time taken to generate a sliding sync response, ignoring wait times.",
-    ["initial"],
+    labelnames=["initial", SERVER_NAME_LABEL],
 )

 # Limit the number of state_keys we should remember sending down the connection for each
@@ -94,6 +95,7 @@ MAX_NUMBER_PREVIOUS_STATE_KEYS_TO_REMEMBER = 100

 class SlidingSyncHandler:
    def __init__(self, hs: "HomeServer"):
+        self.server_name = hs.hostname
        self.clock = hs.get_clock()
        self.store = hs.get_datastores().main
        self.storage_controllers = hs.get_storage_controllers()
@@ -368,9 +370,9 @@ class SlidingSyncHandler:
        set_tag(SynapseTags.FUNC_ARG_PREFIX + "sync_config.user", user_id)

        end_time_s = self.clock.time()
-        sync_processing_time.labels(from_token is not None).observe(
-            end_time_s - start_time_s
-        )
+        sync_processing_time.labels(
+            initial=from_token is not None, **{SERVER_NAME_LABEL: self.server_name}
+        ).observe(end_time_s - start_time_s)

        return sliding_sync_result

@@ -202,7 +202,7 @@ class SsoHandler:
    def __init__(self, hs: "HomeServer"):
        self._clock = hs.get_clock()
        self._store = hs.get_datastores().main
-        self._server_name = hs.hostname
+        self.server_name = hs.hostname
        self._is_mine_server_name = hs.is_mine_server_name
        self._registration_handler = hs.get_registration_handler()
        self._auth_handler = hs.get_auth_handler()
@@ -238,7 +238,9 @@ class SsoHandler:
        p_id = p.idp_id
        assert p_id not in self._identity_providers
        self._identity_providers[p_id] = p
-        init_counters_for_auth_provider(p_id)
+        init_counters_for_auth_provider(
+            auth_provider_id=p_id, server_name=self.server_name
+        )

    def get_identity_providers(self) -> Mapping[str, SsoIdentityProvider]:
        """Get the configured identity providers"""
@@ -569,7 +571,7 @@ class SsoHandler:
                return attributes

            # Check if this mxid already exists
-            user_id = UserID(attributes.localpart, self._server_name).to_string()
+            user_id = UserID(attributes.localpart, self.server_name).to_string()
            if not await self._store.get_users_by_id_case_insensitive(user_id):
                # This mxid is free
                break
@@ -907,7 +909,7 @@ class SsoHandler:

        # render an error page.
        html = self._bad_user_template.render(
-            server_name=self._server_name,
+            server_name=self.server_name,
            user_id_to_verify=user_id_to_verify,
        )
        respond_with_html(request, 200, html)
@@ -959,7 +961,7 @@ class SsoHandler:

        if contains_invalid_mxid_characters(localpart):
            raise SynapseError(400, "localpart is invalid: %s" % (localpart,))
-        user_id = UserID(localpart, self._server_name).to_string()
+        user_id = UserID(localpart, self.server_name).to_string()
        user_infos = await self._store.get_users_by_id_case_insensitive(user_id)

        logger.info("[session %s] users: %s", session_id, user_infos)
@@ -32,7 +32,7 @@ from typing import (
 )

 from synapse.api.constants import EventContentFields, EventTypes, Membership
-from synapse.metrics import event_processing_positions
+from synapse.metrics import SERVER_NAME_LABEL, event_processing_positions
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.storage.databases.main.state_deltas import StateDelta
 from synapse.types import JsonDict
@@ -54,6 +54,7 @@ class StatsHandler:

    def __init__(self, hs: "HomeServer"):
        self.hs = hs
+        self.server_name = hs.hostname
        self.store = hs.get_datastores().main
        self._storage_controllers = hs.get_storage_controllers()
        self.state = hs.get_state_handler()
@@ -89,7 +90,7 @@ class StatsHandler:
            finally:
                self._is_processing = False

-        run_as_background_process("stats.notify_new_event", process)
+        run_as_background_process("stats.notify_new_event", self.server_name, process)

    async def _unsafe_process(self) -> None:
        # If self.pos is None then means we haven't fetched it from DB
@@ -146,7 +147,9 @@ class StatsHandler:

            logger.debug("Handled room stats to %s -> %s", self.pos, max_pos)

-            event_processing_positions.labels("stats").set(max_pos)
+            event_processing_positions.labels(
+                name="stats", **{SERVER_NAME_LABEL: self.server_name}
+            ).set(max_pos)

            self.pos = max_pos

@@ -63,6 +63,7 @@ from synapse.logging.opentracing import (
    start_active_span,
    trace,
 )
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.storage.databases.main.event_push_actions import RoomNotifCounts
 from synapse.storage.databases.main.roommember import extract_heroes_from_room_summary
 from synapse.storage.databases.main.stream import PaginateFunction
@@ -104,7 +105,7 @@ non_empty_sync_counter = Counter(
    "Count of non empty sync responses. type is initial_sync/full_state_sync"
    "/incremental_sync. lazy_loaded indicates if lazy loaded members were "
    "enabled for that request.",
-    ["type", "lazy_loaded"],
+    labelnames=["type", "lazy_loaded", SERVER_NAME_LABEL],
 )

 # Store the cache that tracks which lazy-loaded members have been sent to a given
@@ -614,7 +615,11 @@ class SyncHandler:
                lazy_loaded = "true"
            else:
                lazy_loaded = "false"
-            non_empty_sync_counter.labels(sync_label, lazy_loaded).inc()
+            non_empty_sync_counter.labels(
+                type=sync_label,
+                lazy_loaded=lazy_loaded,
+                **{SERVER_NAME_LABEL: self.server_name},
+            ).inc()

        return result

@@ -1,9 +1,15 @@
 import logging
+from http import HTTPStatus
 from typing import TYPE_CHECKING, Optional

-from synapse.api.errors import AuthError, NotFoundError
-from synapse.storage.databases.main.thread_subscriptions import ThreadSubscription
-from synapse.types import UserID
+from synapse.api.constants import RelationTypes
+from synapse.api.errors import AuthError, Codes, NotFoundError, SynapseError
+from synapse.events import relation_from_event
+from synapse.storage.databases.main.thread_subscriptions import (
+    AutomaticSubscriptionConflicted,
+    ThreadSubscription,
+)
+from synapse.types import EventOrderings, UserID

 if TYPE_CHECKING:
    from synapse.server import HomeServer
@@ -55,42 +61,79 @@ class ThreadSubscriptionsHandler:
        room_id: str,
        thread_root_event_id: str,
        *,
-        automatic: bool,
+        automatic_event_id: Optional[str],
    ) -> Optional[int]:
        """Sets or updates a user's subscription settings for a specific thread root.

        Args:
            requester_user_id: The ID of the user whose settings are being updated.
            thread_root_event_id: The event ID of the thread root.
-            automatic: whether the user was subscribed by an automatic decision by
-                their client.
+            automatic_event_id: if the user was subscribed by an automatic decision by
+                their client, the event ID that caused this.

        Returns:
            The stream ID for this update, if the update isn't no-opped.

        Raises:
            NotFoundError if the user cannot access the thread root event, or it isn't
-            known to this homeserver.
+            known to this homeserver. Ditto for the automatic cause event if supplied.
+
+            SynapseError(400, M_NOT_IN_THREAD): if client supplied an automatic cause event
+            but user cannot access the event.
+
+            SynapseError(409, M_SKIPPED): if client requested an automatic subscription
+            but it was skipped because the cause event is logically later than an unsubscription.
        """
        # First check that the user can access the thread root event
        # and that it exists
        try:
-            event = await self.event_handler.get_event(
+            thread_root_event = await self.event_handler.get_event(
                user_id, room_id, thread_root_event_id
            )
-            if event is None:
+            if thread_root_event is None:
                raise NotFoundError("No such thread root")
        except AuthError:
            logger.info("rejecting thread subscriptions change (thread not accessible)")
            raise NotFoundError("No such thread root")

-        return await self.store.subscribe_user_to_thread(
+        if automatic_event_id:
+            autosub_cause_event = await self.event_handler.get_event(
+                user_id, room_id, automatic_event_id
+            )
+            if autosub_cause_event is None:
+                raise NotFoundError("Automatic subscription event not found")
+            relation = relation_from_event(autosub_cause_event)
+            if (
+                relation is None
+                or relation.rel_type != RelationTypes.THREAD
+                or relation.parent_id != thread_root_event_id
+            ):
+                raise SynapseError(
+                    HTTPStatus.BAD_REQUEST,
+                    "Automatic subscription must use an event in the thread",
+                    errcode=Codes.MSC4306_NOT_IN_THREAD,
+                )
+
+            automatic_event_orderings = EventOrderings.from_event(autosub_cause_event)
+        else:
+            automatic_event_orderings = None
+
+        outcome = await self.store.subscribe_user_to_thread(
            user_id.to_string(),
-            event.room_id,
+            room_id,
            thread_root_event_id,
-            automatic=automatic,
+            automatic_event_orderings=automatic_event_orderings,
        )

+        if isinstance(outcome, AutomaticSubscriptionConflicted):
+            raise SynapseError(
+                HTTPStatus.CONFLICT,
+                "Automatic subscription obsoleted by an unsubscription request.",
+                errcode=Codes.MSC4306_CONFLICTING_UNSUBSCRIPTION,
+            )
+
+        return outcome
+
    async def unsubscribe_user_from_thread(
        self, user_id: UserID, room_id: str, thread_root_event_id: str
    ) -> Optional[int]:
@@ -80,7 +80,9 @@ class FollowerTypingHandler:
    def __init__(self, hs: "HomeServer"):
        self.store = hs.get_datastores().main
        self._storage_controllers = hs.get_storage_controllers()
-        self.server_name = hs.config.server.server_name
+        self.server_name = (
+            hs.hostname
+        )  # nb must be called this for @wrap_as_background_process
        self.clock = hs.get_clock()
        self.is_mine_id = hs.is_mine_id
        self.is_mine_server_name = hs.is_mine_server_name
@@ -143,7 +145,11 @@ class FollowerTypingHandler:
            last_fed_poke = self._member_last_federation_poke.get(member, None)
            if not last_fed_poke or last_fed_poke + FEDERATION_PING_INTERVAL <= now:
                run_as_background_process(
-                    "typing._push_remote", self._push_remote, member=member, typing=True
+                    "typing._push_remote",
+                    self.server_name,
+                    self._push_remote,
+                    member=member,
+                    typing=True,
                )

        # Add a paranoia timer to ensure that we always have a timer for
@@ -216,6 +222,7 @@ class FollowerTypingHandler:
            if self.federation:
                run_as_background_process(
                    "_send_changes_in_typing_to_remotes",
+                    self.server_name,
                    self._send_changes_in_typing_to_remotes,
                    row.room_id,
                    prev_typing,
@@ -378,7 +385,11 @@ class TypingWriterHandler(FollowerTypingHandler):
        if self.hs.is_mine_id(member.user_id):
            # Only send updates for changes to our own users.
            run_as_background_process(
-                "typing._push_remote", self._push_remote, member, typing
+                "typing._push_remote",
+                self.server_name,
+                self._push_remote,
+                member,
+                typing,
            )

        self._push_update_local(member=member, typing=typing)
@@ -35,6 +35,7 @@ from synapse.api.constants import (
 )
 from synapse.api.errors import Codes, SynapseError
 from synapse.handlers.state_deltas import MatchChange, StateDeltasHandler
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.storage.databases.main.state_deltas import StateDelta
 from synapse.storage.databases.main.user_directory import SearchResult
@@ -192,7 +193,9 @@ class UserDirectoryHandler(StateDeltasHandler):
                self._is_processing = False

        self._is_processing = True
-        run_as_background_process("user_directory.notify_new_event", process)
+        run_as_background_process(
+            "user_directory.notify_new_event", self.server_name, process
+        )

    async def handle_local_profile_change(
        self, user_id: str, profile: ProfileInfo
@@ -260,9 +263,9 @@ class UserDirectoryHandler(StateDeltasHandler):
                self.pos = max_pos

                # Expose current event processing position to prometheus
-                synapse.metrics.event_processing_positions.labels("user_dir").set(
-                    max_pos
-                )
+                synapse.metrics.event_processing_positions.labels(
+                    name="user_dir", **{SERVER_NAME_LABEL: self.server_name}
+                ).set(max_pos)

                await self.store.update_user_directory_stream_pos(max_pos)

@@ -606,7 +609,9 @@ class UserDirectoryHandler(StateDeltasHandler):
                self._is_refreshing_remote_profiles = False

        self._is_refreshing_remote_profiles = True
-        run_as_background_process("user_directory.refresh_remote_profiles", process)
+        run_as_background_process(
+            "user_directory.refresh_remote_profiles", self.server_name, process
+        )

    async def _unsafe_refresh_remote_profiles(self) -> None:
        limit = MAX_SERVERS_TO_REFRESH_PROFILES_FOR_IN_ONE_GO - len(
@@ -688,7 +693,9 @@ class UserDirectoryHandler(StateDeltasHandler):

        self._is_refreshing_remote_profiles_for_servers.add(server_name)
        run_as_background_process(
-            "user_directory.refresh_remote_profiles_for_remote_server", process
+            "user_directory.refresh_remote_profiles_for_remote_server",
+            self.server_name,
+            process,
        )

    async def _unsafe_refresh_remote_profiles_for_remote_server(
@@ -66,6 +66,9 @@ class WorkerLocksHandler:
    """

    def __init__(self, hs: "HomeServer") -> None:
+        self.server_name = (
+            hs.hostname
+        )  # nb must be called this for @wrap_as_background_process
        self._reactor = hs.get_reactor()
        self._store = hs.get_datastores().main
        self._clock = hs.get_clock()
@@ -85,6 +85,7 @@ from synapse.http.replicationagent import ReplicationAgent
 from synapse.http.types import QueryParams
 from synapse.logging.context import make_deferred_yieldable, run_in_background
 from synapse.logging.opentracing import set_tag, start_active_span, tags
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.types import ISynapseReactor, StrSequence
 from synapse.util import json_decoder
 from synapse.util.async_helpers import timeout_deferred
@@ -108,9 +109,13 @@ except ImportError:

 logger = logging.getLogger(__name__)

-outgoing_requests_counter = Counter("synapse_http_client_requests", "", ["method"])
+outgoing_requests_counter = Counter(
+    "synapse_http_client_requests", "", labelnames=["method", SERVER_NAME_LABEL]
+)
 incoming_responses_counter = Counter(
-    "synapse_http_client_responses", "", ["method", "code"]
+    "synapse_http_client_responses",
+    "",
+    labelnames=["method", "code", SERVER_NAME_LABEL],
 )

 # the type of the headers map, to be passed to the t.w.h.Headers.
@@ -346,6 +351,7 @@ class BaseHttpClient:
        treq_args: Optional[Dict[str, Any]] = None,
    ):
        self.hs = hs
+        self.server_name = hs.hostname
        self.reactor = hs.get_reactor()

        self._extra_treq_args = treq_args or {}
@@ -384,7 +390,9 @@ class BaseHttpClient:
            RequestTimedOutError if the request times out before the headers are read

        """
-        outgoing_requests_counter.labels(method).inc()
+        outgoing_requests_counter.labels(
+            method=method, **{SERVER_NAME_LABEL: self.server_name}
+        ).inc()

        # log request but strip `access_token` (AS requests for example include this)
        logger.debug("Sending request %s %s", method, redact_uri(uri))
@@ -438,7 +446,11 @@ class BaseHttpClient:

                response = await make_deferred_yieldable(request_deferred)

-                incoming_responses_counter.labels(method, response.code).inc()
+                incoming_responses_counter.labels(
+                    method=method,
+                    code=response.code,
+                    **{SERVER_NAME_LABEL: self.server_name},
+                ).inc()
                logger.info(
                    "Received response to %s %s: %s",
                    method,
@@ -447,7 +459,11 @@ class BaseHttpClient:
                )
                return response
            except Exception as e:
-                incoming_responses_counter.labels(method, "ERR").inc()
+                incoming_responses_counter.labels(
+                    method=method,
+                    code="ERR",
+                    **{SERVER_NAME_LABEL: self.server_name},
+                ).inc()
                logger.info(
                    "Error sending request to  %s %s: %s %s",
                    method,
@@ -821,12 +837,12 @@ class SimpleHttpClient(BaseHttpClient):
        pool.cachedConnectionTimeout = 2 * 60

        self.agent: IAgent = ProxyAgent(
-            self.reactor,
-            hs.get_reactor(),
+            reactor=self.reactor,
+            proxy_reactor=hs.get_reactor(),
            connectTimeout=15,
            contextFactory=self.hs.get_http_client_context_factory(),
            pool=pool,
-            use_proxy=use_proxy,
+            proxy_config=hs.config.server.proxy_config,
        )

        if self._ip_blocklist:
@@ -855,6 +871,7 @@ class ReplicationClient(BaseHttpClient):
            hs: The HomeServer instance to pass in
        """
        super().__init__(hs)
+        self.server_name = hs.hostname

        # Use a pool, but a very small one.
        pool = HTTPConnectionPool(self.reactor)
@@ -891,7 +908,9 @@ class ReplicationClient(BaseHttpClient):
            RequestTimedOutError if the request times out before the headers are read

        """
-        outgoing_requests_counter.labels(method).inc()
+        outgoing_requests_counter.labels(
+            method=method, **{SERVER_NAME_LABEL: self.server_name}
+        ).inc()

        logger.debug("Sending request %s %s", method, uri)

@@ -948,7 +967,11 @@ class ReplicationClient(BaseHttpClient):

                response = await make_deferred_yieldable(request_deferred)

-                incoming_responses_counter.labels(method, response.code).inc()
+                incoming_responses_counter.labels(
+                    method=method,
+                    code=response.code,
+                    **{SERVER_NAME_LABEL: self.server_name},
+                ).inc()
                logger.info(
                    "Received response to %s %s: %s",
                    method,
@@ -957,7 +980,11 @@ class ReplicationClient(BaseHttpClient):
                )
                return response
            except Exception as e:
-                incoming_responses_counter.labels(method, "ERR").inc()
+                incoming_responses_counter.labels(
+                    method=method,
+                    code="ERR",
+                    **{SERVER_NAME_LABEL: self.server_name},
+                ).inc()
                logger.info(
                    "Error sending request to  %s %s: %s %s",
                    method,
@@ -21,7 +21,6 @@ import logging
 import urllib.parse
 from typing import Any, Generator, List, Optional
 from urllib.request import (  # type: ignore[attr-defined]
-    getproxies_environment,
    proxy_bypass_environment,
 )

@@ -40,6 +39,7 @@ from twisted.web.client import URI, Agent, HTTPConnectionPool
 from twisted.web.http_headers import Headers
 from twisted.web.iweb import IAgent, IAgentEndpointFactory, IBodyProducer, IResponse

+from synapse.config.server import ProxyConfig
 from synapse.crypto.context_factory import FederationPolicyForHTTPS
 from synapse.http import proxyagent
 from synapse.http.client import BlocklistingAgentWrapper, BlocklistingReactorWrapper
@@ -77,6 +77,8 @@ class MatrixFederationAgent:

        ip_blocklist: Disallowed IP addresses.

+        proxy_config: Proxy configuration to use for this agent.
+
        proxy_reactor: twisted reactor to use for connections to the proxy server
           reactor might have some blocking applied (i.e. for DNS queries),
           but we need unblocked access to the proxy.
@@ -92,12 +94,14 @@ class MatrixFederationAgent:

    def __init__(
        self,
+        *,
        server_name: str,
        reactor: ISynapseReactor,
        tls_client_options_factory: Optional[FederationPolicyForHTTPS],
        user_agent: bytes,
        ip_allowlist: Optional[IPSet],
        ip_blocklist: IPSet,
+        proxy_config: Optional[ProxyConfig] = None,
        _srv_resolver: Optional[SrvResolver] = None,
        _well_known_resolver: Optional[WellKnownResolver] = None,
    ):
@@ -129,10 +133,11 @@ class MatrixFederationAgent:
        self._agent = Agent.usingEndpointFactory(
            reactor,
            MatrixHostnameEndpointFactory(
-                reactor,
-                proxy_reactor,
-                tls_client_options_factory,
-                _srv_resolver,
+                reactor=reactor,
+                proxy_reactor=proxy_reactor,
+                tls_client_options_factory=tls_client_options_factory,
+                srv_resolver=_srv_resolver,
+                proxy_config=proxy_config,
            ),
            pool=self._pool,
        )
@@ -144,11 +149,11 @@ class MatrixFederationAgent:
                reactor=reactor,
                agent=BlocklistingAgentWrapper(
                    ProxyAgent(
-                        reactor,
-                        proxy_reactor,
+                        reactor=reactor,
+                        proxy_reactor=proxy_reactor,
                        pool=self._pool,
                        contextFactory=tls_client_options_factory,
-                        use_proxy=True,
+                        proxy_config=proxy_config,
                    ),
                    ip_blocklist=ip_blocklist,
                ),
@@ -246,14 +251,17 @@ class MatrixHostnameEndpointFactory:

    def __init__(
        self,
+        *,
        reactor: IReactorCore,
        proxy_reactor: IReactorCore,
        tls_client_options_factory: Optional[FederationPolicyForHTTPS],
        srv_resolver: Optional[SrvResolver],
+        proxy_config: Optional[ProxyConfig],
    ):
        self._reactor = reactor
        self._proxy_reactor = proxy_reactor
        self._tls_client_options_factory = tls_client_options_factory
+        self._proxy_config = proxy_config

        if srv_resolver is None:
            srv_resolver = SrvResolver()
@@ -262,11 +270,12 @@ class MatrixHostnameEndpointFactory:

    def endpointForURI(self, parsed_uri: URI) -> "MatrixHostnameEndpoint":
        return MatrixHostnameEndpoint(
-            self._reactor,
-            self._proxy_reactor,
-            self._tls_client_options_factory,
-            self._srv_resolver,
-            parsed_uri,
+            reactor=self._reactor,
+            proxy_reactor=self._proxy_reactor,
+            tls_client_options_factory=self._tls_client_options_factory,
+            srv_resolver=self._srv_resolver,
+            proxy_config=self._proxy_config,
+            parsed_uri=parsed_uri,
        )


@@ -283,6 +292,7 @@ class MatrixHostnameEndpoint:
        tls_client_options_factory:
            factory to use for fetching client tls options, or none to disable TLS.
        srv_resolver: The SRV resolver to use
+        proxy_config: Proxy configuration to use for this agent.
        parsed_uri: The parsed URI that we're wanting to connect to.

    Raises:
@@ -292,26 +302,28 @@ class MatrixHostnameEndpoint:

    def __init__(
        self,
+        *,
        reactor: IReactorCore,
        proxy_reactor: IReactorCore,
        tls_client_options_factory: Optional[FederationPolicyForHTTPS],
        srv_resolver: SrvResolver,
+        proxy_config: Optional[ProxyConfig],
        parsed_uri: URI,
    ):
        self._reactor = reactor
        self._parsed_uri = parsed_uri
+        self.proxy_config = proxy_config

        # http_proxy is not needed because federation is always over TLS
-        proxies = getproxies_environment()
-        https_proxy = proxies["https"].encode() if "https" in proxies else None
-        self.no_proxy = proxies["no"] if "no" in proxies else None

        # endpoint and credentials to use to connect to the outbound https proxy, if any.
        (
            self._https_proxy_endpoint,
            self._https_proxy_creds,
        ) = proxyagent.http_proxy_endpoint(
-            https_proxy,
+            self.proxy_config.https_proxy.encode()
+            if self.proxy_config and self.proxy_config.https_proxy
+            else None,
            proxy_reactor,
            tls_client_options_factory,
        )
@@ -348,10 +360,10 @@ class MatrixHostnameEndpoint:
            port = server.port

            should_skip_proxy = False
-            if self.no_proxy is not None:
+            if self.proxy_config is not None:
                should_skip_proxy = proxy_bypass_environment(
                    host.decode(),
-                    proxies={"no": self.no_proxy},
+                    proxies=self.proxy_config.get_proxies_dictionary(),
                )

            endpoint: IStreamClientEndpoint
@@ -87,6 +87,7 @@ from synapse.http.types import QueryParams
 from synapse.logging import opentracing
 from synapse.logging.context import make_deferred_yieldable, run_in_background
 from synapse.logging.opentracing import set_tag, start_active_span, tags
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.types import JsonDict
 from synapse.util import json_decoder
 from synapse.util.async_helpers import AwakenableSleeper, Linearizer, timeout_deferred
@@ -99,10 +100,14 @@ if TYPE_CHECKING:
 logger = logging.getLogger(__name__)

 outgoing_requests_counter = Counter(
-    "synapse_http_matrixfederationclient_requests", "", ["method"]
+    "synapse_http_matrixfederationclient_requests",
+    "",
+    labelnames=["method", SERVER_NAME_LABEL],
 )
 incoming_responses_counter = Counter(
-    "synapse_http_matrixfederationclient_responses", "", ["method", "code"]
+    "synapse_http_matrixfederationclient_responses",
+    "",
+    labelnames=["method", "code", SERVER_NAME_LABEL],
 )


@@ -423,6 +428,7 @@ class MatrixFederationHttpClient:
                user_agent=user_agent.encode("ascii"),
                ip_allowlist=hs.config.server.federation_ip_range_allowlist,
                ip_blocklist=hs.config.server.federation_ip_range_blocklist,
+                proxy_config=hs.config.server.proxy_config,
            )
        else:
            proxy_authorization_secret = hs.config.worker.worker_replication_secret
@@ -437,9 +443,9 @@ class MatrixFederationHttpClient:
            # locations
            federation_proxy_locations = outbound_federation_restricted_to.locations
            federation_agent = ProxyAgent(
-                self.reactor,
-                self.reactor,
-                tls_client_options_factory,
+                reactor=self.reactor,
+                proxy_reactor=self.reactor,
+                contextFactory=tls_client_options_factory,
                federation_proxy_locations=federation_proxy_locations,
                federation_proxy_credentials=federation_proxy_credentials,
            )
@@ -619,9 +625,10 @@ class MatrixFederationHttpClient:
            raise FederationDeniedError(request.destination)

        limiter = await synapse.util.retryutils.get_retry_limiter(
-            request.destination,
-            self.clock,
-            self._store,
+            destination=request.destination,
+            our_server_name=self.server_name,
+            clock=self.clock,
+            store=self._store,
            backoff_on_404=backoff_on_404,
            ignore_backoff=ignore_backoff,
            notifier=self.hs.get_notifier(),
@@ -695,7 +702,9 @@ class MatrixFederationHttpClient:
                        _sec_timeout,
                    )

-                    outgoing_requests_counter.labels(request.method).inc()
+                    outgoing_requests_counter.labels(
+                        method=request.method, **{SERVER_NAME_LABEL: self.server_name}
+                    ).inc()

                    try:
                        with Measure(
@@ -734,7 +743,9 @@ class MatrixFederationHttpClient:
                        raise RequestSendFailed(e, can_retry=True) from e

                    incoming_responses_counter.labels(
-                        request.method, response.code
+                        method=request.method,
+                        code=response.code,
+                        **{SERVER_NAME_LABEL: self.server_name},
                    ).inc()

                    set_tag(tags.HTTP_STATUS_CODE, response.code)
@@ -24,7 +24,6 @@ import re
 from typing import Any, Collection, Dict, List, Optional, Sequence, Tuple, Union, cast
 from urllib.parse import urlparse
 from urllib.request import (  # type: ignore[attr-defined]
-    getproxies_environment,
    proxy_bypass_environment,
 )

@@ -54,6 +53,7 @@ from twisted.web.error import SchemeNotSupported
 from twisted.web.http_headers import Headers
 from twisted.web.iweb import IAgent, IBodyProducer, IPolicyForHTTPS, IResponse

+from synapse.config.server import ProxyConfig
 from synapse.config.workers import (
    InstanceLocationConfig,
    InstanceTcpLocationConfig,
@@ -99,8 +99,7 @@ class ProxyAgent(_AgentBase):
        pool: connection pool to be used. If None, a
            non-persistent pool instance will be created.

-        use_proxy: Whether proxy settings should be discovered and used
-            from conventional environment variables.
+        proxy_config: Proxy configuration to use for this agent.

        federation_proxy_locations: An optional list of locations to proxy outbound federation
            traffic through (only requests that use the `matrix-federation://` scheme
@@ -118,13 +117,14 @@ class ProxyAgent(_AgentBase):

    def __init__(
        self,
+        *,
        reactor: IReactorCore,
        proxy_reactor: Optional[IReactorCore] = None,
        contextFactory: Optional[IPolicyForHTTPS] = None,
        connectTimeout: Optional[float] = None,
        bindAddress: Optional[bytes] = None,
        pool: Optional[HTTPConnectionPool] = None,
-        use_proxy: bool = False,
+        proxy_config: Optional[ProxyConfig] = None,
        federation_proxy_locations: Collection[InstanceLocationConfig] = (),
        federation_proxy_credentials: Optional[ProxyCredentials] = None,
    ):
@@ -145,31 +145,33 @@ class ProxyAgent(_AgentBase):
        if bindAddress is not None:
            self._endpoint_kwargs["bindAddress"] = bindAddress

-        http_proxy = None
-        https_proxy = None
-        no_proxy = None
-        if use_proxy:
-            proxies = getproxies_environment()
-            http_proxy = proxies["http"].encode() if "http" in proxies else None
-            https_proxy = proxies["https"].encode() if "https" in proxies else None
-            no_proxy = proxies["no"] if "no" in proxies else None
+        self.proxy_config = proxy_config
+        if self.proxy_config is not None:
            logger.debug(
                "Using proxy settings: http_proxy=%s, https_proxy=%s, no_proxy=%s",
-                http_proxy,
-                https_proxy,
-                no_proxy,
+                self.proxy_config.http_proxy,
+                self.proxy_config.https_proxy,
+                self.proxy_config.no_proxy_hosts,
            )

        self.http_proxy_endpoint, self.http_proxy_creds = http_proxy_endpoint(
-            http_proxy, self.proxy_reactor, contextFactory, **self._endpoint_kwargs
+            self.proxy_config.http_proxy.encode()
+            if self.proxy_config and self.proxy_config.http_proxy
+            else None,
+            self.proxy_reactor,
+            contextFactory,
+            **self._endpoint_kwargs,
        )

        self.https_proxy_endpoint, self.https_proxy_creds = http_proxy_endpoint(
-            https_proxy, self.proxy_reactor, contextFactory, **self._endpoint_kwargs
+            self.proxy_config.https_proxy.encode()
+            if self.proxy_config and self.proxy_config.https_proxy
+            else None,
+            self.proxy_reactor,
+            contextFactory,
+            **self._endpoint_kwargs,
        )

-        self.no_proxy = no_proxy
-
        self._policy_for_https = contextFactory
        self._reactor = cast(IReactorTime, reactor)

@@ -268,10 +270,10 @@ class ProxyAgent(_AgentBase):
        request_path = parsed_uri.originForm

        should_skip_proxy = False
-        if self.no_proxy is not None:
+        if self.proxy_config is not None:
            should_skip_proxy = proxy_bypass_environment(
                parsed_uri.host.decode(),
-                proxies={"no": self.no_proxy},
+                proxies=self.proxy_config.get_proxies_dictionary(),
            )

        if (
@@ -27,40 +27,52 @@ from typing import Dict, Mapping, Set, Tuple
 from prometheus_client.core import Counter, Histogram

 from synapse.logging.context import current_context
-from synapse.metrics import LaterGauge
+from synapse.metrics import SERVER_NAME_LABEL, LaterGauge

 logger = logging.getLogger(__name__)


 # total number of responses served, split by method/servlet/tag
 response_count = Counter(
-    "synapse_http_server_response_count", "", ["method", "servlet", "tag"]
+    "synapse_http_server_response_count",
+    "",
+    labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
 )

 requests_counter = Counter(
-    "synapse_http_server_requests_received", "", ["method", "servlet"]
+    "synapse_http_server_requests_received",
+    "",
+    labelnames=["method", "servlet", SERVER_NAME_LABEL],
 )

 outgoing_responses_counter = Counter(
-    "synapse_http_server_responses", "", ["method", "code"]
+    "synapse_http_server_responses",
+    "",
+    labelnames=["method", "code", SERVER_NAME_LABEL],
 )

 response_timer = Histogram(
    "synapse_http_server_response_time_seconds",
    "sec",
-    ["method", "servlet", "tag", "code"],
+    labelnames=["method", "servlet", "tag", "code", SERVER_NAME_LABEL],
 )

 response_ru_utime = Counter(
-    "synapse_http_server_response_ru_utime_seconds", "sec", ["method", "servlet", "tag"]
+    "synapse_http_server_response_ru_utime_seconds",
+    "sec",
+    labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
 )

 response_ru_stime = Counter(
-    "synapse_http_server_response_ru_stime_seconds", "sec", ["method", "servlet", "tag"]
+    "synapse_http_server_response_ru_stime_seconds",
+    "sec",
+    labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
 )

 response_db_txn_count = Counter(
-    "synapse_http_server_response_db_txn_count", "", ["method", "servlet", "tag"]
+    "synapse_http_server_response_db_txn_count",
+    "",
+    labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
 )

 # seconds spent waiting for db txns, excluding scheduling time, when processing
@@ -68,34 +80,42 @@ response_db_txn_count = Counter(
 response_db_txn_duration = Counter(
    "synapse_http_server_response_db_txn_duration_seconds",
    "",
-    ["method", "servlet", "tag"],
+    labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
 )

 # seconds spent waiting for a db connection, when processing this request
 response_db_sched_duration = Counter(
    "synapse_http_server_response_db_sched_duration_seconds",
    "",
-    ["method", "servlet", "tag"],
+    labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
 )

 # size in bytes of the response written
 response_size = Counter(
-    "synapse_http_server_response_size", "", ["method", "servlet", "tag"]
+    "synapse_http_server_response_size",
+    "",
+    labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
 )

 # In flight metrics are incremented while the requests are in flight, rather
 # than when the response was written.

 in_flight_requests_ru_utime = Counter(
-    "synapse_http_server_in_flight_requests_ru_utime_seconds", "", ["method", "servlet"]
+    "synapse_http_server_in_flight_requests_ru_utime_seconds",
+    "",
+    labelnames=["method", "servlet", SERVER_NAME_LABEL],
 )

 in_flight_requests_ru_stime = Counter(
-    "synapse_http_server_in_flight_requests_ru_stime_seconds", "", ["method", "servlet"]
+    "synapse_http_server_in_flight_requests_ru_stime_seconds",
+    "",
+    labelnames=["method", "servlet", SERVER_NAME_LABEL],
 )

 in_flight_requests_db_txn_count = Counter(
-    "synapse_http_server_in_flight_requests_db_txn_count", "", ["method", "servlet"]
+    "synapse_http_server_in_flight_requests_db_txn_count",
+    "",
+    labelnames=["method", "servlet", SERVER_NAME_LABEL],
 )

 # seconds spent waiting for db txns, excluding scheduling time, when processing
@@ -103,14 +123,14 @@ in_flight_requests_db_txn_count = Counter(
 in_flight_requests_db_txn_duration = Counter(
    "synapse_http_server_in_flight_requests_db_txn_duration_seconds",
    "",
-    ["method", "servlet"],
+    labelnames=["method", "servlet", SERVER_NAME_LABEL],
 )

 # seconds spent waiting for a db connection, when processing this request
 in_flight_requests_db_sched_duration = Counter(
    "synapse_http_server_in_flight_requests_db_sched_duration_seconds",
    "",
-    ["method", "servlet"],
+    labelnames=["method", "servlet", SERVER_NAME_LABEL],
 )

 _in_flight_requests: Set["RequestMetrics"] = set()
@@ -124,31 +144,42 @@ def _get_in_flight_counts() -> Mapping[Tuple[str, ...], int]:
    # Cast to a list to prevent it changing while the Prometheus
    # thread is collecting metrics
    with _in_flight_requests_lock:
-        reqs = list(_in_flight_requests)
+        request_metrics = list(_in_flight_requests)

-    for rm in reqs:
-        rm.update_metrics()
+    for request_metric in request_metrics:
+        request_metric.update_metrics()

    # Map from (method, name) -> int, the number of in flight requests of that
    # type. The key type is Tuple[str, str], but we leave the length unspecified
    # for compatability with LaterGauge's annotations.
    counts: Dict[Tuple[str, ...], int] = {}
-    for rm in reqs:
-        key = (rm.method, rm.name)
+    for request_metric in request_metrics:
+        key = (
+            request_metric.method,
+            request_metric.name,
+            request_metric.our_server_name,
+        )
        counts[key] = counts.get(key, 0) + 1

    return counts


 LaterGauge(
-    "synapse_http_server_in_flight_requests_count",
-    "",
-    ["method", "servlet"],
-    _get_in_flight_counts,
+    name="synapse_http_server_in_flight_requests_count",
+    desc="",
+    labelnames=["method", "servlet", SERVER_NAME_LABEL],
+    caller=_get_in_flight_counts,
 )


 class RequestMetrics:
+    def __init__(self, our_server_name: str) -> None:
+        """
+        Args:
+            our_server_name: Our homeserver name (used to label metrics) (`hs.hostname`)
+        """
+        self.our_server_name = our_server_name
+
    def start(self, time_sec: float, name: str, method: str) -> None:
        self.start_ts = time_sec
        self.start_context = current_context()
@@ -194,33 +225,40 @@ class RequestMetrics:

        response_code_str = str(response_code)

-        outgoing_responses_counter.labels(self.method, response_code_str).inc()
+        outgoing_responses_counter.labels(
+            method=self.method,
+            code=response_code_str,
+            **{SERVER_NAME_LABEL: self.our_server_name},
+        ).inc()

-        response_count.labels(self.method, self.name, tag).inc()
+        response_base_labels = {
+            "method": self.method,
+            "servlet": self.name,
+            "tag": tag,
+            SERVER_NAME_LABEL: self.our_server_name,
+        }

-        response_timer.labels(self.method, self.name, tag, response_code_str).observe(
-            time_sec - self.start_ts
-        )
+        response_count.labels(**response_base_labels).inc()
+
+        response_timer.labels(
+            code=response_code_str,
+            **response_base_labels,
+        ).observe(time_sec - self.start_ts)

        resource_usage = context.get_resource_usage()

-        response_ru_utime.labels(self.method, self.name, tag).inc(
-            resource_usage.ru_utime
-        )
-        response_ru_stime.labels(self.method, self.name, tag).inc(
-            resource_usage.ru_stime
-        )
-        response_db_txn_count.labels(self.method, self.name, tag).inc(
+        response_ru_utime.labels(**response_base_labels).inc(resource_usage.ru_utime)
+        response_ru_stime.labels(**response_base_labels).inc(resource_usage.ru_stime)
+        response_db_txn_count.labels(**response_base_labels).inc(
            resource_usage.db_txn_count
        )
-        response_db_txn_duration.labels(self.method, self.name, tag).inc(
+        response_db_txn_duration.labels(**response_base_labels).inc(
            resource_usage.db_txn_duration_sec
        )
-        response_db_sched_duration.labels(self.method, self.name, tag).inc(
+        response_db_sched_duration.labels(**response_base_labels).inc(
            resource_usage.db_sched_duration_sec
        )
-
-        response_size.labels(self.method, self.name, tag).inc(sent_bytes)
+        response_size.labels(**response_base_labels).inc(sent_bytes)

        # We always call this at the end to ensure that we update the metrics
        # regardless of whether a call to /metrics while the request was in
@@ -240,24 +278,30 @@ class RequestMetrics:
        diff = new_stats - self._request_stats
        self._request_stats = new_stats

+        in_flight_labels = {
+            "method": self.method,
+            "servlet": self.name,
+            SERVER_NAME_LABEL: self.our_server_name,
+        }
+
        # max() is used since rapid use of ru_stime/ru_utime can end up with the
        # count going backwards due to NTP, time smearing, fine-grained
        # correction, or floating points. Who knows, really?
-        in_flight_requests_ru_utime.labels(self.method, self.name).inc(
+        in_flight_requests_ru_utime.labels(**in_flight_labels).inc(
            max(diff.ru_utime, 0)
        )
-        in_flight_requests_ru_stime.labels(self.method, self.name).inc(
+        in_flight_requests_ru_stime.labels(**in_flight_labels).inc(
            max(diff.ru_stime, 0)
        )

-        in_flight_requests_db_txn_count.labels(self.method, self.name).inc(
+        in_flight_requests_db_txn_count.labels(**in_flight_labels).inc(
            diff.db_txn_count
        )

-        in_flight_requests_db_txn_duration.labels(self.method, self.name).inc(
+        in_flight_requests_db_txn_duration.labels(**in_flight_labels).inc(
            diff.db_txn_duration_sec
        )

-        in_flight_requests_db_sched_duration.labels(self.method, self.name).inc(
+        in_flight_requests_db_sched_duration.labels(**in_flight_labels).inc(
            diff.db_sched_duration_sec
        )
@@ -337,7 +337,7 @@ class _AsyncResource(resource.Resource, metaclass=abc.ABCMeta):
                    callback_return = await self._async_render(request)
                except LimitExceededError as e:
                    if e.pause:
-                        self._clock.sleep(e.pause)
+                        await self._clock.sleep(e.pause)
                    raise

                if callback_return is not None:
@@ -44,6 +44,7 @@ from synapse.logging.context import (
    LoggingContext,
    PreserveLoggingContext,
 )
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.types import ISynapseReactor, Requester

 if TYPE_CHECKING:
@@ -83,12 +84,14 @@ class SynapseRequest(Request):
        self,
        channel: HTTPChannel,
        site: "SynapseSite",
+        our_server_name: str,
        *args: Any,
        max_request_body_size: int = 1024,
        request_id_header: Optional[str] = None,
        **kw: Any,
    ):
        super().__init__(channel, *args, **kw)
+        self.our_server_name = our_server_name
        self._max_request_body_size = max_request_body_size
        self.request_id_header = request_id_header
        self.synapse_site = site
@@ -334,7 +337,11 @@ class SynapseRequest(Request):
            # dispatching to the handler, so that the handler
            # can update the servlet name in the request
            # metrics
-            requests_counter.labels(self.get_method(), self.request_metrics.name).inc()
+            requests_counter.labels(
+                method=self.get_method(),
+                servlet=self.request_metrics.name,
+                **{SERVER_NAME_LABEL: self.our_server_name},
+            ).inc()

    @contextlib.contextmanager
    def processing(self) -> Generator[None, None, None]:
@@ -455,7 +462,7 @@ class SynapseRequest(Request):
                self.request_metrics.name.
        """
        self.start_time = time.time()
-        self.request_metrics = RequestMetrics()
+        self.request_metrics = RequestMetrics(our_server_name=self.our_server_name)
        self.request_metrics.start(
            self.start_time, name=servlet_name, method=self.get_method()
        )
@@ -694,6 +701,7 @@ class SynapseSite(ProxySite):

        self.site_tag = site_tag
        self.reactor: ISynapseReactor = reactor
+        self.server_name = hs.hostname

        assert config.http_options is not None
        proxied = config.http_options.x_forwarded
@@ -705,6 +713,7 @@ class SynapseSite(ProxySite):
            return request_class(
                channel,
                self,
+                our_server_name=self.server_name,
                max_request_body_size=max_request_body_size,
                queued=queued,
                request_id_header=request_id_header,
@@ -0,0 +1,25 @@
+import logging
+
+root_logger = logging.getLogger()
+
+
+class ExplicitlyConfiguredLogger(logging.Logger):
+    """
+    A custom logger class that only allows logging if the logger is explicitly
+    configured (does not inherit log level from parent).
+    """
+
+    def isEnabledFor(self, level: int) -> bool:
+        # Check if the logger is explicitly configured
+        explicitly_configured_logger = self.manager.loggerDict.get(self.name)
+
+        log_level = logging.NOTSET
+        if isinstance(explicitly_configured_logger, logging.Logger):
+            log_level = explicitly_configured_logger.level
+
+        # If the logger is not configured, we don't log anything
+        if log_level == logging.NOTSET:
+            return False
+
+        # Otherwise, follow the normal logging behavior
+        return level >= log_level
@@ -186,12 +186,16 @@ class MediaRepository:

    def _start_update_recently_accessed(self) -> Deferred:
        return run_as_background_process(
-            "update_recently_accessed_media", self._update_recently_accessed
+            "update_recently_accessed_media",
+            self.server_name,
+            self._update_recently_accessed,
        )

    def _start_apply_media_retention_rules(self) -> Deferred:
        return run_as_background_process(
-            "apply_media_retention_rules", self._apply_media_retention_rules
+            "apply_media_retention_rules",
+            self.server_name,
+            self._apply_media_retention_rules,
        )

    async def _update_recently_accessed(self) -> None:
@@ -740,7 +740,7 @@ class UrlPreviewer:

    def _start_expire_url_cache_data(self) -> Deferred:
        return run_as_background_process(
-            "expire_url_cache_data", self._expire_url_cache_data
+            "expire_url_cache_data", self.server_name, self._expire_url_cache_data
        )

    async def _expire_url_cache_data(self) -> None:
@@ -33,6 +33,7 @@ from typing import (
    Iterable,
    Mapping,
    Optional,
+    Sequence,
    Set,
    Tuple,
    Type,
@@ -91,6 +92,7 @@ terms, an endpoint you can scrape is called an *instance*, usually corresponding
 single process." (source: https://prometheus.io/docs/concepts/jobs_instances/)
 """

+
 CONTENT_TYPE_LATEST = "text/plain; version=0.0.4; charset=utf-8"
 """
 Content type of the latest text format for Prometheus metrics.
@@ -154,13 +156,13 @@ class _RegistryProxy:
 RegistryProxy = cast(CollectorRegistry, _RegistryProxy)


-@attr.s(slots=True, hash=True, auto_attribs=True)
+@attr.s(slots=True, hash=True, auto_attribs=True, kw_only=True)
 class LaterGauge(Collector):
    """A Gauge which periodically calls a user-provided callback to produce metrics."""

    name: str
    desc: str
-    labels: Optional[StrSequence] = attr.ib(hash=False)
+    labelnames: Optional[StrSequence] = attr.ib(hash=False)
    # callback: should either return a value (if there are no labels for this metric),
    # or dict mapping from a label tuple to a value
    caller: Callable[
@@ -168,7 +170,9 @@ class LaterGauge(Collector):
    ]

    def collect(self) -> Iterable[Metric]:
-        g = GaugeMetricFamily(self.name, self.desc, labels=self.labels)
+        # The decision to add `SERVER_NAME_LABEL` is from the `LaterGauge` usage itself
+        # (we don't enforce it here, one level up).
+        g = GaugeMetricFamily(self.name, self.desc, labels=self.labelnames)  # type: ignore[missing-server-name-label]

        try:
            calls = self.caller()
@@ -302,7 +306,9 @@ class InFlightGauge(Generic[MetricsEntry], Collector):

        Note: may be called by a separate thread.
        """
-        in_flight = GaugeMetricFamily(
+        # The decision to add `SERVER_NAME_LABEL` is from the `GaugeBucketCollector`
+        # usage itself (we don't enforce it here, one level up).
+        in_flight = GaugeMetricFamily(  # type: ignore[missing-server-name-label]
            self.name + "_total", self.desc, labels=self.labels
        )

@@ -326,7 +332,9 @@ class InFlightGauge(Generic[MetricsEntry], Collector):
        yield in_flight

        for name in self.sub_metrics:
-            gauge = GaugeMetricFamily(
+            # The decision to add `SERVER_NAME_LABEL` is from the `InFlightGauge` usage
+            # itself (we don't enforce it here, one level up).
+            gauge = GaugeMetricFamily(  # type: ignore[missing-server-name-label]
                "_".join([self.name, name]), "", labels=self.labels
            )
            for key, metrics in metrics_by_key.items():
@@ -342,6 +350,51 @@ class InFlightGauge(Generic[MetricsEntry], Collector):
        all_gauges[self.name] = self


+class GaugeHistogramMetricFamilyWithLabels(GaugeHistogramMetricFamily):
+    """
+    Custom version of `GaugeHistogramMetricFamily` from `prometheus_client` that allows
+    specifying labels and label values.
+
+    A single gauge histogram and its samples.
+
+    For use by custom collectors.
+    """
+
+    def __init__(
+        self,
+        *,
+        name: str,
+        documentation: str,
+        gsum_value: float,
+        buckets: Optional[Sequence[Tuple[str, float]]] = None,
+        labelnames: StrSequence = (),
+        labelvalues: StrSequence = (),
+        unit: str = "",
+    ):
+        # Sanity check the number of label values matches the number of label names.
+        if len(labelvalues) != len(labelnames):
+            raise ValueError(
+                "The number of label values must match the number of label names"
+            )
+
+        # Call the super to validate and set the labelnames. We use this stable API
+        # instead of setting the internal `_labelnames` field directly.
+        super().__init__(
+            name=name,
+            documentation=documentation,
+            labels=labelnames,
+            # Since `GaugeHistogramMetricFamily` doesn't support supplying `labels` and
+            # `buckets` at the same time (artificial limitation), we will just set these
+            # as `None` and set up the buckets ourselves just below.
+            buckets=None,
+            gsum_value=None,
+        )
+
+        # Create a gauge for each bucket.
+        if buckets is not None:
+            self.add_metric(labels=labelvalues, buckets=buckets, gsum_value=gsum_value)
+
+
 class GaugeBucketCollector(Collector):
    """Like a Histogram, but the buckets are Gauges which are updated atomically.

@@ -354,14 +407,17 @@ class GaugeBucketCollector(Collector):
    __slots__ = (
        "_name",
        "_documentation",
+        "_labelnames",
        "_bucket_bounds",
        "_metric",
    )

    def __init__(
        self,
+        *,
        name: str,
        documentation: str,
+        labelnames: Optional[StrSequence],
        buckets: Iterable[float],
        registry: CollectorRegistry = REGISTRY,
    ):
@@ -375,6 +431,7 @@ class GaugeBucketCollector(Collector):
        """
        self._name = name
        self._documentation = documentation
+        self._labelnames = labelnames if labelnames else ()

        # the tops of the buckets
        self._bucket_bounds = [float(b) for b in buckets]
@@ -386,7 +443,7 @@ class GaugeBucketCollector(Collector):

        # We initially set this to None. We won't report metrics until
        # this has been initialised after a successful data update
-        self._metric: Optional[GaugeHistogramMetricFamily] = None
+        self._metric: Optional[GaugeHistogramMetricFamilyWithLabels] = None

        registry.register(self)

@@ -395,15 +452,26 @@ class GaugeBucketCollector(Collector):
        if self._metric is not None:
            yield self._metric

-    def update_data(self, values: Iterable[float]) -> None:
+    def update_data(self, values: Iterable[float], labels: StrSequence = ()) -> None:
        """Update the data to be reported by the metric

        The existing data is cleared, and each measurement in the input is assigned
        to the relevant bucket.
-        """
-        self._metric = self._values_to_metric(values)

-    def _values_to_metric(self, values: Iterable[float]) -> GaugeHistogramMetricFamily:
+        Args:
+            values
+            labels
+        """
+        self._metric = self._values_to_metric(values, labels)
+
+    def _values_to_metric(
+        self, values: Iterable[float], labels: StrSequence = ()
+    ) -> GaugeHistogramMetricFamilyWithLabels:
+        """
+        Args:
+            values
+            labels
+        """
        total = 0.0
        bucket_values = [0 for _ in self._bucket_bounds]

@@ -421,9 +489,13 @@ class GaugeBucketCollector(Collector):
        # that bucket or below.
        accumulated_values = itertools.accumulate(bucket_values)

-        return GaugeHistogramMetricFamily(
-            self._name,
-            self._documentation,
+        # The decision to add `SERVER_NAME_LABEL` is from the `GaugeBucketCollector`
+        # usage itself (we don't enforce it here, one level up).
+        return GaugeHistogramMetricFamilyWithLabels(  # type: ignore[missing-server-name-label]
+            name=self._name,
+            documentation=self._documentation,
+            labelnames=self._labelnames,
+            labelvalues=labels,
            buckets=list(
                zip((str(b) for b in self._bucket_bounds), accumulated_values)
            ),
@@ -455,61 +527,82 @@ class CPUMetrics(Collector):
            line = s.read()
            raw_stats = line.split(") ", 1)[1].split(" ")

-            user = GaugeMetricFamily("process_cpu_user_seconds_total", "")
+            # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+            user = GaugeMetricFamily("process_cpu_user_seconds_total", "")  # type: ignore[missing-server-name-label]
            user.add_metric([], float(raw_stats[11]) / self.ticks_per_sec)
            yield user

-            sys = GaugeMetricFamily("process_cpu_system_seconds_total", "")
+            # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+            sys = GaugeMetricFamily("process_cpu_system_seconds_total", "")  # type: ignore[missing-server-name-label]
            sys.add_metric([], float(raw_stats[12]) / self.ticks_per_sec)
            yield sys


-REGISTRY.register(CPUMetrics())
+# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+REGISTRY.register(CPUMetrics())  # type: ignore[missing-server-name-label]


 #
 # Federation Metrics
 #

-sent_transactions_counter = Counter("synapse_federation_client_sent_transactions", "")
+sent_transactions_counter = Counter(
+    "synapse_federation_client_sent_transactions", "", labelnames=[SERVER_NAME_LABEL]
+)

-events_processed_counter = Counter("synapse_federation_client_events_processed", "")
+events_processed_counter = Counter(
+    "synapse_federation_client_events_processed", "", labelnames=[SERVER_NAME_LABEL]
+)

 event_processing_loop_counter = Counter(
-    "synapse_event_processing_loop_count", "Event processing loop iterations", ["name"]
+    "synapse_event_processing_loop_count",
+    "Event processing loop iterations",
+    labelnames=["name", SERVER_NAME_LABEL],
 )

 event_processing_loop_room_count = Counter(
    "synapse_event_processing_loop_room_count",
    "Rooms seen per event processing loop iteration",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
 )


 # Used to track where various components have processed in the event stream,
 # e.g. federation sending, appservice sending, etc.
-event_processing_positions = Gauge("synapse_event_processing_positions", "", ["name"])
+event_processing_positions = Gauge(
+    "synapse_event_processing_positions", "", labelnames=["name", SERVER_NAME_LABEL]
+)

 # Used to track the current max events stream position
-event_persisted_position = Gauge("synapse_event_persisted_position", "")
+event_persisted_position = Gauge(
+    "synapse_event_persisted_position", "", labelnames=[SERVER_NAME_LABEL]
+)

 # Used to track the received_ts of the last event processed by various
 # components
-event_processing_last_ts = Gauge("synapse_event_processing_last_ts", "", ["name"])
+event_processing_last_ts = Gauge(
+    "synapse_event_processing_last_ts", "", labelnames=["name", SERVER_NAME_LABEL]
+)

 # Used to track the lag processing events. This is the time difference
 # between the last processed event's received_ts and the time it was
 # finished being processed.
-event_processing_lag = Gauge("synapse_event_processing_lag", "", ["name"])
+event_processing_lag = Gauge(
+    "synapse_event_processing_lag", "", labelnames=["name", SERVER_NAME_LABEL]
+)

 event_processing_lag_by_event = Histogram(
    "synapse_event_processing_lag_by_event",
    "Time between an event being persisted and it being queued up to be sent to the relevant remote servers",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
 )

 # Build info of the running server.
-build_info = Gauge(
+#
+# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`. We
+# consider this process-level because all Synapse homeservers running in the process
+# will use the same Synapse version.
+build_info = Gauge(  # type: ignore[missing-server-name-label]
    "synapse_build_info", "Build information", ["pythonversion", "version", "osversion"]
 )
 build_info.labels(
@@ -525,44 +618,57 @@ threepid_send_requests = Histogram(
    " there is a request with try count of 4, then there would have been one"
    " each for 1, 2 and 3",
    buckets=(1, 2, 3, 4, 5, 10),
-    labelnames=("type", "reason"),
+    labelnames=("type", "reason", SERVER_NAME_LABEL),
 )

 threadpool_total_threads = Gauge(
    "synapse_threadpool_total_threads",
    "Total number of threads currently in the threadpool",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
 )

 threadpool_total_working_threads = Gauge(
    "synapse_threadpool_working_threads",
    "Number of threads currently working in the threadpool",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
 )

 threadpool_total_min_threads = Gauge(
    "synapse_threadpool_min_threads",
    "Minimum number of threads configured in the threadpool",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
 )

 threadpool_total_max_threads = Gauge(
    "synapse_threadpool_max_threads",
    "Maximum number of threads configured in the threadpool",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
 )


-def register_threadpool(name: str, threadpool: ThreadPool) -> None:
-    """Add metrics for the threadpool."""
+def register_threadpool(*, name: str, server_name: str, threadpool: ThreadPool) -> None:
+    """
+    Add metrics for the threadpool.

-    threadpool_total_min_threads.labels(name).set(threadpool.min)
-    threadpool_total_max_threads.labels(name).set(threadpool.max)
+    Args:
+        name: The name of the threadpool, used to identify it in the metrics.
+        server_name: The homeserver name (used to label metrics) (this should be `hs.hostname`).
+        threadpool: The threadpool to register metrics for.
+    """

-    threadpool_total_threads.labels(name).set_function(lambda: len(threadpool.threads))
-    threadpool_total_working_threads.labels(name).set_function(
-        lambda: len(threadpool.working)
-    )
+    threadpool_total_min_threads.labels(
+        name=name, **{SERVER_NAME_LABEL: server_name}
+    ).set(threadpool.min)
+    threadpool_total_max_threads.labels(
+        name=name, **{SERVER_NAME_LABEL: server_name}
+    ).set(threadpool.max)
+
+    threadpool_total_threads.labels(
+        name=name, **{SERVER_NAME_LABEL: server_name}
+    ).set_function(lambda: len(threadpool.threads))
+    threadpool_total_working_threads.labels(
+        name=name, **{SERVER_NAME_LABEL: server_name}
+    ).set_function(lambda: len(threadpool.working))


 class MetricsResource(Resource):
@@ -54,8 +54,9 @@ running_on_pypy = platform.python_implementation() == "PyPy"
 # Python GC metrics
 #

-gc_unreachable = Gauge("python_gc_unreachable_total", "Unreachable GC objects", ["gen"])
-gc_time = Histogram(
+# These are process-level metrics, so they do not have the `SERVER_NAME_LABEL`.
+gc_unreachable = Gauge("python_gc_unreachable_total", "Unreachable GC objects", ["gen"])  # type: ignore[missing-server-name-label]
+gc_time = Histogram(  # type: ignore[missing-server-name-label]
    "python_gc_time",
    "Time taken to GC (sec)",
    ["gen"],
@@ -82,7 +83,8 @@ gc_time = Histogram(

 class GCCounts(Collector):
    def collect(self) -> Iterable[Metric]:
-        cm = GaugeMetricFamily("python_gc_counts", "GC object counts", labels=["gen"])
+        # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+        cm = GaugeMetricFamily("python_gc_counts", "GC object counts", labels=["gen"])  # type: ignore[missing-server-name-label]
        for n, m in enumerate(gc.get_count()):
            cm.add_metric([str(n)], m)

@@ -101,7 +103,8 @@ def install_gc_manager() -> None:
    if running_on_pypy:
        return

-    REGISTRY.register(GCCounts())
+    # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+    REGISTRY.register(GCCounts())  # type: ignore[missing-server-name-label]

    gc.disable()

@@ -176,7 +179,8 @@ class PyPyGCStats(Collector):
        #
        #     Total time spent in GC:  0.073                  # s.total_gc_time

-        pypy_gc_time = CounterMetricFamily(
+        # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+        pypy_gc_time = CounterMetricFamily(  # type: ignore[missing-server-name-label]
            "pypy_gc_time_seconds_total",
            "Total time spent in PyPy GC",
            labels=[],
@@ -184,7 +188,8 @@ class PyPyGCStats(Collector):
        pypy_gc_time.add_metric([], s.total_gc_time / 1000)
        yield pypy_gc_time

-        pypy_mem = GaugeMetricFamily(
+        # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+        pypy_mem = GaugeMetricFamily(  # type: ignore[missing-server-name-label]
            "pypy_memory_bytes",
            "Memory tracked by PyPy allocator",
            labels=["state", "class", "kind"],
@@ -208,4 +213,5 @@ class PyPyGCStats(Collector):


 if running_on_pypy:
-    REGISTRY.register(PyPyGCStats())
+    # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+    REGISTRY.register(PyPyGCStats())  # type: ignore[missing-server-name-label]
@@ -62,7 +62,8 @@ logger = logging.getLogger(__name__)
 # Twisted reactor metrics
 #

-tick_time = Histogram(
+# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+tick_time = Histogram(  # type: ignore[missing-server-name-label]
    "python_twisted_reactor_tick_time",
    "Tick time of the Twisted reactor (sec)",
    buckets=[0.001, 0.002, 0.005, 0.01, 0.025, 0.05, 0.1, 0.2, 0.5, 1, 2, 5],
@@ -114,7 +115,8 @@ class ReactorLastSeenMetric(Collector):
        self._call_wrapper = call_wrapper

    def collect(self) -> Iterable[Metric]:
-        cm = GaugeMetricFamily(
+        # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+        cm = GaugeMetricFamily(  # type: ignore[missing-server-name-label]
            "python_twisted_reactor_last_seen",
            "Seconds since the Twisted reactor was last seen",
        )
@@ -165,4 +167,5 @@ except Exception as e:


 if wrapper:
-    REGISTRY.register(ReactorLastSeenMetric(wrapper))
+    # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+    REGISTRY.register(ReactorLastSeenMetric(wrapper))  # type: ignore[missing-server-name-label]
@@ -31,6 +31,7 @@ from typing import (
    Dict,
    Iterable,
    Optional,
+    Protocol,
    Set,
    Type,
    TypeVar,
@@ -39,7 +40,7 @@ from typing import (

 from prometheus_client import Metric
 from prometheus_client.core import REGISTRY, Counter, Gauge
-from typing_extensions import ParamSpec
+from typing_extensions import Concatenate, ParamSpec

 from twisted.internet import defer

@@ -49,6 +50,7 @@ from synapse.logging.context import (
    PreserveLoggingContext,
 )
 from synapse.logging.opentracing import SynapseTags, start_active_span
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.metrics._types import Collector

 if TYPE_CHECKING:
@@ -64,13 +66,13 @@ logger = logging.getLogger(__name__)
 _background_process_start_count = Counter(
    "synapse_background_process_start_count",
    "Number of background processes started",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
 )

 _background_process_in_flight_count = Gauge(
    "synapse_background_process_in_flight_count",
    "Number of background processes in flight",
-    labelnames=["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
 )

 # we set registry=None in all of these to stop them getting registered with
@@ -80,21 +82,21 @@ _background_process_in_flight_count = Gauge(
 _background_process_ru_utime = Counter(
    "synapse_background_process_ru_utime_seconds",
    "User CPU time used by background processes, in seconds",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
    registry=None,
 )

 _background_process_ru_stime = Counter(
    "synapse_background_process_ru_stime_seconds",
    "System CPU time used by background processes, in seconds",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
    registry=None,
 )

 _background_process_db_txn_count = Counter(
    "synapse_background_process_db_txn_count",
    "Number of database transactions done by background processes",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
    registry=None,
 )

@@ -104,14 +106,14 @@ _background_process_db_txn_duration = Counter(
        "Seconds spent by background processes waiting for database "
        "transactions, excluding scheduling time"
    ),
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
    registry=None,
 )

 _background_process_db_sched_duration = Counter(
    "synapse_background_process_db_sched_duration_seconds",
    "Seconds spent by background processes waiting for database connections",
-    ["name"],
+    labelnames=["name", SERVER_NAME_LABEL],
    registry=None,
 )

@@ -165,12 +167,15 @@ class _Collector(Collector):
            yield from m.collect()


-REGISTRY.register(_Collector())
+# The `SERVER_NAME_LABEL` is included in the individual metrics added to this registry,
+# so we don't need to worry about it on the collector itself.
+REGISTRY.register(_Collector())  # type: ignore[missing-server-name-label]


 class _BackgroundProcess:
-    def __init__(self, desc: str, ctx: LoggingContext):
+    def __init__(self, *, desc: str, server_name: str, ctx: LoggingContext):
        self.desc = desc
+        self.server_name = server_name
        self._context = ctx
        self._reported_stats: Optional[ContextResourceUsage] = None

@@ -185,15 +190,21 @@ class _BackgroundProcess:

        # For unknown reasons, the difference in times can be negative. See comment in
        # synapse.http.request_metrics.RequestMetrics.update_metrics.
-        _background_process_ru_utime.labels(self.desc).inc(max(diff.ru_utime, 0))
-        _background_process_ru_stime.labels(self.desc).inc(max(diff.ru_stime, 0))
-        _background_process_db_txn_count.labels(self.desc).inc(diff.db_txn_count)
-        _background_process_db_txn_duration.labels(self.desc).inc(
-            diff.db_txn_duration_sec
-        )
-        _background_process_db_sched_duration.labels(self.desc).inc(
-            diff.db_sched_duration_sec
-        )
+        _background_process_ru_utime.labels(
+            name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
+        ).inc(max(diff.ru_utime, 0))
+        _background_process_ru_stime.labels(
+            name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
+        ).inc(max(diff.ru_stime, 0))
+        _background_process_db_txn_count.labels(
+            name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
+        ).inc(diff.db_txn_count)
+        _background_process_db_txn_duration.labels(
+            name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
+        ).inc(diff.db_txn_duration_sec)
+        _background_process_db_sched_duration.labels(
+            name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
+        ).inc(diff.db_sched_duration_sec)


 R = TypeVar("R")
@@ -201,6 +212,7 @@ R = TypeVar("R")

 def run_as_background_process(
    desc: "LiteralString",
+    server_name: str,
    func: Callable[..., Awaitable[Optional[R]]],
    *args: Any,
    bg_start_span: bool = True,
@@ -218,6 +230,8 @@ def run_as_background_process(

    Args:
        desc: a description for this background process type
+        server_name: The homeserver name that this background process is being run for
+            (this should be `hs.hostname`).
        func: a function, which may return a Deferred or a coroutine
        bg_start_span: Whether to start an opentracing span. Defaults to True.
            Should only be disabled for processes that will not log to or tag
@@ -236,10 +250,16 @@ def run_as_background_process(
            count = _background_process_counts.get(desc, 0)
            _background_process_counts[desc] = count + 1

-        _background_process_start_count.labels(desc).inc()
-        _background_process_in_flight_count.labels(desc).inc()
+        _background_process_start_count.labels(
+            name=desc, **{SERVER_NAME_LABEL: server_name}
+        ).inc()
+        _background_process_in_flight_count.labels(
+            name=desc, **{SERVER_NAME_LABEL: server_name}
+        ).inc()

-        with BackgroundProcessLoggingContext(desc, count) as context:
+        with BackgroundProcessLoggingContext(
+            name=desc, server_name=server_name, instance_id=count
+        ) as context:
            try:
                if bg_start_span:
                    ctx = start_active_span(
@@ -256,7 +276,9 @@ def run_as_background_process(
                )
                return None
            finally:
-                _background_process_in_flight_count.labels(desc).dec()
+                _background_process_in_flight_count.labels(
+                    name=desc, **{SERVER_NAME_LABEL: server_name}
+                ).dec()

    with PreserveLoggingContext():
        # Note that we return a Deferred here so that it can be used in a
@@ -267,6 +289,14 @@ def run_as_background_process(
 P = ParamSpec("P")


+class HasServerName(Protocol):
+    server_name: str
+    """
+    The homeserver name that this cache is associated with (used to label the metric)
+    (`hs.hostname`).
+    """
+
+
 def wrap_as_background_process(
    desc: "LiteralString",
 ) -> Callable[
@@ -292,22 +322,37 @@ def wrap_as_background_process(
    multiple places.
    """

-    def wrap_as_background_process_inner(
-        func: Callable[P, Awaitable[Optional[R]]],
+    def wrapper(
+        func: Callable[Concatenate[HasServerName, P], Awaitable[Optional[R]]],
    ) -> Callable[P, "defer.Deferred[Optional[R]]"]:
        @wraps(func)
-        def wrap_as_background_process_inner_2(
-            *args: P.args, **kwargs: P.kwargs
+        def wrapped_func(
+            self: HasServerName, *args: P.args, **kwargs: P.kwargs
        ) -> "defer.Deferred[Optional[R]]":
-            # type-ignore: mypy is confusing kwargs with the bg_start_span kwarg.
-            #     Argument 4 to "run_as_background_process" has incompatible type
-            #     "**P.kwargs"; expected "bool"
-            # See https://github.com/python/mypy/issues/8862
-            return run_as_background_process(desc, func, *args, **kwargs)  # type: ignore[arg-type]
+            assert self.server_name is not None, (
+                "The `server_name` attribute must be set on the object where `@wrap_as_background_process` decorator is used."
+            )

-        return wrap_as_background_process_inner_2
+            return run_as_background_process(
+                desc,
+                self.server_name,
+                func,
+                self,
+                *args,
+                # type-ignore: mypy is confusing kwargs with the bg_start_span kwarg.
+                #     Argument 4 to "run_as_background_process" has incompatible type
+                #     "**P.kwargs"; expected "bool"
+                # See https://github.com/python/mypy/issues/8862
+                **kwargs,  # type: ignore[arg-type]
+            )

-    return wrap_as_background_process_inner
+        # There are some shenanigans here, because we're decorating a method but
+        # explicitly making use of the `self` parameter. The key thing here is that the
+        # return type within the return type for `measure_func` itself describes how the
+        # decorated function will be called.
+        return wrapped_func  # type: ignore[return-value]
+
+    return wrapper  # type: ignore[return-value]


 class BackgroundProcessLoggingContext(LoggingContext):
@@ -317,13 +362,20 @@ class BackgroundProcessLoggingContext(LoggingContext):

    __slots__ = ["_proc"]

-    def __init__(self, name: str, instance_id: Optional[Union[int, str]] = None):
+    def __init__(
+        self,
+        *,
+        name: str,
+        server_name: str,
+        instance_id: Optional[Union[int, str]] = None,
+    ):
        """

        Args:
            name: The name of the background process. Each distinct `name` gets a
                separate prometheus time series.
-
+            server_name: The homeserver name that this background process is being run for
+                (this should be `hs.hostname`).
            instance_id: an identifer to add to `name` to distinguish this instance of
                the named background process in the logs. If this is `None`, one is
                made up based on id(self).
@@ -331,7 +383,9 @@ class BackgroundProcessLoggingContext(LoggingContext):
        if instance_id is None:
            instance_id = id(self)
        super().__init__("%s-%s" % (name, instance_id))
-        self._proc: Optional[_BackgroundProcess] = _BackgroundProcess(name, self)
+        self._proc: Optional[_BackgroundProcess] = _BackgroundProcess(
+            desc=name, server_name=server_name, ctx=self
+        )

    def start(self, rusage: "Optional[resource.struct_rusage]") -> None:
        """Log context has started running (again)."""
@@ -22,6 +22,7 @@ from typing import TYPE_CHECKING

 import attr

+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.metrics.background_process_metrics import run_as_background_process

 if TYPE_CHECKING:
@@ -33,6 +34,7 @@ from prometheus_client import Gauge
 current_dau_gauge = Gauge(
    "synapse_admin_daily_active_users",
    "Current daily active users count",
+    labelnames=[SERVER_NAME_LABEL],
 )


@@ -47,6 +49,7 @@ class CommonUsageMetricsManager:
    """Collects common usage metrics."""

    def __init__(self, hs: "HomeServer") -> None:
+        self.server_name = hs.hostname
        self._store = hs.get_datastores().main
        self._clock = hs.get_clock()

@@ -62,12 +65,15 @@ class CommonUsageMetricsManager:
    async def setup(self) -> None:
        """Keep the gauges for common usage metrics up to date."""
        run_as_background_process(
-            desc="common_usage_metrics_update_gauges", func=self._update_gauges
+            desc="common_usage_metrics_update_gauges",
+            server_name=self.server_name,
+            func=self._update_gauges,
        )
        self._clock.looping_call(
            run_as_background_process,
            5 * 60 * 1000,
            desc="common_usage_metrics_update_gauges",
+            server_name=self.server_name,
            func=self._update_gauges,
        )

@@ -85,4 +91,6 @@ class CommonUsageMetricsManager:
        """Update the Prometheus gauges."""
        metrics = await self._collect()

-        current_dau_gauge.set(float(metrics.daily_active_users))
+        current_dau_gauge.labels(
+            **{SERVER_NAME_LABEL: self.server_name},
+        ).set(float(metrics.daily_active_users))
@@ -188,7 +188,8 @@ def _setup_jemalloc_stats() -> None:
        def collect(self) -> Iterable[Metric]:
            stats.refresh_stats()

-            g = GaugeMetricFamily(
+            # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+            g = GaugeMetricFamily(  # type: ignore[missing-server-name-label]
                "jemalloc_stats_app_memory_bytes",
                "The stats reported by jemalloc",
                labels=["type"],
@@ -230,7 +231,8 @@ def _setup_jemalloc_stats() -> None:

            yield g

-    REGISTRY.register(JemallocCollector())
+    # This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
+    REGISTRY.register(JemallocCollector())  # type: ignore[missing-server-name-label]

    logger.debug("Added jemalloc stats")

@@ -23,6 +23,7 @@ import logging
 from typing import (
    TYPE_CHECKING,
    Any,
+    Awaitable,
    Callable,
    Collection,
    Dict,
@@ -80,7 +81,9 @@ from synapse.logging.context import (
    make_deferred_yieldable,
    run_in_background,
 )
-from synapse.metrics.background_process_metrics import run_as_background_process
+from synapse.metrics.background_process_metrics import (
+    run_as_background_process as _run_as_background_process,
+)
 from synapse.module_api.callbacks.account_validity_callbacks import (
    IS_USER_EXPIRED_CALLBACK,
    ON_LEGACY_ADMIN_REQUEST,
@@ -158,6 +161,9 @@ from synapse.util.caches.descriptors import CachedFunction, cached as _cached
 from synapse.util.frozenutils import freeze

 if TYPE_CHECKING:
+    # Old versions don't have `LiteralString`
+    from typing_extensions import LiteralString
+
    from synapse.app.generic_worker import GenericWorkerStore
    from synapse.server import HomeServer

@@ -216,6 +222,65 @@ class UserIpAndAgent:
    last_seen: int


+def run_as_background_process(
+    desc: "LiteralString",
+    func: Callable[..., Awaitable[Optional[T]]],
+    *args: Any,
+    bg_start_span: bool = True,
+    **kwargs: Any,
+) -> "defer.Deferred[Optional[T]]":
+    """
+    XXX: Deprecated: use `ModuleApi.run_as_background_process` instead.
+
+    Run the given function in its own logcontext, with resource metrics
+
+    This should be used to wrap processes which are fired off to run in the
+    background, instead of being associated with a particular request.
+
+    It returns a Deferred which completes when the function completes, but it doesn't
+    follow the synapse logcontext rules, which makes it appropriate for passing to
+    clock.looping_call and friends (or for firing-and-forgetting in the middle of a
+    normal synapse async function).
+
+    Args:
+        desc: a description for this background process type
+        server_name: The homeserver name that this background process is being run for
+            (this should be `hs.hostname`).
+        func: a function, which may return a Deferred or a coroutine
+        bg_start_span: Whether to start an opentracing span. Defaults to True.
+            Should only be disabled for processes that will not log to or tag
+            a span.
+        args: positional args for func
+        kwargs: keyword args for func
+
+    Returns:
+        Deferred which returns the result of func, or `None` if func raises.
+        Note that the returned Deferred does not follow the synapse logcontext
+        rules.
+    """
+
+    logger.warning(
+        "Using deprecated `run_as_background_process` that's exported from the Module API. "
+        "Prefer `ModuleApi.run_as_background_process` instead.",
+    )
+
+    # Historically, since this function is exported from the module API, we can't just
+    # change the signature to require a `server_name` argument. Since
+    # `run_as_background_process` internally in Synapse requires `server_name` now, we
+    # just have to stub this out with a placeholder value and tell people to use the new
+    # function instead.
+    stub_server_name = "synapse_module_running_from_unknown_server"
+
+    return _run_as_background_process(
+        desc,
+        stub_server_name,
+        func,
+        *args,
+        bg_start_span=bg_start_span,
+        **kwargs,
+    )
+
+
 def cached(
    *,
    max_entries: int = 1000,
@@ -277,7 +342,9 @@ class ModuleApi:
        self._device_handler = hs.get_device_handler()
        self.custom_template_dir = hs.config.server.custom_template_directory
        self._callbacks = hs.get_module_api_callbacks()
-        self.msc3861_oauth_delegation_enabled = hs.config.experimental.msc3861.enabled
+        self._auth_delegation_enabled = (
+            hs.config.mas.enabled or hs.config.experimental.msc3861.enabled
+        )
        self._event_serializer = hs.get_event_client_serializer()

        try:
@@ -484,7 +551,7 @@ class ModuleApi:

        Added in Synapse v1.46.0.
        """
-        if self.msc3861_oauth_delegation_enabled:
+        if self._auth_delegation_enabled:
            raise ConfigError(
                "Cannot use password auth provider callbacks when OAuth delegation is enabled"
            )
@@ -1323,7 +1390,7 @@ class ModuleApi:

        if self._hs.config.worker.run_background_tasks or run_on_all_instances:
            self._clock.looping_call(
-                run_as_background_process,
+                self.run_as_background_process,
                msec,
                desc,
                lambda: maybe_awaitable(f(*args, **kwargs)),
@@ -1381,7 +1448,7 @@ class ModuleApi:
        return self._clock.call_later(
            # convert ms to seconds as needed by call_later.
            msec * 0.001,
-            run_as_background_process,
+            self.run_as_background_process,
            desc,
            lambda: maybe_awaitable(f(*args, **kwargs)),
        )
@@ -1588,6 +1655,44 @@ class ModuleApi:

        return {key: state_events[event_id] for key, event_id in state_ids.items()}

+    def run_as_background_process(
+        self,
+        desc: "LiteralString",
+        func: Callable[..., Awaitable[Optional[T]]],
+        *args: Any,
+        bg_start_span: bool = True,
+        **kwargs: Any,
+    ) -> "defer.Deferred[Optional[T]]":
+        """Run the given function in its own logcontext, with resource metrics
+
+        This should be used to wrap processes which are fired off to run in the
+        background, instead of being associated with a particular request.
+
+        It returns a Deferred which completes when the function completes, but it doesn't
+        follow the synapse logcontext rules, which makes it appropriate for passing to
+        clock.looping_call and friends (or for firing-and-forgetting in the middle of a
+        normal synapse async function).
+
+        Args:
+            desc: a description for this background process type
+            server_name: The homeserver name that this background process is being run for
+                (this should be `hs.hostname`).
+            func: a function, which may return a Deferred or a coroutine
+            bg_start_span: Whether to start an opentracing span. Defaults to True.
+                Should only be disabled for processes that will not log to or tag
+                a span.
+            args: positional args for func
+            kwargs: keyword args for func
+
+        Returns:
+            Deferred which returns the result of func, or `None` if func raises.
+            Note that the returned Deferred does not follow the synapse logcontext
+            rules.
+        """
+        return _run_as_background_process(
+            desc, self.server_name, func, *args, bg_start_span=bg_start_span, **kwargs
+        )
+
    async def defer_to_thread(
        self,
        f: Callable[P, T],
@@ -29,6 +29,7 @@ from typing import (
    Iterable,
    List,
    Literal,
+    Mapping,
    Optional,
    Set,
    Tuple,
@@ -50,7 +51,7 @@ from synapse.handlers.presence import format_user_presence_state
 from synapse.logging import issue9533_logger
 from synapse.logging.context import PreserveLoggingContext
 from synapse.logging.opentracing import log_kv, start_active_span
-from synapse.metrics import LaterGauge
+from synapse.metrics import SERVER_NAME_LABEL, LaterGauge
 from synapse.streams.config import PaginationConfig
 from synapse.types import (
    ISynapseReactor,
@@ -74,10 +75,15 @@ if TYPE_CHECKING:

 logger = logging.getLogger(__name__)

-notified_events_counter = Counter("synapse_notifier_notified_events", "")
+# FIXME: Unused metric, remove if not needed.
+notified_events_counter = Counter(
+    "synapse_notifier_notified_events", "", labelnames=[SERVER_NAME_LABEL]
+)

 users_woken_by_stream_counter = Counter(
-    "synapse_notifier_users_woken_by_stream", "", ["stream"]
+    "synapse_notifier_users_woken_by_stream",
+    "",
+    labelnames=["stream", SERVER_NAME_LABEL],
 )

 T = TypeVar("T")
@@ -224,6 +230,7 @@ class Notifier:
        self.room_to_user_streams: Dict[str, Set[_NotifierUserStream]] = {}

        self.hs = hs
+        self.server_name = hs.hostname
        self._storage_controllers = hs.get_storage_controllers()
        self.event_sources = hs.get_event_sources()
        self.store = hs.get_datastores().main
@@ -257,7 +264,10 @@ class Notifier:
        # This is not a very cheap test to perform, but it's only executed
        # when rendering the metrics page, which is likely once per minute at
        # most when scraping it.
-        def count_listeners() -> int:
+        #
+        # Ideally, we'd use `Mapping[Tuple[str], int]` here but mypy doesn't like it.
+        # This is close enough and better than a type ignore.
+        def count_listeners() -> Mapping[Tuple[str, ...], int]:
            all_user_streams: Set[_NotifierUserStream] = set()

            for streams in list(self.room_to_user_streams.values()):
@@ -265,18 +275,34 @@ class Notifier:
            for stream in list(self.user_to_user_stream.values()):
                all_user_streams.add(stream)

-            return sum(stream.count_listeners() for stream in all_user_streams)
-
-        LaterGauge("synapse_notifier_listeners", "", [], count_listeners)
+            return {
+                (self.server_name,): sum(
+                    stream.count_listeners() for stream in all_user_streams
+                )
+            }

        LaterGauge(
-            "synapse_notifier_rooms",
-            "",
-            [],
-            lambda: count(bool, list(self.room_to_user_streams.values())),
+            name="synapse_notifier_listeners",
+            desc="",
+            labelnames=[SERVER_NAME_LABEL],
+            caller=count_listeners,
+        )
+
+        LaterGauge(
+            name="synapse_notifier_rooms",
+            desc="",
+            labelnames=[SERVER_NAME_LABEL],
+            caller=lambda: {
+                (self.server_name,): count(
+                    bool, list(self.room_to_user_streams.values())
+                )
+            },
        )
        LaterGauge(
-            "synapse_notifier_users", "", [], lambda: len(self.user_to_user_stream)
+            name="synapse_notifier_users",
+            desc="",
+            labelnames=[SERVER_NAME_LABEL],
+            caller=lambda: {(self.server_name,): len(self.user_to_user_stream)},
        )

    def add_replication_callback(self, cb: Callable[[], None]) -> None:
@@ -350,9 +376,10 @@ class Notifier:
            for listener in listeners:
                listener.callback(current_token)

-        users_woken_by_stream_counter.labels(StreamKeyType.UN_PARTIAL_STATED_ROOMS).inc(
-            len(user_streams)
-        )
+        users_woken_by_stream_counter.labels(
+            stream=StreamKeyType.UN_PARTIAL_STATED_ROOMS,
+            **{SERVER_NAME_LABEL: self.server_name},
+        ).inc(len(user_streams))

        # Poke the replication so that other workers also see the write to
        # the un-partial-stated rooms stream.
@@ -575,7 +602,10 @@ class Notifier:
                        listener.callback(current_token)

            if user_streams:
-                users_woken_by_stream_counter.labels(stream_key).inc(len(user_streams))
+                users_woken_by_stream_counter.labels(
+                    stream=stream_key,
+                    **{SERVER_NAME_LABEL: self.server_name},
+                ).inc(len(user_streams))

        self.notify_replication()

@@ -25,6 +25,7 @@ from typing import (
    Any,
    Collection,
    Dict,
+    FrozenSet,
    List,
    Mapping,
    Optional,
@@ -50,6 +51,7 @@ from synapse.event_auth import auth_types_for_event, get_user_power_level
 from synapse.events import EventBase, relation_from_event
 from synapse.events.snapshot import EventContext
 from synapse.logging.context import make_deferred_yieldable, run_in_background
+from synapse.metrics import SERVER_NAME_LABEL
 from synapse.state import CREATE_KEY, POWER_KEY
 from synapse.storage.databases.main.roommember import EventIdMembership
 from synapse.storage.invite_rule import InviteRule
@@ -68,11 +70,17 @@ if TYPE_CHECKING:

 logger = logging.getLogger(__name__)

+# FIXME: Unused metric, remove if not needed.
 push_rules_invalidation_counter = Counter(
-    "synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter", ""
+    "synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter",
+    "",
+    labelnames=[SERVER_NAME_LABEL],
 )
+# FIXME: Unused metric, remove if not needed.
 push_rules_state_size_counter = Counter(
-    "synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter", ""
+    "synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter",
+    "",
+    labelnames=[SERVER_NAME_LABEL],
 )


@@ -470,8 +478,18 @@ class BulkPushRuleEvaluator:
            event.room_version.msc3931_push_features,
            self.hs.config.experimental.msc1767_enabled,  # MSC3931 flag
            self.hs.config.experimental.msc4210_enabled,
+            self.hs.config.experimental.msc4306_enabled,
        )

+        msc4306_thread_subscribers: Optional[FrozenSet[str]] = None
+        if self.hs.config.experimental.msc4306_enabled and thread_id != MAIN_TIMELINE:
+            # pull out, in batch, all local subscribers to this thread
+            # (in the common case, they will all be getting processed for push
+            # rules right now)
+            msc4306_thread_subscribers = await self.store.get_subscribers_to_thread(
+                event.room_id, thread_id
+            )
+
        for uid, rules in rules_by_user.items():
            if event.sender == uid:
                continue
@@ -496,7 +514,13 @@ class BulkPushRuleEvaluator:
                # current user, it'll be added to the dict later.
                actions_by_user[uid] = []

-            actions = evaluator.run(rules, uid, display_name)
+            msc4306_thread_subscription_state: Optional[bool] = None
+            if msc4306_thread_subscribers is not None:
+                msc4306_thread_subscription_state = uid in msc4306_thread_subscribers
+
+            actions = evaluator.run(
+                rules, uid, display_name, msc4306_thread_subscription_state
+            )
            if "notify" in actions:
                # Push rules say we should notify the user of this event
                actions_by_user[uid] = actions
--- a/Show More
+++ b/Show More