1
0

Merge 'release-v1.136' into 'master'

This commit is contained in:
Andrew Morgan
2025-08-12 15:36:51 +01:00
parent 4054d956f7
commit 283ade8e33
402 changed files with 8961 additions and 2567 deletions
+4
View File
@@ -16,6 +16,10 @@ jobs:
with:
project-url: "https://github.com/orgs/matrix-org/projects/67"
github-token: ${{ secrets.ELEMENT_BOT_TOKEN }}
# This action will error if the issue already exists on the project. Which is
# common as `X-Needs-Info` will often be added to issues that are already in
# the triage queue. Prevent the whole job from failing in this case.
continue-on-error: true
- name: Set status
env:
GITHUB_TOKEN: ${{ secrets.ELEMENT_BOT_TOKEN }}
+79
View File
@@ -1,3 +1,12 @@
# Synapse 1.136.0 (2025-08-12)
Note: This release includes the security fixes from `1.135.2` and `1.136.0rc2`, detailed below.
### Bugfixes
- Fix bug introduced in 1.135.2 and 1.136.0rc2 where the [Make Room Admin API](https://element-hq.github.io/synapse/latest/admin_api/rooms.html#make-room-admin-api) would not treat a room v12's creator power level as the highest in room. ([\#18805](https://github.com/element-hq/synapse/issues/18805))
# Synapse 1.135.2 (2025-08-11)
This is the Synapse portion of the [Matrix coordinated security release](https://matrix.org/blog/2025/07/security-predisclosure/). This release includes support for [room version](https://spec.matrix.org/v1.15/rooms/) 12 which fixes a number of security vulnerabilities, including [CVE-2025-49090](https://www.cve.org/CVERecord?id=CVE-2025-49090).
@@ -23,7 +32,77 @@ Two patched Synapse releases are now available:
- Speed up upgrading a room with large numbers of banned users. ([\#18574](https://github.com/element-hq/synapse/issues/18574))
# Synapse 1.136.0rc2 (2025-08-11)
- Update MSC4293 redaction logic for room v12. ([\#80](https://github.com/element-hq/synapse/issues/80))
### Internal Changes
- Add a parameter to `upgrade_rooms(..)` to allow auto join local users. ([\#83](https://github.com/element-hq/synapse/issues/83))
# Synapse 1.136.0rc1 (2025-08-05)
Please check [the relevant section in the upgrade notes](https://github.com/element-hq/synapse/blob/develop/docs/upgrade.md#upgrading-to-v11360) as this release contains changes to MAS support, metrics labels and the module API which may require your attention when upgrading.
### Features
- Add configurable rate limiting for the creation of rooms. ([\#18514](https://github.com/element-hq/synapse/issues/18514))
- Add support for [MSC4293](https://github.com/matrix-org/matrix-spec-proposals/pull/4293) - Redact on Kick/Ban. ([\#18540](https://github.com/element-hq/synapse/issues/18540))
- When admins enable themselves to see soft-failed events, they will also see if the cause is due to the policy server flagging them as spam via `unsigned`. ([\#18585](https://github.com/element-hq/synapse/issues/18585))
- Add ability to configure forward/outbound proxy via homeserver config instead of environment variables. See `http_proxy`, `https_proxy`, `no_proxy_hosts`. ([\#18686](https://github.com/element-hq/synapse/issues/18686))
- Advertise experimental support for [MSC4306](https://github.com/matrix-org/matrix-spec-proposals/pull/4306) (Thread Subscriptions) through `/_matrix/clients/versions` if enabled. ([\#18722](https://github.com/element-hq/synapse/issues/18722))
- Stabilise support for delegating authentication to [Matrix Authentication Service](https://github.com/element-hq/matrix-authentication-service/). ([\#18759](https://github.com/element-hq/synapse/issues/18759))
- Implement the push rules for experimental [MSC4306: Thread Subscriptions](https://github.com/matrix-org/matrix-doc/issues/4306). ([\#18762](https://github.com/element-hq/synapse/issues/18762))
### Bugfixes
- Allow return code 403 (allowed by C2S Spec since v1.2) when fetching profiles via federation. ([\#18696](https://github.com/element-hq/synapse/issues/18696))
- Register the MSC4306 (Thread Subscriptions) endpoints in the CS API when the experimental feature is enabled. ([\#18726](https://github.com/element-hq/synapse/issues/18726))
- Fix a long-standing bug where suspended users could not have server notices sent to them (a 403 was returned to the admin). ([\#18750](https://github.com/element-hq/synapse/issues/18750))
- Fix an issue that could cause logcontexts to be lost on rate-limited requests. Found by @realtyem. ([\#18763](https://github.com/element-hq/synapse/issues/18763))
- Fix invalidation of storage cache that was broken in 1.135.0. ([\#18786](https://github.com/element-hq/synapse/issues/18786))
### Improved Documentation
- Minor improvements to README. ([\#18700](https://github.com/element-hq/synapse/issues/18700))
- Document that there can be multiple workers handling the `receipts` stream. ([\#18760](https://github.com/element-hq/synapse/issues/18760))
- Improve worker documentation for some device paths. ([\#18761](https://github.com/element-hq/synapse/issues/18761))
### Deprecations and Removals
- Deprecate `run_as_background_process` exported as part of the module API interface in favor of `ModuleApi.run_as_background_process`. See [the relevant section in the upgrade notes](https://github.com/element-hq/synapse/blob/develop/docs/upgrade.md#upgrading-to-v11360) for more information. ([\#18737](https://github.com/element-hq/synapse/issues/18737))
### Internal Changes
- Add debug logging for HMAC digest verification failures when using the admin API to register users. ([\#18474](https://github.com/element-hq/synapse/issues/18474))
- Speed up upgrading a room with large numbers of banned users. ([\#18574](https://github.com/element-hq/synapse/issues/18574))
- Fix config documentation generation script on Windows by enforcing UTF-8. ([\#18580](https://github.com/element-hq/synapse/issues/18580))
- Refactor cache, background process, `Counter`, `LaterGauge`, `GaugeBucketCollector`, `Histogram`, and `Gauge` metrics to be homeserver-scoped. ([\#18656](https://github.com/element-hq/synapse/issues/18656), [\#18714](https://github.com/element-hq/synapse/issues/18714), [\#18715](https://github.com/element-hq/synapse/issues/18715), [\#18724](https://github.com/element-hq/synapse/issues/18724), [\#18753](https://github.com/element-hq/synapse/issues/18753), [\#18725](https://github.com/element-hq/synapse/issues/18725), [\#18670](https://github.com/element-hq/synapse/issues/18670), [\#18748](https://github.com/element-hq/synapse/issues/18748), [\#18751](https://github.com/element-hq/synapse/issues/18751))
- Reduce database usage in Sliding Sync by not querying for background update completion after the update is known to be complete. ([\#18718](https://github.com/element-hq/synapse/issues/18718))
- Improve order of validation and ratelimiting in room creation. ([\#18723](https://github.com/element-hq/synapse/issues/18723))
- Bump minimum version bound on Twisted to 21.2.0. ([\#18727](https://github.com/element-hq/synapse/issues/18727), [\#18729](https://github.com/element-hq/synapse/issues/18729))
- Use `twisted.internet.testing` module in tests instead of deprecated `twisted.test.proto_helpers`. ([\#18728](https://github.com/element-hq/synapse/issues/18728))
- Remove obsolete `/send_event` replication endpoint. ([\#18730](https://github.com/element-hq/synapse/issues/18730))
- Update metrics linting to be able to handle custom metrics. ([\#18733](https://github.com/element-hq/synapse/issues/18733))
- Work around `twisted.protocols.amp.TooLong` error by reducing logging in some tests. ([\#18736](https://github.com/element-hq/synapse/issues/18736))
- Prevent "Move labelled issues to correct projects" GitHub Actions workflow from failing when an issue is already on the project board. ([\#18755](https://github.com/element-hq/synapse/issues/18755))
- Bump minimum supported Rust version (MSRV) to 1.82.0. Missed in [#18553](https://github.com/element-hq/synapse/pull/18553) (released in Synapse 1.134.0). ([\#18757](https://github.com/element-hq/synapse/issues/18757))
- Make `Clock.sleep(...)` return a coroutine, so that mypy can catch places where we don't await on it. ([\#18772](https://github.com/element-hq/synapse/issues/18772))
- Update implementation of [MSC4306: Thread Subscriptions](https://github.com/matrix-org/matrix-doc/issues/4306) to include automatic subscription conflict prevention as introduced in later drafts. ([\#18756](https://github.com/element-hq/synapse/issues/18756))
### Updates to locked dependencies
* Bump gitpython from 3.1.44 to 3.1.45. ([\#18743](https://github.com/element-hq/synapse/issues/18743))
* Bump mypy-zope from 1.0.12 to 1.0.13. ([\#18744](https://github.com/element-hq/synapse/issues/18744))
* Bump phonenumbers from 9.0.9 to 9.0.10. ([\#18741](https://github.com/element-hq/synapse/issues/18741))
* Bump ruff from 0.12.4 to 0.12.5. ([\#18742](https://github.com/element-hq/synapse/issues/18742))
* Bump sentry-sdk from 2.32.0 to 2.33.2. ([\#18745](https://github.com/element-hq/synapse/issues/18745))
* Bump tokio from 1.46.1 to 1.47.0. ([\#18740](https://github.com/element-hq/synapse/issues/18740))
* Bump types-jsonschema from 4.24.0.20250708 to 4.25.0.20250720. ([\#18703](https://github.com/element-hq/synapse/issues/18703))
* Bump types-psycopg2 from 2.9.21.20250516 to 2.9.21.20250718. ([\#18706](https://github.com/element-hq/synapse/issues/18706))
# Synapse 1.135.0 (2025-08-01)
Generated
+208 -358
View File
File diff suppressed because it is too large Load Diff
+14 -15
View File
@@ -8,7 +8,7 @@
Synapse is an open source `Matrix <https://matrix.org>`__ homeserver
implementation, written and maintained by `Element <https://element.io>`_.
`Matrix <https://github.com/matrix-org>`__ is the open standard for
secure and interoperable real time communications. You can directly run
secure and interoperable real-time communications. You can directly run
and manage the source code in this repository, available under an AGPL
license (or alternatively under a commercial license from Element).
There is no support provided by Element unless you have a
@@ -23,13 +23,13 @@ ESS builds on Synapse to offer a complete Matrix-based backend including the ful
`Admin Console product <https://element.io/enterprise-functionality/admin-console>`_,
giving admins the power to easily manage an organization-wide
deployment. It includes advanced identity management, auditing,
moderation and data retention options as well as Long Term Support and
SLAs. ESS can be used to support any Matrix-based frontend client.
moderation and data retention options as well as Long-Term Support and
SLAs. ESS supports any Matrix-compatible client.
.. contents::
🛠️ Installing and configuration
===============================
🛠️ Installation and configuration
==================================
The Synapse documentation describes `how to install Synapse <https://element-hq.github.io/synapse/latest/setup/installation.html>`_. We recommend using
`Docker images <https://element-hq.github.io/synapse/latest/setup/installation.html#docker-images-and-ansible-playbooks>`_ or `Debian packages from Matrix.org
@@ -133,7 +133,7 @@ connect from a client: see
An easy way to get started is to login or register via Element at
https://app.element.io/#/login or https://app.element.io/#/register respectively.
You will need to change the server you are logging into from ``matrix.org``
and instead specify a Homeserver URL of ``https://<server_name>:8448``
and instead specify a homeserver URL of ``https://<server_name>:8448``
(or just ``https://<server_name>`` if you are using a reverse proxy).
If you prefer to use another client, refer to our
`client breakdown <https://matrix.org/ecosystem/clients/>`_.
@@ -162,16 +162,15 @@ the public internet. Without it, anyone can freely register accounts on your hom
This can be exploited by attackers to create spambots targeting the rest of the Matrix
federation.
Your new user name will be formed partly from the ``server_name``, and partly
from a localpart you specify when you create the account. Your name will take
the form of::
Your new Matrix ID will be formed partly from the ``server_name``, and partly
from a localpart you specify when you create the account in the form of::
@localpart:my.domain.name
(pronounced "at localpart on my dot domain dot name").
As when logging in, you will need to specify a "Custom server". Specify your
desired ``localpart`` in the 'User name' box.
desired ``localpart`` in the 'Username' box.
🎯 Troubleshooting and support
==============================
@@ -209,10 +208,10 @@ Identity servers have the job of mapping email addresses and other 3rd Party
IDs (3PIDs) to Matrix user IDs, as well as verifying the ownership of 3PIDs
before creating that mapping.
**They are not where accounts or credentials are stored - these live on home
servers. Identity Servers are just for mapping 3rd party IDs to matrix IDs.**
**Identity servers do not store accounts or credentials - these are stored and managed on homeservers.
Identity Servers are just for mapping 3rd Party IDs to Matrix IDs.**
This process is very security-sensitive, as there is obvious risk of spam if it
This process is highly security-sensitive, as there is an obvious risk of spam if it
is too easy to sign up for Matrix accounts or harvest 3PID data. In the longer
term, we hope to create a decentralised system to manage it (`matrix-doc #712
<https://github.com/matrix-org/matrix-doc/issues/712>`_), but in the meantime,
@@ -238,9 +237,9 @@ email address.
We welcome contributions to Synapse from the community!
The best place to get started is our
`guide for contributors <https://element-hq.github.io/synapse/latest/development/contributing_guide.html>`_.
This is part of our larger `documentation <https://element-hq.github.io/synapse/latest>`_, which includes
This is part of our broader `documentation <https://element-hq.github.io/synapse/latest>`_, which includes
information for Synapse developers as well as Synapse administrators.
Developers might be particularly interested in:
* `Synapse's database schema <https://element-hq.github.io/synapse/latest/development/database_schema.html>`_,
+5 -5
View File
@@ -19,17 +19,17 @@ def build(setup_kwargs: Dict[str, Any]) -> None:
# This flag is a no-op in the latest versions. Instead, we need to
# specify this in the `bdist_wheel` config below.
py_limited_api=True,
# We force always building in release mode, as we can't tell the
# difference between using `poetry` in development vs production.
# We always build in release mode, as we can't distinguish
# between using `poetry` in development vs production.
debug=False,
)
setup_kwargs.setdefault("rust_extensions", []).append(extension)
setup_kwargs["zip_safe"] = False
# We lookup the minimum supported python version by looking at
# `python_requires` (e.g. ">=3.9.0,<4.0.0") and finding the first python
# We look up the minimum supported Python version with
# `python_requires` (e.g. ">=3.9.0,<4.0.0") and finding the first Python
# version that matches. We then convert that into the `py_limited_api` form,
# e.g. cp39 for python 3.9.
# e.g. cp39 for Python 3.9.
py_limited_api: str
python_bounds = SpecifierSet(setup_kwargs["python_requires"])
for minor_version in itertools.count(start=8):
+2 -2
View File
@@ -4396,7 +4396,7 @@
"exemplar": false,
"expr": "(time() - max without (job, index, host) (avg_over_time(synapse_federation_last_received_pdu_time[10m]))) / 60",
"instant": false,
"legendFormat": "{{server_name}} ",
"legendFormat": "{{origin_server_name}} ",
"range": true,
"refId": "A"
}
@@ -4518,7 +4518,7 @@
"exemplar": false,
"expr": "(time() - max without (job, index, host) (avg_over_time(synapse_federation_last_sent_pdu_time[10m]))) / 60",
"instant": false,
"legendFormat": "{{server_name}}",
"legendFormat": "{{destination_server_name}}",
"range": true,
"refId": "A"
}
+18
View File
@@ -1,3 +1,21 @@
matrix-synapse-py3 (1.136.0) stable; urgency=medium
* New Synapse release 1.136.0.
-- Synapse Packaging team <packages@matrix.org> Tue, 12 Aug 2025 13:18:03 +0100
matrix-synapse-py3 (1.136.0~rc2) stable; urgency=medium
* New Synapse release 1.136.0rc2.
-- Synapse Packaging team <packages@matrix.org> Mon, 11 Aug 2025 12:18:52 -0600
matrix-synapse-py3 (1.136.0~rc1) stable; urgency=medium
* New Synapse release 1.136.0rc1.
-- Synapse Packaging team <packages@matrix.org> Tue, 05 Aug 2025 08:13:30 -0600
matrix-synapse-py3 (1.135.2) stable; urgency=medium
* New Synapse release 1.135.2.
@@ -98,6 +98,10 @@ rc_delayed_event_mgmt:
per_second: 9999
burst_count: 9999
rc_room_creation:
per_second: 9999
burst_count: 9999
federation_rr_transactions_per_room_per_second: 9999
allow_device_name_lookup_over_federation: true
@@ -22,4 +22,46 @@ To receive soft failed events in APIs like `/sync` and `/messages`, set `return_
to `true` in the admin client config. When `false`, the normal behaviour of these endpoints is to
exclude soft failed events.
**Note**: If the policy server flagged the event as spam and that caused soft failure, that will be indicated
in the event's `unsigned` content like so:
```json
{
"type": "m.room.message",
"other": "event_fields_go_here",
"unsigned": {
"io.element.synapse.soft_failed": true,
"io.element.synapse.policy_server_spammy": true
}
}
```
Default: `false`
## See events marked spammy by policy servers
Learn more about policy servers from [MSC4284](https://github.com/matrix-org/matrix-spec-proposals/pull/4284).
Similar to `return_soft_failed_events`, clients logged in with admin accounts can see events which were
flagged by the policy server as spammy (and thus soft failed) by setting `return_policy_server_spammy_events`
to `true`.
`return_policy_server_spammy_events` may be `true` while `return_soft_failed_events` is `false` to only see
policy server-flagged events. When `return_soft_failed_events` is `true` however, `return_policy_server_spammy_events`
is always `true`.
Events which were flagged by the policy will be flagged as `io.element.synapse.policy_server_spammy` in the
event's `unsigned` content, like so:
```json
{
"type": "m.room.message",
"other": "event_fields_go_here",
"unsigned": {
"io.element.synapse.soft_failed": true,
"io.element.synapse.policy_server_spammy": true
}
}
```
Default: `true` if `return_soft_failed_events` is `true`, otherwise `false`
+19 -4
View File
@@ -7,8 +7,23 @@ proxy is supported, not SOCKS proxy or anything else.
## Configure
The `http_proxy`, `https_proxy`, `no_proxy` environment variables are used to
specify proxy settings. The environment variable is not case sensitive.
The proxy settings can be configured in the homeserver configuration file via
[`http_proxy`](../usage/configuration/config_documentation.md#http_proxy),
[`https_proxy`](../usage/configuration/config_documentation.md#https_proxy), and
[`no_proxy_hosts`](../usage/configuration/config_documentation.md#no_proxy_hosts).
`homeserver.yaml` example:
```yaml
http_proxy: http://USERNAME:PASSWORD@10.0.1.1:8080/
https_proxy: http://USERNAME:PASSWORD@proxy.example.com:8080/
no_proxy_hosts:
- master.hostname.example.com
- 10.1.0.0/16
- 172.30.0.0/16
```
The proxy settings can also be configured via the `http_proxy`, `https_proxy`,
`no_proxy` environment variables. The environment variable is not case sensitive.
- `http_proxy`: Proxy server to use for HTTP requests.
- `https_proxy`: Proxy server to use for HTTPS requests.
- `no_proxy`: Comma-separated list of hosts, IP addresses, or IP ranges in CIDR
@@ -44,7 +59,7 @@ The proxy will be **used** for:
- phone-home stats
- recaptcha validation
- CAS auth validation
- OpenID Connect
- OpenID Connect (OIDC)
- Outbound federation
- Federation (checking public key revocation)
- Fetching public keys of other servers
@@ -53,7 +68,7 @@ The proxy will be **used** for:
It will **not be used** for:
- Application Services
- Identity servers
- Matrix Identity servers
- In worker configurations
- connections between workers
- connections from workers to Redis
+74 -3
View File
@@ -117,6 +117,77 @@ each upgrade are complete before moving on to the next upgrade, to avoid
stacking them up. You can monitor the currently running background updates with
[the Admin API](usage/administration/admin_api/background_updates.html#status).
# Upgrading to v1.136.0
## Deprecate `run_as_background_process` exported as part of the module API interface in favor of `ModuleApi.run_as_background_process`
The `run_as_background_process` function is now a method of the `ModuleApi` class. If
you were using the function directly from the module API, it will continue to work fine
but the background process metrics will not include an accurate `server_name` label.
This kind of metric labeling isn't relevant for many use cases and is used to
differentiate Synapse instances running in the same Python process (relevant to Synapse
Pro: Small Hosts). We recommend updating your usage to use the new
`ModuleApi.run_as_background_process` method to stay on top of future changes.
<details>
<summary>Example <code>run_as_background_process</code> upgrade</summary>
Before:
```python
class MyModule:
def __init__(self, module_api: ModuleApi) -> None:
run_as_background_process(__name__ + ":setup_database", self.setup_database)
```
After:
```python
class MyModule:
def __init__(self, module_api: ModuleApi) -> None:
module_api.run_as_background_process(__name__ + ":setup_database", self.setup_database)
```
</details>
## Metric labels have changed on `synapse_federation_last_received_pdu_time` and `synapse_federation_last_sent_pdu_time`
Previously, the `synapse_federation_last_received_pdu_time` and
`synapse_federation_last_sent_pdu_time` metrics both used the `server_name` label to
differentiate between different servers that we send and receive events from.
Since we're now using the `server_name` label to differentiate between different Synapse
homeserver instances running in the same process, these metrics have been changed as follows:
- `synapse_federation_last_received_pdu_time` now uses the `origin_server_name` label
- `synapse_federation_last_sent_pdu_time` now uses the `destination_server_name` label
The Grafana dashboard JSON in `contrib/grafana/synapse.json` has been updated to reflect
this change but you will need to manually update your own existing Grafana dashboards
using these metrics.
## Stable integration with Matrix Authentication Service
Support for [Matrix Authentication Service (MAS)](https://github.com/element-hq/matrix-authentication-service) is now stable, with a simplified configuration.
This stable integration requires MAS 0.20.0 or later.
The existing `experimental_features.msc3861` configuration option is now deprecated and will be removed in Synapse v1.137.0.
Synapse deployments already using MAS should now use the new configuration options:
```yaml
matrix_authentication_service:
# Enable the MAS integration
enabled: true
# The base URL where Synapse will contact MAS
endpoint: http://localhost:8080
# The shared secret used to authenticate MAS requests, must be the same as `matrix.secret` in the MAS configuration
# See https://element-hq.github.io/matrix-authentication-service/reference/configuration.html#matrix
secret: "asecurerandomsecretstring"
```
They must remove the `experimental_features.msc3861` configuration option from their configuration.
They can also remove the client previously used by Synapse [in the MAS configuration](https://element-hq.github.io/matrix-authentication-service/reference/configuration.html#clients) as it is no longer in use.
# Upgrading to v1.135.0
## `on_user_registration` module API callback may now run on any worker
@@ -137,10 +208,10 @@ native ICU library on your system is no longer required.
## Documented endpoint which can be delegated to a federation worker
The endpoint `^/_matrix/federation/v1/version$` can be delegated to a federation
worker. This is not new behaviour, but had not been documented yet. The
[list of delegatable endpoints](workers.md#synapseappgeneric_worker) has
worker. This is not new behaviour, but had not been documented yet. The
[list of delegatable endpoints](workers.md#synapseappgeneric_worker) has
been updated to include it. Make sure to check your reverse proxy rules if you
are using workers.
are using workers.
# Upgrading to v1.126.0
@@ -610,6 +610,61 @@ manhole_settings:
ssh_pub_key_path: CONFDIR/id_rsa.pub
```
---
### `http_proxy`
*(string|null)* Proxy server to use for HTTP requests.
For more details, see the [forward proxy documentation](../../setup/forward_proxy.md). There is no default for this option.
Example configuration:
```yaml
http_proxy: http://USERNAME:PASSWORD@10.0.1.1:8080/
```
---
### `https_proxy`
*(string|null)* Proxy server to use for HTTPS requests.
For more details, see the [forward proxy documentation](../../setup/forward_proxy.md). There is no default for this option.
Example configuration:
```yaml
https_proxy: http://USERNAME:PASSWORD@proxy.example.com:8080/
```
---
### `no_proxy_hosts`
*(array)* List of hosts, IP addresses, or IP ranges in CIDR format which should not use the proxy. Synapse will directly connect to these hosts.
For more details, see the [forward proxy documentation](../../setup/forward_proxy.md). There is no default for this option.
Example configuration:
```yaml
no_proxy_hosts:
- master.hostname.example.com
- 10.1.0.0/16
- 172.30.0.0/16
```
---
### `matrix_authentication_service`
*(object)* The `matrix_authentication_service` setting configures integration with [Matrix Authentication Service (MAS)](https://github.com/element-hq/matrix-authentication-service).
This setting has the following sub-options:
* `enabled` (boolean): Whether or not to enable the MAS integration. If this is set to `false`, Synapse will use its legacy internal authentication API. Defaults to `false`.
* `endpoint` (string): The URL where Synapse can reach MAS. This *must* have the `discovery` and `oauth` resources mounted. Defaults to `"http://localhost:8080"`.
* `secret` (string|null): A shared secret that will be used to authenticate requests from and to MAS.
* `secret_path` (string|null): Alternative to `secret`, reading the shared secret from a file. The file should be a plain text file, containing only the secret. Synapse reads the secret from the given file once at startup.
Example configuration:
```yaml
matrix_authentication_service:
enabled: true
secret: someverysecuresecret
endpoint: http://localhost:8080
```
---
### `dummy_events_threshold`
*(integer)* Forward extremities can build up in a room due to networking delays between homeservers. Once this happens in a large room, calculation of the state of that room can become quite expensive. To mitigate this, once the number of forward extremities reaches a given threshold, Synapse will send an `org.matrix.dummy_event` event, which will reduce the forward extremities in the room.
@@ -1963,6 +2018,31 @@ rc_reports:
burst_count: 20.0
```
---
### `rc_room_creation`
*(object)* Sets rate limits for how often users are able to create rooms.
This setting has the following sub-options:
* `per_second` (number): Maximum number of requests a client can send per second.
* `burst_count` (number): Maximum number of requests a client can send before being throttled.
Default configuration:
```yaml
rc_room_creation:
per_user:
per_second: 0.016
burst_count: 10.0
```
Example configuration:
```yaml
rc_room_creation:
per_second: 1.0
burst_count: 5.0
```
---
### `federation_rr_transactions_per_room_per_second`
*(integer)* Sets outgoing federation transaction frequency for sending read-receipts, per-room.
+7 -6
View File
@@ -260,7 +260,7 @@ information.
^/_matrix/client/(r0|v3|unstable)/keys/claim$
^/_matrix/client/(r0|v3|unstable)/room_keys/
^/_matrix/client/(r0|v3|unstable)/keys/upload
^/_matrix/client/(api/v1|r0|v3|unstable/keys/device_signing/upload$
^/_matrix/client/(api/v1|r0|v3|unstable)/keys/device_signing/upload$
^/_matrix/client/(api/v1|r0|v3|unstable)/keys/signatures/upload$
# Registration/login requests
@@ -532,8 +532,9 @@ the stream writer for the `account_data` stream:
##### The `receipts` stream
The following endpoints should be routed directly to the worker configured as
the stream writer for the `receipts` stream:
The `receipts` stream supports multiple writers. The following endpoints
can be handled by any worker, but should be routed directly to one of the workers
configured as stream writer for the `receipts` stream:
^/_matrix/client/(r0|v3|unstable)/rooms/.*/receipt
^/_matrix/client/(r0|v3|unstable)/rooms/.*/read_markers
@@ -555,13 +556,13 @@ the stream writer for the `push_rules` stream:
##### The `device_lists` stream
The `device_lists` stream supports multiple writers. The following endpoints
can be handled by any worker, but should be routed directly one of the workers
can be handled by any worker, but should be routed directly to one of the workers
configured as stream writer for the `device_lists` stream:
^/_matrix/client/(r0|v3)/delete_devices$
^/_matrix/client/(api/v1|r0|v3|unstable)/devices/
^/_matrix/client/(api/v1|r0|v3|unstable)/devices(/|$)
^/_matrix/client/(r0|v3|unstable)/keys/upload
^/_matrix/client/(api/v1|r0|v3|unstable/keys/device_signing/upload$
^/_matrix/client/(api/v1|r0|v3|unstable)/keys/device_signing/upload$
^/_matrix/client/(api/v1|r0|v3|unstable)/keys/signatures/upload$
#### Restrict outbound federation traffic to a specific set of workers
+15 -1
View File
@@ -1,6 +1,17 @@
[mypy]
namespace_packages = True
plugins = pydantic.mypy, mypy_zope:plugin, scripts-dev/mypy_synapse_plugin.py
# Our custom mypy plugin should remain first in this list.
#
# mypy has a limitation where it only chooses the first plugin that returns a non-None
# value for each hook (known-limitation, c.f.
# https://github.com/python/mypy/issues/19524). We workaround this by putting our custom
# plugin first in the plugin order and then manually calling any other conflicting
# plugin hooks in our own plugin followed by our own checks.
#
# If you add a new plugin, make sure to check whether the hooks being used conflict with
# our custom plugin hooks and if so, manually call the other plugin's hooks in our
# custom plugin. (also applies to if the plugin is updated in the future)
plugins = scripts-dev/mypy_synapse_plugin.py, pydantic.mypy, mypy_zope:plugin
follow_imports = normal
show_error_codes = True
show_traceback = True
@@ -99,3 +110,6 @@ ignore_missing_imports = True
[mypy-multipart.*]
ignore_missing_imports = True
[mypy-mypy_zope.*]
ignore_missing_imports = True
Generated
+40 -39
View File
@@ -504,18 +504,19 @@ smmap = ">=3.0.1,<6"
[[package]]
name = "gitpython"
version = "3.1.44"
version = "3.1.45"
description = "GitPython is a Python library used to interact with Git repositories"
optional = false
python-versions = ">=3.7"
groups = ["dev"]
files = [
{file = "GitPython-3.1.44-py3-none-any.whl", hash = "sha256:9e0e10cda9bed1ee64bc9a6de50e7e38a9c9943241cd7f585f6df3ed28011110"},
{file = "gitpython-3.1.44.tar.gz", hash = "sha256:c87e30b26253bf5418b01b0660f818967f3c503193838337fe5e573331249269"},
{file = "gitpython-3.1.45-py3-none-any.whl", hash = "sha256:8908cb2e02fb3b93b7eb0f2827125cb699869470432cc885f019b8fd0fccff77"},
{file = "gitpython-3.1.45.tar.gz", hash = "sha256:85b0ee964ceddf211c41b9f27a49086010a190fd8132a24e21f362a4b36a791c"},
]
[package.dependencies]
gitdb = ">=4.0.1,<5"
typing-extensions = {version = ">=3.10.0.2", markers = "python_version < \"3.10\""}
[package.extras]
doc = ["sphinx (>=7.1.2,<7.2)", "sphinx-autodoc-typehints", "sphinx_rtd_theme"]
@@ -1453,18 +1454,18 @@ files = [
[[package]]
name = "mypy-zope"
version = "1.0.12"
version = "1.0.13"
description = "Plugin for mypy to support zope interfaces"
optional = false
python-versions = "*"
groups = ["dev"]
files = [
{file = "mypy_zope-1.0.12-py3-none-any.whl", hash = "sha256:f2ecf169f886fbc266e9339db0c2f3818528a7536b9bb4f5ece1d5854dc2f27c"},
{file = "mypy_zope-1.0.12.tar.gz", hash = "sha256:d6f8f99eb5644885553b4ec7afc8d68f5daf412c9bf238ec3c36b65d97df6cbe"},
{file = "mypy_zope-1.0.13-py3-none-any.whl", hash = "sha256:13740c4cbc910cca2c143c6709e1c483c991abeeeb7b629ad6f73d8ac1edad15"},
{file = "mypy_zope-1.0.13.tar.gz", hash = "sha256:63fb4d035ea874baf280dc69e714dcde4bd2a4a4837a0fd8d90ce91bea510f99"},
]
[package.dependencies]
mypy = ">=1.0.0,<1.17.0"
mypy = ">=1.0.0,<1.18.0"
"zope.interface" = "*"
"zope.schema" = "*"
@@ -1542,14 +1543,14 @@ files = [
[[package]]
name = "phonenumbers"
version = "9.0.9"
version = "9.0.10"
description = "Python version of Google's common library for parsing, formatting, storing and validating international phone numbers."
optional = false
python-versions = "*"
groups = ["main"]
files = [
{file = "phonenumbers-9.0.9-py2.py3-none-any.whl", hash = "sha256:13b91aa153f87675902829b38a556bad54824f9c121b89588bbb5fa8550d97ef"},
{file = "phonenumbers-9.0.9.tar.gz", hash = "sha256:c640545019a07e68b0bea57a5fede6eef45c7391165d28935f45615f9a567a5b"},
{file = "phonenumbers-9.0.10-py2.py3-none-any.whl", hash = "sha256:13b12d269be1f2b363c9bc2868656a7e2e8b50f1a1cef629c75005da6c374c6b"},
{file = "phonenumbers-9.0.10.tar.gz", hash = "sha256:c2d15a6a9d0534b14a7764f51246ada99563e263f65b80b0251d1a760ac4a1ba"},
]
[[package]]
@@ -2408,30 +2409,30 @@ files = [
[[package]]
name = "ruff"
version = "0.12.4"
version = "0.12.7"
description = "An extremely fast Python linter and code formatter, written in Rust."
optional = false
python-versions = ">=3.7"
groups = ["dev"]
files = [
{file = "ruff-0.12.4-py3-none-linux_armv6l.whl", hash = "sha256:cb0d261dac457ab939aeb247e804125a5d521b21adf27e721895b0d3f83a0d0a"},
{file = "ruff-0.12.4-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:55c0f4ca9769408d9b9bac530c30d3e66490bd2beb2d3dae3e4128a1f05c7442"},
{file = "ruff-0.12.4-py3-none-macosx_11_0_arm64.whl", hash = "sha256:a8224cc3722c9ad9044da7f89c4c1ec452aef2cfe3904365025dd2f51daeae0e"},
{file = "ruff-0.12.4-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e9949d01d64fa3672449a51ddb5d7548b33e130240ad418884ee6efa7a229586"},
{file = "ruff-0.12.4-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:be0593c69df9ad1465e8a2d10e3defd111fdb62dcd5be23ae2c06da77e8fcffb"},
{file = "ruff-0.12.4-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a7dea966bcb55d4ecc4cc3270bccb6f87a337326c9dcd3c07d5b97000dbff41c"},
{file = "ruff-0.12.4-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:afcfa3ab5ab5dd0e1c39bf286d829e042a15e966b3726eea79528e2e24d8371a"},
{file = "ruff-0.12.4-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:c057ce464b1413c926cdb203a0f858cd52f3e73dcb3270a3318d1630f6395bb3"},
{file = "ruff-0.12.4-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:e64b90d1122dc2713330350626b10d60818930819623abbb56535c6466cce045"},
{file = "ruff-0.12.4-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2abc48f3d9667fdc74022380b5c745873499ff827393a636f7a59da1515e7c57"},
{file = "ruff-0.12.4-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:2b2449dc0c138d877d629bea151bee8c0ae3b8e9c43f5fcaafcd0c0d0726b184"},
{file = "ruff-0.12.4-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:56e45bb11f625db55f9b70477062e6a1a04d53628eda7784dce6e0f55fd549eb"},
{file = "ruff-0.12.4-py3-none-musllinux_1_2_i686.whl", hash = "sha256:478fccdb82ca148a98a9ff43658944f7ab5ec41c3c49d77cd99d44da019371a1"},
{file = "ruff-0.12.4-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:0fc426bec2e4e5f4c4f182b9d2ce6a75c85ba9bcdbe5c6f2a74fcb8df437df4b"},
{file = "ruff-0.12.4-py3-none-win32.whl", hash = "sha256:4de27977827893cdfb1211d42d84bc180fceb7b72471104671c59be37041cf93"},
{file = "ruff-0.12.4-py3-none-win_amd64.whl", hash = "sha256:fe0b9e9eb23736b453143d72d2ceca5db323963330d5b7859d60d101147d461a"},
{file = "ruff-0.12.4-py3-none-win_arm64.whl", hash = "sha256:0618ec4442a83ab545e5b71202a5c0ed7791e8471435b94e655b570a5031a98e"},
{file = "ruff-0.12.4.tar.gz", hash = "sha256:13efa16df6c6eeb7d0f091abae50f58e9522f3843edb40d56ad52a5a4a4b6873"},
{file = "ruff-0.12.7-py3-none-linux_armv6l.whl", hash = "sha256:76e4f31529899b8c434c3c1dede98c4483b89590e15fb49f2d46183801565303"},
{file = "ruff-0.12.7-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:789b7a03e72507c54fb3ba6209e4bb36517b90f1a3569ea17084e3fd295500fb"},
{file = "ruff-0.12.7-py3-none-macosx_11_0_arm64.whl", hash = "sha256:2e1c2a3b8626339bb6369116e7030a4cf194ea48f49b64bb505732a7fce4f4e3"},
{file = "ruff-0.12.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:32dec41817623d388e645612ec70d5757a6d9c035f3744a52c7b195a57e03860"},
{file = "ruff-0.12.7-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:47ef751f722053a5df5fa48d412dbb54d41ab9b17875c6840a58ec63ff0c247c"},
{file = "ruff-0.12.7-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a828a5fc25a3efd3e1ff7b241fd392686c9386f20e5ac90aa9234a5faa12c423"},
{file = "ruff-0.12.7-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:5726f59b171111fa6a69d82aef48f00b56598b03a22f0f4170664ff4d8298efb"},
{file = "ruff-0.12.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:74e6f5c04c4dd4aba223f4fe6e7104f79e0eebf7d307e4f9b18c18362124bccd"},
{file = "ruff-0.12.7-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5d0bfe4e77fba61bf2ccadf8cf005d6133e3ce08793bbe870dd1c734f2699a3e"},
{file = "ruff-0.12.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:06bfb01e1623bf7f59ea749a841da56f8f653d641bfd046edee32ede7ff6c606"},
{file = "ruff-0.12.7-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:e41df94a957d50083fd09b916d6e89e497246698c3f3d5c681c8b3e7b9bb4ac8"},
{file = "ruff-0.12.7-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:4000623300563c709458d0ce170c3d0d788c23a058912f28bbadc6f905d67afa"},
{file = "ruff-0.12.7-py3-none-musllinux_1_2_i686.whl", hash = "sha256:69ffe0e5f9b2cf2b8e289a3f8945b402a1b19eff24ec389f45f23c42a3dd6fb5"},
{file = "ruff-0.12.7-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:a07a5c8ffa2611a52732bdc67bf88e243abd84fe2d7f6daef3826b59abbfeda4"},
{file = "ruff-0.12.7-py3-none-win32.whl", hash = "sha256:c928f1b2ec59fb77dfdf70e0419408898b63998789cc98197e15f560b9e77f77"},
{file = "ruff-0.12.7-py3-none-win_amd64.whl", hash = "sha256:9c18f3d707ee9edf89da76131956aba1270c6348bfee8f6c647de841eac7194f"},
{file = "ruff-0.12.7-py3-none-win_arm64.whl", hash = "sha256:dfce05101dbd11833a0776716d5d1578641b7fddb537fe7fa956ab85d1769b69"},
{file = "ruff-0.12.7.tar.gz", hash = "sha256:1fc3193f238bc2d7968772c82831a4ff69252f673be371fb49663f0068b7ec71"},
]
[[package]]
@@ -2469,15 +2470,15 @@ doc = ["Sphinx", "sphinx-rtd-theme"]
[[package]]
name = "sentry-sdk"
version = "2.32.0"
version = "2.34.1"
description = "Python client for Sentry (https://sentry.io)"
optional = true
python-versions = ">=3.6"
groups = ["main"]
markers = "extra == \"all\" or extra == \"sentry\""
files = [
{file = "sentry_sdk-2.32.0-py2.py3-none-any.whl", hash = "sha256:6cf51521b099562d7ce3606da928c473643abe99b00ce4cb5626ea735f4ec345"},
{file = "sentry_sdk-2.32.0.tar.gz", hash = "sha256:9016c75d9316b0f6921ac14c8cd4fb938f26002430ac5be9945ab280f78bec6b"},
{file = "sentry_sdk-2.34.1-py2.py3-none-any.whl", hash = "sha256:b7a072e1cdc5abc48101d5146e1ae680fa81fe886d8d95aaa25a0b450c818d32"},
{file = "sentry_sdk-2.34.1.tar.gz", hash = "sha256:69274eb8c5c38562a544c3e9f68b5be0a43be4b697f5fd385bf98e4fbe672687"},
]
[package.dependencies]
@@ -2931,14 +2932,14 @@ files = [
[[package]]
name = "types-jsonschema"
version = "4.24.0.20250708"
version = "4.25.0.20250720"
description = "Typing stubs for jsonschema"
optional = false
python-versions = ">=3.9"
groups = ["dev"]
files = [
{file = "types_jsonschema-4.24.0.20250708-py3-none-any.whl", hash = "sha256:d574aa3421d178a8435cc898cf4cf5e5e8c8f37b949c8e3ceeff06da433a18bf"},
{file = "types_jsonschema-4.24.0.20250708.tar.gz", hash = "sha256:a910e4944681cbb1b18a93ffb502e09910db788314312fc763df08d8ac2aadb7"},
{file = "types_jsonschema-4.25.0.20250720-py3-none-any.whl", hash = "sha256:7d7897c715310d8bf9ae27a2cedba78bbb09e4cad83ce06d2aa79b73a88941df"},
{file = "types_jsonschema-4.25.0.20250720.tar.gz", hash = "sha256:765a3b6144798fe3161fd8cbe570a756ed3e8c0e5adb7c09693eb49faad39dbd"},
]
[package.dependencies]
@@ -2982,14 +2983,14 @@ files = [
[[package]]
name = "types-psycopg2"
version = "2.9.21.20250516"
version = "2.9.21.20250718"
description = "Typing stubs for psycopg2"
optional = false
python-versions = ">=3.9"
groups = ["dev"]
files = [
{file = "types_psycopg2-2.9.21.20250516-py3-none-any.whl", hash = "sha256:2a9212d1e5e507017b31486ce8147634d06b85d652769d7a2d91d53cb4edbd41"},
{file = "types_psycopg2-2.9.21.20250516.tar.gz", hash = "sha256:6721018279175cce10b9582202e2a2b4a0da667857ccf82a97691bdb5ecd610f"},
{file = "types_psycopg2-2.9.21.20250718-py3-none-any.whl", hash = "sha256:bcf085d4293bda48f5943a46dadf0389b2f98f7e8007722f7e1c12ee0f541858"},
{file = "types_psycopg2-2.9.21.20250718.tar.gz", hash = "sha256:dc09a97272ef67e739e57b9f4740b761208f4514257e311c0b05c8c7a37d04b4"},
]
[[package]]
@@ -3352,4 +3353,4 @@ url-preview = ["lxml"]
[metadata]
lock-version = "2.1"
python-versions = "^3.9.0"
content-hash = "b1a0f4708465fd597d0bc7ebb09443ce0e2613cd58a33387a28036249f26856b"
content-hash = "600a349d08dde732df251583094a121b5385eb43ae0c6ceff10dcf9749359446"
+9 -4
View File
@@ -101,7 +101,7 @@ module-name = "synapse.synapse_rust"
[tool.poetry]
name = "matrix-synapse"
version = "1.135.2"
version = "1.136.0"
description = "Homeserver for the Matrix decentralised comms protocol"
authors = ["Matrix.org Team and Contributors <packages@matrix.org>"]
license = "AGPL-3.0-or-later"
@@ -178,8 +178,13 @@ signedjson = "^1.1.0"
service-identity = ">=18.1.0"
# Twisted 18.9 introduces some logger improvements that the structured
# logger utilises
Twisted = {extras = ["tls"], version = ">=18.9.0"}
treq = ">=15.1"
# Twisted 19.7.0 moves test helpers to a new module and deprecates the old location.
# Twisted 21.2.0 introduces contextvar support.
# We could likely bump this to 22.1 without making distro packagers'
# lives hard (as of 2025-07, distro support is Ubuntu LTS: 22.1, Debian stable: 22.4,
# RHEL 9: 22.10)
Twisted = {extras = ["tls"], version = ">=21.2.0"}
treq = ">=21.5.0"
# Twisted has required pyopenssl 16.0 since about Twisted 16.6.
pyOpenSSL = ">=16.0.0"
PyYAML = ">=5.3"
@@ -319,7 +324,7 @@ all = [
# failing on new releases. Keeping lower bounds loose here means that dependabot
# can bump versions without having to update the content-hash in the lockfile.
# This helps prevents merge conflicts when running a batch of dependabot updates.
ruff = "0.12.4"
ruff = "0.12.7"
# Type checking only works with the pydantic.v1 compat module from pydantic v2
pydantic = "^2"
+1 -1
View File
@@ -7,7 +7,7 @@ name = "synapse"
version = "0.1.0"
edition = "2021"
rust-version = "1.81.0"
rust-version = "1.82.0"
[lib]
name = "synapse"
+12 -7
View File
@@ -61,6 +61,7 @@ fn bench_match_exact(b: &mut Bencher) {
vec![],
false,
false,
false,
)
.unwrap();
@@ -71,10 +72,10 @@ fn bench_match_exact(b: &mut Bencher) {
},
));
let matched = eval.match_condition(&condition, None, None).unwrap();
let matched = eval.match_condition(&condition, None, None, None).unwrap();
assert!(matched, "Didn't match");
b.iter(|| eval.match_condition(&condition, None, None).unwrap());
b.iter(|| eval.match_condition(&condition, None, None, None).unwrap());
}
#[bench]
@@ -107,6 +108,7 @@ fn bench_match_word(b: &mut Bencher) {
vec![],
false,
false,
false,
)
.unwrap();
@@ -117,10 +119,10 @@ fn bench_match_word(b: &mut Bencher) {
},
));
let matched = eval.match_condition(&condition, None, None).unwrap();
let matched = eval.match_condition(&condition, None, None, None).unwrap();
assert!(matched, "Didn't match");
b.iter(|| eval.match_condition(&condition, None, None).unwrap());
b.iter(|| eval.match_condition(&condition, None, None, None).unwrap());
}
#[bench]
@@ -153,6 +155,7 @@ fn bench_match_word_miss(b: &mut Bencher) {
vec![],
false,
false,
false,
)
.unwrap();
@@ -163,10 +166,10 @@ fn bench_match_word_miss(b: &mut Bencher) {
},
));
let matched = eval.match_condition(&condition, None, None).unwrap();
let matched = eval.match_condition(&condition, None, None, None).unwrap();
assert!(!matched, "Didn't match");
b.iter(|| eval.match_condition(&condition, None, None).unwrap());
b.iter(|| eval.match_condition(&condition, None, None, None).unwrap());
}
#[bench]
@@ -199,6 +202,7 @@ fn bench_eval_message(b: &mut Bencher) {
vec![],
false,
false,
false,
)
.unwrap();
@@ -210,7 +214,8 @@ fn bench_eval_message(b: &mut Bencher) {
false,
false,
false,
false,
);
b.iter(|| eval.run(&rules, Some("bob"), Some("person")));
b.iter(|| eval.run(&rules, Some("bob"), Some("person"), None));
}
+24
View File
@@ -54,6 +54,7 @@ enum EventInternalMetadataData {
RecheckRedaction(bool),
SoftFailed(bool),
ProactivelySend(bool),
PolicyServerSpammy(bool),
Redacted(bool),
TxnId(Box<str>),
TokenId(i64),
@@ -96,6 +97,13 @@ impl EventInternalMetadataData {
.to_owned()
.into_any(),
),
EventInternalMetadataData::PolicyServerSpammy(o) => (
pyo3::intern!(py, "policy_server_spammy"),
o.into_pyobject(py)
.unwrap_infallible()
.to_owned()
.into_any(),
),
EventInternalMetadataData::Redacted(o) => (
pyo3::intern!(py, "redacted"),
o.into_pyobject(py)
@@ -155,6 +163,11 @@ impl EventInternalMetadataData {
.extract()
.with_context(|| format!("'{key_str}' has invalid type"))?,
),
"policy_server_spammy" => EventInternalMetadataData::PolicyServerSpammy(
value
.extract()
.with_context(|| format!("'{key_str}' has invalid type"))?,
),
"redacted" => EventInternalMetadataData::Redacted(
value
.extract()
@@ -427,6 +440,17 @@ impl EventInternalMetadata {
set_property!(self, ProactivelySend, obj);
}
#[getter]
fn get_policy_server_spammy(&self) -> PyResult<bool> {
Ok(get_property_opt!(self, PolicyServerSpammy)
.copied()
.unwrap_or(false))
}
#[setter]
fn set_policy_server_spammy(&mut self, obj: bool) {
set_property!(self, PolicyServerSpammy, obj);
}
#[getter]
fn get_redacted(&self) -> PyResult<bool> {
let bool = get_property!(self, Redacted)?;
+20
View File
@@ -290,6 +290,26 @@ pub const BASE_APPEND_CONTENT_RULES: &[PushRule] = &[PushRule {
}];
pub const BASE_APPEND_UNDERRIDE_RULES: &[PushRule] = &[
PushRule {
rule_id: Cow::Borrowed("global/content/.io.element.msc4306.rule.unsubscribed_thread"),
priority_class: 1,
conditions: Cow::Borrowed(&[Condition::Known(
KnownCondition::Msc4306ThreadSubscription { subscribed: false },
)]),
actions: Cow::Borrowed(&[]),
default: true,
default_enabled: true,
},
PushRule {
rule_id: Cow::Borrowed("global/content/.io.element.msc4306.rule.subscribed_thread"),
priority_class: 1,
conditions: Cow::Borrowed(&[Condition::Known(
KnownCondition::Msc4306ThreadSubscription { subscribed: true },
)]),
actions: Cow::Borrowed(&[Action::Notify, SOUND_ACTION]),
default: true,
default_enabled: true,
},
PushRule {
rule_id: Cow::Borrowed("global/underride/.m.rule.call"),
priority_class: 1,
+52 -7
View File
@@ -106,8 +106,11 @@ pub struct PushRuleEvaluator {
/// flag as MSC1767 (extensible events core).
msc3931_enabled: bool,
// If MSC4210 (remove legacy mentions) is enabled.
/// If MSC4210 (remove legacy mentions) is enabled.
msc4210_enabled: bool,
/// If MSC4306 (thread subscriptions) is enabled.
msc4306_enabled: bool,
}
#[pymethods]
@@ -126,6 +129,7 @@ impl PushRuleEvaluator {
room_version_feature_flags,
msc3931_enabled,
msc4210_enabled,
msc4306_enabled,
))]
pub fn py_new(
flattened_keys: BTreeMap<String, JsonValue>,
@@ -138,6 +142,7 @@ impl PushRuleEvaluator {
room_version_feature_flags: Vec<String>,
msc3931_enabled: bool,
msc4210_enabled: bool,
msc4306_enabled: bool,
) -> Result<Self, Error> {
let body = match flattened_keys.get("content.body") {
Some(JsonValue::Value(SimpleJsonValue::Str(s))) => s.clone().into_owned(),
@@ -156,6 +161,7 @@ impl PushRuleEvaluator {
room_version_feature_flags,
msc3931_enabled,
msc4210_enabled,
msc4306_enabled,
})
}
@@ -167,12 +173,19 @@ impl PushRuleEvaluator {
///
/// Returns the set of actions, if any, that match (filtering out any
/// `dont_notify` and `coalesce` actions).
#[pyo3(signature = (push_rules, user_id=None, display_name=None))]
///
/// msc4306_thread_subscription_state: (Only populated if MSC4306 is enabled)
/// The thread subscription state corresponding to the thread containing this event.
/// - `None` if the event is not in a thread, or if MSC4306 is disabled.
/// - `Some(true)` if the event is in a thread and the user has a subscription for that thread
/// - `Some(false)` if the event is in a thread and the user does NOT have a subscription for that thread
#[pyo3(signature = (push_rules, user_id=None, display_name=None, msc4306_thread_subscription_state=None))]
pub fn run(
&self,
push_rules: &FilteredPushRules,
user_id: Option<&str>,
display_name: Option<&str>,
msc4306_thread_subscription_state: Option<bool>,
) -> Vec<Action> {
'outer: for (push_rule, enabled) in push_rules.iter() {
if !enabled {
@@ -204,7 +217,12 @@ impl PushRuleEvaluator {
Condition::Known(KnownCondition::RoomVersionSupports { feature: _ }),
);
match self.match_condition(condition, user_id, display_name) {
match self.match_condition(
condition,
user_id,
display_name,
msc4306_thread_subscription_state,
) {
Ok(true) => {}
Ok(false) => continue 'outer,
Err(err) => {
@@ -237,14 +255,20 @@ impl PushRuleEvaluator {
}
/// Check if the given condition matches.
#[pyo3(signature = (condition, user_id=None, display_name=None))]
#[pyo3(signature = (condition, user_id=None, display_name=None, msc4306_thread_subscription_state=None))]
fn matches(
&self,
condition: Condition,
user_id: Option<&str>,
display_name: Option<&str>,
msc4306_thread_subscription_state: Option<bool>,
) -> bool {
match self.match_condition(&condition, user_id, display_name) {
match self.match_condition(
&condition,
user_id,
display_name,
msc4306_thread_subscription_state,
) {
Ok(true) => true,
Ok(false) => false,
Err(err) => {
@@ -262,6 +286,7 @@ impl PushRuleEvaluator {
condition: &Condition,
user_id: Option<&str>,
display_name: Option<&str>,
msc4306_thread_subscription_state: Option<bool>,
) -> Result<bool, Error> {
let known_condition = match condition {
Condition::Known(known) => known,
@@ -393,6 +418,13 @@ impl PushRuleEvaluator {
&& self.room_version_feature_flags.contains(&flag)
}
}
KnownCondition::Msc4306ThreadSubscription { subscribed } => {
if !self.msc4306_enabled {
false
} else {
msc4306_thread_subscription_state == Some(*subscribed)
}
}
};
Ok(result)
@@ -536,10 +568,11 @@ fn push_rule_evaluator() {
vec![],
true,
false,
false,
)
.unwrap();
let result = evaluator.run(&FilteredPushRules::default(), None, Some("bob"));
let result = evaluator.run(&FilteredPushRules::default(), None, Some("bob"), None);
assert_eq!(result.len(), 3);
}
@@ -566,6 +599,7 @@ fn test_requires_room_version_supports_condition() {
flags,
true,
false,
false,
)
.unwrap();
@@ -575,6 +609,7 @@ fn test_requires_room_version_supports_condition() {
&FilteredPushRules::default(),
Some("@bob:example.org"),
None,
None,
);
assert_eq!(result.len(), 3);
@@ -593,7 +628,17 @@ fn test_requires_room_version_supports_condition() {
};
let rules = PushRules::new(vec![custom_rule]);
result = evaluator.run(
&FilteredPushRules::py_new(rules, BTreeMap::new(), true, false, true, false, false),
&FilteredPushRules::py_new(
rules,
BTreeMap::new(),
true,
false,
true,
false,
false,
false,
),
None,
None,
None,
);
+12
View File
@@ -369,6 +369,10 @@ pub enum KnownCondition {
RoomVersionSupports {
feature: Cow<'static, str>,
},
#[serde(rename = "io.element.msc4306.thread_subscription")]
Msc4306ThreadSubscription {
subscribed: bool,
},
}
impl<'source> IntoPyObject<'source> for Condition {
@@ -547,11 +551,13 @@ pub struct FilteredPushRules {
msc3664_enabled: bool,
msc4028_push_encrypted_events: bool,
msc4210_enabled: bool,
msc4306_enabled: bool,
}
#[pymethods]
impl FilteredPushRules {
#[new]
#[allow(clippy::too_many_arguments)]
pub fn py_new(
push_rules: PushRules,
enabled_map: BTreeMap<String, bool>,
@@ -560,6 +566,7 @@ impl FilteredPushRules {
msc3664_enabled: bool,
msc4028_push_encrypted_events: bool,
msc4210_enabled: bool,
msc4306_enabled: bool,
) -> Self {
Self {
push_rules,
@@ -569,6 +576,7 @@ impl FilteredPushRules {
msc3664_enabled,
msc4028_push_encrypted_events,
msc4210_enabled,
msc4306_enabled,
}
}
@@ -619,6 +627,10 @@ impl FilteredPushRules {
return false;
}
if !self.msc4306_enabled && rule.rule_id.contains("/.io.element.msc4306.rule.") {
return false;
}
true
})
.map(|r| {
+76 -1
View File
@@ -1,5 +1,5 @@
$schema: https://element-hq.github.io/synapse/latest/schema/v1/meta.schema.json
$id: https://element-hq.github.io/synapse/schema/synapse/v1.135/synapse-config.schema.json
$id: https://element-hq.github.io/synapse/schema/synapse/v1.136/synapse-config.schema.json
type: object
properties:
modules:
@@ -629,6 +629,70 @@ properties:
password: mypassword
ssh_priv_key_path: CONFDIR/id_rsa
ssh_pub_key_path: CONFDIR/id_rsa.pub
http_proxy:
type: ["string", "null"]
description: >-
Proxy server to use for HTTP requests.
For more details, see the [forward proxy documentation](../../setup/forward_proxy.md).
examples:
- "http://USERNAME:PASSWORD@10.0.1.1:8080/"
https_proxy:
type: ["string", "null"]
description: >-
Proxy server to use for HTTPS requests.
For more details, see the [forward proxy documentation](../../setup/forward_proxy.md).
examples:
- "http://USERNAME:PASSWORD@proxy.example.com:8080/"
no_proxy_hosts:
type: array
description: >-
List of hosts, IP addresses, or IP ranges in CIDR format which should not use the
proxy. Synapse will directly connect to these hosts.
For more details, see the [forward proxy documentation](../../setup/forward_proxy.md).
examples:
- - master.hostname.example.com
- 10.1.0.0/16
- 172.30.0.0/16
matrix_authentication_service:
type: object
description: >-
The `matrix_authentication_service` setting configures integration with
[Matrix Authentication Service (MAS)](https://github.com/element-hq/matrix-authentication-service).
properties:
enabled:
type: boolean
description: >-
Whether or not to enable the MAS integration. If this is set to
`false`, Synapse will use its legacy internal authentication API.
default: false
endpoint:
type: string
format: uri
description: >-
The URL where Synapse can reach MAS. This *must* have the `discovery`
and `oauth` resources mounted.
default: http://localhost:8080
secret:
type: ["string", "null"]
description: >-
A shared secret that will be used to authenticate requests from and to MAS.
secret_path:
type: ["string", "null"]
description: >-
Alternative to `secret`, reading the shared secret from a file.
The file should be a plain text file, containing only the secret.
Synapse reads the secret from the given file once at startup.
examples:
- enabled: true
secret: someverysecuresecret
endpoint: http://localhost:8080
dummy_events_threshold:
type: integer
description: >-
@@ -2201,6 +2265,17 @@ properties:
examples:
- per_second: 2.0
burst_count: 20.0
rc_room_creation:
$ref: "#/$defs/rc"
description: >-
Sets rate limits for how often users are able to create rooms.
default:
per_user:
per_second: 0.016
burst_count: 10.0
examples:
- per_second: 1.0
burst_count: 5.0
federation_rr_transactions_per_room_per_second:
type: integer
description: >-
+8 -1
View File
@@ -473,6 +473,10 @@ def section(prop: str, values: dict) -> str:
def main() -> None:
# For Windows: reconfigure the terminal to be UTF-8 for `print()` calls.
if sys.platform == "win32":
sys.stdout.reconfigure(encoding="utf-8")
def usage(err_msg: str) -> int:
script_name = (sys.argv[:1] or ["__main__.py"])[0]
print(err_msg, file=sys.stderr)
@@ -485,7 +489,10 @@ def main() -> None:
exit(usage("Too many arguments."))
if not (filepath := (sys.argv[1:] or [""])[0]):
exit(usage("No schema file provided."))
with open(filepath) as f:
with open(filepath, "r", encoding="utf-8") as f:
# Note: Windows requires that we specify the encoding otherwise it uses
# things like CP-1251, which can cause explosions.
# See https://github.com/yaml/pyyaml/issues/123 for more info.
return yaml.safe_load(f)
schema = read_json_file_arg()
+324 -3
View File
@@ -23,28 +23,195 @@
can crop up, e.g the cache descriptors.
"""
from typing import Callable, Optional, Tuple, Type, Union
import enum
from typing import Callable, Mapping, Optional, Tuple, Type, Union
import attr
import mypy.types
from mypy.erasetype import remove_instance_last_known_values
from mypy.errorcodes import ErrorCode
from mypy.nodes import ARG_NAMED_OPT, TempNode, Var
from mypy.plugin import FunctionSigContext, MethodSigContext, Plugin
from mypy.nodes import ARG_NAMED_OPT, ListExpr, NameExpr, TempNode, TupleExpr, Var
from mypy.plugin import (
ClassDefContext,
Context,
FunctionLike,
FunctionSigContext,
MethodSigContext,
MypyFile,
Plugin,
)
from mypy.typeops import bind_self
from mypy.types import (
AnyType,
CallableType,
Instance,
NoneType,
Options,
TupleType,
TypeAliasType,
TypeVarType,
UninhabitedType,
UnionType,
)
from mypy_zope import plugin as mypy_zope_plugin
from pydantic.mypy import plugin as mypy_pydantic_plugin
PROMETHEUS_METRIC_MISSING_SERVER_NAME_LABEL = ErrorCode(
"missing-server-name-label",
"`SERVER_NAME_LABEL` required in metric",
category="per-homeserver-tenant-metrics",
)
PROMETHEUS_METRIC_MISSING_FROM_LIST_TO_CHECK = ErrorCode(
"metric-type-missing-from-list",
"Every Prometheus metric type must be included in the `prometheus_metric_fullname_to_label_arg_map`.",
category="per-homeserver-tenant-metrics",
)
class Sentinel(enum.Enum):
# defining a sentinel in this way allows mypy to correctly handle the
# type of a dictionary lookup and subsequent type narrowing.
UNSET_SENTINEL = object()
@attr.s(auto_attribs=True)
class ArgLocation:
keyword_name: str
"""
The keyword argument name for this argument
"""
position: int
"""
The 0-based positional index of this argument
"""
prometheus_metric_fullname_to_label_arg_map: Mapping[str, Optional[ArgLocation]] = {
# `Collector` subclasses:
"prometheus_client.metrics.MetricWrapperBase": ArgLocation("labelnames", 2),
"prometheus_client.metrics.Counter": ArgLocation("labelnames", 2),
"prometheus_client.metrics.Histogram": ArgLocation("labelnames", 2),
"prometheus_client.metrics.Gauge": ArgLocation("labelnames", 2),
"prometheus_client.metrics.Summary": ArgLocation("labelnames", 2),
"prometheus_client.metrics.Info": ArgLocation("labelnames", 2),
"prometheus_client.metrics.Enum": ArgLocation("labelnames", 2),
"synapse.metrics.LaterGauge": ArgLocation("labelnames", 2),
"synapse.metrics.InFlightGauge": ArgLocation("labels", 2),
"synapse.metrics.GaugeBucketCollector": ArgLocation("labelnames", 2),
"prometheus_client.registry.Collector": None,
"prometheus_client.registry._EmptyCollector": None,
"prometheus_client.registry.CollectorRegistry": None,
"prometheus_client.process_collector.ProcessCollector": None,
"prometheus_client.platform_collector.PlatformCollector": None,
"prometheus_client.gc_collector.GCCollector": None,
"synapse.metrics._gc.GCCounts": None,
"synapse.metrics._gc.PyPyGCStats": None,
"synapse.metrics._reactor_metrics.ReactorLastSeenMetric": None,
"synapse.metrics.CPUMetrics": None,
"synapse.metrics.jemalloc.JemallocCollector": None,
"synapse.util.metrics.DynamicCollectorRegistry": None,
"synapse.metrics.background_process_metrics._Collector": None,
#
# `Metric` subclasses:
"prometheus_client.metrics_core.Metric": None,
"prometheus_client.metrics_core.UnknownMetricFamily": ArgLocation("labels", 3),
"prometheus_client.metrics_core.CounterMetricFamily": ArgLocation("labels", 3),
"prometheus_client.metrics_core.GaugeMetricFamily": ArgLocation("labels", 3),
"prometheus_client.metrics_core.SummaryMetricFamily": ArgLocation("labels", 3),
"prometheus_client.metrics_core.InfoMetricFamily": ArgLocation("labels", 3),
"prometheus_client.metrics_core.HistogramMetricFamily": ArgLocation("labels", 3),
"prometheus_client.metrics_core.GaugeHistogramMetricFamily": ArgLocation(
"labels", 4
),
"prometheus_client.metrics_core.StateSetMetricFamily": ArgLocation("labels", 3),
"synapse.metrics.GaugeHistogramMetricFamilyWithLabels": ArgLocation(
"labelnames", 4
),
}
"""
Map from the fullname of the Prometheus `Metric`/`Collector` classes to the keyword
argument name and positional index of the label names. This map is useful because
different metrics have different signatures for passing in label names and we just need
to know where to look.
This map should include any metrics that we collect with Prometheus. Which corresponds
to anything that inherits from `prometheus_client.registry.Collector`
(`synapse.metrics._types.Collector`) or `prometheus_client.metrics_core.Metric`. The
exhaustiveness of this list is enforced by `analyze_prometheus_metric_classes`.
The entries with `None` always fail the lint because they don't have a `labelnames`
argument (therefore, no `SERVER_NAME_LABEL`), but we include them here so that people
can notice and manually allow via a type ignore comment as the source of truth
should be in the source code.
"""
# Unbound at this point because we don't know the mypy version yet.
# This is set in the `plugin(...)` function below.
MypyPydanticPluginClass: Type[Plugin]
MypyZopePluginClass: Type[Plugin]
class SynapsePlugin(Plugin):
def __init__(self, options: Options):
super().__init__(options)
self.mypy_pydantic_plugin = MypyPydanticPluginClass(options)
self.mypy_zope_plugin = MypyZopePluginClass(options)
def set_modules(self, modules: dict[str, MypyFile]) -> None:
"""
This is called by mypy internals. We have to override this to ensure it's also
called for any other plugins that we're manually handling.
Here is how mypy describes it:
> [`self._modules`] can't be set in `__init__` because it is executed too soon
> in `build.py`. Therefore, `build.py` *must* set it later before graph processing
> starts by calling `set_modules()`.
"""
super().set_modules(modules)
self.mypy_pydantic_plugin.set_modules(modules)
self.mypy_zope_plugin.set_modules(modules)
def get_base_class_hook(
self, fullname: str
) -> Optional[Callable[[ClassDefContext], None]]:
def _get_base_class_hook(ctx: ClassDefContext) -> None:
# Run any `get_base_class_hook` checks from other plugins first.
#
# Unfortunately, because mypy only chooses the first plugin that returns a
# non-None value (known-limitation, c.f.
# https://github.com/python/mypy/issues/19524), we workaround this by
# putting our custom plugin first in the plugin order and then calling the
# other plugin's hook manually followed by our own checks.
if callback := self.mypy_pydantic_plugin.get_base_class_hook(fullname):
callback(ctx)
if callback := self.mypy_zope_plugin.get_base_class_hook(fullname):
callback(ctx)
# Now run our own checks
analyze_prometheus_metric_classes(ctx)
return _get_base_class_hook
def get_function_signature_hook(
self, fullname: str
) -> Optional[Callable[[FunctionSigContext], FunctionLike]]:
# Strip off the unique identifier for classes that are dynamically created inside
# functions. ex. `synapse.metrics.jemalloc.JemallocCollector@185` (this is the line
# number)
if "@" in fullname:
fullname = fullname.split("@", 1)[0]
# Look for any Prometheus metrics to make sure they have the `SERVER_NAME_LABEL`
# label.
if fullname in prometheus_metric_fullname_to_label_arg_map.keys():
# Because it's difficult to determine the `fullname` of the function in the
# callback, let's just pass it in while we have it.
return lambda ctx: check_prometheus_metric_instantiation(ctx, fullname)
return None
def get_method_signature_hook(
self, fullname: str
) -> Optional[Callable[[MethodSigContext], CallableType]]:
@@ -65,6 +232,157 @@ class SynapsePlugin(Plugin):
return None
def analyze_prometheus_metric_classes(ctx: ClassDefContext) -> None:
"""
Cross-check the list of Prometheus metric classes against the
`prometheus_metric_fullname_to_label_arg_map` to ensure the list is exhaustive and
up-to-date.
"""
fullname = ctx.cls.fullname
# Strip off the unique identifier for classes that are dynamically created inside
# functions. ex. `synapse.metrics.jemalloc.JemallocCollector@185` (this is the line
# number)
if "@" in fullname:
fullname = fullname.split("@", 1)[0]
if any(
ancestor_type.fullname
in (
# All of the Prometheus metric classes inherit from the `Collector`.
"prometheus_client.registry.Collector",
"synapse.metrics._types.Collector",
# And custom metrics that inherit from `Metric`.
"prometheus_client.metrics_core.Metric",
)
for ancestor_type in ctx.cls.info.mro
):
if fullname not in prometheus_metric_fullname_to_label_arg_map:
ctx.api.fail(
f"Expected {fullname} to be in `prometheus_metric_fullname_to_label_arg_map`, "
f"but it was not found. This is a problem with our custom mypy plugin. "
f"Please add it to the map.",
Context(),
code=PROMETHEUS_METRIC_MISSING_FROM_LIST_TO_CHECK,
)
def check_prometheus_metric_instantiation(
ctx: FunctionSigContext, fullname: str
) -> CallableType:
"""
Ensure that the `prometheus_client` metrics include the `SERVER_NAME_LABEL` label
when instantiated.
This is important because we support multiple Synapse instances running in the same
process, where all metrics share a single global `REGISTRY`. The `server_name` label
ensures metrics are correctly separated by homeserver.
There are also some metrics that apply at the process level, such as CPU usage,
Python garbage collection, and Twisted reactor tick time, which shouldn't have the
`SERVER_NAME_LABEL`. In those cases, use a type ignore comment to disable the
check, e.g. `# type: ignore[missing-server-name-label]`.
Args:
ctx: The `FunctionSigContext` from mypy.
fullname: The fully qualified name of the function being called,
e.g. `"prometheus_client.metrics.Counter"`
"""
# The true signature, this isn't being modified so this is what will be returned.
signature = ctx.default_signature
# Find where the label names argument is in the function signature.
arg_location = prometheus_metric_fullname_to_label_arg_map.get(
fullname, Sentinel.UNSET_SENTINEL
)
assert arg_location is not Sentinel.UNSET_SENTINEL, (
f"Expected to find {fullname} in `prometheus_metric_fullname_to_label_arg_map`, "
f"but it was not found. This is a problem with our custom mypy plugin. "
f"Please add it to the map. Context: {ctx.context}"
)
# People should be using `# type: ignore[missing-server-name-label]` for
# process-level metrics that should not have the `SERVER_NAME_LABEL`.
if arg_location is None:
ctx.api.fail(
f"{signature.name} does not have a `labelnames`/`labels` argument "
"(if this is untrue, update `prometheus_metric_fullname_to_label_arg_map` "
"in our custom mypy plugin) and should probably have a type ignore comment, "
"e.g. `# type: ignore[missing-server-name-label]`. The reason we don't "
"automatically ignore this is the source of truth should be in the source code.",
ctx.context,
code=PROMETHEUS_METRIC_MISSING_SERVER_NAME_LABEL,
)
return signature
# Sanity check the arguments are still as expected in this version of
# `prometheus_client`. ex. `Counter(name, documentation, labelnames, ...)`
#
# `signature.arg_names` should be: ["name", "documentation", "labelnames", ...]
if (
len(signature.arg_names) < (arg_location.position + 1)
or signature.arg_names[arg_location.position] != arg_location.keyword_name
):
ctx.api.fail(
f"Expected argument number {arg_location.position + 1} of {signature.name} to be `labelnames`/`labels`, "
f"but got {signature.arg_names[arg_location.position]}",
ctx.context,
)
return signature
# Ensure mypy is passing the correct number of arguments because we are doing some
# dirty indexing into `ctx.args` later on.
assert len(ctx.args) == len(signature.arg_names), (
f"Expected the list of arguments in the {signature.name} signature ({len(signature.arg_names)})"
f"to match the number of arguments from the function signature context ({len(ctx.args)})"
)
# Check if the `labelnames` argument includes `SERVER_NAME_LABEL`
#
# `ctx.args` should look like this:
# ```
# [
# [StrExpr("name")],
# [StrExpr("documentation")],
# [ListExpr([StrExpr("label1"), StrExpr("label2")])]
# ...
# ]
# ```
labelnames_arg_expression = (
ctx.args[arg_location.position][0]
if len(ctx.args[arg_location.position]) > 0
else None
)
if isinstance(labelnames_arg_expression, (ListExpr, TupleExpr)):
# Check if the `labelnames` argument includes the `server_name` label (`SERVER_NAME_LABEL`).
for labelname_expression in labelnames_arg_expression.items:
if (
isinstance(labelname_expression, NameExpr)
and labelname_expression.fullname == "synapse.metrics.SERVER_NAME_LABEL"
):
# Found the `SERVER_NAME_LABEL`, all good!
break
else:
ctx.api.fail(
f"Expected {signature.name} to include `SERVER_NAME_LABEL` in the list of labels. "
"If this is a process-level metric (vs homeserver-level), use a type ignore comment "
"to disable this check.",
ctx.context,
code=PROMETHEUS_METRIC_MISSING_SERVER_NAME_LABEL,
)
else:
ctx.api.fail(
f"Expected the `labelnames` argument of {signature.name} to be a list of label names "
f"(including `SERVER_NAME_LABEL`), but got {labelnames_arg_expression}. "
"If this is a process-level metric (vs homeserver-level), use a type ignore comment "
"to disable this check.",
ctx.context,
code=PROMETHEUS_METRIC_MISSING_SERVER_NAME_LABEL,
)
return signature
return signature
def _get_true_return_type(signature: CallableType) -> mypy.types.Type:
"""
Get the "final" return type of a callable which might return an Awaitable/Deferred.
@@ -372,10 +690,13 @@ def is_cacheable(
def plugin(version: str) -> Type[SynapsePlugin]:
global MypyPydanticPluginClass, MypyZopePluginClass
# This is the entry point of the plugin, and lets us deal with the fact
# that the mypy plugin interface is *not* stable by looking at the version
# string.
#
# However, since we pin the version of mypy Synapse uses in CI, we don't
# really care.
MypyPydanticPluginClass = mypy_pydantic_plugin(version)
MypyZopePluginClass = mypy_zope_plugin(version)
return SynapsePlugin
-10
View File
@@ -45,16 +45,6 @@ if py_version < (3, 9):
# Allow using the asyncio reactor via env var.
if strtobool(os.environ.get("SYNAPSE_ASYNC_IO_REACTOR", "0")):
from incremental import Version
import twisted
# We need a bugfix that is included in Twisted 21.2.0:
# https://twistedmatrix.com/trac/ticket/9787
if twisted.version < Version("Twisted", 21, 2, 0):
print("Using asyncio reactor requires Twisted>=21.2.0")
sys.exit(1)
import asyncio
from twisted.internet import asyncioreactor
+6
View File
@@ -34,9 +34,11 @@ HAS_PYDANTIC_V2: bool = Version(pydantic_version).major == 2
if TYPE_CHECKING or HAS_PYDANTIC_V2:
from pydantic.v1 import (
AnyHttpUrl,
BaseModel,
Extra,
Field,
FilePath,
MissingError,
PydanticValueError,
StrictBool,
@@ -55,9 +57,11 @@ if TYPE_CHECKING or HAS_PYDANTIC_V2:
from pydantic.v1.typing import get_args
else:
from pydantic import (
AnyHttpUrl,
BaseModel,
Extra,
Field,
FilePath,
MissingError,
PydanticValueError,
StrictBool,
@@ -77,6 +81,7 @@ else:
__all__ = (
"HAS_PYDANTIC_V2",
"AnyHttpUrl",
"BaseModel",
"constr",
"conbytes",
@@ -85,6 +90,7 @@ __all__ = (
"ErrorWrapper",
"Extra",
"Field",
"FilePath",
"get_args",
"MissingError",
"parse_obj_as",
+14 -3
View File
@@ -29,19 +29,21 @@ import attr
from synapse.config._base import (
Config,
ConfigError,
RootConfig,
find_config_files,
read_config_files,
)
from synapse.config.database import DatabaseConfig
from synapse.config.server import ServerConfig
from synapse.storage.database import DatabasePool, LoggingTransaction, make_conn
from synapse.storage.engines import create_engine
class ReviewConfig(RootConfig):
"A config class that just pulls out the database config"
"A config class that just pulls out the server and database config"
config_classes = [DatabaseConfig]
config_classes = [ServerConfig, DatabaseConfig]
@attr.s(auto_attribs=True)
@@ -148,6 +150,10 @@ def main() -> None:
config_dict = read_config_files(config_files)
config.parse_config_dict(config_dict, "", "")
server_name = config.server.server_name
if not isinstance(server_name, str):
raise ConfigError("Must be a string", ("server_name",))
since_ms = time.time() * 1000 - Config.parse_duration(config_args.since)
exclude_users_with_email = config_args.exclude_emails
exclude_users_with_appservice = config_args.exclude_app_service
@@ -159,7 +165,12 @@ def main() -> None:
engine = create_engine(database_config.config)
with make_conn(database_config, engine, "review_recent_signups") as db_conn:
with make_conn(
db_config=database_config,
engine=engine,
default_txn_name="review_recent_signups",
server_name=server_name,
) as db_conn:
# This generates a type of Cursor, not LoggingTransaction.
user_infos = get_recent_users(
db_conn.cursor(),
+7 -1
View File
@@ -672,8 +672,14 @@ class Porter:
engine = create_engine(db_config.config)
hs = MockHomeserver(self.hs_config)
server_name = hs.hostname
with make_conn(db_config, engine, "portdb") as db_conn:
with make_conn(
db_config=db_config,
engine=engine,
default_txn_name="portdb",
server_name=server_name,
) as db_conn:
engine.check_database(
db_conn, allow_outdated_version=allow_outdated_version
)
+6 -1
View File
@@ -53,6 +53,7 @@ class MockHomeserver(HomeServer):
def run_background_updates(hs: HomeServer) -> None:
server_name = hs.hostname
main = hs.get_datastores().main
state = hs.get_datastores().state
@@ -66,7 +67,11 @@ def run_background_updates(hs: HomeServer) -> None:
def run() -> None:
# Apply all background updates on the database.
defer.ensureDeferred(
run_as_background_process("background_updates", run_background_updates)
run_as_background_process(
"background_updates",
server_name,
run_background_updates,
)
)
reactor.callWhenRunning(run)
+10
View File
@@ -20,10 +20,13 @@
#
from typing import TYPE_CHECKING, Optional, Protocol, Tuple
from prometheus_client import Histogram
from twisted.web.server import Request
from synapse.appservice import ApplicationService
from synapse.http.site import SynapseRequest
from synapse.metrics import SERVER_NAME_LABEL
from synapse.types import Requester
if TYPE_CHECKING:
@@ -33,6 +36,13 @@ if TYPE_CHECKING:
GUEST_DEVICE_ID = "guest_device"
introspection_response_timer = Histogram(
"synapse_api_auth_delegated_introspection_response",
"Time taken to get a response for an introspection request",
labelnames=["code", SERVER_NAME_LABEL],
)
class Auth(Protocol):
"""The interface that an auth provider must implement."""
+432
View File
@@ -0,0 +1,432 @@
#
# This file is licensed under the Affero General Public License (AGPL) version 3.
#
# Copyright (C) 2025 New Vector, Ltd
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as
# published by the Free Software Foundation, either version 3 of the
# License, or (at your option) any later version.
#
# See the GNU Affero General Public License for more details:
# <https://www.gnu.org/licenses/agpl-3.0.html>.
#
#
import logging
from typing import TYPE_CHECKING, Optional
from urllib.parse import urlencode
from synapse._pydantic_compat import (
BaseModel,
Extra,
StrictBool,
StrictInt,
StrictStr,
ValidationError,
)
from synapse.api.auth.base import BaseAuth
from synapse.api.errors import (
AuthError,
HttpResponseException,
InvalidClientTokenError,
SynapseError,
UnrecognizedRequestError,
)
from synapse.http.site import SynapseRequest
from synapse.logging.context import PreserveLoggingContext
from synapse.logging.opentracing import (
active_span,
force_tracing,
inject_request_headers,
start_active_span,
)
from synapse.metrics import SERVER_NAME_LABEL
from synapse.synapse_rust.http_client import HttpClient
from synapse.types import JsonDict, Requester, UserID, create_requester
from synapse.util import json_decoder
from synapse.util.caches.cached_call import RetryOnExceptionCachedCall
from synapse.util.caches.response_cache import ResponseCache, ResponseCacheContext
from . import introspection_response_timer
if TYPE_CHECKING:
from synapse.rest.admin.experimental_features import ExperimentalFeature
from synapse.server import HomeServer
logger = logging.getLogger(__name__)
# Scope as defined by MSC2967
# https://github.com/matrix-org/matrix-spec-proposals/pull/2967
SCOPE_MATRIX_API = "urn:matrix:org.matrix.msc2967.client:api:*"
SCOPE_MATRIX_DEVICE_PREFIX = "urn:matrix:org.matrix.msc2967.client:device:"
class ServerMetadata(BaseModel):
class Config:
extra = Extra.allow
issuer: StrictStr
account_management_uri: StrictStr
class IntrospectionResponse(BaseModel):
retrieved_at_ms: StrictInt
active: StrictBool
scope: Optional[StrictStr]
username: Optional[StrictStr]
sub: Optional[StrictStr]
device_id: Optional[StrictStr]
expires_in: Optional[StrictInt]
class Config:
extra = Extra.allow
def get_scope_set(self) -> set[str]:
if not self.scope:
return set()
return {token for token in self.scope.split(" ") if token}
def is_active(self, now_ms: int) -> bool:
if not self.active:
return False
# Compatibility tokens don't expire and don't have an 'expires_in' field
if self.expires_in is None:
return True
absolute_expiry_ms = self.expires_in * 1000 + self.retrieved_at_ms
return now_ms < absolute_expiry_ms
class MasDelegatedAuth(BaseAuth):
def __init__(self, hs: "HomeServer"):
super().__init__(hs)
self.server_name = hs.hostname
self._clock = hs.get_clock()
self._config = hs.config.mas
self._http_client = hs.get_proxied_http_client()
self._rust_http_client = HttpClient(
reactor=hs.get_reactor(),
user_agent=self._http_client.user_agent.decode("utf8"),
)
self._server_metadata = RetryOnExceptionCachedCall[ServerMetadata](
self._load_metadata
)
self._force_tracing_for_users = hs.config.tracing.force_tracing_for_users
# # Token Introspection Cache
# This remembers what users/devices are represented by which access tokens,
# in order to reduce overall system load:
# - on Synapse (as requests are relatively expensive)
# - on the network
# - on MAS
#
# Since there is no invalidation mechanism currently,
# the entries expire after 2 minutes.
# This does mean tokens can be treated as valid by Synapse
# for longer than reality.
#
# Ideally, tokens should logically be invalidated in the following circumstances:
# - If a session logout happens.
# In this case, MAS will delete the device within Synapse
# anyway and this is good enough as an invalidation.
# - If the client refreshes their token in MAS.
# In this case, the device still exists and it's not the end of the world for
# the old access token to continue working for a short time.
self._introspection_cache: ResponseCache[str] = ResponseCache(
clock=self._clock,
name="mas_token_introspection",
server_name=self.server_name,
timeout_ms=120_000,
# don't log because the keys are access tokens
enable_logging=False,
)
@property
def _metadata_url(self) -> str:
return f"{self._config.endpoint.rstrip('/')}/.well-known/openid-configuration"
@property
def _introspection_endpoint(self) -> str:
return f"{self._config.endpoint.rstrip('/')}/oauth2/introspect"
async def _load_metadata(self) -> ServerMetadata:
response = await self._http_client.get_json(self._metadata_url)
metadata = ServerMetadata(**response)
return metadata
async def issuer(self) -> str:
metadata = await self._server_metadata.get()
return metadata.issuer
async def account_management_url(self) -> str:
metadata = await self._server_metadata.get()
return metadata.account_management_uri
async def auth_metadata(self) -> JsonDict:
metadata = await self._server_metadata.get()
return metadata.dict()
def is_request_using_the_shared_secret(self, request: SynapseRequest) -> bool:
"""
Check if the request is using the shared secret.
Args:
request: The request to check.
Returns:
True if the request is using the shared secret, False otherwise.
"""
access_token = self.get_access_token_from_request(request)
shared_secret = self._config.secret()
if not shared_secret:
return False
return access_token == shared_secret
async def _introspect_token(
self, token: str, cache_context: ResponseCacheContext[str]
) -> IntrospectionResponse:
"""
Send a token to the introspection endpoint and returns the introspection response
Parameters:
token: The token to introspect
Raises:
HttpResponseException: If the introspection endpoint returns a non-2xx response
ValueError: If the introspection endpoint returns an invalid JSON response
JSONDecodeError: If the introspection endpoint returns a non-JSON response
Exception: If the HTTP request fails
Returns:
The introspection response
"""
# By default, we shouldn't cache the result unless we know it's valid
cache_context.should_cache = False
raw_headers: dict[str, str] = {
"Content-Type": "application/x-www-form-urlencoded",
"Accept": "application/json",
"Authorization": f"Bearer {self._config.secret()}",
# Tell MAS that we support reading the device ID as an explicit
# value, not encoded in the scope. This is supported by MAS 0.15+
"X-MAS-Supports-Device-Id": "1",
}
args = {"token": token, "token_type_hint": "access_token"}
body = urlencode(args, True)
# Do the actual request
logger.debug("Fetching token from MAS")
start_time = self._clock.time()
try:
with start_active_span("mas-introspect-token"):
inject_request_headers(raw_headers)
with PreserveLoggingContext():
resp_body = await self._rust_http_client.post(
url=self._introspection_endpoint,
response_limit=1 * 1024 * 1024,
headers=raw_headers,
request_body=body,
)
except HttpResponseException as e:
end_time = self._clock.time()
introspection_response_timer.labels(
code=e.code, **{SERVER_NAME_LABEL: self.server_name}
).observe(end_time - start_time)
raise
except Exception:
end_time = self._clock.time()
introspection_response_timer.labels(
code="ERR", **{SERVER_NAME_LABEL: self.server_name}
).observe(end_time - start_time)
raise
logger.debug("Fetched token from MAS")
end_time = self._clock.time()
introspection_response_timer.labels(
code=200, **{SERVER_NAME_LABEL: self.server_name}
).observe(end_time - start_time)
raw_response = json_decoder.decode(resp_body.decode("utf-8"))
try:
response = IntrospectionResponse(
retrieved_at_ms=self._clock.time_msec(),
**raw_response,
)
except ValidationError as e:
raise ValueError(
"The introspection endpoint returned an invalid JSON response"
) from e
# We had a valid response, so we can cache it
cache_context.should_cache = True
return response
async def is_server_admin(self, requester: Requester) -> bool:
return "urn:synapse:admin:*" in requester.scope
async def get_user_by_req(
self,
request: SynapseRequest,
allow_guest: bool = False,
allow_expired: bool = False,
allow_locked: bool = False,
) -> Requester:
parent_span = active_span()
with start_active_span("get_user_by_req"):
access_token = self.get_access_token_from_request(request)
requester = await self.get_appservice_user(request, access_token)
if not requester:
requester = await self.get_user_by_access_token(
token=access_token,
allow_expired=allow_expired,
)
await self._record_request(request, requester)
request.requester = requester
if parent_span:
if requester.authenticated_entity in self._force_tracing_for_users:
# request tracing is enabled for this user, so we need to force it
# tracing on for the parent span (which will be the servlet span).
#
# It's too late for the get_user_by_req span to inherit the setting,
# so we also force it on for that.
force_tracing()
force_tracing(parent_span)
parent_span.set_tag(
"authenticated_entity", requester.authenticated_entity
)
parent_span.set_tag("user_id", requester.user.to_string())
if requester.device_id is not None:
parent_span.set_tag("device_id", requester.device_id)
if requester.app_service is not None:
parent_span.set_tag("appservice_id", requester.app_service.id)
return requester
async def get_user_by_access_token(
self,
token: str,
allow_expired: bool = False,
) -> Requester:
try:
introspection_result = await self._introspection_cache.wrap(
token, self._introspect_token, token, cache_context=True
)
except Exception:
logger.exception("Failed to introspect token")
raise SynapseError(503, "Unable to introspect the access token")
logger.debug("Introspection result: %r", introspection_result)
if not introspection_result.is_active(self._clock.time_msec()):
raise InvalidClientTokenError("Token is not active")
# Let's look at the scope
scope = introspection_result.get_scope_set()
# Determine type of user based on presence of particular scopes
if SCOPE_MATRIX_API not in scope:
raise InvalidClientTokenError(
"Token doesn't grant access to the Matrix C-S API"
)
if introspection_result.username is None:
raise AuthError(
500,
"Invalid username claim in the introspection result",
)
user_id = UserID(
localpart=introspection_result.username,
domain=self.server_name,
)
# Try to find a user from the username claim
user_info = await self.store.get_user_by_id(user_id=user_id.to_string())
if user_info is None:
raise AuthError(
500,
"User not found",
)
# MAS will give us the device ID as an explicit value for *compatibility* sessions
# If present, we get it from here, if not we get it in the scope for next-gen sessions
device_id = introspection_result.device_id
if device_id is None:
# Find device_ids in scope
# We only allow a single device_id in the scope, so we find them all in the
# scope list, and raise if there are more than one. The OIDC server should be
# the one enforcing valid scopes, so we raise a 500 if we find an invalid scope.
device_ids = [
tok[len(SCOPE_MATRIX_DEVICE_PREFIX) :]
for tok in scope
if tok.startswith(SCOPE_MATRIX_DEVICE_PREFIX)
]
if len(device_ids) > 1:
raise AuthError(
500,
"Multiple device IDs in scope",
)
device_id = device_ids[0] if device_ids else None
if device_id is not None:
# Sanity check the device_id
if len(device_id) > 255 or len(device_id) < 1:
raise AuthError(
500,
"Invalid device ID in introspection result",
)
# Make sure the device exists. This helps with introspection cache
# invalidation: if we log out, the device gets deleted by MAS
device = await self.store.get_device(
user_id=user_id.to_string(),
device_id=device_id,
)
if device is None:
# Invalidate the introspection cache, the device was deleted
self._introspection_cache.unset(token)
raise InvalidClientTokenError("Token is not active")
return create_requester(
user_id=user_id,
device_id=device_id,
scope=scope,
)
async def get_user_by_req_experimental_feature(
self,
request: SynapseRequest,
feature: "ExperimentalFeature",
allow_guest: bool = False,
allow_expired: bool = False,
allow_locked: bool = False,
) -> Requester:
try:
requester = await self.get_user_by_req(
request,
allow_guest=allow_guest,
allow_expired=allow_expired,
allow_locked=allow_locked,
)
if await self.store.is_feature_enabled(requester.user.to_string(), feature):
return requester
raise UnrecognizedRequestError(code=404)
except (AuthError, InvalidClientTokenError):
if feature.is_globally_enabled(self.hs.config):
# If its globally enabled then return the auth error
raise
raise UnrecognizedRequestError(code=404)
+12 -11
View File
@@ -28,7 +28,6 @@ from authlib.oauth2.auth import encode_client_secret_basic, encode_client_secret
from authlib.oauth2.rfc7523 import ClientSecretJWT, PrivateKeyJWT, private_key_jwt_sign
from authlib.oauth2.rfc7662 import IntrospectionToken
from authlib.oidc.discovery import OpenIDProviderMetadata, get_well_known_url
from prometheus_client import Histogram
from synapse.api.auth.base import BaseAuth
from synapse.api.errors import (
@@ -47,25 +46,21 @@ from synapse.logging.opentracing import (
inject_request_headers,
start_active_span,
)
from synapse.metrics import SERVER_NAME_LABEL
from synapse.synapse_rust.http_client import HttpClient
from synapse.types import Requester, UserID, create_requester
from synapse.util import json_decoder
from synapse.util.caches.cached_call import RetryOnExceptionCachedCall
from synapse.util.caches.response_cache import ResponseCache, ResponseCacheContext
from . import introspection_response_timer
if TYPE_CHECKING:
from synapse.rest.admin.experimental_features import ExperimentalFeature
from synapse.server import HomeServer
logger = logging.getLogger(__name__)
introspection_response_timer = Histogram(
"synapse_api_auth_delegated_introspection_response",
"Time taken to get a response for an introspection request",
["code"],
)
# Scope as defined by MSC2967
# https://github.com/matrix-org/matrix-spec-proposals/pull/2967
SCOPE_MATRIX_API = "urn:matrix:org.matrix.msc2967.client:api:*"
@@ -341,17 +336,23 @@ class MSC3861DelegatedAuth(BaseAuth):
)
except HttpResponseException as e:
end_time = self._clock.time()
introspection_response_timer.labels(e.code).observe(end_time - start_time)
introspection_response_timer.labels(
code=e.code, **{SERVER_NAME_LABEL: self.server_name}
).observe(end_time - start_time)
raise
except Exception:
end_time = self._clock.time()
introspection_response_timer.labels("ERR").observe(end_time - start_time)
introspection_response_timer.labels(
code="ERR", **{SERVER_NAME_LABEL: self.server_name}
).observe(end_time - start_time)
raise
logger.debug("Fetched token from MAS")
end_time = self._clock.time()
introspection_response_timer.labels(200).observe(end_time - start_time)
introspection_response_timer.labels(
code=200, **{SERVER_NAME_LABEL: self.server_name}
).observe(end_time - start_time)
resp = json_decoder.decode(resp_body.decode("utf-8"))
+6
View File
@@ -140,6 +140,12 @@ class Codes(str, Enum):
# Part of MSC4155
INVITE_BLOCKED = "ORG.MATRIX.MSC4155.M_INVITE_BLOCKED"
# Part of MSC4306: Thread Subscriptions
MSC4306_CONFLICTING_UNSUBSCRIPTION = (
"IO.ELEMENT.MSC4306.M_CONFLICTING_UNSUBSCRIPTION"
)
MSC4306_NOT_IN_THREAD = "IO.ELEMENT.MSC4306.M_NOT_IN_THREAD"
class CodeMessageException(RuntimeError):
"""An exception with integer code, a message string attributes and optional headers.
+24 -11
View File
@@ -75,7 +75,7 @@ from synapse.http.site import SynapseSite
from synapse.logging.context import PreserveLoggingContext
from synapse.logging.opentracing import init_tracer
from synapse.metrics import install_gc_manager, register_threadpool
from synapse.metrics.background_process_metrics import wrap_as_background_process
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.metrics.jemalloc import setup_jemalloc_stats
from synapse.module_api.callbacks.spamchecker_callbacks import load_legacy_spam_checkers
from synapse.module_api.callbacks.third_party_event_rules_callbacks import (
@@ -512,6 +512,7 @@ async def start(hs: "HomeServer") -> None:
Args:
hs: homeserver instance
"""
server_name = hs.hostname
reactor = hs.get_reactor()
# We want to use a separate thread pool for the resolver so that large
@@ -524,22 +525,34 @@ async def start(hs: "HomeServer") -> None:
)
# Register the threadpools with our metrics.
register_threadpool("default", reactor.getThreadPool())
register_threadpool("gai_resolver", resolver_threadpool)
register_threadpool(
name="default", server_name=server_name, threadpool=reactor.getThreadPool()
)
register_threadpool(
name="gai_resolver", server_name=server_name, threadpool=resolver_threadpool
)
# Set up the SIGHUP machinery.
if hasattr(signal, "SIGHUP"):
@wrap_as_background_process("sighup")
async def handle_sighup(*args: Any, **kwargs: Any) -> None:
# Tell systemd our state, if we're using it. This will silently fail if
# we're not using systemd.
sdnotify(b"RELOADING=1")
def handle_sighup(*args: Any, **kwargs: Any) -> "defer.Deferred[None]":
async def _handle_sighup(*args: Any, **kwargs: Any) -> None:
# Tell systemd our state, if we're using it. This will silently fail if
# we're not using systemd.
sdnotify(b"RELOADING=1")
for i, args, kwargs in _sighup_callbacks:
i(*args, **kwargs)
for i, args, kwargs in _sighup_callbacks:
i(*args, **kwargs)
sdnotify(b"READY=1")
sdnotify(b"READY=1")
return run_as_background_process(
"sighup",
server_name,
_handle_sighup,
*args,
**kwargs,
)
# We defer running the sighup handlers until next reactor tick. This
# is so that we're in a sane state, e.g. flushing the logs may fail
+162 -123
View File
@@ -26,7 +26,12 @@ from typing import TYPE_CHECKING, List, Mapping, Sized, Tuple
from prometheus_client import Gauge
from synapse.metrics.background_process_metrics import wrap_as_background_process
from twisted.internet import defer
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import (
run_as_background_process,
)
from synapse.types import JsonDict
from synapse.util.constants import ONE_HOUR_SECONDS, ONE_MINUTE_SECONDS
@@ -53,138 +58,158 @@ Phone home stats are sent every 3 hours
_stats_process: List[Tuple[int, "resource.struct_rusage"]] = []
# Gauges to expose monthly active user control metrics
current_mau_gauge = Gauge("synapse_admin_mau_current", "Current MAU")
current_mau_gauge = Gauge(
"synapse_admin_mau_current",
"Current MAU",
labelnames=[SERVER_NAME_LABEL],
)
current_mau_by_service_gauge = Gauge(
"synapse_admin_mau_current_mau_by_service",
"Current MAU by service",
["app_service"],
labelnames=["app_service", SERVER_NAME_LABEL],
)
max_mau_gauge = Gauge(
"synapse_admin_mau_max",
"MAU Limit",
labelnames=[SERVER_NAME_LABEL],
)
max_mau_gauge = Gauge("synapse_admin_mau_max", "MAU Limit")
registered_reserved_users_mau_gauge = Gauge(
"synapse_admin_mau_registered_reserved_users",
"Registered users with reserved threepids",
labelnames=[SERVER_NAME_LABEL],
)
@wrap_as_background_process("phone_stats_home")
async def phone_stats_home(
def phone_stats_home(
hs: "HomeServer",
stats: JsonDict,
stats_process: List[Tuple[int, "resource.struct_rusage"]] = _stats_process,
) -> None:
"""Collect usage statistics and send them to the configured endpoint.
) -> "defer.Deferred[None]":
server_name = hs.hostname
Args:
hs: the HomeServer object to use for gathering usage data.
stats: the dict in which to store the statistics sent to the configured
endpoint. Mostly used in tests to figure out the data that is supposed to
be sent.
stats_process: statistics about resource usage of the process.
"""
async def _phone_stats_home(
hs: "HomeServer",
stats: JsonDict,
stats_process: List[Tuple[int, "resource.struct_rusage"]] = _stats_process,
) -> None:
"""Collect usage statistics and send them to the configured endpoint.
logger.info("Gathering stats for reporting")
now = int(hs.get_clock().time())
# Ensure the homeserver has started.
assert hs.start_time is not None
uptime = int(now - hs.start_time)
if uptime < 0:
uptime = 0
Args:
hs: the HomeServer object to use for gathering usage data.
stats: the dict in which to store the statistics sent to the configured
endpoint. Mostly used in tests to figure out the data that is supposed to
be sent.
stats_process: statistics about resource usage of the process.
"""
#
# Performance statistics. Keep this early in the function to maintain reliability of `test_performance_100` test.
#
old = stats_process[0]
new = (now, resource.getrusage(resource.RUSAGE_SELF))
stats_process[0] = new
logger.info("Gathering stats for reporting")
now = int(hs.get_clock().time())
# Ensure the homeserver has started.
assert hs.start_time is not None
uptime = int(now - hs.start_time)
if uptime < 0:
uptime = 0
# Get RSS in bytes
stats["memory_rss"] = new[1].ru_maxrss
#
# Performance statistics. Keep this early in the function to maintain reliability of `test_performance_100` test.
#
old = stats_process[0]
new = (now, resource.getrusage(resource.RUSAGE_SELF))
stats_process[0] = new
# Get CPU time in % of a single core, not % of all cores
used_cpu_time = (new[1].ru_utime + new[1].ru_stime) - (
old[1].ru_utime + old[1].ru_stime
)
if used_cpu_time == 0 or new[0] == old[0]:
stats["cpu_average"] = 0
else:
stats["cpu_average"] = math.floor(used_cpu_time / (new[0] - old[0]) * 100)
# Get RSS in bytes
stats["memory_rss"] = new[1].ru_maxrss
#
# General statistics
#
store = hs.get_datastores().main
common_metrics = await hs.get_common_usage_metrics_manager().get_metrics()
stats["homeserver"] = hs.config.server.server_name
stats["server_context"] = hs.config.server.server_context
stats["timestamp"] = now
stats["uptime_seconds"] = uptime
version = sys.version_info
stats["python_version"] = "{}.{}.{}".format(
version.major, version.minor, version.micro
)
stats["total_users"] = await store.count_all_users()
total_nonbridged_users = await store.count_nonbridged_users()
stats["total_nonbridged_users"] = total_nonbridged_users
daily_user_type_results = await store.count_daily_user_type()
for name, count in daily_user_type_results.items():
stats["daily_user_type_" + name] = count
room_count = await store.get_room_count()
stats["total_room_count"] = room_count
stats["daily_active_users"] = common_metrics.daily_active_users
stats["monthly_active_users"] = await store.count_monthly_users()
daily_active_e2ee_rooms = await store.count_daily_active_e2ee_rooms()
stats["daily_active_e2ee_rooms"] = daily_active_e2ee_rooms
stats["daily_e2ee_messages"] = await store.count_daily_e2ee_messages()
daily_sent_e2ee_messages = await store.count_daily_sent_e2ee_messages()
stats["daily_sent_e2ee_messages"] = daily_sent_e2ee_messages
stats["daily_active_rooms"] = await store.count_daily_active_rooms()
stats["daily_messages"] = await store.count_daily_messages()
daily_sent_messages = await store.count_daily_sent_messages()
stats["daily_sent_messages"] = daily_sent_messages
r30v2_results = await store.count_r30v2_users()
for name, count in r30v2_results.items():
stats["r30v2_users_" + name] = count
stats["cache_factor"] = hs.config.caches.global_factor
stats["event_cache_size"] = hs.config.caches.event_cache_size
#
# Database version
#
# This only reports info about the *main* database.
stats["database_engine"] = store.db_pool.engine.module.__name__
stats["database_server_version"] = store.db_pool.engine.server_version
#
# Logging configuration
#
synapse_logger = logging.getLogger("synapse")
log_level = synapse_logger.getEffectiveLevel()
stats["log_level"] = logging.getLevelName(log_level)
logger.info(
"Reporting stats to %s: %s", hs.config.metrics.report_stats_endpoint, stats
)
try:
await hs.get_proxied_http_client().put_json(
hs.config.metrics.report_stats_endpoint, stats
# Get CPU time in % of a single core, not % of all cores
used_cpu_time = (new[1].ru_utime + new[1].ru_stime) - (
old[1].ru_utime + old[1].ru_stime
)
except Exception as e:
logger.warning("Error reporting stats: %s", e)
if used_cpu_time == 0 or new[0] == old[0]:
stats["cpu_average"] = 0
else:
stats["cpu_average"] = math.floor(used_cpu_time / (new[0] - old[0]) * 100)
#
# General statistics
#
store = hs.get_datastores().main
common_metrics = await hs.get_common_usage_metrics_manager().get_metrics()
stats["homeserver"] = hs.config.server.server_name
stats["server_context"] = hs.config.server.server_context
stats["timestamp"] = now
stats["uptime_seconds"] = uptime
version = sys.version_info
stats["python_version"] = "{}.{}.{}".format(
version.major, version.minor, version.micro
)
stats["total_users"] = await store.count_all_users()
total_nonbridged_users = await store.count_nonbridged_users()
stats["total_nonbridged_users"] = total_nonbridged_users
daily_user_type_results = await store.count_daily_user_type()
for name, count in daily_user_type_results.items():
stats["daily_user_type_" + name] = count
room_count = await store.get_room_count()
stats["total_room_count"] = room_count
stats["daily_active_users"] = common_metrics.daily_active_users
stats["monthly_active_users"] = await store.count_monthly_users()
daily_active_e2ee_rooms = await store.count_daily_active_e2ee_rooms()
stats["daily_active_e2ee_rooms"] = daily_active_e2ee_rooms
stats["daily_e2ee_messages"] = await store.count_daily_e2ee_messages()
daily_sent_e2ee_messages = await store.count_daily_sent_e2ee_messages()
stats["daily_sent_e2ee_messages"] = daily_sent_e2ee_messages
stats["daily_active_rooms"] = await store.count_daily_active_rooms()
stats["daily_messages"] = await store.count_daily_messages()
daily_sent_messages = await store.count_daily_sent_messages()
stats["daily_sent_messages"] = daily_sent_messages
r30v2_results = await store.count_r30v2_users()
for name, count in r30v2_results.items():
stats["r30v2_users_" + name] = count
stats["cache_factor"] = hs.config.caches.global_factor
stats["event_cache_size"] = hs.config.caches.event_cache_size
#
# Database version
#
# This only reports info about the *main* database.
stats["database_engine"] = store.db_pool.engine.module.__name__
stats["database_server_version"] = store.db_pool.engine.server_version
#
# Logging configuration
#
synapse_logger = logging.getLogger("synapse")
log_level = synapse_logger.getEffectiveLevel()
stats["log_level"] = logging.getLevelName(log_level)
logger.info(
"Reporting stats to %s: %s", hs.config.metrics.report_stats_endpoint, stats
)
try:
await hs.get_proxied_http_client().put_json(
hs.config.metrics.report_stats_endpoint, stats
)
except Exception as e:
logger.warning("Error reporting stats: %s", e)
return run_as_background_process(
"phone_stats_home", server_name, _phone_stats_home, hs, stats, stats_process
)
def start_phone_stats_home(hs: "HomeServer") -> None:
"""
Start the background tasks which report phone home stats.
"""
server_name = hs.hostname
clock = hs.get_clock()
stats: JsonDict = {}
@@ -210,25 +235,39 @@ def start_phone_stats_home(hs: "HomeServer") -> None:
)
hs.get_datastores().main.reap_monthly_active_users()
@wrap_as_background_process("generate_monthly_active_users")
async def generate_monthly_active_users() -> None:
current_mau_count = 0
current_mau_count_by_service: Mapping[str, int] = {}
reserved_users: Sized = ()
store = hs.get_datastores().main
if hs.config.server.limit_usage_by_mau or hs.config.server.mau_stats_only:
current_mau_count = await store.get_monthly_active_count()
current_mau_count_by_service = (
await store.get_monthly_active_count_by_service()
def generate_monthly_active_users() -> "defer.Deferred[None]":
async def _generate_monthly_active_users() -> None:
current_mau_count = 0
current_mau_count_by_service: Mapping[str, int] = {}
reserved_users: Sized = ()
store = hs.get_datastores().main
if hs.config.server.limit_usage_by_mau or hs.config.server.mau_stats_only:
current_mau_count = await store.get_monthly_active_count()
current_mau_count_by_service = (
await store.get_monthly_active_count_by_service()
)
reserved_users = await store.get_registered_reserved_users()
current_mau_gauge.labels(**{SERVER_NAME_LABEL: server_name}).set(
float(current_mau_count)
)
reserved_users = await store.get_registered_reserved_users()
current_mau_gauge.set(float(current_mau_count))
for app_service, count in current_mau_count_by_service.items():
current_mau_by_service_gauge.labels(app_service).set(float(count))
for app_service, count in current_mau_count_by_service.items():
current_mau_by_service_gauge.labels(
app_service=app_service, **{SERVER_NAME_LABEL: server_name}
).set(float(count))
registered_reserved_users_mau_gauge.set(float(len(reserved_users)))
max_mau_gauge.set(float(hs.config.server.max_mau_value))
registered_reserved_users_mau_gauge.labels(
**{SERVER_NAME_LABEL: server_name}
).set(float(len(reserved_users)))
max_mau_gauge.labels(**{SERVER_NAME_LABEL: server_name}).set(
float(hs.config.server.max_mau_value)
)
return run_as_background_process(
"generate_monthly_active_users",
server_name,
_generate_monthly_active_users,
)
if hs.config.server.limit_usage_by_mau or hs.config.server.mau_stats_only:
generate_monthly_active_users()
+14 -10
View File
@@ -48,6 +48,7 @@ from synapse.events import EventBase
from synapse.events.utils import SerializeEventConfig, serialize_event
from synapse.http.client import SimpleHttpClient, is_unknown_endpoint
from synapse.logging import opentracing
from synapse.metrics import SERVER_NAME_LABEL
from synapse.types import DeviceListUpdates, JsonDict, JsonMapping, ThirdPartyInstanceID
from synapse.util.caches.response_cache import ResponseCache
@@ -59,29 +60,31 @@ logger = logging.getLogger(__name__)
sent_transactions_counter = Counter(
"synapse_appservice_api_sent_transactions",
"Number of /transactions/ requests sent",
["service"],
labelnames=["service", SERVER_NAME_LABEL],
)
failed_transactions_counter = Counter(
"synapse_appservice_api_failed_transactions",
"Number of /transactions/ requests that failed to send",
["service"],
labelnames=["service", SERVER_NAME_LABEL],
)
sent_events_counter = Counter(
"synapse_appservice_api_sent_events", "Number of events sent to the AS", ["service"]
"synapse_appservice_api_sent_events",
"Number of events sent to the AS",
labelnames=["service", SERVER_NAME_LABEL],
)
sent_ephemeral_counter = Counter(
"synapse_appservice_api_sent_ephemeral",
"Number of ephemeral events sent to the AS",
["service"],
labelnames=["service", SERVER_NAME_LABEL],
)
sent_todevice_counter = Counter(
"synapse_appservice_api_sent_todevice",
"Number of todevice messages sent to the AS",
["service"],
labelnames=["service", SERVER_NAME_LABEL],
)
HOUR_IN_MS = 60 * 60 * 1000
@@ -382,6 +385,7 @@ class ApplicationServiceApi(SimpleHttpClient):
"left": list(device_list_summary.left),
}
labels = {"service": service.id, SERVER_NAME_LABEL: self.server_name}
try:
args = None
if self.config.use_appservice_legacy_authorization:
@@ -399,10 +403,10 @@ class ApplicationServiceApi(SimpleHttpClient):
service.url,
[event.get("event_id") for event in events],
)
sent_transactions_counter.labels(service.id).inc()
sent_events_counter.labels(service.id).inc(len(serialized_events))
sent_ephemeral_counter.labels(service.id).inc(len(ephemeral))
sent_todevice_counter.labels(service.id).inc(len(to_device_messages))
sent_transactions_counter.labels(**labels).inc()
sent_events_counter.labels(**labels).inc(len(serialized_events))
sent_ephemeral_counter.labels(**labels).inc(len(ephemeral))
sent_todevice_counter.labels(**labels).inc(len(to_device_messages))
return True
except CodeMessageException as e:
logger.warning(
@@ -421,7 +425,7 @@ class ApplicationServiceApi(SimpleHttpClient):
ex.args,
exc_info=logger.isEnabledFor(logging.DEBUG),
)
failed_transactions_counter.labels(service.id).inc()
failed_transactions_counter.labels(**labels).inc()
return False
async def claim_client_keys(
+36 -23
View File
@@ -103,18 +103,16 @@ MAX_TO_DEVICE_MESSAGES_PER_TRANSACTION = 100
class ApplicationServiceScheduler:
"""Public facing API for this module. Does the required DI to tie the
components together. This also serves as the "event_pool", which in this
"""
Public facing API for this module. Does the required dependency injection (DI) to
tie the components together. This also serves as the "event_pool", which in this
case is a simple array.
"""
def __init__(self, hs: "HomeServer"):
self.clock = hs.get_clock()
self.txn_ctrl = _TransactionController(hs)
self.store = hs.get_datastores().main
self.as_api = hs.get_application_service_api()
self.txn_ctrl = _TransactionController(self.clock, self.store, self.as_api)
self.queuer = _ServiceQueuer(self.txn_ctrl, self.clock, hs)
self.queuer = _ServiceQueuer(self.txn_ctrl, hs)
async def start(self) -> None:
logger.info("Starting appservice scheduler")
@@ -184,9 +182,7 @@ class _ServiceQueuer:
appservice at a given time.
"""
def __init__(
self, txn_ctrl: "_TransactionController", clock: Clock, hs: "HomeServer"
):
def __init__(self, txn_ctrl: "_TransactionController", hs: "HomeServer"):
# dict of {service_id: [events]}
self.queued_events: Dict[str, List[EventBase]] = {}
# dict of {service_id: [events]}
@@ -199,10 +195,11 @@ class _ServiceQueuer:
# the appservices which currently have a transaction in flight
self.requests_in_flight: Set[str] = set()
self.txn_ctrl = txn_ctrl
self.clock = clock
self._msc3202_transaction_extensions_enabled: bool = (
hs.config.experimental.msc3202_transaction_extensions
)
self.server_name = hs.hostname
self.clock = hs.get_clock()
self._store = hs.get_datastores().main
def start_background_request(self, service: ApplicationService) -> None:
@@ -210,7 +207,9 @@ class _ServiceQueuer:
if service.id in self.requests_in_flight:
return
run_as_background_process("as-sender", self._send_request, service)
run_as_background_process(
"as-sender", self.server_name, self._send_request, service
)
async def _send_request(self, service: ApplicationService) -> None:
# sanity-check: we shouldn't get here if this service already has a sender
@@ -359,10 +358,11 @@ class _TransactionController:
(Note we have only have one of these in the homeserver.)
"""
def __init__(self, clock: Clock, store: DataStore, as_api: ApplicationServiceApi):
self.clock = clock
self.store = store
self.as_api = as_api
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self.clock = hs.get_clock()
self.store = hs.get_datastores().main
self.as_api = hs.get_application_service_api()
# map from service id to recoverer instance
self.recoverers: Dict[str, "_Recoverer"] = {}
@@ -446,7 +446,12 @@ class _TransactionController:
logger.info("Starting recoverer for AS ID %s", service.id)
assert service.id not in self.recoverers
recoverer = self.RECOVERER_CLASS(
self.clock, self.store, self.as_api, service, self.on_recovered
self.server_name,
self.clock,
self.store,
self.as_api,
service,
self.on_recovered,
)
self.recoverers[service.id] = recoverer
recoverer.recover()
@@ -477,21 +482,24 @@ class _Recoverer:
We have one of these for each appservice which is currently considered DOWN.
Args:
clock (synapse.util.Clock):
store (synapse.storage.DataStore):
as_api (synapse.appservice.api.ApplicationServiceApi):
service (synapse.appservice.ApplicationService): the service we are managing
callback (callable[_Recoverer]): called once the service recovers.
server_name: the homeserver name (used to label metrics) (this should be `hs.hostname`).
clock:
store:
as_api:
service: the service we are managing
callback: called once the service recovers.
"""
def __init__(
self,
server_name: str,
clock: Clock,
store: DataStore,
as_api: ApplicationServiceApi,
service: ApplicationService,
callback: Callable[["_Recoverer"], Awaitable[None]],
):
self.server_name = server_name
self.clock = clock
self.store = store
self.as_api = as_api
@@ -504,7 +512,11 @@ class _Recoverer:
delay = 2**self.backoff_counter
logger.info("Scheduling retries on %s in %fs", self.service.id, delay)
self.scheduled_recovery = self.clock.call_later(
delay, run_as_background_process, "as-recoverer", self.retry
delay,
run_as_background_process,
"as-recoverer",
self.server_name,
self.retry,
)
def _backoff(self) -> None:
@@ -525,6 +537,7 @@ class _Recoverer:
# Run a retry, which will resechedule a recovery if it fails.
run_as_background_process(
"retry",
self.server_name,
self.retry,
)
+2
View File
@@ -36,6 +36,7 @@ from synapse.config import ( # noqa: F401
jwt,
key,
logger,
mas,
metrics,
modules,
oembed,
@@ -124,6 +125,7 @@ class RootConfig:
background_updates: background_updates.BackgroundUpdateConfig
auto_accept_invites: auto_accept_invites.AutoAcceptInvitesConfig
user_types: user_types.UserTypesConfig
mas: mas.MasConfig
config_classes: List[Type["Config"]] = ...
config_files: List[str]
+8 -7
View File
@@ -36,13 +36,14 @@ class AuthConfig(Config):
if password_config is None:
password_config = {}
# The default value of password_config.enabled is True, unless msc3861 is enabled.
msc3861_enabled = (
(config.get("experimental_features") or {})
.get("msc3861", {})
.get("enabled", False)
)
passwords_enabled = password_config.get("enabled", not msc3861_enabled)
auth_delegated = (config.get("experimental_features") or {}).get(
"msc3861", {}
).get("enabled", False) or (
config.get("matrix_authentication_service") or {}
).get("enabled", False)
# The default value of password_config.enabled is True, unless auth is delegated
passwords_enabled = password_config.get("enabled", not auth_delegated)
# 'only_for_reauth' allows users who have previously set a password to use it,
# even though passwords would otherwise be disabled.
+3
View File
@@ -582,6 +582,9 @@ class ExperimentalConfig(Config):
# MSC4155: Invite filtering
self.msc4155_enabled: bool = experimental.get("msc4155_enabled", False)
# MSC4293: Redact on Kick/Ban
self.msc4293_enabled: bool = experimental.get("msc4293_enabled", False)
# MSC4306: Thread Subscriptions
# (and MSC4308: sliding sync extension for thread subscriptions)
self.msc4306_enabled: bool = experimental.get("msc4306_enabled", False)
+3
View File
@@ -36,6 +36,7 @@ from .federation import FederationConfig
from .jwt import JWTConfig
from .key import KeyConfig
from .logger import LoggingConfig
from .mas import MasConfig
from .metrics import MetricsConfig
from .modules import ModulesConfig
from .oembed import OembedConfig
@@ -109,4 +110,6 @@ class HomeServerConfig(RootConfig):
BackgroundUpdateConfig,
AutoAcceptInvitesConfig,
UserTypesConfig,
# This must be last, as it checks for conflicts with other config options.
MasConfig,
]
+192
View File
@@ -0,0 +1,192 @@
#
# This file is licensed under the Affero General Public License (AGPL) version 3.
#
# Copyright (C) 2025 New Vector, Ltd
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as
# published by the Free Software Foundation, either version 3 of the
# License, or (at your option) any later version.
#
# See the GNU Affero General Public License for more details:
# <https://www.gnu.org/licenses/agpl-3.0.html>.
#
#
from typing import Any, Optional
from synapse._pydantic_compat import (
AnyHttpUrl,
Field,
FilePath,
StrictBool,
StrictStr,
ValidationError,
validator,
)
from synapse.config.experimental import read_secret_from_file_once
from synapse.types import JsonDict
from synapse.util.pydantic_models import ParseModel
from ._base import Config, ConfigError, RootConfig
class MasConfigModel(ParseModel):
enabled: StrictBool = False
endpoint: AnyHttpUrl = Field(default="http://localhost:8080")
secret: Optional[StrictStr] = Field(default=None)
secret_path: Optional[FilePath] = Field(default=None)
@validator("secret")
def validate_secret_is_set_if_enabled(cls, v: Any, values: dict) -> Any:
if values.get("enabled", False) and not values.get("secret_path") and not v:
raise ValueError(
"You must set a `secret` or `secret_path` when enabling Matrix Authentication Service integration."
)
return v
@validator("secret_path")
def validate_secret_path_is_set_if_enabled(cls, v: Any, values: dict) -> Any:
if values.get("secret"):
raise ValueError(
"`secret` and `secret_path` cannot be set at the same time."
)
return v
class MasConfig(Config):
section = "mas"
def read_config(
self, config: JsonDict, allow_secrets_in_config: bool, **kwargs: Any
) -> None:
mas_config = config.get("matrix_authentication_service", {})
if mas_config is None:
mas_config = {}
try:
parsed = MasConfigModel(**mas_config)
except ValidationError as e:
raise ConfigError(
"Could not validate Matrix Authentication Service configuration",
path=("matrix_authentication_service",),
) from e
if parsed.secret and not allow_secrets_in_config:
raise ConfigError(
"Config options that expect an in-line secret as value are disabled",
("matrix_authentication_service", "secret"),
)
self.enabled = parsed.enabled
self.endpoint = parsed.endpoint
self._secret = parsed.secret
self._secret_path = parsed.secret_path
self.check_config_conflicts(self.root)
def check_config_conflicts(
self,
root: RootConfig,
) -> None:
"""Checks for any configuration conflicts with other parts of Synapse.
Raises:
ConfigError: If there are any configuration conflicts.
"""
if not self.enabled:
return
if root.experimental.msc3861.enabled:
raise ConfigError(
"Experimental MSC3861 was replaced by Matrix Authentication Service."
"Please disable MSC3861 or disable Matrix Authentication Service.",
("experimental", "msc3861"),
)
if (
root.auth.password_enabled_for_reauth
or root.auth.password_enabled_for_login
):
raise ConfigError(
"Password auth cannot be enabled when OAuth delegation is enabled",
("password_config", "enabled"),
)
if root.registration.enable_registration:
raise ConfigError(
"Registration cannot be enabled when OAuth delegation is enabled",
("enable_registration",),
)
# We only need to test the user consent version, as if it must be set if the user_consent section was present in the config
if root.consent.user_consent_version is not None:
raise ConfigError(
"User consent cannot be enabled when OAuth delegation is enabled",
("user_consent",),
)
if (
root.oidc.oidc_enabled
or root.saml2.saml2_enabled
or root.cas.cas_enabled
or root.jwt.jwt_enabled
):
raise ConfigError("SSO cannot be enabled when OAuth delegation is enabled")
if bool(root.authproviders.password_providers):
raise ConfigError(
"Password auth providers cannot be enabled when OAuth delegation is enabled"
)
if root.captcha.enable_registration_captcha:
raise ConfigError(
"CAPTCHA cannot be enabled when OAuth delegation is enabled",
("captcha", "enable_registration_captcha"),
)
if root.auth.login_via_existing_enabled:
raise ConfigError(
"Login via existing session cannot be enabled when OAuth delegation is enabled",
("login_via_existing_session", "enabled"),
)
if root.registration.refresh_token_lifetime:
raise ConfigError(
"refresh_token_lifetime cannot be set when OAuth delegation is enabled",
("refresh_token_lifetime",),
)
if root.registration.nonrefreshable_access_token_lifetime:
raise ConfigError(
"nonrefreshable_access_token_lifetime cannot be set when OAuth delegation is enabled",
("nonrefreshable_access_token_lifetime",),
)
if root.registration.session_lifetime:
raise ConfigError(
"session_lifetime cannot be set when OAuth delegation is enabled",
("session_lifetime",),
)
if root.registration.enable_3pid_changes:
raise ConfigError(
"enable_3pid_changes cannot be enabled when OAuth delegation is enabled",
("enable_3pid_changes",),
)
def secret(self) -> str:
if self._secret is not None:
return self._secret
elif self._secret_path is not None:
return read_secret_from_file_once(
str(self._secret_path),
("matrix_authentication_service", "secret_path"),
)
else:
raise RuntimeError(
"Neither `secret` nor `secret_path` are set, this is a bug.",
)
+6
View File
@@ -241,6 +241,12 @@ class RatelimitConfig(Config):
defaults={"per_second": 1, "burst_count": 5},
)
self.rc_room_creation = RatelimitSettings.parse(
config,
"rc_room_creation",
defaults={"per_second": 0.016, "burst_count": 10},
)
self.rc_reports = RatelimitSettings.parse(
config,
"rc_reports",
+7 -8
View File
@@ -148,15 +148,14 @@ class RegistrationConfig(Config):
self.enable_set_displayname = config.get("enable_set_displayname", True)
self.enable_set_avatar_url = config.get("enable_set_avatar_url", True)
auth_delegated = (config.get("experimental_features") or {}).get(
"msc3861", {}
).get("enabled", False) or (
config.get("matrix_authentication_service") or {}
).get("enabled", False)
# The default value of enable_3pid_changes is True, unless msc3861 is enabled.
msc3861_enabled = (
(config.get("experimental_features") or {})
.get("msc3861", {})
.get("enabled", False)
)
self.enable_3pid_changes = config.get(
"enable_3pid_changes", not msc3861_enabled
)
self.enable_3pid_changes = config.get("enable_3pid_changes", not auth_delegated)
self.disable_msisdn_registration = config.get(
"disable_msisdn_registration", False
+16 -9
View File
@@ -22,11 +22,10 @@
import logging
import os
from typing import Any, Dict, List, Tuple
from urllib.request import getproxies_environment
import attr
from synapse.config.server import generate_ip_set
from synapse.config.server import generate_ip_set, parse_proxy_config
from synapse.types import JsonDict
from synapse.util.check_dependencies import check_requirements
from synapse.util.module_loader import load_module
@@ -61,7 +60,7 @@ THUMBNAIL_SUPPORTED_MEDIA_FORMAT_MAP = {
"image/png": "png",
}
HTTP_PROXY_SET_WARNING = """\
URL_PREVIEW_BLACKLIST_IGNORED_BECAUSE_HTTP_PROXY_SET_WARNING = """\
The Synapse config url_preview_ip_range_blacklist will be ignored as an HTTP(s) proxy is configured."""
@@ -234,17 +233,25 @@ class ContentRepositoryConfig(Config):
if self.url_preview_enabled:
check_requirements("url-preview")
proxy_env = getproxies_environment()
if "url_preview_ip_range_blacklist" not in config:
if "http" not in proxy_env or "https" not in proxy_env:
proxy_config = parse_proxy_config(config)
is_proxy_configured = (
proxy_config.http_proxy is not None
or proxy_config.https_proxy is not None
)
if "url_preview_ip_range_blacklist" in config:
if is_proxy_configured:
logger.warning(
"".join(
URL_PREVIEW_BLACKLIST_IGNORED_BECAUSE_HTTP_PROXY_SET_WARNING
)
)
else:
if not is_proxy_configured:
raise ConfigError(
"For security, you must specify an explicit target IP address "
"blacklist in url_preview_ip_range_blacklist for url previewing "
"to work"
)
else:
if "http" in proxy_env or "https" in proxy_env:
logger.warning("".join(HTTP_PROXY_SET_WARNING))
# we always block '0.0.0.0' and '::', which are supposed to be
# unroutable addresses.
+125 -1
View File
@@ -25,11 +25,13 @@ import logging
import os.path
import urllib.parse
from textwrap import indent
from typing import Any, Dict, Iterable, List, Optional, Set, Tuple, Union
from typing import Any, Dict, Iterable, List, Optional, Set, Tuple, TypedDict, Union
from urllib.request import getproxies_environment
import attr
import yaml
from netaddr import AddrFormatError, IPNetwork, IPSet
from typing_extensions import TypeGuard
from twisted.conch.ssh.keys import Key
@@ -43,6 +45,21 @@ from ._util import validate_config
logger = logging.getLogger(__name__)
# Directly from the mypy docs:
# https://typing.python.org/en/latest/spec/narrowing.html#typeguard
def is_str_list(val: Any, allow_empty: bool) -> TypeGuard[list[str]]:
"""
Type-narrow a value to a list of strings (compatible with mypy).
"""
if not isinstance(val, list):
return False
if len(val) == 0:
return allow_empty
return all(isinstance(x, str) for x in val)
DIRECT_TCP_ERROR = """
Using direct TCP replication for workers is no longer supported.
@@ -291,6 +308,102 @@ class LimitRemoteRoomsConfig:
)
class ProxyConfigDictionary(TypedDict):
"""
Dictionary of proxy settings suitable for interacting with `urllib.request` API's
"""
http: Optional[str]
"""
Proxy server to use for HTTP requests.
"""
https: Optional[str]
"""
Proxy server to use for HTTPS requests.
"""
no: str
"""
Comma-separated list of hosts, IP addresses, or IP ranges in CIDR format which
should not use the proxy.
Empty string means no hosts should be excluded from the proxy.
"""
@attr.s(slots=True, frozen=True, auto_attribs=True)
class ProxyConfig:
"""
Synapse configuration for HTTP proxy settings.
"""
http_proxy: Optional[str]
"""
Proxy server to use for HTTP requests.
"""
https_proxy: Optional[str]
"""
Proxy server to use for HTTPS requests.
"""
no_proxy_hosts: Optional[List[str]]
"""
List of hosts, IP addresses, or IP ranges in CIDR format which should not use the
proxy. Synapse will directly connect to these hosts.
"""
def get_proxies_dictionary(self) -> ProxyConfigDictionary:
"""
Returns a dictionary of proxy settings suitable for interacting with
`urllib.request` API's (e.g. `urllib.request.proxy_bypass_environment`)
The keys are `"http"`, `"https"`, and `"no"`.
"""
return ProxyConfigDictionary(
http=self.http_proxy,
https=self.https_proxy,
no=",".join(self.no_proxy_hosts) if self.no_proxy_hosts else "",
)
def parse_proxy_config(config: JsonDict) -> ProxyConfig:
"""
Figure out forward proxy config for outgoing HTTP requests.
Prefer values from the given config over the environment variables (`http_proxy`,
`https_proxy`, `no_proxy`, not case-sensitive).
Args:
config: The top-level homeserver configuration dictionary.
"""
proxies_from_env = getproxies_environment()
http_proxy = config.get("http_proxy", proxies_from_env.get("http"))
if http_proxy is not None and not isinstance(http_proxy, str):
raise ConfigError("'http_proxy' must be a string", ("http_proxy",))
https_proxy = config.get("https_proxy", proxies_from_env.get("https"))
if https_proxy is not None and not isinstance(https_proxy, str):
raise ConfigError("'https_proxy' must be a string", ("https_proxy",))
# List of hosts which should not use the proxy. Synapse will directly connect to
# these hosts.
no_proxy_hosts = config.get("no_proxy_hosts")
# The `no_proxy` environment variable should be a comma-separated list of hosts,
# IP addresses, or IP ranges in CIDR format
no_proxy_from_env = proxies_from_env.get("no")
if no_proxy_hosts is None and no_proxy_from_env is not None:
no_proxy_hosts = no_proxy_from_env.split(",")
if no_proxy_hosts is not None and not is_str_list(no_proxy_hosts, allow_empty=True):
raise ConfigError(
"'no_proxy_hosts' must be a list of strings", ("no_proxy_hosts",)
)
return ProxyConfig(
http_proxy=http_proxy,
https_proxy=https_proxy,
no_proxy_hosts=no_proxy_hosts,
)
class ServerConfig(Config):
section = "server"
@@ -718,6 +831,17 @@ class ServerConfig(Config):
)
)
# Figure out forward proxy config for outgoing HTTP requests.
#
# Prefer values from the file config over the environment variables
self.proxy_config = parse_proxy_config(config)
logger.debug(
"Using proxy settings: http_proxy=%s, https_proxy=%s, no_proxy=%s",
self.proxy_config.http_proxy,
self.proxy_config.https_proxy,
self.proxy_config.no_proxy_hosts,
)
self.cleanup_extremities_with_dummy_events = config.get(
"cleanup_extremities_with_dummy_events", True
)
+9 -2
View File
@@ -152,6 +152,8 @@ class Keyring:
def __init__(
self, hs: "HomeServer", key_fetchers: "Optional[Iterable[KeyFetcher]]" = None
):
self.server_name = hs.hostname
if key_fetchers is None:
# Always fetch keys from the database.
mutable_key_fetchers: List[KeyFetcher] = [StoreKeyFetcher(hs)]
@@ -169,7 +171,8 @@ class Keyring:
self._fetch_keys_queue: BatchingQueue[
_FetchKeyRequest, Dict[str, Dict[str, FetchKeyResult]]
] = BatchingQueue(
"keyring_server",
name="keyring_server",
server_name=self.server_name,
clock=hs.get_clock(),
# The method called to fetch each key
process_batch_callback=self._inner_fetch_key_requests,
@@ -473,8 +476,12 @@ class Keyring:
class KeyFetcher(metaclass=abc.ABCMeta):
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self._queue = BatchingQueue(
self.__class__.__name__, hs.get_clock(), self._fetch_keys
name=self.__class__.__name__,
server_name=self.server_name,
clock=hs.get_clock(),
process_batch_callback=self._fetch_keys,
)
async def get_keys(
+1
View File
@@ -34,6 +34,7 @@ class InviteAutoAccepter:
def __init__(self, config: AutoAcceptInvitesConfig, api: ModuleApi):
# Keep a reference to the Module API.
self._api = api
self.server_name = api.server_name
self._config = config
if not self._config.enabled:
+5 -2
View File
@@ -545,8 +545,11 @@ def serialize_event(
d["content"] = dict(d["content"])
d["content"]["redacts"] = e.redacts
if config.include_admin_metadata and e.internal_metadata.is_soft_failed():
d["unsigned"]["io.element.synapse.soft_failed"] = True
if config.include_admin_metadata:
if e.internal_metadata.is_soft_failed():
d["unsigned"]["io.element.synapse.soft_failed"] = True
if e.internal_metadata.policy_server_spammy:
d["unsigned"]["io.element.synapse.policy_server_spammy"] = True
only_event_fields = config.only_event_fields
if only_event_fields:
+1
View File
@@ -174,6 +174,7 @@ class FederationBase:
"Event not allowed by policy server, soft-failing %s", pdu.event_id
)
pdu.internal_metadata.soft_failed = True
pdu.internal_metadata.policy_server_spammy = True
# Note: we don't redact the event so admins can inspect the event after the
# fact. Other processes may redact the event, but that won't be applied to
# the database copy of the event until the server's config requires it.
+20 -5
View File
@@ -74,6 +74,7 @@ from synapse.federation.transport.client import SendJoinResponse
from synapse.http.client import is_unknown_endpoint
from synapse.http.types import QueryParams
from synapse.logging.opentracing import SynapseTags, log_kv, set_tag, tag_args, trace
from synapse.metrics import SERVER_NAME_LABEL
from synapse.types import JsonDict, StrCollection, UserID, get_domain_from_id
from synapse.types.handlers.policy_server import RECOMMENDATION_OK, RECOMMENDATION_SPAM
from synapse.util.async_helpers import concurrently_execute
@@ -85,7 +86,9 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
sent_queries_counter = Counter("synapse_federation_client_sent_queries", "", ["type"])
sent_queries_counter = Counter(
"synapse_federation_client_sent_queries", "", labelnames=["type", SERVER_NAME_LABEL]
)
PDU_RETRY_TIME_MS = 1 * 60 * 1000
@@ -209,7 +212,10 @@ class FederationClient(FederationBase):
Returns:
The JSON object from the response
"""
sent_queries_counter.labels(query_type).inc()
sent_queries_counter.labels(
type=query_type,
**{SERVER_NAME_LABEL: self.server_name},
).inc()
return await self.transport_layer.make_query(
destination,
@@ -231,7 +237,10 @@ class FederationClient(FederationBase):
Returns:
The JSON object from the response
"""
sent_queries_counter.labels("client_device_keys").inc()
sent_queries_counter.labels(
type="client_device_keys",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
return await self.transport_layer.query_client_keys(
destination, content, timeout
)
@@ -242,7 +251,10 @@ class FederationClient(FederationBase):
"""Query the device keys for a list of user ids hosted on a remote
server.
"""
sent_queries_counter.labels("user_devices").inc()
sent_queries_counter.labels(
type="user_devices",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
return await self.transport_layer.query_user_devices(
destination, user_id, timeout
)
@@ -264,7 +276,10 @@ class FederationClient(FederationBase):
Returns:
The JSON object from the response
"""
sent_queries_counter.labels("client_one_time_keys").inc()
sent_queries_counter.labels(
type="client_one_time_keys",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
# Convert the query with counts into a stable and unstable query and check
# if attempting to claim more than 1 OTK.
+26 -11
View File
@@ -82,6 +82,7 @@ from synapse.logging.opentracing import (
tag_args,
trace,
)
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import wrap_as_background_process
from synapse.replication.http.federation import (
ReplicationFederationSendEduRestServlet,
@@ -104,23 +105,30 @@ TRANSACTION_CONCURRENCY_LIMIT = 10
logger = logging.getLogger(__name__)
received_pdus_counter = Counter("synapse_federation_server_received_pdus", "")
received_pdus_counter = Counter(
"synapse_federation_server_received_pdus", "", labelnames=[SERVER_NAME_LABEL]
)
received_edus_counter = Counter("synapse_federation_server_received_edus", "")
received_edus_counter = Counter(
"synapse_federation_server_received_edus", "", labelnames=[SERVER_NAME_LABEL]
)
received_queries_counter = Counter(
"synapse_federation_server_received_queries", "", ["type"]
"synapse_federation_server_received_queries",
"",
labelnames=["type", SERVER_NAME_LABEL],
)
pdu_process_time = Histogram(
"synapse_federation_server_pdu_process_time",
"Time taken to process an event",
labelnames=[SERVER_NAME_LABEL],
)
last_pdu_ts_metric = Gauge(
"synapse_federation_last_received_pdu_time",
"The timestamp of the last PDU which was successfully received from the given domain",
labelnames=("server_name",),
labelnames=("origin_server_name", SERVER_NAME_LABEL),
)
@@ -434,7 +442,9 @@ class FederationServer(FederationBase):
report back to the sending server.
"""
received_pdus_counter.inc(len(transaction.pdus))
received_pdus_counter.labels(**{SERVER_NAME_LABEL: self.server_name}).inc(
len(transaction.pdus)
)
origin_host, _ = parse_server_name(origin)
@@ -545,7 +555,9 @@ class FederationServer(FederationBase):
)
if newest_pdu_ts and origin in self._federation_metrics_domains:
last_pdu_ts_metric.labels(server_name=origin).set(newest_pdu_ts / 1000)
last_pdu_ts_metric.labels(
origin_server_name=origin, **{SERVER_NAME_LABEL: self.server_name}
).set(newest_pdu_ts / 1000)
return pdu_results
@@ -553,7 +565,7 @@ class FederationServer(FederationBase):
"""Process the EDUs in a received transaction."""
async def _process_edu(edu_dict: JsonDict) -> None:
received_edus_counter.inc()
received_edus_counter.labels(**{SERVER_NAME_LABEL: self.server_name}).inc()
edu = Edu(
origin=origin,
@@ -668,7 +680,10 @@ class FederationServer(FederationBase):
async def on_query_request(
self, query_type: str, args: Dict[str, str]
) -> Tuple[int, Dict[str, Any]]:
received_queries_counter.labels(query_type).inc()
received_queries_counter.labels(
type=query_type,
**{SERVER_NAME_LABEL: self.server_name},
).inc()
resp = await self.registry.on_query(query_type, args)
return 200, resp
@@ -1310,9 +1325,9 @@ class FederationServer(FederationBase):
origin, event.event_id
)
if received_ts is not None:
pdu_process_time.observe(
(self._clock.time_msec() - received_ts) / 1000
)
pdu_process_time.labels(
**{SERVER_NAME_LABEL: self.server_name}
).observe((self._clock.time_msec() - received_ts) / 1000)
next = await self._get_next_nonspam_staged_event_for_room(
room_id, room_version
+5 -5
View File
@@ -54,7 +54,7 @@ from sortedcontainers import SortedDict
from synapse.api.presence import UserPresenceState
from synapse.federation.sender import AbstractFederationSender, FederationSender
from synapse.metrics import LaterGauge
from synapse.metrics import SERVER_NAME_LABEL, LaterGauge
from synapse.replication.tcp.streams.federation import FederationStream
from synapse.types import JsonDict, ReadReceipt, RoomStreamToken, StrCollection
from synapse.util.metrics import Measure
@@ -113,10 +113,10 @@ class FederationRemoteSendQueue(AbstractFederationSender):
# changes. ARGH.
def register(name: str, queue: Sized) -> None:
LaterGauge(
"synapse_federation_send_queue_%s_size" % (queue_name,),
"",
[],
lambda: len(queue),
name="synapse_federation_send_queue_%s_size" % (queue_name,),
desc="",
labelnames=[SERVER_NAME_LABEL],
caller=lambda: {(self.server_name,): len(queue)},
)
for queue_name in [
+63 -33
View File
@@ -160,6 +160,7 @@ from synapse.federation.sender.transaction_manager import TransactionManager
from synapse.federation.units import Edu
from synapse.logging.context import make_deferred_yieldable, run_in_background
from synapse.metrics import (
SERVER_NAME_LABEL,
LaterGauge,
event_processing_loop_counter,
event_processing_loop_room_count,
@@ -189,11 +190,13 @@ logger = logging.getLogger(__name__)
sent_pdus_destination_dist_count = Counter(
"synapse_federation_client_sent_pdu_destinations_count",
"Number of PDUs queued for sending to one or more destinations",
labelnames=[SERVER_NAME_LABEL],
)
sent_pdus_destination_dist_total = Counter(
"synapse_federation_client_sent_pdu_destinations",
"Total number of PDUs queued for sending across all destinations",
labelnames=[SERVER_NAME_LABEL],
)
# Time (in s) to wait before trying to wake up destinations that have
@@ -296,6 +299,7 @@ class _DestinationWakeupQueue:
Staggers waking up of per destination queues to ensure that we don't attempt
to start TLS connections with many hosts all at once, leading to pinned CPU.
"""
# The maximum duration in seconds between queuing up a destination and it
@@ -303,6 +307,10 @@ class _DestinationWakeupQueue:
_MAX_TIME_IN_QUEUE = 30.0
sender: "FederationSender" = attr.ib()
server_name: str = attr.ib()
"""
Our homeserver name (used to label metrics) (`hs.hostname`).
"""
clock: Clock = attr.ib()
max_delay_s: int = attr.ib()
@@ -391,31 +399,37 @@ class FederationSender(AbstractFederationSender):
self._per_destination_queues: Dict[str, PerDestinationQueue] = {}
LaterGauge(
"synapse_federation_transaction_queue_pending_destinations",
"",
[],
lambda: sum(
1
for d in self._per_destination_queues.values()
if d.transmission_loop_running
),
name="synapse_federation_transaction_queue_pending_destinations",
desc="",
labelnames=[SERVER_NAME_LABEL],
caller=lambda: {
(self.server_name,): sum(
1
for d in self._per_destination_queues.values()
if d.transmission_loop_running
)
},
)
LaterGauge(
"synapse_federation_transaction_queue_pending_pdus",
"",
[],
lambda: sum(
d.pending_pdu_count() for d in self._per_destination_queues.values()
),
name="synapse_federation_transaction_queue_pending_pdus",
desc="",
labelnames=[SERVER_NAME_LABEL],
caller=lambda: {
(self.server_name,): sum(
d.pending_pdu_count() for d in self._per_destination_queues.values()
)
},
)
LaterGauge(
"synapse_federation_transaction_queue_pending_edus",
"",
[],
lambda: sum(
d.pending_edu_count() for d in self._per_destination_queues.values()
),
name="synapse_federation_transaction_queue_pending_edus",
desc="",
labelnames=[SERVER_NAME_LABEL],
caller=lambda: {
(self.server_name,): sum(
d.pending_edu_count() for d in self._per_destination_queues.values()
)
},
)
self._is_processing = False
@@ -427,7 +441,7 @@ class FederationSender(AbstractFederationSender):
1.0 / hs.config.ratelimiting.federation_rr_transactions_per_room_per_second
)
self._destination_wakeup_queue = _DestinationWakeupQueue(
self, self.clock, max_delay_s=rr_txn_interval_per_room_s
self, self.server_name, self.clock, max_delay_s=rr_txn_interval_per_room_s
)
# Regularly wake up destinations that have outstanding PDUs to be caught up
@@ -435,6 +449,7 @@ class FederationSender(AbstractFederationSender):
run_as_background_process,
WAKEUP_RETRY_PERIOD_SEC * 1000.0,
"wake_destinations_needing_catchup",
self.server_name,
self._wake_destinations_needing_catchup,
)
@@ -477,7 +492,9 @@ class FederationSender(AbstractFederationSender):
# fire off a processing loop in the background
run_as_background_process(
"process_event_queue_for_federation", self._process_event_queue_loop
"process_event_queue_for_federation",
self.server_name,
self._process_event_queue_loop,
)
async def _process_event_queue_loop(self) -> None:
@@ -650,7 +667,8 @@ class FederationSender(AbstractFederationSender):
ts = event_to_received_ts[event.event_id]
assert ts is not None
synapse.metrics.event_processing_lag_by_event.labels(
"federation_sender"
name="federation_sender",
**{SERVER_NAME_LABEL: self.server_name},
).observe((now - ts) / 1000)
async def handle_room_events(events: List[EventBase]) -> None:
@@ -694,22 +712,30 @@ class FederationSender(AbstractFederationSender):
assert ts is not None
synapse.metrics.event_processing_lag.labels(
"federation_sender"
name="federation_sender",
**{SERVER_NAME_LABEL: self.server_name},
).set(now - ts)
synapse.metrics.event_processing_last_ts.labels(
"federation_sender"
name="federation_sender",
**{SERVER_NAME_LABEL: self.server_name},
).set(ts)
events_processed_counter.inc(len(event_entries))
events_processed_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc(len(event_entries))
event_processing_loop_room_count.labels("federation_sender").inc(
len(events_by_room)
)
event_processing_loop_room_count.labels(
name="federation_sender",
**{SERVER_NAME_LABEL: self.server_name},
).inc(len(events_by_room))
event_processing_loop_counter.labels("federation_sender").inc()
event_processing_loop_counter.labels(
name="federation_sender",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
synapse.metrics.event_processing_positions.labels(
"federation_sender"
name="federation_sender", **{SERVER_NAME_LABEL: self.server_name}
).set(next_token)
finally:
@@ -727,8 +753,12 @@ class FederationSender(AbstractFederationSender):
if not destinations:
return
sent_pdus_destination_dist_total.inc(len(destinations))
sent_pdus_destination_dist_count.inc()
sent_pdus_destination_dist_total.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc(len(destinations))
sent_pdus_destination_dist_count.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc()
assert pdu.internal_metadata.stream_ordering
@@ -40,7 +40,7 @@ from synapse.federation.units import Edu
from synapse.handlers.presence import format_user_presence_state
from synapse.logging import issue9533_logger
from synapse.logging.opentracing import SynapseTags, set_tag
from synapse.metrics import sent_transactions_counter
from synapse.metrics import SERVER_NAME_LABEL, sent_transactions_counter
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.types import JsonDict, ReadReceipt
from synapse.util.retryutils import NotRetryingDestination, get_retry_limiter
@@ -56,13 +56,15 @@ logger = logging.getLogger(__name__)
sent_edus_counter = Counter(
"synapse_federation_client_sent_edus", "Total number of EDUs successfully sent"
"synapse_federation_client_sent_edus",
"Total number of EDUs successfully sent",
labelnames=[SERVER_NAME_LABEL],
)
sent_edus_by_type = Counter(
"synapse_federation_client_sent_edus_by_type",
"Number of sent EDUs successfully sent, by event type",
["type"],
labelnames=["type", SERVER_NAME_LABEL],
)
@@ -91,7 +93,7 @@ class PerDestinationQueue:
transaction_manager: "synapse.federation.sender.TransactionManager",
destination: str,
):
self._server_name = hs.hostname
self.server_name = hs.hostname
self._clock = hs.get_clock()
self._storage_controllers = hs.get_storage_controllers()
self._store = hs.get_datastores().main
@@ -311,6 +313,7 @@ class PerDestinationQueue:
run_as_background_process(
"federation_transaction_transmission_loop",
self.server_name,
self._transaction_transmission_loop,
)
@@ -322,7 +325,12 @@ class PerDestinationQueue:
# This will throw if we wouldn't retry. We do this here so we fail
# quickly, but we will later check this again in the http client,
# hence why we throw the result away.
await get_retry_limiter(self._destination, self._clock, self._store)
await get_retry_limiter(
destination=self._destination,
our_server_name=self.server_name,
clock=self._clock,
store=self._store,
)
if self._catching_up:
# we potentially need to catch-up first
@@ -362,10 +370,17 @@ class PerDestinationQueue:
self._destination, pending_pdus, pending_edus
)
sent_transactions_counter.inc()
sent_edus_counter.inc(len(pending_edus))
sent_transactions_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc()
sent_edus_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc(len(pending_edus))
for edu in pending_edus:
sent_edus_by_type.labels(edu.edu_type).inc()
sent_edus_by_type.labels(
type=edu.edu_type,
**{SERVER_NAME_LABEL: self.server_name},
).inc()
except NotRetryingDestination as e:
logger.debug(
@@ -566,7 +581,7 @@ class PerDestinationQueue:
new_pdus = await filter_events_for_server(
self._storage_controllers,
self._destination,
self._server_name,
self.server_name,
new_pdus,
redact=False,
filter_out_erased_senders=True,
@@ -590,7 +605,9 @@ class PerDestinationQueue:
self._destination, room_catchup_pdus, []
)
sent_transactions_counter.inc()
sent_transactions_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc()
# We pulled this from the DB, so it'll be non-null
assert pdu.internal_metadata.stream_ordering
@@ -613,7 +630,7 @@ class PerDestinationQueue:
# Send at most limit EDUs for receipts.
for content in self._pending_receipt_edus[:limit]:
yield Edu(
origin=self._server_name,
origin=self.server_name,
destination=self._destination,
edu_type=EduTypes.RECEIPT,
content=content,
@@ -639,7 +656,7 @@ class PerDestinationQueue:
)
edus = [
Edu(
origin=self._server_name,
origin=self.server_name,
destination=self._destination,
edu_type=edu_type,
content=content,
@@ -666,7 +683,7 @@ class PerDestinationQueue:
edus = [
Edu(
origin=self._server_name,
origin=self.server_name,
destination=self._destination,
edu_type=EduTypes.DIRECT_TO_DEVICE,
content=content,
@@ -739,7 +756,7 @@ class _TransactionQueueManager:
pending_edus.append(
Edu(
origin=self.queue._server_name,
origin=self.queue.server_name,
destination=self.queue._destination,
edu_type=EduTypes.PRESENCE,
content={"push": presence_to_add},
@@ -34,6 +34,7 @@ from synapse.logging.opentracing import (
tags,
whitelisted_homeserver,
)
from synapse.metrics import SERVER_NAME_LABEL
from synapse.types import JsonDict
from synapse.util import json_decoder
from synapse.util.metrics import measure_func
@@ -47,7 +48,7 @@ issue_8631_logger = logging.getLogger("synapse.8631_debug")
last_pdu_ts_metric = Gauge(
"synapse_federation_last_sent_pdu_time",
"The timestamp of the last PDU which was successfully sent to the given domain",
labelnames=("server_name",),
labelnames=("destination_server_name", SERVER_NAME_LABEL),
)
@@ -191,6 +192,7 @@ class TransactionManager:
if pdus and destination in self._federation_metrics_domains:
last_pdu = pdus[-1]
last_pdu_ts_metric.labels(server_name=destination).set(
last_pdu.origin_server_ts / 1000
)
last_pdu_ts_metric.labels(
destination_server_name=destination,
**{SERVER_NAME_LABEL: self.server_name},
).set(last_pdu.origin_server_ts / 1000)
+3
View File
@@ -38,6 +38,9 @@ logger = logging.getLogger(__name__)
class AccountValidityHandler:
def __init__(self, hs: "HomeServer"):
self.hs = hs
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self.config = hs.config
self.store = hs.get_datastores().main
self.send_email_handler = hs.get_send_email_handler()
+29 -12
View File
@@ -42,6 +42,7 @@ from synapse.events import EventBase
from synapse.handlers.presence import format_user_presence_state
from synapse.logging.context import make_deferred_yieldable, run_in_background
from synapse.metrics import (
SERVER_NAME_LABEL,
event_processing_loop_counter,
event_processing_loop_room_count,
)
@@ -68,12 +69,16 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
events_processed_counter = Counter("synapse_handlers_appservice_events_processed", "")
events_processed_counter = Counter(
"synapse_handlers_appservice_events_processed", "", labelnames=[SERVER_NAME_LABEL]
)
class ApplicationServicesHandler:
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self.store = hs.get_datastores().main
self.is_mine_id = hs.is_mine_id
self.appservice_api = hs.get_application_service_api()
@@ -166,7 +171,9 @@ class ApplicationServicesHandler:
except Exception:
logger.error("Application Services Failure")
run_as_background_process("as_scheduler", start_scheduler)
run_as_background_process(
"as_scheduler", self.server_name, start_scheduler
)
self.started_scheduler = True
# Fork off pushes to these services
@@ -180,7 +187,8 @@ class ApplicationServicesHandler:
assert ts is not None
synapse.metrics.event_processing_lag_by_event.labels(
"appservice_sender"
name="appservice_sender",
**{SERVER_NAME_LABEL: self.server_name},
).observe((now - ts) / 1000)
async def handle_room_events(events: Iterable[EventBase]) -> None:
@@ -200,16 +208,23 @@ class ApplicationServicesHandler:
await self.store.set_appservice_last_pos(upper_bound)
synapse.metrics.event_processing_positions.labels(
"appservice_sender"
name="appservice_sender",
**{SERVER_NAME_LABEL: self.server_name},
).set(upper_bound)
events_processed_counter.inc(len(events))
events_processed_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc(len(events))
event_processing_loop_room_count.labels("appservice_sender").inc(
len(events_by_room)
)
event_processing_loop_room_count.labels(
name="appservice_sender",
**{SERVER_NAME_LABEL: self.server_name},
).inc(len(events_by_room))
event_processing_loop_counter.labels("appservice_sender").inc()
event_processing_loop_counter.labels(
name="appservice_sender",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
if events:
now = self.clock.time_msec()
@@ -217,10 +232,12 @@ class ApplicationServicesHandler:
assert ts is not None
synapse.metrics.event_processing_lag.labels(
"appservice_sender"
name="appservice_sender",
**{SERVER_NAME_LABEL: self.server_name},
).set(now - ts)
synapse.metrics.event_processing_last_ts.labels(
"appservice_sender"
name="appservice_sender",
**{SERVER_NAME_LABEL: self.server_name},
).set(ts)
finally:
self.is_processing = False
+21 -9
View File
@@ -70,6 +70,7 @@ from synapse.http import get_request_user_agent
from synapse.http.server import finish_request, respond_with_html
from synapse.http.site import SynapseRequest
from synapse.logging.context import defer_to_thread
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.storage.databases.main.registration import (
LoginTokenExpired,
@@ -95,7 +96,7 @@ INVALID_USERNAME_OR_PASSWORD = "Invalid username or password"
invalid_login_token_counter = Counter(
"synapse_user_login_invalid_login_tokens",
"Counts the number of rejected m.login.token on /login",
["reason"],
labelnames=["reason", SERVER_NAME_LABEL],
)
@@ -199,6 +200,7 @@ class AuthHandler:
SESSION_EXPIRE_MS = 48 * 60 * 60 * 1000
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self.store = hs.get_datastores().main
self.auth = hs.get_auth()
self.auth_blocking = hs.get_auth_blocking()
@@ -248,6 +250,7 @@ class AuthHandler:
run_as_background_process,
5 * 60 * 1000,
"expire_old_sessions",
self.server_name,
self._expire_old_sessions,
)
@@ -272,8 +275,6 @@ class AuthHandler:
hs.config.sso.sso_account_deactivated_template
)
self._server_name = hs.config.server.server_name
# cast to tuple for use with str.startswith
self._whitelisted_sso_clients = tuple(hs.config.sso.sso_client_whitelist)
@@ -281,7 +282,9 @@ class AuthHandler:
# response.
self._extra_attributes: Dict[str, SsoLoginExtraAttributes] = {}
self.msc3861_oauth_delegation_enabled = hs.config.experimental.msc3861.enabled
self._auth_delegation_enabled = (
hs.config.mas.enabled or hs.config.experimental.msc3861.enabled
)
async def validate_user_via_ui_auth(
self,
@@ -332,7 +335,7 @@ class AuthHandler:
LimitExceededError if the ratelimiter's failed request count for this
user is too high to proceed
"""
if self.msc3861_oauth_delegation_enabled:
if self._auth_delegation_enabled:
raise SynapseError(
HTTPStatus.INTERNAL_SERVER_ERROR, "UIA shouldn't be used with MSC3861"
)
@@ -1479,11 +1482,20 @@ class AuthHandler:
try:
return await self.store.consume_login_token(login_token)
except LoginTokenExpired:
invalid_login_token_counter.labels("expired").inc()
invalid_login_token_counter.labels(
reason="expired",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
except LoginTokenReused:
invalid_login_token_counter.labels("reused").inc()
invalid_login_token_counter.labels(
reason="reused",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
except NotFoundError:
invalid_login_token_counter.labels("not found").inc()
invalid_login_token_counter.labels(
reason="not found",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
raise AuthError(403, "Invalid login token", errcode=Codes.FORBIDDEN)
@@ -1858,7 +1870,7 @@ class AuthHandler:
html = self._sso_redirect_confirm_template.render(
display_url=display_url,
redirect_url=redirect_url,
server_name=self._server_name,
server_name=self.server_name,
new_user=new_user,
user_id=registered_user_id,
user_profile=user_profile_data,
+4 -1
View File
@@ -42,6 +42,7 @@ class DeactivateAccountHandler:
def __init__(self, hs: "HomeServer"):
self.store = hs.get_datastores().main
self.hs = hs
self.server_name = hs.hostname
self._auth_handler = hs.get_auth_handler()
self._device_handler = hs.get_device_handler()
self._room_member_handler = hs.get_room_member_handler()
@@ -271,7 +272,9 @@ class DeactivateAccountHandler:
pending deactivation, if it isn't already running.
"""
if not self._user_parter_running:
run_as_background_process("user_parter_loop", self._user_parter_loop)
run_as_background_process(
"user_parter_loop", self.server_name, self._user_parter_loop
)
async def _user_parter_loop(self) -> None:
"""Loop that parts deactivated users from rooms"""
+10 -4
View File
@@ -22,7 +22,7 @@ from synapse.api.errors import ShadowBanError
from synapse.api.ratelimiting import Ratelimiter
from synapse.config.workers import MAIN_PROCESS_INSTANCE_NAME
from synapse.logging.opentracing import set_tag
from synapse.metrics import event_processing_positions
from synapse.metrics import SERVER_NAME_LABEL, event_processing_positions
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.replication.http.delayed_events import (
ReplicationAddedDelayedEventRestServlet,
@@ -110,12 +110,13 @@ class DelayedEventsHandler:
# Can send the events in background after having awaited on marking them as processed
run_as_background_process(
"_send_events",
self.server_name,
self._send_events,
events,
)
self._initialized_from_db = run_as_background_process(
"_schedule_db_events", _schedule_db_events
"_schedule_db_events", self.server_name, _schedule_db_events
)
else:
self._repl_client = ReplicationAddedDelayedEventRestServlet.make_client(hs)
@@ -140,7 +141,9 @@ class DelayedEventsHandler:
finally:
self._event_processing = False
run_as_background_process("delayed_events.notify_new_event", process)
run_as_background_process(
"delayed_events.notify_new_event", self.server_name, process
)
async def _unsafe_process_new_event(self) -> None:
# If self._event_pos is None then means we haven't fetched it from the DB yet
@@ -188,7 +191,9 @@ class DelayedEventsHandler:
self._event_pos = max_pos
# Expose current event processing position to prometheus
event_processing_positions.labels("delayed_events").set(max_pos)
event_processing_positions.labels(
name="delayed_events", **{SERVER_NAME_LABEL: self.server_name}
).set(max_pos)
await self._store.update_delayed_events_stream_pos(max_pos)
@@ -450,6 +455,7 @@ class DelayedEventsHandler:
delay_sec,
run_as_background_process,
"_send_on_timeout",
self.server_name,
self._send_on_timeout,
)
else:
+9 -2
View File
@@ -193,8 +193,9 @@ class DeviceHandler:
self.clock.looping_call(
run_as_background_process,
DELETE_STALE_DEVICES_INTERVAL_MS,
"delete_stale_devices",
self._delete_stale_devices,
desc="delete_stale_devices",
server_name=self.server_name,
func=self._delete_stale_devices,
)
async def _delete_stale_devices(self) -> None:
@@ -963,6 +964,9 @@ class DeviceWriterHandler(DeviceHandler):
def __init__(self, hs: "HomeServer"):
super().__init__(hs)
self.server_name = (
hs.hostname
) # nb must be called this for @measure_func and @wrap_as_background_process
# We only need to poke the federation sender explicitly if its on the
# same instance. Other federation sender instances will get notified by
# `synapse.app.generic_worker.FederationSenderHandler` when it sees it
@@ -1440,6 +1444,7 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
def __init__(self, hs: "HomeServer", device_handler: DeviceWriterHandler):
super().__init__(hs)
self.server_name = hs.hostname
self.federation = hs.get_federation_client()
self.server_name = hs.hostname # nb must be called this for @measure_func
self.clock = hs.get_clock() # nb must be called this for @measure_func
@@ -1470,6 +1475,7 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
self.clock.looping_call(
run_as_background_process,
30 * 1000,
server_name=self.server_name,
func=self._maybe_retry_device_resync,
desc="_maybe_retry_device_resync",
)
@@ -1591,6 +1597,7 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
await self.store.mark_remote_users_device_caches_as_stale([user_id])
run_as_background_process(
"_maybe_retry_device_resync",
self.server_name,
self.multi_user_device_resync,
[user_id],
False,
+16 -7
View File
@@ -71,6 +71,7 @@ from synapse.handlers.pagination import PURGE_PAGINATION_LOCK_NAME
from synapse.http.servlet import assert_params_in_dict
from synapse.logging.context import nested_logging_context
from synapse.logging.opentracing import SynapseTags, set_tag, tag_args, trace
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.module_api import NOT_SPAM
from synapse.storage.databases.main.events_worker import EventRedactBehaviour
@@ -90,7 +91,7 @@ logger = logging.getLogger(__name__)
backfill_processing_before_timer = Histogram(
"synapse_federation_backfill_processing_before_time_seconds",
"sec",
[],
labelnames=[SERVER_NAME_LABEL],
buckets=(
0.1,
0.5,
@@ -187,7 +188,9 @@ class FederationHandler:
# were shut down.
if not hs.config.worker.worker_app:
run_as_background_process(
"resume_sync_partial_state_room", self._resume_partial_state_room_sync
"resume_sync_partial_state_room",
self.server_name,
self._resume_partial_state_room_sync,
)
@trace
@@ -316,6 +319,7 @@ class FederationHandler:
)
run_as_background_process(
"_maybe_backfill_inner_anyway_with_max_depth",
self.server_name,
self.maybe_backfill,
room_id=room_id,
# We use `MAX_DEPTH` so that we find all backfill points next
@@ -530,9 +534,9 @@ class FederationHandler:
# backfill points regardless of `current_depth`.
if processing_start_time is not None:
processing_end_time = self.clock.time_msec()
backfill_processing_before_timer.observe(
(processing_end_time - processing_start_time) / 1000
)
backfill_processing_before_timer.labels(
**{SERVER_NAME_LABEL: self.server_name}
).observe((processing_end_time - processing_start_time) / 1000)
success = await try_backfill(likely_domains)
if success:
@@ -798,7 +802,10 @@ class FederationHandler:
# have. Hence we fire off the background task, but don't wait for it.
run_as_background_process(
"handle_queued_pdus", self._handle_queued_pdus, room_queue
"handle_queued_pdus",
self.server_name,
self._handle_queued_pdus,
room_queue,
)
async def do_knock(
@@ -1870,7 +1877,9 @@ class FederationHandler:
)
run_as_background_process(
desc="sync_partial_state_room", func=_sync_partial_state_room_wrapper
desc="sync_partial_state_room",
server_name=self.server_name,
func=_sync_partial_state_room_wrapper,
)
async def _sync_partial_state_room(
+13 -5
View File
@@ -76,6 +76,7 @@ from synapse.logging.opentracing import (
tag_args,
trace,
)
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.replication.http.federation import (
ReplicationFederationSendEventsRestServlet,
@@ -105,13 +106,14 @@ logger = logging.getLogger(__name__)
soft_failed_event_counter = Counter(
"synapse_federation_soft_failed_events_total",
"Events received over federation that we marked as soft_failed",
labelnames=[SERVER_NAME_LABEL],
)
# Added to debug performance and track progress on optimizations
backfill_processing_after_timer = Histogram(
"synapse_federation_backfill_processing_after_time_seconds",
"sec",
[],
labelnames=[SERVER_NAME_LABEL],
buckets=(
0.1,
0.25,
@@ -146,6 +148,7 @@ class FederationEventHandler:
"""
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self._clock = hs.get_clock()
self._store = hs.get_datastores().main
self._state_store = hs.get_datastores().state
@@ -170,7 +173,6 @@ class FederationEventHandler:
self._is_mine_id = hs.is_mine_id
self._is_mine_server_name = hs.is_mine_server_name
self._server_name = hs.hostname
self._instance_name = hs.get_instance_name()
self._config = hs.config
@@ -249,7 +251,7 @@ class FederationEventHandler:
# Note that if we were never in the room then we would have already
# dropped the event, since we wouldn't know the room version.
is_in_room = await self._event_auth_handler.is_host_in_room(
room_id, self._server_name
room_id, self.server_name
)
if not is_in_room:
logger.info(
@@ -690,7 +692,9 @@ class FederationEventHandler:
if not events:
return
with backfill_processing_after_timer.time():
with backfill_processing_after_timer.labels(
**{SERVER_NAME_LABEL: self.server_name}
).time():
# if there are any events in the wrong room, the remote server is buggy and
# should not be trusted.
for ev in events:
@@ -930,6 +934,7 @@ class FederationEventHandler:
if len(events_with_failed_pull_attempts) > 0:
run_as_background_process(
"_process_new_pulled_events_with_failed_pull_attempts",
self.server_name,
_process_new_pulled_events,
events_with_failed_pull_attempts,
)
@@ -1523,6 +1528,7 @@ class FederationEventHandler:
if resync:
run_as_background_process(
"resync_device_due_to_pdu",
self.server_name,
self._resync_device,
event.sender,
)
@@ -2049,7 +2055,9 @@ class FederationEventHandler:
"hs": origin,
},
)
soft_failed_event_counter.inc()
soft_failed_event_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc()
event.internal_metadata.soft_failed = True
async def _load_or_fetch_auth_events_for_event(
+43 -30
View File
@@ -67,7 +67,6 @@ from synapse.handlers.worker_lock import NEW_EVENT_DURING_PURGE_LOCK_NAME
from synapse.logging import opentracing
from synapse.logging.context import make_deferred_yieldable, run_in_background
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.replication.http.send_event import ReplicationSendEventRestServlet
from synapse.replication.http.send_events import ReplicationSendEventsRestServlet
from synapse.storage.databases.main.events_worker import EventRedactBehaviour
from synapse.types import (
@@ -97,6 +96,7 @@ class MessageHandler:
"""Contains some read only APIs to get state about a room"""
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self.auth = hs.get_auth()
self.clock = hs.get_clock()
self.state = hs.get_state_handler()
@@ -112,7 +112,7 @@ class MessageHandler:
if not hs.config.worker.worker_app:
run_as_background_process(
"_schedule_next_expiry", self._schedule_next_expiry
"_schedule_next_expiry", self.server_name, self._schedule_next_expiry
)
async def get_room_data(
@@ -444,6 +444,7 @@ class MessageHandler:
delay,
run_as_background_process,
"_expire_event",
self.server_name,
self._expire_event,
event_id,
)
@@ -504,7 +505,6 @@ class EventCreationHandler:
self.room_prejoin_state_types = self.hs.config.api.room_prejoin_state
self.send_event = ReplicationSendEventRestServlet.make_client(hs)
self.send_events = ReplicationSendEventsRestServlet.make_client(hs)
self.request_ratelimiter = hs.get_request_ratelimiter()
@@ -546,6 +546,7 @@ class EventCreationHandler:
self.clock.looping_call(
lambda: run_as_background_process(
"send_dummy_events_to_fill_extremities",
self.server_name,
self._send_dummy_events_to_fill_extremities,
),
5 * 60 * 1000,
@@ -646,38 +647,46 @@ class EventCreationHandler:
"""
await self.auth_blocking.check_auth_blocking(requester=requester)
requester_suspended = await self.store.get_user_suspended_status(
requester.user.to_string()
# The requester may be a regular user, but puppeted by the server.
request_by_server = (
requester.authenticated_entity == self.hs.config.server.server_name
)
if requester_suspended:
# We want to allow suspended users to perform "corrective" actions
# asked of them by server admins, such as redact their messages and
# leave rooms.
if event_dict["type"] in ["m.room.redaction", "m.room.member"]:
if event_dict["type"] == "m.room.redaction":
event = await self.store.get_event(
event_dict["content"]["redacts"], allow_none=True
)
if event:
if event.sender != requester.user.to_string():
# If the request is initiated by the server, ignore whether the
# requester or target is suspended.
if not request_by_server:
requester_suspended = await self.store.get_user_suspended_status(
requester.user.to_string()
)
if requester_suspended:
# We want to allow suspended users to perform "corrective" actions
# asked of them by server admins, such as redact their messages and
# leave rooms.
if event_dict["type"] in ["m.room.redaction", "m.room.member"]:
if event_dict["type"] == "m.room.redaction":
event = await self.store.get_event(
event_dict["content"]["redacts"], allow_none=True
)
if event:
if event.sender != requester.user.to_string():
raise SynapseError(
403,
"You can only redact your own events while account is suspended.",
Codes.USER_ACCOUNT_SUSPENDED,
)
if event_dict["type"] == "m.room.member":
if event_dict["content"]["membership"] != "leave":
raise SynapseError(
403,
"You can only redact your own events while account is suspended.",
"Changing membership while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
if event_dict["type"] == "m.room.member":
if event_dict["content"]["membership"] != "leave":
raise SynapseError(
403,
"Changing membership while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
else:
raise SynapseError(
403,
"Sending messages while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
else:
raise SynapseError(
403,
"Sending messages while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
is_create_event = (
event_dict["type"] == EventTypes.Create and event_dict["state_key"] == ""
@@ -1107,6 +1116,9 @@ class EventCreationHandler:
policy_allowed = await self._policy_handler.is_event_allowed(event)
if not policy_allowed:
# We shouldn't need to set the metadata because the raise should
# cause the request to be denied, but just in case:
event.internal_metadata.policy_server_spammy = True
logger.warning(
"Event not allowed by policy server, rejecting %s",
event.event_id,
@@ -2070,6 +2082,7 @@ class EventCreationHandler:
# matters as sometimes presence code can take a while.
run_as_background_process(
"bump_presence_active_time",
self.server_name,
self._bump_active_time,
requester.user,
requester.device_id,
+5 -2
View File
@@ -79,12 +79,12 @@ class PaginationHandler:
def __init__(self, hs: "HomeServer"):
self.hs = hs
self.server_name = hs.hostname
self.auth = hs.get_auth()
self.store = hs.get_datastores().main
self._storage_controllers = hs.get_storage_controllers()
self._state_storage_controller = self._storage_controllers.state
self.clock = hs.get_clock()
self._server_name = hs.hostname
self._room_shutdown_handler = hs.get_room_shutdown_handler()
self._relations_handler = hs.get_relations_handler()
self._worker_locks = hs.get_worker_locks_handler()
@@ -119,6 +119,7 @@ class PaginationHandler:
run_as_background_process,
job.interval,
"purge_history_for_rooms_in_range",
self.server_name,
self.purge_history_for_rooms_in_range,
job.shortest_max_lifetime,
job.longest_max_lifetime,
@@ -245,6 +246,7 @@ class PaginationHandler:
# other purges in the same room.
run_as_background_process(
PURGE_HISTORY_ACTION_NAME,
self.server_name,
self.purge_history,
room_id,
token,
@@ -395,7 +397,7 @@ class PaginationHandler:
write=True,
):
# first check that we have no users in this room
joined = await self.store.is_host_joined(room_id, self._server_name)
joined = await self.store.is_host_joined(room_id, self.server_name)
if joined:
if force:
logger.info(
@@ -604,6 +606,7 @@ class PaginationHandler:
# for a costly federation call and processing.
run_as_background_process(
"maybe_backfill_in_the_background",
self.server_name,
self.hs.get_federation_handler().maybe_backfill,
room_id,
curr_topo,
+111 -40
View File
@@ -105,7 +105,7 @@ from synapse.api.presence import UserDevicePresenceState, UserPresenceState
from synapse.appservice import ApplicationService
from synapse.events.presence_router import PresenceRouter
from synapse.logging.context import run_in_background
from synapse.metrics import LaterGauge
from synapse.metrics import SERVER_NAME_LABEL, LaterGauge
from synapse.metrics.background_process_metrics import (
run_as_background_process,
wrap_as_background_process,
@@ -137,24 +137,40 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
notified_presence_counter = Counter("synapse_handler_presence_notified_presence", "")
notified_presence_counter = Counter(
"synapse_handler_presence_notified_presence", "", labelnames=[SERVER_NAME_LABEL]
)
federation_presence_out_counter = Counter(
"synapse_handler_presence_federation_presence_out", ""
"synapse_handler_presence_federation_presence_out",
"",
labelnames=[SERVER_NAME_LABEL],
)
presence_updates_counter = Counter(
"synapse_handler_presence_presence_updates", "", labelnames=[SERVER_NAME_LABEL]
)
timers_fired_counter = Counter(
"synapse_handler_presence_timers_fired", "", labelnames=[SERVER_NAME_LABEL]
)
presence_updates_counter = Counter("synapse_handler_presence_presence_updates", "")
timers_fired_counter = Counter("synapse_handler_presence_timers_fired", "")
federation_presence_counter = Counter(
"synapse_handler_presence_federation_presence", ""
"synapse_handler_presence_federation_presence", "", labelnames=[SERVER_NAME_LABEL]
)
bump_active_time_counter = Counter(
"synapse_handler_presence_bump_active_time", "", labelnames=[SERVER_NAME_LABEL]
)
bump_active_time_counter = Counter("synapse_handler_presence_bump_active_time", "")
get_updates_counter = Counter("synapse_handler_presence_get_updates", "", ["type"])
get_updates_counter = Counter(
"synapse_handler_presence_get_updates", "", labelnames=["type", SERVER_NAME_LABEL]
)
notify_reason_counter = Counter(
"synapse_handler_presence_notify_reason", "", ["locality", "reason"]
"synapse_handler_presence_notify_reason",
"",
labelnames=["locality", "reason", SERVER_NAME_LABEL],
)
state_transition_counter = Counter(
"synapse_handler_presence_state_transition", "", ["locality", "from", "to"]
"synapse_handler_presence_state_transition",
"",
labelnames=["locality", "from", "to", SERVER_NAME_LABEL],
)
# If a user was last active in the last LAST_ACTIVE_GRANULARITY, consider them
@@ -484,6 +500,7 @@ class _NullContextManager(ContextManager[None]):
class WorkerPresenceHandler(BasePresenceHandler):
def __init__(self, hs: "HomeServer"):
super().__init__(hs)
self.server_name = hs.hostname
self._presence_writer_instance = hs.config.worker.writers.presence[0]
# Route presence EDUs to the right worker
@@ -517,6 +534,7 @@ class WorkerPresenceHandler(BasePresenceHandler):
"shutdown",
run_as_background_process,
"generic_presence.on_shutdown",
self.server_name,
self._on_shutdown,
)
@@ -666,7 +684,9 @@ class WorkerPresenceHandler(BasePresenceHandler):
old_state = self.user_to_current_state.get(new_state.user_id)
self.user_to_current_state[new_state.user_id] = new_state
is_mine = self.is_mine_id(new_state.user_id)
if not old_state or should_notify(old_state, new_state, is_mine):
if not old_state or should_notify(
old_state, new_state, is_mine, self.server_name
):
state_to_notify.append(new_state)
stream_id = token
@@ -747,7 +767,9 @@ class WorkerPresenceHandler(BasePresenceHandler):
class PresenceHandler(BasePresenceHandler):
def __init__(self, hs: "HomeServer"):
super().__init__(hs)
self.server_name = hs.hostname
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self.wheel_timer: WheelTimer[str] = WheelTimer()
self.notifier = hs.get_notifier()
@@ -758,10 +780,10 @@ class PresenceHandler(BasePresenceHandler):
)
LaterGauge(
"synapse_handlers_presence_user_to_current_state_size",
"",
[],
lambda: len(self.user_to_current_state),
name="synapse_handlers_presence_user_to_current_state_size",
desc="",
labelnames=[SERVER_NAME_LABEL],
caller=lambda: {(self.server_name,): len(self.user_to_current_state)},
)
# The per-device presence state, maps user to devices to per-device presence state.
@@ -815,6 +837,7 @@ class PresenceHandler(BasePresenceHandler):
"shutdown",
run_as_background_process,
"presence.on_shutdown",
self.server_name,
self._on_shutdown,
)
@@ -860,10 +883,10 @@ class PresenceHandler(BasePresenceHandler):
)
LaterGauge(
"synapse_handlers_presence_wheel_timer_size",
"",
[],
lambda: len(self.wheel_timer),
name="synapse_handlers_presence_wheel_timer_size",
desc="",
labelnames=[SERVER_NAME_LABEL],
caller=lambda: {(self.server_name,): len(self.wheel_timer)},
)
# Used to handle sending of presence to newly joined users/servers
@@ -972,6 +995,7 @@ class PresenceHandler(BasePresenceHandler):
prev_state,
new_state,
is_mine=self.is_mine_id(user_id),
our_server_name=self.server_name,
wheel_timer=self.wheel_timer,
now=now,
# When overriding disabled presence, don't kick off all the
@@ -991,10 +1015,14 @@ class PresenceHandler(BasePresenceHandler):
# TODO: We should probably ensure there are no races hereafter
presence_updates_counter.inc(len(new_states))
presence_updates_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc(len(new_states))
if to_notify:
notified_presence_counter.inc(len(to_notify))
notified_presence_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc(len(to_notify))
await self._persist_and_notify(list(to_notify.values()))
self.unpersisted_users_changes |= {s.user_id for s in new_states}
@@ -1013,7 +1041,9 @@ class PresenceHandler(BasePresenceHandler):
if user_id not in to_notify
}
if to_federation_ping:
federation_presence_out_counter.inc(len(to_federation_ping))
federation_presence_out_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc(len(to_federation_ping))
hosts_to_states = await get_interested_remotes(
self.store,
@@ -1063,7 +1093,9 @@ class PresenceHandler(BasePresenceHandler):
for user_id in users_to_check
]
timers_fired_counter.inc(len(states))
timers_fired_counter.labels(**{SERVER_NAME_LABEL: self.server_name}).inc(
len(states)
)
# Set of user ID & device IDs which are currently syncing.
syncing_user_devices = {
@@ -1097,7 +1129,7 @@ class PresenceHandler(BasePresenceHandler):
user_id = user.to_string()
bump_active_time_counter.inc()
bump_active_time_counter.labels(**{SERVER_NAME_LABEL: self.server_name}).inc()
now = self.clock.time_msec()
@@ -1349,7 +1381,9 @@ class PresenceHandler(BasePresenceHandler):
updates.append(prev_state.copy_and_replace(**new_fields))
if updates:
federation_presence_counter.inc(len(updates))
federation_presence_counter.labels(
**{SERVER_NAME_LABEL: self.server_name}
).inc(len(updates))
await self._update_states(updates)
async def set_state(
@@ -1495,7 +1529,9 @@ class PresenceHandler(BasePresenceHandler):
finally:
self._event_processing = False
run_as_background_process("presence.notify_new_event", _process_presence)
run_as_background_process(
"presence.notify_new_event", self.server_name, _process_presence
)
async def _unsafe_process(self) -> None:
# Loop round handling deltas until we're up to date
@@ -1532,9 +1568,9 @@ class PresenceHandler(BasePresenceHandler):
self._event_pos = max_pos
# Expose current event processing position to prometheus
synapse.metrics.event_processing_positions.labels("presence").set(
max_pos
)
synapse.metrics.event_processing_positions.labels(
name="presence", **{SERVER_NAME_LABEL: self.server_name}
).set(max_pos)
async def _handle_state_delta(self, room_id: str, deltas: List[StateDelta]) -> None:
"""Process current state deltas for the room to find new joins that need
@@ -1660,7 +1696,10 @@ class PresenceHandler(BasePresenceHandler):
def should_notify(
old_state: UserPresenceState, new_state: UserPresenceState, is_mine: bool
old_state: UserPresenceState,
new_state: UserPresenceState,
is_mine: bool,
our_server_name: str,
) -> bool:
"""Decides if a presence state change should be sent to interested parties."""
user_location = "remote"
@@ -1671,19 +1710,38 @@ def should_notify(
return False
if old_state.status_msg != new_state.status_msg:
notify_reason_counter.labels(user_location, "status_msg_change").inc()
notify_reason_counter.labels(
locality=user_location,
reason="status_msg_change",
**{SERVER_NAME_LABEL: our_server_name},
).inc()
return True
if old_state.state != new_state.state:
notify_reason_counter.labels(user_location, "state_change").inc()
notify_reason_counter.labels(
locality=user_location,
reason="state_change",
**{SERVER_NAME_LABEL: our_server_name},
).inc()
state_transition_counter.labels(
user_location, old_state.state, new_state.state
**{
"locality": user_location,
# `from` is a reserved word in Python so we have to label it this way if
# we want to use keyword args.
"from": old_state.state,
"to": new_state.state,
SERVER_NAME_LABEL: our_server_name,
},
).inc()
return True
if old_state.state == PresenceState.ONLINE:
if new_state.currently_active != old_state.currently_active:
notify_reason_counter.labels(user_location, "current_active_change").inc()
notify_reason_counter.labels(
locality=user_location,
reason="current_active_change",
**{SERVER_NAME_LABEL: our_server_name},
).inc()
return True
if (
@@ -1693,14 +1751,18 @@ def should_notify(
# Only notify about last active bumps if we're not currently active
if not new_state.currently_active:
notify_reason_counter.labels(
user_location, "last_active_change_online"
locality=user_location,
reason="last_active_change_online",
**{SERVER_NAME_LABEL: our_server_name},
).inc()
return True
elif new_state.last_active_ts - old_state.last_active_ts > LAST_ACTIVE_GRANULARITY:
# Always notify for a transition where last active gets bumped.
notify_reason_counter.labels(
user_location, "last_active_change_not_online"
locality=user_location,
reason="last_active_change_not_online",
**{SERVER_NAME_LABEL: our_server_name},
).inc()
return True
@@ -1767,6 +1829,7 @@ class PresenceEventSource(EventSource[int, UserPresenceState]):
self.server_name = hs.hostname
self.get_presence_handler = hs.get_presence_handler
self.get_presence_router = hs.get_presence_router
self.server_name = hs.hostname
self.clock = hs.get_clock()
self.store = hs.get_datastores().main
@@ -1878,7 +1941,10 @@ class PresenceEventSource(EventSource[int, UserPresenceState]):
# If we have the full list of changes for presence we can
# simply check which ones share a room with the user.
get_updates_counter.labels("stream").inc()
get_updates_counter.labels(
type="stream",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
sharing_users = await self.store.do_users_share_a_room(
user_id, updated_users
@@ -1891,7 +1957,10 @@ class PresenceEventSource(EventSource[int, UserPresenceState]):
else:
# Too many possible updates. Find all users we can see and check
# if any of them have changed.
get_updates_counter.labels("full").inc()
get_updates_counter.labels(
type="full",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
users_interested_in = (
await self.store.get_users_who_share_room_with_user(user_id)
@@ -2141,6 +2210,7 @@ def handle_update(
prev_state: UserPresenceState,
new_state: UserPresenceState,
is_mine: bool,
our_server_name: str,
wheel_timer: WheelTimer,
now: int,
persist: bool,
@@ -2153,6 +2223,7 @@ def handle_update(
prev_state
new_state
is_mine: Whether the user is ours
our_server_name: The homeserver name of the our server (`hs.hostname`)
wheel_timer
now: Time now in ms
persist: True if this state should persist until another update occurs.
@@ -2221,7 +2292,7 @@ def handle_update(
)
# Check whether the change was something worth notifying about
if should_notify(prev_state, new_state, is_mine):
if should_notify(prev_state, new_state, is_mine, our_server_name):
new_state = new_state.copy_and_replace(last_federation_update_ts=now)
persist_and_notify = True
+1 -1
View File
@@ -124,7 +124,7 @@ class ProfileHandler:
except RequestSendFailed as e:
raise SynapseError(502, "Failed to fetch profile") from e
except HttpResponseException as e:
if e.code < 500 and e.code != 404:
if e.code < 500 and e.code not in (403, 404):
# Other codes are not allowed in c2s API
logger.info(
"Server replied with wrong response: %s %s", e.code, e.msg
+27 -13
View File
@@ -45,6 +45,7 @@ from synapse.api.errors import (
from synapse.appservice import ApplicationService
from synapse.config.server import is_threepid_reserved
from synapse.http.servlet import assert_params_in_dict
from synapse.metrics import SERVER_NAME_LABEL
from synapse.replication.http.login import RegisterDeviceReplicationServlet
from synapse.replication.http.register import (
ReplicationPostRegisterActionsServlet,
@@ -62,29 +63,38 @@ logger = logging.getLogger(__name__)
registration_counter = Counter(
"synapse_user_registrations_total",
"Number of new users registered (since restart)",
["guest", "shadow_banned", "auth_provider"],
labelnames=["guest", "shadow_banned", "auth_provider", SERVER_NAME_LABEL],
)
login_counter = Counter(
"synapse_user_logins_total",
"Number of user logins (since restart)",
["guest", "auth_provider"],
labelnames=["guest", "auth_provider", SERVER_NAME_LABEL],
)
def init_counters_for_auth_provider(auth_provider_id: str) -> None:
def init_counters_for_auth_provider(auth_provider_id: str, server_name: str) -> None:
"""Ensure the prometheus counters for the given auth provider are initialised
This fixes a problem where the counters are not reported for a given auth provider
until the user first logs in/registers.
Args:
auth_provider_id: The ID of the auth provider to initialise counters for.
server_name: Our server name (used to label metrics) (this should be `hs.hostname`).
"""
for is_guest in (True, False):
login_counter.labels(guest=is_guest, auth_provider=auth_provider_id)
login_counter.labels(
guest=is_guest,
auth_provider=auth_provider_id,
**{SERVER_NAME_LABEL: server_name},
)
for shadow_banned in (True, False):
registration_counter.labels(
guest=is_guest,
shadow_banned=shadow_banned,
auth_provider=auth_provider_id,
**{SERVER_NAME_LABEL: server_name},
)
@@ -97,6 +107,7 @@ class LoginDict(TypedDict):
class RegistrationHandler:
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self.store = hs.get_datastores().main
self._storage_controllers = hs.get_storage_controllers()
self.clock = hs.get_clock()
@@ -112,7 +123,6 @@ class RegistrationHandler:
self._account_validity_handler = hs.get_account_validity_handler()
self._user_consent_version = self.hs.config.consent.user_consent_version
self._server_notices_mxid = hs.config.servernotices.server_notices_mxid
self._server_name = hs.hostname
self._user_types_config = hs.config.user_types
self._spam_checker_module_callbacks = hs.get_module_api_callbacks().spam_checker
@@ -138,7 +148,9 @@ class RegistrationHandler:
)
self.refresh_token_lifetime = hs.config.registration.refresh_token_lifetime
init_counters_for_auth_provider("")
init_counters_for_auth_provider(
auth_provider_id="", server_name=self.server_name
)
async def check_username(
self,
@@ -362,6 +374,7 @@ class RegistrationHandler:
guest=make_guest,
shadow_banned=shadow_banned,
auth_provider=(auth_provider_id or ""),
**{SERVER_NAME_LABEL: self.server_name},
).inc()
# If the user does not need to consent at registration, auto-join any
@@ -422,7 +435,7 @@ class RegistrationHandler:
if self.hs.config.registration.auto_join_user_id:
fake_requester = create_requester(
self.hs.config.registration.auto_join_user_id,
authenticated_entity=self._server_name,
authenticated_entity=self.server_name,
)
# If the room requires an invite, add the user to the list of invites.
@@ -435,7 +448,7 @@ class RegistrationHandler:
requires_join = True
else:
fake_requester = create_requester(
user_id, authenticated_entity=self._server_name
user_id, authenticated_entity=self.server_name
)
# Choose whether to federate the new room.
@@ -467,7 +480,7 @@ class RegistrationHandler:
await room_member_handler.update_membership(
requester=create_requester(
user_id, authenticated_entity=self._server_name
user_id, authenticated_entity=self.server_name
),
target=UserID.from_string(user_id),
room_id=room_id,
@@ -493,7 +506,7 @@ class RegistrationHandler:
if requires_join:
await room_member_handler.update_membership(
requester=create_requester(
user_id, authenticated_entity=self._server_name
user_id, authenticated_entity=self.server_name
),
target=UserID.from_string(user_id),
room_id=room_id,
@@ -539,7 +552,7 @@ class RegistrationHandler:
# we don't have a local user in the room to craft up an invite with.
requires_invite = await self.store.is_host_joined(
room_id,
self._server_name,
self.server_name,
)
if requires_invite:
@@ -567,7 +580,7 @@ class RegistrationHandler:
await room_member_handler.update_membership(
requester=create_requester(
self.hs.config.registration.auto_join_user_id,
authenticated_entity=self._server_name,
authenticated_entity=self.server_name,
),
target=UserID.from_string(user_id),
room_id=room_id,
@@ -579,7 +592,7 @@ class RegistrationHandler:
# Send the join.
await room_member_handler.update_membership(
requester=create_requester(
user_id, authenticated_entity=self._server_name
user_id, authenticated_entity=self.server_name
),
target=UserID.from_string(user_id),
room_id=room_id,
@@ -790,6 +803,7 @@ class RegistrationHandler:
login_counter.labels(
guest=is_guest,
auth_provider=(auth_provider_id or ""),
**{SERVER_NAME_LABEL: self.server_name},
).inc()
return (
+45 -21
View File
@@ -66,6 +66,7 @@ from synapse.api.errors import (
SynapseError,
)
from synapse.api.filtering import Filter
from synapse.api.ratelimiting import Ratelimiter
from synapse.api.room_versions import KNOWN_ROOM_VERSIONS, RoomVersion
from synapse.event_auth import validate_event_for_room_version
from synapse.events import EventBase
@@ -134,7 +135,12 @@ class RoomCreationHandler:
self.room_member_handler = hs.get_room_member_handler()
self._event_auth_handler = hs.get_event_auth_handler()
self.config = hs.config
self.request_ratelimiter = hs.get_request_ratelimiter()
self.common_request_ratelimiter = hs.get_request_ratelimiter()
self.creation_ratelimiter = Ratelimiter(
store=self.store,
clock=self.clock,
cfg=self.config.ratelimiting.rc_room_creation,
)
# Room state based off defined presets
self._presets_dict: Dict[str, Dict[str, Any]] = {
@@ -216,7 +222,11 @@ class RoomCreationHandler:
ShadowBanError if the requester is shadow-banned.
"""
if ratelimit:
await self.request_ratelimiter.ratelimit(requester)
await self.creation_ratelimiter.ratelimit(requester, update=False)
# then apply the ratelimits
await self.common_request_ratelimiter.ratelimit(requester)
await self.creation_ratelimiter.ratelimit(requester)
user_id = requester.user.to_string()
@@ -566,6 +576,7 @@ class RoomCreationHandler:
created with _generate_room_id())
new_room_version: the new room version to use
tombstone_event_id: the ID of the tombstone event in the old room.
additional_creators: additional room creators, for MSC4289.
creation_event_with_context: The create event of the new room, if the new room supports
room ID as create event ID hash.
auto_member: Whether to automatically join local users to the new
@@ -1060,6 +1071,25 @@ class RoomCreationHandler:
await self.auth_blocking.check_auth_blocking(requester=requester)
if ratelimit:
# Limit the rate of room creations,
# using both the limiter specific to room creations as well
# as the general request ratelimiter.
#
# Note that we don't rate limit the individual
# events in the room — room creation isn't atomic and
# historically it was very janky if half the events in the
# initial state don't make it because of rate limiting.
# First check the room creation ratelimiter without updating it
# (this is so we don't consume a token if the other ratelimiter doesn't
# allow us to proceed)
await self.creation_ratelimiter.ratelimit(requester, update=False)
# then apply the ratelimits
await self.common_request_ratelimiter.ratelimit(requester)
await self.creation_ratelimiter.ratelimit(requester)
if (
self._server_notices_mxid is not None
and user_id == self._server_notices_mxid
@@ -1091,25 +1121,6 @@ class RoomCreationHandler:
Codes.MISSING_PARAM,
)
if not is_requester_admin:
spam_check = await self._spam_checker_module_callbacks.user_may_create_room(
user_id, config
)
if spam_check != self._spam_checker_module_callbacks.NOT_SPAM:
raise SynapseError(
403,
"You are not permitted to create rooms",
errcode=spam_check[0],
additional_fields=spam_check[1],
)
if ratelimit:
# Rate limit once in advance, but don't rate limit the individual
# events in the room — room creation isn't atomic and it's very
# janky if half the events in the initial state don't make it because
# of rate limiting.
await self.request_ratelimiter.ratelimit(requester)
room_version_id = config.get(
"room_version", self.config.server.default_room_version.identifier
)
@@ -1202,6 +1213,19 @@ class RoomCreationHandler:
self._validate_room_config(config, visibility)
# Run the spam checker after other validation
if not is_requester_admin:
spam_check = await self._spam_checker_module_callbacks.user_may_create_room(
user_id, config
)
if spam_check != self._spam_checker_module_callbacks.NOT_SPAM:
raise SynapseError(
403,
"You are not permitted to create rooms",
errcode=spam_check[0],
additional_fields=spam_check[1],
)
creation_content = config.get("creation_content", {})
# override any attempt to set room versions via the creation_content
creation_content["room_version"] = room_version.identifier
+40 -29
View File
@@ -49,7 +49,7 @@ from synapse.handlers.profile import MAX_AVATAR_URL_LEN, MAX_DISPLAYNAME_LEN
from synapse.handlers.state_deltas import MatchChange, StateDeltasHandler
from synapse.handlers.worker_lock import NEW_EVENT_DURING_PURGE_LOCK_NAME
from synapse.logging import opentracing
from synapse.metrics import event_processing_positions
from synapse.metrics import SERVER_NAME_LABEL, event_processing_positions
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.replication.http.push import ReplicationCopyPusherRestServlet
from synapse.storage.databases.main.state_deltas import StateDelta
@@ -746,35 +746,41 @@ class RoomMemberHandler(metaclass=abc.ABCMeta):
and requester.user.to_string() == self._server_notices_mxid
)
requester_suspended = await self.store.get_user_suspended_status(
requester.user.to_string()
)
if action == Membership.INVITE and requester_suspended:
raise SynapseError(
403,
"Sending invites while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
# The requester may be a regular user, but puppeted by the server.
request_by_server = requester.authenticated_entity == self._server_name
if target.to_string() != requester.user.to_string():
target_suspended = await self.store.get_user_suspended_status(
target.to_string()
# If the request is initiated by the server, ignore whether the
# requester or target is suspended.
if not request_by_server:
requester_suspended = await self.store.get_user_suspended_status(
requester.user.to_string()
)
else:
target_suspended = requester_suspended
if action == Membership.INVITE and requester_suspended:
raise SynapseError(
403,
"Sending invites while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
if action == Membership.JOIN and target_suspended:
raise SynapseError(
403,
"Joining rooms while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
if action == Membership.KNOCK and target_suspended:
raise SynapseError(
403,
"Knocking on rooms while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
if target.to_string() != requester.user.to_string():
target_suspended = await self.store.get_user_suspended_status(
target.to_string()
)
else:
target_suspended = requester_suspended
if action == Membership.JOIN and target_suspended:
raise SynapseError(
403,
"Joining rooms while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
if action == Membership.KNOCK and target_suspended:
raise SynapseError(
403,
"Knocking on rooms while account is suspended is not allowed.",
Codes.USER_ACCOUNT_SUSPENDED,
)
if (
not self.allow_per_room_profiles and not is_requester_server_notices_user
@@ -2163,6 +2169,7 @@ class RoomForgetterHandler(StateDeltasHandler):
super().__init__(hs)
self._hs = hs
self.server_name = hs.hostname
self._store = hs.get_datastores().main
self._storage_controllers = hs.get_storage_controllers()
self._clock = hs.get_clock()
@@ -2194,7 +2201,9 @@ class RoomForgetterHandler(StateDeltasHandler):
finally:
self._is_processing = False
run_as_background_process("room_forgetter.notify_new_event", process)
run_as_background_process(
"room_forgetter.notify_new_event", self.server_name, process
)
async def _unsafe_process(self) -> None:
# If self.pos is None then means we haven't fetched it from DB
@@ -2251,7 +2260,9 @@ class RoomForgetterHandler(StateDeltasHandler):
self.pos = max_pos
# Expose current event processing position to prometheus
event_processing_positions.labels("room_forgetter").set(max_pos)
event_processing_positions.labels(
name="room_forgetter", **{SERVER_NAME_LABEL: self.server_name}
).set(max_pos)
await self._store.update_room_forgetter_stream_pos(max_pos)
+4 -52
View File
@@ -24,16 +24,13 @@ import logging
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from io import BytesIO
from typing import TYPE_CHECKING, Any, Dict, Optional
from typing import TYPE_CHECKING, Dict, Optional
from pkg_resources import parse_version
import twisted
from twisted.internet.defer import Deferred
from twisted.internet.endpoints import HostnameEndpoint
from twisted.internet.interfaces import IOpenSSLContextFactory, IProtocolFactory
from twisted.internet.interfaces import IProtocolFactory
from twisted.internet.ssl import optionsForClientTLS
from twisted.mail.smtp import ESMTPSender, ESMTPSenderFactory
from twisted.mail.smtp import ESMTPSenderFactory
from twisted.protocols.tls import TLSMemoryBIOFactory
from synapse.logging.context import make_deferred_yieldable
@@ -44,49 +41,6 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
_is_old_twisted = parse_version(twisted.__version__) < parse_version("21")
class _BackportESMTPSender(ESMTPSender):
"""Extend old versions of ESMTPSender to configure TLS.
Unfortunately, before Twisted 21.2, ESMTPSender doesn't give an easy way to
disable TLS, or to configure the hostname used for TLS certificate validation.
This backports the `hostname` parameter for that functionality.
"""
__hostname: Optional[str]
def __init__(self, *args: Any, **kwargs: Any) -> None:
""""""
self.__hostname = kwargs.pop("hostname", None)
super().__init__(*args, **kwargs)
def _getContextFactory(self) -> Optional[IOpenSSLContextFactory]:
if self.context is not None:
return self.context
elif self.__hostname is None:
return None # disable TLS if hostname is None
return optionsForClientTLS(self.__hostname)
class _BackportESMTPSenderFactory(ESMTPSenderFactory):
"""An ESMTPSenderFactory for _BackportESMTPSender.
This backports the `hostname` parameter, to disable or configure TLS.
"""
__hostname: Optional[str]
def __init__(self, *args: Any, **kwargs: Any) -> None:
self.__hostname = kwargs.pop("hostname", None)
super().__init__(*args, **kwargs)
def protocol(self, *args: Any, **kwargs: Any) -> ESMTPSender: # type: ignore
# this overrides ESMTPSenderFactory's `protocol` attribute, with a Callable
# instantiating our _BackportESMTPSender, providing the hostname parameter
return _BackportESMTPSender(*args, **kwargs, hostname=self.__hostname)
async def _sendmail(
reactor: ISynapseReactor,
@@ -129,9 +83,7 @@ async def _sendmail(
elif tlsname is None:
tlsname = smtphost
factory: IProtocolFactory = (
_BackportESMTPSenderFactory if _is_old_twisted else ESMTPSenderFactory
)(
factory: IProtocolFactory = ESMTPSenderFactory(
username,
password,
from_addr,
+6 -4
View File
@@ -38,6 +38,7 @@ from synapse.logging.opentracing import (
tag_args,
trace,
)
from synapse.metrics import SERVER_NAME_LABEL
from synapse.storage.databases.main.roommember import extract_heroes_from_room_summary
from synapse.storage.databases.main.state_deltas import StateDelta
from synapse.storage.databases.main.stream import PaginateFunction
@@ -79,7 +80,7 @@ logger = logging.getLogger(__name__)
sync_processing_time = Histogram(
"synapse_sliding_sync_processing_time",
"Time taken to generate a sliding sync response, ignoring wait times.",
["initial"],
labelnames=["initial", SERVER_NAME_LABEL],
)
# Limit the number of state_keys we should remember sending down the connection for each
@@ -94,6 +95,7 @@ MAX_NUMBER_PREVIOUS_STATE_KEYS_TO_REMEMBER = 100
class SlidingSyncHandler:
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self.clock = hs.get_clock()
self.store = hs.get_datastores().main
self.storage_controllers = hs.get_storage_controllers()
@@ -368,9 +370,9 @@ class SlidingSyncHandler:
set_tag(SynapseTags.FUNC_ARG_PREFIX + "sync_config.user", user_id)
end_time_s = self.clock.time()
sync_processing_time.labels(from_token is not None).observe(
end_time_s - start_time_s
)
sync_processing_time.labels(
initial=from_token is not None, **{SERVER_NAME_LABEL: self.server_name}
).observe(end_time_s - start_time_s)
return sliding_sync_result
+7 -5
View File
@@ -202,7 +202,7 @@ class SsoHandler:
def __init__(self, hs: "HomeServer"):
self._clock = hs.get_clock()
self._store = hs.get_datastores().main
self._server_name = hs.hostname
self.server_name = hs.hostname
self._is_mine_server_name = hs.is_mine_server_name
self._registration_handler = hs.get_registration_handler()
self._auth_handler = hs.get_auth_handler()
@@ -238,7 +238,9 @@ class SsoHandler:
p_id = p.idp_id
assert p_id not in self._identity_providers
self._identity_providers[p_id] = p
init_counters_for_auth_provider(p_id)
init_counters_for_auth_provider(
auth_provider_id=p_id, server_name=self.server_name
)
def get_identity_providers(self) -> Mapping[str, SsoIdentityProvider]:
"""Get the configured identity providers"""
@@ -569,7 +571,7 @@ class SsoHandler:
return attributes
# Check if this mxid already exists
user_id = UserID(attributes.localpart, self._server_name).to_string()
user_id = UserID(attributes.localpart, self.server_name).to_string()
if not await self._store.get_users_by_id_case_insensitive(user_id):
# This mxid is free
break
@@ -907,7 +909,7 @@ class SsoHandler:
# render an error page.
html = self._bad_user_template.render(
server_name=self._server_name,
server_name=self.server_name,
user_id_to_verify=user_id_to_verify,
)
respond_with_html(request, 200, html)
@@ -959,7 +961,7 @@ class SsoHandler:
if contains_invalid_mxid_characters(localpart):
raise SynapseError(400, "localpart is invalid: %s" % (localpart,))
user_id = UserID(localpart, self._server_name).to_string()
user_id = UserID(localpart, self.server_name).to_string()
user_infos = await self._store.get_users_by_id_case_insensitive(user_id)
logger.info("[session %s] users: %s", session_id, user_infos)
+6 -3
View File
@@ -32,7 +32,7 @@ from typing import (
)
from synapse.api.constants import EventContentFields, EventTypes, Membership
from synapse.metrics import event_processing_positions
from synapse.metrics import SERVER_NAME_LABEL, event_processing_positions
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.storage.databases.main.state_deltas import StateDelta
from synapse.types import JsonDict
@@ -54,6 +54,7 @@ class StatsHandler:
def __init__(self, hs: "HomeServer"):
self.hs = hs
self.server_name = hs.hostname
self.store = hs.get_datastores().main
self._storage_controllers = hs.get_storage_controllers()
self.state = hs.get_state_handler()
@@ -89,7 +90,7 @@ class StatsHandler:
finally:
self._is_processing = False
run_as_background_process("stats.notify_new_event", process)
run_as_background_process("stats.notify_new_event", self.server_name, process)
async def _unsafe_process(self) -> None:
# If self.pos is None then means we haven't fetched it from DB
@@ -146,7 +147,9 @@ class StatsHandler:
logger.debug("Handled room stats to %s -> %s", self.pos, max_pos)
event_processing_positions.labels("stats").set(max_pos)
event_processing_positions.labels(
name="stats", **{SERVER_NAME_LABEL: self.server_name}
).set(max_pos)
self.pos = max_pos
+7 -2
View File
@@ -63,6 +63,7 @@ from synapse.logging.opentracing import (
start_active_span,
trace,
)
from synapse.metrics import SERVER_NAME_LABEL
from synapse.storage.databases.main.event_push_actions import RoomNotifCounts
from synapse.storage.databases.main.roommember import extract_heroes_from_room_summary
from synapse.storage.databases.main.stream import PaginateFunction
@@ -104,7 +105,7 @@ non_empty_sync_counter = Counter(
"Count of non empty sync responses. type is initial_sync/full_state_sync"
"/incremental_sync. lazy_loaded indicates if lazy loaded members were "
"enabled for that request.",
["type", "lazy_loaded"],
labelnames=["type", "lazy_loaded", SERVER_NAME_LABEL],
)
# Store the cache that tracks which lazy-loaded members have been sent to a given
@@ -614,7 +615,11 @@ class SyncHandler:
lazy_loaded = "true"
else:
lazy_loaded = "false"
non_empty_sync_counter.labels(sync_label, lazy_loaded).inc()
non_empty_sync_counter.labels(
type=sync_label,
lazy_loaded=lazy_loaded,
**{SERVER_NAME_LABEL: self.server_name},
).inc()
return result
+55 -12
View File
@@ -1,9 +1,15 @@
import logging
from http import HTTPStatus
from typing import TYPE_CHECKING, Optional
from synapse.api.errors import AuthError, NotFoundError
from synapse.storage.databases.main.thread_subscriptions import ThreadSubscription
from synapse.types import UserID
from synapse.api.constants import RelationTypes
from synapse.api.errors import AuthError, Codes, NotFoundError, SynapseError
from synapse.events import relation_from_event
from synapse.storage.databases.main.thread_subscriptions import (
AutomaticSubscriptionConflicted,
ThreadSubscription,
)
from synapse.types import EventOrderings, UserID
if TYPE_CHECKING:
from synapse.server import HomeServer
@@ -55,42 +61,79 @@ class ThreadSubscriptionsHandler:
room_id: str,
thread_root_event_id: str,
*,
automatic: bool,
automatic_event_id: Optional[str],
) -> Optional[int]:
"""Sets or updates a user's subscription settings for a specific thread root.
Args:
requester_user_id: The ID of the user whose settings are being updated.
thread_root_event_id: The event ID of the thread root.
automatic: whether the user was subscribed by an automatic decision by
their client.
automatic_event_id: if the user was subscribed by an automatic decision by
their client, the event ID that caused this.
Returns:
The stream ID for this update, if the update isn't no-opped.
Raises:
NotFoundError if the user cannot access the thread root event, or it isn't
known to this homeserver.
known to this homeserver. Ditto for the automatic cause event if supplied.
SynapseError(400, M_NOT_IN_THREAD): if client supplied an automatic cause event
but user cannot access the event.
SynapseError(409, M_SKIPPED): if client requested an automatic subscription
but it was skipped because the cause event is logically later than an unsubscription.
"""
# First check that the user can access the thread root event
# and that it exists
try:
event = await self.event_handler.get_event(
thread_root_event = await self.event_handler.get_event(
user_id, room_id, thread_root_event_id
)
if event is None:
if thread_root_event is None:
raise NotFoundError("No such thread root")
except AuthError:
logger.info("rejecting thread subscriptions change (thread not accessible)")
raise NotFoundError("No such thread root")
return await self.store.subscribe_user_to_thread(
if automatic_event_id:
autosub_cause_event = await self.event_handler.get_event(
user_id, room_id, automatic_event_id
)
if autosub_cause_event is None:
raise NotFoundError("Automatic subscription event not found")
relation = relation_from_event(autosub_cause_event)
if (
relation is None
or relation.rel_type != RelationTypes.THREAD
or relation.parent_id != thread_root_event_id
):
raise SynapseError(
HTTPStatus.BAD_REQUEST,
"Automatic subscription must use an event in the thread",
errcode=Codes.MSC4306_NOT_IN_THREAD,
)
automatic_event_orderings = EventOrderings.from_event(autosub_cause_event)
else:
automatic_event_orderings = None
outcome = await self.store.subscribe_user_to_thread(
user_id.to_string(),
event.room_id,
room_id,
thread_root_event_id,
automatic=automatic,
automatic_event_orderings=automatic_event_orderings,
)
if isinstance(outcome, AutomaticSubscriptionConflicted):
raise SynapseError(
HTTPStatus.CONFLICT,
"Automatic subscription obsoleted by an unsubscription request.",
errcode=Codes.MSC4306_CONFLICTING_UNSUBSCRIPTION,
)
return outcome
async def unsubscribe_user_from_thread(
self, user_id: UserID, room_id: str, thread_root_event_id: str
) -> Optional[int]:
+14 -3
View File
@@ -80,7 +80,9 @@ class FollowerTypingHandler:
def __init__(self, hs: "HomeServer"):
self.store = hs.get_datastores().main
self._storage_controllers = hs.get_storage_controllers()
self.server_name = hs.config.server.server_name
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self.clock = hs.get_clock()
self.is_mine_id = hs.is_mine_id
self.is_mine_server_name = hs.is_mine_server_name
@@ -143,7 +145,11 @@ class FollowerTypingHandler:
last_fed_poke = self._member_last_federation_poke.get(member, None)
if not last_fed_poke or last_fed_poke + FEDERATION_PING_INTERVAL <= now:
run_as_background_process(
"typing._push_remote", self._push_remote, member=member, typing=True
"typing._push_remote",
self.server_name,
self._push_remote,
member=member,
typing=True,
)
# Add a paranoia timer to ensure that we always have a timer for
@@ -216,6 +222,7 @@ class FollowerTypingHandler:
if self.federation:
run_as_background_process(
"_send_changes_in_typing_to_remotes",
self.server_name,
self._send_changes_in_typing_to_remotes,
row.room_id,
prev_typing,
@@ -378,7 +385,11 @@ class TypingWriterHandler(FollowerTypingHandler):
if self.hs.is_mine_id(member.user_id):
# Only send updates for changes to our own users.
run_as_background_process(
"typing._push_remote", self._push_remote, member, typing
"typing._push_remote",
self.server_name,
self._push_remote,
member,
typing,
)
self._push_update_local(member=member, typing=typing)
+13 -6
View File
@@ -35,6 +35,7 @@ from synapse.api.constants import (
)
from synapse.api.errors import Codes, SynapseError
from synapse.handlers.state_deltas import MatchChange, StateDeltasHandler
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.storage.databases.main.state_deltas import StateDelta
from synapse.storage.databases.main.user_directory import SearchResult
@@ -192,7 +193,9 @@ class UserDirectoryHandler(StateDeltasHandler):
self._is_processing = False
self._is_processing = True
run_as_background_process("user_directory.notify_new_event", process)
run_as_background_process(
"user_directory.notify_new_event", self.server_name, process
)
async def handle_local_profile_change(
self, user_id: str, profile: ProfileInfo
@@ -260,9 +263,9 @@ class UserDirectoryHandler(StateDeltasHandler):
self.pos = max_pos
# Expose current event processing position to prometheus
synapse.metrics.event_processing_positions.labels("user_dir").set(
max_pos
)
synapse.metrics.event_processing_positions.labels(
name="user_dir", **{SERVER_NAME_LABEL: self.server_name}
).set(max_pos)
await self.store.update_user_directory_stream_pos(max_pos)
@@ -606,7 +609,9 @@ class UserDirectoryHandler(StateDeltasHandler):
self._is_refreshing_remote_profiles = False
self._is_refreshing_remote_profiles = True
run_as_background_process("user_directory.refresh_remote_profiles", process)
run_as_background_process(
"user_directory.refresh_remote_profiles", self.server_name, process
)
async def _unsafe_refresh_remote_profiles(self) -> None:
limit = MAX_SERVERS_TO_REFRESH_PROFILES_FOR_IN_ONE_GO - len(
@@ -688,7 +693,9 @@ class UserDirectoryHandler(StateDeltasHandler):
self._is_refreshing_remote_profiles_for_servers.add(server_name)
run_as_background_process(
"user_directory.refresh_remote_profiles_for_remote_server", process
"user_directory.refresh_remote_profiles_for_remote_server",
self.server_name,
process,
)
async def _unsafe_refresh_remote_profiles_for_remote_server(
+3
View File
@@ -66,6 +66,9 @@ class WorkerLocksHandler:
"""
def __init__(self, hs: "HomeServer") -> None:
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self._reactor = hs.get_reactor()
self._store = hs.get_datastores().main
self._clock = hs.get_clock()
+38 -11
View File
@@ -85,6 +85,7 @@ from synapse.http.replicationagent import ReplicationAgent
from synapse.http.types import QueryParams
from synapse.logging.context import make_deferred_yieldable, run_in_background
from synapse.logging.opentracing import set_tag, start_active_span, tags
from synapse.metrics import SERVER_NAME_LABEL
from synapse.types import ISynapseReactor, StrSequence
from synapse.util import json_decoder
from synapse.util.async_helpers import timeout_deferred
@@ -108,9 +109,13 @@ except ImportError:
logger = logging.getLogger(__name__)
outgoing_requests_counter = Counter("synapse_http_client_requests", "", ["method"])
outgoing_requests_counter = Counter(
"synapse_http_client_requests", "", labelnames=["method", SERVER_NAME_LABEL]
)
incoming_responses_counter = Counter(
"synapse_http_client_responses", "", ["method", "code"]
"synapse_http_client_responses",
"",
labelnames=["method", "code", SERVER_NAME_LABEL],
)
# the type of the headers map, to be passed to the t.w.h.Headers.
@@ -346,6 +351,7 @@ class BaseHttpClient:
treq_args: Optional[Dict[str, Any]] = None,
):
self.hs = hs
self.server_name = hs.hostname
self.reactor = hs.get_reactor()
self._extra_treq_args = treq_args or {}
@@ -384,7 +390,9 @@ class BaseHttpClient:
RequestTimedOutError if the request times out before the headers are read
"""
outgoing_requests_counter.labels(method).inc()
outgoing_requests_counter.labels(
method=method, **{SERVER_NAME_LABEL: self.server_name}
).inc()
# log request but strip `access_token` (AS requests for example include this)
logger.debug("Sending request %s %s", method, redact_uri(uri))
@@ -438,7 +446,11 @@ class BaseHttpClient:
response = await make_deferred_yieldable(request_deferred)
incoming_responses_counter.labels(method, response.code).inc()
incoming_responses_counter.labels(
method=method,
code=response.code,
**{SERVER_NAME_LABEL: self.server_name},
).inc()
logger.info(
"Received response to %s %s: %s",
method,
@@ -447,7 +459,11 @@ class BaseHttpClient:
)
return response
except Exception as e:
incoming_responses_counter.labels(method, "ERR").inc()
incoming_responses_counter.labels(
method=method,
code="ERR",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
logger.info(
"Error sending request to %s %s: %s %s",
method,
@@ -821,12 +837,12 @@ class SimpleHttpClient(BaseHttpClient):
pool.cachedConnectionTimeout = 2 * 60
self.agent: IAgent = ProxyAgent(
self.reactor,
hs.get_reactor(),
reactor=self.reactor,
proxy_reactor=hs.get_reactor(),
connectTimeout=15,
contextFactory=self.hs.get_http_client_context_factory(),
pool=pool,
use_proxy=use_proxy,
proxy_config=hs.config.server.proxy_config,
)
if self._ip_blocklist:
@@ -855,6 +871,7 @@ class ReplicationClient(BaseHttpClient):
hs: The HomeServer instance to pass in
"""
super().__init__(hs)
self.server_name = hs.hostname
# Use a pool, but a very small one.
pool = HTTPConnectionPool(self.reactor)
@@ -891,7 +908,9 @@ class ReplicationClient(BaseHttpClient):
RequestTimedOutError if the request times out before the headers are read
"""
outgoing_requests_counter.labels(method).inc()
outgoing_requests_counter.labels(
method=method, **{SERVER_NAME_LABEL: self.server_name}
).inc()
logger.debug("Sending request %s %s", method, uri)
@@ -948,7 +967,11 @@ class ReplicationClient(BaseHttpClient):
response = await make_deferred_yieldable(request_deferred)
incoming_responses_counter.labels(method, response.code).inc()
incoming_responses_counter.labels(
method=method,
code=response.code,
**{SERVER_NAME_LABEL: self.server_name},
).inc()
logger.info(
"Received response to %s %s: %s",
method,
@@ -957,7 +980,11 @@ class ReplicationClient(BaseHttpClient):
)
return response
except Exception as e:
incoming_responses_counter.labels(method, "ERR").inc()
incoming_responses_counter.labels(
method=method,
code="ERR",
**{SERVER_NAME_LABEL: self.server_name},
).inc()
logger.info(
"Error sending request to %s %s: %s %s",
method,
@@ -21,7 +21,6 @@ import logging
import urllib.parse
from typing import Any, Generator, List, Optional
from urllib.request import ( # type: ignore[attr-defined]
getproxies_environment,
proxy_bypass_environment,
)
@@ -40,6 +39,7 @@ from twisted.web.client import URI, Agent, HTTPConnectionPool
from twisted.web.http_headers import Headers
from twisted.web.iweb import IAgent, IAgentEndpointFactory, IBodyProducer, IResponse
from synapse.config.server import ProxyConfig
from synapse.crypto.context_factory import FederationPolicyForHTTPS
from synapse.http import proxyagent
from synapse.http.client import BlocklistingAgentWrapper, BlocklistingReactorWrapper
@@ -77,6 +77,8 @@ class MatrixFederationAgent:
ip_blocklist: Disallowed IP addresses.
proxy_config: Proxy configuration to use for this agent.
proxy_reactor: twisted reactor to use for connections to the proxy server
reactor might have some blocking applied (i.e. for DNS queries),
but we need unblocked access to the proxy.
@@ -92,12 +94,14 @@ class MatrixFederationAgent:
def __init__(
self,
*,
server_name: str,
reactor: ISynapseReactor,
tls_client_options_factory: Optional[FederationPolicyForHTTPS],
user_agent: bytes,
ip_allowlist: Optional[IPSet],
ip_blocklist: IPSet,
proxy_config: Optional[ProxyConfig] = None,
_srv_resolver: Optional[SrvResolver] = None,
_well_known_resolver: Optional[WellKnownResolver] = None,
):
@@ -129,10 +133,11 @@ class MatrixFederationAgent:
self._agent = Agent.usingEndpointFactory(
reactor,
MatrixHostnameEndpointFactory(
reactor,
proxy_reactor,
tls_client_options_factory,
_srv_resolver,
reactor=reactor,
proxy_reactor=proxy_reactor,
tls_client_options_factory=tls_client_options_factory,
srv_resolver=_srv_resolver,
proxy_config=proxy_config,
),
pool=self._pool,
)
@@ -144,11 +149,11 @@ class MatrixFederationAgent:
reactor=reactor,
agent=BlocklistingAgentWrapper(
ProxyAgent(
reactor,
proxy_reactor,
reactor=reactor,
proxy_reactor=proxy_reactor,
pool=self._pool,
contextFactory=tls_client_options_factory,
use_proxy=True,
proxy_config=proxy_config,
),
ip_blocklist=ip_blocklist,
),
@@ -246,14 +251,17 @@ class MatrixHostnameEndpointFactory:
def __init__(
self,
*,
reactor: IReactorCore,
proxy_reactor: IReactorCore,
tls_client_options_factory: Optional[FederationPolicyForHTTPS],
srv_resolver: Optional[SrvResolver],
proxy_config: Optional[ProxyConfig],
):
self._reactor = reactor
self._proxy_reactor = proxy_reactor
self._tls_client_options_factory = tls_client_options_factory
self._proxy_config = proxy_config
if srv_resolver is None:
srv_resolver = SrvResolver()
@@ -262,11 +270,12 @@ class MatrixHostnameEndpointFactory:
def endpointForURI(self, parsed_uri: URI) -> "MatrixHostnameEndpoint":
return MatrixHostnameEndpoint(
self._reactor,
self._proxy_reactor,
self._tls_client_options_factory,
self._srv_resolver,
parsed_uri,
reactor=self._reactor,
proxy_reactor=self._proxy_reactor,
tls_client_options_factory=self._tls_client_options_factory,
srv_resolver=self._srv_resolver,
proxy_config=self._proxy_config,
parsed_uri=parsed_uri,
)
@@ -283,6 +292,7 @@ class MatrixHostnameEndpoint:
tls_client_options_factory:
factory to use for fetching client tls options, or none to disable TLS.
srv_resolver: The SRV resolver to use
proxy_config: Proxy configuration to use for this agent.
parsed_uri: The parsed URI that we're wanting to connect to.
Raises:
@@ -292,26 +302,28 @@ class MatrixHostnameEndpoint:
def __init__(
self,
*,
reactor: IReactorCore,
proxy_reactor: IReactorCore,
tls_client_options_factory: Optional[FederationPolicyForHTTPS],
srv_resolver: SrvResolver,
proxy_config: Optional[ProxyConfig],
parsed_uri: URI,
):
self._reactor = reactor
self._parsed_uri = parsed_uri
self.proxy_config = proxy_config
# http_proxy is not needed because federation is always over TLS
proxies = getproxies_environment()
https_proxy = proxies["https"].encode() if "https" in proxies else None
self.no_proxy = proxies["no"] if "no" in proxies else None
# endpoint and credentials to use to connect to the outbound https proxy, if any.
(
self._https_proxy_endpoint,
self._https_proxy_creds,
) = proxyagent.http_proxy_endpoint(
https_proxy,
self.proxy_config.https_proxy.encode()
if self.proxy_config and self.proxy_config.https_proxy
else None,
proxy_reactor,
tls_client_options_factory,
)
@@ -348,10 +360,10 @@ class MatrixHostnameEndpoint:
port = server.port
should_skip_proxy = False
if self.no_proxy is not None:
if self.proxy_config is not None:
should_skip_proxy = proxy_bypass_environment(
host.decode(),
proxies={"no": self.no_proxy},
proxies=self.proxy_config.get_proxies_dictionary(),
)
endpoint: IStreamClientEndpoint
+21 -10
View File
@@ -87,6 +87,7 @@ from synapse.http.types import QueryParams
from synapse.logging import opentracing
from synapse.logging.context import make_deferred_yieldable, run_in_background
from synapse.logging.opentracing import set_tag, start_active_span, tags
from synapse.metrics import SERVER_NAME_LABEL
from synapse.types import JsonDict
from synapse.util import json_decoder
from synapse.util.async_helpers import AwakenableSleeper, Linearizer, timeout_deferred
@@ -99,10 +100,14 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
outgoing_requests_counter = Counter(
"synapse_http_matrixfederationclient_requests", "", ["method"]
"synapse_http_matrixfederationclient_requests",
"",
labelnames=["method", SERVER_NAME_LABEL],
)
incoming_responses_counter = Counter(
"synapse_http_matrixfederationclient_responses", "", ["method", "code"]
"synapse_http_matrixfederationclient_responses",
"",
labelnames=["method", "code", SERVER_NAME_LABEL],
)
@@ -423,6 +428,7 @@ class MatrixFederationHttpClient:
user_agent=user_agent.encode("ascii"),
ip_allowlist=hs.config.server.federation_ip_range_allowlist,
ip_blocklist=hs.config.server.federation_ip_range_blocklist,
proxy_config=hs.config.server.proxy_config,
)
else:
proxy_authorization_secret = hs.config.worker.worker_replication_secret
@@ -437,9 +443,9 @@ class MatrixFederationHttpClient:
# locations
federation_proxy_locations = outbound_federation_restricted_to.locations
federation_agent = ProxyAgent(
self.reactor,
self.reactor,
tls_client_options_factory,
reactor=self.reactor,
proxy_reactor=self.reactor,
contextFactory=tls_client_options_factory,
federation_proxy_locations=federation_proxy_locations,
federation_proxy_credentials=federation_proxy_credentials,
)
@@ -619,9 +625,10 @@ class MatrixFederationHttpClient:
raise FederationDeniedError(request.destination)
limiter = await synapse.util.retryutils.get_retry_limiter(
request.destination,
self.clock,
self._store,
destination=request.destination,
our_server_name=self.server_name,
clock=self.clock,
store=self._store,
backoff_on_404=backoff_on_404,
ignore_backoff=ignore_backoff,
notifier=self.hs.get_notifier(),
@@ -695,7 +702,9 @@ class MatrixFederationHttpClient:
_sec_timeout,
)
outgoing_requests_counter.labels(request.method).inc()
outgoing_requests_counter.labels(
method=request.method, **{SERVER_NAME_LABEL: self.server_name}
).inc()
try:
with Measure(
@@ -734,7 +743,9 @@ class MatrixFederationHttpClient:
raise RequestSendFailed(e, can_retry=True) from e
incoming_responses_counter.labels(
request.method, response.code
method=request.method,
code=response.code,
**{SERVER_NAME_LABEL: self.server_name},
).inc()
set_tag(tags.HTTP_STATUS_CODE, response.code)
+23 -21
View File
@@ -24,7 +24,6 @@ import re
from typing import Any, Collection, Dict, List, Optional, Sequence, Tuple, Union, cast
from urllib.parse import urlparse
from urllib.request import ( # type: ignore[attr-defined]
getproxies_environment,
proxy_bypass_environment,
)
@@ -54,6 +53,7 @@ from twisted.web.error import SchemeNotSupported
from twisted.web.http_headers import Headers
from twisted.web.iweb import IAgent, IBodyProducer, IPolicyForHTTPS, IResponse
from synapse.config.server import ProxyConfig
from synapse.config.workers import (
InstanceLocationConfig,
InstanceTcpLocationConfig,
@@ -99,8 +99,7 @@ class ProxyAgent(_AgentBase):
pool: connection pool to be used. If None, a
non-persistent pool instance will be created.
use_proxy: Whether proxy settings should be discovered and used
from conventional environment variables.
proxy_config: Proxy configuration to use for this agent.
federation_proxy_locations: An optional list of locations to proxy outbound federation
traffic through (only requests that use the `matrix-federation://` scheme
@@ -118,13 +117,14 @@ class ProxyAgent(_AgentBase):
def __init__(
self,
*,
reactor: IReactorCore,
proxy_reactor: Optional[IReactorCore] = None,
contextFactory: Optional[IPolicyForHTTPS] = None,
connectTimeout: Optional[float] = None,
bindAddress: Optional[bytes] = None,
pool: Optional[HTTPConnectionPool] = None,
use_proxy: bool = False,
proxy_config: Optional[ProxyConfig] = None,
federation_proxy_locations: Collection[InstanceLocationConfig] = (),
federation_proxy_credentials: Optional[ProxyCredentials] = None,
):
@@ -145,31 +145,33 @@ class ProxyAgent(_AgentBase):
if bindAddress is not None:
self._endpoint_kwargs["bindAddress"] = bindAddress
http_proxy = None
https_proxy = None
no_proxy = None
if use_proxy:
proxies = getproxies_environment()
http_proxy = proxies["http"].encode() if "http" in proxies else None
https_proxy = proxies["https"].encode() if "https" in proxies else None
no_proxy = proxies["no"] if "no" in proxies else None
self.proxy_config = proxy_config
if self.proxy_config is not None:
logger.debug(
"Using proxy settings: http_proxy=%s, https_proxy=%s, no_proxy=%s",
http_proxy,
https_proxy,
no_proxy,
self.proxy_config.http_proxy,
self.proxy_config.https_proxy,
self.proxy_config.no_proxy_hosts,
)
self.http_proxy_endpoint, self.http_proxy_creds = http_proxy_endpoint(
http_proxy, self.proxy_reactor, contextFactory, **self._endpoint_kwargs
self.proxy_config.http_proxy.encode()
if self.proxy_config and self.proxy_config.http_proxy
else None,
self.proxy_reactor,
contextFactory,
**self._endpoint_kwargs,
)
self.https_proxy_endpoint, self.https_proxy_creds = http_proxy_endpoint(
https_proxy, self.proxy_reactor, contextFactory, **self._endpoint_kwargs
self.proxy_config.https_proxy.encode()
if self.proxy_config and self.proxy_config.https_proxy
else None,
self.proxy_reactor,
contextFactory,
**self._endpoint_kwargs,
)
self.no_proxy = no_proxy
self._policy_for_https = contextFactory
self._reactor = cast(IReactorTime, reactor)
@@ -268,10 +270,10 @@ class ProxyAgent(_AgentBase):
request_path = parsed_uri.originForm
should_skip_proxy = False
if self.no_proxy is not None:
if self.proxy_config is not None:
should_skip_proxy = proxy_bypass_environment(
parsed_uri.host.decode(),
proxies={"no": self.no_proxy},
proxies=self.proxy_config.get_proxies_dictionary(),
)
if (
+90 -46
View File
@@ -27,40 +27,52 @@ from typing import Dict, Mapping, Set, Tuple
from prometheus_client.core import Counter, Histogram
from synapse.logging.context import current_context
from synapse.metrics import LaterGauge
from synapse.metrics import SERVER_NAME_LABEL, LaterGauge
logger = logging.getLogger(__name__)
# total number of responses served, split by method/servlet/tag
response_count = Counter(
"synapse_http_server_response_count", "", ["method", "servlet", "tag"]
"synapse_http_server_response_count",
"",
labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
)
requests_counter = Counter(
"synapse_http_server_requests_received", "", ["method", "servlet"]
"synapse_http_server_requests_received",
"",
labelnames=["method", "servlet", SERVER_NAME_LABEL],
)
outgoing_responses_counter = Counter(
"synapse_http_server_responses", "", ["method", "code"]
"synapse_http_server_responses",
"",
labelnames=["method", "code", SERVER_NAME_LABEL],
)
response_timer = Histogram(
"synapse_http_server_response_time_seconds",
"sec",
["method", "servlet", "tag", "code"],
labelnames=["method", "servlet", "tag", "code", SERVER_NAME_LABEL],
)
response_ru_utime = Counter(
"synapse_http_server_response_ru_utime_seconds", "sec", ["method", "servlet", "tag"]
"synapse_http_server_response_ru_utime_seconds",
"sec",
labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
)
response_ru_stime = Counter(
"synapse_http_server_response_ru_stime_seconds", "sec", ["method", "servlet", "tag"]
"synapse_http_server_response_ru_stime_seconds",
"sec",
labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
)
response_db_txn_count = Counter(
"synapse_http_server_response_db_txn_count", "", ["method", "servlet", "tag"]
"synapse_http_server_response_db_txn_count",
"",
labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
)
# seconds spent waiting for db txns, excluding scheduling time, when processing
@@ -68,34 +80,42 @@ response_db_txn_count = Counter(
response_db_txn_duration = Counter(
"synapse_http_server_response_db_txn_duration_seconds",
"",
["method", "servlet", "tag"],
labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
)
# seconds spent waiting for a db connection, when processing this request
response_db_sched_duration = Counter(
"synapse_http_server_response_db_sched_duration_seconds",
"",
["method", "servlet", "tag"],
labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
)
# size in bytes of the response written
response_size = Counter(
"synapse_http_server_response_size", "", ["method", "servlet", "tag"]
"synapse_http_server_response_size",
"",
labelnames=["method", "servlet", "tag", SERVER_NAME_LABEL],
)
# In flight metrics are incremented while the requests are in flight, rather
# than when the response was written.
in_flight_requests_ru_utime = Counter(
"synapse_http_server_in_flight_requests_ru_utime_seconds", "", ["method", "servlet"]
"synapse_http_server_in_flight_requests_ru_utime_seconds",
"",
labelnames=["method", "servlet", SERVER_NAME_LABEL],
)
in_flight_requests_ru_stime = Counter(
"synapse_http_server_in_flight_requests_ru_stime_seconds", "", ["method", "servlet"]
"synapse_http_server_in_flight_requests_ru_stime_seconds",
"",
labelnames=["method", "servlet", SERVER_NAME_LABEL],
)
in_flight_requests_db_txn_count = Counter(
"synapse_http_server_in_flight_requests_db_txn_count", "", ["method", "servlet"]
"synapse_http_server_in_flight_requests_db_txn_count",
"",
labelnames=["method", "servlet", SERVER_NAME_LABEL],
)
# seconds spent waiting for db txns, excluding scheduling time, when processing
@@ -103,14 +123,14 @@ in_flight_requests_db_txn_count = Counter(
in_flight_requests_db_txn_duration = Counter(
"synapse_http_server_in_flight_requests_db_txn_duration_seconds",
"",
["method", "servlet"],
labelnames=["method", "servlet", SERVER_NAME_LABEL],
)
# seconds spent waiting for a db connection, when processing this request
in_flight_requests_db_sched_duration = Counter(
"synapse_http_server_in_flight_requests_db_sched_duration_seconds",
"",
["method", "servlet"],
labelnames=["method", "servlet", SERVER_NAME_LABEL],
)
_in_flight_requests: Set["RequestMetrics"] = set()
@@ -124,31 +144,42 @@ def _get_in_flight_counts() -> Mapping[Tuple[str, ...], int]:
# Cast to a list to prevent it changing while the Prometheus
# thread is collecting metrics
with _in_flight_requests_lock:
reqs = list(_in_flight_requests)
request_metrics = list(_in_flight_requests)
for rm in reqs:
rm.update_metrics()
for request_metric in request_metrics:
request_metric.update_metrics()
# Map from (method, name) -> int, the number of in flight requests of that
# type. The key type is Tuple[str, str], but we leave the length unspecified
# for compatability with LaterGauge's annotations.
counts: Dict[Tuple[str, ...], int] = {}
for rm in reqs:
key = (rm.method, rm.name)
for request_metric in request_metrics:
key = (
request_metric.method,
request_metric.name,
request_metric.our_server_name,
)
counts[key] = counts.get(key, 0) + 1
return counts
LaterGauge(
"synapse_http_server_in_flight_requests_count",
"",
["method", "servlet"],
_get_in_flight_counts,
name="synapse_http_server_in_flight_requests_count",
desc="",
labelnames=["method", "servlet", SERVER_NAME_LABEL],
caller=_get_in_flight_counts,
)
class RequestMetrics:
def __init__(self, our_server_name: str) -> None:
"""
Args:
our_server_name: Our homeserver name (used to label metrics) (`hs.hostname`)
"""
self.our_server_name = our_server_name
def start(self, time_sec: float, name: str, method: str) -> None:
self.start_ts = time_sec
self.start_context = current_context()
@@ -194,33 +225,40 @@ class RequestMetrics:
response_code_str = str(response_code)
outgoing_responses_counter.labels(self.method, response_code_str).inc()
outgoing_responses_counter.labels(
method=self.method,
code=response_code_str,
**{SERVER_NAME_LABEL: self.our_server_name},
).inc()
response_count.labels(self.method, self.name, tag).inc()
response_base_labels = {
"method": self.method,
"servlet": self.name,
"tag": tag,
SERVER_NAME_LABEL: self.our_server_name,
}
response_timer.labels(self.method, self.name, tag, response_code_str).observe(
time_sec - self.start_ts
)
response_count.labels(**response_base_labels).inc()
response_timer.labels(
code=response_code_str,
**response_base_labels,
).observe(time_sec - self.start_ts)
resource_usage = context.get_resource_usage()
response_ru_utime.labels(self.method, self.name, tag).inc(
resource_usage.ru_utime
)
response_ru_stime.labels(self.method, self.name, tag).inc(
resource_usage.ru_stime
)
response_db_txn_count.labels(self.method, self.name, tag).inc(
response_ru_utime.labels(**response_base_labels).inc(resource_usage.ru_utime)
response_ru_stime.labels(**response_base_labels).inc(resource_usage.ru_stime)
response_db_txn_count.labels(**response_base_labels).inc(
resource_usage.db_txn_count
)
response_db_txn_duration.labels(self.method, self.name, tag).inc(
response_db_txn_duration.labels(**response_base_labels).inc(
resource_usage.db_txn_duration_sec
)
response_db_sched_duration.labels(self.method, self.name, tag).inc(
response_db_sched_duration.labels(**response_base_labels).inc(
resource_usage.db_sched_duration_sec
)
response_size.labels(self.method, self.name, tag).inc(sent_bytes)
response_size.labels(**response_base_labels).inc(sent_bytes)
# We always call this at the end to ensure that we update the metrics
# regardless of whether a call to /metrics while the request was in
@@ -240,24 +278,30 @@ class RequestMetrics:
diff = new_stats - self._request_stats
self._request_stats = new_stats
in_flight_labels = {
"method": self.method,
"servlet": self.name,
SERVER_NAME_LABEL: self.our_server_name,
}
# max() is used since rapid use of ru_stime/ru_utime can end up with the
# count going backwards due to NTP, time smearing, fine-grained
# correction, or floating points. Who knows, really?
in_flight_requests_ru_utime.labels(self.method, self.name).inc(
in_flight_requests_ru_utime.labels(**in_flight_labels).inc(
max(diff.ru_utime, 0)
)
in_flight_requests_ru_stime.labels(self.method, self.name).inc(
in_flight_requests_ru_stime.labels(**in_flight_labels).inc(
max(diff.ru_stime, 0)
)
in_flight_requests_db_txn_count.labels(self.method, self.name).inc(
in_flight_requests_db_txn_count.labels(**in_flight_labels).inc(
diff.db_txn_count
)
in_flight_requests_db_txn_duration.labels(self.method, self.name).inc(
in_flight_requests_db_txn_duration.labels(**in_flight_labels).inc(
diff.db_txn_duration_sec
)
in_flight_requests_db_sched_duration.labels(self.method, self.name).inc(
in_flight_requests_db_sched_duration.labels(**in_flight_labels).inc(
diff.db_sched_duration_sec
)
+1 -1
View File
@@ -337,7 +337,7 @@ class _AsyncResource(resource.Resource, metaclass=abc.ABCMeta):
callback_return = await self._async_render(request)
except LimitExceededError as e:
if e.pause:
self._clock.sleep(e.pause)
await self._clock.sleep(e.pause)
raise
if callback_return is not None:
+11 -2
View File
@@ -44,6 +44,7 @@ from synapse.logging.context import (
LoggingContext,
PreserveLoggingContext,
)
from synapse.metrics import SERVER_NAME_LABEL
from synapse.types import ISynapseReactor, Requester
if TYPE_CHECKING:
@@ -83,12 +84,14 @@ class SynapseRequest(Request):
self,
channel: HTTPChannel,
site: "SynapseSite",
our_server_name: str,
*args: Any,
max_request_body_size: int = 1024,
request_id_header: Optional[str] = None,
**kw: Any,
):
super().__init__(channel, *args, **kw)
self.our_server_name = our_server_name
self._max_request_body_size = max_request_body_size
self.request_id_header = request_id_header
self.synapse_site = site
@@ -334,7 +337,11 @@ class SynapseRequest(Request):
# dispatching to the handler, so that the handler
# can update the servlet name in the request
# metrics
requests_counter.labels(self.get_method(), self.request_metrics.name).inc()
requests_counter.labels(
method=self.get_method(),
servlet=self.request_metrics.name,
**{SERVER_NAME_LABEL: self.our_server_name},
).inc()
@contextlib.contextmanager
def processing(self) -> Generator[None, None, None]:
@@ -455,7 +462,7 @@ class SynapseRequest(Request):
self.request_metrics.name.
"""
self.start_time = time.time()
self.request_metrics = RequestMetrics()
self.request_metrics = RequestMetrics(our_server_name=self.our_server_name)
self.request_metrics.start(
self.start_time, name=servlet_name, method=self.get_method()
)
@@ -694,6 +701,7 @@ class SynapseSite(ProxySite):
self.site_tag = site_tag
self.reactor: ISynapseReactor = reactor
self.server_name = hs.hostname
assert config.http_options is not None
proxied = config.http_options.x_forwarded
@@ -705,6 +713,7 @@ class SynapseSite(ProxySite):
return request_class(
channel,
self,
our_server_name=self.server_name,
max_request_body_size=max_request_body_size,
queued=queued,
request_id_header=request_id_header,
+25
View File
@@ -0,0 +1,25 @@
import logging
root_logger = logging.getLogger()
class ExplicitlyConfiguredLogger(logging.Logger):
"""
A custom logger class that only allows logging if the logger is explicitly
configured (does not inherit log level from parent).
"""
def isEnabledFor(self, level: int) -> bool:
# Check if the logger is explicitly configured
explicitly_configured_logger = self.manager.loggerDict.get(self.name)
log_level = logging.NOTSET
if isinstance(explicitly_configured_logger, logging.Logger):
log_level = explicitly_configured_logger.level
# If the logger is not configured, we don't log anything
if log_level == logging.NOTSET:
return False
# Otherwise, follow the normal logging behavior
return level >= log_level
+6 -2
View File
@@ -186,12 +186,16 @@ class MediaRepository:
def _start_update_recently_accessed(self) -> Deferred:
return run_as_background_process(
"update_recently_accessed_media", self._update_recently_accessed
"update_recently_accessed_media",
self.server_name,
self._update_recently_accessed,
)
def _start_apply_media_retention_rules(self) -> Deferred:
return run_as_background_process(
"apply_media_retention_rules", self._apply_media_retention_rules
"apply_media_retention_rules",
self.server_name,
self._apply_media_retention_rules,
)
async def _update_recently_accessed(self) -> None:
+1 -1
View File
@@ -740,7 +740,7 @@ class UrlPreviewer:
def _start_expire_url_cache_data(self) -> Deferred:
return run_as_background_process(
"expire_url_cache_data", self._expire_url_cache_data
"expire_url_cache_data", self.server_name, self._expire_url_cache_data
)
async def _expire_url_cache_data(self) -> None:
+145 -39
View File
@@ -33,6 +33,7 @@ from typing import (
Iterable,
Mapping,
Optional,
Sequence,
Set,
Tuple,
Type,
@@ -91,6 +92,7 @@ terms, an endpoint you can scrape is called an *instance*, usually corresponding
single process." (source: https://prometheus.io/docs/concepts/jobs_instances/)
"""
CONTENT_TYPE_LATEST = "text/plain; version=0.0.4; charset=utf-8"
"""
Content type of the latest text format for Prometheus metrics.
@@ -154,13 +156,13 @@ class _RegistryProxy:
RegistryProxy = cast(CollectorRegistry, _RegistryProxy)
@attr.s(slots=True, hash=True, auto_attribs=True)
@attr.s(slots=True, hash=True, auto_attribs=True, kw_only=True)
class LaterGauge(Collector):
"""A Gauge which periodically calls a user-provided callback to produce metrics."""
name: str
desc: str
labels: Optional[StrSequence] = attr.ib(hash=False)
labelnames: Optional[StrSequence] = attr.ib(hash=False)
# callback: should either return a value (if there are no labels for this metric),
# or dict mapping from a label tuple to a value
caller: Callable[
@@ -168,7 +170,9 @@ class LaterGauge(Collector):
]
def collect(self) -> Iterable[Metric]:
g = GaugeMetricFamily(self.name, self.desc, labels=self.labels)
# The decision to add `SERVER_NAME_LABEL` is from the `LaterGauge` usage itself
# (we don't enforce it here, one level up).
g = GaugeMetricFamily(self.name, self.desc, labels=self.labelnames) # type: ignore[missing-server-name-label]
try:
calls = self.caller()
@@ -302,7 +306,9 @@ class InFlightGauge(Generic[MetricsEntry], Collector):
Note: may be called by a separate thread.
"""
in_flight = GaugeMetricFamily(
# The decision to add `SERVER_NAME_LABEL` is from the `GaugeBucketCollector`
# usage itself (we don't enforce it here, one level up).
in_flight = GaugeMetricFamily( # type: ignore[missing-server-name-label]
self.name + "_total", self.desc, labels=self.labels
)
@@ -326,7 +332,9 @@ class InFlightGauge(Generic[MetricsEntry], Collector):
yield in_flight
for name in self.sub_metrics:
gauge = GaugeMetricFamily(
# The decision to add `SERVER_NAME_LABEL` is from the `InFlightGauge` usage
# itself (we don't enforce it here, one level up).
gauge = GaugeMetricFamily( # type: ignore[missing-server-name-label]
"_".join([self.name, name]), "", labels=self.labels
)
for key, metrics in metrics_by_key.items():
@@ -342,6 +350,51 @@ class InFlightGauge(Generic[MetricsEntry], Collector):
all_gauges[self.name] = self
class GaugeHistogramMetricFamilyWithLabels(GaugeHistogramMetricFamily):
"""
Custom version of `GaugeHistogramMetricFamily` from `prometheus_client` that allows
specifying labels and label values.
A single gauge histogram and its samples.
For use by custom collectors.
"""
def __init__(
self,
*,
name: str,
documentation: str,
gsum_value: float,
buckets: Optional[Sequence[Tuple[str, float]]] = None,
labelnames: StrSequence = (),
labelvalues: StrSequence = (),
unit: str = "",
):
# Sanity check the number of label values matches the number of label names.
if len(labelvalues) != len(labelnames):
raise ValueError(
"The number of label values must match the number of label names"
)
# Call the super to validate and set the labelnames. We use this stable API
# instead of setting the internal `_labelnames` field directly.
super().__init__(
name=name,
documentation=documentation,
labels=labelnames,
# Since `GaugeHistogramMetricFamily` doesn't support supplying `labels` and
# `buckets` at the same time (artificial limitation), we will just set these
# as `None` and set up the buckets ourselves just below.
buckets=None,
gsum_value=None,
)
# Create a gauge for each bucket.
if buckets is not None:
self.add_metric(labels=labelvalues, buckets=buckets, gsum_value=gsum_value)
class GaugeBucketCollector(Collector):
"""Like a Histogram, but the buckets are Gauges which are updated atomically.
@@ -354,14 +407,17 @@ class GaugeBucketCollector(Collector):
__slots__ = (
"_name",
"_documentation",
"_labelnames",
"_bucket_bounds",
"_metric",
)
def __init__(
self,
*,
name: str,
documentation: str,
labelnames: Optional[StrSequence],
buckets: Iterable[float],
registry: CollectorRegistry = REGISTRY,
):
@@ -375,6 +431,7 @@ class GaugeBucketCollector(Collector):
"""
self._name = name
self._documentation = documentation
self._labelnames = labelnames if labelnames else ()
# the tops of the buckets
self._bucket_bounds = [float(b) for b in buckets]
@@ -386,7 +443,7 @@ class GaugeBucketCollector(Collector):
# We initially set this to None. We won't report metrics until
# this has been initialised after a successful data update
self._metric: Optional[GaugeHistogramMetricFamily] = None
self._metric: Optional[GaugeHistogramMetricFamilyWithLabels] = None
registry.register(self)
@@ -395,15 +452,26 @@ class GaugeBucketCollector(Collector):
if self._metric is not None:
yield self._metric
def update_data(self, values: Iterable[float]) -> None:
def update_data(self, values: Iterable[float], labels: StrSequence = ()) -> None:
"""Update the data to be reported by the metric
The existing data is cleared, and each measurement in the input is assigned
to the relevant bucket.
"""
self._metric = self._values_to_metric(values)
def _values_to_metric(self, values: Iterable[float]) -> GaugeHistogramMetricFamily:
Args:
values
labels
"""
self._metric = self._values_to_metric(values, labels)
def _values_to_metric(
self, values: Iterable[float], labels: StrSequence = ()
) -> GaugeHistogramMetricFamilyWithLabels:
"""
Args:
values
labels
"""
total = 0.0
bucket_values = [0 for _ in self._bucket_bounds]
@@ -421,9 +489,13 @@ class GaugeBucketCollector(Collector):
# that bucket or below.
accumulated_values = itertools.accumulate(bucket_values)
return GaugeHistogramMetricFamily(
self._name,
self._documentation,
# The decision to add `SERVER_NAME_LABEL` is from the `GaugeBucketCollector`
# usage itself (we don't enforce it here, one level up).
return GaugeHistogramMetricFamilyWithLabels( # type: ignore[missing-server-name-label]
name=self._name,
documentation=self._documentation,
labelnames=self._labelnames,
labelvalues=labels,
buckets=list(
zip((str(b) for b in self._bucket_bounds), accumulated_values)
),
@@ -455,61 +527,82 @@ class CPUMetrics(Collector):
line = s.read()
raw_stats = line.split(") ", 1)[1].split(" ")
user = GaugeMetricFamily("process_cpu_user_seconds_total", "")
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
user = GaugeMetricFamily("process_cpu_user_seconds_total", "") # type: ignore[missing-server-name-label]
user.add_metric([], float(raw_stats[11]) / self.ticks_per_sec)
yield user
sys = GaugeMetricFamily("process_cpu_system_seconds_total", "")
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
sys = GaugeMetricFamily("process_cpu_system_seconds_total", "") # type: ignore[missing-server-name-label]
sys.add_metric([], float(raw_stats[12]) / self.ticks_per_sec)
yield sys
REGISTRY.register(CPUMetrics())
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
REGISTRY.register(CPUMetrics()) # type: ignore[missing-server-name-label]
#
# Federation Metrics
#
sent_transactions_counter = Counter("synapse_federation_client_sent_transactions", "")
sent_transactions_counter = Counter(
"synapse_federation_client_sent_transactions", "", labelnames=[SERVER_NAME_LABEL]
)
events_processed_counter = Counter("synapse_federation_client_events_processed", "")
events_processed_counter = Counter(
"synapse_federation_client_events_processed", "", labelnames=[SERVER_NAME_LABEL]
)
event_processing_loop_counter = Counter(
"synapse_event_processing_loop_count", "Event processing loop iterations", ["name"]
"synapse_event_processing_loop_count",
"Event processing loop iterations",
labelnames=["name", SERVER_NAME_LABEL],
)
event_processing_loop_room_count = Counter(
"synapse_event_processing_loop_room_count",
"Rooms seen per event processing loop iteration",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
)
# Used to track where various components have processed in the event stream,
# e.g. federation sending, appservice sending, etc.
event_processing_positions = Gauge("synapse_event_processing_positions", "", ["name"])
event_processing_positions = Gauge(
"synapse_event_processing_positions", "", labelnames=["name", SERVER_NAME_LABEL]
)
# Used to track the current max events stream position
event_persisted_position = Gauge("synapse_event_persisted_position", "")
event_persisted_position = Gauge(
"synapse_event_persisted_position", "", labelnames=[SERVER_NAME_LABEL]
)
# Used to track the received_ts of the last event processed by various
# components
event_processing_last_ts = Gauge("synapse_event_processing_last_ts", "", ["name"])
event_processing_last_ts = Gauge(
"synapse_event_processing_last_ts", "", labelnames=["name", SERVER_NAME_LABEL]
)
# Used to track the lag processing events. This is the time difference
# between the last processed event's received_ts and the time it was
# finished being processed.
event_processing_lag = Gauge("synapse_event_processing_lag", "", ["name"])
event_processing_lag = Gauge(
"synapse_event_processing_lag", "", labelnames=["name", SERVER_NAME_LABEL]
)
event_processing_lag_by_event = Histogram(
"synapse_event_processing_lag_by_event",
"Time between an event being persisted and it being queued up to be sent to the relevant remote servers",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
)
# Build info of the running server.
build_info = Gauge(
#
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`. We
# consider this process-level because all Synapse homeservers running in the process
# will use the same Synapse version.
build_info = Gauge( # type: ignore[missing-server-name-label]
"synapse_build_info", "Build information", ["pythonversion", "version", "osversion"]
)
build_info.labels(
@@ -525,44 +618,57 @@ threepid_send_requests = Histogram(
" there is a request with try count of 4, then there would have been one"
" each for 1, 2 and 3",
buckets=(1, 2, 3, 4, 5, 10),
labelnames=("type", "reason"),
labelnames=("type", "reason", SERVER_NAME_LABEL),
)
threadpool_total_threads = Gauge(
"synapse_threadpool_total_threads",
"Total number of threads currently in the threadpool",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
)
threadpool_total_working_threads = Gauge(
"synapse_threadpool_working_threads",
"Number of threads currently working in the threadpool",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
)
threadpool_total_min_threads = Gauge(
"synapse_threadpool_min_threads",
"Minimum number of threads configured in the threadpool",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
)
threadpool_total_max_threads = Gauge(
"synapse_threadpool_max_threads",
"Maximum number of threads configured in the threadpool",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
)
def register_threadpool(name: str, threadpool: ThreadPool) -> None:
"""Add metrics for the threadpool."""
def register_threadpool(*, name: str, server_name: str, threadpool: ThreadPool) -> None:
"""
Add metrics for the threadpool.
threadpool_total_min_threads.labels(name).set(threadpool.min)
threadpool_total_max_threads.labels(name).set(threadpool.max)
Args:
name: The name of the threadpool, used to identify it in the metrics.
server_name: The homeserver name (used to label metrics) (this should be `hs.hostname`).
threadpool: The threadpool to register metrics for.
"""
threadpool_total_threads.labels(name).set_function(lambda: len(threadpool.threads))
threadpool_total_working_threads.labels(name).set_function(
lambda: len(threadpool.working)
)
threadpool_total_min_threads.labels(
name=name, **{SERVER_NAME_LABEL: server_name}
).set(threadpool.min)
threadpool_total_max_threads.labels(
name=name, **{SERVER_NAME_LABEL: server_name}
).set(threadpool.max)
threadpool_total_threads.labels(
name=name, **{SERVER_NAME_LABEL: server_name}
).set_function(lambda: len(threadpool.threads))
threadpool_total_working_threads.labels(
name=name, **{SERVER_NAME_LABEL: server_name}
).set_function(lambda: len(threadpool.working))
class MetricsResource(Resource):
+13 -7
View File
@@ -54,8 +54,9 @@ running_on_pypy = platform.python_implementation() == "PyPy"
# Python GC metrics
#
gc_unreachable = Gauge("python_gc_unreachable_total", "Unreachable GC objects", ["gen"])
gc_time = Histogram(
# These are process-level metrics, so they do not have the `SERVER_NAME_LABEL`.
gc_unreachable = Gauge("python_gc_unreachable_total", "Unreachable GC objects", ["gen"]) # type: ignore[missing-server-name-label]
gc_time = Histogram( # type: ignore[missing-server-name-label]
"python_gc_time",
"Time taken to GC (sec)",
["gen"],
@@ -82,7 +83,8 @@ gc_time = Histogram(
class GCCounts(Collector):
def collect(self) -> Iterable[Metric]:
cm = GaugeMetricFamily("python_gc_counts", "GC object counts", labels=["gen"])
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
cm = GaugeMetricFamily("python_gc_counts", "GC object counts", labels=["gen"]) # type: ignore[missing-server-name-label]
for n, m in enumerate(gc.get_count()):
cm.add_metric([str(n)], m)
@@ -101,7 +103,8 @@ def install_gc_manager() -> None:
if running_on_pypy:
return
REGISTRY.register(GCCounts())
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
REGISTRY.register(GCCounts()) # type: ignore[missing-server-name-label]
gc.disable()
@@ -176,7 +179,8 @@ class PyPyGCStats(Collector):
#
# Total time spent in GC: 0.073 # s.total_gc_time
pypy_gc_time = CounterMetricFamily(
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
pypy_gc_time = CounterMetricFamily( # type: ignore[missing-server-name-label]
"pypy_gc_time_seconds_total",
"Total time spent in PyPy GC",
labels=[],
@@ -184,7 +188,8 @@ class PyPyGCStats(Collector):
pypy_gc_time.add_metric([], s.total_gc_time / 1000)
yield pypy_gc_time
pypy_mem = GaugeMetricFamily(
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
pypy_mem = GaugeMetricFamily( # type: ignore[missing-server-name-label]
"pypy_memory_bytes",
"Memory tracked by PyPy allocator",
labels=["state", "class", "kind"],
@@ -208,4 +213,5 @@ class PyPyGCStats(Collector):
if running_on_pypy:
REGISTRY.register(PyPyGCStats())
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
REGISTRY.register(PyPyGCStats()) # type: ignore[missing-server-name-label]
+6 -3
View File
@@ -62,7 +62,8 @@ logger = logging.getLogger(__name__)
# Twisted reactor metrics
#
tick_time = Histogram(
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
tick_time = Histogram( # type: ignore[missing-server-name-label]
"python_twisted_reactor_tick_time",
"Tick time of the Twisted reactor (sec)",
buckets=[0.001, 0.002, 0.005, 0.01, 0.025, 0.05, 0.1, 0.2, 0.5, 1, 2, 5],
@@ -114,7 +115,8 @@ class ReactorLastSeenMetric(Collector):
self._call_wrapper = call_wrapper
def collect(self) -> Iterable[Metric]:
cm = GaugeMetricFamily(
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
cm = GaugeMetricFamily( # type: ignore[missing-server-name-label]
"python_twisted_reactor_last_seen",
"Seconds since the Twisted reactor was last seen",
)
@@ -165,4 +167,5 @@ except Exception as e:
if wrapper:
REGISTRY.register(ReactorLastSeenMetric(wrapper))
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
REGISTRY.register(ReactorLastSeenMetric(wrapper)) # type: ignore[missing-server-name-label]
+91 -37
View File
@@ -31,6 +31,7 @@ from typing import (
Dict,
Iterable,
Optional,
Protocol,
Set,
Type,
TypeVar,
@@ -39,7 +40,7 @@ from typing import (
from prometheus_client import Metric
from prometheus_client.core import REGISTRY, Counter, Gauge
from typing_extensions import ParamSpec
from typing_extensions import Concatenate, ParamSpec
from twisted.internet import defer
@@ -49,6 +50,7 @@ from synapse.logging.context import (
PreserveLoggingContext,
)
from synapse.logging.opentracing import SynapseTags, start_active_span
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics._types import Collector
if TYPE_CHECKING:
@@ -64,13 +66,13 @@ logger = logging.getLogger(__name__)
_background_process_start_count = Counter(
"synapse_background_process_start_count",
"Number of background processes started",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
)
_background_process_in_flight_count = Gauge(
"synapse_background_process_in_flight_count",
"Number of background processes in flight",
labelnames=["name"],
labelnames=["name", SERVER_NAME_LABEL],
)
# we set registry=None in all of these to stop them getting registered with
@@ -80,21 +82,21 @@ _background_process_in_flight_count = Gauge(
_background_process_ru_utime = Counter(
"synapse_background_process_ru_utime_seconds",
"User CPU time used by background processes, in seconds",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
registry=None,
)
_background_process_ru_stime = Counter(
"synapse_background_process_ru_stime_seconds",
"System CPU time used by background processes, in seconds",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
registry=None,
)
_background_process_db_txn_count = Counter(
"synapse_background_process_db_txn_count",
"Number of database transactions done by background processes",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
registry=None,
)
@@ -104,14 +106,14 @@ _background_process_db_txn_duration = Counter(
"Seconds spent by background processes waiting for database "
"transactions, excluding scheduling time"
),
["name"],
labelnames=["name", SERVER_NAME_LABEL],
registry=None,
)
_background_process_db_sched_duration = Counter(
"synapse_background_process_db_sched_duration_seconds",
"Seconds spent by background processes waiting for database connections",
["name"],
labelnames=["name", SERVER_NAME_LABEL],
registry=None,
)
@@ -165,12 +167,15 @@ class _Collector(Collector):
yield from m.collect()
REGISTRY.register(_Collector())
# The `SERVER_NAME_LABEL` is included in the individual metrics added to this registry,
# so we don't need to worry about it on the collector itself.
REGISTRY.register(_Collector()) # type: ignore[missing-server-name-label]
class _BackgroundProcess:
def __init__(self, desc: str, ctx: LoggingContext):
def __init__(self, *, desc: str, server_name: str, ctx: LoggingContext):
self.desc = desc
self.server_name = server_name
self._context = ctx
self._reported_stats: Optional[ContextResourceUsage] = None
@@ -185,15 +190,21 @@ class _BackgroundProcess:
# For unknown reasons, the difference in times can be negative. See comment in
# synapse.http.request_metrics.RequestMetrics.update_metrics.
_background_process_ru_utime.labels(self.desc).inc(max(diff.ru_utime, 0))
_background_process_ru_stime.labels(self.desc).inc(max(diff.ru_stime, 0))
_background_process_db_txn_count.labels(self.desc).inc(diff.db_txn_count)
_background_process_db_txn_duration.labels(self.desc).inc(
diff.db_txn_duration_sec
)
_background_process_db_sched_duration.labels(self.desc).inc(
diff.db_sched_duration_sec
)
_background_process_ru_utime.labels(
name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
).inc(max(diff.ru_utime, 0))
_background_process_ru_stime.labels(
name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
).inc(max(diff.ru_stime, 0))
_background_process_db_txn_count.labels(
name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
).inc(diff.db_txn_count)
_background_process_db_txn_duration.labels(
name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
).inc(diff.db_txn_duration_sec)
_background_process_db_sched_duration.labels(
name=self.desc, **{SERVER_NAME_LABEL: self.server_name}
).inc(diff.db_sched_duration_sec)
R = TypeVar("R")
@@ -201,6 +212,7 @@ R = TypeVar("R")
def run_as_background_process(
desc: "LiteralString",
server_name: str,
func: Callable[..., Awaitable[Optional[R]]],
*args: Any,
bg_start_span: bool = True,
@@ -218,6 +230,8 @@ def run_as_background_process(
Args:
desc: a description for this background process type
server_name: The homeserver name that this background process is being run for
(this should be `hs.hostname`).
func: a function, which may return a Deferred or a coroutine
bg_start_span: Whether to start an opentracing span. Defaults to True.
Should only be disabled for processes that will not log to or tag
@@ -236,10 +250,16 @@ def run_as_background_process(
count = _background_process_counts.get(desc, 0)
_background_process_counts[desc] = count + 1
_background_process_start_count.labels(desc).inc()
_background_process_in_flight_count.labels(desc).inc()
_background_process_start_count.labels(
name=desc, **{SERVER_NAME_LABEL: server_name}
).inc()
_background_process_in_flight_count.labels(
name=desc, **{SERVER_NAME_LABEL: server_name}
).inc()
with BackgroundProcessLoggingContext(desc, count) as context:
with BackgroundProcessLoggingContext(
name=desc, server_name=server_name, instance_id=count
) as context:
try:
if bg_start_span:
ctx = start_active_span(
@@ -256,7 +276,9 @@ def run_as_background_process(
)
return None
finally:
_background_process_in_flight_count.labels(desc).dec()
_background_process_in_flight_count.labels(
name=desc, **{SERVER_NAME_LABEL: server_name}
).dec()
with PreserveLoggingContext():
# Note that we return a Deferred here so that it can be used in a
@@ -267,6 +289,14 @@ def run_as_background_process(
P = ParamSpec("P")
class HasServerName(Protocol):
server_name: str
"""
The homeserver name that this cache is associated with (used to label the metric)
(`hs.hostname`).
"""
def wrap_as_background_process(
desc: "LiteralString",
) -> Callable[
@@ -292,22 +322,37 @@ def wrap_as_background_process(
multiple places.
"""
def wrap_as_background_process_inner(
func: Callable[P, Awaitable[Optional[R]]],
def wrapper(
func: Callable[Concatenate[HasServerName, P], Awaitable[Optional[R]]],
) -> Callable[P, "defer.Deferred[Optional[R]]"]:
@wraps(func)
def wrap_as_background_process_inner_2(
*args: P.args, **kwargs: P.kwargs
def wrapped_func(
self: HasServerName, *args: P.args, **kwargs: P.kwargs
) -> "defer.Deferred[Optional[R]]":
# type-ignore: mypy is confusing kwargs with the bg_start_span kwarg.
# Argument 4 to "run_as_background_process" has incompatible type
# "**P.kwargs"; expected "bool"
# See https://github.com/python/mypy/issues/8862
return run_as_background_process(desc, func, *args, **kwargs) # type: ignore[arg-type]
assert self.server_name is not None, (
"The `server_name` attribute must be set on the object where `@wrap_as_background_process` decorator is used."
)
return wrap_as_background_process_inner_2
return run_as_background_process(
desc,
self.server_name,
func,
self,
*args,
# type-ignore: mypy is confusing kwargs with the bg_start_span kwarg.
# Argument 4 to "run_as_background_process" has incompatible type
# "**P.kwargs"; expected "bool"
# See https://github.com/python/mypy/issues/8862
**kwargs, # type: ignore[arg-type]
)
return wrap_as_background_process_inner
# There are some shenanigans here, because we're decorating a method but
# explicitly making use of the `self` parameter. The key thing here is that the
# return type within the return type for `measure_func` itself describes how the
# decorated function will be called.
return wrapped_func # type: ignore[return-value]
return wrapper # type: ignore[return-value]
class BackgroundProcessLoggingContext(LoggingContext):
@@ -317,13 +362,20 @@ class BackgroundProcessLoggingContext(LoggingContext):
__slots__ = ["_proc"]
def __init__(self, name: str, instance_id: Optional[Union[int, str]] = None):
def __init__(
self,
*,
name: str,
server_name: str,
instance_id: Optional[Union[int, str]] = None,
):
"""
Args:
name: The name of the background process. Each distinct `name` gets a
separate prometheus time series.
server_name: The homeserver name that this background process is being run for
(this should be `hs.hostname`).
instance_id: an identifer to add to `name` to distinguish this instance of
the named background process in the logs. If this is `None`, one is
made up based on id(self).
@@ -331,7 +383,9 @@ class BackgroundProcessLoggingContext(LoggingContext):
if instance_id is None:
instance_id = id(self)
super().__init__("%s-%s" % (name, instance_id))
self._proc: Optional[_BackgroundProcess] = _BackgroundProcess(name, self)
self._proc: Optional[_BackgroundProcess] = _BackgroundProcess(
desc=name, server_name=server_name, ctx=self
)
def start(self, rusage: "Optional[resource.struct_rusage]") -> None:
"""Log context has started running (again)."""
+10 -2
View File
@@ -22,6 +22,7 @@ from typing import TYPE_CHECKING
import attr
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import run_as_background_process
if TYPE_CHECKING:
@@ -33,6 +34,7 @@ from prometheus_client import Gauge
current_dau_gauge = Gauge(
"synapse_admin_daily_active_users",
"Current daily active users count",
labelnames=[SERVER_NAME_LABEL],
)
@@ -47,6 +49,7 @@ class CommonUsageMetricsManager:
"""Collects common usage metrics."""
def __init__(self, hs: "HomeServer") -> None:
self.server_name = hs.hostname
self._store = hs.get_datastores().main
self._clock = hs.get_clock()
@@ -62,12 +65,15 @@ class CommonUsageMetricsManager:
async def setup(self) -> None:
"""Keep the gauges for common usage metrics up to date."""
run_as_background_process(
desc="common_usage_metrics_update_gauges", func=self._update_gauges
desc="common_usage_metrics_update_gauges",
server_name=self.server_name,
func=self._update_gauges,
)
self._clock.looping_call(
run_as_background_process,
5 * 60 * 1000,
desc="common_usage_metrics_update_gauges",
server_name=self.server_name,
func=self._update_gauges,
)
@@ -85,4 +91,6 @@ class CommonUsageMetricsManager:
"""Update the Prometheus gauges."""
metrics = await self._collect()
current_dau_gauge.set(float(metrics.daily_active_users))
current_dau_gauge.labels(
**{SERVER_NAME_LABEL: self.server_name},
).set(float(metrics.daily_active_users))
+4 -2
View File
@@ -188,7 +188,8 @@ def _setup_jemalloc_stats() -> None:
def collect(self) -> Iterable[Metric]:
stats.refresh_stats()
g = GaugeMetricFamily(
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
g = GaugeMetricFamily( # type: ignore[missing-server-name-label]
"jemalloc_stats_app_memory_bytes",
"The stats reported by jemalloc",
labels=["type"],
@@ -230,7 +231,8 @@ def _setup_jemalloc_stats() -> None:
yield g
REGISTRY.register(JemallocCollector())
# This is a process-level metric, so it does not have the `SERVER_NAME_LABEL`.
REGISTRY.register(JemallocCollector()) # type: ignore[missing-server-name-label]
logger.debug("Added jemalloc stats")
+110 -5
View File
@@ -23,6 +23,7 @@ import logging
from typing import (
TYPE_CHECKING,
Any,
Awaitable,
Callable,
Collection,
Dict,
@@ -80,7 +81,9 @@ from synapse.logging.context import (
make_deferred_yieldable,
run_in_background,
)
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.metrics.background_process_metrics import (
run_as_background_process as _run_as_background_process,
)
from synapse.module_api.callbacks.account_validity_callbacks import (
IS_USER_EXPIRED_CALLBACK,
ON_LEGACY_ADMIN_REQUEST,
@@ -158,6 +161,9 @@ from synapse.util.caches.descriptors import CachedFunction, cached as _cached
from synapse.util.frozenutils import freeze
if TYPE_CHECKING:
# Old versions don't have `LiteralString`
from typing_extensions import LiteralString
from synapse.app.generic_worker import GenericWorkerStore
from synapse.server import HomeServer
@@ -216,6 +222,65 @@ class UserIpAndAgent:
last_seen: int
def run_as_background_process(
desc: "LiteralString",
func: Callable[..., Awaitable[Optional[T]]],
*args: Any,
bg_start_span: bool = True,
**kwargs: Any,
) -> "defer.Deferred[Optional[T]]":
"""
XXX: Deprecated: use `ModuleApi.run_as_background_process` instead.
Run the given function in its own logcontext, with resource metrics
This should be used to wrap processes which are fired off to run in the
background, instead of being associated with a particular request.
It returns a Deferred which completes when the function completes, but it doesn't
follow the synapse logcontext rules, which makes it appropriate for passing to
clock.looping_call and friends (or for firing-and-forgetting in the middle of a
normal synapse async function).
Args:
desc: a description for this background process type
server_name: The homeserver name that this background process is being run for
(this should be `hs.hostname`).
func: a function, which may return a Deferred or a coroutine
bg_start_span: Whether to start an opentracing span. Defaults to True.
Should only be disabled for processes that will not log to or tag
a span.
args: positional args for func
kwargs: keyword args for func
Returns:
Deferred which returns the result of func, or `None` if func raises.
Note that the returned Deferred does not follow the synapse logcontext
rules.
"""
logger.warning(
"Using deprecated `run_as_background_process` that's exported from the Module API. "
"Prefer `ModuleApi.run_as_background_process` instead.",
)
# Historically, since this function is exported from the module API, we can't just
# change the signature to require a `server_name` argument. Since
# `run_as_background_process` internally in Synapse requires `server_name` now, we
# just have to stub this out with a placeholder value and tell people to use the new
# function instead.
stub_server_name = "synapse_module_running_from_unknown_server"
return _run_as_background_process(
desc,
stub_server_name,
func,
*args,
bg_start_span=bg_start_span,
**kwargs,
)
def cached(
*,
max_entries: int = 1000,
@@ -277,7 +342,9 @@ class ModuleApi:
self._device_handler = hs.get_device_handler()
self.custom_template_dir = hs.config.server.custom_template_directory
self._callbacks = hs.get_module_api_callbacks()
self.msc3861_oauth_delegation_enabled = hs.config.experimental.msc3861.enabled
self._auth_delegation_enabled = (
hs.config.mas.enabled or hs.config.experimental.msc3861.enabled
)
self._event_serializer = hs.get_event_client_serializer()
try:
@@ -484,7 +551,7 @@ class ModuleApi:
Added in Synapse v1.46.0.
"""
if self.msc3861_oauth_delegation_enabled:
if self._auth_delegation_enabled:
raise ConfigError(
"Cannot use password auth provider callbacks when OAuth delegation is enabled"
)
@@ -1323,7 +1390,7 @@ class ModuleApi:
if self._hs.config.worker.run_background_tasks or run_on_all_instances:
self._clock.looping_call(
run_as_background_process,
self.run_as_background_process,
msec,
desc,
lambda: maybe_awaitable(f(*args, **kwargs)),
@@ -1381,7 +1448,7 @@ class ModuleApi:
return self._clock.call_later(
# convert ms to seconds as needed by call_later.
msec * 0.001,
run_as_background_process,
self.run_as_background_process,
desc,
lambda: maybe_awaitable(f(*args, **kwargs)),
)
@@ -1588,6 +1655,44 @@ class ModuleApi:
return {key: state_events[event_id] for key, event_id in state_ids.items()}
def run_as_background_process(
self,
desc: "LiteralString",
func: Callable[..., Awaitable[Optional[T]]],
*args: Any,
bg_start_span: bool = True,
**kwargs: Any,
) -> "defer.Deferred[Optional[T]]":
"""Run the given function in its own logcontext, with resource metrics
This should be used to wrap processes which are fired off to run in the
background, instead of being associated with a particular request.
It returns a Deferred which completes when the function completes, but it doesn't
follow the synapse logcontext rules, which makes it appropriate for passing to
clock.looping_call and friends (or for firing-and-forgetting in the middle of a
normal synapse async function).
Args:
desc: a description for this background process type
server_name: The homeserver name that this background process is being run for
(this should be `hs.hostname`).
func: a function, which may return a Deferred or a coroutine
bg_start_span: Whether to start an opentracing span. Defaults to True.
Should only be disabled for processes that will not log to or tag
a span.
args: positional args for func
kwargs: keyword args for func
Returns:
Deferred which returns the result of func, or `None` if func raises.
Note that the returned Deferred does not follow the synapse logcontext
rules.
"""
return _run_as_background_process(
desc, self.server_name, func, *args, bg_start_span=bg_start_span, **kwargs
)
async def defer_to_thread(
self,
f: Callable[P, T],
+46 -16
View File
@@ -29,6 +29,7 @@ from typing import (
Iterable,
List,
Literal,
Mapping,
Optional,
Set,
Tuple,
@@ -50,7 +51,7 @@ from synapse.handlers.presence import format_user_presence_state
from synapse.logging import issue9533_logger
from synapse.logging.context import PreserveLoggingContext
from synapse.logging.opentracing import log_kv, start_active_span
from synapse.metrics import LaterGauge
from synapse.metrics import SERVER_NAME_LABEL, LaterGauge
from synapse.streams.config import PaginationConfig
from synapse.types import (
ISynapseReactor,
@@ -74,10 +75,15 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
notified_events_counter = Counter("synapse_notifier_notified_events", "")
# FIXME: Unused metric, remove if not needed.
notified_events_counter = Counter(
"synapse_notifier_notified_events", "", labelnames=[SERVER_NAME_LABEL]
)
users_woken_by_stream_counter = Counter(
"synapse_notifier_users_woken_by_stream", "", ["stream"]
"synapse_notifier_users_woken_by_stream",
"",
labelnames=["stream", SERVER_NAME_LABEL],
)
T = TypeVar("T")
@@ -224,6 +230,7 @@ class Notifier:
self.room_to_user_streams: Dict[str, Set[_NotifierUserStream]] = {}
self.hs = hs
self.server_name = hs.hostname
self._storage_controllers = hs.get_storage_controllers()
self.event_sources = hs.get_event_sources()
self.store = hs.get_datastores().main
@@ -257,7 +264,10 @@ class Notifier:
# This is not a very cheap test to perform, but it's only executed
# when rendering the metrics page, which is likely once per minute at
# most when scraping it.
def count_listeners() -> int:
#
# Ideally, we'd use `Mapping[Tuple[str], int]` here but mypy doesn't like it.
# This is close enough and better than a type ignore.
def count_listeners() -> Mapping[Tuple[str, ...], int]:
all_user_streams: Set[_NotifierUserStream] = set()
for streams in list(self.room_to_user_streams.values()):
@@ -265,18 +275,34 @@ class Notifier:
for stream in list(self.user_to_user_stream.values()):
all_user_streams.add(stream)
return sum(stream.count_listeners() for stream in all_user_streams)
LaterGauge("synapse_notifier_listeners", "", [], count_listeners)
return {
(self.server_name,): sum(
stream.count_listeners() for stream in all_user_streams
)
}
LaterGauge(
"synapse_notifier_rooms",
"",
[],
lambda: count(bool, list(self.room_to_user_streams.values())),
name="synapse_notifier_listeners",
desc="",
labelnames=[SERVER_NAME_LABEL],
caller=count_listeners,
)
LaterGauge(
name="synapse_notifier_rooms",
desc="",
labelnames=[SERVER_NAME_LABEL],
caller=lambda: {
(self.server_name,): count(
bool, list(self.room_to_user_streams.values())
)
},
)
LaterGauge(
"synapse_notifier_users", "", [], lambda: len(self.user_to_user_stream)
name="synapse_notifier_users",
desc="",
labelnames=[SERVER_NAME_LABEL],
caller=lambda: {(self.server_name,): len(self.user_to_user_stream)},
)
def add_replication_callback(self, cb: Callable[[], None]) -> None:
@@ -350,9 +376,10 @@ class Notifier:
for listener in listeners:
listener.callback(current_token)
users_woken_by_stream_counter.labels(StreamKeyType.UN_PARTIAL_STATED_ROOMS).inc(
len(user_streams)
)
users_woken_by_stream_counter.labels(
stream=StreamKeyType.UN_PARTIAL_STATED_ROOMS,
**{SERVER_NAME_LABEL: self.server_name},
).inc(len(user_streams))
# Poke the replication so that other workers also see the write to
# the un-partial-stated rooms stream.
@@ -575,7 +602,10 @@ class Notifier:
listener.callback(current_token)
if user_streams:
users_woken_by_stream_counter.labels(stream_key).inc(len(user_streams))
users_woken_by_stream_counter.labels(
stream=stream_key,
**{SERVER_NAME_LABEL: self.server_name},
).inc(len(user_streams))
self.notify_replication()
+27 -3
View File
@@ -25,6 +25,7 @@ from typing import (
Any,
Collection,
Dict,
FrozenSet,
List,
Mapping,
Optional,
@@ -50,6 +51,7 @@ from synapse.event_auth import auth_types_for_event, get_user_power_level
from synapse.events import EventBase, relation_from_event
from synapse.events.snapshot import EventContext
from synapse.logging.context import make_deferred_yieldable, run_in_background
from synapse.metrics import SERVER_NAME_LABEL
from synapse.state import CREATE_KEY, POWER_KEY
from synapse.storage.databases.main.roommember import EventIdMembership
from synapse.storage.invite_rule import InviteRule
@@ -68,11 +70,17 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
# FIXME: Unused metric, remove if not needed.
push_rules_invalidation_counter = Counter(
"synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter", ""
"synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter",
"",
labelnames=[SERVER_NAME_LABEL],
)
# FIXME: Unused metric, remove if not needed.
push_rules_state_size_counter = Counter(
"synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter", ""
"synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter",
"",
labelnames=[SERVER_NAME_LABEL],
)
@@ -470,8 +478,18 @@ class BulkPushRuleEvaluator:
event.room_version.msc3931_push_features,
self.hs.config.experimental.msc1767_enabled, # MSC3931 flag
self.hs.config.experimental.msc4210_enabled,
self.hs.config.experimental.msc4306_enabled,
)
msc4306_thread_subscribers: Optional[FrozenSet[str]] = None
if self.hs.config.experimental.msc4306_enabled and thread_id != MAIN_TIMELINE:
# pull out, in batch, all local subscribers to this thread
# (in the common case, they will all be getting processed for push
# rules right now)
msc4306_thread_subscribers = await self.store.get_subscribers_to_thread(
event.room_id, thread_id
)
for uid, rules in rules_by_user.items():
if event.sender == uid:
continue
@@ -496,7 +514,13 @@ class BulkPushRuleEvaluator:
# current user, it'll be added to the dict later.
actions_by_user[uid] = []
actions = evaluator.run(rules, uid, display_name)
msc4306_thread_subscription_state: Optional[bool] = None
if msc4306_thread_subscribers is not None:
msc4306_thread_subscription_state = uid in msc4306_thread_subscribers
actions = evaluator.run(
rules, uid, display_name, msc4306_thread_subscription_state
)
if "notify" in actions:
# Push rules say we should notify the user of this event
actions_by_user[uid] = actions

Some files were not shown because too many files have changed in this diff Show More