This fixes one of the 2 blockers to using pytest instead of Trial (which
is not formally-motivated, but sometimes seems like an interesting idea
because
pytest has seen a lot of developer experience features that Trial
hasn't. It would also removes one more coupling to the Twisted
framework.)
---
The `test_` prefix to this test helper makes it appear as a test to
pytest.
We *can* set a `__test__ = False` attribute on the test, but it felt
cleaner to just rename it (as I also thought it would be a test from
that name!).
This was previously reported as:
https://github.com/element-hq/synapse/issues/18665
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Part of: MSC4354 whose experimental feature tracking issue is
https://github.com/element-hq/synapse/issues/19409
Follows: #19340 (a necessary bugfix for `/event/` to set this metadata)
Partially supersedes: #18968
This PR implements the first batch of work to support MSC4354 Sticky
Events.
Sticky events are events that have been configured with a finite
'stickiness' duration,
capped to 1 hour per current MSC draft.
Whilst an event is sticky, we provide stronger delivery guarantees for
the event, both to
our clients and to remote homeservers, essentially making it reliable
delivery as long as we
have a functional connection to the client/server and until the
stickiness expires.
This PR merely supports creating sticky events and receiving the sticky
TTL metadata in clients.
It is not suitable for trialling sticky events since none of the other
semantics are implemented.
Contains a temporary SQLite workaround due to a bug in our supported
version enforcement: https://github.com/element-hq/synapse/issues/19452
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Co-authored-by: Eric Eastwood <erice@element.io>
When we change the `required_state` config for a room in sliding sync,
we insert a new entry into the `sliding_sync_connection_required_state`
table. As the sliding sync connection advances we can accrue a lot of
stale entries, so let's clear those out.
This is a sort of follow on from #19211
---------
Co-authored-by: Eric Eastwood <erice@element.io>
Fixes#19375
`prometheus_client` 0.24 makes `Collector` a generic type.
Previously, `InFlightGauge` inherited from both `Generic[MetricsEntry]`
and `Collector`, resulting in the error `TypeError: cannot create a
consistent MRO` when using `prometheus_client` >= 0.24. This behaviour
of disallowing multiple `Generic` inheritance is more strictly enforced
starting with python 3.14, but can still lead to issues with earlier
versions of python.
This PR separates runtime and typing inheritance for `InFlightGauge`:
- Runtime: `InFlightGauge` inherits only from `Collector`
- Typing: `InFlightGauge` is generic
This preserves static typing, avoids MRO conflicts, and supports both
`prometheus_client` <0.24 and >=0.24.
I have tested these changes out locally with `prometheus_client` 0.23.1
& 0.24 on python 3.14 while sending a bunch of messages over federation
and watching a grafana dashboard configured to show
`synapse_util_metrics_block_in_flight_total` &
`synapse_util_metrics_block_in_flight_real_time_sum` (the only metric
setup to use `InFlightGauge`) and things are working in each case.
https://github.com/element-hq/synapse/blob/a1e9abc7df3e6c43a95cba059348546a4c9d4491/synapse/util/metrics.py#L112-L119
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [X] Pull request is based on the develop branch
* [X] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
Store the JSON content of scheduled delayed events as text instead of a
byte array. This brings it in line with the `event_json` table's `json`
column, and fixes the inability to schedule a delayed event with
non-ASCII characters in its content.
Fixes#19242
Fixes#19347
This deprecates MSC2697 which has been closed since May 2024. As per
#19347 this seems to be a thing we can just rip out. The crypto team
have moved onto MSC3814 and are suggesting that developers who rely on
MSC2697 should use MSC3814 instead.
MSC2697 implementation originally introduced by https://github.com/matrix-org/synapse/pull/8380
Fix /event/ endpoint not transforming event with per-requester metadata
Pass notif_event through filter_events_for_client \
Not aware of an actual issue here, but seems silly to bypass it
Call it filter_and_transform_events_for_client to make it more obvious
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Spawning from wanting to [run a load
test](https://github.com/element-hq/synapse-rust-apps/pull/397) against
the Complement Docker image of Synapse and see metrics from the
homeserver.
### Why not just provide your own homeserver config?
Probably possible but it gets tricky when you try to use the workers
variant of the Docker image (`docker/Dockerfile-workers`). The way to
workaround it would probably be to `yq` edit everything in a script and
change `/data/homeserver.yaml` and `/conf/workers/*.yaml` to add the
`metrics` listener. And then modify `/conf/workers/shared.yaml` to add
`enable_metrics: true`. Doesn't spark much joy.
Fixes#19269
Versions of zope-interface from RHEL, Ubuntu LTS 22 & 24 and OpenSuse
don't support the new python union `X | Y` syntax for interfaces. This
PR partially reverts the change over to fully use the new syntax, adds a
minimum supported version of zope-interface to Synapse's dependency
list, and removes the linter auto-upgrades which prefer the newer
syntax.
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [X] Pull request is based on the develop branch
* [X] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Fixes https://github.com/element-hq/synapse/issues/19175
This PR moves tracking of what lazy loaded membership we've sent to each
room out of the required state table. This avoids that table from
continuously growing, which massively helps performance as we pull out
all matching rows for the connection when we receive a request.
The new table is only read when we have data in a room to send, so we
end up reading a lot fewer rows from the DB. Though we now read from
that table for every room we have events to return in, rather than once
at the start of the request.
For an explanation of how the new table works, see the
[comment](https://github.com/element-hq/synapse/blob/erikj/sss_better_membership_storage2/synapse/storage/schema/main/delta/93/02_sliding_sync_members.sql#L15-L38)
on the table schema.
The table is designed so that we can later prune old entries if we wish,
but that is not implemented in this PR.
Reviewable commit-by-commit.
---------
Co-authored-by: Eric Eastwood <erice@element.io>
Related to https://github.com/element-hq/synapse/issues/17035, when
Synapse receives a request that is larger than the maximum size allowed,
it aborts the connection without ever sending back a HTTP response.
I dug into our usage of twisted and how best to try and report such an
error and this is what I came up with.
It would be ideal to be able to report the status from within
`handleContentChunk` but that is called too early on in the twisted http
handling code, before things have been setup enough to be able to
properly write a response.
I tested this change out locally (both with C-S and S-S apis) and they
do receive a 413 response now in addition to the connection being
closed.
Hopefully this will aid in being able to quickly detect when
https://github.com/element-hq/synapse/issues/17035 is occurring as the
current situation makes it very hard to narrow things down to that
specific issue without making a lot of assumptions.
This PR also responds with more meaningful error codes now in the case
of:
- multiple `Content-Length` headers
- invalid `Content-Length` header value
- request content size being larger than the `Content-Length` value
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [X] Pull request is based on the develop branch
* [X] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Eric Eastwood <erice@element.io>
Fix#19233
Synapse fails to handle events in v12 rooms when the server is run with
the `{use_frozen_dicts: True}` config.
This PR fixes the issue, and adds tests which cover room creation,
joining, and joining over federation, with both frozen and not frozen
config settings, by extending the existing `test_send_join` federation
tests.
This approach to testing was chosen as it is a simple way to get high
level integration style test coverage, without going through all our
existing tests and trying to retroactively add in coverage when using
frozen dicts.
This should provide an easy place for future room versions to extend the
suite of tests and reduce the chance of introducing subtle bugs like
this in the future.
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
This changes the arguments in clock functions to be `Duration` and
converts call sites and constants into `Duration`. There are still some
more functions around that should be converted (e.g.
`timeout_deferred`), but we leave that to another PR.
We also changes `.as_secs()` to return a float, as the rounding broke
things subtly. The only reason to keep it (its the same as
`timedelta.total_seconds()`) is for symmetry with `as_millis()`.
Follows on from https://github.com/element-hq/synapse/pull/19223
MSC4380 aims to be a simplified implementation of MSC4155; the hope is
that we can get it specced and rolled out rapidly, so that we can
resolve the fact that `matrix.org` has enabled MSC4155.
The implementation leans heavily on what's already there for MSC4155.
It has its own `experimental_features` flag. If both MSC4155 and MSC4380
are enabled, and a user has both configurations set, then we prioritise
the MSC4380 one.
Contributed wearing my 🎩 Spec Core Team hat.
We have various constants to try and avoid mistyping of durations, e.g.
`ONE_HOUR_SECONDS * MILLISECONDS_PER_SECOND`, however this can get a
little verbose and doesn't help with typing.
Instead, let's move towards a dedicated `Duration` class (basically a
[`timedelta`](https://docs.python.org/3/library/datetime.html#timedelta-objects)
with helper methods).
This PR introduces the new types and converts all usages of the existing
constants with it. Future PRs may work to move the clock methods to also
use it (e.g. `call_later` and `looping_call`).
Reviewable commit-by-commit.
We add some logic to expire sliding sync connections if they get old or
if there is too much pending data to return.
The values of the constants are picked fairly arbitrarily, these are
currently:
1. More than 100 rooms with pending events if the connection hasn't been
used in over an hour
2. The connection hasn't been used for over a week
Reviewable commit-by-commit
---------
Co-authored-by: Eric Eastwood <erice@element.io>
As per recent proposals in MSC4140, remove authentication for
restarting/cancelling/sending a delayed event, and give each of those
actions its own endpoint. (The original consolidated endpoint is still
supported for backwards compatibility.)
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Half-Shot <will@half-shot.uk>
Spawning a background process comes with a bunch of overhead, so let's
try to reduce the number of background processes we need to spawn when
handling inbound fed.
Currently, we seem to be doing roughly one per command. Instead, lets
keep the background process alive for a bit waiting for a new command to
come in.
I noticed this in some profiling. Basically, we prune the ratelimiters
by copying and iterating over every entry every 60 seconds. Instead,
let's use a wheel timer to track when we should potentially prune a
given key, and then we a) check fewer keys, and b) can run more
frequently. Hopefully this should mean we don't have a large pause
everytime we prune a ratelimiter with lots of keys.
Also fixes a bug where we didn't prune entries that were added via
`record_action` and never subsequently updated. This affected the media
and joins-per-room ratelimiter.