* Properly update retry_last_ts when hitting the maximum retry interval
This was broken in 1.87 when the maximum retry interval got changed from
almost infinite to a week (and made configurable).
fixes#16101
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
* Add changelog
* Change fix + add test
* Add comment
---------
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
Co-authored-by: Mathieu Velten <mathieuv@matrix.org>
This should only be called on HomeServer objects which are configured
to run background tasks, which is automatically (and properly) done via
the call to setup().
If we don't have all the auth events in a room then not all state events will have a chain cover index. Even so, we can still use the chain cover index on the events that do have it, rather than bailing and using the slower functions.
This situation should not arise for newly persisted rooms, as we check we have the full auth chain for each event, but can happen for existing rooms.
c.f. #15245
We were seeing serialization errors when taking out multiple read locks.
The transactions were retried, so isn't causing any failures.
Introduced in #15782.
* Fix the method signature of `run_db_interaction` on the module API
* Newsfile
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
---------
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
Misc. clean-ups to:
* Use keyword arguments.
* Return early (reducing indentation) of some functions.
* Removing duplicated / unused code.
* Use wrap_as_background_process.
* Add a module API function to provide `call_later`
* Newsfile
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
* Add comments
* Update version number
---------
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
* Add a cache invalidation clean-up task
* Run the cache invalidation stream clean-up on the background worker
* Tune down
* call_later is in millis!
* Newsfile
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
* fixup! Add a cache invalidation clean-up task
* Update synapse/storage/databases/main/cache.py
Co-authored-by: Eric Eastwood <erice@element.io>
* Update synapse/storage/databases/main/cache.py
Co-authored-by: Eric Eastwood <erice@element.io>
* MILLISEC -> MS
* Expand on comment
* Move and tweak comment about Postgres
* Use `wrap_as_background_process`
---------
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
Co-authored-by: Eric Eastwood <erice@element.io>
For now this maintains compatible with old Synapses by falling back
to using transaction semantics on a per-access token. A future version
of Synapse will drop support for this.
Adds three new configuration variables:
* destination_min_retry_interval is identical to before (10mn).
* destination_retry_multiplier is now 2 instead of 5, the maximum value will
be reached slower.
* destination_max_retry_interval is one day instead of (essentially) infinity.
Capping this will cause destinations to continue to be retried sometimes instead
of being lost forever. The previous value was 2 ^ 62 milliseconds.
The location of the redacts field changes in room version 11. Ensure
it is copied to the *new* location for *old* room versions for
forwards-compatibility with clients.
Note that copying it to the *old* location for the *new* room version
was previously handled.
* Updates the rule ID.
* Use `event_property_is` instead of `event_match`.
This updates the implementation of MSC3958 to match the latest
text from the MSC.
The un_partial_stated_event_stream_sequence and
application_services_txn_id_seq were never properly configured
in the portdb script, resulting in an error on start-up.
And fix a bug in the implementation of the updated redaction
format (MSC2174) where the top-level redacts field was not
properly added for backwards-compatibility.
Allow configuring the set of workers to proxy outbound federation traffic through (`outbound_federation_restricted_to`).
This is useful when you have a worker setup with `federation_sender` instances responsible for sending outbound federation requests and want to make sure *all* outbound federation traffic goes through those instances. Before this change, the generic workers would still contact federation themselves for things like profile lookups, backfill, etc. This PR allows you to set more strict access controls/firewall for all workers and only allow the `federation_sender`'s to contact the outside world.
Make it more obvious which Python version runs on a given Linux distribution so when we end up dropping support for a given Python version, we can more easily find the reference to the Python version and remove any references for the distribution. We don't want to be running tests or building packages on a distribution that no longer has a supported Python version.
This way, we can avoid another situation like when we dropped support for Python 3.7 but forgot to drop the Debian Buster references everywhere (https://github.com/matrix-org/synapse/pull/15893)
Previously, if you just followed the instructions per the docs, you just ran into an error:
```sh
$ poetry run synapse_worker --config-path homeserver_generic_worker1.yaml
Missing mandatory `server_name` config option.
```
**Before:**
```
Error retrieving alias
```
**After:**
```
Error retrieving alias #foo:bar -> 401 Unauthorized
```
*Spawning from creating the [manual testing strategy for the outbound federation proxy](https://github.com/matrix-org/synapse/pull/15773).*
Unix socket support for `federation` and `client` Listeners has existed now for a little while(since [1.81.0](https://github.com/matrix-org/synapse/pull/15353)), but there was one last hold out before it could be complete: HTTP Replication communication. This should finish it up. The Listeners would have always worked, but would have had no way to be talked to/at.
---------
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
Co-authored-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
Co-authored-by: Eric Eastwood <erice@element.io>
A lot of the functions have the same name in this space like `store_file`,
and we also do it multiple times for different reasons (main media repo,
other storage providers, thumbnails, etc) so it's good to differentiate
them so your head doesn't explode.
Follow-up to https://github.com/matrix-org/synapse/pull/15850
Tracing instrumentation to media `/upload` code paths to investigate https://github.com/matrix-org/synapse/issues/15841
Allow configuring the set of workers to proxy outbound federation traffic through (`outbound_federation_restricted_to`).
This is useful when you have a worker setup with `federation_sender` instances responsible for sending outbound federation requests and want to make sure *all* outbound federation traffic goes through those instances. Before this change, the generic workers would still contact federation themselves for things like profile lookups, backfill, etc. This PR allows you to set more strict access controls/firewall for all workers and only allow the `federation_sender`'s to contact the outside world.
The original code is from @erikjohnston's branches which I've gotten in-shape to merge.
Image.ANTIALIAS is not defined in current pillow releases. Since ANTIALIAS was just using LANCZOS anyways, this is just a cosmetic change, but makes synapse work with most recent pillow releases.
Signed-off-by: Giovanni Harting <539@idlegandalf.com>
Old device entries for the same user were being removed in individual
SQL commands, making the batch take way longer than necessary.
This combines the commands into a single one with a IN/ANY clause.
Example of log entry before the change, regularly observed with
"log_min_duration_statement = 10000" in PostgreSQL's config:
LOG: duration: 42538.282 ms statement:
DELETE FROM device_lists_stream
WHERE user_id = '@someone' AND device_id = 'someid1'
AND stream_id < 123456789
;
DELETE FROM device_lists_stream
WHERE user_id = '@someone' AND device_id = 'someid2'
AND stream_id < 123456789
;
[repeated for each device ID of that user, potentially a lot...]
With the patch applied on my instance for the past couple of days, I
no longer notice overly long statements of that particular kind.
Signed-off-by: pacien <pacien.trangirard@pacien.net>
* Fix use of config override directory in `devenv up`
`--config-directory` is for the generate config script; `-c` is for usage
* Add homeserver config override directory to gitignore
* Newsfile
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
---------
Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
If you leave a room and forget it, then rejoin it, the room would be
missing from the next initial sync.
fixes#13262
Signed-off-by: Nicolas Werner <n.werner@famedly.com>
The port DB script would try and run database background tasks, which
could fail if the data they acted on was in the process of being ported.
These exceptions were non fatal.
Fixes#15789
We now only block the client to backfill when we see a large gap in the events (more than 2 events missing in a row according to `depth`), more than 3 single-event holes, or not enough messages to fill the response. Otherwise, we return the messages directly to the client and backfill in the background for eventual consistency sake.
Fix https://github.com/matrix-org/synapse/issues/15696
* Check required power levels earlier in createRoom handler.
- If a server was configured to reject the creation of rooms with E2EE
enabled (by specifying an unattainably high power level for
"m.room.encryption" in default_power_level_content_override), the 403
error was not being triggered until after the room was created and
before the "m.room.power_levels" was sent. This allowed a user to
access the partially-configured room and complete the setup of E2EE
and power levels manually.
- This change causes the power level overrides to be checked earlier and
the request to be rejected before the user gains access to the room.
- A new `_validate_room_config` method is added to contain checks that
should be run before a room is created.
- The new test case confirms that a user request is rejected by the new
validation method.
Signed-off-by: Grant McLean <grant@catalyst.net.nz>
* Add a changelog file.
* Formatting fix for black.
* Remove unneeded line from test.
---------
Signed-off-by: Grant McLean <grant@catalyst.net.nz>
There appears to be a race where you can end up with entries in
`event_push_summary` with both a `NULL` and `main` thread ID.
Fixes#15736
Introduced in #15597
Fix a long-standing bu in `/sync` where timeout=0 does not skip caching, resulting in slow calls in cases where there are no new changes. Contributed by @PlasmaIntec.
Addan`admins`queryparametertothe[ListAccounts](https://matrix-org.github.io/synapse/v1.91/admin_api/user_admin_api.html#list-accounts) [admin API](https://matrix-org.github.io/synapse/v1.91/usage/administration/admin_api/index.html), to include only admins or to exclude admins in user queries.
Filter out user agent references to the sliding sync proxy and rust-sdk from the user_daily_visits table to ensure that Element X can be represented fully.
User constent and 3-PID changes capability cannot be enabled when using experimental [MSC3861](https://github.com/matrix-org/matrix-spec-proposals/pull/3861) support.
User constent and 3-PID changes capability cannot be enabled when using experimental [MSC3861](https://github.com/matrix-org/matrix-spec-proposals/pull/3861) support.
Fix a bug introduced in 1.87 where synapse would send an excessive amount of federation requests to servers which have been offline for a long time. Contributed by Nico.
Fix a bug introduced in 1.87 where synapse would send an excessive amount of federation requests to servers which have been offline for a long time. Contributed by Nico.
Generally speaking, streams are a series of notifications that something in Synapse's database has changed that the application might need to respond to.
For example:
- The events stream reports new events (PDUs) that Synapse creates, or that Synapse accepts from another homeserver.
- The account data stream reports changes to users' [account data](https://spec.matrix.org/v1.7/client-server-api/#client-config).
- The to-device stream reports when a device has a new [to-device message](https://spec.matrix.org/v1.7/client-server-api/#send-to-device-messaging).
It is very helpful to understand the streams mechanism when working on any part of Synapse that needs to respond to changes—especially if those changes are made by different workers.
To that end, let's describe streams formally, paraphrasing from the docstring of [`AbstractStreamIdGenerator`](
A stream is an append-only log `T1, T2, ..., Tn, ...` of facts[^1] which grows over time.
Only "writers" can add facts to a stream, and there may be multiple writers.
Each fact has an ID, called its "stream ID".
Readers should only process facts in ascending stream ID order.
Roughly speaking, each stream is backed by a database table.
It should have a `stream_id` (or similar) bigint column holding stream IDs, plus additional columns as necessary to describe the fact.
Typically, a fact is expressed with a single row in its backing table.[^2]
Within a stream, no two facts may have the same stream_id.
> _Aside_. Some additional notes on streams' backing tables.
>
> 1. Rich would like to [ditch the backing tables](https://github.com/matrix-org/synapse/issues/13456).
> 2. The backing tables may have other uses.
> For example, the events table serves backs the events stream, and is read when processing new events.
> But old rows are read from the table all the time, whenever Synapse needs to lookup some facts about an event.
> 3. Rich suspects that sometimes the stream is backed by multiple tables, so the stream proper is the union of those tables.
Stream writers can "reserve" a stream ID, and then later mark it as having being completed.
Stream writers need to track the completion of each stream fact.
In the happy case, completion means a fact has been written to the stream table.
But unhappy cases (e.g. transaction rollback due to an error) also count as completion.
Once completed, the rows written with that stream ID are fixed, and no new rows
will be inserted with that ID.
### Current stream ID
For any given stream reader (including writers themselves), we may define a per-writer current stream ID:
> The current stream ID _for a writer W_ is the largest stream ID such that
> all transactions added by W with equal or smaller ID have completed.
Similarly, there is a "linear" notion of current stream ID:
> The "linear" current stream ID is the largest stream ID such that
> all facts (added by any writer) with equal or smaller ID have completed.
Because different stream readers A and B learn about new facts at different times, A and B may disagree about current stream IDs.
Put differently: we should think of stream readers as being independent of each other, proceeding through a stream of facts at different rates.
**NB.** For both senses of "current", that if a writer opens a transaction that never completes, the current stream ID will never advance beyond that writer's last written stream ID.
For single-writer streams, the per-writer current ID and the linear current ID are the same.
Both senses of current ID are monotonic, but they may "skip" or jump over IDs because facts complete out of order.
_Example_.
Consider a single-writer stream which is initially at ID 1.
| Complete 3 | 1 | current ID unchanged, waiting for 2 to complete |
| Complete 2 | 3 | current ID jumps from 1 -> 3 |
| Reserve 4 | 3 | |
| Reserve 5 | 3 | |
| Reserve 6 | 3 | |
| Complete 5 | 3 | |
| Complete 4 | 5 | current ID jumps 3->5, even though 6 is pending |
| Complete 6 | 6 | |
### Multi-writer streams
There are two ways to view a multi-writer stream.
1. Treat it as a collection of distinct single-writer streams, one
for each writer.
2. Treat it as a single stream.
The single stream (option 2) is conceptually simpler, and easier to represent (a single stream id).
However, it requires each reader to know about the entire set of writers, to ensures that readers don't erroneously advance their current stream position too early and miss a fact from an unknown writer.
In contrast, multiple parallel streams (option 1) are more complex, requiring more state to represent (map from writer to stream id).
The payoff for doing so is that readers can "peek" ahead to facts that completed on one writer no matter the state of the others, reducing latency.
Note that a multi-writer stream can be viewed in both ways.
For example, the events stream is treated as multiple single-writer streams (option 1) by the sync handler, so that events are sent to clients as soon as possible.
But the background process that works through events treats them as a single linear stream.
Another useful example is the cache invalidation stream.
The facts this stream holds are instructions to "you should now invalidate these cache entries".
We only ever treat this as a multiple single-writer streams as there is no important ordering between cache invalidations.
(Invalidations are self-contained facts; and the invalidations commute/are idempotent).
### Writing to streams
Writers need to track:
- track their current position (i.e. its own per-writer stream ID).
- their facts currently awaiting completion.
At startup,
- the current position of that writer can be found by querying the database (which suggests that facts need to be written to the database atomically, in a transaction); and
- there are no facts awaiting completion.
To reserve a stream ID, call [`nextval`](https://www.postgresql.org/docs/current/functions-sequence.html) on the appropriate postgres sequence.
To write a fact to the stream: insert the appropriate rows to the appropriate backing table.
To complete a fact, first remove it from your map of facts currently awaiting completion.
Then, if no earlier fact is awaiting completion, the writer can advance its current position in that stream.
Upon doing so it should emit an `RDATA` message[^3], once for every fact between the old and the new stream ID.
### Subscribing to streams
Readers need to track the current position of every writer.
At startup, they can find this by contacting each writer with a `REPLICATE` message,
requesting that all writers reply describing their current position in their streams.
Writers reply with a `POSITION` message.
To learn about new facts, readers should listen for `RDATA` messages and process them to respond to the new fact.
The `RDATA` itself is not a self-contained representation of the fact;
readers will have to query the stream tables for the full details.
Readers must also advance their record of the writer's current position for that stream.
# Summary
In a nutshell: we have an append-only log with a "buffer/scratchpad" at the end where we have to wait for the sequence to be linear and contiguous.
---
[^1]: we use the word _fact_ here for two reasons.
Firstly, the word "event" is already heavily overloaded (PDUs, EDUs, account data, ...) and we don't need to make that worse.
Secondly, "fact" emphasises that the things we append to a stream cannot change after the fact.
[^2]: A fact might be expressed with 0 rows, e.g. if we opened a transaction to persist an event, but failed and rolled the transaction back before marking the fact as completed.
In principle a fact might be expressed with 2 or more rows; if so, each of those rows should share the fact's stream ID.
[^3]: This communication used to happen directly with the writers [over TCP](../../tcp_replication.md);
* You will need to add an [`instance_map`](usage/configuration/config_documentation.md#instance_map)
with the `main` process defined, as well as the relevant connection information from
it's HTTP `replication` listener (defined in step 1 above). Note that the `host` defined
is the address the worker needs to look for the `main` process at, not necessarily the same address that is bound to.
with the `main` process defined, as well as the relevant connection information from
it's HTTP `replication` listener (defined in step 1 above).
* Note that the `host` defined is the address the worker needs to look for the `main`
process at, not necessarily the same address that is bound to.
* If you are using Unix sockets for the `replication` resource, make sure to
use a `path` to the socket file instead of a `port`.
* Optionally, a [shared secret](usage/configuration/config_documentation.md#worker_replication_secret)
can be used to authenticate HTTP traffic between workers. For example:
@@ -145,9 +148,6 @@ In the config file for each worker, you must specify:
with an `http` listener.
***Synapse 1.72 and older:** if handling the `^/_matrix/client/v3/keys/upload` endpoint, the HTTP URI for
the main process (`worker_main_http_uri`). This config option is no longer required and is ignored when running Synapse 1.73 and newer.
***Synapse 1.83 and older:** The HTTP replication endpoint that the worker should talk to on the main synapse process
([`worker_replication_host`](usage/configuration/config_documentation.md#worker_replication_host) and
[`worker_replication_http_port`](usage/configuration/config_documentation.md#worker_replication_http_port)). If using Synapse 1.84 and newer, these are not needed if `main` is defined on the [shared configuration](#shared-configuration) `instance_map`
For example:
@@ -177,11 +177,11 @@ The following applies to Synapse installations that have been installed from sou
You can start the main Synapse process with Poetry by running the following command:
```console
poetry run synapse_homeserver -c [your homeserver.yaml]
poetry run synapse_homeserver --config-file [your homeserver.yaml]
```
For worker setups, you can run the following command
```console
poetry run synapse_worker -c [your worker.yaml]
poetry run synapse_worker --config-file [your homeserver.yaml] --config-file [your worker.yaml]
# Define the perl modules we require to run SyTest.
#
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.