1
0
Commit Graph

720 Commits

Author SHA1 Message Date
Richard van der Hoff
d80cd57c54 Fix new scheduled tasks jumping the queue (#17962)
Currently, when a new scheduled task is added and its scheduled time has
already passed, we set it to ACTIVE. This is problematic, because it
means it will jump the queue ahead of all other SCHEDULED tasks;
furthermore, if the Synapse process gets restarted, it will jump ahead
of any ACTIVE tasks which have been started but are taking a while to
run.

Instead, we leave it set to SCHEDULED, but kick off a call to
`_launch_scheduled_tasks`, which will decide if we actually have
capacity to start a new task, and start the newly-added task if so.
2024-11-28 18:06:19 +00:00
Andrew Ferrazzutti
006251a5d0 Add missing license header (#17799)
Co-authored-by: Erik Johnston <erik@matrix.org>
2024-10-08 12:01:44 +01:00
Andrew Morgan
316d635906 Fix NAME attribute of ReplicationRemovePusherRestServlet (#17779) 2024-10-04 09:53:35 +01:00
Andrew Ferrazzutti
5173741c71 Support MSC4140: Delayed events (Futures) (#17326) 2024-09-23 13:33:48 +01:00
Quentin Gliech
7d52ce7d4b Format files with Ruff (#17643)
I thought ruff check would also format, but it doesn't.

This runs ruff format in CI and dev scripts. The first commit is just a
run of `ruff format .` in the root directory.
2024-09-02 12:39:04 +01:00
Erik Johnston
554a92601a Reintroduce "Reduce device lists replication traffic."" (#17361)
Reintroduces https://github.com/element-hq/synapse/pull/17333


Turns out the reason for revert was down two master instances running
2024-06-25 10:34:34 +01:00
Erik Johnston
a98cb87bee Revert "Reduce device lists replication traffic." (#17360)
Reverts element-hq/synapse#17333

It looks like master was still sending out replication RDATA with the
old format... somehow
2024-06-25 09:57:34 +01:00
Erik Johnston
cf711ac03c Reduce device lists replication traffic. (#17333)
Reduce the replication traffic of device lists, by not sending every
destination that needs to be sent the device list update over
replication. Instead a "hosts to send to have been calculated"
notification over replication, and then federation senders read the
destinations from the DB.

For non federation senders this should heavily reduce the impact of a
user in many large rooms changing a device.
2024-06-24 14:15:13 +01:00
Erik Johnston
b5facbac0f Improve perf of sync device lists (#17216)
Re-introduces #17191, and includes #17197 and #17214

The basic idea is to stop calling `get_rooms_for_user` everywhere, and
instead use the table `device_lists_changes_in_room`.

Commits reviewable one-by-one.
2024-05-21 16:48:20 +01:00
Erik Johnston
ebe77381b0 Reduce pauses on large device list changes (#17192)
For large accounts waking up all the relevant notifier streams can cause
pauses of the reactor.
2024-05-14 14:39:11 +01:00
Erik Johnston
fd48fc4585 Fixups to new push stream (#17038)
Follow on from #17037
2024-03-28 16:29:23 +00:00
Erik Johnston
ea6bfae0fc Add support for moving /push_rules off of main process (#17037) 2024-03-28 15:44:07 +00:00
dependabot[bot]
1e68b56a62 Bump black from 23.10.1 to 24.2.0 (#16936) 2024-03-13 16:46:44 +00:00
Erik Johnston
23740eaa3d Correctly mention previous copyright (#16820)
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
2024-01-23 11:26:48 +00:00
Erik Johnston
eaad9bb156 Merge remote-tracking branch 'gitlab/clokep/license-license' into new_develop 2023-12-13 15:11:56 +00:00
David Robertson
c3627d0f99 Correctly read to-device stream pos on SQLite (#16682) 2023-11-24 13:42:38 +00:00
Patrick Cloke
8e1e62c9e0 Update license headers 2023-11-21 15:29:58 -05:00
Erik Johnston
6fec2d035f Also discard 'caches' and 'backfill' stream POSITIONS (#16655)
Follow on from #16640
2023-11-17 14:14:29 +00:00
Erik Johnston
fef08cbee8 Fix sending out of order POSITION over replication (#16639)
If a worker reconnects to Redis we send out the current positions of all our streams. However, if we're also trying to send out a backlog of RDATA at the same time then we can end up sending a `POSITION` with the current token *before* we've sent all the RDATA before the current token.

This doesn't cause actual bugs as the receiving servers see the POSITION, fetch the relevant rows from the DB, and then ignore the old RDATA as they come in. However, this is inefficient so it'd be better if we didn't  send out-of-order positions
2023-11-16 13:05:09 +00:00
Erik Johnston
898655fd12 More efficiently handle no-op POSITION (#16640)
We may receive `POSITION` commands where we already know that worker has
advanced past that position, so there is no point in handling it.
2023-11-16 12:32:17 +00:00
Erik Johnston
408c13801a Add fast path for replication events stream fetch (#16580)
We can bail early if the from token is greater than or equal to the
current token.
2023-10-30 14:47:57 +00:00
Erik Johnston
8c63e93286 Fix HTTP repl response to use minimum token (#16578) 2023-10-30 12:27:14 +00:00
Erik Johnston
5413cefe32 Reduce amount of caches POSITIONS we send (#16561)
Follow on from / actually correctly does #16557
2023-10-27 16:07:11 +01:00
Erik Johnston
89dbbd68e1 Reduce spurious replication catchup (#16555) 2023-10-27 13:27:20 +00:00
Erik Johnston
0680d76659 Reduce replication traffic due to reflected cache stream POSITION (#16557) 2023-10-27 12:51:08 +01:00
Erik Johnston
ba47fea528 Allow multiple workers to write to receipts stream. (#16432)
Fixes #16417
2023-10-25 16:16:19 +01:00
Jason Little
ffbe9b7666 Remove duplicate call to wake a remote destination when using federation sending worker (#16515) 2023-10-24 08:09:59 -04:00
Erik Johnston
8f35f8148e Fix bug where a new writer advances their token too quickly (#16473)
* Fix bug where a new writer advances their token too quickly

When starting a new writer (for e.g. persisting events), the
`MultiWriterIdGenerator` doesn't have a minimum token for it as there
are no rows matching that new writer in the DB.

This results in the the first stream ID it acquired being announced as
persisted *before* it actually finishes persisting, if another writer
gets and persists a subsequent stream ID. This is due to the logic of
setting the minimum persisted position to the minimum known position of
across all writers, and the new writer starts off not being considered.

* Fix sending out POSITIONs when our token advances without update

Broke in #14820

* For replication HTTP requests, only wait for minimal position
2023-10-23 16:57:30 +01:00
Patrick Cloke
49c9745b45 Avoid sending massive replication updates when purging a room. (#16510) 2023-10-18 12:26:01 -04:00
Richard van der Hoff
109882230c Clean up logging on event persister endpoints (#16488) 2023-10-14 17:57:27 +01:00
Patrick Cloke
ae5b997cfa Fix comments related to replication. (#16428) 2023-10-06 07:25:44 -04:00
Patrick Cloke
4e302b30b6 Add __slots__ to replication commands. (#16429)
To slightly reduce the amount of memory each command takes.
2023-10-05 07:38:55 -04:00
Erik Johnston
80ec81dcc5 Some refactors around receipts stream (#16426) 2023-10-04 16:28:40 +01:00
Erik Johnston
20fb08ec80 Downgrade repl stream time out error to warning (#16401)
This is because if a worker reaches ~100% CPU then everything starts
lagging and we hit the log line a lot. When at error we invoke sentry
and that has a lot of overhead, which then puts even more pressure on
the worker.
2023-09-29 11:52:48 +00:00
Patrick Cloke
f84da3c32e Add a cache around server ACL checking (#16360)
* Pre-compiles the server ACLs onto an object per room and
  invalidates them when new events come in.
* Converts the server ACL checking into Rust.
2023-09-26 11:57:50 -04:00
Erik Johnston
329597022e Some minor performance fixes for task schedular (#16313) 2023-09-14 16:20:47 +01:00
Erik Johnston
ab13fb08bf Improve logging of replication (#16309) 2023-09-13 09:51:50 +00:00
Erik Johnston
1cd410a783 Recheck if remote device is cached before requesting it (#16252)
This fixes a bug where we could get stuck re-requesting the device over
replication again and again.
2023-09-07 12:45:43 +00:00
Patrick Cloke
55c20da4a3 Merge remote-tracking branch 'origin/release-v1.91' into release-v1.92 2023-09-06 11:25:28 -04:00
Quentin Gliech
1940d990a3 Revert MSC3861 introspection cache, admin impersonation and account lock (#16258) 2023-09-06 15:19:51 +01:00
Erik Johnston
d35bed8369 Don't wake up destination transaction queue if they're not due for retry. (#16223) 2023-09-04 17:14:09 +01:00
David Robertson
e9eb26e3af Cache device resync requests over replication (#16241) 2023-09-04 11:57:59 +01:00
Patrick Cloke
e9235d92f2 Track currently syncing users by device for presence (#16172)
Refactoring to use both the user ID & the device ID when tracking
the currently syncing users in the presence handler.

This is done both locally and over replication. Note that the device
ID is discarded but will be used in a future change.
2023-08-29 11:44:07 -04:00
Patrick Cloke
40901af5e0 Pass the device ID around in the presence handler (#16171)
Refactoring to pass the device ID (in addition to the user ID) through
the presence handler (specifically the `user_syncing`, `set_state`,
and `bump_presence_active_time` methods and their replication
versions).
2023-08-28 13:08:49 -04:00
Patrick Cloke
1bf143699c Combine logic about not overriding BUSY presence. (#16170)
Simplify some of the presence code by reducing duplicated code between
worker & non-worker modes.

The main change is to push some of the logic from `user_syncing` into
`set_state`. This is done by passing whether the user is setting the presence
via a `/sync` with a new `is_sync` flag to `set_state`. If this is `true` some
additional logic is performed:

* Don't override `busy` presence.
* Update the `last_user_sync_ts`.
* Never update the status message.
2023-08-28 11:03:23 -04:00
Mathieu Velten
501da8ecd8 Task scheduler: add replication notify for new task to launch ASAP (#16184) 2023-08-28 14:03:51 +00:00
Erik Johnston
803f63df1c Fix perf of wait_for_stream_positions (#16148) 2023-08-22 15:11:22 +00:00
Shay
69048f7b48 Add an admin endpoint to allow authorizing server to signal token revocations (#16125) 2023-08-22 14:15:34 +00:00
Patrick Cloke
ad3f43be9a Run pyupgrade for python 3.7 & 3.8. (#16110) 2023-08-15 08:11:20 -04:00
Erik Johnston
ae55cc1e6b Add ability to wait for locks and add locks to purge history / room deletion (#15791)
c.f. #13476
2023-07-31 10:58:03 +01:00