This is to handle the case of deleting lots of "bot" devices at once.
Reviewable commit-by-commit
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
<ol>
<li>
Reorder columns in `event_txn_id_device_id_txn_id` index \
This now satisfies the foreign key on `(user_id, device_id)` making
reverse lookups, as needed for device deletions, more efficient.
This improves device deletion performance by on the order of 8 to 10×
on matrix.org.
</li>
</ol>
Rationale:
## On the `event_txn_id_device_id` table:
We currently have this index:
```sql
-- This ensures that there is only one mapping per (room_id, user_id, device_id, txn_id) tuple.
CREATE UNIQUE INDEX IF NOT EXISTS event_txn_id_device_id_txn_id
ON event_txn_id_device_id(room_id, user_id, device_id, txn_id);
```
The main way we use this table is
```python
return await self.db_pool.simple_select_one_onecol(
table="event_txn_id_device_id",
keyvalues={
"room_id": room_id,
"user_id": user_id,
"device_id": device_id,
"txn_id": txn_id,
},
retcol="event_id",
allow_none=True,
desc="get_event_id_from_transaction_id_and_device_id",
)
```
But this foreign key is relatively unsupported, making deletions in
the devices table inefficient (full index scan on the above index):
```sql
FOREIGN KEY (user_id, device_id)
REFERENCES devices (user_id, device_id) ON DELETE CASCADE
```
I propose re-ordering the columns in that index to: `(user_id,
device_id, room_id, txn_id)` (by replacing it).
That way the foreign key back-check can rely on the prefix of this
index, but it's still useful for the original purpose it was made for.
It doesn't take any extra disk space and does not harm write performance
(because the same amount of writing work needs to be performed).
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
It came up that this was somewhat confusing and an example might help.
So here's an example :)
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
When a request gets ratelimited we (optionally) wait ~500ms before
returning to mitigate clients that like to tightloop on request
failures. However, this is currently implemented by pausing request
processing when we check for ratelimits, which might be deep within
request processing, and e.g. while locks are held. Instead, let's hoist
the pause to the very top of the HTTP handler.
Hopefully, this mitigates the issue where a user sending lots of events
to a single room can see their requests time out due to the combination
of the linearizer and the pausing of the request. Instead, they should
see the requests 429 after ~500ms.
The first commit is a refactor to pass the `Clock` to `AsyncResource`,
the second commit is the behavioural change.
The background updates are being registered on an object that is for the
_state_ database, but the actual tables are on the _main_ database. This
just moves them to a different store that can access the right stuff.
I noticed this when trying to do a full schema dump cause I was curious
what has changed since the last one.
Fixes#16054
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
We do this by shoving it into Rust. We believe our python http client is
a bit slow.
Also bumps minimum rust version to 1.81.0, released last September (over
six months ago)
To allow for async Rust, includes some adapters between Tokio in Rust
and the Twisted reactor in Python.
This was correctly handled for the "fallback" case where the background
updates hadn't finished
---------
Co-authored-by: Eric Eastwood <erice@element.io>
This can be reviewed commit by commit.
This enables the `flake8-logging` and `flake8-logging-format` rules in
Ruff, as well as logging exception stack traces in a few places where it
makes sense
- https://docs.astral.sh/ruff/rules/#flake8-logging-log
- https://docs.astral.sh/ruff/rules/#flake8-logging-format-g
### Linting to avoid pre-formatting log messages
See [`adamchainz/flake8-logging` -> *LOG011 avoid pre-formatting log
messages*](152db2f167/README.rst (log011-avoid-pre-formatting-log-messages))
Practically, this means prefer placeholders (`%s`) over f-strings for
logging.
This is because placeholders are passed as args to loggers, so they can
do special handling of them.
For example, Sentry will record the args separately in their logging
integration:
c15b390dfe/sentry_sdk/integrations/logging.py (L280-L284)
One theoretical small perf benefit is that log levels that aren't
enabled won't get formatted, so it doesn't unnecessarily create
formatted strings