Refactor `Measure` block metrics to be homeserver-scoped (add
`server_name` label to block metrics).
Part of https://github.com/element-hq/synapse/issues/18592
### Testing strategy
#### See behavior of previous `metrics` listener
1. Add the `metrics` listener in your `homeserver.yaml`
```yaml
listeners:
- port: 9323
type: metrics
bind_addresses: ['127.0.0.1']
```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9323/metrics`
1. Observe response includes the block metrics
(`synapse_util_metrics_block_count`,
`synapse_util_metrics_block_in_flight`, etc)
#### See behavior of the `http` `metrics` resource
1. Add the `metrics` resource to a new or existing `http` listeners in
your `homeserver.yaml`
```yaml
listeners:
- port: 9322
type: http
bind_addresses: ['127.0.0.1']
resources:
- names: [metrics]
compress: false
```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` (it's just a `GET`
request so you can even do in the browser)
1. Observe response includes the block metrics
(`synapse_util_metrics_block_count`,
`synapse_util_metrics_block_in_flight`, etc)
Fixes: #18491
Fix hotlooping due to skipped PDUs if there is still no progress to be
made.
This could bite if the event was purged since being skipped during
catch-up.
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Another config option on my quest to a `*_path` variant for every
secret. Adds the config options `recaptcha_private_key_path` and
`recaptcha_public_key_path`. Tests and docs are included.
A public key is of course no secret, but it is closely related to the
private key, so it’s still useful to have a `*_path` variant for it.
`commonmark` has been deprecated in favor of `markdown-it-py`, and its
type hints have been [removed from
typeshed](https://github.com/python/typeshed/issues/13725).
This switches the release script to use `markdown-it-py` instead of
`commonmark` for parsing the `CHANGES.md`
This should be reviewed commit by commit.
Nowadays it's trivial to propagate cache invalidations, which means we
can move some things off the main process, and not go through HTTP
replication.
`ReplicationGetQueryRestServlet` appeared to be unused, and was very
weird, as it was being called if the current instance is the main one…
to RPC to the main one (if no instance is set on a replication client,
it makes it to the main process)
The other two handlers could be relatively trivially moved to any
workers, moving some methods to the worker store.
**I've intentionally not removed the replication servlets yet** so that
it's safe to rollout, and will do another PR that clean those up to
remove on the N+1 version
You can now configure how much media can be uploaded by a user in a
given time period.
Note the first commit here is a refactor of create/upload content
function
This implements
https://github.com/matrix-org/matrix-spec-proposals/pull/3765 which is
already merged and, therefore, can use stable identifiers.
For `/publicRooms` and `/hierarchy`, the topic is read from the
eponymous field of the `current_state_events` table. Rather than
introduce further columns in this table, I changed the insertion /
update logic to write the plain-text topic from the rich topic into the
existing field. This will not take effect for existing rooms unless
their topic is changed. However, existing rooms shouldn't have rich
topics to begin with.
Similarly, for server-side search, I changed the insertion logic of the
`event_search` table to prefer the value from the rich topic. Again,
existing events shouldn't have rich topics and, therefore, don't need to
be migrated in the table.
Spec doc: https://spec.matrix.org/v1.15/client-server-api/#mroomtopic
Part of supporting Matrix v1.15:
https://spec.matrix.org/v1.15/client-server-api/#mroomtopic
Signed-off-by: Johannes Marbach <n0-0ne+github@mailbox.org>
Co-authored-by: Eric Eastwood <erice@element.io>
This takes down the CI time to build wheels from 50 minutes to <10
minutes.
**It also fixes macOS ARM builds, and includes more ARM builds in
general** (we were ignoring pypy and musl before). This doesn't cost
much for us to do this, reasons for not doing this is 1. space on PyPI
and 2. keeping them 'officially' supported?
This is the list of wheels this built (`+` are the ones added):
```diff
matrix_synapse-1.133.0-cp39-abi3-macosx_10_9_x86_64.whl
+ matrix_synapse-1.133.0-cp39-abi3-macosx_11_0_arm64.whl
matrix_synapse-1.133.0-cp39-abi3-manylinux_2_28_aarch64.whl
matrix_synapse-1.133.0-cp39-abi3-manylinux_2_28_x86_64.whl
+ matrix_synapse-1.133.0-cp39-abi3-musllinux_1_2_aarch64.whl
matrix_synapse-1.133.0-cp39-abi3-musllinux_1_2_x86_64.whl
matrix_synapse-1.133.0-pp310-pypy310_pp73-macosx_10_15_x86_64.whl
+ matrix_synapse-1.133.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl
+ matrix_synapse-1.133.0-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl
matrix_synapse-1.133.0-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl
matrix_synapse-1.133.0-pp311-pypy311_pp73-macosx_10_15_x86_64.whl
+ matrix_synapse-1.133.0-pp311-pypy311_pp73-macosx_11_0_arm64.whl
+ matrix_synapse-1.133.0-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl
matrix_synapse-1.133.0-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
```
And the numbers aaaaare 🥁
-
[before](https://github.com/element-hq/synapse/actions/runs/16072488018):
54 minutes
-
[after](https://github.com/element-hq/synapse/actions/runs/16004034949?pr=18618):
10 minutes
**Revert
[e43b0f9](e43b0f9bd1)
before merging**
This splits the building of docker images in 2 jobs, one for each
platform, using the native ARM runners for arm64.
The tricky part here is to get back a nice multi-arch manifest.
Previously, you'd do that by pushing each platform image in two distinct
tags, then referencing them in a multi-arch manifest. Nowadays, it's
possible to push images by their digest only, then creating the manifest
for those pushed digests separately
This is inspired by the Docker docs on how to distribute multi-platform
image builds:
https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners
`ghcr.io/element-hq/synapse:sha-c733dd6` is an example image that got
built by this workflow (there is a temporary sha-* tag on
workflow_dispatch runs to help trying out the workflow)
I also had to make sure we sign the manifests correctly:
```
$ cosign verify --certificate-oidc-issuer https://token.actions.githubusercontent.com --certificate-identity-regexp 'https://github.com/element-hq/synapse/.github/workflows/docker.yml@.*' ghcr.io/element-hq/synapse:sha-c733dd6
Verification for ghcr.io/element-hq/synapse:sha-c733dd6 --
The following checks were performed on each of these signatures:
- The cosign claims were validated
- Existence of the claims in the transparency log was verified offline
- The code-signing certificate was verified using trusted certificate authority certificates
```
And the numbers aaaaare 🥁
-
[before](https://github.com/element-hq/synapse/actions/runs/16118229296/job/45477093703):
30 minutes
-
[after](https://github.com/element-hq/synapse/actions/runs/16021743575):
4 minutes
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Request to raise the defensive version cap for poetry-core from 1.9.1 to
2.1.3.
My understanding is that the major version bump of poetry signals the
transition to standardized pyproject.toml metadata, but does not affect
backwards compatibility.
This is a subset of the changes in #18432Fixes#18200
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
This is to handle the case of deleting lots of "bot" devices at once.
Reviewable commit-by-commit
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
<ol>
<li>
Reorder columns in `event_txn_id_device_id_txn_id` index \
This now satisfies the foreign key on `(user_id, device_id)` making
reverse lookups, as needed for device deletions, more efficient.
This improves device deletion performance by on the order of 8 to 10×
on matrix.org.
</li>
</ol>
Rationale:
## On the `event_txn_id_device_id` table:
We currently have this index:
```sql
-- This ensures that there is only one mapping per (room_id, user_id, device_id, txn_id) tuple.
CREATE UNIQUE INDEX IF NOT EXISTS event_txn_id_device_id_txn_id
ON event_txn_id_device_id(room_id, user_id, device_id, txn_id);
```
The main way we use this table is
```python
return await self.db_pool.simple_select_one_onecol(
table="event_txn_id_device_id",
keyvalues={
"room_id": room_id,
"user_id": user_id,
"device_id": device_id,
"txn_id": txn_id,
},
retcol="event_id",
allow_none=True,
desc="get_event_id_from_transaction_id_and_device_id",
)
```
But this foreign key is relatively unsupported, making deletions in
the devices table inefficient (full index scan on the above index):
```sql
FOREIGN KEY (user_id, device_id)
REFERENCES devices (user_id, device_id) ON DELETE CASCADE
```
I propose re-ordering the columns in that index to: `(user_id,
device_id, room_id, txn_id)` (by replacing it).
That way the foreign key back-check can rely on the prefix of this
index, but it's still useful for the original purpose it was made for.
It doesn't take any extra disk space and does not harm write performance
(because the same amount of writing work needs to be performed).
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
It came up that this was somewhat confusing and an example might help.
So here's an example :)
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
When a request gets ratelimited we (optionally) wait ~500ms before
returning to mitigate clients that like to tightloop on request
failures. However, this is currently implemented by pausing request
processing when we check for ratelimits, which might be deep within
request processing, and e.g. while locks are held. Instead, let's hoist
the pause to the very top of the HTTP handler.
Hopefully, this mitigates the issue where a user sending lots of events
to a single room can see their requests time out due to the combination
of the linearizer and the pausing of the request. Instead, they should
see the requests 429 after ~500ms.
The first commit is a refactor to pass the `Clock` to `AsyncResource`,
the second commit is the behavioural change.
The background updates are being registered on an object that is for the
_state_ database, but the actual tables are on the _main_ database. This
just moves them to a different store that can access the right stuff.
I noticed this when trying to do a full schema dump cause I was curious
what has changed since the last one.
Fixes#16054
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))