We add some logic to expire sliding sync connections if they get old or
if there is too much pending data to return.
The values of the constants are picked fairly arbitrarily, these are
currently:
1. More than 100 rooms with pending events if the connection hasn't been
used in over an hour
2. The connection hasn't been used for over a week
Reviewable commit-by-commit
---------
Co-authored-by: Eric Eastwood <erice@element.io>
This is useful as someone downstream can source the
`scripts-dev/complement.sh` script and run the same set of tests as
Synapse:
```bash
# Grab the test packages supported by Synapse.
#
# --fast: Skip rebuilding the docker images,
# --build-only: Will only build Docker images but because we also used `--fast`, it won't do anything.
# `>/dev/null` to redirect stdout to `/dev/null` to get rid of the `echo` logs from the script.
test_packages=$(source ${SYNAPSE_DIR}/scripts-dev/complement.sh --fast --build-only >/dev/null && echo "$SYNAPSE_SUPPORTED_COMPLEMENT_TEST_PACKAGES")
echo $test_packages
```
This is spawning from wanting to run the same set of Complement tests in
the https://github.com/element-hq/synapse-rust-apps project.
This is useful so that the script can be sourced by other scripts
without exiting the calling subshell (composable).
This is split out from https://github.com/element-hq/synapse/pull/19208
to make easy to understand PR's and build up to where we want to go.
These aren't really something personally experienced but I just went
around the codebase looking for all of the Deferred `.callback`,
`.errback`, and `.cancel` and wrapped them with
`PreserveLoggingContext()`
Spawning from wanting to solve
https://github.com/element-hq/synapse/issues/19165 but unconfirmed
whether this has any effect.
To explain the fix, see the [*Deferred
callbacks*](3b59ac3b69/docs/log_contexts.md (deferred-callbacks))
section of our logcontext docs for more info (specifically using
solution 2).
As per recent proposals in MSC4140, remove authentication for
restarting/cancelling/sending a delayed event, and give each of those
actions its own endpoint. (The original consolidated endpoint is still
supported for backwards compatibility.)
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Half-Shot <will@half-shot.uk>
This was unintentionally changed in
https://github.com/element-hq/synapse/pull/19068.
There is no real bug here. Without this PR, we just printed an empty
string for the `sentinel` logcontext whereas the prior art behavior was
to print `sentinel` which this PR restores.
Found while staring at the logs in
https://github.com/element-hq/synapse/issues/19165
### Reproduction strategy
1. Configure Synapse with
[logging](df802882bb/docs/sample_log_config.yaml)
1. Start Synapse: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Notice the `asyncio - 64 - DEBUG - - Using selector: EpollSelector`
log line (notice empty string `- -`)
1. With this PR, the log line will be `asyncio - 64 - DEBUG - sentinel -
Using selector: EpollSelector` (notice `sentinel`)
This arises mostly from my recent experience adding a stream for Thread
Subscriptions
and trying to help others add their own streams.
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Spawning a background process comes with a bunch of overhead, so let's
try to reduce the number of background processes we need to spawn when
handling inbound fed.
Currently, we seem to be doing roughly one per command. Instead, lets
keep the background process alive for a bit waiting for a new command to
come in.
I noticed this in some profiling. Basically, we prune the ratelimiters
by copying and iterating over every entry every 60 seconds. Instead,
let's use a wheel timer to track when we should potentially prune a
given key, and then we a) check fewer keys, and b) can run more
frequently. Hopefully this should mean we don't have a large pause
everytime we prune a ratelimiter with lots of keys.
Also fixes a bug where we didn't prune entries that were added via
`record_action` and never subsequently updated. This affected the media
and joins-per-room ratelimiter.
This regressed in https://github.com/element-hq/synapse/pull/19121. I
moved things in https://github.com/element-hq/synapse/pull/19121 because
I thought that it made sense to redirect anything printed to
`stdout`/`stderr` to the logs as early as possible. But we actually want
to log any immediately apparent problems during initialization to
`stderr` in the terminal so that they are obvious and visible to the
operator.
Now, I've moved `redirect_stdio_to_logs()` back to where it was
previously along with some proper comment context for why we have it
there.
- Move `register_start` (calls `os._exit(1)`) out of `setup` (our
composable function)
- We want to avoid `exit(...)` because we use these composable functions
in Synapse Pro for small hosts where we have multiple Synapse instances
running in the same process. We don't want a problem from one homeserver
tenant causing the entire Python process to exit and affect all of the
other homeserver tenants.
- Continuation of https://github.com/element-hq/synapse/pull/19116
- Align our app entrypoints: `homeserver` (main), `generic_worker`
(worker), and `admin_cmd`
### Background
As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process) (c.f
Synapse Pro for small hosts), we're currently diving into the details
and implications of running multiple instances of Synapse in the same
Python process.
"Clean tenant provisioning" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
Move exception handling up the stack (avoid `exit(1)` in our composable
functions)
Relevant to Synapse Pro for small hosts as we don't want to exit the
entire Python process and affect all homeserver tenants.
### Background
As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process) (c.f
Synapse Pro for small hosts), we're currently diving into the details
and implications of running multiple instances of Synapse in the same
Python process.
"Clean tenant provisioning" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
Be mindful that Synapse can be run alongside other code in the same
Python process. We shouldn't overwrite fields on given log record unless
we know it's relevant to Synapse.
(no clobber)
### Background
As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.
"Per-tenant logging" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
The schema lint tries to make sure we don't add or remove indices in
schema files (rather than as background updates), *unless* the table was
created in the same schema file.
The regex to pull out the `CREATE TABLE` SQL incorrectly didn't
recognise `IF NOT EXISTS`.
There is a test delta file that shows that we accept different types of
`CREATE TABLE` and `CREATE INDEX` statements, as well as an index
creation that doesn't have a matching create table (to show that we do
still catch it). The test delta should be removed before merge.
Fix lost logcontext when using `timeout_deferred(...)` and things
actually timeout.
Fix https://github.com/element-hq/synapse/issues/19087 (our HTTP client
times out requests using `timeout_deferred(...)`
Fix https://github.com/element-hq/synapse/issues/19066 (`/sync` uses
`notifier.wait_for_events()` which uses `timeout_deferred(...)` under
the hood)
### When/why did these lost logcontext warnings start happening?
```
synapse.logging.context - 107 - WARNING - sentinel - Expected logging context call_later but found POST-2453
synapse.logging.context - 107 - WARNING - sentinel - Expected logging context call_later was lost
```
In https://github.com/element-hq/synapse/pull/18828, we switched
`timeout_deferred(...)` from using `reactor.callLater(...)` to
[`clock.call_later(...)`](3b59ac3b69/synapse/util/clock.py (L224-L313))
under the hood. This meant it started dealing with logcontexts but our
`time_it_out()` callback didn't follow our [Synapse logcontext
rules](3b59ac3b69/docs/log_contexts.md).