Be mindful that it's possible to run Synapse multiple times in the same
Python process. So we only need to do some part of the logging setup
once.
- We only need to setup the global log record factory and context filter
once
- We only need to redirect Twisted logging once
### Background
As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.
"Per-tenant logging" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
### Background
As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.
"Per-tenant logging" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
### Prior art
Previously, we exposed `server_name` by providing a static logging
`MetadataFilter` that injected the values:
205d9e4fc4/synapse/config/logger.py (L216)
While this can work fine for the normal case of one Synapse instance per
Python process, this configures things globally and isn't compatible
when we try to start multiple Synapse instances because each subsequent
tenant will overwrite the previous tenant.
### What does this PR do?
We remove the `MetadataFilter` and replace it by tracking the
`server_name` in the `LoggingContext` and expose it with our existing
[`LoggingContextFilter`](205d9e4fc4/synapse/logging/context.py (L584-L622))
that we already use to expose information about the `request`.
This means that the `server_name` value follows wherever we log as
expected even when we have multiple Synapse instances running in the
same process.
### A note on logcontext
Anywhere, Synapse mistakenly uses the `sentinel` logcontext to log
something, we won't know which server sent the log. We've been fixing up
`sentinel` logcontext usage as tracked by
https://github.com/element-hq/synapse/issues/18905
Any further `sentinel` logcontext usage we find in the future can be
fixed piecemeal as normal.
d2a966f922/docs/log_contexts.md (L71-L81)
### Testing strategy
1. Adjust your logging config to include `%(server_name)s` in the format
```yaml
formatters:
precise:
format: '%(asctime)s - %(server_name)s - %(name)s - %(lineno)d -
%(levelname)s - %(request)s - %(message)s'
```
1. Start Synapse: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Make some requests (`curl
http://localhost:8008/_matrix/client/versions`, etc)
1. Open the homeserver logs and notice the `server_name` in the logs as
expected. `unknown_server_from_sentinel_context` is expected for the
`sentinel` logcontext (things outside of Synapse).
Introduce `Clock.call_when_running(...)` to wrap startup code in a
logcontext, ensuring we can identify which server generated the logs.
Background:
> Ideally, nothing from the Synapse homeserver would be logged against the `sentinel`
> logcontext as we want to know which server the logs came from. In practice, this is not
> always the case yet especially outside of request handling.
>
> Global things outside of Synapse (e.g. Twisted reactor code) should run in the
> `sentinel` logcontext. It's only when it calls into application code that a logcontext
> gets activated. This means the reactor should be started in the `sentinel` logcontext,
> and any time an awaitable yields control back to the reactor, it should reset the
> logcontext to be the `sentinel` logcontext. This is important to avoid leaking the
> current logcontext to the reactor (which would then get picked up and associated with
> the next thing the reactor does).
>
> *-- `docs/log_contexts.md`
Also adds a lint to prefer `Clock.call_when_running(...)` over
`reactor.callWhenRunning(...)`
Part of https://github.com/element-hq/synapse/issues/18905
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
Part of #9744
Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
- Update black version to the latest
- Run black auto formatting over the codebase
- Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
- Update `code_style.md` docs around installing black to use the correct version
This modifies the configuration of structured logging to be usable from
the standard Python logging configuration.
This also separates the formatting of logs from the transport allowing
JSON logs to files or standard logs to sockets.