Before https://github.com/element-hq/synapse/pull/18849, we we're using our own custom `LogContextScopeManager` which tied the tracing scope to the `LoggingContext`. Since we created a new `BackgroundProcessLoggingContext` any time we `run_as_background_process(...)`, the trace for the background work was separate from the trace that kicked of the work as expected (e.g. request trace is separate from the background process we kicked to fetch more messages from the federation). Since we've now switched to the `ContextVarsScopeManager` (in https://github.com/element-hq/synapse/pull/18849), the tracing scope now crosses the `LoggingContext` boundaries (and thread boundaries) without a problem. This means we end up with request traces that include all of the background work that we've kicked off bloating the trace and making it hard to understand what's going on. This PR separates the traces again to how things were before. Additionally, things are even better now since I added some cross-link references between the traces to easily be able to jump between. Follow-up to https://github.com/element-hq/synapse/pull/18849 --- In the before, you can see that the trace is blown up by the background process (`bgproc.qwer`). In the after, we now only have a little cross-link marker span (`start_bgproc.qwer`) to jump to background process trace. Before | After --- | --- <some image> | <some image> ### Testing strategy 1. Run a Jaeger instance (https://www.jaegertracing.io/docs/1.6/getting-started/) ```shell $ docker run -d --name jaeger \ -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \ -p 5775:5775/udp \ -p 6831:6831/udp \ -p 6832:6832/udp \ -p 5778:5778 \ -p 16686:16686 \ -p 14268:14268 \ -p 9411:9411 \ jaegertracing/all-in-one:1.59.0 ``` 1. Configure Synapse to use tracing: `homeserver.yaml` ```yaml ## Tracing ## opentracing: enabled: true jaeger_config: sampler: type: const param: 1 logging: false ``` 1. Make sure the optional `opentracing` dependency is installed: `poetry install --extras all` 1. In the `VersionsRestServlet`, modify it to kick off a dummy background process (easy to test this way) ```python from synapse.metrics.background_process_metrics import run_as_background_process async def _qwer() -> None: await self.clock.sleep(1) run_as_background_process("qwer", "test_server", _qwer) ``` 1. Run Synapse: `poetry run synapse_homeserver --config-path homeserver.yaml` 1. Fire off a version requests: `curl http://localhost:8008/_matrix/client/versions` 1. Visit http://localhost:16686/search to view the traces - Select the correct service - Look for the `VersionsRestServlet` operation - Press 'Find Traces' button - Select the relevant trace - Notice how the trace isn't bloated - Look for the `start_bgproc.qwer` span cross-linking to the background process - Jump to the other trace using the cross-link reference -> `bgproc.qwer`
2 lines
55 B
Plaintext
2 lines
55 B
Plaintext
Disconnect background process work from request trace.
|