Replace `client_secret` query parameter values with `<redacted>` in the logs. Prevents a scenario where a MITM of server traffic can horde 3pids on their account.
We have set the max retry interval to a value larger than a postgres or
sqlite int can hold, which caused exceptions when updating the
destinations table.
To fix postgres we need to change the column to a bigint, and for sqlite
we lower the max interval to 2**62 (which is still incredibly long).
Joining against `events` and ordering by `stream_ordering` is
inefficient as it forced scanning the entirety of the redactions table.
This isn't the case if we use `redactions.received_ts` column as we can
then use an index.
Currently we don't set `have_censored` column if we don't have the
target event of a redaction, which means we repeatedly attempt to censor
the same non-existant event.
When we persist non-redacted events we unset the `have_censored` column
for any redactions that target said event.
Pull the checkers out to their own classes, rather than having them lost in a
massive 1000-line class which does everything.
This is also preparation for some more intelligent advertising of flows, as per #6100
because, frankly, it looked like it was written by an axe-murderer.
This should be a non-functional change, except that where `m.login.dummy` was
previously advertised *before* `m.login.terms`, it will now be advertised
afterwards. AFAICT that should have no effect, and will be more consistent with
the flows that involve passing a 3pid.
Second part of solving #6076Fixes#6076
We return a submit_url parameter on calls to POST */msisdn/requestToken so that clients know where to submit token information to.
Uses a SimpleHttpClient instance equipped with the federation_ip_range_blacklist list for requests to identity servers provided by user input. Does not use a blacklist when contacting identity servers specified by account_threepid_delegates. The homeserver trusts the latter and we don't want to prevent homeserver admins from specifying delegates that are on internal IP addresses.
Fixes#5935
As MSC2263 states, m.require_identity_server must be set to false when it does not require an identity server to be provided by the client for the purposes of email registration or password reset.
Adds an m.require_identity_server flag to /versionss unstable_flags section. This will advertise that Synapse no longer needs id_server as a parameter.
This is a) simpler than querying user_ips directly and b) means we can
purge older entries from user_ips without losing the required info.
The storage functions now no longer return the access_token, since it
was unused.
Implements MSC2290. This PR adds two new endpoints, /unstable/account/3pid/add and /unstable/account/3pid/bind. Depending on the progress of that MSC the unstable prefix may go away.
This PR also removes the blacklist on some 3PID tests which occurs in #6042, as the corresponding Sytest PR changes them to use the new endpoints.
Finally, it also modifies the account deactivation code such that it doesn't just try to deactivate 3PIDs that were bound to the user's account, but any 3PIDs that were bound through the homeserver on that user's account.
Fixes#6066
This register endpoint should be disabled if registration is disabled, otherwise we're giving anyone the ability to check if a username exists on a server when we don't need to be.
Error code is 403 (Forbidden) as that's the same returned by /register when registration is disabled.
In ancient times Synapse would only send emails when it was notifying a user about a message they received...
Now it can do all sorts of neat things!
Change the logging so it's not just about notifications.
Remove trailing slash ability from the password reset submit_token endpoint. Since we provide the link in an email, and have never sent it with a trailing slash, there's no point for us to accept them on the endpoint.
The validation links sent via email had their query parameters inserted without any URL-encoding. Surprisingly this didn't seem to cause any issues, but if a user were to put a `/` in their client_secret it could lead to problems.
* Allow passing SYNAPSE_WORKER envvar
* changelog.d
* Document SYNAPSE_WORKER.
Attempting to imply that you don't need to change this default
unless you're in worker mode.
Also aware that there's a bigger problem of attempting to document
a complete working configuration of workers using docker, as we
currently only document to use `synctl` for worker mode, and synctl
doesn't work that way in docker.
Fixes a bug where the default attribute maps were prioritised over
user-specified ones, resulting in incorrect mappings.
The problem is that if you call SPConfig.load() multiple times, it adds new
attribute mappers to a list. So by calling it with the default config first,
and then the user-specified config, we would always get the default mappers
before the user-specified mappers.
To solve this, let's merge the config dicts first, and then pass them to
SPConfig.
* make it clear that if you installed from a package manager, you should use
that to upgrade
* Document the new way of getting the server version (cf #4878)
* Write some words about downgrading.
This is a partial revert of #5893. The problem is that if we drop these tables
in the same release as removing the code that writes to them, it prevents users
users from being able to roll back to a previous release.
So let's leave the tables in place for now, and remember to drop them in a
subsequent release.
(Note that these tables haven't been *read* for *years*, so any missing rows
resulting from a temporary upgrade to vNext won't cause a problem.)
Removes the POST method from `/password_reset/<medium>/submit_token/` as it's only used by phone number verification which Synapse does not support yet.
3PID invites require making a request to an identity server to check that the invited 3PID has an Matrix ID linked, and if so, what it is.
These requests are being made on behalf of a user. The user will supply an identity server and an access token for that identity server. The homeserver will then forward this request with the access token (using an `Authorization` header) and, if the given identity server doesn't support v2 endpoints, will fall back to v1 (which doesn't require any access tokens).
Requires: ~~#5976~~
Broke in #5971
Basically the bug is that if get_current_state_deltas returns no new updates and we then take the max pos, its possible that we miss an update that happens in between the two calls. (e.g. get_current_state_deltas looks up to stream pos 5, then an event persists and so getting the max stream pos returns 6, meaning that next time we check for things with a stream pos bigger than 6)
We want to assign unique mxids to saml users based on an incrementing
suffix. For that to work, we need to record the allocated mxid in a separate
table.
This allows support users to be created even on MAU limits via
the admin API. Support users are excluded from MAU after creation,
so it makes sense to exclude them in creation - except if the
whole host is in disabled state.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
Some small fixes to `room_member.py` found while doing other PRs.
1. Add requester to the base `_remote_reject_invite` method.
2. `send_membership_event`'s docstring was out of date and took in a `remote_room_hosts` arg that was not used and no calling function provided.
Another small fixup noticed during work on a larger PR. The `origin` field of `add_display_name_to_third_party_invite` is not used and likely was just carried over from the `on_PUT` method of `FederationThirdPartyInviteExchangeServlet` which, like all other servlets, provides an `origin` argument.
Since it's not used anywhere in the handler function though, we should remove it from the function arguments.
Previously if the first registered user was a "support" or "bot" user,
when the first real user registers, the auto-join rooms were not
created.
Fix to exclude non-real (ie users with a special user type) users
when counting how many users there are to determine whether we should
auto-create a room.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
This is a combination of a few different PRs, finally all being merged into `develop`:
* #5875
* #5876
* #5868 (This one added the `/versions` flag but the flag itself was actually [backed out](891afb57cb (diff-e591d42d30690ffb79f63bb726200891)) in #5969. What's left is just giving /versions access to the config file, which could be useful in the future)
* #5835
* #5969
* #5940
Clients should not actually use the new registration functionality until https://github.com/matrix-org/synapse/pull/5972 is merged.
UPGRADE.rst, changelog entries and config file changes should all be reviewed closely before this PR is merged.
* Ensure the list media admin API is always available
This API is required for some external media repo implementations to operate (mostly for doing quarantine operations on a room).
* changelog
Remove all the "double return" statements which were a result of us removing all the instances of
```
defer.returnValue(...)
return
```
statements when we switched to python3 fully.
Python will return a tuple whether there are parentheses around the returned values or not.
I'm just sick of my editor complaining about this all over the place :)
Template config files
* Imagine a system composed entirely of x, y, z etc and the basic operations..
Wait George, why XOR? Why not just neq?
George: Eh, I didn't think of that..
Co-Authored-By: Erik Johnston <erik@matrix.org>
Some of the caches on worker processes were not being correctly invalidated
when a room's state was changed in a way that did not affect the membership
list of the room.
We need to make sure we send out cache invalidations even when no memberships
are changing.
Propagate opentracing contexts across workers
Also includes some Convenience modifications to opentracing for servlets, notably:
- Add boolean to skip the whitelisting check on inject
extract methods. - useful when injecting into carriers
locally. Otherwise we'd always have to include our
own servername and whitelist our servername
- start_active_span_from_request instead of header
- Add boolean to decide whether to extract context
from a request to a servlet
This type of registration was probably never used. It only includes the
user name in the HMAC but not the password.
Shared secret registration is still available via
client/r0/admin/register.
Signed-off-by: Manuel Stahl <manuel.stahl@awesome-technologies.de>
There's no point doing a raise_from here, because the exception is always
logged at warn with no stacktrace in the caller. Instead, let's try to give
better messages to reduce confusion.
In particular, this means that we won't log 'Failed to connect to remote
server' when we don't even attempt to connect to the remote server due to
blacklisting.
Get rid of the labyrinthine `recoverer_fn` code, and clean up the startup code
(it seemed to be previously inexplicably split between
`ApplicationServiceScheduler.start` and `_Recoverer.start`).
Add some docstrings too.
Hopefully, this will fix a stack overflow when recovering an appservice.
The recursion here leads to a huge chain of deferred callbacks, which then
overflows the stack when the chain completes. `inlineCallbacks` makes a better
job of this if we use iteration instead.
Clean up the code a bit too, while we're there.
Get rid of the labyrinthine `recoverer_fn` code, and clean up the startup code
(it seemed to be previously inexplicably split between
`ApplicationServiceScheduler.start` and `_Recoverer.start`).
Add some docstrings too.
Add authenticated_entity and servlet_names tags.
Functionally:
- Add a tag for authenticated_entity
- Add a tag for servlet_names
Stylistically:
Moved to importing methods directly from opentracing.
Fixes#5833
The emailconfig code was attempting to pull incorrect config file names. This corrects that, while also marking a difference between a config file variable that's a filepath versus a str containing HTML.
This refactors MatrixFederationAgent to move the SRV lookup into the
endpoint code, this has two benefits:
1. Its easier to retry different host/ports in the same way as
HostnameEndpoint.
2. We avoid SRV lookups if we have a free connection in the pool
If we have recently seen a valid well-known for a domain we want to
retry on (non-final) errors a few times, to handle temporary blips in
networking/etc.
is cached and so does not always return a `Deferred`.
`await` does not silently pass-through non-Deferreds like `yield` used to.
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
This gives a bit of a grace period where we can attempt to refetch a
remote `well-known`, while still using the cached result if that fails.
Hopefully this will make the well-known resolution a bit more torelant
of failures, rather than it immediately treating failures as "no result"
and caching that for an hour.
It costs both us and the remote server for us to fetch the well known
for every single request we send, so we add a minimum cache period. This
is set to 5m so that we still honour the basic premise of "refetch
frequently".
When persisting events we calculate new stream orderings up front.
Before we notify about an event all events with lower stream orderings
must have finished being persisted.
This PR moves the assignment of stream ordering till *after* calculated
the new current state and split the batch of events into separate chunks
for persistence. This means that if it takes a long time to calculate
new current state then it will not block events in other rooms being
notified about.
This should help reduce some global pauses in the events stream which
can last for tens of seconds (if not longer), caused by some
particularly expensive state resolutions.
This hopefully addresses #5407 by gracefully handling an empty but
limited TimelineBatch. We also add some logging to figure out how this
is happening.
This is intended as an amendment to #5674 as using M_UNKNOWN as the errcode makes it hard for clients to differentiate between an invalid password and a deactivated user (the problem we were trying to solve in the first place).
M_UNKNOWN was originally chosen as it was presumed than an MSC would have to be carried out to add a new code, but as Synapse often is the testing bed for new MSC implementations, it makes sense to try it out first in the wild and then add it into the spec if it is successful. Thus this PR return a new M_USER_DEACTIVATED code when a deactivated user attempts to login.
The double posting is really annoying, and I don't think anyone is
actually reading them. The commit statuses should give a good summary
and will link to a full report.
The `expire_access_token` didn't do what it sounded like it should do. What it
actually did was make Synapse enforce the 'time' caveat on macaroons used as
access tokens, but since our access token macaroons never contained such a
caveat, it was always a no-op.
(The code to add 'time' caveats was removed back in v0.18.5, in #1656)
* Fix debian packages for sid being called buster.
I don't know why the sid images return buster as its codename in
`lsb_release` but it does, so lets just grab the codename from the
distro we pass into dockerfile
* Newsfile
Annoyingly, `current_state_events` table can include rejected events,
in which case the membership column will be null. To work around this
lets just always filter out null membership for now.
There was some inconsistent behaviour in the caching layer around how
exceptions were handled - particularly synchronously-thrown ones.
This seems to be most easily handled by pushing the creation of
ObservableDeferreds down from CacheDescriptor to the Cache.
`None` is not a valid event id, so queuing up a database fetch for it seems
like a silly thing to do.
I considered making `get_event` return `None` if `event_id is None`, but then
its interaction with `allow_none` seemed uninituitive, and strong typing ftw.
This will allow us to efficiently filter out rooms that have been
forgotten in other queries without having to join against the
`room_memberships` table.
* Add decerators for tracing functions
* Use the new clean contexts
* Context and edu utils
* Move opentracing setters
* Move whitelisting
* Sectioning comments
* Better args wrapper
* Docstrings
Co-Authored-By: Erik Johnston <erik@matrix.org>
* Remove unused methods.
* Don't use global
* One tracing decorator to rule them all.
* Refactor Keyring._start_key_lookups
There's an awful lot of deferreds and dictionaries flying around here. The
whole thing can be made much simpler and achieve the same effect.
* Add a delay to key lookup lock release to fix stack overflow
A tactical call_later here should fix#5723
* changelog
The version of a module isn't going to change over the lifetime of the
process (assuming no funky hot reloading is going on, which it isn't),
so let's just cache the result to avoid spawning lots of git
subprocesses.
Fixes#5672.
* Opentracing survival guide
* Update decorator names in doc
* Doc cleanup
These are all alterations as a result of comments in #5703, it
includes mostly typos and clarifications. The most interesting
changes are:
- Split developer and user docs into two sections
- Add a high level description of OpenTracing
* newsfile
* Move contributer specific info to docstring.
* Sample config.
* Trailing whitespace.
* Update 5703.misc
* Apply suggestions from code review
Mostly just rewording parts of the docs for clarity.
Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Clean up config settings and dead code.
This is mostly about cleaning up the config format, to bring it into line with our conventions. In particular:
* There should be a blank line after `## Section ##' headings
* There should be a blank line between each config setting
* There should be a `#`-only line between a comment and the setting it describes
* We don't really do the `# #` style commenting-out of whole sections if we can help it
* rename `tracer_enabled` to `enabled`
While we're here, do more config parsing upfront, which makes it easier to use
later on.
Also removes redundant code from LogContextScopeManager.
Also changes the changelog fragment to a `feature` - it's exciting!
* Convert BaseFederationServlet._wrap to async
Empirically, this fixes some lost stacktraces. It should be safe because the
wrapped function is called from JsonResource._async_render, which is already
async.
* Convert the rest of synapse.federation.transport.server to async
We may as well do the whole file while we're here.
* changelog
* flake8
This is basically a contrived way of adding a `Recommends` on `libpq5`, to fix#5653.
The way this is supposed to happen in debhelper is to run
`dh_shlibdeps`, which in turn runs `dpkg-shlibdeps`, which spits things out
into `debian/<package>.substvars` whence they can later be included by
`control`.
Previously, we had disabled `dh_shlibdeps`, mostly because `dpkg-shlibdeps`
gets confused about PIL's interdependent objects, but that's not really the
right thing to do and there is another way to work around that.
Since we don't always use postgres, we don't necessarily want a hard Depends on
libpq5, so I've actually ended up adding an explicit invocation of
`dpkg-shlibdeps` for `psycopg2`.
I've also updated the build-depends list for the package, which was missing a
couple of entries.
We can now use `_get_events_from_cache_or_db` rather than going right back to
the database, which means that (a) we can benefit from caching, and (b) it
opens the way forward to more extensive checks on the original event.
We now always require the original event to exist before we will serve up a
redaction.
Ensures that redactions are correctly authenticated for recent room versions.
There are a few things going on here:
* `_fetch_event_rows` is updated to return a dict rather than a list of rows.
* Rather than returning multiple copies of an event which was redacted
multiple times, it returns the redactions as a list within the dict.
* It also returns the actual rejection reason, rather than merely the fact
that it was rejected, so that we don't have to query the table again in
`_get_event_from_row`.
* The redaction handling is factored out of `_get_event_from_row`, and now
checks if any of the redactions are valid.
A couple of changes here:
* get rid of a redundant `allow_rejected` condition - we should already have filtered out any rejected
events before we get to that point in the code, and the redundancy is confusing. Instead, let's stick in
an assertion just to make double-sure we aren't leaking rejected events by mistake.
* factor out a `_get_events_from_cache_or_db` method, which is going to be important for a
forthcoming fix to redactions.
Alpine Linux 3.8 is still supported, but it seems like
it's quite outdated now.
While Python should be the same on both, all other libraries, etc.,
are much newer in Alpine 3.9 and 3.10.
Signed-off-by: Slavi Pantaleev <slavi@devture.com>
First of all, let's get rid of `TOKEN_NOT_FOUND_HTTP_STATUS`. It was a hack we
did at one point when it was possible to return either a 403 or a 401 if the
creds were missing. We always return a 401 in these cases now (thankfully), so
it's not needed.
Let's also stop abusing `AuthError` for these cases. Honestly they have nothing
that relates them to the other places that `AuthError` is used, other than the
fact that they are loosely under the 'Auth' banner. It makes no sense for them
to share exception classes.
Instead, let's add a couple of new exception classes: `InvalidClientTokenError`
and `MissingClientTokenError`, for the `M_UNKNOWN_TOKEN` and `M_MISSING_TOKEN`
cases respectively - and an `InvalidClientCredentialsError` base class for the
two of them.
* Configure and initialise tracer
Includes config options for the tracer and sets up JaegerClient.
* Scope manager using LogContexts
We piggy-back our tracer scopes by using log context.
The current log context gives us the current scope. If new scope is
created we create a stack of scopes in the context.
* jaeger is a dependency now
* Carrier inject and extraction for Twisted Headers
* Trace federation requests on the way in and out.
The span is created in _started_processing and closed in
_finished_processing because we need a meaningful log context.
* Create logcontext for new scope.
Instead of having a stack of scopes in a logcontext we create a new
context for a new scope if the current logcontext already has a scope.
* Remove scope from logcontext if logcontext is top level
* Disable tracer if not configured
* typo
* Remove dependence on jaeger internals
* bools
* Set service name
* :Explicitely state that the tracer is disabled
* Black is the new black
* Newsfile
* Code style
* Use the new config setup.
* Generate config.
* Copyright
* Rename config to opentracing
* Remove user whitelisting
* Empty whitelist by default
* User ConfigError instead of RuntimeError
* Use isinstance
* Use tag constants for opentracing.
* Remove debug comment and no need to explicitely record error
* Two errors a "s(c)entry"
* Docstrings!
* Remove debugging brainslip
* Homeserver Whitlisting
* Better opentracing config comment
* linting
* Inclue worker name in service_name
* Make opentracing an optional dependency
* Neater config retreival
* Clean up dummy tags
* Instantiate tracing as object instead of global class
* Inlcude opentracing as a homeserver member.
* Thread opentracing to the request level
* Reference opetnracing through hs
* Instantiate dummy opentracin g for tests.
* About to revert, just keeping the unfinished changes just in case
* Revert back to global state, commit number:
9ce4a3d9067bf9889b86c360c05ac88618b85c4f
* Use class level methods in tracerutils
* Start and stop requests spans in a place where we
have access to the authenticated entity
* Seen it, isort it
* Make sure to close the active span.
* I'm getting black and blue from this.
* Logger formatting
Co-Authored-By: Erik Johnston <erik@matrix.org>
* Outdated comment
* Import opentracing at the top
* Return a contextmanager
* Start tracing client requests from the servlet
* Return noop context manager if not tracing
* Explicitely say that these are federation requests
* Include servlet name in client requests
* Use context manager
* Move opentracing to logging/
* Seen it, isort it again!
* Ignore twisted return exceptions on context exit
* Escape the scope
* Scopes should be entered to make them useful.
* Nicer decorator names
* Just one init, init?
* Don't need to close something that isn't open
* Docs make you smarter
this is only used in one place, so it's clearer if we inline it and reduce the
API surface.
Also, fixes a buglet where we would create an access token even if we were
about to block the user (we would never return the AT, so the user could never
use it, but it was still created and added to the db.)
A fix for PR #5626, which returned the original event content as part of a call to /relations.
Only problem was that we were attempting to aggregate the relations on top of it when we did so. We now set bundle_aggregations to False in the get_event call.
We also do this when pulling the relation events as well, because edits of edits are not something we'd like to support here.
FederationDeniedError is a subclass of SynapseError, which is a subclass of
CodeMessageException, so if e is a FederationDeniedError, then this check for
FederationDeniedError will never be reached since it will be caught by the
check for CodeMessageException above. The check for CodeMessageException does
almost the same thing as this check (since FederationDeniedError initialises
with code=403 and msg="Federation denied with %s."), so may as well just keep
allowing it to handle this case.
When asking for the relations of an event, include the original event in the response. This will mostly be used for efficiently showing edit history, but could be useful in other circumstances.
Nothing uses this now, so we can remove the dead code, and clean up the
API.
Since we're changing the shape of the return value anyway, we take the
opportunity to give the method a better name.
When a user creates an account and the 'require_auth_for_profile_requests' config flag is set, and a client that performed the registration wants to lookup the newly-created profile, the request will be denied because the user doesn't share a room with themselves yet.
This has never been documented, and I'm not sure it's ever been used outside
sytest.
It's quite a lot of poorly-maintained code, so I'd like to get rid of it.
For now I haven't removed the database table; I suggest we leave that for a
future clearout.
- Put the default window_size back to 1000ms (broken by #5181)
- Make the `rc_federation` config actually do something
- fix an off-by-one error in the 'concurrent' limit
- Avoid creating an unused `_PerHostRatelimiter` object for every single
incoming request
The runtime errors that dealt with local email password resets talked about config options that users may not even have in their config file yet (if upgrading). Instead, the cryptic errors are now replaced with hopefully much more helpful ones.
* SAML2 Improvements and redirect stuff
Signed-off-by: Alexander Trost <galexrt@googlemail.com>
* Code cleanups and simplifications.
Also: share the saml client between redirect and response handlers.
* changelog
* Revert redundant changes to static js
* Move all the saml stuff out to a centralised handler
* Add support for tracking SAML2 sessions.
This allows us to correctly handle `allow_unsolicited: False`.
* update sample config
* cleanups
* update sample config
* rename BaseSSORedirectServlet for consistency
* Address review comments
This would cause emails being sent, but Synapse responding with a 429 when creating the event. The client would then retry, and with bad timing the same scenario would happen again. Some testing I did ended up sending me 10 emails for one single invite because of this.
This is mostly a documentation change, but also adds a default value for
SYNAPSE_CONFIG_PATH, so that running from the generated config is the default,
and will Just Work provided your config is in the right place.
When running under docker, we want to use docker's own logging stuff rather
than losing the logs somewhere on the container's filesystem, so let's use log
configs that spit logs out to stdout instead.
We don't want to generate any missing configs when running from a precanned
config.
(There's a strong argument that we don't want to do this at all, since
generating a new signing key on each invocation sounds disasterous, but I don't
fancy unpicking that for now.)
When a client asks for users whose devices have changed since a token we
used to pull *all* users from the database since the token, which could
easily be thousands of rows for old tokens.
This PR changes this to only check for changes for users the client is
actually interested in.
Fixes#5553
Closes#4583
Does slightly less than #5045, which prevented a room from being upgraded multiple times, one after another. This PR still allows that, but just prevents two from happening at the same time.
Mostly just to mitigate the fact that servers are slow and it can take a moment for the room upgrade to actually complete. We don't want people sending another request to upgrade the room when really they just thought the first didn't go through.
Updates the v1.0.0 changelog to provide more information to those upgrading to v1.0 on why they may receive errors or find their password reset abilities have now been disabled.
Helps address #5444
If no `from` param is specified we calculate and use the "current
token" that inlcuded typing, presence, etc. These are unused during
pagination and are not available on workers, so we simply don't
calculate them.
* group the arguments together into a group
* add new names "--generate-missing-config" and "--config-directory" for
existing cmdline options "--generate-keys" and "--keys-dir", which better
reflect their purposes.
Fixes https://github.com/matrix-org/synapse/issues/5431
`jinja2` was being imported even when it wasn't strictly necessary. This made it required to run Synapse, even if the functionality that required it wasn't enabled. This was causing new Synapse installations to crash on startup.
Email modules are now required.
There is a README.txt which always sets off this warning, which is a bit
alarming when you first start synapse. I don't think we need to warn about
this.
If, for some reason, presence updates take a while to persist then it
can trigger clients to tightloop calling `/sync` due to the presence
handler returning updates but not advancing the stream token.
Fixes#5503.
Fixes intermittent errors observed on Apple hardware which were caused by
time.clock() appearing to go backwards when called from different threads.
Also fixes a bug where database activity times were logged as 1/1000 of their
correct ratio due to confusion between milliseconds and seconds.
I had to add quite a lot of logging to diagnose a problem with 3pid
invites - we only logged the one failure which isn't all that
informative.
NB. I'm not convinced the logic of this loop is right: I think it
should just accept a single valid signature from a trusted source
rather than fail if *any* signature is invalid. Also it should
probably not skip the rest of middle loop if a check fails? However,
I'm deliberately not changing the logic here.
Adds new config option `cleanup_extremities_with_dummy_events` which
periodically sends dummy events to rooms with more than 10 extremities.
THIS IS REALLY EXPERIMENTAL.
Configure the demo servers to use untrusted
tls certs so that they communicate with each other.
This configuration makes them very unsafe so I've added warnings about
it in the readme.
Fixes that when a user exchanges a 3PID invite for a proper invite over
federation it does not include the `invite_room_state` key.
This was due to synapse incorrectly sending out two invite requests.
* remove 2.7 from CI and publishing
* fill out classifiers and also make it not be installed on 3.5
* some minor bumps so that the old deps work on python 3.5
Moves the warning about password resets being disabled to the point where a user actually tries to reset their password. Is this an appropriate place for it to happen?
Also removed the disabling of msisdn password resets when you don't have an email config, as that just doesn't make sense.
Also change the error a user receives upon disabled passwords to specify that only email-based password reset is disabled.
If we try and send a transaction with lots of EDUs and we run out of
space, we call get_new_device_msgs_for_remote with a limit of 0, which
then failed.
Some keys are stored in the synapse database with a null valid_until_ms
which caused an exception to be thrown when using that key. We fix this
by treating nulls as zeroes, i.e. they keys will match verification
requests with a minimum_valid_until_ms of zero (i.e. don't validate ts)
but will not match requests with a non-zero minimum_valid_until_ms.
Fixes#5391.
It's not really a problem to trust notary responses signed by the old key so
long as we are also doing TLS validation.
This commit adds a check to the config parsing code at startup to check that
we do not have the insecure matrix.org key without tls validation, and refuses
to start without it.
This allows us to remove the rather alarming-looking warning which happens at
runtime.
When we try and calculate a description for a room for with no name but
multiple other users we threw an exception (due to trying to subscript
result of `dict.values()`).
Sometimes the build agents get lost or die (error codes -1 and 2). Retry automatically a maximum of 2 times if this happens.
Error code reference:
* -1: Agent was lost
* 0: Build successful
* 1: There was an error in your code
* 2: The build stopped abruptly
* 255: The build was cancelled
Sends password reset emails from the homeserver instead of proxying to the identity server. This is now the default behaviour for security reasons. If you wish to continue proxying password reset requests to the identity server you must now enable the email.trust_identity_server_for_password_resets option.
This PR is a culmination of 3 smaller PRs which have each been separately reviewed:
* #5308
* #5345
* #5368
There are a few changes going on here:
* We make checking the signature on a key server response optional: if no
verify_keys are specified, we trust to TLS to validate the connection.
* We change the default config so that it does not require responses to be
signed by the old key.
* We replace the old 'perspectives' config with 'trusted_key_servers', which
is also formatted slightly differently.
* We emit a warning to the logs every time we trust a key server response
signed by the old key.
* Fix background updates to handle redactions/rejections
In background updates based on current state delta stream we need to
handle that we may not have all the events (or at least that
`get_events` may raise an exception).
Also:
* rename VerifyKeyRequest->VerifyJsonRequest
* calculate key_ids on VerifyJsonRequest construction
* refactor things to pass around VerifyJsonRequests instead of 4-tuples
FederationClient.get_pdu is called in a loop to fetch a batch of PDUs. A
failure to fetch one should not result in a failure of the whole batch. Add the
missing `continue`.
We have too many things called get_event, and it's hard to figure out what we
mean. Also remove some unused params from the signature, and add some logging.
It takes at least 20 minutes to work through the long_retries schedule (11
attempts, each with a 60 second timeout, and 60 seconds between each request),
so if the notary server isn't returning within the timeout, we'll just end up
blocking whatever request is happening for 20 minutes.
Ain't nobody got time for that.
When handling incoming federation requests, make sure that we have an
up-to-date copy of the signing key.
We do not yet enforce the validity period for event signatures.
When processing an incoming event over federation, we may try and
resolve any unexpected differences in auth events. This is a
non-essential process and so should not stop the processing of the event
if it fails (e.g. due to the remote disappearing or not implementing the
necessary endpoints).
Fixes#3330
We have to do this by re-inserting a background update and recreating
tables, as the tables only get created during a background update and
will later be deleted.
We also make sure that we remove any entries that should have been
removed but weren't due to a race that has been fixed in a previous
commit.
The verify_request deferred already returns a suitable SynapseError, so I don't
really know what we expect to achieve by doing more wrapping, other than log
spam.
Fixes#4278.
When we receive a soft failed event we, correctly, *do not* update the
forward extremity table with the event. However, if we later receive an
event that references the soft failed event we then need to remove the
soft failed events prev events from the forward extremities table,
otherwise we just build up forward extremities.
Fixes#5269
This fixes a bug which were causing the "event_format" field to be
ignored in the filter of requests to the `/messages` endpoint of the
CS API.
Signed-off-by: Eisha Chen-yen-su <chenyensu0@gmail.com>
When we receive a soft failed event we, correctly, *do not* update the
forward extremity table with the event. However, if we later receive an
event that references the soft failed event we then need to remove the
soft failed events prev events from the forward extremities table,
otherwise we just build up forward extremities.
Fixes#5269
When enabling the account validity feature, Synapse will look at startup for registered account without an expiration date, and will set one equals to 'now + validity_period' for them. On large servers, it can mean that a large number of users will have the same expiration date, which means that they will all be sent a renewal email at the same time, which isn't ideal.
In order to mitigate this, this PR allows server admins to define a 'max_delta' so that the expiration date is a random value in the [now + validity_period ; now + validity_period + max_delta] range. This allows renewal emails to be progressively sent over a configured period instead of being sent all in one big batch.
The list of server names was redundant, since it was equivalent to the keys on
the server_to_deferred map. This reduces the number of large lists being passed
around, and has the benefit of deduplicating the entries in `wait_on`.
Replaces DEFAULT_ROOM_VERSION constant with a method that first checks the config, then returns a hardcoded value if the option is not present.
That hardcoded value is now located in the server.py config file.
Rather than have three methods which have to have the same interface,
factor out a separate interface which is provided by three implementations.
I find it easier to grok the code this way.
This is a first step to checking that the key is valid at the required moment.
The idea here is that, rather than passing VerifyKey objects in and out of the
storage layer, we instead pass FetchKeyResult objects, which simply wrap the
VerifyKey and add a valid_until_ts field.
* Pass time_added_ms into process_v2_response
* Simplify process_v2_response
We can merge old_verify_keys into verify_keys, and reduce the number of dicts
flying around.
Storing server keys hammered the database a bit. This replaces the
implementation which stored a single key, with one which can do many updates at
once.
I was staring at this function trying to figure out wtf it was actually
doing. This is (hopefully) a non-functional refactor which makes it a bit
clearer.
If we remove support for a particular room version, we should behave more
gracefully. This should make client requests fail with a 400 rather than a 500,
and will ignore individiual PDUs in a federation transaction, rather than the
whole transaction.
This reverts commit ce5bcefc60.
This caused:
```
Traceback (most recent call last):
File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/synapse/src/synapse/app/client_reader.py", line 32, in <module>
from synapse.replication.slave.storage import SlavedProfileStore
ImportError: cannot import name 'SlavedProfileStore' from 'synapse.replication.slave.storage' (/home/synapse/src/synapse/replication/slave/storage/__init__.py)
error starting synapse.app.client_reader('/home/synapse/config/workers/client_reader.yaml') (exit code: 1); see above for logs
```
If account validity is enabled in the server's configuration, this job will run at startup as a background job and will stick an expiration date to any registered account missing one.
We checked that 3pids were not already in use before we checked if
we were going to return the account previously registered in the
same UI auth session, in which case the 3pids will definitely
be in use.
https://github.com/vector-im/riot-web/issues/9586
Prevents a SynapseError being raised inside of a IResolutionReceiver and instead opts to just return 0 results. This thus means that we have to lump a failed lookup and a blacklisted lookup together with the same error message, but the substitute should be generic enough to cover both cases.
It's more natural for the user if the bit that takes them away
from the registration flow comes last. Adding the dummy stage allows
us to do the stages in this order without the ambiguity.
This allows the client to complete the email last which is more
natual for the user. Without this stage, if the client would
complete the recaptcha (and terms, if enabled) stages and then the
registration request would complete because you've now completed a
flow, even if you were intending to complete the flow that's the
same except has email auth at the end.
Adding a dummy auth stage to the recaptcha-only flow means it's
always unambiguous which flow the client was trying to complete.
Longer term we should think about changing the protocol so the
client explicitly says which flow it's trying to complete.
vector-im/riot-web#9586
This allows the client to complete the email last which is more
natual for the user. Without this stage, if the client would
complete the recaptcha (and terms, if enabled) stages and then the
registration request would complete because you've now completed a
flow, even if you were intending to complete the flow that's the
same except has email auth at the end.
Adding a dummy auth stage to the recaptcha-only flow means it's
always unambiguous which flow the client was trying to complete.
Longer term we should think about changing the protocol so the
client explicitly says which flow it's trying to complete.
https://github.com/vector-im/riot-web/issues/9586
* Add AllowEncodedSlashes to apache
Add `AllowEncodedSlashes On` to apache config to support encoding for v3 rooms. "The AllowEncodedSlashes setting is not inherited by virtual hosts, and virtual hosts are used in many default Apache configurations, such as the one in Ubuntu. The workaround is to add the AllowEncodedSlashes setting inside a <VirtualHost> container (/etc/apache2/sites-available/default in Ubuntu)." Source: https://stackoverflow.com/questions/4390436/need-to-allow-encoded-slashes-on-apache
* change allowencodedslashes to nodecode
This commit adds two config options:
* `restrict_public_rooms_to_local_users`
Requires auth to fetch the public rooms directory through the CS API and disables fetching it through the federation API.
* `require_auth_for_profile_requests`
When set to `true`, requires that requests to `/profile` over the CS API are authenticated, and only returns the user's profile if the requester shares a room with the profile's owner, as per MSC1301.
MSC1301 also specifies a behaviour for federation (only returning the profile if the server asking for it shares a room with the profile's owner), but that's currently really non-trivial to do in a not too expensive way. Next step is writing down a MSC that allows a HS to specify which user sent the profile query. In this implementation, Synapse won't send a profile query over federation if it doesn't believe it already shares a room with the profile's owner, though.
Groups have been intentionally omitted from this commit.
This endpoint isn't much use for its intended purpose if you first need to get
yourself an admin's auth token.
I've restricted it to the `/_synapse/admin` path to make it a bit easier to
lock down for those concerned about exposing this information. I don't imagine
anyone is using it in anger currently.
Synapse 0.99.3.2 (2019-05-03)
=============================
Internal Changes
----------------
- Ensure that we have `urllib3` <1.25, to resolve incompatibility with `requests`. ([\#5135](https://github.com/matrix-org/synapse/issues/5135))
psycopg 2.8 is now out, which means that the C library gets built from source,
so we now need libpq-dev when building.
Turns out the need for this package is already documented in
docs/postgres.rst.
Using systemd-python allows for logging to the systemd journal,
as is documented in: `synapse/contrib/systemd/log_config.yaml`.
Signed-off-by: Silke Hofstra <silke@slxh.eu>
Currently if a call to `/get_missing_events` fails we log an exception
and stop processing the top level event we received over federation.
Instead let's try and handle it sensibly given it is a somewhat expected
failure mode.
We need to drop tables in the correct order due to foreign table
constraints (on `application_services`), otherwise the DROP TABLE
command will fail.
Introduced in #4992.
There's no point in collecting a merged dict of keys: it is sufficient to
consider just the new keys which have been fetched by the most recent
key_fetch_fns.
Make this just return the key dict, rather than a single-entry dict mapping the
server name to the key dict. It's easy for the caller to get the server name
from from the response object anyway.
We start all pushers on start up and immediately start a background
process to fetch push to send. This makes start up incredibly painful
when dealing with many pushers.
Instead, let's do a quick fast DB check to see if there *may* be push to
send and only start the background processes for those pushers. We also
stagger starting up and doing those checks so that we don't try and
handle all pushers at once.
Hopefully this time we really will fix#4422.
We need to make sure that the cache on
`get_rooms_for_user_with_stream_ordering` is invalidated *before* the
SyncHandler is notified for the new events, and we can now do so reliably via
the `events` stream.
We assume, as we did before, that users bound their threepid to one of
the trusted identity servers. So we simply fill the new table with all
threepids in `user_threepids` joined with the trusted identity servers.
By default the homeserver will use the identity server used during the
binding of the 3PID to unbind the 3PID. However, we need to allow
clients to explicitly ask the homeserver to unbind via a particular
identity server, for the case where the 3PID was bound out of band from
the homeserver.
Implements MSC915.
This changes the behaviour from using the server specified trusted
identity server to using the IS that used during the binding of the
3PID, if known.
This is the behaviour specified by MSC1915.
Primarily this fixes a bug in the handling of remote users joining a
room where the server sent out the presence for all local users in the
room to all servers in the room.
We also change to using the state delta stream, rather than the
distributor, as it will make it easier to split processing out of the
master process (as well as being more flexible).
Finally, when sending presence states to newly joined servers we filter
out old presence states to reduce the number sent. Initially we filter
out states that are offline and have a last active more than a week ago,
though this can be changed down the line.
Fixes#3962
Adds a new method, check_3pid_auth, which gives password providers
the chance to allow authentication with third-party identifiers such
as email or msisdn.
`__str__` depended on `self.addr`, which was absent from
ClientReplicationStreamProtocol, so attempting to call str on such an object
would raise an exception.
We can calculate the peer addr from the transport, so there is no need for addr
anyway.
I don't have a database with the same name as my user, so leaving the database
name unset fails.
While we're at it, clear out some unused stuff in the test setup.
As per #3622, we remove trailing slashes from outbound federation requests. However, to ensure that we remain backwards compatible with previous versions of Synapse, if we receive a HTTP 400 with `M_UNRECOGNIZED`, then we are likely talking to an older version of Synapse in which case we retry with a trailing slash appended to the request path.
There are a number of instances where a server or admin may puppet a
user to join/leave rooms, which we don't want to fail if the user has
not consented to the privacy policy. We fix this by adding a check to
test if the requester has an associated access_token, which is used as a
proxy to answer the question of whether the action is being done on
behalf of a real request from the user.
Rather than using a Mock for the homeserver config, use a genuine
HomeServerConfig object. This makes for a more realistic test, and means that
we don't have to keep remembering to add things to the mock config every time
we add a new config setting.
The Mailer expects the config object to have `email_smtp_pass` and
`email_riot_base_url` attributes (and it won't by default, because the default
config impl doesn't set any of the attributes unless email_enable_notifs is
set).
Make it so that most options in the config are optional, and commented out in
the generated config.
The reasons this is a good thing are as follows:
* If we decide that we should change the default for an option, we can do so,
and only those admins that have deliberately chosen to override that option
will be stuck on the old setting.
* It moves us towards a point where we can get rid of the super-surprising
feature of synapse where the default settings for the config come from the
generated yaml.
* It makes setting up a test config for unit testing an order of magnitude
easier (see forthcoming PR).
* It makes the generated config more consistent, and hopefully easier for users
to understand.
This setup is a way to manage workers with systemd. It does however not
require workers. You can use this setup without workers. You just have
to make sure that the homeserver is forking and writes its PID file
to the location the service is looking in.
The currently distributed setup in the debian package does not work in
conjunction with workers.
* Adds changelog
* Lets systemd handle the forking
Sets all services to `type=simple` and disables daemonizing on the
synapse side.
* Formats readme to 80 columns per line
* Allows for full restart of all workers
* Changes README to reflect the new setup
* Adds dot to end of changelog file
* Removes surplus word
Co-Authored-By: targodan <targodan@users.noreply.github.com>
* Adds missing word
Co-Authored-By: targodan <targodan@users.noreply.github.com>
* Fixes linebreak
Co-Authored-By: targodan <targodan@users.noreply.github.com>
* Fixes unit type
* Add some stuff back to the .gitignore
Signed-off-by: Aaron Raimist <aaron@raim.ist>
* Add changelog
Signed-off-by: Aaron Raimist <aaron@raim.ist>
* Reorder and remove old items from .gitignore
Signed-off-by: Aaron Raimist <aaron@raim.ist>
We keep track of what stream IDs we've seen so that we know what updates
we've handled or missed. If we re-sync we don't know if the updates
we've seen are included in the re-sync (there may be a race), so we
should reset the seen updates.
Currently the explanation message is sent to the abuse room before any
users are forced joined, which means it tends to get lost in the backlog
of joins.
So instead we send the message *after* we've forced joined everyone.
* Rate-limiting for registration
* Add unit test for registration rate limiting
* Add config parameters for rate limiting on auth endpoints
* Doc
* Fix doc of rate limiting function
Co-Authored-By: babolivier <contact@brendanabolivier.com>
* Incorporate review
* Fix config parsing
* Fix linting errors
* Set default config for auth rate limiting
* Fix tests
* Add changelog
* Advance reactor instead of mocked clock
* Move parameters to registration specific config and give them more sensible default values
* Remove unused config options
* Don't mock the rate limiter un MAU tests
* Rename _register_with_store into register_with_store
* Make CI happy
* Remove unused import
* Update sample config
* Fix ratelimiting test for py2
* Add non-guest test
Remove a call to run_as_background_process: there is no need to run this as a
background process, because build_and_send_edu does not block.
We may as well inline the whole of _push_remotes.
When filtering events to send to server we check more than just history
visibility. However when deciding whether to backfill or not we only
care about the history visibility.
In worker mode, on the federation sender, when we receive an edu for sending
over the replication socket, it is parsed into an Edu object. There is no point
extracting the contents of it so that we can then immediately build another Edu.
We will later need also to import 'register_servlets' from the
'login' module, so we un-pollute the namespace now to keep the
logical changes separate.
If the client failed to process incoming commands during the initial set
up of the replication connection it would immediately disconnect and
reconnect, resulting in a tightloop.
This can happen, for example, when subscribing to a stream that has a
row that is too long in the backlog.
The fix here is to not consider the connection successfully set up until
the client has succesfully subscribed and caught up with the streams.
This ensures that the retry logic timers aren't reset until then,
meaning that if an error does happen during start up the client will
continue backing off before retrying again.
* You need an entry in the debian changelog (and not a regular newsfragment)
for debian packaging changes.
* Regular newsfragments must end in full stops.
* Added HAProxy example
Proposal of an example with HAProxy. Asked by #4541.
Signed-off-by: Benoît S. (“Benpro”) <gitlab@benpro.fr>
* Following suggestions of @richvdh
I just got bitten by a file being caught by the .gitignore, which shouldn't
have been, and am now pissed off with the .gitignore. I have basically declared
bankruptcy on it and started again.
* Run unit tests against python 3.7
... so that we span the full range of our supported python versions
* Switch to xenial
* fix psql fail
* pep8 etc want python 3.6
The general idea here is that config examples should just have a hash and no
extraneous whitespace, both to make it easier for people who don't understand
yaml, and to make the examples stand out from the comments.
Currently whenever the current state changes in a room invalidate a lot
of caches, which cause *a lot* of traffic over replication. Instead,
lets batch up all those invalidations and send a single poke down
the replication streams.
Hopefully this will reduce load on the master process by substantially
reducing traffic.
Firstly, we always logged that the request was being handled via
`JsonResource._async_render`, so we change that to use the servlet name
we add to the request.
Secondly, we pass the exception information to the logger rather than
formatting it manually. This makes it consistent with other exception
logging, allwoing logging hooks and formatters to access the exception
information.
When guest_access changes from allowed to forbidden all local guest
users should be kicked from the room. This did not happen when
revocation was received from federation on a worker.
Presumably broken in #4141
This allows registration to be handled by a worker, though the actual
write to the database still happens on master.
Note: due to the in-memory session map all registration requests must be
handled by the same worker.
* Better logging for errors on startup
* Fix "TypeError: '>' not supported" when starting without an existing
certificate
* Fix a bug where an existing certificate would be reprovisoned every day
* implement `reload` by sending the HUP signal
According to the 0.99 release info* synapse now uses the HUP signal to reload certificates:
> Synapse will now reload TLS certificates from disk upon SIGHUP. (#4495, #4524)
So the matrix-synapse.service unit file should include a reload directive.
Signed-off-by: Дамјан Георгиевски <gdamjan@gmail.com>
I wanted to bring listen_tcp into line with listen_ssl in terms of returning a
list of ports, and wanted to check that was a safe thing to do - hence the
logging in `refresh_certificate`.
Also, pull the 'Synapse now listening' message up to homeserver.py, because it
was being duplicated everywhere else.
Due to the table locks taken out by the naive upsert, the table
statistics may be out of date. During deduplication it is important that
the correct index is used as otherwise a full table scan may be
incorrectly used, which can end up thrashing the database badly.
The background update to remove duplicate rows naively deleted and
reinserted the duplicates. For large tables with a large number of
duplicates this causes a lot of bloat (with postgres), as the inserted
rows are appended to the table, since deleted rows will not be
overwritten until a VACUUM has happened.
This should hopefully also help ensure that the query in the last batch
uses the correct index, as inserting a large number of new rows without
analyzing will upset the query planner.
Rearrange the comments to try to clarify them, and expand on what some of it
means.
Use a sensible default 'bind_addresses' setting.
For the insecure port, only bind to localhost, and enable x_forwarded, since
apparently it's for use behind a load-balancer.
Add more tables to the list of tables which need a background update to
complete before we can upsert into them, which fixes a race against the
background updates.
A surprising number of people are using the well-known method, and are
simply copying the example configuration. This is problematic as the
example includes an explicit port, which causes inbound federation
requests to have the HTTP Host header include the port, upsetting some
reverse proxies.
Given that, we update the well-known example to be more explicit about
the various ways you can set it up, and the consequence of using an
explict port.
A surprising number of people are using the well-known method, and are
simply copying the example configuration. This is problematic as the
example includes an explicit port, which causes inbound federation
requests to have the HTTP Host header include the port, upsetting some
reverse proxies.
Given that, we update the well-known example to be more explicit about
the various ways you can set it up, and the consequence of using an
explict port.
The readme was getting pretty unmanageable and hard to grok. This is an attempt to simplify things by moving installation instructions from the README to a separate file. I've tried to resist the temptation to fix too much stuff while I'm here - it mostly just copies-and-pastes from one doc to the other, and changes from rst to md syntax.
There are two reasons this is a good thing:
* first, it means that you don't end up with stuff kicking around your working
copy ending up in the build image by mistake (which can upset the pip
install process)
* second: it means that the docker image cache is more effective, and we can
reuse docker images when iterating on the docker stuff.
This was broken in PR #4405, commit 886e5ac, where we changed remote
rejections to be outliers.
The fix is to explicitly add the leave event in when we know its an out
of band invite. We can't always add the event as if the server is/was in
the room there might be more events to send down the sync than just the
leave.
* Fix replication for room v3
We were not correctly quoting the path fragments over http replication,
which meant that it exploded when the event IDs had a slash in them
* Newsfile
* Handle listening for ACME requests on IPv6 addresses
the weird url-but-not-actually-a-url-string doesn't handle IPv6 addresses
without extra quoting. Building a string which you are about to parse again
seems like a weird choice. Let's just use listenTCP, which is consistent with
what we do elsewhere.
* Clean up the default ACME config
make it look a bit more consistent with everything else, and tweak the defaults
to listen on port 80.
* newsfile
We only process events sent to us from a server if the event ID matches
the server, to help guard against federation storms. We replace this
with a check against the event origin.
The transaction queue only sends out events that we generate. This was
done by checking domain of event ID, but that can no longer be used.
Instead, we may as well use the sender field.
We add the constant, but don't add it to the known room versions. This
lets us start adding V3 logic, but the servers will never join or create
V3 rooms
We currently pass FrozenEvent instead of `dict` to
`compute_event_signature`, which works by accident due to `dict(event)`
producing the correct result.
This fixes PR #4493 commit 855a151
two reasons for this. One, it saves a bunch of boilerplate. Two, it squashes
unicode to IDNA-in-a-`str` (even on python 3) in a way that it turns out we
rely on to give consistent behaviour between python 2 and 3.
The validator was being run on the EventBuilder objects, and so the
validator only checked a subset of fields. With the upcoming
EventBuilder refactor even fewer fields will be there to validate.
To get around this we split the validation into those that can be run
against an EventBuilder and those run against a fully fledged event.
This is in preparation for making EventBuilder format agnostic, which
means event signing should be done against the event dict rather than
the EventBuilder object.
Turns out that the library does a better job of parsing URIs than our
reinvented wheel. Who knew.
There are two things going on here. The first is that, unlike
parse_server_name, URI.fromBytes will strip off square brackets from IPv6
literals, which means that it is valid input to ClientTLSOptionsFactory and
HostnameEndpoint.
The second is that we stay in `bytes` throughout (except for the argument to
ClientTLSOptionsFactory), which avoids the weirdness of (sometimes) ending up
with idna-encoded values being held in `unicode` variables. TBH it probably
would have been ok but it made the tests fragile.
If you use double-quotes here, you have to escape your backslashes. It's much
easier with single-quotes.
(Note that the existing double-backslashes are already interpreted by python's
""" parsing.)
This is leading to problems with people upgrading to clients that
support MSC1730 because people have this misconfigured, so try
to make the docs completely unambiguous.
The problem here is that we have cut-and-pasted an impl from Twisted, and then
failed to maintain it. It was fixed in Twisted in
https://github.com/twisted/twisted/pull/1047/files; let's do the same here.
Currently they're stored as non-outliers even though the server isn't in
the room, which can be problematic in places where the code assumes it
has the state for all non outlier events.
In particular, there is an edge case where persisting the leave event
triggers a state resolution, which requires looking up the room version
from state. Since the server doesn't have the state, this causes an
exception to be thrown.
In the debian package, make the virtualenv symlink python to /usr/bin/python3.X
rather than /usr/bin/python3. Also make sure we depend on the right python3.x
package.
This might help a bit with subtle failures when people install a package from
the wrong distro (https://github.com/matrix-org/synapse/issues/4431).
Currently we only have the one event format version defined, but this
adds the necessary infrastructure to persist and fetch the format
versions alongside the events.
We specify the format version rather than the room version as:
1. We don't necessarily know the room version, existing events may be
either v1 or v2.
2. We'd need to be careful to prevent/handle correctly if different
events in the same room reported to be of different versions, which
sounds annoying.
This allows the OpenID userinfo endpoint to be active even if the
federation resource is not active. The OpenID userinfo endpoint
is called by integration managers to verify user actions using the
client API OpenID access token. Without this verification, the
integration manager cannot know that the access token is valid.
The OpenID userinfo endpoint will be loaded in the case that either
"federation" or "openid" resource is defined. The new "openid"
resource is defaulted to active in default configuration.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
For all the homeserver classes, only the FrontendProxyServer passes
its reactor when doing the http listen. Looking at previous PR's looks
like this was introduced to make it possible to write a test, otherwise
when you try to run a test with the test homeserver it tries to
do a real bind to a port. Passing the reactor that the homeserver
is instantiated with should probably be the right thing to do anyway?
Signed-off-by: Jason Robinson <jasonr@matrix.org>
For all the homeserver classes, only the FrontendProxyServer passes
its reactor when doing the http listen. Looking at previous PR's looks
like this was introduced to make it possible to write a test, otherwise
when you try to run a test with the test homeserver it tries to
do a real bind to a port. Passing the reactor that the homeserver
is instantiated with should probably be the right thing to do anyway?
Signed-off-by: Jason Robinson <jasonr@matrix.org>
This was caused by accidentally overwritting a `last_seen` variable
in a for loop, causing the wrong value to be written to the progress
table. The result of which was that we didn't scan sections of the table
when searching for duplicates, and so some duplicates did not get
deleted.
* Fix race when persisting create event
When persisting a chunk of DAG it is sometimes requried to do a state
resolution, which requires knowledge of the room version. If this
happens while we're persisting the create event then we need to use that
event rather than attempting to look it up in the database.
* Remove redundant WrappedConnection
The matrix federation client uses an HTTP connection pool, which times out its
idle HTTP connections, so there is no need for any of this business.
Since 0.13.0, pymacaroons works correctly with pynacl, so there
isn’t any more reason to depend on an outdated pynacl fork.
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
turns out that 0.34.1.1+1 comes before 0.34.1.1+bionic (etc).
The version may only contain "~ 0-9 A-Z a-z + - ." (sorting in that order).
Option 1: replace "+" with something that sorts after +. Options are "-" (but
dpkg-source complains about that) or "." (but that would mean we couldn't
distinguish packaging-only changes from real changes).
Option 2: stick with + and just find something that sorts after 'xenial'. The
only options there are "-", "." (same problems as before), "z", and "+".
Hence, ++1. Sorry.
[**#matrix:matrix.org**](https://matrix.to/#/#matrix:matrix.org) is the official support room for Matrix, and can be accessed by any client from https://matrix.org/docs/projects/try-matrix-now.html
It can also be access via IRC bridge at irc://irc.freenode.net/matrix or on the web here: https://webchat.freenode.net/?channels=matrix
[**#synapse:matrix.org**](https://matrix.to/#/#synapse:matrix.org) is the official support room for
Synapse, and can be accessed by any client from https://matrix.org/docs/projects/try-matrix-now.html.
Please ask for support there, rather than filing github issues.
doas pkg_add python libffi py-pip py-setuptools sqlite3 py-virtualenv \
libxslt jpeg
```
There is currently no port for OpenBSD. Additionally, OpenBSD's security
settings require a slightly more difficult installation process.
XXX: I suspect this is out of date.
1. Create a new directory in `/usr/local` called `_synapse`. Also, create a
new user called `_synapse` and set that directory as the new user's home.
This is required because, by default, OpenBSD only allows binaries which need
write and execute permissions on the same memory space to be run from
`/usr/local`.
2.`su` to the new `_synapse` user and change to their home directory.
3. Create a new virtualenv: `virtualenv -p python2.7 ~/.synapse`
4. Source the virtualenv configuration located at
`/usr/local/_synapse/.synapse/bin/activate`. This is done in `ksh` by
using the `.` command, rather than `bash`'s `source`.
5. Optionally, use `pip` to install `lxml`, which Synapse needs to parse
webpages for their titles.
6. Use `pip` to install this repository: `pip install matrix-synapse`
7. Optionally, change `_synapse`'s shell to `/bin/false` to reduce the
chance of a compromised Synapse server being used to take over your box.
After this, you may proceed with the rest of the install directions.
#### Windows
If you wish to run or develop Synapse on Windows, the Windows Subsystem For
Linux provides a Linux environment on Windows 10 which is capable of using the
Debian, Fedora, or source installation methods. More information about WSL can
be found at https://docs.microsoft.com/en-us/windows/wsl/install-win10 for
Windows 10 and https://docs.microsoft.com/en-us/windows/wsl/install-on-server
for Windows Server.
### Troubleshooting Installation
XXX a bunch of this is no longer relevant.
Synapse requires pip 8 or later, so if your OS provides too old a version you
may need to manually upgrade it::
sudo pip install --upgrade pip
Installing may fail with `Could not find any downloads that satisfy the requirement pymacaroons-pynacl (from matrix-synapse==0.12.0)`.
You can fix this by manually upgrading pip and virtualenv::
sudo pip install --upgrade virtualenv
You can next rerun `virtualenv -p python3 synapse` to update the virtual env.
Installing may fail during installing virtualenv with `InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.`
You can fix this by manually installing ndg-httpsclient::
pip install --upgrade ndg-httpsclient
Installing may fail with `mock requires setuptools>=17.1. Aborting installation`.
You can fix this by upgrading setuptools::
pip install --upgrade setuptools
If pip crashes mid-installation for reason (e.g. lost terminal), pip may
refuse to run until you remove the temporary installation directory it
created. To reset the installation::
rm -rf /tmp/pip_install_matrix
pip seems to leak *lots* of memory during installation. For instance, a Linux
host with 512MB of RAM may run out of memory whilst installing Twisted. If this
happens, you will have to individually install the dependencies which are
failing, e.g.::
pip install twisted
## Prebuilt packages
As an alternative to installing from source, prebuilt packages are available
for a number of platforms.
### Docker images and Ansible playbooks
There is an offical synapse image available at
https://hub.docker.com/r/matrixdotorg/synapse which can be used with
the docker-compose file available at [contrib/docker](contrib/docker). Further information on
this including configuration options is available in the README on
hub.docker.com.
Alternatively, Andreas Peters (previously Silvio Fricke) has contributed a
Dockerfile to automate a synapse server in a single Docker image, at
If you wish to run or develop Synapse on Windows, the Windows Subsystem For
Linux provides a Linux environment on Windows 10 which is capable of using the
Debian, Fedora, or source installation methods. More information about WSL can
be found at https://docs.microsoft.com/en-us/windows/wsl/install-win10 for
Windows 10 and https://docs.microsoft.com/en-us/windows/wsl/install-on-server
for Windows Server.
Troubleshooting
===============
Troubleshooting Installation
----------------------------
Synapse requires pip 8 or later, so if your OS provides too old a version you
may need to manually upgrade it::
sudo pip install --upgrade pip
Installing may fail with ``Could not find any downloads that satisfy the requirement pymacaroons-pynacl (from matrix-synapse==0.12.0)``.
You can fix this by manually upgrading pip and virtualenv::
sudo pip install --upgrade virtualenv
You can next rerun ``virtualenv -p python3 synapse`` to update the virtual env.
Installing may fail during installing virtualenv with ``InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.``
You can fix this by manually installing ndg-httpsclient::
pip install --upgrade ndg-httpsclient
Installing may fail with ``mock requires setuptools>=17.1. Aborting installation``.
You can fix this by upgrading setuptools::
pip install --upgrade setuptools
If pip crashes mid-installation for reason (e.g. lost terminal), pip may
refuse to run until you remove the temporary installation directory it
created. To reset the installation::
rm -rf /tmp/pip_install_matrix
pip seems to leak *lots* of memory during installation. For instance, a Linux
host with 512MB of RAM may run out of memory whilst installing Twisted. If this
happens, you will have to individually install the dependencies which are
failing, e.g.::
pip install twisted
Running out of File Handles
~~~~~~~~~~~~~~~~~~~~~~~~~~~
If synapse runs out of filehandles, it typically fails badly - live-locking
at 100% CPU, and/or failing to accept new TCP connections (blocking the
connecting client). Matrix currently can legitimately use a lot of file handles,
thanks to busy rooms like #matrix:matrix.org containing hundreds of participating
servers. The first time a server talks in a room it will try to connect
simultaneously to all participating servers, which could exhaust the available
file descriptors between DNS queries & HTTPS sockets, especially if DNS is slow
to respond. (We need to improve the routing algorithm used to be better than
full mesh, but as of June 2017 this hasn't happened yet).
If you hit this failure mode, we recommend increasing the maximum number of
open file handles to be at least 4096 (assuming a default of 1024 or 256).
This is typically done by editing ``/etc/security/limits.conf``
Separately, Synapse may leak file handles if inbound HTTP requests get stuck
during processing - e.g. blocked behind a lock or talking to a remote server etc.
This is best diagnosed by matching up the 'Received request' and 'Processed request'
log lines and looking for any 'Processed request' lines which take more than
a few seconds to execute. Please let us know at #matrix-dev:matrix.org if
you see this failure mode so we can help debug it, however.
Upgrading an existing Synapse
=============================
@@ -506,100 +164,19 @@ versions of synapse.
.._UPGRADE.rst: UPGRADE.rst
.._federation:
Setting up Federation
=====================
Federation is the process by which users on different servers can participate
in the same room. For this to work, those other servers must be able to contact
yours to send messages.
As explained in `Configuring synapse`_, the ``server_name`` in your
``homeserver.yaml`` file determines the way that other servers will reach
yours. By default, they will treat it as a hostname and try to connect to
port 8448. This is easy to set up and will work with the default configuration,
provided you set the ``server_name`` to match your machine's public DNS
hostname.
For a more flexible configuration, you can set up a DNS SRV record. This allows
you to run your server on a machine that might not have the same name as your
domain name. For example, you might want to run your server at
``synapse.example.com``, but have your Matrix user-ids look like
``@user:example.com``. (A SRV record also allows you to change the port from
the default 8448. However, if you are thinking of using a reverse-proxy on the
federation port, which is not recommended, be sure to read
`Reverse-proxying the federation port`_ first.)
To use a SRV record, first create your SRV record and publish it in DNS. This
should have the format ``_matrix._tcp.<yourdomain.com> <ttl> IN SRV 10 0 <port>
<synapse.server.name>``. The DNS record should then look something like::
$ dig -t srv _matrix._tcp.example.com
_matrix._tcp.example.com. 3600 IN SRV 10 0 8448 synapse.example.com.
Note that the server hostname cannot be an alias (CNAME record): it has to point
directly to the server hosting the synapse instance.
You can then configure your homeserver to use ``<yourdomain.com>`` as the domain in
print("Unable to login via the command line client. Please visit "
"%s to login."%fallback_url)
print(
"Unable to login via the command line client. Please visit "
"%s to login."%fallback_url
)
defer.returnValue(False)
defer.returnValue(True)
@@ -263,21 +258,34 @@ class SynapseCmd(cmd.Cmd):
<clientSecret> A string of characters generated when requesting an email that you'll supply in subsequent calls to identify yourself
<sendAttempt> The number of times the user has requested an email. Leave this the same between requests to retry the request at the transport level. Increment it to request that the email be sent again.
The demo start.sh will start three synapse servers on ports 8080, 8081 and 8082, with host names localhost:$port. This can be easily changed to `hostname`:$port in start.sh if required.
It will also start a web server on port 8000 pointed at the webclient.
The demo start.sh will start three synapse servers on ports 8080, 8081 and 8082, with host names localhost:$port. This can be easily changed to `hostname`:$port in start.sh if required.
To enable the servers to communicate untrusted ssl certs are used. In order to do this the servers do not check the certs
and are configured in a highly insecure way. Do not use these configuration files in production.
stop.sh will stop the synapse servers and the webclient.
This Docker image will run Synapse as a single process. It does not provide a database
server or a TURN server, you should run these separately.
This Docker image will run Synapse as a single process. By default it uses a
sqlite database; for production use you should connect it to a separate
postgres database.
## Run
We do not currently offer a `latest` image, as this has somewhat undefined semantics.
We instead release only tagged versions so upgrading between releases is entirely
within your control.
### Using docker-compose (easier)
This image is designed to run either with an automatically generated configuration
file or with a custom configuration that requires manual editing.
An easy way to make use of this image is via docker-compose. See the
[contrib/docker](../contrib/docker)
section of the synapse project for examples.
### Without Compose (harder)
If you do not wish to use Compose, you may still run this image using plain
Docker commands. Note that the following is just a guideline and you may need
to add parameters to the docker run command to account for the network situation
with your postgres database.
```
docker run \
-d \
--name synapse \
-v ${DATA_PATH}:/data \
-e SYNAPSE_SERVER_NAME=my.matrix.host \
-e SYNAPSE_REPORT_STATS=yes \
docker.io/matrixdotorg/synapse:latest
```
The image also does *not* provide a TURN server.
## Volumes
The image expects a single volume, located at ``/data``, that will hold:
By default, the image expects a single volume, located at ``/data``, that will hold:
* configuration files;
* temporary files during uploads;
* uploaded media and thumbnails;
* the SQLite database if you do not configure postgres;
* the appservices configuration.
You are free to use separate volumes depending on storage endpoints at your
disposal. For instance, ``/data/media`` coud be stored on a large but low
disposal. For instance, ``/data/media`` could be stored on a large but low
performance hdd storage while other files could be stored on high performance
endpoints.
@@ -53,73 +25,108 @@ In order to setup an application service, simply create an ``appservices``
directory in the data volume and write the application service Yaml
configuration file there. Multiple application services are supported.
## Environment
## Generating a configuration file
Unless you specify a custom path for the configuration file, a very generic
file will be generated, based on the following environment settings.
These are a good starting point for setting up your own deployment.
The first step is to generate a valid config file. To do this, you can run the
image with the `generate` command line option.
Global settings:
* ``UID``, the user id Synapse will run as [default 991]
* ``GID``, the group id Synapse will run as [default 991]
* ``SYNAPSE_CONFIG_PATH``, path to a custom config file
If ``SYNAPSE_CONFIG_PATH`` is set, you should generate a configuration file
then customize it manually. No other environment variable is required.
Otherwise, a dynamic configuration file will be used. The following environment
variables are available for configuration:
* ``SYNAPSE_SERVER_NAME`` (mandatory), the current server public hostname.
* ``SYNAPSE_REPORT_STATS``, (mandatory, ``yes`` or ``no``), enable anonymous
statistics reporting back to the Matrix project which helps us to get funding.
* ``SYNAPSE_NO_TLS``, set this variable to disable TLS in Synapse (use this if
you run your own TLS-capable reverse proxy).
* ``SYNAPSE_ENABLE_REGISTRATION``, set this variable to enable registration on
the Synapse instance.
* ``SYNAPSE_ALLOW_GUEST``, set this variable to allow guest joining this server.
* ``SYNAPSE_EVENT_CACHE_SIZE``, the event cache size [default `10K`].
* ``SYNAPSE_CACHE_FACTOR``, the cache factor [default `0.5`].
* ``SYNAPSE_RECAPTCHA_PUBLIC_KEY``, set this variable to the recaptcha public
key in order to enable recaptcha upon registration.
* ``SYNAPSE_RECAPTCHA_PRIVATE_KEY``, set this variable to the recaptcha private
key in order to enable recaptcha upon registration.
* ``SYNAPSE_TURN_URIS``, set this variable to the coma-separated list of TURN
uris to enable TURN for this homeserver.
* ``SYNAPSE_TURN_SECRET``, set this to the TURN shared secret if required.
* ``SYNAPSE_MAX_UPLOAD_SIZE``, set this variable to change the max upload size [default `10M`].
Shared secrets, that will be initialized to random values if not set:
* ``SYNAPSE_REGISTRATION_SHARED_SECRET``, secret for registrering users if
registration is disable.
* ``SYNAPSE_MACAROON_SECRET_KEY`` secret for signing access tokens
to the server.
Database specific values (will use SQLite if not set):
* `POSTGRES_DB` - The database name for the synapse postgres database. [default: `synapse`]
* `POSTGRES_HOST` - The host of the postgres database if you wish to use postgresql instead of sqlite3. [default: `db` which is useful when using a container on the same docker network in a compose file where the postgres service is called `db`]
* `POSTGRES_PASSWORD` - The password for the synapse postgres database. **If this is set then postgres will be used instead of sqlite3.** [default: none] **NOTE**: You are highly encouraged to use postgresql! Please use the compose file to make it easier to deploy.
* `POSTGRES_USER` - The user for the synapse postgres database. [default: `matrix`]
Mail server specific values (will not send emails if not set):
* ``SYNAPSE_SMTP_HOST``, hostname to the mail server.
* ``SYNAPSE_SMTP_PORT``, TCP port for accessing the mail server [default ``25``].
* ``SYNAPSE_SMTP_USER``, username for authenticating against the mail server if any.
* ``SYNAPSE_SMTP_PASSWORD``, password for authenticating against the mail server if any.
## Build
Build the docker image with the `docker build` command from the root of the synapse repository.
You will need to specify values for the `SYNAPSE_SERVER_NAME` and
`SYNAPSE_REPORT_STATS` environment variable, and mount a docker volume to store
The `-t` option sets the image tag. Official images are tagged `matrixdotorg/synapse:<version>` where `<version>` is the same as the release tag in the synapse git repository.
For information on picking a suitable server name, see
Synapse v1.0 will require valid TLS certificates for communication between
servers (port `8448` by default) in addition to those that are client-facing
(port `443`). If you do not already have a valid certificate for your domain,
the easiest way to get one is with Synapse's new ACME support, which will use
the ACME protocol to provision a certificate automatically. Synapse v0.99.0+
will provision server-to-server certificates automatically for you for free
through [Let's Encrypt](https://letsencrypt.org/) if you tell it to.
In the case that your `server_name` config variable is the same as
the hostname that the client connects to, then the same certificate can be
used between client and federation ports without issue.
If your configuration file does not already have an `acme` section, you can
generate an example config by running the `generate_config` executable. For
example:
```
~/synapse/env3/bin/generate_config
```
You will need to provide Let's Encrypt (or another ACME provider) access to
your Synapse ACME challenge responder on port 80, at the domain of your
homeserver. This requires you to either change the port of the ACME listener
provided by Synapse to a high port and reverse proxy to it, or use a tool
like `authbind` to allow Synapse to listen on port 80 without root access.
(Do not run Synapse with root permissions!) Detailed instructions are
available under "ACME setup" below.
If you already have certificates, you will need to back up or delete them
(files `example.com.tls.crt` and `example.com.tls.key` in Synapse's root
directory), Synapse's ACME implementation will not overwrite them.
You may wish to use alternate methods such as Certbot to obtain a certificate
from Let's Encrypt, depending on your server configuration. Of course, if you
already have a valid certificate for your homeserver's domain, that can be
placed in Synapse's config directory without the need for any ACME setup.
## ACME setup
The main steps for enabling ACME support in short summary are:
1. Allow Synapse to listen for incoming ACME challenges.
1. Enable ACME support in `homeserver.yaml`.
1. Move your old certificates (files `example.com.tls.crt` and `example.com.tls.key` out of the way if they currently exist at the paths specified in `homeserver.yaml`.
1. Restart Synapse.
Detailed instructions for each step are provided below.
### Listening on port 80
In order for Synapse to complete the ACME challenge to provision a
certificate, it needs access to port 80. Typically listening on port 80 is
only granted to applications running as root. There are thus two solutions to
this problem.
#### Using a reverse proxy
A reverse proxy such as Apache or nginx allows a single process (the web
server) to listen on port 80 and proxy traffic to the appropriate program
running on your server. It is the recommended method for setting up ACME as
it allows you to use your existing webserver while also allowing Synapse to
provision certificates as needed.
For nginx users, add the following line to your existing `server` block:
```
location /.well-known/acme-challenge {
proxy_pass http://localhost:8009;
}
```
For Apache, add the following to your existing webserver config:
Make sure to restart/reload your webserver after making changes.
Now make the relevant changes in `homeserver.yaml` to enable ACME support:
```
acme:
enabled: true
port: 8009
```
#### Authbind
`authbind` allows a program which does not run as root to bind to
low-numbered ports in a controlled way. The setup is simpler, but requires a
webserver not to already be running on port 80. **This includes every time
Synapse renews a certificate**, which may be cumbersome if you usually run a
web server on port 80. Nevertheless, if you're sure port 80 is not being used
for any other purpose then all that is necessary is the following:
Install `authbind`. For example, on Debian/Ubuntu:
```
sudo apt-get install authbind
```
Allow `authbind` to bind port 80:
```
sudo touch /etc/authbind/byport/80
sudo chmod 777 /etc/authbind/byport/80
```
When Synapse is started, use the following syntax:
```
authbind --deep <synapse start command>
```
Make the relevant changes in `homeserver.yaml` to enable ACME support:
```
acme:
enabled: true
```
### (Re)starting synapse
Ensure that the certificate paths specified in `homeserver.yaml` (`tls_certificate_path` and `tls_private_key_path`) do not currently point to any files. Synapse will not provision certificates if files exist, as it does not want to overwrite existing certificates.
These architecture notes are spectacularly old, and date back to when Synapse
was just federation code in isolation. This should be merged into the main
spec.
= Server to Server =
== Server to Server Stack ==
To use the server to server stack, home servers should only need to interact with the Messaging layer.
The server to server side of things is designed into 4 distinct layers:
1. Messaging Layer
2. Pdu Layer
3. Transaction Layer
4. Transport Layer
Where the bottom (the transport layer) is what talks to the internet via HTTP, and the top (the messaging layer) talks to the rest of the Home Server with a domain specific API.
1. Messaging Layer
This is what the rest of the Home Server hits to send messages, join rooms, etc. It also allows you to register callbacks for when it get's notified by lower levels that e.g. a new message has been received.
It is responsible for serializing requests to send to the data layer, and to parse requests received from the data layer.
2. PDU Layer
This layer handles:
* duplicate pdu_id's - i.e., it makes sure we ignore them.
* responding to requests for a given pdu_id
* responding to requests for all metadata for a given context (i.e. room)
* handling incoming backfill requests
So it has to parse incoming messages to discover which are metadata and which aren't, and has to correctly clobber existing metadata where appropriate.
For incoming PDUs, it has to check the PDUs it references to see if we have missed any. If we have go and ask someone (another home server) for it.
3. Transaction Layer
This layer makes incoming requests idempotent. I.e., it stores which transaction id's we have seen and what our response were. If we have already seen a message with the given transaction id, we do not notify higher levels but simply respond with the previous response.
transaction_id is from "GET /send/<tx_id>/"
It's also responsible for batching PDUs into single transaction for sending to remote destinations, so that we only ever have one transaction in flight to a given destination at any one time.
This is also responsible for answering requests for things after a given set of transactions, i.e., ask for everything after 'ver' X.
4. Transport Layer
This is responsible for starting a HTTP server and hitting the correct callbacks on the Transaction layer, as well as sending both data and requests for data.
== Persistence ==
We persist things in a single sqlite3 database. All database queries get run on a separate, dedicated thread. This that we only ever have one query running at a time, making it a lot easier to do things in a safe manner.
The queries are located in the synapse.persistence.transactions module, and the table information in the synapse.persistence.tables module.
- Handlers: business logic of synapse itself. Follows a set contract of BaseHandler:
- BaseHandler gives us onNewRoomEvent which: (TODO: flesh this out and make it less cryptic):
- handle_state(event)
- auth(event)
- persist_event(event)
- notify notifier or federation(event)
- PresenceHandler: use distributor to get EDUs out of Federation.
Very lightweight logic built on the distributor
- TypingHandler: use distributor to get EDUs out of Federation.
Very lightweight logic built on the distributor
- EventsHandler: handles the events stream...
- FederationHandler: - gets PDU from Federation Layer; turns into
an event; follows basehandler functionality.
- RoomsHandler: does all the room logic, including members - lots
of classes in RoomsHandler.
- ProfileHandler: talks to the storage to store/retrieve profile
info.
- EventFactory: generates events of particular event types.
- Notifier: Backs the events handler
- REST: Interfaces handlers and events to the outside world via
HTTP/JSON. Converts events back and forth from JSON.
- Federation: holds the HTTP client & server to talk to other servers.
Does replication to make sure there's nothing missing in the graph.
Handles reliability. Handles txns.
- Distributor: generic event bus. used for presence & typing only
currently. Notifier could be implemented using Distributor - so far
we are only using for things which actually /require/ dynamic
pluggability however as it can obfuscate the actual flow of control.
- Auth: helper singleton to say whether a given event is allowed to do
a given thing (TODO: put this on the diagram)
- State: helper singleton: does state conflict resolution. You give it
an event and it tells you if it actually updates the state or not,
and annotates the event up properly and handles merge conflict
resolution.
- Storage: abstracts the storage engine.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.