This big ol' change does three high-level things:
1. It modifies `_get_interested_in` to ask the loaded PresenceRouter if
there are any users - in addition to those that share a room with the
user in question - that it thinks should have their presence status
queried. PresenceRouter can either return a Set of users, or "ALL".
2. It modifies `get_new_events` (which is mainly run when a user is
syncing and needs to check for presence updates) to support receiving
"ALL" from `_get_interested_in`. What happens then depends on whether a
`from_key` was provided to `get_new_events`. We also now call
`get_users_and_states` to filter the UserPresenceState objects after
querying ALL of them from a given `from_key`.
3. It also modifies `get_new_events` to take into account whether the
syncing user is included in
`ModuleApi.send_full_presence_to_local_users`. If so, then we're going
to send them all current user presence state (filtering it through
`get_users_for_states` again). We then remove the user ID from the set
to ensure the same doesn't happen on the next sync.
This is mainly all to support redirecting presence for local users as
they sync, though the same method is called for appservice users.
This will be useful for when PresenceRouter.get_interested_user returns
"ALL". It allows for querying all current local user presencee. Note
that the `presence_stream` table is culled frequently, and doesn't just
grow forever like other stream tables.
This function is useful for 'catching up' a user if you've just starting directing
presence updates their way. Sending the current presence (excluding offline) for
each user before you start sending them diffs ensures the target has the right
presence state for each user immediately.
This effectively just forces a presence initial_sync for the user.
This commit asks the PresenceRouter for any users - in addition to those
from users that they share a room with - that should receive the given
presence updates.
These methods are called when routing new presence updates around as
they come in.
* `get_interested_parties` is called when figuring out which local and
remote users to send presence to. For local users, their sync streams
will be woken up.
* `get_interested_remotes` is specifically for figuring out which remote
user(s) a given presence update needs to go to.
This class will perform in the same manner as Synapse did before,
unless a custom PresenceRouter module is configured. If one is,
then it will pass through the calls from Synapse to that module.
This bug was discovered by DINUM. We were modifying `serialized_event["content"]`, which - if you've got `USE_FROZEN_DICTS` turned on or are [using a third party rules module](17cd48fe51/synapse/events/third_party_rules.py (L73-L76)) - will raise a 500 if you try to a edit a reply to a message.
`serialized_event["content"]` could be set to the edit event's content, instead of a copy of it, which is bad as we attempt to modify it. Instead, we also end up modifying the original event's content. DINUM uses a third party rules module, which meant the event's content got frozen and thus an exception was raised.
To be clear, the problem is not that the event's content was frozen. In fact doing so helped us uncover the fact we weren't copying event content correctly.
We had two functions named `get_forward_extremities_for_room` and
`get_forward_extremeties_for_room` that took different paramters. We
rename one of them to avoid confusion.
* Populate `internal_metadata.outlier` based on `events` table
Rather than relying on `outlier` being in the `internal_metadata` column,
populate it based on the `events.outlier` column.
* Move `outlier` out of InternalMetadata._dict
Ultimately, this will allow us to stop writing it to the database. For now, we
have to grandfather it back in so as to maintain compatibility with older
versions of Synapse.
Instead of if the user does not have a password hash. This allows a SSO
user to add a password to their account, but only if the local password
database is configured.
Fixes https://github.com/matrix-org/synapse/issues/9572
When a SSO user logs in for the first time, we create a local Matrix user for them. This goes through the register_user flow, which ends up triggering the spam checker. Spam checker modules don't currently have any way to differentiate between a user trying to sign up initially, versus an SSO user (whom has presumably already been approved elsewhere) trying to log in for the first time.
This PR passes `auth_provider_id` as an argument to the `check_registration_for_spam` function. This argument will contain an ID of an SSO provider (`"saml"`, `"cas"`, etc.) if one was used, else `None`.
Federation catch up mode is very inefficient if the number of events
that the remote server has missed is small, since handling gaps can be
very expensive, c.f. #9492.
Instead of going into catch up mode whenever we see an error, we instead
do so only if we've backed off from trying the remote for more than an
hour (the assumption being that in such a case it is more than a
transient failure).
Background: When we receive incoming federation traffic, and notice that we are missing prev_events from
the incoming traffic, first we do a `/get_missing_events` request, and then if we still have missing prev_events,
we set up new backwards-extremities. To do that, we need to make a `/state_ids` request to ask the remote
server for the state at those prev_events, and then we may need to then ask the remote server for any events
in that state which we don't already have, as well as the auth events for those missing state events, so that we
can auth them.
This PR attempts to optimise the processing of that state request. The `state_ids` API returns a list of the state
events, as well as a list of all the auth events for *all* of those state events. The optimisation comes from the
observation that we are currently loading all of those auth events into memory at the start of the operation, but
we almost certainly aren't going to need *all* of the auth events. Rather, we can check that we have them, and
leave the actual load into memory for later. (Ideally the federation API would tell us which auth events we're
actually going to need, but it doesn't.)
The effect of this is to reduce the number of events that I need to load for an event in Matrix HQ from about
60000 to about 22000, which means it can stay in my in-memory cache, whereas previously the sheer number
of events meant that all 60K events had to be loaded from db for each request, due to the amount of cache
churn. (NB I've already tripled the size of the cache from its default of 10K).
Unfortunately I've ended up basically C&Ping `_get_state_for_room` and `_get_events_from_store_or_dest` into
a new method, because `_get_state_for_room` is also called during backfill, which expects the auth events to be
returned, so the same tricks don't work. That said, I don't really know why that codepath is completely different
(ultimately we're doing the same thing in setting up a new backwards extremity) so I've left a TODO suggesting
that we clean it up.
We either need to pass the auth provider over the replication api, or make sure
we report the auth provider on the worker that received the request. I've gone
with the latter.
Earlier [I was convinced](https://github.com/matrix-org/synapse/issues/9565) that we didn't have an Admin API for listing media uploaded by a user. Foolishly I was looking under the Media Admin API documentation, instead of the User Admin API documentation.
I thought it'd be helpful to link to the latter so others don't hit the same dead end :)
The hashes are from commits due to auto-formatting, e.g. running black.
git can be configured to use this automatically by running the following:
git config blame.ignoreRevsFile .git-blame-ignore-revs