1
0

Compare commits

..

1 Commits

Author SHA1 Message Date
Richard van der Hoff
bc435f7d9d Add some debug to keep track of client state desynchronisation
Fixes https://github.com/element-hq/crypto-internal/issues/179
2024-04-11 13:09:45 +01:00
20 changed files with 155 additions and 52 deletions

View File

@@ -1,42 +1,3 @@
# Synapse 1.105.0 (2024-04-16)
No significant changes since 1.105.0rc1.
# Synapse 1.105.0rc1 (2024-04-11)
### Features
- Stabilize support for [MSC4010](https://github.com/matrix-org/matrix-spec-proposals/pull/4010) which clarifies the interaction of push rules and account data. Contributed by @clokep. ([\#17022](https://github.com/element-hq/synapse/issues/17022))
- Stabilize support for [MSC3981](https://github.com/matrix-org/matrix-spec-proposals/pull/3981): `/relations` recursion. Contributed by @clokep. ([\#17023](https://github.com/element-hq/synapse/issues/17023))
- Add support for moving `/pushrules` off of main process. ([\#17037](https://github.com/element-hq/synapse/issues/17037), [\#17038](https://github.com/element-hq/synapse/issues/17038))
### Bugfixes
- Fix various long-standing bugs which could cause incorrect state to be returned from `/sync` in certain situations. ([\#16930](https://github.com/element-hq/synapse/issues/16930), [\#16932](https://github.com/element-hq/synapse/issues/16932), [\#16942](https://github.com/element-hq/synapse/issues/16942), [\#17064](https://github.com/element-hq/synapse/issues/17064), [\#17065](https://github.com/element-hq/synapse/issues/17065), [\#17066](https://github.com/element-hq/synapse/issues/17066))
- Fix server notice rooms not always being created as unencrypted rooms, even when `encryption_enabled_by_default_for_room_type` is in use (server notices are always unencrypted). ([\#17033](https://github.com/element-hq/synapse/issues/17033))
- Fix the `.m.rule.encrypted_room_one_to_one` and `.m.rule.room_one_to_one` default underride push rules being in the wrong order. Contributed by @Sumpy1. ([\#17043](https://github.com/element-hq/synapse/issues/17043))
### Internal Changes
- Refactor auth chain fetching to reduce duplication. ([\#17044](https://github.com/element-hq/synapse/issues/17044))
- Improve database performance by adding a missing index to `access_tokens.refresh_token_id`. ([\#17045](https://github.com/element-hq/synapse/issues/17045), [\#17054](https://github.com/element-hq/synapse/issues/17054))
- Improve database performance by reducing number of receipts fetched when sending push notifications. ([\#17049](https://github.com/element-hq/synapse/issues/17049))
### Updates to locked dependencies
* Bump packaging from 23.2 to 24.0. ([\#17027](https://github.com/element-hq/synapse/issues/17027))
* Bump regex from 1.10.3 to 1.10.4. ([\#17028](https://github.com/element-hq/synapse/issues/17028))
* Bump ruff from 0.3.2 to 0.3.5. ([\#17060](https://github.com/element-hq/synapse/issues/17060))
* Bump serde_json from 1.0.114 to 1.0.115. ([\#17041](https://github.com/element-hq/synapse/issues/17041))
* Bump types-pillow from 10.2.0.20240125 to 10.2.0.20240406. ([\#17061](https://github.com/element-hq/synapse/issues/17061))
* Bump types-requests from 2.31.0.20240125 to 2.31.0.20240406. ([\#17063](https://github.com/element-hq/synapse/issues/17063))
* Bump typing-extensions from 4.9.0 to 4.11.0. ([\#17062](https://github.com/element-hq/synapse/issues/17062))
# Synapse 1.104.0 (2024-04-02)
### Bugfixes

1
changelog.d/16930.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix various long-standing bugs which could cause incorrect state to be returned from `/sync` in certain situations.

1
changelog.d/16932.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix various long-standing bugs which could cause incorrect state to be returned from `/sync` in certain situations.

1
changelog.d/16942.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix various long-standing bugs which could cause incorrect state to be returned from `/sync` in certain situations.

View File

@@ -0,0 +1 @@
Stabilize support for [MSC4010](https://github.com/matrix-org/matrix-spec-proposals/pull/4010) which clarifies the interaction of push rules and account data. Contributed by @clokep.

View File

@@ -0,0 +1 @@
Stabilize support for [MSC3981](https://github.com/matrix-org/matrix-spec-proposals/pull/3981): `/relations` recursion. Contributed by @clokep.

1
changelog.d/17033.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix server notice rooms not always being created as unencrypted rooms, even when `encryption_enabled_by_default_for_room_type` is in use (server notices are always unencrypted).

View File

@@ -0,0 +1 @@
Add support for moving `/pushrules` off of main process.

View File

@@ -0,0 +1 @@
Add support for moving `/pushrules` off of main process.

1
changelog.d/17043.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix the `.m.rule.encrypted_room_one_to_one` and `.m.rule.room_one_to_one` default underride push rules being in the wrong order. Contributed by @Sumpy1.

1
changelog.d/17044.misc Normal file
View File

@@ -0,0 +1 @@
Refactor auth chain fetching to reduce duplication.

1
changelog.d/17045.misc Normal file
View File

@@ -0,0 +1 @@
Improve database performance by adding a missing index to `access_tokens.refresh_token_id`.

1
changelog.d/17049.misc Normal file
View File

@@ -0,0 +1 @@
Improve database performance by reducing number of receipts fetched when sending push notifications.

1
changelog.d/17054.misc Normal file
View File

@@ -0,0 +1 @@
Improve database performance by adding a missing index to `access_tokens.refresh_token_id`.

1
changelog.d/17064.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix various long-standing bugs which could cause incorrect state to be returned from `/sync` in certain situations.

1
changelog.d/17065.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix various long-standing bugs which could cause incorrect state to be returned from `/sync` in certain situations.

1
changelog.d/17066.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix various long-standing bugs which could cause incorrect state to be returned from `/sync` in certain situations.

12
debian/changelog vendored
View File

@@ -1,15 +1,3 @@
matrix-synapse-py3 (1.105.0) stable; urgency=medium
* New Synapse release 1.105.0.
-- Synapse Packaging team <packages@matrix.org> Tue, 16 Apr 2024 15:53:23 +0100
matrix-synapse-py3 (1.105.0~rc1) stable; urgency=medium
* New Synapse release 1.105.0rc1.
-- Synapse Packaging team <packages@matrix.org> Thu, 11 Apr 2024 12:15:49 +0100
matrix-synapse-py3 (1.104.0) stable; urgency=medium
* New Synapse release 1.104.0.

View File

@@ -96,7 +96,7 @@ module-name = "synapse.synapse_rust"
[tool.poetry]
name = "matrix-synapse"
version = "1.105.0"
version = "1.104.0"
description = "Homeserver for the Matrix decentralised comms protocol"
authors = ["Matrix.org Team and Contributors <packages@matrix.org>"]
license = "AGPL-3.0-or-later"

View File

@@ -88,6 +88,10 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
# Logging for https://github.com/matrix-org/matrix-spec/issues/1209 and
# https://github.com/element-hq/synapse/issues/16940
client_state_desync_logger = logging.getLogger("synapse.client_state_desync_debug")
# Counts the number of times we returned a non-empty sync. `type` is one of
# "initial_sync", "full_state_sync" or "incremental_sync", `lazy_loaded` is
# "true" or "false" depending on if the request asked for lazy loaded members or
@@ -1214,6 +1218,12 @@ class SyncHandler:
previous_timeline_end={},
lazy_load_members=lazy_load_members,
)
if client_state_desync_logger.isEnabledFor(logging.DEBUG):
await self._log_client_state_desync(
room_id, None, state_ids, timeline_state, lazy_load_members
)
return state_ids
async def _compute_state_delta_for_incremental_sync(
@@ -1359,6 +1369,15 @@ class SyncHandler:
lazy_load_members=lazy_load_members,
)
if client_state_desync_logger.isEnabledFor(logging.DEBUG):
await self._log_client_state_desync(
room_id,
since_token,
state_ids,
timeline_state,
lazy_load_members,
)
return state_ids
async def _find_missing_partial_state_memberships(
@@ -1475,6 +1494,125 @@ class SyncHandler:
return additional_state_ids
async def _log_client_state_desync(
self,
room_id: str,
since_token: Optional[StreamToken],
sync_response_state_state: StateMap[str],
sync_response_timeline_state: StateMap[str],
lazy_load_members: bool,
) -> None:
"""
Logging to see how often the client's state gets out of sync with the
actual current state of the room.
There are few different potential failure modes here:
* State resolution can cause changes in the state of the room that don't
directly correspond to events with the corresponding (type, state_key).
https://github.com/matrix-org/matrix-spec/issues/1209 discusses this in
more detail.
* Even where there is an event that causes a given state change, Synapse
may not serve it to the client, since it works on state at specific points
in the DAG, rather than "current state".
See https://github.com/element-hq/synapse/issues/16940.
* Lazy-loading adds more complexity, as it means that events that would
normally be served via the `state` part of an incremental sync are filtered
out.
To try to get a handle on this, let's put ourselves in the shoes of a client,
and compare the state they will calculate against the actual current state.
"""
# We only care about membership events.
state_filter = StateFilter.from_types(types=(("m.room.member", None),))
if since_token is None:
if lazy_load_members:
# For initial syncs with lazy-loading enabled, there's not too much
# concern here. We know the client will do a `/members` query before
# doing any encryption, so what sync returns isn't too important.
#
# (Of course, then `/members` might also return an incomplete list, but
# that's a separate problem.)
return
# For regular initial syncs, compare the returned response with the actual
# current state.
client_calculated_state = {}
client_calculated_state.update(sync_response_state_state)
client_calculated_state.update(sync_response_timeline_state)
else:
# For an incremental (gappy or otherwise) sync, let's assume the client has
# a complete membership list as of the last sync (or rather, at
# `since_token`, which is the closest approximation we have to it
# right now), and see what they would calculate as the current state given
# this sync update.
client_calculated_state = dict(
await self.get_state_at(
room_id,
stream_position=since_token,
state_filter=state_filter,
await_full_state=False,
)
)
client_calculated_state.update(sync_response_state_state)
client_calculated_state.update(sync_response_timeline_state)
current_state = await self._state_storage_controller.get_current_state_ids(
room_id, state_filter=state_filter, await_full_state=False
)
missing_users = await self._calculate_missing_members(
current_state, client_calculated_state
)
if missing_users:
client_state_desync_logger.debug(
"client state discrepancy in incremental sync in room %s: missing users %s",
room_id,
missing_users,
)
async def _calculate_missing_members(
self,
actual_state: StateMap[str],
client_calculated_state: StateMap[str],
) -> List[str]:
"""Helper for `_log_client_state_desync`: calculates the difference in
joined members between two state maps.
Returns:
A list of user IDs
"""
missing_users = []
async def event_id_to_membership(event_id: Optional[str]) -> Optional[str]:
if event_id is None:
return None
event = await self.store.get_event(event_id, allow_none=True)
if event is None:
return "MISSING_EVENT"
return event.membership
# Check for joined members in the actual state that are missing or have a
# different membership in the actual state.
for (event_type, state_key), actual_event_id in actual_state.items():
if event_type != EventTypes.Member:
continue
calculated_event_id = client_calculated_state.get((event_type, state_key))
if calculated_event_id != actual_event_id:
actual_membership = event_id_to_membership(actual_event_id)
calculated_membership = event_id_to_membership(calculated_event_id)
if (
actual_membership == Membership.JOIN
and calculated_membership != Membership.JOIN
):
missing_users.append(state_key)
return missing_users
async def unread_notifs_for_room_id(
self, room_id: str, sync_config: SyncConfig
) -> RoomNotifCounts: