Compare commits

...

13 Commits

Author SHA1 Message Date
Olivier Wilkinson (reivilibre)
0887bb3052 Add counting for total_events in room statistics.
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2019-07-22 17:32:19 +01:00
Olivier Wilkinson (reivilibre)
df3fafa827 Introduce total_events column in room_stats.
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2019-07-18 14:45:55 +01:00
Richard van der Hoff
9150ec72fa Convert synapse.federation.transport.server to async (#5689)
* Convert BaseFederationServlet._wrap to async

Empirically, this fixes some lost stacktraces. It should be safe because the
wrapped function is called from JsonResource._async_render, which is already
async.

* Convert the rest of synapse.federation.transport.server to async

We may as well do the whole file while we're here.

* changelog

* flake8
2019-07-18 14:45:55 +01:00
Richard van der Hoff
df7747b821 Ignore redactions of m.room.create events (#5701) 2019-07-18 14:45:55 +01:00
Richard van der Hoff
583ec23c3c Improve Depends specs in debian package. (#5675)
This is basically a contrived way of adding a `Recommends` on `libpq5`, to fix #5653.

The way this is supposed to happen in debhelper is to run
`dh_shlibdeps`, which in turn runs `dpkg-shlibdeps`, which spits things out
into `debian/<package>.substvars` whence they can later be included by
`control`.

Previously, we had disabled `dh_shlibdeps`, mostly because `dpkg-shlibdeps`
gets confused about PIL's interdependent objects, but that's not really the
right thing to do and there is another way to work around that.

Since we don't always use postgres, we don't necessarily want a hard Depends on
libpq5, so I've actually ended up adding an explicit invocation of
`dpkg-shlibdeps` for `psycopg2`.

I've also updated the build-depends list for the package, which was missing a
couple of entries.
2019-07-18 14:45:55 +01:00
Richard van der Hoff
c9e8f2c755 More refactoring in get_events_as_list (#5707)
We can now use `_get_events_from_cache_or_db` rather than going right back to
the database, which means that (a) we can benefit from caching, and (b) it
opens the way forward to more extensive checks on the original event.

We now always require the original event to exist before we will serve up a
redaction.
2019-07-18 14:45:55 +01:00
Richard van der Hoff
24bbea7953 Fix redaction authentication (#5700)
Ensures that redactions are correctly authenticated for recent room versions.

There are a few things going on here:

 * `_fetch_event_rows` is updated to return a dict rather than a list of rows.

 * Rather than returning multiple copies of an event which was redacted
   multiple times, it returns the redactions as a list within the dict.

 * It also returns the actual rejection reason, rather than merely the fact
   that it was rejected, so that we don't have to query the table again in
   `_get_event_from_row`.

 * The redaction handling is factored out of `_get_event_from_row`, and now
   checks if any of the redactions are valid.
2019-07-18 14:45:55 +01:00
Richard van der Hoff
f5a07e51f8 Refactor get_events_as_list (#5699)
A couple of changes here:

* get rid of a redundant `allow_rejected` condition - we should already have filtered out any rejected
  events before we get to that point in the code, and the redundancy is confusing. Instead, let's stick in
  an assertion just to make double-sure we aren't leaking rejected events by mistake.

* factor out a `_get_events_from_cache_or_db` method, which is going to be important for a 
  forthcoming fix to redactions.
2019-07-18 14:45:55 +01:00
Olivier Wilkinson (reivilibre)
94463bf5fb Disuse room_stats.state_events (relates to #5690)
since it is tricky to maintain and has no known use case (for now).

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2019-07-18 13:25:35 +01:00
Olivier Wilkinson (reivilibre)
8502c668bf Changelog for #5691
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2019-07-17 09:46:01 +01:00
Olivier Wilkinson (reivilibre)
dc68c2a101 Update state_events and current_state_events upon receipt of a state
event #5690.

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2019-07-17 09:46:01 +01:00
Olivier Wilkinson (reivilibre)
181c1a6072 Don't decrease left_members if the user is joining for the first time.
Fixes #5423

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2019-07-17 09:46:01 +01:00
Olivier Wilkinson (reivilibre)
20ae4afe7e Create room_stats rows for new rooms. #5624
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2019-07-17 09:46:01 +01:00
21 changed files with 849 additions and 475 deletions

1
changelog.d/5675.doc Normal file
View File

@@ -0,0 +1 @@
Minor tweaks to postgres documentation.

1
changelog.d/5689.misc Normal file
View File

@@ -0,0 +1 @@
Convert `synapse.federation.transport.server` to `async`. Might improve some stack traces.

1
changelog.d/5691.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix improper or missing room_stats updates when handling state events (deltas).

1
changelog.d/5699.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix some problems with authenticating redactions in recent room versions.

2
changelog.d/5700.bugfix Normal file
View File

@@ -0,0 +1,2 @@
Fix some problems with authenticating redactions in recent room versions.

1
changelog.d/5701.bugfix Normal file
View File

@@ -0,0 +1 @@
Ignore redactions of m.room.create events.

1
changelog.d/5707.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix some problems with authenticating redactions in recent room versions.

3
debian/changelog vendored
View File

@@ -3,6 +3,9 @@ matrix-synapse-py3 (1.1.0-1) UNRELEASED; urgency=medium
[ Amber Brown ]
* Update logging config defaults to match API changes in Synapse.
[ Richard van der Hoff ]
* Add Recommends and Depends for some libraries which you probably want.
-- Erik Johnston <erikj@rae> Thu, 04 Jul 2019 13:59:02 +0100
matrix-synapse-py3 (1.1.0) stable; urgency=medium

7
debian/control vendored
View File

@@ -2,16 +2,20 @@ Source: matrix-synapse-py3
Section: contrib/python
Priority: extra
Maintainer: Synapse Packaging team <packages@matrix.org>
# keep this list in sync with the build dependencies in docker/Dockerfile-dhvirtualenv.
Build-Depends:
debhelper (>= 9),
dh-systemd,
dh-virtualenv (>= 1.1),
libsystemd-dev,
libpq-dev,
lsb-release,
python3-dev,
python3,
python3-setuptools,
python3-pip,
python3-venv,
libsqlite3-dev,
tar,
Standards-Version: 3.9.8
Homepage: https://github.com/matrix-org/synapse
@@ -28,9 +32,12 @@ Depends:
debconf,
python3-distutils|libpython3-stdlib (<< 3.6),
${misc:Depends},
${shlibs:Depends},
${synapse:pydepends},
# some of our scripts use perl, but none of them are important,
# so we put perl:Depends in Suggests rather than Depends.
Recommends:
${shlibs1:Recommends},
Suggests:
sqlite3,
${perl:Depends},

14
debian/rules vendored
View File

@@ -3,15 +3,29 @@
# Build Debian package using https://github.com/spotify/dh-virtualenv
#
# assume we only have one package
PACKAGE_NAME:=`dh_listpackages`
override_dh_systemd_enable:
dh_systemd_enable --name=matrix-synapse
override_dh_installinit:
dh_installinit --name=matrix-synapse
# we don't really want to strip the symbols from our object files.
override_dh_strip:
override_dh_shlibdeps:
# make the postgres package's dependencies a recommendation
# rather than a hard dependency.
find debian/$(PACKAGE_NAME)/ -path '*/site-packages/psycopg2/*.so' | \
xargs dpkg-shlibdeps -Tdebian/$(PACKAGE_NAME).substvars \
-pshlibs1 -dRecommends
# all the other dependencies can be normal 'Depends' requirements,
# except for PIL's, which is self-contained and which confuses
# dpkg-shlibdeps.
dh_shlibdeps -X site-packages/PIL/.libs -X site-packages/psycopg2
override_dh_virtualenv:
./debian/build_virtualenv

View File

@@ -43,6 +43,9 @@ RUN cd dh-virtualenv-1.1 && dpkg-buildpackage -us -uc -b
FROM ${distro}
# Install the build dependencies
#
# NB: keep this list in sync with the list of build-deps in debian/control
# TODO: it would be nice to do that automatically.
RUN apt-get update -qq -o Acquire::Languages=none \
&& env DEBIAN_FRONTEND=noninteractive apt-get install \
-yqq --no-install-recommends -o Dpkg::Options::=--force-unsafe-io \

View File

@@ -11,7 +11,9 @@ a postgres database.
* If you are using the `matrix.org debian/ubuntu
packages <../INSTALL.md#matrixorg-packages>`_,
the necessary libraries will already be installed.
the necessary python library will already be installed, but you will need to
ensure the low-level postgres library is installed, which you can do with
``apt install libpq5``.
* For other pre-built packages, please consult the documentation from the
relevant package.
@@ -34,7 +36,7 @@ Assuming your PostgreSQL database user is called ``postgres``, create a user
su - postgres
createuser --pwprompt synapse_user
Before you can authenticate with the ``synapse_user``, you must create a
Before you can authenticate with the ``synapse_user``, you must create a
database that it can access. To create a database, first connect to the database
with your database user::
@@ -53,7 +55,7 @@ and then run::
This would create an appropriate database named ``synapse`` owned by the
``synapse_user`` user (which must already have been created as above).
Note that the PostgreSQL database *must* have the correct encoding set (as
Note that the PostgreSQL database *must* have the correct encoding set (as
shown above), otherwise it will not be able to store UTF8 strings.
You may need to enable password authentication so ``synapse_user`` can connect

View File

@@ -606,21 +606,6 @@ class Auth(object):
defer.returnValue(auth_ids)
def check_redaction(self, room_version, event, auth_events):
"""Check whether the event sender is allowed to redact the target event.
Returns:
True if the the sender is allowed to redact the target event if the
target event was created by them.
False if the sender is allowed to redact the target event with no
further checks.
Raises:
AuthError if the event sender is definitely not allowed to redact
the target event.
"""
return event_auth.check_redaction(room_version, event, auth_events)
@defer.inlineCallbacks
def check_can_change_room_list(self, room_id, user):
"""Check if the user is allowed to edit the room's entry in the

File diff suppressed because it is too large Load Diff

View File

@@ -23,6 +23,7 @@ from canonicaljson import encode_canonical_json, json
from twisted.internet import defer
from twisted.internet.defer import succeed
from synapse import event_auth
from synapse.api.constants import EventTypes, Membership, RelationTypes
from synapse.api.errors import (
AuthError,
@@ -784,6 +785,20 @@ class EventCreationHandler(object):
event.signatures.update(returned_invite.signatures)
if event.type == EventTypes.Redaction:
original_event = yield self.store.get_event(
event.redacts,
check_redacted=False,
get_prev_content=False,
allow_rejected=False,
allow_none=True,
check_room_id=event.room_id,
)
# we can make some additional checks now if we have the original event.
if original_event:
if original_event.type == EventTypes.Create:
raise AuthError(403, "Redacting create events is not permitted")
prev_state_ids = yield context.get_prev_state_ids(self.store)
auth_events_ids = yield self.auth.compute_auth_events(
event, prev_state_ids, for_verification=True
@@ -791,18 +806,18 @@ class EventCreationHandler(object):
auth_events = yield self.store.get_events(auth_events_ids)
auth_events = {(e.type, e.state_key): e for e in auth_events.values()}
room_version = yield self.store.get_room_version(event.room_id)
if self.auth.check_redaction(room_version, event, auth_events=auth_events):
original_event = yield self.store.get_event(
event.redacts,
check_redacted=False,
get_prev_content=False,
allow_rejected=False,
allow_none=False,
)
if event_auth.check_redaction(room_version, event, auth_events=auth_events):
# this user doesn't have 'redact' rights, so we need to do some more
# checks on the original event. Let's start by checking the original
# event exists.
if not original_event:
raise NotFoundError("Could not find event %s" % (event.redacts,))
if event.user_id != original_event.user_id:
raise AuthError(403, "You don't have permission to redact events")
# We've already checked.
# all the checks are done.
event.internal_metadata.recheck_redaction = False
if event.type == EventTypes.Create:

View File

@@ -88,20 +88,42 @@ class StatsHandler(StateDeltasHandler):
if self.pos is None:
defer.returnValue(None)
# Loop round handling deltas until we're up to date
while True:
with Measure(self.clock, "stats_delta"):
deltas = yield self.store.get_current_state_deltas(self.pos)
max_pos = yield self.store.get_room_max_stream_ordering()
if max_pos == self.pos:
return
pos_of_delta = self.pos
with Measure(self.clock, "stats_delta"):
# Loop round handling deltas until we're up to date with deltas
while True:
# note that get_current_state_deltas isn't greedy
# it is limited
deltas = yield self.store.get_current_state_deltas(pos_of_delta)
if not deltas:
return
break
logger.info("Handling %d state deltas", len(deltas))
yield self._handle_deltas(deltas)
self.pos = deltas[-1]["stream_id"]
yield self.store.update_stats_stream_pos(self.pos)
pos_of_delta = deltas[-1]["stream_id"]
event_processing_positions.labels("stats").set(self.pos)
if pos_of_delta > self.pos:
new_pos = max(pos_of_delta, max_pos)
else:
new_pos = max_pos
with Measure(self.clock, "stats_total_events"):
yield self.store.update_total_event_count_between(
old_pos=self.pos, new_pos=new_pos
)
self.pos = new_pos
yield self.store.update_stats_stream_pos(self.pos)
event_processing_positions.labels("stats").set(self.pos)
@defer.inlineCallbacks
def _handle_deltas(self, deltas):
@@ -148,26 +170,40 @@ class StatsHandler(StateDeltasHandler):
# quantise time to the nearest bucket
now = (now // 1000 // self.stats_bucket_size) * self.stats_bucket_size
if prev_event_id is None:
# this state event doesn't overwrite another,
# so it is a new effective/current state event
yield self.store.update_stats_delta(
now, "room", room_id, "current_state_events", +1
)
if typ == EventTypes.Member:
# we could use _get_key_change here but it's a bit inefficient
# given we're not testing for a specific result; might as well
# just grab the prev_membership and membership strings and
# compare them.
prev_event_content = {}
# We take None rather than leave as a previous membership
# in the absence of a previous event because we do not want to
# reduce the leave count when a new-to-the-room user joins.
prev_membership = None
if prev_event_id is not None:
prev_event = yield self.store.get_event(
prev_event_id, allow_none=True
)
if prev_event:
prev_event_content = prev_event.content
prev_membership = prev_event_content.get(
"membership", Membership.LEAVE
)
membership = event_content.get("membership", Membership.LEAVE)
prev_membership = prev_event_content.get("membership", Membership.LEAVE)
if prev_membership == membership:
continue
if prev_membership == Membership.JOIN:
if prev_membership is None:
logger.debug("No previous membership for this user.")
elif prev_membership == Membership.JOIN:
yield self.store.update_stats_delta(
now, "room", room_id, "joined_members", -1
)
@@ -246,6 +282,28 @@ class StatsHandler(StateDeltasHandler):
},
)
# Also add room stats with just the one state event
# (the room creation state event)
yield self.store.update_stats(
"room",
room_id,
now,
{
"bucket_size": self.stats_bucket_size,
# This m.room.create state event is indeed a state event
# so we count it as one.
"current_state_events": 1,
"joined_members": 0,
"invited_members": 0,
"left_members": 0,
"banned_members": 0,
"total_events": 0,
# This column is disused but not yet removed from the
# schema, so fill with -1.
"state_events": -1,
},
)
elif typ == EventTypes.JoinRules:
yield self.store.update_room_state(
room_id, {"join_rules": event_content.get("join_rule")}

View File

@@ -37,6 +37,7 @@ from synapse.logging.context import (
)
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.types import get_domain_from_id
from synapse.util import batch_iter
from synapse.util.metrics import Measure
from ._base import SQLBaseStore
@@ -218,9 +219,108 @@ class EventsWorkerStore(SQLBaseStore):
if not event_ids:
defer.returnValue([])
event_id_list = event_ids
event_ids = set(event_ids)
# there may be duplicates so we cast the list to a set
event_entry_map = yield self._get_events_from_cache_or_db(
set(event_ids), allow_rejected=allow_rejected
)
events = []
for event_id in event_ids:
entry = event_entry_map.get(event_id, None)
if not entry:
continue
if not allow_rejected:
assert not entry.event.rejected_reason, (
"rejected event returned from _get_events_from_cache_or_db despite "
"allow_rejected=False"
)
# We may not have had the original event when we received a redaction, so
# we have to recheck auth now.
if not allow_rejected and entry.event.type == EventTypes.Redaction:
redacted_event_id = entry.event.redacts
event_map = yield self._get_events_from_cache_or_db([redacted_event_id])
original_event_entry = event_map.get(redacted_event_id)
if not original_event_entry:
# we don't have the redacted event (or it was rejected).
#
# We assume that the redaction isn't authorized for now; if the
# redacted event later turns up, the redaction will be re-checked,
# and if it is found valid, the original will get redacted before it
# is served to the client.
logger.debug(
"Withholding redaction event %s since we don't (yet) have the "
"original %s",
event_id,
redacted_event_id,
)
continue
original_event = original_event_entry.event
if original_event.type == EventTypes.Create:
# we never serve redactions of Creates to clients.
logger.info(
"Withholding redaction %s of create event %s",
event_id,
redacted_event_id,
)
continue
if entry.event.internal_metadata.need_to_check_redaction():
original_domain = get_domain_from_id(original_event.sender)
redaction_domain = get_domain_from_id(entry.event.sender)
if original_domain != redaction_domain:
# the senders don't match, so this is forbidden
logger.info(
"Withholding redaction %s whose sender domain %s doesn't "
"match that of redacted event %s %s",
event_id,
redaction_domain,
redacted_event_id,
original_domain,
)
continue
# Update the cache to save doing the checks again.
entry.event.internal_metadata.recheck_redaction = False
if check_redacted and entry.redacted_event:
event = entry.redacted_event
else:
event = entry.event
events.append(event)
if get_prev_content:
if "replaces_state" in event.unsigned:
prev = yield self.get_event(
event.unsigned["replaces_state"],
get_prev_content=False,
allow_none=True,
)
if prev:
event.unsigned = dict(event.unsigned)
event.unsigned["prev_content"] = prev.content
event.unsigned["prev_sender"] = prev.sender
defer.returnValue(events)
@defer.inlineCallbacks
def _get_events_from_cache_or_db(self, event_ids, allow_rejected=False):
"""Fetch a bunch of events from the cache or the database.
If events are pulled from the database, they will be cached for future lookups.
Args:
event_ids (Iterable[str]): The event_ids of the events to fetch
allow_rejected (bool): Whether to include rejected events
Returns:
Deferred[Dict[str, _EventCacheEntry]]:
map from event id to result
"""
event_entry_map = self._get_events_from_cache(
event_ids, allow_rejected=allow_rejected
)
@@ -243,81 +343,7 @@ class EventsWorkerStore(SQLBaseStore):
event_entry_map.update(missing_events)
events = []
for event_id in event_id_list:
entry = event_entry_map.get(event_id, None)
if not entry:
continue
# Starting in room version v3, some redactions need to be rechecked if we
# didn't have the redacted event at the time, so we recheck on read
# instead.
if not allow_rejected and entry.event.type == EventTypes.Redaction:
if entry.event.internal_metadata.need_to_check_redaction():
# XXX: we need to avoid calling get_event here.
#
# The problem is that we end up at this point when an event
# which has been redacted is pulled out of the database by
# _enqueue_events, because _enqueue_events needs to check
# the redaction before it can cache the redacted event. So
# obviously, calling get_event to get the redacted event out
# of the database gives us an infinite loop.
#
# For now (quick hack to fix during 0.99 release cycle), we
# just go and fetch the relevant row from the db, but it
# would be nice to think about how we can cache this rather
# than hit the db every time we access a redaction event.
#
# One thought on how to do this:
# 1. split get_events_as_list up so that it is divided into
# (a) get the rawish event from the db/cache, (b) do the
# redaction/rejection filtering
# 2. have _get_event_from_row just call the first half of
# that
orig_sender = yield self._simple_select_one_onecol(
table="events",
keyvalues={"event_id": entry.event.redacts},
retcol="sender",
allow_none=True,
)
expected_domain = get_domain_from_id(entry.event.sender)
if (
orig_sender
and get_domain_from_id(orig_sender) == expected_domain
):
# This redaction event is allowed. Mark as not needing a
# recheck.
entry.event.internal_metadata.recheck_redaction = False
else:
# We don't have the event that is being redacted, so we
# assume that the event isn't authorized for now. (If we
# later receive the event, then we will always redact
# it anyway, since we have this redaction)
continue
if allow_rejected or not entry.event.rejected_reason:
if check_redacted and entry.redacted_event:
event = entry.redacted_event
else:
event = entry.event
events.append(event)
if get_prev_content:
if "replaces_state" in event.unsigned:
prev = yield self.get_event(
event.unsigned["replaces_state"],
get_prev_content=False,
allow_none=True,
)
if prev:
event.unsigned = dict(event.unsigned)
event.unsigned["prev_content"] = prev.content
event.unsigned["prev_sender"] = prev.sender
defer.returnValue(events)
return event_entry_map
def _invalidate_get_event_cache(self, event_id):
self._get_event_cache.invalidate((event_id,))
@@ -326,7 +352,7 @@ class EventsWorkerStore(SQLBaseStore):
"""Fetch events from the caches
Args:
events (list(str)): list of event_ids to fetch
events (Iterable[str]): list of event_ids to fetch
allow_rejected (bool): Whether to return events that were rejected
update_metrics (bool): Whether to update the cache hit ratio metrics
@@ -384,19 +410,16 @@ class EventsWorkerStore(SQLBaseStore):
The fetch requests. Each entry consists of a list of event
ids to be fetched, and a deferred to be completed once the
events have been fetched.
"""
with Measure(self._clock, "_fetch_event_list"):
try:
event_id_lists = list(zip(*event_list))[0]
event_ids = [item for sublist in event_id_lists for item in sublist]
rows = self._new_transaction(
row_dict = self._new_transaction(
conn, "do_fetch", [], [], self._fetch_event_rows, event_ids
)
row_dict = {r["event_id"]: r for r in rows}
# We only want to resolve deferreds from the main thread
def fire(lst, res):
for ids, d in lst:
@@ -454,7 +477,7 @@ class EventsWorkerStore(SQLBaseStore):
logger.debug("Loaded %d events (%d rows)", len(events), len(rows))
if not allow_rejected:
rows[:] = [r for r in rows if not r["rejects"]]
rows[:] = [r for r in rows if r["rejected_reason"] is None]
res = yield make_deferred_yieldable(
defer.gatherResults(
@@ -463,8 +486,8 @@ class EventsWorkerStore(SQLBaseStore):
self._get_event_from_row,
row["internal_metadata"],
row["json"],
row["redacts"],
rejected_reason=row["rejects"],
row["redactions"],
rejected_reason=row["rejected_reason"],
format_version=row["format_version"],
)
for row in rows
@@ -475,49 +498,98 @@ class EventsWorkerStore(SQLBaseStore):
defer.returnValue({e.event.event_id: e for e in res if e})
def _fetch_event_rows(self, txn, events):
rows = []
N = 200
for i in range(1 + len(events) // N):
evs = events[i * N : (i + 1) * N]
if not evs:
break
def _fetch_event_rows(self, txn, event_ids):
"""Fetch event rows from the database
Events which are not found are omitted from the result.
The returned per-event dicts contain the following keys:
* event_id (str)
* json (str): json-encoded event structure
* internal_metadata (str): json-encoded internal metadata dict
* format_version (int|None): The format of the event. Hopefully one
of EventFormatVersions. 'None' means the event predates
EventFormatVersions (so the event is format V1).
* rejected_reason (str|None): if the event was rejected, the reason
why.
* redactions (List[str]): a list of event-ids which (claim to) redact
this event.
Args:
txn (twisted.enterprise.adbapi.Connection):
event_ids (Iterable[str]): event IDs to fetch
Returns:
Dict[str, Dict]: a map from event id to event info.
"""
event_dict = {}
for evs in batch_iter(event_ids, 200):
sql = (
"SELECT "
" e.event_id as event_id, "
" e.event_id, "
" e.internal_metadata,"
" e.json,"
" e.format_version, "
" r.redacts as redacts,"
" rej.event_id as rejects "
" rej.reason "
" FROM event_json as e"
" LEFT JOIN rejections as rej USING (event_id)"
" LEFT JOIN redactions as r ON e.event_id = r.redacts"
" WHERE e.event_id IN (%s)"
) % (",".join(["?"] * len(evs)),)
txn.execute(sql, evs)
rows.extend(self.cursor_to_dict(txn))
return rows
for row in txn:
event_id = row[0]
event_dict[event_id] = {
"event_id": event_id,
"internal_metadata": row[1],
"json": row[2],
"format_version": row[3],
"rejected_reason": row[4],
"redactions": [],
}
# check for redactions
redactions_sql = (
"SELECT event_id, redacts FROM redactions WHERE redacts IN (%s)"
) % (",".join(["?"] * len(evs)),)
txn.execute(redactions_sql, evs)
for (redacter, redacted) in txn:
d = event_dict.get(redacted)
if d:
d["redactions"].append(redacter)
return event_dict
@defer.inlineCallbacks
def _get_event_from_row(
self, internal_metadata, js, redacted, format_version, rejected_reason=None
self, internal_metadata, js, redactions, format_version, rejected_reason=None
):
"""Parse an event row which has been read from the database
Args:
internal_metadata (str): json-encoded internal_metadata column
js (str): json-encoded event body from event_json
redactions (list[str]): a list of the events which claim to have redacted
this event, from the redactions table
format_version: (str): the 'format_version' column
rejected_reason (str|None): the reason this event was rejected, if any
Returns:
_EventCacheEntry
"""
with Measure(self._clock, "_get_event_from_row"):
d = json.loads(js)
internal_metadata = json.loads(internal_metadata)
if rejected_reason:
rejected_reason = yield self._simple_select_one_onecol(
table="rejections",
keyvalues={"event_id": rejected_reason},
retcol="reason",
desc="_get_event_from_row_rejected_reason",
)
if format_version is None:
# This means that we stored the event before we had the concept
# of a event format version, so it must be a V1 event.
@@ -529,41 +601,7 @@ class EventsWorkerStore(SQLBaseStore):
rejected_reason=rejected_reason,
)
redacted_event = None
if redacted:
redacted_event = prune_event(original_ev)
redaction_id = yield self._simple_select_one_onecol(
table="redactions",
keyvalues={"redacts": redacted_event.event_id},
retcol="event_id",
desc="_get_event_from_row_redactions",
)
redacted_event.unsigned["redacted_by"] = redaction_id
# Get the redaction event.
because = yield self.get_event(
redaction_id, check_redacted=False, allow_none=True
)
if because:
# It's fine to do add the event directly, since get_pdu_json
# will serialise this field correctly
redacted_event.unsigned["redacted_because"] = because
# Starting in room version v3, some redactions need to be
# rechecked if we didn't have the redacted event at the
# time, so we recheck on read instead.
if because.internal_metadata.need_to_check_redaction():
expected_domain = get_domain_from_id(original_ev.sender)
if get_domain_from_id(because.sender) == expected_domain:
# This redaction event is allowed. Mark as not needing a
# recheck.
because.internal_metadata.recheck_redaction = False
else:
# Senders don't match, so the event isn't actually redacted
redacted_event = None
redacted_event = yield self._maybe_redact_event_row(original_ev, redactions)
cache_entry = _EventCacheEntry(
event=original_ev, redacted_event=redacted_event
@@ -573,6 +611,60 @@ class EventsWorkerStore(SQLBaseStore):
defer.returnValue(cache_entry)
@defer.inlineCallbacks
def _maybe_redact_event_row(self, original_ev, redactions):
"""Given an event object and a list of possible redacting event ids,
determine whether to honour any of those redactions and if so return a redacted
event.
Args:
original_ev (EventBase):
redactions (iterable[str]): list of event ids of potential redaction events
Returns:
Deferred[EventBase|None]: if the event should be redacted, a pruned
event object. Otherwise, None.
"""
if original_ev.type == "m.room.create":
# we choose to ignore redactions of m.room.create events.
return None
redaction_map = yield self._get_events_from_cache_or_db(redactions)
for redaction_id in redactions:
redaction_entry = redaction_map.get(redaction_id)
if not redaction_entry:
# we don't have the redaction event, or the redaction event was not
# authorized.
continue
redaction_event = redaction_entry.event
# Starting in room version v3, some redactions need to be
# rechecked if we didn't have the redacted event at the
# time, so we recheck on read instead.
if redaction_event.internal_metadata.need_to_check_redaction():
expected_domain = get_domain_from_id(original_ev.sender)
if get_domain_from_id(redaction_event.sender) == expected_domain:
# This redaction event is allowed. Mark as not needing a recheck.
redaction_event.internal_metadata.recheck_redaction = False
else:
# Senders don't match, so the event isn't actually redacted
continue
# we found a good redaction event. Redact!
redacted_event = prune_event(original_ev)
redacted_event.unsigned["redacted_by"] = redaction_id
# It's fine to add the event directly, since get_pdu_json
# will serialise this field correctly
redacted_event.unsigned["redacted_because"] = redaction_event
return redacted_event
# no valid redaction found for this event
return None
@defer.inlineCallbacks
def have_events_in_timeline(self, event_ids):
"""Given a list of event ids, check if we have already processed and

View File

@@ -0,0 +1,17 @@
/* Copyright 2019 The Matrix.org Foundation C.I.C.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-- the total number of events stored for this room at time `ts`
ALTER TABLE room_stats ADD COLUMN total_events INT;

View File

@@ -97,7 +97,7 @@ class StatsStore(StateDeltasStore):
sql = (
"CREATE TABLE IF NOT EXISTS "
+ TEMP_TABLE
+ "_position(position TEXT NOT NULL)"
+ "_position(position BIGINT NOT NULL)"
)
txn.execute(sql)
@@ -258,9 +258,7 @@ class StatsStore(StateDeltasStore):
membership_counts = self._get_user_counts_in_room_txn(txn, room_id)
total_state_events = self._get_total_state_event_counts_txn(
txn, room_id
)
room_total_event_count = self._count_events_in_room_txn(txn, room_id, current_token)
self._update_stats_txn(
txn,
@@ -274,7 +272,10 @@ class StatsStore(StateDeltasStore):
"invited_members": membership_counts.get(Membership.INVITE, 0),
"left_members": membership_counts.get(Membership.LEAVE, 0),
"banned_members": membership_counts.get(Membership.BAN, 0),
"state_events": total_state_events,
"total_events": room_total_event_count,
# this column is disused but not (yet) removed from the
# schema, so we fill it with -1.
"state_events": -1,
},
)
self._simple_insert_txn(
@@ -427,56 +428,99 @@ class StatsStore(StateDeltasStore):
)
def update_stats_delta(self, ts, stats_type, stats_id, field, value):
def _update_stats_delta(txn):
table, id_col = TYPE_TO_ROOM[stats_type]
return self.runInteraction("update_stats_delta", self._update_stats_delta_txn, ts, stats_type, stats_id, field, value)
sql = (
"SELECT * FROM %s"
" WHERE %s=? and ts=("
" SELECT MAX(ts) FROM %s"
" WHERE %s=?"
")"
) % (table, id_col, table, id_col)
txn.execute(sql, (stats_id, stats_id))
rows = self.cursor_to_dict(txn)
if len(rows) == 0:
# silently skip as we don't have anything to apply a delta to yet.
# this tries to minimise any race between the initial sync and
# subsequent deltas arriving.
return
def _update_stats_delta_txn(self, txn, ts, stats_type, stats_id, field, value):
table, id_col = TYPE_TO_ROOM[stats_type]
current_ts = ts
latest_ts = rows[0]["ts"]
if current_ts < latest_ts:
# This one is in the past, but we're just encountering it now.
# Mark it as part of the current bucket.
current_ts = latest_ts
elif ts != latest_ts:
# we have to copy our absolute counters over to the new entry.
values = {
key: rows[0][key] for key in ABSOLUTE_STATS_FIELDS[stats_type]
}
values[id_col] = stats_id
values["ts"] = ts
values["bucket_size"] = self.stats_bucket_size
sql = (
"SELECT * FROM %s"
" WHERE %s=? and ts=("
" SELECT MAX(ts) FROM %s"
" WHERE %s=?"
")"
) % (table, id_col, table, id_col)
txn.execute(sql, (stats_id, stats_id))
rows = self.cursor_to_dict(txn)
if len(rows) == 0:
# silently skip as we don't have anything to apply a delta to yet.
# this tries to minimise any race between the initial sync and
# subsequent deltas arriving.
return
self._simple_insert_txn(txn, table=table, values=values)
current_ts = ts
latest_ts = rows[0]["ts"]
if current_ts < latest_ts:
# This one is in the past, but we're just encountering it now.
# Mark it as part of the current bucket.
current_ts = latest_ts
elif ts != latest_ts:
# we have to copy our absolute counters over to the new entry.
values = {
key: rows[0][key] for key in ABSOLUTE_STATS_FIELDS[stats_type]
}
values[id_col] = stats_id
values["ts"] = ts
values["bucket_size"] = self.stats_bucket_size
# actually update the new value
if stats_type in ABSOLUTE_STATS_FIELDS[stats_type]:
self._simple_update_txn(
txn,
table=table,
keyvalues={id_col: stats_id, "ts": current_ts},
updatevalues={field: value},
)
else:
sql = ("UPDATE %s SET %s=%s+? WHERE %s=? AND ts=?") % (
table,
field,
field,
id_col,
)
txn.execute(sql, (value, stats_id, current_ts))
self._simple_insert_txn(txn, table=table, values=values)
return self.runInteraction("update_stats_delta", _update_stats_delta)
# actually update the new value
if stats_type in ABSOLUTE_STATS_FIELDS[stats_type]:
self._simple_update_txn(
txn,
table=table,
keyvalues={id_col: stats_id, "ts": current_ts},
updatevalues={field: value},
)
else:
sql = ("UPDATE %s SET %s=%s+? WHERE %s=? AND ts=?") % (
table,
field,
field,
id_col,
)
txn.execute(sql, (value, stats_id, current_ts))
def update_total_event_count_between(self, old_pos, new_pos):
"""
Updates the total_events counts for rooms
Args:
old_pos: The old stream position (stream position of the last event
that was already handled.)
new_pos: The new stream position (stream position of the new last
event to handle.)
"""
# TODO pass in now or calc here?
now = self.hs.get_reactor().seconds()
# quantise time to the nearest bucket
now = (now // self.stats_bucket_size) * self.stats_bucket_size
def _update_total_event_count_between(txn):
sql = """
SELECT room_id, COUNT(*) AS new_events
FROM events
WHERE ? < stream_ordering AND stream_ordering <= ?
GROUP BY room_id
"""
txn.execute(sql, (old_pos, new_pos))
for room_id, new_events in txn:
self._update_stats_delta_txn(txn, now, "room", room_id, "total_events", new_events)
return self.runInteraction("update_total_event_count_between", _update_total_event_count_between)
def _count_events_in_room_txn(self, txn, room_id, up_to_token):
sql = """
SELECT COUNT(*) AS num_events
FROM events
WHERE room_id = ? AND stream_ordering <= ?
"""
txn.execute(sql, (room_id, up_to_token))
return txn.fetchone()[0]

View File

@@ -0,0 +1,179 @@
# -*- coding: utf-8 -*-
# Copyright 2019 The Matrix.org Foundation C.I.C.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from synapse.rest import admin
from synapse.rest.client.v1 import login, room
from synapse.rest.client.v2_alpha import sync
from tests.unittest import HomeserverTestCase
class RedactionsTestCase(HomeserverTestCase):
"""Tests that various redaction events are handled correctly"""
servlets = [
admin.register_servlets,
room.register_servlets,
login.register_servlets,
sync.register_servlets,
]
def prepare(self, reactor, clock, hs):
# register a couple of users
self.mod_user_id = self.register_user("user1", "pass")
self.mod_access_token = self.login("user1", "pass")
self.other_user_id = self.register_user("otheruser", "pass")
self.other_access_token = self.login("otheruser", "pass")
# Create a room
self.room_id = self.helper.create_room_as(
self.mod_user_id, tok=self.mod_access_token
)
# Invite the other user
self.helper.invite(
room=self.room_id,
src=self.mod_user_id,
tok=self.mod_access_token,
targ=self.other_user_id,
)
# The other user joins
self.helper.join(
room=self.room_id, user=self.other_user_id, tok=self.other_access_token
)
def _redact_event(self, access_token, room_id, event_id, expect_code=200):
"""Helper function to send a redaction event.
Returns the json body.
"""
path = "/_matrix/client/r0/rooms/%s/redact/%s" % (room_id, event_id)
request, channel = self.make_request(
"POST", path, content={}, access_token=access_token
)
self.render(request)
self.assertEqual(int(channel.result["code"]), expect_code)
return channel.json_body
def _sync_room_timeline(self, access_token, room_id):
request, channel = self.make_request(
"GET", "sync", access_token=self.mod_access_token
)
self.render(request)
self.assertEqual(channel.result["code"], b"200")
room_sync = channel.json_body["rooms"]["join"][room_id]
return room_sync["timeline"]["events"]
def test_redact_event_as_moderator(self):
# as a regular user, send a message to redact
b = self.helper.send(room_id=self.room_id, tok=self.other_access_token)
msg_id = b["event_id"]
# as the moderator, send a redaction
b = self._redact_event(self.mod_access_token, self.room_id, msg_id)
redaction_id = b["event_id"]
# now sync
timeline = self._sync_room_timeline(self.mod_access_token, self.room_id)
# the last event should be the redaction
self.assertEqual(timeline[-1]["event_id"], redaction_id)
self.assertEqual(timeline[-1]["redacts"], msg_id)
# and the penultimate should be the redacted original
self.assertEqual(timeline[-2]["event_id"], msg_id)
self.assertEqual(timeline[-2]["unsigned"]["redacted_by"], redaction_id)
self.assertEqual(timeline[-2]["content"], {})
def test_redact_event_as_normal(self):
# as a regular user, send a message to redact
b = self.helper.send(room_id=self.room_id, tok=self.other_access_token)
normal_msg_id = b["event_id"]
# also send one as the admin
b = self.helper.send(room_id=self.room_id, tok=self.mod_access_token)
admin_msg_id = b["event_id"]
# as a normal, try to redact the admin's event
self._redact_event(
self.other_access_token, self.room_id, admin_msg_id, expect_code=403
)
# now try to redact our own event
b = self._redact_event(self.other_access_token, self.room_id, normal_msg_id)
redaction_id = b["event_id"]
# now sync
timeline = self._sync_room_timeline(self.other_access_token, self.room_id)
# the last event should be the redaction of the normal event
self.assertEqual(timeline[-1]["event_id"], redaction_id)
self.assertEqual(timeline[-1]["redacts"], normal_msg_id)
# the penultimate should be the unredacted one from the admin
self.assertEqual(timeline[-2]["event_id"], admin_msg_id)
self.assertNotIn("redacted_by", timeline[-2]["unsigned"])
self.assertTrue(timeline[-2]["content"]["body"], {})
# and the antepenultimate should be the redacted normal
self.assertEqual(timeline[-3]["event_id"], normal_msg_id)
self.assertEqual(timeline[-3]["unsigned"]["redacted_by"], redaction_id)
self.assertEqual(timeline[-3]["content"], {})
def test_redact_nonexistent_event(self):
# control case: an existing event
b = self.helper.send(room_id=self.room_id, tok=self.other_access_token)
msg_id = b["event_id"]
b = self._redact_event(self.other_access_token, self.room_id, msg_id)
redaction_id = b["event_id"]
# room moderators can send redactions for non-existent events
self._redact_event(self.mod_access_token, self.room_id, "$zzz")
# ... but normals cannot
self._redact_event(
self.other_access_token, self.room_id, "$zzz", expect_code=404
)
# when we sync, we should see only the valid redaction
timeline = self._sync_room_timeline(self.other_access_token, self.room_id)
self.assertEqual(timeline[-1]["event_id"], redaction_id)
self.assertEqual(timeline[-1]["redacts"], msg_id)
# and the penultimate should be the redacted original
self.assertEqual(timeline[-2]["event_id"], msg_id)
self.assertEqual(timeline[-2]["unsigned"]["redacted_by"], redaction_id)
self.assertEqual(timeline[-2]["content"], {})
def test_redact_create_event(self):
# control case: an existing event
b = self.helper.send(room_id=self.room_id, tok=self.mod_access_token)
msg_id = b["event_id"]
self._redact_event(self.mod_access_token, self.room_id, msg_id)
# sync the room, to get the id of the create event
timeline = self._sync_room_timeline(self.other_access_token, self.room_id)
create_event_id = timeline[0]["event_id"]
# room moderators cannot send redactions for create events
self._redact_event(
self.mod_access_token, self.room_id, create_event_id, expect_code=403
)
# and nor can normals
self._redact_event(
self.other_access_token, self.room_id, create_event_id, expect_code=403
)

View File

@@ -447,6 +447,7 @@ class HomeserverTestCase(TestCase):
# Create the user
request, channel = self.make_request("GET", "/_matrix/client/r0/admin/register")
self.render(request)
self.assertEqual(channel.code, 200)
nonce = channel.json_body["nonce"]
want_mac = hmac.new(key=b"shared", digestmod=hashlib.sha1)