Fixes https://github.com/element-hq/synapse/issues/19175
This PR moves tracking of what lazy loaded membership we've sent to each
room out of the required state table. This avoids that table from
continuously growing, which massively helps performance as we pull out
all matching rows for the connection when we receive a request.
The new table is only read when we have data in a room to send, so we
end up reading a lot fewer rows from the DB. Though we now read from
that table for every room we have events to return in, rather than once
at the start of the request.
For an explanation of how the new table works, see the
[comment](https://github.com/element-hq/synapse/blob/erikj/sss_better_membership_storage2/synapse/storage/schema/main/delta/93/02_sliding_sync_members.sql#L15-L38)
on the table schema.
The table is designed so that we can later prune old entries if we wish,
but that is not implemented in this PR.
Reviewable commit-by-commit.
---------
Co-authored-by: Eric Eastwood <erice@element.io>
Introduce `Clock.call_when_running(...)` to wrap startup code in a
logcontext, ensuring we can identify which server generated the logs.
Background:
> Ideally, nothing from the Synapse homeserver would be logged against the `sentinel`
> logcontext as we want to know which server the logs came from. In practice, this is not
> always the case yet especially outside of request handling.
>
> Global things outside of Synapse (e.g. Twisted reactor code) should run in the
> `sentinel` logcontext. It's only when it calls into application code that a logcontext
> gets activated. This means the reactor should be started in the `sentinel` logcontext,
> and any time an awaitable yields control back to the reactor, it should reset the
> logcontext to be the `sentinel` logcontext. This is important to avoid leaking the
> current logcontext to the reactor (which would then get picked up and associated with
> the next thing the reactor does).
>
> *-- `docs/log_contexts.md`
Also adds a lint to prefer `Clock.call_when_running(...)` over
`reactor.callWhenRunning(...)`
Part of https://github.com/element-hq/synapse/issues/18905
Fixes https://github.com/element-hq/synapse/issues/17698
This handles `required_state` changes by checking if new state has been
added to the config, and if so fetching and returning that from the
current state.
This also takes care to ensure that given a state entry S that is added,
removed and then re-added that we do *not* send S down a second time if
there have been no changes to S in the current state. This is fine for
Rust SDK (as it just remembers all state), but we might decide not to do
this behaviour in the MSC. If we decide to always send down S then its
easy enough to rip out all the code.
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
I thought ruff check would also format, but it doesn't.
This runs ruff format in CI and dev scripts. The first commit is just a
run of `ruff format .` in the root directory.
Previously, we just had very basic partial room exclusion based on
whether we were lazy-loading room members. Now with this PR, we added
`must_await_full_state(...)` with rules to check if we have a we're only
requesting `required_state` which is completely satisfied even with
partial state.
Partially-stated rooms should have all state events except for remote
membership events so if we require a remote membership event anywhere,
then we need to return `True`.
This triggers the client to start a new sliding sync connection. If we
don't do this and the client asks for the full range of rooms, we end up
sending down all rooms and their state from scratch (which can be very
slow)
This causes things like
https://github.com/element-hq/element-x-ios/issues/3115 after we restart
the server
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>