Commit Graph

9102 Commits

Author SHA1 Message Date
Andrew Morgan
8da8d4b4f5
Remove explicit python 3.8/9 skips (#19177)
Co-authored-by: Devon Hudson <devon.dmytro@gmail.com>
2025-11-14 11:38:39 +00:00
Eric Eastwood
408a05ebbc
Fix potential lost logcontext when PerDestinationQueue.shutdown(...) (#19178)
Spawning from looking at the logs in
https://github.com/element-hq/synapse/issues/19165#issuecomment-3527452941
which mention the `federation_transaction_transmission_loop`. I don't
think it's the source of the lost logcontext that person in the issue is
experiencing because this only applies when you try to `shutdown` the
homeserver.

Problem code introduced in
https://github.com/element-hq/synapse/pull/18828

To explain the fix, see the [*Deferred
callbacks*](3b59ac3b69/docs/log_contexts.md (deferred-callbacks))
section of our logcontext docs for more info (specifically using
solution 2).
2025-11-13 15:17:15 -06:00
Devon Hudson
5d545d1626
Remove support for PostgreSQL 13 (#19170)
This PR removes support for PostgreSQL 13 as it is deprecated
(tomorrow).
Uses https://github.com/element-hq/synapse/pull/18034 as a reference of
where to look, and also found a few other places that needed updating.
I didn't see anywhere in Complement that needs updating.
There is a companion Sytest PR deprecating psql13 over there:
https://github.com/matrix-org/sytest/pull/1418

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [X] Pull request is based on the develop branch
* [X] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
2025-11-13 19:38:59 +00:00
Andrew Ferrazzutti
9e23cded8f
MSC4140: Remove auth from delayed event management endpoints (#19152)
As per recent proposals in MSC4140, remove authentication for
restarting/cancelling/sending a delayed event, and give each of those
actions its own endpoint. (The original consolidated endpoint is still
supported for backwards compatibility.)

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Half-Shot <will@half-shot.uk>
2025-11-13 18:56:17 +00:00
Eric Eastwood
4494cc0694
Point out which event caused the exception when checking MSC4293 redactions (#19169)
Spawning from looking at the stack trace in
https://github.com/element-hq/synapse/issues/19128 which has no useful
information on how to dig in deeper.
2025-11-13 12:08:22 -06:00
Eric Eastwood
47d24bd234
Add debug logs to track Clock callbacks (#19173)
Spawning from wanting to find the source of a `Clock.call_later()`
callback, https://github.com/element-hq/synapse/issues/19165
2025-11-13 12:07:23 -06:00
Eric Eastwood
b9dda0ff22
Restore printing sentinel for log_record.request (#19172)
This was unintentionally changed in
https://github.com/element-hq/synapse/pull/19068.

There is no real bug here. Without this PR, we just printed an empty
string for the `sentinel` logcontext whereas the prior art behavior was
to print `sentinel` which this PR restores.

Found while staring at the logs in
https://github.com/element-hq/synapse/issues/19165


### Reproduction strategy

1. Configure Synapse with
[logging](df802882bb/docs/sample_log_config.yaml)
1. Start Synapse: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Notice the `asyncio - 64 - DEBUG - - Using selector: EpollSelector`
log line (notice empty string `- -`)
1. With this PR, the log line will be `asyncio - 64 - DEBUG - sentinel -
Using selector: EpollSelector` (notice `sentinel`)
2025-11-13 09:57:56 -06:00
reivilibre
938c97416d
Add a shortcut return when there are no events to purge. (#19093)
Fixes: #13417

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-11-13 14:26:37 +00:00
Jason Volk
e67ba69f20
Provide same servers list in s2s alias results as c2s. (#18970)
Signed-off-by: Jason Volk <jason@zemos.net>
Co-authored-by: dasha_uwu <dasha@linuxping.win>
2025-11-13 11:12:03 +00:00
Erik Johnston
df802882bb
Further reduce cardinality of metrics on event persister (#19168)
Follow on from #19133 to only track a subset of event types.
2025-11-12 16:40:38 +00:00
Andrew Ferrazzutti
97cc05d1d8
Bump lower bounds of unit test exclusive dependencies for Python 3.10 support (#19167)
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2025-11-12 16:37:14 +00:00
Erik Johnston
3ba3c7fe7d
Reduce cardinality of metrics on event persister (#19133)
This reduces the size of metrics by ~80%. Responding with the metrics
takes significant amounts of time.
2025-11-12 13:41:58 +00:00
Andrew Morgan
9722e05479
Update pyproject.toml to be compatible with other standard Python packaging tools (#19137) 2025-11-12 12:37:42 +00:00
Andrew Morgan
2c91896070
Run trial tests on Python 3.14 in PRs (#19135) 2025-11-12 12:02:50 +00:00
Eric Eastwood
8fa7d4a5a3
Ignore Python language refactors (.git-blame-ignore-revs) (#19150)
Ignore Python language refactors (`.git-blame-ignore-revs`)

 - https://github.com/element-hq/synapse/pull/19046
 - https://github.com/element-hq/synapse/pull/19111

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
2025-11-10 22:34:30 +00:00
V02460
dc7f01f334
register_new_matrix_user: Support multiple config files (#18784)
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
2025-11-10 16:52:57 +00:00
reivilibre
a50923b6bf
Improve documentation around streams, particularly ID generators and adding new streams. (#18943)
This arises mostly from my recent experience adding a stream for Thread
Subscriptions
and trying to help others add their own streams.

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-11-10 13:07:22 +00:00
Andrew Ferrazzutti
8580ab60c9
Add delayed_events table to boolean column port (#19155)
The `delayed_events` table has a boolean column that should be handled
by the SQLite->PostgreSQL migration script.
2025-11-10 12:17:42 +00:00
Andrew Ferrazzutti
fcac7e0282
Write union types as X | Y where possible (#19111)
aka PEP 604, added in Python 3.10
2025-11-06 14:02:33 -06:00
Erik Johnston
6790312831
Fixup logcontexts after replication PR. (#19146)
Fixes logcontext leaks introduced in #19138.
2025-11-05 15:38:14 +00:00
Erik Johnston
d3ffd04f66
Fix spelling (#19145)
Fixes up #19138
2025-11-05 14:00:59 +00:00
Erik Johnston
4906771da1
Faster redis replication handling (#19138)
Spawning a background process comes with a bunch of overhead, so let's
try to reduce the number of background processes we need to spawn when
handling inbound fed.

Currently, we seem to be doing roughly one per command. Instead, lets
keep the background process alive for a bit waiting for a new command to
come in.
2025-11-05 13:42:04 +00:00
Andrew Morgan
2fd8d88b42 1.142.0rc3 2025-11-04 17:39:28 +00:00
Andrew Morgan
0cbb2a15e0
Don't build free-threaded wheels (#19140)
Fixes https://github.com/element-hq/synapse/issues/19139.
2025-11-04 17:38:25 +00:00
Andrew Morgan
5d71034f81 1.142.0rc2 2025-11-04 16:21:50 +00:00
Andrew Morgan
4bbde142dc
Skip building Python 3.9 wheels with cibuildwheel (#19119) 2025-11-04 16:20:01 +00:00
Andrew Morgan
2760d15348 1.142.0rc1 2025-11-04 13:34:46 +00:00
Erik Johnston
5408101d21
Speed up pruning of ratelimiter (#19129)
I noticed this in some profiling. Basically, we prune the ratelimiters
by copying and iterating over every entry every 60 seconds. Instead,
let's use a wheel timer to track when we should potentially prune a
given key, and then we a) check fewer keys, and b) can run more
frequently. Hopefully this should mean we don't have a large pause
everytime we prune a ratelimiter with lots of keys.

Also fixes a bug where we didn't prune entries that were added via
`record_action` and never subsequently updated. This affected the media
and joins-per-room ratelimiter.
2025-11-04 12:44:57 +00:00
Andrew Morgan
08f570f5f5
Fix "There is no current event loop in thread" error in tests (#19134) 2025-11-04 12:32:49 +00:00
Eric Eastwood
db00925ae7
Redirect stdout/stderr to logs after initialization (#19131)
This regressed in https://github.com/element-hq/synapse/pull/19121. I
moved things in https://github.com/element-hq/synapse/pull/19121 because
I thought that it made sense to redirect anything printed to
`stdout`/`stderr` to the logs as early as possible. But we actually want
to log any immediately apparent problems during initialization to
`stderr` in the terminal so that they are obvious and visible to the
operator.

Now, I've moved `redirect_stdio_to_logs()` back to where it was
previously along with some proper comment context for why we have it
there.
2025-11-03 16:16:23 -06:00
Eric Eastwood
891acfd502
Move oidc.load_metadata() startup into _base.start() (#19056)
Slightly related to ["clean-tenant
provisioning"](https://github.com/element-hq/synapse-small-hosts/issues/221)
as making startup cleaner, makes it more clear how to handle clean
provisioning.
2025-11-03 15:23:22 -06:00
Eric Eastwood
e02a6f5e5d
Fix lost logcontext on HomeServer.shutdown() (#19108)
Same fix as https://github.com/element-hq/synapse/pull/19090

Spawning from working on clean tenant deprovisioning in the Synapse Pro
for small hosts project
(https://github.com/element-hq/synapse-small-hosts/pull/204).
2025-11-03 14:07:10 -06:00
Eric Eastwood
a7107458c6
Refactor app entrypoints (avoid exit(1) in our composable functions) (#19121)
- Move `register_start` (calls `os._exit(1)`) out of `setup` (our
composable function)
- We want to avoid `exit(...)` because we use these composable functions
in Synapse Pro for small hosts where we have multiple Synapse instances
running in the same process. We don't want a problem from one homeserver
tenant causing the entire Python process to exit and affect all of the
other homeserver tenants.
     - Continuation of https://github.com/element-hq/synapse/pull/19116
- Align our app entrypoints: `homeserver` (main), `generic_worker`
(worker), and `admin_cmd`

### Background

As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process) (c.f
Synapse Pro for small hosts), we're currently diving into the details
and implications of running multiple instances of Synapse in the same
Python process.

"Clean tenant provisioning" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
2025-11-03 12:04:43 -06:00
Eric Eastwood
e00a411837
Move exception handling up the stack (avoid exit(1) in our composable functions) (#19116)
Move exception handling up the stack (avoid `exit(1)` in our composable
functions)

Relevant to Synapse Pro for small hosts as we don't want to exit the
entire Python process and affect all homeserver tenants.


### Background

As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process) (c.f
Synapse Pro for small hosts), we're currently diving into the details
and implications of running multiple instances of Synapse in the same
Python process.

"Clean tenant provisioning" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
2025-11-03 11:18:56 -06:00
Andrew Morgan
69bab78b44
Python 3.14 support (#19055)
Co-authored-by: Eric Eastwood <erice@element.io>
2025-11-03 11:53:59 +00:00
Eric Eastwood
41a2762e58
Be mindful of other logging context filters in 3rd-party code (#19068)
Be mindful that Synapse can be run alongside other code in the same
Python process. We shouldn't overwrite fields on given log record unless
we know it's relevant to Synapse.

(no clobber)


### Background

As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.

"Per-tenant logging" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
2025-10-31 10:12:05 -05:00
Erik Johnston
3ccc5184e0
Fix schema lint script to understand CREATE TABLE IF NOT EXISTS (#19020)
The schema lint tries to make sure we don't add or remove indices in
schema files (rather than as background updates), *unless* the table was
created in the same schema file.

The regex to pull out the `CREATE TABLE` SQL incorrectly didn't
recognise `IF NOT EXISTS`.

There is a test delta file that shows that we accept different types of
`CREATE TABLE` and `CREATE INDEX` statements, as well as an index
creation that doesn't have a matching create table (to show that we do
still catch it). The test delta should be removed before merge.
2025-10-31 13:16:47 +00:00
V02460
07e7980572
Fix Rust’s confusing lifetime lint (#19118)
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2025-10-31 12:09:13 +00:00
V02460
3595ff921f
Pydantic v2 (#19071)
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
2025-10-31 09:22:22 +00:00
Andrew Morgan
300c5558ab
Update check_dependencies to support markers (#19110) 2025-10-30 21:33:29 +00:00
Eric Eastwood
c0b9437ab6
Fix lost logcontext when using timeout_deferred(...) (#19090)
Fix lost logcontext when using `timeout_deferred(...)` and things
actually timeout.

Fix https://github.com/element-hq/synapse/issues/19087 (our HTTP client
times out requests using `timeout_deferred(...)`
Fix https://github.com/element-hq/synapse/issues/19066 (`/sync` uses
`notifier.wait_for_events()` which uses `timeout_deferred(...)` under
the hood)


### When/why did these lost logcontext warnings start happening?

```
synapse.logging.context - 107 - WARNING - sentinel - Expected logging context call_later but found POST-2453

synapse.logging.context - 107 - WARNING - sentinel - Expected logging context call_later was lost
```

In https://github.com/element-hq/synapse/pull/18828, we switched
`timeout_deferred(...)` from using `reactor.callLater(...)` to
[`clock.call_later(...)`](3b59ac3b69/synapse/util/clock.py (L224-L313))
under the hood. This meant it started dealing with logcontexts but our
`time_it_out()` callback didn't follow our [Synapse logcontext
rules](3b59ac3b69/docs/log_contexts.md).
2025-10-30 11:49:15 -05:00
Eric Eastwood
f0aae62f85
Cheaper logcontext debug logs (random_string_insecure_fast(...)) (#19094)
Follow-up to https://github.com/element-hq/synapse/pull/18966

During the weekly Backend team meeting, it was mentioned that
`random_string(...)` was taking a significant amount of CPU on
`matrix.org`. This makes sense as it relies on
[`secrets.choice(...)`](https://docs.python.org/3/library/secrets.html#secrets.choice),
a cryptographically secure function that is inherently computationally
expensive. And since https://github.com/element-hq/synapse/pull/18966,
we're calling `random_string(...)` as part of a bunch of logcontext
utilities.

Since we don't need cryptographically secure random strings for our
debug logs, this PR is introducing a new `random_string_insecure_fast(...)`
function that uses
[`random.choice(...)`](https://docs.python.org/3/library/random.html#random.choice)
which uses pseudo-random numbers that are "both fast and threadsafe".
2025-10-30 11:47:53 -05:00
Andrew Morgan
349599143e
Move reading of multipart response into try body (#19062) 2025-10-30 15:22:52 +00:00
Eric Eastwood
2c4057bf93
Prevent duplicate logging setup when running multiple Synapse instances (#19067)
Be mindful that it's possible to run Synapse multiple times in the same
Python process. So we only need to do some part of the logging setup
once.

- We only need to setup the global log record factory and context filter
once
 - We only need to redirect Twisted logging once


### Background

As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.

"Per-tenant logging" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48
2025-10-30 10:21:56 -05:00
Andrew Morgan
f54ddbcace
Prevent duplicate GH releases being created during Synapse release process (#19096) 2025-10-30 12:40:53 +00:00
Andrew Morgan
728512918e
Exclude .lock file from /usr/local when building docker images (#19107) 2025-10-30 10:17:35 +00:00
Andrew Ferrazzutti
e0838c2567
Drop Python 3.9, bump tests/builds to Python 3.10 (#19099)
Python 3.9 EOL is on 2025-10-31
2025-10-29 12:15:00 -05:00
Eric Eastwood
6facf98a3a
Be mindful of other SIGHUP handlers in 3rd-party code (#19095)
Be mindful that Synapse can be run alongside other code in the same
Python process. We shouldn't clobber other `SIGHUP` handlers as only one
can be set at time.

(no clobber)

### Background

As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.

"Per-tenant logging" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48

Relevant to logging as we use a `SIGHUP` to reload log config in
Synapse.
2025-10-29 10:28:05 -05:00
Eric Eastwood
0417296b9f
Remove logcontext problems caused by awaiting raw deferLater(...) (#19058)
This is a normal
problem where we `await` a deferred without wrapping it in
`make_deferred_yieldable(...)`. But I've opted to replace the usage of
`deferLater` with something more standard for the Synapse codebase.

Part of https://github.com/element-hq/synapse/issues/18905

It's unclear why we're only now seeing these failures happen with the
changes from https://github.com/element-hq/synapse/pull/19057

Example failures seen in
https://github.com/element-hq/synapse/actions/runs/18477454390/job/52645183606?pr=19057

```
builtins.AssertionError: Expected `looping_call` callback from the reactor to start with the sentinel logcontext but saw task-_resumable_task-0-IBzAmHUoepQfLnEA. In other words, another task shouldn't have leaked their logcontext to us.
```
2025-10-29 10:23:10 -05:00
Andrew Morgan
7897c8f6af
Add a docs page with common steps to review the release notes (#19109) 2025-10-29 11:32:33 +00:00
Andrew Ferrazzutti
dc33ef90d3
Update docs on downstream Debian package (#19100) 2025-10-28 17:25:16 -05:00
Andrew Ferrazzutti
a07dd43ac4
Use Pillow's non-experimental getexif (#19098)
It has been available since Pillow 6, and Synapse is now pinned on
Pillow >=10.0.1.

Found this while looking at Debian-shipped dependencies, and figured
this may as well be updated.
2025-10-28 13:11:45 -05:00
Shay
f1695ac20e
Add an admin API to get the space hierarchy (#19021)
It is often useful when investigating a space to get information about
that space and it's children. This PR adds an Admin API to return
information about a space and it's children, regardless of room
membership. Will not fetch information over federation about remote
rooms that the server is not participating in.
2025-10-24 15:32:16 -05:00
Andrew Ferrazzutti
9d81bb703c
Always treat RETURNING as supported by SQL engines (#19047)
Can do this now that SQLite 3.35.0 added support for `RETURNING`.

> The RETURNING syntax has been supported by SQLite since version 3.35.0
(2021-03-12).
>
> *-- https://sqlite.org/lang_returning.html*

This also bumps the minimum supported SQLite version according to
Synapse's [deprecation
policy](https://element-hq.github.io/synapse/latest/deprecation_policy.html#platform-dependencies).

Fix https://github.com/element-hq/synapse/issues/17577
2025-10-24 13:21:49 -05:00
Andrew Morgan
123eff1bc0
Update poetry dev dependencies name (#19081) 2025-10-24 11:19:40 +01:00
Andrew Morgan
a092d2053a
Fix deprecation warning in release script (#19080) 2025-10-24 11:19:04 +01:00
Andrew Morgan
45a042ae88
Remove cibuildwheel pp38* skip selector (#19085) 2025-10-24 10:39:29 +01:00
Andrew Morgan
72d0de9f30
Don't exit the release script if there are uncommitted changes (#19088) 2025-10-24 10:39:06 +01:00
Andrew Morgan
5556b491c1
Spruce up generated announcement text in the release script (#19089) 2025-10-24 10:19:44 +01:00
Bryce Servis
b835eb253c
Make optional networking and security settings for Redis more apparent in workers.md (#19073)
I couldn't really find any documentation regarding how to setup TLS
communication between Synapse and Redis, so I looked through the source
code and found it. I figured I should go ahead and document it here.
2025-10-23 10:10:10 -05:00
Andrew Ferrazzutti
fc244bb592
Use type hinting generics in standard collections (#19046)
aka PEP 585, added in Python 3.9

 - https://peps.python.org/pep-0585/
 - https://docs.astral.sh/ruff/rules/non-pep585-annotation/
2025-10-22 16:48:19 -05:00
Eric Eastwood
cba3a814c6
Fix lints on develop (#19092)
Snuck in with
ff242faad0
2025-10-22 10:39:04 -05:00
Andrew Morgan
3b59ac3b69 Merge branch 'release-v1.141' into develop 2025-10-21 16:48:09 +01:00
Andrew Morgan
6c16734cf3 Revert "newsfile"
This reverts commit 4427908340.

This should not have been committed to `develop`.
2025-10-21 14:18:40 +01:00
Andrew Morgan
4427908340 newsfile 2025-10-21 14:17:53 +01:00
Kieran Lane
2f65b9e001
Update oidc_session_no_samesite cookie to be Secure (#19079) 2025-10-21 13:35:55 +01:00
Andrew Morgan
1271e896b5 1.141.0rc1 2025-10-21 11:12:59 +01:00
Andrew Morgan
418c9f3fe5
Prevent bcrypt from raising a ValueError and log (#19078) 2025-10-21 10:52:28 +01:00
Eric Eastwood
eac862629f
Revert "Move start_doing_background_updates() to SynapseHomeServer.start_background_tasks() (#19036)" (#19059)
### Why

See
https://github.com/element-hq/synapse/pull/19036#discussion_r2427070612

Revert while I figure out the tests in
https://github.com/element-hq/synapse/pull/19057
2025-10-20 10:55:41 -05:00
Ben Banfield-Zanin
67f22a200d
Update Docker images to use Debian trixie (13) and thus Python 3.13 (#19064) 2025-10-20 16:49:17 +01:00
Andrew Morgan
a4f9274107
Fix indentation of sighup handler calling code (#19060) 2025-10-14 15:10:48 +01:00
Tulir Asokan
ec7554b768
Stabilize support for MSC4326: Device masquerading for appservices (#19033)
Note: the code references MSC3202, which is what MSC4326 was split off
from. Only MSC4326 was accepted, MSC3202 wasn't yet.
2025-10-13 11:13:07 -05:00
Eric Eastwood
d2c582ef3c
Move unique snowflake homeserver background tasks to start_background_tasks (#19037)
(the standard pattern for this kind of thing)
2025-10-13 10:19:09 -05:00
Eric Eastwood
2d07bd7fd2
Update TODO list of conflicting areas where we encounter metrics being clobbered (ApplicationService) (#19040)
These errors are harmless and are a long-standing issue that is just now
being logged, see https://github.com/element-hq/synapse/issues/19042

```
2025-10-10 15:30:00,026 - synapse.util.metrics - 330 - ERROR - notify_interested_services-0 - Metric named cache_lru_cache__matches_user_in_member_list_example.com already registered for server example.com
	2025-10-10 16:30:00.167
2025-10-10 15:30:00,026 - synapse.util.metrics - 330 - ERROR - notify_interested_services-0 - Metric named cache_lru_cache_is_interested_in_room_example.com already registered for server example.com
	2025-10-10 16:30:00.167
2025-10-10 15:30:00,025 - synapse.util.metrics - 330 - ERROR - notify_interested_services-0 - Metric named cache_lru_cache_is_interested_in_event_example.com already registered for server example.com
	2025-10-10 16:29:15.560
2025-10-10 15:29:15,449 - synapse.util.metrics - 330 - ERROR - notify_interested_services_ephemeral-0 - Metric named cache_lru_cache__matches_user_in_member_list_example.com already registered for server example.com
	2025-10-10 16:29:15.560
2025-10-10 15:29:15,449 - synapse.util.metrics - 330 - ERROR - notify_interested_services_ephemeral-0 - Metric named cache_lru_cache_is_interested_in_room_example.com already registered for server example.com
```
2025-10-13 10:15:47 -05:00
Andrew Morgan
a7303c5311
Fix deprecated token field in release script (#19039) 2025-10-13 14:31:09 +01:00
Tulir Asokan
690b3a4fcc
Allow using MSC4190 features without opt-in (#19031) 2025-10-13 13:07:11 +00:00
Eric Eastwood
d399d7649a
Move start_doing_background_updates() to SynapseHomeServer.start_background_tasks() (#19036)
(more sane standard location for this sort of thing)

The one difference here is that previously, `start_doing_background_updates
()` only ran on the main Synapse instance. But since it now lives in
`start_background_tasks()`, it will run on the worker that supposed to
`run_background_tasks`. Doesn't seem like a problem though.
2025-10-10 14:30:38 -05:00
Andrew Morgan
c0d6998dea 1.140.0rc1 2025-10-10 11:24:27 +01:00
Eric Eastwood
47fb4b43ca
Introduce RootConfig.validate_config() which can be subclassed in HomeServerConfig to do cross-config class validation (#19027)
This means we
can move the open registration config validation from `setup()` to
`HomeServerConfig.validate_config()` (much more sane).

Spawning from looking at this area of code in
https://github.com/element-hq/synapse/pull/19015
2025-10-09 14:56:22 -05:00
Eric Eastwood
715cc5ee37
Split homeserver creation and setup (#19015)
### Background

As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.

"Clean tenant provisioning" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/221


### Partial startup problem

In the context of Synapse Pro for Small Hosts, since the Twisted reactor
is already running (from the `multi_synapse` shard process itself), when
provisioning a homeserver tenant, the `reactor.callWhenRunning(...)`
callbacks will be invoked immediately. This includes the Synapse's
[`start`](0615b64bb4/synapse/app/homeserver.py (L418-L429))
callback which sets up everything (including listeners, background
tasks, etc). If we encounter an error at this point, we are partially
setup but the exception will [bubble back to
us](8be122186b/multi_synapse/app/shard.py (L114-L121))
without us having a handle to the homeserver yet so we can't call
`hs.shutdown()` and clean everything up.


### What does this PR do?

Structures Synapse so we split creating the homeserver instance from
setting everything up. This way we have access to `hs` if anything goes
wrong during setup and can subsequently `hs.shutdown()` to clean
everything up.
2025-10-09 13:12:10 -05:00
Andrew Morgan
d440cfc9e2
Allow any release script command to accept --gh-token (#19035) 2025-10-09 17:15:54 +01:00
fkwp
18f07fdc4c
Add MatrixRTC backend/services discovery endpoint (#18967)
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
2025-10-09 17:15:47 +01:00
Andrew Morgan
e3344dc0c3
Expose defer_to_threadpool in the module API (#19032) 2025-10-09 15:15:13 +01:00
Andrew Morgan
bcbbccca23
Swap macos-13 with macos-15-intel GHA runner in CI (#19025) 2025-10-08 12:58:42 +01:00
Shay
8f01eb8ee0
Add an Admin API to fetch an event by ID (#18963)
Adds an endpoint to allow server admins to fetch an event regardless of
their membership in the originating room.
2025-10-08 11:38:15 +01:00
Eric Eastwood
631eed91f1
Fix bad merge with start_background_tasks (#19013)
This was originally removed in
https://github.com/element-hq/synapse/pull/18886 but it looks like it
snuck back in https://github.com/element-hq/synapse/pull/18828 during a
[bad
merge](4cd3d9172e).

Noticed while looking at Synapse setup and startup (just by happen
stance).

I don't think this has adverse effects on Synapse actually working and
`start_background_tasks()` can be called multiple times.


### Is there a good way to audit all of these merges?

As I would like to see the conflicts for each merge.

This works but it's still hard to notice anything is wrong:

```
git log --remerge-diff <commit-sha>
```

> shows the difference from mechanical merge result and the result that
is actually recorded in a merge commit

via
https://stackoverflow.com/questions/15277708/how-do-you-see-show-a-git-merge-conflict-resolution-that-was-done-given-a-mer/71181334#71181334

The following better. Specify the version range to the commit right
before the merge to the merge. And can even specify which file to look
at to make it more obvious with the hindsight we have now.

```
git log --remerge-diff <merge-commit-sha>~1..<merge-commit-sha> -- synapse/server.py
```

Example:
```
git log --remerge-diff 4cd3d9172ed7b87e509746851a376c861a27820e~1..4cd3d9172ed7b87e509746851a376c861a27820e -- synapse/server.py
```
2025-10-07 13:29:22 -05:00
Eric Eastwood
7b8831310f
No need to have version_string as an argument since it's always the same (#19012)
Assuming, we're happy with
https://github.com/element-hq/synapse/pull/19011, this PR makes sense.
2025-10-07 13:27:24 -05:00
Eric Eastwood
ca27938257
Align Synapse version string to use SYNAPSE_VERSION (#19011)
See https://github.com/matrix-org/synapse/pull/12973 where we previously
used `version_string="Synapse/" +
get_distribution_version_string("matrix-synapse")` everywhere; and then
updated to use `version_string=f"Synapse/{SYNAPSE_VERSION}"` for every
other place except `synapse/app/homeserver.py` (why?!?!?!). This seems
more like a typo than something on purpose especially without any
context in the comments or PR. The whole point of that PR was trying to
solve the missing git info in version strings.

For reference, here is what both variables look like for me locally on
the latest `develop`:

 - `SYNAPSE_VERSION`: `1.139.0 (b=develop,1d2ddbc76e,dirty)`
 - `VERSION`: `1.139.0`

Only reason we may want to do this is to hide the branch name (some
sensitive name that exposes a security fix, etc). But we don't hide
anything:

`https://matrix.org/_matrix/federation/v1/version`
```json
{
  "server": {
    "name": "Synapse",
    "version": "1.139.0rc3 (b=matrix-org-hotfixes-priv,f538ed5ac3)"
  }
}
```

On `matrix.org`, the `Server` response header is masked as `cloudflare`
which would otherwise show `1.139.0rc3` for everything from the main
process.

---

This is spawning from looking at the way we setup and start Synapse for
homeserver tenant provisioning in the Synapse Pro for Small Hosts
project (https://github.com/element-hq/synapse-small-hosts/issues/221)
2025-10-07 10:44:56 -05:00
Andrew Morgan
2443760d0d
Update KeyUploadServlet to handle case where client sends device_keys: null (#19023) 2025-10-07 16:23:55 +01:00
Till
42bbff8294
Validate the body of requests to /keys/upload (#17097)
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Eric Eastwood <erice@element.io>
2025-10-07 11:27:53 +01:00
Andrew Morgan
5465c68553
Remove unstable prefixes for MSC2732: Olm fallback keys (#18996)
Co-authored-by: Eric Eastwood <erice@element.io>
2025-10-07 11:15:35 +01:00
Francesco Stefanini
1d2ddbc76e
Fix bug where ephemeral events were not filtered by room ID (#19002)
Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
2025-10-03 13:19:57 +01:00
Eric Eastwood
70c044db8e
Remove deprecated LoggingContext.set_current_context/LoggingContext.current_context methods (#18989)
These were added for backwards compatibility (and essentially
deprecated) in https://github.com/matrix-org/synapse/pull/7408
(2020-05-04) because
[`synapse-s3-storage-provider`](https://github.com/matrix-org/synapse-s3-storage-provider)
previously relied on them -- but `synapse-s3-storage-provider` since
been
[updated](https://github.com/matrix-org/synapse-s3-storage-provider/pull/36)
to no longer use them.
2025-10-02 13:21:37 -05:00
Eric Eastwood
6835e7be0d
Wrap the Rust HTTP client with make_deferred_yieldable (#18903)
Wrap the Rust HTTP client with `make_deferred_yieldable` so downstream
usage doesn't need to use `PreserveLoggingContext()` or
`make_deferred_yieldable`.

> it seems like we should have some wrapper around it that uses
[`make_deferred_yieldable(...)`](40edb10a98/docs/log_contexts.md (where-you-create-a-new-awaitable-make-it-follow-the-rules))
to make things right so we don't have to do this in the downstream code.
>
> *-- @MadLittleMods,
https://github.com/element-hq/synapse/pull/18357#discussion_r2294941827*

Spawning from wanting to [remove `PreserveLoggingContext()` from the
codebase](https://github.com/element-hq/synapse/pull/18870) and thinking
that we [shouldn't have to pollute all downstream usage with
`PreserveLoggingContext()` or
`make_deferred_yieldable`](https://github.com/element-hq/synapse/pull/18357#discussion_r2294941827)

Part of https://github.com/element-hq/synapse/issues/18905 (Remove
`sentinel` logcontext where we log in Synapse)
2025-10-02 13:00:50 -05:00
Eric Eastwood
d27ff161f5
Add debug logs wherever we change current logcontext (#18966)
Add debug logs wherever we change current logcontext (`LoggingContext`).
I've had to make this same set of changes over and over as I've been
debugging things so it seems useful enough to include by default.

Instead of tracing things at the `set_current_context(...)` level, I've
added the debug logging on all of the utilities that utilize
`set_current_context(...)`. It's much easier to reason about the log
context changing because of `PreserveLoggingContext` changing things
than an opaque `set_current_context(...)` call.
2025-10-02 11:51:17 -05:00
Eric Eastwood
06a84f4fe0
Revert "Switch to OpenTracing's ContextVarsScopeManager (#18849)" (#19007)
Revert https://github.com/element-hq/synapse/pull/18849

Go back to our custom `LogContextScopeManager` after trying
OpenTracing's `ContextVarsScopeManager`.

Fix https://github.com/element-hq/synapse/issues/19004

### Why revert?

For reference, with the normal reactor, `ContextVarsScopeManager` worked
just as good as our custom `LogContextScopeManager` as far as I can tell
(and even better in some cases). But since Twisted appears to not fully
support `ContextVar`'s, it doesn't work as expected in all cases.
Compounding things, `ContextVarsScopeManager` was causing errors with
the experimental `SYNAPSE_ASYNC_IO_REACTOR` option.

Since we're not getting the full benefit that we originally desired, we
might as well revert and figure out alternatives for extending the
logcontext lifetimes to support the use case we were trying to unlock
(c.f. https://github.com/element-hq/synapse/pull/18804).

See
https://github.com/element-hq/synapse/issues/19004#issuecomment-3358052171
for more info.


### Does this require backporting and patch releases?

No. Since `ContextVarsScopeManager` operates just as good with the
normal reactor and was only causing actual errors with the experimental
`SYNAPSE_ASYNC_IO_REACTOR` option, I don't think this requires us to
backport and make patch releases at all.



### Maintain cross-links between main trace and background process work

In order to maintain the functionality introduced in https://github.com/element-hq/synapse/pull/18932 (cross-links between the background process trace and currently active trace), we also needed a small change.

Previously, when we were using `ContextVarsScopeManager`, it tracked the tracing scope across the logcontext changes without issue. Now that we're using our own custom `LogContextScopeManager` again, we need to capture the active span from the logcontext before we reset to the sentinel context because of the `PreserveLoggingContext()` below.

Added some tests to ensure we maintain the `run_as_background` tracing behavior regardless of the tracing scope manager we use.
2025-10-02 11:27:26 -05:00
Eric Eastwood
1c093509ce
Switch task scheduler from raw logcontext manipulation (set_current_context) to utils (PreserveLoggingContext) (#18990)
Prefer the utils over raw logcontext manipulation.

Spawning from adding some logcontext debug logs in
https://github.com/element-hq/synapse/pull/18966 and since we're not
logging at the `set_current_context(...)` level (see reasoning there),
this removes some usage of `set_current_context(...)`.
2025-10-02 10:22:25 -05:00
Devon Hudson
396de6544a
Cleanly shutdown SynapseHomeServer object (#18828)
This PR aims to allow for a clean shutdown of the `SynapseHomeServer`
object so that it can be fully deleted and cleaned up by garbage
collection without shutting down the entire python process.

Fix https://github.com/element-hq/synapse-small-hosts/issues/50

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Eric Eastwood <erice@element.io>
2025-10-01 02:42:09 +00:00
Sebastian Spaeth
d1c96ee0f2
Fix rc_room_creation and rc_reports docs - remove per_user typo (#18998) 2025-09-30 15:17:11 -05:00
Eric Eastwood
5adb08f3c9
Remove MockClock() (#18992)
Spawning from adding some logcontext debug logs in
https://github.com/element-hq/synapse/pull/18966 and since we're not
logging at the `set_current_context(...)` level (see reasoning there),
this removes some usage of `set_current_context(...)`.

Specifically, `MockClock.call_later(...)` doesn't handle logcontexts
correctly. It uses the calling logcontext as the callback context
(wrong, as the logcontext could finish before the callback finishes) and
it didn't reset back to the sentinel context before handing back to the
reactor. It was like this since it was [introduced 10+ years
ago](38da9884e7).
Instead of fixing the implementation which would just be a copy of our
normal `Clock`, we can just remove `MockClock`
2025-09-30 11:27:29 -05:00