Commit Graph

24984 Commits

Author SHA1 Message Date
Eric Eastwood
ff03a51cb0
Revert "Fix LaterGauge metrics to collect from all servers (#18751)" (#18789)
This PR reverts https://github.com/element-hq/synapse/pull/18751

### Why revert?

@reivilibre
[found](https://matrix.to/#/!vcyiEtMVHIhWXcJAfl:sw1v.org/$u9OEmMxaFYUzWHhCk1A_r50Y0aGrtKEhepF7WxWJkUA?via=matrix.org&via=node.marinchik.ink&via=element.io)
that our CI was failing in bizarre ways (thanks for stepping up to dive
into this 🙇). Examples:

- `twisted.internet.error.ProcessTerminated: A process has ended with a
probable error condition: process ended by signal 9.`
- `twisted.internet.error.ProcessTerminated: A process has ended with a
probable error condition: process ended by signal 15.`

<details>
<summary>More detailed part of the log</summary>


https://github.com/element-hq/synapse/actions/runs/16758038107/job/47500520633#step:9:6809
```
tests.util.test_wheel_timer.WheelTimerTestCase.test_single_insert_fetch
===============================================================================
Error: 
Traceback (most recent call last):
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/trial/_dist/disttrial.py", line 371, in task
    await worker.run(case, result)
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/trial/_dist/worker.py", line 305, in run
    return await self.callRemote(workercommands.Run, testCase=testCaseId)  # type: ignore[no-any-return]
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/internet/defer.py", line 1187, in __iter__
    yield self
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/internet/defer.py", line 1092, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/protocols/amp.py", line 1968, in _massageError
    error.trap(RemoteAmpError)
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/python/failure.py", line 431, in trap
    self.raiseException()
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/python/failure.py", line 455, in raiseException
    raise self.value.with_traceback(self.tb)
twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended by signal 9.

tests.util.test_macaroons.MacaroonGeneratorTestCase.test_guest_access_token
-------------------------------------------------------------------------------
Ran 4325 tests in 669.321s

FAILED (skips=159, errors=62, successes=4108)
while calling from thread
Traceback (most recent call last):
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/internet/base.py", line 1064, in runUntilCurrent
    f(*a, **kw)
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/internet/base.py", line 790, in stop
    raise error.ReactorNotRunning("Can't stop reactor that isn't running.")
twisted.internet.error.ReactorNotRunning: Can't stop reactor that isn't running.

joining disttrial worker #0 failed
Traceback (most recent call last):
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/internet/defer.py", line 1853, in _inlineCallbacks
    result = context.run(
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/python/failure.py", line 467, in throwExceptionIntoGenerator
    return g.throw(self.value.with_traceback(self.tb))
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/trial/_dist/worker.py", line 406, in exit
    await endDeferred
  File "/home/runner/.cache/pypoetry/virtualenvs/matrix-synapse-pswDeSvb-py3.9/lib/python3.9/site-packages/twisted/internet/defer.py", line 1187, in __iter__
    yield self
twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended by signal 15.
```

</details>


With more debugging (thanks @devonh for also stepping in as maintainer),
we were finding that the CI was consistently failing at
`test_exposed_to_prometheus` which was a bit of smoke because of all of
the [metrics
changes](https://github.com/element-hq/synapse/issues/18592) that were
merged recently.

Locally, although I wasn't able to reproduce the bizarre errors, I could
easily see increased memory usage (~20GB vs ~2GB) and the
`test_exposed_to_prometheus` test taking a while to complete when
running a full test run (`SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial
tests`).

<img width="1485" height="78" alt="Lots of memory usage"
src="https://github.com/user-attachments/assets/811e2a96-75e5-4a3c-966c-00dc0512cea9"
/>

After updating `test_exposed_to_prometheus` to dump the
`latest_metrics_response = generate_latest(REGISTRY)`, I could see that
it's a massive 3.2GB response. Inspecting the contents, we can see 4.1M
(4,137,123) entries for just
`synapse_background_update_status{server_name="test"} 3.0` which is a
`LaterGauge`. I don't think we have 4.1M test cases so it's also unclear
why we end up with so many samples but it does make sense that we do see
a lot of duplicates because each `HomeserverTestCase` will create a
homeserver for each test case that will `LaterGauge.register_hook(...)`
(part of the https://github.com/element-hq/synapse/pull/18751 changes).

`tests/storage/databases/main/test_metrics.py`
```python
        latest_metrics_response = generate_latest(REGISTRY)
        with open("/tmp/synapse-test-metrics", "wb") as f:
            f.write(latest_metrics_response)
```

After reverting the https://github.com/element-hq/synapse/pull/18751
changes, running the full test suite locally doesn't result in memory
spikes and seems to run normally.



### Dev notes

Discussion in the
[`#synapse-dev:matrix.org`](https://matrix.to/#/!vcyiEtMVHIhWXcJAfl:sw1v.org/$vkMATs04yqZggVVd6Noop5nU8M2DVoTkrAWshw7u1-w?via=matrix.org&via=node.marinchik.ink&via=element.io)
room.

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [ ] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [ ] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
2025-08-06 22:14:40 +00:00
reivilibre
6514381b02
Implement the push rules for experimental MSC4306: Thread Subscriptions. (#18762)
Follows: #18756

Implements: MSC4306

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2025-08-06 15:33:52 +01:00
reivilibre
8306cee06a
Update implementation of MSC4306: Thread Subscriptions to include automatic subscription conflict prevention as introduced in later drafts. (#18756)
Follows: #18674

Implements new drafts of MSC4306

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Co-authored-by: Eric Eastwood <erice@element.io>
2025-08-05 18:22:53 +00:00
Eric Eastwood
076db0ab49
Fix LaterGauge metrics to collect from all servers (#18751)
Fix `LaterGauge` metrics to collect from all servers

Follow-up to https://github.com/element-hq/synapse/pull/18714

Previously, our `LaterGauge` metrics did include the `server_name` label
as expected but we were only seeing the last server being reported in
some cases. Any `LaterGauge` that we were creating multiple times was
only reporting the last instance.

This PR updates all `LaterGauge` to be created once and then we use
`LaterGauge.register_hook(...)` to add in the metric callback as before.
This works now because we store a list of callbacks instead of just one.

I noticed this problem thanks to some [tests in the Synapse Pro for
Small Hosts](https://github.com/element-hq/synapse-small-hosts/pull/173)
repo that sanity check all metrics to ensure that we can see each metric
includes data from multiple servers.


### Testing strategy

1. This is only noticeable when you run multiple Synapse instances in
the same process.
 1. TODO

(see test that was added)

### Dev notes

Previous non-global `LaterGauge`:

```
synapse_federation_send_queue_xxx
synapse_federation_transaction_queue_pending_destinations
synapse_federation_transaction_queue_pending_pdus
synapse_federation_transaction_queue_pending_edus
synapse_handlers_presence_user_to_current_state_size
synapse_handlers_presence_wheel_timer_size
synapse_notifier_listeners
synapse_notifier_rooms
synapse_notifier_users
synapse_replication_tcp_resource_total_connections
synapse_replication_tcp_command_queue
synapse_background_update_status
synapse_federation_known_servers
synapse_scheduler_running_tasks
```



### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
2025-08-05 15:28:55 +00:00
Andrew Morgan
c7762cd55e
Prevent "Move labelled issues to correct projects" GitHub Actions workflow from failing when an issue is already on the project board (#18755) 2025-08-05 12:03:25 +01:00
Andrew Morgan
357b749bf3
Bump minimum supported rust version to 1.82.0 (#18757) 2025-08-05 12:02:57 +01:00
Erik Johnston
20615115fb
Make .sleep(..) return a coroutine (#18772)
This helps ensure that mypy can catch places where we don't await on it,
like in #18763.

---------

Co-authored-by: Eric Eastwood <erice@element.io>
2025-08-05 09:30:52 +01:00
Eric Eastwood
ddbcd859aa
Improve order of validation and ratelimiting in room creation (#18723)
Spawning from looking at this stuff while reviewing
https://github.com/element-hq/synapse/pull/18721
2025-08-04 11:08:02 -05:00
Quentin Gliech
7ed55666b5
Stabilise MAS integration (#18759)
This can be reviewed commit by commit

There are a few improvements over the experimental support:

- authorisation of Synapse <-> MAS requests is simplified, with a single
shared secret, removing the need for provisioning a client on the MAS
side
- the tests actually spawn a real server, allowing us to test the rust
introspection layer
- we now check that the device advertised in introspection actually
exist, making it so that when a user logs out, the tokens are
immediately invalidated, even if the cache doesn't expire
- it doesn't rely on discovery anymore, rather on a static endpoint
base. This means users don't have to override the introspection endpoint
to avoid internet roundtrips
- it doesn't depend on `authlib` anymore, as we simplified a lot the
calls done from Synapse to MAS

We still have to update the MAS documentation about the Synapse setup,
but that can be done later.

---------

Co-authored-by: reivilibre <oliverw@element.io>
2025-08-04 15:48:45 +02:00
Ben Banfield-Zanin
8c71875195
Document that there can be multiple workers handling the receipts stream (#18760) 2025-08-04 13:23:15 +01:00
Ben Banfield-Zanin
bbe78c253c
Improve device lists documentation (#18761) 2025-08-04 13:19:34 +01:00
Erik Johnston
72cd5cccf7
Make room upgrades faster for rooms with many bans (#18574)
We do this by a) not pulling out all membership events, and b) batch
inserting bans.

One blocking concern is that this bypasses the `update_membership`
function, which otherwise all other membership events go via. In this
case it's fine (having audited what it is doing), but I'm hesitant to
set the precedent of bypassing it, given it has a lot of logic in there.

---------

Co-authored-by: Eric Eastwood <erice@element.io>
2025-08-04 10:42:52 +01:00
Eric Eastwood
e16fbdcdcc
Update metrics linting to be able to handle custom metrics (#18733)
Part of https://github.com/element-hq/synapse/issues/18592
2025-08-01 15:34:11 -05:00
Eric Eastwood
e43a1cec84
Fix cache metrics to collect from all servers (#18748)
Follow-up to https://github.com/element-hq/synapse/pull/18604

Previously, our cache metrics did include the `server_name` label as
expected but we were only seeing the last server being reported. This
was caused because we would
`CACHE_METRIC_REGISTRY.register_hook(metric_name, metric.collect)` where
the `metric_name` only took into account the cache name so it would be
overwritten every time we spawn a new server.

This PR updates the register logic to include the `server_name` so we
have a hook for every cache on every server as expected.

I noticed this problem thanks to some [tests in the Synapse Pro for
Small Hosts](https://github.com/element-hq/synapse-small-hosts/pull/173)
repo that sanity check all metrics to ensure that we can see each metric
includes data from multiple servers.
2025-08-01 12:29:58 -05:00
Andrew Morgan
510924a2f6
Add missing await to sleep calls (#18763) 2025-08-01 16:00:30 +01:00
Andrew Morgan
3b5b6f6152 Merge branch 'master' into develop 2025-08-01 13:46:54 +01:00
Andrew Morgan
edac7a471f 1.135.0 2025-08-01 13:12:33 +01:00
Andrew Morgan
c15001d765 Run cargo update 2025-07-31 17:36:12 +01:00
Eric Eastwood
a6e326582f
Fix Failed to stop metrics warnings in request metrics (#18753)
```
Failed to stop metrics: TypeError("prometheus_client.metrics.MetricWrapperBase.labels() got multiple values for keyword argument 'server_name'")
```

Noticed while running and debugging some tests.

This bug was introduced in
https://github.com/element-hq/synapse/pull/18724
2025-07-31 10:31:45 -05:00
dependabot[bot]
cd339d52b6
Bump tokio from 1.46.1 to 1.47.0 (#18740) 2025-07-30 17:07:42 +01:00
dependabot[bot]
e7348406a3
Bump phonenumbers from 9.0.9 to 9.0.10 (#18741) 2025-07-30 17:06:48 +01:00
dependabot[bot]
4a01e2df47
Bump ruff from 0.12.4 to 0.12.5 (#18742) 2025-07-30 17:05:54 +01:00
dependabot[bot]
2465659942
Bump sentry-sdk from 2.32.0 to 2.33.2 (#18745) 2025-07-30 17:05:09 +01:00
dependabot[bot]
501b96134c
Bump mypy-zope from 1.0.12 to 1.0.13 (#18744) 2025-07-30 17:04:48 +01:00
dependabot[bot]
f8887a64e4
Bump gitpython from 3.1.44 to 3.1.45 (#18743) 2025-07-30 17:04:07 +01:00
Andrew Morgan
8551e0f0af
Allow suspended users to be auto-joined to server notice rooms (#18750) 2025-07-30 15:38:07 +00:00
Andrew Morgan
25289b6444 Fix trailing whitespace in build_rust.py, from #18700 2025-07-30 16:08:25 +01:00
Strac Consulting Engineers Pty Ltd
86370979d9
Minor improvements to README.rst (#18700) 2025-07-30 15:07:10 +01:00
Andrew Morgan
664f0e8938 Merge branch 'release-v1.135' into develop 2025-07-30 14:04:29 +01:00
reivilibre
ea87853188
Work around twisted.protocols.amp.TooLong error by reducing logging in some tests. (#18736)
Part of: https://github.com/element-hq/synapse/issues/18537

Works around: https://github.com/twisted/twisted/issues/12482

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-07-30 12:03:56 +00:00
Andrew Morgan
caf5f0110e Linkify GitHub PR ID in changelog 2025-07-30 12:57:20 +01:00
reivilibre
a31d53b28f
Use twisted.internet.testing module in tests instead of deprecated twisted.test.proto_helpers. (#18728)
Follows: #18727

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-07-30 12:32:10 +01:00
reivilibre
16a639e0fe
Remove some obsolete Twisted version checks. (#18729)
Follows: #18727
---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-07-30 12:31:55 +01:00
reivilibre
a2ba909ded
Remove obsolete /send_event replication endpoint. (#18730)
Fixes: #18441

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-07-30 12:30:40 +01:00
Andrew Morgan
c823d2e98a 1.135.0rc2 2025-07-30 12:19:34 +01:00
Andrew Morgan
7ae7468159
Improve performance of is_server_admin by adding a cache (#18747)
Fixes https://github.com/element-hq/synapse/issues/18738
2025-07-30 10:43:39 +00:00
Eric Eastwood
d4af2970f3
Refactor Histogram metrics to be homeserver-scoped (#18724)
Bulk refactor `Histogram` metrics to be homeserver-scoped. We also add
lints to make sure that new `Histogram` metrics don't sneak in without
using the `server_name` label (`SERVER_NAME_LABEL`).

Part of https://github.com/element-hq/synapse/issues/18592



### Testing strategy

 1. Add the `metrics` listener in your `homeserver.yaml`
    ```yaml
    listeners:
      # This is just showing how to configure metrics either way
      #
      # `http` `metrics` resource
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
      # `metrics` listener
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` and/or
`http://localhost:9323/metrics`
1. Observe response includes the TODO metrics with the `server_name`
label

### Todo

- [x] Wait for https://github.com/element-hq/synapse/pull/18656 to merge


### Dev notes

```
LoggingDatabaseConnection
make_conn
make_pool
make_fake_db_pool
```

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
2025-07-29 15:35:38 -05:00
Eric Eastwood
31a38f57f5
Resolve breaking change to run_as_background_process in module API (#18737)
Fix https://github.com/element-hq/synapse/issues/18735

In https://github.com/element-hq/synapse/pull/18670, we updated
`run_as_background_process` to add a `server_name` argument. Because
this function is directly exported from the Synapse module API, this is
a breaking change to any downstream Synapse modules that use
`run_as_background_process`.

This PR shims and deprecates the existing
`run_as_background_process(...)` for modules by providing a stub
`server_name` value and introduces a new
`ModuleApi.run_as_background_process(...)` that covers the `server_name`
logic automagically.
2025-07-29 14:29:38 -05:00
Travis Ralston
5b8b45a16d
Allow admins to see policy server-flagged events (#18585) 2025-07-29 19:57:33 +01:00
Eric Eastwood
3d683350e9
Refactor LaterGauge metrics to be homeserver-scoped (#18714)
Part of https://github.com/element-hq/synapse/issues/18592
2025-07-29 13:49:41 -05:00
Benjamin Bouvier
106afe4984
MSC4306: expose feature in the client version (#18722) 2025-07-29 13:39:11 -05:00
Eric Eastwood
5106818bd0
Refactor GaugeBucketCollector metrics to be homeserver-scoped (#18715)
Refactor `GaugeBucketCollector` metrics to be homeserver-scoped

Part of https://github.com/element-hq/synapse/issues/18592


### Testing strategy

 1. Add the `metrics` listener in your `homeserver.yaml`
    ```yaml
    listeners:
      # This is just showing how to configure metrics either way
      #
      # `http` `metrics` resource
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
      # `metrics` listener
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` and/or
`http://localhost:9323/metrics`
1. Adjust the number of [`msecs` in the `looping_call` so that
`_read_forward_extremities`](a82b8a966a/synapse/storage/databases/main/metrics.py (L79))
runs immediately instead of after an hour.
1. Observe response includes the `synapse_forward_extremities` and
`synapse_excess_extremity_events` metrics with the `server_name` label
2025-07-29 11:46:21 -05:00
Eric Eastwood
f13a136396
Refactor Gauge metrics to be homeserver-scoped (#18725)
Bulk refactor `Gauge` metrics to be homeserver-scoped. We also add lints
to make sure that new `Gauge` metrics don't sneak in without using the
`server_name` label (`SERVER_NAME_LABEL`).

Part of https://github.com/element-hq/synapse/issues/18592



### Testing strategy

 1. Add the `metrics` listener in your `homeserver.yaml`
    ```yaml
    listeners:
      # This is just showing how to configure metrics either way
      #
      # `http` `metrics` resource
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
      # `metrics` listener
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` and/or
`http://localhost:9323/metrics`
1. Observe response includes the TODO metrics with the `server_name`
label

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
2025-07-29 10:37:59 -05:00
Eric Eastwood
2c236be058
Refactor Counter metrics to be homeserver-scoped (#18656)
Bulk refactor `Counter` metrics to be homeserver-scoped. We also add
lints to make sure that new `Counter` metrics don't sneak in without
using the `server_name` label (`SERVER_NAME_LABEL`).

All of the "Fill in" commits are just bulk refactor.

Part of https://github.com/element-hq/synapse/issues/18592



### Testing strategy

 1. Add the `metrics` listener in your `homeserver.yaml`
    ```yaml
    listeners:
      # This is just showing how to configure metrics either way
      #
      # `http` `metrics` resource
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
      # `metrics` listener
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` and/or
`http://localhost:9323/metrics`
1. Observe response includes the `synapse_user_registrations_total`,
`synapse_http_server_response_count_total`, etc metrics with the
`server_name` label
2025-07-25 14:58:47 -05:00
reivilibre
458e6410e8
Reduce database usage in Sliding Sync by not querying for background update completion after the update is known to be complete. (#18718)
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
Co-authored-by: Eric Eastwood <erice@element.io>
2025-07-24 14:58:39 +00:00
reivilibre
1dd5f68251
Bump minimum version bound on Twisted to 21.2.0. (#18727)
Distro packagers have been consulted and as far as has been answered so
far, the lowest version of Twisted on the distros' platforms is 22.1, so
this bump should be safe.

This gives us 2 notable things:

- contextvar propagation support, which would let us remove A LOT of
logcontext machinery
  and vastly simplify logcontext rules!
- The test helpers have moved to the new location, so no longer will you
import test helpers
from the 'correct' (non-deprecated) path and get told by CI (olddeps)
that your test
  doesn't exist.

Changelog entries for those are reproduced below:

> - twisted.internet.defer.inlineCallbacks and ensureDeferred will now
associate a contextvars.Context with the coroutines they run, meaning
that ContextVar objects will maintain their value within the same
coroutine, similarly to asyncio Tasks. This functionality requires
Python 3.7+, or the contextvars PyPI backport to be installed for Python
3.5-3.6. (#<!--- -->9719, #<!--- -->9826)
>
> - twisted.test.proto_helpers has moved to twisted.internet.testing.
twisted.test.proto_helpers has been deprecated. (#<!--- -->6435)

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-07-24 15:39:54 +01:00
reivilibre
8344c944b1
Add configurable rate limiting for the creation of rooms. (#18514)
Default values will be 1 room per minute, with a burst count of 10.

It's hard to imagine most users will be affected by this default rate,
but it's intentionally non-invasive in case of bots or other users that
need to create rooms at a large rate.
Server admins might want to down-tune this on their deployments.

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2025-07-24 14:08:02 +00:00
Benjamin Bouvier
b34342eedf
MSC4306: register the thread subscriptions servlet in the client servlet section (#18726)
The MSC4306 endpoints were never registered, and thus never made
available, even if the experimental feature flag was enabled.
2025-07-24 10:33:34 +00:00
Quentin Gliech
61e79a4cdf
Fix deactivation running off the main process (#18716)
Best reviewed commit by commit.

With the new dedicated MAS API
(https://github.com/element-hq/synapse/pull/18520), it's possible that
deactivation starts off the main process, which was not possible because
of a few calls.

I basically looked at everything that the deactivation handler was
doing, reviewed whether it could run on workers or not, and find a
workaround when possible

---------

Co-authored-by: Eric Eastwood <erice@element.io>
2025-07-24 08:43:58 +00:00
Eric Eastwood
b7e7f537f1
Refactor background process metrics to be homeserver-scoped (#18670)
Part of https://github.com/element-hq/synapse/issues/18592

Separated out of https://github.com/element-hq/synapse/pull/18656
because it's a bigger, unique piece of the refactor


### Testing strategy

 1. Add the `metrics` listener in your `homeserver.yaml`
    ```yaml
    listeners:
      # This is just showing how to configure metrics either way
      #
      # `http` `metrics` resource
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
      # `metrics` listener
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` and/or
`http://localhost:9323/metrics`
1. Observe response includes the background processs metrics
(`synapse_background_process_start_count`,
`synapse_background_process_db_txn_count_total`, etc) with the
`server_name` label
2025-07-23 13:28:17 -05:00