Stabilize test suite: fix flaky timing tests, add rerun support, and handle Windows resource leaks by rodrigobnogueira · Pull Request #11992 · aio-libs/aiohttp

rodrigobnogueira · 2026-01-24T03:50:04Z

Description

What do these changes do?

Timing-based performance tests:
- Refactor test_import_time to use sys.version_info >= (3, 12) for relaxed thresholds, making it automatically apply to Python 3.12+ (including 3.14) without manual version additions. This fixes intermittent failures in CI.
- Similarly stabilize test_regex_performance flakiness with adjusted thresholds and refactoring.
- Introduce a RerunThresholdParams NamedTuple and rerun_adjusted_threshold fixture for cleaner threshold handling.
Flaky test handling:
- Add pytest-rerunfailures plugin to CI requirements and configure it to automatically rerun flaky tests, reducing spurious failures.
Resource leak and cleanup fixes (primarily Windows):

Are there changes in behavior for the user?

No user-facing changes. This only affects CI test behavior.

Is it a substantial burden for the maintainers to support this?

No — this actually reduces maintenance burden by making the test future-proof.

Related issue number

Fixes flaky test failures in PR #11990.

Checklist

I think the code is well written
Unit tests for the changes exist
Documentation reflects the changes
If you provide code modification, please add yourself to CONTRIBUTORS.txt
Add a new news fragment into the CHANGES/ folder

codecov · 2026-01-24T03:57:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.75%. Comparing base (31fce7d) to head (0f3fb76).
⚠️ Report is 1 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #11992      +/-   ##
==========================================
- Coverage   98.76%   98.75%   -0.01%     
==========================================
  Files         127      127              
  Lines       44655    44674      +19     
  Branches     2367     2365       -2     
==========================================
+ Hits        44102    44117      +15     
- Misses        393      396       +3     
- Partials      160      161       +1

Flag	Coverage Δ
CI-GHA	`98.60% <98.03%> (-0.02%)`	⬇️
OS-Linux	`98.34% <96.07%> (-0.02%)`	⬇️
OS-Windows	`96.71% <90.19%> (+0.01%)`	⬆️
OS-macOS	`97.60% <88.23%> (+0.01%)`	⬆️
Py-3.10.11	`97.14% <90.19%> (+0.01%)`	⬆️
Py-3.10.19	`97.62% <96.07%> (-0.02%)`	⬇️
Py-3.11.14	`97.82% <96.07%> (-0.02%)`	⬇️
Py-3.11.9	`97.35% <90.19%> (+<0.01%)`	⬆️
Py-3.12.10	`97.44% <88.23%> (+0.01%)`	⬆️
Py-3.12.12	`97.92% <96.07%> (-0.01%)`	⬇️
Py-3.13.11	`98.16% <96.07%> (-0.01%)`	⬇️
Py-3.14.2	`98.14% <95.91%> (-0.01%)`	⬇️
Py-3.14.2t	`97.23% <95.91%> (-0.02%)`	⬇️
Py-pypy3.11.13-7.3.20	`97.38% <88.23%> (+<0.01%)`	⬆️
VM-macos	`97.60% <88.23%> (+0.01%)`	⬆️
VM-ubuntu	`98.34% <96.07%> (-0.02%)`	⬇️
VM-windows	`96.71% <90.19%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

codspeed-hq · 2026-01-24T03:58:01Z

CodSpeed Performance Report

Merging this PR will not alter performance

_{Comparing rodrigobnogueira:fix/import-time-test-python-3.14 (0f3fb76) with master (0cba798)}

Summary

✅ 59 untouched benchmarks

Dreamsorcerer · 2026-01-24T15:07:20Z

I think the expectation was that import time would improve again in later versions. I think flaky failures today are mostly caused by regressions (e.g. idna: kjd/idna#188).

tests/test_imports.py

for more information, see https://pre-commit.ci

rodrigobnogueira · 2026-01-24T20:18:05Z

The CI failure on test_regex_performance popped up again in the macOS 3.12 job (as seen before—it's another timing-based test with a tight <10ms assertion).

I've updated it. Let me know if this looks good

webknjaz

Can we try using https://pypi.org/p/pytest-rerunfailures? It should be more generic.

webknjaz · 2026-01-25T09:57:17Z

tests/test_client_middleware_digest_auth.py

@@ -1333,11 +1333,23 @@ async def handler(request: Request) -> Response:

 def test_regex_performance() -> None:


I wonder if this belongs in tests executed by codspeed 🤔

cc @bdraco

We could do, but the reason I put it here is that the exact performance is not important, it's designed to trip if there's a ReDoS issue, which is an order of magnitude problem. I just misjudged the initial value.

For reference, on my machine this now takes ~4ms, on the old regex it took closer to 60s. So, increasing the test threshold to 50ms is absolutely fine here.

We could also consider just skipping it on Mac OS, given that it seems to be 5-10x slower than Linux...

webknjaz · 2026-01-25T10:00:32Z

tests/test_imports.py

Another codspeed candidate?

I think bdraco mentioned that he might look at this one at some point. I'm not sure it's a trivial one to run in codspeed.

I imagine we could try their new CLI they've announced the other day (I think I got an email yesterday). But I agree this is not in the scope of this PR.

Dreamsorcerer · 2026-01-25T15:17:04Z

The CI failure on test_regex_performance popped up again in the macOS 3.12 job (as seen before—it's another timing-based test with a tight <10ms assertion).

I think we can just double that to 20ms. It's meant to highlight an issue as an order of magnitude, so I clearly misjudged the performance of Mac OS.

for more information, see https://pre-commit.ci

rodrigobnogueira · 2026-01-25T16:58:45Z

Thanks @webknjaz for the suggesting pytest-rerunfailures.
A helper function was introduced for these time-based tests.
There are other tests that might be modified if you want: test_forwarded_re_performance and test_cookie_pattern_performance, both have a hardcoded 10ms threshold

webknjaz

@bdraco any ideas on designing this better?

aiohttp/pytest_plugin.py

requirements/test-common.in

… much

- Converted get_flaky_threshold function to a pytest fixture using indirect parametrization - Renamed to rerun_adjusted_threshold for clarity - Updated RST docstring with usage examples and rerun count logic - Added proper type annotations (tuple[float, float]) for mypy compliance - Updated test_imports.py and test_client_middleware_digest_auth.py to use new fixture - Improved code readability by splitting long assertion lines - Restored macOS timing observation comment (40-50ms)

The 0.1s delay was insufficient for proxy.py worker threads to release their sockets. Increased to 0.5s and added 3 gc.collect() passes to ensure all cyclic references are broken before pytest cleanup.

Instead of waiting a fixed 5+ seconds, poll for proxy threads to finish with gc.collect() calls. This is faster on typical runs (exits as soon as threads are gone) while still having a 5s timeout for robustness.

for more information, see https://pre-commit.ci

Instead of matching thread names (which was unreliable), capture the baseline set of threads before starting proxy.py and wait until all extra threads have finished.

tests/test_proxy_functional.py

…ad tracking and adding post-loop garbage collection.

tests/test_proxy_functional.py

tests/test_imports.py

webknjaz

I think it's good but let me know if you'd like to drop that helper function before merging.

Urgh.. looks like my browser cached the previous diff and I commented on the wrong thing.

tests/conftest.py

aiohttp/pytest_plugin.py

tests/test_client_middleware_digest_auth.py

tests/test_proxy_functional.py

requirements/test-common.in

for more information, see https://pre-commit.ci

rodrigobnogueira · 2026-01-30T02:22:39Z

Addressed intermittent CI failures on Windows with Python 3.10 and 3.11 in test_proxy_functional.py, caused by PytestUnraisableExceptionWarning (wrapping ResourceWarning: unclosed socket) from proxy.py's threaded mode during connection failures (related to known Windows quirks; see abhinavsingh/proxy.py#492 for context, though it primarily discusses threadless mode). This is not ideal since it's a global filter, but it stabilizes the CI without masking unrelated issues and allows the tests to pass consistently.
All tests passed with the last commit.

…est thresholds Replaces tuple-based threshold parameters with a self-documenting RerunThresholdParams NamedTuple containing 'base' and 'increment_per_rerun' fields for improved readability and maintainability.

…ixture

rodrigobnogueira requested review from asvetlov and webknjaz as code owners January 24, 2026 03:50

psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Jan 24, 2026

rodrigobnogueira mentioned this pull request Jan 24, 2026

Skip benchmarks in ci when running in fork repositories #11737

Merged

5 tasks

Dreamsorcerer reviewed Jan 24, 2026

View reviewed changes

tests/test_imports.py Outdated Show resolved Hide resolved

Fix flaky import time test for Python 3.12+

cf37399

rodrigobnogueira force-pushed the fix/import-time-test-python-3.14 branch from 2a4a9d7 to cf37399 Compare January 24, 2026 19:19

[pre-commit.ci] auto fixes from pre-commit.com hooks

877749b

for more information, see https://pre-commit.ci

rodrigobnogueira force-pushed the fix/import-time-test-python-3.14 branch from 6fae85d to 2cda92f Compare January 24, 2026 20:14

rodrigobnogueira force-pushed the fix/import-time-test-python-3.14 branch 2 times, most recently from 5febb04 to cd1e76d Compare January 24, 2026 20:29

Fix flaky test_regex_performance timing test

553f63e

rodrigobnogueira force-pushed the fix/import-time-test-python-3.14 branch from be1fa7f to 553f63e Compare January 24, 2026 20:42

webknjaz reviewed Jan 25, 2026

View reviewed changes

webknjaz requested a review from bdraco January 25, 2026 10:00

rodrigo.nogueira and others added 2 commits January 25, 2026 13:49

Improve flaky test handling using pytest-rerunfailures

1eed494

[pre-commit.ci] auto fixes from pre-commit.com hooks

1e0e7b5

for more information, see https://pre-commit.ci

webknjaz reviewed Jan 26, 2026

View reviewed changes

aiohttp/pytest_plugin.py Outdated Show resolved Hide resolved

aiohttp/pytest_plugin.py Outdated Show resolved Hide resolved

requirements/test-common.in Show resolved Hide resolved

rodrigo.nogueira and others added 5 commits January 26, 2026 13:25

Fix socket leaks in TestShutdown suite for Windows CI

c1f3847

reverting the windows socket handling. The scope might be growing too…

5bc9395

… much

Merge branch 'master' into fix/import-time-test-python-3.14

6eaa28b

Regenerate test requirement pins to include pytest-rerunfailures

defe900

rodrigo.nogueira and others added 6 commits January 28, 2026 19:15

fix: increase Windows cleanup delay to 0.5s with multiple gc passes

85ffe07

The 0.1s delay was insufficient for proxy.py worker threads to release their sockets. Increased to 0.5s and added 3 gc.collect() passes to ensure all cyclic references are broken before pytest cleanup.

test: use extreme 5s delay to verify socket leak source

f905bc7

test: add delay after gc.collect() to test async finalization

f6a3a00

test: use thread polling instead of fixed sleep for Windows cleanup

7315a1b

Instead of waiting a fixed 5+ seconds, poll for proxy threads to finish with gc.collect() calls. This is faster on typical runs (exits as soon as threads are gone) while still having a 5s timeout for robustness.

[pre-commit.ci] auto fixes from pre-commit.com hooks

968d9bd

for more information, see https://pre-commit.ci

test: use baseline thread detection for Windows cleanup

e63fb6d

Instead of matching thread names (which was unreliable), capture the baseline set of threads before starting proxy.py and wait until all extra threads have finished.

github-advanced-security bot found potential problems Jan 29, 2026

View reviewed changes

tests/test_proxy_functional.py Fixed Show fixed Hide fixed

tests/test_proxy_functional.py Fixed Show fixed Hide fixed

tests/test_proxy_functional.py Fixed Show fixed Hide fixed

fix: Improve thread cleanup in proxy test fixture by simplifying thre…

4d75c6a

…ad tracking and adding post-loop garbage collection.

github-advanced-security bot found potential problems Jan 29, 2026

View reviewed changes

tests/test_proxy_functional.py Fixed Show fixed Hide fixed

tests/test_proxy_functional.py Fixed Show fixed Hide fixed