Perf: Batch fetch emails to reduce IMAP round trips#107
Merged
Wh1isper merged 2 commits intoai-zerolab:mainfrom Jan 19, 2026
Merged
Perf: Batch fetch emails to reduce IMAP round trips#107Wh1isper merged 2 commits intoai-zerolab:mainfrom
Wh1isper merged 2 commits intoai-zerolab:mainfrom
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Member
|
@avarun42 Thanks, I like this pr! I've made some changes to the parallelization. Do you think they're appropriate? |
Previously, get_emails_metadata_stream() fetched headers one-by-one for ALL emails before sorting and paginating. For a mailbox with 25k emails, this meant 25k IMAP round trips taking 30+ minutes. Now uses a two-phase batch approach: 1. Batch fetch INTERNALDATE for all UIDs (chunked at 5000 for Yahoo) 2. Sort by date in Python, then paginate 3. Batch fetch full headers for the requested page only (typically 10) Uses INTERNALDATE (server receipt time) instead of Date header for sorting. Tested on 25k emails: max position difference of 20, average 0.1 vs Date header sorting - negligible for UX, but 40x faster to fetch. Performance: 25k emails page 1 goes from 30+ min to ~2 seconds.
- Parallelize _batch_fetch_dates using asyncio.gather for better performance - Improve variable naming (t0/t1/t2 -> descriptive names) - Restore helpful comments for code readability - Fix type handling in _batch_fetch_headers (accept both bytes and str) - Add comprehensive tests for batch methods Co-Authored-By: Paintress <paintress@arcoer.com>
b413eb2 to
635d10e
Compare
4 tasks
jbkjr
pushed a commit
to jbkjr/mcp-email-server
that referenced
this pull request
Jan 25, 2026
Merged latest upstream changes including: - verify_ssl option for SMTP connections (PR ai-zerolab#105) - Batch fetch emails performance optimization (PR ai-zerolab#107) - Pre-commit updates Preserved local features: - Folder management tools - Label management tools (ProtonMail) - Mark emails as read/unread - Improved mailbox parameter descriptions Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
jbkjr
pushed a commit
to jbkjr/mcp-email-server
that referenced
this pull request
Jan 26, 2026
aioimaplib returns FETCH responses in 3 separate parts:
- i: b'N FETCH (BODY[HEADER] {size}' - contains BODY[HEADER]
- i+1: bytearray(...) - raw header content
- i+2: b' UID N)' - contains UID
The original code assumed UID was on the same line as BODY[HEADER], but
aioimaplib separates them. This caused list_emails_metadata to return
empty results when used with actual IMAP servers.
Also fixes test mocks to use correct response format.
Fixes batch fetch regression introduced in PR ai-zerolab#107.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2 tasks
jbkjr
pushed a commit
to jbkjr/mcp-email-server
that referenced
this pull request
Jan 26, 2026
aioimaplib returns FETCH responses in 3 separate parts:
- i: b'N FETCH (BODY[HEADER] {size}' - contains BODY[HEADER]
- i+1: bytearray(...) - raw header content
- i+2: b' UID N)' - contains UID
The original code assumed UID was on the same line as BODY[HEADER], but
aioimaplib separates them. This caused list_emails_metadata to return
empty results when used with actual IMAP servers.
Also fixes test mocks to use correct response format.
Fixes batch fetch regression introduced in PR ai-zerolab#107.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
jbkjr
pushed a commit
to jbkjr/mcp-email-server
that referenced
this pull request
Jan 26, 2026
Remove TestParseHeaderToMetadata (uses non-existent _parse_header_to_metadata) and TestGetEmailsStreamWithSort (tests SORT capability from PR ai-zerolab#107) that were accidentally included during rebase. These test upstream's batch fetch implementation, not the folder management feature. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
For context, the state of the current
mainbranch was completely unusable with my email account before this diff. Massive n+1 query which would take many minutes before timing out.Summary
Why INTERNALDATE?
Server receipt time is 40x faster to fetch than parsing Date headers, with negligible sorting differences (tested: max 20 position difference on 25k emails).
Performance
Test plan
_parse_headers,_batch_fetch_dates,_batch_fetch_headers