Perf: Use IMAP SORT and batch fetch for metadata retrieval#100
Perf: Use IMAP SORT and batch fetch for metadata retrieval#100jbkjr wants to merge 24 commits intoai-zerolab:mainfrom
Conversation
Add 6 new MCP tools for IMAP folder operations: - list_folders: List all folders/mailboxes with flags - move_emails: Move emails between folders (MOVE or COPY+DELETE fallback) - copy_emails: Copy emails to folder (useful for labels in Proton Mail) - create_folder: Create new folders - delete_folder: Delete folders - rename_folder: Rename folders This enables full folder management through the MCP interface, with special consideration for Proton Mail Bridge compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Addresses performance concern raised in ai-zerolab#98 review. The previous fix fetched headers for ALL emails before sorting, causing O(n) network calls for large mailboxes. Changes: - Add SORT capability detection (RFC 5256) - When SORT supported: server-side sorting, fetch only page headers - Fallback: batch fetch Date headers, sort, fetch page headers - Reduces network calls from O(n) to O(2) for any mailbox size Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Address maintainer feedback on PR ai-zerolab#99: - Add enable_folder_management config flag (disabled by default) - All folder management tools now require explicit opt-in - Add MCP_EMAIL_SERVER_ENABLE_FOLDER_MANAGEMENT env var support - Add comprehensive tests for folder management (30 new tests) - Update README documentation with new setting Tests cover: - Permission checks when disabled (6 tests) - Tool functionality when enabled (6 tests) - Handler method tests (6 tests) - EmailClient IMAP operation tests (6 tests) - Edge cases (3 tests) - Config tests (3 tests) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Tests for _has_sort_capability helper - Tests for _parse_date_from_header helper - Tests for _batch_fetch_dates with success, empty, and error cases - Tests for _batch_fetch_headers with success, empty, and error cases - Tests for _parse_header_to_metadata including CC handling - Tests for SORT path in get_emails_metadata_stream - Tests for SORT fallback when SORT command fails - Tests for empty search results - Tests for ascending order - Tests for pagination - Tests for date fetch fallback Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
for more information, see https://pre-commit.ci
- Log exception in _parse_date_from_header instead of silent pass (S110) - Add noqa: C901 for get_emails_metadata_stream complexity - Use RuntimeError instead of Exception in test (TRY002) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Different IMAP servers return UID in different positions: - Some include UID in the FETCH line: b'1 FETCH (UID 1 BODY[...]' - Others (like Proton Bridge) return UID separately: b' UID 1)' Updated _batch_fetch_dates and _batch_fetch_headers to handle both formats by tracking pending UID/data and emitting results when the pair is complete, regardless of order. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extract _append_header_metadata helper to reduce _batch_fetch_headers complexity from 11 to under 10 (ruff C901). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Add 6 new MCP tools for managing ProtonMail labels: - list_labels: List all labels (filters Labels/ prefix folders) - apply_label: Apply label to emails (copy to Labels/X) - remove_label: Remove label from emails (delete from Labels/X) - get_email_labels: Get all labels for an email - create_label: Create new label - delete_label: Delete label Labels in ProtonMail Bridge are exposed as IMAP folders under the Labels/ prefix. These tools provide semantic operations for label management while using the underlying folder operations. Includes comprehensive tests (26 new tests, all passing). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add three new tri-state filter parameters: - seen: True=read (SEEN), False=unread (UNSEEN), None=all - flagged: True=starred (FLAGGED), False=not starred (UNFLAGGED), None=all - answered: True=replied (ANSWERED), False=not replied (UNANSWERED), None=all These filters enable compound searches like "unread emails from SenderX in Labels/Y" by combining mailbox, from_address, and seen parameters. Also updated mailbox parameter description to document label usage (e.g., 'Labels/LabelName' for ProtonMail). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add clearer documentation for the mailbox parameter across all tools, explaining standard IMAP folders and provider-specific paths for Gmail ([Gmail]/...) and ProtonMail Bridge (Folders/<name>, Labels/<name>). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add functionality to mark emails as read or unread using IMAP \Seen flag: - EmailMarkResponse model for operation results - mark_emails abstract method in EmailHandler - Implementation in EmailClient and ClassicEmailHandler - MCP tool exposed via app.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Clarify that move_emails removes from source folder and apply_label only tags without removing from INBOX. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked test improvements from feature/folder-management: - Comprehensive tests for folder management edge cases - Tests for _parse_list_response exception handling - Tests for EmailClient.delete_emails coverage - Ruff RUF059 fixes for unused variables Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked test improvements from pr/mark-read-unread: - Comprehensive tests for mark_emails functionality - Logout error test for mark_emails coverage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Can you resolve the conflict? |
- Add test for batch_fetch_dates with UID after data - Add test for batch_fetch_headers with UID after data - Add test for bytes without UID match (continue path) These tests cover the alternate IMAP response format used by Proton Mail Bridge where UID comes after the data. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add test for empty To header in _parse_header_to_metadata - Add tests for non-bytes items (None, int) in _batch_fetch_dates loop - Add tests for non-bytes items (None, int) in _batch_fetch_headers loop - Achieves 100% branch coverage on batch fetch optimization code Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Test empty SORT response handling (lines 451-453) - Test empty page after pagination with SORT (line 463) - Test empty email_ids after split (lines 494-495) - Test empty page in fallback path (line 518) - Test logout error handling (lines 534-535) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
a057364 to
813a274
Compare
|
Merge conflicts resolved - rebased onto main. Ready for review. |
|
Looks like #107 addresses this with a similar approach using INTERNALDATE (which is probably better than parsing Date headers anyway - faster and more reliable). One thing from this PR that might still be valuable: the IMAP SORT capability check. When the server supports the SORT extension (RFC 5256), it can return UIDs already sorted, avoiding the need to fetch dates entirely. That's the optimal path when available. Happy to close this PR, or I could extract just the SORT optimization as a smaller enhancement to #107 if there's interest. |
|
Closing as superseded by upstream's PR #107 which implemented a similar batch fetch optimization. |
Summary
Addresses the performance concern raised by @Wh1isper in #98 review. The previous fix fetched headers for ALL emails before sorting, which could cause performance issues with large mailboxes.
Before: n individual IMAP FETCH calls for n emails — O(n)
After: 2 batch IMAP calls regardless of mailbox size — O(1)
Changes
_has_sort_capability()to detect IMAP SORT extension (RFC 5256)_batch_fetch_dates()for efficient date-only header fetching_batch_fetch_headers()for batch full header fetchingget_emails_metadata_stream()with two paths:Performance Comparison
Test plan
test_get_emails_streamto verify batch fetch behavior🤖 Generated with Claude Code