Skip to content

feat(ui): add download progress feedback with tqdm#65

Merged
jhamon merged 2 commits intomainfrom
jhamon/sdk-321-add-progress-feedback-to-dataset-io-operations
Feb 3, 2026
Merged

feat(ui): add download progress feedback with tqdm#65
jhamon merged 2 commits intomainfrom
jhamon/sdk-321-add-progress-feedback-to-dataset-io-operations

Conversation

@jhamon
Copy link
Contributor

@jhamon jhamon commented Feb 3, 2026

Problem

When loading datasets from cloud storage, users see no feedback during downloads. For large datasets, this creates a poor experience as users don't know if the download is progressing, stalled, or how long it will take.

Solution

Added byte-level progress bars to dataset downloads using tqdm. The progress bar shows:

  • Total file size and bytes downloaded
  • Download speed (MB/s)
  • Estimated time remaining (ETA)
  • Progress percentage
  • Support for resumed downloads (shows already-downloaded bytes)

Changes

pinecone_datasets/cache.py:

  • Import tqdm
  • Wrap _download_file method with tqdm progress bar
  • Display filename, file size, and download progress
  • Update progress bar on each chunk downloaded

Testing

  • ✅ All 190 tests pass (3 skipped)
  • ✅ No linter errors
  • ✅ Cache tests validate download functionality
  • ✅ Integration tests verify cloud storage operations

Example Output

Downloading part-0.parquet: 100%|████████| 1.23M/1.23M [00:02<00:00, 512kB/s]

When resuming an interrupted download:

Downloading part-0.parquet:  45%|████     | 512k/1.23M [00:01<00:01, 498kB/s]

Related

  • Closes SDK-321
  • Part of SDK-319 (Download Progress, Resumable Downloads, and Dataset Rebuilding)

Made with Cursor


Note

Low Risk
Low risk: changes are limited to wrapping file downloads with a tqdm progress bar; primary risk is minor output/TTY compatibility or overhead during downloads.

Overview
Adds byte-level download progress feedback to cached dataset downloads by wrapping CacheManager._download_file with a tqdm progress bar.

The progress bar is initialized with the remote file size and resume offset (initial=start_byte), updates on each chunk written, and displays the filename in the description while keeping existing resumable/metadata behavior intact.

Written by Cursor Bugbot for commit f040bab. This will update automatically on new commits. Configure here.

Add byte-level progress bars to file downloads in CacheManager.
Shows download speed, ETA, and bytes transferred with proper
handling of resumed downloads.

- Import tqdm in cache.py
- Wrap download loop with tqdm progress bar
- Display file size, download speed, and ETA
- Show already-downloaded bytes when resuming
- All existing tests pass (190/190)

Related to SDK-321

Co-authored-by: Cursor <cursoragent@cursor.com>
@jhamon jhamon added the enhancement New feature or request label Feb 3, 2026
Co-authored-by: Cursor <cursoragent@cursor.com>
@jhamon jhamon merged commit 7ad9f4f into main Feb 3, 2026
11 checks passed
@jhamon jhamon deleted the jhamon/sdk-321-add-progress-feedback-to-dataset-io-operations branch February 3, 2026 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant