Skip to content

Add bulk transcribe and bulk export operations#57

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/add-bulk-transcribe-export-functionality
Draft

Add bulk transcribe and bulk export operations#57
Copilot wants to merge 4 commits intomainfrom
copilot/add-bulk-transcribe-export-functionality

Conversation

Copy link

Copilot AI commented Feb 6, 2026

Adds batch processing capabilities for transcription and export workflows. Users can now transcribe multiple uploaded files with a single configuration and export multiple completed transcriptions as a ZIP archive.

Changes

UI (pages/home.py)

  • Bulk Transcribe button: enabled when ≥1 uploaded file selected
  • Bulk Export button: enabled when ≥2 completed files selected with matching output format
  • Button visibility logic in toggle_buttons() validates selection state and format compatibility

Core Logic (utils/common.py)

table_bulk_transcribe(table)

  • Filters selected rows for "Uploaded" status
  • Opens dialog with file count and standard transcription settings
  • Delegates to existing start_transcription() with row list

table_bulk_export(table) + execute_bulk_export()

  • Validates homogeneous output format (prevents mixing TXT/SRT)
  • Fetches transcription results via GET /api/v1/transcriber/{uuid}/result
  • Parses with SRTEditor (parse_srt/parse_txt based on format)
  • Exports to selected format (SRT/VTT for subtitles, TXT/JSON/RTF/CSV/TSV for transcriptions)
  • Creates in-memory ZIP with timestamped filename
  • Tracks per-file success/failure, continues on individual errors

Example export flow:

# User selects 3 completed SRT files
# Validation: all have output_format="SRT" ✓
# User selects VTT export format
# System:
#   - Fetches each result from API
#   - SRTEditor.parse_srt(result_data)
#   - SRTEditor.export_vtt() → content
#   - ZIP: interview.vtt, meeting.vtt, podcast.vtt
# Download: bulk_export_20260206_200400.zip

Error Handling

  • Individual file failures don't abort batch
  • Success notification reports N of M files exported
  • Mixed format selection shows validation error before API calls
  • Empty ZIP prevented if all exports fail

Filename Processing

Strips known audio/video extensions (.mp3, .wav, .mp4, etc.) before appending export extension. Fallback to generic extension stripping for unknown formats.

Original prompt

Add Bulk Transcribe and Bulk Export Functionality

Overview

Add the ability for users to transcribe multiple uploaded files at once and export multiple completed transcriptions at once from the home page.

Current Behavior

  • Users can only transcribe one file at a time via the "Transcribe" button that appears on each row
  • Users can only export one file at a time from within the SRT editor (/srt page)

Required Changes

1. Bulk Transcribe Feature

In pages/home.py:

  • Add a "Bulk Transcribe" button that becomes visible/enabled when multiple rows with status "Uploaded" are selected
  • The button should appear in the button row alongside the existing "Upload" and "Delete" buttons

In utils/common.py:

  • Create a new function table_bulk_transcribe(table: ui.table) that:
    • Gets all selected rows from the table
    • Filters to only include rows with status "Uploaded"
    • Opens a dialog similar to table_transcribe() but shows the count of files to be transcribed
    • Allows user to select language, number of speakers, and output format (Transcribed text or Subtitles)
    • On confirmation, calls start_transcription() with all selected rows (the existing function already accepts a list of rows)
    • Shows appropriate success/error notifications

2. Bulk Export Feature

In pages/home.py:

  • Add a "Bulk Export" button that becomes visible/enabled when multiple rows with status "Completed" are selected AND all selected rows have the same output_format (either all TXT or all SRT)
  • The button should appear in the button row alongside the other buttons

In utils/common.py:

  • Create a new function table_bulk_export(table: ui.table) that:
    • Gets all selected rows from the table
    • Filters to only include rows with status "Completed"
    • Validates that all selected rows have the same output_format (TXT or SRT) - if not, show an error notification
    • Opens an export dialog that allows selecting the export format:
      • For SRT files: offer SRT, VTT formats
      • For TXT files: offer TXT, JSON, RTF, CSV, TSV formats
    • Fetches the transcription result for each selected job from the API (GET /api/v1/transcriber/{uuid}/result)
    • Generates the export content for each file using the appropriate export method
    • Creates a ZIP file containing all exported files and triggers download
    • Shows progress during export and appropriate success/error notifications

API Integration:

  • Use the existing API endpoint GET /api/v1/transcriber/{uuid}/result to fetch each transcription result
  • The response contains the transcription data that can be parsed and exported

Export Logic:

  • For SRT format jobs, use the SRTEditor class methods: parse_srt(), export_srt(), export_vtt()
  • For TXT format jobs, use the SRTEditor class methods: parse_txt(), export_txt(), export_json(), export_rtf(), export_csv(), export_tsv()
  • Create a helper function to instantiate SRTEditor, load the data, and export in the desired format

ZIP File Creation:

  • Use Python's zipfile module to create an in-memory ZIP file
  • Name each file in the ZIP as {original_filename}.{export_extension}
  • Use ui.download() to trigger the ZIP file download

3. UI/UX Considerations

Button Visibility Logic in pages/home.py:

  • Update the toggle_buttons() function to also control:
    • "Bulk Transcribe" button: enabled when at least one selected row has status "Uploaded"
    • "Bulk Export" button: enabled when at least 2 selected rows have status "Completed" AND all completed selected rows have the same output_format

Styling:

  • Use consistent styling with existing buttons (use default_styles CSS classes)
  • "Bulk Transcribe" button should use the default-style class
  • "Bulk Export" button should use the button-default-style class

4. Files to Modify

  1. pages/home.py:

    • Import table_bulk_transcribe and table_bulk_export from utils.common
    • Add "Bulk Transcribe" and "Bulk Export" buttons
    • Update toggle_buttons() to handle new button states
  2. utils/common.py:

    • Add table_bulk_transcribe() function
    • Add table_bulk_export() function
    • Add helper function for fetching and exporting transcription results

5. Example Implementation Details

Bulk Transcribe Dialog should show:

Bulk Transcription Settings
---------------------------
Selected files: 5

Language: [dropdown]
Number of speakers: [number input]
Output format: [radio: Transcribed text / Subtitles]

[Cancel] [Start transcribing]

Bulk Export Dialog should show:

Bulk Export
-----------
Selected files: 3 (Subtitles)

Export format: [dropdown based on type]

[Cancel] [Export All]

6. Error Handling

  • If bulk transcribe is clicked with no "Uploaded" files selected, show a notification
  • If bulk export is clicked with mixed output formats, show error: "Cannot bulk export files with different output formats. Please select only Transcription...

This pull request was created from Copilot chat.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits February 6, 2026 20:06
Co-authored-by: krihal <72446+krihal@users.noreply.github.com>
Co-authored-by: krihal <72446+krihal@users.noreply.github.com>
Co-authored-by: krihal <72446+krihal@users.noreply.github.com>
Copilot AI changed the title [WIP] Add bulk transcribe and bulk export functionality Add bulk transcribe and bulk export operations Feb 6, 2026
Copilot AI requested a review from krihal February 6, 2026 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants