Skip to content

feat(cancell-job-on-delete): Cancel queue job after file deletion in db#1058

Open
MohdShoaib-18169 wants to merge 2 commits intomainfrom
cancell-job-on-delete
Open

feat(cancell-job-on-delete): Cancel queue job after file deletion in db#1058
MohdShoaib-18169 wants to merge 2 commits intomainfrom
cancell-job-on-delete

Conversation

@MohdShoaib-18169
Copy link
Contributor

@MohdShoaib-18169 MohdShoaib-18169 commented Oct 7, 2025

Description

Testing

Additional Notes

Summary by CodeRabbit

  • New Features
    • Background processing now auto-cancels for deleted collections, folders, and files; uploads and auto-created folders are assigned cancellation keys so in-progress jobs can be stopped (including PDF-specific processing).
  • Bug Fixes
    • Prevents lingering processing jobs after deletions, reducing delays and inconsistencies.
    • Improved reliability during uploads and folder auto-creation; adds per-item cancellation handling and preserves logging for success/failure.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 7, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds an internal cancelProcessingJobs utility and integrates it into collection/folder/file creation and deletion flows, assigning singleton keys (collection_, folder_, file_) to queued processing jobs and cancelling related jobs when items or collections are deleted.

Changes

Cohort / File(s) Summary
Cancellation utility
server/api/knowledgeBase.ts
Adds internal cancelProcessingJobs(itemsToDelete, collectionId?) to cancel jobs in FileProcessingQueue, PdfFileProcessingQueue, and collection-level jobs; aggregates and logs cancellation results and errors.
Delete integrations
server/api/knowledgeBase.ts
Calls cancelProcessingJobs after successful DeleteCollectionApi and DeleteItemApi transactions to cancel in-flight processing for deleted items and collections.
Enqueue singleton keys: collections
server/api/knowledgeBase.ts
Sets singletonKey: collection_<id> when enqueuing collection-level jobs in CreateCollectionApi.
Enqueue singleton keys: folders
server/api/knowledgeBase.ts
Sets singletonKey: folder_<id> in CreateFolderApi and EnsureFolderPath (including auto-created folders during uploads).
Enqueue singleton keys: files
server/api/knowledgeBase.ts
Sets singletonKey: file_<id> for each uploaded file in UploadFilesApi; routes PDFs to PdfFileProcessingQueue and others to FileProcessingQueue using the keys.
Upload/batch ensure behavior
server/api/knowledgeBase.ts
Applies folder_<id> singletonKey behavior during batch/path-based folder creation in upload flows and assigns per-file singleton keys during upload.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as Client
  participant K as KnowledgeBase API
  participant DB as Database
  participant Q as Processing Queues

  rect rgba(230,245,255,0.6)
  note over U,K: Delete collection or items
  U->>K: DeleteCollection/DeleteItem request
  K->>DB: Delete records (collection/items)
  K->>Q: cancelProcessingJobs(items, collectionId?)
  note over K,Q: Cancels file, PDF, and collection jobs (singleton keys)
  K-->>U: Deletion success
  end
Loading
sequenceDiagram
  autonumber
  participant U as Client
  participant K as KnowledgeBase API
  participant DB as Database
  participant Q as Processing Queues

  rect rgba(240,255,240,0.6)
  note over U,K: Create and upload flows
  U->>K: CreateCollection/CreateFolder/UploadFiles
  K->>DB: Persist collection/folder/file
  K->>Q: Enqueue job with singletonKey (collection_<id>/folder_<id>/file_<id>)
  note over K,Q: Singleton keys enable targeted cancellation later
  Q-->>K: Job accepted
  K-->>U: Ack with job refs
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I thump my paws—cancel, clear, align!
Singleton keys mark each burrow fine.
Files and folders hop in queue,
Remove a nest — their jobs undo.
A rabbit nods: queues hush on time. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately mentions cancelling a queue job upon file deletion, which aligns with part of the changeset, but it overlooks cancellation support for folders and collections and contains a typo in “cancell,” making it less descriptive of the overall feature.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cancell-job-on-delete

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eaa8ea3 and eb6b309.

📒 Files selected for processing (1)
  • server/api/knowledgeBase.ts (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/api/knowledgeBase.ts (1)
server/queue/api-server-queue.ts (3)
  • boss (9-12)
  • FileProcessingQueue (14-14)
  • PdfFileProcessingQueue (15-15)
🔇 Additional comments (5)
server/api/knowledgeBase.ts (5)

382-382: LGTM: Singleton key enables targeted job cancellation.

The singleton key format collection_${collection.id} is consistent with the pattern used for files and folders, enabling precise cancellation of collection processing jobs during deletion.


949-949: LGTM: Consistent singleton keys for all folder types.

Both user-created (line 949) and auto-created (line 1100) folders use the same singleton key format folder_${folder.id}, ensuring comprehensive cancellation coverage regardless of folder creation source.

Also applies to: 1100-1100


1487-1487: LGTM: File singleton key supports dual-queue cancellation.

The singleton key file_${item.id} is applied uniformly regardless of whether the file is queued to PdfFileProcessingQueue or FileProcessingQueue, which aligns with the cancellation logic that checks both queues for file jobs.


744-749: LGTM: Cancellation properly sequenced in deletion flow.

The cancellation calls are correctly placed:

  1. After database transactions complete (ensuring atomic DB operations)
  2. Before Vespa/storage cleanup (preventing race conditions)
  3. Errors in cancellation don't fail the deletion (appropriate, since the DB records are already deleted)

This sequencing ensures that:

  • Processing jobs don't try to access deleted database records
  • Deletion succeeds even if job cancellation encounters issues
  • The system maintains eventual consistency

Also applies to: 1675-1679


100-169: Validate pg-boss “job not found” error handling
In server/api/knowledgeBase.ts (lines 119, 127, 138, 151), the cancellation logic filters errors by matching message substrings (“not found”/“does not exist”), which is brittle. Verify pg-boss’s cancel() throws a specific error class/name or code for missing jobs and update your catch blocks to check against that instead of raw message text.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @MohdShoaib-18169, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a crucial enhancement to resource management by ensuring that background processing jobs are automatically canceled when their associated files, folders, or collections are deleted from the database. This prevents the system from performing unnecessary work on non-existent items, thereby improving overall efficiency and reducing computational overhead.

Highlights

  • New Utility Function: Introduced a new asynchronous utility function, cancelProcessingJobs, which is responsible for canceling pending background processing jobs for files, folders, and collections based on their IDs and types.
  • Job Cancellation Integration: Integrated the cancelProcessingJobs function into the DeleteCollectionApi and DeleteItemApi endpoints. This ensures that any active or pending processing jobs are canceled immediately after the corresponding items are successfully deleted from the database.
  • Singleton Key for Jobs: Modified job creation in CreateCollectionApi, CreateFolderApi, ensureFolderPath, and UploadFilesApi to include a singletonKey. This key, derived from the item's ID (e.g., file_ID, folder_ID, collection_ID), allows for precise identification and cancellation of specific processing jobs.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to cancel pending background processing jobs when files, folders, or collections are deleted. This is achieved by adding a singletonKey to jobs upon creation and then using this key to cancel them upon deletion via a new cancelProcessingJobs utility function. The changes are logical and well-placed. My review includes suggestions to improve the implementation of the new utility function for better code clarity and robustness.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
server/api/knowledgeBase.ts (2)

111-127: Simplify error handling.

The outer try-catch blocks (lines 125-127 and later 134-136) are redundant since each promise already has .catch(() => {}) handling. The outer try-catch won't catch anything because the inner catch prevents propagation.

Apply this diff to remove redundant error handling:

     ...itemsToDelete.map(async (item) => {
-      try {
         const keyPrefix = item.type === "file" ? "file" : "folder"
         const singletonKey = `${keyPrefix}_${item.id}`
         
         if (item.type === "file") {
           // For files, try both queues
           await Promise.all([
             boss.cancel(FileProcessingQueue, singletonKey).catch(() => {}),
             boss.cancel(PdfFileProcessingQueue, singletonKey).catch(() => {})
           ])
         } else {
           // For folders, only file-processing queue
           await boss.cancel(FileProcessingQueue, singletonKey).catch(() => {})
         }
-      } catch (error) {
-        // Ignore errors if job doesn't exist
-      }
     }),

Apply the same simplification to the collection cancellation block (lines 130-137).


101-105: Add explicit type for better clarity.

The function parameters accept itemsToDelete as a generic array with id and type, but in practice it receives CollectionItem[] objects. Consider making this explicit for better type safety and documentation.

 async function cancelProcessingJobs(params: {
-  itemsToDelete: Array<{ id: string; type: string }>
+  itemsToDelete: Array<Pick<CollectionItem, 'id' | 'type'>>
   collectionId?: string
   userEmail: string
 }) {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a80dbef and eaa8ea3.

📒 Files selected for processing (1)
  • server/api/knowledgeBase.ts (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/api/knowledgeBase.ts (1)
server/queue/api-server-queue.ts (3)
  • boss (9-12)
  • FileProcessingQueue (14-14)
  • PdfFileProcessingQueue (15-15)
🔇 Additional comments (4)
server/api/knowledgeBase.ts (4)

100-149: Well-structured cancellation utility.

The implementation correctly:

  • Handles both file and folder types appropriately
  • Attempts cancellation from both queues for files (safe even if only one was used)
  • Includes collection cancellation when provided
  • Uses parallel cancellation with Promise.all for efficiency
  • Provides good observability through logging

The best-effort error handling approach (catching and ignoring errors) is appropriate for this use case where cancellation failures shouldn't block deletion operations.


362-362: Consistent singletonKey implementation across all enqueue operations.

The singletonKey additions follow a consistent pattern:

  • collection_<id> for collections
  • folder_<id> for folders
  • file_<id> for files

This enables targeted cancellation and covers all job enqueue locations in the file. The format matches what the cancelProcessingJobs function expects.

Also applies to: 929-929, 1080-1080, 1467-1467


724-729: Proper integration of job cancellation in collection deletion.

The cancellation is correctly placed after the database transaction commits (line 722) but before external resource cleanup (line 731+). This ensures:

  1. Database changes are persisted first
  2. Jobs are canceled to prevent processing deleted items
  3. External cleanup proceeds regardless of cancellation success

Passing both collectionItemsToDelete and collectionId correctly cancels jobs for all items and the collection itself.


1655-1659: Correct job cancellation for item deletion.

The cancellation follows the same pattern as collection deletion - after DB commit, before external cleanup. Appropriately omits collectionId since we're only deleting items, not the collection itself.

The itemsToDelete array correctly includes both the target item and all descendants (if it's a folder), ensuring all related jobs are canceled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments