-
Notifications
You must be signed in to change notification settings - Fork 2
Decontamination References
Note: This does not cover fastq cleaning using fastp, although it does happen in the WDL's decontamination task. By default, fastp cleans fastqs upstream of decontamination.
myco (all versions) uses clockwork for decontamination and variant calling. The way clockwork handles decontamination can be roughly summed as:
- Align your fastqs to a decontamination reference containing stuff that is decidedly not TB (human, NTM, etc)
- Get rid of anything that aligned too closely to that decontamination reference
The contents of your decontamination reference is worth some consideration. clockwork helpfully provides workflows for generating these references. Rather than regenerating these references every time the workflow is run, I maintain premade Docker images containing the decontamination reference in order to save on cloud compute costs. These Docker images are what the decontamination task of myco (all versions) will actually run inside, so you do not need to pass a decontamination reference to these tasks, as long as you are happy with the options I've provided.
CDPH and I discussed this a few times in the pipeline's development, and we've come to the conclusion to use clockwork-v0.12.5's decontamination reference as the default. This is also the decontamination reference I recommend to other users. This is intentionally not the same decontamination reference CDC's varpipe pipeline uses (as of 2024). Please see this documentation in the clockwork-wdl repo for more details about supported decontamination references.
You can either roll your own Docker image and make that the Docker image myco's decontamination task runs in, or you can modify the code itself to take in the decontamination reference as a file. I strongly recommend going the Docker route, especially if you are running on Terra, because it is essentially free to download a Docker image with huge files in it as opposed to localizing huge files. My clockwork-wdl repo has multiple Dockerfiles you can use as a starting point for creating your own.