dsl2: metagenomics uncollapsed paired end #1098
Conversation
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.1.2. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
|
dsl-2 metagenomics-pairedend
|
Ready for review: Basic Overview: We can now retain PE information from unmerged input fastq files (when PE samples going into metagenomics will be separately input into metaphlan, kraken2, and krakenuniq. Malt does not accept PE reads so it is not allowed with the param combo. Also fixed a small error in the parsing of metagenomics input data from bamfiltering --> metagnomics. |
merszym
left a comment
There was a problem hiding this comment.
Based on visual inspection, it looks good to me!
I would remove the one map-channel manipulation in metagenomics (see comment) as it is not necessary anymore
Also I have not tested anything, so I trust your tests for now :P
|
|
||
| So: needs to be fixed up higher (eg in bamfiltering.nf, likely with a new adjustment to the SAMTOOLS_FASTQ_UNMAPPED, SAMTOOLS_FASTQ_MAPPED, and SAMTOOLS_VIEW_BAM_FILTERING modules ) | ||
|
|
||
| ISSUE FOUND: while the outputting of PE reads is OK in bamfiltering.nf (fastq_mapped & fastq_unmapped) when overlap merging is not done cat_fastq weirdly merges singletons to one PE file and other to the other PE file, so then everything gets fucked up |
There was a problem hiding this comment.
That sounds like a hackathon-thing to do
There was a problem hiding this comment.
let me double check this behavior and add more context of what we might need to do in a hackathon... i this was more for my own notes reference but was not edited prior to opening the PR.
…tering --> fastq for metagenomics. improves clarity (IMO)
PR for metagenomics paired-end merging off, retaining that info for metagenomic tools that have PE modes.
This is a QOL improvement for tools that can use paired end information (kraken2, krakenuniq), and now allows for variable inputs into these tools (by splitting inputs into either PE or SE data and running separate instances).
Main issues resolved:
--preprocessing_skippairmergingused.PR checklist
scrape_software_versions.pynf-core lint .).nextflow run . -profile test,docker).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).