-
Notifications
You must be signed in to change notification settings - Fork 5
Results
bohra outputs sequence level and dataset level files
All results for each sequence are in a <sequence_id> folder in the directory you ran bohra in.
| Filename | Pipeline | Description |
|---|---|---|
read_assessment.txt |
all | combined output of seqkit stats, seqkit fx2tab, seqtk fqchk
|
kraken2.tab |
all | raw output of kraken2 |
species.txt |
all | summary of top 3 species from kraken2.tab
|
snippy_qc.txt |
snps, phylogeny, default, full
|
summary of snippy output |
snps.* |
snps, phylogeny, default, full
|
snippy outputs .vcf, .log, .fa
|
mlst.txt |
amr_typing,full,default
|
results of MLST |
typer.txt |
amr_typing,full,default
|
results of the typer - where appropriate |
resistome.txt |
amr_typing,full,default
|
collated results of the abritamr |
abritamr |
amr_typing,full,default
|
raw results of abritamr
|
plasmid.txt |
amr_typing,full,default
|
collated results of the mbb_suite |
plasmid_*.fa |
amr_typing,full,default
|
fasta files of plasmids identfied |
<seq_ID>.txt |
assemble, amr_typing,full,default
|
collated results of the prokka |
<seq_ID>.gff |
assemble, amr_typing,full,default
|
gff output of prokka |
contigs.fa |
assemble, amr_typing,full,default
|
the assembly generated by bohra OR provided by the user |
assembly_statistics.txt |
assemble, amr_typing,full,default
|
summary of assembly statistics |
All dataset level results are stored in a folder called report. This folder contains collated versions of all the summary files found in the sample directories, as well as the tree (if phylogeny was performed) and also the outputs of panaroo if the full pipeline was run. It also contains a report_<pipeline>.html file, which is a standalone file that can be viewed in the browser and shared with collaborators or other stakeholders. A copy of one can be found [here](need to add a link). If you are going to rerun bohra continuously in a folder and want to keep a record of all the past report folders you can use the --keep Y flag, the report folder will be renamed and kept for archiving.
Nextflow is a workflow platform that allows you to run a pipeline - a series of processes or steps all tied together by their dependency on each other. For example - a tree can't be generated unless there is an multi-sequence alignment of some sort, which requires alignments and so on back to reads and a reference. The bohra pipelines are modular and can be run independently or as a whole. You can also run a pipeline, go away and check it add or remove sequences and then come back and re-run - and only the new sequences will be added - the whole thing will not be re-run on all of the samples that already have results. Or run a preview pipeline and then come back and run the full pipeline - the early steps will not be repeated.
This 'caching' of results is achieved by the work folder. When you run bohra a series of folders will be created in the location you run it, a whole lot of sample directories, a report directory and also a work directory. This work directory is where nextflow looks to find what has been performed already. It holds log files of each process that was run and also raw results. You can delete this folder if you want - there will be no explosions!! Your sample directories and report folders will all be fine. But if you want to rerun the same analysis for any reason - there will be no 'memory' of previous runs - so the whole pipeline will run again.