-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Following #11 one of the most "difficult" part of the plugin is create and maintain Java record for schemas
Maybe can be interesting the plugin can offer some common schemas out-the-box as for example FastQRecord (and also another more generics records as StringMap)
for example, a pipeline to convert from raw Fasta text to parquet:
include { toParquet; toFastq } from 'plugin/nf-parquet'
workflow {
channel.fromPath('data/HI.4549.004.index_10.ANN0830_R2.fastq')
.splitFastq( record: true )
.map{ record ->
toFastq(record)
}
.toParquet('data/HI.4549.004.index_10.ANN0830_R2.parquet', ['schema':'fastq'])
.view{ record -> record }
}
here, the pipeline is using splitFastq to read a fastq file, convert the Map version of Nextflow into an internal Java Record , and write the Java records to a parquet file using schema instead of record parameter
schema can be one of:
- stringMap
- fastq
- fasta
- ....
Custom/propietary formats will be out the scope of this issue and will required the current approach
@Hoohm , what do you think? can cover your use case ?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels