C.-elegans-miRNA-Sequence-Analysis-Pipeline

This is an automated Bash pipeline for reproducible miRNA analysis from miRBase v22. I used C. elegans (lin-4/let-7) precursors using SeqKit for biochemical metric extraction.

Biological Context

MicroRNAs are small non-coding RNA molecules (~22nt) that play a critical role in post-transcriptional gene regulation. This pipeline focuses on:

cel-lin-4: The first miRNA discovered, essential for larval development.

cel-let-7: A highly conserved regulator of developmental timing.

The script automates the extraction of these precursors, converts the RNA sequences to DNA for downstream compatibility, and calculates key biochemical metrics such as GC Content.

Features

Reproducibility: Targets a static version of the miRBase database (v22).

Automation: One-touch execution from download to final report.

Efficiency: Optimized for multi-core processing (Intel i5-1235U) using SeqKit.

Data Transformation: Converts raw FASTA data into structured TSV tables.

Installation & Usage

Prerequisites

Ensure you have seqkit installed on your Ubuntu system:

Bash: sudo apt update && sudo apt install seqkit -y

Execution

Clone the repository:

Bash

git clone https://github.com/your-username/miRNA-analysis-pipeline.git cd miRNA-analysis-pipeline Set permissions and run:

Bash

chmod +x analyze_mirna.sh ./analyze_mirna.sh

Output

The pipeline generates a directory cel_analysis_v22/ containing:

hairpin.fa: The raw miRBase dataset.

cel_mirna_v22_results.tsv: A tab-separated file.

Output Preview

Figure 1: Summary Statistics of C. elegans miRNA (miRBase v22)

Built With

Bash: Shell scripting for process automation.

SeqKit: Ultra-fast toolkit for FASTA/Q manipulation.

AWK: For terminal report formatting.

Bibliographic Sources

Kozomara, A., et al. (2019). miRBase: from microRNA sequences to function. Nucleic Acids Research, 47(D1).

Shen, W. (2016). SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q Manipulation. PLOS ONE.

Ambros, V. (1993). The C. elegans heterochronic gene lin-4. Cell.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Summary_stats.png		Summary_stats.png
analyze_mirna.sh		analyze_mirna.sh
cel_mirna_v22_results.tsv		cel_mirna_v22_results.tsv
hairpin.fa		hairpin.fa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

C.-elegans-miRNA-Sequence-Analysis-Pipeline

Biological Context

Features

Installation & Usage

Prerequisites

Execution

Output

Output Preview

Built With

Bibliographic Sources

About

Uh oh!

Releases

Packages

Languages

License

CharlesDexterW/C.-elegans-miRNA-Sequence-Analysis-Pipeline

Folders and files

Latest commit

History

Repository files navigation

C.-elegans-miRNA-Sequence-Analysis-Pipeline

Biological Context

Features

Installation & Usage

Prerequisites

Execution

Output

Output Preview

Built With

Bibliographic Sources

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages