Endothelial Cell RNA-seq Data: Differential Expression and Functional Enrichment Analyses to Study Phenotypic Switching

A user-friendly bioinformatics workflow to take raw data produced by RNA sequencing to interpretable results. The workflow described here was performed using Ubuntu 20.04.2 LTS, a Linux distribution. A 64-bit machine with at least 32Gb RAM is recommended for the majority of the steps in the workflow.

The published protocol can be found under Chapter 29 of Angiogenesis Methods and Protocols 2022.

Bioinformatics Workflow

The steps of the workflow are shown in the flowchart. The tools used are in yellow boxes, the data required/produced in white boxes and file formats in purple, blue, dark green, orange and grey boxes. Results obtained are in light green boxes.

Software and R Packages

Below is a list of the software and R packages used in the workflow with the corresponding URL.

Software	URL
Ubuntu	https://ubuntu.com/
FastQC	https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Cutadapt	https://github.com/marcelm/cutadapt
STAR	https://github.com/alexdobin/STAR
Qualimap	http://qualimap.conesalab.org/
Subread (featureCounts)	http://subread.sourceforge.net/
R	https://www.r-project.org/
Rstudio	https://www.rstudio.com/
DESeq2	https://bioconductor.org/packages/release/bioc/html/DESeq2.html
clusterProfiler	https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html
pathview	http://www.bioconductor.org/packages/release/bioc/html/pathview.html
ReactomePA	https://bioconductor.org/packages/release/bioc/html/ReactomePA.html
enrichplot	https://bioconductor.org/packages/release/bioc/html/enrichplot.html
biomaRt	https://bioconductor.org/packages/release/bioc/html/biomaRt.html
ggplot2	https://ggplot2.tidyverse.org/
GO	http://geneontology.org/
KEGG	https://www.genome.jp/kegg/
Reactome	https://reactome.org/
GSEA	https://www.gsea-msigdb.org/gsea/index.jsp

Workspace Preparation

The commands used in the workflow, as seen in software_downloads and pipeline_commands use relevant file paths. Throughout the workflow, when a path containing "user" is shown (e.g., /home/user/rnaseq_exp), "user" represents the user's name and should be replaced by it.

Key directories to be made prior to software installation and raw data download:

Change directory to 'user'
```
cd /home/user
```
Make a new directory called 'rnaseq_exp'
```
mkdir rnaseq_exp
```
Change directory to 'rnaseq_exp'
```
cd rnaseq_exp
```
Make new directories called 'output', 'raw_data', resources', 'programs'
```
mkdir output raw_data resources programs
```

Software Installation

The required software and R packages can be installed by following the commands in the files within the software_downloads directory.

Refer to section 3.2 of the published protocol for more information.

Raw Reads Download

A publicly available HUVEC dataset was used from a published study Andrade J et al (2021) Control of endothelial quiescence by FOXO-regulated metabolites. Nat Cell Biol 23(4):413–423.

The raw data in FASTQ format was obtained from the European Nucleotide Archive project PRJNA679567. Select the 'Download All' button above the 'FASTQ FTP' column and save in the raw_data directory created above.

Follow the commands in pipeline_commands/1_raw_data_decompression.txt to decompress the raw read files.

Refer to section 3.2.9 of the published protocol for more information.

Reference Genome Download

The reference genome in FASTA format and the annotation of the reference genome in GTF or GFF format are required.

Both can be obtained from Ensembl FTP Download via an FTP client.

See the figure below to download the required files. Save in the resources directory created above.

Follow the commands in pipeline_commands/2_ref_genome_anno_decompression.txt to decompress the genome files.

Refer to section 3.2.10 of the published protocol for more information.

Begin!

Once the required software and R packages have been installed, the workspace created, the raw reads and genome files downloaded and decompressed the analysis can begin.

Follow the command files in pipeline_commands in conjunction with the published protocol to successfully complete the analysis.

The Notes section of the published protocol, as well as the main text comments on errors that may arise throughout the workflow. These may help with troubleshooting.

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
images		images
pipeline_commands		pipeline_commands
software_downloads		software_downloads
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Endothelial Cell RNA-seq Data: Differential Expression and Functional Enrichment Analyses to Study Phenotypic Switching

Table of Contents

Bioinformatics Workflow

Software and R Packages

Workspace Preparation

Software Installation

Raw Reads Download

Reference Genome Download

Begin!

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

vasc-bioinf/rnaseq_exp

Folders and files

Latest commit

History

Repository files navigation

Endothelial Cell RNA-seq Data: Differential Expression and Functional Enrichment Analyses to Study Phenotypic Switching

Table of Contents

Bioinformatics Workflow

Software and R Packages

Workspace Preparation

Software Installation

Raw Reads Download

Reference Genome Download

Begin!

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages