Designs third-generation HCR probes (HCRv3, split-probes) and attaches split initiator sequences.
- B1-5, Choi 2014 ACS Nano
- B7, B9-10, B13-15, B17, Y Wang 2020 BioRxiv
Please see user manual PDF file for a detailed guide.
- BLAST+: The NCBI BLAST+ suite must be installed and available in the system's PATH. You can download it from the NCBI website.
- R: The scripts are written in R (version 4.1.2 or later is recommended).
- RStudio (Recommended): An IDE for R, which simplifies package management and script execution.
-
Clone the repository or download ZIP:
git clone https://github.com/jefflee1103/HCRv3_probe_design.git cd HCRv3_probe_design -
Install R Packages: Launch R or RStudio and run the following commands to install the required packages from CRAN, Bioconductor and Github.
# Install packages from CRAN install.packages(c("tidyverse", "shiny", "shinythemes", "patchwork", "DT", "Rcpp", "scales", "valr")) # Install packages from Bioconductor if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("Biostrings") # Install BLAST helper # install.packages("devtools") devtools::install_github("HajkD/metablastr", build_vignettes = TRUE, dependencies = TRUE)
-
Check BLAST installation: Check that
Rcan find the installed BLAST. If it can't, add the BLAST installation path to R system path usingSys.setenv(PATH = paste(Sys.getenv("PATH"), path/to/blast/bin, sep = .Platform$path.sep)).Rmust be able to locate your BLAST binary file.Sys.which("blastn") # should say sth like "/usr/local/ncbi/blast/bin/blastn"
-
Prepare BLAST database and Tx2gene file: BLAST is used to filter out probes potentially cross-hybridising with non-intended RNA targets. Prepare species-specific transcriptome BLAST databases (unzipped
.faor.fasta) and Tx2gene file, which is used to maptranscript_ids (used in BLAST) togene_ids. Drosophila melanogaster BLAST database based on ENSEMBL v99 is provided in this repo. D.mel, C.ele and S.cer databases are small and included in this repo underdata/BLAST/.Name Species name Ensembl Release BLASTdb Fruit Fly Drosophila melanogaster 99 Dmel_BLASTdb_ens99.fa.gzHuman Homo sapiens 99 Zenodo link Mouse Mus musculus 99 Zenodo link Zebrafish Danio rerio 114 Zenodo link Frog Xenopus tropicalis 114 Zenodo link Chicken Gallus gallus 114 Zenodo link Yeast Saccharomyces cerevisiae 114 Scer_BLASTdb_ens114.fa.gzWorm Caenorhabditis elegans 114 Cele_BLASTdb_ens114.fa.gzFor larger transcriptomes, see Zenodo database for BLASTdb files. Please unzip them before use. Tx2gene for the above species are already included in this repo
data/BLAST/tx2gene/. To design probes for other species, a custom BLAST database can be easily built. A Python script and instructions for this are provided inscripts/custom_blast_databases/.
This tool offers two workflows for designing HCR probes.
The interactive Shiny web application provides a user-friendly graphical interface for designing probes without writing any code.
-
Launch the app: Open RStudio, by clicking the
HCR_probe_designer.Rprojfile, and run the following command:shiny::runApp("shiny_webapp")
-
Using the app:
- The application will open in your default web browser.
- Follow the instructions on the "HCRv3" tab.
- Upload your target transcript fasta file - only takes one file/gene at a time.
- Adjust the design parameters (e.g., probe length, GC content) as needed.
- Follow the instructions in the side panel.
- Once the design process is complete, you can review the results and download the output files, including the probe sequences, a detailed report, and a plot showing probe locations.
This workflow is intended for users who wish to integrate probe design into custom scripts or automated pipelines. The core logic is parameterised in a Quarto (HCR_probe_design_script-workflow.qmd) document. You can run the script line-by-line in RStudio or use the quarto command-line interface to render the entire document, which will execute the code and generate a report.
Upon successful completion, following files can be downloaded from the Shiny interface. If using the script workflow, it will generate a new directory within the output/ folder named with the run date and target name (e.g., output/YYYY-MM-DD_mygene-exon_HCRB1/). This tool generates following output files.
probes.txt : A table of the final selected probes, ready for ordering.
probes.pdf : A plot visualising the location of the selected probes on the target transcript.
details.csv : A detailed table of thermodynamics and other calculated values of the final probe set.
params.txt : A record of the parameters used for the design run.
rawblastoutput.csv : The raw output from the BLAST search for further inspection.