RISoTTo (RIbonucleic acid Sequence design from TerTiary structure) is a parameter-free geometric deep learning approach that generates RNA sequences conditioned on both their backbone scaffolds and the surrounding molecular context. This repository contains the inference code for generating RNA sequences that are predicted to fold into a given target structure.
RISoTTo takes a 3D RNA structure (PDB format) as input and generates multiple RNA sequences that are predicted to fold into that structure.
Install dependencies:
pip install -r requirements.txtmake sure to install the compatible version of PyTorch and cuda for your system.
python apply_model.py --pdb_filepath path/to/structure.pdb --output_dir ./outputpython apply_model.py \
--pdb_filepath path/to/structure.pdb \
--output_dir ./output \
--num_samples 5 \
--imprint_ratio 0.5 \
--sampling probabilistic \
--device cuda:0Test the installation with provided example:
python apply_model.py --pdb_filepath=test_pdb/1csl.pdb --output_dir=test_pdb/ --num_samples=5 --imprint_ratio=0.5 --device=cuda --sampling=max_confidenceThis will generate test_pdb/1csl_designs.fasta with the designed sequences.
--pdb_filepath(required): Path to input PDB file containing RNA structure--output_dir(required): Directory to save output FASTA file--num_samples(default: 5): Number of additional sequences to generate beyond the max-confidence sequence--imprint_ratio(default: 0.5): Fraction of residues to constrain during sampling (0.0-1.0)--sampling(default: "max_confidence"): Sampling method ("probabilistic", "max_confidence")--device(default: "cuda:0"): Device for computation ("cuda:0", "cpu", etc.)
The tool generates a FASTA file containing designed sequences with metadata:
>seq_0 | recovery=0.536 confidence=0.937
GACGCCCGCGUAAUACAAUGGAGGGUUG
>seq_1 | recovery=0.500 confidence=0.972
GAAGCCCGCGUAAUACAAUGGAGGGUUG
If you want to train the model, first you need to download our ML ready dataset from this link [https://drive.google.com/file/d/1Ihp-RgOw6GUoTmV1lKFPmRRjOEMJZBH8/view?usp=sharing] and store it in the datasets directory.
To start training, run:
python main.pyIf you want to run secondary structure based forward folding validation, install EternaFold in the software directory from [https://github.com/eternagame/EternaFold].
Download weights for Ribonanzanet for chemical reactivity predictions from [https://drive.google.com/drive/folders/1rDMwn_CJ3usmBN0_V0dQU6xAaciFgTXT?usp=sharing]
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Bibekar, P., Krapp, L. F., & Dal Peraro, M. (2025). Context-aware geometric deep learning for RNA sequence design. bioRXiv https://doi.org/10.1101/2025.06.21.660801

