Authors:
- Lorenzo Rosset (Ecole Normale Supérieure ENS, Sorbonne Université)
- Roberto Netti (Sorbonne Université)
- Anna Paola Muntoni (Politecnico di Torino)
- Martin Weigt (Sorbonne Université)
- Francesco Zamponi (Sapienza Università di Roma)
Maintainer: Anna Paola Muntoni
adabmDCA 2.0 is a flexible yet easy-to-use implementation of Direct Coupling Analysis (DCA) based on Boltzmann machine learning. This package provides tools for analyzing residue-residue contacts, predicting mutational effects, scoring sequence libraries, and generating artificial sequences, applicable to both protein and RNA families. The package is designed for flexibility and performance, supporting multiple programming languages (C++, Julia, Python) and architectures (single-core/multi-core CPUs and GPUs).
This repository contains the C/C++ version of adabmDCA, maintained by Anna Paola Muntoni.
The project's main repository can be found at adabmDCA 2.0.
- Direct Coupling Analysis (DCA) based on Boltzmann machine learning.
- Support for dense and sparse generative DCA models.
- Available on multiple architectures: single-core and multi-core CPUs, GPUs.
- Ready-to-use for residue-residue contact prediction, mutational-effect prediction, and sequence design.
- Compatible with protein and RNA family analysis.
In the src folder run
make
It will generate the executable file adabmDCA. In the main folder run also chmod +x adabmDCA.sh to use the main script file. See
./adabmDCA.sh [ train | sample | energies | DMS | contacts ] --help
for the basic runs or look at the main page.
./adabmDCA -f <MSA file> -a <output folder> -k <label> -m <nsave> -L
- Output files will be saved every
nsaveiterations specified in the-mflag; - The output folder is named after the
output folderspecified in the-aflag. Files will be labeled according to the argument of the flag-k. - Use
-w <file name>for ad-hoc weights file (optional). - For RNA, set the flag
-b n; for ad hoc alphabet set-b <alphabet>wherealphabetis a string
./adabmDCA -f <MSA file> -k <label> -a <output folder> -p <params> -c <convergence threshold> -x <required sparsity> -L
-Aflag removes gauge invariance at the beginning of the training;- Additional options
-V <drate>
./adabmDCA -f <MSA file> -k <label> -a <output folder> -I 0. -Z -c <convergence threshold> -X <gsteps> -L -e <nsweep>
-Zflag inactivates all couplings at the beginning of the training;-I 0.allows one to start from a profile model;- Additional options
-U <factivate>; - Convergence at target Pearson whatever the density.
Add the flag --restore to restart the training from the checkpoint saved in the output folder.
Use
./adabmDCA -p <params> -f <MSA file> -i 0 -S -L -s <nconfig>
-W nmix(optional)
./adabmDCA --energies -p <params> -f <MSA file>
./adabmDCA --dms -p <params> -f <wild type>
See
./adabmDCA -h
for a complete list.
This package is open-sourced under the MIT License.
If you use (even partially) this code, please cite:
Rosset, L., Netti, R., Muntoni, A.P., Weigt, M., & Zamponi, F. (2024). adabmDCA 2.0: A flexible but easy-to-use package for Direct Coupling Analysis.
This work was developed in collaboration with Sorbonne Université, Sapienza Università di Roma, and Politecnico di Torino.