To use it in your python code: Use from DynaTMT_SB.DynaTMT import PD_input, plain_text_input
The DynaTMT tool can be used to analyze multiplexed enhanced protein dynamic mass spectrometry (mePROD) data, as well as mePRODmt (mitochondrial protein import proteomics) data. mePROD uses pulse SILAC combined with Tandem Mass Tag (TMT) labelling to profile newly synthesized proteins. Through a booster channel, that contains a fully heavy labelled digest, the identification rate of labelled peptides is greatly enhanced, compared to other pSILAC experiments. Through the multiplexing capacity of TMT reagents it is possible during the workflow to use the boost signal as a carrier that improves survey scan intensities, but does not interfere with quantification of the pulsed samples. This workflow makes labelling times of minutes (down to 15min in the original publication) possible.
mePRODmt extends this approach to specifically quantify mitochondrial protein import dynamics, allowing researchers to measure the rate of protein translocation into mitochondria under various conditions.
Additionally, mePROD utilizes a baseline channel, comprised of a non-SILAC labelled digest that serves as a proxy for isolation interference and greatly improves quantification dynamic range (note: baseline correction is required for MS2 data but not necessary for MS3 data due to reduced co-isolation interference in MS3). Quantification values of a heavy labelled peptide in that baseline channel are derived from co-fragmented heavy peptides and will be subtracted from the other quantifications.
The package can also be used to analyse any pSILAC/TMT dataset.
Current version: 2.9.2 (2024-06-03). See the CHANGELOG for details on what has changed.
If you use DynaTMT-py in your research, please cite the following publications:
-
Original mePROD Publication (2020):
Klann K, Tascher G, Münch C. Functional Translatome Proteomics Reveal Converging and Dose-Dependent Regulation by mTORC1 and eIF2α. Molecular Cell. 2020;77(4):913-925.e4. DOI: 10.1016/j.molcel.2019.11.010
-
Mitochondrial Protein Import Proteomics (2022):
Schäfer JA, Bozkurt S, Michaelis JB, Klann K, Münch C. Global mitochondrial protein import proteomics reveal distinct regulation by translation and translocation machinery. Molecular Cell. 2022;82(2):435-446.e7. DOI: 10.1016/j.molcel.2021.11.004
-
mePRODmt Proteomics and PBLMM Methods Paper (2024):
Bozkurt S, Parmar BS, Münch C. Quantifying mitochondrial protein import by mePRODmt proteomics. Methods in Enzymology. 2024;706:449-474. DOI: 10.1016/bs.mie.2024.07.017
-
Injection Time Adjustment Method:
Klann K, Münch C. Instrument Logic Increases Identifications during Multiplexed Translatome Measurements. Analytical Chemistry. 2020. DOI: 10.1021/acs.analchem.0c01749
- Installation
- Quick Start
- MS2 Workflow
- MS3 Workflow
- Functions Reference
- Loading Data
- Proteome Discoverer Setup
- API Documentation
- Python 3.8+
- Git (for installation from GitHub)
- JupyterLab or Jupyter Notebook (recommended for interactive analysis)
pip install --upgrade git+https://github.com/science64/DynaTMT-py-SB.git# Clone the repository
git clone https://github.com/science64/DynaTMT-py-SB.git
cd DynaTMT-py-SB
# Install the package
pip install .pip install pandas numpy matplotlib statsmodels scipyFor peptide-based linear mixed model (PBLMM) statistical analysis:
pip install --upgrade git+https://github.com/science64/PBLMM.gitimport DynaTMT_SB.DynaTMT as mePROD
import pandas as pd
# Load your PSMs file
psms = pd.read_csv("your_PSMs_file.txt", sep='\t', header=0)
# Remove booster channel (adjust column name as needed)
data = psms.drop('Abundance: 131C', axis=True)
# Initialize the processor
process = mePROD.PD_input(data)
# Filter and process
filtered = process.filter_PSMs(data)
normalized = process.total_intensity_normalisation(filtered)
heavy = process.extract_heavy(normalized)
# Continue with MS2 or MS3 specific workflow (see below)Use this workflow for MS2-based TMT quantitation data.
MS2-based quantification requires:
- Injection time (IT) adjustment - Required to correct for variable ion injection times
- Baseline correction - Required to correct for co-isolation interference
from datetime import date
import pandas as pd
import DynaTMT_SB.DynaTMT as mePROD
import PBLMM
# Configuration
wd = "Example data/MS2_data"
dataName = "20200724_SB_CCCP_ISRIB_Import_PSMs.txt"
conditions = ['Light', 'DMSO', 'DMSO', 'DMSO', 'CCCP', 'CCCP', 'CCCP', 'CCCP_ISRIB', 'CCCP_ISRIB', 'CCCP_ISRIB']
pairs = [['CCCP', 'DMSO'], ['CCCP_ISRIB', 'DMSO'], ['CCCP_ISRIB', 'CCCP']]
# Load and prepare data
psms = pd.read_csv(f'{wd}/{dataName}', sep='\t', header=0)
booster_removed = psms.drop('Abundance: 131C', axis=True) # Remove booster channel
# Initialize processor
process = mePROD.PD_input(booster_removed)
# Step 1: Filter PSMs (removes contaminants, shared peptides, isolation interference >50%)
filter_data = process.filter_PSMs(booster_removed)
# Step 2: IT adjustment (REQUIRED for MS2)
IT_adjusted = process.IT_adjustment(filter_data)
# Step 3: Normalization
sumNorm = process.total_intensity_normalisation(IT_adjusted)
# Step 4: Extract heavy peptides
heavy = process.extract_heavy(sumNorm)
# Step 5: Baseline correction (REQUIRED for MS2)
peptide_data = process.baseline_correction(heavy, threshold=15, i_baseline=0, random=True)
# Step 6: Statistical analysis with PBLMM
hypothesis_testing = PBLMM.HypothesisTesting()
resultFinal = hypothesis_testing.peptide_based_lmm(peptide_data, conditions=conditions, pairs=pairs)
resultFinal.reset_index(inplace=True)
resultFinal.rename(columns={'index': 'Accession'}, inplace=True)
# Export results
resultFinal.to_excel(f'{wd}/results_MS2_{date.today().strftime("%d.%m.%Y")}.xlsx', index=False)| Step | Function | Required | Notes |
|---|---|---|---|
| 1 | filter_PSMs() |
Yes | Removes contaminants, shared peptides, isolation interference >50% |
| 2 | IT_adjustment() |
Yes | Corrects for variable ion injection times |
| 3 | total_intensity_normalisation() |
Yes | Or use Median_normalisation() or TMM() |
| 4 | extract_heavy() |
Yes | Extracts heavy-labeled peptides |
| 5 | baseline_correction() |
Yes | Subtracts baseline channel; converts PSMs to peptides |
| 6 | PBLMM analysis | Recommended | Peptide-based linear mixed model |
Use this workflow for MS3-based TMT quantitation data.
MS3-based quantification has reduced co-isolation interference, therefore:
- Injection time adjustment - NOT required (skip this step)
- Baseline correction - NOT required (optional); remove baseline channel instead
from datetime import date
import pandas as pd
import DynaTMT_SB.DynaTMT as mePROD
import PBLMM
# Configuration
wd = "Example data/MS3_data"
dataName = "20240109_LU_LC2_MAA_DB_mePROD_MS3_PSMs.txt"
conditions = ['Cont', 'Cont', 'Cont', 'USP39', 'USP39', 'USP39']
pairs = [['USP39', 'Cont']]
# Load and prepare data
psms = pd.read_csv(f'{wd}/{dataName}', sep='\t', header=0)
booster_removed = psms.drop('Abundance 131C', axis=True) # Remove booster channel
baseline_removed = booster_removed.drop('Abundance 126', axis=True) # Remove baseline channel
# Initialize processor
process = mePROD.PD_input(baseline_removed)
# Step 1: Filter PSMs
filter_data = process.filter_PSMs(baseline_removed)
# Step 2: Normalization (NO IT adjustment for MS3!)
sumNorm = process.total_intensity_normalisation(filter_data)
# Step 3: Extract heavy peptides
heavy = process.extract_heavy(sumNorm)
# Step 4: Convert PSMs to Peptides (NO baseline correction for MS3!)
peptide_data = process.PSMs_to_Peptide(heavy)
# Step 5: Statistical analysis with PBLMM
hypothesis_testing = PBLMM.HypothesisTesting()
resultFinal = hypothesis_testing.peptide_based_lmm(peptide_data, conditions=conditions, pairs=pairs)
resultFinal.reset_index(inplace=True)
resultFinal.rename(columns={'index': 'Accession'}, inplace=True)
# Export results
resultFinal.to_excel(f'{wd}/results_MS3_{date.today().strftime("%d.%m.%Y")}.xlsx', index=False)| Step | Function | Required | Notes |
|---|---|---|---|
| 1 | filter_PSMs() |
Yes | Removes contaminants, shared peptides, isolation interference >50% |
| 2 | IT_adjustment() |
No | Skip for MS3 data |
| 3 | total_intensity_normalisation() |
Yes | Or use Median_normalisation() or TMM() |
| 4 | extract_heavy() |
Yes | Extracts heavy-labeled peptides |
| 5 | PSMs_to_Peptide() |
Yes | Combines PSMs into peptides |
| 6 | baseline_correction() |
No | Skip for MS3 data; remove baseline channel instead |
| 7 | PBLMM analysis | Recommended | Peptide-based linear mixed model |
| Feature | MS2 | MS3 |
|---|---|---|
| IT Adjustment | Required | Not needed |
| Baseline Correction | Required | Not needed (remove baseline channel) |
| Co-isolation Interference | Higher | Lower |
| Signal Intensity | Higher | Lower |
| PSM to Peptide | Via baseline_correction() |
Via PSMs_to_Peptide() |
| Function | Description |
|---|---|
get_channels(input) |
Returns list of unnormalized abundance column names |
filter_peptides(input) |
Filters peptide files: removes shared peptides, contaminants, NA values |
filter_PSMs(input) |
Filters PSM files: removes shared peptides, contaminants, isolation interference >50%, NA values |
IT_adjustment(input) |
Adjusts abundances for ion injection times (MS2 only) |
total_intensity_normalisation(input) |
Normalizes to total intensity across TMT channels |
Median_normalisation(input) |
Normalizes to median of TMT channels |
TMM(input) |
Trimmed Mean of M-values normalization |
extract_heavy(input) |
Extracts heavy-labeled (SILAC) peptides |
extract_light(input) |
Extracts light-labeled peptides |
baseline_correction(input, threshold, i_baseline, random) |
Subtracts baseline channel, filters by threshold, converts PSMs to peptides |
PSMs_to_Peptide(input) |
Combines PSMs into peptides without baseline correction |
protein_rollup(input, method) |
Aggregates peptides to protein level ('sum', 'mean', 'median') |
log2(input) |
Log2 transforms all TMT intensities |
DynaTMT uses ProteomeDiscoverer PSM or Peptide file outputs in tab-delimited text format:
import pandas as pd
# Load PSMs file
psms = pd.read_csv("PSMs.txt", sep='\t', header=0)
# Or load Peptides file
peptides = pd.read_csv("Peptides.txt", sep='\t', header=0)For custom file formats, use the plain_text_input class. Column order:
- Protein Accession
- Ion Injection Time (optional)
- Modifications
- TMT Abundances (all remaining columns)
from DynaTMT_SB.DynaTMT import plain_text_input
df = pd.read_csv("custom_data.txt", sep='\t', header=0)
processor = plain_text_input(df, it_adj=True) # Set it_adj=False if no injection time columnSince search engines cannot handle two modifications on the same residue, create custom modifications combining TMT and heavy lysine:
| Modification Name | Description | ΔM (monoisotopic) | ΔM (average) |
|---|---|---|---|
Label:13C(6)15N(4) |
Heavy Arginine (PD default) | +10.0083 | - |
TMTK8 |
TMT + Heavy Lysine (K8) | +237.1771 | +237.2061 |
TMTproK8 |
TMTpro + Heavy Lysine (K8) | +312.2213 | +312.2554 |
TMTK4 |
TMT + Heavy Lysine (K4) | - | - |
TMTK6 |
TMT + Heavy Lysine (K6) | - | - |
TMTproK4 |
TMTpro + Heavy Lysine (K4) | - | - |
TMTproK6 |
TMTpro + Heavy Lysine (K6) | - | - |
- Create an MS2 or MS3 reporter ion-based workflow depending on your acquisition method
- Use SequenceHT node for database searches
- Set enzyme to trypsin
- Add TMT as static modification at N-terminus and lysines
- Add TMTK8/TMTproK8 and Label:13C(6)15N(4) (Arg10) as dynamic modifications
- Use 1% FDR threshold at peptide and protein levels
- Use default Percolator settings
- Export PSMs file in text format
Class for analyzing ProteomeDiscoverer peptide/PSM output with default column names.
Initializes the PD_input class with input DataFrame. Automatically extracts TMT abundance channels.
Parameters:
input(DataFrame): ProteomeDiscoverer PSM or Peptide output
Identifies and returns abundance column names from the input DataFrame.
Returns:
- List of column names containing abundance data (excludes normalized columns)
Filters peptide-level data by removing:
- Shared peptides (non-unique quantification)
- Contaminant proteins
- Rows with NA values in abundance channels
Parameters:
filtered_input(DataFrame): Peptide data
Returns:
- DataFrame: Filtered peptide data
Filters PSM-level data by removing:
- Shared peptides (multiple protein accessions)
- Empty protein accessions
- Contaminant proteins
- PSMs with isolation interference >50%
- Rows with NA values in abundance channels
Parameters:
filtered_input(DataFrame): PSM data
Returns:
- DataFrame: Filtered PSM data
Adjusts TMT abundances for ion injection times. Required for MS2 data only.
Formula: Abundance_adjusted = (Abundance / Injection_Time) * 1000
Parameters:
input(DataFrame): Data with "Ion Inject Time" column
Returns:
- DataFrame: IT-adjusted data
Normalizes abundances to the total intensity across all TMT channels. The channel with the lowest total intensity is used as reference.
Parameters:
input(DataFrame): Data to normalize
Returns:
- DataFrame: Normalized data
Normalizes abundances to the median of each TMT channel. The channel with the lowest median is used as reference.
Parameters:
input(DataFrame): Data to normalize
Returns:
- DataFrame: Normalized data
Implements Trimmed Mean of M-values (TMM) normalization (Robinson & Oshlack, 2010, Genome Biology).
Parameters:
input(DataFrame): Data to normalize
Returns:
- DataFrame: TMM-normalized data
Extracts heavy-labeled peptides based on modification strings. Searches for: TMTK8, Label, TMTproK8, TMTK4, TMTK6, TMTproK4, TMTproK6
Parameters:
input(DataFrame): Data containing Modifications column
Returns:
- DataFrame: Heavy-labeled peptides only
Extracts light-labeled peptides (peptides WITHOUT heavy modifications).
Parameters:
input(DataFrame): Data containing Modifications column
Returns:
- DataFrame: Light-labeled peptides only
Combines PSMs into peptides by grouping on Annotated Sequence, Modifications, and Master Protein Accessions, then summing abundances.
Parameters:
input(DataFrame): PSM-level data
Returns:
- DataFrame: Peptide-level data
Subtracts baseline channel from all other channels and converts PSMs to peptides. Required for MS2 data.
Parameters:
input_file(DataFrame): Heavy peptide datathreshold(float): Minimum mean signal after baseline subtraction (default: 5)i_baseline(int): Index of baseline channel (default: 0, typically TMT 126)random(bool): If True, replaces zero/negative values with random values 0-1 (recommended for PBLMM)
Returns:
- DataFrame: Baseline-corrected peptide data
Notes:
- Negative values after subtraction are set to zero
- If
random=True, zero values are replaced with random values (0-1) to avoid zero-inflation in statistical models - Mean signal threshold filters low-quality peptides
- Automatically converts PSMs to peptides
Aggregates peptide-level data to protein-level by grouping on Master Protein Accessions.
Parameters:
input_file(DataFrame): Peptide-level datamethod(str): Aggregation method - 'sum', 'mean', or 'median' (default: 'sum')
Returns:
- DataFrame: Protein-level data (proteins as index)
Log2 transforms all TMT abundance values.
Parameters:
input(DataFrame): Data to transform
Returns:
- DataFrame: Log2-transformed data
Class for analyzing pSILAC data from custom text files where column identity is determined by position rather than name.
Column Order:
- Protein Accession
- Ion Injection Time (optional)
- Modifications
- All subsequent columns: TMT Abundances
Parameters:
input(DataFrame): Input datait_adj(bool): If True, expects injection time in column 2. If False, modifications are in column 2.
All other methods function identically to PD_input class.
The repository includes ready-to-use Jupyter notebooks:
| Notebook | Description |
|---|---|
mePROD LMM simplified_MS2.ipynb |
Simplified MS2 workflow with PBLMM |
mePROD LMM simplified_MS3.ipynb |
Simplified MS3 workflow with PBLMM |
mePROD LMM template.ipynb |
Full template with visualization and QC |
Example data is provided in the Example data/ folder:
MS2_data/- Example MS2 PSM data with conditions and pairs filesMS3_data/- Example MS3 PSM data with conditions and pairs files
-
Column not found error: Ensure your data uses standard ProteomeDiscoverer column names, or use
plain_text_inputclass. -
Empty result after filtering: Check that your data contains valid values and proper modification annotations.
-
Baseline correction produces all zeros: Adjust the threshold parameter to a lower value.
-
PBLMM installation fails: Ensure Git is installed and accessible from command line.
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
- Kevin Klann - Original development
- Süleyman Bozkurt - Maintenance and updates
Contact: sbozkurt.mbg@gmail.com
This work was developed in the Münch Lab at Goethe University Frankfurt.