Skip to content

Frequently Asked Questions

Sarah Haynes edited this page Oct 7, 2019 · 41 revisions

Q: What are the hardware requirements for MSFragger?

A: This depends on the complexity of the search. In general, we recommend at least 8-16 GB, but complex closed searches and open searches will require more. Additional variable modifications, semi-enzymatic, and non-enzymatic/non-specific searches will require additional memory and processing cores to maintain very short search times. (We do not recommend performing an open search with non-specific cleavage rules.) Performance scales well with the number of processors.

Q: Does MSFragger require decoys in the FASTA sequence database?

A: Yes. Philosopher can download and format a database for you (including the addition of reversed/decoy sequences). Custom databases can also be generated and formatted by Philosopher. See this page for more information & usage examples.

Q: How can I convert raw MS/MS files to mzML for MSFragger searches?

A: We recommend using the msconvert tool from ProteoWizard. A tutorial can be found here.

Q: I got a memory error when I tried to run MSFragger. What can I do?

A: First make sure a 64-bit (x64) version of Java runtime environment is installed. (The latest version of JRE, which is 64-bit, can be downloaded here.) If you’re using MSFragger in the command line, make sure to specify the appropriate amount of memory given your hardware configuration (e.g. -Xmx32G to use 32 GB). The database splitting option (available in FragPipe and through the command line) can be used to reduce the size of the in-memory fragment ion index that MSFragger generates. If you still run out of memory, you can reduce the size of the fragment ion index a few ways, 1) use only reviewed sequences in your fasta database, 2) decrease digest_max_length, 3) remove some variable modifications, 4) try a fully-enzymatic search.

Q: What is the difference between "precursor_mass_lower/upper" and "precursor_true_tolerance"?

A: The precursor_mass_lower/upper parameters are the precursor mass boundaries used in search. For example, precursor_mass_lower = -20, precursor_mass_upper = 20, precursor_mass_units = 1 would result in 20 ppm precursor mass tolerance, default for a closed search. On the other hand, precursor_mass_lower = -150, precursor_mass_upper = 500, precursor_mass_units = 0 would be an open search with [-150, +500] Da precursor mass window.

The precursor_true_tolerance is used to break ties in open searches and is also used as precursor_mass_lower/upper in the first search (if applicable). For example, if precursor_true_tolerance = 20, precursor_true_units = 1, MSFragger would use precursor_mass_lower = -20, precursor_mass_upper = 20, precursor_mass_units = 1 in the first search no matter what the precursor_mass_lower/upper values were.

Q: Can low resolution (e.g. ion trap) MS/MS data be used in MSFragger?

A: Low resolution MS/MS spectra are suitable for closed searches, but we recommend open searches only be performed on high mass accuracy MS/MS spectra (e.g acquired in an Orbitrap or high-res TOF).

Q: How should experiments with multiple fractions or experimental groups be handled in FragPipe?

A: Select 'Multi-Experiment Report' in the 'Report' tab of FragPipe, and assign labels to each experimental condition in the 'Select LC/MS Files' tab, e.g.:

File Experiment/Group
file1.mzML 1_1
file2.mzML 1_1
file3.mzML 1_1
file4.mzML 1_2
file5.mzML 1_2
file6.mzML 1_2

where the numbers before _ indicate biological replicates and the numbers after _ indicate replicates in each biological replicate. In this example, the first three files will be combined in 1_1, and the other three in 1_2. Each result file (combined_peptide.tsv and combined_protein.tsv) will have just two columns (1_1 and 1_2) with spectral counts and/or intensity-based quantification.

If you want to obtain a separate result for each file, then you can assign separate labels for each, e.g.:

File Experiment/Group
file1.mzML 1_1
file2.mzML 1_2
file3.mzML 1_3
file4.mzML 2_1
file5.mzML 2_2
file6.mzML 2_3

The summary tables will then have quantification separately for 1_1...1_3, 2_1...2_3.