-
Notifications
You must be signed in to change notification settings - Fork 9
Frequently Asked Questions
Q: What are the hardware requirements for MSFragger?
A: This depends on the complexity of the search. In general, we recommend at least 8-16 GB, but complex closed searches and open searches will require more. Additional variable modifications, semi-enzymatic, and non-enzymatic/non-specific searches will require additional memory and processing cores to maintain very short search times. (We do not recommend performing an open search with non-specific cleavage rules.) Performance scales well with the number of processors.
Q: Does MSFragger require decoys in the FASTA sequence database?
A: Yes. Philosopher can download and format a database for you (including the addition of reversed/decoy sequences). Custom databases can also be generated and formatted by Philosopher. See this page for more information & usage examples.
Q: How can I convert raw MS/MS files to mzML for MSFragger searches?
A: We recommend using the msconvert tool from ProteoWizard. A tutorial can be found here.
Q: I got a memory error when I tried to run MSFragger. What can I do?
A: First make sure a 64-bit (x64) version of Java runtime environment is installed. (The latest version of JRE, which is 64-bit, can be downloaded here.) If you’re using MSFragger in the command line, make sure to specify the appropriate amount of memory given your hardware configuration (e.g. -Xmx32G to use 32 GB). The database splitting option (available in FragPipe and through the command line) can be used to reduce the size of the in-memory fragment ion index that MSFragger generates. If you still run out of memory, you can reduce the size of the fragment ion index a few ways, 1) use only reviewed sequences in your fasta database, 2) decrease digest_max_length, 3) remove some variable modifications, 4) try a fully-enzymatic search.
Q: What is the difference between "precursor_mass_lower/upper" and "precursor_true_tolerance"?
A: The precursor_mass_lower/upper parameters are the precursor mass boundaries used in search. For example, precursor_mass_lower = -20, precursor_mass_upper = 20, precursor_mass_units = 1 would result in 20 ppm precursor mass tolerance, default for a closed search. On the other hand, precursor_mass_lower = -150, precursor_mass_upper = 500, precursor_mass_units = 0 would be an open search with [-150, +500] Da precursor mass window.
The precursor_true_tolerance is used to break ties in open searches and is also used as precursor_mass_lower/upper in the first search (if applicable). For example, if precursor_true_tolerance = 20, precursor_true_units = 1, MSFragger would use precursor_mass_lower = -20, precursor_mass_upper = 20, precursor_mass_units = 1 in the first search no matter what the precursor_mass_lower/upper values were.
Q: Can low resolution (e.g. ion trap) MS/MS data be used in MSFragger?
A: Low resolution MS/MS spectra are suitable for closed searches, but we recommend open searches only be performed on high mass accuracy MS/MS spectra (e.g acquired in an Orbitrap or high-res TOF).
Q: How should experiments with multiple fractions or experimental groups be handled in FragPipe?
A: Select 'Multi-Experiment Report' in the 'Report' tab of FragPipe, and assign labels to each experimental condition in the 'Select LC/MS Files' tab, e.g.:
| File | Experiment/Group |
|---|---|
| file1.mzML | 1_1 |
| file2.mzML | 1_1 |
| file3.mzML | 1_1 |
| file4.mzML | 1_2 |
| file5.mzML | 1_2 |
| file6.mzML | 1_2 |
where the numbers before _ indicate biological replicates and the numbers after _ indicate replicates in each biological replicate. In this example, the first three files will be combined in 1_1, and the other three in 1_2. Each result file (combined_peptide.tsv and combined_protein.tsv) will have just two columns (1_1 and 1_2) with spectral counts and/or intensity-based quantification.
If you want to obtain a separate result for each file, then you can assign separate labels for each, e.g.:
| File | Experiment/Group |
|---|---|
| file1.mzML | 1_1 |
| file2.mzML | 1_2 |
| file3.mzML | 1_3 |
| file4.mzML | 2_1 |
| file5.mzML | 2_2 |
| file6.mzML | 2_3 |
The summary tables will then have quantification separately for 1_1...1_3, 2_1...2_3.