-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Hello de-goulash team,
First of all, thank you for developing this very interesting and powerful tool. I have read your paper, "Deconvoluting multi-person biological mixtures and accurate characterization and identification of separated contributors using non-targeted single-cell DNA sequencing," and I am very keen to apply the de-goulash pipeline to my own data.
I am currently setting up the configuration for the second part of the pipeline (Snakefile_analysis) and have a question regarding some of the required input files. I am having difficulty locating them.
Specifically, I would be very grateful if you could provide guidance on how to obtain the following files:
- exome_96_remmapedto38.vcf.gz: This file appears to be a custom reference panel. Could you please advise if this file is publicly available, and if so, where I might be able to download it?
- 1000G_populations.txt: Could you possibly provide the version of this file that was used in your analysis, or provide guidance on how to generate it correctly?
- dirpath_1000G: The README helpfully points to the 1000 Genomes FTP server. However, the server contains a vast number of files. Could you clarify which specific files or subdirectories from the release are required for the analysis?
Any help or clarification you could provide would be greatly appreciated and would be very helpful for me to get the pipeline running.
Thank you for your time and for your great contribution to the field.
Best regards,
Li-Yaoting
2025.09.24