NIH Long COVID Computational Challenge -- Targeted Machine Learning Analysis Group
This is the formatted competition code for the L3C Challenge entry of the Targeted Machine Learning Analysis Group at UC Berkeley. See (TODO: maybe add writeup here for details of our analysis plan and results)
- obtain the synthetic data (contact @trberg for box access)
- extract the synthetic data:
tar -xzf synthetic_data.tar.gz - add the additional data files to the synthetic data folder:
LL_concept_sets_fusion_everyone.csv LL_DO_NOT_DELETE_REQUIRED_concept_sets_all.csv
- build the docker container
utils/build.sh - run
utils/do_analysis.sh - fit models and predictions will be in the
outputfolder
The python module format_code can process raw code exported from the enclave (as in the src_raw folder) and generate runnable python code (as in the src folder). R is not currently supported.