Draft: Train neural network for roman pots momentum reconstruction#10
Draft: Train neural network for roman pots momentum reconstruction#10
Conversation
…m reconstruction. Function create a parametried dense neural network and train it with standardized input data.
…rdization parameters during inference
…erate model files and assign unique names based on sha512 hash of concatenated hyperparamater values.
|
@veprbl I have a snakemake file where I use some global variables which are then used to define directory structure (DETECTOR_VERSION and DETECTOR_CONFIG). I am noticing that if I have more than than one value stored as list in these variables, snakemake fails. For example, if i set DETECTOR_VERSION=["23.11.0","23.12.0"] However, if i don't use that variable to define directory structure, then it runs as expected. For example, this works fine: Does snakemake expect all outputs from an individual rule to go to a single directory? |
Nevermind. I think i messed up my combinatorics. Most likely not a snakemake issue. |
…6*4=24 bits. The number of unique hashes that we can generate before the probability of hash collision between any 2 model hashes reaches 50% is sqrt(2*0.5*2^(24))=4096. We are unlikely to run training over hyperparameter space that large on a single iteration.
…rom environment for creating directory structure
…s obeyed. Execute snakemake all1/2/3 in sequence to avoid errors
…ts rules. Parallelization at generation stage through use of wildcards.
…f hash following https://waterdata.usgs.gov/blog/snakemake-for-ml-experiments/. Establishes a DAG with greater parallelization of processes.
…he Low Pt model training
| num_epochs=MODEL_PZ["num_epochs"], | ||
| learning_rate=MODEL_PZ["learning_rate"], | ||
| size_input=MODEL_PZ["size_input"], | ||
| size_output=MODEL_PZ["size_output"], | ||
| n_layers=MODEL_PZ["n_layers"], | ||
| size_first_hidden_layer=MODEL_PZ["size_first_hidden_layer"], | ||
| multiplier=MODEL_PZ["multiplier"], | ||
| leak_rate=MODEL_PZ["leak_rate"] |
There was a problem hiding this comment.
| num_epochs=MODEL_PZ["num_epochs"], | |
| learning_rate=MODEL_PZ["learning_rate"], | |
| size_input=MODEL_PZ["size_input"], | |
| size_output=MODEL_PZ["size_output"], | |
| n_layers=MODEL_PZ["n_layers"], | |
| size_first_hidden_layer=MODEL_PZ["size_first_hidden_layer"], | |
| multiplier=MODEL_PZ["multiplier"], | |
| leak_rate=MODEL_PZ["leak_rate"] | |
| **MODEL_PZ, |
| detector_path=DETECTOR_PATH, | ||
| nevents_per_file=NEVENTS_PER_FILE, | ||
| detector_config=DETECTOR_CONFIG | ||
| output: | ||
| "results/"+DETECTOR_VERSION+"/"+DETECTOR_CONFIG+"/detector_benchmarks/"+SUBSYSTEM+"/"+BENCHMARK+"/raw_data/"+DETECTOR_VERSION+"_"+DETECTOR_CONFIG+"_{index}.edm4hep.root" | ||
| shell: | ||
| """ | ||
| npsim --steeringFile {input.script} \ | ||
| --compactFile {params.detector_path}/{params.detector_config}.xml \ |
There was a problem hiding this comment.
I would drop global variables in the leaf rules, and use wildcards where possible
| detector_path=DETECTOR_PATH, | |
| nevents_per_file=NEVENTS_PER_FILE, | |
| detector_config=DETECTOR_CONFIG | |
| output: | |
| "results/"+DETECTOR_VERSION+"/"+DETECTOR_CONFIG+"/detector_benchmarks/"+SUBSYSTEM+"/"+BENCHMARK+"/raw_data/"+DETECTOR_VERSION+"_"+DETECTOR_CONFIG+"_{index}.edm4hep.root" | |
| shell: | |
| """ | |
| npsim --steeringFile {input.script} \ | |
| --compactFile {params.detector_path}/{params.detector_config}.xml \ | |
| detector_path=DETECTOR_PATH, | |
| nevents_per_file=NEVENTS_PER_FILE, | |
| output: | |
| "results/"+DETECTOR_VERSION+"/{DETECTOR_CONFIG}/detector_benchmarks/"+SUBSYSTEM+"/"+BENCHMARK+"/raw_data/"+DETECTOR_VERSION+"_{DETECTOR_CONFIG}_{index}.edm4hep.root" | |
| shell: | |
| """ | |
| npsim --steeringFile {input.script} \ | |
| --compactFile {params.detector_path}/{wildcards.DETECTOR_CONFIG}.xml \ |
| //------------------------- | ||
| // | ||
| // Hit reader to relate hits at Roman Pots to momentum vectors from MC. | ||
| // | ||
| // Input(s): output file from npsim particle gun for RP particles. | ||
| // | ||
| // Output(s): txt file with training information with px_mc, py_mc, pz_mc, x_rp, slope_xrp, y_rp, slope_yrp | ||
| // | ||
| // | ||
| // Author: Alex Jentsch | ||
| //------------------------ | ||
| //Low PT preprocessing added by David Ruth |
There was a problem hiding this comment.
| //------------------------- | |
| // | |
| // Hit reader to relate hits at Roman Pots to momentum vectors from MC. | |
| // | |
| // Input(s): output file from npsim particle gun for RP particles. | |
| // | |
| // Output(s): txt file with training information with px_mc, py_mc, pz_mc, x_rp, slope_xrp, y_rp, slope_yrp | |
| // | |
| // | |
| // Author: Alex Jentsch | |
| //------------------------ | |
| //Low PT preprocessing added by David Ruth | |
| // Copyright 2023 - 2024, Alex Jentsch, David Ruth | |
| // SPDX-License-Identifier: LGPL-3.0-only | |
| //------------------------- | |
| // | |
| // Hit reader to relate hits at Roman Pots to momentum vectors from MC. | |
| // | |
| // Input(s): output file from npsim particle gun for RP particles. | |
| // | |
| // Output(s): txt file with training information with px_mc, py_mc, pz_mc, x_rp, slope_xrp, y_rp, slope_yrp | |
| // | |
| // | |
| // Author: Alex Jentsch | |
| //------------------------ | |
| //Low PT preprocessing added by David Ruth |
| @@ -0,0 +1,163 @@ | |||
| import pandas as pd | |||
There was a problem hiding this comment.
| import pandas as pd | |
| # Copyright YYYY, NAME | |
| # SPDX-License-Identifier: LGPL-3.0-only |
Briefly, what does this PR introduce?
Train neural network for roman pots momentum reconstruction
What kind of change does this PR introduce?
Please check if this PR fulfills the following:
Does this PR introduce breaking changes? What changes might users need to make to their code?
Does this PR change default behavior?