Draft: Train neural network for roman pots momentum reconstruction by rahmans1 · Pull Request #10 · eic/detector_benchmarks

rahmans1 · 2024-01-10T18:08:37Z

Briefly, what does this PR introduce?

Train neural network for roman pots momentum reconstruction

What kind of change does this PR introduce?

Bug fix (issue #__)
New feature (issue #__)
Documentation update
Other: __

Please check if this PR fulfills the following:

Tests for the changes have been added
Documentation has been added / updated
Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

Does this PR change default behavior?

…m reconstruction. Function create a parametried dense neural network and train it with standardized input data.

…rdization parameters during inference

…twork models

…erate model files and assign unique names based on sha512 hash of concatenated hyperparamater values.

rahmans1 · 2024-01-19T17:01:18Z

@veprbl I have a snakemake file where I use some global variables which are then used to define directory structure (DETECTOR_VERSION and DETECTOR_CONFIG).
Line 44
Line 53
Line 66

I am noticing that if I have more than than one value stored as list in these variables, snakemake fails. For example, if i set DETECTOR_VERSION=["23.11.0","23.12.0"]

MissingOutputException in rule roman_pots_generate_neural_network_configs in file /w/eic-scshelf2104/users/rahmans/RomanPotsML/detector_benchmarks/benchmarks/roman_pots/Snakefile, line 50:
Job 0 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
results/23.12.0/epic_craterlake/detector_benchmarks/roman_pots/ml/metadata/931f1764775d40f79ec4f5781f2e810ba0987e11064e7b89928373b5095f922395d301b8dfc228c696b3e5b8badfff29b43a191e1bb56caf2458f91ba634cfd3.txt
results/23.12.0/epic_craterlake/detector_benchmarks/roman_pots/ml/metadata/a94f560f93e5c337b2105bfe01d04ef354ab17e083ee609fd5fb75c69a4f49a876296d36fae5b83387b174a9f6879176884a36ff2c502acfd77543345b4d34e9.txt

However, if i don't use that variable to define directory structure, then it runs as expected. For example, this works fine:

output:
    expand("results/{detector_config}/detector_benchmarks/roman_pots/ml/metadata/{model_version}.txt",
           detector_config=DETECTOR_CONFIG,
           model_version=MODEL_VERSION)

Does snakemake expect all outputs from an individual rule to go to a single directory?

rahmans1 · 2024-01-19T17:08:14Z

@veprbl I have a snakemake file where I use some global variables which are then used to define directory structure (DETECTOR_VERSION and DETECTOR_CONFIG). Line 44 Line 53 Line 66

I am noticing that if I have more than than one value stored as list in these variables, snakemake fails. For example, if i set DETECTOR_VERSION=["23.11.0","23.12.0"]
MissingOutputException in rule roman_pots_generate_neural_network_configs in file /w/eic-scshelf2104/users/rahmans/RomanPotsML/detector_benchmarks/benchmarks/roman_pots/Snakefile, line 50:
Job 0 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
results/23.12.0/epic_craterlake/detector_benchmarks/roman_pots/ml/metadata/931f1764775d40f79ec4f5781f2e810ba0987e11064e7b89928373b5095f922395d301b8dfc228c696b3e5b8badfff29b43a191e1bb56caf2458f91ba634cfd3.txt
results/23.12.0/epic_craterlake/detector_benchmarks/roman_pots/ml/metadata/a94f560f93e5c337b2105bfe01d04ef354ab17e083ee609fd5fb75c69a4f49a876296d36fae5b83387b174a9f6879176884a36ff2c502acfd77543345b4d34e9.txt
However, if i don't use that variable to define directory structure, then it runs as expected. For example, this works fine:
output:
    expand("results/{detector_config}/detector_benchmarks/roman_pots/ml/metadata/{model_version}.txt",
           detector_config=DETECTOR_CONFIG,
           model_version=MODEL_VERSION) 
Does snakemake expect all outputs from an individual rule to go to a single directory?

Nevermind. I think i messed up my combinatorics. Most likely not a snakemake issue.

…ter string.

…6*4=24 bits. The number of unique hashes that we can generate before the probability of hash collision between any 2 model hashes reaches 50% is sqrt(2*0.5*2^(24))=4096. We are unlikely to run training over hyperparameter space that large on a single iteration.

…rom environment for creating directory structure

…s obeyed. Execute snakemake all1/2/3 in sequence to avoid errors

…ts rules. Parallelization at generation stage through use of wildcards.

…f hash following https://waterdata.usgs.gov/blog/snakemake-for-ml-experiments/. Establishes a DAG with greater parallelization of processes.

…he Low Pt model training

veprbl · 2024-02-12T16:05:36Z

benchmarks/roman_pots/Snakefile

+           num_epochs=MODEL_PZ["num_epochs"],
+           learning_rate=MODEL_PZ["learning_rate"],
+           size_input=MODEL_PZ["size_input"],
+           size_output=MODEL_PZ["size_output"],
+           n_layers=MODEL_PZ["n_layers"],
+           size_first_hidden_layer=MODEL_PZ["size_first_hidden_layer"],
+           multiplier=MODEL_PZ["multiplier"],
+           leak_rate=MODEL_PZ["leak_rate"]


Suggested change

num_epochs=MODEL_PZ["num_epochs"],

learning_rate=MODEL_PZ["learning_rate"],

size_input=MODEL_PZ["size_input"],

size_output=MODEL_PZ["size_output"],

n_layers=MODEL_PZ["n_layers"],

size_first_hidden_layer=MODEL_PZ["size_first_hidden_layer"],

multiplier=MODEL_PZ["multiplier"],

leak_rate=MODEL_PZ["leak_rate"]

**MODEL_PZ,

veprbl · 2024-02-12T16:08:28Z

benchmarks/roman_pots/Snakefile

+        detector_path=DETECTOR_PATH,
+        nevents_per_file=NEVENTS_PER_FILE,
+        detector_config=DETECTOR_CONFIG
+    output:
+        "results/"+DETECTOR_VERSION+"/"+DETECTOR_CONFIG+"/detector_benchmarks/"+SUBSYSTEM+"/"+BENCHMARK+"/raw_data/"+DETECTOR_VERSION+"_"+DETECTOR_CONFIG+"_{index}.edm4hep.root"
+    shell:
+      """
+      npsim --steeringFile {input.script} \
+            --compactFile {params.detector_path}/{params.detector_config}.xml \


I would drop global variables in the leaf rules, and use wildcards where possible

Suggested change

detector_path=DETECTOR_PATH,

nevents_per_file=NEVENTS_PER_FILE,

detector_config=DETECTOR_CONFIG

output:

"results/"+DETECTOR_VERSION+"/"+DETECTOR_CONFIG+"/detector_benchmarks/"+SUBSYSTEM+"/"+BENCHMARK+"/raw_data/"+DETECTOR_VERSION+"_"+DETECTOR_CONFIG+"_{index}.edm4hep.root"

shell:

"""

npsim --steeringFile {input.script} \

--compactFile {params.detector_path}/{params.detector_config}.xml \

detector_path=DETECTOR_PATH,

nevents_per_file=NEVENTS_PER_FILE,

output:

"results/"+DETECTOR_VERSION+"/{DETECTOR_CONFIG}/detector_benchmarks/"+SUBSYSTEM+"/"+BENCHMARK+"/raw_data/"+DETECTOR_VERSION+"_{DETECTOR_CONFIG}_{index}.edm4hep.root"

shell:

"""

npsim --steeringFile {input.script} \

--compactFile {params.detector_path}/{wildcards.DETECTOR_CONFIG}.xml \

veprbl · 2024-02-12T16:10:59Z

benchmarks/roman_pots/preprocess_model_training_data.cxx

+//-------------------------
+//
+// Hit reader to relate hits at Roman Pots to momentum vectors from MC.
+//
+// Input(s): output file from npsim particle gun for RP particles.
+//
+// Output(s): txt file with training information with px_mc, py_mc, pz_mc, x_rp, slope_xrp, y_rp, slope_yrp
+//
+//
+// Author: Alex Jentsch
+//------------------------
+//Low PT preprocessing added by David Ruth


Suggested change

//-------------------------

//

// Hit reader to relate hits at Roman Pots to momentum vectors from MC.

//

// Input(s): output file from npsim particle gun for RP particles.

//

// Output(s): txt file with training information with px_mc, py_mc, pz_mc, x_rp, slope_xrp, y_rp, slope_yrp

//

//

// Author: Alex Jentsch

//------------------------

//Low PT preprocessing added by David Ruth

// Copyright 2023 - 2024, Alex Jentsch, David Ruth

// SPDX-License-Identifier: LGPL-3.0-only

//-------------------------

//

// Hit reader to relate hits at Roman Pots to momentum vectors from MC.

//

// Input(s): output file from npsim particle gun for RP particles.

//

// Output(s): txt file with training information with px_mc, py_mc, pz_mc, x_rp, slope_xrp, y_rp, slope_yrp

//

//

// Author: Alex Jentsch

//------------------------

//Low PT preprocessing added by David Ruth

veprbl · 2024-02-12T16:11:45Z

benchmarks/roman_pots/train_dense_neural_network.py

@@ -0,0 +1,163 @@
+import pandas as pd


Suggested change

import pandas as pd

# Copyright YYYY, NAME

# SPDX-License-Identifier: LGPL-3.0-only

…ns in epic

Add initial script for training neural network for roman pots momentu…

4a5c335

…m reconstruction. Function create a parametried dense neural network and train it with standardized input data.

rahmans1 self-assigned this Jan 10, 2024

rahmans1 changed the title ~~Train neural network for roman pots momentum reconstruction~~ Draft: Train neural network for roman pots momentum reconstruction Jan 10, 2024

rahmans1 marked this pull request as draft January 10, 2024 18:13

rahmans1 added 13 commits January 11, 2024 22:29

Simplify layer description code using nn.ModuleList

77a117d

Calculate the sample mean and standard deviation to pass on as standa…

1a5d06e

…rdization parameters during inference

Use number of epochs and learning rate as training hyper parameters

7398a0b

Update number format and precision in training progress message

c4e1703

Functions to run experiments with parametrize inputs to the neural ne…

45ffea0

…twork models

Import argparse and sys python modules

92bc46e

Fix typo in call to add_argument function

5ad5a79

Fix typo. Missing '--' in list of arguments.

e1b0786

Input file locations will be passed along with list of hyperparameters

c4310f3

Fetch the number of training inputs from hyperparameter list

6b7dae7

Fix typo in variable name multiplier

34eee45

Cast hyperparameters to the appropriate numerical datatypes

2e25da5

First commit of snakefile for roman pots neural network training. Gen…

65310c7

…erate model files and assign unique names based on sha512 hash of concatenated hyperparamater values.

rahmans1 added 11 commits January 19, 2024 12:26

Unhash the detector config and detector version. Only hash the parame…

130ff48

…ter string.

Parametrise subsystem and model type for readability

08fa4c5

Introduce steering file to simulate events for model training purposes

81bf1d5

Add rule to generate training events. Use DETECTOR_VERSION inferred f…

1e66076

…rom environment for creating directory structure

Use epic_ip6 geometry for faster processing of far-forward events

570f99d

Add rule to extract hit information necessary for model training

dd7e61b

Use realistic hyperparameter values

ec5d176

Add rule to train network and save models and artifacts.

d89bae8

Unnecessary because directory structure is automatically created by rule

f34159a

Variable is an array with one element. So, indexing needs to be used

d81ab52

rahmans1 and others added 4 commits January 25, 2024 18:14

Split up default target rule into 3 rules so that proper dependency i…

805c2cd

…s obeyed. Execute snakemake all1/2/3 in sequence to avoid errors

Simplify workflow. Build dependency between generate and process even…

2d1a65e

…ts rules. Parallelization at generation stage through use of wildcards.

Use a nested directory approach to uniquely identify models instead o…

acbb6c5

…f hash following https://waterdata.usgs.gov/blog/snakemake-for-ml-experiments/. Establishes a DAG with greater parallelization of processes.

Changed Snakefile and preprocess_model_training_data.cxx to include t…

5496e3c

…he Low Pt model training

veprbl reviewed Feb 12, 2024

View reviewed changes

veprbl mentioned this pull request Mar 21, 2024

Demonstration of ML Integration in EICrecon eic/EICrecon#1340

Closed

rahmans1 added 2 commits June 4, 2024 08:47

Extract detector version

b0c5198

The default position of the planes updated to reflect current locatio…

a695c51

…ns in epic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Train neural network for roman pots momentum reconstruction#10

Draft: Train neural network for roman pots momentum reconstruction#10
rahmans1 wants to merge 31 commits intomasterfrom
pr/train-neural-network-for-roman-pots-momentum-reconstruction

rahmans1 commented Jan 10, 2024

Uh oh!

rahmans1 commented Jan 19, 2024

Uh oh!

rahmans1 commented Jan 19, 2024

Uh oh!

veprbl Feb 12, 2024

Uh oh!

veprbl Feb 12, 2024

Uh oh!

veprbl Feb 12, 2024

Uh oh!

veprbl Feb 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	import pandas as pd
	# Copyright YYYY, NAME
	# SPDX-License-Identifier: LGPL-3.0-only

Conversation

rahmans1 commented Jan 10, 2024

Briefly, what does this PR introduce?

What kind of change does this PR introduce?

Please check if this PR fulfills the following:

Does this PR introduce breaking changes? What changes might users need to make to their code?

Does this PR change default behavior?

Uh oh!

rahmans1 commented Jan 19, 2024

Uh oh!

rahmans1 commented Jan 19, 2024

Uh oh!

veprbl Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

veprbl Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

veprbl Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

veprbl Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants