Use torch._logging._internal.trace_structured_artifact to save TraceCtxs when TORCH_TRACE is set#2751
Use torch._logging._internal.trace_structured_artifact to save TraceCtxs when TORCH_TRACE is set#2751
torch._logging._internal.trace_structured_artifact to save TraceCtxs when TORCH_TRACE is set#2751Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR enables dumping of TraceCtx objects using PyTorch's torch._logging._internal.trace_structured_artifact API when the TORCH_TRACE environment variable is set. This provides enhanced observability for Thunder's compilation process when using thunder.dynamo.compiler.thunderfx.
- Adds support for logging compilation traces and graph modules to
TORCH_TRACEartifacts - Implements backward compatibility wrapper for
trace_structured_artifactto handle different PyTorch versions - Extends test coverage to validate the TORCH_TRACE functionality
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| thunder/dynamo/utils.py | Adds log_trace_or_graphmodule_to_torch_trace helper function and backward compatibility wrapper for trace_structured_artifact |
| thunder/dynamo/compiler.py | Integrates trace logging for graph modules and split reasons in the ThunderCompiler |
| thunder/init.py | Adds trace logging at various compilation stages (prologue, computation, epilogue, transforms, backward) |
| thunder/tests/test_dynamo.py | Adds parametrized test to validate functionality with and without TORCH_TRACE enabled |
Comments suppressed due to low confidence (1)
thunder/dynamo/compiler.py:144
- This assignment to 'thunder_options' is unnecessary as it is redefined before this value is used.
thunder_options = _with_prologue_pruning_transform(
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
thunder/dynamo/compiler.py:144
- This assignment to 'thunder_options' is unnecessary as it is redefined before this value is used.
thunder_options = _with_prologue_pruning_transform(
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
shino16
left a comment
There was a problem hiding this comment.
It's a very cool feature!
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
so that the fallback path can also correctly work with torch-trace Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Co-authored-by: Masato Shinokawa <40788005+shino16@users.noreply.github.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
eb5bf55 to
3eeb517
Compare
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
|
@shino16 @kshitij12345 could you give this another look? |
kshitij12345
left a comment
There was a problem hiding this comment.
LGTM, thanks @crcrpar
|
@KaelanDt could you review this? |
What does this PR do?
Taking over #2182.
This enables us to dump
TraceCtxby settingTORCH_TRACEenv var if we usethunder.dynamo.compiler.thunderfx.What are stored?
Ref: https://docs.pytorch.org/docs/stable/compile/programming_model.observability.html
I got the following as an artifact of
TORCH_TRACE="./trace-tests/003" python thunder/benchmarks/benchmark_litgpt.py --compile "thunder_dynamo" --n_layers 2For
TORCH_TRACE="./trace-tests/012" python thunder/benchmarks/benchmark_inference.py --mode thunder,