Skip to content

Commit 28ff7fa

Browse files
author
Atticus Kuhn
committed
add flamegraph
1 parent 1a30085 commit 28ff7fa

File tree

6 files changed

+48
-0
lines changed

6 files changed

+48
-0
lines changed

docs/activeContext.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ Latest changes:
4747
- `PartIiProject/Rust.lean`: generated `for` loops iterate via `.clone().into_iter()` to avoid moving maps used more than once.
4848
- Refactored the Rust AST to be DeBruijn-indexed (`Expr : Nat → Type`, vars are `Fin ctx`) and replaced stringly-typed runtime calls with `RuntimeFn`; updated `PartIiProject/CodegenRust.lean` accordingly.
4949
- Added a performance benchmarking runner `Performance.lean` (flake app `performanceComparison`) that compares runtime (ms) of `sdql-rs` binaries vs Lean-generated Rust binaries, including microbenchmarks and TPCH cases.
50+
- Added a flamegraph profiling runner `Flamegraph.lean` (flake app `flamegraph`) that generates per-TPCH SVGs for Lean-generated Rust binaries using `cargo flamegraph`.
51+
- Flamegraph runner now rewrites `TPCH_DATASET_PATH` and TPCH tiny load paths to absolute paths so profiling works from the generated cargo workspace.
5052
- Refined the optimisation benchmarking runner `OptimisationPerformanceComparison.lean` (flake app `optimisationPerformanceComparison`) to use `[SDQLProg2 { T }| ... ]` programs, drop CLI arg parsing, and increase default input sizes so optimisation effects are clearer.
5153
- Fixed a dependent-pattern-matching blocker in optimisation passes by refactoring `Term2.mul`/`Term2.proj` to carry typeclass witnesses (`has_tensor`/`has_proj`) instead of computed indices (`tensor` / `List.getD`) directly.
5254
- Added a small `Term2` optimisation framework (`PartIiProject/Optimisations/Apply.lean`) where each rewrite is a non-recursive `Optimisation` and `applyOptimisations{,Loc}` performs the recursive traversal + (fuel-bounded) fixpoint iteration.

docs/progress.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,8 @@ What works:
4545
- CI: GitHub Actions workflow builds the project and runs the test executable on pushes/PRs.
4646
- `nix run` support: wrapper script ensures datasets are present and runs tests with proper environment setup; reference binaries are built on-demand by the Lean test runner when missing.
4747
- Performance comparison: `Performance.lean` executable `performanceComparison` benchmarks runtime (ms) of `sdql-rs` reference binaries vs Lean-generated Rust binaries.
48+
- Flamegraph profiling: `Flamegraph.lean` executable `flamegraph` generates per-TPCH SVG flamegraphs for Lean-generated Rust binaries.
49+
- Flamegraph runner normalizes `TPCH_DATASET_PATH` and TPCH tiny load paths to absolute paths so dataset loading works from the profiling cargo workspace.
4850
- Optimisation performance comparison: `OptimisationPerformanceComparison.lean` executable `optimisationPerformanceComparison` benchmarks runtime (ms) of unoptimised vs optimised Lean-generated Rust binaries for the implemented SDQL optimisations.
4951
- Surface/core terms are DeBruijn-indexed: surface terms in `SurfaceCore2.lean`, core terms in `Term2.lean`, with lowering in `ToCore2`.
5052
- Optimisation-friendly `Term2` indices: `mul`/`proj` carry `has_tensor`/`has_proj` witnesses to avoid dependent-elimination failures when pattern-matching in optimisation passes.

docs/techContext.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ How to run:
5454
- Run tests: `lake exe sdql-tests`.
5555
- Preferred: `nix run` (runs the full test suite; sdql-rs TPCH reference binaries are built on-demand by the Lean test runner via `cargo build --release --bin ...` when missing).
5656
- Performance comparison: `nix run .#performanceComparison` (times `sdql-rs` binaries vs Lean-generated Rust binaries; must be run from the project root).
57+
- Flamegraph profiling: `nix run .#flamegraph` (generates per-TPCH SVGs for Lean-generated Rust binaries; must be run from the project root).
5758
- Optimisation performance comparison: `nix run .#optimisationPerformanceComparison` (times unoptimised vs optimised Lean-generated Rust binaries for a small suite of SDQL optimisation patterns).
5859
- Explore: open the `.lean` files and evaluate examples with `#eval`.
5960
- Try the DSL: use `[SDQLProg2 { int }| 3 + 5 ]` (runs the full pipeline) or start from `[SDQL| 3 + 5 ]` and call `loadTermToSProg2` explicitly.

flake.nix

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,11 @@
5959
src = ./.;
6060
};
6161

62+
sdqlFlamegraph = lake.mkPackage {
63+
name = "flamegraph";
64+
src = ./.;
65+
};
66+
6267
# Wrapper script that sets up datasets for tests and runs the Lean test runner.
6368
# TPCH reference binaries are built on-demand by `Tests/Main.lean`.
6469
sdqlTestsWithRef = pkgs.writeShellApplication {
@@ -123,6 +128,33 @@
123128
exec ${sdqlOptPerf}/bin/optimisationPerformanceComparison "$@"
124129
'';
125130
};
131+
132+
flamegraphRunner = pkgs.writeShellApplication {
133+
name = "flamegraph";
134+
runtimeInputs =
135+
[ rustToolchain pkgs.cargo-flamegraph ]
136+
++ pkgs.lib.optionals pkgs.stdenv.isLinux [ pkgs.linuxPackages.perf ];
137+
text = ''
138+
set -euo pipefail
139+
140+
if [ ! -f "sdql_runtime.rs" ]; then
141+
echo "Error: must be run from the project root directory (sdql_runtime.rs not found)" >&2
142+
exit 1
143+
fi
144+
145+
if [ ! -d "datasets/tpch-tiny" ]; then
146+
echo "Error: datasets/tpch-tiny not found" >&2
147+
exit 1
148+
fi
149+
150+
if [ ! -d "sdql-rs/datasets/tpch_datasets/SF_0.01" ]; then
151+
echo "Error: sdql-rs/datasets/tpch_datasets/SF_0.01 not found" >&2
152+
exit 1
153+
fi
154+
155+
exec ${sdqlFlamegraph}/bin/flamegraph "$@"
156+
'';
157+
};
126158
# Runtime tools shared by sdql reference test runners
127159
sdqlRefRuntimeInputs = with pkgs; [
128160
# JVM + Scala toolchain
@@ -218,6 +250,7 @@
218250
sdql-tests-bare = sdqlTests;
219251
performanceComparison = performanceComparison;
220252
optimisationPerformanceComparison = optimisationPerformanceComparison;
253+
flamegraph = flamegraphRunner;
221254
sdql-reference-tests = sdqlRefTestRunner;
222255
sdql-reference-tpch-0_01 = sdqlRefTPCH001;
223256
sdql-reference-tpch-1 = sdqlRefTPCH1;
@@ -257,6 +290,10 @@
257290
type = "app";
258291
program = "${optimisationPerformanceComparison}/bin/optimisationPerformanceComparison";
259292
};
293+
flamegraph = {
294+
type = "app";
295+
program = "${flamegraphRunner}/bin/flamegraph";
296+
};
260297
};
261298

262299
devShells.default = pkgs.mkShell {

lakefile.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,7 @@ root = "Performance"
2828
[[lean_exe]]
2929
name = "optimisationPerformanceComparison"
3030
root = "OptimisationPerformanceComparison"
31+
32+
[[lean_exe]]
33+
name = "flamegraph"
34+
root = "Flamegraph"

todos.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@
1414
- start writing paper (in typst or latex), and add to CI/CD
1515
- optimisation pipeline is ugly, could this be improved by the use of a monad?
1616
- Current rust implementation is slowed down a lot by overused of `x.clone()` in BTreeMap. I'm not good enough with Rust to avoid this.
17+
- Use Rust profiling. Use a profiler for any serious performance analysis.
18+
1719
```bash
1820
[atticusk@nixos:~/coding/part_ii_project]$ find PartIiProject -name "*.lean" -exec wc -l {} + | sort -nr | head -n10
1921
3987 total

0 commit comments

Comments
 (0)