perf: default integer arrays to int32 for ~25% memory reduction #566

FBumann · 2026-02-02T08:25:44Z

Changes proposed in this Pull Request

Cut memory for internal integer arrays (labels, vars indices, _term coords) by ~25% and improve build speed by ~10-35% by defaulting to int32 instead of int64.

What changed

linopy/constants.py: Added DEFAULT_LABEL_DTYPE = np.int32
linopy/model.py: Variable and constraint label assignment uses np.arange(..., dtype=DEFAULT_LABEL_DTYPE) with overflow guard that raises ValueError if labels exceed int32 max (~2.1 billion)
linopy/expressions.py: _term coord assignment and .astype(int) for vars arrays now use DEFAULT_LABEL_DTYPE
linopy/variables.py: ffill, bfill, sanitize use DEFAULT_LABEL_DTYPE instead of astype(int) (which widened labels back to int64); Variables.to_dataframe arange uses int32
linopy/constraints.py: Constraints.to_dataframe arange uses DEFAULT_LABEL_DTYPE
linopy/common.py: fill_missing_coords uses int32 arange; save_join outer-join fallback uses DEFAULT_LABEL_DTYPE instead of astype(int); polars schema infers Int32/Int64 based on actual array dtype
test/test_constraints.py: Updated dtype assertions to use np.issubdtype (compatible with both int32 and int64)
test/test_dtypes.py (new): Tests for int32 labels, expression vars, solve correctness, and overflow guard
dev-scripts/benchmark_lp_writer.py (new): Benchmark script supporting --phase memory|build|lp_write with --plot comparison mode

Benchmark results

Reproduce with:

# Run on master
python dev-scripts/benchmark_lp_writer.py --phase memory --model basic -o bench_master_memory.json --label "master_org"
python dev-scripts/benchmark_lp_writer.py --phase build --model basic -o bench_master_build.json --label "master_org"

# Run on this branch
python dev-scripts/benchmark_lp_writer.py --phase memory --model basic -o bench_int32_memory.json --label "int32"
python dev-scripts/benchmark_lp_writer.py --phase build --model basic -o bench_int32_build.json --label "int32"

# Plot comparisons
python dev-scripts/benchmark_lp_writer.py --plot bench_master_memory.json bench_int32_memory.json
python dev-scripts/benchmark_lp_writer.py --plot bench_master_build.json bench_int32_build.json

Memory (dataset `.nbytes`)

Consistent 1.25x reduction across all problem sizes (e.g. 640 MB → 512 MB at 8M vars). The labels and vars arrays shrink 50% (int64 → int32) while lower/upper/coeffs/rhs stay float64.

Build speed

Consistently ~1.1-1.35x faster across all sizes (30 iterations with GC, tight error bars). 10-20% for large models (170ms → 153ms at 8M vars), and up to 35% for small/medium models where the fixed overhead of array allocation matters more relative to total time.

Similar results on real PyPSA model.

No influence on lp-write

Checklist

Code changes are sufficiently documented; i.e. new functions contain docstrings and further explanations may be given in doc.
Unit tests for new features were added (if applicable).
A note for the release notes doc/release_notes.rst of the upcoming release is included.
I consent to the release of this PR's code under the MIT license.

linopy/constants.py — Added DEFAULT_LABEL_DTYPE = np.int32 linopy/model.py — Variable and constraint label assignment now uses np.arange(..., dtype=DEFAULT_LABEL_DTYPE) with overflow guards that raise ValueError if labels exceed int32 max. linopy/expressions.py — _term coord assignment and all .astype(int) for vars arrays now use DEFAULT_LABEL_DTYPE (int32). linopy/common.py — fill_missing_coords uses np.arange(..., dtype=DEFAULT_LABEL_DTYPE). Polars schema inference now checks array.dtype.itemsize instead of the old OS/numpy-version hack. test/test_constraints.py — Updated 2 dtype assertions to use np.issubdtype instead of == int. test/test_dtypes.py (new) — 7 tests covering int32 labels, expression vars, solve correctness, and overflow guards.

…k to int64 via astype(int), now use DEFAULT_LABEL_DTYPE. Also Variables.to_dataframe arange for map_labels. - linopy/constraints.py: Constraints.to_dataframe arange for map_labels. - linopy/common.py: save_join outer-join fallback was casting to int64.

FBumann added 4 commits February 1, 2026 19:29

Add memory becnhmark

b5df113

bench: improve benchmark_lp_writer.py

d0a8c74

FBumann changed the title ~~Perf/int32~~ perf: default integer arrays to int32 for ~25% memory reduction Feb 2, 2026

Add dtype tests

2f3e87e

FBumann mentioned this pull request Feb 2, 2026

Feat/benchmarks #567

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: default integer arrays to int32 for ~25% memory reduction #566

perf: default integer arrays to int32 for ~25% memory reduction #566

Uh oh!

FBumann commented Feb 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

perf: default integer arrays to int32 for ~25% memory reduction #566

Are you sure you want to change the base?

perf: default integer arrays to int32 for ~25% memory reduction #566

Uh oh!

Conversation

FBumann commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes proposed in this Pull Request

What changed

Benchmark results

Memory (dataset .nbytes)

Build speed

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FBumann commented Feb 2, 2026 •

edited

Loading

Memory (dataset `.nbytes`)