Skip to content

Conversation

@FBumann
Copy link
Contributor

@FBumann FBumann commented Feb 2, 2026

Changes proposed in this Pull Request

Cut memory for internal integer arrays (labels, vars indices, _term coords) by ~25% and improve build speed by ~10-35% by defaulting to int32 instead of int64.

What changed

  • linopy/constants.py: Added DEFAULT_LABEL_DTYPE = np.int32
  • linopy/model.py: Variable and constraint label assignment uses np.arange(..., dtype=DEFAULT_LABEL_DTYPE) with overflow guard that raises ValueError if labels exceed int32 max (~2.1 billion)
  • linopy/expressions.py: _term coord assignment and .astype(int) for vars arrays now use DEFAULT_LABEL_DTYPE
  • linopy/variables.py: ffill, bfill, sanitize use DEFAULT_LABEL_DTYPE instead of astype(int) (which widened labels back to int64); Variables.to_dataframe arange uses int32
  • linopy/constraints.py: Constraints.to_dataframe arange uses DEFAULT_LABEL_DTYPE
  • linopy/common.py: fill_missing_coords uses int32 arange; save_join outer-join fallback uses DEFAULT_LABEL_DTYPE instead of astype(int); polars schema infers Int32/Int64 based on actual array dtype
  • test/test_constraints.py: Updated dtype assertions to use np.issubdtype (compatible with both int32 and int64)
  • test/test_dtypes.py (new): Tests for int32 labels, expression vars, solve correctness, and overflow guard
  • dev-scripts/benchmark_lp_writer.py (new): Benchmark script supporting --phase memory|build|lp_write with --plot comparison mode

Benchmark results

Reproduce with:

# Run on master
python dev-scripts/benchmark_lp_writer.py --phase memory --model basic -o bench_master_memory.json --label "master_org"
python dev-scripts/benchmark_lp_writer.py --phase build --model basic -o bench_master_build.json --label "master_org"

# Run on this branch
python dev-scripts/benchmark_lp_writer.py --phase memory --model basic -o bench_int32_memory.json --label "int32"
python dev-scripts/benchmark_lp_writer.py --phase build --model basic -o bench_int32_build.json --label "int32"

# Plot comparisons
python dev-scripts/benchmark_lp_writer.py --plot bench_master_memory.json bench_int32_memory.json
python dev-scripts/benchmark_lp_writer.py --plot bench_master_build.json bench_int32_build.json

Memory (dataset .nbytes)

Consistent 1.25x reduction across all problem sizes (e.g. 640 MB → 512 MB at 8M vars). The labels and vars arrays shrink 50% (int64 → int32) while lower/upper/coeffs/rhs stay float64.

benchmark_memory_comparison

Build speed

Consistently ~1.1-1.35x faster across all sizes (30 iterations with GC, tight error bars). 10-20% for large models (170ms → 153ms at 8M vars), and up to 35% for small/medium models where the fixed overhead of array allocation matters more relative to total time.

benchmark_build_comparison

Similar results on real PyPSA model.

No influence on lp-write

Checklist

  • Code changes are sufficiently documented; i.e. new functions contain docstrings and further explanations may be given in doc.
  • Unit tests for new features were added (if applicable).
  • A note for the release notes doc/release_notes.rst of the upcoming release is included.
  • I consent to the release of this PR's code under the MIT license.

  linopy/constants.py — Added DEFAULT_LABEL_DTYPE = np.int32

  linopy/model.py — Variable and constraint label assignment now uses np.arange(..., dtype=DEFAULT_LABEL_DTYPE) with overflow guards that raise ValueError if labels exceed
  int32 max.

  linopy/expressions.py — _term coord assignment and all .astype(int) for vars arrays now use DEFAULT_LABEL_DTYPE (int32).

  linopy/common.py — fill_missing_coords uses np.arange(..., dtype=DEFAULT_LABEL_DTYPE). Polars schema inference now checks array.dtype.itemsize instead of the old
  OS/numpy-version hack.

  test/test_constraints.py — Updated 2 dtype assertions to use np.issubdtype instead of == int.

  test/test_dtypes.py (new) — 7 tests covering int32 labels, expression vars, solve correctness, and overflow guards.
…k to int64 via astype(int), now use DEFAULT_LABEL_DTYPE. Also Variables.to_dataframe arange for

  map_labels.
  - linopy/constraints.py: Constraints.to_dataframe arange for map_labels.
  - linopy/common.py: save_join outer-join fallback was casting to int64.
@FBumann FBumann changed the title Perf/int32 perf: default integer arrays to int32 for ~25% memory reduction Feb 2, 2026
@FBumann FBumann mentioned this pull request Feb 2, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant