NVIDIA
diff --git a/‎.gitlab-ci.yml‎
Lines changed: 9 additions & 3 deletions b/‎.gitlab-ci.yml‎
Lines changed: 9 additions & 3 deletions
diff --git a/‎.gitlab/ci/cuda-13-build.yml‎
Lines changed: 2 additions & 3 deletions b/‎.gitlab/ci/cuda-13-build.yml‎
Lines changed: 2 additions & 3 deletions
diff --git a/‎.gitlab/ci/kit-extensions.yml‎
Lines changed: 0 additions & 1 deletion b/‎.gitlab/ci/kit-extensions.yml‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎CHANGELOG.md‎
Lines changed: 67 additions & 50 deletions b/‎CHANGELOG.md‎
Lines changed: 67 additions & 50 deletions
diff --git a/‎PUBLICATIONS.md‎
Lines changed: 3 additions & 0 deletions b/‎PUBLICATIONS.md‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎VERSION.md‎
Lines changed: 1 addition & 1 deletion b/‎VERSION.md‎
Lines changed: 1 addition & 1 deletion
@@ -144,7 +144,7 @@ mac-aarch64 build:
 
 linux-x86_64 cuda 13 build:
   stage: build
-  image: quay.io/pypa/manylinux_2_34_x86_64:latest
+  image: quay.io/pypa/manylinux_2_28_x86_64:latest
   extends:
     - .save_warp_bin_artifact
     - .ipp_lnx_x86_64_cpu_medium
@@ -369,7 +369,14 @@ linux-aarch64 test orin:
   image: ghcr.io/astral-sh/uv:bookworm-slim
   extends:
     - .save_test_report_artifact
-    - .basic_test_changes_rules
+  rules:
+    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
+    - if: $CI_PIPELINE_SOURCE == "schedule"
+    - if: $CI_COMMIT_TAG
+    - if: $CI_COMMIT_BRANCH =~ /^release-.*/
+    - if: $CI_PIPELINE_SOURCE == "web"
+    - when: manual # Can be triggered in all other scenarios
+      allow_failure: true
   before_script:
     - echo -e "\\e[0Ksection_start:`date +%s`:install_dependencies[collapsed=true]\\r\\e[0KInstalling dependencies"
     - df -h
@@ -914,7 +921,6 @@ publish wheels to gitlab package registry:
     - .ipp_lnx_x86_64_cpu_micro
   rules:
     - if: $CI_COMMIT_TAG
-    - if: $CI_COMMIT_BRANCH =~ /^release-.*/
     - when: manual # If not auto-triggered, allow any pipeline to run this job manually
       allow_failure: true
   before_script:
 
@@ -148,7 +148,7 @@ create pypi wheels:
   script:
     - printf '%s+cu13\n' "$(head -n1 VERSION.md | tr -d '\n\r')" > VERSION.md # Modify VERSION.md with +cu13
     - uv build --wheel -C--build-option=-Pwindows-x86_64
-    - uv build --wheel -C--build-option=-Plinux-x86_64 -C--build-option=-Mmanylinux_2_34
+    - uv build --wheel -C--build-option=-Plinux-x86_64 -C--build-option=-Mmanylinux_2_28
     - uv build --wheel -C--build-option=-Plinux-aarch64 -C--build-option=-Mmanylinux_2_34
     - find . -type f -exec chmod 664 {} +
     - find . -type d -exec chmod 775 {} +
@@ -188,7 +188,6 @@ publish wheels to gitlab package registry:
     - .ipp_lnx_x86_64_cpu_micro
   rules:
     - if: $PARENT_COMMIT_TAG
-    - if: $PARENT_COMMIT_BRANCH =~ /release-.*/
     - when: manual # Can be triggered in all other scenarios
       allow_failure: true
   before_script:
@@ -249,7 +248,7 @@ publish tag wheels to artifactory:
       'arch=aarch64;os=linux'
     - >
       jf rt sp --fail-no-op --url=$ARTIFACTORY_BASE_URL --access-token $ARTIFACTORY_SVC_ACCESS_TOKEN
-      sw-warp-pypi-local/warp/$PARENT_COMMIT_REF_NAME/*-manylinux_2_34_x86_64.whl
+      sw-warp-pypi-local/warp/$PARENT_COMMIT_REF_NAME/*-manylinux_2_28_x86_64.whl
       'arch=x86_64;os=linux'
     # Set additional common properties on all artifacts
     - >
 
@@ -256,7 +256,6 @@ publish to gitlab package registry:
     - .ipp_lnx_x86_64_cpu_micro
   rules:
     - if: $PARENT_COMMIT_TAG
-    - if: $PARENT_COMMIT_BRANCH =~ /release-.*/
     - when: manual # Can be triggered in all other scenarios
       allow_failure: true
   before_script:
 
@@ -1,6 +1,6 @@
 # Changelog
 
-## [Unreleased] - 2025-??
+## [1.10.0] - 2025-11-02
 
 ### Added
 
@@ -11,88 +11,108 @@
   ([GH-886](https://github.com/NVIDIA/warp/issues/886)).
 - Add support for negative indexing and improve slicing for the `wp.array()` type
   ([GH-504](https://github.com/NVIDIA/warp/issues/504)).
-- Add support for composite type tile indexed assignment and extraction ([GH-941](https://github.com/NVIDIA/warp/issues/941)).
-- Add `warp/examples/tile/example_tile_mcgp.py`, demonstrating how to implement a Monte Carlo Laplace solver. 
-- Add `wp.tile_full()` builtin, which fills a tile with a constant value.
+- Add `wp.cast()` to reinterpret a value as a different type while preserving its bit pattern
+  ([GH-789](https://github.com/NVIDIA/warp/issues/789)).
+- Add support for error functions: `wp.erf()`, `wp.erfc()`, `wp.erfinv()`, and `wp.erfcinv()`
+  ([GH-910](https://github.com/NVIDIA/warp/issues/910)).
+- Add `wp.tile_full()`, which fills a tile with a constant value ([GH-973](https://github.com/NVIDIA/warp/issues/973)).
+- Add axis-reduction overloads for `wp.tile_reduce()` and `wp.tile_sum()`
+  ([GH-835](https://github.com/NVIDIA/warp/issues/835)).
+- Add support for component-level indexing and assignment on tiles of composite types (e.g. `tile[i][1]` for
+  extracting vector components, `tile[i][1, 1]` for matrix elements)
+  ([GH-941](https://github.com/NVIDIA/warp/issues/941)).
+- Add `warp/examples/tile/example_tile_mcgp.py`, demonstrating how to implement a Monte Carlo Laplace solver.
 - Add support for recording and waiting for external events in CUDA graphs
   ([GH-983](https://github.com/NVIDIA/warp/issues/983)).
-- Add kernel-level functions `bsr_row_index()` and `bsr_block_index()` to `warp.sparse`
-  ([GH-895](https://github.com/NVIDIA/warp/issues/895)).
 - Add support for querying CPU memory information (requires `psutil` package)
   ([GH-985](https://github.com/NVIDIA/warp/issues/985)).
-- Add support for limiting the graph cache size of JAX callables ([GH-989](https://github.com/NVIDIA/warp/issues/989)).
-- Add support for JAX pmap ([GH-976](https://github.com/NVIDIA/warp/pull/976)).
-- Add support for `wp.erf()`, `wp.erfc()`, `wp.erfinv()`, and `wp.erfcinv()` ([GH-910](https://github.com/NVIDIA/warp/issues/910)).
-- Add axis reduction overloads for `wp.tile_reduce()` and `wp.tile_sum()`
-  ([GH-835](https://github.com/NVIDIA/warp/issues/835)).
-- Add adjoint for `wp.transform()` when constructing with individual scalars ([GH-1011](https://github.com/NVIDIA/warp/issues/1011)).
-- Add a double precision overload for `wp.intersect_tri_tri` ([GH-1015](https://github.com/NVIDIA/warp/issues/1015)).
 - Add `wp.get_cuda_supported_archs()` to query supported CUDA compute architectures for compilation targets
   ([GH-964](https://github.com/NVIDIA/warp/issues/964)).
-- Add `wp.cast()` to reinterpret a value as a different type while preserving its bit pattern
-  ([GH-789](https://github.com/NVIDIA/warp/issues/789)).
 - Add runtime version verification to detect native library mismatches.
   Version mismatches trigger warnings but allow execution to continue
   ([GH-1018](https://github.com/NVIDIA/warp/issues/1018)).
+- Add kernel-level functions `bsr_row_index()` and `bsr_block_index()` to `warp.sparse`
+  ([GH-895](https://github.com/NVIDIA/warp/issues/895)).
+- Add adjoint for `wp.transform()` when constructing with individual scalars
+  ([GH-1011](https://github.com/NVIDIA/warp/issues/1011)).
+- Add a double-precision overload for `wp.intersect_tri_tri()` ([GH-1015](https://github.com/NVIDIA/warp/issues/1015)).
+- Add support for `jax.pmap()` ([GH-976](https://github.com/NVIDIA/warp/pull/976)).
 - Add automatic differentiation support with `jax_kernel(enable_backward=True)`
   ([GH-912](https://github.com/NVIDIA/warp/pull/912), [GH-515](https://github.com/NVIDIA/warp/issues/515)).
+- Add support for limiting the graph cache size of JAX callables ([GH-989](https://github.com/NVIDIA/warp/issues/989)).
+- Add PyTorch-Warp interop deferred gradient allocation case study to documentation
+  ([GH-1046](https://github.com/NVIDIA/warp/issues/1046)).
 
 ### Removed
 
 - Remove `warp.sim` module and related examples. This module has been superseded by the Newton library, a separate
   package with a new API. For migration guidance, see the
   [Newton migration guide](https://newton-physics.github.io/newton/migration.html) and the original GitHub announcement
   ([GH-735](https://github.com/NVIDIA/warp/discussions/735)).
-- Remove support for passing lists, tuples, and other non-Warp array arguments when calling built-ins at the Python scope
-  (e.g: `wp.normalize([1.0, 2.0, 3.0])` should be written as `wp.normalize(wp.vec3(1.0, 2.0, 3.0))`).
-- Remove support for Intel-based macOS (x86_64). Apple Silicon-based Macs (ARM64) remain fully supported.
-  Users attempting to run Warp on Intel Macs will receive a `RuntimeError` directing them to use Warp 1.9.x or earlier
-  ([GH-1016](https://github.com/NVIDIA/warp/issues/1016))
+- Remove support for passing lists, tuples, and other non-Warp array arguments when calling built-ins at the Python
+  scope (deprecated since v0.11.0). Use explicit type constructors instead (e.g., `wp.normalize([1.0, 2.0, 3.0])`
+  should be `wp.normalize(wp.vec3(1.0, 2.0, 3.0))`).
+- Remove support for Intel-based macOS (x86_64). Apple Silicon-based Macs (ARM64) continue to be supported with the CPU
+  backend. Users on Intel Macs will receive a `RuntimeError` directing them to use Warp 1.9.x or earlier
+  ([GH-1016](https://github.com/NVIDIA/warp/issues/1016)).
 - Remove `wp.select()` (deprecated since 1.7). Use `wp.where(cond, value_if_true, value_if_false)` instead.
 - Remove the `wp.matrix(pos, quat, scale)` built-in function. Use `wp.transform_compose()` instead
   ([GH-980](https://github.com/NVIDIA/warp/issues/980)).
 
 ### Deprecated
 
-- Deprecate constructing a matrix from vectors at the Python scope (e.g.: `wp.mat22(wp.vec2(1, 2), wp.vec2(3, 4))` should become `wp.matrix_from_rows(wp.vec2(1, 2), wp.vec2(3, 4))`) ([GH-981](https://github.com/NVIDIA/warp/issues/981)).
+- Deprecate constructing a matrix from vectors at the Python scope (e.g. `wp.mat22(wp.vec2(1, 2), wp.vec2(3, 4))`
+  should become `wp.matrix_from_rows(wp.vec2(1, 2), wp.vec2(3, 4))`)
+  ([GH-981](https://github.com/NVIDIA/warp/issues/981)).
 
 ### Changed
 
-- Improve efficiency for `wp.bvh_query_aabb()`, `wp.mesh_query_aabb()` and `wp.bvh_query_ray()`.
-  This fixes a performance regression introduced in Warp 1.6.0 ([GH-758](https://github.com/NVIDIA/warp/issues/758)).
+- **Breaking:** Change the default implementation of `jax_kernel()` to be `wp.jax_experimental.ffi.jax_kernel()`.
+  The previous version is still available as `wp.jax_experimental.custom_call.jax_kernel()`, but it is not supported
+  with JAX v0.8 and newer ([GH-974](https://github.com/NVIDIA/warp/issues/974)).
+- **Breaking:** Raise `RuntimeError` from `wp.load_module()` when attempting to load a module that does not contain
+  any Warp kernels, functions, or structs ([GH-920](https://github.com/NVIDIA/warp/issues/920)).
+- Improve performance when calling built-in functions from the Python scope
+  ([GH-801](https://github.com/NVIDIA/warp/issues/801)).
 - Improve efficiency of struct instance creation and attribute access ([GH-968](https://github.com/NVIDIA/warp/issues/968)).
+- Add `leaf_size` parameter to `wp.Bvh` and `bvh_leaf_size` to `wp.Mesh` to control the number of primitives per leaf
+  for performance tuning. The default is now 1 for `wp.Bvh` and 4 for `wp.Mesh`, changed from a hardcoded value of
+  4 ([GH-994](https://github.com/NVIDIA/warp/issues/994)).
 - Make `warp.sparse` operations with `masked=True` consistent with `bsr_mm()` by preserving result matrix topology,
   enabling CUDA subgraph capture for `bsr_axpy()`, `bsr_assign()` and `bsr_set_transpose()`
   ([GH-987](https://github.com/NVIDIA/warp/issues/987)).
-- Add `max_new_nnz` argument to `wp.sparse.bsr_mm` providing a synchronization-free path without further assumptions about non-zero topology.
-- Improve performance when calling built-in functions from the Python scope
-  ([GH-801](https://github.com/NVIDIA/warp/issues/801)).
-- Building `warp.fem` geometry and function space partitions is now possible in CUDA graphs by passing an explicit upper-bound for the number of cells and nodes to `ExplicitGeometryPartition` and `make_space_partition`. Additionally, building fields and field restrictions is now synchronization-free by default ([GH-1021](https://github.com/NVIDIA/warp/issues/1021)).
-- Raise `RuntimeError` from `wp.load_module()` when attempting to load a module that does not contain any Warp kernels,
-  functions, or structs ([GH-920](https://github.com/NVIDIA/warp/issues/920)).
+- Add `max_new_nnz` argument to `wp.sparse.bsr_mm()` providing a synchronization-free path without further assumptions
+  about non-zero topology.
+- Building `warp.fem` geometry and function space partitions is now possible in CUDA graphs by passing an explicit
+  upper-bound for the number of cells and nodes to `ExplicitGeometryPartition` and `make_space_partition`.
+  Building fields and field restrictions is now synchronization-free by default
+  ([GH-1021](https://github.com/NVIDIA/warp/issues/1021)).
 - Default the `q` argument in `wp.transform()` to the identity quaternion at the kernel scope
   ([GH-923](https://github.com/NVIDIA/warp/issues/923)).
-- Add `leaf_size` parameter to `wp.Bvh` and `bvh_leaf_size` to `wp.Mesh` to control the number of primitives per leaf
-  for performance tuning. The default is now 1 for `wp.Bvh` and 4 for `wp.Mesh`, changed from a hardcoded value of
-  4 ([GH-994](https://github.com/NVIDIA/warp/issues/994)).
-- **Breaking:** Change the default implementation of `jax_kernel()` to be `wp.jax_experimental.ffi.jax_kernel()`.
-  The previous version is still available as `wp.jax_experimental.custom_call.jax_kernel()`, but it is not supported with JAX v0.8 and newer
-  ([GH-974](https://github.com/NVIDIA/warp/issues/974)).
+- Improve efficiency for `wp.bvh_query_aabb()`, `wp.mesh_query_aabb()` and `wp.bvh_query_ray()`.
+  This fixes a performance regression introduced in Warp 1.6.0 ([GH-758](https://github.com/NVIDIA/warp/issues/758)).
 
 ### Fixed
 
+- Fix segmentation faults on AArch64 CPUs when using tiles. The fix uses stack memory for tile storage
+  and is controlled by `wp.config.enable_tiles_in_stack_memory` (enabled by default)
+  ([GH-957](https://github.com/NVIDIA/warp/issues/957)).
 - Fix copying and filling arrays with large strides ([GH-929](https://github.com/NVIDIA/warp/issues/929)).
-- Fix graph deletion during capture ([GH-992](https://github.com/NVIDIA/warp/issues/992)).
-- Fix return type annotations for `struct()` and `overload()` decorators ([GH-971](https://github.com/NVIDIA/warp/pull/971))
-- Fix segmentation faults on AArch64 CPUs caused by referencing static memory. The LLVM JIT generates ADRP instructions
-  to address memory up to 4 GiB from the program counter, but the section for static memory may be further apart than
-  that. Work around it by reserving stack memory on kernel entry, tracked through the x28 register which is prevented
-  from being used as a scratch register. `wp.config.enable_tiles_in_stack_memory` can be used to enable (default)
-  or disable this new method ([GH-957](https://github.com/NVIDIA/warp/issues/957)).
-- Fix arithmetic operators not working when a scalar is on the lhs and an array on the rhs
-  ([GH-892](https://github.com/NVIDIA/warp/issues/892)).
-- Fix invalid keyword arguments not being detected in the `wp.transform()` constructor at Python scope
+- Fix incorrect results when filling arrays in CUDA graphs ([GH-1040](https://github.com/NVIDIA/warp/issues/1040)).
+- Defer CUDA graph deletion when graph captures are in progress ([GH-992](https://github.com/NVIDIA/warp/issues/992)).
+- Fix race conditions in CUDA graph destruction callbacks ([GH-1063](https://github.com/NVIDIA/warp/issues/1063)).
+- Fix arithmetic operators with scalars and arrays at the Python scope. Operations like `scalar * array`
+  now work correctly (previously only `array * scalar` worked) ([GH-892](https://github.com/NVIDIA/warp/issues/892)).
+- Fix `wp.atomic_add()` failing to accumulate `wp.int64` values ([GH-977](https://github.com/NVIDIA/warp/issues/977)).
+- Fix handling of multi-line lambda expressions and lambda expressions involving parentheses in `wp.map()`
+  ([GH-984](https://github.com/NVIDIA/warp/issues/984)).
+- Fix invalid keyword arguments not being detected in the `wp.transform()` constructor at the Python scope
   ([GH-975](https://github.com/NVIDIA/warp/issues/975)).
+- Fix return type annotations for `struct()` and `overload()` decorators
+  ([GH-971](https://github.com/NVIDIA/warp/pull/971)).
+- Suppress `TypeError` and `AttributeError` exceptions during Python interpreter shutdown when Warp objects are being
+  cleaned up, as these can be safely ignored during process termination
+  ([GH-1048](https://github.com/NVIDIA/warp/issues/1048)).
 
 ## [1.9.1] - 2025-10-01
 
@@ -123,8 +143,6 @@
 - Fix handling of generic kernels with `wp.jax_experimental.ffi.jax_kernel()`.
 - Update built-in documentation to accurately reflect their differentiability status
   ([GH-970](https://github.com/NVIDIA/warp/issues/970)).
-- Fix handling of multi-line lambda expressions and lambda expressions involving parentheses in `wp.map()` ([GH-984](https://github.com/NVIDIA/warp/issues/984)).
-- Fix `wp.atomic_add()` for int64 type ([GH-977](https://github.com/NVIDIA/warp/issues/977))
 
 ## [1.9.0] - 2025-09-04
 
@@ -217,8 +235,6 @@
 - Fix adding superfluous inactive nodes to tetrahedron polynomial function spaces in `warp.fem`.
 - Fix `#line` directives for Python↔CUDA source correlation not being emitted by default when a module is compiled in
   debug mode ([GH-901](https://github.com/NVIDIA/warp/issues/901)).
-- Fix 2D shared tile allocation/de-allocation bug inside Warp functions ([GH-877](https://github.com/NVIDIA/warp/issues/877)).
-- Fix loading "unique" modules using `wp.load_module()`.
 
 ## [1.8.1] - 2025-08-01
 
@@ -1923,7 +1939,8 @@
 
 - Initial publish for alpha testing
 
-[Unreleased]: https://github.com/NVIDIA/warp/compare/v1.9.0...HEAD
+[1.10.0]: https://github.com/NVIDIA/warp/releases/tag/v1.10.0
+[1.9.1]: https://github.com/NVIDIA/warp/releases/tag/v1.9.1
 [1.9.0]: https://github.com/NVIDIA/warp/releases/tag/v1.9.0
 [1.8.1]: https://github.com/NVIDIA/warp/releases/tag/v1.8.1
 [1.8.0]: https://github.com/NVIDIA/warp/releases/tag/v1.8.0
 
@@ -7,8 +7,11 @@ pull request on GitHub or email a link to your arXiv preprint (preferred) or DOI
 
 ## 2025
 
+- **Learning to Design Soft Hands using Reward Models**. *X. Bai, N. Hansen, A. Singh, M. T. Tolley, Y. Duan, P. Abbeel, X. Wang, S. Yi*. October 2025. [arXiv:2510.17086](https://arxiv.org/abs/2510.17086)
 - **Feedback Matters: Augmenting Autonomous Dissection with Visual and Topological Feedback**. *C. Wang, C. Chen, X. Liang, S. Atar, F. Richter, M. Yip*. October 2025. [arXiv:2510.04074](https://arxiv.org/abs/2510.04074)
 - **MPMAvatar: Learning 3D Gaussian Avatars with Accurate and Robust Physics-Based Dynamics**. *C. Lee, J. Lee, T. Kim*. October 2025. [arXiv:2510.01619](https://arxiv.org/abs/2510.01619)
+- **Phys4DRT: Physics-based 4D Generation for Real-Time Interaction with Time-Frequency Supervision**. *Y. Xiao, S. Zhang, Z. Zhang, J. Cui, Y. Wang, S. Li*. October 2025. [DOI:10.1145/3746027.3754827](https://doi.org/10.1145/3746027.3754827)
+- **An End-to-End Framework for Modelling Pneumatic Soft Robots Based on Differentiable Finite Element Methods**. *S. Zhong, Y. Yao, P. Maiolino, I. Posner*. October 2025. [DOI:10.1109/lra.2025.3625507](https://doi.org/10.1109/lra.2025.3625507)
 - **MechStyle: Augmenting Generative AI with Mechanical Simulation to Create Stylized and Structurally Viable 3D Models**. *F. Faruqi, A. Abdel-Rahman, L. Tejedor, M. Nisser, J. Li, V. Phadnis, V. Jampani, N. Gershenfeld, M. Hofmann, S. Mueller*. September 2025. [arXiv:2509.20571](https://arxiv.org/abs/2509.20571)
 - **AERO-MPPI: Anchor-Guided Ensemble Trajectory Optimization for Agile Mapless Drone Navigation**. *X. Chen, R. Huang, L. Tang, L. Zhao*. September 2025. [arXiv:2509.17340](https://arxiv.org/abs/2509.17340)
 - **Discovering neural elastoplasticity from kinematic observations**. *G. B. Gavris, W. Sun*. September 2025. [DOI:10.1073/pnas.2508732122](https://doi.org/10.1073/pnas.2508732122)
 
@@ -44,7 +44,7 @@ the `pip install` command, e.g.
 | Platform        | Install Command                                                                                                               |
 | --------------- | ----------------------------------------------------------------------------------------------------------------------------- |
 | Linux aarch64   | `pip install https://github.com/NVIDIA/warp/releases/download/v1.10.0/warp_lang-1.10.0+cu13-py3-none-manylinux_2_34_aarch64.whl` |
-| Linux x86-64    | `pip install https://github.com/NVIDIA/warp/releases/download/v1.10.0/warp_lang-1.10.0+cu13-py3-none-manylinux_2_34_x86_64.whl`  |
+| Linux x86-64    | `pip install https://github.com/NVIDIA/warp/releases/download/v1.10.0/warp_lang-1.10.0+cu13-py3-none-manylinux_2_28_x86_64.whl`  |
 | Windows x86-64  | `pip install https://github.com/NVIDIA/warp/releases/download/v1.10.0/warp_lang-1.10.0+cu13-py3-none-win_amd64.whl`             |
 
 The `--force-reinstall` option may need to be used to overwrite a previous installation.
 
@@ -1 +1 @@
-1.10.0rc2
+1.10.0