v2.0.0-rc.9 #319
decahedron1
announced in
Announcements
v2.0.0-rc.9
#319
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🌴 Undo The Flattening (d4f82fc)
A previous
ortrelease 'flattened' all exports, such that everything was exported at the crate root -ort::{TensorElementType, Session, Value}. This was done at a time whenortdidn't export much, but now it exports a lot, so this was leading to some big, uglyuseblocks.rc.9now has most exports behind their respective modules -Sessionis now imported asort::session::Session,Tensorasort::value::Tensor, etc.rust-analyzerand some quick searches on docs.rs can help you find the right paths to import.📦 Tensor
extractoptimization (1dbad54)Previously, calling any of the
extract_tensor_*methods would have to call back to ONNX Runtime to determine the value'sValueTypeto ensure it was OK to extract. This involved a lot of FFI calls and a few allocations which could have a notable performance impact in hot loops.Since a value's type never changes after it is created, the
ValueTypeis now created when theValueis constructed (i.e. viaTensor::from_arrayor returned from a session). This makesextract_tensor_*a lot cheaper!Note that this does come with some breaking changes:
&[i64]for their dimensions instead ofVec<i64>.Value::dtype()andTensor::memory_info()now return&ValueTypeand&MemoryInforespectively, instead of their non-borrowed counterparts.ValueType::Tensornow has an extra field for symbolic dimensions,dimension_symbols, so you might have to updatematches onValueType.🚥 Threading management (87577ef)
2.0.0-rc.9introduces a new trait:ThreadManager. This allows you to define custom thread create & join functions for session & environment thread pools! See thethread_manager.rstest for an example of how to create your ownThreadManagerand apply it to a session, or an environment'sGlobalThreadPoolOptions(previouslyEnvironmentGlobalThreadPoolOptions).Additionally, sessions may now opt out of the environment's global thread pool if one is configured.
🧠 Shape inference for custom operators (87577ef)
ortnow providesShapeInferenceContext, an interface for custom operators to provide a hint to ONNX Runtime about the shape of the operator's output tensors based on its inputs, which may open the doors to memory optimizations.See the updated
custom_operators.rsexample to see how it works.📃 Session output refactor (8a16adb)
SessionOutputshas been slightly refactored to reduce memory usage and slightly increase performance. Most notably, it no longer derefs to a&BTreeMap.The new
SessionOutputsinterface closely mirrorsBTreeMap's API, so most applications require no changes unless you were explicitly dereferencing to a&BTreeMap.🛠️ LoRA Adapters (d877fb3)
ONNX Runtime v1.20.0 introduces a new
Adapterformat for supporting LoRA-like weight adapters, and noworthas it too!An
Adapteressentially functions as a map of tensors, loaded from disk or memory and copied to a device (typically whichever device the session resides on). When you add anAdaptertoRunOptions, those tensors are automatically added as inputs (except faster, because they don't need to be copied anywhere!)With some modification to your ONNX graph, you can add LoRA layers using optional inputs which
Adaptercan then override. (Hopefully ONNX Runtime will provide some documentation on how this can be done soon, but until then, it's ready to use inort!)🗂️ Prepacked weights (87577ef)
PrepackedWeightsallows multiple sessions to share the same weights across multiple sessions. If you create multipleSessions from one model file, they can all share the same memory!Currently, ONNX Runtime only supports prepacked weights for the CPU execution provider.
You can now override dynamic dimensions in a graph using
SessionBuilder::with_dimension_override, allowing ONNX Runtime to do more optimizations.🪶 Customizable workload type (87577ef)
Not all workloads need full performance all the time! If you're using
ortto perform background tasks, you can now set a session's workload type to prioritize either efficiency (by lowering scheduling priority or utilizing more efficient CPU cores on some architectures), or performance (the default).Other features
ortsys!macro.ort::api()return&ort_sys::OrtApiinstead ofNonNull<ort_sys::OrtApi>.AsPointertrait.ptr()method now have anAsPointerimplementation instead.RunOptions.ORT_CXX_STDLIBenvironment variable (mirroringCXXSTDLIB) to allow changing the C++ standard library ort links to.Fixes
ValueRef&ValueRefMutleaking value memory.MemoryInfo'sDeviceTypeinstead of its allocation device to determine whetherTensors can be extracted.ORT_PREFER_DYNAMIC_LINKto work even whencudaortensorrtare enabled.Sequence<T>.If you have any questions about this release, we're here to help:
#💬|ort-discussionsThank you to Thomas, Johannes Laier, Yunho Cho, Phu Tran, Bartek, Noah, Matouš Kučera, Kevin Lacker, and Okabintaro, whose support made this release possible. If you'd like to support
ortas well, consider contributing on Open Collective 💖🩷💜🩷💜
This discussion was created from the release v2.0.0-rc.9.
Beta Was this translation helpful? Give feedback.
All reactions