Skip to content

[OVEP] ORT 1.24 Release Patch#27238

Merged
adrianlizarraga merged 1 commit intomicrosoft:mainfrom
intel:ovep_1.24_patch
Feb 4, 2026
Merged

[OVEP] ORT 1.24 Release Patch#27238
adrianlizarraga merged 1 commit intomicrosoft:mainfrom
intel:ovep_1.24_patch

Conversation

@ankitm3k
Copy link
Contributor

@ankitm3k ankitm3k commented Feb 4, 2026

Description

Re-use weight files and their underlying memory maps across shared contexts.

Motivation and Context

This reduces resident memory when different ep shared context sets reference the same weight file.

* Reuse weight files across shared contexts.

* fix format
@ankitm3k
Copy link
Contributor Author

ankitm3k commented Feb 4, 2026

@adrianlizarraga & @HectorSVC Please review & merge. FYI @MayureshV1

@tianleiwu
Copy link
Contributor

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

Copy link
Contributor

@MayureshV1 MayureshV1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good !

@MayureshV1
Copy link
Contributor

@adrianlizarraga , @yuslepukhin . Can you please review and have this merged.
If there is any opportunity to have it merged in ORT 1.24 or a bug fix release please consider.

@adrianlizarraga adrianlizarraga merged commit 8abbfda into microsoft:main Feb 4, 2026
92 checks passed
tianleiwu pushed a commit that referenced this pull request Feb 4, 2026
### Description
Re-use weight files and their underlying memory maps across shared
contexts.

### Motivation and Context
This reduces resident memory when different ep shared context sets
reference the same weight file.

Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces memory optimization for the OpenVINO Execution Provider by implementing a WeightFileManager singleton that enables sharing of weight file instances and their underlying memory maps across multiple SharedContext instances. This reduces memory footprint when different execution provider shared contexts reference the same weight files.

Changes:

  • Introduced WeightFileManager singleton to globally manage weight file instances
  • Changed WeightsFile storage from unique_ptr to shared_ptr to enable sharing across contexts
  • Modified SharedContext constructor to accept bin_path by const reference instead of by value

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
onnxruntime/core/providers/openvino/ov_shared_context.h Added WeightFileManager singleton class, updated SharedContext to use shared weight files, and adjusted constructor signature
onnxruntime/core/providers/openvino/ov_shared_context.cc Updated constructor to initialize weight_file_manager_ and modified weight file acquisition to use the global manager

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +110 to +111
void
LoadTensorFromFile(
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type and function name are split across two lines, which is inconsistent with the rest of the codebase. All other function declarations in this file (e.g., lines 62, 76-79) keep the return type and function name on the same line. This should be reformatted to match the existing code style.

Suggested change
void
LoadTensorFromFile(
void LoadTensorFromFile(

Copilot uses AI. Check for mistakes.
Comment on lines 89 to 107
@@ -104,7 +106,9 @@ class SharedContext : public std::enable_shared_from_this<SharedContext> {
std::map<std::string, MappingContainer> imported_device_tensors_;
};
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WeightsFile struct lacks thread synchronization for its member access. With the introduction of WeightFileManager, WeightsFile instances are now shared across multiple SharedContext objects (line 128-137). This means multiple threads can concurrently call LoadWeights() and TryGetOrCreateDeviceMapping() on the same WeightsFile instance. The LoadWeights() method performs non-thread-safe operations on the file_ member (seekg and read), and TryGetOrCreateDeviceMapping() modifies imported_device_tensors_ without synchronization. Add a mutex member to WeightsFile and protect all member accesses to prevent race conditions.

Copilot uses AI. Check for mistakes.
tianleiwu pushed a commit that referenced this pull request Feb 5, 2026
### Description
Re-use weight files and their underlying memory maps across shared
contexts.

### Motivation and Context
This reduces resident memory when different ep shared context sets
reference the same weight file.

Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants