-
Notifications
You must be signed in to change notification settings - Fork 3k
[OV][ITT][GPU Plugin] Enable default ITT markers for inference and op submission #33313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Enables default ITT markers for higher level operations such as inference pass, op preparation and submission - Follows the same guidelines to standardize the conventions for namespaces: ov::phases::gpu::inference ov::op::gpu - Supports both synchronous and asynchronous operations Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
+ Adding base markers for node execution instead of a prepare() and execute() Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
| void SyncInferRequest::infer() { | ||
| OV_ITT_SCOPED_TASK(itt::domains::intel_gpu_plugin, "SyncInferRequest::infer"); | ||
| // String can be constructed once in the constructor | ||
| OV_ITT_SCOPED_TASK_BASE(itt::domains::intel_gpu_inference, "SyncInferRequestGPU::infer" + m_graph->get_runtime_model()->get_name()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity, is it OK to change the name here? you commented not to change the name in other place.. [Warning] The strings in ITT_SCOPED_TASK_BASE should NOT be deleted or edited!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isanghao Please suggest what changes you need. I tried to make the strings in TASK_BASE to be similar for all plugins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, you changed the name from SyncInferRequest to SyncInferRequestGPU. If it is OK to change the name, why do you leave comment that we should not change the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isanghao Previous task definition was incomplete. Also, it is not enabled by default. I enable the task by default using OV_ITT_SCOPED_TASK_BASE macro and renamed the string to be indicative of being executed by the GPU plugin. This new string cannot be changed as it acts as a SW contract and all tools that use this data will rely on the strings being invariant.
|
build_jenkins |
- This update fixes the GPU runtime performance regressions due to increased submission times. By caching the required string for each inference, the overheads are eliminated. Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
|
build_jenkins |
| #include "intel_gpu/runtime/debug_configuration.hpp" | ||
|
|
||
| #include <algorithm> | ||
| #include <functional> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not include style fix in the PR. It makes review difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to reformat the whole file. You can just change the problematic line to pass the test..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should consistent with current style in GPU, @isanghao has greater experience in this component I would follow his suggestions
Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
| } | ||
|
|
||
| // Helper function to return the model name for ITT tracing | ||
| std::string get_model_name() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| std::string get_model_name() const { | |
| std::string_view get_model_name() const { |
Be used instead make copy of string the copy can be created where needed.
@isanghao
Is it required or model name can be get via different member?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isanghao Please suggest other ways to get the model name, if there is a way. Using get_runtime_model()->get_name() caused the perf regression earlier, so I am getting the compiled model name with the assumption that it does change with new models being loaded.
@praasz In this particular case, the number of string constructors being invoked should be the same, so we don't gain much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The std::string_view better hide internal implementation.
@isanghao
compiled_model->get_runtime_model()->get_friendly_name()
can it be used instead introduce this member? or it will give different name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@praasz If there is a way of getting the "friendly name" with low cost, it makes sense. compiled_model->get_runtime_model() goes through a chain of calls, so I am not sure it is low cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The std::string_view better hide internal implementation.
It has been changed to string_view :)
- Moving string constructor to where it is used Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
|
@praasz @aobolensk @isanghao Any progress or next steps? |
| // Helper function to return the model name for ITT tracing | ||
| std::string_view get_model_name() const { | ||
| return m_model_name; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to reuse compiled_model->get_runtime_model()->get_friendly_name() instead of adding this new method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isanghao get_runtime_model() call is what causes the performance regressions, so the answer is no. Are there alternatives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aobolensk @isanghao If there are no other changes, can this be merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@praasz do you have any further comment?
|
build_jenkins |
Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
- Also update the strings to be consistent across all plugins Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
|
@isanghao The style changes have been removed as requested. Should be easily reviewable now. |
aobolensk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
|
build_jenkins |
isanghao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for GPU
| OV_ITT_DOMAIN(intel_gpu_inference, "ov::phases::gpu::inference"); | ||
| // Domain namespace for all of the operators | ||
| OV_ITT_DOMAIN(intel_gpu_op, "ov::op::gpu"); | ||
| OV_ITT_DOMAIN(intel_gpu_plugin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be beneficial to define this domain name as ov::intel_gpu_plugin to maintain consistency with other domains. This ensures the ov:: prefix is present, making it easier for profiling tools to filter and capture all ov events
| OV_ITT_DOMAIN(intel_gpu_plugin); | |
| OV_ITT_DOMAIN(intel_gpu_plugin, "ov::intel_gpu_plugin"); |
Enabling default GPU ITT markers using standard convention - Part 3
This PR is the third of a series of PRs to standardize the ITT markers in OpenVINO that will be enabled by default through host-side instrumentation.
Summary of the current PR (PR#3)
Use the same convention standardized in PR#33639
Ensures the namespace for GPU Plugin activity falls under:
ov::phases::gpu::inference
ov::op::gpu
Details:
GPU support is enabled with default ITT markers that support synchronous an asynchronous execution. This PR ensures a standardized convention is followed in namespaces used.
Tickets:
CVS-179230
@isanghao Please review this as you are generally aware of what was discussed