[SYCL][L0][CUDA][HIP] Fix PI_KERNEL_GROUP_INFO_GLOBAL_WORK_SIZE queries#8769
[SYCL][L0][CUDA][HIP] Fix PI_KERNEL_GROUP_INFO_GLOBAL_WORK_SIZE queries#8769steffenlarsen merged 8 commits intointel:syclfrom
Conversation
|
I have a question. Is the max global work size independent of the global work size set in a host program for a kernel ? |
|
/verify with intel/llvm-test-suite#1694 |
|
@abagusetty, FYI. "verify with" command do not validate on CUDA/HIP platforms. |
|
Thanks, I stumbled upon that too and looked at the wording in Spec, which made me think it could be the max global limits. |
|
The global work sizes from the query will be the same for any kernels. Right ? |
Yes, since the descriptor is a kernel_device_specific one: Any kernel from (custom device type or a built-in kernel) possibly returns the info of device specific global-work-sizes which in turn should be the same for all the kernels IMO. |
…m device-types appropriately
sycl/plugins/cuda/pi_cuda.hpp
Outdated
| #include <vector> | ||
|
|
||
| // Helper for one-liner validation | ||
| #define PI_ASSERT(condition, error) \ |
There was a problem hiding this comment.
It's a bit misleading, as it does not assert on the condition, maybe consider renaming it?
There was a problem hiding this comment.
PI_ASSERT to PI_ERR_CHECK
|
Gentle ping @smaslov-intel @jchlanda |
steffenlarsen
left a comment
There was a problem hiding this comment.
Sorry for the delay. I think these changes look good. I am a little curious what built-in kernels they would apply to, but I assume CUDA, HIP and L0 guarantee full possible work-sizes either way.
Thanks for the feed back on the built-ins, I too stumbled upon that a bit: Just convinced myself that they see the complete device limits. |
intel#8769 Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
intel#8769 Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
intel#8769 Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
intel#8769 Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
intel#8769 Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
intel#8769 Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Address kernel query
global_work_sizefor L0, CUDA, HIP fromPI_KERNEL_GROUP_INFO_GLOBAL_WORK_SIZEFixes #8766
For instance (for X-dimension)
L0:
maxGroupSizeX * maxGroupCountXCUDA:
CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_X * CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X