-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Open
Labels
performanceissues related to performance regressionsissues related to performance regressions
Description
Describe the issue
Description
We observed a performance regression in the Cast operator when converting float32 to double (float64) between ONNXRuntime v1.18.0 and v1.19.0.
Affected Operator
Cast
- Opset Version: 21
- Source Type: float32
- Target Type: double (float64)
- Attribute: to=11 (DOUBLE), saturate=1
- Regression: +10.4% kernel slowdown
Test Case Details
Test Case: cast_cast_21_cast_float32_to_double
Inputs:
- input tensor:
- Data type: float32 (type=1)
- Shape: [4, 64, 100] (25,600 elements)
Attributes:
- to: 11 (DOUBLE)
- saturate: 1
Output:
- Data type: double (float64)
- Shape: [4, 64, 100]
Performance:
- v1.18.0: 0.0055 ms (kernel time)
- v1.19.0: 0.0060 ms (kernel time)
- Kernel regression: +10.4% slowdown
- Confirmation: 5/10 validation runs confirmed
Regression Characteristics
Affected Configuration
- Source type: float32
- Target type: double (float64)
- Tensor size: Medium (25K elements)
Key Characteristics
- Type conversion specific: float32 → double
- Opset version: 21
- Saturate attribute: Enabled (saturate=1)
To reproduce
python script_profiling.py cast_cast_21_cast_float32_to_double 1.18.0 1.19.0
``
[Archive.zip](https://github.com/user-attachments/files/24907619/Archive.zip)
### Urgency
_No response_
### Platform
Linux
### OS Version
Ubuntu 24.04.3 LTS
### ONNX Runtime Installation
Released Package
### ONNX Runtime Version or Commit ID
1.19.0
### ONNX Runtime API
Python
### Architecture
X64
### Execution Provider
Default CPU
### Execution Provider Library Version
_No response_
### Model File
_No response_
### Is this a quantized model?
YesMetadata
Metadata
Assignees
Labels
performanceissues related to performance regressionsissues related to performance regressions