Skip to content

[Performance] Performance regression in Cast operator for float32 to double conversion between v1.18.0 and v1.19.0 #27189

@junghyunpark2001

Description

@junghyunpark2001

Describe the issue

Description

We observed a performance regression in the Cast operator when converting float32 to double (float64) between ONNXRuntime v1.18.0 and v1.19.0.

Affected Operator

Cast

  • Opset Version: 21
  • Source Type: float32
  • Target Type: double (float64)
  • Attribute: to=11 (DOUBLE), saturate=1
  • Regression: +10.4% kernel slowdown

Test Case Details

Test Case: cast_cast_21_cast_float32_to_double

Inputs:

  • input tensor:
    • Data type: float32 (type=1)
    • Shape: [4, 64, 100] (25,600 elements)

Attributes:

  • to: 11 (DOUBLE)
  • saturate: 1

Output:

  • Data type: double (float64)
  • Shape: [4, 64, 100]

Performance:

  • v1.18.0: 0.0055 ms (kernel time)
  • v1.19.0: 0.0060 ms (kernel time)
  • Kernel regression: +10.4% slowdown
  • Confirmation: 5/10 validation runs confirmed

Regression Characteristics

Affected Configuration

  • Source type: float32
  • Target type: double (float64)
  • Tensor size: Medium (25K elements)

Key Characteristics

  • Type conversion specific: float32 → double
  • Opset version: 21
  • Saturate attribute: Enabled (saturate=1)

To reproduce

python script_profiling.py  cast_cast_21_cast_float32_to_double 1.18.0 1.19.0
``

[Archive.zip](https://github.com/user-attachments/files/24907619/Archive.zip)

### Urgency

_No response_

### Platform

Linux

### OS Version

Ubuntu 24.04.3 LTS

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

1.19.0

### ONNX Runtime API

Python

### Architecture

X64

### Execution Provider

Default CPU

### Execution Provider Library Version

_No response_

### Model File

_No response_

### Is this a quantized model?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceissues related to performance regressions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions