Skip to content

8375633: C2: Missed Ideal optimization opportunity with ConvD2F/ConvF2HF and -XX:+StressIncrementalInlining#29570

Open
krk wants to merge 2 commits intoopenjdk:masterfrom
krk:opt-d2f-8375633
Open

8375633: C2: Missed Ideal optimization opportunity with ConvD2F/ConvF2HF and -XX:+StressIncrementalInlining#29570
krk wants to merge 2 commits intoopenjdk:masterfrom
krk:opt-d2f-8375633

Conversation

@krk
Copy link

@krk krk commented Feb 4, 2026

ConvD2FNode and ConvF2HFNode ::Ideal already optimize depth-2 patterns (e.g. ConvD2F(SqrtD(ConvF2D(x))) → SqrtF(x)), but with StressIncrementalInlining, the inner conversion may be inlined after IGVN has already visited the outer one.

The intermediate node (SqrtD) doesn't propagate the change to its users, so the optimization is missed.

The fix adds worklist propagation rules in add_users_of_use_to_worklist, when a ConvF2D node changes and its user is SqrtD, push ConvD2F users of that SqrtD onto the worklist. Similarly, when ConvHF2F changes and its user is a float binary op, push ConvF2HF users.

The intermediate node checks (SqrtD, is_float32_binary_oper) match exactly what the current Ideal methods look for.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8375633: C2: Missed Ideal optimization opportunity with ConvD2F/ConvF2HF and -XX:+StressIncrementalInlining (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/29570/head:pull/29570
$ git checkout pull/29570

Update a local copy of the PR:
$ git checkout pull/29570
$ git pull https://git.openjdk.org/jdk.git pull/29570/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 29570

View PR using the GUI difftool:
$ git pr show -t 29570

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/29570.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 4, 2026

👋 Welcome back krk! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 4, 2026

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Feb 4, 2026
@openjdk
Copy link

openjdk bot commented Feb 4, 2026

@krk The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 4, 2026
@mlbridge
Copy link

mlbridge bot commented Feb 4, 2026

Webrevs

@marc-chevalier
Copy link
Member

Thanks for working on this.

I'm still seeing some "Missed Ideal optimization opportunity in PhaseIterGVN for ConvF2HF" in testing, for instance on

  • compiler/floatingpoint/TestSubNodeFloatDoubleNegation.java

Missed Ideal optimization (can_reshape=false):
The node was replaced by Ideal.
Old node:
dist dump
---------------------------------------------
   1  264  SubF  === _ 417 262  [[ 277 300 ]]  !jvms: Float16::subtract @ bci:8 (line 1188) TestSubNodeFloatDoubleNegation::testHalfFloat @ bci:9 (line 73)
   0  300  ConvF2HF  === _ 264  [[ 310 328 ]]  #short !jvms: Float16::valueOf @ bci:5 (line 364) Float16::subtract @ bci:9 (line 1188) TestSubNodeFloatDoubleNegation::testHalfFloat @ bci:9 (line 73)
The result after Ideal:
dist dump
---------------------------------------------
   1  542  SubHF  === _ 541 540  [[ 543 ]]  !jvms: Number::<init> @ bci:1 (line 59) Float16::<init> @ bci:1 (line 128) Float16::shortBitsToFloat16 @ bci:5 (line 985) TestSubNodeFloatDoubleNegation::testHalfFloat @ bci:1 (line 72)
   0  543  ReinterpretHF2S  === _ 542  [[ ]]  !jvms: Number::<init> @ bci:1 (line 59) Float16::<init> @ bci:1 (line 128) Float16::shortBitsToFloat16 @ bci:5 (line 985) TestSubNodeFloatDoubleNegation::testHalfFloat @ bci:1 (line 72)
  • compiler/vectorization/TestFloat16VectorOperations.java
Missed Ideal optimization (can_reshape=false):
The node was replaced by Ideal.
Old node:
dist dump
---------------------------------------------
   1  380  SubF  === _ 360 641  [[ 392 415 ]]  !jvms: Float16::subtract @ bci:8 (line 1188) TestFloat16VectorOperations::vectorSubConstInputFloat16 @ bci:26 (line 355)
   0  415  ConvF2HF  === _ 380  [[ 454 291 265 280 ]]  #short !jvms: Float16::valueOf @ bci:5 (line 364) Float16::subtract @ bci:9 (line 1188) TestFloat16VectorOperations::vectorSubConstInputFloat16 @ bci:26 (line 355)
The result after Ideal:
dist dump
---------------------------------------------
   1  645  SubHF  === _ 643 644  [[ 646 ]]  !jvms: Float16::subtract @ bci:5 (line 1188) TestFloat16VectorOperations::vectorSubConstInputFloat16 @ bci:26 (line 355)
   0  646  ReinterpretHF2S  === _ 645  [[ ]]  !jvms: Float16::subtract @ bci:5 (line 1188) TestFloat16VectorOperations::vectorSubConstInputFloat16 @ bci:26 (line 355)

both with flags -XX:-TieredCompilation -XX:VerifyIterativeGVN=1110 -XX:+UnlockDiagnosticVMOptions -XX:+StressIncrementalInlining.

If I understand well, I don't think your fix should check n_op. Indeed n is the (single) parameter n of PhaseIterGVN::add_users_to_worklist, which is the old node, that is the node on which Ideal, Value and Identity are called, and not the nodes returned by Ideal or Identity. See

Node* k = n;
DEBUG_ONLY(dead_loop_check(k);)
DEBUG_ONLY(bool is_new = (k->outcnt() == 0);)
C->remove_modified_node(k);
Node* i = apply_ideal(k, /*can_reshape=*/true);
assert(i != k || is_new || i->outcnt() > 0, "don't return dead nodes");
#ifndef PRODUCT
verify_step(k);
#endif
DEBUG_ONLY(uint loop_count = 1;)
while (i != nullptr) {
#ifdef ASSERT
if (loop_count >= K + C->live_nodes()) {
dump_infinite_loop_info(i, "PhaseIterGVN::transform_old");
}
#endif
assert((i->_idx >= k->_idx) || i->is_top(), "Idealize should return new nodes, use Identity to return old nodes");
// Made a change; put users of original Node on worklist
add_users_to_worklist(k);
// Replacing root of transform tree?
if (k != i) {
// Make users of old Node now use new.
subsume_node(k, i);
k = i;
}
DEBUG_ONLY(dead_loop_check(k);)
// Try idealizing again
DEBUG_ONLY(is_new = (k->outcnt() == 0);)
C->remove_modified_node(k);
i = apply_ideal(k, /*can_reshape=*/true);
assert(i != k || is_new || (i->outcnt() > 0), "don't return dead nodes");
#ifndef PRODUCT
verify_step(k);
#endif
DEBUG_ONLY(loop_count++;)
}
// If brand new node, make space in type array.
ensure_type_or_null(k);
// See what kind of values 'k' takes on at runtime
const Type* t = k->Value(this);
assert(t != nullptr, "value sanity");
// Since I just called 'Value' to compute the set of run-time values
// for this Node, and 'Value' is non-local (and therefore expensive) I'll
// cache Value. Later requests for the local phase->type of this Node can
// use the cached Value instead of suffering with 'bottom_type'.
if (type_or_null(k) != t) {
#ifndef PRODUCT
inc_new_values();
set_progress();
#endif
set_type(k, t);
// If k is a TypeNode, capture any more-precise type permanently into Node
k->raise_bottom_type(t);
// Move users of node to worklist
add_users_to_worklist(k);
}
// If 'k' computes a constant, replace it with a constant
if (t->singleton() && !k->is_Con()) {
NOT_PRODUCT(set_progress();)
Node* con = makecon(t); // Make a constant
add_users_to_worklist(k);
subsume_node(k, con); // Everybody using k now uses con
return con;
}
// Now check for Identities
i = k->Identity(this); // Look for a nearby replacement
if (i != k) { // Found? Return replacement!
NOT_PRODUCT(set_progress();)
add_users_to_worklist(k);
subsume_node(k, i); // Everybody using k now uses i
return i;
}
// Global Value Numbering
i = hash_find_insert(k); // Check for pre-existing node
if (i && (i != k)) {
// Return the pre-existing node if it isn't dead
NOT_PRODUCT(set_progress();)
add_users_to_worklist(k);
subsume_node(k, i); // Everybody using k now uses i
return i;
}
// Return Idealized original
return k;

As far as I know, there is no way to see the new node from add_users_to_worklist, and we shouldn't try to do so. I think the rationale behind it is that most optimizations don't care about the type of the leaves: either they are looking for pattern with unconstrained leaves, or the constraint is non-trivial enough that we shouldn't have it also here. Moreover, idealization is a loop: seeing that n is being replaced by a specific type N of node at some iteration doesn't guarantee that the uses of the replacer of n will see an N in there input.

I think if you remove the declaration of n_op and the conditions on it, it might be good.

// ConvD2F::Ideal matches ConvD2F(SqrtD(ConvF2D(x))) => SqrtF(x).
// ConvF2HF::Ideal matches ConvF2HF(binopF(ConvHF2F(...))) => FP16BinOp(...).
// Notify them when the inner conversion changes.
int n_op = n->Opcode();
Copy link
Contributor

@benoitmaillard benoitmaillard Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marc-chevalier is right, you should not use the opcode of n directly. Node n may be a node that is about to get replaced, so might run into false negatives. I made this mistake in the past, see this PR

Copy link
Contributor

@benoitmaillard benoitmaillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this. In addition to my comment on using the opcode of n, I think this PR needs a regression test that (ideally) covers each pattern added in add_users_of_node_to_worklist.

@krk
Copy link
Author

krk commented Feb 6, 2026

Thank you for the insightful comments, fixed the n usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-compiler hotspot-compiler-dev@openjdk.org rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

3 participants