Skip to content

Conversation

@adityamukho
Copy link
Contributor

@adityamukho adityamukho marked this pull request as ready for review January 28, 2026 03:36
Comment on lines 199 to 203
Throwable rootEx = unwrapRootCause(e);

if (!(rootEx instanceof NodeStoppingException)) {
log.error("Failed to unregister metrics source {}", e, src.name());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use ExceptionUtils.hasCause().

Also, how can an exception caused by a NodeStoppingException be thrown by this code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's thrown by the busy lock in this fragment of code, then let's just use an idiom 'try to take a lock and if we're not successful, don't do anything' instead of producing the exception and immediately catching it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MessagingService-related metrics, when being unregistered, were triggering the NodeStoppingException in certain integration tests. While not a fatal error in itself (since the cluster and node were anyway shutting down), it was polluting test console output (and would likely pollute production output as well). I added this check to prevent that, while still logging any other, unexpected cause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use ExceptionUtils.hasCause().
👍

registry.unregisterSource(src);
});
try {
if (metricSources().contains(src)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a left-over artifact from an intermediate development stage. Will remove.

registry.unregisterSource(srcName);
});
}
} catch (Exception e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern doesn't look very beautiful, and there is also duplication on top. Let's discuss it higher first and come back here after we are done there

ChannelTypeRegistry channelTypeRegistry,
IgniteLogger log
IgniteLogger log,
@Nullable MetricManager metricManager,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are they nullable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CriticalStripedExecutors may be used outside of the messaging service in the future, and metrics may not be so important elsewhere as to be made mandatory.

IgniteMessageServiceThreadFactory.create(nodeName, "MessagingService-outbound", LOG, NOTHING_ALLOWED)
);
IgniteMessageServiceThreadFactory.create(nodeName, "MessagingService-outbound", LOG, NOTHING_ALLOWED));
outboundExecutor.initMetricSource(metricManager, "network.messaging.default.executor.outbound",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does 'default' stand here for?

private void handleHandshakeError(Throwable ex, UUID nodeId, ChannelType type, InetSocketAddress addr) {
if (ex != null) {
if (hasCause(ex, CriticalHandshakeException.class)) {
metrics.incrementConnectionFailures();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we even need a metric for this case as the case should be eliminated

Comment on lines +705 to +709
if (fut.isCompletedExceptionally()) {
metrics.incrementInvokeTimeouts();
} else {
fut.complete(response);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a race between checking and completing. Also, why is it about timeouts?

* @param name the name of the metric.
* @param description the metric description.
*/
public void initMetricSource(MetricManager metricManager, String name, String description) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about doing this in the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doable, but I wanted to keep the initialization similar to that of the CriticalStripedThreadPoolExecutor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants