-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Add metrics max queue latency #133959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metrics max queue latency #133959
Conversation
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
safeAwait(delayLatch); | ||
safeAwait(delayLatch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need a barrier? I think countdown is enough to block thread-pool and let one task sit in queue.
var latch = new CountDownLatch(1);
range(0, threads + 1).forEach(i -> pool.execute(() -> safeAwait(latch)));
var queueTimeMs = between(100,200);
safeSleep(queueTimeMs);
latch.countDown();
I don't mind having queue metrics in DesiredBalanceMetrics, at least, everything related to allocation in one place. Also allocators and balancer tend to use same metrics. |
…info" This reverts commit 075be86.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new metric that publishes the maximum queue latency for the write thread pool to support write load constraint decisions. The metric is collected by TransportNodeUsageStatsForThreadPoolsAction
and exposed through the existing DesiredBalanceMetrics
system.
- Refactors
DesiredBalanceMetrics
to be injected as a dependency instead of created internally - Adds max queue latency metric registration and collection functionality
- Updates all test constructors to use the new
DesiredBalanceMetrics.NOOP
instance
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
DesiredBalanceMetrics.java |
Adds NOOP instance and method to register write load decider max latency gauge |
DesiredBalanceShardsAllocator.java |
Changes constructor to accept DesiredBalanceMetrics instead of creating internally |
ClusterModule.java |
Creates and injects DesiredBalanceMetrics instance |
TransportNodeUsageStatsForThreadPoolsAction.java |
Adds max queue latency metric collection and registration |
WriteLoadConstraintDeciderIT.java |
Adds integration test for the new metric |
Test files | Updates all test constructors to use DesiredBalanceMetrics.NOOP |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
...asticsearch/action/admin/cluster/node/usage/TransportNodeUsageStatsForThreadPoolsAction.java
Show resolved
Hide resolved
...asticsearch/action/admin/cluster/node/usage/TransportNodeUsageStatsForThreadPoolsAction.java
Show resolved
Hide resolved
.../java/org/elasticsearch/cluster/routing/allocation/decider/WriteLoadConstraintDeciderIT.java
Show resolved
Hide resolved
…max_queue_latency # Conflicts: # server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/allocation/decider/WriteLoadConstraintDeciderIT.java
…max_queue_latency
8c35a43
to
5f4cf5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We already publish the 50th, 90th and 99th percentile queue latencies for time tracking thread pools, this PR adds a metric that publishes the most recent max queue latency observed by
org.elasticsearch.action.admin.cluster.node.usage.TransportNodeUsageStatsForThreadPoolsAction
I added a new metrics class to make it available to the action, I'm open to the idea of putting these metrics somewhere else, it didn't seem like we had an obvious home for them currently. There isDesiredBalanceMetrics
but that strikes me as being more focused on the balancer rather than the deciders.Relates: ES-12631