Skip to content

Conversation

ywangd
Copy link
Member

@ywangd ywangd commented Sep 2, 2025

This PR adds allocaiton explain to logs when there are unassigned shards after a converged DesiredBalance computation. The allocation explain prefers unassigned primary over replica. The logging keeps track of one unassigned shard and does not log again until it becomes assigned.

Relates: ES-12797

This PR adds allocaiton explain to logs when there are unassigned shards
after a converged DesiredBalance computation. The allocation explain
prefers unassigned primary over replica. The logging is frequency capped
at one minute by default.

Relates: ES-12797
@ywangd ywangd requested a review from DaveCTurner September 2, 2025 06:18
@ywangd ywangd requested a review from a team as a code owner September 2, 2025 06:18
@ywangd ywangd added :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v9.2.0 labels Sep 2, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Sep 2, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@ywangd
Copy link
Member Author

ywangd commented Sep 2, 2025

This PR adds a logging message similar to the following for desired balance computation.

unassigned shard [[test-index][0], node[null], [R], recovery_source[peer recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2025-09-02T06:00:07.478Z], delayed=false, allocation_status[no_attempt]], failed_attempts[0]] with allocation decision {"node_allocation_decision":{"can_allocate":"no","allocate_explanation":"Elasticsearch isn't allowed to allocate this shard to any of the nodes in the cluster. Choose a node to which you expect this shard to be allocated, find this node in the node-by-node explanation, and address the reasons which prevent Elasticsearch from allocating this shard there.","node_allocation_decisions":[{"node_id":"node-0","node_name":"","transport_address":"0.0.0.0:5","roles":["data"],"node_decision":"no","weight_ranking":1,"deciders":[{"decider":"same_shard","decision":"NO","explanation":"a copy of this shard is already allocated to this node [[test-index][0], node[node-0], [P], s[STARTED], a[id=uwInxw8VQf-7Vh-s6s15XA], failed_attempts[0], expected_shard_size[0]]"}]}]}}

It is inspired by the investigation of ES-12797. Though the specific investigation has concluded and no longer needs it. It feels like a generally useful information for future investigations. Hence this PR.

In ES-12797, it was suggested to log something similar in DesiredBalancerReconciler. I am not sure whether it is still necessary. My impression is that unassigned shards are usually unassigned in DesiredBalance. In case we do still want similar message for the reconciler, I think a separate follow-up PR is also more reasonable.

@ywangd ywangd removed the request for review from a team September 2, 2025 06:25
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok as is, I just left a question about an alternative design.

@@ -98,6 +99,10 @@ public OptionalDouble getForecastedWriteLoad(IndexMetadata indexMetadata) {
public void refreshLicense() {}
};

public static final DesiredBalanceShardsAllocator.ShardAllocationExplainer DUMMY_EXPLAINER = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming nit: maybe TEST_EXPLAINER or TEST_ONLY_EXPLAINER or something like that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure pushed e1b6207

@@ -77,15 +86,27 @@ public class DesiredBalanceComputer {
private long lastConvergedTimeMillis;
private long lastNotConvergedLogMessageTimeMillis;
private Level convergenceLogMsgLevel;
private final FrequencyCappedAction logAllocationExplainForUnassigned;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than a simple frequency cap, WDYT about keeping track of the shard we logged and not logging anything while that same shard remains unassigned? I.e. we reset the state only when the unassigned shard becomes assigned or is deleted. Otherwise if there are multiple shards unassigned we may get reports about different shards each time which will be very hard to interpret.

Might also be nice to record the reset in the logs, that way we can see how long a shard might be blocked from assignment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good idea. I pushed 9a4e07f for it.

@ywangd ywangd requested a review from DaveCTurner September 3, 2025 03:22
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I like it

RoutingNodes routingNodes,
RoutingAllocation routingAllocation
) {
if (allocationExplainLogger.isDebugEnabled()) {
Copy link
Contributor

@DiannaHohensee DiannaHohensee Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a plan to temporarily turn debug on for the new logger in serverless? Turning it on after the fact in an incident won't be useful, I expect.

I'd be inclined to log at INFO. CONVERGED with unassigned shards shouldn't happen (except for our mystery bug). Unless decider settings are too strict (not expected in serverless), and then the same shard will be unassigned and shouldn't flood the logs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the plan is to enable the debug log in serverless. I raised ES-12842 for it. I prefer the DEBUG level since it affects stateful as well.

@ywangd
Copy link
Member Author

ywangd commented Sep 3, 2025

@elasticmachine update branch

@ywangd ywangd added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 3, 2025
@elasticsearchmachine elasticsearchmachine merged commit 99ff870 into elastic:main Sep 4, 2025
33 checks passed
@ywangd ywangd deleted the log-allocation-explain-for-unassigned branch September 4, 2025 00:55
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Sep 11, 2025
BASE=fabf32c3f1dd08a0e51d6b86412bbea7dee50bc7
HEAD=56dfacba4d973bd825df6f7b24a516b307b8fff7
Branch=main
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Sep 11, 2025
BASE=fabf32c3f1dd08a0e51d6b86412bbea7dee50bc7
HEAD=56dfacba4d973bd825df6f7b24a516b307b8fff7
Branch=main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants