Skip to content

Conversation

hongkunxu
Copy link
Contributor

Why I'm doing:

Under the current rebalance + deletion mechanism, when a new BE node is added, data cannot be successfully migrated to it.

Here’s why:

  • During rebalance, a tablet replica is cloned from a source BE to the new BE.
  • At this moment, the source BE still holds the replica, so the system immediately treats the situation as redundant.
  • Since the source BE’s replica is already recorded in cachedReplicaId and scheduled for deletion, the cloned replica on the new BE is also incorrectly judged as redundant and then removed.

As a result, the new BE node never keeps the migrated data, making the rebalance ineffective.

What I'm doing:

I added an extra check in the redundant replica detection logic:

  1. Before deciding whether a replica is redundant, we now verify if its tabletId already exists in cachedReplicaId.
  2. If it does, we skip the redundant replica judgment because the replica is already scheduled for cleanup.

Fixes #62541

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.0
    • 3.5
    • 3.4
    • 3.3

Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
@hongkunxu hongkunxu requested a review from a team as a code owner August 31, 2025 07:31
Copy link

Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@gengjun-git gengjun-git self-assigned this Sep 1, 2025
@@ -1185,7 +1185,13 @@ private boolean deleteLocationMismatchReplica(TabletSchedCtx tabletCtx, boolean
if (backend != null) {
Pair<String, String> location = backend.getSingleLevelLocationKV();
if (location != null && matchedLocations.contains(location)) {
dupReplica = replica;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about iterating replicas backwards and remove the replica with duplicate location?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wyb Iterating replicas backwards and removing the replica with duplicate location is indeed one solution, but in that case, wouldn’t this mean that

// tabletId -> replicaId
// used to delete src replica after copy task success
private final Map<Long, Long> cachedReplicaId = new ConcurrentHashMap<>();

becomes useless? This set operation only happens in completeSchedCtx inside createBalanceTask. And since rebalance will inevitably cause duplicate replicas, if we clean them up directly here, doesn’t that mean the subsequent deleteReplicaChosenByRebalancer can also be removed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current implement doesn't guarantee that the source replica must be deleted via deleteReplicaChosenByRebalancer after balancing.

@hongkunxu hongkunxu requested a review from wyb September 1, 2025 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] New BE added to an existing rack cannot get balanced data
3 participants