Skip to content

Conversation

devanbenz
Copy link

We were seeing a state in which a shard would not perform full compactions leading to a build up of level 4 TSM files.

File state:

-rw-r--r--.  1 root root 2.1G Aug  5 20:59 000016684-000000007.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 21:02 000016684-000000008.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 21:04 000016684-000000009.tsm
-rw-r--r--.  1 root root 376M Aug  5 21:05 000016684-000000010.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000016812-000000004.tsm
-rw-r--r--.  1 root root 1.4G Aug  5 18:00 000016812-000000005.tsm
-rw-r--r--.  1 root root 1.3G Aug  5 21:21 000016844-000000002.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000016948-000000004.tsm
-rw-r--r--.  1 root root 1.4G Aug  5 18:00 000016948-000000005.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000017076-000000004.tsm

There is a rouge level 2 file packed within fully compacted files

-rw-r--r--.  1 root root 2.1G Aug  5 20:59 000016684-000000007.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 21:02 000016684-000000008.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 21:04 000016684-000000009.tsm
-rw-r--r--.  1 root root 376M Aug  5 21:05 000016684-000000010.tsm

and level 4 files

-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000016948-000000004.tsm
-rw-r--r--.  1 root root 1.4G Aug  5 18:00 000016948-000000005.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000017076-000000004.tsm

The area of our code that would cause this state to be skipped would be here

// step is how may files to compact in a group. We want to clamp it at 4 but also stil
// return groups smaller than 4.
step := 4
if step > end {
step = end
}
// slice off the generations that we'll examine
generations = generations[start:end]
// Loop through the generations in groups of size step and see if we can compact all (or
// some of them as group)
groups := []tsmGenerations{}
for i := 0; i < len(generations); i += step {
var skipGroup bool
startIndex := i
for j := i; j < i+step && j < len(generations); j++ {
gen := generations[j]
lvl := gen.level()
// Skip compacting this group if there happens to be any lower level files in the
// middle. These will get picked up by the level compactors.
if lvl <= 3 {
skipGroup = true
break
}
// Skip the file if it's over the max size and it contains a full block
if gen.size() >= uint64(tsdb.MaxTSMFileSize) && gen.files[0].FirstBlockCount >= tsdb.DefaultMaxPointsPerBlock && !gen.hasTombstones() {
startIndex++
continue
}
}
if skipGroup {
continue
}
endIndex := i + step
if endIndex > len(generations) {
endIndex = len(generations)
}
if endIndex-startIndex > 0 {
groups = append(groups, generations[startIndex:endIndex])
}
}
if len(groups) == 0 {
return nil, 0
}

We need to add some sort of escape mechanism that would allow for compactions to occur or simplify this logic.

This PR adds a test for the bugged file listing where we experience the issue. I'm still trying to find the way files get in to this state, but, this should resolve the issue if and when it occurs. I've added the check in PlanOptimize() as to not change any of the core logic in Plan().

Closes #

Describe your proposed changes here.

  • I've read the contributing section of the project README.
  • Signed CLA (if not already signed).

We were seeing a state in which a shard would not perform full compactions leading to a build up of level 4 TSM files.

File state:

```
-rw-r--r--.  1 root root 2.1G Aug  5 20:59 000016684-000000007.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 21:02 000016684-000000008.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 21:04 000016684-000000009.tsm
-rw-r--r--.  1 root root 376M Aug  5 21:05 000016684-000000010.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000016812-000000004.tsm
-rw-r--r--.  1 root root 1.4G Aug  5 18:00 000016812-000000005.tsm
-rw-r--r--.  1 root root 1.3G Aug  5 21:21 000016844-000000002.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000016948-000000004.tsm
-rw-r--r--.  1 root root 1.4G Aug  5 18:00 000016948-000000005.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000017076-000000004.tsm
```

There is a rouge level 2 file packed within fully compacted files

```
-rw-r--r--.  1 root root 2.1G Aug  5 20:59 000016684-000000007.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 21:02 000016684-000000008.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 21:04 000016684-000000009.tsm
-rw-r--r--.  1 root root 376M Aug  5 21:05 000016684-000000010.tsm
```

and level 4 files

```
-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000016948-000000004.tsm
-rw-r--r--.  1 root root 1.4G Aug  5 18:00 000016948-000000005.tsm
-rw-r--r--.  1 root root 2.1G Aug  5 18:00 000017076-000000004.tsm
```

The area of our code that would cause this state to be skipped would be here

https://github.com/influxdata/influxdb/blob/22bec4f046a28e3f1fa815705362767151407e1b/tsdb/engine/tsm1/compact.go#L620-L670

We need to add some sort of escape mechanism that would allow for compactions to occur or simplify this logic.

This PR adds a test for the bugged file listing where we experience the issue. I'm still trying to find the way files get in to this
state, but, this should resolve the issue if and when it occurs. I've added the check in PlanOptimize() as to not change any of the core
logic in Plan().
@davidby-influx
Copy link
Contributor

The test fails without the second condition check? || cur.Len() < 2

@devanbenz
Copy link
Author

The test fails without the second condition check? || cur.Len() < 2

Correct, I should have pushed up a commit with just the test but if you remove the check it fails.

@davidby-influx
Copy link
Contributor

davidby-influx commented Aug 30, 2025 via email

@devanbenz devanbenz marked this pull request as ready for review September 2, 2025 14:21
@devanbenz devanbenz requested review from davidby-influx and gwossum and removed request for davidby-influx September 2, 2025 14:21
@devanbenz devanbenz self-assigned this Sep 2, 2025
* 2 level 2 files
* 3 level 2 files
* 4 level 2 files
* 5 level 2 files
…k at the files on disk a lot of these

l4 files are not at the max points per block, just at the max file size of 2.1 GB.
…we don't have nested lower level

files in our plans
@@ -640,7 +642,7 @@ func (c *DefaultPlanner) Plan(lastWrite time.Time) ([]CompactionGroup, int64) {

// Skip compacting this group if there happens to be any lower level files in the
// middle. These will get picked up by the level compactors.
if lvl <= 3 {
if lvl <= 3 && len(gen.files) > 4 {
Copy link
Author

@devanbenz devanbenz Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it would be better to ensure that the files are nested within high level files. For example, lets say we have the following levels listed

5
6
7
8
4
5
2
2
4
5
4
5

We can assume that the 2 level 2 files need to be compacted as a group with our high level files.

But if we have

5
6
7
8
4
5
2
2

since the 2 level 2 files are at the end of the entire list of files we should not compact them with our higher groups

I found the following test case:

	data := []tsm1.FileStat{
		{
			Path: "01-04.tsm1",
			Size: 64 * 1024 * 1024,
		},
		{
			Path: "02-04.tsm1",
			Size: 64 * 1024 * 1024,
		},
		{
			Path: "03-04.tsm1",
			Size: 64 * 1024 * 1024,
		},
		{
			Path: "04-04.tsm1",
			Size: 64 * 1024 * 1024,
		},
		{
			Path: "05-02.tsm1",
			Size: 2 * 1024 * 1024,
		},
	}

Which is found here:

func TestDefaultPlanner_Plan_CompactsMiddleSteps(t *testing.T) {
data := []tsm1.FileStat{
{
Path: "01-04.tsm1",
Size: 64 * 1024 * 1024,
},
{
Path: "02-04.tsm1",
Size: 64 * 1024 * 1024,
},
{
Path: "03-04.tsm1",
Size: 64 * 1024 * 1024,
},
{
Path: "04-04.tsm1",
Size: 64 * 1024 * 1024,
},
{
Path: "05-02.tsm1",
Size: 2 * 1024 * 1024,
},
}
cp := tsm1.NewDefaultPlanner(
newFakeFileStore(withFileStats(data)),
tsdb.DefaultCompactFullWriteColdDuration,
)
expFiles := []tsm1.FileStat{data[0], data[1], data[2], data[3]}
tsm, pLen := cp.Plan(time.Now())
if exp, got := len(expFiles), len(tsm[0]); got != exp {
t.Fatalf("tsm file length mismatch: got %v, exp %v", got, exp)
} else if pLen != int64(len(tsm)) {
t.Fatalf("tsm file plan length mismatch: got %v, exp %v", pLen, exp)
}
for i, p := range expFiles {
if got, exp := tsm[0][i], p.Path; got != exp {
t.Fatalf("tsm file mismatch: got %v, exp %v", got, exp)
}
}
}

With this change does not pass. I would expect for the first four files to get compacted but the last one to be left out since its at the end of the file list.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not pick up the level 2 files at the end? They will get the benefits of the better compression from inclusion with higher level files. Are you concerned it will be extra work on a hot shard?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retracting the above comment after discussion.

@davidby-influx
Copy link
Contributor

@devanbenz - I think we need to create a new test case file. In it, we should have many, many permutations of compaction levels and generations. Here are a few examples:

4,5,4,5,4
2,4,5,4,5,4
3,2,4,5,4,5,4
3,2,4,5,4,5,4,2
2,2,3,2,4,5,4,5,4,2
2,2,3,2,4,5,2,4,5,4,2
2,2,3,2,4,5,2,2,4,5,3,4,2
4,5,2,2,4,5,3,5,4,2,2,2
4,5,2,2,4,5,3,5,2,3,4,2,2,2,5,4,5,6

We want to test

  • leading low levels with runs from 1 to 9.
  • trailing low levels,
  • nested low levels (with varying run lengths from 1 to 9),
  • multiple nested low levels with varying run lengths
  • leading low levels and multiple nested low levels with varying run lengths
  • leading low levels and multiple nested low levels with varying run lengths and trailing low levels
  • etc.

Create compact_case_test.go, generate all the test cases (maybe statically with Claude, maybe programmatically at run time) with labels (don't forget to vary points per block and file size). This will be a large set of permutations, so test them as you go along with whatever algorithm choices you have made to the planner.

Claude generated the test cases and I modified the results as to what I would expect.
@devanbenz devanbenz marked this pull request as draft September 3, 2025 16:34
@devanbenz
Copy link
Author

@devanbenz - I think we need to create a new test case file. In it, we should have many, many permutations of compaction levels and generations. Here are a few examples:

4,5,4,5,4
2,4,5,4,5,4
3,2,4,5,4,5,4
3,2,4,5,4,5,4,2
2,2,3,2,4,5,4,5,4,2
2,2,3,2,4,5,2,4,5,4,2
2,2,3,2,4,5,2,2,4,5,3,4,2
4,5,2,2,4,5,3,5,4,2,2,2
4,5,2,2,4,5,3,5,2,3,4,2,2,2,5,4,5,6

We want to test

* leading low levels with runs from 1 to 9.

* trailing low levels,

* nested low levels (with varying run lengths from 1 to 9),

* multiple nested low levels with varying run lengths

* leading low levels and multiple nested low levels with varying run lengths

* leading low levels and multiple nested low levels with varying run lengths and trailing low levels

* etc.

Create compact_case_test.go, generate all the test cases (maybe statically with Claude, maybe programmatically at run time) with labels (don't forget to vary points per block and file size). This will be a large set of permutations, so test them as you go along with whatever algorithm choices you have made to the planner.

I've found the static test file to work better this AM while working on this. I'm going to add the same tests duplicated but using different block counts to see how it impacts the planner + expected results.

Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly confused by the changes. It may be my problem.

@@ -640,7 +653,7 @@ func (c *DefaultPlanner) Plan(lastWrite time.Time) ([]CompactionGroup, int64) {

// Skip compacting this group if there happens to be any lower level files in the
// middle. These will get picked up by the level compactors.
if lvl <= 3 {
if lvl <= 3 && len(gen.files) > 4 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the length check here. Why are big groups of lower level files skipped?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidby-influx davidby-influx changed the title feat: Adds detection for orphaned TSM files while running PlanOptimize feat: Adds detection for orphaned TSM files while planning compaction Sep 4, 2025
Copy link
Contributor

@philjb philjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor perf comment and one comment that needs to be addressed (or I'm confused) -- Either there should be a test that exercises the change or the condition should be removed (I think it can be removed).

for i, g := range generations {
if g.level() <= 3 {
// Track the last high-level generation
if g.level() > 3 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

level() calls parsefilename -- I think we should store off the lvl since you're calling it twice now and we know we spend a lot of cycles on parsing tsm file names (to be improved more substantially in other work).

lvl := g.level() ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

if lvl <= 3 {
// Skip compacting this group if there happens to be more than 3 lower level
// files in the middle. These will get picked up by the level compactors.
if lvl <= 3 && len(gen.files) > 3 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this check. The loop over generations (line 585) breaks when a generation of level 3 or less with 4 or more files is encountered so the remaining generations in consideration for compaction will never contain a group that matches lvl <= 3 && len(gen.files) > 3 as that's the same as lvl <= 3 && len(g.files) >= 4 above.

The tests in compact_test.go with coverage show that the inner part of this if isn't ever run. So the code is confusing now or at least just as confusing as before; as the current master-1.x code can not hit the inner part of this if block either as the loop over generations broke when any lvl <= 3 was hit.

I recommend adding a test that proves this if-chunk is necessary or remove it.

As an aside, the if check here and the comment imply to me that this compaction code used to be resilient against "orphaned" low level tsm files surrounded by fully compacted files, but looking at the git history that doesn't appear to be the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, another test to hit this conditional would be good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, commenting out the check on line 655 does cause test failures:

			// Skip compacting this group if there happens to be more than 3 lower level
			// files in the middle. These will get picked up by the level compactors.
			if lvl <= 3 { // && len(gen.files) > 3 {
				skipGroup = true
=== RUN   TestEnginePlanCompactions/nested_4_5_2_2_4_5
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/nested_4_5_2_2_4_5
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/nested_4_5_2_2_4_5 (0.00s)

=== RUN   TestEnginePlanCompactions/complex_nested_4_5_2_4_5_2_4
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/complex_nested_4_5_2_4_5_2_4
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/complex_nested_4_5_2_4_5_2_4 (0.00s)

=== RUN   TestEnginePlanCompactions/complex_nested_4_5_2_2_1_2_4_5
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/complex_nested_4_5_2_2_1_2_4_5
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/complex_nested_4_5_2_2_1_2_4_5 (0.00s)
~/src/github.com/influxdata/influxdb/tsdb/engine/tsm1$ git diff
diff --git a/tsdb/engine/tsm1/compact.go b/tsdb/engine/tsm1/compact.go
index a7c318c54f..8de97a3fc5 100644
--- a/tsdb/engine/tsm1/compact.go
+++ b/tsdb/engine/tsm1/compact.go
@@ -652,7 +652,7 @@ func (c *DefaultPlanner) Plan(lastWrite time.Time) ([]CompactionGroup, int64) {
 
                        // Skip compacting this group if there happens to be more than 3 lower level
                        // files in the middle. These will get picked up by the level compactors.
-                       if lvl <= 3 && len(gen.files) > 3 {
+                       if lvl <= 3 { // && len(gen.files) > 3 {
                                skipGroup = true
                                break
                        }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And commenting out the other check on line 592 also causes errors:

		// Skip low-level generations with too many files for level compaction
		if g.level() <= 3 { // && len(g.files) >= 4 {
			break
		}
	}

=== RUN   TestEnginePlanCompactions/Mixed_generations_with_varying_file_sizes
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/Mixed_generations_with_varying_file_sizes
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/Mixed_generations_with_varying_file_sizes (0.00s)

=== RUN   TestEnginePlanCompactions/Mixed_generations_with_2_level_2_files
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/Mixed_generations_with_2_level_2_files
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/Mixed_generations_with_2_level_2_files (0.00s)

=== RUN   TestEnginePlanCompactions/Mixed_generations_with_3_level_2_files
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/Mixed_generations_with_3_level_2_files
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/Mixed_generations_with_3_level_2_files (0.00s)

=== RUN   TestEnginePlanCompactions/Mixed_generations_with_4_level_2_files
--- PASS: TestEnginePlanCompactions/Mixed_generations_with_4_level_2_files (0.00s)
=== RUN   TestEnginePlanCompactions/Mixed_generations_with_5_level_2_files
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/Mixed_generations_with_5_level_2_files
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/Mixed_generations_with_5_level_2_files (0.00s)

=== RUN   TestEnginePlanCompactions/First_file_is_lower_level_than_next_files
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/First_file_is_lower_level_than_next_files
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/First_file_is_lower_level_than_next_files (0.00s)

=== RUN   TestEnginePlanCompactions/basic_high_level_4_5_4_5_4
--- PASS: TestEnginePlanCompactions/basic_high_level_4_5_4_5_4 (0.00s)
=== RUN   TestEnginePlanCompactions/leading_low_2_4_5_4_5_4
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/leading_low_2_4_5_4_5_4
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/leading_low_2_4_5_4_5_4 (0.00s)

=== RUN   TestEnginePlanCompactions/leading_low_run_2_2_4_5_4_5_4
    compact_test.go:4528: 
        	Error Trace:	/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4528
        	            				/home/davidby/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact_test.go:4515
        	Error:      	"[]" should have 1 item(s), but has 0
        	Test:       	TestEnginePlanCompactions/leading_low_run_2_2_4_5_4_5_4
        	Messages:   	unexpected level 4 Group  collection length mismatch
--- FAIL: TestEnginePlanCompactions/leading_low_run_2_2_4_5_4_5_4 (0.00s)
~/src/github.com/influxdata/influxdb/tsdb/engine/tsm1$ git diff
diff --git a/tsdb/engine/tsm1/compact.go b/tsdb/engine/tsm1/compact.go
index a7c318c54f..5369924181 100644
--- a/tsdb/engine/tsm1/compact.go
+++ b/tsdb/engine/tsm1/compact.go
@@ -589,7 +589,7 @@ func (c *DefaultPlanner) Plan(lastWrite time.Time) ([]CompactionGroup, int64) {
                }
 
                // Skip low-level generations with too many files for level compaction
-               if g.level() <= 3 && len(g.files) >= 4 {
+               if g.level() <= 3 { // && len(g.files) >= 4 {
                        break
                }
        }

Copy link
Contributor

@philjb philjb Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you only commented out half of the conditional. I'm saying the full conditional is never true in any test.

I get no test failures when I comment the 655 conditional:

diff --git a/tsdb/engine/tsm1/compact.go b/tsdb/engine/tsm1/compact.go
index a7c318c54f..2a0c02c6c0 100644
--- a/tsdb/engine/tsm1/compact.go
+++ b/tsdb/engine/tsm1/compact.go
@@ -648,14 +648,14 @@ func (c *DefaultPlanner) Plan(lastWrite time.Time) ([]CompactionGroup, int64) {

                for j := i; j < i+step && j < len(generations); j++ {
                        gen := generations[j]
-                       lvl := gen.level()
+                       //lvl := gen.level()

                        // Skip compacting this group if there happens to be more than 3 lower level
                        // files in the middle. These will get picked up by the level compactors.
-                       if lvl <= 3 && len(gen.files) > 3 {
-                               skipGroup = true
-                               break
-                       }
+                       //if lvl <= 3 && len(gen.files) > 3 {
+                       //      skipGroup = true
+                       //      break
+                       //}

                        // Skip the file if it's over the max size and it contains a full block
                        if gen.size() >= uint64(tsdb.MaxTSMFileSize) && gen.files[0].FirstBlockCount >= tsdb.DefaultMaxPointsPerBlock && !gen.hasTombstones() {

and to prove the point, if I instead comment out the conditional at 592 I also get no test failures.

diff --git a/tsdb/engine/tsm1/compact.go b/tsdb/engine/tsm1/compact.go
index a7c318c54f..d89b4c2cee 100644
--- a/tsdb/engine/tsm1/compact.go
+++ b/tsdb/engine/tsm1/compact.go
@@ -589,9 +589,9 @@ func (c *DefaultPlanner) Plan(lastWrite time.Time) ([]CompactionGroup, int64) {
                }

                // Skip low-level generations with too many files for level compaction
-               if g.level() <= 3 && len(g.files) >= 4 {
-                       break
-               }
+               //if g.level() <= 3 && len(g.files) >= 4 {
+               //      break
+               //}
        }

        // If we have high-level files, only include generations up to the last high-level file

And way more interesting, if I comment out BOTH conditionals, I still get no test failures.

I expected the first two because I think they cover the same situations but removing both and not seeing any test failures means a gap in the testing of the changes in the pr.

Copy link
Author

@devanbenz devanbenz Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philjb It looks like by commenting out

 diff --git a/tsdb/engine/tsm1/compact.go b/tsdb/engine/tsm1/compact.go
index a7c318c54f..d89b4c2cee 100644
--- a/tsdb/engine/tsm1/compact.go
+++ b/tsdb/engine/tsm1/compact.go
@@ -589,9 +589,9 @@ func (c *DefaultPlanner) Plan(lastWrite time.Time) ([]CompactionGroup, int64) {
                }

                // Skip low-level generations with too many files for level compaction
-               if g.level() <= 3 && len(g.files) >= 4 {
-                       break
-               }
+               //if g.level() <= 3 && len(g.files) >= 4 {
+               //      break
+               //}
        }

        // If we have high-level files, only include generations up to the last high-level file

The tests still pass as it's a bit redundant since later on we will skip these groups anyway:

for j := i; j < i+step && j < len(generations); j++ {
gen := generations[j]
lvl := gen.level()
// Skip compacting this group if there happens to be any lower level files in the
// middle. These will get picked up by the level compactors.
if lvl <= 3 {
skipGroup = true
break
}
// Skip the file if it's over the max size and it contains a full block
if gen.size() >= uint64(tsdb.MaxTSMFileSize) && gen.files[0].FirstBlockCount >= tsdb.DefaultMaxPointsPerBlock && !gen.hasTombstones() {
startIndex++
continue
}
}

I wonder if this check can be removed to simplify 🤔

I'm tracing through the rest of the code to see if we can consolidate/simplify things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philjb - I commented out the part of the conditional that was added in the PR, leaving the previous code. That's why it was half of each removed.

@philjb
Copy link
Contributor

philjb commented Sep 4, 2025

Related to commit: 06226d6

},
},
{
name: "Mixed generations with varying file sizes, small amount of files",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test and all proceeding tests are for checking correctness in the edge case that Phil pointed out. I have this checking against a single lower level TSM file, 3 lower level TSM files (< 4) and 5 lower level TSM files (> 4). All tests appear to plan the nested lower level files for full compaction as expected.

}

// If we have high-level files, only include generations up to the last high-level file
if c.enableNestedCompactor && lastHighLevelIndex >= 0 {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be needed since we set lastHighLevelIndex to -1 so if the feature flag is disabled it will just always be -1.. I want to remain consistent and indicate that this code is only supposed to be enabled when the feature flag is on so I decided to keep it.

start = i
break
if c.enableNestedCompactor {
if g.size()*2 < generations[i-1].size() && generations[i-1].level() >= generations[i].level() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to add a comment about why this check needs to be different with the feature flag on. It's a bit confusing to just look at without knowing context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment would be great.

I still think that this logic will "leave behind" a non fully compacted file, but i think this is likely ok. For example

gen-lvl size index
6-6 6.1 0
5-5 5.1 1
4-4 4.1 2
3-2 2 3
2-2 .9 4
1-7 .9 5
0-1 .9 6

start will be 4, end will be 6. The 3-2 file won't be picked up.

@@ -252,6 +252,7 @@ func NewEngine(id uint64, idx tsdb.Index, path string, walPath string, sfile *ts

var planner CompactionPlanner = NewDefaultPlanner(fs, time.Duration(opt.Config.CompactFullWriteColdDuration))
planner.SetAggressiveCompactionPointsPerBlock(int(opt.Config.AggressivePointsPerBlock))
planner.SetNestedCompactor(opt.Config.NestedCompactorEnabled)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps log here if the config is set to true. I think that will be helpful confirmation. Sometimes we get logs but not a config file (although with Cloud1 we have direct access to both).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely log. Info level. Great suggestion.

"000016684-000000007.tsm",
"000016812-000000004.tsm",
"000016853-000000002.tsm",
"000016948-000000004.tsm",
Copy link
Contributor

@philjb philjb Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is wrong. This compaction group will merge these tsm files but that will produce incorrect overwrite semantics as the files in the group aren't adjacent.

The level 2 group looks right --- it pulled 4 level 2 files together into a group that are sequential. They will be compacted into a level 3 generation that will be correct for overwrites, producing a 000016844-3

But if the weighting of the scheduler picks this level4group to be run, it'll compact 684-7, 812-5, 853-2, 948-4 together and produce a 000016684-8 which will be older than the one from the level 2 group (16684 < 16844) and overwrites will be broken.

So you might need to check there isn't a run of non-fully compacted generations longer than 4 (which is what the level 2 & 3 compactions look for.

			if i - lastHighLevelIndex > 3 {
				break
			}

Copy link
Contributor

@philjb philjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found an issue in one of the tests -- makes me wonder if other tests are incorrectly combining tsm files that aren't adjacent. I think each test case could have a check that given the input files/generations and the expected output, all generations in all groups are adjacent -- just to verify the test is accurate too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants