Skip to content

MergeTree Parts Not Merging Automatically with Frequent Inserts, Seeking Optimization Methods #365

@yidigo

Description

@yidigo

Hello chdb team,

I am using chdb for a use case that involves periodically inserting small batches of data into a MergeTree table. I have noticed that over time, each insert creates a new, small data part (a separate file/directory on disk).

The Problem:
Unlike a standard ClickHouse server, these small parts do not seem to be automatically merged in the background. This results in a large number of small data parts for a single table. Consequently, the query performance on this table degrades significantly, as the query engine has to read from many different files.

My Questions
Does chdb have an automatic background merge process for MergeTree tables like a standard ClickHouse server? If so, is there any configuration required to enable or tune it?

Or is there a way to manually trigger a merge for all the data parts in a table? Are there any recommended best practices or MergeTree settings for write-intensive scenarios with frequent, small inserts in chdb?

The follows are some screenshots

Image Image

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions