Skip to content

Conversation

stdpain
Copy link
Contributor

@stdpain stdpain commented Aug 26, 2025

Why I'm doing:

  1. refactor fixed_length_column rename const get_data() to immutable_data()
  2. change return type from Buffer& to ImmBuffer
  3. introduce config enable_zero_copy_from_page_cache to control the behaviour

What I'm doing:

Fixes #62289

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.0
    • 3.5
    • 3.4
    • 3.3

@stdpain stdpain requested review from a team as code owners August 26, 2025 09:00
@stdpain stdpain marked this pull request as draft August 26, 2025 09:00
@stdpain stdpain force-pushed the support_column_zero_copy_from_page_cache branch 4 times, most recently from d1e0bf0 to e235289 Compare August 28, 2025 07:00
@stdpain stdpain marked this pull request as ready for review August 28, 2025 09:12
@alvin-celerdata
Copy link
Contributor

@cursor review

cursor[bot]

This comment was marked as outdated.

@stdpain stdpain changed the title [Feature] support column zero copy read from page cache [Enhancement] support column zero copy read from page cache Aug 29, 2025
@stdpain stdpain force-pushed the support_column_zero_copy_from_page_cache branch 7 times, most recently from 9a32e30 to a501441 Compare September 2, 2025 06:24
@alvin-celerdata
Copy link
Contributor

@cursor review

cursor[bot]

This comment was marked as outdated.

Seaven
Seaven previously approved these changes Sep 4, 2025
@stdpain stdpain force-pushed the support_column_zero_copy_from_page_cache branch from 7763fcf to cad7a47 Compare September 4, 2025 03:05
@stdpain stdpain force-pushed the support_column_zero_copy_from_page_cache branch from cad7a47 to 6b93084 Compare September 4, 2025 03:07
satanson
satanson previously approved these changes Sep 4, 2025
Signed-off-by: stdpain <drfeng08@gmail.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
@stdpain stdpain force-pushed the support_column_zero_copy_from_page_cache branch from 6b93084 to 0f53100 Compare September 4, 2025 07:01
@alvin-celerdata
Copy link
Contributor

@cursor review

Signed-off-by: stdpain <drfeng08@gmail.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
Signed-off-by: stdpain <drfeng08@gmail.com>
Copy link

github-actions bot commented Sep 4, 2025

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

github-actions bot commented Sep 4, 2025

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

github-actions bot commented Sep 4, 2025

[BE Incremental Coverage Report]

pass : 954 / 1063 (89.75%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/storage/rowset/plain_page.h 0 3 00.00% [244, 245, 246]
🔵 be/src/column/adaptive_nullable_column.cpp 0 3 00.00% [38, 56, 274]
🔵 be/src/storage/push_utils.cpp 0 4 00.00% [141, 152, 154, 155]
🔵 be/src/column/adaptive_nullable_column.h 0 2 00.00% [163, 349]
🔵 be/src/cache/lrucache_engine.h 0 1 00.00% [26]
🔵 be/src/exec/partition/bucket_aware_partition.cpp 0 2 00.00% [44, 45]
🔵 be/src/exprs/agg/exchange_perf.h 0 2 00.00% [68, 88]
🔵 be/src/storage/convert_helper.cpp 0 1 00.00% [1577]
🔵 be/src/exprs/agg/covariance.h 2 6 33.33% [181, 182, 184, 185]
🔵 be/src/exprs/agg/window_funnel.h 5 14 35.71% [465, 498, 502, 524, 529, 531, 559, 561, 563]
🔵 be/src/storage/rowset/column_writer.cpp 1 2 50.00% [1056]
🔵 be/src/exprs/agg/intersect_count.h 1 2 50.00% [152]
🔵 be/src/exprs/agg/percentile_cont.h 7 13 53.85% [274, 309, 311, 314, 548, 640]
🔵 be/src/exprs/agg/avg.h 9 15 60.00% [106, 108, 110, 199, 201, 203]
🔵 be/src/exprs/agg/retention.h 4 6 66.67% [65, 75]
🔵 be/src/column/stream_chunk.cpp 2 3 66.67% [104]
🔵 be/src/util/json_flattener.cpp 2 3 66.67% [160]
🔵 be/src/storage/rowset/column_iterator.cpp 2 3 66.67% [63]
🔵 be/src/exprs/agg/ds_theta_count_distinct.h 2 3 66.67% [86]
🔵 be/src/exprs/agg/hll_ndv.h 2 3 66.67% [86]
🔵 be/src/exprs/agg/ds_hll_count_distinct.h 2 3 66.67% [92]
🔵 be/src/column/column_helper.h 12 17 70.59% [135, 136, 662, 664, 671]
🔵 be/src/exprs/agg/boolor.h 6 8 75.00% [141, 162]
🔵 be/src/column/object_column.h 3 4 75.00% [45]
🔵 be/src/exprs/agg/bitmap_agg.h 3 4 75.00% [89]
🔵 be/src/exprs/bitmap_functions.cpp 6 8 75.00% [434, 435]
🔵 be/src/column/map_column.cpp 46 56 82.14% [81, 82, 83, 207, 572, 575, 578, 579, 623, 624]
🔵 be/src/exprs/agg/percentile_approx.h 5 6 83.33% [136]
🔵 be/src/exec/pipeline/table_function_operator.cpp 5 6 83.33% [218]
🔵 be/src/exec/parquet_reader.cpp 6 7 85.71% [198]
🔵 be/src/exprs/binary_function.h 29 33 87.88% [101, 102, 103, 104]
🔵 be/src/exprs/agg/sum.h 8 9 88.89% [71]
🔵 be/src/column/fixed_length_column_base.cpp 101 114 88.60% [338, 341, 342, 353, 357, 360, 361, 365, 406, 410, 414, 415, 449]
🔵 be/src/column/fixed_length_column_base.h 54 60 90.00% [70, 71, 124, 125, 126, 127]
🔵 be/src/exprs/runtime_filter.h 20 22 90.91% [869, 871]
🔵 be/src/exprs/string_functions.cpp 20 22 90.91% [2063, 2065]
🔵 be/src/column/nullable_column.cpp 21 23 91.30% [447, 470]
🔵 be/src/column/array_column.cpp 66 67 98.51% [553]
🔵 be/src/exprs/array_functions.tpp 63 64 98.44% [1668]
🔵 be/src/exec/stream/aggregate/stream_aggregator.cpp 1 1 100.00% []
🔵 be/src/exprs/agg/map_agg.h 3 3 100.00% []
🔵 be/src/storage/column_predicate_dict_conjuct.cpp 1 1 100.00% []
🔵 be/src/formats/csv/decimalv3_converter.cpp 2 2 100.00% []
🔵 be/src/exprs/agg/stream/retract_maxmin.h 2 2 100.00% []
🔵 be/src/formats/csv/boolean_converter.cpp 3 3 100.00% []
🔵 be/src/formats/csv/decimalv2_converter.cpp 1 1 100.00% []
🔵 be/src/exprs/array_map_expr.cpp 3 3 100.00% []
🔵 be/src/column/nullable_column.h 2 2 100.00% []
🔵 be/src/cache/object_cache/page_cache.h 1 1 100.00% []
🔵 be/src/udf/java/java_native_method.cpp 3 3 100.00% []
🔵 be/src/formats/csv/array_converter.cpp 6 6 100.00% []
🔵 be/src/exprs/agg/approx_top_k.h 7 7 100.00% []
🔵 be/src/udf/java/java_data_converter.cpp 6 6 100.00% []
🔵 be/src/storage/rowset/dictcode_column_iterator.cpp 2 2 100.00% []
🔵 be/src/runtime/global_dict/decoder.cpp 4 4 100.00% []
🔵 be/src/exprs/agg/array_agg.h 3 3 100.00% []
🔵 be/src/formats/parquet/level_builder.cpp 6 6 100.00% []
🔵 be/src/formats/orc/orc_file_writer.cpp 2 2 100.00% []
🔵 be/src/storage/rowset/binary_plain_page.cpp 3 3 100.00% []
🔵 be/src/exprs/agg/group_concat.h 1 1 100.00% []
🔵 be/src/storage/column_aggregate_func.cpp 1 1 100.00% []
🔵 be/src/exec/chunks_sorter.h 1 1 100.00% []
🔵 be/src/exprs/map_element_expr.cpp 8 8 100.00% []
🔵 be/src/exec/spill/spill_components.cpp 1 1 100.00% []
🔵 be/src/column/decimalv3_column.cpp 4 4 100.00% []
🔵 be/src/exec/join/join_hash_map.tpp 14 14 100.00% []
🔵 be/src/exprs/agg/nullable_aggregate.h 8 8 100.00% []
🔵 be/src/column/array_view_column.cpp 22 22 100.00% []
🔵 be/src/exec/sorted_streaming_aggregator.cpp 2 2 100.00% []
🔵 be/src/exec/tablet_sink.cpp 2 2 100.00% []
🔵 be/src/storage/rowset/bitshuffle_page.h 2 2 100.00% []
🔵 be/src/runtime/agg_state_desc.h 1 1 100.00% []
🔵 be/src/exprs/table_function/subdivide_bitmap.h 3 3 100.00% []
🔵 be/src/exprs/agg/distinct.h 6 6 100.00% []
🔵 be/src/exec/aggregate/compress_serializer.cpp 1 1 100.00% []
🔵 be/src/storage/rowset/parsed_page.cpp 2 2 100.00% []
🔵 be/src/exec/partition/partition_hash_map.h 4 4 100.00% []
🔵 be/src/storage/rowset/page_decoder.h 1 1 100.00% []
🔵 be/src/exprs/agg/array_union_agg.h 1 1 100.00% []
🔵 be/src/exprs/map_apply_expr.cpp 1 1 100.00% []
🔵 be/src/exprs/agg/maxmin_by.h 10 10 100.00% []
🔵 be/src/exec/join/join_hash_map_method.hpp 16 16 100.00% []
🔵 be/src/util/arrow/starrocks_column_to_arrow.cpp 12 12 100.00% []
🔵 be/src/formats/csv/float_converter.cpp 1 1 100.00% []
🔵 be/src/storage/column_in_predicate.cpp 2 2 100.00% []
🔵 be/src/formats/csv/date_converter.cpp 1 1 100.00% []
🔵 be/src/exprs/decimal_cast_expr.h 1 1 100.00% []
🔵 be/src/formats/csv/numeric_converter.cpp 3 3 100.00% []
🔵 be/src/formats/csv/string_converter.cpp 4 4 100.00% []
🔵 be/src/exprs/agg/maxmin.h 4 4 100.00% []
🔵 be/src/exec/sorting/sort_permute.cpp 2 2 100.00% []
🔵 be/src/exprs/array_functions.cpp 42 42 100.00% []
🔵 be/src/storage/rowset/segment_iterator.cpp 3 3 100.00% []
🔵 be/src/exprs/utility_functions.cpp 5 5 100.00% []
🔵 be/src/exec/aggregate/agg_hash_set.h 4 4 100.00% []
🔵 be/src/column/container_resource.h 31 31 100.00% []
🔵 be/src/formats/parquet/scalar_column_reader.cpp 1 1 100.00% []
🔵 be/src/exprs/agg/aggregate_traits.h 1 1 100.00% []
🔵 be/src/exprs/function_helper.h 15 15 100.00% []
🔵 be/src/exec/join/join_key_constructor.h 15 15 100.00% []
🔵 be/src/column/column_helper.cpp 4 4 100.00% []
🔵 be/src/exprs/function_helper.cpp 5 5 100.00% []
🔵 be/src/exprs/agg/count.h 2 2 100.00% []
🔵 be/src/formats/csv/datetime_converter.cpp 1 1 100.00% []
🔵 be/src/formats/csv/map_converter.cpp 3 3 100.00% []
🔵 be/src/exec/sorting/compare_column.cpp 11 11 100.00% []
🔵 be/src/exec/chunks_sorter_heap_sort.cpp 1 1 100.00% []
🔵 be/src/formats/parquet/encoding_dict.h 1 1 100.00% []
🔵 be/src/exec/join/join_key_constructor.hpp 9 9 100.00% []
🔵 be/src/column/column.h 1 1 100.00% []
🔵 be/src/exec/sorting/merge_column.cpp 2 2 100.00% []
🔵 be/src/exec/sorting/sort_column.cpp 13 13 100.00% []
🔵 be/src/formats/csv/nullable_converter.cpp 4 4 100.00% []
🔵 be/src/column/column.cpp 1 1 100.00% []
🔵 be/src/exec/join/join_key_constructor.cpp 6 6 100.00% []
🔵 be/src/exprs/agg/combinator/agg_state_if.h 5 5 100.00% []
🔵 be/src/storage/column_predicate_rewriter.cpp 1 1 100.00% []
🔵 be/src/runtime/global_dict/parser.cpp 4 4 100.00% []
🔵 be/src/exec/aggregate/agg_hash_map.h 8 8 100.00% []
🔵 be/src/cache/object_cache/page_cache.cpp 8 8 100.00% []
🔵 be/src/exprs/math_functions.cpp 8 8 100.00% []
🔵 be/src/exprs/agg/bitmap_union_int.h 2 2 100.00% []
🔵 be/src/exprs/map_functions.cpp 16 16 100.00% []
🔵 be/src/exprs/min_max_predicate.h 3 3 100.00% []

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Optimize StarRocks Read Process with Direct Page References
4 participants