feat(datafusion): implement the partitioning node for DataFusion to define the partitioning #1620

fvaleye · 2025-08-21T13:37:42Z

Which issue does this PR close?

Closes Implement Repartition Node: Decide when the partitioning mode for the best parallelism #1543

What changes are included in this PR?

Implement a physical execution repartition node that determines the relevant DataFusion partitioning strategy based on the Iceberg table schema and metadata.

Implement hash partitioning for partitioned/bucketed tables
Use round-robin partitioning for unpartitioned tables

Minor change: I created a new schema_ref() helper method.

Are these changes tested?

Yes, with unit tests

…the best partition strategy for Iceberg for writing - Implement hash partitioning for partitioned/bucketed tables - Use round-robin partitioning for unpartitioned tables - Support range distribution mode approximation via sort columns

crates/integrations/datafusion/src/physical_plan/repartition.rs

liurenjie1024 · 2025-08-25T10:28:40Z

crates/integrations/datafusion/src/physical_plan/repartition.rs

+    ///
+    /// If no suitable hash columns are found (e.g., unpartitioned, non-bucketed table),
+    /// falls back to round-robin batch partitioning for even load distribution.
+    fn determine_partitioning_strategy(


This is interesting to see. At first I just thought two cases:

If it's partitioned table, we should just hash partition.

If it's not partitioned, we should just use round robin partition.

However, this reminds me another case: range only partition, e.g. we only has partitions like date, time. I think in this case we should also use round robin partition since in this case most data are focused in several partitions.

Also I don't think we should take into account write.distribution-mode for now. The example you use are for spark, but not applicable for datafusion.

However, this reminds me another case: range only partition, e.g. we only has partitions like date, time. I think in this case we should also use round robin partition since in this case most data are focused in several partitions.

Hum. You are right. The range partitions concentrate data in recent partitions, making hash partitioning counterproductive (considering a date with a temporal partition).
Since DataFusion doesn't provide Range, the fallback is round-robin and not hashing.

Briefly:

Hash partition: Only on bucket columns (partition spec + sort order)

Round-robin: Everything else (unpartitioned, range, identity, temporal transforms)

Also I don't think we should take into account write.distribution-mode for now. The example you use are for spark, but not applicable for datafusion.

Oh, good point, I misunderstood this. I thought it was an iceberg-rust table property.

…robin for range partitions Signed-off-by: Florian Valeye <florian.valeye@gmail.com>

…rtitioning strategy Signed-off-by: Florian Valeye <florian.valeye@gmail.com>

…t-into-datafusion

crates/integrations/datafusion/src/physical_plan/repartition.rs

liurenjie1024 · 2025-09-01T10:31:25Z

crates/integrations/datafusion/src/physical_plan/repartition.rs

+            let idx = input_schema
+                .index_of(name)
+                .map_err(|e| datafusion::error::DataFusionError::Plan(e.to_string()))?;
+            Ok(Arc::new(Column::new(name, idx))


Is this correct? I assume the project node should happen before repartition.

Adding more information to the error log.

What's I'm saying is that when inserting into a partitioned table in iceberg, the physical plan should looks like

WriteExec | RepartitionExec | ProjectExec | TableScanExec

Where ProjectExec will add an extra column called _partition, which is the partition value of each row. And when the RepartitionExec chooses to use HashPartition, it should repartition using the _partition field, rather than the source field.

Clearer, thanks!
Indeed, this part needs to be reworked when ProjectExec is merged.

So we should resolve #1602 first?

…e unused parameters

fvaleye changed the title ~~feat(datafusion): implement repartition node for DataFusion with~~ feat(datafusion): implement the partitioning node for DataFusion to define the partitioning Aug 21, 2025

liurenjie1024 reviewed Aug 25, 2025

View reviewed changes

feat(datafusion): remove spark-specific distribution-mode, use round-…

b6d5c2a

…robin for range partitions Signed-off-by: Florian Valeye <florian.valeye@gmail.com>

liurenjie1024 mentioned this pull request Aug 28, 2025

feat(datafusion): implement the project node to add the partition columns #1602

Open

feat(datafusion): adapt the repartition to determine only the best pa…

48ffd26

…rtitioning strategy Signed-off-by: Florian Valeye <florian.valeye@gmail.com>

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch from e8eb255 to 48ffd26 Compare August 29, 2025 12:10

fvaleye added 2 commits August 29, 2025 14:31

Merge branch 'main' into feature/implement-repartition-node-for-inser…

4e935d5

…t-into-datafusion

Merge branch 'main' into feature/implement-repartition-node-for-inser…

03f4d6c

…t-into-datafusion

liurenjie1024 reviewed Sep 1, 2025

View reviewed changes

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch from f24ba1a to 77ca951 Compare September 1, 2025 18:20

feat(datafusion): remove sort order from partition strategy and remov…

1a8d65d

…e unused parameters

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch from 77ca951 to 1a8d65d Compare September 1, 2025 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(datafusion): implement the partitioning node for DataFusion to define the partitioning #1620

feat(datafusion): implement the partitioning node for DataFusion to define the partitioning #1620

Uh oh!

fvaleye commented Aug 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

liurenjie1024 Aug 25, 2025

Uh oh!

fvaleye Aug 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liurenjie1024 Sep 1, 2025

Uh oh!

fvaleye Sep 1, 2025

Uh oh!

liurenjie1024 Sep 2, 2025

Uh oh!

fvaleye Sep 2, 2025

Uh oh!

liurenjie1024 Sep 3, 2025

Uh oh!

fvaleye Sep 3, 2025

Uh oh!

Uh oh!

feat(datafusion): implement the partitioning node for DataFusion to define the partitioning #1620

Are you sure you want to change the base?

feat(datafusion): implement the partitioning node for DataFusion to define the partitioning #1620

Uh oh!

Conversation

fvaleye commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

What changes are included in this PR?

Are these changes tested?

Uh oh!

Uh oh!

liurenjie1024 Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

fvaleye Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liurenjie1024 Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

fvaleye Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

liurenjie1024 Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

fvaleye Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

liurenjie1024 Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

fvaleye Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fvaleye commented Aug 21, 2025 •

edited

Loading

fvaleye Aug 25, 2025 •

edited

Loading