Skip to content

Conversation

Rashampreet4114
Copy link

Rationale for this change

This PR adds documentation for PyIceberg ⇄ PyArrow datatype mapping /conversion table
The Iceberg spec currently defines type mappings only for Avro, Parquet, and ORC.
This change fills that gap by providing a clear reference for Python developers working with PyIceberg and PyArrow.

The documentation:

  • Lists all supported PyIceberg → PyArrow conversions, derived from the _ConvertToArrowSchema visitor class.
  • Provides the natural PyArrow → PyIceberg reverse mapping for round-tripping.
  • Highlights important details such as handling of field IDs, documentation metadata, decimal precision, large string/binary handling, and timestamp precision/timezones.

Are these changes tested?

No

Are there any user-facing changes?

Yes.

  • New Markdown documentation: pyiceberg_pyarrow_mapping.md (with aligned tables and Python type classes).
  • No changes to runtime behavior—purely documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant