Skip to content

Zip64 compatibility with MS Excel and DuckDB Excel #894

@FlipperPA

Description

@FlipperPA

I'm trying to add 64-bit ZIP support to DuckDB's Excel extension, which uses minizip-ng to create the xlsx file (which is just a ZIP archive). This is my first time working with minizip-ng, so apologies in advance for anything obvious I may have missed.

My work-in-progress PR is here: https://github.com/duckdb/duckdb-excel/pull/64/files

When using MZ_ZIP64_DISABLE to create the xlsx zip file, it works well, and results pulled from a database into an Excel zip are readable by several Excel readers I tested, including Microsoft Excel, Google Sheets, and Python polars. It won't read in OpenPyXL because of the lack of xl/sharedStrings.xml, but that's a shortcoming of OpenPyXL.

When I compiled using MZ_ZIP64_AUTO instead, it seems to create a Zip64 regardless of whether it is needed. This causes the resulting zip file not to open the Microsoft Excel, with an error about it being invalid.

Image

Here is the head of the "bad" version:

$ head -c 64 new_export.xlsx | xxd
00000000: 504b 0304 2d00 0800 0800 0000 0000 0000 PK..-...........
00000010: 0000 ffff ffff ffff ffff 1800 1400 786c ..............xl
00000020: 2f77 6f72 6b73 6865 6574 732f 7368 6565 /worksheets/shee
00000030: 7431 2e78 6d6c 0100 1000 0000 0000 0000 t1.xml..........

Here is the head of the "good" version:

$ head -c 64 v1.3.0_export.xlsx | xxd
00000000: 504b 0304 1400 0800 0800 0000 0000 0000 PK..............
00000010: 0000 0000 0000 0000 0000 1800 0000 786c ..............xl
00000020: 2f77 6f72 6b73 6865 6574 732f 7368 6565 /worksheets/shee
00000030: 7431 2e78 6d6c bc9d 5f93 1c37 72ed 9faf t1.xml.._..7r...

Is there something I'm doing wrong while creating the file in zip_file.cpp? Thanks for your efforts creating this library, and for any insight you can send my way. It is much appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    not a bugIssue is not a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions