v0.6.0 Release Notes¶
This release adds a new best-effort libreoffice extraction mode for
non-COM environments and extends shape/chart metadata with provenance fields.
Highlights¶
- Added
mode="libreoffice"across the Python API, CLI, and MCP server. - Added early validation for
.xls+mode="libreoffice"with a clear error. - Added extraction-only validation for
mode="libreoffice": - rejects PDF/PNG rendering
- rejects auto page-break export
- Added
FallbackReason.LIBREOFFICE_UNAVAILABLEandFallbackReason.LIBREOFFICE_PIPELINE_FAILED. - Added backend metadata to shapes/charts:
provenanceapproximation_levelconfidence- serialized output now keeps these fields opt-in via
include_backend_metadata - Added OOXML-based best-effort reconstruction for:
- shapes
- connectors
- charts
- Added a LibreOffice runtime helper so server/Linux/macOS environments can opt into rich extraction without Excel COM.
- Added bundled bridge compatibility probing for LibreOffice Python runtime
selection, including fail-fast handling for incompatible
EXSTRUCT_LIBREOFFICE_PYTHON_PATHoverrides. - Added a required Linux GitHub Actions smoke job that installs LibreOffice
python3-unoand runs thepytest.mark.libreofficesample smoke test.
Notes¶
libreofficeis available for.xlsx/.xlsmonly.libreofficeis best-effort and not a strict subset of COM output.- v1 does not add LibreOffice PDF/PNG rendering or auto page-break extraction.