About
ExStruct is a robust Python library and CLI tool designed to extract comprehensive structured data from Excel workbooks. It meticulously parses cells, identifies table candidates, and captures rich elements like shapes, charts, SmartArt, merged cell ranges, print areas, and even auto page-break areas. With multiple output modes (light, standard, verbose) and tunable table detection heuristics, ExStruct provides flexible control over the extraction process, making it ideal for preparing complex Excel data for large language model (LLM) and retrieval-augmented generation (RAG) applications.