01Extract plain text, HTML, or XML from any document
02Retrieve all document metadata or specific fields (e.g., author, dates, page count)
03Detect MIME type and natural language of document content
04Fetch and extract content from remote documents via URL
05List all supported MIME types and available Tika parsers
060 GitHub stars