Does this skill support scanned documents?

Yes, it includes OCR capabilities using Tesseract to extract text from image-based PDFs and scanned documents.

Why use this instead of Claude's built-in PDF reader?

Claude's built-in tools often timeout on files larger than 5MB. This skill handles massive files by extracting text locally and splitting them into manageable chunks for analysis.

How do I handle a PDF with hundreds of pages?

You can use the --split flag to divide the document into smaller page chunks, allowing Claude to summarize and process each section sequentially without context overflow.

Can I convert Microsoft Word files to Markdown?

Absolutely. The skill preserves document structure, including headings and tables, when converting DOCX files to Markdown.

Document Processor

Name: Document Processor
Author: adnanmueller

byadnanmueller

0•

Content Management

Extracts and processes text from large PDFs and DOCX files using OCR and structured markdown conversion.

The Document Processor skill enables Claude Code to handle large or complex documents that typically exceed standard context limits or cause direct-read timeouts. By utilizing specialized extraction scripts, it processes PDF and DOCX files of any size, offering features like page-by-page splitting, OCR for scanned images, and structural markdown conversion. This skill is ideal for developers and researchers who need to analyze technical documentation, research papers, or legal contracts while maintaining document structure and avoiding context overflow.

Key Features

01Granular extraction of specific page ranges or individual pages

020 GitHub stars

03OCR support for scanned documents and image-based PDFs using Tesseract

04Seamless integration with Obsidian and other markdown-based note-taking workflows

05Structural conversion of DOCX files into clean, usable Markdown format

06Large file processing with page-by-page extraction and chunk splitting

Use Cases

01Converting complex technical manuals or DOCX contracts into Markdown for project documentation

02Extracting text from scanned research papers using OCR for AI-assisted summarization

03Processing massive PDF documents in manageable 50-page chunks to avoid context window limits

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add adnanmueller/am-dev-plugins document-processor

For use in Claude.ai and ChatGPT

Download Skill