Does this skill support command-line PDF tools?

Yes, it provides integration patterns for popular CLI tools like qpdf, pdftk, and poppler-utils for efficient batch processing directly from the terminal.

Which Python libraries are supported for PDF creation?

This skill primarily utilizes ReportLab for creating new PDFs, offering both basic canvas drawing and complex layout tools like Platypus for multi-page documents.

Can I use this to extract tables into a spreadsheet?

Absolutely. The skill uses the pdfplumber library to identify tables and provides patterns to convert them into Pandas DataFrames for easy export to Excel or CSV.

Can this skill handle scanned PDF documents?

Yes, the skill includes guidance on using pytesseract and pdf2image to perform OCR (Optical Character Recognition) on scanned images within PDFs to retrieve text.

PDF Processing Toolkit

Name: PDF Processing Toolkit
Author: cdeistopened

bycdeistopened

•

内容管理

Manipulates, generates, and extracts data from PDF documents using a suite of Python libraries and command-line tools.

The PDF Processing Skill provides Claude with a comprehensive toolkit for programmatic document management and automated data extraction. It enables sophisticated operations such as high-fidelity text and table extraction, document merging and splitting, metadata manipulation, and the creation of new PDF files from scratch. By integrating powerful libraries like pypdf, pdfplumber, and reportlab, this skill allows Claude to automate complex document workflows, handle scanned PDFs via OCR, and manage form-filling tasks with precision, making it an essential asset for data processing, administrative automation, and reporting.

主要功能

01Programmatic PDF generation and multi-page report creation

02OCR capabilities for extracting content from scanned documents

03Security features including password protection and metadata management

041 GitHub stars

05Advanced text and table extraction with layout preservation

06Document manipulation including merging, splitting, and rotation

使用场景

01Generating custom-branded PDF reports, invoices, and certificates dynamically from application data

02Automating the extraction of financial data from PDF invoices into structured formats like Excel or CSV

03Batch processing document archives to merge related files, split chapters, or add watermarks

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add cdeistopened/skill-stack pdf

For use in Claude.ai and ChatGPT