Does it support specific embedding models?

Yes, it provides optimized chunk size and token limit recommendations for models including text-embedding-3-small/large, Voyage-2, and Cohere embed-v3.

How does it help with retrieval errors?

It identifies 'warning signs' of bad chunking, such as answers being truncated or irrelevant content polluting results, and provides specific fixes for each symptom.

What is the primary purpose of the Chunking Advisor skill?

It provides domain-specific guidance on how to break down large documents into smaller pieces (chunks) to maximize the effectiveness of Retrieval-Augmented Generation (RAG) systems.

Can I use this for source code or technical docs?

Absolutely. It includes specific strategies for technical content, recommending semantic chunking by function or class with specific token ranges to preserve syntax integrity.

Chunking Advisor

Name: Chunking Advisor
Author: davicqueiroz

bydavicqueiroz

•

データサイエンスとML

Optimizes RAG pipeline performance by recommending tailored document chunking strategies based on content type and embedding models.

The Chunking Advisor skill for Claude Code streamlines the preparation of datasets for Retrieval-Augmented Generation (RAG) by providing expert guidance on document segmentation. It analyzes specific content types—ranging from source code and legal contracts to structured tables—to suggest optimal chunk sizes, overlap ratios, and separator hierarchies. By tailoring recommendations to specific embedding models like OpenAI's text-embedding-3 or Voyage-2, this skill helps developers resolve common retrieval issues such as lost context, excessive noise, or truncated answers, ensuring high-precision search results in AI applications.

主な機能

01Model-specific optimization parameters for OpenAI, Voyage, and Cohere embedding models.

021 GitHub stars

03Ready-to-use Python implementation snippets using LangChain and custom NLP logic.

04Diagnostic checklists to identify and fix common retrieval pitfalls like 'boundary loss' and 'chunk noise'.

05Context-aware decision tree for selecting strategies like semantic, hierarchical, or fixed-size chunking.

06Automatic overlap calculation based on document sensitivity and narrative flow.

ユースケース

01Migrating between embedding models that have different token limits and optimal density requirements.

02Configuring a new RAG system for diverse datasets including PDFs, Markdown, and technical documentation.

03Troubleshooting poor retrieval quality where search results are irrelevant or lack necessary context.

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add davicqueiroz/claude-rag-skills chunking-advisor

For use in Claude.ai and ChatGPT

Download Skill