How much can I save with Claude prompt caching?

Users can typically see a reduction in latency of up to 85% and a reduction in costs by up to 90% for content that is repeatedly read from the cache.

Which Claude models support this skill?

Prompt caching is supported on the latest generation of models, including Claude 3.5 Sonnet, Claude 3.5 Haiku, and Claude 3 Opus.

What is a cache breakpoint?

A cache breakpoint is a flag added to your API request that tells Claude exactly which part of the prompt (like a large document or a list of tools) should be stored in memory for future use.

Is there a minimum token limit for caching?

Yes, caching generally requires a minimum of 1024 to 4096 tokens depending on the model, making it ideal for large system prompts and long documents.

How long does a cached prompt last?

The default TTL (Time-To-Live) is 5 minutes, which is perfect for interactive chat. For batch processing, you can extend the TTL to 1 hour.

Claude Prompt Caching

Name: Claude Prompt Caching
Author: Lobbi-Docs

byLobbi-Docs

0•

Developer Tools

Optimizes API performance and reduces operational costs by caching static prompt segments and conversation history.

The Prompt Caching skill enables developers to leverage Anthropic's sophisticated caching mechanisms to achieve up to an 85% reduction in latency and a 90% reduction in API costs. By strategically implementing cache breakpoints within tool definitions, system prompts, and message histories, this skill allows for the persistence of large contexts like RAG documents, complex codebases, and long-running interactive sessions. It provides standardized implementation patterns and monitoring utilities to ensure your Claude integrations are both lightning-fast and highly cost-efficient.

Key Features

01Reduces latency by up to 85% for repeated content segments

02Cuts API costs by up to 90% with discounted cache-read pricing

03Built-in utilities for monitoring cache hit rates and calculating actual cost savings

040 GitHub stars

05Supports flexible TTL options for interactive sessions or batch processing

06Strategic breakpoint placement for tools, system instructions, and messages

Use Cases

01Maintaining context across long, multi-turn AI conversations without re-sending history

02Executing high-volume batch processing tasks that share a common prefix or context

03Analyzing massive codebases or technical documentation within RAG systems

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add lobbi-docs/claude prompt-caching

For use in Claude.ai and ChatGPT

Download Skill