Implements path-independent, auto-invalidating file caching using SHA-256 content hashes to optimize expensive processing tasks.
This skill provides a robust implementation pattern for caching the results of resource-intensive file operations, such as PDF parsing, OCR, and image analysis. By utilizing SHA-256 content hashes as cache keys rather than file paths, it ensures that renamed or moved files still result in cache hits, while any modification to the file content triggers an automatic invalidation. It promotes clean architecture through a service layer wrapper that keeps processing functions pure and follows the Single Responsibility Principle.
主な機能
010 GitHub stars
02Service layer separation to maintain pure processing functions
03Memory-efficient chunked hashing for large file processing
04SHA-256 content-based indexing for path-independent caching
05O(1) file-based lookup avoiding the need for a central index
06Automatic cache invalidation triggered by file content changes
ユースケース
01Optimizing document processing pipelines for PDF and text extraction
02Building CLI tools that require efficient --cache/--no-cache functionality
03Reducing latency in image analysis and metadata processing workflows