Provides comprehensive API documentation and implementation patterns for Apple's Vision framework across xOS platforms.
This skill serves as a specialized technical reference for integrating Apple's Vision framework into iOS, iPadOS, macOS, and visionOS applications. It offers detailed guidance on high-level computer vision tasks such as subject segmentation (subject lifting), 2D and 3D hand and body pose detection, face analysis, and OCR. By providing precise API signatures, coordinate system mappings, and performance best practices, it helps developers implement advanced visual features like gesture recognition, background removal, and document scanning within their Claude Code workflow.
主要功能
01Detailed 21-point hand and 18-point body pose landmark mapping
02233 GitHub stars
03Reference for OCR and barcode detection using VNRecognizeTextRequest
04Multi-person segmentation and hit-testing logic for pixel buffers
05Latest xOS compatibility including 3D body pose and visionOS support
06Implementation guides for VisionKit 'Subject Lifting' and instance masking
使用场景
01Building high-performance document scanners and OCR data extractors
02Implementing gesture-based controls using real-time hand pose tracking
03Creating photo editing tools with automatic subject extraction and segmentation