Implements advanced computer vision capabilities including OCR, face detection, and object tracking for modern iOS applications using Swift 6.3.
This skill provides specialized patterns for integrating Apple's Vision framework into iOS 26+ applications, supporting both the modern Swift-native API and legacy VNRequest patterns. It streamlines the implementation of on-device machine learning tasks such as text recognition (OCR), barcode scanning, face detection, and image segmentation. Developers can leverage this skill to implement complex vision tasks like 3D body pose detection, object tracking, and structured document analysis, while ensuring proper coordinate system mapping and performance optimization for real-time camera feeds.
主な機能
01Modern Swift-native Vision API implementation (iOS 18+)
02VisionKit DataScannerViewController and Core ML integration
03Advanced Image and Instance Segmentation for masking
04506 GitHub stars
05Real-time Face, Barcode, and Object detection/tracking
06High-accuracy OCR and structured Document Scanning
ユースケース
01Building a document scanner with table extraction and layout understanding
02Adding on-device face detection or person-masking for privacy-focused photo apps
03Implementing real-time barcode and QR code scanning for retail or logistics