Is it compatible with custom Core ML models?

Yes, the skill covers integration patterns for VNCoreMLRequest, allowing you to use custom-trained models within the Vision pipeline.

How does it handle complex document layouts?

It utilizes the RecognizeDocumentsRequest (iOS 26+) which provides structured access to paragraphs, tables, lists, and titles within a document.

Can I use this for real-time camera scanning?

Absolutely. It provides specific guidance for VisionKit's DataScannerViewController and performance settings for real-time observation levels.

Does it help with coordinate system conversion?

Yes, it includes logic for converting Vision's normalized coordinate system (origin bottom-left) to the UIKit/SwiftUI top-left origin system for UI overlays.

Does this skill support older iOS versions?

Yes, it includes code patterns for both the modern iOS 18+ API (ImageProcessingRequest) and legacy VNRequest patterns for backward compatibility.

iOS Vision Framework Integration

Name: iOS Vision Framework Integration
Author: dpearson2699

bydpearson2699

•

506

•

Mobile Development

Implements advanced computer vision capabilities including OCR, face detection, and object tracking for modern iOS applications using Swift 6.3.

This skill provides specialized patterns for integrating Apple's Vision framework into iOS 26+ applications, supporting both the modern Swift-native API and legacy VNRequest patterns. It streamlines the implementation of on-device machine learning tasks such as text recognition (OCR), barcode scanning, face detection, and image segmentation. Developers can leverage this skill to implement complex vision tasks like 3D body pose detection, object tracking, and structured document analysis, while ensuring proper coordinate system mapping and performance optimization for real-time camera feeds.

主な機能

01Modern Swift-native Vision API implementation (iOS 18+)

02VisionKit DataScannerViewController and Core ML integration

03Advanced Image and Instance Segmentation for masking

04506 GitHub stars

05Real-time Face, Barcode, and Object detection/tracking

06High-accuracy OCR and structured Document Scanning

ユースケース

01Building a document scanner with table extraction and layout understanding

02Adding on-device face detection or person-masking for privacy-focused photo apps

03Implementing real-time barcode and QR code scanning for retail or logistics

主な機能

01Modern Swift-native Vision API implementation (iOS 18+)

02VisionKit DataScannerViewController and Core ML integration

03Advanced Image and Instance Segmentation for masking

04506 GitHub stars

05Real-time Face, Barcode, and Object detection/tracking

06High-accuracy OCR and structured Document Scanning

ユースケース

01Building a document scanner with table extraction and layout understanding

02Adding on-device face detection or person-masking for privacy-focused photo apps

03Implementing real-time barcode and QR code scanning for retail or logistics