Provides standardized API patterns and implementation guidance for Meta's Segment Anything Model 3 (SAM3) across image and video tasks.
The SAM3 Image Segmentation skill enables Claude to provide expert-level guidance for implementing Segment Anything Model 3, a unified foundation model for promptable segmentation. It covers essential patterns for text prompts (open-vocabulary), geometric bounding boxes, and SAM1-style interactive point prompts. Developers can leverage this skill to implement complex computer vision workflows, including multi-GPU video tracking, efficient batched inference using the DataPoint API, and performance optimizations for modern GPU architectures.
主な機能
01Advanced video tracking implementation for consistent object identification across frames
02Efficient batched inference patterns using DataPoint and FindQueryLoaded structures
030 GitHub stars
04Interactive mask refinement techniques using logit-based feedback loops
05Comprehensive patterns for text, box, and point-based promptable segmentation
06GPU optimization guidance including bfloat16 and TensorFloat32 configurations
ユースケース
01Building interactive annotation tools for medical, satellite, or consumer imagery
02Implementing zero-shot object detection and segmentation using natural language
03Developing high-performance video processing pipelines for automated tracking and masking