What is the Computer Vision Expert skill for Claude Code?

It is a specialized capability that provides guidance on designing and implementing state-of-the-art vision pipelines, focusing on 2026 technologies like YOLO26 and Segment Anything 3.

Does this skill support edge hardware deployment?

Yes, it includes patterns for optimizing vision models for edge devices using ONNX, TensorRT, and NPU-specific architectures like NMS-free YOLO26.

Does it support text-based image interaction?

Yes, it leverages SAM 3's text-to-mask capabilities and VLMs like Florence-2 for visual grounding and natural language scene understanding.

Can it help with 3D scene reconstruction?

Absolutely. It covers Depth Anything V2 for monocular depth estimation and SAM 3D for reconstructing objects and scenes from visual inputs.

How does it handle real-time detection performance?

It emphasizes NMS-free architectures and the MuSGD optimizer to reduce inference latency and improve training convergence for real-time applications.

Computer Vision Expert (SOTA 2026)

Name: Computer Vision Expert (SOTA 2026)
Author: sickn33

bysickn33

•

31,722

•

数据科学与机器学习

Implements and optimizes state-of-the-art vision systems using YOLO26, SAM 3, and advanced visual language models.

This skill transforms Claude into an advanced vision systems architect, providing expert guidance for designing high-performance detection, segmentation, and 3D spatial intelligence pipelines. It focuses on 2026 industry standards, including NMS-free YOLO26 architectures for low-latency edge deployment, text-to-mask segmentation with SAM 3, and complex visual reasoning using foundation models. Whether you are building autonomous robotics, industrial inspection tools, or real-time spatial analysis systems, this skill bridges the gap between classical geometric calibration and cutting-edge deep learning.

主要功能

01Promptable text-to-mask segmentation and 3D reconstruction via SAM 3

02End-to-end real-time object detection using YOLO26 NMS-free architectures

0331,722 GitHub stars

04Edge device optimization for ONNX, TensorRT, and specialized NPUs

05Advanced visual reasoning and grounding using Vision Language Models (VLMs)

06High-precision spatial analysis including monocular depth and sub-pixel calibration

使用场景

01Developing high-speed industrial inspection systems with low-latency edge deployment

02Building autonomous robotics pipelines for real-time spatial mapping and navigation

03Implementing zero-shot visual search and semantic scene understanding for media analysis

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add sickn33/antigravity-awesome-skills computer-vision-expert

For use in Claude.ai and ChatGPT

主要功能

01Promptable text-to-mask segmentation and 3D reconstruction via SAM 3

02End-to-end real-time object detection using YOLO26 NMS-free architectures

0331,722 GitHub stars

04Edge device optimization for ONNX, TensorRT, and specialized NPUs

05Advanced visual reasoning and grounding using Vision Language Models (VLMs)

06High-precision spatial analysis including monocular depth and sub-pixel calibration

使用场景

01Developing high-speed industrial inspection systems with low-latency edge deployment

02Building autonomous robotics pipelines for real-time spatial mapping and navigation

03Implementing zero-shot visual search and semantic scene understanding for media analysis

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add sickn33/antigravity-awesome-skills computer-vision-expert

For use in Claude.ai and ChatGPT