Data Science & ML Agent Skills

Discover Agent Skills for data science & ml. Browse 53skills for Claude, ChatGPT & Codex.

Mamba Architecture & SSM Implementation

Implements and optimizes Mamba-based Selective State Space Models for high-efficiency sequence modeling and long-context AI research.

bitsandbytes LLM Quantization

384

Quantizes Large Language Models to 4-bit or 8-bit formats to reduce GPU memory usage by up to 75% with minimal accuracy loss.

Constitutional AI Safety Alignment

384

Implements Anthropic's Constitutional AI method to train harmless, helpful models through self-critique and automated AI feedback.

PyTorch FSDP Expert

384

Optimizes large-scale AI model training using PyTorch Fully Sharded Data Parallelism for efficient memory management and scaling.

SentencePiece Tokenizer

384

Implements language-independent subword tokenization using BPE and Unigram algorithms for advanced AI model development.

NanoGPT Model Training

384

Implements and trains minimalist GPT architectures for educational and research purposes using Andrej Karpathy's clean, hackable codebase.

Whisper Speech Recognition

384

Transcribes audio, translates speech to English, and automates multilingual audio processing using OpenAI's Whisper models.

vLLM High-Performance Inference Serving

384

Serves Large Language Models with maximum throughput and efficiency using vLLM's PagedAttention and continuous batching.

Llama.cpp Inference

384

Deploys and optimizes LLM inference on CPU, Apple Silicon, and consumer hardware using GGUF quantization.

TransformerLens Mechanistic Interpretability

384

Facilitates mechanistic interpretability research by providing tools to inspect, cache, and manipulate transformer model activations via HookPoints.

Ray Data Processing

384

Processes large-scale datasets for machine learning workloads using distributed streaming execution across CPU and GPU clusters.

HuggingFace Tokenizers

384

Provides high-performance, Rust-optimized text tokenization for NLP research and production-grade machine learning pipelines.

SimPO LLM Alignment

384

Simplifies large language model alignment using reference-free preference optimization to improve model performance without the overhead of PPO or DPO.

LLaMA Factory Fine-Tuning

384

Streamlines the fine-tuning process for over 100 large language models using the LLaMA-Factory framework and QLoRA techniques.

SGLang Inference Serving

384

Optimizes LLM serving and structured generation using RadixAttention prefix caching for high-performance agentic workflows.

Weights & Biases MLOps

384

Tracks machine learning experiments and manages model lifecycles with real-time visualization and collaborative tools.

Speculative Decoding LLM Accelerator

384

Accelerates LLM inference speeds by up to 3.6x using advanced decoding techniques like Medusa heads and lookahead decoding.

GGUF Quantization & Model Optimization

384

Optimizes AI models for efficient local inference using the GGUF format and llama.cpp quantization techniques.

Segment Anything Model (SAM)

384

Implements Meta AI's foundation model for high-precision zero-shot image segmentation using points, boxes, and masks.

DSPy Declarative AI Programming

384

Builds complex AI systems using Stanford's declarative programming framework to optimize prompts and create modular RAG systems automatically.

CLIP Vision-Language Model

384

Enables zero-shot image classification and semantic image search by connecting visual concepts with natural language.

SAELens: Mechanistic Interpretability

384

Decomposes complex neural network activations into sparse, interpretable features to understand and steer model behavior.

Unsloth Fine-Tuning

384

Accelerates LLM fine-tuning workflows with Unsloth to achieve up to 5x faster training speeds and 80% reduced memory consumption.

pyvene Causal Interventions

384

Performs declarative causal interventions and mechanistic interpretability experiments on PyTorch models.

LlamaIndex RAG Framework

384

Connects LLMs to private data sources through advanced document ingestion, vector indexing, and retrieval-augmented generation (RAG) pipelines.

BLIP-2 Multimodal Vision

384

Integrates Salesforce's BLIP-2 framework to enable advanced image captioning, visual question answering, and multimodal reasoning within AI workflows.

HuggingFace Accelerate Distributed Training

384

Simplifies PyTorch distributed training by providing a unified API for DDP, DeepSpeed, and FSDP with minimal code changes.

Stable Diffusion Image Generation

384

Generates high-quality images and performs advanced image transformations using Stable Diffusion models and the HuggingFace Diffusers library.

Gtars Genomic Analysis

324

Accelerates genomic interval analysis and machine learning preprocessing using a high-performance Rust toolkit with Python bindings.

scikit-bio Bioinformatics & Ecology

324

Performs comprehensive bioinformatics analysis including sequence manipulation, phylogenetics, and microbial ecology statistics within Python.

30 results loaded • More available

Scroll for more results...