Discover Agent Skills for data science & ml. Browse 61skills for Claude, ChatGPT & Codex.
Evaluates Large Language Models across 60+ academic benchmarks to measure reasoning, coding, and mathematical capabilities using industry-standard metrics.
Implements Group Relative Policy Optimization (GRPO) using the TRL library to enhance model reasoning and structured output capabilities.
Manages high-performance vector search and storage for production RAG and AI applications using Pinecone's serverless infrastructure.
Implements Meta AI's foundation model for high-precision zero-shot image segmentation using points, boxes, and masks.
Generates state-of-the-art text and image embeddings for RAG, semantic search, and clustering tasks.
Performs declarative causal interventions and mechanistic interpretability experiments on PyTorch models.
Implements Anthropic's Constitutional AI method to train harmless, helpful models through self-critique and automated AI feedback.
Optimizes large-scale language model training using NVIDIA Megatron-Core with advanced 3D and expert parallelism strategies.
Accelerates Large Language Model inference on NVIDIA GPUs using state-of-the-art optimization techniques for maximum throughput and minimal latency.
Implements programmable safety rails and validation for LLM applications to prevent jailbreaks, hallucinations, and PII leaks.
Implements and trains minimalist GPT architectures for educational and research purposes using Andrej Karpathy's clean, hackable codebase.
Simplifies Large Language Model implementation, training, and fine-tuning using clean, production-ready LitGPT architectures.
Streamlines the fine-tuning process for over 100 large language models using the LLaMA-Factory framework and QLoRA techniques.
Optimizes LLM serving and structured generation using RadixAttention prefix caching for high-performance agentic workflows.
Deploys high-performance Reinforcement Learning from Human Feedback (RLHF) workflows using Ray and vLLM acceleration for large-scale model alignment.
Facilitates mechanistic interpretability research by providing tools to inspect, cache, and manipulate transformer model activations via HookPoints.
Simplifies large language model alignment using reference-free preference optimization to improve model performance without the overhead of PPO or DPO.
Quantizes Large Language Models to 4/3/2-bit precision without calibration data for faster inference and reduced memory footprint.
Fine-tunes large language models using LoRA, QLoRA, and other parameter-efficient methods to drastically reduce memory and compute requirements.
Builds LLM-powered applications using agents, retrieval-augmented generation (RAG), and modular chains.
Implements and optimizes Mixture of Experts (MoE) architectures to scale model capacity while reducing training and inference costs.
Orchestrates distributed machine learning training across clusters to scale PyTorch, TensorFlow, and Hugging Face models.
Optimizes Large Language Models using 4-bit post-training quantization to reduce memory usage and accelerate inference on consumer GPUs.
Extracts structured, type-safe data from Large Language Models using Pydantic validation and automatic retries.
Curates high-quality datasets for LLM training using GPU-accelerated deduplication, filtering, and PII redaction.
Implements language-independent subword tokenization using BPE and Unigram algorithms for advanced AI model development.
Deploys and optimizes LLM inference on CPU, Apple Silicon, and consumer hardware using GGUF quantization.
Guarantees valid, type-safe JSON and structured outputs from Large Language Models using grammar-based constraints.
Queries the NCBI Gene database to retrieve comprehensive genetic information, sequences, and functional annotations for biological research.
Builds and deploys specialized machine learning models for clinical healthcare data and electronic health records.
Scroll for more results...