探索我们完整的 Claude 技能集合,扩展 AI 代理的能力。
Streamlines deep learning development by decoupling research code from engineering boilerplate for automated distributed training and hardware scaling.
Implements and optimizes RWKV architectures, a hybrid RNN-Transformer model offering linear-time inference and infinite context windows.
Accelerates large-scale similarity search and clustering for dense vectors using Facebook AI's high-performance library.
Performs declarative causal interventions and mechanistic interpretability experiments on PyTorch models.
Implements Anthropic's Constitutional AI method to train harmless, helpful models through self-critique and automated AI feedback.
Optimizes large-scale language model training using NVIDIA Megatron-Core with advanced 3D and expert parallelism strategies.
Accelerates Large Language Model inference on NVIDIA GPUs using state-of-the-art optimization techniques for maximum throughput and minimal latency.
Implements programmable safety rails and validation for LLM applications to prevent jailbreaks, hallucinations, and PII leaks.
Implements and trains minimalist GPT architectures for educational and research purposes using Andrej Karpathy's clean, hackable codebase.
Simplifies Large Language Model implementation, training, and fine-tuning using clean, production-ready LitGPT architectures.
Streamlines the fine-tuning process for over 100 large language models using the LLaMA-Factory framework and QLoRA techniques.
Optimizes LLM serving and structured generation using RadixAttention prefix caching for high-performance agentic workflows.
Integrates comprehensive tracing, evaluation, and monitoring tools to debug and optimize Large Language Model (LLM) applications.
Deploys high-performance Reinforcement Learning from Human Feedback (RLHF) workflows using Ray and vLLM acceleration for large-scale model alignment.
Facilitates mechanistic interpretability research by providing tools to inspect, cache, and manipulate transformer model activations via HookPoints.
Simplifies large language model alignment using reference-free preference optimization to improve model performance without the overhead of PPO or DPO.
Monitors, traces, and evaluates LLM applications using an open-source, OpenTelemetry-based observability platform.
Quantizes Large Language Models to 4/3/2-bit precision without calibration data for faster inference and reduced memory footprint.
Fine-tunes large language models using LoRA, QLoRA, and other parameter-efficient methods to drastically reduce memory and compute requirements.
Builds LLM-powered applications using agents, retrieval-augmented generation (RAG), and modular chains.
Implements and optimizes Mixture of Experts (MoE) architectures to scale model capacity while reducing training and inference costs.
Orchestrates distributed machine learning training across clusters to scale PyTorch, TensorFlow, and Hugging Face models.
Optimizes Large Language Models using 4-bit post-training quantization to reduce memory usage and accelerate inference on consumer GPUs.
Extracts structured, type-safe data from Large Language Models using Pydantic validation and automatic retries.
Curates high-quality datasets for LLM training using GPU-accelerated deduplication, filtering, and PII redaction.
Implements language-independent subword tokenization using BPE and Unigram algorithms for advanced AI model development.
Deploys and optimizes LLM inference on CPU, Apple Silicon, and consumer hardware using GGUF quantization.
Guarantees valid, type-safe JSON and structured outputs from Large Language Models using grammar-based constraints.
Transcribes audio, translates speech to English, and automates multilingual audio processing using OpenAI's Whisper models.
Integrates Salesforce's BLIP-2 framework to enable advanced image captioning, visual question answering, and multimodal reasoning within AI workflows.
Scroll for more results...