探索我们完整的 Claude 技能集合,扩展 AI 代理的能力。
Moderates LLM inputs and outputs using Meta's specialized LlamaGuard models to ensure safety and policy compliance across six critical categories.
Implements Group Relative Policy Optimization (GRPO) for reasoning and task-specific model alignment using the TRL library.
Visualizes machine learning training metrics, model architectures, and performance profiles using Google's TensorBoard toolkit.
Optimizes Large Language Models using 4-bit activation-aware weight quantization to achieve 3x faster inference with minimal accuracy loss.
Evaluates Large Language Models across 60+ academic benchmarks using standardized prompts and metrics for reproducible research.
Ensures guaranteed valid JSON, XML, and type-safe code generation from LLMs using constrained token sampling and Pydantic models.
Evaluates Large Language Models across 100+ industry-standard benchmarks using NVIDIA's enterprise-grade containerized architecture.
Accelerates LLM data curation using GPU-powered deduplication, quality filtering, and PII redaction at scale.
Accelerates large-scale LLM pretraining using PyTorch-native 4D parallelism and Float8 optimization.
Optimizes large language models for efficient local inference using GGUF format and llama.cpp quantization techniques.
Evaluates AI code generation models using industry-standard benchmarks and pass@k metrics.
Deploys and manages high-performance RLHF training pipelines for large-scale language models using Ray and vLLM acceleration.
Optimizes large-scale model training using DeepSpeed configurations, ZeRO optimization stages, and high-performance I/O management.
Interprets and manipulates neural network internals across local and remote models using the nnsight library and NDIF execution.
Monitors, debugs, and evaluates large language model applications with comprehensive tracing and systematic testing tools.
Implements programmable safety rails and runtime validation for LLM applications using NVIDIA's NeMo Guardrails framework.
Manages high-performance vector similarity search and scalable storage for production RAG and semantic search systems.
Quantizes Large Language Models to 8-bit or 4-bit formats to reduce memory usage by up to 75% with minimal accuracy loss.
Implements, fine-tunes, and deploys high-performance Large Language Models using Lightning AI's LitGPT framework.
Implements advanced PyTorch FSDP2 sharding and distributed checkpointing for efficient large-scale model training.
Combines multiple fine-tuned AI models into a single high-performance model without requiring additional training or expensive GPU resources.
Implements and manages RWKV architectures for efficient, linear-time AI inference and long-context processing.
Provisions and manages high-performance GPU infrastructure on Lambda Labs for machine learning training and inference workflows.
Deploys and scales machine learning workloads on high-performance serverless GPUs using a Python-native framework.
Simplifies PyTorch distributed training across multiple GPUs, TPUs, and nodes with minimal code changes and a unified API.
Manage the complete machine learning lifecycle including experiment tracking, model versioning, and deployment using the MLflow framework.
Manages high-performance vector embeddings and metadata for RAG applications and semantic search using the open-source Chroma database.
Compresses large language models to 4-bit precision to enable high-speed inference and deployment on consumer-grade hardware.
Accelerates LLM inference speed by up to 3.6x using speculative decoding, Medusa heads, and lookahead techniques without sacrificing model quality.
Streamlines the fine-tuning of 100+ large language models using LLaMA-Factory with support for QLoRA and multimodal architectures.
Scroll for more results...