How do I filter for specific errors in my LangSmith runs?

You can use the 'runs get-latest' command with the '--failed' flag or use '--grep' with regex patterns to search for specific error messages within run outputs.

Can I manage LangSmith prompt templates with this tool?

Yes, the skill provides full support for listing, pulling, and versioning prompts from your LangSmith repository.

How does the cache-first workflow work?

The skill encourages checking local caches using 'runs cache list' before querying the API. This allows for zero-latency searches and analysis on previously downloaded project data.

Can I use this skill to monitor LLM costs?

Yes, the skill includes specialized commands for token usage analysis, allowing you to breakdown costs by model, provider, or specific project timeframes.

Why is the --json flag required for this skill?

The --json flag is mandatory because it ensures the CLI outputs data in a machine-readable format. Without it, the CLI produces rich terminal tables which are difficult for AI agents to parse and interpret.

LangSmith CLI

Name: LangSmith CLI
Author: gigaverse-app

bygigaverse-app

•

Analytics & Monitoring

Inspects and manages LangSmith traces, datasets, and prompts using a context-efficient CLI optimized for Claude Code.

The LangSmith CLI skill provides a lightweight, on-demand alternative to heavy MCP servers for developers using LangSmith. It enables seamless debugging of AI chains, management of evaluation datasets, and detailed token cost analysis directly within Claude. By implementing a cache-first architecture, the skill ensures high-performance searching and offline analysis of runs, while advanced filtering capabilities—including regex and FQL—allow for surgical data extraction. This skill is essential for developers who need to monitor production LLM applications, compare experiment results, and maintain high-quality prompt repositories without leaving their terminal workflow.

Key Features

01Advanced filtering using FQL and client-side regex search across inputs and outputs.

02Efficient debugging of AI traces and run history with JSON-first output.

03Local cache-first workflow for instant searching and reduced API latency.

04Comprehensive management of LangSmith datasets, prompts, and feedback.

058 GitHub stars

06Detailed token usage and cost analysis broken down by model and provider.

Use Cases

01Debugging failed LLM traces and inspecting nested chain outputs in real-time.

02Analyzing production token costs and identifying expensive model usage patterns.

03Managing evaluation datasets and pulling prompt templates for model testing.

Key Features

01Advanced filtering using FQL and client-side regex search across inputs and outputs.

02Efficient debugging of AI traces and run history with JSON-first output.

03Local cache-first workflow for instant searching and reduced API latency.

04Comprehensive management of LangSmith datasets, prompts, and feedback.

058 GitHub stars

06Detailed token usage and cost analysis broken down by model and provider.

Use Cases

01Debugging failed LLM traces and inspecting nested chain outputs in real-time.

02Analyzing production token costs and identifying expensive model usage patterns.

03Managing evaluation datasets and pulling prompt templates for model testing.