概要
Enables the construction of high-performance, real-time chat interfaces by providing a standardized way to stream incremental LLM responses. This skill abstracts the complexities of different provider SDKs—including OpenAI, Anthropic, Google, and Ollama—into a single async iterator pattern. It simplifies handling chunk deltas, monitoring token usage, and detecting stream completion reasons, making it an essential tool for developers building responsive AI applications that require immediate feedback and low perceived latency.