How does it handle network interruptions?

The provided patterns include recommendations for retry logic, proper connection closing, and using AbortControllers on the frontend to manage stream lifecycle.

Is it compatible with FastAPI?

Absolutely, it provides dedicated patterns for setting up FastAPI SSE endpoints using sse-starlette for production-grade streaming.

Can it handle tool calls while streaming?

Yes, it includes specific patterns for accumulating and parsing tool call chunks as they arrive in a stream to ensure function arguments are complete before execution.

What protocols does this skill support?

It primarily focuses on Server-Sent Events (SSE) for web-based streaming and standard Python generator patterns for backend processing.

Does it support backpressure management?

Yes, it features an implementation using asyncio.Queue to manage the flow between fast LLM token producers and slower consumers.

LLM Streaming Patterns

Name: LLM Streaming Patterns
Author: yonatangross

byyonatangross

•

API Development

Implements high-performance real-time token streaming and Server-Sent Events (SSE) for AI-powered applications.

This skill provides production-ready patterns for handling real-time LLM responses, focusing on improving user experience through incremental token delivery. It covers standard OpenAI-style streaming, asynchronous implementations, FastAPI backend integration via SSE, and robust frontend consumption strategies. Additionally, it addresses complex scenarios like streaming tool calls, handling backpressure, and managing partial JSON parsing to ensure smooth, responsive AI interfaces even during long-running generation tasks.

Key Features

01Real-time token streaming with AsyncOpenAI

0269 GitHub stars

03FastAPI Server-Sent Events (SSE) endpoint patterns

04Advanced tool call streaming and accumulation

05Frontend SSE consumer implementation with AbortController

06Backpressure management for slow consumers

Use Cases

01Building a real-time AI chatbot interface with minimal latency

02Streaming complex tool execution outputs to the frontend

03Implementing secure and efficient SSE pipelines for production LLM apps

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add yonatangross/orchestkit llm-streaming

For use in Claude.ai and ChatGPT

Download Skill