Does this skill support local LLM models?

Yes, it includes detailed patterns for setting up and optimizing local inference using Ollama, specifically tailored for high-performance models like DeepSeek-R1 and Qwen2.5-Coder.

How does it handle function calling reliability?

The skill implements 'strict mode' tool definitions, which is a 2026 best practice to ensure the LLM adheres strictly to JSON schemas, reducing validation errors in production.

What fine-tuning methods are included?

The skill covers Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA and QLoRA, including dataset preparation and evaluation metrics to ensure model quality.

Can I use this for streaming AI responses?

Absolutely. It provides production-ready implementations for Server-Sent Events (SSE) and structured streaming, including backpressure handling and partial JSON parsing.

LLM Integration Patterns

Name: LLM Integration Patterns
Author: yonatangross

byyonatangross

•

116

•

データサイエンスとML

Implements production-ready LLM patterns for function calling, streaming, local inference with Ollama, and model fine-tuning.

This skill provides a comprehensive toolkit for building robust LLM-powered applications, offering standardized patterns for strict function calling, real-time SSE streaming, and local model deployment using Ollama. It guides developers through complex tasks like LoRA/QLoRA fine-tuning, context window optimization, and advanced prompt engineering using frameworks like DSPy. Whether you are reducing costs by moving to local inference or enhancing reliability with structured tool use, this skill ensures production-grade implementation across multiple providers and hardware profiles.

主な機能

01Strict mode function calling schemas for high-reliability tool use

02Local inference setup for Ollama featuring DeepSeek and Qwen models

03Parameter-efficient fine-tuning workflows using LoRA and QLoRA

04Advanced prompt engineering patterns including CoT, ReAct, and DSPy

05Real-time SSE and WebSocket streaming implementations for FastAPI

06116 GitHub stars

ユースケース

01Building real-time AI chatbots with streaming responses and parallel tool access

02Migrating cloud-based LLM workflows to local inference for cost savings and privacy

03Fine-tuning open-source models on domain-specific datasets for specialized tasks

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add yonatangross/orchestkit llm-integration

For use in Claude.ai and ChatGPT

Download Skill