Are there inference optimization patterns included?

Yes, the skill covers batch processing, Mixed Precision (AMP) training, and exporting models to ONNX for optimized production inference.

Does this skill support model quantization?

Yes, it includes standardized patterns for 4-bit and 8-bit quantization using BitsAndBytes to reduce memory usage during model loading.

Does it cover different transformer architectures?

Yes, it includes best practices for Encoders (BERT), Decoders (GPT/LLaMA), Encoder-Decoders (T5), and Vision Transformers (ViT).

Can I use this for LoRA fine-tuning?

Absolutely. The skill provides specific patterns for Parameter-Efficient Fine-Tuning (PEFT) using LoRA to update models with minimal trainable parameters.

How does this help with tokenization issues?

It provides 'correct' versus 'wrong' patterns for padding, truncation, and special token handling to prevent common data processing errors.

Hugging Face Transformers Best Practices

Name: Hugging Face Transformers Best Practices
Author: applied-artificial-intelligence

byapplied-artificial-intelligence

•

データサイエンスとML

Implements best practices for Hugging Face Transformers including model loading, fine-tuning, and inference optimization.

This skill equips Claude with specialized knowledge for the Hugging Face Transformers ecosystem, providing standardized patterns for model loading, tokenization, and deployment. It covers advanced implementation details such as 4-bit/8-bit quantization, Parameter-Efficient Fine-Tuning (PEFT) with LoRA, and high-level Pipeline usage for tasks like text classification and question answering. It is ideal for developers building production-grade NLP applications who need to balance model performance with memory efficiency and inference speed.

主な機能

01Advanced model loading patterns including BitsAndBytes quantization

02Fine-tuning implementations using Trainer API and LoRA/PEFT

03Inference optimization techniques including AMP and ONNX export

04Comprehensive tokenization workflows for diverse NLP architectures

05High-level Pipeline configurations for rapid task deployment

0618 GitHub stars

ユースケース

01Fine-tuning Large Language Models on custom datasets for specialized domain tasks

02Building robust NLP pipelines for text classification, QA, and generation

03Implementing memory-efficient inference for LLMs using quantization

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add applied-artificial-intelligence/claude-code-toolkit huggingface-transformers

For use in Claude.ai and ChatGPT

Download Skill

主な機能

01Advanced model loading patterns including BitsAndBytes quantization

02Fine-tuning implementations using Trainer API and LoRA/PEFT

03Inference optimization techniques including AMP and ONNX export

04Comprehensive tokenization workflows for diverse NLP architectures

05High-level Pipeline configurations for rapid task deployment

0618 GitHub stars

ユースケース

01Fine-tuning Large Language Models on custom datasets for specialized domain tasks

02Building robust NLP pipelines for text classification, QA, and generation

03Implementing memory-efficient inference for LLMs using quantization

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add applied-artificial-intelligence/claude-code-toolkit huggingface-transformers

For use in Claude.ai and ChatGPT

Download Skill