概要
This skill provides specialized implementation patterns and best practices for extending the context limits of Large Language Models (LLMs) to 128k+ tokens. It covers advanced positional encoding methods such as Rotary Position Embeddings (RoPE), YaRN, and ALiBi, enabling developers to adapt pre-trained models like Llama or Mistral for long-form document analysis, extensive codebase processing, and complex reasoning tasks. By leveraging position interpolation and efficient attention biases, users can achieve massive context windows with minimal additional training and compute overhead.