概要
This skill provides specialized guidance for implementing lightweight GPT-2 inference from scratch, specifically tailored for code golf and resource-constrained environments. It covers the technical intricacies of parsing TensorFlow binary checkpoints, implementing byte-pair encoding (BPE) tokenization without external libraries, and executing efficient forward passes while maintaining numerical stability. By emphasizing a 'verify-before-optimize' workflow, it helps developers navigate common pitfalls like endianness issues, incorrect merge orders, and overflow errors in softmax layers to ensure a functional yet compact solution.