소개
Token Compressor is a sophisticated two-stage pipeline designed to optimize Large Language Model (LLM) workflows by significantly reducing token usage without compromising semantic intent. It employs a local LLM to rewrite prompts to their semantic minimum, focusing on preserving all conditionals and negations. This initial compression is then validated using embedding similarity; if the compressed prompt's cosine similarity to the original falls below a set threshold, the original prompt is used as a fallback, ensuring no critical meaning is lost. The result is shorter prompts, lower operational costs, and consistent LLM response quality.