소개
This skill provides comprehensive guidance for fine-tuning vector search infrastructures, focusing on HNSW parameter optimization, quantization strategies, and scaling patterns. It helps developers navigate the complex trade-offs between search speed, memory footprint, and retrieval accuracy (recall). By offering production-ready templates for Python-based benchmarking, vector compression techniques (Scalar, PQ, and Binary), and specific database configurations like Qdrant, this skill ensures RAG systems and LLM applications remain performant at scale.