소개
This skill provides comprehensive guidance and implementation templates for fine-tuning vector database indexes, focusing on the critical trade-offs between latency, recall, and memory consumption. It assists developers in selecting the right index types, configuring HNSW parameters like M and efConstruction, and implementing quantization techniques such as Scalar (INT8) or Product Quantization (PQ). Whether you are scaling a RAG application to millions of vectors or optimizing search latency for real-time recommendations, this skill provides the benchmarking scripts and configuration logic needed to maximize vector search efficiency.