RAG Server FAQs

Question 1

Does RAG Server require external API calls or network access?

Accepted Answer

No. RAG Server performs pure local embedding inference using `@xenova/transformers`. After the initial embedding model download, all operations, including indexing and search, happen entirely on your local machine, ensuring privacy and zero-network dependency.

Question 2

How does RAG Server integrate with other LLM tools or IDEs?

Accepted Answer

It integrates seamlessly with any client that speaks the Model Context Protocol (MCP). This includes GitHub Copilot Agent mode in Visual Studio / VS Code, the official MCP Inspector, and custom MCP-aware tooling, enabling powerful LLM interactions within your development environment.

Question 3

What is RAG Server?

Accepted Answer

RAG Server (mcp-rag-server) is a lightweight, local-first Retrieval-Augmented Generation (RAG) server. It indexes your source code and documentation, building a local vector index to provide semantic search and file access capabilities to LLMs via the Model Context Protocol (MCP).

Question 4

Can I customize the indexing process for my repository?

Accepted Answer

Yes. RAG Server is highly configurable. You can define allowed file extensions, exclude specific folders (e.g., `node_modules`, `dist`), adjust chunk sizes, and even select different embedding models (e.g., `jinaai/jina-embeddings-v2-base-code` for code or `Xenova/bge-base-en-v1.5` for docs) to optimize indexing for your project.

Question 5

What kind of tools and functionalities does RAG Server expose to LLMs?

Accepted Answer

RAG Server exposes three core MCP tools: `rag_query` for semantic search over your repository, `read_file` for securely accessing specific file content (with optional line ranges), and `list_files` for listing directory contents with filtering and recursion, all constrained to your defined repository root (`REPO_ROOT`).

RAG Server

RAG Server

About

Key Features

Use Cases

About

Key Features

Use Cases