RAG App on AWS
Deploys a complete AWS backend infrastructure for retrieval-augmented generation applications, integrating with Gemini Pro and a Streamlit UI.
About
This tool provides a Terraform-based solution for deploying a complete AWS backend service, ideal for building Retrieval-Augmented Generation (RAG) applications. By integrating Google's free-tier Gemini Pro and Embedding models, it allows for AI-powered document querying. Key components include a Streamlit UI for interaction and a Remote Streamable HTTP-based Web Search MCP Server. It offers a streamlined approach to setting up the necessary infrastructure for document processing, semantic search, and AI-driven question answering.
Key Features
- Infrastructure as Code (IaC) with Terraform for consistent deployments
- Integrates with Google's Gemini Pro and Embedding models
- Includes a Streamlit UI with token-based authentication
- Leverages AWS Lambda for serverless compute
- Utilizes PostgreSQL RDS with pgvector for vector storage
- 19 GitHub stars
Use Cases
- Deploying scalable RAG infrastructure on AWS
- Creating a secure and authenticated document processing pipeline
- Building AI-powered document querying applications