Introduction
Retrieval-Augmented Generation (RAG) is a powerful technique that enhances the capabilities of large language models (LLMs) by grounding them with external knowledge. This blog post will guide you through building a RAG application using Langchain, Next.js, OpenAI, and Pinecone.
What is RAG?
RAG combines the strengths of information retrieval and text generation. It works by first retrieving relevant documents from a knowledge base in response to a user query and then using those documents to generate a more informed and accurate response.
Technologies Used
- Langchain: A framework for developing applications powered by language models.
- Next.js: A React framework for building performant and scalable web applications.
- OpenAI: Provides powerful language models like GPT-3.5 or GPT-4.
- Pinecone: A vector database for efficient similarity search.
Prerequisites
Before you begin, make sure you have the following:
- Node.js and npm installed
- An OpenAI API key
- A Pinecone API key and environment
Step 1: Setting Up Your Next.js Project
-
Create a new Next.js project:
npx create-next-app rag-app cd rag-app
-
Install the necessary dependencies:
npm install langchain openai pinecone-client dotenv
Step 2: Loading and Indexing Your Data
-
Create a
.env.local
file in your project root and add your API keys:OPENAI_API_KEY=your_openai_api_key PINECONE_API_KEY=your_pinecone_api_key PINECONE_ENVIRONMENT=your_pinecone_environment
-
Create a script to load and index your data. For example, create a file named
scripts/index-data.js
:// filepath: scripts/index-data.js import { PineconeClient } from 'pinecone-client'; import { OpenAIEmbeddings } from 'langchain/embeddings/openai'; import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter'; import { Document } from 'langchain/document'; import * as fs from 'fs'; import * as dotenv from 'dotenv'; dotenv.config(); const pinecone = new PineconeClient(); async function indexData() { await pinecone.init({ apiKey: process.env.PINECONE_API_KEY, environment: process.env.PINECONE_ENVIRONMENT, }); const indexName = 'rag-index'; const index = pinecone.Index(indexName); // Load your data (e.g., from a text file) const text = fs.readFileSync('data/my-data.txt', 'utf8'); // Split the text into chunks const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000, chunkOverlap: 200, }); const docs = await textSplitter.splitDocuments([ new Document({ pageContent: text }), ]); // Create embeddings using OpenAI const embeddings = new OpenAIEmbeddings({ openAIApiKey: process.env.OPENAI_API_KEY, }); const vectors = await embeddings.embedDocuments( docs.map((doc) => doc.pageContent) ); // Upsert the embeddings into Pinecone const batchSize = 100; for (let i = 0; i < docs.length; i += batchSize) { const batch = docs.slice(i, i + batchSize); const vectorBatch = vectors.slice(i, i + batchSize); await index.upsert( batch.map((doc, j) => ({ id: `${i + j}`, values: vectorBatch[j], metadata: { content: doc.pageContent, }, })) ); console.log(`Upserted ${i + batchSize} vectors`); } console.log('Data indexing complete!'); } indexData();
Make sure to create a
data/my-data.txt
file with your data. -
Run the indexing script:
node scripts/index-data.js
Step 3: Creating the API Endpoint
-
Create an API endpoint in
pages/api/query.js
:// filepath: pages/api/query.js import { PineconeClient } from 'pinecone-client'; import { OpenAIEmbeddings } from 'langchain/embeddings/openai'; import { OpenAI } from 'langchain/llms/openai'; const pinecone = new PineconeClient(); export default async function handler(req, res) { const { query } = req.body; await pinecone.init({ apiKey: process.env.PINECONE_API_KEY, environment: process.env.PINECONE_ENVIRONMENT, }); const indexName = 'rag-index'; const index = pinecone.Index(indexName); // Create embeddings for the query const embeddings = new OpenAIEmbeddings({ openAIApiKey: process.env.OPENAI_API_KEY, }); const queryVector = await embeddings.embedQuery(query); // Search Pinecone for relevant documents const searchResult = await index.query({ vector: queryVector, topK: 3, includeMetadata: true, }); // Extract the content from the retrieved documents const context = searchResult.matches .map((match) => match.metadata.content) .join('\n'); // Use OpenAI to generate an answer const llm = new OpenAI({ openAIApiKey: process.env.OPENAI_API_KEY, }); const prompt = `Answer the following question based on the context provided:\n\nContext:\n${context}\n\nQuestion: ${query}`; const answer = await llm.call(prompt); res.status(200).json({ answer }); }
Step 4: Building the User Interface
-
Modify
pages/index.js
to include a search bar and display the answer:// filepath: pages/index.js import { useState } from 'react'; export default function Home() { const [query, setQuery] = useState(''); const [answer, setAnswer] = useState(''); const handleSubmit = async (e) => { e.preventDefault(); const response = await fetch('/api/query', { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ query }), }); const data = await response.json(); setAnswer(data.answer); }; return ( <div> <h1>RAG Application</h1> <form onSubmit={handleSubmit}> <input type="text" value={query} onChange={(e) => setQuery(e.target.value)} placeholder="Ask me anything!" /> <button type="submit">Search</button> </form> {answer && ( <div> <h2>Answer:</h2> <p>{answer}</p> </div> )} </div> ); }
Step 5: Running Your Application
-
Start the Next.js development server:
npm run dev
-
Open your browser and go to
http://localhost:3000
. -
Enter your query in the search bar and click "Search".
-
The answer generated by the RAG application will be displayed.
Conclusion
Congratulations! You have successfully built a RAG application using Langchain, Next.js, OpenAI, and Pinecone. This example provides a basic framework that you can extend and customize to fit your specific needs.