SUTRA Document RAG Chatbot - Streamlit App with LangChain
This is a Streamlit-based SUTRA Document RAG Chatbot application, which leverages the SUTRA LLM model via LangChain to provide a Retrieval-Augmented Generation (RAG) chatbot. The app allows users to upload PDF documents, extract text, and ask questions answered based on the document content using the SUTRA LLM.
Overview
The SUTRA Document RAG Chatbot app enables users to upload PDF files, process them to extract text, and interact with a chatbot powered by the SUTRA LLM model. The app uses LangChain for document processing and retrieval, Streamlit for the UI, and supports streaming responses for a dynamic user experience. Chat history is maintained using Streamlit's session state.
Prerequisites
To run this Streamlit app, ensure you have the following installed:
- Python (v3.8 or higher)
- pip (Python package manager)
- Streamlit (
pip install streamlit
) - LangChain (
pip install langchain
) - LangChain OpenAI (
pip install langchain-openai
) - LangChain Community (
pip install langchain-community
) - PyPDF2 (
pip install PyPDF2
) - FAISS (
pip install faiss-cpu
) - python-dotenv (
pip install python-dotenv
) - A valid Sutra API key (set in a
.env
file asSUTRA_API_KEY
) - A modern web browser (e.g., Chrome, Firefox)
Setup Instructions
-
Install Dependencies:
pip install streamlit langchain langchain-openai langchain-community PyPDF2 faiss-cpu python-dotenv
-
Set Up Environment Variables: Create a
.env
file in the project directory with the following content:SUTRA_API_KEY=your_sutra_api_key_here
-
Save the Code: Save the provided code in a file named
app.py
. -
Run the Application:
streamlit run app.py
This will start the Streamlit server, and the app will be accessible at
http://localhost:8501
.
Streamlit App Code
Below is the complete Python code for the Streamlit app, which implements the Document RAG Chatbot functionality.
import streamlit as st
from langchain_openai import ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain.schema import HumanMessage
from langchain.callbacks.base import BaseCallbackHandler
import PyPDF2
import os
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
api_key = os.getenv("SUTRA_API_KEY")
# Page configuration
st.set_page_config(
page_title="Sutra Document RAG Chatbot",
page_icon="📚",
layout="wide"
)
# Streaming callback handler
class StreamHandler(BaseCallbackHandler):
def __init__(self, container, initial_text=""):
self.container = container
self.text = initial_text
def on_llm_new_token(self, token: str, **kwargs):
self.text += token
self.container.markdown(self.text)
# Initialize the ChatOpenAI model - base instance for caching
@st.cache_resource
def get_base_chat_model():
return ChatOpenAI(
api_key=os.getenv("SUTRA_API_KEY"),
base_url="https://api.two.ai/v2",
model="sutra-v2",
temperature=0.7,
)
# Create a streaming version of the model with callback handler
def get_streaming_chat_model(callback_handler=None):
return ChatOpenAI(
api_key=os.getenv("SUTRA_API_KEY"),
base_url="https://api.two.ai/v2",
model="sutra-v2",
temperature=0.7,
streaming=True,
callbacks=[callback_handler] if callback_handler else None
)
# Sidebar for document upload
st.sidebar.image("https://framerusercontent.com/images/3Ca34Pogzn9I3a7uTsNSlfs9Bdk.png", use_container_width=True)
with st.sidebar:
st.title("📚 Sutra RAG Chatbot")
uploaded_file = st.file_uploader("Upload a PDF document", type="pdf")
st.divider()
st.markdown("### About Sutra RAG Chatbot")
st.markdown("This app uses Retrieval-Augmented Generation to answer questions based on uploaded PDF documents using the Sutra LLM.")
# Main app title
st.markdown(
f'<h1><img src="https://framerusercontent.com/images/9vH8BcjXKRcC5OrSfkohhSyDgX0.png" width="60"/> Sutra Document RAG Chatbot 🤖</h1>',
unsafe_allow_html=True
)
# Initialize session state for messages and vector store
if "messages" not in st.session_state:
st.session_state.messages = []
if "vector_store" not in st.session_state:
st.session_state.vector_store = None
# Process uploaded PDF
if uploaded_file is not None:
try:
# Read and extract text from PDF
pdf_reader = PyPDF2.PdfReader(uploaded_file)
text = ""
for page in pdf_reader.pages:
text += page.extract_text() or ""
# Split text into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_text(text)
# Create embeddings and vector store
embeddings = OpenAIEmbeddings(api_key=os.getenv("SUTRA_API_KEY"), base_url="https://api.two.ai/v2")
st.session_state.vector_store = FAISS.from_texts(documents, embeddings)
st.success("Document processed successfully!")
except Exception as e:
st.error(f"Error processing PDF: {str(e)}")
# Display chat messages
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.write(message["content"])
# Chat input
user_input = st.chat_input("Ask a question about the document...")
# Process user input
if user_input and st.session_state.vector_store is not None:
# Add user message to chat
st.session_state.messages.append({"role": "user", "content": user_input})
# Display user message
with st.chat_message("user"):
st.write(user_input)
# Generate response
try:
# Create message placeholder
with st.chat_message("assistant"):
response_placeholder = st.empty()
# Create a stream handler
stream_handler = StreamHandler(response_placeholder)
# Get streaming model with handler
chat = get_streaming_chat_model(stream_handler)
# Set up RetrievalQA chain
qa_chain = RetrievalQA.from_chain_type(
llm=chat,
chain_type="stuff",
retriever=st.session_state.vector_store.as_retriever()
)
# Generate response using RAG
response = qa_chain.invoke({"query": user_input})
answer = response["result"]
# Add assistant response to chat history
st.session_state.messages.append({"role": "assistant", "content": answer})
except Exception as e:
st.error(f"Error: {str(e)}")
if "API key" in str(e):
st.error("Please check your Sutra API key in the environment variables.")
elif user_input and st.session_state.vector_store is None:
st.error("Please upload a PDF document first.")
Functionality
- Document Upload: A sidebar file uploader allows users to upload PDF documents for processing.
- Text Extraction and Indexing: The app uses
PyPDF2
to extract text from PDFs,CharacterTextSplitter
to chunk the text, andFAISS
withOpenAIEmbeddings
to create a vector store for retrieval. - RAG Pipeline: The app uses LangChain’s
RetrievalQA
chain to retrieve relevant document chunks and generate answers using the Sutra LLM (sutra-v2
model). - Chat Interface: Users input questions via a
st.chat_input
field, and responses are displayed usingst.chat_message
for a clean chat UI. - Streaming Responses: A custom
StreamHandler
class streams responses from the Sutra LLM, providing a dynamic typing effect. - Session State: Persists chat history and the vector store using
st.session_state
. - Error Handling: Displays errors for invalid API keys, PDF processing issues, or missing document uploads.
- Styling: Uses custom HTML for the title with an icon and includes a logo in the sidebar for branding.
Running the App
- Ensure all dependencies are installed and the
.env
file contains a validSUTRA_API_KEY
. - Save the code as
app.py
. - Run
streamlit run app.py
. - Open
http://localhost:8501
in your browser. - Upload a PDF document via the sidebar.
- Ask a question about the document in the chat input, and receive a response based on the document’s content.
Notes
- API Key: The app requires a valid Sutra API key. Obtain one from Two AI's API service and set it in the
.env
file. - Document Processing: The app supports PDF files only. Large or complex PDFs may require additional error handling or chunking adjustments.
- RAG Limitations: The quality of answers depends on the document’s content and the Sutra LLM’s capabilities. Adjust
chunk_size
andchunk_overlap
inCharacterTextSplitter
for better performance if needed. - Streaming: The
StreamHandler
class ensures smooth streaming of responses, enhancing the user experience. - Styling: The app uses minimal custom CSS and HTML for branding. Further styling can be added using Streamlit’s
st.markdown
with CSS. - Dependencies: Ensure
faiss-cpu
is used for local development. For GPU support, installfaiss-gpu
instead.
Example Usage
- Upload a PDF document (e.g., a research paper) via the sidebar.
- Wait for the “Document processed successfully!” message.
- Type a question like “What is the main topic of the document?” in the chat input.
- The app retrieves relevant document chunks and responds with an answer generated by the Sutra LLM (e.g., “The document discusses advancements in AI for multilingual processing.”).
- The conversation is stored in
st.session_state.messages
and displayed in the chat interface.