SUTRA Document RAG Chatbot - Streamlit App with LangChain

This is a Streamlit-based SUTRA Document RAG Chatbot application, which leverages the SUTRA LLM model via LangChain to provide a Retrieval-Augmented Generation (RAG) chatbot. The app allows users to upload PDF documents, extract text, and ask questions answered based on the document content using the SUTRA LLM.

Overview

The SUTRA Document RAG Chatbot app enables users to upload PDF files, process them to extract text, and interact with a chatbot powered by the SUTRA LLM model. The app uses LangChain for document processing and retrieval, Streamlit for the UI, and supports streaming responses for a dynamic user experience. Chat history is maintained using Streamlit's session state.

Prerequisites

To run this Streamlit app, ensure you have the following installed:

  • Python (v3.8 or higher)
  • pip (Python package manager)
  • Streamlit (pip install streamlit)
  • LangChain (pip install langchain)
  • LangChain OpenAI (pip install langchain-openai)
  • LangChain Community (pip install langchain-community)
  • PyPDF2 (pip install PyPDF2)
  • FAISS (pip install faiss-cpu)
  • python-dotenv (pip install python-dotenv)
  • A valid Sutra API key (set in a .env file as SUTRA_API_KEY)
  • A modern web browser (e.g., Chrome, Firefox)

Setup Instructions

  1. Install Dependencies:

    pip install streamlit langchain langchain-openai langchain-community PyPDF2 faiss-cpu python-dotenv
  2. Set Up Environment Variables: Create a .env file in the project directory with the following content:

    SUTRA_API_KEY=your_sutra_api_key_here
  3. Save the Code: Save the provided code in a file named app.py.

  4. Run the Application:

    streamlit run app.py

    This will start the Streamlit server, and the app will be accessible at http://localhost:8501.

Streamlit App Code

Below is the complete Python code for the Streamlit app, which implements the Document RAG Chatbot functionality.

import streamlit as st
from langchain_openai import ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain.schema import HumanMessage
from langchain.callbacks.base import BaseCallbackHandler
import PyPDF2
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()
api_key = os.getenv("SUTRA_API_KEY")

# Page configuration
st.set_page_config(
    page_title="Sutra Document RAG Chatbot",
    page_icon="📚",
    layout="wide"
)

# Streaming callback handler
class StreamHandler(BaseCallbackHandler):
    def __init__(self, container, initial_text=""):
        self.container = container
        self.text = initial_text

    def on_llm_new_token(self, token: str, **kwargs):
        self.text += token
        self.container.markdown(self.text)

# Initialize the ChatOpenAI model - base instance for caching
@st.cache_resource
def get_base_chat_model():
    return ChatOpenAI(
        api_key=os.getenv("SUTRA_API_KEY"),
        base_url="https://api.two.ai/v2",
        model="sutra-v2",
        temperature=0.7,
    )

# Create a streaming version of the model with callback handler
def get_streaming_chat_model(callback_handler=None):
    return ChatOpenAI(
        api_key=os.getenv("SUTRA_API_KEY"),
        base_url="https://api.two.ai/v2",
        model="sutra-v2",
        temperature=0.7,
        streaming=True,
        callbacks=[callback_handler] if callback_handler else None
    )

# Sidebar for document upload
st.sidebar.image("https://framerusercontent.com/images/3Ca34Pogzn9I3a7uTsNSlfs9Bdk.png", use_container_width=True)
with st.sidebar:
    st.title("📚 Sutra RAG Chatbot")
    uploaded_file = st.file_uploader("Upload a PDF document", type="pdf")
    st.divider()
    st.markdown("### About Sutra RAG Chatbot")
    st.markdown("This app uses Retrieval-Augmented Generation to answer questions based on uploaded PDF documents using the Sutra LLM.")

# Main app title
st.markdown(
    f'<h1><img src="https://framerusercontent.com/images/9vH8BcjXKRcC5OrSfkohhSyDgX0.png" width="60"/> Sutra Document RAG Chatbot 🤖</h1>',
    unsafe_allow_html=True
)

# Initialize session state for messages and vector store
if "messages" not in st.session_state:
    st.session_state.messages = []
if "vector_store" not in st.session_state:
    st.session_state.vector_store = None

# Process uploaded PDF
if uploaded_file is not None:
    try:
        # Read and extract text from PDF
        pdf_reader = PyPDF2.PdfReader(uploaded_file)
        text = ""
        for page in pdf_reader.pages:
            text += page.extract_text() or ""

        # Split text into chunks
        text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
        documents = text_splitter.split_text(text)

        # Create embeddings and vector store
        embeddings = OpenAIEmbeddings(api_key=os.getenv("SUTRA_API_KEY"), base_url="https://api.two.ai/v2")
        st.session_state.vector_store = FAISS.from_texts(documents, embeddings)

        st.success("Document processed successfully!")
    except Exception as e:
        st.error(f"Error processing PDF: {str(e)}")

# Display chat messages
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.write(message["content"])

# Chat input
user_input = st.chat_input("Ask a question about the document...")

# Process user input
if user_input and st.session_state.vector_store is not None:
    # Add user message to chat
    st.session_state.messages.append({"role": "user", "content": user_input})

    # Display user message
    with st.chat_message("user"):
        st.write(user_input)

    # Generate response
    try:
        # Create message placeholder
        with st.chat_message("assistant"):
            response_placeholder = st.empty()

            # Create a stream handler
            stream_handler = StreamHandler(response_placeholder)

            # Get streaming model with handler
            chat = get_streaming_chat_model(stream_handler)

            # Set up RetrievalQA chain
            qa_chain = RetrievalQA.from_chain_type(
                llm=chat,
                chain_type="stuff",
                retriever=st.session_state.vector_store.as_retriever()
            )

            # Generate response using RAG
            response = qa_chain.invoke({"query": user_input})
            answer = response["result"]

            # Add assistant response to chat history
            st.session_state.messages.append({"role": "assistant", "content": answer})

    except Exception as e:
        st.error(f"Error: {str(e)}")
        if "API key" in str(e):
            st.error("Please check your Sutra API key in the environment variables.")
elif user_input and st.session_state.vector_store is None:
    st.error("Please upload a PDF document first.")

Functionality

  • Document Upload: A sidebar file uploader allows users to upload PDF documents for processing.
  • Text Extraction and Indexing: The app uses PyPDF2 to extract text from PDFs, CharacterTextSplitter to chunk the text, and FAISS with OpenAIEmbeddings to create a vector store for retrieval.
  • RAG Pipeline: The app uses LangChain’s RetrievalQA chain to retrieve relevant document chunks and generate answers using the Sutra LLM (sutra-v2 model).
  • Chat Interface: Users input questions via a st.chat_input field, and responses are displayed using st.chat_message for a clean chat UI.
  • Streaming Responses: A custom StreamHandler class streams responses from the Sutra LLM, providing a dynamic typing effect.
  • Session State: Persists chat history and the vector store using st.session_state.
  • Error Handling: Displays errors for invalid API keys, PDF processing issues, or missing document uploads.
  • Styling: Uses custom HTML for the title with an icon and includes a logo in the sidebar for branding.

Running the App

  1. Ensure all dependencies are installed and the .env file contains a valid SUTRA_API_KEY.
  2. Save the code as app.py.
  3. Run streamlit run app.py.
  4. Open http://localhost:8501 in your browser.
  5. Upload a PDF document via the sidebar.
  6. Ask a question about the document in the chat input, and receive a response based on the document’s content.

Notes

  • API Key: The app requires a valid Sutra API key. Obtain one from Two AI's API service and set it in the .env file.
  • Document Processing: The app supports PDF files only. Large or complex PDFs may require additional error handling or chunking adjustments.
  • RAG Limitations: The quality of answers depends on the document’s content and the Sutra LLM’s capabilities. Adjust chunk_size and chunk_overlap in CharacterTextSplitter for better performance if needed.
  • Streaming: The StreamHandler class ensures smooth streaming of responses, enhancing the user experience.
  • Styling: The app uses minimal custom CSS and HTML for branding. Further styling can be added using Streamlit’s st.markdown with CSS.
  • Dependencies: Ensure faiss-cpu is used for local development. For GPU support, install faiss-gpu instead.

Example Usage

  1. Upload a PDF document (e.g., a research paper) via the sidebar.
  2. Wait for the “Document processed successfully!” message.
  3. Type a question like “What is the main topic of the document?” in the chat input.
  4. The app retrieves relevant document chunks and responds with an answer generated by the Sutra LLM (e.g., “The document discusses advancements in AI for multilingual processing.”).
  5. The conversation is stored in st.session_state.messages and displayed in the chat interface.

Resources