Create a Langchain app with multiple vector store the easy way

Representation of langchain using multiple vector store for RAG

Learn, in simple steps, how to create an LLM app using Langchain and Streamlit with multiple vector stores for RAG use cases.


Imagine you are creating a SaaS RAG application that will allow people to analyze and ask questions on their own documents. Very soon you will arrive at the point where you will need separate the documents in topics or domains. At this point, you will need to go question the architecture of your application and wonder if you need multiple vector stores.

This blog post will show how to integrate multiple vector store and solution on how to use them at the same time. More precisely, you will see in this post:

Let’s begin then.


What is a multi vector store strategy ?

A vector store for large language models (LLMs) is essentially a database or storage system designed to efficiently store and retrieve vectors. Basically, it allows you to retrieve the most semantically close data to the question you give.

But very soon, you will see that you need you need to separate your data into logic grouping like topics or domains. At this point, you will see that you will need different vector store or different indexes inside your vector store.

To be more precise, the vector store is the specialized database system behind the storing and the retrieval and the index are the different optimized storage of your data which are independent from one another.

When you have multiple index and want more than one, you will need to be careful for some points:

So in conclusion, the relevance of the retrieved data from your indexes is the key.

Initialize the work environnement

We will use the same setup as the previous post, meaning Streamlit, Langchain, FAISS vector store and Pipenv for managing virtual env. For better readability, I will create a new folder called RAG-pipeline-multi-vector-store-langchain-app and copy inside it all the files I need from the previous post (and rename some).

cp -R RAG-pipeline-langchain-openai RAG-pipeline-multi-vector-store-langchain-app
cd RAG-pipeline-multi-vector-store-langchain-app

Now we need to install the pipenv virtual env:

pipenv install

You can now check that the web app is launching with this:

pipenv run streamlit run

Modify the code for a second vector store index

Ok now let’s add the relevant code in so that we can add another file in the app. This file will become our second vector store index.

First let’s change the prompt to accept another index:

template = """Answer the question based only on the following contexts:


Question: {question}

Here we have called the data retrieved from the 2 indexes context1 and context2, but this naming can be improved.

Now let’s add 2 distinct components to accept 2 files and generate the 2 vector store index using FAISS:

first_retriever = None
second_retriever = None

st.title("First vector store index")

first_index_uploaded_file = st.file_uploader("Choose a text file", type="txt", key="first_index")

if first_index_uploaded_file is not None:
    string_data = first_index_uploaded_file.getvalue().decode("utf-8")

    splitted_data = string_data.split("\n\n")

    first_vectorstore = FAISS.from_texts(
    first_retriever = first_vectorstore.as_retriever()

st.title("Second vector store index")

second_vector_uploaded_file = st.file_uploader("Choose a text file", type="txt", key="second_index")

if second_vector_uploaded_file is not None:
    string_data = second_vector_uploaded_file.getvalue().decode("utf-8")

    splitted_data = string_data.split("\n\n")

    second_vectorstore = FAISS.from_texts(
    second_retriever = second_vectorstore.as_retriever()

This is actually still really simple, we just duplicated the first component and changed some variable names.

Now let’s add the chain that will generate the answer:

if first_retriever is not None and second_retriever is not None:
    chain = (
        {"context1": first_retriever,
         "context2": second_retriever,
         "question": RunnablePassthrough()}
        | prompt
        | model
        | StrOutputParser()

    question = st.text_input("Input your question for the uploaded document")

    result = chain.invoke(question)


As you can see, we added the 2 context from the retrievers and are putting them through the chain. Simple right ?

Here’s the full code incase you missed something:

import os

from dotenv import load_dotenv
import streamlit as st
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# Load the variables from .env

st.title("Hello, Metadocs readers!")

template = """Answer the question based only on the following contexts:


Question: {question}

prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI(temperature=0, model_name="gpt-4", openai_api_key=os.environ["OPENAI_KEY"])
embedding = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_KEY"])
first_retriever = None
second_retriever = None

st.title("First vector store index")

first_index_uploaded_file = st.file_uploader("Choose a text file", type="txt", key="first_index")

if first_index_uploaded_file is not None:
    string_data = first_index_uploaded_file.getvalue().decode("utf-8")

    splitted_data = string_data.split("\n\n")

    first_vectorstore = FAISS.from_texts(
    first_retriever = first_vectorstore.as_retriever()

st.title("Second vector store index")

second_vector_uploaded_file = st.file_uploader("Choose a text file", type="txt", key="second_index")

if second_vector_uploaded_file is not None:
    string_data = second_vector_uploaded_file.getvalue().decode("utf-8")

    splitted_data = string_data.split("\n\n")

    second_vectorstore = FAISS.from_texts(
    second_retriever = second_vectorstore.as_retriever()

if first_retriever is not None and second_retriever is not None:
    chain = (
        {"context1": first_retriever,
         "context2": second_retriever,
         "question": RunnablePassthrough()}
        | prompt
        | model
        | StrOutputParser()

    question = st.text_input("Input your question for the uploaded document")

    result = chain.invoke(question)


Now let’s try this new rag app. Launch the app using the following command:

pipenv run streamlit run

And a new tab should open in your browser with your new app. You can use the following files for the test : state_of_the_unions.txt and generated_clean_energy_discourse.txt (you can also find them in the git repo of this tutorial).
You can use the following question to test your app:

Here’s what you should have then:

As you could see, the app is capable of answering questions and works very well. It even refuse to answer when the question is not related to any of the given documents.

Congratulation, you got a nice multi vector store index on your arm. This is really a toy app but it can be some much more and we will see some tips on how to make it better.

Limits and improvements


As you can see, this is really a tutorial project and there is so much more to do but this is a good enough beginning I think.
RAG is an extremely complex because you are going to use real world data, which is inherently messy and hard to understand. This is exactly why it can provide so much value.. It is a very deep subject so it is perfectly normal if this takes time. Enjoy the learning !


I hope this tutorial helped you and taught you many things. I will update this post with more nuggets from time to time. Don’t forget to check my other post as I write a lot of cool posts on practical stuff in AI.

Cheers !

