Building a knowledge base with OpenAI, LangChain, OpenSearch, and Unstructured

We’ve designed this project to help Voiceflow users build a knowledge base using custom APIs. The project utilizes Open AI, Langchain, Redis, OpenSearch, and Unstructured to fetch content from various sources such as URLs, sitemaps, text, PDFs, PowerPoints, Notion docs (markdown) and even images (OCR).

These sources of information are then turned into embeddings/vectors and saved in a local OpenSearch database. This knowledge base can then be used to generate context and answer questions. And the best part? Because it’s an API, you can use it within your Voiceflow Assistant with the help of the API Step.

Before you dive in, watch the quick overview below.


You’ll need Node.js 18+ to run this code. Download it here.

You'll also need to have Docker Compose installed.

To get started, copy the `.env` file and set up required environment variables:

To create the containers, install the required dependencies, and launch the server, run:

This will create the following containers:

  • Redis (cache)
  • Unstructured (handles images, PPT, text, markdown)
  • OpenSearch (search engine)
  • OpenSearch-dashboards (search engine dashboard)

OpenSearch dashboard can be accessed at http://localhost:5601

API documentation

There are several API endpoints available for various tasks:

Add content to OpenSearch: `POST /api/add`

Get a response using a live webpage as context: `POST /api/live`
Get a response using the vector store: `POST /api/question`

Clear Redis cache: `GET /api/clearcache`
Delete a collection: `DELETE /api/collection`

Find more detailed API documentation in the file on our repo.

Using live data

You can also use the `/api/live` endpoint to get a response using a live webpage as context without vectorizing the content.

Using the knowledge base

Once you've added content to OpenSearch, you can use the `/api/question` endpoint to get answers based on your knowledge base.

Now what?

Now you can easily set up and use your knowledge base to answer questions and provide valuable information for your users using the API Step in your Voiceflow Assistants.

More projects


Harness the power of OpenAI’s Whisper model for ASR with Voiceflow

No items found.