Prompt Chain

Augment GPT-4 with External Knowledge

Name: Augment GPT-4 with External Knowledge
Availability: OnlineOnly
Author: OpenAI Cookbook

A prompt workflow that connects GPT-4 to Pinecone vector database to retrieve relevant context from LangChain documentation, reducing hallucinations by

Copy chain

Works with openaipineconelangchain

OpenAI Cookbook

Maintainer?

Spark score

out of 100

Updated 3 months ago

Version 1.0.0

Models

gpt 4ogpt 4

Add to Favorites

Why it matters

Enhance GPT-4's responses by retrieving relevant information from a Pinecone vector database, reducing hallucinations and grounding answers in factual data.

Outcomes

What it gets done

Scrape and process documentation from web sources.

Embed text data into vectors using OpenAI's embedding models.

Index embeddings in a Pinecone vector database for efficient retrieval.

Query Pinecone for relevant context and feed it to GPT-4 for augmented generation.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/oai-gpt4retrievalaugmentation | bash

Steps

Steps in the chain

Preparing the Data

Download the LangChain docs from langchain.readthedocs.io/. Get all .html files located on the site and download them into the `rtdocs` directory. Use LangChain's `ReadTheDocsLoader` to process these docs into hundreds of processed doc pages.

Process Documents into Chunks

Chunk the processed documents into ~400 token chunks using langchain and tiktoken. Create a data list containing the plaintext page content and source information from each document.

Initialize Embedding Model

Use `text-embedding-3-small` as the embedding model to embed text. Apply this embedding logic to the langchain docs dataset. Each vector embedding will contain 1536 dimensions (the output dimensionality of the `text-embedding-3-small` model).

Initialize Pinecone Index

Get a free API key from Pinecone and initialize your connection to Pinecone. Create a new index to store the embeddings and enable efficient vector search through them.

Populate Index with Embeddings

Populate the Pinecone index with OpenAI `text-embedding-3-small` built embeddings of all langchain docs. This adds all documents to the index for later retrieval.

Retrieval

Create a query vector `xq` from your search query. Use `xq` to retrieve the most relevant chunks from the LangChain docs stored in Pinecone.

Retrieval Augmented Generation with GPT-4

Pass the retrieved document chunks into GPT-4 via the `ChatCompletions` endpoint. Add the retrieved information into the model by passing it into user prompts alongside the original query to generate answers backed by real data sources.

Overview

Retrieval Augmentation for GPT-4 using Pinecone

What it does

This workflow connects GPT-4 to a Pinecone vector database to implement retrieval-augmented generation.

How it connects

Use this when you need GPT-4 to answer questions about specific documentation or knowledge bases where accuracy matters and hallucinations must be minimized. It's ideal for technical documentation Q&A, customer support systems, or any scenario where responses must be grounded in verifiable, up-to-date source material that you control.

Source README

Retrieval Augmentation for GPT-4 using Pinecone

Fixing LLMs that Hallucinate

In this notebook we will learn how to query relevant contexts to our queries from Pinecone, and pass these to a GPT-4 model to generate an answer backed by real data sources.

GPT-4 is a big step up from previous OpenAI completion models. It also exclusively uses the ChatCompletion endpoint, so we must use it in a slightly different way to usual. However, the power of the model makes the change worthwhile, particularly when augmented with an external knowledge base like the Pinecone vector database.

Required installs for this notebook are:

Preparing the Data

In this example, we will download the LangChain docs from langchain.readthedocs.io/. We get all .html files located on the site like so:

This downloads all HTML into the rtdocs directory. Now we can use LangChain itself to process these docs. We do this using the ReadTheDocsLoader like so:

This leaves us with hundreds of processed doc pages. Let's take a look at the format each one contains:

We access the plaintext page content like so:

We can also find the source of each document:

We can use these to create our data list:

It's pretty ugly but it's good enough for now. Let's see how we can process all of these. We will chunk everything into ~400 token chunks, we can do this easily with langchain and tiktoken:

Process the data into more chunks using this approach.

Our chunks are ready so now we move onto embedding and indexing everything.

Initialize Embedding Model

We use text-embedding-3-small as the embedding model. We can embed text like so:

In the response res we will find a JSON-like object containing our new embeddings within the 'data' field.

Inside 'data' we will find two records, one for each of the two sentences we just embedded. Each vector embedding contains 1536 dimensions (the output dimensionality of the text-embedding-3-small model.

We will apply this same embedding logic to the langchain docs dataset we've just scraped. But before doing so we must create a place to store the embeddings.

Initializing the Index

Now we need a place to store these embeddings and enable a efficient vector search through them all. To do that we use Pinecone, we can get a free API key and enter it below where we will initialize our connection to Pinecone and create a new index.

We can see the index is currently empty with a total_vector_count of 0. We can begin populating it with OpenAI text-embedding-3-small built embeddings like so:

Now we've added all of our langchain docs to the index. With that we can move on to retrieval and then answer generation using GPT-4.

Retrieval

To search through our documents we first need to create a query vector xq. Using xq we will retrieve the most relevant chunks from the LangChain docs, like so:

With retrieval complete, we move on to feeding these into GPT-4 to produce answers.

Retrieval Augmented Generation

GPT-4 is currently accessed via the ChatCompletions endpoint of OpenAI. To add the information we retrieved into the model, we need to pass it into our user prompts alongside our original query. We can do that like so:

Now we ask the question:

To display this response nicely, we will display it in markdown.

Let's compare this to a non-augmented query...

If we drop the "I don't know" part of the primer?

Step 1: Preparing the Data

Download the LangChain docs from langchain.readthedocs.io/. Get all .html files located on the site and download them into the `rtdocs` directory. Use LangChain's `ReadTheDocsLoader` to process these docs into hundreds of processed doc pages.

Step 2: Process Documents into Chunks

Chunk the processed documents into ~400 token chunks using langchain and tiktoken. Create a data list containing the plaintext page content and source information from each document.

Step 3: Initialize Embedding Model

Use `text-embedding-3-small` as the embedding model to embed text. Apply this embedding logic to the langchain docs dataset. Each vector embedding will contain 1536 dimensions (the output dimensionality of the `text-embedding-3-small` model).

Step 4: Initialize Pinecone Index

Get a free API key from Pinecone and initialize your connection to Pinecone. Create a new index to store the embeddings and enable efficient vector search through them.

Step 5: Populate Index with Embeddings

Populate the Pinecone index with OpenAI `text-embedding-3-small` built embeddings of all langchain docs. This adds all documents to the index for later retrieval.

Step 6: Retrieval

Create a query vector `xq` from your search query. Use `xq` to retrieve the most relevant chunks from the LangChain docs stored in Pinecone.

Step 7: Retrieval Augmented Generation with GPT-4

Pass the retrieved document chunks into GPT-4 via the `ChatCompletions` endpoint. Add the retrieved information into the model by passing it into user prompts alongside the original query to generate answers backed by real data sources.

Discussion

Augment GPT-4 with External Knowledge

What it gets done

Add it to your toolbox

Steps in the chain

Retrieval Augmentation for GPT-4 using Pinecone

What it does

How it connects

Retrieval Augmentation for GPT-4 using Pinecone

Fixing LLMs that Hallucinate

Preparing the Data

Initialize Embedding Model

Initializing the Index

Retrieval

Retrieval Augmented Generation

Step 1: Preparing the Data

Step 2: Process Documents into Chunks

Step 3: Initialize Embedding Model

Step 4: Initialize Pinecone Index

Step 5: Populate Index with Embeddings

Step 6: Retrieval

Step 7: Retrieval Augmented Generation with GPT-4

Questions & comments · 0