Prompt Chain

Augment Search with AI for Smarter Answers

Name: Augment Search with AI for Smarter Answers
Availability: OnlineOnly
Author: OpenAI Cookbook

Enhance search with AI: generate queries, re-rank results using embeddings, and get cited answers. Integrates with any search API.

Copy chain

Works with slackelasticsearch

OpenAI Cookbook

Maintainer?

Spark score

out of 100

Updated 3 months ago

Version 1.0.0

Models

gpt 4o gemini 2 0

Add to Favorites

Why it matters

Enhance existing search systems by integrating AI to improve the relevance and quality of search results, leading to more accurate and context-aware answers.

Outcomes

What it gets done

Generate diverse search queries from user questions.

Execute searches in parallel across multiple sources.

Re-rank search results using semantic similarity to a hypothetical ideal answer.

Synthesize top results into a concise, referenced answer.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/oai-questionansweringusingasearchapi | bash

Steps

Steps in the chain

Step 1: Search

User asks a question. GPT generates a list of potential queries. Search queries are executed in parallel.

Step 2: Re-rank

Embeddings for each result are used to calculate semantic similarity to a generated hypothetical ideal answer to the user question. Results are ranked and filtered based on this similarity metric.

Step 3: Answer

Given the top search results, the model generates an answer to the user's question, including references and links.

Overview

Question answering using a search API and re-ranking

What it does

This prompt chain enhances search by generating potential queries from a user's question, executing these queries, and then re-ranking the results using embeddings to identify semantically similar content. The top results are then used to generate a final answer, complete with references and links.

How it connects

Use this to improve the quality of search results. This approach can be implemented on top of any existing search system, like the Slack search API, or an internal ElasticSearch instance with private data. It offers a middle ground between mimicking human browsing (which can be slow) and retrieval with embeddings (which requires embedding your entire knowledge base in advance and maintaining a vector database).

Source README

Question answering using a search API and re-ranking

Searching for relevant information can sometimes feel like looking for a needle in a haystack, but don’t despair, GPTs can actually do a lot of this work for us. In this guide we explore a way to augment existing search systems with various AI techniques, helping us sift through the noise.

Two ways of retrieving information for GPT are:

Mimicking Human Browsing: GPT triggers a search, evaluates the results, and modifies the search query if necessary. It can also follow up on specific search results to form a chain of thought, much like a human user would do.
Retrieval with Embeddings: Calculate embeddings for your content and a user query, and then retrieve the content most related as measured by cosine similarity. This technique is used heavily by search engines like Google.

These approaches are both promising, but each has their shortcomings: the first one can be slow due to its iterative nature and the second one requires embedding your entire knowledge base in advance, continuously embedding new content and maintaining a vector database.

By combining these approaches, and drawing inspiration from re-ranking methods, we identify an approach that sits in the middle. This approach can be implemented on top of any existing search system, like the Slack search API, or an internal ElasticSearch instance with private data. Here’s how it works:

Step 1: Search

User asks a question.
GPT generates a list of potential queries.
Search queries are executed in parallel.

Step 2: Re-rank

Embeddings for each result are used to calculate semantic similarity to a generated hypothetical ideal answer to the user question.
Results are ranked and filtered based on this similarity metric.

Step 3: Answer

Given the top search results, the model generates an answer to the user’s question, including references and links.

This hybrid approach offers relatively low latency and can be integrated into any existing search endpoint, without requiring the upkeep of a vector database. Let's dive into it! We will use the News API as an example domain to search over.

Setup

In addition to your OPENAI_API_KEY, you'll have to include a NEWS_API_KEY in your environment. You can get an API key here.

1. Search

It all starts with a user question.

Now, in order to be as exhaustive as possible, we use the model to generate a list of diverse queries based on this question.

The queries look good, so let's run the searches.

As we can see, oftentimes, the search queries will return a large number of results, many of which are not relevant to the original question asked by the user. In order to improve the quality of the final answer, we use embeddings to re-rank and filter the results.

2. Re-rank

Drawing inspiration from HyDE (Gao et al.), we first generate a hypothetical ideal answer to rerank our compare our results against. This helps prioritize results that look like good answers, rather than those similar to our question. Here’s the prompt we use to generate our hypothetical answer.

Now, let's generate embeddings for the search results and the hypothetical answer. We then calculate the cosine distance between these embeddings, giving us a semantic similarity metric. Note that we can simply calculate the dot product in lieu of doing a full cosine similarity calculation since the OpenAI embeddings are returned normalized in our API.

Finally, we use these similarity scores to sort and filter the results.

Awesome! These results look a lot more relevant to our original query. Now, let's use the top 5 results to generate a final answer.

3. Answer

Step 1: Step 1: Search

User asks a question. GPT generates a list of potential queries. Search queries are executed in parallel.

Step 2: Step 2: Re-rank

Embeddings for each result are used to calculate semantic similarity to a generated hypothetical ideal answer to the user question. Results are ranked and filtered based on this similarity metric.

Step 3: Step 3: Answer

Given the top search results, the model generates an answer to the user's question, including references and links.

Discussion

Augment Search with AI for Smarter Answers

What it gets done

Add it to your toolbox

Steps in the chain

Question answering using a search API and re-ranking

What it does

How it connects

Question answering using a search API and re-ranking

Setup

1. Search

2. Re-rank

3. Answer

Step 1: Step 1: Search

Step 2: Step 2: Re-rank

Step 3: Step 3: Answer

Questions & comments · 0