Prompt Chain

Enhance Query Engine with Fuzzy Citations

Name: Enhance Query Engine with Fuzzy Citations
Availability: OnlineOnly
Author: LlamaIndex

A LlamaIndex query engine that post-processes responses to identify and cite source sentences using fuzzy string matching, with metadata mapping each response

Copy chain

Works with github

LlamaIndex

Maintainer?

Spark score

out of 100

Updated 4 days ago

Version 0.14.22

Models

llama 3

Add to Favorites

Why it matters

Augment your existing query engine to identify and extract source sentences from responses using fuzzy matching, providing precise citation metadata.

Outcomes

What it gets done

Post-process query responses to identify source sentences.

Utilize fuzzy matching (fuzz.ratio) for sentence identification.

Attach citation metadata (start/end char indices, node) to response objects.

Configure fuzzy matching threshold for desired precision.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/li-pack-packs-fuzzy-citation | bash

Steps

Steps in the chain

Download the FuzzyCitationEnginePack

Download the FuzzyCitationEnginePack using llamaindex-cli with the command: llamaindex-cli download-llamapack FuzzyCitationEnginePack --download-dir ./fuzzy_citation_pack. Inspect the files at ./fuzzy_citation_pack and use them as a template for your own project.

Import required modules and download pack

Import Document and VectorStoreIndex from llama_index.core, and download_llama_pack from llama_index.core.llama_pack. Download and install the FuzzyCitationEnginePack to ./fuzzy_citation_pack directory using download_llama_pack().

Create index from documents

Create a VectorStoreIndex from documents using VectorStoreIndex.from_documents([Document.example()]). This prepares the document data for querying.

Initialize query engine

Get a query engine from the index using index.as_query_engine(). This creates the base query engine that will be wrapped by the fuzzy citation engine.

Create FuzzyCitationEnginePack instance

Instantiate FuzzyCitationEnginePack with the query engine and threshold parameter: fuzzy_engine = FuzzyCitationEnginePack(query_engine, threshold=50). The threshold score (default 50) is used for fuzzy matching with fuzz.ratio().

Run query through fuzzy engine

Call the run() function on the fuzzy engine with your query: response = fuzzy_engine.run('What can you tell me about LLMs?'). The run() function is a light wrapper around query_engine.query() that attaches metadata with fuzzy citations.

Extract and print source citations

Access the fuzzy citations from response.metadata. Print response.metadata.keys() to see available citations, or print response.metadata to view full source sentence info including mappings of (response_sentence, source_chunk) with start/end character indexes.

Overview

Fuzzy Citation Query Engine Pack

What it does

A LlamaIndex pack that creates a CustomQueryEngine (FuzzCitationQueryEngine) to post-process query responses and identify source sentences through fuzzy string matching, storing citation metadata with character-level precision.

How it connects

Use this when building RAG applications that require transparent source attribution, allowing you to trace each response sentence back to its originating text chunk with exact character positions in the source nodes.

Source README

Description pending for li-pack-packs-fuzzy-citation.

Step 1: Download the FuzzyCitationEnginePack

Download the FuzzyCitationEnginePack using llamaindex-cli with the command: llamaindex-cli download-llamapack FuzzyCitationEnginePack --download-dir ./fuzzy_citation_pack. Inspect the files at ./fuzzy_citation_pack and use them as a template for your own project.

Step 2: Import required modules and download pack

Import Document and VectorStoreIndex from llama_index.core, and download_llama_pack from llama_index.core.llama_pack. Download and install the FuzzyCitationEnginePack to ./fuzzy_citation_pack directory using download_llama_pack().

Step 3: Create index from documents

Create a VectorStoreIndex from documents using VectorStoreIndex.from_documents([Document.example()]). This prepares the document data for querying.

Step 4: Initialize query engine

Get a query engine from the index using index.as_query_engine(). This creates the base query engine that will be wrapped by the fuzzy citation engine.

Step 5: Create FuzzyCitationEnginePack instance

Instantiate FuzzyCitationEnginePack with the query engine and threshold parameter: fuzzy_engine = FuzzyCitationEnginePack(query_engine, threshold=50). The threshold score (default 50) is used for fuzzy matching with fuzz.ratio().

Step 6: Run query through fuzzy engine

Call the run() function on the fuzzy engine with your query: response = fuzzy_engine.run('What can you tell me about LLMs?'). The run() function is a light wrapper around query_engine.query() that attaches metadata with fuzzy citations.

Step 7: Extract and print source citations

Access the fuzzy citations from response.metadata. Print response.metadata.keys() to see available citations, or print response.metadata to view full source sentence info including mappings of (response_sentence, source_chunk) with start/end character indexes.

Discussion