Integrate Google Search Results with LlamaIndex
LlamaIndex reader that fetches organic Google Search results via Zyte API, returning top search result URLs for any query to feed downstream document loaders.
Why it matters
Leverage Zyte's Google Search API integration to enrich your LlamaIndex applications with real-time organic search results. This asset allows you to programmatically fetch top search result URLs based on a given query, enabling more comprehensive data ingestion for your AI.
Outcomes
What it gets done
Fetch Google search result URLs using Zyte API
Integrate Zyte's search capabilities into LlamaIndex pipelines
Extract relevant content from fetched URLs using ZyteWebReader
Build RAG systems with up-to-date web data
Install
Add it to your toolbox
Run in your project directory:
curl -fsSL https://spark.entire.vc/get/li-reader-readers-zyte-serp | bash Capabilities
What this skill does
Searches the web and retrieves relevant sources.
Pulls structured data fields from unstructured text.
Chunks, embeds, and indexes documents for semantic retrieval.
Overview
LlamaIndex Readers Integration: Zyte-Serp
What it does
ZyteSerpReader is a LlamaIndex integration that retrieves organic Google Search results through the Zyte API. It accepts a search query and returns the top result URLs as documents. The reader supports two extraction modes (httpResponseBody or browserHtml) and can be chained with ZyteWebReader to fetch full article content from discovered URLs.
How it connects
Use this reader when you need to augment your LlamaIndex pipeline with current web search results, such as building RAG systems that require fresh information from Google Search. It's particularly useful when combined with content extractors to create search-then-extract workflows for question answering or research applications.
Source README
LlamaIndex Readers Integration: Zyte-Serp
ZyteSerp can be used to add organic search results from Google Search. It takes a query and returns top search results urls.
Instructions for ZyteSerpReader
Setup and Installation
pip install llama-index-readers-zyte-serp
Secure an API key from Zyte to access the Zyte services.
Using ZyteSerpReader
Initialization: Initialize the ZyteSerpReader by providing the API key and the option for extraction ("httpResponseBody" or "browserHtml").
from llama_index.readers.zyte_serp import ZyteSerpReader zyte_serp = ZyteSerpReader( api_key="your_api_key_here", extract_from="httpResponseBody", # or "browserHtml" )Loading Data: To load search results, use the
load_datamethod with the query you wish to search.
documents = zyte_serp.load_data(query="llama index docs")
Example Usage
Here is an example demonstrating how to initialize the ZyteSerpReader and get top search URLs.
Further the content from these URLs can be loaded using ZyteWebReader in "article" mode to obtain just the article content from webpage.
from llama_index.readers.zyte_serp import ZyteSerpReader
from llama_index.readers.web.zyte.base import ZyteWebReader
# Initialize the ZyteSerpReader with your API key
zyte_serp = ZyteSerpReader(
api_key="your_api_key_here", # Replace with your actual API key
)
# Get the search results (URLs from google search results)
search_urls = zyte_serp.load_data(query="llama index docs")
# Display the results
print(search_urls)
urls = [result.text for result in search_urls]
# Initialize the ZyteWebReader to load the content from search results
zyte_web = ZyteWebReader(
api_key="your_api_key_here", # Replace with your actual API key
mode="article",
)
documents = zyte_web.load_data(urls)
print(documents)
Discussion
Questions & comments · 0
Sign In Sign in to leave a comment.