Retrieve Web Content with AgentSearch
A LlamaIndex retriever pack that queries terabytes of internet-indexed data via the agent-search API or hosted search engines like Bing for RAG workflows.
Why it matters
Leverage the AgentSearch dataset or hosted search APIs to retrieve general content from the internet based on user queries.
Outcomes
What it gets done
Integrate with AgentSearch's terabytes of indexed data.
Utilize search providers like Bing or AgentSearch.
Retrieve relevant nodes from indexed internet content.
Query general information using natural language.
Install
Add it to your toolbox
Run in your project directory:
curl -fsSL https://spark.entire.vc/get/li-pack-packs-agent-search-retriever | bash Steps
Steps in the chain
Use llamaindex-cli to download the pack: llamaindex-cli download-llamapack AgentSearchRetrieverPack --download-dir ./agent_search_pack
Optionally set the SCIPHI_API_KEY environment variable: import os; os.environ["SCIPHI_API_KEY"] = "..."
Import RetrieverQueryEngine and download_llama_pack from llama_index.core
Download the pack using download_llama_pack("AgentSearchRetrieverPack", "./agent_search_pack") and instantiate with api_key, similarity_top_k, and search_provider parameters
Access the retriever from the pack and call retrieve() with a query string to get source nodes
Wrap the retriever in a RetrieverQueryEngine using RetrieverQueryEngine.from_args(retriever)
Call query_engine.query() with a query string to get responses, or use agent_search_pack.run() as a wrapper around retriever.retrieve()
Overview
Agent-Search Retrieval Pack
What it does
A custom retriever that integrates with the agent-search API to access terabytes of indexed internet content for retrieval-augmented generation workflows.
How it connects
Use this pack when you need to retrieve information from large-scale internet-indexed datasets without building your own web crawling infrastructure, and want to integrate that retrieval capability into LlamaIndex pipelines.
Source README
Description pending for li-pack-packs-agent-search-retriever.
Step 1: Download the AgentSearchRetrieverPack
Use llamaindex-cli to download the pack: llamaindex-cli download-llamapack AgentSearchRetrieverPack --download-dir ./agent_search_pack
Step 2: Set API key (optional)
Optionally set the SCIPHI_API_KEY environment variable: import os; os.environ["SCIPHI_API_KEY"] = "..."
Step 3: Import required modules
Import RetrieverQueryEngine and download_llama_pack from llama_index.core
Step 4: Download and initialize the pack
Download the pack using download_llama_pack("AgentSearchRetrieverPack", "./agent_search_pack") and instantiate with api_key, similarity_top_k, and search_provider parametersStep 5: Use the retriever directly
Access the retriever from the pack and call retrieve() with a query string to get source nodes
Step 6: Create a query engine
Wrap the retriever in a RetrieverQueryEngine using RetrieverQueryEngine.from_args(retriever)
Step 7: Execute queries
Call query_engine.query() with a query string to get responses, or use agent_search_pack.run() as a wrapper around retriever.retrieve()
Discussion
Questions & comments · 0
Sign In Sign in to leave a comment.