LlamaIndex Tools Integration: Airweave
This tool connects your LlamaIndex agent to [Airweave](https://airweave.ai/), an open-source platform that makes any app searchable by syncing data from various sources with minimal configuration.
Get this skill
LlamaIndex Tools Integration: Airweave
This tool connects your LlamaIndex agent to Airweave, an open-source platform that makes any app searchable by syncing data from various sources with minimal configuration.
Installation
pip install llama-index-tools-airweave llama-index-llms-openai
Prerequisites
- An Airweave account and API key
- At least one collection set up with synced data
Get started at Airweave
Usage
Basic Usage
import os
import asyncio
from llama_index.tools.airweave import AirweaveToolSpec
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI
### Initialize the Airweave tool
airweave_tool = AirweaveToolSpec(
api_key=os.environ["AIRWEAVE_API_KEY"],
)
### Create an agent with the Airweave tools
agent = FunctionAgent(
tools=airweave_tool.to_tool_list(),
llm=OpenAI(model="gpt-4o-mini"),
system_prompt="""You are a helpful assistant that can search through
Airweave collections to answer questions about your organization's data.""",
)
### Use the agent to search your data
async def main():
response = await agent.run(
"Search the finance-data collection for Q4 revenue reports"
)
print(response)
if __name__ == "__main__":
asyncio.run(main())
Available Tools
search_collection
Simple search in a collection with default settings (most common use case).
Parameters:
collection_id(str): The readable ID of the collectionquery(str): Your search querylimit(int, optional): Max results to return (default: 10)offset(int, optional): Pagination offset (default: 0)
advanced_search_collection
Advanced search with full control over retrieval parameters.
Parameters:
collection_id(str): The readable ID of the collectionquery(str): Your search querylimit(int, optional): Max results to return (default: 10)offset(int, optional): Pagination offset (default: 0)retrieval_strategy(str, optional): "hybrid", "neural", or "keyword"temporal_relevance(float, optional): Weight recent content (0.0-1.0)expand_query(bool, optional): Generate query variationsinterpret_filters(bool, optional): Extract filters from natural languagererank(bool, optional): Use LLM-based rerankinggenerate_answer(bool, optional): Generate natural language answer
Returns:
Dictionary with documents list and optional answer field.
search_and_generate_answer
Convenience method that searches and returns a direct natural language answer (RAG-style).
Parameters:
collection_id(str): The readable ID of the collectionquery(str): Your question in natural languagelimit(int, optional): Max results to consider (default: 10)use_reranking(bool, optional): Use reranking (default: True)
Returns:
Natural language answer string.
list_collections
List all collections in your organization.
Parameters:
skip(int, optional): Pagination skip (default: 0)limit(int, optional): Max collections to return (default: 100)
get_collection_info
Get detailed information about a specific collection.
Parameters:
collection_id(str): The readable ID of the collection
Advanced Examples
Direct Tool Usage
You can use the tools directly without an agent:
from llama_index.tools.airweave import AirweaveToolSpec
airweave_tool = AirweaveToolSpec(api_key="your-key")
### List collections
collections = airweave_tool.list_collections()
print(f"Found {len(collections)} collections")
### Simple search
results = airweave_tool.search_collection(
collection_id="finance-data", query="Q4 revenue reports", limit=5
)
for doc in results:
print(f"Score: {doc.metadata.get('score', 'N/A')}")
print(f"Text: {doc.text[:200]}...")
Advanced Search Options
### Advanced search with all options
result = airwea