Skill

Integrate Box with LlamaIndex for RAG

Connect LlamaIndex to Box.com to ingest and process documents for RAG and LLM applications, with options for native text extraction and Box AI features.

Works with box

91
Spark score
out of 100
Updated 4 months ago
Version 1.0.0
Models

Add to Favorites

Why it matters

Empower your LLM applications by seamlessly integrating Box.com documents into LlamaIndex for advanced Retrieval Augmented Generation (RAG) and data querying.

Outcomes

What it gets done

01

Read files directly from Box.com

02

Extract text using Box's native text representation

03

Leverage Box AI for advanced text and structured data extraction (E+ customers)

04

Index Box content for efficient RAG queries

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/li-reader-readers-box | bash

Capabilities

What this skill does

Read files directly

Read files directly from Box.com

Extract text using

Extract text using Box's native text representation

Leverage Box AI

Leverage Box AI for advanced text and structured data extraction (E+ customers)

Index Box content

Index Box content for efficient RAG queries

Overview

LlamaIndex: Box Readers

What it does

This open-source integration brings the capabilities of Box.com to the LLama-Index, empowering developers building Retrieval Augmented Generation (RAG) and other LLM applications.

Source README

LlamaIndex: Box Readers

This open-source integration brings the capabilities of Box.com to the LLama-Index, empowering developers building Retrieval Augmented Generation (RAG) and other LLM applications.

This README will guide you through installation, usage, and explore the functionalities of each reader.

Installation

pip install llama-index-readers-box

Available readers

We provide multiple readers, including:

  • Box Reader - Implementation of the SimpleReader interface to read files from Box.
  • Box Text Extraction - Uses Box text representation to extract text from document.
  • Box AI Prompt - Uses Box AI to extract context from documents
  • Box AI Extraction - Uses Box AI to extract structured data from documents

Authentication

Client credential gran (CCG)

Create a new application in the Box Developer Console and generate a new client ID and client secret.
Create a .env file with the following content:

### CCG settings
BOX_CLIENT_ID = YOUR_CLIENT_ID
BOX_CLIENT_SECRET = YOUR_CLIENT_SECRET

### Common Settings
BOX_ENTERPRISE_ID = YOUR_BOX_ENTERPRISE_ID
BOX_USER_ID = YOUR_BOX_USER_ID (optional)

By default the CCG client will use a service account associated with the application. Depending on how the files are shared, the service account may not have access to all the files.

If you want to use a different user, you can specify the user ID in the .env file. In this case make sure your application can impersonate and/or generate user tokens in the scope.

Checkout this guide for more information on how to setup the CCG: Box CCG Guide

JSON web tokens (JWT)

Create a new application in the Box Developer Console and generate a new .config.json file.
Create a .env file with the following content:

### Common settings
BOX_ENTERPRISE_ID = 877840855
BOX_USER_ID = 18622116055

### JWT Settings
JWT_CONFIG_PATH = /path/to/your/.config.json

By default the JWT client will use a service account associated with the application. Depending on how the files are shared, the service account may not have access to all the files.

If you want to use a different user, you can specify the user ID in the .env file. In this case make sure your application can impersonate and/or generate user tokens in the scope.

Checkout this guide for more information on how to setup the JWT: Box JWT Guide

pip install "box-sdk-gen[jwt]"

Box Client

To work with the box readers, you will need to provide a Box Client.
The Box Client can be created using either the Client Credential Grant (CCG), JSON Web Tokens (JWT), OAuth 2.0, and developer token.

Using CCG authentication

from box_sdk_gen import CCGConfig, BoxCCGAuth, BoxClient

config = CCGConfig(
    client_id="your_client_id",
    client_secret="your_client_secret",
    enterprise_id="your_enterprise_id",
    user_id="your_ccg_user_id",  # Optional
)
auth = BoxCCGAuth(config)
if config.user_id:
    auth.with_user_subject(config.user_id)
client = BoxClient(auth)

reader = BoxReader(box_client=client)

Using JWT authentication

from box_sdk_gen import JWTConfig, BoxJWTAuth, BoxClient

#### Using manual configuration
config = JWTConfig(
    client_id="YOUR_BOX_CLIENT_ID",
    client_secret="YOUR_BOX_CLIENT_SECRET",
    jwt_key_id="YOUR_BOX_JWT_KEY_ID",
    private_key="YOUR_BOX_PRIVATE_KEY",
    private_key_passphrase="YOUR_BOX_PRIVATE_KEY_PASSPHRASE",
    enterprise_id="YOUR_BOX_EN

FAQ

Common questions

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.