Access and Analyze Hugging Face Datasets
MCP server that connects to Hugging Face Dataset Viewer API for browsing, searching, filtering, and analyzing datasets hosted on Hugging Face Hub.
Why it matters
Effortlessly interact with Hugging Face datasets. View, analyze, search, and filter datasets hosted on the Hugging Face Hub using a dedicated MCP server.
Outcomes
What it gets done
Validate dataset existence and accessibility
Retrieve and paginate dataset content
Search and filter dataset rows using SQL-like conditions
Download datasets in Parquet format
Install
Add it to your toolbox
Run in your project directory:
curl -fsSL https://spark.entire.vc/get/vb-dataset-viewer | bash Capabilities
Tools your agent gets
Checks whether a dataset exists and is accessible
Retrieves detailed information about a dataset
Retrieves dataset content with pagination
Retrieves the first rows from a dataset split
Retrieves statistics for a dataset split
Searches for text in a dataset
Filters rows using SQL-like conditions
Downloads the full dataset in Parquet format
Overview
Dataset Viewer MCP Server
What it does
This MCP server provides access to the Hugging Face Dataset Viewer API, enabling dataset exploration and analysis through tools for validation, information retrieval, content access, statistics, searching, filtering, and Parquet downloads.
How it connects
Use this connector when you need to work with datasets hosted on Hugging Face Hub, whether public or private, and want to inspect metadata, retrieve specific rows, search content, apply filters, or download complete datasets in Parquet format.
Source README
MCP server for working with the Hugging Face Dataset Viewer API, providing capabilities to view, analyze, search, and filter datasets hosted on Hugging Face Hub.
Installation
From Source
git clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer
uv venv
source .venv/bin/activate
uv add -e .
Configuration
Claude Desktop
{
"mcpServers": {
"dataset-viewer": {
"command": "uv",
"args": [
"--directory",
"parent_to_repo/dataset-viewer",
"run",
"dataset-viewer"
]
}
}
}
Available Tools
| Tool | Description |
|---|---|
validate |
Checks whether a dataset exists and is accessible |
get_info |
Retrieves detailed information about a dataset |
get_rows |
Retrieves dataset content with pagination |
get_first_rows |
Retrieves the first rows from a dataset split |
get_statistics |
Retrieves statistics for a dataset split |
search_dataset |
Searches for text in a dataset |
filter |
Filters rows using SQL-like conditions |
get_parquet |
Downloads the full dataset in Parquet format |
Features
- Uses the dataset:// URI scheme for accessing Hugging Face datasets
- Supports dataset configurations and splits
- Provides paginated access to dataset content
- Handles authentication for private datasets
- Supports searching and filtering dataset content
- Provides dataset statistics and analysis
Environment Variables
Optional
HUGGINGFACE_TOKEN- Your Hugging Face API token for accessing private datasets
Notes
Requires Python 3.12 or higher and the uv package installer. MIT License.
Discussion
Questions & comments · 0
Sign In Sign in to leave a comment.