MCP

Access and Analyze Hugging Face Datasets

MCP server that connects to Hugging Face Dataset Viewer API for browsing, searching, filtering, and analyzing datasets hosted on Hugging Face Hub.

Works with huggingface

90
Spark score
out of 100
Updated 4 months ago
Version 1.0.0
Models

Add to Favorites

Why it matters

Effortlessly interact with Hugging Face datasets. View, analyze, search, and filter datasets hosted on the Hugging Face Hub using a dedicated MCP server.

Outcomes

What it gets done

01

Validate dataset existence and accessibility

02

Retrieve and paginate dataset content

03

Search and filter dataset rows using SQL-like conditions

04

Download datasets in Parquet format

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/vb-dataset-viewer | bash

Capabilities

Tools your agent gets

validate

Checks whether a dataset exists and is accessible

get_info

Retrieves detailed information about a dataset

get_rows

Retrieves dataset content with pagination

get_first_rows

Retrieves the first rows from a dataset split

get_statistics

Retrieves statistics for a dataset split

search_dataset

Searches for text in a dataset

filter

Filters rows using SQL-like conditions

get_parquet

Downloads the full dataset in Parquet format

Overview

Dataset Viewer MCP Server

What it does

This MCP server provides access to the Hugging Face Dataset Viewer API, enabling dataset exploration and analysis through tools for validation, information retrieval, content access, statistics, searching, filtering, and Parquet downloads.

How it connects

Use this connector when you need to work with datasets hosted on Hugging Face Hub, whether public or private, and want to inspect metadata, retrieve specific rows, search content, apply filters, or download complete datasets in Parquet format.

Source README

MCP server for working with the Hugging Face Dataset Viewer API, providing capabilities to view, analyze, search, and filter datasets hosted on Hugging Face Hub.

Installation

From Source

git clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer
uv venv
source .venv/bin/activate
uv add -e .

Configuration

Claude Desktop

{
  "mcpServers": {
    "dataset-viewer": {
      "command": "uv",
      "args": [
        "--directory",
        "parent_to_repo/dataset-viewer",
        "run",
        "dataset-viewer"
      ]
    }
  }
}

Available Tools

Tool Description
validate Checks whether a dataset exists and is accessible
get_info Retrieves detailed information about a dataset
get_rows Retrieves dataset content with pagination
get_first_rows Retrieves the first rows from a dataset split
get_statistics Retrieves statistics for a dataset split
search_dataset Searches for text in a dataset
filter Filters rows using SQL-like conditions
get_parquet Downloads the full dataset in Parquet format

Features

  • Uses the dataset:// URI scheme for accessing Hugging Face datasets
  • Supports dataset configurations and splits
  • Provides paginated access to dataset content
  • Handles authentication for private datasets
  • Supports searching and filtering dataset content
  • Provides dataset statistics and analysis

Environment Variables

Optional

  • HUGGINGFACE_TOKEN - Your Hugging Face API token for accessing private datasets

Notes

Requires Python 3.12 or higher and the uv package installer. MIT License.

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.