Back to catalog
AutoML MCP Server
An intelligent automated machine learning platform that provides comprehensive capabilities for data analysis, preprocessing, model selection, and hyperparameter tuning through Model Context Protocol (MCP) tools.
An intelligent automated machine learning platform that provides comprehensive capabilities for data analysis, preprocessing, model selection, and hyperparameter tuning through Model Context Protocol (MCP) tools.
Installation
From Source Code
git clone https://github.com/emircansoftware/AutoML.git
cd AutoML
pip install -r requirements.txt
pip install uv
Configuration
Claude Desktop
{
"mcpServers": {
"AutoML": {
"command": "uv",
"args": [
"--directory",
"C:\\YOUR\\PROJECT\\PATH\\AutoML",
"run",
"main.py"
]
}
}
}
Available Tools
| Tool | Description |
|---|---|
information_about_data |
Provides detailed information about the data |
reading_csv |
Reads a CSV file |
visualize_correlation_num |
Visualizes correlation matrix for numerical columns |
visualize_correlation_cat |
Visualizes correlation matrix for categorical columns |
visualize_correlation_final |
Visualizes correlation matrix after preprocessing |
visualize_outliers |
Visualizes outliers in the data |
visualize_outliers_final |
Visualizes outliers after preprocessing |
preprocessing_data |
Preprocesses data (removing outliers, filling missing values, etc.) |
prepare_data |
Prepares data for models (encoding, scaling, etc.) |
models |
Selects and evaluates models based on task type |
visualize_accuracy_matrix |
Visualizes confusion matrix for predictions |
best_model_hyperparameter |
Tunes hyperparameters of the best model |
test_external_data |
Tests external data with the best model and returns predictions |
predict_value |
Predicts target column value for new input data |
feature_importance_analysis |
Analyzes feature importance in the data using XGBoost |
Features
- Comprehensive dataset statistics including size, memory usage, data types, and missing values
- Efficient CSV file reading with pandas and pyarrow support
- Correlation analysis and visualization for numerical and categorical variables
- Outlier detection and visualization
- Automated preprocessing with missing value handling, categorical feature encoding, and scaling
- Support for multiple machine learning algorithms including Linear Regression, Ridge, Lasso, ElasticNet, Random Forest, XGBoost, SVR, KNN, CatBoost
- Classification algorithms including Logistic Regression, Ridge Classifier, Random Forest, XGBoost, SVM, KNN, Decision Tree, Naive Bayes, CatBoost
- Performance metrics for regression (R², MAE, MSE) and classification (Accuracy, F1-Score)
- Confusion matrix visualization for classification tasks
- Model comparison capabilities
Usage Examples
Analyze dataset statistics and missing values for heart.csv
Preprocess data by handling missing values and outliers for target column
Train and compare multiple classification models on heart disease dataset
Visualize correlation matrix for numerical features in the dataset
Optimize hyperparameters for RandomForestClassifier with 100 trials
Notes
Requires Python 3.8+. You must update the data path in utils/read_csv_file.py according to your project directory. Includes 16 sample datasets from Kaggle for testing. The server should be configured with correct local paths in Claude Desktop configuration.
