Ended 23 months ago
43 participants
131 submissions

Materials (6 MB)

Download all materials
1 MB
1 MB
1 MB
datasest for local testing
Several datasets for local model testing
3 MB

Dataset groups

In order to provide a realistic overview of AutoML system performance, yet be compatible with other major AutoML results, we design our benchmark around groups of datasets. We start with the following dataset groups:

  • OpenML CC18. A total of 36 datasets on binary and multiclass classification tasks.
  • Finance datasets. To be released in October. A group of ~30 datasets on various industrial tasks that appear in the finance industry. All 3 major tasks: regression, binary and multiclass classification.
  • ODS crowdsource. To be released in November. A group of ~40 datasets on various tasks from different industries and Data Science domains.

Submission format

Each solution is an archive with code that runs in the Docker container environment. Solution archives are submitted into the automatic testing system for evaluation. 

Each solution receives the following information:

  • task_type: “binary” for binary classification, “multiclass” for multiclass classification, or “reg” for regression
  • train_data: path to the training dataset
  • test_data: path to the test dataset, without the target variable
  • output_path: path where the system must save predictions on the test_data

Datasets for local testing

  • dresses-sales: binary, target - 'Class'
  • internet-advertisements: binary, target - 'Class'
  • eucalyptus: multiclass, target - 'Utility'
  • bioresponse: binary, target - 'target'

Cookies help us deliver our services. By using our services, you agree to our use of cookies.