Open AutoML benchmark in the form of a container-based competition on both Academic (OpenML CC18) and industrial (Finance and ODS) datasets.
The goal of this benchmark is to provide an open and interactive interface for AutoML system evaluation on a wide range of tasks and datasets. We design our benchmark for both Academic datasets and real-world industrial datasets in order to get a better understanding of the current state of AutoML system performance. This benchmark is extensible and will have additional dataset groups with their updated versions coming with respect to the benchmark roadmap.
Benchmark solutions are end-to-end AutoML systems suited for both automatically building ML models on a given dataset as well as using their best-fitted model for inference on test data for the given dataset. Solutions are sent to the automatic testing system and evaluated on groups of datasets (see Dataset section).
Solution evaluation consists of 3 phases:
The complete scoring process of each AutoML solution consists of the following 3 steps:
Step 1. For every dataset group on each dataset evaluate respective metric_value on test data predictions:
Step 2. For each metric value on each dataset calculate its relative dataset_score compared to the metric value of linear baseline:
dataset_score = metric_value / metric_baseline
Step 3. For each dataset group calculate its group_score as the average dataset_score within this group.
total_score is the average
dataset_score across all datasets in the current benchmark.
Official support channel: #automl_benchmark in ODS.ai slack. If you are not registered, please join the community.