Ended 10 months ago
137 participants
1628 submissions

Materials (15 MB)

Download all materials
train.csv
11 MB
test.csv
3 MB
sample_submission.csv
1 MB

The dataset presented here was collected from one of the most popular maps. It contains reviews and ratings from 1 to 5 and we suggest you try to predict them.

  • train.csv - The training set, comprising the rate and text of each review. rate comprise the target for the competition.
  • test.csv - For the test data we give only the text of a review.
  • sample_submission.csv - A submission file in the correct format.

You can download the dataset by following the link.

Evaluation

Submissions are scored using F1-score:

Submission File

For each row in the test set, you need to predict one of the 5 rates, from 1 to 5. The file should contain a header and have the following format:

index,rate
0,5
1,5
2,5
3,5
...

Usage

  1. Clonning repo: git clone https://github.com/e0xextazy/nlp_huawei_new2_task.git
  2. cd nlp_huawei_new2_task/
  3. Create virtual environment: python3.7 -m venv venv
  4. Activate virtual environment: source venv/bin/activate
  5. Setup your baseline:
    1. TF-IDF + Logistic Regression: ./setup/setup_tf_idf_logreg.sh
    2. Catboost: ./setup/setup_catboost.sh
    3. LSTM: ./setup/setup_lstm.sh
    4. Transformers: ./setup/setup_transformers.sh
  6. Download data: ./setup/download_data.sh
  7. Enjoy!

Authors

Cookies help us deliver our services. By using our services, you agree to our use of cookies.