## Description
An application for controlling the keyboard and mouse using gestures.

## Installation
1. Clone the repository
    ```shell
    git clone https://github.com/Samoed/MyFirstDataScienceProject
    ```
2. Install dependencies
    ```shell
    pip install -r requirements.txt
    ```
3. Run the application
    ```shell
    python ui_app.py
    ```
   If the camera window doesn't appear when launching the application, run `ui_app.py` with the `-d` or `--device` flag to switch the camera:
    ```shell
    python ui_app.py -d 1
    ```

### Issue with Building the Application in Docker
During the development of the application, the idea arose to build it in Docker using PyInstaller. However, a problem arose when using the Pynput library, which interacts with the keyboard and mouse in Linux using X-libs and evdev.

To work with Pynput inside a Docker container, it is necessary to install linux-headers. However, this needs to be done separately for each type of system. In the case of Debian-like systems, installing header files via apt is straightforward. However, for other systems like Arch-like ones, there are difficulties in adding linux-headers.

One possible solution is to copy the header files inside the container, but this doesn't seem to be the optimal solution.

## Working Principle
The video stream from the camera is obtained using `opencv` and passed to `MediaPipe`, which then detects hand landmarks. These landmarks are then passed to a model that predicts the gesture. Based on the predicted gesture, a specific action is taken (mouse movement or mouse/keyboard button press).

## Experiments
### Data
The data was taken from the [Kaggle dataset](https://www.kaggle.com/datasets/gti-upm/leapgestrecog). From this dataset, 11 hand gesture classes were selected. Additionally, videos were recorded for 4 gestures for training purposes. In total, over 15,000 photos were collected, around 1,000 for each class.

Gesture photos were processed using `Mediapipe` and saved as `numpy` arrays.

### Model Selection
Experiments were conducted with various models. The evaluation metric used was `accuracy`, since the classes were balanced, but `f1` was also considered. The following models were used:
* [Logistic Regression](experiments/sklearn_logreg)
* [Support Vector Machine](experiments/sklearn_svc)
* [Random Forest](experiments/sklearn_forest)
* Gradient Boosting ([XGBoost](experiments/xgboost_model), [sklearn](experiments/sklearn_gradient), [CatBoost](experiments/catboost_model))
* Neural Network ([PyTorch](experiments/train_pytorch))

The code for training the models is located in the [`experiments`](experiments) folder. `mlflow` was used to store the experiment results.

Optimal hyperparameters were tuned for each model using the `optuna` library.

### Results
After training, SVM was chosen as the best model due to its performance.

| Model                  | Accuracy | F1    |
|------------------------|----------|-------|
| Logistic Regression    | 0.942    | 0.942 |
| Support Vector Machine | 0.991    | 0.99  |
| Random Forest          | 0.969    | 0.969 |
| XGBoost                | 0.989    | 0.989 |
| CatBoost               | 0.987    | 0.987 |
| Neural Network         | 0.978    | 0.978 |

![mlflow_table](https://raw.githubusercontent.com/Samoed/MyFirstDatascienceProject/main/img/mlflow_best.png)

Confusion matrix for SVM:
![img](https://raw.githubusercontent.com/Samoed/MyFirstDataScienceProject/main/img/training_confusion_matrix.png)
0. two_fingers_near
1. one ☝
2. two ✌
3. three
4. four
5. five
6. ok 👌
7. C
8. heavy 🤟
9. hang 🤙
10. palm ✋
11. L
12. like 👍
13. dislike 👎
14. fist ✊


Test accuracy for the neural model:
![torch_train_acc](https://raw.githubusercontent.com/Samoed/MyFirstDatascienceProject/main/img/pytorch_test_acc.png)

## Demo
1. [Gesture Set](https://youtu.be/abdMVR_DWkw)
2. [Interaction with Miro](https://youtu.be/w5kQKZJLQdM)

An application for controlling the keyboard and mouse using gestures. (GitHub)[https://github.com/Samoed/MyFirstDatascienceProject]

Model	Accuracy	F1
Logistic Regression	0.942	0.942
Support Vector Machine	0.991	0.99
Random Forest	0.969	0.969
XGBoost	0.989	0.989
CatBoost	0.987	0.987
Neural Network	0.978	0.978

An application for controlling the keyboard and mouse using gestures.active, Founded 2 years ago

Description

Installation

Issue with Building the Application in Docker

Working Principle

Experiments

Data

Model Selection

Results

Demo

An application for controlling the keyboard and mouse using gestures.
active,
Founded 2 years ago