What is CML?
Continuous Machine Learning (CML) is an open-source library for implementing continuous integration & delivery (CI/CD) in machine learning projects via popular CI systems like GitHub Actions & GitLab CI. Use CI to automate parts of your development workflow, including model training and evaluation, comparing ML experiments across your project history, and monitoring changing datasets.
If you want to learn more about CI as a philosophy and a practice, and how it relates to ML, check out this talk.
We built CML with these principles in mind:
- GitFlow for data science. Use GitLab or GitHub to manage ML experiments, track who trained ML models or modified data and when. Codify data and models with DVC instead of pushing to a Git repo.
- Auto reports for ML experiments. Auto-generate reports with metrics and plots in each Git Pull Request. Rigorous engineering practices help your team make informed, data-driven decisions.
- No additional services. Build your own ML platform using just GitHub or GitLab and your favorite cloud services: AWS, Azure, GCP. No databases, services or complex setup needed.
On every pull request, CML helps you automatically train and evaluate models, then generates a visual report with results and metrics. Above, an example report for a neural style transfer model.
How to contribute
CML is a new project and we welcome contributions. The best way to familiarize yourself with the project is to read our docs and tutorial, then explore some of our use cases (all are available for GitHub Actions & GitLab CI). We are also building a YouTube tutorial series!
The Issues list in our repo is a good place to look for contribution ideas. Here are some broad areas where we welcome users to dig in:
- Creating a new use case for CI in an ML project
- Finding bugs, issues, or barriers to CML adoption in your projects
- Leveraging advanced compute resources, such as Kubernetes, for running CI jobs (one of the big values of CML is resource orchestration).
- Creating wrappers to integrate CML with your favorite ML tools- for example, we have a wrapper for Tensorboard.
- We can always use help with docs!
Additionally, the concept of CI/CD is only beginning to be explored in the context of machine learning. We welcome blog posts about your thoughts and experiments with CML. At this stage, a great blog or tutorial could be as influential and helpful as a pull request