Code Mining


Code Mining is a whole industrial direction that focuses on source code and code-artifact analysis by all means available to the data science community. In other words, we represent Data Analysis for Software Engineering.

The track appeared in the OpenDataScience community in 2019 on Data Fest Siberia 2 in the form of two review reports "Programmers – write, robots – read. Why does business need automatic code analysis?" and "Source code analysis: an overview of tasks, recent articles, and developments".

Since then, at various conferences on software engineering and data analysis, we have seen an increasing interest in the tasks being solved by specialists in the field:

  • auto-completion and it's life these days;
  • automated code review;
  • code summarization and code generation;
  • source code similarity analysis (clone search);
  • error search, vulnerability search;
  • evaluation of source code quality criteria;
  • classification, source code structurization;
  • evaluation of source code authors;
  • analysis of development history;
  • code artifact analysis;
  • and many others.

Here we discuss tools, new approaches, and challenges that arise in this area. We hold our own educational tracks as part of ODS conferences, such as the track Data Fest Online 2020:
Small stats: 8 speakers from 4 companies, more than 100 participants at the event and more than 1,500 afterward on YouTube.

We also conduct hackathons related to data analysis for software engineering. For example, the first one was a joke and was called the Scary code with up to 40 participants. This event was timed to ODS Data Halloween 2020.

Here you can see the short summary of the year 2020.

Plans for 2021

  • more educational content at;
  • more educational and promo posts/papers related to ODS-community;
  • more serious hackathons (feel free - suggest your topics as well);
  • more networking and involvement of project participants;
  • automate ODS Community (Pet) Projects quality checking/tracking on a single point.


Telegram channel:


Code Mining project organizer – company. For participation and support with any kind of activity feel free to contact with Alexey Smirnov @ ODS Slack / Telegram - @alsmirn or text to