Starts in 9 days

DataFest — Belgrade, May 31

FON, University of Belgrade


12:00 — Guest Registration & Welcome

13:05 — Opening remarks · Salavat Gariffullin, ODS Serbia

13:15 — Welcome Address

Prof. Dragan Vukmirović, PhD (FON University) From Big Data to AI‑native: 7V, Synthetic Data and the New Role of Data Science in Industry



🏛 Stage 0 · Читаоница · 140 seats

(Library, Ground Floor)

🤖 Industry & Robots

TimeSpeakerCompanyTalkDescription
13:30Fedor KurdovYandexRL for Real-World Robot Motion PlanningHow RL without imitation learning was built from scratch for Yandex sidewalk delivery rovers.
14:00Aleksey PostnikovSber Robotics LabPhysical AI: Status and the Road AheadBroad overview of Physical AI: synthetic data, sim-to-real, RL over behavior cloning, and learning from human videos.
14:30Dmitrii IunovidovLogicYieldMaking Industrial CV Fly on Edge CPUsRunning industrial computer vision on edge CPUs in harsh factory conditions using inference optimization and neuro-symbolic methods.
15:00Fedor KonovalenkoMIL Team (MIPT)From Model Compression to Local Inference PlatformHow a model compression tool evolved into a local GenAI inference platform with OpenAI-compatible API and multi-engine support.

15:30 — 16:30 · Break


🔬 Career & Quality

TimeSpeakerCompanyTalkDescription
16:30Maksim ArtemevMorphicHow (Not) to Find a JobCommon candidate mistakes, hiring mechanics, and salary negotiation from the perspective of applicant, manager, and recruiter.
17:00Vladimir KukushkinIndependent ResearcherBeyond Funnels: Advanced UX AnalyticsHow to study user behavior deeper than traditional funnels using advanced UX analytics tools and journey analysis.
17:30Alexey VasilevSber AI LabSplitLight: RecSys Evaluation ToolkitOpen-source toolkit for analyzing datasets and split strategies in RecSys to make offline evaluation transparent and reproducible.
18:00Alexey Korotkov & Timofey GaraevMIPT AI InstituteSHARP: Span-level Hallucination Annotation for Reasoning PathsNew span-level dataset for hallucination detection in LLM reasoning paths, yielding better downstream quality for PRM models.


🏛 Stage 1 · Амфитеатар 1 · 180 seats

(Lecture Hall, 1st Floor)

🛡️ Trust in AI

TimeSpeakerCompanyTalkDescription
13:30Oleg SekachevYandexVector DB & Specialist LLM in Labelling PipelinesUsing a vector DB to reuse accumulated annotation data, plus a domain-tuned LLM pipeline as a cost-quality compromise.
14:00Anastasiia MargolinaBanco PlataHow we (didn't) build an AutoEvalA story about evaluating AI when the answers involve real money — and how a single prompt turned into a full methodology.
14:30Stefan HačkoFoursquareLLM-Powered Harmonization of 100M+ PlacesHow Foursquare uses LLMs and vector embeddings to clean, match, and unify massive third-party venue datasets at scale.
15:00Michael DiskinIndependent ResearcherWhen Models Should Stay SilentMeasuring model uncertainty, calibrating confidence, and implementing rejection mechanisms for reliable LLMs in production.

15:30 — 16:30 · Break


⚡ Agents & LLMs

TimeSpeakerCompanyTalkDescription
16:30Ksenija BlaževićLemana Pro4 Anti-Patterns in Agentic AICommon anti-patterns in agentic AI and how to replace them with leaner, cost-efficient architectures.
17:00Dmitrii KrasnovZencoderOrchestrating Coding ModelsComparison of sequential, parallel, and OSS orchestration for coding models and their impact on SWE-bench-like benchmarks.
17:30Nikita SeverinIndependent ResearcherKnowledge Transfer from Pre-trained LLMs to Recommender ModelsEfficient knowledge transfer from LLMs to recommender models without costly serving-time inference or architectural changes.
18:00Ivan BushmarinovPerplexityUser-Guided LLM Answer Quality EvaluationLeveraging thread-style user feedback and small trained models to evaluate frontier LLM answers and enable scalable benchmarking.


🏛 Stage 2 · Амфитеатар 2 · 180 seats

(Lecture Hall, 2nd Floor)

💳 Ranking & Banking

TimeSpeakerCompanyTalkDescription
13:30Alexander EroshenkoYandexLLM-Powered Item-to-Item Recs in LavkaPractical case of deploying a compact LLM (Gemma ~270M) for item-to-item recommendations of substitutes and complements.
14:00Boris TseitlinBanco PlataLearning from Unstructured Sequences in 2026Overview of self-supervised and foundation-model approaches to embeddings from transactions, events, and unstructured sequences.
14:30Mikhail SysoevBanco PlataPV Models in Retail LendingPV models and approaches to optimizing product parameters in card-based fintech products.
15:00Victor BarbarichBanco PlataTransformers Replace Feature Engineering in ScoringMoving from manual feature engineering in credit scoring to transformers that learn directly from raw account and employment histories.

15:30 — 16:30 · Break


🎙️ The Voice of AI

TimeSpeakerCompanyTalkDescription
16:30Aleksandr NikolićIndependent ResearcherBorealis — how to train an audio LLM for the price of a MacBookPractical story on training an audio LLM with a limited budget using the example of Borealis.
17:00Pavel MazaevYandexDevice-Directed Speech Detection for AliceProduction system for detecting speech directed at Yandex Alice to enable natural dialogue without a constant wake word.
17:30Ilya ShigabeevLangswap.appTTS in 2026: Open Source vs Big TechPractical view on modern TTS modeling, competing with big tech, and why small open-source models still matter.
18:00Pavel GuliaevIndependent ResearcherVideo2Text: Industry State & Practical ChoicesState of the Video2Text industry — what works, current limitations, and how to pick solutions for production load and budget.


18:30 — Closing remarks · Salavat Gariffullin, ODS Serbia

18:45 — After-party 🎉

Our website uses cookies, including web analytics services. By using the website, you consent to the processing of personal data using cookies. You can find out more about the processing of personal data in the Privacy policy