PyData Eindhoven 2024

Enhancing Event Analysis at Scale: Leveraging Tracking Data in Sports.
07-11, 14:45–15:15 (Europe/Amsterdam), If (1.1)

Learn how to automate the generation of contextual metrics from tracking data to enrich event analysis, handling the influx of games arriving daily in an efficient way by scaling-out the entire architecture.


In the dynamic landscape of sports analytics, the integration of tracking data has opened new frontiers for in-depth event analysis. Yet, the use of this data remains a bottleneck, particularly when dealing with a large volume of games. Indeed, such computation is either too expensive or too long. The focus of the presentation will be on automating the generation of these contextual metrics at scale, and their usage by professionals and decision-makers.
The presentation will showcase an architecture and an automated pipeline designed to handle the influx of games. Leveraging Python and cloud computing services such as message queues, we efficiently manage incoming game data by scaling the infrastructure based on the workload, ensuring optimal performance during peak period while minimizing costs during quieter times. The presentation will strike a balance between technical depth and practical application. Attendees will gain insights into the architecture required to efficiently process hundreds of games weekly, while accommodating the thousands already present in the database. The advantage granted by this method will be quantified in terms of time and resources to inform data scientists and data engineers the efficiency they could reach.


Prior Knowledge Expected

No prior knowledge expected

I am a French Data Scientist, holding an engineering diploma from Telecom Paris and a Master's degree from Institut Polytechnique de Paris in Applied Mathematics and Data Science.
At the end of my studies, I completed a Data Science internship at Parma Calcio 1913. I now serve as a full-time Data Scientist at the club, working on leveraging tracking data.