Vincent D. Warmerdam PyData Eindhoven 2024

Vincent D. Warmerdam
.ical

Vincent is a senior data professional, and recovering consultant, who worked as an engineer, researcher, team lead, and educator in the past. I’m especially interested in understanding algorithmic systems so that one may prevent failure. As such, he prefers simpler solutions that scale and worry more about data quality than the number of tensors we throw at a problem. He's also well known for creating calmcode as well as a small dozen of open-source packages.

He's currently employed at probabl where he works together with scikit-learn core maintainers to improve the ecosystem of tooling.

Sessions

07-11

12:00

30min

Scikit-Learn can do THAT?!

Vincent D. Warmerdam

Many of us know scikit-learn for it's ability to construct pipelines that can do .fit().predict(). It's an amazing feature for sure. But once you dive into the codebase ... you realise that there is just so much more.

This talk will be an attempt at demonstrating some extra features in scikit-learn, and it's ecosystem, that are less common but deserve to be in the spotlight.

In particular I hope to discuss these things that scikit-learn can do:

sparse datasets and models
larger than memory datasets
sample weight techniques
image classification via embeddings
tabular embeddings/vectorisation
data deduplication
pipeline caching

If time allows I may also touch on extra topics.

Else (1.3)

Vincent D. Warmerdam .ical

Sessions

Vincent D. Warmerdam
.ical