Vincent D. Warmerdam
Vincent is a senior data professional, and recovering consultant, who worked as an engineer, researcher, team lead, and educator in the past. I’m especially interested in understanding algorithmic systems so that one may prevent failure. As such, he prefers simpler solutions that scale and worry more about data quality than the number of tensors we throw at a problem. He's also well known for creating calmcode as well as a small dozen of open-source packages.
He's currently employed at probabl where he works together with scikit-learn core maintainers to improve the ecosystem of tooling.

Sessions
Many of us know scikit-learn for it's ability to construct pipelines that can do .fit().predict(). It's an amazing feature for sure. But once you dive into the codebase ... you realise that there is just so much more.
This talk will be an attempt at demonstrating some extra features in scikit-learn, and it's ecosystem, that are less common but deserve to be in the spotlight.
In particular I hope to discuss these things that scikit-learn can do:
- sparse datasets and models
- larger than memory datasets
- sample weight techniques
- image classification via embeddings
- tabular embeddings/vectorisation
- data deduplication
- pipeline caching
If time allows I may also touch on extra topics.