I'm Thomas van Dongen. I'm head of AI engineering at Springer Nature and co-founder of Minish, an open-source ML lab focused on efficient, eco-friendly models and packages.
- model2vec: a library for creating state-of-the-art static embedding models by distilling sentence transformers.
- semhash: a library for multimodal deduplication, outlier detection, and representative filtering.
- pyversity: a library for retrieval result diversification.
- vicinity: a library for fast and lightweight nearest neighbor search, with flexible indexing backends.
- tokenlearn: a library for pre-training static embedding models.
- model2vec-rs: a Rust port of Model2Vec.
- agentcheck: a Go CLI tool that checks what an AI agent can access before you run it.





