Skip to content
View Pringled's full-sized avatar
🚢
🚢

Organizations

@MinishLab

Block or report Pringled

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Pringled/README.md

Hi there 👋

I'm Thomas van Dongen. I'm head of AI engineering at Springer Nature and co-founder of Minish, an open-source ML lab focused on efficient, eco-friendly models and packages.

Open-source projects:

  • model2vec: a library for creating state-of-the-art static embedding models by distilling sentence transformers.
  • semhash: a library for multimodal deduplication, outlier detection, and representative filtering.
  • pyversity: a library for retrieval result diversification.
  • vicinity: a library for fast and lightweight nearest neighbor search, with flexible indexing backends.
  • tokenlearn: a library for pre-training static embedding models.
  • model2vec-rs: a Rust port of Model2Vec.
  • agentcheck: a Go CLI tool that checks what an AI agent can access before you run it.

Info:

Pinned Loading

  1. MinishLab/model2vec MinishLab/model2vec Public

    Fast State-of-the-Art Static Embeddings

    Python 2k 116

  2. MinishLab/semhash MinishLab/semhash Public

    Fast Multimodal Semantic Deduplication & Filtering

    Python 899 55

  3. pyversity pyversity Public

    Fast Diversification for Search & Retrieval

    Python 485 27

  4. MinishLab/vicinity MinishLab/vicinity Public

    Lightweight Nearest Neighbors with Flexible Backends

    Python 335 10

  5. agentcheck agentcheck Public

    Check what an AI agent can access before you run it

    Go 25 2