Skip to content

Phonetic vectorization? #13

@dkbarn

Description

@dkbarn

This is more of a question than a bug report:

I have a somewhat different use case than is covered in the documentation of how to use this library. In my case, I am wanting to search for similar-sounding syllables, rather than character-by-character matching of text. So my plan is to use some sort of phonetic encoding on my corpus (i.e. Soundex, Metaphone, etc). But I am not certain how to do this in such a way that would be compatible with neofuzz's Process -- it doesn't look like scikit-learn provides an out-of-the-box Vectorizer for phonetic encoding of text. And I'm not sure if the SubWordVectorizer could somehow be leveraged for this.

Any pointers on how to achieve this with neofuzz?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions