levenshtein-distance – Tarik Billa

Levenshtein: MySQL + PHP

December 31, 2023 by Tarik

Improving search result using Levenshtein distance in Java

December 28, 2023 by Tarik

Without understanding the meaning of the words like @DrYap suggests, the next logical unit to compare two words (if you are not looking for misspellings) is syllables. It is very easy to modify Levenshtein to compare syllables instead of characters. The hard part is breaking the words into syllables. There is a Java implementation TeXHyphenator-J … Read more

Most efficient way to calculate Levenshtein distance

December 25, 2023 by Tarik

The wikipedia entry on Levenshtein distance has useful suggestions for optimizing the computation — the most applicable one in your case is that if you can put a bound k on the maximum distance of interest (anything beyond that might as well be infinity!) you can reduce the computation to O(n times k) instead of … Read more

How can I optimize this Python code to generate all words with word-distance 1?

December 25, 2023 by Tarik

If your wordlist is very long, might it be more efficient to generate all possible 1-letter-differences from ‘word’, then check which ones are in the list? I don’t know any Python but there should be a suitable data structure for the wordlist allowing for log-time lookups. I suggest this because if your words are reasonable … Read more

How to add levenshtein function in mysql?

December 6, 2023 by Tarik

I have connected to my MySQL server and simply executed this statement in MySQL Workbench, and it simply worked – I now have new function levenshtein(). For example, this works as expected: SELECT levenshtein(‘abcde’, ‘abced’) 2

Best machine learning technique for matching product strings

December 6, 2023 by Tarik

My first thought is to try to parse the names into a description of features (company LG, size 42 Inch, resolution 1080p, type LCD HDTV). Then you can match these descriptions against each other for compatibility; it’s okay to omit a product number but bad to have different sizes. Simple are-the-common-attributes-compatible might be enough, or … Read more

Fast Levenshtein distance in R?

December 5, 2023 by Tarik

Levenshtein distance: how to better handle words swapping positions?

September 12, 2023 by Tarik

Text clustering with Levenshtein distances

September 1, 2023 by Tarik

Implementing a simple Trie for efficient Levenshtein Distance calculation – Java

August 17, 2023 by Tarik

From what I can tell you don’t need to improve the efficiency of Levenshtein Distance, you need to store your strings in a structure that stops you needing to run distance computations so many times i.e by pruning the search space. Since Levenshtein distance is a metric, you can use any of the metric spaces … Read more