Implementing a simple Trie for efficient Levenshtein Distance calculation – Java

From what I can tell you don’t need to improve the efficiency of Levenshtein Distance, you need to store your strings in a structure that stops you needing to run distance computations so many times i.e by pruning the search space.

Since Levenshtein distance is a metric, you can use any of the metric spaces indices which take advantage of triangle inequality – you mentioned BK-Trees, but there are others eg. Vantage Point Trees, Fixed-Queries Trees, Bisector Trees, Spatial Approximation Trees. Here are their descriptions:

Burkhard-Keller Tree

Nodes are inserted into the tree as follows:
For the root node pick an arbitary element
from the space; add unique edge-labeled
children such that the value of each edge is
the distance from the pivot to that
element; apply recursively, selecting the
child as the pivot when an edge already
exists.

Fixed-Queries Tree

As with BKTs except: Elements are stored
at leaves; Each leaf has multiple elements;
For each level of the tree the same pivot is
used.

Bisector Tree

Each node contains two pivot elements
with their covering radius (maximum
distance between the centre element and
any of its subtree elements); Filter into two
sets those elements which are closest to
the first pivot and those closest to the
second, and recursively build two subtrees
from these sets.

Spatial Approximation Tree

Initially all elements are in a bag; Choose
an arbitrary element to be the pivot; Build
a collection of nearest neighbours within
range of the pivot; Put each remaining
element into the bag of the nearest
element to it from collection just built;
Recursively form a subtree from each
element of this collection.

Vantage Point Tree

Choose a pivot from the set abitrarily;
Calculate the median distance between this
pivot and each element of the remaining
set; Filter elements from the set into left
and right recursive subtrees such that
those with distances less than or equal to
the median form the left and those greater
form the right.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)