B4A Library ABSimMetrics 'Fuzzy' Library

A library to do 'fuzzy' matches. It looks for the similarity between two strings or searches for the best match of a string in a table of strings.

More information on the algorithms can be found at Algorithms

Implemented algorithms:

CHAPMAN_MATCHING_SOUNDEX
CHAPMAN_MEAN_LENGTH
CHAPMAN_ORDERED_NAME_COMPOUND_SIMILARITY
JARO
JARO_WINKLER
LEVENSHTEIN_DISTANCE
MONGE_ELKAN
NEEDLEMAN_WUNCH
QGRAMS_DISTANCE
SMITH_WATERMAN
SMITH_WATERMAN_GOTOH
SMITH_WATERMAN_GOTOH_WINDOWED_AFFINE
SOUNDEX
TAGLINK
TAGLINK_TOKEN

Functions

ABFindBestMatchAll(myList, SearchTerm) as ABFoundMatch

Finds the best match of a string in a table of strings using all available algorithms.

Returns an ABFoundMatch object

Example:
B4X:
Dim sim as ABSimMetrics
Dim mat as ABFoundMatch
Dim mylist(3) as String
 
mylist(0) = "Alain"
mylist(1) = "Aldo"
mylist(2) = "Albrecht"

mat = sim.ABFindBestMatchAll(myList, "Albert")
 
msgbox (mat.FoundString, "")

ABFindBestMatch(myList, SearchTerm, algorithm) AS ABFoundMatch

Finds the best match of a string in a table of strings using a specific algorithm.

Returns an ABFoundMatch object

Example:
B4X:
Dim sim as ABSimMetrics
Dim mat as ABFoundMatch
Dim mylist(3) as String

mylist(0) = "Alain"
mylist(1) = "Aldo"
mylist(2) = "Albrecht"

mat = sim.ABFindBestMatch(myList, "Albert", sim.LEVENSHTEIN_DISTANCE)

msgbox (mat.FoundString, "")

ABGetSimilarity(string1, string2, algorithm) AS float

gives the percentage a string matches with another string using a specific algorithm.

Returns a float (percentage)

Example:
B4X:
dim sim as ABSimMetrics
dim perc as float
perc = sim.ABGetSimilarity("Albrecht", "Albert", sim.LEVENSHTEIN_DISTANCE)

msgbox (perc, "")

ABFoundMatch Object

Properties:

Percentage as int
FoundString as String
UsedAlgorithm as int
UsedAlgorithmName as String
 

Attachments

  • ABSimMetricsTest.zip
    5.9 KB · Views: 851
  • ABSimMetrics.zip
    116.3 KB · Views: 983
Last edited:

latcc

Banned
In the demo app you can search Array 'MyStr'.

sim.ABFindBestMatch(MyStr, "search-for-this-string", 1)

This returns only 1 record: the best match from within Array 'MyStr'.

How can you find other matches from within the array?



The obvious method would be using a counter ...

for counter = 1 to end-array-value
sim.ABFindBestMatch(MyStr(counter), "search-for-this-string", 1)
next counter

But MyStr(counter) returns an error!

So how?
 
Top