Matching measures
To compare one attribute of two records, you can use any of the implemented matching
functions, such as Exact, Levenshtein and
Jaro-Winkler, or a custom matching algorithm you
created.
You can also compare two records on many attributes. For two records to match, the following two conditions must hold:
- When using the T-Swoosh algorithm, the score for each matching function in the match rule must exceed the threshold, if any specified. By default, the threshold is set to 1. This means exact match for most matching functions, excepted for Exact - ignore case and potentially any custom matching function.
- The global score, computed as a weighted score of the different matching functions, must exceed the match threshold. The score is equal to Σ(wi × si(r1,r2)) / Σwi where wi is the confidence weight of the matching function i and si(r1,r2) is the score of the matching function i over records r1 and r2 .