You are here: Fuzzy Matching > Complimentary Tools for Fuzzy Matching

Complimentary Tools for Fuzzy Matching

Other functions that can be very useful to facilitate meaningful fuzzy matching, especially with tricky data, include:

INCLUDE() and EXCLUDE() which enable the data to be matched to be limited to desired characters such as numbers only, alphabetic characters only, etc. These functions are useful for fuzzy matching on financially coded data like invoice numbers, purchase order numbers and even phone numbers
ARRANGE() re-orders data in descending order and is useful for address matching where you might use SORTNORMALIZE() to perform an initial fuzzy match on addresses, but recognizing that there may be multiple addresses near a certain location or within the same building, using ARRANGE() with INCLUDE() to isolate and re-order only the numeric portion of the address for additional exact matching to eliminate (or reduce) the likelihood of false positive address matches
FIND() and LISTFIND() are useful for case- and position-insensitive searches of strings, where the existence of keywords anywhere in a string warrants investigation. FIND() can be used to perform a Google-type search on one or more keywords within a character field, while LISTFIND() is great at searching a character field for items in a large list (typically stored in a variable array)
LEFT() and RIGHT() are helpful when you only want to explore matches based on the specific leading or trailing characters that are of interest. This may be a phone number for which you only want the area code, or perhaps a phone number without the area or country code that precede it
SUBSTRING() is a commonly used tool to select a portion of a larger string by specifying the starting position and length of the data to be captured. Useful for fuzzy matching a meaningful portion of a string rather than the whole string. Also useful for capturing the month or year in a date string, or isolating only the area code in a 10-digit phone number, etc.
FORMAT() and MAP() enable fuzzy matching based on the arrangement of data (like a zip code or phone number), isolating those items whose format meets a specific criteria (for example, a phone number of the form 99-999-999-999 for international calls vs. 999-999-9999 for local North American calls). Conversely, these functions can be used to locate items that fail to conform to an expected arrangement