You are here: Functions > REGEXFIND()

REGEXFIND()

The REGEXFIND() function allows you to construct a regular expression to find matching patterns in data. Regular expressions are powerful and flexible search strings that combine literal characters and meta-characters, which are special characters that perform a wide variety of search operations. See Regular Expressions for specific details on the full list of special meta-characters.

The REGEXFIND() function returns a logical value indicating whether the pattern specified by a regular expression occurs in a string.

Note: The current implementation of regular expressions in Analyzer does not fully support searching languages other than English.

Constructing regular expressions can be tricky, especially if you are new to using regular expression syntax. You may be able to achieve your search goals using simpler search functions such as FIND(), LISTFIND(), MATCH(), or MAP(). If your search requirements exceed the capabilities of these simpler functions, regular expressions provide almost unlimited flexibility in constructing search strings.

If you want to replace matching patterns with a new string using regular expressions, see REGEXREPLACE() function.

Function Syntax

REGEXFIND(string, pattern)

string - the character field, expression, or literal value to test for a matching pattern
pattern - the matching pattern (regular expression) being searched for within the string. The pattern parameter can contain literal characters, meta-characters, or a combination of the two. Literal characters include all alphanumeric characters, some punctuation characters, and blanks. For a list of meta-characters, see Regular Expressions. The pattern parameter must be enclosed in quotation marks.

Note: The search is case-sensitive, which means that uppercase and lowercase alpha characters must be explicitly specified.

Examples

The first simple example uses the group, alternation, and quantifier meta-characters to create a regular expression that finds “Smith”, “Smyth”, “Smithe”, or “Smythe” in the Last_Name field.

REGEXFIND(Last_Name,"Sm(i|y)the{0,1}")

Matching between the string and pattern parameters is performed sequentially. In the example above, “S” is matched against the first position in the Last_Name field, “m” is matched against the second position, “i” and “y” are matched against the third position, “t” is matched against the fourth position, and so on.

Spaces (blanks) are treated as characters in both the string and pattern parameters, so you should exercise care when dealing with spaces. In the pattern parameter, you can indicate a space either literally, by typing a space, or by using the meta-character \s. Using the meta-character makes spaces easier to read, and less likely to be overlooked, in a regular expression.

The second example demonstrates concatenating two or more fields in the string parameter when you want to search across multiple fields simultaneously.

REGEXFIND(Vendor_Name+Vendor_Street,"Hardware.*Main")

searches both the Vendor_Name and the Vendor_Street fields for the words “Hardware” and “Main” separated by zero or more characters. A business with the word “Hardware” in its name, located on a street called “Main”, would match the regular expression. So would a business called “Hardware on Main”. The concatenated fields are treated like a single field that includes leading and trailing spaces from the individual fields, unless you use the ALLTRIM() function to remove leading and/or trailing spaces.

Because REGEXFIND() searches for the characters in the pattern parameter in the order in which you specify them, the order in which you concatenate the fields has an effect. If you reversed Vendor_Name and Vendor_Street in the expression above, you would be less likely to get any results.

The following table lists some additional examples of using the REGEXFIND() function:

Example

Return value

REGEXFIND(City,"Chicago|Boston|New York")

Returns T for cities that contain the exact words "Chicago"¯, "Boston", or "New York".

Returns F otherwise.

REGEXFIND(First_Name,"Damian|Damien|Damion ")

Returns T for first names that contain the exact words "Damian", "Damien", or "Damion".

Returns F otherwise.

REGEXFIND(Last_Name,"Sm(i|y)the{0,2}")

Returns T for last names that contain a word starting with "Smi"¯ or "Smy"¯ followed by "th", "the", or "thee"¯.

For example: "Smith", "Smyth", "Smithe", "Smythe", "Smithee", "Smythee", "Jones-Smith, etc.

Returns F otherwise.

REGEXFIND(Last_Name,"^Sm(i|y)the?\b")

Returns T for only those last names starting from the first position that are solely "Smith", "Smyth", "Smithe", or "Smythe".

Returns F otherwise.

REGEXFIND(Last_Name,"Sm(i|y)the?\b")

Returns T for last names that contain a word starting with "Smi"¯ or "Smy"¯ followed by "th" or "the".

For example: "Smith", "Smyth", "Smithe", "Smythe", "Jones-Smith", Jones-Smithe", Jones-Smyth", Jones-Smythe", etc.

Returns F otherwise.

REGEXFIND(Purchase_Order,"[a-zA-Z]{3}-\d{4}")

Returns T for all records with Purchase Orders that start with 3 alphabetic characters, followed by a hyphen and 4 digits.

Returns F otherwise.

REGEXFIND(Purchase_Order,"\b\d{2,}-[a-zA-Z]{5}")

Returns T for all records with Purchase Orders that start with 2 or more numeric digits, followed by a hyphen and 5 or more letters.

Returns F otherwise.

REGEXFIND(Purchase_Order,"62")

Returns T for all records with Purchase Orders that contain "62".

Returns F otherwise.

REGEXFIND(Purchase_Order,"\b62")

Returns T for all records with Purchase Orders that begin with "62"¯.

Returns F otherwise.

REGEXFIND(Purchase_Order,"62\b")

Returns T for all records with Purchase Orders that end with "62"¯.

Returns F otherwise.

REGEXFIND(Purchase_Order,"\b\d\d\d\d62")

Returns T for all records with Purchase Orders that contain "62"¯ in the 5th and 6th positions.

Returns F otherwise.

REGEXFIND(Purchase_Order,"\b\d{4}62")

Returns T for all records with Purchase Orders that contain "62"¯ in the 5th and 6th positions.

Returns F otherwise.

REGEXFIND(Purchase_Order, "\b\w{4}62")

Returns T for all records with alphanumeric Purchase Order identifiers that contain "62"¯ in the 5th and 6th positions.

Returns F otherwise.

REGEXFIND(Purchase_Order, "\b.{4}98")

Returns T for all records with Purchase Order identifiers that contain any characters in the first four positions , and "98"¯ in the 5th and 6th positions

Returns F otherwise.

REGEXFIND(Purchase_Order, "\b.{1,3}62")

Returns T for all records with Purchase Order identifiers that contain "62"¯ preceded by 1 to 3 initial characters.

Returns F otherwise.

REGEXFIND(Purchase_Order, "\b.{3}[14]62")

Returns T for all records with Purchase Order identifiers that contain:

any character in the first three positions, and "1"¯ or "4"¯ in the 4th position, and "98" in the 5th and 6th positions

Returns F otherwise.