Regex extract
Function: Regex Extract
This action helps you find and pull out specific pieces of text from a larger body of text using a powerful pattern-matching tool called a "Regular Expression" (Regex). It's perfect for when you need to extract data that follows a consistent format, like email addresses, phone numbers, or specific codes.
Input
- Text: The main block of text from which you want to extract information.
- Regex: The pattern (regular expression) that defines what you are looking for within the text.
Output
This action creates a new variable (by default named 'RESULT', but you can choose a different name) which will contain a list of all extracted text pieces that match your specified pattern. If your Regex includes "capture groups" (parts of the pattern enclosed in parentheses), the list will contain the text matched by those specific groups.
Execution Flow
Real-Life Examples
Example 1: Extracting Email Addresses
Imagine you have a customer feedback form where users sometimes include their email addresses in the comments section, and you want to automatically pull these out.
- Inputs:
- Text: "Thanks for the great service! You can reach me at [email protected] or my work email [email protected] if needed."
- Regex:
\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]\{2,\}\b - Result Variable Name:
CustomerEmails
- Result: A new variable named
CustomerEmailswill be created, containing a list:["[email protected]", "[email protected]"].
Example 2: Finding Order Numbers
You receive daily reports as plain text, and you need to quickly identify all order numbers, which always start with "ORD-" followed by five digits.
- Inputs:
- Text: "Report for 2023-10-26. New orders: ORD-12345, ORD-67890. Returns: RET-001. Another order: ORD-54321."
- Regex:
ORD-\(\d\{5\}\) - Result Variable Name:
FoundOrderNumbers
- Result: A new variable named
FoundOrderNumberswill be created, containing a list:["12345", "67890", "54321"]. (Note: The parentheses in the regex\(\d\{5\}\)create a capture group, so only the digits are extracted, not "ORD-").
Example 3: Extracting Product Codes from a Description
You have product descriptions that sometimes include a specific product code format: three letters, a hyphen, and four numbers (e.g., "ABC-1234"). You want to list all such codes.
- Inputs:
- Text: "This is a new product (XYZ-9876) with improved features. Also check out item DEF-5432. Old model GHI-1122 is discontinued."
- Regex:
[A-Z]\{3\}-\d\{4\} - Result Variable Name:
ProductCodes
- Result: A new variable named
ProductCodeswill be created, containing a list:["XYZ-9876", "DEF-5432", "GHI-1122"].