Introduction
Regex in programming refers to regular expressions. Regular expressions are patterns of text that are used to search or match strings. They exist in many programming languages but our focus is on Javascript. In Javascript, they are used for manipulating and matching strings and input validation. Regular expressions use a well-defined set of characters for string manipulation.
Table of Contents
What is Regex?
Flags
Patterns
Regex literal
Regex constructor
Metacharacters
Character classes
Negated character classes
Ranges
Matching a password as an example
Conclusion
Prerequisite:
Familiarity with Javascript is required to understand and apply the information in this article.
What is Regex?
Regex are patterns of text that are used for searching and matching strings in Javascript. They are used in input validation, password validation, and other string manipulations.
const myString = "In here somewhere lies my name. Daniel is the name"
const findMyName = /Daniel/g
console.log(myString.match(findMyName))
//Output: ["Daniel"]
In the code above, myString is a line of text. findMyName is a regex represented by enclosing in forward slashes. The g in the regex is called a flag.
Several string methods can be used on regex. In the above code, the match method is used.
Flags
Flags are special characters placed after the ending forward slash of a regex that modifies how the pattern matching is performed. There are a total of six commonly used flags in regex. They include:
i - case-insensitivity
g - global for finding all matches not just the first
m - multiline for matching the start and end of multi-line string
s - dot-all for matching any character except the newline
u - unicode used for handling Unicode characters
x - verbose for ignoring whitespace and adding comments
Patterns
They are a sequence of characters that define how a search or match is performed. There are many patterns used in regex such as literals, character sets, quantifiers, etc.
For example:
const findMyName = /Daniel/g
In the above code, Daniel is a string literal for searching for the string specified.
Regex literal
Regex literal refers to the enclosing of strings in forward slashes for pattern matching.
The previous code image is a perfect example of a regex literal.
Regex constructor
Regex constructor is the use of the Javascript RegExp constructor for creating regex patterns. The constructor takes two arguments; the string to be matched and the flag.
const findMyName = new RegExp("Daniel", "g")
In the above code, the RegExp constructor takes the Daniel string as a pattern to be matched and takes the global flag for full search.
Metacharacters
Metacharacters are characters that have special meanings. They include:
'.', '^', '+', '$', '?', '|', '()', '{}', etc
const findMe = /^Me/
//Matches if the string starts with "Me"
const findHim = /Him$/
//Matches if the string ends with "Him"
const findMeOrHim = /Me|Him/
//Matches "Me" or "Him" in the string
Some metacharacters are called quantifiers because they find the number of occurrences of a character. The '?', '+', ''*, '{}' characters fall into this category.
Example:
const findZeroOrMoreA = /a*/
//Matches zero or more occurrences of a
const findOneOrMoreB = /b+/
//Matches one or more occurrences of b
const find_4_To_5_digits = /\d{4,5}/
//Matches 4 - 5 occurrences of a digit
const findZeroOrOneA = /a?/
//Matches zero or one occurrence of a
Character classes
Character classes are used to match any one character from a set of characters. The characters searched for are enclosed in square brackets.
Example:
const findVowel = /[aieou]/
//Match any vowel
const findDigit = /[0-9]/
//Match sny digit
const findLetter = /[A-Za-z]/
Match any uppercase or lowercase letter
Some of the classes can be represented in shorthand form. For example:
const findDigit =/\d/
//for /[0-9]/
const findNonDigit = /\D/
//for non-digits
const findWord = /\w/
//for word character: [a-zA-z0-9_]
const findNonWord = /\W/
//for non-word character: [^A-Za-z0-9_]
const findWhiteSpace = /\s/
//for whitespace
const findNonWhiteSpace = /\S/
//for non-white space
The metacharacters lose their special meaning inside classes except the - which is used to show range.
Negated Character classes
The ^ character is used for the negation of character classes. It is different from the metacharacter because it is used inside the square bracket. Example:
const findNonDigit = /[^0-9]/
//Matches any non-digit character
It can also be used with the shorthand character classes.
const findNonDigit = /[^\d]/
//Matches any non-digit character
Ranges
This is the use of the - to show the span of a character class. It is used inside the [ ] brackets.
For example:
const findVowels = /[aieouAIEOU]/
//Matches any vowel regardless of case
Ranges can be combined with quantifiers to specify the length of a class.
const findFourLetters = /^[a-zA-Z]{4}/
//Matches exactly first four letters
Matching A Password As an Example
Let us apply what we've been able to learn so far. In this exercise we are going to create a password-matching pattern, here are the needed criteria:
It must be at least 8 characters of alphanumerics.
It must contain at least one uppercase letter.
Have at least one special character.
It must contain at least one lowercase letter.
const checkPassword = /^(?=.*[a-z])(?=.*\d)(?=.*[A-Z])(?=.*[@#$%&?])[A-Za-z\d@#$%&?]{8,}$/g
const password = "gaGndhhh$6"
console.log(password.match(checkPassword))
//Returns [ 'gaGndhhh$6', index: 0, input: 'gaGndhhh$6', groups: undefined ] if matched or null if not matched
The above code uses the password string to match the regex. If null is not returned the match was successful. This is just one of the applications of J; it can be used to check phone numbers, emails, advanced string manipulations, etc.
💡 Tip: To understand the above code, here is more on Lookaheads
Conclusion
This article explains the basics of regex in JS. I believe with the information provided so far, you can take up a more advanced study of regex. I recommend checking out RegExr for an interactive study of regex.