A better RegEx pattern for matching e-mail addresses
UPDATE: Also read Stefan Esser’s post: Holes in most
A few weeks ago, I posted a regular expression pattern for matching e-mail addresses.
Below is a more refined version.
Just as with the previous pattern, this one will match most valid e-mail addresses including:
- Addresses with periods and plus signs (e.g. ‘tiffany.brown’ or ‘hotc0derch1ck+todolist’)
- Top-level British and Australian domain names such as ‘.co.uk’ and ‘.com.au’
- New top-level domains such as ‘.museum’ and ‘.travel’
This pattern takes advantage of the
\w character type. It’s a simpler way of waying “a – z (both upper and lower case), 0 – 9 and the underscore character” (though for many languages,
\w means any alphanumeric character).
It also checks to see whether a user or domain name contains at least one, but no more than 64 alphanumeric characters. Sixty-four is the maximum character length for user and domain names under SMTP.
This pattern should work with most regular expression engines.
Recommended reading: Mastering Regular Expressions by Jeffrey E. F. Friedl