How to validate email format using Perl regex?
Question
How to validate email format using Perl regex?
Validating email addresses using Perl regex requires balancing between correctness and practicality. The full email specification (RFC 5322) is complex, so most Perl scripts use simplified regular expressions that catch common valid forms while rejecting obvious invalid ones.
Perl regexes use the match operator =~ with patterns inside slashes /.../. Common modifiers include i for case-insensitive matching, which is important for the domain part of emails.
Key points about email format
- An email generally consists of a
local-part, the literal at-sign@, and adomain. - The local part can contain letters, digits, dots, plus signs, underscores, and hyphens.
- The domain part contains labels separated by dots, each label using letters, digits, and hyphens, finishing with a top-level domain (TLD) of 2 to 6 letters.
Example Perl code to validate email format with regex
#!/usr/bin/perl
use strict;
use warnings;
sub validate_email {
my ($email) = @_;
# This regex balances simplicity and common practical usage.
return $email =~ /^[\w.+-]+@[\w-]+(\.[\w-]+)*\.[a-zA-Z]{2,6}$/i;
}
my @test_emails = (
'user@example.com',
'user.name+tag+sorting@example.co.uk',
'user_name@example-domain.com',
'user@localserver', # Invalid - no TLD
'invalid-email@', # Invalid - no domain
'justtext', # Invalid - no @
'user@.com', # Invalid - domain cannot start with dot
);
for my $email (@test_emails) {
if (validate_email($email)) {
print "'$email' is a valid email format.\n";
} else {
print "'$email' is NOT a valid email format.\n";
}
}
How the regex works
^[\w.+-]+: One or more word characters (\w= letters/digits/underscore), dot (.), plus (+) or hyphen (-) at start for the local part.@: Literal at-sign separating local and domain.[\w-]+: One or more word characters or hyphen for the first domain label.(\.[\w-]+)*: Zero or more additional dot-separated domain labels.\.[a-zA-Z]{2,6}$: Ends with a dot and a 2 to 6 letter alphabetic TLD.- The
imodifier makes matching case-insensitive (affecting domain letters).
Common pitfalls and notes
- This regex allows consecutive dots or other subtle issues disallowed by RFC 5322.
- It does not support quoted local parts or internationalized domains.
- Domain existence or DNS validation is not checked here, only format.
- For stricter validation, consider using CPAN modules like
Email::Valid, but these are outside core Perl.
Overall, this regex provides a fast, easy way to filter obviously malformed email formats in Perl scripts without extra dependencies, using core Perl features compatible with all modern Perl 5 versions.
Verified Code
Executed in a sandbox to capture real output. • v5.34.1 • 7ms
'user@example.com' is a valid email format.
'user.name+tag+sorting@example.co.uk' is a valid email format.
'user_name@example-domain.com' is a valid email format.
'user@localserver' is NOT a valid email format.
'invalid-email@' is NOT a valid email format.
'justtext' is NOT a valid email format.
'user@.com' is NOT a valid email format.
(empty)