regex beginner

How to use character classes [a-z] in Perl regex?

Question

How to use character classes [a-z] in Perl regex?

Character classes in Perl regular expressions are a fundamental way to match any one character from a set of characters. A character class is defined inside square brackets [ ]. For example, [a-z] matches any lowercase letter from a to z.

Using character classes like [a-z] allows you to specify a range of characters to match. Here’s how to understand and use it effectively in Perl regex:

Basics of [a-z] in Perl Regex

  • [a-z] matches exactly one character in the range from lowercase a through z.
  • The match is case-sensitive by default, so [a-z] does not match uppercase letters like A or Z.
  • You can combine multiple ranges and individual characters within a character class, like [a-zA-Z0-9] to match letters and digits.
  • Special characters inside character classes, like -, need to be escaped or positioned carefully to avoid being treated as a range indicator.

Perl Regex Context and Sigils

In Perl, regular expressions are typically used with the binding operator =~. The pattern is enclosed by slashes:

$string =~ /[a-z]+/;

The pattern /[a-z]+/ matches one or more consecutive lowercase letters. The + quantifier means "1 or more".

Example: Using [a-z] to Check for Lowercase Letters

The following Perl script demonstrates matching and extracting lowercase letters from a string:


use strict;
use warnings;

my $text = "Hello World 123 perl";

print "Text: $text\n";

# Check if the string contains any lowercase letter
if ($text =~ /[a-z]/) {
    print "String contains at least one lowercase letter.\n";
} else {
    print "No lowercase letters found.\n";
}

# Extract all sequences of lowercase letters
print "Lowercase letter sequences found:\n";
while ($text =~ /([a-z]+)/g) {
    print " - $1\n";
}

Output:

Text: Hello World 123 perl
String contains at least one lowercase letter.
Lowercase letter sequences found:
 - ello
 - orld
 - perl

Important Perl Regex Gotchas with Character Classes

  • Case Sensitivity: [a-z] is lowercase only. Use [A-Za-z] for case-insensitive letter matching or use the i modifier:
  • 
    if ($text =~ /[a-z]+/i) {
        print "Matches letters in any case.\n";
    }
    
  • Unicode Awareness: Perl’s basic [a-z] only matches ASCII letters by default. For Unicode letters, use the use utf8; pragma and Unicode character properties like \p{Ll} (lowercase letter).
  • Dash - Characters: To match literal dash inside character classes, put it at the start or end, or escape it:
  • [-a-z] or [a-z-] or [a\-z]
  • Negated Classes: Use [^a-z] to match any character that is not lowercase a-z.

Summary

Character classes like [a-z] are simple yet powerful for matching ranges of characters in Perl regex. Remember:

  • Square brackets create a character class matching one character from inside.
  • [a-z] matches any single lowercase ASCII letter.
  • Add quantifiers like + to match longer sequences.
  • Use case modifiers or expand ranges to include uppercase letters.
  • Watch out for special characters inside classes, especially dash -.

Mastering character classes is one of the quickest ways to become proficient with Perl regular expressions and text processing.

Verified Code

Executed in a sandbox to capture real output. • v5.34.1 • 5ms

Tip: edit code and use “Run (Browser)”. Server runs always execute the published, verified snippet.
STDOUT
(empty)
STDERR
(empty)

Was this helpful?

Related Questions