How to Compare Strings in Perl (eq/ne, lt/gt, cmp, case-insensitive, Unicode)
Question
how to compare strings in perl
Category: Perl
Difficulty: Beginner
Tags: perl, strings, comparison, operators, cmp, case-insensitive, unicode, sorting, best-practices, pitfalls
In Perl, comparing strings correctly is mostly about choosing the string operators (like eq and cmp) instead of the numeric operators (like == and <=>). Perl is flexible about converting values between strings and numbers, which is powerful—but it also means that using the wrong operator can silently give you the wrong result. This guide explains all the standard ways to compare strings in Perl, when to use each one, and the common pitfalls (numeric-vs-string mixups, case sensitivity, whitespace, undef warnings, Unicode, and locale issues).
1) The core string comparison operators
Perl has a parallel set of operators for string comparisons. Use these when you want lexicographic (dictionary-like) comparison based on character ordering:
| What you want | Use | Meaning |
|---|---|---|
| Equal | eq | Strings are exactly identical |
| Not equal | ne | Strings differ |
| Less than | lt | Lexicographically smaller |
| Greater than | gt | Lexicographically larger |
| Less or equal | le | Lexicographically ≤ |
| Greater or equal | ge | Lexicographically ≥ |
| 3-way compare | cmp | Returns negative/0/positive (like “strcmp”) |
Key idea: eq/ne/lt/gt/le/ge/cmp compare based on character order, not numeric value. That means "10" lt "2" is true because "1" comes before "2".
What does cmp return?
$a cmp $b returns:
- < 0 if
$ais lexicographically less than$b - 0 if they are equal
- > 0 if
$ais greater than$b
Many examples show -1, 0, or 1, but Perl only promises negative/zero/positive; the exact non-zero magnitude isn’t something you should rely on.
2) Runnable Example 1: Basic string comparisons
#!/usr/bin/env perl
use strict;
use warnings;
use v5.16;
my ($a, $b) = ("Perl", "perl");
print "a eq b? ", ($a eq $b ? "yes" : "no"), "\n";
print "a ne b? ", ($a ne $b ? "yes" : "no"), "\n";
print "a cmp b = ", ($a cmp $b), "\n";
print "Case-insensitive (lc): ", (lc($a) eq lc($b) ? "equal" : "different"), "\n";
my $x = "apple";
my $y = "banana";
print "$x lt $y? ", ($x lt $y ? "yes" : "no"), "\n";
print "$x gt $y? ", ($x gt $y ? "yes" : "no"), "\n";
print "$x le $y? ", ($x le $y ? "yes" : "no"), "\n";
print "$x ge $y? ", ($x ge $y ? "yes" : "no"), "\n";
Expected output:
a eq b? no
a ne b? yes
a cmp b = -1
Case-insensitive (lc): equal
apple lt banana? yes
apple gt banana? no
apple le banana? yes
apple ge banana? no
3) Sorting strings: cmp and custom comparisons
Perl’s sort uses string comparison by default (it behaves like sorting by cmp). If you want to control ordering (for example, case-insensitive sorting), you pass a block that returns a cmp-style result.
Runnable Example 2: Sort strings and show numeric-vs-string pitfalls
#!/usr/bin/env perl
use strict;
use warnings;
use v5.16;
my @words = qw(pear Apple banana);
my @sorted_ci = sort { lc($a) cmp lc($b) } @words;
print "Sorted case-insensitive: @sorted_ci\n";
my @nums_as_strings = qw(2 10 1);
my @lex = sort @nums_as_strings; # string sort
my @num = sort { $a <=> $b } @nums_as_strings; # numeric sort
print "Default sort (string cmp): @lex\n";
print "Numeric sort (<=>): @num\n";
Expected output:
Sorted case-insensitive: Apple banana pear
Default sort (string cmp): 1 10 2
Numeric sort (<=>): 1 2 10
Best practice: if values are conceptually numeric, compare/sort them numerically with ==, <=>, <, >, etc. If they are conceptually strings (IDs, filenames, tokens), compare them with eq / cmp.
4) Case-insensitive comparisons
Many real-world comparisons should ignore case: usernames, HTTP header names, command keywords, etc. The simplest approach is to normalize both sides and then compare:
lc($a) eq lc($b)for ASCII-ish case-insensitive comparisonsfc($a) eq fc($b)for Unicode-aware “case folding” (recommended when Unicode matters)
Why not always lc? Unicode has special casing rules. For example, German ß case-folds to ss in many contexts. lc does not fully capture this, while fc is designed for caseless matching.
Runnable Example 3: Unicode case folding with fc
#!/usr/bin/env perl
use strict;
use warnings;
use v5.16;
use utf8;
binmode STDOUT, ":encoding(UTF-8)";
my $s1 = "straße";
my $s2 = "STRASSE";
print "lc equal? ", (lc($s1) eq lc($s2) ? "yes" : "no"), "\n";
print "fc equal? ", (fc($s1) eq fc($s2) ? "yes" : "no"), "\n";
Expected output:
lc equal? no
fc equal? yes
5) Comparing against patterns (regex) vs comparing strings
Sometimes you don’t want to know whether two strings are equal—you want to know whether a string matches a pattern. That is not string comparison; that is regular expression matching:
$s =~ /pattern/tests whether$smatches$s !~ /pattern/tests whether it does not match
For example, “is this input exactly "yes"?” is a string comparison. “does this input look like an email?” is pattern matching.
6) Best practices
- Use
use strict; use warnings;so mistakes like comparing undef or using the wrong operator are easier to catch. - Pick operators by meaning: numeric operators for numbers, string operators for strings.
- Normalize before comparing when you need “logical” equality:
- Case-insensitive:
fc(Unicode) orlc(basic) - Trim whitespace: consider removing leading/trailing whitespace if input is user-provided
- Line endings: use
chompwhen reading lines from files
- Case-insensitive:
- Be explicit with undef: comparing undef with
eqcan warn (“Use of uninitialized value…”). Decide your policy:- Require defined:
dieor handle missing values - Coerce:
($a // "") eq ($b // "")if treating undef as empty string is acceptable
- Require defined:
- Use
cmpfor sorting keys and combine comparisons with||for multi-key sorts, e.g.lc($a) cmp lc($b) || $a cmp $b(case-insensitive primary, stable tie-breaker).
7) Common pitfalls (and how to avoid them)
Pitfall A: Using numeric operators on strings
If you write:
"foo" == "bar"(numeric compare)
Perl will try to convert both sides to numbers; non-numeric strings often become 0, leading to surprising “equal” results (and usually warnings). Use eq for exact string equality.
Pitfall B: Lexicographic ordering is not numeric ordering
"10" lt "2" is true because it compares character by character. If you mean numeric, use < or <=> after validating the strings are numeric.
Pitfall C: Hidden whitespace and newlines
Input from files often includes trailing newlines. Comparing a line read from a file directly against a literal can fail unexpectedly:
- Read:
my $line = <STDIN>;(includes newline) - Fix:
chomp($line);then compare
Similarly, user input can include leading/trailing spaces; decide whether to treat those as significant.
Pitfall D: Locale and “alphabetical order”
By default, string comparisons are based on Perl’s internal character ordering (Unicode code point semantics in modern Perls), which is not necessarily what humans consider alphabetical in a specific language. If you need locale-aware collation (e.g., Swedish ordering), you may need locale or collation libraries. This is an advanced topic: use locale affects comparisons, but it can introduce surprises and depends on environment configuration. For robust human sorting, consider dedicated collation modules (for example ICU-based solutions) rather than relying on incidental locale settings.
Pitfall E: Unicode encoding mismatches
String comparison assumes both strings represent the same sequence of characters. If one string is decoded properly and the other is raw bytes (or decoded with a different encoding), comparisons can fail. In real programs, standardize: decode input to Perl’s internal character strings (e.g., read with an encoding layer or use Encode), and write output with an explicit encoding layer.
8) Quick “when to use what” cheat sheet
- Exact equality:
$a eq $b - Not equal:
$a ne $b - Lexicographic ordering:
lt gt le ge - 3-way comparison / sort callback:
$a cmp $b - Case-insensitive (basic):
lc($a) eq lc($b) - Case-insensitive (Unicode-correct):
fc($a) eq fc($b) - Numeric comparison:
==and<=> - Pattern match (not comparison):
$s =~ /.../
If you internalize one rule, make it this: use eq for strings and == for numbers. From there, everything else (sorting, case folding, Unicode) becomes a matter of selecting the right normalization and operator for your intent.
Verified Code
Executed in a sandbox to capture real output. • 11ms
a eq b? no
a ne b? yes
a cmp b = -1
Case-insensitive (lc): equal
apple lt banana? yes
apple gt banana? no
apple le banana? yes
apple ge banana? no
(empty)Was this helpful?
Related Questions
- How to Write to a File in Perl (open/print, modes, safety, and best practices)
- How to Load a Module in Perl (use, require, @INC, and Best Practices)
- How to Run Perl Code on macOS (Terminal, Scripts, Shebangs, Modules, and Common Pitfalls)
- How to Add Elements to an Array in Perl (push, unshift, splice, indexing, and arrayrefs)
- How to Install Perl Modules on Windows (CPAN, cpanm, Strawberry Perl, ActiveState, local::lib)
- How to Check If a Perl Module Is Installed (and Loadable)