How to convert Windows line endings to Unix in Perl?
Question
How to convert Windows line endings to Unix in Perl?
When working with text files across different operating systems, line endings often cause headaches. Windows uses CRLF (\r\n) as line endings, whereas Unix, Linux, and macOS use just LF (\n). If you have a file with Windows-style line endings, you might want to convert them to Unix-style in Perl to avoid problems with tools that expect Unix line endings.
Understanding Line Endings and Perl Concepts
In Perl strings, line endings appear as \r\n for Windows and \n for Unix. Perl variables use sigils to denote types, for example:
$scalarfor scalar variables (strings, numbers)@arrayfor arrays%hashfor hashes
When processing line endings, you'll be working primarily with scalars (strings) and possibly arrays of lines. Perl's "There's More Than One Way To Do It" (TMTOWTDI) philosophy means you have several approaches for this task.
Converting Windows Line Endings to Unix in Perl
The simplest automated way is to remove carriage return characters \r from your input, leaving behind only the newline \n. This commonly involves a substitution with a regular expression.
Here is a complete, runnable Perl script that reads lines from standard input, removes Windows-style endings, and prints Unix-style endings:
#!/usr/bin/env perl
use strict;
use warnings;
# Read from STDIN line by line
while (<>) {
# Remove trailing carriage return if it exists (Windows CRLF -> Unix LF)
s/\r$//;
# Print the modified line (with Unix line ending)
print;
}
You can use this script like this in the terminal (assuming the script is saved as crlf_to_unix.pl):
perl crlf_to_unix.pl < windows_file.txt > unix_file.txt
Why This Works
- The input operator
<>reads a line including the line ending (which is\r\nfor Windows files). - The substitution
s/\r$//;removes the trailing carriage return (\r) from the end of the line. - Printing the line outputs the line with the remaining Unix newline
\n.
Notes and Common Pitfalls
- Don't strip all
\rindiscriminately: Only remove trailing carriage returns to avoid modifying content that legitimately uses\rinside the text. - Beware of input mode: When reading files, if Perl is set to open in text mode on Windows, it may automatically translate line endings. But explicitly handling it ensures consistent results across platforms.
- Binary modes and large files: For large files or binary-safe processing, consider setting binmode and reading in raw mode to avoid surprises.
- Perl versions: This approach works in all Perl 5 versions. If your Perl is 5.10 or newer, you can also use the
chompfunction safely on lines ending with\r\nby setting the input record separator$/.
Alternative Approach Using chomp (Perl 5.10+)
In recent Perl versions, if you set $/ = "\r\n";, the chomp function removes \r\n.
#!/usr/bin/env perl
use strict;
use warnings;
local $/ = "\r\n"; # Input record separator is CRLF
while (<>) {
chomp;
print "$_\n"; # Add Unix newline explicitly
}
This method explicitly chomps Windows line endings, then prints with Unix line endings.
Summary
To convert Windows line endings to Unix in Perl:
- Read lines from input (using
<>) - Remove trailing carriage return with
s/\r$//; - Print lines normally (which end with Unix newline
\n)
This approach is straightforward, fast, and works across Perl versions without external dependencies.
Verified Code
Executed in a sandbox to capture real output. • v5.34.1 • 11ms
(empty)(empty)