How to merge multiple text files into one in Perl?

Question

How to merge multiple text files into one in Perl?

How to Merge Multiple Text Files into One in Perl

Merging multiple text files into a single output file is a common task that Perl handles elegantly due to its powerful text-processing capabilities and flexible I/O functions.

At its core, merging involves reading from several input files sequentially and writing their contents into a new (or existing) output file. Perl’s philosophy of "There’s More Than One Way To Do It" (TMTOWTDI) applies here—you can accomplish this in several ways. But the fundamental principle remains the same: open each input file, read its contents, then write that to the output.

Key Perl Concepts Used

  • open function for filehandle management.
  • Use of scalar and array contexts to read file content (line-by-line reading).
  • Use of lexical filehandles and the three-argument open for safety and clarity.
  • Understanding of filehandle sigils: <FH> to read, >FH to write.
  • Error handling with die to catch failure to open files.

Simple Example: Merging Multiple Files

This script merges a list of input text files passed as command line arguments into a single output file called merged.txt. It reads line by line to handle large files efficiently.

#!/usr/bin/perl
use strict;
use warnings;

# Name of the output file
my $output_file = 'merged.txt';

# Open output file for writing (overwrites if exists)
open(my $out_fh, '>', $output_file) or die "Cannot open $output_file for writing: $!";

# Loop over each input file given as argument
foreach my $input_file (@ARGV) {
    # Open the input file for reading
    open(my $in_fh, '<', $input_file) or die "Cannot open $input_file for reading: $!";

    # Print a header before each file's content (optional)
    print $out_fh "\n--- Contents of $input_file ---\n";

    # Read and write each line
    while (my $line = <$in_fh>) {
        print $out_fh $line;
    }

    close($in_fh);
}

close($out_fh);
print "Files merged successfully into '$output_file'.\n";

How It Works

  • The script uses @ARGV to accept any number of input files. This makes it flexible—just pass file names after the script.
  • The output filehandle $out_fh is opened once for writing.
  • Each input file is opened, read line-by-line, then written into the output filehandle.
  • We add a small header before each file’s contents, which you can remove or customize as needed.
  • All filehandles are properly closed.

Running the Script

Assuming you save this script as merge.pl, run it in the terminal as:

perl merge.pl file1.txt file2.txt file3.txt

This will create merged.txt containing the combined contents of the input files.

Common Pitfalls to Avoid

  • Not specifying mode correctly in open: Always use three-argument open to avoid security issues and bugs, e.g., open(my $fh, '<', $filename).
  • Failing to check for open errors: Always add or die "message: $!" to detect issues like missing files or permission errors.
  • Reading all files at once: Avoid slurping huge files into memory unless you are sure the files are small; reading line-by-line is efficient and scalable.
  • Overwriting files accidentally: Be careful if the output file is also in the list of input files. This can cause data loss or truncation.

More Advanced Tips

  • You can enhance the script to take output filename as a command-line option.
  • For very large files, consider buffering or using system utilities if speed is critical.
  • Perl 5.10+ introduced the say function (with use feature 'say';), which can be handy for printing lines with newline automatically.

This simple example illustrates Perl’s ease in handling text files by focusing on readable, idiomatic code that beginners can easily adapt.

Verified Code

Executed in a sandbox to capture real output. • v5.34.1 • 12ms

Tip: edit code and use “Run (Browser)”. Server runs always execute the published, verified snippet.
STDOUT
Files merged successfully into 'merged.txt'.
STDERR
(empty)

Was this helpful?

Related Questions