How to parse HTTP response headers in Perl?
Question
How to parse HTTP response headers in Perl?
Parsing HTTP response headers in Perl can be done in several ways, depending on whether you want to use built-in/core modules or handle raw HTTP response data manually. HTTP headers follow a relatively simple structure: a status line followed by key-value header lines, each separated by CRLF (carriage return + line feed).
Here’s a comprehensive approach to parsing HTTP response headers in Perl:
1. Understanding HTTP Response Headers
An HTTP response typically looks like this:
HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Content-Length: 138
Connection: keep-alive
<html>...
It starts with a status line (HTTP/1.1 200 OK) followed by zero or more header lines in Key: Value format. Blank line separates headers from the body.
2. Parsing headers manually
If you receive a raw HTTP response (perhaps from a socket), you can parse headers by:
- Splitting by line endings (
\r\nor just\n) - Extracting the status line and each header line
- Collecting headers in a hash for easy access
Key Perl concepts used here include:
splitfunction to split text into lines and header key/value- Hashes with string keys (header names normalized)
- Possibly handling multiple headers with same name (like
Set-Cookie) by using arrays
3. Example: Parsing HTTP Headers from a raw HTTP response
use strict;
use warnings;
# Simulated raw HTTP response header block (no actual body)
my $raw_response = <<'END_RESPONSE';
HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Content-Length: 138
Connection: keep-alive
Set-Cookie: id=12345; Path=/
Set-Cookie: token=abcdef; Secure
END_RESPONSE
# Split response lines (handle CRLF or LF)
my @lines = split /\r?\n/, $raw_response;
# Parse the status line
my $status_line = shift @lines or die "No response status line";
print "Status line: $status_line\n";
# Hash to hold headers
my %headers;
# To handle multiple headers with same names, store values in arrayref
while (@lines) {
my $line = shift @lines;
last if $line eq ''; # blank line indicates end of headers
# Parse header line: key: value
if ($line =~ /^([^:]+):\s*(.*)$/) {
my ($field, $value) = (lc $1, $2); # lowercase keys for uniformity
# If header seen before, push into array, else store scalar
if (exists $headers{$field}) {
# Convert to arrayref if needed
if (ref $headers{$field} eq 'ARRAY') {
push @{ $headers{$field} }, $value;
} else {
$headers{$field} = [ $headers{$field}, $value ];
}
} else {
$headers{$field} = $value;
}
} else {
warn "Malformed header line: $line\n";
}
}
# Print parsed headers
print "Parsed headers:\n";
for my $key (sort keys %headers) {
my $val = $headers{$key};
if (ref $val eq 'ARRAY') {
print " $key => [", join(", ", @$val), "]\n";
} else {
print " $key => $val\n";
}
}
Output:
Status line: HTTP/1.1 200 OK
Parsed headers:
connection => keep-alive
content-length => 138
content-type => text/html; charset=UTF-8
set-cookie => [id=12345; Path=/, token=abcdef; Secure]
4. Notes and Pitfalls
- Line endings: HTTP headers should use
\r\n, but sometimes only\nis present. The regex/\r?\n/handles both. - Multiple headers with same keys: Some headers can appear multiple times (e.g.,
Set-Cookie). Storing these as array references helps keep all values. - Case insensitivity: HTTP header field names are case-insensitive, so normalizing keys to lowercase is good practice.
- Continued headers: HTTP/1.1 allows header line folding (continuation lines starting with spaces). This example does not handle it, but be mindful for robust parsers.
- TMTOWTDI: Perl offers multiple ways to parse headers — using regexes, splitting, or core modules.
5. Using Core Modules
For more robust parsing, you can use the HTTP::Response or HTTP::Headers modules from CPAN, but they’re not core. Instead, the Perl core includes HTTP::Tiny (since 5.14), which handles HTTP requests and responses including headers parsing for you.
Here’s a tiny example using HTTP::Tiny to fetch a URL’s headers:
use strict;
use warnings;
use HTTP::Tiny;
my $http = HTTP::Tiny->new;
my $response = $http->head("http://www.perl.org/");
if ($response->{success}) {
print "Status: $response->{status} $response->{reason}\n";
print "Headers:\n";
while (my ($k, $v) = each %{ $response->{headers} }) {
print " $k: $v\n";
}
} else {
die "Failed to fetch headers: $response->{status} $response->{reason}\n";
}
This method abstracts away the parsing, but if you want to process a raw HTTP response yourself (e.g., from a socket), the first example is the fundamental way to do so.
Summary
- HTTP response headers are key-value pairs following a status line.
- Manually parse by splitting lines, extracting key/values into a hash (handle duplicates).
- Be aware of line endings, case insensitivity, and multi-valued headers.
- Use
HTTP::Tinyfor simpler, higher-level HTTP header fetching/parsing if you do not need to parse raw data.
With these principles, you can parse and use HTTP response headers effectively in Perl.
Verified Code
Executed in a sandbox to capture real output. • v5.34.1 • 9ms
Status line: HTTP/1.1 200 OK
Parsed headers:
connection => keep-alive
content-length => 138
content-type => text/html; charset=UTF-8
set-cookie => [id=12345; Path=/, token=abcdef; Secure]
(empty)