http beginner

How to parse response content from LWP::UserAgent in Perl?

Question

How to parse response content from LWP::UserAgent in Perl?

When working with web requests in Perl, LWP::UserAgent is one of the most popular modules to fetch HTTP content. After sending a request, you usually get an HTTP::Response object, from which you can extract the content or "body" of the response for further processing or parsing.

Basic Workflow to Parse Response Content with LWP::UserAgent

The typical steps are:

  • Create a LWP::UserAgent object
  • Make a GET or POST request using the user agent
  • Check if the request succeeded using the response object’s is_success method
  • Access the response content with decoded_content or content
  • Parse that content as needed (e.g., HTML, JSON, XML)

The key method to get the response body is $response->decoded_content, which respects any content encoding and applies the proper charset decoding for you. Using just $response->content returns a raw byte string, which may require manual decoding.

Important Perl Learning Points

  • Sigils: Scalars like response objects and content use $.
  • Context: The content is a scalar string, so you just assign or print it directly.
  • TMTOWTDI: While you could interact with LWP several ways, decoded_content is usually the best for parsing textual responses.
  • Version: decoded_content is available in LWP 5.64+ (if you have Perl 5.10 or later, you’re good).

Common Pitfalls

  • Not checking $response->is_success before accessing content
  • Using content without decoding, which can cause garbled output if encoding is UTF-8 or compressed
  • Ignoring HTTP errors such as redirects or 4xx/5xx codes

Runnable Example

use strict;
use warnings;
use LWP::UserAgent;

# Create a user agent
my $ua = LWP::UserAgent->new();

# Make a GET request
my $response = $ua->get('http://httpbin.org/get');

# Check if the request was successful
if ($response->is_success) {
    # Use decoded_content to get properly decoded text content
    my $content = $response->decoded_content;

    # Print the content to STDOUT
    print "Response content:\n";
    print $content;
} else {
    # Print the error
    die "HTTP GET error: ", $response->status_line, "\n";
}

This example fetches a simple GET response from http://httpbin.org/get and prints the JSON output to the terminal.

To parse JSON further, you could add use JSON; and decode the content, but since this example focuses on LWP response content, it sticks with simple printing.

With this approach, you can fetch any HTTP resource and then parse the content accordingly, whether it is HTML scraping, JSON APIs, or other textual or binary responses.

Verified Code

Executed in a sandbox to capture real output. • v5.34.1 • 1789ms

Tip: edit code and use “Run (Browser)”. Server runs always execute the published, verified snippet.
STDOUT
Response content:
{
  "args": {}, 
  "headers": {
    "Host": "httpbin.org", 
    "User-Agent": "libwww-perl/6.44", 
    "X-Amzn-Trace-Id": "Root=1-69534f1e-356e9f656380096609ab4b3c"
  }, 
  "origin": "107.167.18.100", 
  "url": "http://httpbin.org/get"
}
STDERR
(empty)

Was this helpful?

Related Questions