[ Team LiB ] Previous Section Next Section

Recipe 20.2 Automating Form Submission

20.2.1 Problem

You want to submit form values to a CGI script from your program. For example, you want to write a program that searches Amazon and notifies you when new books with a particular keyword in the title or new books by a particular author appear.

20.2.2 Solution

If you're submitting form values with GET, use the get method on an LWP::UserAgent object:

use LWP::Simple;
use URI::URL;

$url = url("http://www.amazon.com/exec/obidos/search-handle-url/index=books");
$url->query_form("field-author" => "Larry Wall"); # more params if needed
$page = get($url);

If you're using the POST method, create your own user agent and encode the content appropriately:

use LWP::UserAgent;

$ua = LWP::UserAgent->new( );
$resp = $ua->post("www.amazon.com/exec/obidos/search-handle-form",
                  { "url"            => "index-books",
                    "field-keywords" => "perl" });
$content = $resp->content;

20.2.3 Discussion

For simple operations, the procedural interface of the LWP::Simple module is sufficient. For fancier ones, the LWP::UserAgent module provides a virtual browser object, which you manipulate using method calls.

The format of a query string is:

field1=value1&field2=value2&field3=value3

In GET requests, this is encoded in the URL being requested:

script.cgi?field1=value1&field2=value2&field3=value3

Fields must still be properly escaped, so setting the arg form parameter to "this isn't <EASY> & <FUN>" would yield:

http://www.site.com/path/to/
        script.cgi?arg=%22this+isn%27t+%3CEASY%3E+%26+%3CFUN%3E%22

The query_form method called on a URL object correctly escapes the form values for you, or you could use the URI::Escape::uri_escape or CGI::escape_html functions on your own. In POST requests, the query string is in the body of the HTTP document sent to the CGI script.

You can use the LWP::Simple module to submit data in a GET request, but there is no corresponding LWP::Simple interface for POST requests. Instead, the $ua->post method creates and submits the request in one fell swoop.

If you need to go through a proxy, construct your user agent and tell it to use a proxy this way:

$ua->proxy('http' => 'http://proxy.myorg.com:8081');

If a proxy handles multiple protocols, pass an array reference as the first argument:

$ua->proxy(['http', 'ftp'] => 'http://proxy.myorg.com:8081');

That says that HTTP and FTP requests through this user agent should be routed through the proxy on port 8081 at proxy.myorg.com.

20.2.4 See Also

The documentation for the CPAN modules LWP::Simple, LWP::UserAgent, HTTP::Request::Common, URI::Escape, and URI::URL; Recipe 20.1; Perl & LWP

    [ Team LiB ] Previous Section Next Section