[ Team LiB ] Previous Section Next Section

21.5 Web Streams

Rather than reading from a stream provided by a custom server, you can just as easily read from any web page on the Internet.

A WebRequest is an object that requests a Uniform Resource Identifier (URI) such as the URL for a web page. You can use a WebRequest object to create a WebResponse object that will encapsulate the object pointed to by the URI. That is, you can call GetResponse( ) on your WebRequest object to get the actual object (e.g., a web page) pointed to by the URI. What is returned is encapsulated in a WebResponse object. You can then ask that WebResponse object for a Stream object by calling GetResponseStream( ). GetResponseStream( ) returns a stream that encapsulates the contents of the web object (e.g., a stream with the web page).

The next example retrieves the contents of a web page as a stream. To get a web page, you'll want to use HttpWebRequest. HttpWebRequest derives from WebRequest and provides additional support for interacting with the HTTP protocol.

To create the HttpWebRequest, cast the WebRequest returned from the static Create( ) method of the WebRequestFactory:

HttpWebRequest webRequest = 
    (HttpWebRequest) WebRequest.Create
    ("http://www.libertyassociates.com/book_edit.htm");

Create( ) is a static method of WebRequest. When you pass in a URI, an instance of HttpWebRequest is created.

The method is overloaded on the type of the parameter. It returns different derived types depending on what is passed in. For example, if you pass in a URI, an object of type HttpWebRequest is created. The return type, however, is WebRequest, and so you must cast the returned value to HttpWebRequest.

Creating the HttpWebRequest establishes a connection to a page on your web site. What you get back from the host is encapsulated in an HttpWebResponse object, which is an HTTP protocol-specific subclass of the more general WebResponse class:

HttpWebResponse webResponse = 
    (HttpWebResponse) webRequest.GetResponse( );

You can now open a StreamReader on that page by calling the GetResponseStream( ) method of the WebResponse object:

StreamReader streamReader = new StreamReader(
    webResponse.GetResponseStream( ), Encoding.ASCII);

You can read from that stream exactly as you read from the network stream. Example 21-14 shows the complete listing.

Example 21-14. Reading a web page as an HTML stream
using System;
using System.Net;
using System.Net.Sockets;
using System.IO;
using System.Text;

public class Client
{
    
   static public void Main( string[] Args )
   {

      // create a webRequest for a particular page
      HttpWebRequest webRequest = 
         (HttpWebRequest) WebRequest.Create
         ("http://www.libertyassociates.com/book_edit.htm");

      // ask the web request for a webResponse encapsulating
      // that page
      HttpWebResponse webResponse = 
         (HttpWebResponse) webRequest.GetResponse( );

      // get the streamReader from the response
      StreamReader streamReader = new StreamReader(
         webResponse.GetResponseStream( ), Encoding.ASCII);
        
      try
      {
         string outputString;
         outputString = streamReader.ReadToEnd( );
         Console.WriteLine(outputString);
      }
      catch
      {
         Console.WriteLine("Exception reading from web page");
      }
      streamReader.Close( );

   }
}

Output (excerpt):
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>

<head>
<title>Books &amp; Resources</title>
</head>

<body bgcolor="#ffffff" vlink="#808080" 
alink="#800000" topmargin="0" leftmargin
="0">
<table border="0" cellpadding="0" cellspacing="0" width="454" bgcolor="#ffffff">

  <tr>
&quot;More
      than just about any other writer, Jesse Liberty 
      is brilliant at communicating what it's really 
      like to work on a programming project.&quot;
      </font></b><font face="times new roman, times, 
      serif" size="3"><b>
 </b> Barnes &amp; Noble</font></i><font size="3"><br>

The output shows that what is sent through the stream is the HTML of the page you requested. You might use this capability for screen scraping: reading a page from a site into a buffer and then extracting the information you need.

All examples of screen scraping in this book assume that you are reading a site for which you have copyright permission.

    [ Team LiB ] Previous Section Next Section