The details of how proxying works differ from service to service. Some services provide proxying easily or automatically; for those services, you set up proxying by making configuration changes to normal servers. For most services, however, proxying requires appropriate proxy server software on the server side. On the client side, it needs one of the following:
With this approach, the software must know how to contact the proxy server instead of the real server when a user makes a request (for example, for FTP or Telnet), and how to tell the proxy server what real server to connect.
With this approach, the user uses standard client software to talk to the proxy server and tells it to connect to the real server, instead of to the real server directly.
The first approach is to use custom client software for proxying. There are a few problems associated with this approach.
Appropriate custom client software is often available only for certain platforms. If it's not available for one of your platforms, your users are pretty much out of luck. For example, the Igateway package from Sun (written by Jim Thompson) is a proxy package for FTP and TELNET, but you can only use it on Sun machines because it provides only precompiled Sun binaries. If you're going to use proxy software, you obviously need to choose software that's available for the needed platforms.
Even if software is available for your platforms, it may not be software your users want. For example, on the Macintosh, there are dozens of FTP client programs. Some of them have really impressive graphical user interfaces. Others have other useful features; for example, anarchie is a program that combines an Archie client and an FTP client into a single program, so that you can search for a file with Archie and then retrieve it with FTP, all with a single consistent user interface. You're out of luck if the particular client you want to use, for whatever reason, doesn't support your particular proxy server mechanism. In some cases, you may be able to modify clients to support your proxy server, but doing so requires that you have the source code for the client, as well as the tools and the ability to recompile it. Few client programs come with support for any form of proxying.
The happy exception to this rule is WWW client programs, like Mosaic. Many of these programs support proxies of various sorts (typically SOCKS and the CERN HTTP daemon). Most of these programs are fairly new, and were thus written after firewalls and proxy systems had become common on the Internet; recognizing the environment they would be working in, their authors chose to support proxying by design, right from the start.
Using client changes for proxying does not make proxying completely transparent to users. Most sites will use the unchanged clients for internal connections and the modified ones only to make external connections; users need to remember to use the modified program in order to make external connections. Following procedures they've become accustomed to using elsewhere, or procedures that are written in books, may leave them mystified at apparently intermittent results as internal connections succeed and external ones fail. (Using the modified clients internally will work, but it introduces unneccessary dependencies on the proxy server, which is why most sites avoid it.)
In addition to having to choose the right program, users may find themselves doing extra configuration, because the proxy client needs to know how to contact the proxy server. This shouldn't represent a major burden, but it provides an extra place for things to go wrong.
With the custom procedure approach, the proxy servers are designed to work with standard client software; however, they require the users of the software to follow custom procedures. The user tells the client to connect to the proxy server and then tells the proxy server which host to connect to. Because few protocols are designed to pass this kind of information, the user needs to remember not only what the name of the proxy server is, but also what special means are used to pass the name of the other host.
How does this work? You need to teach your users specific procedures to follow for each protocol. Let's look at FTP. Suppose a user wants to retrieve a file from an anonymous FTP server (e.g., ftp.greatcircle.com). Here's what the user does:
Using any FTP client, the user connects to your proxy server (which is probably running on the bastion host - the gateway to the Internet) instead of directly to the anonymous FTP server.
At the user name prompt, in addition to specifying the name he wants to use, the user also specifies the name of the real server he wants to connect to. If he wants to access the anonymous FTP server on ftp.greatcircle.com, for example, then instead of simply typing "anonymous" at the prompt generated by the proxy server, he'll type "[email protected]".
For a more complete example, see the discussion of the TIS Internet Firewall Toolkit later in this section.
Just as using custom software requires some modification of user procedures, using custom procedures places limitations on which clients you can use. Some clients try to do anonymous FTP automatically; they won't know how to go through the proxy server. Some clients may interfere in simpler ways, e.g., by providing a graphical user interface that doesn't allow you type a user name long enough to hold the username and the hostname.
The main problem with using custom procedures, however, is that you have to teach them to your users. If you have a small user base and one that is technically adept, this may not be a problem. However, if you have 10,000 users spread across four continents, it's going to be a problem. On the one side, you have arrayed hundreds of books, thousands of magazine articles, and tens of thousands of Usenet news postings, not to mention whatever previous training or experience the users might have had, all of which attempt to teach users the standard way to use basic Internet services like FTP. On the other side is your tiny voice, telling them how to use a procedure that is at odds with all the other information they're getting. On top of that, your users will have to remember the name of your gateway and the details of how to use it. In any organization of a reasonable size, this approach can't be relied upon.