[ Team LiB ] |
12.11 Web ServicesWeb Services is yet another distributed computing architecture. As such, all of the general guidelines for efficient client/server systems from previous sections also apply to improving the performance of Web Services. Table 12-2 lists the equivalent standards for Web Services, CORBA, and Java RMI.
The simplicity of the Web Services model has both advantages and disadvantages for performance (see Table 12-3). Web Services is too simple for many distributed application requirements. The many additional features in CORBA and RMI are not whimsical; they are there in response to recognized needs. This implies that as these needs are transferred to Web Services, the Web Services standards will evolve to support additional functionality. From a performance point of view, this is problematic. Typically, the more functionality that is added to the standard, the worse performance becomes because the architecture needs to handle more and more options. So consider the performance impact of each function added to the Web Services standards.
12.11.1 Measuring Web Services PerformanceAs I write this, there is a market opportunity for Web Services profiling and measurement tools. You can use web measurement tools, such as load-testing tools and web-server monitoring tools, but these provide only the most basic statistics for Web Services, and are not normally sufficient to determine where bottlenecks lie. For developers, this means that you cannot easily obtain a Web Services profiling tool, and consequently breaking down end-to-end performance of a Web Service and finding bottlenecks may be challenging. Currently the best way to measure the component parts of Web Services seems to be to explicitly add logging points (see, for example, Steve Souza's Java Application Monitor at http://www.JavaPerformanceTuning.com/tools/jamon/index.shtml). The major Web Services component times to measure are the time taken by the server service, the time taken by the server marshalling, the time taken by the client marshalling, and the time taken to transport the message. Ideally you would like to measure times:
It is important (but difficult to determine) the time taken in marshalling and unmarshalling and the time taken for network transportation, so that you know where to focus your tuning effort. Of course, if you are worried only about the Web Service itself and you have arbitrary Web Service clients connecting to your service, as is the expected scenario, then you are interested in points 4 to 13. Note that I include these points because the client perception of your service is affected not only by how long the server takes to process it but also by any delays in the server receiving the message, and because the time taken to receive the message depends on the size of the returned message. Specifically, if the TCP data has arrived at the server (or starts to arrive at the server if it requires several TCP packets) but the server does not start reading because it is busy, this service wait time is an overhead that adds to the time taken to service the request. In the same way, the larger the size of the returned data, the more time it may take to be assembled on the client side before unmarshalling can begin, which again adds overhead to the total service time. In practice, what tends to get measured is either the full round-trip time (client to server and back) with no breakdown, or only the server-side method call. But there are a number of different ways to infer some of the intermediate measurements. The following sections detail various ways to directly measure or infer some Web Service request times. 12.11.1.1 Measuring server-side method execution timeServer-side method execution is the simplest measurement to take. Simply wrap the original method with a timer. For example, if the server method is getBlah(params), then rename it to _getBlah(params) and implement getBlah(params) as: public whatever getBlah(params){ Thread t; Log.start(t = Thread.currentThread( ),"getBlah"); whatever returnValue = getBlah(params); Log.end(t, "getBlah"); return returnValue; } 12.11.1.2 Measuring the full round-trip timeTo measure the full round-trip time, employ the wrapping technique that we just described, but this time, in the client. 12.11.1.3 Inferring round-trip overheadTo infer round-trip overhead, simply measure the time taken to execute a call to an "echo" Web Service, i.e., the Web Service implemented as: public String echo(String val) { return val; } 12.11.1.4 Inferring network communication timeYou can infer the combined time taken to transfer the data to and from the server by executing the Web Service in two configurations: across the network, and with both client and server executing on the local machine. Be sure to use the numeric IP address in both cases to specify the service (i.e., 10.20.21.22 rather than myservice.myhost.mycomp.com) to eliminate DNS lookup costs. Note that since this is likely to be communication over the Internet, you can measure only average times or daily profile times. You should repeat the measurements many times and either take the average or generate a profile of transport times at different times of the day. 12.11.1.5 Inferring DNS lookup timeTo find out how long DNS lookups are taking, compare times using the numeric IP address with time found using the name for the service (i.e., using 10.20.21.22 versus using myservice.myhost.mycomp.com). DNS lookup time can vary depending on network congestion and DNS server availability, so averages are helpful. 12.11.1.6 Inferring marshalling timeFrom the previous measurements, you can subtract network communication time, DNS time, and server-side method execution time from the total round-trip time to obtain the remaining overhead time, which includes marshalling and other actions such as object resolution, proxy method invocation, etc. The majority of this overhead time is expected to come from marshalling. If your Web Service is layered behind a web server that runs a Java servlet, you can add logging to the web server layer in the doGet( ) and doPost( ) methods. Since these servlet methods are called before any marshalling is performed, they provide more direct measurements of marshalling and unmarshalling times. In addition to measuring individual calls, you should also load-test the Web Service, testing it as if multiple, separate clients were making requests. It is not difficult to create a client to run multiple requests to the Web Service, but there are also free load-testing utilities that you can use, such as Load (available from http://www.pushtotest.com).
12.11.2 High-Performance Web ServicesIt is worth emphasizing that the previous sections of this chapter, as well as other chapters in this book, also apply to performance-tuning Web Services. As with all distributed computing, caching is especially important and should be applied to data and metadata such as WSDL (Web Services Description Language) files. The generation and parsing of XML is a Web Service overhead that you should try to minimize by using specialized XML processors. Additionally, a few techniques are particularly effective for high-performance Web Services:
These techniques are discussed in the following sections. 12.11.2.1 Service granularityIf you read the "Message Reduction" section, it should come as no surprise that Web Service methods should have a large granularity. A Web Service should provide monolithic methods that do as much work as possible rather than many methods that perform small services. The intention is to reduce the number of client/server requests required to satisfy the client's requirements. For example, the classic example of a Web Service is providing the current share price of a company quoted on a stock exchange: public interface IStockQuoteService { public String getQuote(String exchangeSymbol); public String getSymbol(String companyName); } Amusingly, this "classic" example is bad; it is too fine-grained for optimal efficiency. If you wanted to create a Web Service that provides share price quotes, you are far better off providing a service that can return multiple quotes in one request, as it is likely that anyone requesting one share price would also want others. Here is a more efficient interface: public interface IStockQuoteService { public String[ ] getQuotes(String[ ] exchangeSymbols); public String[ ] getSymbols(String[ ] companyNames); public String[ ] getQuotesIfResolved(String[ ] companyNames); } Note that there are three changes to this interface. First, as already explained, I have changed the methods to accept and return an array of Strings so that multiple prices for multiple companies can be obtained in one request. Second, I have not retained the previous interfaces that handle only one company at a time. This is a deliberate attempt to influence the thinking of developers using the service. I want developers of clients using this Web Service to immediately think in terms of multiple companies per request so that they build their client more efficiently. As the server Web Services manager, this benefits me twice over: once by influencing clients to be more efficient, ultimately giving my service a better reputation, and again by reducing the number of requests sent to my Web Service. Note that if a client is determined to be inefficient, he can still send one request per company, but at least I've tried my best to influence his thinking. The third change I've made is to add a new method. The original interface had two methods: one to get quotes using the company symbol and the other to get the company symbol using the company name. In case you are unfamiliar with stock market exchanges, I should explain that a company may have several recognizable names (for example, Big Comp., Big Company, Big Company Inc., The Big Company). The stock exchange assigns one unique symbol to identify the company (for example, BIGC). The getSymbol( ) method provides a mechanism to get the unique symbol from one of the many alternative company names. With only the two methods, if a client has a company name without the symbol, it needs to make two requests to the server to obtain the share price: a request for the unique symbol and a request for the price. By adding a third method that gives a price directly from one of the various valid company names, I've provided the option to reduce requests for those clients that need this service. Think through the service you provide, and try to design a service that minimizes client requests. Similarly, if you are writing a Web Services client and the service provides alternative ways to get the information you need, use the methods that minimize the number of requests required. Think in terms of individual methods that do a lot of work and return a lot of information rather than the recommended object-oriented methodology of many small methods that each do a little bit and combine to do a lot. Unfortunately, you also need to be aware that if the interface is too complex, developers may use a competing Web Service provider with a simpler (but less efficient) interface that they can more easily understand. 12.11.2.2 Load balancingThe most efficient architecture for maximal scalability is a load-balanced server system. This architecture allows the client to connect to a frontend load balancer, which performs the minimum of activity and whose main job is to pass the request onto one of several backend servers (or cluster of servers) that perform the real work. Load balancing is discussed in more detail in Chapter 10. Since Web Services already leverages the successful HTTP protocol, you can immediately use a web-server load balancer without altering any other aspect of the Web Service. A typical load-balancing Web Service would have the client connect to a frontend load balancer, which is a proxy web server, and have that load balancer pass on requests to a farm of backend Web Services. The main alternative to this architecture is to use round-robin DNS, where the DNS server supplies a different IP address from a list of servers for each request to resolve a hostname. The client automatically connects to a random server in a farm of replicated Web Services. A different load-balancing scheme is possible by controlling the WSDL document and sending WSDL containing different binding addresses (that is, different URLs for the Web Service location). In fact, all three of the load-balancing schemes mentioned here can be used simultaneously if necessary to scale the load-balancing and reduce failure points in the system. Where even load balancing is insufficient to provide the necessary throughput to efficiently handle all Web Service requests, priority levels should be added to Web Service requests. Higher-priority requests should be handled first, leaving lower-priority requests queued until server processing power is available. 12.11.2.3 Asynchronous processingThere are a number of characteristics of Web Services that suggest that asynchronous messaging may be required to use Web Services optimally. HTTP is a best-efforts delivery service. This means that requests can be dropped, typically for network congestion or server overload. The client Web Service will get an error in this situation, but nevertheless needs to handle it and retry. Traffic on the Internet follows a distinct usage pattern and regularly provides better service at certain times. Web Service usage is likely to follow this pattern, as times of peak congestion are also likely to be peak Web Service usage (unless your service is targeted at an off-peak activity). This means that at peak times the average Web Service gets a double hit of a congested network and a higher number of requests reaching the service. Many client/server projects over the years have shown that if your application can put up with increased latency, asynchronous messaging maximizes the throughput of the system. Requiring synchronous processing over the Internet is a heavy overhead. Consider that synchronous calls are most likely to fail from congestion when other synchronous calls are also failing. The response for a synchronous protocol, such as TCP, is simply to send more attempts to complete the synchronous call. The repeated attempts only increase congestion, as they occur in addition to all the new synchronous calls that are now starting up. Consequently, supporting asynchronous requests, especially for large, complicated services, is a good design option. You can do this using an underlying messaging protocol, such as JMS, or independently of the transport protocol using the design of the Web Service. The latter option means that you need to provide an interface that accepts requests and stores the results of processing the request for later retrieval by the client. Similarly, the client of the Web Service should strive to use an asynchronous model where possible. Finally, some Web Services combine other Web Services in some value-added way to provide what are called aggregation services. Aggregation services should try to retrieve the data they require from other services during off-peak hours in large, coarse-grained requests. |