17.5 Performance and Scalability Gotchas

This section presents several well-known issues that may affect performance and scalability for your Struts applications. This section is not meant to be exhaustive, but rather to single out a few of the more serious concerns.

17.5.1 Request Scope Versus Session

Memory is limited. You can purchase more memory for a machine, but at some point you'll stop receiving the same return on your investment. It's very common to store objects and data in the user's HttpSession. In some cases, this might be the only way to achieve a particular requirement. However, you must consider the effect that storing objects and data in the session has on an application.

The more information and objects that are stored in the session, the more you need to worry about scalability. Say that you store 0.5 MB worth of data for a single user. If the application were loaded with 1000 concurrent users, that would equal 500 MB (0.5 GB) worth of memory.

Don't forget that there are other resources taking up memory, not just user sessions. There's application-scope data, the rest of the Struts framework, your application components, the container itself, the JVM, and so on. You must consider all of these factors.

As you can see, using the session to store data can quickly eat up memory. A better alternative is to use the HttpServletRequest to temporarily store data that can be used by other components and then reclaimed by the garbage collector when the request is completed. With request-scoped data, the responsibility of cleaning up the data is between the container and the JVM, not the application.

17.5.2 Using the synchronized Keyword

Synchronization is used to control the access of multiple threads to a shared resource. There are many situations where synchronization makes sense and is absolutely necessary to keep multiple threads from interfering with one another, which can lead to significant application errors. Struts applications are inherently multithreaded, and you might think that certain parts of the web application should be synchronized. However, using the synchronized keyword inside of your Struts applications can cause some significant performance problems and reduce the overall scalability of the applications.

We've all heard at one time or another that servlets should not contain any instance variables. This is because there may be only a single instance of a servlet running, with multiple client threads executing the same instance concurrently. If you store the state of one client thread in an instance variable and a different client thread comes along at the same time, it may overwrite the previous thread's state information. This is true for Struts Action classes and session-scoped ActionForms, too. You must be sure to code in a thread-safe manner throughout your application. In other words, you must design and code your application to allow for multiple client threads to run concurrently throughout the application without interfering with one another. If you need to control access to a shared resource, try to use a pool of resources instead of synchronizing on a single object. Also, keep in mind that the HttpSession is not synchronized. If you have multiple threads reading and writing to objects in the user's session, you may experience severe problems that are very difficult to track down. It's up to the programmer to protect shared resources stored in the user's session.

17.5.2.1 Using java.util.Vector and java.util.Hashtable

You must also be careful which Java classes you use throughout your Struts applications, especially when it comes to selecting a collection class. The java.util.Vector and java.util.Hashtable classes, for example, are synchronized internally. If you are using Vector or Hashtable within your Struts applications, this may have the same effect as using the synchronized keyword explicitly.

You should avoid using these classes unless you are absolutely sure that you need to. Instead of using Vector, for example, you can use java.util.ArrayList. Instead of Hashtable, use the java.util.HashMap class. Both of these classes provide similar functionality without the synchronization overhead.

17.5.3 Using Too Many Custom Tags

JSP custom tags are great at what they do. Using them instead of coding Java directly in your JSP pages is recommended by almost everyone who has used both approaches. You have to be careful, however, when using too many custom tags in a single JSP page. Some containers are not very efficient at pooling tag handlers, and some may generate poorly written Java code.

If your JSP pages are performing slowly, one possible solution is to move some of the code to another JSP page and use the JSP include mechanism. A second approach is to simply reduce the number of tags in the page, although this is less practical. If these solutions don't work, try a different container. Each container may deal with tags differently—while one may be slow with your application, another may be fast.

17.5.4 Improperly Tuning the JVM

The JVM supports many different options for tuning and configuring its runtime parameters. Sometimes it's necessary to adjust these options to achieve better performance from your application.

The two most important options when trying to increase performance or scalability for your application are the -Xms(size) and -Xmx(size) options. The -Xms option allows you to set the initial size of the application heap. The -Xmx option allows you to set the maximum size for the heap.

The heap is the memory storage area for the application. The larger the storage area, the more memory the application can use. You might ask, "Why not just set it to the maximum size allowed by the physical memory?" The problem with that approach is that it becomes an area that the garbage collector has to clean up. The garbage collector in the JVM runs periodically and attempts to reclaim any unused memory.

The garbage collector has to search through all of the memory assigned to an application. Each time the garbage collector runs, the application will pause. The longer it takes for the garbage collector to do its job, the longer the users will have to wait during a collection cycle. It's very important to set the heap size correctly. Unfortunately, there's no general way to determine the correct heap size for an application. Each application is different, and each one creates and destroys objects at a different rate. The best that you can do is to set the values to standard starting points and make changes incrementally. You will eventually reach a point where performance or scalability gets worse as the values get higher. Lower the values again and leave them alone. In general, you should start with these values:

-Xms 256M
-Xmx 256M

Many sources recommended setting the initial and maximum heap sizes to the same value, so that the JVM doesn't have to pause the application when it needs to acquire more memory. This, in turn, should help to improve performance.

The default minimum value is 2 MB, and the default maximum value is 64 MB. The letter after the number in the option can be:

·         k or K for kilobytes

·         m or M for megabytes

To see a list of other supported JVM options, type java -X on the command line. Assuming that your path is set up correctly for the Java executable, you should see something similar to Figure 17-7.

Figure 17-7. Typing java -X displays the options available for the JVM

figs/jstr_1707.gif

17.5.5 Using Too Many Remote Calls

When accessing remote components such as EJBs from your application, you may find that the overhead of network communication starts to cause performance problems. One thing to look at is the "granularity" of your remote calls. If you find that you are making many calls that retrieve a small amount of information on each page, try bundling a related set of calls into fewer remote invocations. For example, if you are displaying a product list, querying the products, and then requesting the details of each product as separate remote calls, this would be a good candidate for one aggregated call that returns all the product information and details at once. You can also improve the performance of an application that uses remote references by caching the remote reference. See Chapter 13 for more information.

17.5.6 Using Too Many Graphics

When a web page contains graphics, they are downloaded separately from the HTML content. Each image may also require and use separate connections. Even when the performance problems are related to images, a user may have the impression that the entire application is slow. Don't use too many images, and especially stay away from large images. This is one sure way to improve the performance of your application.