but
it also makes it impossible for a client to distinguish between a
server that is slow and one that has crashed. In either case, the
client does not receive an RPC reply before the RPC timeout period
expires. Clients can't tell why a server appears slow, either:
packets could be dropped by the network and never reach the server,
or the server could simply be overloaded. Using NFS performance
figures alone, it is hard to distinguish a slow server from an
unreliable network. Users complain that "the system is
slow," but there are several areas that contribute to system
sluggishness.
An overloaded server responds to all packets that it
daemons, perhaps dropping some incoming
packets due to the high load. Those requests that are received
generate a response, albeit a response that arrives sometime after
the client has retransmitted the request. If the network itself is to
blame, then packets may not make it from the client or server onto
the wire, or they may vanish in transit between the two hosts.
16.4.2. Throughput
The next two sections summarize NFS throughput issues.
16.4.2.1. NFS writes (NFS Version 2 versus NFS Version 3)
Write operations over NFS Version 2
are
synchronous, forcing servers to flush
data to disk
[45] before a reply to the NFS client can be generated. This
severely limits the speed at which synchronous write requests can be
generated by the NFS client, since it has to wait for acknowledgment
from the server before it can generate the next request. NFS Version
3 overcomes this limitation by introducing a two-phased commit write
operation. The NFS Version 3 client generates asynchronous write
requests, allowing the server to acknowledge the requests without
requiring it to flush the data to disk. This results in a reduction
of the round-trip time between the client and server, allowing
requests to be sent more quickly. Since the server no longer flushes
the data to disk before it replies, the data may be lost if the
server crashes or reboots unexpectedly. The NFS Version 3 client
assumes the responsibility of recovering from these conditions by
caching a copy of the data. The client must first issue a commit
operation for the data to the server before it can flush its cached
copy of the data. In response to the commit request, the server
either ensures the data has been written to disk and responds
affirmatively, or in the case of a crash, responds with an error
causing the client to synchronously retransmit the cached copy of the
data to the server. In short, the client is still responsible for
holding on to the data until it receives acknowledgment from the
server indicating that the data has been flushed to disk.
For all practical purposes, the NFS Version 3 protocol removes any
limitations on the size of the data block that can be transmitted,
although the data block size may still be limited by the underlying
transport. Most NFS Version 3 implementations use a 32 KB data block
size. The larger NFS writes reduce protocol overhead and disk seek
time, resulting in much higher sequential file access.
16.4.2.2. NFS/TCP versus NFS/UDP
TCP handles retransmissions and flow
control for NFS, requiring only individual
packets to be retransmitted in case of loss, and making NFS practical
over lossy and wide area network practical. In contrast, UDP requires
the whole NFS operation to be retransmitted if one or more packets is
lost, making it impractical over lossy networks. TCP allows read and
write operations to be increased from 8 KB to 32 KB. By default,
Solaris clients will attempt to mount NFS filesystems using NFS
Version 3 over TCP when supported by the server. Note that workloads
that mainly access attributes or consist of short reads will benefit
less from the larger transfer size, and as such you may want to
reduce the default read size block by using the
rsize=n option of the
mount
command. This is explored in more detail in
Chapter 18, "Client-Side Performance Tuning".
16.4.3. Locating bottlenecks
Given all of the areas in which NFS can
break
down, it is hard to pick a starting point for performance analysis.
Inspecting server behavior, for example, may not tell you anything if
the network is overly congested or dropping packets. One approach is
to start with a typical NFS client, and evaluate its view of the
network's services. Tools that examine the local network
interface, the network load perceived by the client, and NFS timeout
and retransmission statistics indicate whether the bulk of your
performance problems are due to the network or the NFS servers.
In this and the next two chapters, we look at performance problems
from excessive server loading to network congestion, and offer
suggestions for easing constraints at each of the problem areas
outlined above. However, you may want to get a rough idea of whether
your NFS servers or your network is the biggest contributor to
performance problems before walking through all diagnostic steps. On
a typical NFS client, use the
nfsstat tool to
compare the retransmission and duplicate reply rates:
% nfsstat -rc
Client rpc:
Connection oriented:
calls badcalls badxids timeouts newcreds badverfs
1753584 1412 18 64 0 0
timers cantconn nomem interrupts
0 1317 0 18
Connectionless:
calls badcalls retrans badxids timeouts newcreds
12443 41 334 80 166 0
badverfs timers nomem cantsend
0 4321 0 206
The
timeout value indicates the number of NFS
RPC calls that did not complete within the RPC timeout period. Divide
timeout by
calls to
determine the
retransmission rate for this
client. We'll look at an equation for calculating the maximum
allowable retransmission rate on each client in
Section 18.1.3, "Retransmission rate thresholds".
If the client-side RPC counts for
timeout and
badxid are close in value, the network is
healthy. Requests are making it to the server but the server cannot
handle them and generate replies before the client's RPC call
times out. The server eventually works its way through the backlog of
requests, generating duplicate replies that increment the
badxid count. In this case, the emphasis should
be on improving server response time.
Alternatively,
nfsstat may show that
timeout is large while
badxid is zero or negligible. In this case,
packets are never making it to the server, and the network interfaces
of client and server, as well as the network itself, should be
examined. NFS does not query the lower protocol layers to determine
where packets are being consumed; to NFS the entire RPC and transport
mechanisms are a black box. Note that NFS is like
spray in this regard -- it doesn't
matter whether it's the local host's interface, network
congestion, or the remote host's interface that dropped the
packet -- the packets
are simply lost. To eliminate all
network-related effects, you must examine each of
these areas.