8.3 MTU Throughout the Network

The size of the largest data packet that can pass along any particular section of network is called the Maximum Transmission Unit (MTU). Suppose, for example, that a network contains both Ethernet and Token Ring segments. The default MTU for Ethernet is 1,500 bytes. For a 16Mbps Token Ring, the maximum MTU is 18,200 bytes. If a packet travels from one of these media to the other, the network will have to find a compromise.

There are two main ways to resolve MTU mismatch problems. The network can either fragment the large packets, or it can force everything to use the smaller value. In most cases, the network will fragment packets if it can and negotiate the greatest common MTU value only if it is not allowed to fragment. The efficiency issue is that both fragmentation and MTU negotiation consume network resources. However, fragmentation has to be done with every oversized packet, and MTU negotiation is done primarily during session establishment. MTU negotiation also happens if the path changes and the new path contains a leg with a lower MTU value.

In TCP sessions, the Path MTU Discovery process starts when a packet that has the Don't Fragment (DF) bit in the IP header set is sent. This bit literally instructs the network not to fragment the packet. Fragmentation is the default. Suppose a TCP packet passes through a network and it gets to a router that needs to break that packet to send it to the next hop on its path. If the DF bit in the IP header is not set, then the router simply breaks the packet into as many pieces as necessary and sends it along. When the fragments reach the ultimate destination, they are reassembled.

If the DF bit is set, then the router drops the packet and sends back a special ICMP packet explaining the situation. This packet tells the sender that the packet has been dropped because it could not be fragmented. It also tells the sender the largest packet it could have sent. Doing so allows the sender to shorten all future packets to this Path MTU value.

Note that it is more efficient in general to reassemble at the ultimate destination rather than at the other end of the link with a lower MTU. This is because it is possible that the packet will encounter another low MTU segment later in the path. Since there is significant overhead in both fragmentation and reassembly, if the network has to do it, it should do it only once.

Many protocols do not have a Path MTU Discovery mechanism. In particular, it is not possible to negotiate an end-to-end MTU for a UDP application. Thus, whenever a large UDP packet is sent through a network segment with a lower MTU value, it must be fragmented. Then the receiver has to carefully buffer and reassemble the pieces. However, most UDP applications deliberately keep their packets small to avoid fragmentation.

If the network is noisy or congested, it is possible to lose some fragments. This loss results in two efficiency problems. First, the device that reassembles the packet from the fragments must buffer the fragments and hold them in its memory until it decides it can no longer wait for the missing pieces. This is not only a resource issue on the device, but it also results in serious latency and jitter problems. The second problem can actually be more serious. If any fragment is lost, then the entire packet must be resent, including the fragments that were received properly. Data lost due to congestion problems will make the problem considerably worse.

Obviously, it is better if the network doesn't have to fragment packets. Thus, in a multiprotocol network it is often better to configure a common MTU manually throughout all end-device segments.

This configuration is not always practical for Token Ring segments that run IBM protocols. Suppose a tunneling protocol such as Data Link Switching (DLSw) connects two Token Ring segments through an Ethernet infrastructure. Generally, it is most efficient to use the greatest MTU possible. In this case, however, there is an important advantage. The DLSw protocol is TCP based and operates as a tunnel between two routers. These routers can discover a smaller Path MTU between them. They can then simply hide the fragmentation and reassemble from the end devices. They will appear to pass full-sized Token Ring frames.

Even here, the routers suffer from additional memory utilization, and there will be latency and jitter issues on the end-to-end session. If at all possible, it is better to reduce the Token Ring MTU to match the lower Ethernet value.