Ethernet flow control allows for a receiving node to temporarily stop the transmission of data from the sending node. As defined by IEEE 802.3x this is accomplished via the PAUSE frame.
Flow control is useful in cases where a node on the network is transmitting data faster than the receiving node on the network can accept it; the goal is to properly handle input buffer congestion while preventing packet loss. The PAUSE frame is used to tell the sending node on the link to temporarily stop or ‘pause’ as the receiving node cannot handle the rate at which the data is being sent. The receiving node sends a PAUSE frame to the sending node which then halts the transmission of further data for a specified period of time.
An Ethernet frame is used to carry the ‘PAUSE’ command with the ‘Ethertype’ field always set to ’0×8808′, and the ‘Opcode’ field always set to ’0×0001′. When a node becomes overwhelmed with traffic from the other end of the link, it sends a PAUSE frame to the reserved 48-bit destination multicast address of ’01-80-C2-00-00-01′. In this respect the node does not need to discover and store the address of the node at the other end of the link. Without flow control enabled on the switch, the overloaded device will drop packets. The PAUSE frame has the structure shown below.
The PAUSE frame includes a two byte unsigned integer (0 through 65535) in hex which tells the sending node how long to pause. A values of ’0′ tells the end device to resume transmission. The pause time is measured in units of pause ‘quanta’, where each unit is equal to 512 bit times. Just to give an example, with Gigabit Ethernet, a pause time of 0xFFFF (65536) equates to 33.55 msec. If you would like to understand the details of this calculation further, please see the following website link: http://wiki.networksecuritytoolkit.org – Ethernet Flow Control Pause Frame (IEEE 802.3x).
It’s important to note that if an additional PAUSE frame arrives before the pause time has expired from the prior PAUSE frame, its pause time parameter replaces the prior pause time; this is why a PAUSE frame with a value of ’0′ for the pause time causes the data transmission to resume immediately.
The below lab diagram demonstrates how flow control, a mechanism that employs PAUSE frames to control packet loss, works under congestion conditions. The server has a 1 Gb NIC and is receiving traffic from two PCs both with a 1 Gb NIC at a faster rate than it can handle.
Now, if the server is congested and without flow control configured on the switch, if PC 1 and PC 2 send traffic to the server, the packets will just be dropped. However, if I have flow control enabled and it is supported on all devices, the server will inform the switch via PAUSE frames to ‘pause transmission’ until notified to proceed. It is important to note as shown above that PAUSE frames are a direct-link mechanism. PAUSE frames do not propagate directly from link to link. The switch starts to build a queue and once that queue reaches a certain threshold, the switch is forced to send a PAUSE frame to the PC to avoid dropping frames. By this mechanism, PAUSE frames are propagated indirectly.
Since I am using a Dell Force10 S50N as my switch, below I show how to quickly configure flow control on the switch.
Finally, an important note to mention here is that the IEEE 802.3x flow control defined in 1997 and discussed here causes the entire link to pause traffic under congestion. This is not an ideal result for networks carrying multiple types of traffic with different priorities. It is for this reason that Quality of Service (QoS) fails to operate properly with this flow control/PAUSE frame mechanism.
Fortunately, the follow-on priority-based flow control (PFC) (IEEE 802.1Qbb standard approved in 2011), provides a link-level flow control mechanism that can be controlled independently for each Class of Service (CoS), as defined by the IEEE P802.1p group. PFC will be the mechanism used as part of the data center bridging (DCB) protocol to ensure zero loss under congestion for the converged networks of the future. Switches employing DCB solutions will be at a minimum 10 GbE switches such as the Dell Force10 S4810 that allow for the bandwidth requirements of large converged data center solutions. I will discuss DCB and PFC in greater detail in a future blog.
Tags: 01-80-C2-00-00-01, 01:80:C2:00:00:01, 802.1Qbb, 802.3x, 802.3x flow control, Class of Service, converged network, converged networks, CoS, DCB, Dell, Dell Force10, Dell Force10 S50N, Dell S50N, Ethernet Pause Frame, Flow Control, Force10, Force10 Networks, Force10 S50N, IEEE 802.1Qbb, IEEE 802.3x flow control, network congestion, pause, PAUSE frame, PAUSE frame capture, PAUSE frame propogation, pause traffic, PFC, priority-based flow control, propogate PAUSE frame, QoS, Quality of Service, wireshark, Wireshark PAUSE frame, Wireshark PAUSE frame capture