Distributed Core Network Architecture

There is no doubt you have probably heard by now the importance of supporting ‘east-west’ traffic flow in large data center networks. So why is this and what does it mean? And when Gartner claims that 80 percent of data center network traffic now travels from server-to-server, just what impact does this have on network design/architecture?

The fact is, most all of this information has to do with the increased use of virtualization, cloud computing services, and distributed cloud computing platforms such as Hadoop. Today, most cloud computing services and platforms like Hadoop instantiate a large flow of data transfer within the data center from server to server within a cluster or between multiple clusters.

Take Hadoop as an example; this distributed cloud computing platform takes enormous amounts of data, sometimes referred to as ‘Big Data’, and distributes it across possibly hundreds or even thousands of servers inside a data center to perform resource intensive computations in parallel. To give you an example of the amount of data we could be talking about here, Sameet Agarwal, a Director of Engineering at Facebook, was quoted last year in Technology Review magazine as stating that Facebook has one Hadoop store of more than 100 petabytes (that’s one million gigabytes).

This massive internal server-to-server data communication that occurs in today’s data centers’ is what is dubbed as ‘east-west’ traffic. As mentioned, the majority of traffic in the data center today is server-to-server or ‘east-west’ traffic instead of the traditional ‘north-south’ traffic supported by traditional legacy architectures. This new data traffic trend demands a new approach to network architecture. The traditional three layer north-south hierarchy with spanning tree protocol (STP) to block redundant paths used only for backup is no longer efficient for the kind of workloads seen in today’s large data center applications.

Let’s look at an example. The below diagram shows the classic 3-tier architecture with spanning tree employed to provide redundancy while preventing network loops.

Traditional 3-Tier Network Architecture

Traditional 3-Tier Network Architecture

As you can see from above, half the ports and half the bandwidth is lost thanks to STP blocking redundant paths. To make matters worse, the core is composed of huge chassis systems that are power-hungry and don’t scale well due to their monster size. Additionally, this ‘north-south’ supporting architecture does not support the trend of ‘east-west’ traffic flows. For instance, in a large data center setting, the majority of traffic will have to traverse all the way up to the core to reach other servers.

Now take a look at the distributed core architecture shown below and immediately you’ll notice some differences.

Distributed Core Network Architecture

Distributed Core Network Architecture

As you can see from the above, the most noticeable difference is the more compact/flat design and the no blocked ports. It’s obvious a design with full bandwidth and no blocked ports is preferable to a design where you lose half the ports and half the bandwidth. Further, the switches that make-up the spine and leaf of the distributed core are not monster chassis systems but smaller and more scalable cost effective switches. However, although these switches are smaller and more cost-effective, these switches are purpose-built for distributed core networks and pack a powerful punch. Such is the case of the Dell Force10 Z9000, a 2 RU switch with 32 x 40 GbE ports that can be broken out via breakout cable into 128 x 10 GbE ports if desired.

Dell Force10 Z9000

Dell Force10 Z9000

This compact powerhouse can deliver 2.5 Tbps of switching capacity and can scale up to 160 Tbps in a spine-and-leaf architecture. At 128 ports of 10 GbE and with a maximum of 64 spines and 128 leafs, the distributed core fabric can have up to 8192 fabric ports! Large chassis systems are usually anywhere from 12 RU to 24 RU. At the high end, that could be half the standard 42 RU data center cabinet. Within 24 RU you could fit 12 Z9000s!

Yes, you should be impressed, especially considering the fact that the Z9000 only uses about 800 watts, and due to its low power requirements and size is easily more scalable than a chassis-based system. Dell Force10 also has the lower-priced 1 RU S4810 switch which can be used as a spine or leaf switch in a smaller-scaled distributed core network. It has 48 x 10 GbE ports and 4 x 40 GbE ports (which each can also be broken out into 4 x 10 GbE ports if desired).


Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

3 Responses to “Distributed Core Network Architecture”

  1. bbooher says:

    I have of course heard this story from Force10 for a few years and it sounds great. But from what I’ve heard, they can only do this now with Equal Cost Multi-path where each leaf and spine is a separate subnet. That would be a problem if your data center was composed mainly of VM hosts and guests where the VM’s all need to have access to the same VLANs.

    Force10 talks about the imminent arrival of TRILL and apparently their silicon will support TRILL when the standard is finalized – if it ever is. But I’m concerned TRILL will never materialize as the world’s chasing OpenFlow now.



  2. Humair says:

    Hi Brian,

    The design of the network will definitely depend on individual customer needs. There are many possibilities to design a somewhat similar network with different goals in mind. Using the network setup in this blog as an example, it is not a requirement to use the leaf node as the ToR switch. To allow for more VM mobility, you can easily add another row of switches below the leaf layer within a L2 domain and employ Dell Force10 VLT technology which gives you L2 multipathing capability.

    Again, it will depend on the exact needs and how much VM mobility is required. Yes, TRILL has not seen widespread adoption in the industry just yet, but many vendors, including Dell, have L2 multipathing capabilities that are vendor specific (Ex: Dell has VLT, Brocade has MCT, Cisco has MEC and vPC).

    So another design to accomplish more VM mobility with Dell Force10 switches could be to simply create a large L2 domain up to the core using VLT. Incredibly dense switches such as the Z9000 allow you to build very large L2 domains with technologies such as VLT. Also, a future option could be to use network overlay technologies such as VXLAN or NVGRE.


  3. Chi Paige says:

    Thanks for sharing interesting post please relpy me how many watts chassis-based require as you mentioned above Z9000 consume 800 watt.

Leave a Reply


6 + seven =