- A TechNote on The Next Generation
- Jim Metzler
- Distinguished Research Fellow and Co-Founder
- Webtorials Analyst Division
For over a decade, data center LAN design has been rather staid. The
majority of IT organizations have deployed a three-tier switched
architecture comprising access, distribution and core switches. This
approach came into common use because, until recently, the relatively
low port densities of LAN switches made it impossible to support a large
data center LAN with fewer than three tiers.
These traditional data center LANs were designed primarily to support client-server applications that generated "north-south" traffic moving between users' client devices and servers. In most cases, these applications were neither bandwidth-sensitive nor particularly delay-sensitive.
Other typical characteristics of these LANs included the following:
Loop-Free, Redundant Topology for Early Bridged LANs
A critical component of data center LANs for many years has been the spanning tree protocol (STP), created by Radia Perlman and standardized in 1990 as IEEE 802.1D. In the early 1990s, a typical state-of-the-art LAN contained bridges used to connect shared 10Mbps LAN segments, a network architecture that seems prehistoric by today's standards. One of the design goals for STP was to ensure a loop-free topology for any bridged Ethernet LAN. Another was to allow for spare or redundant links that could provide an automatic backup path if an active link failed.
Twenty years later, virtually all of the traditional data center LAN assumptions described are being questioned. For example, modular data center switches are currently available with up to 768 non-blocking 10-gigabit Ethernet ports or 192 40-gigabit Ethernet ports. In addition, due to server virtualization, most data center traffic travels between servers - running "east-west" rather than "north-south."
Major Limitations in Today's LAN Designs
While STP's design goals and functionality were appropriate two decades ago, the protocol has some significant limitations in the current environment. For one thing, after a failure or change in link status, STP can take seconds to reconfigure the network. A delay of that length, perfectly acceptable in most environments in the 1990s, wouldn't fly in very many scenarios today. An even bigger limitation of STP is that it allows for only a single active path between any two network nodes, which can severely limit the scalability of the LAN.
Alternatives Not Yet Ready for Prime Time
To determine the interest that IT organizations have in replacing STP, attendees at the October 2011 Interop trade show and conference were polled as to what would likely be the most common layer 2 Ethernet protocol in their data center LANs in two years' time. Possible answers included STP, emerging standards-based alternatives and some proprietary approaches.
Of 455 respondents, about 40% indicated that they would retain STP or use another emerging technology such as Transparent Interconnect of Lots of Links (TRILL) or Shortest Path Bridging (SPB). However, 60% of the survey respondents said they "don't know" what layer 2 Ethernet protocol they'll be using two years down the road.
It is highly unusual to see a majority of those surveyed respond in this way to a question about what can be considered a mainstream networking technology. IT organizations, it seems, are confused about how to proceed.
Is STP on borrowed time? The answer is yes, but it will take a long time before most IT organizations are ready to replace it. One reason is that, because most alternative technologies are still under development, many IT organizations remain uncertain about which of the alternatives they will implement.
These traditional data center LANs were designed primarily to support client-server applications that generated "north-south" traffic moving between users' client devices and servers. In most cases, these applications were neither bandwidth-sensitive nor particularly delay-sensitive.
Other typical characteristics of these LANs included the following:
- The use of Ethernet on a best-effort basis (i.e., packets may be dropped when the network is busy)
- The application of policies such as quality of service (QoS) settings and access control lists (ACLs) based on physical ports
- A high oversubscription rate on uplinks
Loop-Free, Redundant Topology for Early Bridged LANs
A critical component of data center LANs for many years has been the spanning tree protocol (STP), created by Radia Perlman and standardized in 1990 as IEEE 802.1D. In the early 1990s, a typical state-of-the-art LAN contained bridges used to connect shared 10Mbps LAN segments, a network architecture that seems prehistoric by today's standards. One of the design goals for STP was to ensure a loop-free topology for any bridged Ethernet LAN. Another was to allow for spare or redundant links that could provide an automatic backup path if an active link failed.
Twenty years later, virtually all of the traditional data center LAN assumptions described are being questioned. For example, modular data center switches are currently available with up to 768 non-blocking 10-gigabit Ethernet ports or 192 40-gigabit Ethernet ports. In addition, due to server virtualization, most data center traffic travels between servers - running "east-west" rather than "north-south."
Major Limitations in Today's LAN Designs
While STP's design goals and functionality were appropriate two decades ago, the protocol has some significant limitations in the current environment. For one thing, after a failure or change in link status, STP can take seconds to reconfigure the network. A delay of that length, perfectly acceptable in most environments in the 1990s, wouldn't fly in very many scenarios today. An even bigger limitation of STP is that it allows for only a single active path between any two network nodes, which can severely limit the scalability of the LAN.
Alternatives Not Yet Ready for Prime Time
To determine the interest that IT organizations have in replacing STP, attendees at the October 2011 Interop trade show and conference were polled as to what would likely be the most common layer 2 Ethernet protocol in their data center LANs in two years' time. Possible answers included STP, emerging standards-based alternatives and some proprietary approaches.
Of 455 respondents, about 40% indicated that they would retain STP or use another emerging technology such as Transparent Interconnect of Lots of Links (TRILL) or Shortest Path Bridging (SPB). However, 60% of the survey respondents said they "don't know" what layer 2 Ethernet protocol they'll be using two years down the road.
It is highly unusual to see a majority of those surveyed respond in this way to a question about what can be considered a mainstream networking technology. IT organizations, it seems, are confused about how to proceed.
Is STP on borrowed time? The answer is yes, but it will take a long time before most IT organizations are ready to replace it. One reason is that, because most alternative technologies are still under development, many IT organizations remain uncertain about which of the alternatives they will implement.
Hi Jim,
Great article. STP has bothered us already too long and cannot provide true resiliency for today's enterprises. Please let me point out to you that Avaya (ex Nortel, ex BayNetworks)have banned out Spanning Tree already since 2001, when Switch Clustering was introduced. This is an IP protected technology, that is still open in a way that dual homing, active active links to that virtualised core can be connected using any trunking protocol, like MultiLink Trunk, Etherchannel or 802.3ad. This also works for servers and increases resiliency straight away. These networks have a deterministic failover time of less than a second. This immediately alleviated the need for a distribution layer (when they were there just to create resiliency), saving customers a lot of money, while increasing uptime. The industry have followed that innovation and others came out with similar solutions.
Avaya is now the first to have a working SPB solution (in GA software and GA hardware) in the market today and customers have already and continue migrating to it. The great value of it is that we can support it on our key platform, without a hardware upgrade. On top of that, SPB can be rolled out alongside the current infrastructure (like ships in the night) and traffic/services can be migrated at the customer's pace. That makes it very easy to migrate in a controlled fashion. The little tweak Avaya provides to its SPB implementation is that it is switch clustering aware, so 2 service nodes in SPB will know they run dual homed, active-active connections outside the cloud. Stp has been on borrowed time for a long time. Just choose the right vendor and you can get rid of it end to end today.