Let me start by saying that spanning tree is a Good Thing. It saves you from loops, which will completely shut down a network.
But it has to be configured properly to work properly. I can’t count the number of times I’ve had a client call me, desperate with a terribly broken network, and I’ve responded, “Sounds like a spanning tree problem.”
There are many ways things can go wrong with spanning tree. In this article, I’ve collected six of the recurring themes.
Everything you need to know about network topology
From network layers and components to segmentation to step-by-step instructions for drawing every layer.
1. Not configuring spanning tree at all
As I said, spanning tree is a good thing. But for some reason, a lot of switch vendors disable it by default. So out of the box, you might have to enable the protocol.
Sometimes people deliberately disable spanning tree. The most common reason for disabling spanning tree is that the original 802.1D Spanning Tree Protocol (STP) goes through a fairly lengthy wait period from the time a port becomes electrically active to when it starts to pass traffic. This waiting period, typically 45 seconds, is long enough that DHCP can give up trying to get an IP address for this new device.
One solution to the problem is to simply disable spanning tree on the switch. This is the wrong solution.
The right solution is to configure a feature called PortFast on Cisco switches. (Most switch vendors have a similar feature.) You configure the command “spanning-tree portfast” on all the ports connecting to end devices like workstations. They then automatically bypass the waiting period and DHCP works properly.
It’s important to only configure this command on ports that connect to end devices though. Ports connecting to other switches need to exchange spanning tree information.
2. Letting the network pick your root bridge
As the name suggests, spanning tree resolves loops in your network by creating a logical tree structure between the switches. One switch becomes the root of the tree, and is called the root bridge. All other switches then figure out the best path to get to the root bridge.
If there are multiple paths, then on each switch, spanning tree selects the best path and puts all the other ports into a blocking state. In this way, there’s a single path between any two devices on the network, although it might be rather circuitous.
Every switch taking part in spanning tree has a bridge priority. The switch with the lowest priority becomes the root bridge. If there’s a tie, then the switch with the lowest bridge ID number wins. The ID number is typically derived from a MAC address on the switch.
The problem is that, by default, every switch has the same priority value (32768). So if you don’t manually configure a better (lower) bridge priority value on a particular switch, the network will simply select a root for you. Then Murphy’s Law applies. The resulting root bridge could be some tiny edge switch with slow uplinks and limited backplane resources.
To make matters worse, a bad choice of root bridge can make the network less stable. If there’s a connectivity problem that takes any random switch off the network, spanning tree heals rather quickly. But if the root bridge goes down, or if the failure means that some switches no longer have a path to the root bridge, this constitutes a major topology change. A new root bridge needs to be selected. The entire network will freeze during this time and no packets can be forwarded.
I always recommend making the core switch to the root bridge. I also like to select a backup root bridge. If there are dual redundant core switches, then one is the root bridge and the other becomes my backup.
Set the bridge priority on the primary root bridge to the best possible value—4096—and the backup root bridge to the next best value—8192. Why these funny numbers? Well, that’s a long story that we don’t have space for here, but the lower order bits in the priority field have another purpose, so they aren’t available for use as priorities.
3. Using legacy 802.1D
The first open standard for spanning tree is called 802.1D. It’s one of the earliest standards in the IEEE 802 series of standards that includes the specifications for every type of Ethernet and Wi-Fi as well as a bunch of other protocols. It works well despite its age, and you’ll find this type of spanning tree on just about every switch. Any switch that doesn’t support 802.1D is only useful in small isolated environments, and should never be connected to any other switches.
But there have been several important advancements to spanning tree since 802.1D. These improvements allow sub-second convergence following a link failure, as well as the ability to scale to larger networks and the ability to actually have different spanning tree topologies and different root bridges for different VLANs. So it makes a whole lot of sense to use them.
Most modern Cisco switches default to a protocol called Per-VLAN RSTP. This stands for Rapid Spanning Tree Protocol. It automatically operates a separate spanning tree domain with a separate root bridge on every VLAN. In practice, it’s common to make the same switch the root bridge on all or most of the VLANs, though.
The rapid feature or RSTP is what you’ll probably find most useful. This allows the network to recover from most failures in times on the order of 1 to 2 seconds. Multiple Instance Spanning Tree, or MST, is similar to RSTP. The main difference is that you can designate groups of VLANs that are all part of the same tree structure with a single common root bridge. However, I recommend using Per-VLAN RSTP in most cases because it’s easier to configure. Also, I’ve encountered some interoperability problems with MSTP between different switch vendors.
4. Mixing spanning tree types
It should be pretty clear from the descriptions of 802.1D, RSTP, and MST in the previous section that mixing them could get messy. The RSTP and MST protocols have rules for how to deal with this mixing, and in general, it involves creating separate zones within the network for groups of switches running different flavors of spanning tree. This rarely results in the most efficient paths being selected between devices.
The only really valid reason to mix spanning tree types is to allow the inclusion of legacy equipment that doesn’t support the more modern protocols. As time goes by, there should be fewer and fewer of these legacy devices, and the number of places where it makes sense to mix the protocols should become smaller.
I recommend picking one, preferably RSTP or MST, and just using that in a consistent manner across all of your switches.
5. Using MST with pruned trunks
Because MST allows a single spanning tree structure that supports multiple VLANs, you need to be extremely careful about your inter-switch trunks.
I once had a client with a large complicated network involving many switches and many VLANs. They were running MST. For simplicity, they had designated a single MST instance, meaning that all VLANs were controlled by the same root bridge.
The problem for this client arose when they decided that certain VLANs should only exist on certain switches for security reasons. All perfectly reasonable. So they removed the VLAN from the main inter-switch trunks and added new special trunks just for these secure VLANs. And everything broke.
MST considered all VLANs to be part of the same tree, and it selected which trunks to block and which to forward based on that assumption. But in this case, because some VLANs were only present on some trunks and other VLANs were present on the other trunks, blocking a trunk meant only passing some of the VLANs. Blocking the other trunk meant only passing the other set of VLANs. For the blocked VLANs there was simply no path to the root bridge at all.
So, if you’re going to use MST, you need to either ensure that all VLANs are passed on all trunks, or you need to carefully and manually create different MST instances for each group of VLANs with special topological requirements. In other words, you have to do careful analysis and design the network properly. Or you could take the easy way out and run Per-VLAN RSTP.
6. Conflicting root bridge and HSRP/VRRP
Another common topological problem with spanning tree networks involves the way that Layer 2 and 3 redundancy mechanisms sometimes interact.
Suppose I have a network core consisting of two Layer 3 switches. On each segment I want these core switches to act as redundant default gateways. And I want to connect all of the downstream switches redundantly to both core switches and make spanning tree remove the loops.
In this scenario, the spanning tree root bridge for a particular VLAN might be on one of these core switches and HSRP/VRRP master default gateway on the other switch. Then an Ethernet frame originating on one of the downstream switches destined to the default gateway will need to take an extra hop, going first to the root bridge, and then to the secondary core switch that currently owns the default gateway IP.
Normally this isn’t a problem, but imagine that I’m passing packets between two VLANs, both with Core Switch A as the root bridge and Core Switch B as the default gateway. Every packet must go up to Core Switch A, and cross the backbone link to get routed on Core Switch B.
Then it has to cross the backbone link again to go back to Core Switch A to be delivered to its destination. All of the return packets must also cross the backbone link twice. This creates a massive traffic burden on the backbone link where every packet in both directions must cross twice. It also incurs a latency penalty as every packet needs to be serialized and transmitted twice. Even on 10Gbps links, this will typically cost a couple of microseconds in both directions, which could add up for particularly sensitive applications.
Suppose instead that the default gateway was on the same switch as the root bridge. Now the packet goes up to the root bridge, Core Switch A, gets routed between the VLANs, and immediately switched out to the downstream device. It doesn’t cross the backbone at all in either direction.
Spanning tree is a terrifically important protocol. It allows us to build redundancy into inter-switch connections. It saves us from catastrophic loops when somebody accidentally connects things they shouldn’t.
It’s true spanning tree can be misconfigured with bad consequences, but this possibility shouldn’t discourage you from using it. The solution is to be careful and deliberate about your network design.
Your Guide to Selling Managed Network Services
Get templates for network assessment reports, presentations, pricing & more—designed just for MSPs.
Great assist Kevin!!! Tons of websites out there telling me how to set up the spanning tree, some convoluted sites explaining why to and the risks of without, but this truly encompasses my ideas that my network admin definitely didn’t set this up right. Thanks!
I happen to see this article from a coworker. Coming from a CCIE, this article contains many valid points and concepts that people forget, but also leaves out many more like root guard, bpdu guard, loop guard, udld, and the reasons some vendors disable it by default. There are enough other technologies these days that one shouldn’t be relying on spanning tree for it’s redundancy aspects within core/distribution infrastructure and even distribution to the edge. Multi-chassis etherchannel, VSS, VLT, VPC, backup ports, and routing protocols for example. If you are using technology that isn’t compatible in any way with these protocols, and you are trying to achieve critical redundancy, then you are mixing consumer grade solutions with enterprise grade solutions. Spanning-tree’s main job these days are just to prevent outside factors from inadvertently tanking the network. However, at this job it is still say only 80% good at. Loops can be caused simply by looping a local switch with itself and then connecting it into the upstream infrastructure. Upstream spanning-tree won’t detect it, because to it there is no loop in the upstream. Yet the broadcasts will still forward upstream causing significant performance issues there. This is where storm control comes in. Arguably, stomcontrol makes a case, in the right configuration, to disable spanning-tree even at the edge. In the end, there are many cases where you can spend more time protecting a spanning-tree implementation from being abused than the value you’ll ever receive out of it. These are the reasons why spanning-tree gets turned off by default from some vendors. There are simply better options, and one needs to be aware of how they intend to use it, the value they’ll receive from it, and know how to configure it properly anyways. It still has value in user exposed edge ports, even if you turn on portfast, but you can easily get away with it disabled else where with proper designs and feature implementations. Again, this doesn’t mean I am running to turn it off or do away with it, but I definitely use it more to protect myself from end-users rather than anything else.
Hi,
Thanks for your excellent post, really great.
On your point 6. Can you please clarify and confirm if the data must always past through the root bridge? Is the root bridge only used to manage STP or all the user/data traffic also passes through the root bridge? I hope it’s clear what I am asking. Thanks.
That’s an excellent question. There is actually more subtlety to it than I implied in the blog post. In reality, each bridge has its own MAC table, which it uses to make forwarding decisions. If a bridge’s MAC table contains the destination MAC, then it forwards the packet according to that information. If it doesn’t, then it forwards the packet out its root port. This is true regardless of whether the source and destination devices are directly attached or many hops away across the spanning tree topology.
If it doesn’t contains the destination MAC does it broadcast the ARP in the whole VLAN to obtain the destination MAC? just confused by your saying “If it doesn’t, then it forwards the packet out its root port. This is true regardless of whether the source and destination devices are directly attached or many hops away across the spanning tree topology”.
Hi Chris,
You’re right. If the MAC isn’t in the MAC table, the switch will flood the packet to all ports in the VLAN. Ultimately, the Root Bridge’s MAC table will include all MACs.
– Kevin
Hello Kevin,
First of all, thank you for your post, it is really good!
I have a question regarding priority values on second point, could you explain a little bit more about why we must use the 4096 and 8192 and why not starting from 0 for root bridge and 4096 for backup root?
Thanks!
Hi Bernardo,
We don’t use the priority value 0 because it indicates that the bridge must never become the root. It is considered by the Spanning Tree Protocol, in effect, to be infinity.
– Kevin
Spanning tree is considered a management protocol. The reason it is disabled by default is it should really only be enabled on ports that are connected to other switches or need management privileged. If it is on every port then someone can relatively easily plug into your network and brute force the VLAN domain or the configuration if you have not used strong passwords. The issue is that when you are talking about VOIP convergence these layer 2 management protocols have to be enabled. This is where the importance of Bridge Protocol Data Unit (BPDU) comes into play. This protocol allows us to limit which switches are actually members of the VTP domain.
Food for thought.
This is not true. In the case of Cisco switches, you should enable a feature called ‘bpduguard’ on ports connecting to hosts, along with ‘portfast’. So, portfast will allow the port to go into forwarding mode immediately. But bpduguard acts to protect the network from external loops, or adding ‘unauthorized’ switches. All ports send out bpdus, which hosts ignore. However, when a port with bpduguard configured receives a bpdu, it assumes that either a loop has been created, or that a switch has been connected to the port. When this happens, the port will immediately shut down, in error-disable mode, to protect the network.
Hello Kevin, Like everybody else, Thanks so much for your article, was clear and you nailed on point numbers 2 and 6, I make root bridge my core switch which L3, but regarding point 6 (both cores switches sharing a virtual IP) what about in a complex large ring topology? with more than 20 switches managed by these two core switches, the main concern is what happen in the event of one of those 20 switches lose the power, by outages etc? the whole ring will broke? or RSTP should keep the ring alive? (all SWs have the same speed full/1Gbps over FO) Thank you very much!
George.
Greetings
Hi George,
It’s a little difficult to give a definite answer without looking at the network topology. The short answer is that as long as there is a path that connects every switch to the root bridge, and as long as the network doesn’t violate the maximum diameter rule for the spanning tree implementation, it should be able to reconverge and continue working. The theoretical maximum diameter for RSTP is 40 bridges. If the failure results in any bridges being more than 40 hops appart, it will not be able to reconverge.
If the switch that fails is the root bridge, then the network must elect a new root bridge, which can take several seconds. After that, the network should work normally. But as I indicated in the article, the extra hops could cause latency problems.
– Kevin
Hi, incase if I add a new switch in existing STP topology , what will be the impact and how should i plan to minimize the impact.
My thought is to increase the priority value in the new switch so that it never becomes root bridge…is that right?
That’s right – You’ll want to set the priority in such a way that the switch will not become the root bridge. You can also look at enabling Root Guard on the new switch if its capable to prevent it from becoming the root bridge.
Thanks much
Kevin! Nice article. I have a question relating to your example where you have 2 core switches 4096 and 8192. What bridge priority should the other distribution switches connecting to the core be? The distribution switches are configured similar to the core, in that there are two of them and all hosts are bonded.
Also, should I increment the bridge priority as I add more distribution switches. I have done this and have reached the max bridge ID allowed. heh
I am thinking the distribution switches should all have the same bridge ID, say 12,288 and 16384 ?
However, bigger issue I screwed up and made core switch 1 bridge id 0. Will need to fix.
Thank you!!!!!!!
Hi Eddie. The first goal is to ensure that, under normal conditions, the root is in the core. The second goal is to ensure that if that switch fails you have selected a successor that makes sense. In general, the other switches can be left with the default priority of 32768. The priorities of the two core switches are set so that one will always be the root bridge and the other will take over only if the first one fails. If you have a failure affecting both of the core switches, then the network is probably completely fragmented and there may not be much of an advantage to selecting the best root bridge for each of the isolated parts of the network.
Exception, if you have 3 or 4 core switches, you’ll probably want to include them in your scheme. But even here I probably wouldn’t worry too much about distribution switch priorities.
Hi Kevin, Struggling to find info on a LAN with multiple switches but no redundancy and running PVST+ spanning tree. If my root bridge gets removed will I see uplink interfaces stopping transmitting frames while a new root bridge is elected.
Cheers
The short answer is “probably”. Certainly, if the root bridge is in between two other switches, then you’ll lose connectivity between those devices. Assuming the root bridge is at the end of a long daisy chain of switches, the most important question is how long the outage will be. And in that case I would actually recommend setting up an after hours test and measuring it on your network.
Hi Kevin,
I am having query for point number 5, let me explain you my topology
We are having 2 Core switches interconnected with single Gigabit interface and below that Core, 2 L2 switches both of them are connected to Core 1 and Core 2 with TenGig port.
MSTP is configured with default instance 0 and following is mst configuration on all the 4 switches (2 Core and 2 L2 switches):
#sh spanning-tree mst configuration
Name []
Revision 0 Instances configured 1
Instance Vlans mapped
——– ———————————————————————
0 1-4094
——————————————————————————-
Following L2 VLAN are presents on respective switches:
Core Switches: 32,33,35,36,38,39,40,41,136,138,139,142,1000,2000 (One of the Core switch priority is 0 for mst instance 0, so that is root bridge)
First L2 switch: 32,33,35,40,41,142
Second L2 switch: 32,33,36,38,39,41,138,139,142
So my query is:
1. Am I supposed to pass all the unique VLANs on all the interconnected trunk ports (between the CORE and between CORE-L2) even if they are not used on few of the switches.
2. Since both the core switches are connected via Gigabit port, traffic from Core2 to Core1 is passing via one the L2 switch which is Tengig port. (How can resolve this issue, should i manually configure low cost for port connected between Core, if yes than i can i do it? need help with the commands)
Hi Vinit,
You should map out what will happen in case one link or one switch fails. In your example, it doesn’t sound like traffic that originates on one of your L2 switches will ever traverse the other L2 switch before reaching the core. If the uplink to Core 1 fails, it will instead use the Core 2 path. Then, as long as the trunk between Core 1 and Core 2 carries all VLANs, you’re OK.
The guidance from the article is intended for more general cases where you could be forced to traverse a link that doesn’t carry the required VLAN. For example, consider a 4 switch network with 2 cores (CORE1 and CORE2) and 2 distribution switches (DIST1 and DIST2). If these 4 switches are connected in a square (CORE1 -> DIST1 -> DIST2 -> CORE2 -> CORE1) then there are two paths from DIST2 to CORE1: DIST2->CORE2->CORE1 and DIST2->DIST1->CORE1. In this case, the uplink from DIST1 -> CORE1 must contain all of the VLANs that DIST2 uses because there will be times when traffic must use that path. And, by extension, all of the trunks and all of the switches must include all VLANs, even the ones that are not directly required on that switch.
So, while it sounds like your network is probably OK right now, it would be very easy to get into trouble in the future if you add more switches and more links. If you have a single MSTP instance, I strongly recommend including all VLANs in all inter-switch trunks.
Hi Kevin,
Thank you for your post, I have a quick question about migrating the Root Bridge from an old switch running SPT to a new core switch running RSTP.
Can you tell me how long the conversion will take when I change the priority on the new switch, will it be 2 sec or up to 45 sec?
Thanks again.
John
That’s a good question. The problem is not as simple as it sounds, though. All of the switches in your network will need to be converted from STP to RSTP. If you convert some switches then you’ll basically get a region of RSTP that borders a region of STP for backward compatibility. So the switches that are still running STP will learn about a new root bridge using STP, which will take the longer time, while the ones running RSTP will discover the new Root Bridge quickly.
I’ll take SPT over Multi-chassis etherchannel, VSS, VLT, VPC, backup ports, and routing protocols any day. Although those technologies would, of course, complement the switched network with SPT design.
1. multi-chassis etherchannel – reduces links from many to one, making it easier to setup SPT.
2. VSS – still needs SPT to connect to dist/access switches unless you’re using something like Cisco fabric extenders ($$$$$).
3. VLT/VPC – see 2.
4. backup ports – yes, SPT handles backup ports. One could argue that backup ports is what SPT is for.
5. routing protocols – I assume that this is referring to running subnets on switches locally. The network vendors love this because they sell more Layer 3 switches and less Layer 2. But technically, it is hard to support organizations that require flexibility of having VLAN’s wherever they are needed. Layer 3 on access switches kills that mobility which is why even the big orgs (50,000+) continue to use SPT for spoke locations. Some of them of considerable size and scope.
So SPT is still a very significant contributor to productivity, flexibility, cost containment and resilience in a switched network environment, if you’re smart enough to configure it properly. Which seems to be the bigger hurdle for a lot of network engineers and wannabees.
Hi Kevin,
Great info.
An interesting thing I found in the extreme switches is that MSTP is enabled only on trunk ports (ports that connects to distribution switch only) ,access switch ports or any user ports were not involved anywhere into MSTP. I believe Cisco switches are not configured like this. Cisco enables STP/RSTP/MSTP on complete switch ports. Correct me if I’m wrong ?
Also tell me if it’s a good practice configuring STP on only ring ports ?
Kevin, I think you’ve confused spanning tree priority with OSPF designated root priority. Priority 0 is valid and will be preferred by spanning tree. Give it a test in a lab environment and confirm for yourself.
I have 1 core switch and 20 access switches connected directly to it. I have made the core a bridge piority of 4096, and all the others are the default of 32768. Do I need to add a unique priority to all of the access switches? I dont have any redundant links added by design