In QoS: Essentials, Part I, we discussed what QoS is, classifying/marking traffic, and trust boundaries. In Part II, we will get into the actual types of marking, do an overview of NBAR, and finally get into Congestion management/Queuing. Ready?
Types of Marking:
There are several different ways to mark, and each one is suited for a special situation. For example, if you’re running QoS over frame relay, you’d use the Frame Relay DE (discard elgibility) bit, whereas if you’re using ATM, you’d opt for the CLP (Cell loss priority) bit, etc. For now, we are going to simply discuss the different types. First, it is important to note ahead of time that we are going to be discussing markings that are used on layer 2, and some that are used on layer 3. Here are the breakdowns:

- CoS (Class of Service): CoS is very common in a LAN environment, as it is marked at Layer 2. In the graphic below, we have an 802.1Q frame, with the PRI field (used for CoS) inside a 4 byte tag, where you find the Type ID (TPID), will always be 0×8100 in order to identify the frame as an IEEE 802.1Q frame. Next we have the PRI field, which is 3 bits long (8 total values, 0-7), then the CFI, and VLAN ID. The key part here is to remember the PRI field is 3 bits, and can have 8 possible values. Thus our CoS values can be anywhere from zero up to seven. To give you some perspective on this, most Cisco IP phones will tag their traffic with a CoS of 5 by default, putting it into the critical category. This makes sense, since VoIP traffic is very sensitive to delay/jitter.

- DE bit: The Discard Elgibility bit is used in Frame relay environments. Here’s the concept. The USPS mail guy does his usual mail run, and heads back to the office to pick up more mail. Upon arriving, he realizes he has entirely too much to take, so he takes only the unmarked pieces of mail, and leaves the ones with a red marking behind..or drops them. Essentially all that happens with the DE bit is that you are telling nodes along the path that this packet *can* be dropped before others do in times of network congestion- or when the router cannot handle all of the traffic. The other end does not have to act on this bit at all, however. If it does choose to, however, the packets with the DE bit set will be dropped before those with no bit set.
- CLP (Cell Loss Priority): The CLP bit works the same as the DE bit in concept, except will be used in ATM cells.
- IP Precedence: Now we’re talking about Layer 3. In 1981, the ToS byte was used to set a certain level of service for that packet. Inside the byte was IP precedence (3 bits, the same as CoS), a ToS field (yes, a ToS field within a ToS byte, which was 4 bits), then the remaining 7 bits were unused. IP Precedence is fine, but DiffServ is quickly becoming standard, with engineers opting for DSCP marking, as it can be more grainular. Instead of 0-5 IP Precedence levels, you have from 0-63 with DSCP. That being said, DSCP is backward compatible with IP Precedence..there are 8 DSCP values that map to IP Precedence values. If a network running IP Precedence receives a packet marked with DSCP, it will simply read the first 3 bits of the DSCP, which it thinks is just a regular IP Precedence mark. That’s another time and place, however!
- DSCP (Differentiated Services Code Point): The ToS byte has been redefined as the DSCP field, with the 6 most signifigant bytes making up the DSCP value, and the last two bits being the ECN, or Explicit Congestion Notification bits. As I said, DSCP is backward compatible with IP Precedence, so if a system receives an IP packet with a DSCP value, remember that it will only read the most signifigant 3 bits, and treat it as IP Precedence. With DSCP, you set the DSCP value, which in turn causes a DiffServ node to act in a certain way towards that packet..this is called Per-Hop Behavior. In a nutshell, the node reads the DSCP, and realizes it is part of a group (or behavior aggregate..BA for short), and treats it the same way for the rest of the packets belonging to that BA.
- MPLS EXP: Ok, this one is kind of odd. Without diving into MPLS too deep, here is a breakdown. MPLS packets can be thought of as a regular IP packet with a 4 byte (or more) MPLS header inside it. The IP packet (with MPLS header inside..) is then encapsulated in a Layer 2 protocol, such as ethernet. It is then sent. Because of the fact that it is technically in a layer 3 packet, but encapsulated by layer 2, the MPLS header can almost be considered Layer 2 1/2. The MPLS header consists of only 4 fields, the label (which is basically like a color that is marked on the packet), the EXP bits (3 bits to be exact), BS bit (bottom-of-stack), and TTL. Inside the EXP bits, you have the same values as you do for CoS, or IP Precedence.
NBAR: Digging deep…
Prepare to be amazed! NBAR, also known as Network Based Application Recognition..is incredible. NBAR is a feature found in Cisco IOS, which can allow you to check traffic statistics, protocol discovery, and classify your traffic…for you! Let’s say you decide you want to implement QoS on your network. The first step is to identify traffic and requirements, right? Well, with NBAR, you can simply issue the following on the interface you wish to monitor:
SGTccie(config-if)#ip nbar protocol-discovery
In order to actually see the traffic statistics, we’d then issue the following command from enable mode:
SGTccie#show ip nbar protocol-discoveryIt is worth mentioning CEF is required to run NBAR. Also, when using the “show ip nbar protocol-discovery” command, it will show you all interfaces unless you add “interface X” after it. NBAR can also save you a lot of time. Once we get to QoS configuration, you will see. The old way of doing things was to configure extended ACL’s listing port numbers and IP’s, and etc. Instead of “access-list 101 permit ip any host 192.168.1.1 eq www”, we now use “match protocol http”. Nice!
Congestion Management/Queuing…waiting in line…
Ahh, congestion management. Running fiber everywhere along with 1GB ethernet everywhere is great..but congestion still happens. Why? Many reasons, really..poor QoS implementation (or none!), poorly designed networks, outdated equipment, etc..the list goes on and on. Generally, however, the point of congestion is almost always where traffic from multiple sources aggregate onto a single link. Picture 10 access-layer switches connecting to one distribution-layer switch, which only has a 100MB link to the core. You could easily have 400 users’ traffic flowing to the core on that one link. Another scenario would be where you have a slow WAN link (pretty common!). Another way you could think of it is: Congestion occurs when the rate of input for incoming traffic exceeds the rate of output. In english? When going from high speed interfaces down to low speed interfaces you are prone to congestion. It’s no different then a theatre filled with people trying to get out of two doors at once..they can only move so fast!
Queuing is a temporary form of congestion management. It will ease some issues with congestion, but the long-term fix is fairly obvious- getting more capacity. This is not always feasible, unfortunately. So what can we do? We can alter the order that traffic leaves the node, so the low-priority traffic will be dropped first, and not the high priority (VoIP, critical applications, etc) traffic. By default, however, you will experience FIFO (First In, First Out) on interfaces that are faster then 2.048Mbps. Weighted Fair Queuing is used on interfaces slower then 2.048Mbps by default..but we’ll get into that in a bit. Depicted below is the way FIFO software queuing works. It is key to mention that there is only one hardware queue..and it uses FIFO. When we discuss creating new queues, and assigning traffic to certain queues, we are discussing the software queue only. As you can see below, FIFO treats all traffic equally, meaning the sensitive VoIP traffic will have to wait in line behind the web traffic. Not ideal!
Priority Queuing
Priority Queuing, or PQ, consists of four queues: high, medium, normal, and low. By default, all packets will be assigned to the normal queue when using PQ. PQ is a pretty harsh Queuing method, which generally leaves lower-priority queues starved. PQ works by always giving the high priority queue the right of way, so to speak. If there is something in the high queue, it is sent before any other traffic. If the high queue is empty, it will check the medium queue..send one packet from there, then move down to the low, and start the cycle over. What you get is the possibility of the queues below high not getting enough bandwidth, since the high queue is taking it all. The idea is almost right (treating the high priority traffic as such), but the implementation is a little off. Let’s look at some better options.
Round Robin (RR)
Round robin contrasts heavily in comparison to Priority Queuing. The Round Robin process passes one packet from one queue at a time, effectively (almost) dividing the bandwidth almost equally. This is assuming the packet sizes are almost the same size, however. If one queue consistently has packet sizes much larger then the rest, it will take more bandwidth then the rest. RR does a good job of dealing with queue starvation, but does not prioritize at all. It can also be somewhat unpredictable as to actual queue usage.
Weighted Round Robin (WRR)
WRR is a modification of RR, where each queue receives a weight, and as a result of the weight, receives that portion of the bandwidth. WRR allows you to prioritize to some degree, but can also be somewhat unpredictable as some queues may use more bandwidth then planned.
Weighted Fair Queuing (WFQ)
As I mentioned before, WFQ is the default queuing method used on interfaces that are slower then 2.048Mbps. WFQ is important to know because as we’ll find later, it is implemented in both LLQ and CBWFQ..which are popular methods of queuing these days. WFQ is flow-based, meaning that once it receives a flow, it is assigned to a FIFO queue. A flow consists of packets that have the same source IP, destination IP, Layer 4 protocol (TCP/UDP), IP Precedence, TCP/UDP source and destination ports. WFQ creates queues on the fly for each flow, so the number of queues can vary greatly.
Class-Based Weighted Fair Queueing (CBWFQ)
CBWFQ divides traffic into classes (that are configured by the user), which are assigned their own respective queue. Although each queue can use more bandwidth then configured for, they can have a minimum bandwidth guarantee, so that even in times of congestion, they will get that amount of bandwidth. CBWFQ can create up to 64 queues, with each one being a FIFO queue. It is worth noting that you can configure the class-default queue to be a WFQ. The class-default queue is used for all undefined traffic. Bear in mind that while the CBWFQ functions with WFQ as a whole, once the traffic has been divided up into it’s respective queue, they are FIFO. Think of it like this, you are sending traffic into separate lines based on preference, but once in that line, they are considered equal. CBWFQ is a big improvement over previous queuing methods, however it still falls short as it relates to voice or video applications. You’ll note that CBWFQ provides no method of identifying a priority queue..this can hurt applications sensitive to delay. To solve these issues we move on to LLQ!
Low Latency Queuing (LLQ)
At this point we can agree what we need is a queuing method that will give priority to delay-sensitive traffic, but at the same time not leave all other queues starved for bandwidth. Do you remember the issue with priority queuing? PQ gives priority to one queue- which is great, but leaves the other queues starved in times of congestion. WFQ is good, as it doesn’t leave flows starved, but it also provides no guarantee to any particular queues. LLQ solves these issues. LLQ is essentially a CBWFQ with at least one strict-priority queue. What does this mean? It means one queue receives priority, however that queue is policed, meaning in times of congestion it cannot use more bandwidth than is configured.
Ahhh..sigh of relief!
Here we are, at the end of Part II! As you have noticed by now, QoS can definitely be daunting, but if you take the time to tackle the theory behind it, it really isn’t that difficult. The difficulty (for me at least), has always been in the theory as opposed to the implementation! In Part III we will discuss Traffic shaping/policing, link efficiency, and congestion avoidance. Look forward to seeing you!