Previous

Content

Next 


2.4.2.5 CAR queuing discipline  

As its name tells us CAR (Commited Access Rate) is a router's commitment to forward traffic up to some predefined throughput under some predefined conditions, and to forward traffic above this level using another, almost surely, different condition.
Cisco is not so generous explaining us, with a diagram that would be a lot better, what is the CAR innermost but reading from their documents you could deduce that CAR is not a simple and mortal queuing discipline, but instead, perhaps a very sofisticated scheme where some specialized devices converge for having the work done. How they themselves say: "CAR is a multifaceted feature that implement both classification service and policing through rate limiting". CAR also has marking capabilities offering us the posibility for building a partial compliance Differentiated Service architecture.
For doing all this it must implement these devices:
  • Classifier: an entity which selects packets based on the content of packet headers according to defined rules; a multi-field (MF) classifier selects packets based on the content of some arbitrary number of header fields; typically some combination of source address, destination address, DS field, protocol ID, source port and destination port.
  • Meter: metering is the process of measuring the temporal properties (e.g., rate) of a traffic stream selected by a classifier. The instantaneous state of this process may be used to affect the operation of a marker, shaper, or dropper, and/or may be used for accounting and measurement purposes; a meter is a device that performs metering.
  • Dropper: dropping is the process of discarding packets based on specified rules; a dropper is a device that performs dropping.

  • Policer: policing is the process of discarding packets -by a dropper- within a traffic stream in accordance with the state of a corresponding meter enforcing a traffic profile; a policer is a device that performs policing.
  • Marker: marking is the process of setting the DS codepoint in a packet based on defined rules; a marker is a device that performs marking (see note 2 below).
Also, according to Cisco documents CAR implements some kind of probabilistic mechanism to evaluate burst of packets and deciding if they are conforming some predefined requirements.
Notes: 
  1. Definitions on cursive are taken from [3]; see References at the end of this document.
  2. Cisco CAR does not set the DS codepoint; just the TOS-byte precedence bits.
As I say above Cisco explanation is a little tasteless but picking out from documents we can build an idea of what we are dealing with and more important indeed, how to use CAR for implementing QoS services. Reading from Cisco we got these premises about CAR:
  1. CAR can only be used with IP traffic. It does not work with Appletalk, IPX, SNA, DECNet.
  2. Using CAR we can limit the input or output transmission rate on an interface or subinterface based on a flexible set of criteria.
  3. Selection criteria can be: 
    1. Incoming interface. 
    2. IP precedence. 
    3. IP access lists (standard or extended).
    4. MAC address. 
    5. TOS byte. 
    6. Packet direction, incoming or outgoing.
  4. CAR works by policies. A policy is a combination of factors, it means:
    1. The definition of what is known as in-profile and out-of-profile flows; or using Cisco terminology flows that conform and flows that do not conform a requirement. When a flow conforms the requeriment we say it is an in-profile flow; when not, we say it is an out-of-profile flow. CAR treats flows going through it different depending if they are in-profile or out-of-profile. 
    2. An average or committed rate which determines the long-term average of the transmission rate. Flows that have an average throughput below or equal to this predefined level are in-profile, if not, they are out-of-profile. 
    3. A normal burst size which determines how large traffic burst can be before some traffic is considered to be out-of-profile. 
    4. An excess burst size such that traffic that falls between the normal burst size and the excess burst size is considered to be out-of-profile with a probability that increases as the burst size increase.
    5. An action which establishes how the flow will be treated when it is in-profile and how it will be treated when it is out-of-profile. 
  5. Each interface can have multiple CAR policies which will be corresponding to different types of traffic. Each policy can be independent, or alternatively, policies can be on cascade. A flow can be compared to different policies in succession. Policies are checked in the same order they were created. 
  6. Actions to be taken, being conform or exceed, are as follows:
    1. continue, that means, evaluate the next policy. 
    2. drop, that means, just stop and drop the packet. 
    3. set-prec-continue, that means, set the packet header IP precedence field and continue evaluating the next policy. 
    4. set-prec-transmit, that means, set the packet header IP precedence field and forward the packet. 
    5. transmit, that means, forward the packet. 
  Well, later on we will see some examples to clear better all this information. CAR is one of the powerful Cisco weapon for implementing QoS services. We can use it to limit the transmission rate on an interface. Exactly what we want to control flow behavior interacting with our domain. But better yet, we are in fact capable of using a flexible set of criteria to apply our rules. CAR can then be used as a "guard" to control and protect our domain frontiers. Also, by using its set-prec-continue and set-prec-transmit marking actions a Differentiated Service architecture could be implemented.
Okay, fellows. Too many words in this dissertation. Let's talk less and do more presenting next two selected examples using CAR: first one, using our well-known on previous examples simple network and later on, an incipient intention to convert our domain in a Differentiated Service domain. 
For our first example we will use the same scheme we used before when studying CQ. Have a look to figure #8 and table #3 somewhere above on this document. We will repeat the router RT1 configuration but using CAR this time. We have to rebuild our table to calculate actual throughput based on a E1 speed of 2.048 Mbps and the share distribution of bandwidth indicated on table #3. 
Also we will calculate excess burst based on a maximum desired latency of 50 ms and normal burst such that excess burst will be 150% of normal burst. There's nothing cientific behind this calculations; they are just based on searching Cisco documentation and some experience and common sense that give us this rule of thumb. Our new table will be:

For example, calculation for the SQL server would be:
SQL server throughput = 0.30 * (E1) 2.048 Mbps * 1024 * 1024 ~ 644245 bps.
SQL server excess burst = (644245 bps / 8) * (Latency 50ms) 0.050 ~ 4000 bytes.
SQL server normal burst = 4000 / 1.5 ~ 2600 bytes.
Bursts could be increased to allow bursty flows if you want; it's just a matter to make some live tests to select your best choice. 
Next we continue with the access list commands:

Access lists 10, 20 and 30 are for permiting SQL, MAIL and FTP servers access respectively. Access list 40 is for denying SQL server access; we will see later why we need this rule. Access lists 100 and 110 are for permiting WWW and DNS server access respectively.
To configure CAR we have to enter first in interface configuration mode using the serial interface we want to configure CAR to; then: 

And next we configure the different CAR policies; some comments to clear what we did will be given below (router prompt in interface configuration mode is not as it would be, but it doesn't matter; also each command should be typed in a single line):

Okay. How this work? First policy to be checked is for SQL server packets. This rule admits SQL traffic up to 644245 bps. Above this, action is not to drop packets but instead to continue with the next policy. This way SQL server traffic is not really limited to 644245 bps; packets from it have another opportunity to survive when the basic rate (30% of bandwidth) is surpassed.
Next rule is for MAIL traffic. This time we don't want more traffic above 322122 bps. Excess traffic is dropped. The same is for the FTP server. Its maximum throughput is limited to 322122 bps.
WWW server policy comes next. Because we are matching flows by TCP port 80 the maximum throughput include Internet traffic through the Linux router. Traffic is limited to 429496 bps but a second opportunity is given to these flows using the continue action.
Now is the turn to DNS server. Again the traffic is limited up to 214748 bps (perhaps too much for this type of service; doesn't matter adjust yours as you like). Any excess is dropped.
Now is the turn of the SQL server deny access list; this policy permits any kind of traffic up to 214748 bps except for SQL server traffic. Above this limit the traffic is invited to check the next policy using the continue action. By this path "other" traffic scape and also some TCP 80 traffic. But limiting them to 214748 bps.
Next policy is for saving additional packets from the most important SQL server traffic. This time traffic from this server is permited to flow up to 1288490 bps being this the top admitted. Above this the traffic is dropped.
Not having more rules and being the default action to transmit, the rest of the traffic that reach this stage is transmitted as long as enough bandwidth is yet available.
To understand CAR behavior is very interesting to figure out some traffic combination and confront them against our CAR beast to see what happen. Let's suppose then this traffic combination:

Expected throughput is higher than available throughput. To estimate a distribution we have to assume some sharing between TCP and UDP flows when they are competing for bandwidth. Generally, being both sources generating the same throughput, UDP is stronger than TCP because it does not implement a congestion avoidance mechanism to self control the throughput when some packets are dropped. UDP tries to maintain its throughput. TCP instead adjusts the throughput accordingly to network condition trying to avoid congestion problem. Then, when competing for bandwidth, UDP flows starve TCP flows.
But don't believe TCP is hornless. It is constantly testing the network to get more bandwidth from it. At last, when available bandwidth is less than required, the protocol's fight is hard and probably both of them will behave in an oscillating wave. When TCP tries to get more bandwidth its throughput curve goes up, and being the total bandwidth limited, the UDP throughput curve goes down. When some TCP packet is dropped, its congestion control mechanism fires reducing the flow throughput. Then its throughput curve goes down and UDP takes control trying to get more bandwidth against TCP. But next TCP is probably in slow-start state insisting and testing again the network for more bandwidth and the process is repeated again and again. 
If you want to see a graphic of this fight have a look to next figure taken by screening my Linux box. The output is from the software ns-2 (Network Simulator-2) and it is graphicated by the software xgraph. Here I'm simulating a 1.7 Mbps WAN link where one TCP (red) and one UDP (green) flows are competing for the available bandwidth. Initially UDP flow (CBR source) is started at time 0.1 seconds with a rate of 0.45 Mbps. Observe that green flow is very steady state. Problem begins when TCP flow (FTP source) is started at time 1.0 seconds. TCP fight to cope the available bandwidth and both flows begin to oscillate. If UDP flow rate were higher (1.2-1.5 Mbps), TCP flow would be really in big troubles. It could be completely starved. At time 3.0 seconds TCP flow ends and UDP flow gets back its normal steady state behavior.

Anyway, to have an answer to our question we will assume that when TCP and UDP are competing for bandwidth in our domain their share will be 30% for TCP and 70% for UDP. Again, we are assuming this to have an answer; I'm not telling that this will be the final real situation. 
Distributing our flows we will have the next table:

Policies act as follows:
  • Policy 1 matches 644245 bps of SQL traffic. 
  • Policy 2 matches 189000 bps corresponding to the total of MAIL traffic.
  • Policy 3 matches 322122 bps of FTP traffic. Above this all FTP traffic is dropped.
  • Policy 4 matches 429496 bps of WWW traffic.
  • Policy 5 matches 96000 bps corresponding to the total of DNS traffic.
  • Policy 6 matches 214748 bps of every except SQL traffic. Because MAIL, FTP and DNS are satisfied traffic (MAIL and DNS are exhausted and FTP is dropped), this throughput distributes 70% to UDP and 30% to TCP traffic (as we assumed above). We assume also that TCP traffic distributes equally between WWW and Other-TCP traffic.
  • Policy 7 matches 245755 bps corresponding to the rest of the SQL traffic.
  • After Policy 7 is matched, a little as 7117 bps is left that is distributed between Other-UDP traffic (70%), some WWW traffic (15%) and Other-TCP traffic (15%).
Last two columns of our table show the total bandwidth consumption for each type of traffic and the fulfill based on total traffic that is called to be meet. Our class-1 SQL-traffic is satisfied 100%.
I think it would be a very interesting exercise for the reader to do this:
  • Repeat the process of creating distribution tables for different scenarios, this means, different combinations of expected throughput.
  • For some of these scenarios check how to improve response to any selectable traffic. Select a scenario and a type of traffic (for example, SQL) and make some examples changing the CAR rules to see what happen.
  • Chosing any of the scenarios try to create two additional tables for a FIFO and for a WFQ queuing discipline when they are implemented on the router, instead of CAR.
 

Okay, friends, let's continue with our second goal. As I told before now it's time to start out our incipient intention to convert our domain in a Differentiated Service domain. How do we do that? Let's begin by drawing again the domain:

Very nice figure. Colored routers RT1, RT2, RT3 and RT4 are called "edge routers" because they are located at the domain frontiers and connect it to the rest of the world. Internal routers are called "core routers" and are represented in the figure by gray dots tied by internal links. Inside the domain many networks are built where final users are connected.
First we have to list the services we want to prioritize. Selection will be made by port number because we don't want to depend on ip addresses that could change. Our prioritized services table will be as follows:

Well, among other services that will be treated as best-effort (priority 0), we have some we want to treat better. Microsoft SQL Server and IBM DB2 Server services will have the higher priority (number 5). Interactive services as SSH, TELNET and WWW are assigned to priority 4. The Windows Name Server (WINS) will have priority 3. E-mails entering our domain through POP3 will be treated with priority 2. And finally E-mails through SMTP and the FTP service will be assigned to priority 1.
Precedence values are defined on RFC 791 and are listed in this figure I took from Cisco:

Priorities 6 and 7 are reserved for network control information and should not be used.
We want to do something simple that works. Our first decision will be to configure WFQ queues in every core router; something really very easy. We know that WFQ is a very friendly and automatic queue that being precedence-aware will prioritize the traffic within the domain based on packet's IP-precedence bits. As higher the IP-precedence field is it will be higher the priority given to this type of traffic to be forwarded. We are going to take advantage of this fact.
Now we can concentrate our effort on edge routers. We will configure router RT1 being the rest of them (RT2 to RT4) the same configuration. We begin with the access-list commands as follows:

Router prompt is really RT1(config)# but let's obviate this now. List 101 to 105 will correspond to precedences 1 to 5 respectively. List 200 allows any ip traffic; it will be used for precedence 0 (best-effort) flows.
Next we sketch a useful table having the maximum rates per router we will accept for each type of traffic classified by precedences; assuming a maximum bandwidth of 2 Mbps (E1) in every edge router serial interface and reserving 25% of this availability for departing flows, we create the table based on 1500000 bps per router as follows:

Of course, you can select your own configuration adjusting shares as you like; also, some live tests could be advanced to check burstiness for avoiding starvation, and then if required, increase buffers depending on router's available memory.
Observe two things: first, we are ensuring that our administrative and general ledger system based on SQL servers located outside the domain is receiving a guaranteed bandwidth of 4 * 525000 = 2100000 bps for traffic of this type entering the domain. Also, because by using CAR we will mark this traffic as precedence 5, it will be treated a lot better (except perhaps, network control information traffic) within the domain by every core router having WFQ implemented. This way we are sure that this mission-critical system is not being starved for those guys looking for funny in adult sites or those music lovers downloading heavy files from Kaaza. As you see, we are liberals. We are not blocking them, just putting each piece in its place. Everyone can be happy, but business goes first.
Second, our policy defines certainly what type of traffic we want; any other type of traffic will be confined to a maximum of 13% of available bandwidth when traffic we want is exercizing its rights and using its guaranteed bandwidth.
Okay, now we can configure our edge routers as follows:

Again, ignore that I'm lazy and router prompt is not as is indicated; also remember that commands must be typed in a single line. 
We haven't finished yet but this part set the required precedence fields and transmit the flows not exceeding previously agreed values. Observe that for every traffic we left the door open for more throughput using the action "continue". Why we do this instead of using "drop"? Well, we want to distribute excess bandwidth as better as possible between flows when some of them are less than expected. Every flow beyond expectation will have a second opportunity to be transmitted, if some bandwidth is yet available, but being marked this time with a precedence of zero (best-effort). Then our second part will be as follows:

Now we certainly have finished. Observe that first rule of this part opens the door to any flow up to 195000 bps. This rule is very important; not only we have here the 13% we reserved for best-effort traffic. Also we have here an insurance to be applied to any flow that belonging to any of our priviledged services (SQL, WWW, DNS, TELNET, etc.) is not being multiplexed through the expected port. This is not a matter with WWW, DNS, TELNET, etc. that being simple applications use always their expected ports, but it can, and in fact occurs with some applications.
For example, list of services tells us that IBM DB2 reserves the port number 523. This is one of the called "IANA well-known port number". Okay, but, do you know well how DB2 work? Could you take for granted that only port 523 will be used? And just TCP port 523 as we assumed above? It should be, but, because the devil knows more for being old that for being devil, perhaps other control flows could be required to manage DB2 using different ports other than TCP 523. Then you have to take providences to allow these probable flows. Do you remember we told somewhere above that application to be included in QoS implementations should be studied carefully? Imagine the fool… you know, guys, with this new QoS implementation our SQL server will work a lot better… and when ready, the server cannot be accessed. I really wouldn't like to be there when this happens.
Then our rule through access-group 200 allows 195000 bps per router for any traffic to be guaranteed even when priviledged flows reach their top levels. Another solution, even easier and saver is to applying our rule for SQL servers using the server's ip addresses. This way we are really sure; every flow coming from the SQL servers will be catched for our classifier. Well, we left you having the right answer.
Continuing with the policies, assuming that some priviledged flows don't use their expected bandwidth, with the next two rules we privilege again an extra-bonus as best-effort for 500000 bps of SQL traffic and 250000 bps of interactive traffic per router. And finally, being the default action to transmit, the rest of any traffic that reach this stage is transmitted as long as some bandwidth is yet available. They will fight to get some share from the rest of the cake.
What is very nice of this new configuration? That we never use "drop". We want to protect the live of the packets. Some of them will be dropped at random just when congestion occurs. But this time our policies never drop specifically a packet when throughputs reach some level. 
As you see when working in QoS projects, what you really have is the tool but possibilities are endless. For example, we haven't took about routing. What about routing? Which routing protocol are you using? RIP? OSPF? Are these protocols taking automatically into account when you configure QoS or you have to foresee something for them? Studying carefully the router's specification is the answer to this question. Doesn't have any sense to implement an incredible QoS project if you forget something as vital as routing and leave your network uncommunicated.

 

 


Previous

Content

Next