|
Previous
|
Content |
Next
|
|
|
2.8.- HTB queuing
discipline |
|
 |
|
| Finally, after doing some test, we decided to
use HTB instead of CBQ to implement our Differentiated Service on Linux
HOWTO. The original Differentiated Service on Linux [10]
implementation was developed using CBQ when it was necessary. Basically two
of the example scripts that were included in the original implementation
were based on CBQ: AFCBQ that implements an example of the DS Assure
Forwarding PHB (AF-PHB) and EFCBQ that implements an example of the DS
Expedited Forwarding PHB (EF-PHB). |
| Then, why do we decide to base our HOWTO on HTB
instead of CBQ? Perhaps reading a ittle from Lartc [7] we can give an
answer to this question. Let's see how. |
| |
| People from Lartc [7] don't like too much CBQ
queuing discipline; let's read what they say about this: |
| |
| As said before, CBQ is the most complex
qdisc available, the most hyped, the least understood, and probably the
trickiest one to get right. This is not because the authors are evil or
incompetent, far from it, it’s just that the CBQ algorithm isn’t all that
precise and doesn’t really match the way Linux works. |
| |
| Besides being classful, CBQ is also a shaper
and it is in that aspect that it really doesn’t work very well. It should
work like this. If you try to shape a 10mbit/s connection to 1mbit/s, the
link should be idle 90% of the time. If it isn’t, we need to throttle so
that it IS idle 90% of the time. |
| |
| This is pretty hard to measure, so CBQ
instead derives the idle time from the number of microseconds that elapse
between requests from the hardware layer for more data. Combined, this can
be used to approximate how full or empty the link is. |
| |
| This is rather circumspect and doesn’t
always arrive at proper results. For example, what if the actual link speed
of an interface that is not really able to transmit the full 100mbit/s of
data, perhaps because of a badly implemented driver? A PCMCIA network card
will also never achieve 100mbit/s because of the way the bus is designed -
again, how do we calculate the idle time? |
| |
| It gets even worse if we consider
not-quite-real network devices like PPP over Ethernet or PPTP over TCP/IP.
The effective bandwidth in that case is probably determined by the
efficiency of pipes to userspace - which is huge. |
| |
| People who have done measurements discover
that CBQ is not always very accurate and sometimes completely misses the
mark. |
| |
| Love enough , don't you think? Initially, when
this HOWTO was started, I decided to base the document on CBQ. My decision
was based in the following arguments: |
|
|
|
|
- CBQ was already part of the standard kernel. HTB not yet.
- Differentiated Service on Linux designers used CBQ to implement their
models. When required DS examples in the original Diffserv package were
based on CBQ.
|
| But, finally, at last, HTB was included in the
standard Linux kernel. Immediatlely I wrote this note. |
| Note: Reading again from
HTB user guide
I discovered (too late, I admit this) that beginning from kernel 2.4.20,
HTB is on standard Linux kernel already. Being the case it would be really a
lot better to re-write Differentiated Service scripts to porting them from
CBQ to HTB. I really prefere to use HTB. Its design is clearer, command are
simpler and support and a very good user documentation exists thank to
Martin Devera, the designer. Also Stef Coene does an invaluable work
supporting HTB through the LARTC list. I´m going to evaluate all this to study
the possibility to step back this HOWTO and trying to use HTB instead of CBQ
for implementing DS. |
| I did my work implementing AFCBQ and EFCBQ
using HTB. The results were excellent. General behavior was much better
using HTB than using CBQ. Then I renamed AFCBQ and EFCBQ as AFHTB and EFHTB.
This HOWTO is based on these scripts instead of the original ones. |
| What are the real problems with CBQ? First, its
configuration is very complicated. Too many parameters make the work
tedious. Second, it isn't accurate. Almost everyone agree that HTB is better
than CBQ to make the work they are intended to do. In fact, Alexey
Kuznetsov, the CBQ Linux implementation designer admitted (I read this in
Lartc list some time ago), that HTB did the work better. Martin Devera (the
HTB designer) was very happy that day. |
| HTB (and CBQ too) is what is known as a hierachical
link-sharing queuing discipline. If you want to know more about this type of
queuing discipline it is a good idea to have a read to the paper
Link-sharing and Resource Management Models for Packet Networks written
by Sally Floyd and Van Jacobson in 1.995. |
| Some concepts taken from this document are very
interesting to understand the model behind HTB design. They wrote, for
example: |
| One requirement for link-sharing is to share
bandwidth on a link between multiple
organizations, where each organization wants to
receive a guaranteed share of the link bandwidth during
congestion, but where bandwidth that is not being used by one
organization should be available to other organizations sharing
the link. |
| Well, the idea is the same. It is very
difficult to be always overprovisioned. Time will come when you will be
underprovisioned and perhaps congested. When this occur you would like to
have some mechanism to guarantee a share of the available bandwidth to the
organizations that receive the service. Organizations share the link and
after having some agreement you are responsible for guaranteeing the respect
to the terms of such agreement. |
| Floyd & Jacobson recognize two type of
structures to give bandwidth distribution to the organizations interested to
be served. The first of them is know as flat link-sharing structure.
Let's take some paragraphs from the paper to clear our knowledge.
They wrote: |
| For a flat link-sharing structure such as in
Figure 2.8.1, the link-sharing
requirements are fairly straightforward. A link-sharing
bandwidth is allocated to each class (expressed in Figure
2.8.1 as a percentage of
the overall link bandwidth). These link-sharing
allocations could be either static (permanently assigned by the
network administrator) or dynamic (varying in response to current
conditions on the network, according to some predetermined
algorithm). The first link-sharing goal is that each class
with sufficient demand should be able to receive roughly its
allocated bandwidth, over some interval of time,
in times of congestion. |
| As a consequence of this link-sharing goal,
in times of congestion some classes might be
restricted to their link-sharing bandwidth. For a
class with a link-sharing allocation of zero, such
as the mail class in Figure 2.8.1, the bandwidth
received by this class is determined by the other
scheduling mechanisms at the gateway; the
link-sharing mechanisms do not “guarantee” any
bandwidth to this class in times of congestion. |
|

|
| The second type is known as hierarchical
link-sharing structure. Again, Floyd & Jacobson explain this as follows: |
| Multiple link-sharing constraints at a
gateway can be expressed by a hierarchical
link-sharing structure such as in Figure 2.8.2.
The link-sharing structure in Figure
2.8.2 illustrates link-sharing between
organizations, between protocol families, between service classes,
and between individual connections within a service class; this
is not meant to imply that all link-sharing structures at all links
should include all of these forms of link-sharing. All arriving
packets at the gateway are assigned to one of the leaf classes;
the interior classes are used to designate guidelines about how
`excess' bandwidth should be allocated. Thus, the goal is that
the three service classes for agency A should collectively receive
50% of the link bandwidth over appropriate time intervals, given
sufficient demand. If the real-time class for agency A has little
data to send, the hierarchical link-sharing structure specifies that
the `excess' bandwidth should be allocated to other subclasses
of agency A. |
|

|
The link-sharing goals can be summarized as
follows:
- Each interior or leaf class should receive roughly its allocated
link-sharing bandwidth over appropriate time intervals, given
sufficient demand.
- If all leaf and interior classes with sufficient demand have
received at least their allocated link-sharing bandwidth, the
distribution of any `excess' bandwidth should not be arbitrary, but
should follow some set of reasonable guidelines.
|
| Observe that this structure is more flexible
than the flat one. In the flat approach you have only one level of bandwidth
distribution. In the hierarchical approach you can have multiple levels of
bandwidth distribution increasing the flexibility to distribute the
available bandwidth between the classes to be served. |
| Have a look to the hierarchical structure.
Observe that the classes could be organizations or agencies (first level in
the figure); protocol families (second level in the B agency); traffic
types, as is shown in the second level of A and C agencies and the third
level of B agency; or connections as is shown in the bottom level of A
agency. |
| To implement a hierarchical link-sharing
structure you have to put a lot of resources in the game because the model
consumes CPU cycles generously. With the current technology managing
high-speed WAN links using hierarchical approach is yet non-practical. For
example, Cisco doesn't have yet this type of queuing discipline implemented
in its routers. In this case Linux is one step ahead from Cisco because two
hierarchical queuing disciplines are already implemented: CBQ and HTB. |
| Let us suppose we want to
implement a hierarchical link-sharing structure using HTB.
HTB is so nice that it is not necessary to begin with a long explanation
about parameters and all those complicated matters. We can go directly, on a
fast-track explanation. Let's begin by drawing our
network example scheme: |
|

|
| Two subnetworks, A and
B, are going to be fed from a 2.048Mbps Internet
connection through the 192.168.1.254 interface of
our green Linux box depicted in the figure.
A HTB queuing discipline
will be installed on ethernet interface eth0 to control bandwidth
distribution. Subnetwork A will have 60% of the available bandwidth
guaranteed; subnetwork B the rest (40%). Within each subnetwork the
guarantee flows for each type of service are as is indicated in the
figure. |
| |
| Next we draw our traditional
queuing discipline diagram: |
| |
|

|
| |
|
|
|
|
| Yellow figure represents our HTB queuing
discipline named 1:0. To permit borrowing between classes (more about this
is explained below) HTB qdisc requires to create a root class depicted in
the figure as a green rectangle; the root class is named 1:1. From this
class we set two subclasses: class 1:2 will be used to control traffic to
the network 192.168.1/26 and class 1:3 will be used to control traffic to
the network 192.168.129/26. These classes represent the second level
of hierarchy in our implementation. The third level is represented by the
subclasses 1:21, 1:22, 1:23 and 1:24 belonging to the class 1:2 and the
subclasses 1:31, 1:32, 1:33 and 1:34 belonging to the subclass 1:3. These
subclasses will be used to distribute bandwidth at the service (third)
level. For each of these subclasses a pfifo queuing discipline is configured
for packet enqueuing. Final network, sub-networks and services bandwidth
requeriments are shown in parenthesis in kilobits per second (kbit). |
| Our next step will be to configure the HTB
queuing discipline. We depict each command first and then a brief
explanation is given to clear what we are doing. |
|

|
| This command creates the root HTB queuing
discipline on ethernet interface eth0. Nothing more is required for our
example. The queuing discipline is called 1:0. |
|

|
| Next we create the root class 1:1. The rate
parameter establishes the maximum bandwidth that will be permitted by the
queuing discipline. |
|

|
| Now we create the class 1:2 to control flows to
the A network (192.168.1/26) and the class 1:3 to control flows to the B
network (192.168.129/26). The rate parameter is the minimum bandwidth
that will be guaranteed to a class when all the classes in the queuing
discipline are underprovisioned; this means, when flows reclaim rates
that are equal or above the values set by the rate parameters. Let's
explain this better. Being the A network flows equal or above 1228kbit and
(simultaneously) the B network flows equal or above 820kbit, the minimum
bandwidth to be guaranteed to classes 1:2 and 1:3 will be 1228kbit and
820kbit respectively. |
|
| HTB behavior is explained in HTB User Guide
[17] as follows: HTB ensures that the amount of service provided
to each class is at least the minimum of the amount it requests and the
amount assigned to it. When a class requests less than the amount
assigned, the remaining (excess) bandwidth is distributed to other classes
which request service. |
| What about if a class is overprovisioned? Then
the ceil parameter enters the game. For example, being the class 1:3
overprovisioned (consuming less than 820kbit), the class 1:2 can reclaim
the excess left by the 1:3 class and uses this excess up to the value set by the
ceil parameter, this means, up to 2048kbit. But, and this is very
important, to allowing this borrowing (between classes 1:2 and
1:3), the HTB queuing discipline has to be configured including the root
class 1:1. |
| But, what about if we don't want this
borrowing? For example, let's suppose A and B network owners pay a fixed
rate for their connections. They don't want that when he/she is not using
his/her bandwidth another people uses it to take advantage of the share he/she
pays. To avoid this problem we can re-write the last two commands using the
value of the rate parameter as the ceil parameter; this way: |
|

|
| Our next step will be to configure the service
classes on each network. Let's begin with network A: |
|

|
| Observe here that we set the rates defined for
each service but also we define a ceil parameter equal to the share
permitted to the network A. This is another way to limit the maximum amount
to be assigned to network A. Being the ceil parameter of classes 1:2
and 1:3 set to 2048kbit will be not matter because the top level will be
controlled by the ceil parameter on service level subclasses. |
| The commands are very similar to configure the
network B: |
|

|
| With these commands we have configured the part
corresponding to the HTB queuing discipline. As you see HTB configuration is
a very easy and friendly process. Having defined your class hierarchy you
have to deal only with the rate and ceil parameters to get the
job done. Really HTB has some other parameters to refine its behavior. But
because this is the Differentiated Service HOWTO and not the HTB HOWTO, I
suggest that those of you interested on having more HTB information have a
look to the
HTB user guide. As
was told before Martin Devera (HTB designer) and Stef Coene (one of the
Lartc maintainer) have done an excellent work supporting HTB through the
Lartc list. |
| To continue let's add a pfifo queuing
discipline to each of the service classes; the commands are as follows: |
|

|
|
Well, fellows, we are almost ready. But, first,
we have to configure the filter that place the packets on each class. Let's
see how to do this beginning with network A: |
|

|
| |
| All these filter elements (the filter is just
one, the prio1 filter) match to destination ip address range
192.168.1/26 corresponding to network A. The flowid parameter
indicates the class where the packets matching the corresponding filter
element will be placed. Source ports 23, 80, 20 and 21 correspond to service
telnet, www, ftp and ftp-data respectively from the remote servers. Last
filter element matches any packet belonging to the network 192.168.1/26 not
being previously matched by upper elements (in fact, the rest of A network
flows). Because 192.168.1/24 is a private address range some kind of NAT
server has to be implemented on the Linux box in front to the Internet
connection. But this is not a matter to this HOWTO. |
| |
| Commands to configure B network filter elements
are very similar: |
| |
|

|
| |
| Okay, estimable lectors, we have finished certainly with HTB. Our next step
will be one of the Differentiated Service key queuing disciplines: the
DSMARK queuing discipline in charge of marking DS packets. But this is a
matter for another section. Let us go on. |
|
|
|
|
|
|
|
|
|
Previous
|
Content |
Next
|