Дополнение к Руководству пользователя и Инструкции;pdf

Lower than Best Effort: a Design and Implementation
Ken Carlberg
Panos Gevros
Jon Crowcroft
Dept. of Computer Science
University College London
Gower Str. WC1E 6BT
Dept. of Computer Science
University College London
Gower Str. WC1E 6BT
Dept. of Computer Science
University College London
Gower Str. WC1E 6BT
[email protected]
[email protected]
[email protected]
In reent years, the Internet arhiteture has been augmented
so that Better-than-Best-Eort (BBE) servies, in the form
of reserved resoures for spei ows, an be provided by
the network. To date, this has been realized through two different and sequentially developed eorts. The rst is known
as Integrated Servies and fouses on spei bounds on
bandwidth and/or delay for spei ows. The Dierentiated Servie model was later introdued, whih presented
a more aggregated and loal perspetive regarding the forwarding of traÆ. A diretion that is missing in today's
work on servie models is a dened shema used to purposely degrade ertain traÆ to various levels below that
of Best Eort. In a sense, a new diretion that provides a
balaning eet in the deployment of BBE servie. This is
partiularly evident with ontinual and parallel short transation ows (like that used for web appliations) over low
bandwidth links that are not subjet to any bako penalty
inurred by ongestion beause state does not persist. In
a more indiret perspetive, our model orrelates degraded
servie with the appliation of usage and seurity poliies {
administrative deisions that an operate in tandem or disjointly from onditions of the network. This paper attempts
to address these and other issues and presents the design
and implementation of suh a new degraded servie model
and queuing mehanism used to support it.
The default servie model of the Internet is known as Best
Eort. This servie model is based on the design priniple
that a network does not set aside or reserve resoures in forwarding data pakets hop-by-hop to their destination. Failure in delivery an our when resoures along a path are
exhausted or when there are breaks in onnetivity.
In reent years, the IP servie model has been augmented
so that Better-than-Best-Eort servie, in the form of reserved resoures for spei ows, an be provided by the
network. To date, this augmentation has been realized through
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
SIGCOMM Workshop 2000, Costa Rica
Copyright 2001 ACM 0-89791-88-6/97/05 ...$5.00.
two dierent sequentially developed eorts. The rst is
known as Integrated Servies and is realized as either Control Load [15℄ or Guaranteed Servie [13℄ { the former fousing on bandwidth levels, and the latter addressing bandwidth and delay bounds. Two important features of this
new servie is that its original design is based on a relatively
granular sope (Soure, Destination, Port tuple), and that
the servie was meant to be instantiated on an end-to-end
In tandem with the denition of a new servie model, the
Internet Engineering Task Fore (IETF) dened a resoure
reservation signaling protool titled RSVP [16℄ . This protool ats as a onduit for propagating reservation requests
via an exhange of sender initiated Path messages, and orresponding reeiver initiated Resv messages. The protool
is designed to support uniast and multiast ows on an
end-to-end basis. Reent work with RSVP has involved augmenting its design to support an aggregation and/or hierarhial struture in its signaling . An outome in using RSVP
to install the end-to-end attributes of a ow is that state
is not stored in the paket header, but installed in routers
along the soure-to-destination data path by an out-of-band
signaling mehanism.
After initial experiene was obtained in dening and implementing the Integrated Servie model, a dierent model
was introdued into the IETF. This new model is known
as Dierentiated Servies and takes a more abstrat and loal view on resoure alloation [2℄ . Speially, it fouses
on servie agreements regarding bandwidth and/or delay
between neighboring routing domains. A separate entity,
e.g., Bandwidth Broker, an be used to alter the sope of
the agreement so that resoures an be set aside to support hanges in overall aggregate levels of traÆ . Per Hop
Behaviors an be used to help determine the sope and subsequent hanges of the agreement between two domains.
One distintive aspet of this new type of servie model
is that the harateristis of ingress traÆ are ompared
against a pre-established prole. The prole represents the
servie level used by the next-hop transit domain to aept
data for forwarding. If the ingress traÆ does not onform to
the prole, it is marked aordingly. The marked paket is
plaed in a virtual 'penalty box' and is then either dropped,
or marked with lower preedene and forwarded to the next
hop. This form of degraded servie involves a binary deision
proess that is dependent on the ongestive state of a node
- i.e., degraded servie only ours with out-of-prole traf and with no onsideration of previous history of a nononformant ow. This is reasonable to understand within
the ontext of 'fairness' to all traÆ ows
A omplimentary eort used to ahieve Better-than-BestEort servie is realized through the various queuing mehanisms that have been developed over the last several years.
Two of the more notable designs have ome in the form of
Class Based Queuing (CBQ) [8℄ and Weighted Fair Queuing
(WFQ) [5℄ . CBQ denes lasses, assigns a queue per lass
of traÆ, and sets aside resoures in an a priori manner to
eah lass. Unused resoures an be borrowed in either a
peer or hierarhial manner by other lasses. Eah lass an
be dened using dierent levels of granularity.
In a general sense, WFQ establishes individual queues for
eah ow that traverses a node. Weights are assigned to
assoiate a measure of managed bias in assigning resoures
to ertain types of ows. 1 In both ases, CBQ and WFQ
involve per-ow queuing, where there exists a 1-to-1 relationship between ow/lass and queue. In the ase where
ows are able to borrow unused resoures, eah mehanism
is said to be work-onserving.
Independent of the sheduling of traÆ, Random Early
Detetion (RED) was designed as a ongestion avoidane
mehanism that an ompliment a queuing mehanism. It
uses a weighted moving average of queue size to determine
when it should randomly drop a paket. Weighted RED is
an extension that plaes pre-seleted weights in prioritizing
the dropping of ertain types of traÆ [11℄ . The oupling
of RED with per-ow queuing follows the attempt by researhers to support the lassi notion of fairness with respet to best eort servie, in that all users obtain an equal
share of network resoures.
1.1 The Problem
One initial observation about fairness is that ation to
ahieve it is based on the state of urrent ows. For diserv, ation to maintain fairness is triggered by out-of-prole
traÆ. Queuing mehanisms rely on ongestive state to ativate heuristis to enfore fairness and ompensate for exess
load. The use of exponentially weighted moving average
adds a measure of historial state to a ow, but the history has a deay fator and is generally based on the same
soure/destination pair. With respet to fairness on an endto-end basis, TCP employs ow ontrol algorithms per onnetion that adaptively responds to the ongestive state of
the network.
A type of appliation that irumvents, in various degrees,
the fairness ahieved by the above is the web browser. This
is beause browsers typially establish a series of parallel
short transation TCP onnetions from a single soure to
a variety of destinations. This parallel establishment allows
the end user to irumvent the slow start algorithm and obtain more than its fair share of network resoures through
multiple onnetions. Thus, these umulative ows an ontribute to network ongestion without being subjet to the
same degree of bak-o ow ontrol experiened by long
transation ows. Further, the per-ow queuing that exists
today applies fairness to queues independently from other,
and only takes ation to ahieve fairness when ongestion ours. This implies that if per-ow queuing is dened to the
granularity of a TCP onnetion, then fairness from queuing
will not take into aount the ulmination of onnetions
per appliation session. Feldman [6℄ presents an example
And of ourse, various implementations exist that ombine
the features of both CBQ and WFQ.
of how muh traÆ an be generated by web browsers and
shows traÆ measurements taken within AT&T's Worldnet,
where approximatelly 72% of ows and 56% of bytes were
web based.
The example of web browsers brings up the subjet of usage poliy { an aounting of whether and to what extent
a servie an be used. Until now, weights were the only
means by whih seletive bias, a type of administrative poliy, ould be supported by queuing. However, these weights
are stati in nature, in that their seletion is deided independent of hanges in types of traÆ, network usage, and
seurity onditions. The importane of usage poliies an be
seen in the need to ontrol unrestrited use of appliations
that are apable of bypassing the fairness supported by TCP
and per-ow queuing. This is partiularly aute in networks
that are saturated with traÆ, or are onstrained by relatively low bandwidth links to other networks. Reliane on
dierentiated servies to reet usage poliy an be problemati beause of the limited number of ode points that
an be manipulated by downstream transit providers; e.g.,
Assured Forwarding (AF) provides 4 traÆ lasses, with 3
levels of Drop Preedene per lass. Hene, in the ase of
where traÆ traverses multiple transit providers, subsequent
provider/diserv regions may need additional per-ow aounting in order to provide a lower shedule lass than that
whih an be marked in a paket.
There is a dynami to adaptive TCP-like ows in that
one annot just onsider what apaity is available now, but
what share a given ow (lass) is urrently using, and will
ontinue to use in the future. In other words, in ases like
parallel short transation ows that our suessively, a
punishment/penalty box annot eetively penalize unless
some aounting state persists.
To ahieve a more dynami support of usage poliies by
the network as well as address gaps in supporting fairness,
we propose the integration of per-ow aounting within
a queuing mehanism. From this aounting, various levels
of servie an be applied per ow, regardless of whether
a single FIFO queue, or multi-queue, mehanism is used.
These levels an reet degraded or upgraded servie and
an be subjet to monetary onsiderations (e.g., lower prie
orrelating to a more restritive poliy), as well as seurity
onditions. By this latter example, we mean that the level of
ondene that a ow represents a seurity breah (e.g., in
the form of denial of servie) ould be diretly proportional
to the degree that a usage poliy penalizes a ow.
A onsequene of using poliy and seurity riteria to degrade servie within a queuing mehanism is that non-work
onserving behavior may exist. But this is understandable
given that seurity and adminstrative poliy are orthogonal
to fairness and ongestion.
This paper presents new work that introdues the onept of per-ow aounting into queuing to realize degraded
servie as it pertains to usage and seurity poliies. We foresee degraded servie as a balane to the inlusion of betterthan-best-eort servie, and as a means of disouraging the
use of ertain appliations. This latter point an be helpful in resoure hallenged networks suseptible to ongestion. Setion 2 desribes a general and modular design that
an support various algorithms and usage poliies. Setion
3 presents an initial implementation of the design, termed
Persistent Class Based Queuing (P-CBQ), that represents
the integration of per-ow aounting to per-lass queuing.
Speially, we fous on penalty algorithms that an degrade
the servie of a ow. Setion 4 desribes experimental results
with P-CBQ using one type of penalty algorithm. Setion 5
lists areas of future researh, and Setion 6 presents related
The fous of our design involves per-ow aounting within
a queuing mehanism to help ahieve degraded servie. This
new servie an be based on usage poliies that determines
when, how, and to what extent ows are degraded on a datadriven basis. This last feature makes the design adaptive to
hanging onditions, suh as traÆ load, or lak thereof.
In taking a broad perspetive of usage based poliies, we
divide the omponents that dene per-ow aounting into
the following ategories:
1. Flow Identiation
2. Resoure Management (RM) Algorithm 2 .
3. Poliy Database
4. Persistent Information
The rst item involves the identity of elds and level of
granularity used to dene a ow. Traditionally, whenever
an address (masked or fully qualied) is inluded for ow
identiation, both soure and destination are used. In our
design, we present the option of identifying either soure or
destination so that information an be aquired for sets of
parallel onnetions established by an appliation/user (e.g.,
a web browser) to various loations.
The seond item represents a type of Per Hop Behavior
and speies the harateristis of a penalty algorithm applied to a ow. The speiation inludes the onditions and
degree by whih the algorithm operates (e.g., the rate and
level at whih a ow may be degraded), and thereby manages the resoures of a node. Our initial design and implementation fouses on algorithms used to penalize a partiular ow. A subjet for future researh would entail dening
algorithms to upgrade ows and improve their performane.
The third item identies the type of poliy assoiated with
a given ow, thereby mapping the above items 1 and 2. In
our implementation, we developed and tested a `usage' based
poliy that foused on plaing limitations and penalties in
the form of degraded servie for spei ows. Usage poliies
ould be further dened in terms of priing and the degree
by whih a user has ontrated for additional fair share of
resoures (i.e., additional ost orrelating to lower penalities). Other types of poliies ould involve seurity and the
pereption or possibility that a ow represents a form of
attak (e.g., denial or servie or intrusion). Both of these
renements are topis for future researh.
The fourth item gathers state for a ow. This is the
one dynami element whose information triggers the ations
taken by the RM algorithm. The retention of state an be
persistant for an indenite period, or it an be treated as
soft-state, implying that there exists a temporary quality
that must be renewed in some manner. The exat harateristis attributed to the persistent information is speied
by Item 2 above.
In this paper, an RM is also refered to as a Penalty Algorithm, sine the only algorithms dened so far involve servie
One should note that per-ow aounting an be aomplished with a single FIFO queuing mehanism, or with more
omplex per-ow queuing shemas like CBQ or WFQ. Its
appliation and measure of salibility is dependent upon the
dened granularity of a ow. Having presented the omponents that dene per-ow aounting, we now disuss our
view of Degraded Servie in more detail.
Degraded servie involves purposely reduing the level of
servie of a ow below that whih an be oered. So if a
ow initially exhibits end-to-end best eort servie, then degrading it at some point along a path hanges the ow so
that it experienes less than best eort servie. This degradation is aomplished independent of any existing ow ontrol mehanism that exists on an end-to-end basis, suh as
that employed by TCP. Servie degradation an be done by
Inreasing the delay in forwarding pakets,
Dropping pakets,
Dereasing the preedene value of pakets 3 .
Separating the ability to degrade servie from existing
end-to-end ow ontrol allows the degraded servie model
to be applied to any appliation, whether it generates long
or short transation-based ows. In addition, the servie
model an be applied to those appliations that have no inherent ow ontrol; e.g., UDP-based multiast. These ows
an have detrimental eet to TCP traÆ and an ause
ongestion ollapse [7℄, and they are also known to have
aused problems with traÆ ontrol mehanisms like CBQ
[?℄. And yet, it is assumed that degraded servie is probably
not appliable or needed for all types of ows that traverse
the network. For example, non-interative appliations that
an generate long transation TCP-based ows, whih are
not generated in a suessive repetitive manner, may not
be primary andidates for degraded servie. Examples of
suh appliations would inlude le transfer and email, in
the form of FTP and SMTP, respetively. On the other
hand, an ideal target for degraded servie would be multiast UDP ows that have no inherent ow ontrol, short
transation based TCP ows (as evidened by HTTP appliations), and ows exhibiting suspiious behavior that an
be interpreted as an attak (e.g., high levels of ICMP ehoreply messages). Another andidate might be appliations
like Napster, whih are predominantly used as non-ritial
entertainment at university and business network domains.
In fat, the use of Napster has aused institutions like UC
Berkeley to redue the priority of Napster in order to redue its load on the ampus network without eliminating its
In our design, we further qualify the term Degraded Servie as a potentially gradual funtion that is traÆ driven
and an our as a funtion of time, paket ount, and/or
servie rate. Then, after the Persistent Information has triggered a Penalty Algorithm, the servie degrades at some
speied rate to some dened target level. The ativation of
the degraded servie is onsidered traÆ driven beause the
redution of a ows servie is linked to its resoure usage:
e.g., as the soure sends more traÆ beyond a threshold,
This third approah is tied to urrent strategies in implementing Dierentiated Servies and is a subjet for future
the level or quality of the servie dereases in a manner presribed by the penalty funtion. Finally, the rate at whih
one an degrade servie an be ongured to be either gradual or very aggresive depending on the speied parameters
of the Penalty Algorithm. This ability to speify a variane
provides us with a exible mehanism that an also produe
a smooth man-in-the-loop pereption of degraded servie.
As mentioned previously, degrading servie an be a manifestation of a usage poliy aimed at disouraging the ontinual use of an appliation (like web surng). Aomplishing
this goal frees resoures, whih are then made available to
other users of the lass or other lasses of traÆ traversing
the node. Thus, if an appliation lass is given best eort
servie, then it stands to reason that the servie will remain
'good' for existing ows if users are disouraged from its
ontinual use after some threshold is reahed. Similar design philosophies for operating systems have been integrated
on time sharing omputer systems of the 70's.
In this setion, we desribe an implementation of a queuing mehanism that has been augmented with per-ow aounting to support the Degraded Servie model. The implementation is a derivative of CBQ that is termed Persistent Class Based Queuing (P-CBQ). Like its anteedent,
the sheduler of P-CBQ bases its seletion and subsequent
ation on dierent lasses of traÆ. Hene, we deide in
an a priori manner whih lasses of traÆ (i.e., ow) will
and will not be subjet to degraded servie. However, a
paradigm shift is introdued with P-CBQ in that state for
degraded ows remains persistent for a period of time so
that umulative information an be used to determine the
level of degraded servie that is to be applied. Further, umulative state is retained for either soure or destination, as
opposed to the more ommon fsoure/destination, address,
port, protoolg tuple used to identify a ow. The reason for
basing state on either soure or destination, though not neessarily both, is that the approah allows us to aumulate
state for a servie with a topologial diretion oupled with
a wildard endpoint. This allows us to aumulate state for
web-like appliations, in whih users tend to aess a multitude of random loations during a given session (i.e., time
spent using an appliation over time).
Figure 1 presents two examples in deploying P-CBQ within
the ontext of inter-domain onnetivity. In Example-A,
the P-CBQ router is plaed at the edge of the transit network, where it onnets a ustomer stub network. The PCBQ router is ongured to retain persistent state based on
the lass (e.g., port & protool) and destination address of
the traÆ 4 forwarded on the T-1 (1.5 Mb/s) link - a relatively small link that is suseptible to ongestion. When the
penalty phase is ativated and degraded servie is applied to
a ow, resoures of the T-1 are released and made available
to other ows traversing the link. From the perspetive of
the transit network, the degraded servie an be treated as a
value added servie for its stub ustomer in that it attempts
to minimize seemingly non-ritial servie on a data-driven
manner. In Example-B, the P-CBQ router is at the edge
For our purposes, we separate and highlight the address
from the denition of a lass. In general, a dened penalty
lass will not be based on both soure and destination,
though there is nothing to prelude this ation.
access network
access network
Telco/LAN Router
Telco/LAN Router
Telco/LAN Router
Telco/LAN Router
Telco/LAN Router
Telco/LAN Router
stub network
stub network
PCBQ router
traffic subject to penalty
Example A
Example B
Figure 1: Two examples of P-CBQ deployment
of the stub and is ongured to retain state based on lass
and soure address. This ation allows the stub domain to
ahieve degraded servie on egress traÆ (suh as requests
for data) independent of the transit network and the servies
it provides.
The urrent implementation of P-CBQ stores penalty state
based on the full length of an IPv4 address. The next version
will add the ability to dene sets of masks used to aggregate addresses. In addition, exlusionary state will be added
so that ertain address/masks an be exluded from being
subjet to degraded servie.
The development of P-CBQ was done under FreeBSD using the ALTQ framework [1℄ [4℄ . The implementation uses
one of two dierent algorithms to enfore degraded servie.
Both algorithms are polynomial-based and generate drop
probabilities for those lasses of traÆ that are subjet to
degraded servie. Note that our implementation degrades
traÆ by dropping pakets, as opposed to inreasing their
delay in the queue, or in altering their priority as dened
and realized in a Di-Serv environment.
A number of parameters an be set in a onguration le.
Some of the parameters are unique to a spei algorithm
used to enfore degraded servie. Other parameters, as per
the following, are appliable to all strategies. These being :
lass of traÆ, threshold value, degradation algorithm and
queue limit .
The rst onguration parameter identies the lass of
traÆ (HTTP, TCP et.) for a spei interfae subjet
to degraded servie. The seond parameter identies the
threshold used to trigger degraded servie. The value of
the threshold an refer to paket ounts or servie rate {
the spei meaning based on the identity of the seleted
degradation algorithm. When the threshold is surpassed,
then the degradation phase is ativated. It is this feature
that signies the data-driven nature of our design.
The third onguration parameter identies the penalty
algorithm used for servie degradation. Currently, two algorithms have been implemented. The rst is based on a
umulative paket ount of a soure or destination address
and is referred to as the PC algorithm. The paket ount
(pkt) is ompared against the threshold value (thresh), and
if the ount is greater, then the paket is dropped with a dynamially alulated drop probability. As the paket ount
inreases, the drop probability (dp) inreases in a proportional manner. This an be represented in the form of:
dp =
if pkt < thresh
if pkt > thresh
where n speies the ongured base for the rate of degradation (how fast the drop probability inreases), if n is high
then the penalty rate is gradual, otherwise the rate is onsidered steep. Aording to the above formula it is possible
that the drop probability grows without bound, in order to
ontrol this we use a ongured Max dp and never allow the
drop probability to exeed this value. Several values of n are
shown in Figure 2; one representing an aggresive degradation of servie, another (n=12) representing a more gradual
drop rate, and the third (n=14) an even more gradual drop
of referene by whih other penalty algorithms and test senarios an be designed, tested, and further rened. Below
we present the results of several test senarios. These senarios use the topology shown in Figure 3, where there exist
three nodes: Node-1 ating as the soure(s), Node-2 ating as the router with P-CBQ enabled on one interfae and
FIFO enabled in the other, and Node-3 ating as the destination. The mahines were PC routers running FreeBSD
equipped with ATM ards, whose links between them were
ATM PVCs. The apaity from Node-1 to Node-2 was set
to 10 Mb/s, and the other link was set to 1.5Mb/s.
Penalty Class1 mask = 10
Penalty Class2 mask = 12
Penalty Class3 mask = 14
throughput (Mbit/sec)
Figure 3: Test Topology
80 90 100 110 120 130 140 150 160 170 180
time (sec)
Figure 2: Dierent Drop Probability Rates
One aspet unique to the PC algorithm is that if no pakets have been reeived for a period of time, we presume that
the user has been \disouraged" from using the appliation
lass. Thus, the paket ount, or penalty state is leared
so that at a later time the user an start using the appliation without penalty until the threshold or paket ount
is reahed one again. The following setion presents some
initial results in testing this algorithm.
The seond algorithm, referred to as the SR algorithm,
is based on the servie rate of a lass of traÆ. If the rate
at whih pakets are being servied from the input interfae is above the ongured threshold value, then a drop
probability is generated to determine if the paket should
be disarded. The probability of dropping a paket is proportional to inreases or dereases of the servie rate. One
the average servie rate drops below the threshold value, the
SR algorithm terminates the generation of a drop probability - whih orresponds to a termination of a degradation
of servie for that soure or destination address. In order
to redue the potential existene of severe osillations, an
exponentially weighted moving average is used to determine
the servie rate over time. Results of this approah are to
be presented at a later date.
Finally, the fourth onguration parameter, the queue
limit, is used by both the SR and PC algorithms as an additional renement in determining when to start using degraded servie. In the ase of SR, the omparison of threshold and servie rate is done when the queue size is above a
ongured limit. In the ase of the PC algorithm, only the
pakets above the ongured limit are subjet to degraded
This setion presents some initial results in testing P-CBQ
using the PC algorithm. Our objetive is to present a point
4.1 UDP Tests
The rst set of tests involves the transmission of a steady
stream of UDP traÆ from two soures on Node-1 to Node3. Eah soure sent 6000 pakets, with eah paket being
1400 bytes in length. Eah soure also represented a lass
of traÆ; one was subjet to degraded servie and the other
under Best Eort servie. The Penalty Class had a ongured threshold of 600 pakets, meaning that degraded servie would not ommene until the paket ount equals the
threshold ount. The Penalty Class was also ongured to
experiene a maximum drop probability of 45%, suh that
the probability of dropping a newly reeived paket would
never exeed that value. Finally, eah lass was alloated
49% of available resoures (bandwidth).
The left graph in Figure 4 represents a ontrol example
that shows the dierene between a a normal UDP ow and
one that is penalized. Initially, both UDP ows ahieve
a throughput of approximately 700 Kbits/s whih is their
nominal alloation. The usage poliy for Default Class has
no penalty assoiated with it, and thus maintains this steady
level of throughput. The Penalty Class, on the other hand,
has a penalty poliy, whih takes eet in a gradual manner
after the threshold has been reahed. At time around 57 seonds into the test, the throughput of the Penalty Class displays a more level behavior with some measure of variane,
whih is due to the fat that the deision to drop pakets is
probabilisti in nature on a per paket basis. Given that the
left graph represents a ontrol senario, it represents a point
of referene and thus is not indiative of how we envision
the deployment of P-CBQ with respet to usage poliy.
The right graph of Figure 4 is the same UDP test as that
of the left graph, but now the Default lass is allowed to borrow unused bandwidth from the Penalty Class, as the latter
is subjeted to Degraded Servie. This seond test represents one example of how we envision usage poliy would be
deployed with P-CBQ. One an notie that in this seond
example that the Default Class is able to nish its transmission earlier than the ontrol example beause of the additional bandwidth that was made available to it. One an
Default Class
Penalty Class
Root Class
throughput (Mbit/sec)
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100105110115120125130
time (sec)
Default Class
Penalty Class
Root Class
throughput (Mbit/sec)
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100105110115120125130
time (sec)
Figure 4: UDP TraÆ with and without Borrow.
also see that the utilisation of the output link (i.e., throughput of the Root lass) drops due to the fat that for the
given level of degraded throughput in the Penalty lass, the
Default lass did not have enough demand (rate) to sustain
the throughput on the output link.
4.2 TCP Tests
We have also onduted tests with TCP ows and observed the eets of PC algorithm both on individual ows
and on aggregates. TCP traÆ today aounts for almost
90% of the total Internet traÆ and uses a window ow
ontrol algorithm with timeout and retransmission. The algorithms for managing the window for ongestion avoidane
and ontrol or the way the retransmission timeout (RTO) is
alulated are far too ompliated to be desribed here, see
[14℄ for details. It is widely aeptable though that TCP's
window-based ow ontrol does not allow ne grained rate
ontrol. Moreover TCP's throughput degrades with loss
probability in a non-linear fashion; a slight inrease in the
steady state paket loss probability (in the range from 1%
to 10%) an ause a preipitous drop in throughput. From
the latest work in TCP modeling [12℄ we have the follwing
equation :
Throughput RTT
+ o( p )
where RTT is the round trip time of the onnetion, b is
the number of pakets that are aknowledged by a reeived
ACK and p is the probability that a paket is lost.
Although this equation only models the steady state through-
put of TCP when loss indiations are only 3 dupliate ACKs
without aounting for timeouts it is possible that just a
few losses ombined with the exponential bako of the retransmission timer [9℄ may lead TCP to long timeouts (idle
periods) and an overall throughput that is very hard to be
predited, see [12℄ for the detailed equation. When TCP
experienes a series of timeouts it an be thought from a
user's perspetive as ompletely starved and the onnetion
may be aborted.
It is hard to penalize a single TCP ow with the PC algorithm and still maintain a preditable minimum throughput
like in the UDP ase. Short ows in partiular do not usually have large enough ongestion windows to allow them to
use fast retransmit and fast reovery (requires a window of
at least 4) 5 and an be better thought of as a series of slow
start periods followed by one (or more as it was often the
ase with P-CBQ) retransmission timeouts. After a single
TCP ow was subjeted to penalty, it was behaving like an
on/o soure and its transmission intervals were governed
by the TCP internal timeout alulation algorithm (exponential bako) and the expiration of router penalty state.
Dening the parameters and algorithms to aurately ontrol TCP throughput by probabilisti paket drop methods
is still an open area of researh.
The PC algorithm an provide drop probabilities suÆient
for ontrolling aggregates of TCP ows beause the losses
are then distributed over a larger number of onnetions and
the eet is that individual ows experiene gradual servie
degradation that is roughly proportinal to the number of
ows in the lass (up to a ertain limit). In our work, the
fous is on the degradation algorithm itself and the servie it
provides, not on the interations between paket drop probability and the number of TCP ows.
Eah of the graphs in Figure 5 show a series of short TCP
ows (like HTTP replies): the Default lass ows unharmed,
and the Penalty lass ows subjeted to degraded servie.
As in the ase of the previous UDP results, the left graph
represents a ontrol example that an be ompared to the
right graph, whih shows the benets of realloating the
resoures of the penalized ow. The penalty state at the
router is kept for the duration of the test (i.e., the timeout value does not expire), the threshold that ativates the
penalty is set to 1000 pakets, the maximum drop probability is 45% and the parameter n (\mask") that ontrols
the rate of degradation is set to 12 so that the degradation
is relatively \gradual". All the graphs show aggregate lass
In the left, or ontrol graph of Figure 5, the Default lass
is not allowed to borrow, so that the highest throughput
is limited by the lass bandwidth alloation (approximately
750 Kbits/s). In omparison, the Default Class in the right
graph is allowed to borrow any unused bandwidth from the
Penalty lass, so its highest throughput in this onguration
is limited by the link apaity if the Penalty Class is idle.
Typially after the rst three TCP sequenes, the PC
penalty algorithm is ativated and the Default lass is able
Most TCP ows in the Internet today transfer HTTP
data and are usually short in duration, the typial MSS
is 1460 bytes and the majority of the onnetions transfer
around 10 Kbytes whih means less than 7 segments. Deployment of Persistent HTTP in HTTP 1.1 is expeted to
inrease the amount of data transferred in a TCP onnetion.
Series of HTTP like replies through the Default (without borrow) and Penalty TCP classes
Default Class
Penalty Class
Root Class
throughput (Mbit/sec)
time (sec)
Series of HTTP like replies through the Default (with borrow) and Penalty TCP classes
Default Class
Penalty Class
Root Class
throughput (Mbit/sec)
time (sec)
Figure 5: TCP TraÆ with and without Borrow.
to use the resoures denied to the Penalty lass. Some TCPs
in the Penalty lass either took too long to nish and there
were oasions where the onnetions were dropped after
waiting for long periods (note that 12 onseutive retransmission timeouts for the same onnetion, in the BSD TCP
implementation, are not that rare when the paket drop
probability s 45%). More importantly, this ondition was
aomplished beause of the umulative and persistent information used by P-CBQ to penalize subsequent transations.
In the TCP ase, aomplishing drop probability with an
algorithm like Paket Count, is less straightforward than
with UDP, and the onlusion was that one must be rather
areful when designing degradation poliies for TCP ows,
or ows that use timeouts for ongestion ontrol in general.
Depending on the poliy the PC algorithm would be ideal
for streaming media type ows that do ongestion ontrol
by baking o their rate suÆiently low.
In the previous setions, we have presented an approah
that supports a degraded servie model for IP networks using P-CBQ. For its initial design, we implemented a degradation algorithm that was based on pakets ounts and a orresponding drop probability as a means of seletively penalizing a spei type of ow. This initial approah provides
a foundation that an be evolved in a variety of ways. One
suh diretion involves the development of dierent types of
degradation algorithms. For instane, one derivation of the
PC algorithm ould involve the inlusion of a token buket
to smoothen the rate of traÆ one a maximum degraded
servie has been ahieved. Beyond this, a new type of drop
algorithm ould be designed that would base its ations of
the servie rate of a queue. As the servie rate exeeds a
given threshold, its drop rate ould inrease proportionally.
Another area for future researh involves the integration
of P-CBQ into the Di-Serv arhiteture. Currently, our
implementation realizes its degradation funtion by dropping pakets. This ation ould easily be rened so that IP
pakets an be expliitly marked as lower priority. In addition, the riteria for marking or dropping pakets ould
reet a harging model that satises the users measure of
payment: X-amount for degraded servie, X+Y for best effort servie, and X+Y+Z for better than best eort servie.
A potentially third area of future researh fouses on a more
oordinated interation between eonomially based harging models with degraded servie. In [10℄, Kelly presents
the onept of Shadow Priing and its integration with rate
ontrol algorithms in order to ahieve stable onvergene in
establishing a per unit harge of resoure used to forward
traÆ. The ost assoiated with a resoure is in relation to
the other ows sharing the same resoure. As the demand
inreases, either by new ows or added demand by existing
ows, the ost rises. By inreasing the ost in relation to an
inrease of demand, the system provides an equalizing inuene between the diametri fores of supply and demand.
If we substitute the ow ontrol algorithms of TCP with
the penalty algorithms of P-CBQ, we an potentially use
Shadow Priing as a means of altering the ations taken
upon a penalized ow or lass of servie.
Finally, the ability to inur a variane in degraded servie has the potential to operate symbiotially with seurity
and orresponding levels of ondene that an attak (e.g.,
intrusion or denial or servie) is ouring . The use of systems like, Baysian Networks, Geneti Algorithms, and Fuzzy
Logi to detet seurity attaks has the tendeny to present
non-denitive results. By integrating a range of answers
with a orresponding variane in penalty, we have the potential to go beyond binary deisions { i.e., ontinue forward
pakets, or drop all pakets for a 'suspiious' ow.
As noted above, one of the fundamental features of CBQ
is it's a priori denition of lasses and division of resoures
independent of existing traÆ onditions or aquired knowledge of previous traÆ types and load. The ability to borrow
unused resoures presents a powerful feature that allows the
lassiation proess to be reative to existing traÆ load
traversing a node.
The use of a data-driven design is also the foundation of
Flow Random Early Detetion (FRED) [11℄ , a modied
version of RED. In the approah, the designers introdue a
per-ative-ow aounting of load to penalize ows based on
their buer usage - the penalty being in the form of dropped
pakets. The objetive in this design is to be more fair in
relation to existing ows exhibiting dierent harateristis
(e.g., two TCP ows with dierent soure or destinations
that have dierent load due to dierent Round Trip Time
delays). This realization of fairness also implies that some
ows may be penalized at a higher rate than others based
on their usage of resoures, versus an arbitrary pre-seleted
Reent work in degraded servie is found in [3℄ , whih as
title says disusses lower than best eort per-hop behavior.
This eort is meant to work in tandem with the Dierentiated Servies model to provide a measure of protetion of
best eort servie in the presene of ongestive state due to
non-onformant data ows. A key aspet that distinguishes
this work from that proposed in this paper is that ation is
never taken unless a ongestive state ours within a node
. Further, all pakets that are marked as reeiving lower
than best eort servie are degraded uniformly - the form of
the degradation being in the form of more aggressive paket
drop. Previous load onditions, or temporal state, are not
taken into onsideration.
The design of a per-ow aounting mehanism to support degraded servie through represents a new diretion in
the treatment of ows traversing a node. Three motivations
behind this involve a desire to support usage poliies, ow
ontrol for ontinual and parallel short transation onnetions, and a need to degrade servie for di-serv pakets
whose priority has been set to the lowest value possible by
previous entities/domains.
The resoures of penalized ows are made available to
other ows of a given lass, or other lasses of traÆ, traversing the node. The fat that we aumulate information
on subsequent ows allows us to penalize short transation
ows like those generated by series of HTTP requests. Depending on its deployment and onguration, the amount
of potential state makes our initial implementation more attrative at the leaf edges of networks. However, future work
in aggregating penalty will allow us to apply our approah
to the Internet as a whole.
A related feature involving our approah in degrading servie is that the rate and severity of the penalization proess
an be applied to disouraging user behavior, in the form of
ontinual and prolonged usage, of a given appliation. Coupled with an eonomially based harging model, one has
the potential of harging for maintaining Best Eort servie
to avoid restritive usage poliies plaed on appliations like
web browsers.
Our initial results have been based on a single penalty
algorithm based on paket ounts. This algorithm operates
independent of existing end-to-end ow ontrol algorithms,
the rate at whih pakets are servied/forwarded through a
node, and uses a probabilisti means of determining whether
a paket is to be dropped. Applied to UDP ows without
ow ontrol applied to upper level RTP pakets, the PC type
of algorithm allows us to inur a relatively smooth degradation of servie upon a ow. However, an area that will require additional investigation and researh involves the development and renement of penalty algorithms that take
into aount the ow ontrol algorithms inherent in TCP.
[1℄ ALTQ: Alternate Queueing for FreeBSD. Available
[2℄ S. Blake, D. Blak, M. Carlson, E. Davies, Z. Wang,
and W. Weiss. \An Arhiteture for Dierentiated
Servies". RFC (Informational) 2475, IETF, De.
[3℄ R. Bless and K. Werle. A Lower than Best Eort
Per-Hop Behavior Internet Draft - Work in Progress,
Sept. 1999.
[4℄ K. Cho. A Framework for Alternate Queueing:
Towards TraÆ Management by PC-UNIX based
Routers. In USENIX Annual Tehnial Conferene,
New Orleans, Louisiana, 1998. Available from
[5℄ A. Demers, S. Keshav, and S. Shenker. \Analysis and
Simulation of a Fair Queueing Algorithm". In Pro. of
the ACM SIGCOMM, pages 1{12, Austin, Texas,
Sept. 1989.
[6℄ A. Feldman. Presentation at IETF Plenary Session,
Nov. 1999.
[7℄ S. Floyd and K. Fall. \Promoting the Use of
End-to-End Congestion Control in the Internet".
IEEE/ACM Transations on Networking, Aug. 1999.
[8℄ S. Floyd and V. Jaobson. \Link-sharing and
Resoure Management Models for Paket Networks".
IEEE/ACM Transations on Networking,
3(4):365{386, Aug. 1995.
[9℄ P. Karn and C. Partridge. \Improving Round-trip
Time Estimates in Reliable Transport Protools".
ACM Computer Communiation Review, 25(1):67{74,
Jan. 1995.
[10℄ F. Kelly, A. Maulloo, and D. Tan. \Rate ontrol for
ommuniation networks: shadow pries, proportional
fairness and stability". Journal of the Operational
Researh Soiety, 49(3):237{252, Mar. 1998.
[11℄ D. Lin and R. Morris. "Dynamis of Random Early
Detetion". In Pro. of the ACM SIGCOMM, pages
127{137, Cannes, Frane, 1997.
[12℄ J. Padhye, V. Firoiu, D. Towsley, and J. Kurose.
\Modeling TCP Throughput: A Simple Model and its
Empirial Validation". ACM Computer
Communiation Review, 28(4):303{314, Sept. 1998.
[13℄ S. Shenker, C. Partridge, and R. Guerin.
\Speiation of Guaranteed Quality of Servie". RFC
2212, IETF, Sept. 1997.
[14℄ W. R. Stevens. TCP/IP Illustrated Volume 1: The
Protools. Addison-Wesley, Reading, Massahusetts,
[15℄ J. Wrolawski. Speiation of the Controlled-Load
Network Element Servie. Request for Comments
2211, IETF, Sept. 1997.
[16℄ L. Zhang, S. Deering, D. Estrin, S. Shenker, and
D. Zappala. \RSVP: a new resoure ReSerVation
Protool". IEEE Network, 7(5):8{18, Sept. 1993.