Lower than Best Effort: a Design and Implementation Ken Carlberg Panos Gevros Jon Crowcroft Dept. of Computer Science University College London Gower Str. WC1E 6BT Dept. of Computer Science University College London Gower Str. WC1E 6BT Dept. of Computer Science University College London Gower Str. WC1E 6BT [email protected] [email protected] [email protected] ABSTRACT In reent years, the Internet arhiteture has been augmented so that Better-than-Best-Eort (BBE) servies, in the form of reserved resoures for spei ows, an be provided by the network. To date, this has been realized through two different and sequentially developed eorts. The rst is known as Integrated Servies and fouses on spei bounds on bandwidth and/or delay for spei ows. The Dierentiated Servie model was later introdued, whih presented a more aggregated and loal perspetive regarding the forwarding of traÆ. A diretion that is missing in today's work on servie models is a dened shema used to purposely degrade ertain traÆ to various levels below that of Best Eort. In a sense, a new diretion that provides a balaning eet in the deployment of BBE servie. This is partiularly evident with ontinual and parallel short transation ows (like that used for web appliations) over low bandwidth links that are not subjet to any bako penalty inurred by ongestion beause state does not persist. In a more indiret perspetive, our model orrelates degraded servie with the appliation of usage and seurity poliies { administrative deisions that an operate in tandem or disjointly from onditions of the network. This paper attempts to address these and other issues and presents the design and implementation of suh a new degraded servie model and queuing mehanism used to support it. 1. INTRODUCTION The default servie model of the Internet is known as Best Eort. This servie model is based on the design priniple that a network does not set aside or reserve resoures in forwarding data pakets hop-by-hop to their destination. Failure in delivery an our when resoures along a path are exhausted or when there are breaks in onnetivity. In reent years, the IP servie model has been augmented so that Better-than-Best-Eort servie, in the form of reserved resoures for spei ows, an be provided by the network. To date, this augmentation has been realized through Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGCOMM Workshop 2000, Costa Rica Copyright 2001 ACM 0-89791-88-6/97/05 ...$5.00. two dierent sequentially developed eorts. The rst is known as Integrated Servies and is realized as either Control Load [15℄ or Guaranteed Servie [13℄ { the former fousing on bandwidth levels, and the latter addressing bandwidth and delay bounds. Two important features of this new servie is that its original design is based on a relatively granular sope (Soure, Destination, Port tuple), and that the servie was meant to be instantiated on an end-to-end basis. In tandem with the denition of a new servie model, the Internet Engineering Task Fore (IETF) dened a resoure reservation signaling protool titled RSVP [16℄ . This protool ats as a onduit for propagating reservation requests via an exhange of sender initiated Path messages, and orresponding reeiver initiated Resv messages. The protool is designed to support uniast and multiast ows on an end-to-end basis. Reent work with RSVP has involved augmenting its design to support an aggregation and/or hierarhial struture in its signaling . An outome in using RSVP to install the end-to-end attributes of a ow is that state is not stored in the paket header, but installed in routers along the soure-to-destination data path by an out-of-band signaling mehanism. After initial experiene was obtained in dening and implementing the Integrated Servie model, a dierent model was introdued into the IETF. This new model is known as Dierentiated Servies and takes a more abstrat and loal view on resoure alloation [2℄ . Speially, it fouses on servie agreements regarding bandwidth and/or delay between neighboring routing domains. A separate entity, e.g., Bandwidth Broker, an be used to alter the sope of the agreement so that resoures an be set aside to support hanges in overall aggregate levels of traÆ . Per Hop Behaviors an be used to help determine the sope and subsequent hanges of the agreement between two domains. One distintive aspet of this new type of servie model is that the harateristis of ingress traÆ are ompared against a pre-established prole. The prole represents the servie level used by the next-hop transit domain to aept data for forwarding. If the ingress traÆ does not onform to the prole, it is marked aordingly. The marked paket is plaed in a virtual 'penalty box' and is then either dropped, or marked with lower preedene and forwarded to the next hop. This form of degraded servie involves a binary deision proess that is dependent on the ongestive state of a node - i.e., degraded servie only ours with out-of-prole traf and with no onsideration of previous history of a nononformant ow. This is reasonable to understand within the ontext of 'fairness' to all traÆ ows A omplimentary eort used to ahieve Better-than-BestEort servie is realized through the various queuing mehanisms that have been developed over the last several years. Two of the more notable designs have ome in the form of Class Based Queuing (CBQ) [8℄ and Weighted Fair Queuing (WFQ) [5℄ . CBQ denes lasses, assigns a queue per lass of traÆ, and sets aside resoures in an a priori manner to eah lass. Unused resoures an be borrowed in either a peer or hierarhial manner by other lasses. Eah lass an be dened using dierent levels of granularity. In a general sense, WFQ establishes individual queues for eah ow that traverses a node. Weights are assigned to assoiate a measure of managed bias in assigning resoures to ertain types of ows. 1 In both ases, CBQ and WFQ involve per-ow queuing, where there exists a 1-to-1 relationship between ow/lass and queue. In the ase where ows are able to borrow unused resoures, eah mehanism is said to be work-onserving. Independent of the sheduling of traÆ, Random Early Detetion (RED) was designed as a ongestion avoidane mehanism that an ompliment a queuing mehanism. It uses a weighted moving average of queue size to determine when it should randomly drop a paket. Weighted RED is an extension that plaes pre-seleted weights in prioritizing the dropping of ertain types of traÆ [11℄ . The oupling of RED with per-ow queuing follows the attempt by researhers to support the lassi notion of fairness with respet to best eort servie, in that all users obtain an equal share of network resoures. 1.1 The Problem One initial observation about fairness is that ation to ahieve it is based on the state of urrent ows. For diserv, ation to maintain fairness is triggered by out-of-prole traÆ. Queuing mehanisms rely on ongestive state to ativate heuristis to enfore fairness and ompensate for exess load. The use of exponentially weighted moving average adds a measure of historial state to a ow, but the history has a deay fator and is generally based on the same soure/destination pair. With respet to fairness on an endto-end basis, TCP employs ow ontrol algorithms per onnetion that adaptively responds to the ongestive state of the network. A type of appliation that irumvents, in various degrees, the fairness ahieved by the above is the web browser. This is beause browsers typially establish a series of parallel short transation TCP onnetions from a single soure to a variety of destinations. This parallel establishment allows the end user to irumvent the slow start algorithm and obtain more than its fair share of network resoures through multiple onnetions. Thus, these umulative ows an ontribute to network ongestion without being subjet to the same degree of bak-o ow ontrol experiened by long transation ows. Further, the per-ow queuing that exists today applies fairness to queues independently from other, and only takes ation to ahieve fairness when ongestion ours. This implies that if per-ow queuing is dened to the granularity of a TCP onnetion, then fairness from queuing will not take into aount the ulmination of onnetions per appliation session. Feldman [6℄ presents an example 1 And of ourse, various implementations exist that ombine the features of both CBQ and WFQ. of how muh traÆ an be generated by web browsers and shows traÆ measurements taken within AT&T's Worldnet, where approximatelly 72% of ows and 56% of bytes were web based. The example of web browsers brings up the subjet of usage poliy { an aounting of whether and to what extent a servie an be used. Until now, weights were the only means by whih seletive bias, a type of administrative poliy, ould be supported by queuing. However, these weights are stati in nature, in that their seletion is deided independent of hanges in types of traÆ, network usage, and seurity onditions. The importane of usage poliies an be seen in the need to ontrol unrestrited use of appliations that are apable of bypassing the fairness supported by TCP and per-ow queuing. This is partiularly aute in networks that are saturated with traÆ, or are onstrained by relatively low bandwidth links to other networks. Reliane on dierentiated servies to reet usage poliy an be problemati beause of the limited number of ode points that an be manipulated by downstream transit providers; e.g., Assured Forwarding (AF) provides 4 traÆ lasses, with 3 levels of Drop Preedene per lass. Hene, in the ase of where traÆ traverses multiple transit providers, subsequent provider/diserv regions may need additional per-ow aounting in order to provide a lower shedule lass than that whih an be marked in a paket. There is a dynami to adaptive TCP-like ows in that one annot just onsider what apaity is available now, but what share a given ow (lass) is urrently using, and will ontinue to use in the future. In other words, in ases like parallel short transation ows that our suessively, a punishment/penalty box annot eetively penalize unless some aounting state persists. To ahieve a more dynami support of usage poliies by the network as well as address gaps in supporting fairness, we propose the integration of per-ow aounting within a queuing mehanism. From this aounting, various levels of servie an be applied per ow, regardless of whether a single FIFO queue, or multi-queue, mehanism is used. These levels an reet degraded or upgraded servie and an be subjet to monetary onsiderations (e.g., lower prie orrelating to a more restritive poliy), as well as seurity onditions. By this latter example, we mean that the level of ondene that a ow represents a seurity breah (e.g., in the form of denial of servie) ould be diretly proportional to the degree that a usage poliy penalizes a ow. A onsequene of using poliy and seurity riteria to degrade servie within a queuing mehanism is that non-work onserving behavior may exist. But this is understandable given that seurity and adminstrative poliy are orthogonal to fairness and ongestion. This paper presents new work that introdues the onept of per-ow aounting into queuing to realize degraded servie as it pertains to usage and seurity poliies. We foresee degraded servie as a balane to the inlusion of betterthan-best-eort servie, and as a means of disouraging the use of ertain appliations. This latter point an be helpful in resoure hallenged networks suseptible to ongestion. Setion 2 desribes a general and modular design that an support various algorithms and usage poliies. Setion 3 presents an initial implementation of the design, termed Persistent Class Based Queuing (P-CBQ), that represents the integration of per-ow aounting to per-lass queuing. Speially, we fous on penalty algorithms that an degrade the servie of a ow. Setion 4 desribes experimental results with P-CBQ using one type of penalty algorithm. Setion 5 lists areas of future researh, and Setion 6 presents related work. 2. A DESIGN FOR DEGRADED SERVICE The fous of our design involves per-ow aounting within a queuing mehanism to help ahieve degraded servie. This new servie an be based on usage poliies that determines when, how, and to what extent ows are degraded on a datadriven basis. This last feature makes the design adaptive to hanging onditions, suh as traÆ load, or lak thereof. In taking a broad perspetive of usage based poliies, we divide the omponents that dene per-ow aounting into the following ategories: 1. Flow Identiation 2. Resoure Management (RM) Algorithm 2 . 3. Poliy Database 4. Persistent Information The rst item involves the identity of elds and level of granularity used to dene a ow. Traditionally, whenever an address (masked or fully qualied) is inluded for ow identiation, both soure and destination are used. In our design, we present the option of identifying either soure or destination so that information an be aquired for sets of parallel onnetions established by an appliation/user (e.g., a web browser) to various loations. The seond item represents a type of Per Hop Behavior and speies the harateristis of a penalty algorithm applied to a ow. The speiation inludes the onditions and degree by whih the algorithm operates (e.g., the rate and level at whih a ow may be degraded), and thereby manages the resoures of a node. Our initial design and implementation fouses on algorithms used to penalize a partiular ow. A subjet for future researh would entail dening algorithms to upgrade ows and improve their performane. The third item identies the type of poliy assoiated with a given ow, thereby mapping the above items 1 and 2. In our implementation, we developed and tested a `usage' based poliy that foused on plaing limitations and penalties in the form of degraded servie for spei ows. Usage poliies ould be further dened in terms of priing and the degree by whih a user has ontrated for additional fair share of resoures (i.e., additional ost orrelating to lower penalities). Other types of poliies ould involve seurity and the pereption or possibility that a ow represents a form of attak (e.g., denial or servie or intrusion). Both of these renements are topis for future researh. The fourth item gathers state for a ow. This is the one dynami element whose information triggers the ations taken by the RM algorithm. The retention of state an be persistant for an indenite period, or it an be treated as soft-state, implying that there exists a temporary quality that must be renewed in some manner. The exat harateristis attributed to the persistent information is speied by Item 2 above. 2 In this paper, an RM is also refered to as a Penalty Algorithm, sine the only algorithms dened so far involve servie degradation. One should note that per-ow aounting an be aomplished with a single FIFO queuing mehanism, or with more omplex per-ow queuing shemas like CBQ or WFQ. Its appliation and measure of salibility is dependent upon the dened granularity of a ow. Having presented the omponents that dene per-ow aounting, we now disuss our view of Degraded Servie in more detail. Degraded servie involves purposely reduing the level of servie of a ow below that whih an be oered. So if a ow initially exhibits end-to-end best eort servie, then degrading it at some point along a path hanges the ow so that it experienes less than best eort servie. This degradation is aomplished independent of any existing ow ontrol mehanism that exists on an end-to-end basis, suh as that employed by TCP. Servie degradation an be done by either: Inreasing the delay in forwarding pakets, Dropping pakets, Dereasing the preedene value of pakets 3 . Separating the ability to degrade servie from existing end-to-end ow ontrol allows the degraded servie model to be applied to any appliation, whether it generates long or short transation-based ows. In addition, the servie model an be applied to those appliations that have no inherent ow ontrol; e.g., UDP-based multiast. These ows an have detrimental eet to TCP traÆ and an ause ongestion ollapse [7℄, and they are also known to have aused problems with traÆ ontrol mehanisms like CBQ [?℄. And yet, it is assumed that degraded servie is probably not appliable or needed for all types of ows that traverse the network. For example, non-interative appliations that an generate long transation TCP-based ows, whih are not generated in a suessive repetitive manner, may not be primary andidates for degraded servie. Examples of suh appliations would inlude le transfer and email, in the form of FTP and SMTP, respetively. On the other hand, an ideal target for degraded servie would be multiast UDP ows that have no inherent ow ontrol, short transation based TCP ows (as evidened by HTTP appliations), and ows exhibiting suspiious behavior that an be interpreted as an attak (e.g., high levels of ICMP ehoreply messages). Another andidate might be appliations like Napster, whih are predominantly used as non-ritial entertainment at university and business network domains. In fat, the use of Napster has aused institutions like UC Berkeley to redue the priority of Napster in order to redue its load on the ampus network without eliminating its usage. In our design, we further qualify the term Degraded Servie as a potentially gradual funtion that is traÆ driven and an our as a funtion of time, paket ount, and/or servie rate. Then, after the Persistent Information has triggered a Penalty Algorithm, the servie degrades at some speied rate to some dened target level. The ativation of the degraded servie is onsidered traÆ driven beause the redution of a ows servie is linked to its resoure usage: e.g., as the soure sends more traÆ beyond a threshold, 3 This third approah is tied to urrent strategies in implementing Dierentiated Servies and is a subjet for future researh. the level or quality of the servie dereases in a manner presribed by the penalty funtion. Finally, the rate at whih one an degrade servie an be ongured to be either gradual or very aggresive depending on the speied parameters of the Penalty Algorithm. This ability to speify a variane provides us with a exible mehanism that an also produe a smooth man-in-the-loop pereption of degraded servie. As mentioned previously, degrading servie an be a manifestation of a usage poliy aimed at disouraging the ontinual use of an appliation (like web surng). Aomplishing this goal frees resoures, whih are then made available to other users of the lass or other lasses of traÆ traversing the node. Thus, if an appliation lass is given best eort servie, then it stands to reason that the servie will remain 'good' for existing ows if users are disouraged from its ontinual use after some threshold is reahed. Similar design philosophies for operating systems have been integrated on time sharing omputer systems of the 70's. 3. IMPLEMENTATION In this setion, we desribe an implementation of a queuing mehanism that has been augmented with per-ow aounting to support the Degraded Servie model. The implementation is a derivative of CBQ that is termed Persistent Class Based Queuing (P-CBQ). Like its anteedent, the sheduler of P-CBQ bases its seletion and subsequent ation on dierent lasses of traÆ. Hene, we deide in an a priori manner whih lasses of traÆ (i.e., ow) will and will not be subjet to degraded servie. However, a paradigm shift is introdued with P-CBQ in that state for degraded ows remains persistent for a period of time so that umulative information an be used to determine the level of degraded servie that is to be applied. Further, umulative state is retained for either soure or destination, as opposed to the more ommon fsoure/destination, address, port, protoolg tuple used to identify a ow. The reason for basing state on either soure or destination, though not neessarily both, is that the approah allows us to aumulate state for a servie with a topologial diretion oupled with a wildard endpoint. This allows us to aumulate state for web-like appliations, in whih users tend to aess a multitude of random loations during a given session (i.e., time spent using an appliation over time). Figure 1 presents two examples in deploying P-CBQ within the ontext of inter-domain onnetivity. In Example-A, the P-CBQ router is plaed at the edge of the transit network, where it onnets a ustomer stub network. The PCBQ router is ongured to retain persistent state based on the lass (e.g., port & protool) and destination address of the traÆ 4 forwarded on the T-1 (1.5 Mb/s) link - a relatively small link that is suseptible to ongestion. When the penalty phase is ativated and degraded servie is applied to a ow, resoures of the T-1 are released and made available to other ows traversing the link. From the perspetive of the transit network, the degraded servie an be treated as a value added servie for its stub ustomer in that it attempts to minimize seemingly non-ritial servie on a data-driven manner. In Example-B, the P-CBQ router is at the edge 4 For our purposes, we separate and highlight the address from the denition of a lass. In general, a dened penalty lass will not be based on both soure and destination, though there is nothing to prelude this ation. access network access network T1 Telco/LAN Router T1 Telco/LAN Router Telco/LAN Router Telco/LAN Router Telco/LAN Router Telco/LAN Router stub network stub network PCBQ router traffic subject to penalty Example A Example B Figure 1: Two examples of P-CBQ deployment of the stub and is ongured to retain state based on lass and soure address. This ation allows the stub domain to ahieve degraded servie on egress traÆ (suh as requests for data) independent of the transit network and the servies it provides. The urrent implementation of P-CBQ stores penalty state based on the full length of an IPv4 address. The next version will add the ability to dene sets of masks used to aggregate addresses. In addition, exlusionary state will be added so that ertain address/masks an be exluded from being subjet to degraded servie. The development of P-CBQ was done under FreeBSD using the ALTQ framework [1℄ [4℄ . The implementation uses one of two dierent algorithms to enfore degraded servie. Both algorithms are polynomial-based and generate drop probabilities for those lasses of traÆ that are subjet to degraded servie. Note that our implementation degrades traÆ by dropping pakets, as opposed to inreasing their delay in the queue, or in altering their priority as dened and realized in a Di-Serv environment. A number of parameters an be set in a onguration le. Some of the parameters are unique to a spei algorithm used to enfore degraded servie. Other parameters, as per the following, are appliable to all strategies. These being : lass of traÆ, threshold value, degradation algorithm and queue limit . The rst onguration parameter identies the lass of traÆ (HTTP, TCP et.) for a spei interfae subjet to degraded servie. The seond parameter identies the threshold used to trigger degraded servie. The value of the threshold an refer to paket ounts or servie rate { the spei meaning based on the identity of the seleted degradation algorithm. When the threshold is surpassed, then the degradation phase is ativated. It is this feature that signies the data-driven nature of our design. The third onguration parameter identies the penalty algorithm used for servie degradation. Currently, two algorithms have been implemented. The rst is based on a umulative paket ount of a soure or destination address and is referred to as the PC algorithm. The paket ount (pkt) is ompared against the threshold value (thresh), and if the ount is greater, then the paket is dropped with a dynamially alulated drop probability. As the paket ount inreases, the drop probability (dp) inreases in a proportional manner. This an be represented in the form of: dp = 0 (pkt thresh)=2n if pkt < thresh if pkt > thresh where n speies the ongured base for the rate of degradation (how fast the drop probability inreases), if n is high then the penalty rate is gradual, otherwise the rate is onsidered steep. Aording to the above formula it is possible that the drop probability grows without bound, in order to ontrol this we use a ongured Max dp and never allow the drop probability to exeed this value. Several values of n are shown in Figure 2; one representing an aggresive degradation of servie, another (n=12) representing a more gradual drop rate, and the third (n=14) an even more gradual drop rate. of referene by whih other penalty algorithms and test senarios an be designed, tested, and further rened. Below we present the results of several test senarios. These senarios use the topology shown in Figure 3, where there exist three nodes: Node-1 ating as the soure(s), Node-2 ating as the router with P-CBQ enabled on one interfae and FIFO enabled in the other, and Node-3 ating as the destination. The mahines were PC routers running FreeBSD equipped with ATM ards, whose links between them were ATM PVCs. The apaity from Node-1 to Node-2 was set to 10 Mb/s, and the other link was set to 1.5Mb/s. 0.6 Penalty Class1 mask = 10 Penalty Class2 mask = 12 Penalty Class3 mask = 14 0.5 throughput (Mbit/sec) 0.4 0.3 0.2 Figure 3: Test Topology 0.1 0 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 time (sec) Figure 2: Dierent Drop Probability Rates One aspet unique to the PC algorithm is that if no pakets have been reeived for a period of time, we presume that the user has been \disouraged" from using the appliation lass. Thus, the paket ount, or penalty state is leared so that at a later time the user an start using the appliation without penalty until the threshold or paket ount is reahed one again. The following setion presents some initial results in testing this algorithm. The seond algorithm, referred to as the SR algorithm, is based on the servie rate of a lass of traÆ. If the rate at whih pakets are being servied from the input interfae is above the ongured threshold value, then a drop probability is generated to determine if the paket should be disarded. The probability of dropping a paket is proportional to inreases or dereases of the servie rate. One the average servie rate drops below the threshold value, the SR algorithm terminates the generation of a drop probability - whih orresponds to a termination of a degradation of servie for that soure or destination address. In order to redue the potential existene of severe osillations, an exponentially weighted moving average is used to determine the servie rate over time. Results of this approah are to be presented at a later date. Finally, the fourth onguration parameter, the queue limit, is used by both the SR and PC algorithms as an additional renement in determining when to start using degraded servie. In the ase of SR, the omparison of threshold and servie rate is done when the queue size is above a ongured limit. In the ase of the PC algorithm, only the pakets above the ongured limit are subjet to degraded servie. 4. RESULTS This setion presents some initial results in testing P-CBQ using the PC algorithm. Our objetive is to present a point 4.1 UDP Tests The rst set of tests involves the transmission of a steady stream of UDP traÆ from two soures on Node-1 to Node3. Eah soure sent 6000 pakets, with eah paket being 1400 bytes in length. Eah soure also represented a lass of traÆ; one was subjet to degraded servie and the other under Best Eort servie. The Penalty Class had a ongured threshold of 600 pakets, meaning that degraded servie would not ommene until the paket ount equals the threshold ount. The Penalty Class was also ongured to experiene a maximum drop probability of 45%, suh that the probability of dropping a newly reeived paket would never exeed that value. Finally, eah lass was alloated 49% of available resoures (bandwidth). The left graph in Figure 4 represents a ontrol example that shows the dierene between a a normal UDP ow and one that is penalized. Initially, both UDP ows ahieve a throughput of approximately 700 Kbits/s whih is their nominal alloation. The usage poliy for Default Class has no penalty assoiated with it, and thus maintains this steady level of throughput. The Penalty Class, on the other hand, has a penalty poliy, whih takes eet in a gradual manner after the threshold has been reahed. At time around 57 seonds into the test, the throughput of the Penalty Class displays a more level behavior with some measure of variane, whih is due to the fat that the deision to drop pakets is probabilisti in nature on a per paket basis. Given that the left graph represents a ontrol senario, it represents a point of referene and thus is not indiative of how we envision the deployment of P-CBQ with respet to usage poliy. The right graph of Figure 4 is the same UDP test as that of the left graph, but now the Default lass is allowed to borrow unused bandwidth from the Penalty Class, as the latter is subjeted to Degraded Servie. This seond test represents one example of how we envision usage poliy would be deployed with P-CBQ. One an notie that in this seond example that the Default Class is able to nish its transmission earlier than the ontrol example beause of the additional bandwidth that was made available to it. One an Default Class Penalty Class Root Class 1.5 1.4 1.3 1.2 throughput (Mbit/sec) 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100105110115120125130 time (sec) Default Class Penalty Class Root Class 1.5 1.4 1.3 1.2 throughput (Mbit/sec) 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100105110115120125130 time (sec) Figure 4: UDP TraÆ with and without Borrow. also see that the utilisation of the output link (i.e., throughput of the Root lass) drops due to the fat that for the given level of degraded throughput in the Penalty lass, the Default lass did not have enough demand (rate) to sustain the throughput on the output link. 4.2 TCP Tests We have also onduted tests with TCP ows and observed the eets of PC algorithm both on individual ows and on aggregates. TCP traÆ today aounts for almost 90% of the total Internet traÆ and uses a window ow ontrol algorithm with timeout and retransmission. The algorithms for managing the window for ongestion avoidane and ontrol or the way the retransmission timeout (RTO) is alulated are far too ompliated to be desribed here, see [14℄ for details. It is widely aeptable though that TCP's window-based ow ontrol does not allow ne grained rate ontrol. Moreover TCP's throughput degrades with loss probability in a non-linear fashion; a slight inrease in the steady state paket loss probability (in the range from 1% to 10%) an ause a preipitous drop in throughput. From the latest work in TCP modeling [12℄ we have the follwing equation : 1 Throughput RTT r 1 3 + o( p ) 2bp p where RTT is the round trip time of the onnetion, b is the number of pakets that are aknowledged by a reeived ACK and p is the probability that a paket is lost. Although this equation only models the steady state through- put of TCP when loss indiations are only 3 dupliate ACKs without aounting for timeouts it is possible that just a few losses ombined with the exponential bako of the retransmission timer [9℄ may lead TCP to long timeouts (idle periods) and an overall throughput that is very hard to be predited, see [12℄ for the detailed equation. When TCP experienes a series of timeouts it an be thought from a user's perspetive as ompletely starved and the onnetion may be aborted. It is hard to penalize a single TCP ow with the PC algorithm and still maintain a preditable minimum throughput like in the UDP ase. Short ows in partiular do not usually have large enough ongestion windows to allow them to use fast retransmit and fast reovery (requires a window of at least 4) 5 and an be better thought of as a series of slow start periods followed by one (or more as it was often the ase with P-CBQ) retransmission timeouts. After a single TCP ow was subjeted to penalty, it was behaving like an on/o soure and its transmission intervals were governed by the TCP internal timeout alulation algorithm (exponential bako) and the expiration of router penalty state. Dening the parameters and algorithms to aurately ontrol TCP throughput by probabilisti paket drop methods is still an open area of researh. The PC algorithm an provide drop probabilities suÆient for ontrolling aggregates of TCP ows beause the losses are then distributed over a larger number of onnetions and the eet is that individual ows experiene gradual servie degradation that is roughly proportinal to the number of ows in the lass (up to a ertain limit). In our work, the fous is on the degradation algorithm itself and the servie it provides, not on the interations between paket drop probability and the number of TCP ows. Eah of the graphs in Figure 5 show a series of short TCP ows (like HTTP replies): the Default lass ows unharmed, and the Penalty lass ows subjeted to degraded servie. As in the ase of the previous UDP results, the left graph represents a ontrol example that an be ompared to the right graph, whih shows the benets of realloating the resoures of the penalized ow. The penalty state at the router is kept for the duration of the test (i.e., the timeout value does not expire), the threshold that ativates the penalty is set to 1000 pakets, the maximum drop probability is 45% and the parameter n (\mask") that ontrols the rate of degradation is set to 12 so that the degradation is relatively \gradual". All the graphs show aggregate lass throughput. In the left, or ontrol graph of Figure 5, the Default lass is not allowed to borrow, so that the highest throughput is limited by the lass bandwidth alloation (approximately 750 Kbits/s). In omparison, the Default Class in the right graph is allowed to borrow any unused bandwidth from the Penalty lass, so its highest throughput in this onguration is limited by the link apaity if the Penalty Class is idle. Typially after the rst three TCP sequenes, the PC penalty algorithm is ativated and the Default lass is able 5 Most TCP ows in the Internet today transfer HTTP data and are usually short in duration, the typial MSS is 1460 bytes and the majority of the onnetions transfer around 10 Kbytes whih means less than 7 segments. Deployment of Persistent HTTP in HTTP 1.1 is expeted to inrease the amount of data transferred in a TCP onnetion. Series of HTTP like replies through the Default (without borrow) and Penalty TCP classes Default Class Penalty Class Root Class 1.4 throughput (Mbit/sec) 1.2 1 0.8 0.6 0.4 0.2 0 0 25 50 75 100 125 time (sec) 150 175 200 225 250 Series of HTTP like replies through the Default (with borrow) and Penalty TCP classes Default Class Penalty Class Root Class 1.4 throughput (Mbit/sec) 1.2 1 0.8 0.6 0.4 0.2 0 0 25 50 75 100 125 time (sec) 150 175 200 225 250 Figure 5: TCP TraÆ with and without Borrow. to use the resoures denied to the Penalty lass. Some TCPs in the Penalty lass either took too long to nish and there were oasions where the onnetions were dropped after waiting for long periods (note that 12 onseutive retransmission timeouts for the same onnetion, in the BSD TCP implementation, are not that rare when the paket drop probability s 45%). More importantly, this ondition was aomplished beause of the umulative and persistent information used by P-CBQ to penalize subsequent transations. In the TCP ase, aomplishing drop probability with an algorithm like Paket Count, is less straightforward than with UDP, and the onlusion was that one must be rather areful when designing degradation poliies for TCP ows, or ows that use timeouts for ongestion ontrol in general. Depending on the poliy the PC algorithm would be ideal for streaming media type ows that do ongestion ontrol by baking o their rate suÆiently low. 5. FUTURE WORK In the previous setions, we have presented an approah that supports a degraded servie model for IP networks using P-CBQ. For its initial design, we implemented a degradation algorithm that was based on pakets ounts and a orresponding drop probability as a means of seletively penalizing a spei type of ow. This initial approah provides a foundation that an be evolved in a variety of ways. One suh diretion involves the development of dierent types of degradation algorithms. For instane, one derivation of the PC algorithm ould involve the inlusion of a token buket to smoothen the rate of traÆ one a maximum degraded servie has been ahieved. Beyond this, a new type of drop algorithm ould be designed that would base its ations of the servie rate of a queue. As the servie rate exeeds a given threshold, its drop rate ould inrease proportionally. Another area for future researh involves the integration of P-CBQ into the Di-Serv arhiteture. Currently, our implementation realizes its degradation funtion by dropping pakets. This ation ould easily be rened so that IP pakets an be expliitly marked as lower priority. In addition, the riteria for marking or dropping pakets ould reet a harging model that satises the users measure of payment: X-amount for degraded servie, X+Y for best effort servie, and X+Y+Z for better than best eort servie. A potentially third area of future researh fouses on a more oordinated interation between eonomially based harging models with degraded servie. In [10℄, Kelly presents the onept of Shadow Priing and its integration with rate ontrol algorithms in order to ahieve stable onvergene in establishing a per unit harge of resoure used to forward traÆ. The ost assoiated with a resoure is in relation to the other ows sharing the same resoure. As the demand inreases, either by new ows or added demand by existing ows, the ost rises. By inreasing the ost in relation to an inrease of demand, the system provides an equalizing inuene between the diametri fores of supply and demand. If we substitute the ow ontrol algorithms of TCP with the penalty algorithms of P-CBQ, we an potentially use Shadow Priing as a means of altering the ations taken upon a penalized ow or lass of servie. Finally, the ability to inur a variane in degraded servie has the potential to operate symbiotially with seurity and orresponding levels of ondene that an attak (e.g., intrusion or denial or servie) is ouring . The use of systems like, Baysian Networks, Geneti Algorithms, and Fuzzy Logi to detet seurity attaks has the tendeny to present non-denitive results. By integrating a range of answers with a orresponding variane in penalty, we have the potential to go beyond binary deisions { i.e., ontinue forward pakets, or drop all pakets for a 'suspiious' ow. 6. RELATED WORK As noted above, one of the fundamental features of CBQ is it's a priori denition of lasses and division of resoures independent of existing traÆ onditions or aquired knowledge of previous traÆ types and load. The ability to borrow unused resoures presents a powerful feature that allows the lassiation proess to be reative to existing traÆ load traversing a node. The use of a data-driven design is also the foundation of Flow Random Early Detetion (FRED) [11℄ , a modied version of RED. In the approah, the designers introdue a per-ative-ow aounting of load to penalize ows based on their buer usage - the penalty being in the form of dropped pakets. The objetive in this design is to be more fair in relation to existing ows exhibiting dierent harateristis (e.g., two TCP ows with dierent soure or destinations that have dierent load due to dierent Round Trip Time delays). This realization of fairness also implies that some ows may be penalized at a higher rate than others based on their usage of resoures, versus an arbitrary pre-seleted lassiation. Reent work in degraded servie is found in [3℄ , whih as title says disusses lower than best eort per-hop behavior. This eort is meant to work in tandem with the Dierentiated Servies model to provide a measure of protetion of best eort servie in the presene of ongestive state due to non-onformant data ows. A key aspet that distinguishes this work from that proposed in this paper is that ation is never taken unless a ongestive state ours within a node . Further, all pakets that are marked as reeiving lower than best eort servie are degraded uniformly - the form of the degradation being in the form of more aggressive paket drop. Previous load onditions, or temporal state, are not taken into onsideration. 7. CONCLUSIONS AND SUMMARY The design of a per-ow aounting mehanism to support degraded servie through represents a new diretion in the treatment of ows traversing a node. Three motivations behind this involve a desire to support usage poliies, ow ontrol for ontinual and parallel short transation onnetions, and a need to degrade servie for di-serv pakets whose priority has been set to the lowest value possible by previous entities/domains. The resoures of penalized ows are made available to other ows of a given lass, or other lasses of traÆ, traversing the node. The fat that we aumulate information on subsequent ows allows us to penalize short transation ows like those generated by series of HTTP requests. Depending on its deployment and onguration, the amount of potential state makes our initial implementation more attrative at the leaf edges of networks. However, future work in aggregating penalty will allow us to apply our approah to the Internet as a whole. A related feature involving our approah in degrading servie is that the rate and severity of the penalization proess an be applied to disouraging user behavior, in the form of ontinual and prolonged usage, of a given appliation. Coupled with an eonomially based harging model, one has the potential of harging for maintaining Best Eort servie to avoid restritive usage poliies plaed on appliations like web browsers. Our initial results have been based on a single penalty algorithm based on paket ounts. This algorithm operates independent of existing end-to-end ow ontrol algorithms, the rate at whih pakets are servied/forwarded through a node, and uses a probabilisti means of determining whether a paket is to be dropped. Applied to UDP ows without ow ontrol applied to upper level RTP pakets, the PC type of algorithm allows us to inur a relatively smooth degradation of servie upon a ow. However, an area that will require additional investigation and researh involves the development and renement of penalty algorithms that take into aount the ow ontrol algorithms inherent in TCP. 8. REFERENCES [1℄ ALTQ: Alternate Queueing for FreeBSD. Available from http://www.sl.sony.o.jp/person/kj/software.html. [2℄ S. Blake, D. Blak, M. Carlson, E. Davies, Z. Wang, and W. Weiss. \An Arhiteture for Dierentiated Servies". RFC (Informational) 2475, IETF, De. 1998. [3℄ R. Bless and K. Werle. A Lower than Best Eort Per-Hop Behavior Internet Draft - Work in Progress, Sept. 1999. [4℄ K. Cho. A Framework for Alternate Queueing: Towards TraÆ Management by PC-UNIX based Routers. In USENIX Annual Tehnial Conferene, New Orleans, Louisiana, 1998. Available from ftp://ftp.sl.sony.o.jp/pub/kj/papers/altq98.ps.gz. [5℄ A. Demers, S. Keshav, and S. Shenker. \Analysis and Simulation of a Fair Queueing Algorithm". In Pro. of the ACM SIGCOMM, pages 1{12, Austin, Texas, Sept. 1989. [6℄ A. Feldman. Presentation at IETF Plenary Session, Nov. 1999. [7℄ S. Floyd and K. Fall. \Promoting the Use of End-to-End Congestion Control in the Internet". IEEE/ACM Transations on Networking, Aug. 1999. [8℄ S. Floyd and V. Jaobson. \Link-sharing and Resoure Management Models for Paket Networks". IEEE/ACM Transations on Networking, 3(4):365{386, Aug. 1995. [9℄ P. Karn and C. Partridge. \Improving Round-trip Time Estimates in Reliable Transport Protools". ACM Computer Communiation Review, 25(1):67{74, Jan. 1995. [10℄ F. Kelly, A. Maulloo, and D. Tan. \Rate ontrol for ommuniation networks: shadow pries, proportional fairness and stability". Journal of the Operational Researh Soiety, 49(3):237{252, Mar. 1998. [11℄ D. Lin and R. Morris. "Dynamis of Random Early Detetion". In Pro. of the ACM SIGCOMM, pages 127{137, Cannes, Frane, 1997. [12℄ J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. \Modeling TCP Throughput: A Simple Model and its Empirial Validation". ACM Computer Communiation Review, 28(4):303{314, Sept. 1998. [13℄ S. Shenker, C. Partridge, and R. Guerin. \Speiation of Guaranteed Quality of Servie". RFC 2212, IETF, Sept. 1997. [14℄ W. R. Stevens. TCP/IP Illustrated Volume 1: The Protools. Addison-Wesley, Reading, Massahusetts, 1994. [15℄ J. Wrolawski. Speiation of the Controlled-Load Network Element Servie. Request for Comments 2211, IETF, Sept. 1997. [16℄ L. Zhang, S. Deering, D. Estrin, S. Shenker, and D. Zappala. \RSVP: a new resoure ReSerVation Protool". IEEE Network, 7(5):8{18, Sept. 1993.
© Copyright 2021 DropDoc