Wednesday, September 9, 2009
Core-Stateless Fair Queuing
The authors evaluate this mechanism over several simulated topologies with variously behaved competing flows. They show that the performance of CSFQ is often competitive to a full per-flow state implementation of fair queuing (Deficient Fair Queuing). Its aboslute performance is usually slightly worse than DFQ (with the exception of performing better in cases where flows may suffer from not being able to temporarily exceed their "fair" buffer to accommodate burstiness).
CSFQ has some parameters which seem to be set arbitrarily and some which are not clear how to set in general. Arbitrary parameters include thresholds used to adjust for buffering effects. Exponential smoothing parameters the authors recommend setting substantially larger than delay-jitter and shorter than the average duration of a flow. Since with things like web pages which are in fact short-lived flows, we probably have a relatively short average flow duration; how should these parameters be set when flows may traverse the globe (with presumably high delay jitter) before exiting the network or traverse only to the local metropolis's peering point (with hopefully very low delay jitter)?
Analysis and Simulation of a Fair Queueing Algorithm
The authors evaluated their fair queuing algorithm (in simulation) against first-come first-serve and DECbit (which gives explicit congestion notifcations to flows over their fair share) running against continuous ("FTP") and intermittent ("telnet") flow controlled streams (both traditional TCP-style and with explicit congestion notification) and against misbehaved flows. In contrast to the other algorithms, their fair queuing algorithm provides well-behaved flows good performance in the presence of an ill-behaved flow and severely punishes the ill-behaved flow (which they assume to be loss-sensitive). Even for the normal cases, they show that they provide lower latency to the intermittent flow (rewarding it for using less than its fair share of bandwidth) than the DECbit scheme.
The incidental punishment of ill-behaved flows is strange; as the authors note, it is a result of charging flows for packets that cannot be queued, but it would be nicer to make the degree of punishment more explicit. (Since, surely, well-behaved hosts will send somewhat above their fair share while adjusting or due to network perturbation, and should necessarily lose throughput for it.)
Sunday, September 6, 2009
Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks
Primarily evaluated are algorithms where at each time t, the new load is a(signal) + b(signal)*(old load) where signal is the single-bit indicator of whether the network was over the desired load at time t -1 . The appropriate values of a, b for each case are analyzed in light of convergence to both the desired level of load and the desired fairness. The main result is that additive increase/multiplicative decrease (i.e. a(underloaded) > 0, a(overloaded) = 0, b(underloaded) = 0, 0 < b(overloaded) < 1) is are the types of a and b functions which achieves convergence on both criteria. With respect to the rate of convergence, there is a tradeoff between quick convergence and low levels of oscillation around the goal load.
For me, the most interesting observation of the paper are the idealized throughput and response-time curves. TCP-style congestion control has its boolean signal of a sort (a dropped packet) — but where on that throughput/response time curve does it fall?
Congestion Avoidance and Control
- reaching the equilibirum
- slow start — ramp up sending speed slowly so the correct sending speed won't be overshot. Prior systems would not recover from a mismatch in their local network rate and the least capable link to their end host because they would continuous overshot.
- linear window size increase — be conservative about increasing the amount of data in flight to avoid massively overdriving the network, from which it is hard to recover.
- don't retransmit until packets is out of the network
- RTT variance estimate — because of queues, etc. round-trip time varies in the network a lot. This must be taken into account, which the authors do by creating an estimator of RTT variance.
- backoff after retransmission enough to avoid more losses
- exponential backoff — motivated by some theoretical results (about what sort of backoff is required for stability) and a desire to be as general as possible with what network topologies they support, the authors choose exponential backoff.
Despite the authors attempt at generality in these algorithms, since as they note there were TCP implementations for systems ranging from 56Kbit/s to hundreds of Mbit/s, there are some interesting assumptions about the environment. The most glaring one is that losses due to damage are are rare, which seems a strange assumption when one is considering microwave links, etc. It is hard to see, though, how this sort of congestion control (only at the ends and where loss types are not distinguishable) could function very well without assuming that at least most losses are due to congestion.
Wednesday, September 2, 2009
Understanding BGP Misconfiguration
Perhaps most remarkable about this paper is that most of these problems do not adversely affect connectivity. In this sense, BGP appears to have partially achieved the old goal of survivability, if only by accident. (Indeed, it seems to perform better than the Registries' databases at providing information about how to reach networks.) The most serious effect seems to come from the added load these background misconfigurations place on the system and the more serious possibilities they reflect. They are reminders that configuration is not very automated or well understood, so future global network disrupting events are still seem quite possible.
Interdomain Internet Routing
Much of the interesting part of inter-network routing, how these routing decisions are made and which offers are made or withdrawn, are not part of BGP. Typically, these policies are determined by the economic relationship between the two networks. A provider wants to maximize utilization of links that make it money, so it has an incentive to advertise any available routes using such links. When deciding between multiple routes to the same destination, a provider will similarly be motivated to configure their software to prefer routes using links for which they are paid, and may then decide based on traditional “technical” criteria like path length. (Of course, explicit contractual arrangements also influence these decisions.)
BGP does seem to have been designed for a smaller, friendlier Internet. The state size is and its changes are not trivial and likely to get worse as more people would sensibly like to have an internet or a backup internet connection. As administrated, there is a lot of trust, revealed through the poor filtering illustrated by hijacked blocks and then general lack of route authentication in the protocol. And, despite its flexibility for route announcers to choose their policy, some policies seem hard to enforce technically, such as preferences for peers to use their own networks for cross-country/sea transit. (Presumably, this could be prevented by only advertising the routes through the preferred entry point except that this would break reachability during glitches on the peer's network, so it is easier to contractually require honoring MED when needed?)
Monday, August 31, 2009
The Design Philosophy of the DARPA Internet Protocols
This article gives a design rationale for the DARPA Internet protocols. The primary motivation derives from connecting a variety, separately administered networks and supporting a variety of applications on top of them. The targeted applications tended to be non-jitter-sensitive ones with varying bandwidth requirements, so packet switching was chosen over circuit-switching. Another important goal was tolerance of failures; fault-tolerance combined with separate adminstration and the least common denominator of a variety of networks lead to the datagram model: Nothing was to be assumed of the intermediate hosts in a connection, so all the state of the network would be provided at the ends of the connection. The IP layer, then, provides exactly this service, on top of which end-user services would be built with modifications only at the end points (including the initial example of TCP).
Probably the biggest tradeoff mentioned in the article was one of end-host complexity. By relying on such a minimalist network stack and tolerating fail-stop failures of the network between two end hosts, implementing the protocols on the end-hosts is not easy (compared to, e.g., network architectures that explicitly signal failures or flow control limits).
Somehow, despite placing high importance on resilience from explicit attack, the design considerations mentioned fail to account for the presence of an adversary on the network who can insert non-fail-stop failures. The article recognizes this is a lost of robustness:
In this respect, the goal of robustness, which lead to the method of fate-sharing, which led to host-resident algorithms, contributes to a loss of robustness if the host misbehaves.but fails to mention the apparent design assumption that the separately administered hosts are reasonably well-behaved.
It is interesting that the article describes the lack of router state-carrying as a robustness feature. At our current scale, it seems more like a scalability feature. Though the article talks about performance considerations (particularly the difficult of specifying them to defense contractors) and mentions network-performance-sensitive activities like teleconferencing, performance generally and scalability in particular did not appear to be major goals. (The closest among the enumerated goals is “[permitting] host attachment with a low level of effort”, which was observed to not be well-achieved because of the protocol reimplementation effort.)