• Buffer size of router

    Data buffer

    Buffers are typically used when there is a difference between the rate at which data is received and the
    rate at which it can be processed, or in the case that these rates are variable.
    A buffer often adjusts timing by implementing a queue algorithm in memory, simultaneously writing
    data into the queue at one rate and reading it at another rate.


    Bufferbloat is a phenomenon in a packet-switched computer network whereby excess buffering of packets
    inside the network causes high latency and jitter, as well as reducing the overall network throughput.
    This problem is caused mainly by router and switch manufactures making incorrect assumptions and
    buffering packets for too long in cases where they should be dropped.
    On older routers, buffers were fairly small so they filled quickly and therefore pakcets began to drop shortly
    after the link became saturated.

    High-Speed Routers

    Loss can be reduced by making buffers large enough.
    jitter:fluctuation in buffer size.
    The buffer size in core Internet routers is typically chosen according to a rule of thumb which says:
    provide at least one round trip time's worth of buffering.
    Recent theoretical work has challenged the rule of thumb: it seems that a buffer of just 20 packets
    should be sufficient.
    The buffer in an Internet router has serveral roles. It accommodates transient bursts in traffic,
    without having to drop packets. It keeps a reserve of packets, so that the link doesn't go idle.
    It also introduces queueing delay and jitter.
    The network model comprises two parts: one part describing the dynamics of TCP, another
    describing the dynamics of the queue. [1]

    Optimal choice of the buffer size

    Data packets of an Internet connection travel from a source node to a destination node via a
    series of routers. Some routers, particularly edge routers, experience periods of congestion
    when packets spend a non-negligible time waiting in the router buffers to be transmitted
    over the next hop.
    Congestion signals can be either packet losses or Explicit Congestion Notifications.
    At the present state of the Internet, nearly all congestion signals are generated by packet losses.
    Packets can be dropped either when the router buffer is full or when Active Queue Management
    (AQM) scheme is employed. Given an ambiguity in the choice of the AQM parameters, so far
    AQM is rarely used in practice. On the other hand, in the basic Drop Tail routers, the buffer size
    is the only one parameter to tune apart of the router capacity.
    The first proposed rule of thumb for the choice of the router buffer size was to choose the buffer
    size equal to the Bandwidth-Delay Product (BDP) of the outgoing link.
    It was observed that the utilization of a link improves very fast with the increase of the buffer size
    until a certain threshold value. After that threshold value the further increase of the buffer size
    does not improve the link utilization but increases the queueing delay. [2]

    Sizing Router Buffer

    A link with n flows requires no more than :

    for long-lived or short lived TCP flows.
    The consequences on router design are enormous: a 2.5Gbps link carrying 10,000 flows
    could reduce its buffer by 99% with negligible difference in throughput, and a 10Gbps link
    carrying 50000 flows requires only 10Mbits of  buffering, which can easily be implemented
    using fast, on-chip SRAM.
    If these buffers fill up, they cause queueing delay and delay-variance; when they overflow they
    cause packet loss, and when they underflow they can degrade throughput.
    Router buffers are sized today based on a rule-of-thumb commonly attributed to a 1994 paper
    by Villamizar and Song. Network operators follow the rule-of-thumb and require that router
    manufacturers provide 250ms (or more) of buffering.



    a 10Gbps router linecard  needs buffers — 20万个包,300MB
    a 1Gbps router linecard needs buffers — 2万个包,30MB
    a 100Mbps router linecard needs buffers — 2千个包,3MB


    Today(2005)backbone links commonly operate at 2.5Gbps or 10Gbps and carry over
    10000 flows. We believe that significantly smaller buffers could be used in backbone routers
    (e.g., by removing 99% of the buffers) without a loss in network utilization.
    With declining memory prices, why not just overbuffer routers ?
    First, it complicates the design of high-speed routers, leading to higher power consumption,
    more board space, and low density. Second, overbuffering increases end-to-end delay in the
    presence of congestion. Large buffers conflict with the low-latency needs of real time application. 


    We know from the central limit theorem that the aggregate window size does converge to a
    Gaussian process.

    The graph above shows the probability distribution of the sum of the congestion windows of all
    flows, with different propagation times and start times.
    How does the shape of the Gaussian, and thus our buffer, depend on the number of flows?
    If we have more flows, we would expect more statistical multiplexing and thus a narrower
    Gaussian. In fact, the central limit theorem tells us that if we increase the number of flows n,
    the width of the Gaussian (or, more formally, its standard deviation) should decrease with

    The role of the buffer is to absorb the fluctuation in the total window size. If the standard deviation
    of the total window size decrease with 1/ root(n), we would expect the required amount of buffer
    to do the same.

    For short flows — the size of the buffer does not depend on the line-rate, the propagation delay of
    the flows, or the number of flows; it only depends on the load of the link, and length of the bursts.

    Our model predicts that for 98% utilization a buffer of RTT * C / sqrt(n) should be sufficient.
    Decreasing the latency of a TCP flow will always increase the loss rate.The loss rate of a TCP flow
    is a function of the flow's window size and can be approximated to l = 0.76/w^2. If we reduce buffers,
    we decrease the RTT of the flow, therefore decrease the average W and thus increase loss.


    The rule of thumb for buffer sizing derives from the following goal: If capacity is limited, it's
    desirable to make sure the link never goes idle. Therefore there should always be some
    packets in the buffer. By reasoning about how TCP responds to loss, we can work out how
    big the buffer needs to be.
    The buffer needs to be at least Wmax/2 to avoid going empty during the pause. The key to
    sizing the buffer is to make sure that while the sender pauses, the router buffer doesn't go
    empty and force the bottleneck link to go idle.(原来链路是满的,在流出Wmax/2的数据后,
    路由器至少需要补充Wmax/2的数据量,才能保持链路被完全利用) It turns out that this is equal
    to the distance (in bytes) between the peak and trough of the "sawtooth" representing the TCP
    window size.
    Bursts from short flows do have an effect. However it is very small, and that the buffer size is,
    in fact, dictated by the number of long flows.
    If the buffer never goes empty, the router must be sending packets onto the bottleneck link at
    constant rate C. This in turn means that ACKs arrive to the sender at rate C. The sender therefore
    pauses for exactly (Wmax/2)/C seconds for the Wmax/2 packets to be acknowledged. It then resumes
    sending, and starts increasing its window size again.
    The buffer will just avoid going empty if the first packet from the sender shows up at the buffer just as
    it hits empty, i.e., (Wmax/2)/C <= B/C or B >= Wmax/2


    当diff >= beta时,我们判断路由器拥塞。
    A 2.5Gbps link typically carries over 10000 flows at a time.
    Network operators follow the rule-of-thumb and require that router
    manufacturers provide 250ms (or more) of buffering. [3]


    Flows are not synchronized in a backbone router carrying thousands of flows with varying RTTs.
    Small variations in RTT or processing time are sufficient to prevent synchronization; and the absence
    of synchronization has been demonstrated in real networks.Likewise, we found in our simulations
    and experiments that while in-phase synchronization is common for under 100 concurrent flows,
    it is very rare above 500 concurrent flows.


