各位看官非常对不起,本文是用因为写的,如果多有不便敬请见谅
代码是在商业公司编写的,在商业产品中也不能开源,再次抱歉
This presentation will highlight our efforts on optimizing the
Linux TCP/IP stack for providing networking in an
OpenStack environment, as deployed at our industrial customers.
Our primary goal is to provide a high-quality and highly performant TCP/IP stack.
To achieve this, we have to identify the performance bottlenecks in
the Linux TCP/IP stack for networking in OpenStack. We have performed a lot of
Linux TCP/IP stack performance tuning, related to NIC, CPU cache hit rate, spin lock,
memory alloc and others. However, we learned while measuring that conntrack NAT
uses too much CPU such for instance for the ipt_do_table function.
Linux conntrack is very good, but it is too heavy and many functions are not used.
Instead, we implemented FAST NAT in the Linux TCP/IP stack.
We will present our efforts on reducing the performance costs.
First, FAST NAT uses spin lock instead of global connection table but the entry to greatly reduces the CPU waiting time,
and user policies is instead stored as a hash table not a list. The connection table and user
policy is per-NUMA, this would avoid CPU through QPI waste much time and increase delay.
Second, FAST NAT does not record the TCP status,
but only record a tuple with relevant connection formation for NAT forward.
This can reduce much check for forwarding packet.
Entry in the connection table can be set to expire on
an absolute expiration time or relative expiration time basis.
Relative expiration time will incresae by per forwarding packet.
Global connection table don't synchronize for reducing lock's using. This may casue one TCP stream in
per-NUMA connection table. If we use Intel Ixgbe NIC with Flow Director ATR mode, the incoming
stream and outcoming stream will have same index for multiple queues. The mentioned limit above
will disappear.
Limitations of FAST NAT only TCP and UDP are supported.
Although some limitations exist, our work has paid off and resulted in 15-20 percentage pps improvement.