net: optimize csum_replace2()

When changing one 16bit value by another in IP header, we can adjust
the IP checksum by doing a simple operation described in RFC 1624, as
reminded by David.

csum_partial() is a complex function on x86_64, not really suited for
small number of checksummed bytes.

I spotted csum_partial() being in the top 20 most consuming functions
(more than 1 %) in a GRO workload, which was rather unexpected.

The caller was inet_gro_complete() doing a csum_replace2() when
building the new IP header for the GRO packet.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Eric Dumazet 2014-03-23 19:51:36 -07:00 committed by David S. Miller
parent 860b4042dd
commit 99f0b958b1

View file

@ -69,6 +69,19 @@ static inline __wsum csum_sub(__wsum csum, __wsum addend)
return csum_add(csum, ~addend);
}
static inline __sum16 csum16_add(__sum16 csum, __be16 addend)
{
u16 res = (__force u16)csum;
res += (__force u16)addend;
return (__force __sum16)(res + (res < (__force u16)addend));
}
static inline __sum16 csum16_sub(__sum16 csum, __be16 addend)
{
return csum16_add(csum, ~addend);
}
static inline __wsum
csum_block_add(__wsum csum, __wsum csum2, int offset)
{
@ -112,9 +125,15 @@ static inline void csum_replace4(__sum16 *sum, __be32 from, __be32 to)
*sum = csum_fold(csum_partial(diff, sizeof(diff), ~csum_unfold(*sum)));
}
static inline void csum_replace2(__sum16 *sum, __be16 from, __be16 to)
/* Implements RFC 1624 (Incremental Internet Checksum)
* 3. Discussion states :
* HC' = ~(~HC + ~m + m')
* m : old value of a 16bit field
* m' : new value of a 16bit field
*/
static inline void csum_replace2(__sum16 *sum, __be16 old, __be16 new)
{
csum_replace4(sum, (__force __be32)from, (__force __be32)to);
*sum = ~csum16_add(csum16_sub(~(*sum), old), new);
}
struct sk_buff;