linux.git/include/net/ipv6.h, branch v3.10.100

ipv6: distinguish frag queues by device for multicast and link-local packets

2016-01-23T03:47:52+00:00

[ Upstream commit 264640fc2c5f4f913db5c73fa3eb1ead2c45e9d7 ]

If a fragmented multicast packet is received on an ethernet device which
has an active macvlan on top of it, each fragment is duplicated and
received both on the underlying device and the macvlan. If some
fragments for macvlan are processed before the whole packet for the
underlying device is reassembled, the "overlapping fragments" test in
ip6_frag_queue() discards the whole fragment queue.

To resolve this, add device ifindex to the search key and require it to
match reassembling multicast packets and packets to link-local
addresses.

Note: similar patch has been already submitted by Yoshifuji Hideaki in

  http://patchwork.ozlabs.org/patch/220979/

but got lost and forgotten for some reason.

Signed-off-by: Michal Kubecek 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

inetpeer: get rid of ip_id_count

2014-08-14T01:24:15+00:00

[ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ]

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

inet: fix addr_len/msg->msg_namelen assignment in recv_error and rxpmtu functions

2013-12-08T15:29:25+00:00

[ Upstream commit 85fbaa75037d0b6b786ff18658ddf0b4014ce2a4 ]

Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage
of uninitialized memory to user in recv syscalls") conditionally updated
addr_len if the msg_name is written to. The recv_error and rxpmtu
functions relied on the recvmsg functions to set up addr_len before.

As this does not happen any more we have to pass addr_len to those
functions as well and set it to the size of the corresponding sockaddr
length.

This broke traceroute and such.

Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")
Reported-by: Brad Spengler 
Reported-by: Tom Labanowski
Cc: mpb 
Cc: David S. Miller 
Cc: Eric Dumazet 
Signed-off-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ipv6: implement RFC3168 5.3 (ecn protection) for ipv6 fragmentation handling

2013-03-24T21:16:30+00:00

Hello!

After patch 1 got accepted to net-next I will also send a patch to
netfilter-devel to make the corresponding changes to the netfilter
reassembly logic.

Thanks,

  Hannes

-- >8 --
[PATCH 2/2] ipv6: implement RFC3168 5.3 (ecn protection) for ipv6 fragmentation handling

This patch also ensures that INET_ECN_CE is propagated if one fragment
had the codepoint set.

Cc: Eric Dumazet 
Cc: Jesper Dangaard Brouer 
Cc: YOSHIFUJI Hideaki 
Signed-off-by: Hannes Frederic Sowa 
Acked-by: YOSHIFUJI Hideaki 
Signed-off-by: David S. Miller

ipv6: introdcue __ipv6_addr_needs_scope_id and ipv6_iface_scope_id helper functions

2013-03-08T17:29:22+00:00

__ipv6_addr_needs_scope_id checks if an ipv6 address needs to supply
a 'sin6_scope_id != 0'. 'sin6_scope_id != 0' was enforced in case
of link-local addresses. To support interface-local multicast these
checks had to be enhanced and are now consolidated into these new helper
functions.

v2:
a) migrated to struct ipv6_addr_props

v3:
a) reverted changes for ipv6_addr_props
b) test for address type instead of comparing scope

v4:
a) unchanged

Suggested-by: YOSHIFUJI Hideaki 
Cc: YOSHIFUJI Hideaki 
Acked-by: YOSHIFUJI Hideaki 
Signed-off-by: Hannes Frederic Sowa 
Acked-by: YOSHIFUJI Hideaki 
Signed-off-by: David S. Miller

ipv6 flowlabel: add __rcu annotations

2013-03-07T21:33:10+00:00

Commit 18367681a10b (ipv6 flowlabel: Convert np->ipv6_fl_list to RCU.)
omitted proper __rcu annotations.

Signed-off-by: Eric Dumazet 
Cc: YOSHIFUJI Hideaki 
Signed-off-by: David S. Miller

ipv6: use a stronger hash for tcp

2013-02-21T23:15:58+00:00

It looks like its possible to open thousands of TCP IPv6
sessions on a server, all landing in a single slot of TCP hash
table. Incoming packets have to lookup sockets in a very
long list.

We should hash all bits from foreign IPv6 addresses, using
a salt and hash mix, not a simple XOR.

inet6_ehashfn() can also separately use the ports, instead
of xoring them.

Reported-by: Neal Cardwell 
Signed-off-by: Eric Dumazet 
Cc: Yuchung Cheng 
Signed-off-by: David S. Miller

ipv6 flowlabel: Convert np->ipv6_fl_list to RCU.

2013-01-31T03:41:13+00:00

Signed-off-by: YOSHIFUJI Hideaki 
Signed-off-by: David S. Miller

ipv6 flowlabel: Convert hash list to RCU.

2013-01-31T03:41:13+00:00

Signed-off-by: YOSHIFUJI Hideaki 
Signed-off-by: David S. Miller

net: frag helper functions for mem limit tracking

2013-01-29T18:36:24+00:00

This change is primarily a preparation to ease the extension of memory
limit tracking.

The change does reduce the number atomic operation, during freeing of
a frag queue.  This does introduce a some performance improvement, as
these atomic operations are at the core of the performance problems
seen on NUMA systems.

Signed-off-by: Jesper Dangaard Brouer 
Signed-off-by: David S. Miller