linux.git/net/ipv4/igmp.c, branch v4.3

IGMP: Inhibit reports for local multicast groups

2015-08-28T20:28:47+00:00

The range of addresses between 224.0.0.0 and 224.0.0.255 inclusive, is
reserved for the use of routing protocols and other low-level topology
discovery or maintenance protocols, such as gateway discovery and
group membership reporting.  Multicast routers should not forward any
multicast datagram with destination addresses in this range,
regardless of its TTL.

Currently, IGMP reports are generated for this reserved range of
addresses even though a router will ignore this information since it
has no purpose.  However, the presence of reserved group addresses in
an IGMP membership report uses up network bandwidth and can also
obscure addresses of interest when inspecting membership reports using
packet inspection or debug messages.

Although the RFCs for the various version of IGMP (e.g.RFC 3376 for
v3) do not specify that the reserved addresses be excluded from
membership reports, it should do no harm in doing so.  In particular
there should be no adverse effect in any IGMP snooping functionality
since 224.0.0.x is specifically excluded as per RFC 4541 (IGMP and MLD
Snooping Switches Considerations) section 2.1.2. Data Forwarding
Rules:

    2) Packets with a destination IP (DIP) address in the 224.0.0.X
       range which are not IGMP must be forwarded on all ports.

IGMP reports for local multicast groups can now be optionally
inhibited by means of a system control variable (by setting the value
to zero) e.g.:
    echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports

To retain backwards compatibility the previous behaviour is retained
by default on system boot or reverted by setting the value back to
non-zero e.g.:
    echo 1 >  /proc/sys/net/ipv4/igmp_link_local_mcast_reports

Signed-off-by: Philip Downey 
Signed-off-by: David S. Miller

net: fix wrong skb_get() usage / crash in IGMP/MLD parsing code

2015-08-14T00:08:39+00:00

The recent refactoring of the IGMP and MLD parsing code into
ipv6_mc_check_mld() / ip_mc_check_igmp() introduced a potential crash /
BUG() invocation for bridges:

I wrongly assumed that skb_get() could be used as a simple reference
counter for an skb which is not the case. skb_get() bears additional
semantics, a user count. This leads to a BUG() invocation in
pskb_expand_head() / kernel panic if pskb_may_pull() is called on an skb
with a user count greater than one - unfortunately the refactoring did
just that.

Fixing this by removing the skb_get() call and changing the API: The
caller of ipv6_mc_check_mld() / ip_mc_check_igmp() now needs to
additionally check whether the returned skb_trimmed is a clone.

Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
Reported-by: Brenden Blanco 
Signed-off-by: Linus Lüssing 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

net: Export IGMP/MLD message validation code

2015-05-04T18:49:23+00:00

With this patch, the IGMP and MLD message validation functions are moved
from the bridge code to IPv4/IPv6 multicast files. Some small
refactoring was done to enhance readibility and to iron out some
differences in behaviour between the IGMP and MLD parsing code (e.g. the
skb-cloning of MLD messages is now only done if necessary, just like the
IGMP part always did).

Finally, these IGMP and MLD message validation functions are exported so
that not only the bridge can use it but batman-adv later, too.

Signed-off-by: Linus Lüssing 
Signed-off-by: David S. Miller

ipv4: coding style: comparison for inequality with NULL

2015-04-03T16:11:15+00:00

The ipv4 code uses a mixture of coding styles. In some instances check
for non-NULL pointer is done as x != NULL and sometimes as x. x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.

No changes detected by objdiff.

Signed-off-by: Ian Morris 
Signed-off-by: David S. Miller

ipv4: coding style: comparison for equality with NULL

2015-04-03T16:11:15+00:00

The ipv4 code uses a mixture of coding styles. In some instances check
for NULL pointer is done as x == NULL and sometimes as !x. !x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.

No changes detected by objdiff.

Signed-off-by: Ian Morris 
Signed-off-by: David S. Miller

ipv4: hash net ptr into fragmentation bucket selection

2015-03-25T18:07:04+00:00

As namespaces are sometimes used with overlapping ip address ranges,
we should also use the namespace as input to the hash to select the ip
fragmentation counter bucket.

Cc: Eric Dumazet 
Cc: Flavio Leitner 
Signed-off-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller

ipv4, ipv6: kill ip_mc_{join, leave}_group and ipv6_sock_mc_{join, drop}

2015-03-19T02:05:09+00:00

in favor of their inner __ ones, which doesn't grab rtnl.

As these functions need to operate on a locked socket, we can't be
grabbing rtnl by then. It's too late and doing so causes reversed
locking.

So this patch:
- move rtnl handling to callers instead while already fixing some
  reversed locking situations, like on vxlan and ipvs code.
- renames __ ones to not have the __ mark:
  __ip_mc_{join,leave}_group -> ip_mc_{join,leave}_group
  __ipv6_sock_mc_{join,drop} -> ipv6_sock_mc_{join,drop}

Signed-off-by: Marcelo Ricardo Leitner 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller

multicast: Extend ip address command to enable multicast group join/leave on

2015-02-27T21:25:25+00:00

Joining multicast group on ethernet level via "ip maddr" command would
not work if we have an Ethernet switch that does igmp snooping since
the switch would not replicate multicast packets on ports that did not
have IGMP reports for the multicast addresses.

Linux vxlan interfaces created via "ip link add vxlan" have the group option
that enables then to do the required join.

By extending ip address command with option "autojoin" we can get similar
functionality for openvswitch vxlan interfaces as well as other tunneling
mechanisms that need to receive multicast traffic. The kernel code is
structured similar to how the vxlan driver does a group join / leave.

example:
ip address add 224.1.1.10/24 dev eth5 autojoin
ip address del 224.1.1.10/24 dev eth5

Signed-off-by: Madhu Challa 
Signed-off-by: David S. Miller

igmp: add __ip_mc_{join|leave}_group()

2015-02-20T20:24:04+00:00

There is a need to perform igmp join/leave operations while RTNL is
held.

Make ip_mc_{join|leave}_group() wrappers around
__ip_mc_{join|leave}_group() to avoid the proliferation of work queues.

For example, vxlan_igmp_join() could possibly be removed.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller

ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs

2014-11-06T03:12:30+00:00

It has been reported that generating an MLD listener report on
devices with large MTUs (e.g. 9000) and a high number of IPv6
addresses can trigger a skb_over_panic():

skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20
head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0
dev:port1
 ------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:100!
invalid opcode: 0000 [#1] SMP
Modules linked in: ixgbe(O)
CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4
[...]
Call Trace:
 
 [] ? skb_put+0x3a/0x3b
 [] ? add_grhead+0x45/0x8e
 [] ? add_grec+0x394/0x3d4
 [] ? mld_ifc_timer_expire+0x195/0x20d
 [] ? mld_dad_timer_expire+0x45/0x45
 [] ? call_timer_fn.isra.29+0x12/0x68
 [] ? run_timer_softirq+0x163/0x182
 [] ? __do_softirq+0xe0/0x21d
 [] ? irq_exit+0x4e/0xd3
 [] ? smp_apic_timer_interrupt+0x3b/0x46
 [] ? apic_timer_interrupt+0x6a/0x70

mld_newpack() skb allocations are usually requested with dev->mtu
in size, since commit 72e09ad107e7 ("ipv6: avoid high order allocations")
we have changed the limit in order to be less likely to fail.

However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb)
macros, which determine if we may end up doing an skb_put() for
adding another record. To avoid possible fragmentation, we check
the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong
assumption as the actual max allocation size can be much smaller.

The IGMP case doesn't have this issue as commit 57e1ab6eaddc
("igmp: refine skb allocations") stores the allocation size in
the cb[].

Set a reserved_tailroom to make it fit into the MTU and use
skb_availroom() helper instead. This also allows to get rid of
igmp_skb_size().

Reported-by: Wei Liu 
Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations")
Signed-off-by: Daniel Borkmann 
Cc: Eric Dumazet 
Cc: Hannes Frederic Sowa 
Cc: David L Stevens 
Acked-by: Eric Dumazet 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller