<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/core/dev.c, branch v3.15</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>net: fix wrong mac_len calculation for vlans</title>
<updated>2014-06-02T02:39:13+00:00</updated>
<author>
<name>Nikolay Aleksandrov</name>
<email>nikolay@redhat.com</email>
</author>
<published>2014-05-28T16:03:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=4b9b1cdf83c4facba89e0646aeac8ead679851b8'/>
<id>4b9b1cdf83c4facba89e0646aeac8ead679851b8</id>
<content type='text'>
After 1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol") skb-&gt;mac_len is used as a start of the
calculation in skb_network_protocol() but that is not always correct. If
skb-&gt;protocol == 8021Q/AD, usually the vlan header is already inserted
in the skb (i.e. vlan reorder hdr == 0). Usually when the packet enters
dev_hard_xmit it has mac_len == 0 so we take 2 bytes from the
destination mac address (skb-&gt;data + VLAN_HLEN) as a type in
skb_network_protocol() and return vlan_depth == 4. In the case where TSO is
off, then the mac_len is set but it's == 18 (ETH_HLEN + VLAN_HLEN), so
skb_network_protocol() returns a type from inside the packet and
offset == 22. Also make vlan_depth unsigned as suggested before.
As suggested by Eric Dumazet, move the while() loop in the if() so we
can avoid additional testing in fast path.

Here are few netperf tests + debug printk's to illustrate:
cat netperf.tso-on.reorder-on.bugged
- Vlan -&gt; device (reorder on, default, this case is okay)
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    7111.54
[   81.605435] skb-&gt;len 65226 skb-&gt;gso_size 1448 skb-&gt;proto 0x800
skb-&gt;mac_len 0 vlan_depth 0 type 0x800

- Vlan -&gt; device (reorder off, bad)
cat netperf.tso-on.reorder-off.bugged
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00     241.35
[  204.578332] skb-&gt;len 1518 skb-&gt;gso_size 0 skb-&gt;proto 0x8100
skb-&gt;mac_len 0 vlan_depth 4 type 0x5301
0x5301 are the last two bytes of the destination mac.

And if we stop TSO, we may get even the following:
[   83.343156] skb-&gt;len 2966 skb-&gt;gso_size 1448 skb-&gt;proto 0x8100
skb-&gt;mac_len 18 vlan_depth 22 type 0xb84
Because mac_len already accounts for VLAN_HLEN.

After the fix:
cat netperf.tso-on.reorder-off.fixed
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.01    5001.46
[   81.888489] skb-&gt;len 65230 skb-&gt;gso_size 1448 skb-&gt;proto 0x8100
skb-&gt;mac_len 0 vlan_depth 18 type 0x800

CC: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
CC: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Daniel Borkman &lt;dborkman@redhat.com&gt;
CC: David S. Miller &lt;davem@davemloft.net&gt;

Fixes:1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol")
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After 1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol") skb-&gt;mac_len is used as a start of the
calculation in skb_network_protocol() but that is not always correct. If
skb-&gt;protocol == 8021Q/AD, usually the vlan header is already inserted
in the skb (i.e. vlan reorder hdr == 0). Usually when the packet enters
dev_hard_xmit it has mac_len == 0 so we take 2 bytes from the
destination mac address (skb-&gt;data + VLAN_HLEN) as a type in
skb_network_protocol() and return vlan_depth == 4. In the case where TSO is
off, then the mac_len is set but it's == 18 (ETH_HLEN + VLAN_HLEN), so
skb_network_protocol() returns a type from inside the packet and
offset == 22. Also make vlan_depth unsigned as suggested before.
As suggested by Eric Dumazet, move the while() loop in the if() so we
can avoid additional testing in fast path.

Here are few netperf tests + debug printk's to illustrate:
cat netperf.tso-on.reorder-on.bugged
- Vlan -&gt; device (reorder on, default, this case is okay)
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    7111.54
[   81.605435] skb-&gt;len 65226 skb-&gt;gso_size 1448 skb-&gt;proto 0x800
skb-&gt;mac_len 0 vlan_depth 0 type 0x800

- Vlan -&gt; device (reorder off, bad)
cat netperf.tso-on.reorder-off.bugged
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00     241.35
[  204.578332] skb-&gt;len 1518 skb-&gt;gso_size 0 skb-&gt;proto 0x8100
skb-&gt;mac_len 0 vlan_depth 4 type 0x5301
0x5301 are the last two bytes of the destination mac.

And if we stop TSO, we may get even the following:
[   83.343156] skb-&gt;len 2966 skb-&gt;gso_size 1448 skb-&gt;proto 0x8100
skb-&gt;mac_len 18 vlan_depth 22 type 0xb84
Because mac_len already accounts for VLAN_HLEN.

After the fix:
cat netperf.tso-on.reorder-off.fixed
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.01    5001.46
[   81.888489] skb-&gt;len 65230 skb-&gt;gso_size 1448 skb-&gt;proto 0x8100
skb-&gt;mac_len 0 vlan_depth 18 type 0x800

CC: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
CC: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Daniel Borkman &lt;dborkman@redhat.com&gt;
CC: David S. Miller &lt;davem@davemloft.net&gt;

Fixes:1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol")
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: Fix stacked device detection in arp monitoring</title>
<updated>2014-05-17T02:29:05+00:00</updated>
<author>
<name>Vlad Yasevich</name>
<email>vyasevic@redhat.com</email>
</author>
<published>2014-05-16T21:20:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=44a4085538c844e79d6ee6bcf46fabf7c57a9a38'/>
<id>44a4085538c844e79d6ee6bcf46fabf7c57a9a38</id>
<content type='text'>
Prior to commit fbd929f2dce460456807a51e18d623db3db9f077
	bonding: support QinQ for bond arp interval

the arp monitoring code allowed for proper detection of devices
stacked on top of vlans.  Since the above commit, the
code can still detect a device stacked on top of single
vlan, but not a device stacked on top of Q-in-Q configuration.
The search will only set the inner vlan tag if the route
device is the vlan device.  However, this is not always the
case, as it is possible to extend the stacked configuration.

With this patch it is possible to provision devices on
top Q-in-Q vlan configuration that should be used as
a source of ARP monitoring information.

For example:
ip link add link bond0 vlan10 type vlan proto 802.1q id 10
ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
ip link add link vlan100 type macvlan

Note:  This patch limites the number of stacked VLANs to 2,
just like before.  The original, however had another issue
in that if we had more then 2 levels of VLANs, we would end
up generating incorrectly tagged traffic.  This is no longer
possible.

Fixes: fbd929f2dce460456807a51e18d623db3db9f077 (bonding: support QinQ for bond arp interval)
CC: Jay Vosburgh &lt;j.vosburgh@gmail.com&gt;
CC: Veaceslav Falico &lt;vfalico@redhat.com&gt;
CC: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: Ding Tianhong &lt;dingtianhong@huawei.com&gt;
CC: Patric McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Prior to commit fbd929f2dce460456807a51e18d623db3db9f077
	bonding: support QinQ for bond arp interval

the arp monitoring code allowed for proper detection of devices
stacked on top of vlans.  Since the above commit, the
code can still detect a device stacked on top of single
vlan, but not a device stacked on top of Q-in-Q configuration.
The search will only set the inner vlan tag if the route
device is the vlan device.  However, this is not always the
case, as it is possible to extend the stacked configuration.

With this patch it is possible to provision devices on
top Q-in-Q vlan configuration that should be used as
a source of ARP monitoring information.

For example:
ip link add link bond0 vlan10 type vlan proto 802.1q id 10
ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
ip link add link vlan100 type macvlan

Note:  This patch limites the number of stacked VLANs to 2,
just like before.  The original, however had another issue
in that if we had more then 2 levels of VLANs, we would end
up generating incorrectly tagged traffic.  This is no longer
possible.

Fixes: fbd929f2dce460456807a51e18d623db3db9f077 (bonding: support QinQ for bond arp interval)
CC: Jay Vosburgh &lt;j.vosburgh@gmail.com&gt;
CC: Veaceslav Falico &lt;vfalico@redhat.com&gt;
CC: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: Ding Tianhong &lt;dingtianhong@huawei.com&gt;
CC: Patric McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vlan: Fix lockdep warning with stacked vlan devices.</title>
<updated>2014-05-17T02:14:49+00:00</updated>
<author>
<name>Vlad Yasevich</name>
<email>vyasevic@redhat.com</email>
</author>
<published>2014-05-16T21:04:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=d38569ab2bba6e6b3233acfc3a84cdbcfbd1f79f'/>
<id>d38569ab2bba6e6b3233acfc3a84cdbcfbd1f79f</id>
<content type='text'>
This reverts commit dc8eaaa006350d24030502a4521542e74b5cb39f.
	vlan: Fix lockdep warning when vlan dev handle notification

Instead we use the new new API to find the lock subclass of
our vlan device.  This way we can support configurations where
vlans are interspersed with other devices:
  bond -&gt; vlan -&gt; macvlan -&gt; vlan

Signed-off-by: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit dc8eaaa006350d24030502a4521542e74b5cb39f.
	vlan: Fix lockdep warning when vlan dev handle notification

Instead we use the new new API to find the lock subclass of
our vlan device.  This way we can support configurations where
vlans are interspersed with other devices:
  bond -&gt; vlan -&gt; macvlan -&gt; vlan

Signed-off-by: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Find the nesting level of a given device by type.</title>
<updated>2014-05-17T02:14:49+00:00</updated>
<author>
<name>Vlad Yasevich</name>
<email>vyasevic@redhat.com</email>
</author>
<published>2014-05-16T21:04:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=4085ebe8c31face855fd01ee40372cb4aab1df3a'/>
<id>4085ebe8c31face855fd01ee40372cb4aab1df3a</id>
<content type='text'>
Multiple devices in the kernel can be stacked/nested and they
need to know their nesting level for the purposes of lockdep.
This patch provides a generic function that determines a nesting
level of a particular device by its type (ex: vlan, macvlan, etc).
We only care about nesting of the same type of devices.

For example:
  eth0 &lt;- vlan0.10 &lt;- macvlan0 &lt;- vlan1.20

The nesting level of vlan1.20 would be 1, since there is another vlan
in the stack under it.

Signed-off-by: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Multiple devices in the kernel can be stacked/nested and they
need to know their nesting level for the purposes of lockdep.
This patch provides a generic function that determines a nesting
level of a particular device by its type (ex: vlan, macvlan, etc).
We only care about nesting of the same type of devices.

For example:
  eth0 &lt;- vlan0.10 &lt;- macvlan0 &lt;- vlan1.20

The nesting level of vlan1.20 would be 1, since there is another vlan
in the stack under it.

Signed-off-by: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: gro: make sure skb-&gt;cb[] initial content has not to be zero</title>
<updated>2014-05-16T21:24:54+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-05-16T18:34:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=29e98242783ed3ba569797846a606ba66f781625'/>
<id>29e98242783ed3ba569797846a606ba66f781625</id>
<content type='text'>
Starting from linux-3.13, GRO attempts to build full size skbs.

Problem is the commit assumed one particular field in skb-&gt;cb[]
was clean, but it is not the case on some stacked devices.

Timo reported a crash in case traffic is decrypted before
reaching a GRE device.

Fix this by initializing NAPI_GRO_CB(skb)-&gt;last at the right place,
this also removes one conditional.

Thanks a lot to Timo for providing full reports and bisecting this.

Fixes: 8a29111c7ca6 ("net: gro: allow to build full sized skb")
Bisected-by: Timo Teras &lt;timo.teras@iki.fi&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Tested-by: Timo Teräs &lt;timo.teras@iki.fi&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Starting from linux-3.13, GRO attempts to build full size skbs.

Problem is the commit assumed one particular field in skb-&gt;cb[]
was clean, but it is not the case on some stacked devices.

Timo reported a crash in case traffic is decrypted before
reaching a GRE device.

Fix this by initializing NAPI_GRO_CB(skb)-&gt;last at the right place,
this also removes one conditional.

Thanks a lot to Timo for providing full reports and bisecting this.

Fixes: 8a29111c7ca6 ("net: gro: allow to build full sized skb")
Bisected-by: Timo Teras &lt;timo.teras@iki.fi&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Tested-by: Timo Teräs &lt;timo.teras@iki.fi&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rtnetlink: wait for unregistering devices in rtnl_link_unregister()</title>
<updated>2014-05-15T19:30:33+00:00</updated>
<author>
<name>Cong Wang</name>
<email>cwang@twopensource.com</email>
</author>
<published>2014-05-12T22:11:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=200b916f3575bdf11609cb447661b8d5957b0bbf'/>
<id>200b916f3575bdf11609cb447661b8d5957b0bbf</id>
<content type='text'>
From: Cong Wang &lt;cwang@twopensource.com&gt;

commit 50624c934db18ab90 (net: Delay default_device_exit_batch until no
devices are unregistering) introduced rtnl_lock_unregistering() for
default_device_exit_batch(). Same race could happen we when rmmod a driver
which calls rtnl_link_unregister() as we call dev-&gt;destructor without rtnl
lock.

For long term, I think we should clean up the mess of netdev_run_todo()
and net namespce exit code.

Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Cong Wang &lt;cwang@twopensource.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
From: Cong Wang &lt;cwang@twopensource.com&gt;

commit 50624c934db18ab90 (net: Delay default_device_exit_batch until no
devices are unregistering) introduced rtnl_lock_unregistering() for
default_device_exit_batch(). Same race could happen we when rmmod a driver
which calls rtnl_link_unregister() as we call dev-&gt;destructor without rtnl
lock.

For long term, I think we should clean up the mess of netdev_run_todo()
and net namespce exit code.

Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Cong Wang &lt;cwang@twopensource.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "net: core: introduce netif_skb_dev_features"</title>
<updated>2014-05-07T19:49:07+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2014-05-05T13:00:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=c1e756bfcbcac838a86a23f3e4501b556a961e3c'/>
<id>c1e756bfcbcac838a86a23f3e4501b556a961e3c</id>
<content type='text'>
This reverts commit d206940319c41df4299db75ed56142177bb2e5f6,
there are no more callers.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit d206940319c41df4299db75ed56142177bb2e5f6,
there are no more callers.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vlan: Fix lockdep warning when vlan dev handle notification</title>
<updated>2014-04-18T21:48:30+00:00</updated>
<author>
<name>dingtianhong</name>
<email>dingtianhong@huawei.com</email>
</author>
<published>2014-04-17T10:40:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=dc8eaaa006350d24030502a4521542e74b5cb39f'/>
<id>dc8eaaa006350d24030502a4521542e74b5cb39f</id>
<content type='text'>
When I open the LOCKDEP config and run these steps:

modprobe 8021q
vconfig add eth2 20
vconfig add eth2.20 30
ifconfig eth2 xx.xx.xx.xx

then the Call Trace happened:

[32524.386288] =============================================
[32524.386293] [ INFO: possible recursive locking detected ]
[32524.386298] 3.14.0-rc2-0.7-default+ #35 Tainted: G           O
[32524.386302] ---------------------------------------------
[32524.386306] ifconfig/3103 is trying to acquire lock:
[32524.386310]  (&amp;vlan_netdev_addr_lock_key/1){+.....}, at: [&lt;ffffffff814275f4&gt;] dev_mc_sync+0x64/0xb0
[32524.386326]
[32524.386326] but task is already holding lock:
[32524.386330]  (&amp;vlan_netdev_addr_lock_key/1){+.....}, at: [&lt;ffffffff8141af83&gt;] dev_set_rx_mode+0x23/0x40
[32524.386341]
[32524.386341] other info that might help us debug this:
[32524.386345]  Possible unsafe locking scenario:
[32524.386345]
[32524.386350]        CPU0
[32524.386352]        ----
[32524.386354]   lock(&amp;vlan_netdev_addr_lock_key/1);
[32524.386359]   lock(&amp;vlan_netdev_addr_lock_key/1);
[32524.386364]
[32524.386364]  *** DEADLOCK ***
[32524.386364]
[32524.386368]  May be due to missing lock nesting notation
[32524.386368]
[32524.386373] 2 locks held by ifconfig/3103:
[32524.386376]  #0:  (rtnl_mutex){+.+.+.}, at: [&lt;ffffffff81431d42&gt;] rtnl_lock+0x12/0x20
[32524.386387]  #1:  (&amp;vlan_netdev_addr_lock_key/1){+.....}, at: [&lt;ffffffff8141af83&gt;] dev_set_rx_mode+0x23/0x40
[32524.386398]
[32524.386398] stack backtrace:
[32524.386403] CPU: 1 PID: 3103 Comm: ifconfig Tainted: G           O 3.14.0-rc2-0.7-default+ #35
[32524.386409] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[32524.386414]  ffffffff81ffae40 ffff8800d9625ae8 ffffffff814f68a2 ffff8800d9625bc8
[32524.386421]  ffffffff810a35fb ffff8800d8a8d9d0 00000000d9625b28 ffff8800d8a8e5d0
[32524.386428]  000003cc00000000 0000000000000002 ffff8800d8a8e5f8 0000000000000000
[32524.386435] Call Trace:
[32524.386441]  [&lt;ffffffff814f68a2&gt;] dump_stack+0x6a/0x78
[32524.386448]  [&lt;ffffffff810a35fb&gt;] __lock_acquire+0x7ab/0x1940
[32524.386454]  [&lt;ffffffff810a323a&gt;] ? __lock_acquire+0x3ea/0x1940
[32524.386459]  [&lt;ffffffff810a4874&gt;] lock_acquire+0xe4/0x110
[32524.386464]  [&lt;ffffffff814275f4&gt;] ? dev_mc_sync+0x64/0xb0
[32524.386471]  [&lt;ffffffff814fc07a&gt;] _raw_spin_lock_nested+0x2a/0x40
[32524.386476]  [&lt;ffffffff814275f4&gt;] ? dev_mc_sync+0x64/0xb0
[32524.386481]  [&lt;ffffffff814275f4&gt;] dev_mc_sync+0x64/0xb0
[32524.386489]  [&lt;ffffffffa0500cab&gt;] vlan_dev_set_rx_mode+0x2b/0x50 [8021q]
[32524.386495]  [&lt;ffffffff8141addf&gt;] __dev_set_rx_mode+0x5f/0xb0
[32524.386500]  [&lt;ffffffff8141af8b&gt;] dev_set_rx_mode+0x2b/0x40
[32524.386506]  [&lt;ffffffff8141b3cf&gt;] __dev_open+0xef/0x150
[32524.386511]  [&lt;ffffffff8141b177&gt;] __dev_change_flags+0xa7/0x190
[32524.386516]  [&lt;ffffffff8141b292&gt;] dev_change_flags+0x32/0x80
[32524.386524]  [&lt;ffffffff8149ca56&gt;] devinet_ioctl+0x7d6/0x830
[32524.386532]  [&lt;ffffffff81437b0b&gt;] ? dev_ioctl+0x34b/0x660
[32524.386540]  [&lt;ffffffff814a05b0&gt;] inet_ioctl+0x80/0xa0
[32524.386550]  [&lt;ffffffff8140199d&gt;] sock_do_ioctl+0x2d/0x60
[32524.386558]  [&lt;ffffffff81401a52&gt;] sock_ioctl+0x82/0x2a0
[32524.386568]  [&lt;ffffffff811a7123&gt;] do_vfs_ioctl+0x93/0x590
[32524.386578]  [&lt;ffffffff811b2705&gt;] ? rcu_read_lock_held+0x45/0x50
[32524.386586]  [&lt;ffffffff811b39e5&gt;] ? __fget_light+0x105/0x110
[32524.386594]  [&lt;ffffffff811a76b1&gt;] SyS_ioctl+0x91/0xb0
[32524.386604]  [&lt;ffffffff815057e2&gt;] system_call_fastpath+0x16/0x1b

========================================================================

The reason is that all of the addr_lock_key for vlan dev have the same class,
so if we change the status for vlan dev, the vlan dev and its real dev will
hold the same class of addr_lock_key together, so the warning happened.

we should distinguish the lock depth for vlan dev and its real dev.

v1-&gt;v2: Convert the vlan_netdev_addr_lock_key to an array of eight elements, which
	could support to add 8 vlan id on a same vlan dev, I think it is enough for current
	scene, because a netdev's name is limited to IFNAMSIZ which could not hold 8 vlan id,
	and the vlan dev would not meet the same class key with its real dev.

	The new function vlan_dev_get_lockdep_subkey() will return the subkey and make the vlan
	dev could get a suitable class key.

v2-&gt;v3: According David's suggestion, I use the subclass to distinguish the lock key for vlan dev
	and its real dev, but it make no sense, because the difference for subclass in the
	lock_class_key doesn't mean that the difference class for lock_key, so I use lock_depth
	to distinguish the different depth for every vlan dev, the same depth of the vlan dev
	could have the same lock_class_key, I import the MAX_LOCK_DEPTH from the include/linux/sched.h,
	I think it is enough here, the lockdep should never exceed that value.

v3-&gt;v4: Add a huge array of locking keys will waste static kernel memory and is not a appropriate method,
	we could use _nested() variants to fix the problem, calculate the depth for every vlan dev,
	and use the depth as the subclass for addr_lock_key.

Signed-off-by: Ding Tianhong &lt;dingtianhong@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When I open the LOCKDEP config and run these steps:

modprobe 8021q
vconfig add eth2 20
vconfig add eth2.20 30
ifconfig eth2 xx.xx.xx.xx

then the Call Trace happened:

[32524.386288] =============================================
[32524.386293] [ INFO: possible recursive locking detected ]
[32524.386298] 3.14.0-rc2-0.7-default+ #35 Tainted: G           O
[32524.386302] ---------------------------------------------
[32524.386306] ifconfig/3103 is trying to acquire lock:
[32524.386310]  (&amp;vlan_netdev_addr_lock_key/1){+.....}, at: [&lt;ffffffff814275f4&gt;] dev_mc_sync+0x64/0xb0
[32524.386326]
[32524.386326] but task is already holding lock:
[32524.386330]  (&amp;vlan_netdev_addr_lock_key/1){+.....}, at: [&lt;ffffffff8141af83&gt;] dev_set_rx_mode+0x23/0x40
[32524.386341]
[32524.386341] other info that might help us debug this:
[32524.386345]  Possible unsafe locking scenario:
[32524.386345]
[32524.386350]        CPU0
[32524.386352]        ----
[32524.386354]   lock(&amp;vlan_netdev_addr_lock_key/1);
[32524.386359]   lock(&amp;vlan_netdev_addr_lock_key/1);
[32524.386364]
[32524.386364]  *** DEADLOCK ***
[32524.386364]
[32524.386368]  May be due to missing lock nesting notation
[32524.386368]
[32524.386373] 2 locks held by ifconfig/3103:
[32524.386376]  #0:  (rtnl_mutex){+.+.+.}, at: [&lt;ffffffff81431d42&gt;] rtnl_lock+0x12/0x20
[32524.386387]  #1:  (&amp;vlan_netdev_addr_lock_key/1){+.....}, at: [&lt;ffffffff8141af83&gt;] dev_set_rx_mode+0x23/0x40
[32524.386398]
[32524.386398] stack backtrace:
[32524.386403] CPU: 1 PID: 3103 Comm: ifconfig Tainted: G           O 3.14.0-rc2-0.7-default+ #35
[32524.386409] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[32524.386414]  ffffffff81ffae40 ffff8800d9625ae8 ffffffff814f68a2 ffff8800d9625bc8
[32524.386421]  ffffffff810a35fb ffff8800d8a8d9d0 00000000d9625b28 ffff8800d8a8e5d0
[32524.386428]  000003cc00000000 0000000000000002 ffff8800d8a8e5f8 0000000000000000
[32524.386435] Call Trace:
[32524.386441]  [&lt;ffffffff814f68a2&gt;] dump_stack+0x6a/0x78
[32524.386448]  [&lt;ffffffff810a35fb&gt;] __lock_acquire+0x7ab/0x1940
[32524.386454]  [&lt;ffffffff810a323a&gt;] ? __lock_acquire+0x3ea/0x1940
[32524.386459]  [&lt;ffffffff810a4874&gt;] lock_acquire+0xe4/0x110
[32524.386464]  [&lt;ffffffff814275f4&gt;] ? dev_mc_sync+0x64/0xb0
[32524.386471]  [&lt;ffffffff814fc07a&gt;] _raw_spin_lock_nested+0x2a/0x40
[32524.386476]  [&lt;ffffffff814275f4&gt;] ? dev_mc_sync+0x64/0xb0
[32524.386481]  [&lt;ffffffff814275f4&gt;] dev_mc_sync+0x64/0xb0
[32524.386489]  [&lt;ffffffffa0500cab&gt;] vlan_dev_set_rx_mode+0x2b/0x50 [8021q]
[32524.386495]  [&lt;ffffffff8141addf&gt;] __dev_set_rx_mode+0x5f/0xb0
[32524.386500]  [&lt;ffffffff8141af8b&gt;] dev_set_rx_mode+0x2b/0x40
[32524.386506]  [&lt;ffffffff8141b3cf&gt;] __dev_open+0xef/0x150
[32524.386511]  [&lt;ffffffff8141b177&gt;] __dev_change_flags+0xa7/0x190
[32524.386516]  [&lt;ffffffff8141b292&gt;] dev_change_flags+0x32/0x80
[32524.386524]  [&lt;ffffffff8149ca56&gt;] devinet_ioctl+0x7d6/0x830
[32524.386532]  [&lt;ffffffff81437b0b&gt;] ? dev_ioctl+0x34b/0x660
[32524.386540]  [&lt;ffffffff814a05b0&gt;] inet_ioctl+0x80/0xa0
[32524.386550]  [&lt;ffffffff8140199d&gt;] sock_do_ioctl+0x2d/0x60
[32524.386558]  [&lt;ffffffff81401a52&gt;] sock_ioctl+0x82/0x2a0
[32524.386568]  [&lt;ffffffff811a7123&gt;] do_vfs_ioctl+0x93/0x590
[32524.386578]  [&lt;ffffffff811b2705&gt;] ? rcu_read_lock_held+0x45/0x50
[32524.386586]  [&lt;ffffffff811b39e5&gt;] ? __fget_light+0x105/0x110
[32524.386594]  [&lt;ffffffff811a76b1&gt;] SyS_ioctl+0x91/0xb0
[32524.386604]  [&lt;ffffffff815057e2&gt;] system_call_fastpath+0x16/0x1b

========================================================================

The reason is that all of the addr_lock_key for vlan dev have the same class,
so if we change the status for vlan dev, the vlan dev and its real dev will
hold the same class of addr_lock_key together, so the warning happened.

we should distinguish the lock depth for vlan dev and its real dev.

v1-&gt;v2: Convert the vlan_netdev_addr_lock_key to an array of eight elements, which
	could support to add 8 vlan id on a same vlan dev, I think it is enough for current
	scene, because a netdev's name is limited to IFNAMSIZ which could not hold 8 vlan id,
	and the vlan dev would not meet the same class key with its real dev.

	The new function vlan_dev_get_lockdep_subkey() will return the subkey and make the vlan
	dev could get a suitable class key.

v2-&gt;v3: According David's suggestion, I use the subclass to distinguish the lock key for vlan dev
	and its real dev, but it make no sense, because the difference for subclass in the
	lock_class_key doesn't mean that the difference class for lock_key, so I use lock_depth
	to distinguish the different depth for every vlan dev, the same depth of the vlan dev
	could have the same lock_class_key, I import the MAX_LOCK_DEPTH from the include/linux/sched.h,
	I think it is enough here, the lockdep should never exceed that value.

v3-&gt;v4: Add a huge array of locking keys will waste static kernel memory and is not a appropriate method,
	we could use _nested() variants to fix the problem, calculate the depth for every vlan dev,
	and use the depth as the subclass for addr_lock_key.

Signed-off-by: Ding Tianhong &lt;dingtianhong@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Start with correct mac_len in skb_network_protocol</title>
<updated>2014-04-14T22:58:58+00:00</updated>
<author>
<name>Vlad Yasevich</name>
<email>vyasevic@redhat.com</email>
</author>
<published>2014-04-14T21:37:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=1e785f48d29a09b6cf96db7b49b6320dada332e1'/>
<id>1e785f48d29a09b6cf96db7b49b6320dada332e1</id>
<content type='text'>
Sometimes, when the packet arrives at skb_mac_gso_segment()
its skb-&gt;mac_len already accounts for some of the mac lenght
headers in the packet.  This seems to happen when forwarding
through and OpenSSL tunnel.

When we start looking for any vlan headers in skb_network_protocol()
we seem to ignore any of the already known mac headers and start
with an ETH_HLEN.  This results in an incorrect offset, dropped
TSO frames and general slowness of the connection.

We can start counting from the known skb-&gt;mac_len
and return at least that much if all mac level headers
are known and accounted for.

Fixes: 53d6471cef17262d3ad1c7ce8982a234244f68ec (net: Account for all vlan headers in skb_mac_gso_segment)
CC: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Daniel Borkman &lt;dborkman@redhat.com&gt;
Tested-by: Martin Filip &lt;nexus+kernel@smoula.net&gt;
Signed-off-by: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sometimes, when the packet arrives at skb_mac_gso_segment()
its skb-&gt;mac_len already accounts for some of the mac lenght
headers in the packet.  This seems to happen when forwarding
through and OpenSSL tunnel.

When we start looking for any vlan headers in skb_network_protocol()
we seem to ignore any of the already known mac headers and start
with an ETH_HLEN.  This results in an incorrect offset, dropped
TSO frames and general slowness of the connection.

We can start counting from the known skb-&gt;mac_len
and return at least that much if all mac level headers
are known and accounted for.

Fixes: 53d6471cef17262d3ad1c7ce8982a234244f68ec (net: Account for all vlan headers in skb_mac_gso_segment)
CC: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Daniel Borkman &lt;dborkman@redhat.com&gt;
Tested-by: Martin Filip &lt;nexus+kernel@smoula.net&gt;
Signed-off-by: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netdev: remove potentially harmful checks</title>
<updated>2014-04-07T19:52:07+00:00</updated>
<author>
<name>Veaceslav Falico</name>
<email>vfalico@redhat.com</email>
</author>
<published>2014-04-07T09:25:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=6859e7df6d9045a461412777e63bd8cef12f9705'/>
<id>6859e7df6d9045a461412777e63bd8cef12f9705</id>
<content type='text'>
Currently we're checking a variable for != NULL after actually
dereferencing it, in netdev_lower_get_next_private*().

It's counter-intuitive at best, and can lead to faulty usage (as it implies
that the variable can be NULL), so fix it by removing the useless checks.

Reported-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: Eric Dumazet &lt;edumazet@google.com&gt;
CC: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
CC: Jiri Pirko &lt;jiri@resnulli.us&gt;
CC: stephen hemminger &lt;stephen@networkplumber.org&gt;
CC: Jerry Chu &lt;hkchu@google.com&gt;
Signed-off-by: Veaceslav Falico &lt;vfalico@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we're checking a variable for != NULL after actually
dereferencing it, in netdev_lower_get_next_private*().

It's counter-intuitive at best, and can lead to faulty usage (as it implies
that the variable can be NULL), so fix it by removing the useless checks.

Reported-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: Eric Dumazet &lt;edumazet@google.com&gt;
CC: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
CC: Jiri Pirko &lt;jiri@resnulli.us&gt;
CC: stephen hemminger &lt;stephen@networkplumber.org&gt;
CC: Jerry Chu &lt;hkchu@google.com&gt;
Signed-off-by: Veaceslav Falico &lt;vfalico@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
