<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/sched/sch_api.c, branch v4.3</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>net: sched: don't break line in tc_classify loop notification</title>
<updated>2015-08-28T21:01:15+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-08-28T16:46:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=c1b3b19923a371c9e099c30372d376b02fe66088'/>
<id>c1b3b19923a371c9e099c30372d376b02fe66088</id>
<content type='text'>
Just some minor noise follow-up to address some stylistic issues of
commit 3b3ae880266d ("net: sched: consolidate tc_classify{,_compat}").
Accidentally v1 instead of v2 of that commit got applied, so this
patch adds the relative diff.

Suggested-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Just some minor noise follow-up to address some stylistic issues of
commit 3b3ae880266d ("net: sched: consolidate tc_classify{,_compat}").
Accidentally v1 instead of v2 of that commit got applied, so this
patch adds the relative diff.

Suggested-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: sched: register noqueue qdisc</title>
<updated>2015-08-28T00:14:30+00:00</updated>
<author>
<name>Phil Sutter</name>
<email>phil@nwl.cc</email>
</author>
<published>2015-08-27T19:21:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=d66d6c3152e8d5a6db42a56bf7ae1c6cae87ba48'/>
<id>d66d6c3152e8d5a6db42a56bf7ae1c6cae87ba48</id>
<content type='text'>
This way users can attach noqueue just like any other qdisc using tc
without having to mess with tx_queue_len first.

Signed-off-by: Phil Sutter &lt;phil@nwl.cc&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This way users can attach noqueue just like any other qdisc using tc
without having to mess with tx_queue_len first.

Signed-off-by: Phil Sutter &lt;phil@nwl.cc&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: sched: consolidate tc_classify{,_compat}</title>
<updated>2015-08-27T21:18:48+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-08-26T21:00:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=3b3ae880266d148bf73a573a766bc9b78c08d805'/>
<id>3b3ae880266d148bf73a573a766bc9b78c08d805</id>
<content type='text'>
For classifiers getting invoked via tc_classify(), we always need an
extra function call into tc_classify_compat(), as both are being
exported as symbols and tc_classify() itself doesn't do much except
handling of reclassifications when tp-&gt;classify() returned with
TC_ACT_RECLASSIFY.

CBQ and ATM are the only qdiscs that directly call into tc_classify_compat(),
all others use tc_classify(). When tc actions are being configured
out in the kernel, tc_classify() effectively does nothing besides
delegating.

We could spare this layer and consolidate both functions. pktgen on
single CPU constantly pushing skbs directly into the netif_receive_skb()
path with a dummy classifier on ingress qdisc attached, improves
slightly from 22.3Mpps to 23.1Mpps.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For classifiers getting invoked via tc_classify(), we always need an
extra function call into tc_classify_compat(), as both are being
exported as symbols and tc_classify() itself doesn't do much except
handling of reclassifications when tp-&gt;classify() returned with
TC_ACT_RECLASSIFY.

CBQ and ATM are the only qdiscs that directly call into tc_classify_compat(),
all others use tc_classify(). When tc actions are being configured
out in the kernel, tc_classify() effectively does nothing besides
delegating.

We could spare this layer and consolidate both functions. pktgen on
single CPU constantly pushing skbs directly into the netif_receive_skb()
path with a dummy classifier on ingress qdisc attached, improves
slightly from 22.3Mpps to 23.1Mpps.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next</title>
<updated>2015-06-24T23:49:49+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-06-24T23:49:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=e0456717e483bb8a9431b80a5bdc99a928b9b003'/>
<id>e0456717e483bb8a9431b80a5bdc99a928b9b003</id>
<content type='text'>
Pull networking updates from David Miller:

 1) Add TX fast path in mac80211, from Johannes Berg.

 2) Add TSO/GRO support to ibmveth, from Thomas Falcon

 3) Move away from cached routes in ipv6, just like ipv4, from Martin
    KaFai Lau.

 4) Lots of new rhashtable tests, from Thomas Graf.

 5) Run ingress qdisc lockless, from Alexei Starovoitov.

 6) Allow servers to fetch TCP packet headers for SYN packets of new
    connections, for fingerprinting.  From Eric Dumazet.

 7) Add mode parameter to pktgen, for testing receive.  From Alexei
    Starovoitov.

 8) Cache access optimizations via simplifications of build_skb(), from
    Alexander Duyck.

 9) Move page frag allocator under mm/, also from Alexander.

10) Add xmit_more support to hv_netvsc, from KY Srinivasan.

11) Add a counter guard in case we try to perform endless reclassify
    loops in the packet scheduler.

12) Extern flow dissector to be programmable and use it in new "Flower"
    classifier.  From Jiri Pirko.

13) AF_PACKET fanout rollover fixes, performance improvements, and new
    statistics.  From Willem de Bruijn.

14) Add netdev driver for GENEVE tunnels, from John W Linville.

15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso.

16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet.

17) Add an ECN retry fallback for the initial TCP handshake, from Daniel
    Borkmann.

18) Add tail call support to BPF, from Alexei Starovoitov.

19) Add several pktgen helper scripts, from Jesper Dangaard Brouer.

20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa.

21) Favor even port numbers for allocation to connect() requests, and
    odd port numbers for bind(0), in an effort to help avoid
    ip_local_port_range exhaustion.  From Eric Dumazet.

22) Add Cavium ThunderX driver, from Sunil Goutham.

23) Allow bpf programs to access skb_iif and dev-&gt;ifindex SKB metadata,
    from Alexei Starovoitov.

24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai.

25) Double TCP Small Queues default to 256K to accomodate situations
    like the XEN driver and wireless aggregation.  From Wei Liu.

26) Add more entropy inputs to flow dissector, from Tom Herbert.

27) Add CDG congestion control algorithm to TCP, from Kenneth Klette
    Jonassen.

28) Convert ipset over to RCU locking, from Jozsef Kadlecsik.

29) Track and act upon link status of ipv4 route nexthops, from Andy
    Gospodarek.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits)
  bridge: vlan: flush the dynamically learned entries on port vlan delete
  bridge: multicast: add a comment to br_port_state_selection about blocking state
  net: inet_diag: export IPV6_V6ONLY sockopt
  stmmac: troubleshoot unexpected bits in des0 &amp; des1
  net: ipv4 sysctl option to ignore routes when nexthop link is down
  net: track link-status of ipv4 nexthops
  net: switchdev: ignore unsupported bridge flags
  net: Cavium: Fix MAC address setting in shutdown state
  drivers: net: xgene: fix for ACPI support without ACPI
  ip: report the original address of ICMP messages
  net/mlx5e: Prefetch skb data on RX
  net/mlx5e: Pop cq outside mlx5e_get_cqe
  net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
  net/mlx5e: Remove extra spaces
  net/mlx5e: Avoid TX CQE generation if more xmit packets expected
  net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
  net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
  net/mlx5e: Use skb_shinfo(skb)-&gt;gso_segs rather than counting them
  net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
  net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull networking updates from David Miller:

 1) Add TX fast path in mac80211, from Johannes Berg.

 2) Add TSO/GRO support to ibmveth, from Thomas Falcon

 3) Move away from cached routes in ipv6, just like ipv4, from Martin
    KaFai Lau.

 4) Lots of new rhashtable tests, from Thomas Graf.

 5) Run ingress qdisc lockless, from Alexei Starovoitov.

 6) Allow servers to fetch TCP packet headers for SYN packets of new
    connections, for fingerprinting.  From Eric Dumazet.

 7) Add mode parameter to pktgen, for testing receive.  From Alexei
    Starovoitov.

 8) Cache access optimizations via simplifications of build_skb(), from
    Alexander Duyck.

 9) Move page frag allocator under mm/, also from Alexander.

10) Add xmit_more support to hv_netvsc, from KY Srinivasan.

11) Add a counter guard in case we try to perform endless reclassify
    loops in the packet scheduler.

12) Extern flow dissector to be programmable and use it in new "Flower"
    classifier.  From Jiri Pirko.

13) AF_PACKET fanout rollover fixes, performance improvements, and new
    statistics.  From Willem de Bruijn.

14) Add netdev driver for GENEVE tunnels, from John W Linville.

15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso.

16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet.

17) Add an ECN retry fallback for the initial TCP handshake, from Daniel
    Borkmann.

18) Add tail call support to BPF, from Alexei Starovoitov.

19) Add several pktgen helper scripts, from Jesper Dangaard Brouer.

20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa.

21) Favor even port numbers for allocation to connect() requests, and
    odd port numbers for bind(0), in an effort to help avoid
    ip_local_port_range exhaustion.  From Eric Dumazet.

22) Add Cavium ThunderX driver, from Sunil Goutham.

23) Allow bpf programs to access skb_iif and dev-&gt;ifindex SKB metadata,
    from Alexei Starovoitov.

24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai.

25) Double TCP Small Queues default to 256K to accomodate situations
    like the XEN driver and wireless aggregation.  From Wei Liu.

26) Add more entropy inputs to flow dissector, from Tom Herbert.

27) Add CDG congestion control algorithm to TCP, from Kenneth Klette
    Jonassen.

28) Convert ipset over to RCU locking, from Jozsef Kadlecsik.

29) Track and act upon link status of ipv4 route nexthops, from Andy
    Gospodarek.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits)
  bridge: vlan: flush the dynamically learned entries on port vlan delete
  bridge: multicast: add a comment to br_port_state_selection about blocking state
  net: inet_diag: export IPV6_V6ONLY sockopt
  stmmac: troubleshoot unexpected bits in des0 &amp; des1
  net: ipv4 sysctl option to ignore routes when nexthop link is down
  net: track link-status of ipv4 nexthops
  net: switchdev: ignore unsupported bridge flags
  net: Cavium: Fix MAC address setting in shutdown state
  drivers: net: xgene: fix for ACPI support without ACPI
  ip: report the original address of ICMP messages
  net/mlx5e: Prefetch skb data on RX
  net/mlx5e: Pop cq outside mlx5e_get_cqe
  net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
  net/mlx5e: Remove extra spaces
  net/mlx5e: Avoid TX CQE generation if more xmit packets expected
  net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
  net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
  net/mlx5e: Use skb_shinfo(skb)-&gt;gso_segs rather than counting them
  net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
  net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2015-06-23T01:57:44+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-06-23T01:57:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=43224b96af3154cedd7220f7b90094905f07ac78'/>
<id>43224b96af3154cedd7220f7b90094905f07ac78</id>
<content type='text'>
Pull timer updates from Thomas Gleixner:
 "A rather largish update for everything time and timer related:

   - Cache footprint optimizations for both hrtimers and timer wheel

   - Lower the NOHZ impact on systems which have NOHZ or timer migration
     disabled at runtime.

   - Optimize run time overhead of hrtimer interrupt by making the clock
     offset updates smarter

   - hrtimer cleanups and removal of restrictions to tackle some
     problems in sched/perf

   - Some more leap second tweaks

   - Another round of changes addressing the 2038 problem

   - First step to change the internals of clock event devices by
     introducing the necessary infrastructure

   - Allow constant folding for usecs/msecs_to_jiffies()

   - The usual pile of clockevent/clocksource driver updates

  The hrtimer changes contain updates to sched, perf and x86 as they
  depend on them plus changes all over the tree to cleanup API changes
  and redundant code, which got copied all over the place.  The y2038
  changes touch s390 to remove the last non 2038 safe code related to
  boot/persistant clock"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
  clocksource: Increase dependencies of timer-stm32 to limit build wreckage
  timer: Minimize nohz off overhead
  timer: Reduce timer migration overhead if disabled
  timer: Stats: Simplify the flags handling
  timer: Replace timer base by a cpu index
  timer: Use hlist for the timer wheel hash buckets
  timer: Remove FIFO "guarantee"
  timers: Sanitize catchup_timer_jiffies() usage
  hrtimer: Allow hrtimer::function() to free the timer
  seqcount: Introduce raw_write_seqcount_barrier()
  seqcount: Rename write_seqcount_barrier()
  hrtimer: Fix hrtimer_is_queued() hole
  hrtimer: Remove HRTIMER_STATE_MIGRATE
  selftest: Timers: Avoid signal deadlock in leap-a-day
  timekeeping: Copy the shadow-timekeeper over the real timekeeper last
  clockevents: Check state instead of mode in suspend/resume path
  selftests: timers: Add leap-second timer edge testing to leap-a-day.c
  ntp: Do leapsecond adjustment in adjtimex read path
  time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap second edge
  ntp: Introduce and use SECS_PER_DAY macro instead of 86400
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull timer updates from Thomas Gleixner:
 "A rather largish update for everything time and timer related:

   - Cache footprint optimizations for both hrtimers and timer wheel

   - Lower the NOHZ impact on systems which have NOHZ or timer migration
     disabled at runtime.

   - Optimize run time overhead of hrtimer interrupt by making the clock
     offset updates smarter

   - hrtimer cleanups and removal of restrictions to tackle some
     problems in sched/perf

   - Some more leap second tweaks

   - Another round of changes addressing the 2038 problem

   - First step to change the internals of clock event devices by
     introducing the necessary infrastructure

   - Allow constant folding for usecs/msecs_to_jiffies()

   - The usual pile of clockevent/clocksource driver updates

  The hrtimer changes contain updates to sched, perf and x86 as they
  depend on them plus changes all over the tree to cleanup API changes
  and redundant code, which got copied all over the place.  The y2038
  changes touch s390 to remove the last non 2038 safe code related to
  boot/persistant clock"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
  clocksource: Increase dependencies of timer-stm32 to limit build wreckage
  timer: Minimize nohz off overhead
  timer: Reduce timer migration overhead if disabled
  timer: Stats: Simplify the flags handling
  timer: Replace timer base by a cpu index
  timer: Use hlist for the timer wheel hash buckets
  timer: Remove FIFO "guarantee"
  timers: Sanitize catchup_timer_jiffies() usage
  hrtimer: Allow hrtimer::function() to free the timer
  seqcount: Introduce raw_write_seqcount_barrier()
  seqcount: Rename write_seqcount_barrier()
  hrtimer: Fix hrtimer_is_queued() hole
  hrtimer: Remove HRTIMER_STATE_MIGRATE
  selftest: Timers: Avoid signal deadlock in leap-a-day
  timekeeping: Copy the shadow-timekeeper over the real timekeeper last
  clockevents: Check state instead of mode in suspend/resume path
  selftests: timers: Add leap-second timer edge testing to leap-a-day.c
  ntp: Do leapsecond adjustment in adjtimex read path
  time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap second edge
  ntp: Introduce and use SECS_PER_DAY macro instead of 86400
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2015-06-02T05:51:30+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-06-02T05:33:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=dda922c831d1661c11a3ae1051b7160236f6ffb0'/>
<id>dda922c831d1661c11a3ae1051b7160236f6ffb0</id>
<content type='text'>
Conflicts:
	drivers/net/phy/amd-xgbe-phy.c
	drivers/net/wireless/iwlwifi/Kconfig
	include/net/mac80211.h

iwlwifi/Kconfig and mac80211.h were both trivial overlapping
changes.

The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and
the bug fix that happened on the 'net' side is already integrated
into the rest of the amd-xgbe driver.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/net/phy/amd-xgbe-phy.c
	drivers/net/wireless/iwlwifi/Kconfig
	include/net/mac80211.h

iwlwifi/Kconfig and mac80211.h were both trivial overlapping
changes.

The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and
the bug fix that happened on the 'net' side is already integrated
into the rest of the amd-xgbe driver.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net_sched: invoke -&gt;attach() after setting dev-&gt;qdisc</title>
<updated>2015-05-27T18:09:55+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2015-05-26T23:08:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=86e363dc3b50bfd50a1f315934583fbda673ab8d'/>
<id>86e363dc3b50bfd50a1f315934583fbda673ab8d</id>
<content type='text'>
For mq qdisc, we add per tx queue qdisc to root qdisc
for display purpose, however, that happens too early,
before the new dev-&gt;qdisc is finally set, this causes
q-&gt;list points to an old root qdisc which is going to be
freed right before assigning with a new one.

Fix this by moving -&gt;attach() after setting dev-&gt;qdisc.

For the record, this fixes the following crash:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 975 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98()
 list_del corruption. prev-&gt;next should be ffff8800d1998ae8, but was 6b6b6b6b6b6b6b6b
 CPU: 1 PID: 975 Comm: tc Not tainted 4.1.0-rc4+ #1019
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000009 ffff8800d73fb928 ffffffff81a44e7f 0000000047574756
  ffff8800d73fb978 ffff8800d73fb968 ffffffff810790da ffff8800cfc4cd20
  ffffffff814e725b ffff8800d1998ae8 ffffffff82381250 0000000000000000
 Call Trace:
  [&lt;ffffffff81a44e7f&gt;] dump_stack+0x4c/0x65
  [&lt;ffffffff810790da&gt;] warn_slowpath_common+0x9c/0xb6
  [&lt;ffffffff814e725b&gt;] ? __list_del_entry+0x5a/0x98
  [&lt;ffffffff81079162&gt;] warn_slowpath_fmt+0x46/0x48
  [&lt;ffffffff81820eb0&gt;] ? dev_graft_qdisc+0x5e/0x6a
  [&lt;ffffffff814e725b&gt;] __list_del_entry+0x5a/0x98
  [&lt;ffffffff814e72a7&gt;] list_del+0xe/0x2d
  [&lt;ffffffff81822f05&gt;] qdisc_list_del+0x1e/0x20
  [&lt;ffffffff81820cd1&gt;] qdisc_destroy+0x30/0xd6
  [&lt;ffffffff81822676&gt;] qdisc_graft+0x11d/0x243
  [&lt;ffffffff818233c1&gt;] tc_get_qdisc+0x1a6/0x1d4
  [&lt;ffffffff810b5eaf&gt;] ? mark_lock+0x2e/0x226
  [&lt;ffffffff817ff8f5&gt;] rtnetlink_rcv_msg+0x181/0x194
  [&lt;ffffffff817ff72e&gt;] ? rtnl_lock+0x17/0x19
  [&lt;ffffffff817ff72e&gt;] ? rtnl_lock+0x17/0x19
  [&lt;ffffffff817ff774&gt;] ? __rtnl_unlock+0x17/0x17
  [&lt;ffffffff81855dc6&gt;] netlink_rcv_skb+0x4d/0x93
  [&lt;ffffffff817ff756&gt;] rtnetlink_rcv+0x26/0x2d
  [&lt;ffffffff818544b2&gt;] netlink_unicast+0xcb/0x150
  [&lt;ffffffff81161db9&gt;] ? might_fault+0x59/0xa9
  [&lt;ffffffff81854f78&gt;] netlink_sendmsg+0x4fa/0x51c
  [&lt;ffffffff817d6e09&gt;] sock_sendmsg_nosec+0x12/0x1d
  [&lt;ffffffff817d8967&gt;] sock_sendmsg+0x29/0x2e
  [&lt;ffffffff817d8cf3&gt;] ___sys_sendmsg+0x1b4/0x23a
  [&lt;ffffffff8100a1b8&gt;] ? native_sched_clock+0x35/0x37
  [&lt;ffffffff810a1d83&gt;] ? sched_clock_local+0x12/0x72
  [&lt;ffffffff810a1fd4&gt;] ? sched_clock_cpu+0x9e/0xb7
  [&lt;ffffffff810def2a&gt;] ? current_kernel_time+0xe/0x32
  [&lt;ffffffff810b4bc5&gt;] ? lock_release_holdtime.part.29+0x71/0x7f
  [&lt;ffffffff810ddebf&gt;] ? read_seqcount_begin.constprop.27+0x5f/0x76
  [&lt;ffffffff810b6292&gt;] ? trace_hardirqs_on_caller+0x17d/0x199
  [&lt;ffffffff811b14d5&gt;] ? __fget_light+0x50/0x78
  [&lt;ffffffff817d9808&gt;] __sys_sendmsg+0x42/0x60
  [&lt;ffffffff817d9838&gt;] SyS_sendmsg+0x12/0x1c
  [&lt;ffffffff81a50e97&gt;] system_call_fastpath+0x12/0x6f
 ---[ end trace ef29d3fb28e97ae7 ]---

For long term, we probably need to clean up the qdisc_graft() code
in case it hides other bugs like this.

Fixes: 95dc19299f74 ("pkt_sched: give visibility to mq slave qdiscs")
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For mq qdisc, we add per tx queue qdisc to root qdisc
for display purpose, however, that happens too early,
before the new dev-&gt;qdisc is finally set, this causes
q-&gt;list points to an old root qdisc which is going to be
freed right before assigning with a new one.

Fix this by moving -&gt;attach() after setting dev-&gt;qdisc.

For the record, this fixes the following crash:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 975 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98()
 list_del corruption. prev-&gt;next should be ffff8800d1998ae8, but was 6b6b6b6b6b6b6b6b
 CPU: 1 PID: 975 Comm: tc Not tainted 4.1.0-rc4+ #1019
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000009 ffff8800d73fb928 ffffffff81a44e7f 0000000047574756
  ffff8800d73fb978 ffff8800d73fb968 ffffffff810790da ffff8800cfc4cd20
  ffffffff814e725b ffff8800d1998ae8 ffffffff82381250 0000000000000000
 Call Trace:
  [&lt;ffffffff81a44e7f&gt;] dump_stack+0x4c/0x65
  [&lt;ffffffff810790da&gt;] warn_slowpath_common+0x9c/0xb6
  [&lt;ffffffff814e725b&gt;] ? __list_del_entry+0x5a/0x98
  [&lt;ffffffff81079162&gt;] warn_slowpath_fmt+0x46/0x48
  [&lt;ffffffff81820eb0&gt;] ? dev_graft_qdisc+0x5e/0x6a
  [&lt;ffffffff814e725b&gt;] __list_del_entry+0x5a/0x98
  [&lt;ffffffff814e72a7&gt;] list_del+0xe/0x2d
  [&lt;ffffffff81822f05&gt;] qdisc_list_del+0x1e/0x20
  [&lt;ffffffff81820cd1&gt;] qdisc_destroy+0x30/0xd6
  [&lt;ffffffff81822676&gt;] qdisc_graft+0x11d/0x243
  [&lt;ffffffff818233c1&gt;] tc_get_qdisc+0x1a6/0x1d4
  [&lt;ffffffff810b5eaf&gt;] ? mark_lock+0x2e/0x226
  [&lt;ffffffff817ff8f5&gt;] rtnetlink_rcv_msg+0x181/0x194
  [&lt;ffffffff817ff72e&gt;] ? rtnl_lock+0x17/0x19
  [&lt;ffffffff817ff72e&gt;] ? rtnl_lock+0x17/0x19
  [&lt;ffffffff817ff774&gt;] ? __rtnl_unlock+0x17/0x17
  [&lt;ffffffff81855dc6&gt;] netlink_rcv_skb+0x4d/0x93
  [&lt;ffffffff817ff756&gt;] rtnetlink_rcv+0x26/0x2d
  [&lt;ffffffff818544b2&gt;] netlink_unicast+0xcb/0x150
  [&lt;ffffffff81161db9&gt;] ? might_fault+0x59/0xa9
  [&lt;ffffffff81854f78&gt;] netlink_sendmsg+0x4fa/0x51c
  [&lt;ffffffff817d6e09&gt;] sock_sendmsg_nosec+0x12/0x1d
  [&lt;ffffffff817d8967&gt;] sock_sendmsg+0x29/0x2e
  [&lt;ffffffff817d8cf3&gt;] ___sys_sendmsg+0x1b4/0x23a
  [&lt;ffffffff8100a1b8&gt;] ? native_sched_clock+0x35/0x37
  [&lt;ffffffff810a1d83&gt;] ? sched_clock_local+0x12/0x72
  [&lt;ffffffff810a1fd4&gt;] ? sched_clock_cpu+0x9e/0xb7
  [&lt;ffffffff810def2a&gt;] ? current_kernel_time+0xe/0x32
  [&lt;ffffffff810b4bc5&gt;] ? lock_release_holdtime.part.29+0x71/0x7f
  [&lt;ffffffff810ddebf&gt;] ? read_seqcount_begin.constprop.27+0x5f/0x76
  [&lt;ffffffff810b6292&gt;] ? trace_hardirqs_on_caller+0x17d/0x199
  [&lt;ffffffff811b14d5&gt;] ? __fget_light+0x50/0x78
  [&lt;ffffffff817d9808&gt;] __sys_sendmsg+0x42/0x60
  [&lt;ffffffff817d9838&gt;] SyS_sendmsg+0x12/0x1c
  [&lt;ffffffff81a50e97&gt;] system_call_fastpath+0x12/0x6f
 ---[ end trace ef29d3fb28e97ae7 ]---

For long term, we probably need to clean up the qdisc_graft() code
in case it hides other bugs like this.

Fixes: 95dc19299f74 ("pkt_sched: give visibility to mq slave qdiscs")
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: sched: use counter to break reclassify loops</title>
<updated>2015-05-13T19:08:14+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2015-05-11T17:50:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=e578d9c02587d57bfa7b560767c698a668a468c6'/>
<id>e578d9c02587d57bfa7b560767c698a668a468c6</id>
<content type='text'>
Seems all we want here is to avoid endless 'goto reclassify' loop.
tc_classify_compat even resets this counter when something other
than TC_ACT_RECLASSIFY is returned, so this skb-counter doesn't
break hypothetical loops induced by something other than perpetual
TC_ACT_RECLASSIFY return values.

skb_act_clone is now identical to skb_clone, so just use that.

Tested with following (bogus) filter:
tc filter add dev eth0 parent ffff: \
 protocol ip u32 match u32 0 0 police rate 10Kbit burst \
 64000 mtu 1500 action reclassify

Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Acked-by: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Seems all we want here is to avoid endless 'goto reclassify' loop.
tc_classify_compat even resets this counter when something other
than TC_ACT_RECLASSIFY is returned, so this skb-counter doesn't
break hypothetical loops induced by something other than perpetual
TC_ACT_RECLASSIFY return values.

skb_act_clone is now identical to skb_clone, so just use that.

Tested with following (bogus) filter:
tc filter add dev eth0 parent ffff: \
 protocol ip u32 match u32 0 0 police rate 10Kbit burst \
 64000 mtu 1500 action reclassify

Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Acked-by: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: sched: Use hrtimer_resolution instead of hrtimer_get_res()</title>
<updated>2015-04-22T15:06:48+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-04-14T21:08:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=1e3176885cce8e0137d6f4072c5910bfa00901ed'/>
<id>1e3176885cce8e0137d6f4072c5910bfa00901ed</id>
<content type='text'>
No point in converting a timespec now that the value is directly
accessible.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Preeti U Murthy &lt;preeti@linux.vnet.ibm.com&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Link: http://lkml.kernel.org/r/20150414203500.720623028@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No point in converting a timespec now that the value is directly
accessible.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Preeti U Murthy &lt;preeti@linux.vnet.ibm.com&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Link: http://lkml.kernel.org/r/20150414203500.720623028@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net_sched: destroy proto tp when all filters are gone</title>
<updated>2015-03-09T19:35:55+00:00</updated>
<author>
<name>Cong Wang</name>
<email>cwang@twopensource.com</email>
</author>
<published>2015-03-06T19:47:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=1e052be69d045c8d0f82ff1116fd3e5a79661745'/>
<id>1e052be69d045c8d0f82ff1116fd3e5a79661745</id>
<content type='text'>
Kernel automatically creates a tp for each
(kind, protocol, priority) tuple, which has handle 0,
when we add a new filter, but it still is left there
after we remove our own, unless we don't specify the
handle (literally means all the filters under
the tuple). For example this one is left:

  # tc filter show dev eth0
  filter parent 8001: protocol arp pref 49152 basic

The user-space is hard to clean up these for kernel
because filters like u32 are organized in a complex way.
So kernel is responsible to remove it after all filters
are gone.  Each type of filter has its own way to
store the filters, so each type has to provide its
way to check if all filters are gone.

Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: Cong Wang &lt;cwang@twopensource.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Jamal Hadi Salim&lt;jhs@mojatatu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Kernel automatically creates a tp for each
(kind, protocol, priority) tuple, which has handle 0,
when we add a new filter, but it still is left there
after we remove our own, unless we don't specify the
handle (literally means all the filters under
the tuple). For example this one is left:

  # tc filter show dev eth0
  filter parent 8001: protocol arp pref 49152 basic

The user-space is hard to clean up these for kernel
because filters like u32 are organized in a complex way.
So kernel is responsible to remove it after all filters
are gone.  Each type of filter has its own way to
store the filters, so each type has to provide its
way to check if all filters are gone.

Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: Cong Wang &lt;cwang@twopensource.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Jamal Hadi Salim&lt;jhs@mojatatu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
