<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/core/dev.c, branch v5.18</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>net: fix dev_fill_forward_path with pppoe + bridge</title>
<updated>2022-05-16T10:58:55+00:00</updated>
<author>
<name>Felix Fietkau</name>
<email>nbd@nbd.name</email>
</author>
<published>2022-05-09T12:26:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=cf2df74e202d81b09f09d84c2d8903e0e87e9274'/>
<id>cf2df74e202d81b09f09d84c2d8903e0e87e9274</id>
<content type='text'>
When calling dev_fill_forward_path on a pppoe device, the provided destination
address is invalid. In order for the bridge fdb lookup to succeed, the pppoe
code needs to update ctx-&gt;daddr to the correct value.
Fix this by storing the address inside struct net_device_path_ctx

Fixes: f6efc675c9dd ("net: ppp: resolve forwarding path for bridge pppoe devices")
Signed-off-by: Felix Fietkau &lt;nbd@nbd.name&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When calling dev_fill_forward_path on a pppoe device, the provided destination
address is invalid. In order for the bridge fdb lookup to succeed, the pppoe
code needs to update ctx-&gt;daddr to the correct value.
Fix this by storing the address inside struct net_device_path_ctx

Fixes: f6efc675c9dd ("net: ppp: resolve forwarding path for bridge pppoe devices")
Signed-off-by: Felix Fietkau &lt;nbd@nbd.name&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Use this_cpu_inc() to increment net-&gt;core_stats</title>
<updated>2022-04-27T00:32:30+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2022-04-25T16:39:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=6510ea973d8d9d4a0cb2fb557b36bd1ab3eb49f6'/>
<id>6510ea973d8d9d4a0cb2fb557b36bd1ab3eb49f6</id>
<content type='text'>
The macro dev_core_stats_##FIELD##_inc() disables preemption and invokes
netdev_core_stats_alloc() to return a per-CPU pointer.
netdev_core_stats_alloc() will allocate memory on its first invocation
which breaks on PREEMPT_RT because it requires non-atomic context for
memory allocation.

This can be avoided by enabling preemption in netdev_core_stats_alloc()
assuming the caller always disables preemption.

It might be better to replace local_inc() with this_cpu_inc() now that
dev_core_stats_##FIELD##_inc() gained a preempt-disable section and does
not rely on already disabled preemption. This results in less
instructions on x86-64:
local_inc:
|          incl %gs:__preempt_count(%rip)  # __preempt_count
|          movq    488(%rdi), %rax # _1-&gt;core_stats, _22
|          testq   %rax, %rax      # _22
|          je      .L585   #,
|          add %gs:this_cpu_off(%rip), %rax        # this_cpu_off, tcp_ptr__
|  .L586:
|          testq   %rax, %rax      # _27
|          je      .L587   #,
|          incq (%rax)            # _6-&gt;a.counter
|  .L587:
|          decl %gs:__preempt_count(%rip)  # __preempt_count

this_cpu_inc(), this patch:
|         movq    488(%rdi), %rax # _1-&gt;core_stats, _5
|         testq   %rax, %rax      # _5
|         je      .L591   #,
| .L585:
|         incq %gs:(%rax) # _18-&gt;rx_dropped

Use unsigned long as type for the counter. Use this_cpu_inc() to
increment the counter. Use a plain read of the counter.

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://lore.kernel.org/r/YmbO0pxgtKpCw4SY@linutronix.de
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The macro dev_core_stats_##FIELD##_inc() disables preemption and invokes
netdev_core_stats_alloc() to return a per-CPU pointer.
netdev_core_stats_alloc() will allocate memory on its first invocation
which breaks on PREEMPT_RT because it requires non-atomic context for
memory allocation.

This can be avoided by enabling preemption in netdev_core_stats_alloc()
assuming the caller always disables preemption.

It might be better to replace local_inc() with this_cpu_inc() now that
dev_core_stats_##FIELD##_inc() gained a preempt-disable section and does
not rely on already disabled preemption. This results in less
instructions on x86-64:
local_inc:
|          incl %gs:__preempt_count(%rip)  # __preempt_count
|          movq    488(%rdi), %rax # _1-&gt;core_stats, _22
|          testq   %rax, %rax      # _22
|          je      .L585   #,
|          add %gs:this_cpu_off(%rip), %rax        # this_cpu_off, tcp_ptr__
|  .L586:
|          testq   %rax, %rax      # _27
|          je      .L587   #,
|          incq (%rax)            # _6-&gt;a.counter
|  .L587:
|          decl %gs:__preempt_count(%rip)  # __preempt_count

this_cpu_inc(), this patch:
|         movq    488(%rdi), %rax # _1-&gt;core_stats, _5
|         testq   %rax, %rax      # _5
|         je      .L591   #,
| .L585:
|         incq %gs:(%rax) # _18-&gt;rx_dropped

Use unsigned long as type for the counter. Use this_cpu_inc() to
increment the counter. Use a plain read of the counter.

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://lore.kernel.org/r/YmbO0pxgtKpCw4SY@linutronix.de
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: move net_unlink_todo() out of the header</title>
<updated>2022-03-28T21:40:08+00:00</updated>
<author>
<name>Johannes Berg</name>
<email>johannes.berg@intel.com</email>
</author>
<published>2022-03-25T21:50:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=f32404ae1bb9a7428a3c77419672a28895d185bf'/>
<id>f32404ae1bb9a7428a3c77419672a28895d185bf</id>
<content type='text'>
There's no reason for this to be in netdevice.h, it's all
just used in dev.c. Also make it no longer inline and let
the compiler decide to do that by itself.

Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Link: https://lore.kernel.org/r/20220325225023.f49b9056fe1c.I6b901a2df00000837a9bd251a8dd259bd23f5ded@changeid
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's no reason for this to be in netdevice.h, it's all
just used in dev.c. Also make it no longer inline and let
the compiler decide to do that by itself.

Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Link: https://lore.kernel.org/r/20220325225023.f49b9056fe1c.I6b901a2df00000837a9bd251a8dd259bd23f5ded@changeid
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Revert the softirq will run annotation in ____napi_schedule().</title>
<updated>2022-03-21T19:07:05+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2022-03-21T09:22:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=351bdbb6419ca988802882badadc321d384d0254'/>
<id>351bdbb6419ca988802882badadc321d384d0254</id>
<content type='text'>
The lockdep annotation lockdep_assert_softirq_will_run() expects that
either hard or soft interrupts are disabled because both guaranty that
the "raised" soft-interrupts will be processed once the context is left.

This triggers in flush_smp_call_function_from_idle() but it this case it
explicitly calls do_softirq() in case of pending softirqs.

Revert the "softirq will run" annotation in ____napi_schedule() and move
the check back to __netif_rx() as it was. Keep the IRQ-off assert in
____napi_schedule() because this is always required.

Fixes: fbd9a2ceba5c7 ("net: Add lockdep asserts to ____napi_schedule().")
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Link: https://lore.kernel.org/r/YjhD3ZKWysyw8rc6@linutronix.de
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The lockdep annotation lockdep_assert_softirq_will_run() expects that
either hard or soft interrupts are disabled because both guaranty that
the "raised" soft-interrupts will be processed once the context is left.

This triggers in flush_smp_call_function_from_idle() but it this case it
explicitly calls do_softirq() in case of pending softirqs.

Revert the "softirq will run" annotation in ____napi_schedule() and move
the check back to __netif_rx() as it was. Keep the IRQ-off assert in
____napi_schedule() because this is always required.

Fixes: fbd9a2ceba5c7 ("net: Add lockdep asserts to ____napi_schedule().")
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Link: https://lore.kernel.org/r/YjhD3ZKWysyw8rc6@linutronix.de
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: set default rss queues num to physical cores / 2</title>
<updated>2022-03-18T19:00:35+00:00</updated>
<author>
<name>Íñigo Huguet</name>
<email>ihuguet@redhat.com</email>
</author>
<published>2022-03-15T09:18:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=046e1537a3cf0adc68fe865b5dc9a7e731cc63b3'/>
<id>046e1537a3cf0adc68fe865b5dc9a7e731cc63b3</id>
<content type='text'>
Network drivers can call to netif_get_num_default_rss_queues to get the
default number of receive queues to use. Right now, this default number
is min(8, num_online_cpus()).

Instead, as suggested by Jakub, use the number of physical cores divided
by 2 as a way to avoid wasting CPU resources and to avoid using both CPU
threads, but still allowing to scale for high-end processors with many
cores.

As an exception, select 2 queues for processors with 2 cores, because
otherwise it won't take any advantage of RSS despite being SMP capable.

Tested: Processor Intel Xeon E5-2620 (2 sockets, 6 cores/socket, 2
threads/core). NIC Broadcom NetXtreme II BCM57810 (10GBps). Ran some
tests with `perf stat iperf3 -R`, with parallelisms of 1, 8 and 24,
getting the following results:
- Number of queues: 6 (instead of 8)
- Network throughput: not affected
- CPU usage: utilized 0.05-0.12 CPUs more than before (having 24 CPUs
  this is only 0.2-0.5% higher)
- Reduced the number of context switches by 7-50%, being more noticeable
  when using a higher number of parallel threads.

Suggested-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Íñigo Huguet &lt;ihuguet@redhat.com&gt;
Link: https://lore.kernel.org/r/20220315091832.13873-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Network drivers can call to netif_get_num_default_rss_queues to get the
default number of receive queues to use. Right now, this default number
is min(8, num_online_cpus()).

Instead, as suggested by Jakub, use the number of physical cores divided
by 2 as a way to avoid wasting CPU resources and to avoid using both CPU
threads, but still allowing to scale for high-end processors with many
cores.

As an exception, select 2 queues for processors with 2 cores, because
otherwise it won't take any advantage of RSS despite being SMP capable.

Tested: Processor Intel Xeon E5-2620 (2 sockets, 6 cores/socket, 2
threads/core). NIC Broadcom NetXtreme II BCM57810 (10GBps). Ran some
tests with `perf stat iperf3 -R`, with parallelisms of 1, 8 and 24,
getting the following results:
- Number of queues: 6 (instead of 8)
- Network throughput: not affected
- CPU usage: utilized 0.05-0.12 CPUs more than before (having 24 CPUs
  this is only 0.2-0.5% higher)
- Reduced the number of context switches by 7-50%, being more noticeable
  when using a higher number of parallel threads.

Suggested-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Íñigo Huguet &lt;ihuguet@redhat.com&gt;
Link: https://lore.kernel.org/r/20220315091832.13873-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Add lockdep asserts to ____napi_schedule().</title>
<updated>2022-03-14T10:09:28+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2022-03-11T15:03:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=fbd9a2ceba5c74bbfa19cf257ae4b4b2c820860d'/>
<id>fbd9a2ceba5c74bbfa19cf257ae4b4b2c820860d</id>
<content type='text'>
____napi_schedule() needs to be invoked with disabled interrupts due to
__raise_softirq_irqoff (in order not to corrupt the per-CPU list).
____napi_schedule() needs also to be invoked from an interrupt context
so that the raised-softirq is processed while the interrupt context is
left.

Add lockdep asserts for both conditions.
While this is the second time the irq/softirq check is needed, provide a
generic lockdep_assert_softirq_will_run() which is used by both caller.

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
____napi_schedule() needs to be invoked with disabled interrupts due to
__raise_softirq_irqoff (in order not to corrupt the per-CPU list).
____napi_schedule() needs also to be invoked from an interrupt context
so that the raised-softirq is processed while the interrupt context is
left.

Add lockdep asserts for both conditions.
While this is the second time the irq/softirq check is needed, provide a
generic lockdep_assert_softirq_will_run() which is used by both caller.

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add per-cpu storage and net-&gt;core_stats</title>
<updated>2022-03-12T07:17:24+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2022-03-11T05:14:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=625788b5844511cf4c30cffa7fa0bc3a69cebc82'/>
<id>625788b5844511cf4c30cffa7fa0bc3a69cebc82</id>
<content type='text'>
Before adding yet another possibly contended atomic_long_t,
it is time to add per-cpu storage for existing ones:
 dev-&gt;tx_dropped, dev-&gt;rx_dropped, and dev-&gt;rx_nohandler

Because many devices do not have to increment such counters,
allocate the per-cpu storage on demand, so that dev_get_stats()
does not have to spend considerable time folding zero counters.

Note that some drivers have abused these counters which
were supposed to be only used by core networking stack.

v4: should use per_cpu_ptr() in dev_get_stats() (Jakub)
v3: added a READ_ONCE() in netdev_core_stats_alloc() (Paolo)
v2: add a missing include (reported by kernel test robot &lt;lkp@intel.com&gt;)
    Change in netdev_core_stats_alloc() (Jakub)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: jeffreyji &lt;jeffreyji@google.com&gt;
Reviewed-by: Brian Vazquez &lt;brianvv@google.com&gt;
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Link: https://lore.kernel.org/r/20220311051420.2608812-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before adding yet another possibly contended atomic_long_t,
it is time to add per-cpu storage for existing ones:
 dev-&gt;tx_dropped, dev-&gt;rx_dropped, and dev-&gt;rx_nohandler

Because many devices do not have to increment such counters,
allocate the per-cpu storage on demand, so that dev_get_stats()
does not have to spend considerable time folding zero counters.

Note that some drivers have abused these counters which
were supposed to be only used by core networking stack.

v4: should use per_cpu_ptr() in dev_get_stats() (Jakub)
v3: added a READ_ONCE() in netdev_core_stats_alloc() (Paolo)
v2: add a missing include (reported by kernel test robot &lt;lkp@intel.com&gt;)
    Change in netdev_core_stats_alloc() (Jakub)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: jeffreyji &lt;jeffreyji@google.com&gt;
Reviewed-by: Brian Vazquez &lt;brianvv@google.com&gt;
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Link: https://lore.kernel.org/r/20220311051420.2608812-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: remove exports for netdev_name_node_alt_create() and destroy</title>
<updated>2022-03-12T06:55:09+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2022-03-10T22:39:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=2387834dd228c9efa60dbe118e0eff051033176b'/>
<id>2387834dd228c9efa60dbe118e0eff051033176b</id>
<content type='text'>
netdev_name_node_alt_create() and netdev_name_node_alt_destroy()
are only called by rtnetlink, so no need for exports.

Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://lore.kernel.org/r/20220310223952.558779-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
netdev_name_node_alt_create() and netdev_name_node_alt_destroy()
are only called by rtnetlink, so no need for exports.

Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://lore.kernel.org/r/20220310223952.558779-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: dev: use kfree_skb_reason() for __netif_receive_skb_core()</title>
<updated>2022-03-04T12:17:11+00:00</updated>
<author>
<name>Menglong Dong</name>
<email>imagedong@tencent.com</email>
</author>
<published>2022-03-04T06:00:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=6c2728b7c14164928cb7cb9c847dead101b2d503'/>
<id>6c2728b7c14164928cb7cb9c847dead101b2d503</id>
<content type='text'>
Add reason for skb drops to __netif_receive_skb_core() when packet_type
not found to handle the skb. For this purpose, the drop reason
SKB_DROP_REASON_PTYPE_ABSENT is introduced. Take ether packets for
example, this case mainly happens when L3 protocol is not supported.

Signed-off-by: Menglong Dong &lt;imagedong@tencent.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add reason for skb drops to __netif_receive_skb_core() when packet_type
not found to handle the skb. For this purpose, the drop reason
SKB_DROP_REASON_PTYPE_ABSENT is introduced. Take ether packets for
example, this case mainly happens when L3 protocol is not supported.

Signed-off-by: Menglong Dong &lt;imagedong@tencent.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: dev: use kfree_skb_reason() for sch_handle_ingress()</title>
<updated>2022-03-04T12:17:11+00:00</updated>
<author>
<name>Menglong Dong</name>
<email>imagedong@tencent.com</email>
</author>
<published>2022-03-04T06:00:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=a568aff26ac03ee9eb1482683514914a5ec3b4c3'/>
<id>a568aff26ac03ee9eb1482683514914a5ec3b4c3</id>
<content type='text'>
Replace kfree_skb() used in sch_handle_ingress() with
kfree_skb_reason(). Following drop reasons are introduced:

SKB_DROP_REASON_TC_INGRESS

Signed-off-by: Menglong Dong &lt;imagedong@tencent.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace kfree_skb() used in sch_handle_ingress() with
kfree_skb_reason(). Following drop reasons are introduced:

SKB_DROP_REASON_TC_INGRESS

Signed-off-by: Menglong Dong &lt;imagedong@tencent.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
