Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 39f6eed4cb209643f3f8633291854ed7375d7264 ]
Previously the IPv6 addresses in the key were clobbered and the mask was
left unset.
I haven't tested this; I noticed it while skimming the code to
understand an unrelated issue.
Fixes: cfab6dbd0ecf ("netfilter: flowtable: add tunnel match offload support")
Cc: wenxu <wenxu@ucloud.cn>
Signed-off-by: Will Mortensen <willmo@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c95c07836fa4c1767ed11d8eca0769c652760e32 ]
We are changing expire_nodest_conn to work even for reused connections when
conn_reuse_mode=0, just as what was done with commit dc7b3eb900aa ("ipvs:
Fix reuse connection if real server is dead").
For controlled and persistent connections, the new connection will get the
needed real server depending on the rules in ip_vs_check_template().
Fixes: d752c3645717 ("ipvs: allow rescheduling of new connections when port reuse is detected")
Co-developed-by: Chuanqi Liu <legend050709@qq.com>
Signed-off-by: Chuanqi Liu <legend050709@qq.com>
Signed-off-by: yangxingwu <xingwu.yang@gmail.com>
Acked-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 77522ff02f333434612bd72df9b376f8d3836e4d ]
And be consistent in error management for both orig/reply filtering
Fixes: cb8aa9a3affb ("netfilter: ctnetlink: add kernel side filtering for dump")
Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ad81d4daf6a3f4769a346e635d5e1e967ca455d9 ]
filter->orig_flags was used for a reply context.
Fixes: cb8aa9a3affb ("netfilter: ctnetlink: add kernel side filtering for dump")
Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 1c743127cc54b112b155f434756bd4b5fa565a99 upstream.
When we try to add an IPv6 nexthop and IPv6 is not enabled
(!CONFIG_IPV6) we'll hit a NULL pointer dereference[1] in the error path
of nh_create_ipv6() due to calling ipv6_stub->fib6_nh_release. The bug
has been present since the beginning of IPv6 nexthop gateway support.
Commit 1aefd3de7bc6 ("ipv6: Add fib6_nh_init and release to stubs") tells
us that only fib6_nh_init has a dummy stub because fib6_nh_release should
not be called if fib6_nh_init returns an error, but the commit below added
a call to ipv6_stub->fib6_nh_release in its error path. To fix it return
the dummy stub's -EAFNOSUPPORT error directly without calling
ipv6_stub->fib6_nh_release in nh_create_ipv6()'s error path.
[1]
Output is a bit truncated, but it clearly shows the error.
BUG: kernel NULL pointer dereference, address: 000000000000000000
#PF: supervisor instruction fetch in kernel modede
#PF: error_code(0x0010) - not-present pagege
PGD 0 P4D 0
Oops: 0010 [#1] PREEMPT SMP NOPTI
CPU: 4 PID: 638 Comm: ip Kdump: loaded Not tainted 5.16.0-rc1+ #446
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
RSP: 0018:ffff888109f5b8f0 EFLAGS: 00010286^Ac
RAX: 0000000000000000 RBX: ffff888109f5ba28 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8881008a2860
RBP: ffff888109f5b9d8 R08: 0000000000000000 R09: 0000000000000000
R10: ffff888109f5b978 R11: ffff888109f5b948 R12: 00000000ffffff9f
R13: ffff8881008a2a80 R14: ffff8881008a2860 R15: ffff8881008a2840
FS: 00007f98de70f100(0000) GS:ffff88822bf00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000100efc000 CR4: 00000000000006e0
Call Trace:
<TASK>
nh_create_ipv6+0xed/0x10c
rtm_new_nexthop+0x6d7/0x13f3
? check_preemption_disabled+0x3d/0xf2
? lock_is_held_type+0xbe/0xfd
rtnetlink_rcv_msg+0x23f/0x26a
? check_preemption_disabled+0x3d/0xf2
? rtnl_calcit.isra.0+0x147/0x147
netlink_rcv_skb+0x61/0xb2
netlink_unicast+0x100/0x187
netlink_sendmsg+0x37f/0x3a0
? netlink_unicast+0x187/0x187
sock_sendmsg_nosec+0x67/0x9b
____sys_sendmsg+0x19d/0x1f9
? copy_msghdr_from_user+0x4c/0x5e
? rcu_read_lock_any_held+0x2a/0x78
___sys_sendmsg+0x6c/0x8c
? asm_sysvec_apic_timer_interrupt+0x12/0x20
? lockdep_hardirqs_on+0xd9/0x102
? sockfd_lookup_light+0x69/0x99
__sys_sendmsg+0x50/0x6e
do_syscall_64+0xcb/0xf2
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f98dea28914
Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 48 8d 05 e9 5d 0c 00 8b 00 85 c0 75 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 41 89 d4 55 48 89 f5 53
RSP: 002b:00007fff859f5e68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e2e
RAX: ffffffffffffffda RBX: 00000000619cb810 RCX: 00007f98dea28914
RDX: 0000000000000000 RSI: 00007fff859f5ed0 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000008
R10: fffffffffffffce6 R11: 0000000000000246 R12: 0000000000000001
R13: 000055c0097ae520 R14: 000055c0097957fd R15: 00007fff859f63a0
</TASK>
Modules linked in: bridge stp llc bonding virtio_net
Cc: stable@vger.kernel.org
Fixes: 53010f991a9f ("nexthop: Add support for IPv6 gateways")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 563fbefed46ae4c1f70cffb8eb54c02df480b2c2 upstream.
If the userspace tools switch from NL80211_IFTYPE_P2P_GO to
NL80211_IFTYPE_ADHOC via send_msg(NL80211_CMD_SET_INTERFACE), it
does not call the cleanup cfg80211_stop_ap(), this leads to the
initialization of in-use data. For example, this path re-init the
sdata->assigned_chanctx_list while it is still an element of
assigned_vifs list, and makes that linked list corrupt.
Signed-off-by: Nguyen Dinh Phi <phind.uet@gmail.com>
Reported-by: syzbot+bbf402b783eeb6d908db@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20211027173722.777287-1-phind.uet@gmail.com
Cc: stable@vger.kernel.org
Fixes: ac800140c20e ("cfg80211: .stop_ap when interface is going down")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3e6db079751afd527bf3db32314ae938dc571916 upstream.
kmemdup can return a null pointer so need to check for it, otherwise
the null key will be dereferenced later in tipc_crypto_key_xmit as
can be seen in the trace [1].
Cc: tipc-discussion@lists.sourceforge.net
Cc: stable@vger.kernel.org # 5.15, 5.14, 5.10
[1] https://syzkaller.appspot.com/bug?id=bca180abb29567b189efdbdb34cbf7ba851c2a58
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/20211115160143.5099-1-tadeusz.struk@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 48b71a9e66c2eab60564b1b1c85f4928ed04e406 ]
There are two sites that calls queue_work() after the
destroy_workqueue() and lead to possible UAF.
The first site is nci_send_cmd(), which can happen after the
nci_close_device as below
nfcmrvl_nci_unregister_dev | nfc_genl_dev_up
nci_close_device |
flush_workqueue |
del_timer_sync |
nci_unregister_device | nfc_get_device
destroy_workqueue | nfc_dev_up
nfc_unregister_device | nci_dev_up
device_del | nci_open_device
| __nci_request
| nci_send_cmd
| queue_work !!!
Another site is nci_cmd_timer, awaked by the nci_cmd_work from the
nci_send_cmd.
... | ...
nci_unregister_device | queue_work
destroy_workqueue |
nfc_unregister_device | ...
device_del | nci_cmd_work
| mod_timer
| ...
| nci_cmd_timer
| queue_work !!!
For the above two UAF, the root cause is that the nfc_dev_up can race
between the nci_unregister_device routine. Therefore, this patch
introduce NCI_UNREG flag to easily eliminate the possible race. In
addition, the mutex_lock in nci_close_device can act as a barrier.
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation")
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20211116152732.19238-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f011539e723c737b74876ac47345e40270a3c384 ]
If there is a NCI command in work queue after closing the NCI device at
nci_unregister_device, The NCI command timer starts at flush_workqueue
function and then NCI command timeout handler would be called 5 second
after flushing the NCI command work queue and destroying the queue.
At that time, the timeout handler would try to use NCI command work queue
that is destroyed already. it will causes the problem. To avoid this
abnormal situation, change the sequence to prevent the NCI command timeout
handler from being called after destroying the NCI command work queue.
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3e3b5dfcd16a3e254aab61bd1e8c417dd4503102 ]
There is a potential UAF between the unregistration routine and the NFC
netlink operations.
The race that cause that UAF can be shown as below:
(FREE) | (USE)
nfcmrvl_nci_unregister_dev | nfc_genl_dev_up
nci_close_device |
nci_unregister_device | nfc_get_device
nfc_unregister_device | nfc_dev_up
rfkill_destory |
device_del | rfkill_blocked
... | ...
The root cause for this race is concluded below:
1. The rfkill_blocked (USE) in nfc_dev_up is supposed to be placed after
the device_is_registered check.
2. Since the netlink operations are possible just after the device_add
in nfc_register_device, the nfc_dev_up() can happen anywhere during the
rfkill creation process, which leads to data race.
This patch reorder these actions to permit
1. Once device_del is finished, the nfc_dev_up cannot dereference the
rfkill object.
2. The rfkill_register need to be placed after the device_add of nfc_dev
because the parent device need to be created first. So this patch keeps
the order but inject device_lock to prevent the data race.
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Fixes: be055b2f89b5 ("NFC: RFKILL support")
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20211116152652.19217-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 86cdf8e38792545161dbe3350a7eced558ba4d15 ]
There is a possible data race as shown below:
thread-A in nci_request() | thread-B in nci_close_device()
| mutex_lock(&ndev->req_lock);
test_bit(NCI_UP, &ndev->flags); |
... | test_and_clear_bit(NCI_UP, &ndev->flags)
mutex_lock(&ndev->req_lock); |
|
This race will allow __nci_request() to be awaked while the device is
getting removed.
Similar to commit e2cb6b891ad2 ("bluetooth: eliminate the potential race
condition when removing the HCI controller"). this patch alters the
function sequence in nci_request() to prevent the data races between the
nci_close_device().
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation")
Link: https://lore.kernel.org/r/20211115145600.8320-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f799ada6bf2397c351220088b9b0980125c77280 ]
Without dropping dst, the packets sent from local mirred/redirected
to ingress will may still use the old dst. ip_rcv() will drop it as
the old dst is for output and its .input is dst_discard.
This patch is to fix by also dropping dst for those packets that are
mirred or redirected from egress to ingress in act_mirred.
Note that we don't drop it for the direction change from ingress to
egress, as on which there might be a user case attaching a metadata
dst by act_tunnel_key that would be used later.
Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cf4f5530bb55ef7d5a91036b26676643b80b1616 ]
The link_id is supposed to be unique, but smcr_next_link_id() doesn't
skip the used link_id as expected. So the patch fixes this.
Fixes: 026c381fb477 ("net/smc: introduce link_idx for link group array")
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 938cca9e4109b30ee1d476904538225a825e54eb ]
sk_clone_lock() needs to call sock_inuse_add(1) before entering the
sk_free_unlock_clone() error path, for __sk_free() from sk_free() from
sk_free_unlock_clone() calls sock_inuse_add(-1).
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 648845ab7e200993 ("sock: Move the socket inuse to namespace.")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bbc20b70424aeb3c84f833860f6340adda5141fc ]
Rework initial test to jump over init code
if memory allocation has failed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20210127152731.748663-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 271351d255b09e39c7f6437738cba595f9b235be ]
The MSG_CRYPTO msgs are always encrypted and sent to other nodes
for keys' deployment. But when receiving in peers, if those nodes
do not validate it and make sure it's encrypted, one could craft
a malicious MSG_CRYPTO msg to deploy its key with no need to know
other nodes' keys.
This patch is to do that by checking TIPC_SKB_CB(skb)->decrypted
and discard it if this packet never got decrypted.
Note that this is also a supplementary fix to CVE-2021-43267 that
can be triggered by an unencrypted malicious MSG_CRYPTO msg.
Fixes: 1ef6f7c9390f ("tipc: add automatic session key exchange")
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 70701b83e208767f2720d8cd3e6a62cddafb3a30 ]
TCP Receive zerocopy iterates through the SKB queue via
tcp_recv_skb(), acquiring a pointer to an SKB and an offset within
that SKB to read from. From there, it iterates the SKB frags array to
determine which offset to start remapping pages from.
However, this is built on the assumption that the offset read so far
within the SKB is smaller than the SKB length. If this assumption is
violated, we can attempt to read an invalid frags array element, which
would cause a fault.
tcp_recv_skb() can cause such an SKB to be returned when the TCP FIN
flag is set. Therefore, we must guard against this occurrence inside
skb_advance_frag().
One way that we can reproduce this error follows:
1) In a receiver program, call getsockopt(TCP_ZEROCOPY_RECEIVE) with:
char some_array[32 * 1024];
struct tcp_zerocopy_receive zc = {
.copybuf_address = (__u64) &some_array[0],
.copybuf_len = 32 * 1024,
};
2) In a sender program, after a TCP handshake, send the following
sequence of packets:
i) Seq = [X, X+4000]
ii) Seq = [X+4000, X+5000]
iii) Seq = [X+4000, X+5000], Flags = FIN | URG, urgptr=1000
(This can happen without URG, if we have a signal pending, but URG is
a convenient way to reproduce the behaviour).
In this case, the following event sequence will occur on the receiver:
tcp_zerocopy_receive():
-> receive_fallback_to_copy() // copybuf_len >= inq
-> tcp_recvmsg_locked() // reads 5000 bytes, then breaks due to URG
-> tcp_recv_skb() // yields skb with skb->len == offset
-> tcp_zerocopy_set_hint_for_skb()
-> skb_advance_to_frag() // will returns a frags ptr. >= nr_frags
-> find_next_mappable_frag() // will dereference this bad frags ptr.
With this patch, skb_advance_to_frag() will no longer return an
invalid frags pointer, and will return NULL instead, fixing the issue.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 05255b823a61 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive")
Link: https://lore.kernel.org/r/20211111235215.2605384-1-arjunroy.kdev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7fba5309efe24e4f0284ef4b8663cdf401035e72 ]
Refactor skb frag fast-forwarding for tcp receive zerocopy. This is
part of a patch set that introduces short-circuited hybrid copies
for small receive operations, which results in roughly 33% fewer
syscalls for small RPC scenarios.
skb_advance_to_frag(), given a skb and an offset into the skb,
iterates from the first frag for the skb until we're at the frag
specified by the offset. Assuming the offset provided refers to how
many bytes in the skb are already read, the returned frag points to
the next frag we may read from, while offset_frag is set to the number
of bytes from this frag that we have already read.
If frag is not null and offset_frag is equal to 0, then we may be able
to map this frag's page into the process address space with
vm_insert_page(). However, if offset_frag is not equal to 0, then we
cannot do so.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 18fb76ed53865c1b5d5f0157b1b825704590beb5 ]
When TCP receive zerocopy does not successfully map the entire
requested space, it outputs a 'hint' that the caller should recvmsg().
Augment zerocopy to accept a user buffer that it tries to copy this
hint into - if it is possible to copy the entire hint, it will do so.
This elides a recvmsg() call for received traffic that isn't exactly
page-aligned in size.
This was tested with RPC-style traffic of arbitrary sizes. Normally,
each received message required at least one getsockopt() call, and one
recvmsg() call for the remaining unaligned data.
With this change, almost all of the recvmsg() calls are eliminated,
leading to a savings of about 25%-50% in number of system calls
for RPC-style workloads.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit ea7a1019d8baf8503ecd6e3ec8436dec283569e6 upstream.
The premise of commit 6f9f17287e78 ("SUNRPC: Mitigate cond_resched() in
xprt_transmit()") was that cond_resched() is expensive and unnecessary
when there has been just a single send.
The point of cond_resched() is to ensure that tasks that should pre-empt
this one get a chance to do so when it is safe to do so. The code prior
to commit 6f9f17287e78 failed to take into account that it was keeping a
rpc_task pinned for longer than it needed to, and so rather than doing a
full revert, let's just move the cond_resched.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 27eb4c3144f7a5ebef3c9a261d80cb3e1fa784dc upstream.
Link: https://lkml.kernel.org/r/99338965-d36c-886e-cd0e-1d8fff2b4746@gmail.com
Reported-by: syzbot+06472778c97ed94af66d@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 3dc20f4762c62d3b3f0940644881ed818aa7b2f5 ]
Currently, it is not possible to migrate a neighbor entry between NUD_PERMANENT
state and NTF_USE flag with a dynamic NUD state from a user space control plane.
Similarly, it is not possible to add/remove NTF_EXT_LEARNED flag from an existing
neighbor entry in combination with NTF_USE flag.
This is due to the latter directly calling into neigh_event_send() without any
meta data updates as happening in __neigh_update(). Thus, to enable this use
case, extend the latter with a NEIGH_UPDATE_F_USE flag where we break the
NUD_PERMANENT state in particular so that a latter neigh_event_send() is able
to re-resolve a neighbor entry.
Before fix, NUD_PERMANENT -> NUD_* & NTF_USE:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
[...]
As can be seen, despite the admin-triggered replace, the entry remains in the
NUD_PERMANENT state.
After fix, NUD_PERMANENT -> NUD_* & NTF_USE:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
[...]
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn STALE
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
[...]
After the fix, the admin-triggered replace switches to a dynamic state from
the NTF_USE flag which triggered a new neighbor resolution. Likewise, we can
transition back from there, if needed, into NUD_PERMANENT.
Similar before/after behavior can be observed for below transitions:
Before fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
[...]
After fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
[..]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e5d5aadcf3cd59949316df49c27cb21788d7efe4 ]
We got the following WARNING when running ab/nginx
test with RDMA link flapping (up-down-up).
The reason is when smc_sock fallback and at linkdown
happens simultaneously, we may got the following situation:
__smc_lgr_terminate()
--> smc_conn_kill()
--> smc_close_active_abort()
smc_sock->sk_state = SMC_CLOSED
sock_put(smc_sock)
smc_sock was set to SMC_CLOSED and sock_put() been called
when terminate the link group. But later application call
close() on the socket, then we got:
__smc_release():
if (smc_sock->fallback)
smc_sock->sk_state = SMC_CLOSED
sock_put(smc_sock)
Again we set the smc_sock to CLOSED through it's already
in CLOSED state, and double put the refcnt, so the following
warning happens:
refcount_t: underflow; use-after-free.
WARNING: CPU: 5 PID: 860 at lib/refcount.c:28 refcount_warn_saturate+0x8d/0xf0
Modules linked in:
CPU: 5 PID: 860 Comm: nginx Not tainted 5.10.46+ #403
Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014
RIP: 0010:refcount_warn_saturate+0x8d/0xf0
Code: 05 5c 1e b5 01 01 e8 52 25 bc ff 0f 0b c3 80 3d 4f 1e b5 01 00 75 ad 48
RSP: 0018:ffffc90000527e50 EFLAGS: 00010286
RAX: 0000000000000026 RBX: ffff8881300df2c0 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffff88813bd58040 RDI: ffff88813bd58048
RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000001
R10: ffff8881300df2c0 R11: ffffc90000527c78 R12: ffff8881300df340
R13: ffff8881300df930 R14: ffff88810b3dad80 R15: ffff8881300df4f8
FS: 00007f739de8fb80(0000) GS:ffff88813bd40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000a01b008 CR3: 0000000111b64003 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
smc_release+0x353/0x3f0
__sock_release+0x3d/0xb0
sock_close+0x11/0x20
__fput+0x93/0x230
task_work_run+0x65/0xa0
exit_to_user_mode_prepare+0xf9/0x100
syscall_exit_to_user_mode+0x27/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This patch adds check in __smc_release() to make
sure we won't do an extra sock_put() and set the
socket to CLOSED when its already in CLOSED state.
Fixes: 51f1de79ad8e (net/smc: replace sock_put worker by socket refcounting)
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c7cd82b90599fa10915f41e3dd9098a77d0aa7b6 ]
Currently vosck_connect() increments sock refcount for nonblocking
socket each time it's called, which can lead to memory leak if
it's called multiple times because connect timeout function decrements
sock refcount only once.
Fixes it by making vsock_connect() return -EALREADY immediately when
sock state is already SS_CONNECTING.
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6dc25401cba4d428328eade8ceae717633fdd702 ]
1) if q->tk_offset == TK_OFFS_MAX, then get_tcp_tstamp() calls
ktime_mono_to_any() with out-of-bound value.
2) if q->tk_offset is changed in taprio_parse_clockid(),
taprio_get_time() might also call ktime_mono_to_any()
with out-of-bound value as sysbot found:
UBSAN: array-index-out-of-bounds in kernel/time/timekeeping.c:908:27
index 3 is out of range for type 'ktime_t *[3]'
CPU: 1 PID: 25668 Comm: kworker/u4:0 Not tainted 5.15.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
__ubsan_handle_out_of_bounds.cold+0x62/0x6c lib/ubsan.c:291
ktime_mono_to_any+0x1d4/0x1e0 kernel/time/timekeeping.c:908
get_tcp_tstamp net/sched/sch_taprio.c:322 [inline]
get_packet_txtime net/sched/sch_taprio.c:353 [inline]
taprio_enqueue_one+0x5b0/0x1460 net/sched/sch_taprio.c:420
taprio_enqueue+0x3b1/0x730 net/sched/sch_taprio.c:485
dev_qdisc_enqueue+0x40/0x300 net/core/dev.c:3785
__dev_xmit_skb net/core/dev.c:3869 [inline]
__dev_queue_xmit+0x1f6e/0x3630 net/core/dev.c:4194
batadv_send_skb_packet+0x4a9/0x5f0 net/batman-adv/send.c:108
batadv_iv_ogm_send_to_if net/batman-adv/bat_iv_ogm.c:393 [inline]
batadv_iv_ogm_emit net/batman-adv/bat_iv_ogm.c:421 [inline]
batadv_iv_send_outstanding_bat_ogm_packet+0x6d7/0x8e0 net/batman-adv/bat_iv_ogm.c:1701
process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
kthread+0x405/0x4f0 kernel/kthread.c:327
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Fixes: 7ede7b03484b ("taprio: make clock reference conversions easier")
Fixes: 54002066100b ("taprio: Adjust timestamps for TCP packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vedang Patel <vedang.patel@intel.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://lore.kernel.org/r/20211108180815.1822479-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e0dc3b93bd7bcff8c3813d1df43e0908499c7cf0 ]
Strparser is reusing the qdisc_skb_cb struct to stash the skb message handling
progress, e.g. offset and length of the skb. First this is poorly named and
inherits a struct from qdisc that doesn't reflect the actual usage of cb[] at
this layer.
But, more importantly strparser is using the following to access its metadata.
(struct _strp_msg *)((void *)skb->cb + offsetof(struct qdisc_skb_cb, data))
Where _strp_msg is defined as:
struct _strp_msg {
struct strp_msg strp; /* 0 8 */
int accum_len; /* 8 4 */
/* size: 12, cachelines: 1, members: 2 */
/* last cacheline: 12 bytes */
};
So we use 12 bytes of ->data[] in struct. However in BPF code running parser
and verdict the user has read capabilities into the data[] array as well. Its
not too problematic, but we should not be exposing internal state to BPF
program. If its really needed then we can use the probe_read() APIs which allow
reading kernel memory. And I don't believe cb[] layer poses any API breakage by
moving this around because programs can't depend on cb[] across layers.
In order to fix another issue with a ctx rewrite we need to stash a temp
variable somewhere. To make this work cleanly this patch builds a cb struct
for sk_skb types called sk_skb_cb struct. Then we can use this consistently
in the strparser, sockmap space. Additionally we can start allowing ->cb[]
write access after this.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jussi Maki <joamaki@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-5-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b8b8315e39ffaca82e79d86dde26e9144addf66b ]
We do not need to handle unhash from BPF side we can simply wait for the
close to happen. The original concern was a socket could transition from
ESTABLISHED state to a new state while the BPF hook was still attached.
But, we convinced ourself this is no longer possible and we also improved
BPF sockmap to handle listen sockets so this is no longer a problem.
More importantly though there are cases where unhash is called when data is
in the receive queue. The BPF unhash logic will flush this data which is
wrong. To be correct it should keep the data in the receive queue and allow
a receiving application to continue reading the data. This may happen when
tcp_abort() is received for example. Instead of complicating the logic in
unhash simply moving all this to tcp_close() hook solves this.
Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jussi Maki <joamaki@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-3-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 563bcbae3ba233c275c244bfce2efe12938f5363 ]
The real_dev of a vlan net_device may be freed after
unregister_vlan_dev(). Access the real_dev continually by
vlan_dev_real_dev() will trigger the UAF problem for the
real_dev like following:
==================================================================
BUG: KASAN: use-after-free in vlan_dev_real_dev+0xf9/0x120
Call Trace:
kasan_report.cold+0x83/0xdf
vlan_dev_real_dev+0xf9/0x120
is_eth_port_of_netdev_filter.part.0+0xb1/0x2c0
is_eth_port_of_netdev_filter+0x28/0x40
ib_enum_roce_netdev+0x1a3/0x300
ib_enum_all_roce_netdevs+0xc7/0x140
netdevice_event_work_handler+0x9d/0x210
...
Freed by task 9288:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0xfc/0x130
slab_free_freelist_hook+0xdd/0x240
kfree+0xe4/0x690
kvfree+0x42/0x50
device_release+0x9f/0x240
kobject_put+0x1c8/0x530
put_device+0x1b/0x30
free_netdev+0x370/0x540
ppp_destroy_interface+0x313/0x3d0
...
Move the put_device(real_dev) to vlan_dev_free(). Ensure
real_dev not be freed before vlan_dev unregistered.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+e4df4e1389e28972e955@syzkaller.appspotmail.com
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1aabe578dd86e9f2867c4db4fba9a15f4ba1825d ]
ETHTOOL_A_PAUSE_STAT_MAX is the MAX attribute id,
so we need to subtract non-stats and add one to
get a count (IOW -2+1 == -1).
Otherwise we'll see:
ethnl cmd 21: calculated reply length 40, but consumed 52
Fixes: 9a27a33027f2 ("ethtool: add standard pause stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5648b5e1169ff1d6d6a46c35c0b5fbebd2a5cbb2 ]
On 64bit platforms the MAC header is set to 0xffff on allocation and
also when a helper like skb_unset_mac_header() is called.
dev_parse_header may call skb_mac_header() which assumes valid mac offset:
BUG: KASAN: use-after-free in eth_header_parse+0x75/0x90
Read of size 6 at addr ffff8881075a5c05 by task nf-queue/1364
Call Trace:
memcpy+0x20/0x60
eth_header_parse+0x75/0x90
__nfqnl_enqueue_packet+0x1a61/0x3380
__nf_queue+0x597/0x1300
nf_queue+0xf/0x40
nf_hook_slow+0xed/0x190
nf_hook+0x184/0x440
ip_output+0x1c0/0x2a0
nf_reinject+0x26f/0x700
nfqnl_recv_verdict+0xa16/0x18b0
nfnetlink_rcv_msg+0x506/0xe70
The existing code only works if the skb has a mac header.
Fixes: 2c38de4c1f8da7 ("netfilter: fix looped (broad|multi)cast's MAC handling")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9b6e27d01adcec58e046c624874f8a124e8b07ec ]
Dan Carpenter says:
The patch d20c11d86d8f: "nfsd: Protect session creation and client
confirm using client_lock" from Jul 30, 2014, leads to the following
Smatch static checker warning:
net/sunrpc/addr.c:178 rpc_parse_scope_id()
warn: sleeping in atomic context
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: d20c11d86d8f ("nfsd: Protect session creation and client...")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 42dcfd850e514b229d616a53dec06d0f2533217c ]
Commit c6af0c227a22 ("ip: support SO_MARK cmsg")
added propagation of SO_MARK from cmsg to skb->mark.
For IPv4 and raw sockets the mark also affects route
lookup, but in case of IPv6 the flow info is
initialized before cmsg is parsed.
Fixes: c6af0c227a22 ("ip: support SO_MARK cmsg")
Reported-and-tested-by: Xintong Hu <huxintong@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cf12e6f9124629b18a6182deefc0315f0a73a199 ]
v1: Implement a more general statement as recommended by Eric Dumazet. The
sequence number will be advanced, so this check will fix the FIN case and
other cases.
A customer reported sockets stuck in the CLOSING state. A Vmcore revealed that
the write_queue was not empty as determined by tcp_write_queue_empty() but the
sk_buff containing the FIN flag had been freed and the socket was zombied in
that state. Corresponding pcaps show no FIN from the Linux kernel on the wire.
Some instrumentation was added to the kernel and it was found that there is a
timing window where tcp_sendmsg() can run after tcp_send_fin().
tcp_sendmsg() will hit an error, for example:
1269 ▹ if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))↩
1270 ▹ ▹ goto do_error;↩
tcp_remove_empty_skb() will then free the FIN sk_buff as "skb->len == 0". The
TCP socket is now wedged in the FIN-WAIT-1 state because the FIN is never sent.
If the other side sends a FIN packet the socket will transition to CLOSING and
remain that way until the system is rebooted.
Fix this by checking for the FIN flag in the sk_buff and don't free it if that
is the case. Testing confirmed that fixed the issue.
Fixes: fdfc5c8594c2 ("tcp: remove empty skb from write queue in error cases")
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Reported-by: Monir Zouaoui <Monir.Zouaoui@mail.schwarz>
Reported-by: Simon Stier <simon.stier@mail.schwarz>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 24bcbe1cc69fa52dc4f7b5b2456678ed464724d8 ]
sk_stream_kill_queues() can be called on close when there are
still outstanding skbs to transmit. Those skbs may try to queue
notifications to the error queue (e.g. timestamps).
If sk_stream_kill_queues() purges the queue without taking
its lock the queue may get corrupted, and skbs leaked.
This shows up as a warning about an rmem leak:
WARNING: CPU: 24 PID: 0 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x...
The leak is always a multiple of 0x300 bytes (the value is in
%rax on my builds, so RAX: 0000000000000300). 0x300 is truesize of
an empty sk_buff. Indeed if we dump the socket state at the time
of the warning the sk_error_queue is often (but not always)
corrupted. The ->next pointer points back at the list head,
but not the ->prev pointer. Indeed we can find the leaked skb
by scanning the kernel memory for something that looks like
an skb with ->sk = socket in question, and ->truesize = 0x300.
The contents of ->cb[] of the skb confirms the suspicion that
it is indeed a timestamp notification (as generated in
__skb_complete_tx_timestamp()).
Removing purging of sk_error_queue should be okay, since
inet_sock_destruct() does it again once all socket refs
are gone. Eric suggests this may cause sockets that go
thru disconnect() to maintain notifications from the
previous incarnations of the socket, but that should be
okay since the race was there anyway, and disconnect()
is not exactly dependable.
Thanks to Jonathan Lemon and Omar Sandoval for help at various
stages of tracing the issue.
Fixes: cb9eff097831 ("net: new user space API for time stamping of incoming and outgoing packets")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 19757cebf0c5016a1f36f7fe9810a9f0b33c0832 ]
Use of percpu_counter structure to track count of orphaned
sockets is causing problems on modern hosts with 256 cpus
or more.
Stefan Bach reported a serious spinlock contention in real workloads,
that I was able to reproduce with a netfilter rule dropping
incoming FIN packets.
53.56% server [kernel.kallsyms] [k] queued_spin_lock_slowpath
|
---queued_spin_lock_slowpath
|
--53.51%--_raw_spin_lock_irqsave
|
--53.51%--__percpu_counter_sum
tcp_check_oom
|
|--39.03%--__tcp_close
| tcp_close
| inet_release
| inet6_release
| sock_close
| __fput
| ____fput
| task_work_run
| exit_to_usermode_loop
| do_syscall_64
| entry_SYSCALL_64_after_hwframe
| __GI___libc_close
|
--14.48%--tcp_out_of_resources
tcp_write_timeout
tcp_retransmit_timer
tcp_write_timer_handler
tcp_write_timer
call_timer_fn
expire_timers
__run_timers
run_timer_softirq
__softirqentry_text_start
As explained in commit cf86a086a180 ("net/dst: use a smaller percpu_counter
batch for dst entries accounting"), default batch size is too big
for the default value of tcp_max_orphans (262144).
But even if we reduce batch sizes, there would still be cases
where the estimated count of orphans is beyond the limit,
and where tcp_too_many_orphans() has to call the expensive
percpu_counter_sum_positive().
One solution is to use plain per-cpu counters, and have
a timer to periodically refresh this cache.
Updating this cache every 100ms seems about right, tcp pressure
state is not radically changing over shorter periods.
percpu_counter was nice 15 years ago while hosts had less
than 16 cpus, not anymore by current standards.
v2: Fix the build issue for CONFIG_CRYPTO_DEV_CHELSIO_TLS=m,
reported by kernel test robot <lkp@intel.com>
Remove unused socket argument from tcp_too_many_orphans()
Fixes: dd24c00191d5 ("net: Use a percpu_counter for orphan_count")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Stefan Bach <sfb@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|