summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2021-09-18Bluetooth: avoid circular locks in sco_sock_connectDesmond Cheong Zhi Xi1-23/+16
[ Upstream commit 734bc5ff783115aa3164f4e9dd5967ae78e0a8ab ] In a future patch, calls to bh_lock_sock in sco.c should be replaced by lock_sock now that none of the functions are run in IRQ context. However, doing so results in a circular locking dependency: ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc4-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor.2/14867 is trying to acquire lock: ffff88803e3c1120 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1613 [inline] ffff88803e3c1120 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}, at: sco_conn_del+0x12a/0x2a0 net/bluetooth/sco.c:191 but task is already holding lock: ffffffff8d2dc7c8 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_disconn_cfm include/net/bluetooth/hci_core.h:1497 [inline] ffffffff8d2dc7c8 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_conn_hash_flush+0xda/0x260 net/bluetooth/hci_conn.c:1608 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (hci_cb_list_lock){+.+.}-{3:3}: __mutex_lock_common kernel/locking/mutex.c:959 [inline] __mutex_lock+0x12a/0x10a0 kernel/locking/mutex.c:1104 hci_connect_cfm include/net/bluetooth/hci_core.h:1482 [inline] hci_remote_features_evt net/bluetooth/hci_event.c:3263 [inline] hci_event_packet+0x2f4d/0x7c50 net/bluetooth/hci_event.c:6240 hci_rx_work+0x4f8/0xd30 net/bluetooth/hci_core.c:5122 process_one_work+0x98d/0x1630 kernel/workqueue.c:2276 worker_thread+0x658/0x11f0 kernel/workqueue.c:2422 kthread+0x3e5/0x4d0 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 -> #1 (&hdev->lock){+.+.}-{3:3}: __mutex_lock_common kernel/locking/mutex.c:959 [inline] __mutex_lock+0x12a/0x10a0 kernel/locking/mutex.c:1104 sco_connect net/bluetooth/sco.c:245 [inline] sco_sock_connect+0x227/0xa10 net/bluetooth/sco.c:601 __sys_connect_file+0x155/0x1a0 net/socket.c:1879 __sys_connect+0x161/0x190 net/socket.c:1896 __do_sys_connect net/socket.c:1906 [inline] __se_sys_connect net/socket.c:1903 [inline] __x64_sys_connect+0x6f/0xb0 net/socket.c:1903 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}: check_prev_add kernel/locking/lockdep.c:3051 [inline] check_prevs_add kernel/locking/lockdep.c:3174 [inline] validate_chain kernel/locking/lockdep.c:3789 [inline] __lock_acquire+0x2a07/0x54a0 kernel/locking/lockdep.c:5015 lock_acquire kernel/locking/lockdep.c:5625 [inline] lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 lock_sock_nested+0xca/0x120 net/core/sock.c:3170 lock_sock include/net/sock.h:1613 [inline] sco_conn_del+0x12a/0x2a0 net/bluetooth/sco.c:191 sco_disconn_cfm+0x71/0xb0 net/bluetooth/sco.c:1202 hci_disconn_cfm include/net/bluetooth/hci_core.h:1500 [inline] hci_conn_hash_flush+0x127/0x260 net/bluetooth/hci_conn.c:1608 hci_dev_do_close+0x528/0x1130 net/bluetooth/hci_core.c:1778 hci_unregister_dev+0x1c0/0x5a0 net/bluetooth/hci_core.c:4015 vhci_release+0x70/0xe0 drivers/bluetooth/hci_vhci.c:340 __fput+0x288/0x920 fs/file_table.c:280 task_work_run+0xdd/0x1a0 kernel/task_work.c:164 exit_task_work include/linux/task_work.h:32 [inline] do_exit+0xbd4/0x2a60 kernel/exit.c:825 do_group_exit+0x125/0x310 kernel/exit.c:922 get_signal+0x47f/0x2160 kernel/signal.c:2808 arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:865 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:209 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline] syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:302 ret_from_fork+0x15/0x30 arch/x86/entry/entry_64.S:288 other info that might help us debug this: Chain exists of: sk_lock-AF_BLUETOOTH-BTPROTO_SCO --> &hdev->lock --> hci_cb_list_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(hci_cb_list_lock); lock(&hdev->lock); lock(hci_cb_list_lock); lock(sk_lock-AF_BLUETOOTH-BTPROTO_SCO); *** DEADLOCK *** The issue is that the lock hierarchy should go from &hdev->lock --> hci_cb_list_lock --> sk_lock-AF_BLUETOOTH-BTPROTO_SCO. For example, one such call trace is: hci_dev_do_close(): hci_dev_lock(); hci_conn_hash_flush(): hci_disconn_cfm(): mutex_lock(&hci_cb_list_lock); sco_disconn_cfm(): sco_conn_del(): lock_sock(sk); However, in sco_sock_connect, we call lock_sock before calling hci_dev_lock inside sco_connect, thus inverting the lock hierarchy. We fix this by pulling the call to hci_dev_lock out from sco_connect. Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18Bluetooth: schedule SCO timeouts with delayed_workDesmond Cheong Zhi Xi1-6/+29
[ Upstream commit ba316be1b6a00db7126ed9a39f9bee434a508043 ] struct sock.sk_timer should be used as a sock cleanup timer. However, SCO uses it to implement sock timeouts. This causes issues because struct sock.sk_timer's callback is run in an IRQ context, and the timer callback function sco_sock_timeout takes a spin lock on the socket. However, other functions such as sco_conn_del and sco_conn_ready take the spin lock with interrupts enabled. This inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} lock usage could lead to deadlocks as reported by Syzbot [1]: CPU0 ---- lock(slock-AF_BLUETOOTH-BTPROTO_SCO); <Interrupt> lock(slock-AF_BLUETOOTH-BTPROTO_SCO); To fix this, we use delayed work to implement SCO sock timouts instead. This allows us to avoid taking the spin lock on the socket in an IRQ context, and corrects the misuse of struct sock.sk_timer. As a note, cancel_delayed_work is used instead of cancel_delayed_work_sync in sco_sock_set_timer and sco_sock_clear_timer to avoid a deadlock. In the future, the call to bh_lock_sock inside sco_sock_timeout should be changed to lock_sock to synchronize with other functions using lock_sock. However, since sco_sock_set_timer and sco_sock_clear_timer are sometimes called under the locked socket (in sco_connect and __sco_sock_close), cancel_delayed_work_sync might cause them to sleep until an sco_sock_timeout that has started finishes running. But sco_sock_timeout would also sleep until it can grab the lock_sock. Using cancel_delayed_work is fine because sco_sock_timeout does not change from run to run, hence there is no functional difference between: 1. waiting for a timeout to finish running before scheduling another timeout 2. scheduling another timeout while a timeout is running. Link: https://syzkaller.appspot.com/bug?id=9089d89de0502e120f234ca0fc8a703f7368b31e [1] Reported-by: syzbot+2f6d7c28bb4bf7e82060@syzkaller.appspotmail.com Tested-by: syzbot+2f6d7c28bb4bf7e82060@syzkaller.appspotmail.com Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18Bluetooth: skip invalid hci_sync_conn_complete_evtDesmond Cheong Zhi Xi1-0/+15
[ Upstream commit 92fe24a7db751b80925214ede43f8d2be792ea7b ] Syzbot reported a corrupted list in kobject_add_internal [1]. This happens when multiple HCI_EV_SYNC_CONN_COMPLETE event packets with status 0 are sent for the same HCI connection. This causes us to register the device more than once which corrupts the kset list. As this is forbidden behavior, we add a check for whether we're trying to process the same HCI_EV_SYNC_CONN_COMPLETE event multiple times for one connection. If that's the case, the event is invalid, so we report an error that the device is misbehaving, and ignore the packet. Link: https://syzkaller.appspot.com/bug?extid=66264bf2fd0476be7e6c [1] Reported-by: syzbot+66264bf2fd0476be7e6c@syzkaller.appspotmail.com Tested-by: syzbot+66264bf2fd0476be7e6c@syzkaller.appspotmail.com Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18flow_dissector: Fix out-of-bounds warningsGustavo A. R. Silva1-4/+8
[ Upstream commit 323e0cb473e2a8706ff162b6b4f4fa16023c9ba7 ] Fix the following out-of-bounds warnings: net/core/flow_dissector.c: In function '__skb_flow_dissect': >> net/core/flow_dissector.c:1104:4: warning: 'memcpy' offset [24, 39] from the object at '<unknown>' is out of the bounds of referenced subobject 'saddr' with type 'struct in6_addr' at offset 8 [-Warray-bounds] 1104 | memcpy(&key_addrs->v6addrs, &iph->saddr, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1105 | sizeof(key_addrs->v6addrs)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from include/linux/ipv6.h:5, from net/core/flow_dissector.c:6: include/uapi/linux/ipv6.h:133:18: note: subobject 'saddr' declared here 133 | struct in6_addr saddr; | ^~~~~ >> net/core/flow_dissector.c:1059:4: warning: 'memcpy' offset [16, 19] from the object at '<unknown>' is out of the bounds of referenced subobject 'saddr' with type 'unsigned int' at offset 12 [-Warray-bounds] 1059 | memcpy(&key_addrs->v4addrs, &iph->saddr, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1060 | sizeof(key_addrs->v4addrs)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from include/linux/ip.h:17, from net/core/flow_dissector.c:5: include/uapi/linux/ip.h:103:9: note: subobject 'saddr' declared here 103 | __be32 saddr; | ^~~~~ The problem is that the original code is trying to copy data into a couple of struct members adjacent to each other in a single call to memcpy(). So, the compiler legitimately complains about it. As these are just a couple of members, fix this by copying each one of them in separate calls to memcpy(). This helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). Link: https://github.com/KSPP/linux/issues/109 Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/d5ae2e65-1f18-2577-246f-bada7eee6ccd@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18ipv4: ip_output.c: Fix out-of-bounds warning in ip_copy_addrs()Gustavo A. R. Silva1-2/+3
[ Upstream commit 6321c7acb82872ef6576c520b0e178eaad3a25c0 ] Fix the following out-of-bounds warning: In function 'ip_copy_addrs', inlined from '__ip_queue_xmit' at net/ipv4/ip_output.c:517:2: net/ipv4/ip_output.c:449:2: warning: 'memcpy' offset [40, 43] from the object at 'fl' is out of the bounds of referenced subobject 'saddr' with type 'unsigned int' at offset 36 [-Warray-bounds] 449 | memcpy(&iph->saddr, &fl4->saddr, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 450 | sizeof(fl4->saddr) + sizeof(fl4->daddr)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The problem is that the original code is trying to copy data into a couple of struct members adjacent to each other in a single call to memcpy(). This causes a legitimate compiler warning because memcpy() overruns the length of &iph->saddr and &fl4->saddr. As these are just a couple of struct members, fix this by using direct assignments, instead of memcpy(). This helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). Link: https://github.com/KSPP/linux/issues/109 Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/d5ae2e65-1f18-2577-246f-bada7eee6ccd@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18ethtool: improve compat ioctl handlingArnd Bergmann2-140/+121
[ Upstream commit dd98d2895de6485c884a9cb42de69fed02826fa4 ] The ethtool compat ioctl handling is hidden away in net/socket.c, which introduces a couple of minor oddities: - The implementation may end up diverging, as seen in the RXNFC extension in commit 84a1d9c48200 ("net: ethtool: extend RXNFC API to support RSS spreading of filter matches") that does not work in compat mode. - Most architectures do not need the compat handling at all because u64 and compat_u64 have the same alignment. - On x86, the conversion is done for both x32 and i386 user space, but it's actually wrong to do it for x32 and cannot work there. - On 32-bit Arm, it never worked for compat oabi user space, since that needs to do the same conversion but does not. - It would be nice to get rid of both compat_alloc_user_space() and copy_in_user() throughout the kernel. None of these actually seems to be a serious problem that real users are likely to encounter, but fixing all of them actually leads to code that is both shorter and more readable. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18netlink: Deal with ESRCH error in nlmsg_notify()Yajun Deng1-1/+3
[ Upstream commit fef773fc8110d8124c73a5e6610f89e52814637d ] Yonghong Song report: The bpf selftest tc_bpf failed with latest bpf-next. The following is the command to run and the result: $ ./test_progs -n 132 [ 40.947571] bpf_testmod: loading out-of-tree module taints kernel. test_tc_bpf:PASS:test_tc_bpf__open_and_load 0 nsec test_tc_bpf:PASS:bpf_tc_hook_create(BPF_TC_INGRESS) 0 nsec test_tc_bpf:PASS:bpf_tc_hook_create invalid hook.attach_point 0 nsec test_tc_bpf_basic:PASS:bpf_obj_get_info_by_fd 0 nsec test_tc_bpf_basic:PASS:bpf_tc_attach 0 nsec test_tc_bpf_basic:PASS:handle set 0 nsec test_tc_bpf_basic:PASS:priority set 0 nsec test_tc_bpf_basic:PASS:prog_id set 0 nsec test_tc_bpf_basic:PASS:bpf_tc_attach replace mode 0 nsec test_tc_bpf_basic:PASS:bpf_tc_query 0 nsec test_tc_bpf_basic:PASS:handle set 0 nsec test_tc_bpf_basic:PASS:priority set 0 nsec test_tc_bpf_basic:PASS:prog_id set 0 nsec libbpf: Kernel error message: Failed to send filter delete notification test_tc_bpf_basic:FAIL:bpf_tc_detach unexpected error: -3 (errno 3) test_tc_bpf:FAIL:test_tc_internal ingress unexpected error: -3 (errno 3) The failure seems due to the commit cfdf0d9ae75b ("rtnetlink: use nlmsg_notify() in rtnetlink_send()") Deal with ESRCH error in nlmsg_notify() even the report variable is zero. Reported-by: Yonghong Song <yhs@fb.com> Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Link: https://lore.kernel.org/r/20210719051816.11762-1-yajun.deng@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18tipc: keep the skb in rcv queue until the whole data is readXin Long1-9/+27
[ Upstream commit f4919ff59c2828064b4156e3c3600a169909bcf4 ] Currently, when userspace reads a datagram with a buffer that is smaller than this datagram, the data will be truncated and only part of it can be received by users. It doesn't seem right that users don't know the datagram size and have to use a huge buffer to read it to avoid the truncation. This patch to fix it by keeping the skb in rcv queue until the whole data is read by users. Only the last msg of the datagram will be marked with MSG_EOR, just as TCP/SCTP does. Note that this will work as above only when MSG_EOR is set in the flags parameter of recvmsg(), so that it won't break any old user applications. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18sunrpc: Fix return value of get_srcport()Anna Schumaker1-1/+1
[ Upstream commit 5d46dd04cb68771f77ba66dbf6fd323a4a2ce00d ] Since bc1c56e9bbe9 transport->srcport may by unset, causing get_srcport() to return 0 when called. Fix this by querying the port from the underlying socket instead of the transport. Fixes: bc1c56e9bbe9 (SUNRPC: prevent port reuse on transports which don't request it) Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18SUNRPC query transport's source portOlga Kornievskaia1-0/+7
[ Upstream commit a8482488a7d6d320f63a9ee1912dbb5ae5b80a61 ] Provide ability to query transport's source port. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18SUNRPC/xprtrdma: Fix reconnection lockingTrond Myklebust2-6/+7
[ Upstream commit f99fa50880f5300fbbb3c0754ddc7f8738d24fe7 ] The xprtrdma client code currently relies on the task that initiated the connect to hold the XPRT_LOCK for the duration of the connection attempt. If the task is woken early, due to some other event, then that lock could get released early. Avoid races by using the same mechanism that the socket code uses of transferring lock ownership to the RDMA connect worker itself. That frees us to call rpcrdma_xprt_disconnect() directly since we're now guaranteed exclusion w.r.t. other callers. Fixes: 4cf44be6f1e8 ("xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18SUNRPC: Fix potential memory corruptionTrond Myklebust1-2/+4
[ Upstream commit c2dc3e5fad13aca5d7bdf4bcb52b1a1d707c8555 ] We really should not call rpc_wake_up_queued_task_set_status() with xprt->snd_task as an argument unless we are certain that is actually an rpc_task. Fixes: 0445f92c5d53 ("SUNRPC: Fix disconnection races") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-189p/xen: Fix end of loop tests for list_for_each_entryHarshvardhan Jha1-2/+2
commit 732b33d0dbf17e9483f0b50385bf606f724f50a2 upstream. This patch addresses the following problems: - priv can never be NULL, so this part of the check is useless - if the loop ran through the whole list, priv->client is invalid and it is more appropriate and sufficient to check for the end of list_for_each_entry loop condition. Link: http://lkml.kernel.org/r/20210727000709.225032-1-harshvardhan.jha@oracle.com Signed-off-by: Harshvardhan Jha <harshvardhan.jha@oracle.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Tested-by: Stefano Stabellini <sstabellini@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-16Revert "Bluetooth: Move shutdown callback before flushing tx and rx queue"Sasha Levin1-8/+0
This reverts commit 471128476819777a7b71f2b90bc868ccf74b185f. Botched backport, dropping to rework for next release. Reported-by: Pavel Machek <pavel@denx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15ipv4: fix endianness issue in inet_rtm_getroute_build_skb()Eric Dumazet1-1/+1
[ Upstream commit 92548b0ee220e000d81c27ac9a80e0ede895a881 ] The UDP length field should be in network order. This removes the following sparse error: net/ipv4/route.c:3173:27: warning: incorrect type in assignment (different base types) net/ipv4/route.c:3173:27: expected restricted __be16 [usertype] len net/ipv4/route.c:3173:27: got unsigned long Fixes: 404eb77ea766 ("ipv4: support sport, dport and ip_proto in RTM_GETROUTE") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@nvidia.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15net: sched: Fix qdisc_rate_table refcount leak when get tcf_block failedXiyu Yang1-1/+1
[ Upstream commit c66070125837900163b81a03063ddd657a7e9bfb ] The reference counting issue happens in one exception handling path of cbq_change_class(). When failing to get tcf_block, the function forgets to decrease the refcount of "rtab" increased by qdisc_put_rtab(), causing a refcount leak. Fix this issue by jumping to "failure" label when get tcf_block failed. Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/1630252681-71588-1-git-send-email-xiyuyang19@fudan.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15ipv4: make exception cache less predictibleEric Dumazet1-16/+30
[ Upstream commit 67d6d681e15b578c1725bad8ad079e05d1c48a8e ] Even after commit 6457378fe796 ("ipv4: use siphash instead of Jenkins in fnhe_hashfun()"), an attacker can still use brute force to learn some secrets from a victim linux host. One way to defeat these attacks is to make the max depth of the hash table bucket a random value. Before this patch, each bucket of the hash table used to store exceptions could contain 6 items under attack. After the patch, each bucket would contains a random number of items, between 6 and 10. The attacker can no longer infer secrets. This is slightly increasing memory size used by the hash table, by 50% in average, we do not expect this to be a problem. This patch is more complex than the prior one (IPv6 equivalent), because IPv4 was reusing the oldest entry. Since we need to be able to evict more than one entry per update_or_create_fnhe() call, I had to replace fnhe_oldest() with fnhe_remove_oldest(). Also note that we will queue extra kfree_rcu() calls under stress, which hopefully wont be a too big issue. Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Keyu Man <kman001@ucr.edu> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Reviewed-by: David Ahern <dsahern@kernel.org> Tested-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15ipv6: make exception cache less predictibleEric Dumazet1-1/+4
[ Upstream commit a00df2caffed3883c341d5685f830434312e4a43 ] Even after commit 4785305c05b2 ("ipv6: use siphash in rt6_exception_hash()"), an attacker can still use brute force to learn some secrets from a victim linux host. One way to defeat these attacks is to make the max depth of the hash table bucket a random value. Before this patch, each bucket of the hash table used to store exceptions could contain 6 items under attack. After the patch, each bucket would contains a random number of items, between 6 and 10. The attacker can no longer infer secrets. This is slightly increasing memory size used by the hash table, we do not expect this to be a problem. Following patch is dealing with the same issue in IPv4. Fixes: 35732d01fe31 ("ipv6: introduce a hash table to store dst cache") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Keyu Man <kman001@ucr.edu> Cc: Wei Wang <weiwan@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15Bluetooth: add timeout sanity check to hci_inquiryPavel Skripkin1-0/+6
[ Upstream commit f41a4b2b5eb7872109723dab8ae1603bdd9d9ec1 ] Syzbot hit "task hung" bug in hci_req_sync(). The problem was in unreasonable huge inquiry timeout passed from userspace. Fix it by adding sanity check for timeout value to hci_inquiry(). Since hci_inquiry() is the only user of hci_req_sync() with user controlled timeout value, it makes sense to check timeout value in hci_inquiry() and don't touch hci_req_sync(). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-and-tested-by: syzbot+be2baed593ea56c6a84c@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15mac80211: Fix insufficient headroom issue for AMSDUChih-Kang Chang1-1/+3
[ Upstream commit f50d2ff8f016b79a2ff4acd5943a1eda40c545d4 ] ieee80211_amsdu_realloc_pad() fails to account for extra_tx_headroom, the original reserved headroom might be eaten. Add the necessary extra_tx_headroom. Fixes: 6e0456b54545 ("mac80211: add A-MSDU tx support") Signed-off-by: Chih-Kang Chang <gary.chang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://lore.kernel.org/r/20210816085128.10931-2-pkshih@realtek.com [fix indentation] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15Bluetooth: Move shutdown callback before flushing tx and rx queueKai-Heng Feng1-0/+8
[ Upstream commit 0ea53674d07fb6db2dd7a7ec2fdc85a12eb246c2 ] Commit 0ea9fd001a14 ("Bluetooth: Shutdown controller after workqueues are flushed or cancelled") introduced a regression that makes mtkbtsdio driver stops working: [ 36.593956] Bluetooth: hci0: Firmware already downloaded [ 46.814613] Bluetooth: hci0: Execution of wmt command timed out [ 46.814619] Bluetooth: hci0: Failed to send wmt func ctrl (-110) The shutdown callback depends on the result of hdev->rx_work, so we should call it before flushing rx_work: -> btmtksdio_shutdown() -> mtk_hci_wmt_sync() -> __hci_cmd_send() -> wait for BTMTKSDIO_TX_WAIT_VND_EVT gets cleared -> btmtksdio_recv_event() -> hci_recv_frame() -> queue_work(hdev->workqueue, &hdev->rx_work) -> clears BTMTKSDIO_TX_WAIT_VND_EVT So move the shutdown callback before flushing TX/RX queue to resolve the issue. Reported-and-tested-by: Mattijs Korpershoek <mkorpershoek@baylibre.com> Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Guenter Roeck <linux@roeck-us.net> Fixes: 0ea9fd001a14 ("Bluetooth: Shutdown controller after workqueues are flushed or cancelled") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15devlink: Clear whole devlink_flash_notify structLeon Romanovsky1-2/+2
[ Upstream commit ed43fbac717882165a2a4bd64f7b1f56f7467bb7 ] The { 0 } doesn't clear all fields in the struct, but tells to the compiler to set all fields to zero and doesn't touch any sub-fields if they exists. The {} is an empty initialiser that instructs to fully initialize whole struct including sub-fields, which is error-prone for future devlink_flash_notify extensions. Fixes: 6700acc5f1fe ("devlink: collect flash notify params into a struct") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15Bluetooth: fix repeated calls to sco_sock_killDesmond Cheong Zhi Xi1-5/+1
[ Upstream commit e1dee2c1de2b4dd00eb44004a4bda6326ed07b59 ] In commit 4e1a720d0312 ("Bluetooth: avoid killing an already killed socket"), a check was added to sco_sock_kill to skip killing a socket if the SOCK_DEAD flag was set. This was done after a trace for a use-after-free bug showed that the same sock pointer was being killed twice. Unfortunately, this check prevents sco_sock_kill from running on any socket. sco_sock_kill kills a socket only if it's zapped and orphaned, however sock_orphan announces that the socket is dead before detaching it. i.e., orphaned sockets have the SOCK_DEAD flag set. To fix this, we remove the check for SOCK_DEAD, and avoid repeated calls to sco_sock_kill by removing incorrect calls in: 1. sco_sock_timeout. The socket should not be killed on timeout as further processing is expected to be done. For example, sco_sock_connect sets the timer then waits for the socket to be connected or for an error to be returned. 2. sco_conn_del. This function should clean up resources for the connection, but the socket itself should be cleaned up in sco_sock_release. 3. sco_sock_close. Calls to sco_sock_close in sco_sock_cleanup_listen and sco_sock_release are followed by sco_sock_kill. Hence the duplicated call should be removed. Fixes: 4e1a720d0312 ("Bluetooth: avoid killing an already killed socket") Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15Bluetooth: increase BTNAMSIZ to 21 chars to fix potential buffer overflowColin Ian King1-1/+1
[ Upstream commit 713baf3dae8f45dc8ada4ed2f5fdcbf94a5c274d ] An earlier commit replaced using batostr to using %pMR sprintf for the construction of session->name. Static analysis detected that this new method can use a total of 21 characters (including the trailing '\0') so we need to increase the BTNAMSIZ from 18 to 21 to fix potential buffer overflows. Addresses-Coverity: ("Out-of-bounds write") Fixes: fcb73338ed53 ("Bluetooth: Use %pMR in sprintf/seq_printf instead of batostr") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15devlink: Break parameter notification sequence to be before/after ↵Leon Romanovsky1-12/+20
unload/load driver [ Upstream commit 05a7f4a8dff19999ca8a83a35ff4782689de7bfc ] The change of namespaces during devlink reload calls to driver unload before it accesses devlink parameters. The commands below causes to use-after-free bug when trying to get flow steering mode. * ip netns add n1 * devlink dev reload pci/0000:00:09.0 netns n1 ================================================================== BUG: KASAN: use-after-free in mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] Read of size 4 at addr ffff888009d04308 by task devlink/275 CPU: 6 PID: 275 Comm: devlink Not tainted 5.12.0-rc2+ #2853 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x93/0xc2 print_address_description.constprop.0+0x18/0x140 ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] kasan_report.cold+0x7c/0xd8 ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] devlink_nl_param_fill+0x1c8/0xe80 ? __free_pages_ok+0x37a/0x8a0 ? devlink_flash_update_timeout_notify+0xd0/0xd0 ? lock_acquire+0x1a9/0x6d0 ? fs_reclaim_acquire+0xb7/0x160 ? lock_is_held_type+0x98/0x110 ? 0xffffffff81000000 ? lock_release+0x1f9/0x6c0 ? fs_reclaim_release+0xa1/0xf0 ? lock_downgrade+0x6d0/0x6d0 ? lock_is_held_type+0x98/0x110 ? lock_is_held_type+0x98/0x110 ? memset+0x20/0x40 ? __build_skb_around+0x1f8/0x2b0 devlink_param_notify+0x6d/0x180 devlink_reload+0x1c3/0x520 ? devlink_remote_reload_actions_performed+0x30/0x30 ? mutex_trylock+0x24b/0x2d0 ? devlink_nl_cmd_reload+0x62b/0x1070 devlink_nl_cmd_reload+0x66d/0x1070 ? devlink_reload+0x520/0x520 ? devlink_get_from_attrs+0x1bc/0x260 ? devlink_nl_pre_doit+0x64/0x4d0 genl_family_rcv_msg_doit+0x1e9/0x2f0 ? mutex_lock_io_nested+0x1130/0x1130 ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240 ? security_capable+0x51/0x90 genl_rcv_msg+0x27f/0x4a0 ? genl_get_cmd+0x3c0/0x3c0 ? lock_acquire+0x1a9/0x6d0 ? devlink_reload+0x520/0x520 ? lock_release+0x6c0/0x6c0 netlink_rcv_skb+0x11d/0x340 ? genl_get_cmd+0x3c0/0x3c0 ? netlink_ack+0x9f0/0x9f0 ? lock_release+0x1f9/0x6c0 genl_rcv+0x24/0x40 netlink_unicast+0x433/0x700 ? netlink_attachskb+0x730/0x730 ? _copy_from_iter_full+0x178/0x650 ? __alloc_skb+0x113/0x2b0 netlink_sendmsg+0x6f1/0xbd0 ? netlink_unicast+0x700/0x700 ? lock_is_held_type+0x98/0x110 ? netlink_unicast+0x700/0x700 sock_sendmsg+0xb0/0xe0 __sys_sendto+0x193/0x240 ? __x64_sys_getpeername+0xb0/0xb0 ? do_sys_openat2+0x10b/0x370 ? __up_read+0x1a1/0x7b0 ? do_user_addr_fault+0x219/0xdc0 ? __x64_sys_openat+0x120/0x1d0 ? __x64_sys_open+0x1a0/0x1a0 __x64_sys_sendto+0xdd/0x1b0 ? syscall_enter_from_user_mode+0x1d/0x50 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fc69d0af14a Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c RSP: 002b:00007ffc1d8292f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fc69d0af14a RDX: 0000000000000038 RSI: 0000555f57c56440 RDI: 0000000000000003 RBP: 0000555f57c56410 R08: 00007fc69d17b200 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Allocated by task 146: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0x99/0xc0 mlx5_init_fs+0xf0/0x1c50 [mlx5_core] mlx5_load+0xd2/0x180 [mlx5_core] mlx5_init_one+0x2f6/0x450 [mlx5_core] probe_one+0x47d/0x6e0 [mlx5_core] pci_device_probe+0x2a0/0x4a0 really_probe+0x20a/0xc90 driver_probe_device+0xd8/0x380 device_driver_attach+0x1df/0x250 __driver_attach+0xff/0x240 bus_for_each_dev+0x11e/0x1a0 bus_add_driver+0x309/0x570 driver_register+0x1ee/0x380 0xffffffffa06b8062 do_one_initcall+0xd5/0x410 do_init_module+0x1c8/0x760 load_module+0x6d8b/0x9650 __do_sys_finit_module+0x118/0x1b0 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae Freed by task 275: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x102/0x140 slab_free_freelist_hook+0x74/0x1b0 kfree+0xd7/0x2a0 mlx5_unload+0x16/0xb0 [mlx5_core] mlx5_unload_one+0xae/0x120 [mlx5_core] mlx5_devlink_reload_down+0x1bc/0x380 [mlx5_core] devlink_reload+0x141/0x520 devlink_nl_cmd_reload+0x66d/0x1070 genl_family_rcv_msg_doit+0x1e9/0x2f0 genl_rcv_msg+0x27f/0x4a0 netlink_rcv_skb+0x11d/0x340 genl_rcv+0x24/0x40 netlink_unicast+0x433/0x700 netlink_sendmsg+0x6f1/0xbd0 sock_sendmsg+0xb0/0xe0 __sys_sendto+0x193/0x240 __x64_sys_sendto+0xdd/0x1b0 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff888009d04300 which belongs to the cache kmalloc-128 of size 128 The buggy address is located 8 bytes inside of 128-byte region [ffff888009d04300, ffff888009d04380) The buggy address belongs to the page: page:0000000086a64ecc refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888009d04000 pfn:0x9d04 head:0000000086a64ecc order:1 compound_mapcount:0 flags: 0x4000000000010200(slab|head) raw: 4000000000010200 ffffea0000203980 0000000200000002 ffff8880050428c0 raw: ffff888009d04000 000000008020001d 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888009d04200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888009d04280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888009d04300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888009d04380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888009d04400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== The right solution to devlink reload is to notify about deletion of parameters, unload driver, change net namespaces, load driver and notify about addition of parameters. Fixes: 070c63f20f6c ("net: devlink: allow to change namespaces during reload") Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15Bluetooth: mgmt: Fix wrong opcode in the response for add_adv cmdTedd Ho-Jeong An1-1/+1
[ Upstream commit a25fca4d3c18766b6f7a3c95fa8faec23ef464c5 ] This patch fixes the MGMT add_advertising command repsones with the wrong opcode when it is trying to return the not supported error. Fixes: cbbdfa6f33198 ("Bluetooth: Enable controller RPA resolution using Experimental feature") Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15net: cipso: fix warnings in netlbl_cipsov4_add_stdPavel Skripkin1-4/+4
[ Upstream commit 8ca34a13f7f9b3fa2c464160ffe8cc1a72088204 ] Syzbot reported warning in netlbl_cipsov4_add(). The problem was in too big doi_def->map.std->lvl.local_size passed to kcalloc(). Since this value comes from userpace there is no need to warn if value is not correct. The same problem may occur with other kcalloc() calls in this function, so, I've added __GFP_NOWARN flag to all kcalloc() calls there. Reported-and-tested-by: syzbot+cdd51ee2e6b0b2e18c0d@syzkaller.appspotmail.com Fixes: 96cb8e3313c7 ("[NetLabel]: CIPSOv4 and Unlabeled packet integration") Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15tcp: seq_file: Avoid skipping sk during tcp_seek_last_posMartin KaFai Lau1-2/+3
[ Upstream commit 525e2f9fd0229eb10cb460a9e6d978257f24804e ] st->bucket stores the current bucket number. st->offset stores the offset within this bucket that is the sk to be seq_show(). Thus, st->offset only makes sense within the same st->bucket. These two variables are an optimization for the common no-lseek case. When resuming the seq_file iteration (i.e. seq_start()), tcp_seek_last_pos() tries to continue from the st->offset at bucket st->bucket. However, it is possible that the bucket pointed by st->bucket has changed and st->offset may end up skipping the whole st->bucket without finding a sk. In this case, tcp_seek_last_pos() currently continues to satisfy the offset condition in the next (and incorrect) bucket. Instead, regardless of the offset value, the first sk of the next bucket should be returned. Thus, "bucket == st->bucket" check is added to tcp_seek_last_pos(). The chance of hitting this is small and the issue is a decade old, so targeting for the next tree. Fixes: a8b690f98baf ("tcp: Fix slowness in read /proc/net/tcp") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210701200541.1033917-1-kafai@fb.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-156lowpan: iphc: Fix an off-by-one check of array indexColin Ian King1-1/+2
[ Upstream commit 9af417610b6142e826fd1ee8ba7ff3e9a2133a5a ] The bounds check of id is off-by-one and the comparison should be >= rather >. Currently the WARN_ON_ONCE check does not stop the out of range indexing of &ldev->ctx.table[id] so also add a return path if the bounds are out of range. Addresses-Coverity: ("Illegal address computation"). Fixes: 5609c185f24d ("6lowpan: iphc: add support for stateful compression") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15Bluetooth: sco: prevent information leak in sco_conn_defer_accept()Dan Carpenter1-0/+5
[ Upstream commit 59da0b38bc2ea570ede23a3332ecb3e7574ce6b2 ] Smatch complains that some of these struct members are not initialized leading to a stack information disclosure: net/bluetooth/sco.c:778 sco_conn_defer_accept() warn: check that 'cp.retrans_effort' doesn't leak information This seems like a valid warning. I've added a default case to fix this issue. Fixes: 2f69a82acf6f ("Bluetooth: Use voice setting in deferred SCO connection request") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-12netfilter: nftables: clone set element expression templatePablo Neira Ayuso1-8/+28
commit 4d8f9065830e526c83199186c5f56a6514f457d2 upstream. memcpy() breaks when using connlimit in set elements. Use nft_expr_clone() to initialize the connlimit expression list, otherwise connlimit garbage collector crashes when walking on the list head copy. [ 493.064656] Workqueue: events_power_efficient nft_rhash_gc [nf_tables] [ 493.064685] RIP: 0010:find_or_evict+0x5a/0x90 [nf_conncount] [ 493.064694] Code: 2b 43 40 83 f8 01 77 0d 48 c7 c0 f5 ff ff ff 44 39 63 3c 75 df 83 6d 18 01 48 8b 43 08 48 89 de 48 8b 13 48 8b 3d ee 2f 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 03 48 83 [ 493.064699] RSP: 0018:ffffc90000417dc0 EFLAGS: 00010297 [ 493.064704] RAX: 0000000000000000 RBX: ffff888134f38410 RCX: 0000000000000000 [ 493.064708] RDX: 0000000000000000 RSI: ffff888134f38410 RDI: ffff888100060cc0 [ 493.064711] RBP: ffff88812ce594a8 R08: ffff888134f38438 R09: 00000000ebb9025c [ 493.064714] R10: ffffffff8219f838 R11: 0000000000000017 R12: 0000000000000001 [ 493.064718] R13: ffffffff82146740 R14: ffff888134f38410 R15: 0000000000000000 [ 493.064721] FS: 0000000000000000(0000) GS:ffff88840e440000(0000) knlGS:0000000000000000 [ 493.064725] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 493.064729] CR2: 0000000000000008 CR3: 00000001330aa002 CR4: 00000000001706e0 [ 493.064733] Call Trace: [ 493.064737] nf_conncount_gc_list+0x8f/0x150 [nf_conncount] [ 493.064746] nft_rhash_gc+0x106/0x390 [nf_tables] Reported-by: Laura Garcia Liebana <nevola@gmail.com> Fixes: 409444522976 ("netfilter: nf_tables: add elements with stateful expressions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-12netfilter: nf_tables: initialize set before expression setupPablo Neira Ayuso1-22/+24
commit ad9f151e560b016b6ad3280b48e42fa11e1a5440 upstream. nft_set_elem_expr_alloc() needs an initialized set if expression sets on the NFT_EXPR_GC flag. Move set fields initialization before expression setup. [4512935.019450] ================================================================== [4512935.019456] BUG: KASAN: null-ptr-deref in nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019487] Read of size 8 at addr 0000000000000070 by task nft/23532 [4512935.019494] CPU: 1 PID: 23532 Comm: nft Not tainted 5.12.0-rc4+ #48 [...] [4512935.019502] Call Trace: [4512935.019505] dump_stack+0x89/0xb4 [4512935.019512] ? nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019536] ? nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019560] kasan_report.cold.12+0x5f/0xd8 [4512935.019566] ? nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019590] nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019615] nf_tables_newset+0xc7f/0x1460 [nf_tables] Reported-by: syzbot+ce96ca2b1d0b37c6422d@syzkaller.appspotmail.com Fixes: 65038428b2c6 ("netfilter: nf_tables: allow to specify stateful expression in set definition") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-12netfilter: nftables: avoid potential overflows on 32bit archesEric Dumazet2-7/+10
commit 6c8774a94e6ad26f29ef103c8671f55c255c6201 upstream. User space could ask for very large hash tables, we need to make sure our size computations wont overflow. nf_tables_newset() needs to double check the u64 size will fit into size_t field. Fixes: 0ed6389c483d ("netfilter: nf_tables: rename set implementations") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-12igmp: Add ip_mc_list lock in ip_check_mc_rcuLiu Jian1-0/+2
commit 23d2b94043ca8835bd1e67749020e839f396a1c2 upstream. I got below panic when doing fuzz test: Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 4056 Comm: syz-executor.3 Tainted: G B 5.14.0-rc1-00195-gcff5c4254439-dirty #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl+0x7a/0x9b panic+0x2cd/0x5af end_report.cold+0x5a/0x5a kasan_report+0xec/0x110 ip_check_mc_rcu+0x556/0x5d0 __mkroute_output+0x895/0x1740 ip_route_output_key_hash_rcu+0x2d0/0x1050 ip_route_output_key_hash+0x182/0x2e0 ip_route_output_flow+0x28/0x130 udp_sendmsg+0x165d/0x2280 udpv6_sendmsg+0x121e/0x24f0 inet6_sendmsg+0xf7/0x140 sock_sendmsg+0xe9/0x180 ____sys_sendmsg+0x2b8/0x7a0 ___sys_sendmsg+0xf0/0x160 __sys_sendmmsg+0x17e/0x3c0 __x64_sys_sendmmsg+0x9e/0x100 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x462eb9 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f3df5af1c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462eb9 RDX: 0000000000000312 RSI: 0000000020001700 RDI: 0000000000000007 RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3df5af26bc R13: 00000000004c372d R14: 0000000000700b10 R15: 00000000ffffffff It is one use-after-free in ip_check_mc_rcu. In ip_mc_del_src, the ip_sf_list of pmc has been freed under pmc->lock protection. But access to ip_sf_list in ip_check_mc_rcu is not protected by the lock. Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-03net: don't unconditionally copy_from_user a struct ifreq for socket ioctlsPeter Collingbourne1-1/+5
commit d0efb16294d145d157432feda83877ae9d7cdf37 upstream. A common implementation of isatty(3) involves calling a ioctl passing a dummy