summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2022-04-20ax25: Fix UAF bugs in ax25 timersDuoming Zhou1-0/+5
commit 82e31755e55fbcea6a9dfaae5fe4860ade17cbc0 upstream. There are race conditions that may lead to UAF bugs in ax25_heartbeat_expiry(), ax25_t1timer_expiry(), ax25_t2timer_expiry(), ax25_t3timer_expiry() and ax25_idletimer_expiry(), when we call ax25_release() to deallocate ax25_dev. One of the UAF bugs caused by ax25_release() is shown below: (Thread 1) | (Thread 2) ax25_dev_device_up() //(1) | ... | ax25_kill_by_device() ax25_bind() //(2) | ax25_connect() | ... ax25_std_establish_data_link() | ax25_start_t1timer() | ax25_dev_device_down() //(3) mod_timer(&ax25->t1timer,..) | | ax25_release() (wait a time) | ... | ax25_dev_put(ax25_dev) //(4)FREE ax25_t1timer_expiry() | ax25->ax25_dev->values[..] //USE| ... ... | We increase the refcount of ax25_dev in position (1) and (2), and decrease the refcount of ax25_dev in position (3) and (4). The ax25_dev will be freed in position (4) and be used in ax25_t1timer_expiry(). The fail log is shown below: ============================================================== [ 106.116942] BUG: KASAN: use-after-free in ax25_t1timer_expiry+0x1c/0x60 [ 106.116942] Read of size 8 at addr ffff88800bda9028 by task swapper/0/0 [ 106.116942] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-06123-g0905eec574 [ 106.116942] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-14 [ 106.116942] Call Trace: ... [ 106.116942] ax25_t1timer_expiry+0x1c/0x60 [ 106.116942] call_timer_fn+0x122/0x3d0 [ 106.116942] __run_timers.part.0+0x3f6/0x520 [ 106.116942] run_timer_softirq+0x4f/0xb0 [ 106.116942] __do_softirq+0x1c2/0x651 ... This patch adds del_timer_sync() in ax25_release(), which could ensure that all timers stop before we deallocate ax25_dev. Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Paolo Abeni <pabeni@redhat.com> [OP: backport to 5.10: adjust context] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ax25: Fix NULL pointer dereferences in ax25 timersDuoming Zhou2-8/+16
commit fc6d01ff9ef03b66d4a3a23b46fc3c3d8cf92009 upstream. The previous commit 7ec02f5ac8a5 ("ax25: fix NPD bug in ax25_disconnect") move ax25_disconnect into lock_sock() in order to prevent NPD bugs. But there are race conditions that may lead to null pointer dereferences in ax25_heartbeat_expiry(), ax25_t1timer_expiry(), ax25_t2timer_expiry(), ax25_t3timer_expiry() and ax25_idletimer_expiry(), when we use ax25_kill_by_device() to detach the ax25 device. One of the race conditions that cause null pointer dereferences can be shown as below: (Thread 1) | (Thread 2) ax25_connect() | ax25_std_establish_data_link() | ax25_start_t1timer() | mod_timer(&ax25->t1timer,..) | | ax25_kill_by_device() (wait a time) | ... | s->ax25_dev = NULL; //(1) ax25_t1timer_expiry() | ax25->ax25_dev->values[..] //(2)| ... ... | We set null to ax25_cb->ax25_dev in position (1) and dereference the null pointer in position (2). The corresponding fail log is shown below: =============================================================== BUG: kernel NULL pointer dereference, address: 0000000000000050 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc6-00794-g45690b7d0 RIP: 0010:ax25_t1timer_expiry+0x12/0x40 ... Call Trace: call_timer_fn+0x21/0x120 __run_timers.part.0+0x1ca/0x250 run_timer_softirq+0x2c/0x60 __do_softirq+0xef/0x2f3 irq_exit_rcu+0xb6/0x100 sysvec_apic_timer_interrupt+0xa2/0xd0 ... This patch moves ax25_disconnect() before s->ax25_dev = NULL and uses del_timer_sync() to delete timers in ax25_disconnect(). If ax25_disconnect() is called by ax25_kill_by_device() or ax25->ax25_dev is NULL, the reason in ax25_disconnect() will be equal to ENETUNREACH, it will wait all timers to stop before we set null to s->ax25_dev in ax25_kill_by_device(). Fixes: 7ec02f5ac8a5 ("ax25: fix NPD bug in ax25_disconnect") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> [OP: backport to 5.10: adjust context] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ax25: fix NPD bug in ax25_disconnectDuoming Zhou1-1/+1
commit 7ec02f5ac8a5be5a3f20611731243dc5e1d9ba10 upstream. The ax25_disconnect() in ax25_kill_by_device() is not protected by any locks, thus there is a race condition between ax25_disconnect() and ax25_destroy_socket(). when ax25->sk is assigned as NULL by ax25_destroy_socket(), a NULL pointer dereference bug will occur if site (1) or (2) dereferences ax25->sk. ax25_kill_by_device() | ax25_release() ax25_disconnect() | ax25_destroy_socket() ... | if(ax25->sk != NULL) | ... ... | ax25->sk = NULL; bh_lock_sock(ax25->sk); //(1) | ... ... | bh_unlock_sock(ax25->sk); //(2)| This patch moves ax25_disconnect() into lock_sock(), which can synchronize with ax25_destroy_socket() in ax25_release(). Fail log: =============================================================== BUG: kernel NULL pointer dereference, address: 0000000000000088 ... RIP: 0010:_raw_spin_lock+0x7e/0xd0 ... Call Trace: ax25_disconnect+0xf6/0x220 ax25_device_event+0x187/0x250 raw_notifier_call_chain+0x5e/0x70 dev_close_many+0x17d/0x230 rollback_registered_many+0x1f1/0x950 unregister_netdevice_queue+0x133/0x200 unregister_netdev+0x13/0x20 ... Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> [OP: backport to 5.10: adjust context] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ax25: fix UAF bug in ax25_send_control()Duoming Zhou1-4/+4
commit 5352a761308397a0e6250fdc629bb3f615b94747 upstream. There are UAF bugs in ax25_send_control(), when we call ax25_release() to deallocate ax25_dev. The possible race condition is shown below: (Thread 1) | (Thread 2) ax25_dev_device_up() //(1) | | ax25_kill_by_device() ax25_bind() //(2) | ax25_connect() | ... ax25->state = AX25_STATE_1 | ... | ax25_dev_device_down() //(3) (Thread 3) ax25_release() | ax25_dev_put() //(4) FREE | case AX25_STATE_1: | ax25_send_control() | alloc_skb() //USE | The refcount of ax25_dev increases in position (1) and (2), and decreases in position (3) and (4). The ax25_dev will be freed before dereference sites in ax25_send_control(). The following is part of the report: [ 102.297448] BUG: KASAN: use-after-free in ax25_send_control+0x33/0x210 [ 102.297448] Read of size 8 at addr ffff888009e6e408 by task ax25_close/602 [ 102.297448] Call Trace: [ 102.303751] ax25_send_control+0x33/0x210 [ 102.303751] ax25_release+0x356/0x450 [ 102.305431] __sock_release+0x6d/0x120 [ 102.305431] sock_close+0xf/0x20 [ 102.305431] __fput+0x11f/0x420 [ 102.305431] task_work_run+0x86/0xd0 [ 102.307130] get_signal+0x1075/0x1220 [ 102.308253] arch_do_signal_or_restart+0x1df/0xc00 [ 102.308253] exit_to_user_mode_prepare+0x150/0x1e0 [ 102.308253] syscall_exit_to_user_mode+0x19/0x50 [ 102.308253] do_syscall_64+0x48/0x90 [ 102.308253] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 102.308253] RIP: 0033:0x405ae7 This patch defers the free operation of ax25_dev and net_device after all corresponding dereference sites in ax25_release() to avoid UAF. Fixes: 9fd75b66b8f6 ("ax25: Fix refcount leaks caused by ax25_cb_del()") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Paolo Abeni <pabeni@redhat.com> [OP: backport to 5.10: adjust dev_put_track()->dev_put()] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ax25: Fix refcount leaks caused by ax25_cb_del()Duoming Zhou1-3/+11
commit 9fd75b66b8f68498454d685dc4ba13192ae069b0 upstream. The previous commit d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") and commit feef318c855a ("ax25: fix UAF bugs of net_device caused by rebinding operation") increase the refcounts of ax25_dev and net_device in ax25_bind() and decrease the matching refcounts in ax25_kill_by_device() in order to prevent UAF bugs, but there are reference count leaks. The root cause of refcount leaks is shown below: (Thread 1) | (Thread 2) ax25_bind() | ... | ax25_addr_ax25dev() | ax25_dev_hold() //(1) | ... | dev_hold_track() //(2) | ... | ax25_destroy_socket() | ax25_cb_del() | ... | hlist_del_init() //(3) | | (Thread 3) | ax25_kill_by_device() | ... | ax25_for_each(s, &ax25_list) { | if (s->ax25_dev == ax25_dev) //(4) | ... | Firstly, we use ax25_bind() to increase the refcount of ax25_dev in position (1) and increase the refcount of net_device in position (2). Then, we use ax25_cb_del() invoked by ax25_destroy_socket() to delete ax25_cb in hlist in position (3) before calling ax25_kill_by_device(). Finally, the decrements of refcounts in ax25_kill_by_device() will not be executed, because no s->ax25_dev equals to ax25_dev in position (4). This patch adds decrements of refcounts in ax25_release() and use lock_sock() to do synchronization. If refcounts decrease in ax25_release(), the decrements of refcounts in ax25_kill_by_device() will not be executed and vice versa. Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") Fixes: 87563a043cef ("ax25: fix reference count leaks of ax25_dev") Fixes: feef318c855a ("ax25: fix UAF bugs of net_device caused by rebinding operation") Reported-by: Thomas Osterried <thomas@osterried.de> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> [OP: backport to 5.10: adjust dev_put_track()->dev_put()] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ax25: fix UAF bugs of net_device caused by rebinding operationDuoming Zhou1-1/+4
commit feef318c855a361a1eccd880f33e88c460eb63b4 upstream. The ax25_kill_by_device() will set s->ax25_dev = NULL and call ax25_disconnect() to change states of ax25_cb and sock, if we call ax25_bind() before ax25_kill_by_device(). However, if we call ax25_bind() again between the window of ax25_kill_by_device() and ax25_dev_device_down(), the values and states changed by ax25_kill_by_device() will be reassigned. Finally, ax25_dev_device_down() will deallocate net_device. If we dereference net_device in syscall functions such as ax25_release(), ax25_sendmsg(), ax25_getsockopt(), ax25_getname() and ax25_info_show(), a UAF bug will occur. One of the possible race conditions is shown below: (USE) | (FREE) ax25_bind() | | ax25_kill_by_device() ax25_bind() | ax25_connect() | ... | ax25_dev_device_down() | ... | dev_put_track(dev, ...) //FREE ax25_release() | ... ax25_send_control() | alloc_skb() //USE | the corresponding fail log is shown below: =============================================================== BUG: KASAN: use-after-free in ax25_send_control+0x43/0x210 ... Call Trace: ... ax25_send_control+0x43/0x210 ax25_release+0x2db/0x3b0 __sock_release+0x6d/0x120 sock_close+0xf/0x20 __fput+0x11f/0x420 ... Allocated by task 1283: ... __kasan_kmalloc+0x81/0xa0 alloc_netdev_mqs+0x5a/0x680 mkiss_open+0x6c/0x380 tty_ldisc_open+0x55/0x90 ... Freed by task 1969: ... kfree+0xa3/0x2c0 device_release+0x54/0xe0 kobject_put+0xa5/0x120 tty_ldisc_kill+0x3e/0x80 ... In order to fix these UAF bugs caused by rebinding operation, this patch adds dev_hold_track() into ax25_bind() and corresponding dev_put_track() into ax25_kill_by_device(). Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> [OP: backport to 5.10: adjust dev_put_track()->dev_put() and dev_hold_track()->dev_hold()] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ax25: fix reference count leaks of ax25_devDuoming Zhou3-16/+36
commit 87563a043cef044fed5db7967a75741cc16ad2b1 upstream. The previous commit d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") introduces refcount into ax25_dev, but there are reference leak paths in ax25_ctl_ioctl(), ax25_fwd_ioctl(), ax25_rt_add(), ax25_rt_del() and ax25_rt_opt(). This patch uses ax25_dev_put() and adjusts the position of ax25_addr_ax25dev() to fix reference cout leaks of ax25_dev. Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220203150811.42256-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> [OP: backport to 5.10: adjust context] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ax25: add refcount in ax25_dev to avoid UAF bugsDuoming Zhou3-2/+15
commit d01ffb9eee4af165d83b08dd73ebdf9fe94a519b upstream. If we dereference ax25_dev after we call kfree(ax25_dev) in ax25_dev_device_down(), it will lead to concurrency UAF bugs. There are eight syscall functions suffer from UAF bugs, include ax25_bind(), ax25_release(), ax25_connect(), ax25_ioctl(), ax25_getname(), ax25_sendmsg(), ax25_getsockopt() and ax25_info_show(). One of the concurrency UAF can be shown as below: (USE) | (FREE) | ax25_device_event | ax25_dev_device_down ax25_bind | ... ... | kfree(ax25_dev) ax25_fillin_cb() | ... ax25_fillin_cb_from_dev() | ... | The root cause of UAF bugs is that kfree(ax25_dev) in ax25_dev_device_down() is not protected by any locks. When ax25_dev, which there are still pointers point to, is released, the concurrency UAF bug will happen. This patch introduces refcount into ax25_dev in order to guarantee that there are no pointers point to it when ax25_dev is released. Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> [OP: backport to 5.10: adjusted context] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ipv6: fix panic when forwarding a pkt with no in6 devNicolas Dichtel1-1/+1
commit e3fa461d8b0e185b7da8a101fe94dfe6dd500ac0 upstream. kongweibin reported a kernel panic in ip6_forward() when input interface has no in6 dev associated. The following tc commands were used to reproduce this panic: tc qdisc del dev vxlan100 root tc qdisc add dev vxlan100 root netem corrupt 5% CC: stable@vger.kernel.org Fixes: ccd27f05ae7b ("ipv6: fix 'disable_policy' for fwd packets") Reported-by: kongweibin <kongweibin2@huawei.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20nl80211: correctly check NL80211_ATTR_REG_ALPHA2 sizeJohannes Berg1-1/+2
commit 6624bb34b4eb19f715db9908cca00122748765d7 upstream. We need this to be at least two bytes, so we can access alpha2[0] and alpha2[1]. It may be three in case some userspace used NUL-termination since it was NLA_STRING (and we also push it out with NUL-termination). Cc: stable@vger.kernel.org Reported-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220411114201.fd4a31f06541.Ie7ff4be2cf348d8cc28ed0d626fc54becf7ea799@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20nfc: nci: add flush_workqueue to prevent uafLin Ma1-0/+4
[ Upstream commit ef27324e2cb7bb24542d6cb2571740eefe6b00dc ] Our detector found a concurrent use-after-free bug when detaching an NCI device. The main reason for this bug is the unexpected scheduling between the used delayed mechanism (timer and workqueue). The race can be demonstrated below: Thread-1 Thread-2 | nci_dev_up() | nci_open_device() | __nci_request(nci_reset_req) | nci_send_cmd | queue_work(cmd_work) nci_unregister_device() | nci_close_device() | ... del_timer_sync(cmd_timer)[1] | ... | Worker nci_free_device() | nci_cmd_work() kfree(ndev)[3] | mod_timer(cmd_timer)[2] In short, the cleanup routine thought that the cmd_timer has already been detached by [1] but the mod_timer can re-attach the timer [2], even it is already released [3], resulting in UAF. This UAF is easy to trigger, crash trace by POC is like below [ 66.703713] ================================================================== [ 66.703974] BUG: KASAN: use-after-free in enqueue_timer+0x448/0x490 [ 66.703974] Write of size 8 at addr ffff888009fb7058 by task kworker/u4:1/33 [ 66.703974] [ 66.703974] CPU: 1 PID: 33 Comm: kworker/u4:1 Not tainted 5.18.0-rc2 #5 [ 66.703974] Workqueue: nfc2_nci_cmd_wq nci_cmd_work [ 66.703974] Call Trace: [ 66.703974] <TASK> [ 66.703974] dump_stack_lvl+0x57/0x7d [ 66.703974] print_report.cold+0x5e/0x5db [ 66.703974] ? enqueue_timer+0x448/0x490 [ 66.703974] kasan_report+0xbe/0x1c0 [ 66.703974] ? enqueue_timer+0x448/0x490 [ 66.703974] enqueue_timer+0x448/0x490 [ 66.703974] __mod_timer+0x5e6/0xb80 [ 66.703974] ? mark_held_locks+0x9e/0xe0 [ 66.703974] ? try_to_del_timer_sync+0xf0/0xf0 [ 66.703974] ? lockdep_hardirqs_on_prepare+0x17b/0x410 [ 66.703974] ? queue_work_on+0x61/0x80 [ 66.703974] ? lockdep_hardirqs_on+0xbf/0x130 [ 66.703974] process_one_work+0x8bb/0x1510 [ 66.703974] ? lockdep_hardirqs_on_prepare+0x410/0x410 [ 66.703974] ? pwq_dec_nr_in_flight+0x230/0x230 [ 66.703974] ? rwlock_bug.part.0+0x90/0x90 [ 66.703974] ? _raw_spin_lock_irq+0x41/0x50 [ 66.703974] worker_thread+0x575/0x1190 [ 66.703974] ? process_one_work+0x1510/0x1510 [ 66.703974] kthread+0x2a0/0x340 [ 66.703974] ? kthread_complete_and_exit+0x20/0x20 [ 66.703974] ret_from_fork+0x22/0x30 [ 66.703974] </TASK> [ 66.703974] [ 66.703974] Allocated by task 267: [ 66.703974] kasan_save_stack+0x1e/0x40 [ 66.703974] __kasan_kmalloc+0x81/0xa0 [ 66.703974] nci_allocate_device+0xd3/0x390 [ 66.703974] nfcmrvl_nci_register_dev+0x183/0x2c0 [ 66.703974] nfcmrvl_nci_uart_open+0xf2/0x1dd [ 66.703974] nci_uart_tty_ioctl+0x2c3/0x4a0 [ 66.703974] tty_ioctl+0x764/0x1310 [ 66.703974] __x64_sys_ioctl+0x122/0x190 [ 66.703974] do_syscall_64+0x3b/0x90 [ 66.703974] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 66.703974] [ 66.703974] Freed by task 406: [ 66.703974] kasan_save_stack+0x1e/0x40 [ 66.703974] kasan_set_track+0x21/0x30 [ 66.703974] kasan_set_free_info+0x20/0x30 [ 66.703974] __kasan_slab_free+0x108/0x170 [ 66.703974] kfree+0xb0/0x330 [ 66.703974] nfcmrvl_nci_unregister_dev+0x90/0xd0 [ 66.703974] nci_uart_tty_close+0xdf/0x180 [ 66.703974] tty_ldisc_kill+0x73/0x110 [ 66.703974] tty_ldisc_hangup+0x281/0x5b0 [ 66.703974] __tty_hangup.part.0+0x431/0x890 [ 66.703974] tty_release+0x3a8/0xc80 [ 66.703974] __fput+0x1f0/0x8c0 [ 66.703974] task_work_run+0xc9/0x170 [ 66.703974] exit_to_user_mode_prepare+0x194/0x1a0 [ 66.703974] syscall_exit_to_user_mode+0x19/0x50 [ 66.703974] do_syscall_64+0x48/0x90 [ 66.703974] entry_SYSCALL_64_after_hwframe+0x44/0xae To fix the UAF, this patch adds flush_workqueue() to ensure the nci_cmd_work is finished before the following del_timer_sync. This combination will promise the timer is actually detached. Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20sctp: Initialize daddr on peeled off socketPetr Malat1-1/+1
[ Upstream commit 8467dda0c26583547731e7f3ea73fc3856bae3bf ] Function sctp_do_peeloff() wrongly initializes daddr of the original socket instead of the peeled off socket, which makes getpeername() return zeroes instead of the primary address. Initialize the new socket instead. Fixes: d570ee490fb1 ("[SCTP]: Correctly set daddr for IPv6 sockets during peeloff") Signed-off-by: Petr Malat <oss@malat.biz> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/20220409063611.673193-1-oss@malat.biz Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20net/smc: Fix NULL pointer dereference in smc_pnet_find_ib()Karsten Graul1-2/+3
[ Upstream commit d22f4f977236f97e01255a80bca2ea93a8094fc8 ] dev_name() was called with dev.parent as argument but without to NULL-check it before. Solve this by checking the pointer before the call to dev_name(). Fixes: af5f60c7e3d5 ("net/smc: allow PCI IDs as ib device names in the pnet table") Reported-by: syzbot+03e3e228510223dabd34@syzkaller.appspotmail.com Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20cfg80211: hold bss_lock while updating nontrans_listRameshkumar Sundaram1-0/+2
[ Upstream commit a5199b5626cd6913cf8776a835bc63d40e0686ad ] Synchronize additions to nontrans_list of transmitting BSS with bss_lock to avoid races. Also when cfg80211_add_nontrans_list() fails __cfg80211_unlink_bss() needs bss_lock to be held (has lockdep assert on bss_lock). So protect the whole block with bss_lock to avoid races and warnings. Found during code review. Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning") Signed-off-by: Rameshkumar Sundaram <quic_ramess@quicinc.com> Link: https://lore.kernel.org/r/1649668071-9370-1-git-send-email-quic_ramess@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20net/sched: taprio: Check if socket flags are validBenedikt Spranger1-1/+2
[ Upstream commit e8a64bbaaad1f6548cec5508297bc6d45e8ab69e ] A user may set the SO_TXTIME socket option to ensure a packet is send at a given time. The taprio scheduler has to confirm, that it is allowed to send a packet at that given time, by a check against the packet time schedule. The scheduler drop the packet, if the gates are closed at the given send time. The check, if SO_TXTIME is set, may fail since sk_flags are part of an union and the union is used otherwise. This happen, if a socket is not a full socket, like a request socket for example. Add a check to verify, if the union is used for sk_flags. Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") Signed-off-by: Benedikt Spranger <b.spranger@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20net/sched: fix initialization order when updating chain 0 headMarcelo Ricardo Leitner1-1/+1
[ Upstream commit e65812fd22eba32f11abe28cb377cbd64cfb1ba0 ] Currently, when inserting a new filter that needs to sit at the head of chain 0, it will first update the heads pointer on all devices using the (shared) block, and only then complete the initialization of the new element so that it has a "next" element. This can lead to a situation that the chain 0 head is propagated to another CPU before the "next" initialization is done. When this race condition is triggered, packets being matched on that CPU will simply miss all other filters, and will flow through the stack as if there were no other filters installed. If the system is using OVS + TC, such packets will get handled by vswitchd via upcall, which results in much higher latency and reordering. For other applications it may result in packet drops. This is reproducible with a tc only setup, but it varies from system to system. It could be reproduced with a shared block amongst 10 veth tunnels, and an ingress filter mirroring packets to another veth. That's because using the last added veth tunnel to the shared block to do the actual traffic, it makes the race window bigger and easier to trigger. The fix is rather simple, to just initialize the next pointer of the new filter instance (tp) before propagating the head change. The fixes tag is pointing to the original code though this issue should only be observed when using it unlocked. Fixes: 2190d1d0944f ("net: sched: introduce helpers to work with filter chains") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/b97d5f4eaffeeb9d058155bcab63347527261abf.1649341369.git.marcelo.leitner@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20net/sched: flower: fix parsing of ethertype following VLAN headerVlad Buslov2-5/+14
[ Upstream commit 2105f700b53c24aa48b65c15652acc386044d26a ] A tc flower filter matching TCA_FLOWER_KEY_VLAN_ETH_TYPE is expected to match the L2 ethertype following the first VLAN header, as confirmed by linked discussion with the maintainer. However, such rule also matches packets that have additional second VLAN header, even though filter has both eth_type and vlan_ethtype set to "ipv4". Looking at the code this seems to be mostly an artifact of the way flower uses flow dissector. First, even though looking at the uAPI eth_type and vlan_ethtype appear like a distinct fields, in flower they are all mapped to the same key->basic.n_proto. Second, flow dissector skips following VLAN header as no keys for FLOW_DISSECTOR_KEY_CVLAN are set and eventually assigns the value of n_proto to last parsed header. With these, such filters ignore any headers present between first VLAN header and first "non magic" header (ipv4 in this case) that doesn't result FLOW_DISSECT_RET_PROTO_AGAIN. Fix the issue by extending flow dissector VLAN key structure with new 'vlan_eth_type' field that matches first ethertype following previously parsed VLAN header. Modify flower classifier to set the new flow_dissector_key_vlan->vlan_eth_type with value obtained from TCA_FLOWER_KEY_VLAN_ETH_TYPE/TCA_FLOWER_KEY_CVLAN_ETH_TYPE uAPIs. Link: https://lore.kernel.org/all/Yjhgi48BpTGh6dig@nanopsycho/ Fixes: 9399ae9a6cb2 ("net_sched: flower: Add vlan support") Fixes: d64efd0926ba ("net/sched: flower: Add supprt for matching on QinQ vlan headers") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13SUNRPC: svc_tcp_sendmsg() should handle errors from xdr_alloc_bvec()Trond Myklebust1-1/+3
[ Upstream commit b056fa070814897be32d83b079dbc311375588e7 ] The allocation is done with GFP_KERNEL, but it could still fail in a low memory situation. Fixes: 4a85a6a3320b ("SUNRPC: Handle TCP socket sends with kernel_sendpage() again") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13SUNRPC: Handle low memory situations in call_status()Trond Myklebust1-0/+5
[ Upstream commit 9d82819d5b065348ce623f196bf601028e22ed00 ] We need to handle ENFILE, ENOBUFS, and ENOMEM, because xprt_wake_pending_tasks() can be called with any one of these due to socket creation failures. Fixes: b61d59fffd3e ("SUNRPC: xs_tcp_connect_worker{4,6}: merge common code") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13SUNRPC: Handle ENOMEM in call_transmit_status()Trond Myklebust1-0/+2
[ Upstream commit d3c15033b240767d0287f1c4a529cbbe2d5ded8a ] Both call_transmit() and call_bc_transmit() can now return ENOMEM, so let's make sure that we handle the errors gracefully. Fixes: 0472e4766049 ("SUNRPC: Convert socket page send code to use iov_iter()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13bpf: Support dual-stack sockets in bpf_tcp_check_syncookieMaxim Mikityanskiy1-4/+13
[ Upstream commit 2e8702cc0cfa1080f29fd64003c00a3e24ac38de ] bpf_tcp_gen_syncookie looks at the IP version in the IP header and validates the address family of the socket. It supports IPv4 packets in AF_INET6 dual-stack sockets. On the other hand, bpf_tcp_check_syncookie looks only at the address family of the socket, ignoring the real IP version in headers, and validates only the packet size. This implementation has some drawbacks: 1. Packets are not validated properly, allowing a BPF program to trick bpf_tcp_check_syncookie into handling an IPv6 packet on an IPv4 socket. 2. Dual-stack sockets fail the checks on IPv4 packets. IPv4 clients end up receiving a SYNACK with the cookie, but the following ACK gets dropped. This patch fixes these issues by changing the checks in bpf_tcp_check_syncookie to match the ones in bpf_tcp_gen_syncookie. IP version from the header is taken into account, and it is validated properly with address family. Fixes: 399040847084 ("bpf: add helper to check for a valid SYN cookie") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Arthur Fabre <afabre@cloudflare.com> Link: https://lore.kernel.org/bpf/20220406124113.2795730-1-maximmi@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13rxrpc: fix a race in rxrpc_exit_net()Eric Dumazet1-1/+1
[ Upstream commit 1946014ca3b19be9e485e780e862c375c6f98bad ] Current code can lead to the following race: CPU0 CPU1 rxrpc_exit_net() rxrpc_peer_keepalive_worker() if (rxnet->live) rxnet->live = false; del_timer_sync(&rxnet->peer_keepalive_timer); timer_reduce(&rxnet->peer_keepalive_timer, jiffies + delay); cancel_work_sync(&rxnet->peer_keepalive_work); rxrpc_exit_net() exits while peer_keepalive_timer is still armed, leading to use-after-free. syzbot report was: ODEBUG: free active (active state 0) object type: timer_list hint: rxrpc_peer_keepalive_timeout+0x0/0xb0 WARNING: CPU: 0 PID: 3660 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505 Modules linked in: CPU: 0 PID: 3660 Comm: kworker/u4:6 Not tainted 5.17.0-syzkaller-13993-g88e6c0207623 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505 Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd 00 1c 26 8a 4c 89 ee 48 c7 c7 00 10 26 8a e8 b1 e7 28 05 <0f> 0b 83 05 15 eb c5 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3 RSP: 0018:ffffc9000353fb00 EFLAGS: 00010082 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: ffff888029196140 RSI: ffffffff815efad8 RDI: fffff520006a7f52 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815ea4ae R11: 0000000000000000 R12: ffffffff89ce23e0 R13: ffffffff8a2614e0 R14: ffffffff816628c0 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe1f2908924 CR3: 0000000043720000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __debug_check_no_obj_freed lib/debugobjects.c:992 [inline] debug_check_no_obj_freed+0x301/0x420 lib/debugobjects.c:1023 kfree+0xd6/0x310 mm/slab.c:3809 ops_free_list.part.0+0x119/0x370 net/core/net_namespace.c:176 ops_free_list net/core/net_namespace.c:174 [inline] cleanup_net+0x591/0xb00 net/core/net_namespace.c:598 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298 </TASK> Fixes: ace45bec6d77 ("rxrpc: Fix firewall route keepalive") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: linux-afs@lists.infradead.org Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13net: openvswitch: fix leak of nested actionsIlya Maximets1-5/+90
[ Upstream commit 1f30fb9166d4f15a1aa19449b9da871fe0ed4796 ] While parsing user-provided actions, openvswitch module may dynamically allocate memory and store pointers in the internal copy of the actions. So this memory has to be freed while destroying the actions. Currently there are only two such actions: ct() and set(). However, there are many actions that can hold nested lists of actions and ovs_nla_free_flow_actions() just jumps over them leaking the memory. For example, removal of the flow with the following actions will lead to a leak of the memory allocated by nf_ct_tmpl_alloc(): actions:clone(ct(commit),0) Non-freed set() action may also leak the 'dst' structure for the tunnel info including device references. Under certain conditions with a high rate of flow rotation that may cause significant memory leak problem (2MB per second in reporter's case). The problem is also hard to mitigate, because the user doesn't have direct control over the datapath flows generated by OVS. Fix that by iterating over all the nested actions and freeing everything that needs to be freed recursively. New build time assertion should protect us from this problem if new actions will be added in the future. Unfortunately, openvswitch module doesn't use NLA_F_NESTED, so all attributes has to be explicitly checked. sample() and clone() actions are mixing extra attributes into the user-provided action list. That prevents some code generalization too. Fixes: 34ae932a4036 ("openvswitch: Make tunnel set action attach a metadata dst") Link: https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392922.html Reported-by: Stéphane Graber <stgraber@ubuntu.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13net: openvswitch: don't send internal clone attribute to the userspace.Ilya Maximets2-2/+4
[ Upstream commit 3f2a3050b4a3e7f32fc0ea3c9b0183090ae00522 ] 'OVS_CLONE_ATTR_EXEC' is an internal attribute that is used for performance optimization inside the kernel. It's added by the kernel while parsing user-provided actions and should not be sent during the flow dump as it's not part of the uAPI. The issue doesn't cause any significant problems to the ovs-vswitchd process, because reported actions are not really used in the application lifecycle and only supposed to be shown to a human via ovs-dpctl flow dump. However, the action list is still incorrect and causes the following error if the user wants to look at the datapath flows: # ovs-dpctl add-dp system@ovs-system # ovs-dpctl add-flow "<flow match>" "clone(ct(commit),0)" # ovs-dpctl dump-flows <flow match>, packets:0, bytes:0, used:never, actions:clone(bad length 4, expected -1 for: action0(01 00 00 00), ct(commit),0) With the fix: # ovs-dpctl dump-flows <flow match>, packets:0, bytes:0, used:never, actions:clone(ct(commit),0) Additionally fixed an incorrect attribute name in the comment. Fixes: b233504033db ("openvswitch: kernel datapath clone action") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Link: https://lore.kernel.org/r/20220404104150.2865736-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13ipv6: Fix stats accounting in ip6_pkt_dropDavid Ahern1-1/+1
[ Upstream commit 1158f79f82d437093aeed87d57df0548bdd68146 ] VRF devices are the loopbacks for VRFs, and a loopback can not be assigned to a VRF. Accordingly, the condition in ip6_pkt_drop should be '||' not '&&'. Fixes: 1d3fd8a10bed ("vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach") Reported-by: Pudak, Filip <Filip.Pudak@windriver.com> Reported-by: Xiao, Jiguang <Jiguang.Xiao@windriver.com> Signed-off-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220404150908.2937-1-dsahern@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13net: ipv4: fix route with nexthop object delete warningNikolay Aleksandrov1-1/+6
[ Upstream commit 6bf92d70e690b7ff12b24f4bfff5e5434d019b82 ] FRR folks have hit a kernel warning[1] while deleting routes[2] which is caused by trying to delete a route pointing to a nexthop id without specifying nhid but matching on an interface. That is, a route is found but we hit a warning while matching it. The warning is from fib_info_nh() in include/net/nexthop.h because we run it on a fib_info with nexthop object. The call chain is: inet_rtm_delroute -> fib_table_delete -> fib_nh_match (called with a nexthop fib_info and also with fc_oif set thus calling fib_info_nh on the fib_info and triggering the warning). The fix is to not do any matching in that branch if the fi has a nexthop object because those are managed separately. I.e. we should match when deleting without nh spec and should fail when deleting a nexthop route with old-style nh spec because nexthop objects are managed separately, e.g.: $ ip r show 1.2.3.4/32 1.2.3.4 nhid 12 via 192.168.11.2 dev dummy0 $ ip r del 1.2.3.4/32 $ ip r del 1.2.3.4/32 nhid 12 <both should work> $ ip r del 1.2.3.4/32 dev dummy0 <should fail with ESRCH> [1] [ 523.462226] ------------[ cut here ]------------ [ 523.462230] WARNING: CPU: 14 PID: 22893 at include/net/nexthop.h:468 fib_nh_match+0x210/0x460 [ 523.462236] Modules linked in: dummy rpcsec_gss_krb5 xt_socket nf_socket_ipv4 nf_socket_ipv6 ip6table_raw iptable_raw bpf_preload xt_statistic ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs xt_mark nf_tables xt_nat veth nf_conntrack_netlink nfnetlink xt_addrtype br_netfilter overlay dm_crypt nfsv3 nfs fscache netfs vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack 8021q garp mrp ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bridge stp llc rfcomm snd_seq_dummy snd_hrtimer rpcrdma rdma_cm iw_cm ib_cm ib_core ip6table_filter xt_comment ip6_tables vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) qrtr bnep binfmt_misc xfs vfat fat squashfs loop nvidia_drm(POE) nvidia_modeset(POE) nvidia_uvm(POE) nvidia(POE) intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi btusb btrtl iwlmvm uvcvideo btbcm snd_hda_intel edac_mce_amd [ 523.462274] videobuf2_vmalloc videobuf2_memops btintel snd_intel_dspcfg videobuf2_v4l2 snd_intel_sdw_acpi bluetooth snd_usb_audio snd_hda_codec mac80211 snd_usbmidi_lib joydev snd_hda_core videobuf2_common kvm_amd snd_rawmidi snd_hwdep snd_seq videodev ccp snd_seq_device libarc4 ecdh_generic mc snd_pcm kvm iwlwifi snd_timer drm_kms_helper snd cfg80211 cec soundcore irqbypass rapl wmi_bmof i2c_piix4 rfkill k10temp pcspkr acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc drm zram ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel nvme sp5100_tco r8169 nvme_core wmi ipmi_devintf ipmi_msghandler fuse [ 523.462300] CPU: 14 PID: 22893 Comm: ip Tainted: P OE 5.16.18-200.fc35.x86_64 #1 [ 523.462302] Hardware name: Micro-Star International Co., Ltd. MS-7C37/MPG X570 GAMING EDGE WIFI (MS-7C37), BIOS 1.C0 10/29/2020 [ 523.462303] RIP: 0010:fib_nh_match+0x210/0x460 [ 523.462304] Code: 7c 24 20 48 8b b5 90 00 00 00 e8 bb ee f4 ff 48 8b 7c 24 20 41 89 c4 e8 ee eb f4 ff 45 85 e4 0f 85 2e fe ff ff e9 4c ff ff ff <0f> 0b e9 17 ff ff ff 3c 0a 0f 85 61 fe ff ff 48 8b b5 98 00 00 00 [ 523.462306] RSP: 0018:ffffaa53d4d87928 EFLAGS: 00010286 [ 523.462307] RAX: 0000000000000000 RBX: ffffaa53d4d87a90 RCX: ffffaa53d4d87bb0 [ 523.462308] RDX: ffff9e3d2ee6be80 RSI: ffffaa53d4d87a90 RDI: ffffffff920ed380 [ 523.462309] RBP: ffff9e3d2ee6be80 R08: 0000000000000064 R09: 0000000000000000 [ 523.462310] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000031 [ 523.462310] R13: 0000000000000020 R14: 0000000000000000 R15: ffff9e3d331054e0 [ 523.462311] FS: 00007f245517c1c0(0000) GS:ffff9e492ed80000(0000) knlGS:0000000000000000 [ 523.462313] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 523.462313] CR2: 000055e5dfdd8268 CR3: 00000003ef488000 CR4: 0000000000350ee0 [ 523.462315] Call Trace: [ 523.462316] <TASK> [ 523.462320] fib_table_delete+0x1a9/0x310 [ 523.462323] inet_rtm_delroute+0x93/0x110 [ 523.462325] rtnetlink_rcv_msg+0x133/0x370 [ 523.462327] ? _copy_to_iter+0xb5/0x6f0 [ 523.462330] ? rtnl_calcit.isra.0+0x110/0x110 [ 523.462331] netlink_rcv_skb+0x50/0xf0 [ 523.462334] netlink_unicast+0x211/0x330 [ 523.462336] netlink_sendmsg+0x23f/0x480 [ 523.462338] sock_sendmsg+0x5e/0x60 [ 523.462340] ____sys_sendmsg+0x22c/0x270 [ 523.462341] ? import_iovec+0x17/0x20 [ 523.462343] ? sendmsg_copy_msghdr+0x59/0x90 [ 523.462344] ? __mod_lruvec_page_state+0x85/0x110 [ 523.462348] ___sys_sendmsg+0x81/0xc0 [ 523.462350] ? netlink_seq_start+0x70/0x70 [ 523.462352] ? __dentry_kill+0x13a/0x180 [ 523.462354] ? __fput+0xff/0x250 [ 523.462356] __sys_sendmsg+0x49/0x80 [ 523.462358] do_syscall_64+0x3b/0x90 [ 523.462361] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 523.462364] RIP: 0033:0x7f24552aa337 [ 523.462365] Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 523.462366] RSP: 002b:00007fff7f05a838 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 523.462368] RAX: ffffffffffffffda RBX: 000000006245bf91 RCX: 00007f24552aa337 [ 523.462368] RDX: 0000000000000000 RSI: 00007fff7f05a8a0 RDI: 0000000000000003 [ 523.462369] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 523.462370] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000001 [ 523.462370] R13: 00007fff7f05ce08 R14: 0000000000000000 R15: 000055e5dfdd1040 [ 523.462373] </TASK> [ 523.462374] ---[ end trace ba537bc16f6bf4ed ]--- [2] https://github.com/FRRouting/frr/issues/6412 Fixes: 4c7e8084fd46 ("ipv4: Plumb support for nexthop object in a fib_info") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13net/tls: fix slab-out-of-bounds bug in decrypt_internalZiyang Xuan1-1/+1
[ Upstream commit 9381fe8c849cfbe50245ac01fc077554f6eaa0e2 ] The memory size of tls_ctx->rx.iv for AES128-CCM is 12 setting in tls_set_sw_offload(). The return value of crypto_aead_ivsize() for "ccm(aes)" is 16. So memcpy() require 16 bytes from 12 bytes memory space will trigger slab-out-of-bounds bug as following: ================================================================== BUG: KASAN: slab-out-of-bounds in decrypt_internal+0x385/0xc40 [tls] Read of size 16 at addr ffff888114e84e60 by task tls/10911 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x5e/0x5db ? decrypt_internal+0x385/0xc40 [tls] kasan_report+0xab/0x120 ? decrypt_internal+0x385/0xc40 [tls] kasan_check_range+0xf9/0x1e0 memcpy+0x20/0x60 decrypt_internal+0x385/0xc40 [tls] ? tls_get_rec+0x2e0/0x2e0 [tls] ? process_rx_list+0x1a5/0x420 [tls] ? tls_setup_from_iter.constprop.0+0x2e0/0x2e0 [tls] decrypt_skb_update+0x9d/0x400 [tls] tls_sw_recvmsg+0x3c8/0xb50 [tls] Allocated by task 10911: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 tls_set_sw_offload+0x2eb/0xa20 [tls] tls_setsockopt+0x68c/0x700 [tls] __sys_setsockopt+0xfe/0x1b0 Replace the crypto_aead_ivsize() with prot->iv_size + prot->salt_size when memcpy() iv value in TLS_1_3_VERSION scenario. Fixes: f295b3ae9f59 ("net/tls: Add support of AES128-CCM based ciphers") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13SUNRPC: Fix socket waits for write buffer spaceTrond Myklebust1-15/+39
[ Upstream commit 7496b59f588dd52886fdbac7633608097543a0a5 ] The socket layer requires that we use the socket lock to protect changes to the sock->sk_write_pending field and others. Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13SUNRPC: remove scheduling boost for "SWAPPER" tasks.NeilBrown2-18/+0
[ Upstream commit a80a8461868905823609be97f91776a26befe839 ] Currently, tasks marked as "swapper" tasks get put to the front of non-priority rpc_queues, and are sorted earlier than non-swapper tasks on the transport's ->xmit_queue. This is pointless as currently *all* tasks for a mount that has swap enabled on *any* file are marked as "swapper" tasks. So the net result is that the non-priority rpc_queues are reverse-ordered (LIFO). This scheduling boost is not necessary to avoid deadlocks, and hurts fairness, so remove it. If there were a need to expedite some requests, the tk_priority mechanism is a more appropriate tool. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13SUNRPC/xprt: async tasks mustn't block waiting for memoryNeilBrown2-2/+5
[ Upstream commit a721035477fb5fb8abc738fbe410b07c12af3dc5 ] When memory is short, new worker threads cannot be created and we depend on the minimum one rpciod thread to be able to handle everything. So it must not block waiting for memory. xprt_dynamic_alloc_slot can block indefinitely. This can tie up all workqueue threads and NFS can deadlock. So when called from a workqueue, set __GFP_NORETRY. The rdma alloc_slot already does not block. However it sets the error to -EAGAIN suggesting this will trigger a sleep. It does not. As we can see in call_reserveresult(), only -ENOMEM causes a sleep. -EAGAIN causes immediate retry. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13SUNRPC/call_alloc: async tasks mustn't block waiting for memoryNeilBrown2-2/+6