summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2021-12-22virtio/vsock: fix the transport to work with VMADDR_CID_ANYWei Wang1-1/+2
[ Upstream commit 1db8f5fc2e5c66a5c51e1f6488e0ba7d45c29ae4 ] The VMADDR_CID_ANY flag used by a socket means that the socket isn't bound to any specific CID. For example, a host vsock server may want to be bound with VMADDR_CID_ANY, so that a guest vsock client can connect to the host server with CID=VMADDR_CID_HOST (i.e. 2), and meanwhile, a host vsock client can connect to the same local server with CID=VMADDR_CID_LOCAL (i.e. 1). The current implementation sets the destination socket's svm_cid to a fixed CID value after the first client's connection, which isn't an expected operation. For example, if the guest client first connects to the host server, the server's svm_cid gets set to VMADDR_CID_HOST, then other host clients won't be able to connect to the server anymore. Reproduce steps: 1. Run the host server: socat VSOCK-LISTEN:1234,fork - 2. Run a guest client to connect to the host server: socat - VSOCK-CONNECT:2:1234 3. Run a host client to connect to the host server: socat - VSOCK-CONNECT:1:1234 Without this patch, step 3. above fails to connect, and socat complains "socat[1720] E connect(5, AF=40 cid:1 port:1234, 16): Connection reset by peer". With this patch, the above works well. Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Signed-off-by: Wei Wang <wei.w.wang@intel.com> Link: https://lore.kernel.org/r/20211126011823.1760-1-wei.w.wang@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-22mac80211: track only QoS data frames for admission controlJohannes Berg1-3/+10
[ Upstream commit d5e568c3a4ec2ddd23e7dc5ad5b0c64e4f22981a ] For admission control, obviously all of that only works for QoS data frames, otherwise we cannot even access the QoS field in the header. Syzbot reported (see below) an uninitialized value here due to a status of a non-QoS nullfunc packet, which isn't even long enough to contain the QoS header. Fix this to only do anything for QoS data packets. Reported-by: syzbot+614e82b88a1a4973e534@syzkaller.appspotmail.com Fixes: 02219b3abca5 ("mac80211: add WMM admission control support") Link: https://lore.kernel.org/r/20211122124737.dad29e65902a.Ieb04587afacb27c14e0de93ec1bfbefb238cc2a0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-22mac80211: validate extended element ID is presentJohannes Berg1-1/+6
commit 768c0b19b50665e337c96858aa2b7928d6dcf756 upstream. Before attempting to parse an extended element, verify that the extended element ID is present. Fixes: 41cbb0f5a295 ("mac80211: add support for HE") Reported-by: syzbot+59bdff68edce82e393b6@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20211211201023.f30a1b128c07.I5cacc176da94ba316877c6e10fe3ceec8b4dbd7d@changeid Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22mac80211: send ADDBA requests using the tid/queue of the aggregation sessionFelix Fietkau1-1/+1
commit 1fe98f5690c4219d419ea9cc190f94b3401cf324 upstream. Sending them out on a different queue can cause a race condition where a number of packets in the queue may be discarded by the receiver, because the ADDBA request is sent too early. This affects any driver with software A-MPDU setup which does not allocate packet seqno in hardware on tx, regardless of whether iTXQ is used or not. The only driver I've seen that explicitly deals with this issue internally is mwl8k. Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20211202124533.80388-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22mac80211: mark TX-during-stop for TX in in_reconfigJohannes Berg1-1/+4
commit db7205af049d230e7e0abf61c1e74c1aab40f390 upstream. Mark TXQs as having seen transmit while they were stopped if we bail out of drv_wake_tx_queue() due to reconfig, so that the queue wake after this will make them catch up. This is particularly necessary for when TXQs are used for management packets since those TXQs won't see a lot of traffic that'd make them catch up later. Cc: stable@vger.kernel.org Fixes: 4856bfd23098 ("mac80211: do not call driver wake_tx_queue op during reconfig") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20211129152938.4573a221c0e1.I0d1d5daea3089be3fc0dccc92991b0f8c5677f0c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22mac80211: fix regression in SSN handling of addba txFelix Fietkau2-2/+3
commit 73111efacd3c6d9e644acca1d132566932be8af0 upstream. Some drivers that do their own sequence number allocation (e.g. ath9k) rely on being able to modify params->ssn on starting tx ampdu sessions. This was broken by a change that modified it to use sta->tid_seq[tid] instead. Cc: stable@vger.kernel.org Fixes: 31d8bb4e07f8 ("mac80211: agg-tx: refactor sending addba") Reported-by: Eneas U de Queiroz <cotequeiroz@gmail.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20211124094024.43222-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-17ethtool: do not perform operations on net devices being unregisteredAntoine Tenart1-0/+3
commit dde91ccfa25fd58f64c397d91b81a4b393100ffa upstream. There is a short period between a net device starts to be unregistered and when it is actually gone. In that time frame ethtool operations could still be performed, which might end up in unwanted or undefined behaviours[1]. Do not allow ethtool operations after a net device starts its unregistration. This patch targets the netlink part as the ioctl one isn't affected: the reference to the net device is taken and the operation is executed within an rtnl lock section and the net device won't be found after unregister. [1] For example adding Tx queues after unregister ends up in NULL pointer exceptions and UaFs, such as: BUG: KASAN: use-after-free in kobject_get+0x14/0x90 Read of size 1 at addr ffff88801961248c by task ethtool/755 CPU: 0 PID: 755 Comm: ethtool Not tainted 5.15.0-rc6+ #778 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/014 Call Trace: dump_stack_lvl+0x57/0x72 print_address_description.constprop.0+0x1f/0x140 kasan_report.cold+0x7f/0x11b kobject_get+0x14/0x90 kobject_add_internal+0x3d1/0x450 kobject_init_and_add+0xba/0xf0 netdev_queue_update_kobjects+0xcf/0x200 netif_set_real_num_tx_queues+0xb4/0x310 veth_set_channels+0x1c3/0x550 ethnl_set_channels+0x524/0x610 Fixes: 041b1c5d4a53 ("ethtool: helper functions for netlink interface") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Antoine Tenart <atenart@kernel.org> Link: https://lore.kernel.org/r/20211203101318.435618-1-atenart@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-17bpf: Fix integer overflow in argument calculation for bpf_map_area_allocBui Quang Minh1-1/+1
commit 7dd5d437c258bbf4cc15b35229e5208b87b8b4e0 upstream. In 32-bit architecture, the result of sizeof() is a 32-bit integer so the expression becomes the multiplication between 2 32-bit integer which can potentially leads to integer overflow. As a result, bpf_map_area_alloc() allocates less memory than needed. Fix this by casting 1 operand to u64. Fixes: 0d2c4f964050 ("bpf: Eliminate rlimit-based memory accounting for sockmap and sockhash maps") Fixes: 99c51064fb06 ("devmap: Use bpf_map_area_alloc() for allocating hash buckets") Fixes: 546ac1ffb70d ("bpf: add devmap, a map for storing net device references") Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210613143440.71975-1-minhquangbui99@gmail.com Signed-off-by: Connor O'Brien <connoro@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-17net: netlink: af_netlink: Prevent empty skb by adding a check on len.Harshit Mogalapalli1-0/+5
[ Upstream commit f123cffdd8fe8ea6c7fded4b88516a42798797d0 ] Adding a check on len parameter to avoid empty skb. This prevents a division error in netem_enqueue function which is caused when skb->len=0 and skb->data_len=0 in the randomized corruption step as shown below. skb->data[prandom_u32() % skb_headlen(skb)] ^= 1<<(prandom_u32() % 8); Crash Report: [ 343.170349] netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0 [ 343.216110] netem: version 1.3 [ 343.235841] divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 343.236680] CPU: 3 PID: 4288 Comm: reproducer Not tainted 5.16.0-rc1+ [ 343.237569] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 [ 343.238707] RIP: 0010:netem_enqueue+0x1590/0x33c0 [sch_netem] [ 343.239499] Code: 89 85 58 ff ff ff e8 5f 5d e9 d3 48 8b b5 48 ff ff ff 8b 8d 50 ff ff ff 8b 85 58 ff ff ff 48 8b bd 70 ff ff ff 31 d2 2b 4f 74 <f7> f1 48 b8 00 00 00 00 00 fc ff df 49 01 d5 4c 89 e9 48 c1 e9 03 [ 343.241883] RSP: 0018:ffff88800bcd7368 EFLAGS: 00010246 [ 343.242589] RAX: 00000000ba7c0a9c RBX: 0000000000000001 RCX: 0000000000000000 [ 343.243542] RDX: 0000000000000000 RSI: ffff88800f8edb10 RDI: ffff88800f8eda40 [ 343.244474] RBP: ffff88800bcd7458 R08: 0000000000000000 R09: ffffffff94fb8445 [ 343.245403] R10: ffffffff94fb8336 R11: ffffffff94fb8445 R12: 0000000000000000 [ 343.246355] R13: ffff88800a5a7000 R14: ffff88800a5b5800 R15: 0000000000000020 [ 343.247291] FS: 00007fdde2bd7700(0000) GS:ffff888109780000(0000) knlGS:0000000000000000 [ 343.248350] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 343.249120] CR2: 00000000200000c0 CR3: 000000000ef4c000 CR4: 00000000000006e0 [ 343.250076] Call Trace: [ 343.250423] <TASK> [ 343.250713] ? memcpy+0x4d/0x60 [ 343.251162] ? netem_init+0xa0/0xa0 [sch_netem] [ 343.251795] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.252443] netem_enqueue+0xe28/0x33c0 [sch_netem] [ 343.253102] ? stack_trace_save+0x87/0xb0 [ 343.253655] ? filter_irq_stacks+0xb0/0xb0 [ 343.254220] ? netem_init+0xa0/0xa0 [sch_netem] [ 343.254837] ? __kasan_check_write+0x14/0x20 [ 343.255418] ? _raw_spin_lock+0x88/0xd6 [ 343.255953] dev_qdisc_enqueue+0x50/0x180 [ 343.256508] __dev_queue_xmit+0x1a7e/0x3090 [ 343.257083] ? netdev_core_pick_tx+0x300/0x300 [ 343.257690] ? check_kcov_mode+0x10/0x40 [ 343.258219] ? _raw_spin_unlock_irqrestore+0x29/0x40 [ 343.258899] ? __kasan_init_slab_obj+0x24/0x30 [ 343.259529] ? setup_object.isra.71+0x23/0x90 [ 343.260121] ? new_slab+0x26e/0x4b0 [ 343.260609] ? kasan_poison+0x3a/0x50 [ 343.261118] ? kasan_unpoison+0x28/0x50 [ 343.261637] ? __kasan_slab_alloc+0x71/0x90 [ 343.262214] ? memcpy+0x4d/0x60 [ 343.262674] ? write_comp_data+0x2f/0x90 [ 343.263209] ? __kasan_check_write+0x14/0x20 [ 343.263802] ? __skb_clone+0x5d6/0x840 [ 343.264329] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.264958] dev_queue_xmit+0x1c/0x20 [ 343.265470] netlink_deliver_tap+0x652/0x9c0 [ 343.266067] netlink_unicast+0x5a0/0x7f0 [ 343.266608] ? netlink_attachskb+0x860/0x860 [ 343.267183] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.267820] ? write_comp_data+0x2f/0x90 [ 343.268367] netlink_sendmsg+0x922/0xe80 [ 343.268899] ? netlink_unicast+0x7f0/0x7f0 [ 343.269472] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.270099] ? write_comp_data+0x2f/0x90 [ 343.270644] ? netlink_unicast+0x7f0/0x7f0 [ 343.271210] sock_sendmsg+0x155/0x190 [ 343.271721] ____sys_sendmsg+0x75f/0x8f0 [ 343.272262] ? kernel_sendmsg+0x60/0x60 [ 343.272788] ? write_comp_data+0x2f/0x90 [ 343.273332] ? write_comp_data+0x2f/0x90 [ 343.273869] ___sys_sendmsg+0x10f/0x190 [ 343.274405] ? sendmsg_copy_msghdr+0x80/0x80 [ 343.274984] ? slab_post_alloc_hook+0x70/0x230 [ 343.275597] ? futex_wait_setup+0x240/0x240 [ 343.276175] ? security_file_alloc+0x3e/0x170 [ 343.276779] ? write_comp_data+0x2f/0x90 [ 343.277313] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.277969] ? write_comp_data+0x2f/0x90 [ 343.278515] ? __fget_files+0x1ad/0x260 [ 343.279048] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.279685] ? write_comp_data+0x2f/0x90 [ 343.280234] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.280874] ? sockfd_lookup_light+0xd1/0x190 [ 343.281481] __sys_sendmsg+0x118/0x200 [ 343.281998] ? __sys_sendmsg_sock+0x40/0x40 [ 343.282578] ? alloc_fd+0x229/0x5e0 [ 343.283070] ? write_comp_data+0x2f/0x90 [ 343.283610] ? write_comp_data+0x2f/0x90 [ 343.284135] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.284776] ? ktime_get_coarse_real_ts64+0xb8/0xf0 [ 343.285450] __x64_sys_sendmsg+0x7d/0xc0 [ 343.285981] ? syscall_enter_from_user_mode+0x4d/0x70 [ 343.286664] do_syscall_64+0x3a/0x80 [ 343.287158] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 343.287850] RIP: 0033:0x7fdde24cf289 [ 343.288344] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b7 db 2c 00 f7 d8 64 89 01 48 [ 343.290729] RSP: 002b:00007fdde2bd6d98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 343.291730] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fdde24cf289 [ 343.292673] RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000004 [ 343.293618] RBP: 00007fdde2bd6e20 R08: 0000000100000001 R09: 0000000000000000 [ 343.294557] R10: 0000000100000001 R11: 0000000000000246 R12: 0000000000000000 [ 343.295493] R13: 0000000000021000 R14: 0000000000000000 R15: 00007fdde2bd7700 [ 343.296432] </TASK> [ 343.296735] Modules linked in: sch_netem ip6_vti ip_vti ip_gre ipip sit ip_tunnel geneve macsec macvtap tap ipvlan macvlan 8021q garp mrp hsr wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic curve25519_x86_64 libcurve25519_generic libchacha xfrm_interface xfrm6_tunnel tunnel4 veth netdevsim psample batman_adv nlmon dummy team bonding tls vcan ip6_gre ip6_tunnel tunnel6 gre tun ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_security iptable_raw ebtable_filter ebtables rfkill ip6table_filter ip6_tables iptable_filter ppdev bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper cec parport_pc drm joydev floppy parport sg syscopyarea sysfillrect sysimgblt i2c_piix4 qemu_fw_cfg fb_sys_fops pcspkr [ 343.297459] ip_tables xfs virtio_net net_failover failover sd_mod sr_mod cdrom t10_pi ata_generic pata_acpi ata_piix libata virtio_pci virtio_pci_legacy_dev serio_raw virtio_pci_modern_dev dm_mirror dm_region_hash dm_log dm_mod [ 343.311074] Dumping ftrace buffer: [ 343.311532] (ftrace buffer empty) [ 343.312040] ---[ end trace a2e3db5a6ae05099 ]--- [ 343.312691] RIP: 0010:netem_enqueue+0x1590/0x33c0 [sch_netem] [ 343.313481] Code: 89 85 58 ff ff ff e8 5f 5d e9 d3 48 8b b5 48 ff ff ff 8b 8d 50 ff ff ff 8b 85 58 ff ff ff 48 8b bd 70 ff ff ff 31 d2 2b 4f 74 <f7> f1 48 b8 00 00 00 00 00 fc ff df 49 01 d5 4c 89 e9 48 c1 e9 03 [ 343.315893] RSP: 0018:ffff88800bcd7368 EFLAGS: 00010246 [ 343.316622] RAX: 00000000ba7c0a9c RBX: 0000000000000001 RCX: 0000000000000000 [ 343.317585] RDX: 0000000000000000 RSI: ffff88800f8edb10 RDI: ffff88800f8eda40 [ 343.318549] RBP: ffff88800bcd7458 R08: 0000000000000000 R09: ffffffff94fb8445 [ 343.319503] R10: ffffffff94fb8336 R11: ffffffff94fb8445 R12: 0000000000000000 [ 343.320455] R13: ffff88800a5a7000 R14: ffff88800a5b5800 R15: 0000000000000020 [ 343.321414] FS: 00007fdde2bd7700(0000) GS:ffff888109780000(0000) knlGS:0000000000000000 [ 343.322489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 343.323283] CR2: 00000000200000c0 CR3: 000000000ef4c000 CR4: 00000000000006e0 [ 343.324264] Kernel panic - not syncing: Fatal exception in interrupt [ 343.333717] Dumping ftrace buffer: [ 343.334175] (ftrace buffer empty) [ 343.334653] Kernel Offset: 0x13600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 343.336027] Rebooting in 86400 seconds.. Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20211129175328.55339-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-17nfc: fix segfault in nfc_genl_dump_devices_doneTadeusz Struk1-2/+4
commit fd79a0cbf0b2e34bcc45b13acf962e2032a82203 upstream. When kmalloc in nfc_genl_dump_devices() fails then nfc_genl_dump_devices_done() segfaults as below KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 25 Comm: kworker/0:1 Not tainted 5.16.0-rc4-01180-g2a987e65025e-dirty #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-6.fc35 04/01/2014 Workqueue: events netlink_sock_destruct_work RIP: 0010:klist_iter_exit+0x26/0x80 Call Trace: <TASK> class_dev_iter_exit+0x15/0x20 nfc_genl_dump_devices_done+0x3b/0x50 genl_lock_done+0x84/0xd0 netlink_sock_destruct+0x8f/0x270 __sk_destruct+0x64/0x3b0 sk_destruct+0xa8/0xd0 __sk_free+0x2e8/0x3d0 sk_free+0x51/0x90 netlink_sock_destruct_work+0x1c/0x20 process_one_work+0x411/0x710 worker_thread+0x6fd/0xa80 Link: https://syzkaller.appspot.com/bug?id=fc0fa5a53db9edd261d56e74325419faf18bd0df Reported-by: syzbot+f9f76f4a0766420b4a02@syzkaller.appspotmail.com Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211208182742.340542-1-tadeusz.struk@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14net, neigh: clear whole pneigh_entry at alloc timeEric Dumazet1-2/+1
commit e195e9b5dee6459d8c8e6a314cc71a644a0537fd upstream. Commit 2c611ad97a82 ("net, neigh: Extend neigh->flags to 32 bit to allow for extensions") enables a new KMSAM warning [1] I think the bug is actually older, because the following intruction only occurred if ndm->ndm_flags had NTF_PROXY set. pn->flags = ndm->ndm_flags; Let's clear all pneigh_entry fields at alloc time. [1] BUG: KMSAN: uninit-value in pneigh_fill_info+0x986/0xb30 net/core/neighbour.c:2593 pneigh_fill_info+0x986/0xb30 net/core/neighbour.c:2593 pneigh_dump_table net/core/neighbour.c:2715 [inline] neigh_dump_info+0x1e3f/0x2c60 net/core/neighbour.c:2832 netlink_dump+0xaca/0x16a0 net/netlink/af_netlink.c:2265 __netlink_dump_start+0xd1c/0xee0 net/netlink/af_netlink.c:2370 netlink_dump_start include/linux/netlink.h:254 [inline] rtnetlink_rcv_msg+0x181b/0x18c0 net/core/rtnetlink.c:5534 netlink_rcv_skb+0x447/0x800 net/netlink/af_netlink.c:2491 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5589 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x1095/0x1360 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x16f3/0x1870 net/netlink/af_netlink.c:1916 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] sock_write_iter+0x594/0x690 net/socket.c:1057 call_write_iter include/linux/fs.h:2162 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0x1318/0x2030 fs/read_write.c:590 ksys_write+0x28c/0x520 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0xdb/0x120 fs/read_write.c:652 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae Uninit was created at: slab_post_alloc_hook mm/slab.h:524 [inline] slab_alloc_node mm/slub.c:3251 [inline] slab_alloc mm/slub.c:3259 [inline] __kmalloc+0xc3c/0x12d0 mm/slub.c:4437 kmalloc include/linux/slab.h:595 [inline] pneigh_lookup+0x60f/0xd70 net/core/neighbour.c:766 arp_req_set_public net/ipv4/arp.c:1016 [inline] arp_req_set+0x430/0x10a0 net/ipv4/arp.c:1032 arp_ioctl+0x8d4/0xb60 net/ipv4/arp.c:1232 inet_ioctl+0x4ef/0x820 net/ipv4/af_inet.c:947 sock_do_ioctl net/socket.c:1118 [inline] sock_ioctl+0xa3f/0x13e0 net/socket.c:1235 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl+0x2df/0x4a0 fs/ioctl.c:860 __x64_sys_ioctl+0xd8/0x110 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae CPU: 1 PID: 20001 Comm: syz-executor.0 Not tainted 5.16.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 62dd93181aaa ("[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211206165329.1049835-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14net/sched: fq_pie: prevent dismantle issueEric Dumazet1-0/+1
commit 61c2402665f1e10c5742033fce18392e369931d7 upstream. For some reason, fq_pie_destroy() did not copy working code from pie_destroy() and other qdiscs, thus causing elusive bug. Before calling del_timer_sync(&q->adapt_timer), we need to ensure timer will not rearm itself. rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 0-....: (4416 ticks this GP) idle=60d/1/0x4000000000000000 softirq=10433/10434 fqs=2579 (t=10501 jiffies g=13085 q=3989) NMI backtrace for cpu 0 CPU: 0 PID: 13 Comm: ksoftirqd/0 Not tainted 5.16.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] rcu_dump_cpu_stacks+0x25e/0x3f0 kernel/rcu/tree_stall.h:343 print_cpu_stall kernel/rcu/tree_stall.h:627 [inline] check_cpu_stall kernel/rcu/tree_stall.h:711 [inline] rcu_pending kernel/rcu/tree.c:3878 [inline] rcu_sched_clock_irq.cold+0x9d/0x746 kernel/rcu/tree.c:2597 update_process_times+0x16d/0x200 kernel/time/timer.c:1785 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1428 __run_hrtimer kernel/time/hrtimer.c:1685 [inline] __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline] __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638 RIP: 0010:write_comp_data kernel/kcov.c:221 [inline] RIP: 0010:__sanitizer_cov_trace_const_cmp1+0x1d/0x80 kernel/kcov.c:273 Code: 54 c8 20 48 89 10 c3 66 0f 1f 44 00 00 53 41 89 fb 41 89 f1 bf 03 00 00 00 65 48 8b 0c 25 40 70 02 00 48 89 ce 4c 8b 54 24 08 <e8> 4e f7 ff ff 84 c0 74 51 48 8b 81 88 15 00 00 44 8b 81 84 15 00 RSP: 0018:ffffc90000d27b28 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff888064bf1bf0 RCX: ffff888011928000 RDX: ffff888011928000 RSI: ffff888011928000 RDI: 0000000000000003 RBP: ffff888064bf1c28 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff875d8295 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8880783dd300 R14: 0000000000000000 R15: 0000000000000000 pie_calculate_probability+0x405/0x7c0 net/sched/sch_pie.c:418 fq_pie_timer+0x170/0x2a0 net/sched/sch_fq_pie.c:383 call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1421 expire_timers kernel/time/timer.c:1466 [inline] __run_timers.part.0+0x675/0xa20 kernel/time/timer.c:1734 __run_timers kernel/time/timer.c:1715 [inline] run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1747 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 run_ksoftirqd kernel/softirq.c:921 [inline] run_ksoftirqd+0x2d/0x60 kernel/softirq.c:913 smpboot_thread_fn+0x645/0x9c0 kernel/smpboot.c:164 kthread+0x405/0x4f0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Cc: Sachin D. Patil <sdp.sachin@gmail.com> Cc: V. Saicharan <vsaicharan1998@gmail.com> Cc: Mohit Bhasi <mohitbhasi1998@gmail.com> Cc: Leslie Monis <lesliemonis@gmail.com> Cc: Gautam Ramakrishnan <gautamramk@gmail.com> Link: https://lore.kernel.org/r/20211209084937.3500020-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14devlink: fix netns refcount leak in devlink_nl_cmd_reload()Eric Dumazet1-8/+8
commit 4dbb0dad8e63fcd0b5a117c2861d2abe7ff5f186 upstream. While preparing my patch series adding netns refcount tracking, I spotted bugs in devlink_nl_cmd_reload() Some error paths forgot to release a refcount on a netns. To fix this, we can reduce the scope of get_net()/put_net() section around the call to devlink_reload(). Fixes: ccdf07219da6 ("devlink: Add reload action option to devlink reload command") Fixes: dc64cc7c6310 ("devlink: Add devlink reload limit option") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Moshe Shemesh <moshe@mellanox.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20211205192822.1741045-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14netfilter: conntrack: annotate data-races around ct->timeoutEric Dumazet3-6/+6
commit 802a7dc5cf1bef06f7b290ce76d478138408d6b1 upstream. (struct nf_conn)->timeout can be read/written locklessly, add READ_ONCE()/WRITE_ONCE() to prevent load/store tearing. BUG: KCSAN: data-race in __nf_conntrack_alloc / __nf_conntrack_find_get write to 0xffff888132e78c08 of 4 bytes by task 6029 on cpu 0: __nf_conntrack_alloc+0x158/0x280 net/netfilter/nf_conntrack_core.c:1563 init_conntrack+0x1da/0xb30 net/netfilter/nf_conntrack_core.c:1635 resolve_normal_ct+0x502/0x610 net/netfilter/nf_conntrack_core.c:1746 nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901 ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414 nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline] nf_hook_slow+0x72/0x170 net/netfilter/core.c:619 nf_hook include/linux/netfilter.h:262 [inline] NF_HOOK include/linux/netfilter.h:305 [inline] ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324 inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135 __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402 tcp_transmit_skb net/ipv4/tcp_output.c:1420 [inline] tcp_write_xmit+0x1450/0x4460 net/ipv4/tcp_output.c:2680 __tcp_push_pending_frames+0x68/0x1c0 net/ipv4/tcp_output.c:2864 tcp_push_pending_frames include/net/tcp.h:1897 [inline] tcp_data_snd_check+0x62/0x2e0 net/ipv4/tcp_input.c:5452 tcp_rcv_established+0x880/0x10e0 net/ipv4/tcp_input.c:5947 tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0xf2/0x270 net/core/sock.c:2768 release_sock+0x40/0x110 net/core/sock.c:3300 sk_stream_wait_memory+0x435/0x700 net/core/stream.c:145 tcp_sendmsg_locked+0xb85/0x25a0 net/ipv4/tcp.c:1402 tcp_sendmsg+0x2c/0x40 net/ipv4/tcp.c:1440 inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:644 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] __sys_sendto+0x21e/0x2c0 net/socket.c:2036 __do_sys_sendto net/socket.c:2048 [inline] __se_sys_sendto net/socket.c:2044 [inline] __x64_sys_sendto+0x74/0x90 net/socket.c:2044 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff888132e78c08 of 4 bytes by task 17446 on cpu 1: nf_ct_is_expired include/net/netfilter/nf_conntrack.h:286 [inline] ____nf_conntrack_find net/netfilter/nf_conntrack_core.c:776 [inline] __nf_conntrack_find_get+0x1c7/0xac0 net/netfilter/nf_conntrack_core.c:807 resolve_normal_ct+0x273/0x610 net/netfilter/nf_conntrack_core.c:1734 nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901 ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414 nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline] nf_hook_slow+0x72/0x170 net/netfilter/core.c:619 nf_hook include/linux/netfilter.h:262 [inline] NF_HOOK include/linux/netfilter.h:305 [inline] ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324 inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135 __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402 __tcp_send_ack+0x1fd/0x300 net/ipv4/tcp_output.c:3956 tcp_send_ack+0x23/0x30 net/ipv4/tcp_output.c:3962 __tcp_ack_snd_check+0x2d8/0x510 net/ipv4/tcp_input.c:5478 tcp_ack_snd_check net/ipv4/tcp_input.c:5523 [inline] tcp_rcv_established+0x8c2/0x10e0 net/ipv4/tcp_input.c:5948 tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0xf2/0x270 net/core/sock.c:2768 release_sock+0x40/0x110 net/core/sock.c:3300 tcp_sendpage+0x94/0xb0 net/ipv4/tcp.c:1114 inet_sendpage+0x7f/0xc0 net/ipv4/af_inet.c:833 rds_tcp_xmit+0x376/0x5f0 net/rds/tcp_send.c:118 rds_send_xmit+0xbed/0x1500 net/rds/send.c:367 rds_send_worker+0x43/0x200 net/rds/threads.c:200 process_one_work+0x3fc/0x980 kernel/workqueue.c:2298 worker_thread+0x616/0xa70 kernel/workqueue.c:2445 kthread+0x2c7/0x2e0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 value changed: 0x00027cc2 -> 0x00000000 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 17446 Comm: kworker/u4:5 Tainted: G W 5.16.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: krdsd rds_send_worker Note: I chose an arbitrary commit for the Fixes: tag, because I do not think we need to backport this fix to very old kernels. Fixes: e37542ba111f ("netfilter: conntrack: avoid possible false sharing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14udp: using datalen to cap max gso segmentsJianguo Wu1-1/+1
commit 158390e45612ef0fde160af0826f1740c36daf21 upstream. The max number of UDP gso segments is intended to cap to UDP_MAX_SEGMENTS, this is checked in udp_send_skb(): if (skb->len > cork->gso_size * UDP_MAX_SEGMENTS) { kfree_skb(skb); return -EINVAL; } skb->len contains network and transport header len here, we should use only data len instead. Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/900742e5-81fb-30dc-6e0b-375c6cdd7982@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14seg6: fix the iif in the IPv6 socket control blockAndrea Mayer1-0/+8
commit ae68d93354e5bf5191ee673982251864ea24dd5c upstream. When an IPv4 packet is received, the ip_rcv_core(...) sets the receiving interface index into the IPv4 socket control block (v5.16-rc4, net/ipv4/ip_input.c line 510): IPCB(skb)->iif = skb->skb_iif; If that IPv4 packet is meant to be encapsulated in an outer IPv6+SRH header, the seg6_do_srh_encap(...) performs the required encapsulation. In this case, the seg6_do_srh_encap function clears the IPv6 socket control block (v5.16-rc4 net/ipv6/seg6_iptunnel.c line 163): memset(IP6CB(skb), 0, sizeof(*IP6CB(skb))); The memset(...) was introduced in commit ef489749aae5 ("ipv6: sr: clear IP6CB(skb) on SRH ip4ip6 encapsulation") a long time ago (2019-01-29). Since the IPv6 socket control block and the IPv4 socket control block share the same memory area (skb->cb), the receiving interface index info is lost (IP6CB(skb)->iif is set to zero). As a side effect, that condition triggers a NULL pointer dereference if commit 0857d6f8c759 ("ipv6: When forwarding count rx stats on the orig netdev") is applied. To fix that issue, we set the IP6CB(skb)->iif with the index of the receiving interface once again. Fixes: ef489749aae5 ("ipv6: sr: clear IP6CB(skb) on SRH ip4ip6 encapsulation") Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211208195409.12169-1-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_doneKrzysztof Kozlowski1-2/+4
commit 4cd8371a234d051f9c9557fcbb1f8c523b1c0d10 upstream. The done() netlink callback nfc_genl_dump_ses_done() should check if received argument is non-NULL, because its allocation could fail earlier in dumpit() (nfc_genl_dump_ses()). Fixes: ac22ac466a65 ("NFC: Add a GET_SE netlink API") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211209081307.57337-1-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groupsStefano Brivio1-1/+1
commit b7e945e228d7df1b1473ef6fd2cdec67433065fb upstream. The sixth byte of packet data has to be looked up in the sixth group, not in the seventh one, even if we load the bucket data into ymm6 (and not ymm5, for convenience of tracking stalls). Without this fix, matching on a MAC address as first field of a set, if 8-bit groups are selected (due to a small set size) would fail, that is, the given MAC address would never match. Reported-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com> Cc: <stable@vger.kernel.org> # 5.6.x Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-By: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08net/tls: Fix authentication failure in CCM modeTianjia Zhang1-2/+2
commit 5961060692f8b17cd2080620a3d27b95d2ae05ca upstream. When the TLS cipher suite uses CCM mode, including AES CCM and SM4 CCM, the first byte of the B0 block is flags, and the real IV starts from the second byte. The XOR operation of the IV and rec_seq should be skip this byte, that is, add the iv_offset. Fixes: f295b3ae9f59 ("net/tls: Add support of AES128-CCM based ciphers") Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Cc: Vakul Garg <vakul.garg@nxp.com> Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08net/smc: Keep smc_close_final rc during active closeTony Lu1-2/+6
commit 00e158fb91dfaff3f94746f260d11f1a4853506e upstream. When smc_close_final() returns error, the return code overwrites by kernel_sock_shutdown() in smc_close_active(). The return code of smc_close_final() is more important than kernel_sock_shutdown(), and it will pass to userspace directly. Fix it by keeping both return codes, if smc_close_final() raises an error, return it or kernel_sock_shutdown()'s. Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/ Fixes: 606a63c9783a ("net/smc: Ensure the active closing peer first closes clcsock") Suggested-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08net/rds: correct socket tunable error in rds_tcp_tune()William Kucharski1-1/+1
commit 19f36edf14bcdb783aef3af8217df96f76a8ce34 upstream. Correct an error where setting /proc/sys/net/rds/tcp/rds_tcp_rcvbuf would instead modify the socket's sk_sndbuf and would leave sk_rcvbuf untouched. Fixes: c6a58ffed536 ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket") Signed-off-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08net/smc: fix wrong list_del in smc_lgr_cleanup_earlyDust Li1-4/+3
commit 789b6cc2a5f9123b9c549b886fdc47c865cfe0ba upstream. smc_lgr_cleanup_early() meant to delete the link group from the link group list, but it deleted the list head by mistake. This may cause memory corruption since we didn't remove the real link group from the list and later memseted the link group structure. We got a list corruption panic when testing: [  231.277259] list_del corruption. prev->next should be ffff8881398a8000, but was 0000000000000000 [  231.278222] ------------[ cut here ]------------ [  231.278726] kernel BUG at lib/list_debug.c:53! [  231.279326] invalid opcode: 0000 [#1] SMP NOPTI [  231.279803] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.10.46+ #435 [  231.280466] Hardware name: Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014 [  231.281248] Workqueue: events smc_link_down_work [  231.281732] RIP: 0010:__list_del_entry_valid+0x70/0x90 [  231.282258] Code: 4c 60 82 e8 7d cc 6a 00 0f 0b 48 89 fe 48 c7 c7 88 4c 60 82 e8 6c cc 6a 00 0f 0b 48 89 fe 48 c7 c7 c0 4c 60 82 e8 5b cc 6a 00 <0f> 0b 48 89 fe 48 c7 c7 00 4d 60 82 e8 4a cc 6a 00 0f 0b cc cc cc [  231.284146] RSP: 0018:ffffc90000033d58 EFLAGS: 00010292 [  231.284685] RAX: 0000000000000054 RBX: ffff8881398a8000 RCX: 0000000000000000 [  231.285415] RDX: 0000000000000001 RSI: ffff88813bc18040 RDI: ffff88813bc18040 [  231.286141] RBP: ffffffff8305ad40 R08: 0000000000000003 R09: 0000000000000001 [  231.286873] R10: ffffffff82803da0 R11: ffffc90000033b90 R12: 0000000000000001 [  231.287606] R13: 0000000000000000 R14: ffff8881398a8000 R15: 0000000000000003 [  231.288337] FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 [  231.289160] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [  231.289754] CR2: 0000000000e72058 CR3: 000000010fa96006 CR4: 00000000003706f0 [  231.290485] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [  231.291211] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [  231.291940] Call Trace: [  231.292211]  smc_lgr_terminate_sched+0x53/0xa0 [  231.292677]  smc_switch_conns+0x75/0x6b0 [  231.293085]  ? update_load_avg+0x1a6/0x590 [  231.293517]  ? ttwu_do_wakeup+0x17/0x150 [  231.293907]  ? update_load_avg+0x1a6/0x590 [  231.294317]  ? newidle_balance+0xca/0x3d0 [  231.294716]  smcr_link_down+0x50/0x1a0 [  231.295090]  ? __wake_up_common_lock+0x77/0x90 [  231.295534]  smc_link_down_work+0x46/0x60 [  231.295933]  process_one_work+0x18b/0x350 Fixes: a0a62ee15a829 ("net/smc: separate locks for SMCD and SMCR link group lists") Signed-off-by: Dust Li <dust.li@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08ipv4: convert fib_num_tclassid_users to atomic_tEric Dumazet3-5/+5
commit 213f5f8f31f10aa1e83187ae20fb7fa4e626b724 upstream. Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") changes to net->ipv4.fib_num_tclassid_users were protected by RTNL. After the change, this is no longer the case, as free_fib_info_rcu() runs after rcu grace period, without rtnl being held. Fixes: faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08net: annotate data-races on txq->xmit_lock_ownerEric Dumazet1-1/+4
commit 7a10d8c810cfad3e79372d7d1c77899d86cd6662 upstream. syzbot found that __dev_queue_xmit() is reading txq->xmit_lock_owner without annotations. No serious issue there, let's document what is happening there. BUG: KCSAN: data-race in __dev_queue_xmit / __dev_queue_xmit write to 0xffff888139d09484 of 4 bytes by interrupt on cpu 0: __netif_tx_unlock include/linux/netdevice.h:4437 [inline] __dev_queue_xmit+0x948/0xf70 net/core/dev.c:4229 dev_queue_xmit_accel+0x19/0x20 net/core/dev.c:4265 macvlan_queue_xmit drivers/net/macvlan.c:543 [inline] macvlan_start_xmit+0x2b3/0x3d0 drivers/net/macvlan.c:567 __netdev_start_xmit include/linux/netdevice.h:4987 [inline] netdev_start_xmit include/linux/netdevice.h:5001 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3590 dev_hard_start_xmit+0x72/0x120 net/core/dev.c:3606 sch_direct_xmit+0x1b2/0x7c0 net/sched/sch_generic.c:342 __dev_xmit_skb+0x83d/0x1370 net/core/dev.c:3817 __dev_queue_xmit+0x590/0xf70 net/core/dev.c:4194 dev_queue_xmit+0x13/0x20 net/core/dev.c:4259 neigh_hh_output include/net/neighbour.h:511 [inline] neigh_output include/net/neighbour.h:525 [inline] ip6_finish_output2+0x995/0xbb0 net/ipv6/ip6_output.c:126 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] ip6_finish_output+0x444/0x4c0 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x10e/0x210 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:450 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ndisc_send_skb+0x486/0x610 net/ipv6/ndisc.c:508 ndisc_send_rs+0x3b0/0x3e0 net/ipv6/ndisc.c:702 addrconf_rs_timer+0x370/0x540 net/ipv6/addrconf.c:3898 call_timer_fn+0x2e/0x240 kernel/time/timer.c:1421 expire_timers+0x116/0x240 kernel/time/timer.c:1466 __run_timers+0x368/0x410 kernel/time/timer.c:1734 run_timer_softirq+0x2e/0x60 kernel/time/timer.c:1747 __do_softirq+0x158/0x2de kernel/softirq.c:558 __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0x37/0x70 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x3e/0xb0 arch/x86/kernel/apic/apic.c:1097 asm_sysvec_apic_timer_interrupt+0x12/0x20 read to 0xffff888139d09484 of 4 bytes by interrupt on cpu 1: __dev_queue_xmit+0x5e3/0xf70 net/core/dev.c:4213 dev_queue_xmit_accel+0x19/0x20 net/core/dev.c:4265 macvlan_queue_xmit drivers/net/macvlan.c:543 [inline] macvlan_start_xmit+0x2b3/0x3d0 drivers/net/macvlan.c:567 __netdev_start_xmit include/linux/netdevice.h:4987 [inline] netdev_start_xmit include/linux/netdevice.h:5001 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3590 dev_hard_start_xmit+0x72/0x120 net/core/dev.c:3606 sch_direct_xmit+0x1b2/0x7c0 net/sched/sch_generic.c:342 __dev_xmit_skb+0x83d/0x1370 net/core/dev.c:3817 __dev_queue_xmit+0x590/0xf70 net/core/dev.c:4194 dev_queue_xmit+0x13/0x20 net/core/dev.c:4259 neigh_resolve_output+0x3db/0x410 net/core/neighbour.c:1523 neigh_output include/net/neighbour.h:527 [inline] ip6_finish_output2+0x9be/0xbb0 net/ipv6/ip6_output.c:126 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] ip6_finish_output+0x444/0x4c0 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x10e/0x210 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:450 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ndisc_send_skb+0x486/0x610 net/ipv6/ndisc.c:508 ndisc_send_rs+0x3b0/0x3e0 net/ipv6/ndisc.c:702 addrconf_rs_timer+0x370/0x540 net/ipv6/addrconf.c:3898 call_timer_fn+0x2e/0x240 kernel/time/timer.c:1421 expire_timers+0x116/0x240 kernel/time/timer.c:1466 __run_timers+0x368/0x410 kernel/time/timer.c:1734 run_timer_softirq+0x2e/0x60 kernel/time/timer.c:1747 __do_softirq+0x158/0x2de kernel/softirq.c:558 __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0x37/0x70 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x8d/0xb0 arch/x86/kernel/apic/apic.c:1097 asm_sysvec_apic_timer_interrupt+0x12/0x20 kcsan_setup_watchpoint+0x94/0x420 kernel/kcsan/core.c:443 folio_test_anon include/linux/page-flags.h:581 [inline] PageAnon include/linux/page-flags.h:586 [inline] zap_pte_range+0x5ac/0x10e0 mm/memory.c:1347 zap_pmd_range mm/memory.c:1467 [inline] zap_pud_range mm/memory.c:1496 [inline] zap_p4d_range mm/memory.c:1517 [inline] unmap_page_range+0x2dc/0x3d0 mm/memory.c:1538 unmap_single_vma+0x157/0x210 mm/memory.c:1583 unmap_vmas+0xd0/0x180 mm/memory.c:1615 exit_mmap+0x23d/0x470 mm/mmap.c:3170 __mmput+0x27/0x1b0 kernel/fork.c:1113 mmput+0x3d/0x50 kernel/fork.c:1134 exit_mm+0xdb/0x170 kernel/exit.c:507 do_exit+0x608/0x17a0 kernel/exit.c:819 do_group_exit+0xce/0x180 kernel/exit.c:929 get_signal+0xfc3/0x1550 kernel/signal.c:2852 arch_do_signal_or_restart+0x8c/0x2e0 arch/x86/kernel/signal.c:868 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x113/0x190 kernel/entry/common.c:207 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:300 do_syscall_64+0x50/0xd0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00000000 -> 0xffffffff Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 28712 Comm: syz-executor.0 Tainted: G W 5.16.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20211130170155.2331929-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer()Eiichi Tsukata1-5/+9
commit beacff50edbd6c9659a6f15fc7f6126909fade29 upstream. Need to call rxrpc_put_local() for peer candidate before kfree() as it holds a ref to rxrpc_local. [DH: v2: Changed to abstract the peer freeing code out into a function] Fixes: 9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record") Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/all/20211121041608.133740-2-eiichi.tsukata@nutanix.com/ # v1 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08rxrpc: Fix rxrpc_peer leak in rxrpc_look_up_bundle()Eiichi Tsukata1-5/+9
commit ca77fba821351190777b236ce749d7c4d353102e upstream. Need to call rxrpc_put_peer() for bundle candidate before kfree() as it holds a ref to rxrpc_peer. [DH: v2: Changed to abstract out the bundle freeing code into a function] Fixes: 245500d853e9 ("rxrpc: Rewrite the client connection manager") Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20211121041608.133740-1-eiichi.tsukata@nutanix.com/ # v1 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08net: mpls: Fix notifications when deleting a deviceBenjamin Poirier1-16/+52
commit 7d4741eacdefa5f0475431645b56baf00784df1f upstream. There are various problems related to netlink notifications for mpls route changes in response to interfaces being deleted: * delete interface of only nexthop DELROUTE notification is missing RTA_OIF attribute * delete interface of non-last nexthop NEWROUTE notification is missing entirely * delete interface of last nexthop DELROUTE notification is missing nexthop All of these problems stem from the fact that existing routes are modified in-place before sending a notification. Restructure mpls_ifdown() to avoid changing the route in the DELROUTE cases and to create a copy in the NEWROUTE case. Fixes: f8efb73c97e2 ("mpls: multipath route support") Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08wireguard: device: reset peer src endpoint when netns exitsJason A. Donenfeld1-0/+19
commit 20ae1d6aa159eb91a9bf09ff92ccaa94dbea92c2 upstream. Each peer's endpoint contains a dst_cache entry that takes a reference to another netdev. When the containing namespace exits, we take down the socket and prevent future sockets from being created (by setting creating_net to NULL), which removes that potential reference on the netns. However, it doesn't release references to the netns that a netdev cached in dst_cache might be taking, so the netns still might fail to exit. Since the socket is gimp