summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2025-03-22net/mlx5: Bridge, fix the crash caused by LAG state checkJianbo Liu1-7/+5
[ Upstream commit 4b8eeed4fb105770ce6dc84a2c6ef953c7b71cbb ] When removing LAG device from bridge, NETDEV_CHANGEUPPER event is triggered. Driver finds the lower devices (PFs) to flush all the offloaded entries. And mlx5_lag_is_shared_fdb is checked, it returns false if one of PF is unloaded. In such case, mlx5_esw_bridge_lag_rep_get() and its caller return NULL, instead of the alive PF, and the flush is skipped. Besides, the bridge fdb entry's lastuse is updated in mlx5 bridge event handler. But this SWITCHDEV_FDB_ADD_TO_BRIDGE event can be ignored in this case because the upper interface for bond is deleted, and the entry will never be aged because lastuse is never updated. To make things worse, as the entry is alive, mlx5 bridge workqueue keeps sending that event, which is then handled by kernel bridge notifier. It causes the following crash when accessing the passed bond netdev which is already destroyed. To fix this issue, remove such checks. LAG state is already checked in commit 15f8f168952f ("net/mlx5: Bridge, verify LAG state when adding bond to bridge"), driver still need to skip offload if LAG becomes invalid state after initialization. Oops: stack segment: 0000 [#1] SMP CPU: 3 UID: 0 PID: 23695 Comm: kworker/u40:3 Tainted: G OE 6.11.0_mlnx #1 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: mlx5_bridge_wq mlx5_esw_bridge_update_work [mlx5_core] RIP: 0010:br_switchdev_event+0x2c/0x110 [bridge] Code: 44 00 00 48 8b 02 48 f7 00 00 02 00 00 74 69 41 54 55 53 48 83 ec 08 48 8b a8 08 01 00 00 48 85 ed 74 4a 48 83 fe 02 48 89 d3 <4c> 8b 65 00 74 23 76 49 48 83 fe 05 74 7e 48 83 fe 06 75 2f 0f b7 RSP: 0018:ffffc900092cfda0 EFLAGS: 00010297 RAX: ffff888123bfe000 RBX: ffffc900092cfe08 RCX: 00000000ffffffff RDX: ffffc900092cfe08 RSI: 0000000000000001 RDI: ffffffffa0c585f0 RBP: 6669746f6e690a30 R08: 0000000000000000 R09: ffff888123ae92c8 R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888123ae9c60 R13: 0000000000000001 R14: ffffc900092cfe08 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f15914c8734 CR3: 0000000002830005 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __die_body+0x1a/0x60 ? die+0x38/0x60 ? do_trap+0x10b/0x120 ? do_error_trap+0x64/0xa0 ? exc_stack_segment+0x33/0x50 ? asm_exc_stack_segment+0x22/0x30 ? br_switchdev_event+0x2c/0x110 [bridge] ? sched_balance_newidle.isra.149+0x248/0x390 notifier_call_chain+0x4b/0xa0 atomic_notifier_call_chain+0x16/0x20 mlx5_esw_bridge_update+0xec/0x170 [mlx5_core] mlx5_esw_bridge_update_work+0x19/0x40 [mlx5_core] process_scheduled_works+0x81/0x390 worker_thread+0x106/0x250 ? bh_worker+0x110/0x110 kthread+0xb7/0xe0 ? kthread_park+0x80/0x80 ret_from_fork+0x2d/0x50 ? kthread_park+0x80/0x80 ret_from_fork_asm+0x11/0x20 </TASK> Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1741644104-97767-6-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22net/mlx5: Lag, Check shared fdb before creating MultiPort E-SwitchShay Drory3-3/+5
[ Upstream commit 32966984bee1defd9f5a8f9be274d7c32f911ba1 ] Currently, MultiPort E-Switch is requesting to create a LAG with shared FDB without checking the LAG is supporting shared FDB. Add the check. Fixes: a32327a3a02c ("net/mlx5: Lag, Control MultiPort E-Switch single FDB mode") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1741644104-97767-5-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22net/mlx5: Fix incorrect IRQ pool usage when releasing IRQsShay Drory5-7/+16
[ Upstream commit 32d2724db5b2361ab293427ccd5c24f4f2bcca14 ] mlx5_irq_pool_get() is a getter for completion IRQ pool only. However, after the cited commit, mlx5_irq_pool_get() is called during ctrl IRQ release flow to retrieve the pool, resulting in the use of an incorrect IRQ pool. Hence, use the newly introduced mlx5_irq_get_pool() getter to retrieve the correct IRQ pool based on the IRQ itself. While at it, rename mlx5_irq_pool_get() to mlx5_irq_table_get_comp_irq_pool() which accurately reflects its purpose and improves code readability. Fixes: 0477d5168bbb ("net/mlx5: Expose SFs IRQs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1741644104-97767-4-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22net/mlx5: HWS, Rightsize bwc matcher priorityVlad Dogaru1-1/+1
[ Upstream commit 521992337f67f71ce4436b98bc32563ddb1a5ce3 ] The bwc layer was clamping the matcher priority from 32 bits to 16 bits. This didn't show up until a matcher was resized, since the initial native matcher was created using the correct 32 bit value. The fix also reorders fields to avoid some padding. Fixes: 2111bb970c78 ("net/mlx5: HWS, added backward-compatible API handling") Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1741644104-97767-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22rtase: Fix improper release of ring list entries in rtase_sw_resetJustin Lai1-0/+10
[ Upstream commit 415f135ace7fd824cde083184a922e39156055b5 ] Since rtase_init_ring, which is called within rtase_sw_reset, adds ring entries already present in the ring list back into the list, it causes the ring list to form a cycle. This results in list_for_each_entry_safe failing to find an endpoint during traversal, leading to an error. Therefore, it is necessary to remove the previously added ring_list nodes before calling rtase_init_ring. Fixes: 079600489960 ("rtase: Implement net_device_ops") Signed-off-by: Justin Lai <justinlai0215@realtek.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250306070510.18129-1-justinlai0215@realtek.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22bonding: fix incorrect MAC address setting to receive NS messagesHangbin Liu1-8/+47
[ Upstream commit 0c5e145a350de3b38cd5ae77a401b12c46fb7c1d ] When validation on the backup slave is enabled, we need to validate the Neighbor Solicitation (NS) messages received on the backup slave. To receive these messages, the correct destination MAC address must be added to the slave. However, the target in bonding is a unicast address, which we cannot use directly. Instead, we should first convert it to a Solicited-Node Multicast Address and then derive the corresponding MAC address. Fix the incorrect MAC address setting on both slave_set_ns_maddr() and slave_set_ns_maddrs(). Since the two function names are similar. Add some description for the functions. Also only use one mac_addr variable in slave_set_ns_maddr() to save some code and logic. Fixes: 8eb36164d1a6 ("bonding: add ns target multicast address to slave device") Acked-by: Jay Vosburgh <jv@jvosburgh.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250306023923.38777-2-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22eth: bnxt: fix memory leak in queue resetTaehee Yoo1-0/+1
[ Upstream commit 87dd2850835dd7886726b428a8ef7d73a60520c7 ] When the queue is reset, the bnxt_alloc_one_tpa_info() is called to allocate tpa_info for the new queue. And then the old queue's tpa_info should be removed by the bnxt_free_one_tpa_info(), but it is not called. So memory leak occurs. It adds the bnxt_free_one_tpa_info() in the bnxt_queue_mem_free(). unreferenced object 0xffff888293cc0000 (size 16384): comm "ncdevmem", pid 2076, jiffies 4296604081 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 40 75 78 93 82 88 ff ff ........@ux..... 40 75 78 93 02 00 00 00 00 00 00 00 00 00 00 00 @ux............. backtrace (crc 5d7d4798): ___kmalloc_large_node+0x10d/0x1b0 __kmalloc_large_node_noprof+0x17/0x60 __kmalloc_noprof+0x3f6/0x520 bnxt_alloc_one_tpa_info+0x5f/0x300 [bnxt_en] bnxt_queue_mem_alloc+0x8e8/0x14f0 [bnxt_en] netdev_rx_queue_restart+0x233/0x620 net_devmem_bind_dmabuf_to_queue+0x2a3/0x600 netdev_nl_bind_rx_doit+0xc00/0x10a0 genl_family_rcv_msg_doit+0x1d4/0x2b0 genl_rcv_msg+0x3fb/0x6c0 netlink_rcv_skb+0x12c/0x360 genl_rcv+0x24/0x40 netlink_unicast+0x447/0x710 netlink_sendmsg+0x712/0xbc0 __sys_sendto+0x3fd/0x4d0 __x64_sys_sendto+0xdc/0x1b0 Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250309134219.91670-7-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx}Taehee Yoo1-0/+6
[ Upstream commit f09af5fdfbd9b0fcee73aab1116904c53b199e97 ] When qstats-get operation is executed, callbacks of netdev_stats_ops are called. The bnxt_get_queue_stats{rx | tx} collect per-queue stats from sw_stats in the rings. But {rx | tx | cp}_ring are allocated when the interface is up. So, these rings are not allocated when the interface is down. The qstats-get is allowed even if the interface is down. However, the bnxt_get_queue_stats{rx | tx}() accesses cp_ring and tx_ring without null check. So, it needs to avoid accessing rings if the interface is down. Reproducer: ip link set $interface down ./cli.py --spec netdev.yaml --dump qstats-get OR ip link set $interface down python ./stats.py Splat looks like: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1680fa067 P4D 1680fa067 PUD 16be3b067 PMD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 UID: 0 PID: 1495 Comm: python3 Not tainted 6.14.0-rc4+ #32 5cd0f999d5a15c574ac72b3e4b907341 Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 RIP: 0010:bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en] Code: c6 87 b5 18 00 00 02 eb a2 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 01 RSP: 0018:ffffabef43cdb7e0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffffc04c8710 RCX: 0000000000000000 RDX: ffffabef43cdb858 RSI: 0000000000000000 RDI: ffff8d504e850000 RBP: ffff8d506c9f9c00 R08: 0000000000000004 R09: ffff8d506bcd901c R10: 0000000000000015 R11: ffff8d506bcd9000 R12: 0000000000000000 R13: ffffabef43cdb8c0 R14: ffff8d504e850000 R15: 0000000000000000 FS: 00007f2c5462b080(0000) GS:ffff8d575f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000167fd0000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x15a/0x460 ? sched_balance_find_src_group+0x58d/0xd10 ? exc_page_fault+0x6e/0x180 ? asm_exc_page_fault+0x22/0x30 ? bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en cdd546fd48563c280cfd30e9647efa420db07bf1] netdev_nl_stats_by_netdev+0x2b1/0x4e0 ? xas_load+0x9/0xb0 ? xas_find+0x183/0x1d0 ? xa_find+0x8b/0xe0 netdev_nl_qstats_get_dumpit+0xbf/0x1e0 genl_dumpit+0x31/0x90 netlink_dump+0x1a8/0x360 Fixes: af7b3b4adda5 ("eth: bnxt: support per-queue statistics") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Link: https://patch.msgid.link/20250309134219.91670-6-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22eth: bnxt: do not update checksum in bnxt_xdp_build_skb()Taehee Yoo3-12/+5
[ Upstream commit c03e7d05aa0e2f7e9a9ce5ad8a12471a53f941dc ] The bnxt_rx_pkt() updates ip_summed value at the end if checksum offload is enabled. When the XDP-MB program is attached and it returns XDP_PASS, the bnxt_xdp_build_skb() is called to update skb_shared_info. The main purpose of bnxt_xdp_build_skb() is to update skb_shared_info, but it updates ip_summed value too if checksum offload is enabled. This is actually duplicate work. When the bnxt_rx_pkt() updates ip_summed value, it checks if ip_summed is CHECKSUM_NONE or not. It means that ip_summed should be CHECKSUM_NONE at this moment. But ip_summed may already be updated to CHECKSUM_UNNECESSARY in the XDP-MB-PASS path. So the by skb_checksum_none_assert() WARNS about it. This is duplicate work and updating ip_summed in the bnxt_xdp_build_skb() is not needed. Splat looks like: WARNING: CPU: 3 PID: 5782 at ./include/linux/skbuff.h:5155 bnxt_rx_pkt+0x479b/0x7610 [bnxt_en] Modules linked in: bnxt_re bnxt_en rdma_ucm rdma_cm iw_cm ib_cm ib_uverbs veth xt_nat xt_tcpudp xt_conntrack nft_chain_nat xt_MASQUERADE nf_] CPU: 3 UID: 0 PID: 5782 Comm: socat Tainted: G W 6.14.0-rc4+ #27 Tainted: [W]=WARN Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 RIP: 0010:bnxt_rx_pkt+0x479b/0x7610 [bnxt_en] Code: 54 24 0c 4c 89 f1 4c 89 ff c1 ea 1f ff d3 0f 1f 00 49 89 c6 48 85 c0 0f 84 4c e5 ff ff 48 89 c7 e8 ca 3d a0 c8 e9 8f f4 ff ff <0f> 0b f RSP: 0018:ffff88881ba09928 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000c7590303 RCX: 0000000000000000 RDX: 1ffff1104e7d1610 RSI: 0000000000000001 RDI: ffff8881c91300b8 RBP: ffff88881ba09b28 R08: ffff888273e8b0d0 R09: ffff888273e8b070 R10: ffff888273e8b010 R11: ffff888278b0f000 R12: ffff888273e8b080 R13: ffff8881c9130e00 R14: ffff8881505d3800 R15: ffff888273e8b000 FS: 00007f5a2e7be080(0000) GS:ffff88881ba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff2e708ff8 CR3: 000000013e3b0000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <IRQ> ? __warn+0xcd/0x2f0 ? bnxt_rx_pkt+0x479b/0x7610 ? report_bug+0x326/0x3c0 ? handle_bug+0x53/0xa0 ? exc_invalid_op+0x14/0x50 ? asm_exc_invalid_op+0x16/0x20 ? bnxt_rx_pkt+0x479b/0x7610 ? bnxt_rx_pkt+0x3e41/0x7610 ? __pfx_bnxt_rx_pkt+0x10/0x10 ? napi_complete_done+0x2cf/0x7d0 __bnxt_poll_work+0x4e8/0x1220 ? __pfx___bnxt_poll_work+0x10/0x10 ? __pfx_mark_lock.part.0+0x10/0x10 bnxt_poll_p5+0x36a/0xfa0 ? __pfx_bnxt_poll_p5+0x10/0x10 __napi_poll.constprop.0+0xa0/0x440 net_rx_action+0x899/0xd00 ... Following ping.py patch adds xdp-mb-pass case. so ping.py is going to be able to reproduce this issue. Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Link: https://patch.msgid.link/20250309134219.91670-5-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logicTaehee Yoo1-2/+2
[ Upstream commit 661958552eda5bf64bfafb4821cbdded935f1f68 ] When a queue is restarted, it sets MRU to 0 for stopping packet flow. MRU variable is a member of vnic_info[], the first vnic_info is default and the second is ntuple. Only when ntuple is enabled(ethtool -K eth0 ntuple on), vnic_info for ntuple is allocated in init logic. The bp->nr_vnics indicates how many vnic_info are allocated. However bnxt_queue_{start | stop}() accesses vnic_info[BNXT_VNIC_NTUPLE] regardless of ntuple state. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Fixes: b9d2956e869c ("bnxt_en: stop packet flow during bnxt_queue_stop/start") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250309134219.91670-4-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc()Taehee Yoo1-0/+3
[ Upstream commit ca2456e073957781e1184de68551c65161b2bd30 ] The bnxt_queue_mem_alloc() is called to allocate new queue memory when a queue is restarted. It internally accesses rx buffer descriptor corresponding to the index. The rx buffer descriptor is allocated and set when the interface is up and it's freed when the interface is down. So, if queue is restarted if interface is down, kernel panic occurs. Splat looks like: BUG: unable to handle page fault for address: 000000000000b240 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 UID: 0 PID: 1563 Comm: ncdevmem2 Not tainted 6.14.0-rc2+ #9 844ddba6e7c459cafd0bf4db9a3198e Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 RIP: 0010:bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en] Code: 41 54 4d 89 c4 4d 69 c0 c0 05 00 00 55 48 89 f5 53 48 89 fb 4c 8d b5 40 05 00 00 48 83 ec 15 RSP: 0018:ffff9dcc83fef9e8 EFLAGS: 00010202 RAX: ffffffffc0457720 RBX: ffff934ed8d40000 RCX: 0000000000000000 RDX: 000000000000001f RSI: ffff934ea508f800 RDI: ffff934ea508f808 RBP: ffff934ea508f800 R08: 000000000000b240 R09: ffff934e84f4b000 R10: ffff9dcc83fefa30 R11: ffff934e84f4b000 R12: 000000000000001f R13: ffff934ed8d40ac0 R14: ffff934ea508fd40 R15: ffff934e84f4b000 FS: 00007fa73888c740(0000) GS:ffff93559f780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000b240 CR3: 0000000145a2e000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x15a/0x460 ? exc_page_fault+0x6e/0x180 ? asm_exc_page_fault+0x22/0x30 ? __pfx_bnxt_queue_mem_alloc+0x10/0x10 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7] ? bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7] netdev_rx_queue_restart+0xc5/0x240 net_devmem_bind_dmabuf_to_queue+0xf8/0x200 netdev_nl_bind_rx_doit+0x3a7/0x450 genl_family_rcv_msg_doit+0xd9/0x130 genl_rcv_msg+0x184/0x2b0 ? __pfx_netdev_nl_bind_rx_doit+0x10/0x10 ? __pfx_genl_rcv_msg+0x10/0x10 netlink_rcv_skb+0x54/0x100 genl_rcv+0x24/0x40 ... Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250309134219.91670-3-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22eth: bnxt: fix truesize for mb-xdp-pass caseTaehee Yoo2-1/+9
[ Upstream commit 9f7b2aa5034e24d3c49db73d5f760c0435fe31c2 ] When mb-xdp is set and return is XDP_PASS, packet is converted from xdp_buff to sk_buff with xdp_update_skb_shared_info() in bnxt_xdp_build_skb(). bnxt_xdp_build_skb() passes incorrect truesize argument to xdp_update_skb_shared_info(). The truesize is calculated as BNXT_RX_PAGE_SIZE * sinfo->nr_frags but the skb_shared_info was wiped by napi_build_skb() before. So it stores sinfo->nr_frags before bnxt_xdp_build_skb() and use it instead of getting skb_shared_info from xdp_get_shared_info_from_buff(). Splat looks like: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 0 at net/core/skbuff.c:6072 skb_try_coalesce+0x504/0x590 Modules linked in: xt_nat xt_tcpudp veth af_packet xt_conntrack nft_chain_nat xt_MASQUERADE nf_conntrack_netlink xfrm_user xt_addrtype nft_coms CPU: 2 UID: 0 PID: 0 Comm: swapper/2 Not tainted 6.14.0-rc2+ #3 RIP: 0010:skb_try_coalesce+0x504/0x590 Code: 4b fd ff ff 49 8b 34 24 40 80 e6 40 0f 84 3d fd ff ff 49 8b 74 24 48 40 f6 c6 01 0f 84 2e fd ff ff 48 8d 4e ff e9 25 fd ff ff <0f> 0b e99 RSP: 0018:ffffb62c4120caa8 EFLAGS: 00010287 RAX: 0000000000000003 RBX: ffffb62c4120cb14 RCX: 0000000000000ec0 RDX: 0000000000001000 RSI: ffffa06e5d7dc000 RDI: 0000000000000003 RBP: ffffa06e5d7ddec0 R08: ffffa06e6120a800 R09: ffffa06e7a119900 R10: 0000000000002310 R11: ffffa06e5d7dcec0 R12: ffffe4360575f740 R13: ffffe43600000000 R14: 0000000000000002 R15: 0000000000000002 FS: 0000000000000000(0000) GS:ffffa0755f700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f147b76b0f8 CR3: 00000001615d4000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <IRQ> ? __warn+0x84/0x130 ? skb_try_coalesce+0x504/0x590 ? report_bug+0x18a/0x1a0 ? handle_bug+0x53/0x90 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 ? skb_try_coalesce+0x504/0x590 inet_frag_reasm_finish+0x11f/0x2e0 ip_defrag+0x37a/0x900 ip_local_deliver+0x51/0x120 ip_sublist_rcv_finish+0x64/0x70 ip_sublist_rcv+0x179/0x210 ip_list_rcv+0xf9/0x130 How to reproduce: <Node A> ip link set $interface1 xdp obj xdp_pass.o ip link set $interface1 mtu 9000 up ip a a 10.0.0.1/24 dev $interface1 <Node B> ip link set $interfac2 mtu 9000 up ip a a 10.0.0.2/24 dev $interface2 ping 10.0.0.1 -s 65000 Following ping.py patch adds xdp-mb-pass case. so ping.py is going to be able to reproduce this issue. Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250309134219.91670-2-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22net/mlx5: handle errors in mlx5_chains_create_table()Wentao Liang1-0/+5
[ Upstream commit eab0396353be1c778eba1c0b5180176f04dd21ce ] In mlx5_chains_create_table(), the return value of mlx5_get_fdb_sub_ns() and mlx5_get_flow_namespace() must be checked to prevent NULL pointer dereferences. If either function fails, the function should log error message with mlx5_core_warn() and return error pointer. Fixes: 39ac237ce009 ("net/mlx5: E-Switch, Refactor chains and priorities") Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250307021820.2646-1-vulab@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22Drivers: hv: vmbus: Don't release fb_mmio resource in vmbus_free_mmio()Michael Kelley1-0/+13
[ Upstream commit 73fe9073c0cc28056cb9de0c8a516dac070f1d1f ] The VMBus driver manages the MMIO space it owns via the hyperv_mmio resource tree. Because the synthetic video framebuffer portion of the MMIO space is initially setup by the Hyper-V host for each guest, the VMBus driver does an early reserve of that portion of MMIO space in the hyperv_mmio resource tree. It saves a pointer to that resource in fb_mmio. When a VMBus driver requests MMIO space and passes "true" for the "fb_overlap_ok" argument, the reserved framebuffer space is used if possible. In that case it's not necessary to do another request against the "shadow" hyperv_mmio resource tree because that resource was already requested in the early reserve steps. However, the vmbus_free_mmio() function currently does no special handling for the fb_mmio resource. When a framebuffer device is removed, or the driver is unbound, the current code for vmbus_free_mmio() releases the reserved resource, leaving fb_mmio pointing to memory that has been freed. If the same or another driver is subsequently bound to the device, vmbus_allocate_mmio() checks against fb_mmio, and potentially gets garbage. Furthermore a second unbind operation produces this "nonexistent resource" error because of the unbalanced behavior between vmbus_allocate_mmio() and vmbus_free_mmio(): [ 55.499643] resource: Trying to free nonexistent resource <0x00000000f0000000-0x00000000f07fffff> Fix this by adding logic to vmbus_free_mmio() to recognize when MMIO space in the fb_mmio reserved area would be released, and don't release it. This filtering ensures the fb_mmio resource always exists, and makes vmbus_free_mmio() more parallel with vmbus_allocate_mmio(). Fixes: be000f93e5d7 ("drivers:hv: Track allocations of children of hv_vmbus in private resource tree") Signed-off-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/20250310035208.275764-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250310035208.275764-1-mhklinux@outlook.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22fbdev: hyperv_fb: Allow graceful removal of framebufferSaurabh Sengar1-5/+15
[ Upstream commit ea2f45ab0e53b255f72c85ccd99e2b394fc5fceb ] When a Hyper-V framebuffer device is unbind, hyperv_fb driver tries to release the framebuffer forcefully. If this framebuffer is in use it produce the following WARN and hence this framebuffer is never released. [ 44.111220] WARNING: CPU: 35 PID: 1882 at drivers/video/fbdev/core/fb_info.c:70 framebuffer_release+0x2c/0x40 < snip > [ 44.111289] Call Trace: [ 44.111290] <TASK> [ 44.111291] ? show_regs+0x6c/0x80 [ 44.111295] ? __warn+0x8d/0x150 [ 44.111298] ? framebuffer_release+0x2c/0x40 [ 44.111300] ? report_bug+0x182/0x1b0 [ 44.111303] ? handle_bug+0x6e/0xb0 [ 44.111306] ? exc_invalid_op+0x18/0x80 [ 44.111308] ? asm_exc_invalid_op+0x1b/0x20 [ 44.111311] ? framebuffer_release+0x2c/0x40 [ 44.111313] ? hvfb_remove+0x86/0xa0 [hyperv_fb] [ 44.111315] vmbus_remove+0x24/0x40 [hv_vmbus] [ 44.111323] device_remove+0x40/0x80 [ 44.111325] device_release_driver_internal+0x20b/0x270 [ 44.111327] ? bus_find_device+0xb3/0xf0 Fix this by moving the release of framebuffer and assosiated memory to fb_ops.fb_destroy function, so that framebuffer framework handles it gracefully. While we fix this, also replace manual registrations/unregistration of framebuffer with devm_register_framebuffer. Fixes: 68a2d20b79b1 ("drivers/video: add Hyper-V Synthetic Video Frame Buffer Driver") Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1740845791-19977-3-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1740845791-19977-3-git-send-email-ssengar@linux.microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22fbdev: hyperv_fb: Simplify hvfb_putmemSaurabh Sengar1-6/+6
[ Upstream commit f5e728a50bb17336a20803dde488515b833ecd1d ] The device object required in 'hvfb_release_phymem' function for 'dma_free_coherent' can also be obtained from the 'info' pointer, making 'hdev' parameter in 'hvfb_putmem' redundant. Remove the unnecessary 'hdev' argument from 'hvfb_putmem'. Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1740845791-19977-2-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1740845791-19977-2-git-send-email-ssengar@linux.microsoft.com> Stable-dep-of: ea2f45ab0e53 ("fbdev: hyperv_fb: Allow graceful removal of framebuffer") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22fbdev: hyperv_fb: Fix hang in kdump kernel when on Hyper-V Gen 2 VMsMichael Kelley1-7/+13
[ Upstream commit 304386373007aaca9236a3f36afac0bbedcd2bf0 ] Gen 2 Hyper-V VMs boot via EFI and have a standard EFI framebuffer device. When the kdump kernel runs in such a VM, loading the efifb driver may hang because of accessing the framebuffer at the wrong memory address. The scenario occurs when the hyperv_fb driver in the original kernel moves the framebuffer to a different MMIO address because of conflicts with an already-running efifb or simplefb driver. The hyperv_fb driver then informs Hyper-V of the change, which is allowed by the Hyper-V FB VMBus device protocol. However, when the kexec command loads the kdump kernel into crash memory via the kexec_file_load() system call, the system call doesn't know the framebuffer has moved, and it sets up the kdump screen_info using the original framebuffer address. The transition to the kdump kernel does not go through the Hyper-V host, so Hyper-V does not reset the framebuffer address like it would do on a reboot. When efifb tries to run, it accesses a non-existent framebuffer address, which traps to the Hyper-V host. After many such accesses, the Hyper-V host thinks the guest is being malicious, and throttles the guest to the point that it runs very slowly or appears to have hung. When the kdump kernel is loaded into crash memory via the kexec_load() system call, the problem does not occur. In this case, the kexec command builds the screen_info table itself in user space from data returned by the FBIOGET_FSCREENINFO ioctl against /dev/fb0, which gives it the new framebuffer location. This problem was originally reported in 2020 [1], resulting in commit 3cb73bc3fa2a ("hyperv_fb: Update screen_info after removing old framebuffer"). This commit solved the problem by setting orig_video_isVGA to 0, so the kdump kernel was unaware of the EFI framebuffer. The efifb driver did not try to load, and no hang occurred. But in 2024, commit c25a19afb81c ("fbdev/hyperv_fb: Do not clear global screen_info") effectively reverted 3cb73bc3fa2a. Commit c25a19afb81c has no reference to 3cb73bc3fa2a, so perhaps it was done without knowing the implications that were reported with 3cb73bc3fa2a. In any case, as of commit c25a19afb81c, the original problem came back again. Interestingly, the hyperv_drm driver does not have this problem because it never moves the framebuffer. The difference is that the hyperv_drm driver removes any conflicting framebuffers *before* allocating an MMIO address, while the hyperv_fb drivers removes conflicting framebuffers *after* allocating an MMIO address. With the "after" ordering, hyperv_fb may encounter a conflict and move the framebuffer to a different MMIO address. But the conflict is essentially bogus because it is removed a few lines of code later. Rather than fix the problem with the approach from 2020 in commit 3cb73bc3fa2a, instead slightly reorder the steps in hyperv_fb so conflicting framebuffers are removed before allocating an MMIO address. Then the default framebuffer MMIO address should always be available, and there's never any confusion about which framebuffer address the kdump kernel should use -- it's always the original address provided by the Hyper-V host. This approach is already used by the hyperv_drm driver, and is consistent with the usage guidelines at the head of the module with the function aperture_remove_conflicting_devices(). This approach also solves a related minor problem when kexec_load() is used to load the kdump kernel. With current code, unbinding and rebinding the hyperv_fb driver could result in the framebuffer moving back to the default framebuffer address, because on the rebind there are no conflicts. If such a move is done after the kdump kernel is loaded with the new framebuffer address, at kdump time it could again have the wrong address. This problem and fix are described in terms of the kdump kernel, but it can also occur with any kernel started via kexec. See extensive discussion of the problem and solution at [2]. [1] https://lore.kernel.org/linux-hyperv/20201014092429.1415040-1-kasong@redhat.com/ [2] https://lore.kernel.org/linux-hyperv/BLAPR10MB521793485093FDB448F7B2E5FDE92@BLAPR10MB5217.namprd10.prod.outlook.com/ Reported-by: Thomas Tai <thomas.tai@oracle.com> Fixes: c25a19afb81c ("fbdev/hyperv_fb: Do not clear global screen_info") Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250218230130.3207-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250218230130.3207-1-mhklinux@outlook.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22drm/hyperv: Fix address space leak when Hyper-V DRM device is removedMichael Kelley1-0/+2
[ Upstream commit aed709355fd05ef747e1af24a1d5d78cd7feb81e ] When a Hyper-V DRM device is probed, the driver allocates MMIO space for the vram, and maps it cacheable. If the device removed, or in the error path for device probing, the MMIO space is released but no unmap is done. Consequently the kernel address space for the mapping is leaked. Fix this by adding iounmap() calls in the device removal path, and in the error path during device probing. Fixes: f1f63cbb705d ("drm/hyperv: Fix an error handling path in hyperv_vmbus_probe()") Fixes: a0ab5abced55 ("drm/hyperv : Removing the restruction of VRAM allocation with PCI bar size") Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Tested-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/20250210193441.2414-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250210193441.2414-1-mhklinux@outlook.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22net: mctp i2c: Copy headers if clonedMatt Johnston1-0/+5
[ Upstream commit df8ce77ba8b7c012a3edd1ca7368b46831341466 ] Use skb_cow_head() prior to modifying the TX SKB. This is necessary when the SKB has been cloned, to avoid modifying other shared clones. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver") Link: https://patch.msgid.link/20250306-matt-mctp-i2c-cow-v1-1-293827212681@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22net: mctp i3c: Copy headers if clonedMatt Johnston1-0/+5
[ Upstream commit 26db9c9ee19c36a97dbb1cfef007a3c189c4c874 ] Use skb_cow_head() prior to modifying the tx skb. This is necessary when the skb has been cloned, to avoid modifying other shared clones. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Fixes: c8755b29b58e ("mctp i3c: MCTP I3C driver") Link: https://patch.msgid.link/20250306-matt-i3c-cow-head-v1-1-d5e6a5495227@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22net: dsa: mv88e6xxx: Verify after ATU Load opsJoseph Huang1-11/+48
[ Upstream commit dc5340c3133a3ebe54853fd299116149e528cfaa ] ATU Load operations could fail silently if there's not enough space on the device to hold the new entry. When this happens, the symptom depends on the unknown flood settings. If unknown multicast flood is disabled, the multicast packets are dropped when the ATU table is full. If unknown multicast flood is enabled, the multicast packets will be flooded to all ports. Either way, IGMP snooping is broken when the ATU Load operation fails silently. Do a Read-After-Write verification after each fdb/mdb add operation to make sure that the operation was really successful, and return -ENOSPC otherwise. Fixes: defb05b9b9b4 ("net: dsa: mv88e6xxx: Add support for fdb_add, fdb_del, and fdb_getnext") Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250306172306.3859214-1-Joseph.Huang@garmin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22net/mlx5: Fill out devlink dev info only for PFsJiri Pirko1-0/+3
[ Upstream commit d749d901b2168389f060b654fdaa08acf6b367d2 ] Firmware version query is supported on the PFs. Due to this following kernel warning log is observed: [ 188.590344] mlx5_core 0000:08:00.2: mlx5_fw_version_query:816:(pid 1453): fw query isn't supported by the FW Fix it by restricting the query and devlink info to the PF. Fixes: 8338d9378895 ("net/mlx5: Added devlink info callback") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Link: https://patch.msgid.link/20250306212529.429329-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22wifi: iwlwifi: mvm: fix PNVM timeout for non-MSI-X platformsEmmanuel Grumbach1-3/+3
[ Upstream commit b8c8a03e9b7bfc06f366b75daf3d0812400e7123 ] When MSI-X is not enabled, we mask all the interrupts in the interrupt handler and re-enable them when the interrupt thread runs. If STATUS_INT_ENABLED is not set, we won't re-enable in the thread. In order to get the ALIVE interrupt, we allow the ALIVE interrupt itself, and RX as well in order to receive the ALIVE notification (which is received as an RX from the firmware. The problem is that STATUS_INT_ENABLED is clear until the op_mode calls trans_fw_alive which means that until trans_fw_alive is called, any notification from the firmware will not be received. This became a problem when we inserted the pnvm_load exactly between the ALIVE and trans_fw_alive. Fix that by calling trans_fw_alive before loading the PNVM. This will allow to get the notification from the firmware about PNVM load being complete and continue the flow normally. This didn't happen on MSI-X because we don't disable the interrupts in the ISR when MSI-X is available. The error in the log looks like this: iwlwifi 0000:00:03.0: Timeout waiting for PNVM load! iwlwifi 0000:00:03.0: Failed to start RT ucode: -110 iwlwifi 0000:00:03.0: WRT: Collecting data: ini trigger 13 fired (delay=0ms). Fixes: 70d3ca86b025 ("iwlwifi: mvm: ring the doorbell and wait for PNVM load completion") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250306122425.0f2cf207aae1.I025d8f724b44f52eadf6c19069352eb9275613a8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22ice: Fix switchdev slow-path in LAGMarcin Szycik2-1/+30
[ Upstream commit dce97cb0a3e34204c0b99345418a714eac85953f ] Ever since removing switchdev control VSI and using PF for port representor Tx/Rx, switchdev slow-path has been working improperly after failover in SR-IOV LAG. LAG assumes that the first uplink to be added to the aggregate will own VFs and have switchdev configured. After failing-over to the other uplink, representors are still configured to Tx through the uplink they are set up on, which fails because that uplink is now down. On failover, update all PRs on primary uplink to use the currently active uplink for Tx. Call netif_keep_dst(), as the secondary uplink might not be in switchdev mode. Also make sure to call ice_eswitch_set_target_vsi() if uplink is in LAG. On the Rx path, representors are already working properly, because default Tx from VFs is set to PF owning the eswitch. After failover the same PF is receiving traffic from VFs, even though link is down. Fixes: defd52455aee ("ice: do Tx through PF netdev in slow-path") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22ice: fix memory leak in aRFS after resetGrzegorz Nitka1-1/+1
[ Upstream commit 23d97f18901ef5e4e264e3b1777fe65c760186b5 ] Fix aRFS (accelerated Receive Flow Steering) structures memory leak by adding a checker to verify if aRFS memory is already allocated while configuring VSI. aRFS objects are allocated in two cases: - as part of VSI initialization (at probe), and - as part of reset handling However, VSI reconfiguration executed during reset involves memory allocation one more time, without prior releasing already allocated resources. This led to the memory leak with the following signature: [root@os-delivery ~]# cat /sys/kernel/debug/kmemleak unreferenced object 0xff3c1ca7252e6000 (size 8192): comm "kworker/0:0", pid 8, jiffies 4296833052 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 0): [<ffffffff991ec485>] __kmalloc_cache_noprof+0x275/0x340 [<ffffffffc0a6e06a>] ice_init_arfs+0x3a/0xe0 [ice] [<ffffffffc09f1027>] ice_vsi_cfg_def+0x607/0x850 [ice] [<ffffffffc09f244b>] ice_vsi_setup+0x5b/0x130 [ice] [<ffffffffc09c2131>] ice_init+0x1c1/0x460 [ice] [<ffffffffc09c64af>] ice_probe+0x2af/0x520 [ice] [<ffffffff994fbcd3>] local_pci_probe+0x43/0xa0 [<ffffffff98f07103>] work_for_cpu_fn+0x13/0x20 [<ffffffff98f0b6d9>] process_one_work+0x179/0x390 [<ffffffff98f0c1e9>] worker_thread+0x239/0x340 [<ffffffff98f14abc>] kthread+0xcc/0x100 [<ffffffff98e45a6d>] ret_from_fork+0x2d/0x50 [<ffffffff98e083ba>] ret_from_fork_asm+0x1a/0x30 ... Fixes: 28bf26724fdb ("ice: Implement aRFS") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22ice: do not configure destination override for switchdevLarysa Zaremba3-28/+0
[ Upstream commit 3be83ee9de0298f8321aa0b148d8f9995102e40f ] After switchdev is enabled and disabled later, LLDP packets sending stops, despite working perfectly fine before and during switchdev state. To reproduce (creating/destroying VF is what triggers the reconfiguration): devlink dev eswitch set pci/<address> mode switchdev echo '2' > /sys/class/net/<ifname>/device/sriov_numvfs echo '0' > /sys/class/net/<ifname>/device/sriov_numvfs This happens because LLDP relies on the destination override functionality. It needs to 1) set a flag in the descriptor, 2) set the VSI permission to make it valid. The permissions are set when the PF VSI is first configured, but switchdev then enables it for the uplink VSI (which is always the PF) once more when configured and disables when deconfigured, which leads to software-generated LLDP packets being blocked. Do not modify the destination override permissions when configuring switchdev, as the enabled state is the default configuration that is never modified. Fixes: 1a1c40df2e80 ("ice: set and release switchdev environment") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22pinctrl: nuvoton: npcm8xx: Add NULL check in npcm8xx_gpio_fwCharles Han1-0/+3
[ Upstream commit acf40ab42799e4ae1397ee6f5c5941092d66f999 ] devm_kasprintf() calls can return null pointers on failure. But the return values were not checked in npcm8xx_gpio_fw(). Add NULL check in npcm8xx_gpio_fw(), to handle kernel NULL pointer dereference error. Fixes: acf4884a5717 ("pinctrl: nuvoton: add NPCM8XX pinctrl and GPIO driver") Signed-off-by: Charles Han <hanchunchao@inspur.com> Link: https://lore.kernel.org/20250212100532.4317-1-hanchunchao@inspur.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22pinctrl: bcm281xx: Fix incorrect regmap max_registers valueArtur Weber1-1/+1
[ Upstream commit 68283c1cb573143c0b7515e93206f3503616bc10 ] The max_registers value does not take into consideration the stride; currently, it's set to the number of the last pin, but this does not accurately represent the final register. Fix this by multiplying the current value by 4. Fixes: 54b1aa5a5b16 ("ARM: pinctrl: Add Broadcom Capri pinctrl driver") Signed-off-by: Artur Weber <aweber.kernel@gmail.com> Link: https://lore.kernel.org/20250207-bcm21664-pinctrl-v1-2-e7cfac9b2d3b@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22fbdev: hyperv_fb: iounmap() the correct memory when removing a deviceMichael Kelley1-1/+1
[ Upstream commit 7241c886a71797cc51efc6fadec7076fcf6435c2 ] When a Hyper-V framebuffer device is removed, or the driver is unbound from a device, any allocated and/or mapped memory must be released. In particular, MMIO address space that was mapped to the framebuffer must be unmapped. Current code unmaps the wrong address, resulting in an error like: [ 4093.980597] iounmap: bad address 00000000c936c05c followed by a stack dump. Commit d21987d709e8 ("video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver") changed the kind of address stored in info->screen_base, and the iounmap() call in hvfb_putmem() was not updated accordingly. Fix this by updating hvfb_putmem() to unmap the correct address. Fixes: d21987d709e8 ("video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver") Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/20250209235252.2987-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250209235252.2987-1-mhklinux@outlook.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutexAlexey Kardashevskiy1-24/+39
commit 3e385c0d6ce88ac9916dcf84267bd5855d830748 upstream. Compared to the SNP Guest Request, the "Extended" version adds data pages for receiving certificates. If not enough pages provided, the HV can report to the VM how much is needed so the VM can reallocate and repeat. Commit ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") moved handling of the allocated/desired pages number out of scope of said mutex and create a possibility for a race (multiple instances trying to trigger Extended request in a VM) as there is just one instance of snp_msg_desc per /dev/sev-guest and no locking other than snp_cmd_mutex. Fix the issue by moving the data blob/size and the GHCB input struct (snp_req_data) into snp_guest_req which is allocated on stack now and accessed by the GHCB caller under that mutex. Stop allocating SEV_FW_BLOB_MAX_SIZE in snp_msg_alloc() as only one of four callers needs it. Free the received blob in get_ext_report() right after it is copied to the userspace. Possible future users of snp_send_guest_request() are likely to have different ideas about the buffer size anyways. Fixes: ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250307013700.437505-3-aik@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-13wifi: iwlwifi: pcie: Fix TSO preparationIlan Peer1-5/+6
commit bbb18f7e23a3f5f56d5c8b4ee0f78f00edb3b1b2 upstream. The allocation of the scatter gather data structure should be done based on the number of memory chunks that need to be mapped, and it is not dependent on the overall payload length. Fix it. In addition, as the skb_to_sgvec() function returns an 'int' do