summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)AuthorFilesLines
2021-09-03IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs()Tuo Li1-5/+4
[ Upstream commit cbe71c61992c38f72c2b625b2ef25916b9f0d060 ] kmalloc_array() is called to allocate memory for tx->descp. If it fails, the function __sdma_txclean() is called: __sdma_txclean(dd, tx); However, in the function __sdma_txclean(), tx-descp is dereferenced if tx->num_desc is not zero: sdma_unmap_desc(dd, &tx->descp[0]); To fix this possible null-pointer dereference, assign the return value of kmalloc_array() to a local variable descp, and then assign it to tx->descp if it is not NULL. Otherwise, go to enomem. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20210806133029.194964-1-islituo@gmail.com Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-03RDMA/bnxt_re: Add missing spin lock initializationNaresh Kumar PBS1-0/+1
[ Upstream commit 17f2569dce1848080825b8336e6b7c6900193b44 ] Add the missing initialization of srq lock. Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters") Link: https://lore.kernel.org/r/1629343553-5843-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-18net/mlx5: Synchronize correct IRQ when destroying CQShay Drory2-5/+2
[ Upstream commit 563476ae0c5e48a028cbfa38fa9d2fc0418eb88f ] The CQ destroy is performed based on the IRQ number that is stored in cq->irqn. That number wasn't set explicitly during CQ creation and as expected some of the API users of mlx5_core_create_cq() forgot to update it. This caused to wrong synchronization call of the wrong IRQ with a number 0 instead of the real one. As a fix, set the IRQ number directly in the mlx5_core_create_cq() and update all users accordingly. Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Fixes: ef1659ade359 ("IB/mlx5: Add DEVX support for CQ events") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-12RDMA/mlx5: Delay emptying a cache entry when a new MR is added to it recentlyAharon Landau1-2/+2
[ Upstream commit d6793ca97b76642b77629dd0783ec64782a50bdb ] Fixing a typo that causes a cache entry to shrink immediately after adding to it new MRs if the entry size exceeds the high limit. In doing so, the cache misses its purpose to prevent the creation of new mkeys on the runtime by using the cached ones. Fixes: b9358bdbc713 ("RDMA/mlx5: Fix locking in MR cache work queue") Link: https://lore.kernel.org/r/fcb546986be346684a016f5ca23a0567399145fa.1627370131.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-04RDMA/bnxt_re: Fix stats countersNaresh Kumar PBS3-7/+8
[ Upstream commit 0c23af52ccd1605926480b5dfd1dd857ef604611 ] Statistical counters are not incrementing in some adapter versions with newer FW. This is due to the stats context length mismatch between FW and driver. Since the L2 driver updates the length correctly, use the stats length from L2 driver while allocating the DMA'able memory and creating the stats context. Fixes: 9d6b648c3112 ("bnxt_en: Update firmware interface spec to 1.10.1.65.") Link: https://lore.kernel.org/r/1626010296-6076-1-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19RDMA/cma: Fix rdma_resolve_route() memory leakGerd Rausch1-1/+2
[ Upstream commit 74f160ead74bfe5f2b38afb4fcf86189f9ff40c9 ] Fix a memory leak when "mda_resolve_route() is called more than once on the same "rdma_cm_id". This is possible if cma_query_handler() triggers the RDMA_CM_EVENT_ROUTE_ERROR flow which puts the state machine back and allows rdma_resolve_route() to be called again. Link: https://lore.kernel.org/r/f6662b7b-bdb7-2706-1e12-47c61d3474b6@oracle.com Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19IB/isert: Align target max I/O size to initiator sizeMax Gurtovoy2-5/+2
[ Upstream commit 109d19a5eb3ddbdb87c43bfd4bcf644f4569da64 ] Since the Linux iser initiator default max I/O size set to 512KB and since there is no handshake procedure for this size in iser protocol, set the default max IO size of the target to 512KB as well. For changing the default values, there is a module parameter for both drivers. Link: https://lore.kernel.org/r/20210524085215.29005-1-mgurtovoy@nvidia.com Reviewed-by: Alaa Hleihel <alaa@nvidia.com> Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19RDMA/rxe: Don't overwrite errno from ib_umem_get()Xiao Yang1-1/+1
[ Upstream commit 20ec0a6d6016aa28b9b3299be18baef1a0f91cd2 ] rxe_mr_init_user() always returns the fixed -EINVAL when ib_umem_get() fails so it's hard for user to know which actual error happens in ib_umem_get(). For example, ib_umem_get() will return -EOPNOTSUPP when trying to pin pages on a DAX file. Return actual error as mlx4/mlx5 does. Link: https://lore.kernel.org/r/20210621071456.4259-1-ice_yangxiao@163.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19RDMA/cxgb4: Fix missing error code in create_qp()Jiapeng Chong1-0/+1
[ Upstream commit aeb27bb76ad8197eb47890b1ff470d5faf8ec9a5 ] The error code is missing in this code scenario so 0 will be returned. Add the error code '-EINVAL' to the return value 'ret'. Eliminates the follow smatch warning: drivers/infiniband/hw/cxgb4/qp.c:298 create_qp() warn: missing error code 'ret'. Link: https://lore.kernel.org/r/1622545669-20625-1-git-send-email-jiapeng.chong@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19RDMA/rtrs: Change MAX_SESS_QUEUE_DEPTHGioh Kim1-5/+8
[ Upstream commit 3a98ea7041b7d18ac356da64823c2ba2f8391b3e ] Max IB immediate data size is 2^28 (MAX_IMM_PAYL_BITS) and the minimum chunk size is 4096 (2^12). Therefore the maximum sess_queue_depth is 65536 (2^16). Link: https://lore.kernel.org/r/20210528113018.52290-6-jinpu.wang@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/core: Always release restrack objectLeon Romanovsky1-1/+1
[ Upstream commit 3d8287544223a3d2f37981c1f9ffd94d0b5e9ffc ] Change location of rdma_restrack_del() to fix the bug where task_struct was acquired but not released, causing to resource leak. ucma_create_id() { ucma_alloc_ctx(); rdma_create_user_id() { rdma_restrack_new(); rdma_restrack_set_name() { rdma_restrack_attach_task.part.0(); <--- task_struct was gotten } } ucma_destroy_private_ctx() { ucma_put_ctx(); rdma_destroy_id() { _destroy_id() <--- id_priv was freed } } } Fixes: 889d916b6f8a ("RDMA/core: Don't access cm_id after its destruction") Link: https://lore.kernel.org/r/073ec27acb943ca8b6961663c47c5abe78a5c8cc.1624948948.git.leonro@nvidia.com Reported-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/mlx5: Don't access NULL-cleared mpi pointerLeon Romanovsky1-1/+1
[ Upstream commit 4a754d7637026b42b0c9ba5787ad5ee3bc2ff77f ] The "dev->port[i].mp.mpi" is set to NULL during mlx5_ib_unbind_slave_port() execution, however that field is needed to add device to unaffiliated list. Such flow causes to the following kernel panic while unloading mlx5_ib module in multi-port mode, hence the device should be added to the list prior to unbind call. RPC: Unregistered rdma transport module. RPC: Unregistered rdma backchannel transport module. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP NOPTI CPU: 4 PID: 1904 Comm: modprobe Not tainted 5.13.0-rc7_for_upstream_min_debug_2021_06_24_12_08 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5_ib_cleanup_multiport_master+0x18b/0x2d0 [mlx5_ib] Code: 00 04 0f 85 c4 00 00 00 48 89 df e8 ef fa ff ff 48 8b 83 40 0d 00 00 48 8b 15 b9 e8 05 00 4a 8b 44 28 20 48 89 05 ad e8 05 00 <48> c7 00 d0 57 c5 a0 48 89 50 08 48 89 02 39 ab 88 0a 00 00 0f 86 RSP: 0018:ffff888116ee3df8 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffff8881154f6000 RCX: 0000000000000080 RDX: ffffffffa0c557d0 RSI: ffff88810b69d200 RDI: 000000000002d8a0 RBP: 0000000000000002 R08: ffff888110780408 R09: 0000000000000000 R10: ffff88812452e1c0 R11: fffffffffff7e028 R12: 0000000000000000 R13: 0000000000000080 R14: ffff888102c58000 R15: 0000000000000000 FS: 00007f884393a740(0000) GS:ffff8882f5a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001249f6004 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5_ib_stage_init_cleanup+0x16/0xd0 [mlx5_ib] __mlx5_ib_remove+0x33/0x90 [mlx5_ib] mlx5r_remove+0x22/0x30 [mlx5_ib] auxiliary_bus_remove+0x18/0x30 __device_release_driver+0x177/0x220 driver_detach+0xc4/0x100 bus_remove_driver+0x58/0xd0 auxiliary_driver_unregister+0x12/0x20 mlx5_ib_cleanup+0x13/0x897 [mlx5_ib] __x64_sys_delete_module+0x154/0x230 ? exit_to_user_mode_prepare+0x104/0x140 do_syscall_64+0x3f/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f8842e095c7 Code: 73 01 c3 48 8b 0d d9 48 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 48 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc68f6e758 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 00005638207929c0 RCX: 00007f8842e095c7 RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000563820792a28 RBP: 00005638207929c0 R08: 00007ffc68f6d701 R09: 0000000000000000 R10: 00007f8842e82880 R11: 0000000000000206 R12: 0000563820792a28 R13: 0000000000000001 R14: 0000563820792a28 R15: 00007ffc68f6fb40 Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_ipoib ib_cm ib_umad mlx5_ib(-) mlx4_ib ib_uverbs ib_core mlx4_en mlx4_core mlx5_core ptp pps_core [last unloaded: rpcrdma] CR2: 0000000000000000 ---[ end trace a0bb7e20804e9e9b ]--- Fixes: 7ce6095e3bff ("RDMA/mlx5: Don't add slave port to unaffiliated list") Link: https://lore.kernel.org/r/899ac1b33a995be5ec0e16a4765c4e43c2b1ba5b.1624956444.git.leonro@nvidia.com Reviewed-by: Itay Aveksis <itayav@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/cma: Fix incorrect Packet Lifetime calculationHåkon Bugge1-2/+4
[ Upstream commit e84045eab69c625bc0b0bf24d8e05bc65da1eed1 ] An approximation for the PacketLifeTime is half the local ACK timeout. The encoding for both timers are logarithmic. If the local ACK timeout is set, but zero, it means the timer is disabled. In this case, we choose the CMA_IBOE_PACKET_LIFETIME value, since 50% of infinite makes no sense. Before this commit, the PacketLifeTime became 255 if local ACK timeout was zero (not running). Fixed by explicitly testing for timeout being zero. Fixes: e1ee1e62bec4 ("RDMA/cma: Use ACK timeout for RoCE packetLifeTime") Link: https://lore.kernel.org/r/1624371207-26710-1-git-send-email-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/cma: Protect RMW with qp_mutexHåkon Bugge1-1/+17
[ Upstream commit ca0c448d2b9f43e3175835d536853854ef544e22 ] The struct rdma_id_private contains three bit-fields, tos_set, timeout_set, and min_rnr_timer_set. These are set by accessor functions without any synchronization. If two or all accessor functions are invoked in close proximity in time, there will be Read-Modify-Write from several contexts to the same variable, and the result will be intermittent. Fixed by protecting the bit-fields by the qp_mutex in the accessor functions. The consumer of timeout_set and min_rnr_timer_set is in rdma_init_qp_attr(), which is called with qp_mutex held for connected QPs. Explicit locking is added for the consumers of tos and tos_set. This commit depends on ("RDMA/cma: Remove unnecessary INIT->INIT transition"), since the call to rdma_init_qp_attr() from cma_init_conn_qp() does not hold the qp_mutex. Fixes: 2c1619edef61 ("IB/cma: Define option to set ack timeout and pack tos_set") Fixes: 3aeffc46afde ("IB/cma: Introduce rdma_set_min_rnr_timer()") Link: https://lore.kernel.org/r/1624369197-24578-3-git-send-email-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rtrs-srv: Set minimal max_send_wr and max_recv_wrJack Wang1-13/+25
[ Upstream commit 5e91eabf66c854f16ca2e954e5c68939bc81601e ] Currently rtrs when create_qp use a coarse numbers (bigger in general), which leads to hardware create more resources which only waste memory with no benefits. For max_send_wr, we don't really need alway max_qp_wr size when creating qp, reduce it to cq_size. For max_recv_wr, cq_size is enough. With the patch when sess_queue_depth=128, per session (2 paths) memory consumption reduced from 188 MB to 65MB When always_invalidate is enabled, we need send more wr, so treat it special. Fixes: 9cb837480424e ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20210614090337.29557-2-jinpu.wang@ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rxe: Fix qp reference counting for atomic opsBob Pearson2-3/+0
[ Upstream commit 15ae1375ea91ae2dee6f12d71a79d8c0a10a30bf ] Currently the rdma_rxe driver attempts to protect atomic responder resources by taking a reference to the qp which is only freed when the resource is recycled for a new read or atomic operation. This means that in normal circumstances there is almost always an extra qp reference once an atomic operation has been executed which prevents cleaning up the qp and associated pd and cqs when the qp is destroyed. This patch removes the call to rxe_add_ref() in send_atomic_ack() and the call to rxe_drop_ref() in free_rd_atomic_resource(). If the qp is destroyed while a peer is retrying an atomic op it will cause the operation to fail which is acceptable. Link: https://lore.kernel.org/r/20210604230558.4812-1-rpearsonhpe@gmail.com Reported-by: Zhu Yanjun <zyjzyj2000@gmail.com> Fixes: 86af61764151 ("IB/rxe: remove unnecessary skb_clone") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/mlx5: Don't add slave port to unaffiliated listLeon Romanovsky1-2/+2
[ Upstream commit 7ce6095e3bff8e20ce018b050960b527e298f7df ] The mlx5_ib_bind_slave_port() doesn't remove multiport device from the unaffiliated list, but mlx5_ib_unbind_slave_port() did it. This unbalanced flow caused to the situation where mlx5_ib_unaffiliated_port_list was changed during iteration. Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Link: https://lore.kernel.org/r/2726e6603b1e6ecfe76aa5a12a063af72173bcf7.1622477058.git.leonro@nvidia.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rxe: Fix failure during driver loadKamal Heib1-3/+7
[ Upstream commit 32a25f2ea690dfaace19f7a3a916f5d7e1ddafe8 ] To avoid the following failure when trying to load the rdma_rxe module while IPv6 is disabled, add a check for EAFNOSUPPORT and ignore the failure, also delete the needless debug print from rxe_setup_udp_tunnel(). $ modprobe rdma_rxe modprobe: ERROR: could not insert 'rdma_rxe': Operation not permitted Fixes: dfdd6158ca2c ("IB/rxe: Fix kernel panic in udp_setup_tunnel") Link: https://lore.kernel.org/r/20210603090112.36341-1-kamalheib1@gmail.com Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/core: Sanitize WQ state received from the userspaceLeon Romanovsky3-13/+23
[ Upstream commit f97442887275d11c88c2899e720fe945c1f61488 ] The mlx4 and mlx5 implemented differently the WQ input checks. Instead of duplicating mlx4 logic in the mlx5, let's prepare the input in the central place. The mlx5 implementation didn't check for validity of state input. It is not real bug because our FW checked that, but still worth to fix. Fixes: f213c0527210 ("IB/uverbs: Add WQ support") Link: https://lore.kernel.org/r/ac41ad6a81b095b1a8ad453dcf62cf8d3c5da779.1621413310.git.leonro@nvidia.com Reported-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rtrs-clt: Fix memory leak of not-freed sess->stats and stats->pcpu_statsGioh Kim1-0/+6
[ Upstream commit 7ecd7e290bee0ab9cf75b79a367a4cc113cf8292 ] sess->stats and sess->stats->pcpu_stats objects are freed when sysfs entry is removed. If something wrong happens and session is closed before sysfs entry is created, sess->stats and sess->stats->pcpu_stats objects are not freed. This patch adds freeing of them at three places: 1. When client uses wrong address and session creation fails. 2. When client fails to create a sysfs entry. 3. When client adds wrong address via sysfs add_path. Fixes: 215378b838df0 ("RDMA/rtrs: client: sysfs interface functions") Link: https://lore.kernel.org/r/20210528113018.52290-21-jinpu.wang@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rtrs-clt: Check if the queue_depth has changed during a reconnectionMd Haris Iqbal1-4/+15
[ Upstream commit 5b73b799c25c68a4703cd6c5ac4518006d9865b8 ] The queue_depth is a module parameter for rtrs_server. It is used on the client side to determing the queue_depth of the request queue for the RNBD virtual block device. During a reconnection event for an already mapped device, in case the rtrs_server module queue_depth has changed, fail the reconnect attempt. Also stop further auto reconnection attempts. A manual reconnect via sysfs has to be triggerred. Fixes: 6a98d71daea18 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20210528113018.52290-20-jinpu.wang@ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rtrs-srv: Fix memory leak when having multiple sessionsJack Wang1-0/+1
[ Upstream commit 6bb97a2c1aa5278a30d49abb6186d50c34c207e2 ] Gioh notice memory leak below unreferenced object 0xffff8880acda2000 (size 2048): comm "kworker/4:1", pid 77, jiffies 4295062871 (age 1270.730s) hex dump (first 32 bytes): 00 20 da ac 80 88 ff ff 00 20 da ac 80 88 ff ff . ....... ...... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000e85d85b5>] rtrs_srv_rdma_cm_handler+0x8e5/0xa90 [rtrs_server] [<00000000e31a988a>] cma_ib_req_handler+0xdc5/0x2b50 [rdma_cm] [<000000000eb02c5b>] cm_process_work+0x2d/0x100 [ib_cm] [<00000000e1650ca9>] cm_req_handler+0x11bc/0x1c40 [ib_cm] [<000000009c28818b>] cm_work_handler+0xe65/0x3cf2 [ib_cm] [<000000002b53eaa1>] process_one_work+0x4bc/0x980 [<00000000da3499fb>] worker_thread+0x78/0x5c0 [<00000000167127a4>] kthread+0x191/0x1e0 [<0000000060802104>] ret_from_fork+0x3a/0x50 unreferenced object 0xffff88806d595d90 (size 8): comm "kworker/4:1H", pid 131, jiffies 4295062972 (age 1269.720s) hex dump (first 8 bytes): 62 6c 61 00 6b 6b 6b a5 bla.kkk. backtrace: [<000000004447d253>] kstrdup+0x2e/0x60 [<0000000047259793>] kobject_set_name_vargs+0x2f/0xb0 [<00000000c2ee3bc8>] dev_set_name+0xab/0xe0 [<000000002b6bdfb1>] rtrs_srv_create_sess_files+0x260/0x290 [rtrs_server] [<0000000075d87bd7>] rtrs_srv_info_req_done+0x71b/0x960 [rtrs_server] [<00000000ccdf1bb5>] __ib_process_cq+0x94/0x100 [ib_core] [<00000000cbcb60cb>] ib_cq_poll_work+0x32/0xc0 [ib_core] [<000000002b53eaa1>] process_one_work+0x4bc/0x980 [<00000000da3499fb>] worker_thread+0x78/0x5c0 [<00000000167127a4>] kthread+0x191/0x1e0 [<0000000060802104>] ret_from_fork+0x3a/0x50 unreferenced object 0xffff88806d6bb100 (size 256): comm "kworker/4:1H", pid 131, jiffies 4295062972 (age 1269.720s) hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff 00 59 4d 86 ff ff ff ff .........YM..... backtrace: [<00000000a18a11e4>] device_add+0x74d/0xa00 [<00000000a915b95f>] rtrs_srv_create_sess_files.cold+0x49/0x1fe [rtrs_server] [<0000000075d87bd7>] rtrs_srv_info_req_done+0x71b/0x960 [rtrs_server] [<00000000ccdf1bb5>] __ib_process_cq+0x94/0x100 [ib_core] [<00000000cbcb60cb>] ib_cq_poll_work+0x32/0xc0 [ib_core] [<000000002b53eaa1>] process_one_work+0x4bc/0x980 [<00000000da3499fb>] worker_thread+0x78/0x5c0 [<00000000167127a4>] kthread+0x191/0x1e0 [<0000000060802104>] ret_from_fork+0x3a/0x50 The problem is we increase device refcount by get_device in process_info_req for each path, but only does put_deice for last path, which lead to memory leak. To fix it, it also calls put_device when dev_ref is not 0. Fixes: e2853c49477d1 ("RDMA/rtrs-srv-sysfs: fix missing put_device") Link: https://lore.kernel.org/r/20210528113018.52290-19-jinpu.wang@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rtrs-srv: Fix memory leak of unfreed rtrs_srv_stats objectGioh Kim1-0/+1
[ Upstream commit 2371c40354509746e4a4dad09a752e027a30f148 ] When closing a session, currently the rtrs_srv_stats object in the closing session is freed by kobject release. But if it failed to create a session by various reasons, it must free the rtrs_srv_stats object directly because kobject is not created yet. This problem is found by kmemleak as below: 1. One client machine maps /dev/nullb0 with session name 'bla': root@test1:~# echo "sessname=bla path=ip:192.168.122.190 \ device_path=/dev/nullb0" > /sys/devices/virtual/rnbd-client/ctl/map_device 2. Another machine failed to create a session with the same name 'bla': root@test2:~# echo "sessname=bla path=ip:192.168.122.190 \ device_path=/dev/nullb1" > /sys/devices/virtual/rnbd-client/ctl/map_device -bash: echo: write error: Connection reset by peer 3. The kmemleak on server machine reported an error: unreferenced object 0xffff888033cdc800 (size 128): comm "kworker/2:1", pid 83, jiffies 4295086585 (age 2508.680s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000a72903b2>] __alloc_sess+0x1d4/0x1250 [rtrs_server] [<00000000d1e5321e>] rtrs_srv_rdma_cm_handler+0xc31/0xde0 [rtrs_server] [<00000000bb2f6e7e>] cma_ib_req_handler+0xdc5/0x2b50 [rdma_cm] [<00000000e896235d>] cm_process_work+0x2d/0x100 [ib_cm] [<00000000b6866c5f>] cm_req_handler+0x11bc/0x1c40 [ib_cm] [<000000005f5dd9aa>] cm_work_handler+0xe65/0x3cf2 [ib_cm] [<00000000610151e7>] process_one_work+0x4bc/0x980 [<00000000541e0f77>] worker_thread+0x78/0x5c0 [<00000000423898ca>] kthread+0x191/0x1e0 [<000000005a24b239>] ret_from_fork+0x3a/0x50 Fixes: 39c2d639ca183 ("RDMA/rtrs-srv: Set .release function for rtrs srv device during device init") Link: https://lore.kernel.org/r/20210528113018.52290-18-jinpu.wang@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rtrs: Do not reset hb_missed_max after re-connectionGioh Kim1-1/+0
[ Upstream commit 64bce1ee978491a779eb31098b21c57d4e431d6a ] When re-connecting, it resets hb_missed_max to 0. Before the first re-connecting, client will trigger re-connection when it gets hb-ack more than 5 times. But after the first re-connecting, clients will do re-connection whenever it does not get hb-ack because hb_missed_max is 0. There is no need to reset hb_missed_max when re-connecting. hb_missed_max should be kept until closing the session. Fixes: c0894b3ea69d3 ("RDMA/rtrs: core: lib functions shared between client and server modules") Link: https://lore.kernel.org/r/20210528113018.52290-16-jinpu.wang@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rtrs-clt: Check state of the rtrs_clt_sess before reading its statsMd Haris Iqbal1-0/+3
[ Upstream commit 41db63a7efe1c8c2dd282c1849a6ebfbbedbaf67 ] When get_next_path_min_inflight is called to select the next path, it iterates over the list of available rtrs_clt_sess (paths). It then reads the number of inflight IOs for that path to select one which has the least inflight IO. But it may so happen that rtrs_clt_sess (path) is no longer in the connected state because closing or error recovery paths can change the status of the rtrs_clt_Sess. For example, the client sent the heart-beat and did not get the response, it would change the session status and stop IO processing. The added checking of this patch can prevent accessing the broken path and generating duplicated error messages. It is ok if the status is changed after checking the status because the error recovery path does not free memory and only tries to reconnection. And also it is ok if the session is closed after checking the status because closing the session changes the session status and flush all IO beforing free memory. If the session is being accessed for IO processing, the closing session will wait. Fixes: 6a98d71daea18 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20210528113018.52290-13-jinpu.wang@ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/srp: Fix a recently introduced memory leakBart Van Assche1-7/+6
[ Upstream commit 7ec2e27a3afff64c96bfe7a77685c33619db84be ] Only allocate a memory registration list if it will be used and if it will be freed. Link: https://lore.kernel.org/r/20210524041211.9480-5-bvanassche@acm.org Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Fixes: f273ad4f8d90 ("RDMA/srp: Remove support for FMR memory registration") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-07RDMA/mlx5: Block FDB rules when not in switchdev modeMark Bloch1-0/+7
commit edc0b0bccc9c80d9a44d3002dcca94984b25e7cf upstream. Allow creating FDB steering rules only when in switchdev mode. The only software model where a userspace application can manipulate FDB entries is when it manages the eswitch. This is only possible in switchdev mode where we expose a single RDMA device with representors for all the vports that are connected to the eswitch. Fixes: 52438be44112 ("RDMA/mlx5: Allow inserting a steering rule to the FDB") Link: https://lore.kernel.org/r/e928ae7c58d07f104716a2a8d730963d1bd01204.1623052923.git.leonro@nvidia.com Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> [sudip: use old mlx5_eswitch_mode] Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-16IB/mlx5: Fix initializing CQ fragments bufferAlaa Hleihel1-5/+4
commit 2ba0aa2feebda680ecfc3c552e867cf4d1b05a3a upstream. The function init_cq_frag_buf() can be called to initialize the current CQ fragments buffer cq->buf, or the temporary cq->resize_buf that is filled during CQ resize operation. However, the offending commit started to use function get_cqe() for getting the CQEs, the issue with this change is that get_cqe() always returns CQEs from cq->buf, which leads us to initialize the wrong buffer, and in case of enlarging the CQ we try to access elements beyond the size of the current cq->buf and eventually hit a kernel panic. [exception RIP: init_cq_frag_buf+103] [ffff9f799ddcbcd8] mlx5_ib_resize_cq at ffffffffc0835d60 [mlx5_ib] [ffff9f799ddcbdb0] ib_resize_cq at ffffffffc05270df [ib_core] [ffff9f799ddcbdc0] llt_rdma_setup_qp at ffffffffc0a6a712 [llt] [ffff9f799ddcbe10] llt_rdma_cc_event_action at ffffffffc0a6b411 [llt] [ffff9f799ddcbe98] llt_rdma_client_conn_thread at ffffffffc0a6bb75 [llt] [ffff9f799ddcbec8] kthread at ffffffffa66c5da1 [ffff9f799ddcbf50] ret_from_fork_nospec_begin at ffffffffa6d95ddd Fix it by getting the needed CQE by calling mlx5_frag_buf_get_wqe() that takes the correct source buffer as a parameter. Fixes: 388ca8be0037 ("IB/mlx5: Implement fragmented completion queue (CQ)") Link: https://lore.kernel.org/r/90a0e8c924093cfa50a482880ad7e7edb73dc19a.1623309971.git.leonro@nvidia.com Signed-off-by: Alaa Hleihel <alaa@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-16RDMA/mlx4: Do not map the core_clock page to user space unless enabledShay Drory1-4/+1
commit 404e5a12691fe797486475fe28cc0b80cb8bef2c upstream. Currently when mlx4 maps the hca_core_clock page to the user space there are read-modifiable registers, one of which is semaphore, on this page as well as the clock counter. If user reads the wrong offset, it can modify the semaphore and hang the device. Do not map the hca_core_clock page to the user space unless the device has been put in a backwards compatibility mode to support this feature. After this patch, mlx4 core_clock won't be mapped to user space on the majority of existing devices and the uverbs device time feature in ibv_query_rt_values_ex() will be disabled. Fixes: 52033cfb5aab ("IB/mlx4: Add mmap call to map the hardware clock") Link: https://lore.kernel.org/r/9632304e0d6790af84b3b706d8c18732bc0d5e27.1622726305.git.leonro@nvidia.com Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-16RDMA/ipoib: Fix warning caused by destroying non-initial netnsKamal Heib1-0/+1
commit a3e74fb9247cd530dca246699d5eb5a691884d32 upstream. After the commit 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces"), if the IPoIB device is moved to non-initial netns, destroying that netns lets the device vanish instead of moving it back to the initial netns, This is happening because default_device_exit() skips the interfaces due to having rtnl_link_ops set. Steps to reporoduce: ip netns add foo ip link set mlx5_ib0 netns foo ip netns delete foo WARNING: CPU: 1 PID: 704 at net/core/dev.c:11435 netdev_exit+0x3f/0x50 Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun d fuse CPU: 1 PID: 704 Comm: kworker/u64:3 Tainted: G S W 5.13.0-rc1+ #1 Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016 Workqueue: netns cleanup_net RIP: 0010:netdev_exit+0x3f/0x50 Code: 48 8b bb 30 01 00 00 e8 ef 81 b1 ff 48 81 fb c0 3a 54 a1 74 13 48 8b 83 90 00 00 00 48 81 c3 90 00 00 00 48 39 d8 75 02 5b c3 <0f> 0b 5b c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 RSP: 0018:ffffb297079d7e08 EFLAGS: 00010206 RAX: ffff8eb542c00040 RBX: ffff8eb541333150 RCX: 000000008010000d RDX: 000000008010000e RSI: 000000008010000d RDI: ffff8eb440042c00 RBP: ffffb297079d7e48 R08: 0000000000000001 R09: ffffffff9fdeac00 R10: ffff8eb5003be000 R11: 0000000000000001 R12: ffffffffa1545620 R13: ffffffffa1545628 R14: 0000000000000000 R15: ffffffffa1543b20 FS: 0000000000000000(0000) GS:ffff8ed37fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005601b5f4c2e8 CR3: 0000001fc8c10002 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ops_exit_list.isra.9+0x36/0x70 cleanup_net+0x234/0x390 process_one_work+0x1cb/0x360 ? process_one_work+0x360/0x360 worker_thread+0x30/0x370 ? process_one_work+0x360/0x360 kthread+0x116/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30 To avoid the above warning and later on the kernel panic that could happen on shutdown due to a NULL pointer dereference, make sure to set the netns_refund flag that was introduced by commit 3a5ca857079e ("can: dev: Move device back to init netns on owning netns delete") to properly restore the IPoIB interfaces to the initial netns. Fixes: 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces") Link: https://lore.kernel.org/r/20210525150134.139342-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-26RDMA/uverbs: Fix a NULL vs IS_ERR() bugDan Carpenter1-2/+2
[ Upstream commit 463a3f66473b58d71428a1c3ce69ea52c05440e5 ] The uapi_get_object() function returns error pointers, it never returns NULL. Fixes: 149d3845f4a5 ("RDMA/uverbs: Add a method to introspect handles in a context") Link: https://lore.kernel.org/r/YJ6Got+U7lz+3n9a@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26RDMA/mlx5: Fix query DCT via DEVXMaor Gottlieb1-4/+2
[ Upstream commit cfa3b797118eda7d68f9ede9b1a0279192aca653 ] When executing DEVX command to query QP object, we need to take the QP type from the mlx5_ib_qp struct which hold the driver specific QP types as well, such as DC. Fixes: 34613eb1d2ad ("IB/mlx5: Enable modify and query verbs objects via DEVX") Link: https://lore.kernel.org/r/6eee15d63f09bb70787488e0cf96216e2957f5aa.1621413654.git.leonro@nvidia.com Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26RDMA/core: Don't access cm_id after its destructionShay Drory1-2/+3
[ Upstream commit 889d916b6f8a48b8c9489fffcad3b78eedd01a51 ] restrack should only be attached to a cm_id while the ID has a valid device pointer. It is set up when the device is first loaded, but not cleared when the device is removed. There is also two copies of the device pointer, one private and one in the public API, and these were left out of sync. Make everything go to NULL together and manipulate restrack right around the device assignments. Found by syzcaller: BUG: KASAN: wild-memory-access in __list_del include/linux/list.h:112 [inline] BUG: KASAN: wild-memory-access in __list_del_entry include/linux/list.h:135 [inline] BUG: KASAN: wild-memory-access in list_del include/linux/list.h:146 [inline] BUG: KASAN: wild-memory-access in cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] BUG: KASAN: wild-memory-access in cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] BUG: KASAN: wild-memory-access in cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 Write of size 8 at addr dead000000000108 by task syz-executor716/334 CPU: 0 PID: 334 Comm: syz-executor716 Not tainted 5.11.0+ #271 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0xbe/0xf9 lib/dump_stack.c:120 __kasan_report mm/kasan/report.c:400 [inline] kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413 __list_del include/linux/list.h:112 [inline] __list_del_entry include/linux/list.h:135 [inline] list_del include/linux/list.h:146 [inline] cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 _destroy_id+0x29/0x460 drivers/infiniband/core/cma.c:1862 ucma_close_id+0x36/0x50 drivers/infiniband/core/ucma.c:185 ucma_destroy_private_ctx+0x58d/0x5b0 drivers/infiniband/core/ucma.c:576 ucma_close+0x91/0xd0 drivers/infiniband/core/ucma.c:1797 __fput+0x169/0x540 fs/file_table.c:280 task_work_run+0xb7/0x100 kernel/task_work.c:140 exit_task_work include/linux/task_work.h:30 [inline] do_exit+0x7da/0x17f0 kernel/exit.c:825 do_group_exit+0x9e/0x190 kernel/exit.c:922 __do_sys_exit_group kernel/exit.c:933 [inline] __se_sys_exit_group kernel/exit.c:931 [inline] __x64_sys_exit_group+0x2d/0x30 kernel/exit.c:931 do_syscall_64+0x2d/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 255d0c14b375 ("RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count") Link: https://lore.kernel.org/r/3352ee288fe34f2b44220457a29bfc0548686363.1620711734.git.leonro@nvidia.com Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26RDMA/mlx5: Recover from fatal event in dual port modeMaor Gottlieb1-0/+1
[ Upstream commit 97f30d324ce6645a4de4ffb71e4ae9b8ca36ff04 ] When there is fatal event on the slave port, the device is marked as not active. We need to mark it as active again when the slave is recovered to regain full functionality. Fixes: d69a24e03659 ("IB/mlx5: Move IB event processing onto a workqueue") Link: https://lore.kernel.org/r/8906754455bb23019ef223c725d2c0d38acfb80b.1620711734.git.leonro@nvidia.com Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26RDMA/rxe: Clear all QP fields if creation failedLeon Romanovsky1-0/+7
[ Upstream commit 67f29896fdc83298eed5a6576ff8f9873f709228 ] rxe_qp_do_cleanup() relies on valid pointer values in QP for the properly created ones, but in case rxe_qp_from_init() failed it was filled with garbage and caused tot the following error. refcount_t: underflow; use-after-free. WARNING: CPU: 1 PID: 12560 at lib/refcount.c:28 refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 Modules linked in: CPU: 1 PID: 12560 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 Code: e9 db fe ff ff 48 89 df e8 2c c2 ea fd e9 8a fe ff ff e8 72 6a a7 fd 48 c7 c7 e0 b2 c1 89 c6 05 dc 3a e6 09 01 e8 ee 74 fb 04 <0f> 0b e9 af fe ff ff 0f 1f 84 00 00 00 00 00 41 56 41 55 41 54 55 RSP: 0018:ffffc900097ceba8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815bb075 RDI: fffff520012f9d67 RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815b4eae R11: 0000000000000000 R12: ffff8880322a4800 R13: ffff8880322a4940 R14: ffff888033044e00 R15: 0000000000000000 FS: 00007f6eb2be3700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdbe5d41000 CR3: 000000001d181000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_sub_and_test include/linux/refcount.h:283 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] kref_put include/linux/kref.h:64 [inline] rxe_qp_do_cleanup+0x96f/0xaf0 drivers/infiniband/sw/rxe/rxe_qp.c:805 execute_in_process_context+0x37/0x150 kernel/workqueue.c:3327 rxe_elem_release+0x9f/0x180 drivers/infiniband/sw/rxe/rxe_pool.c:391 kref_put include/linux/kref.h:65 [inline] rxe_create_qp+0x2cd/0x310 drivers/infiniband/sw/rxe/rxe_verbs.c:425 _ib_create_qp drivers/infiniband/core/core_priv.h:331 [inline] ib_create_named_qp+0x2ad/0x1370 drivers/infiniband/core/verbs.c:1231 ib_create_qp include/rdma/ib_verbs.h:3644 [inline] create_mad_qp+0x177/0x2d0 drivers/infiniband/core/mad.c:2920 ib_mad_port_open drivers/infiniband/core/mad.c:3001 [inline] ib_mad_init_device+0xd6f/0x1400 drivers/infiniband/core/mad.c:3092 add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:717 enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1331 ib_register_device drivers/infiniband/core/device.c:1413 [inline] ib_register_device+0x7c7/0xa50 drivers/infiniband/core/device.c:1365 rxe_register_device+0x3d5/0x4a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1147 rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247 rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:503 rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline] rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250 nldev_newlink+0x30e/0x550 drivers/infiniband/core/nldev.c:1555 rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/7bf8d548764d406dbbbaf4b574960ebfd5af8387.1620717918.git.leonro@nvidia.com Reported-by: syzbot+36a7f280de4e11c6f04e@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26RDMA/core: Prevent divide-by-zero error triggered by the userLeon Romanovsky1-0/+3
[ Upstream commit 54d87913f147a983589923c7f651f97de9af5be1 ] The user_entry_size is supplied by the user and later used as a denominator to calculate number of entries. The zero supp