linux.git - Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git

Age	Commit message (Collapse)	Author	Files	Lines
2021-11-18	scsi: dc395: Fix error case unwinding	Tong Zhang	1	-0/+1
	[ Upstream commit cbd9a3347c757383f3d2b50cf7cfd03eb479c481 ] dc395x_init_one()->adapter_init() might fail. In this case, the acb is already cleaned up by adapter_init(), no need to do that in adapter_uninit(acb) again. [ 1.252251] dc395x: adapter init failed [ 1.254900] RIP: 0010:adapter_uninit+0x94/0x170 [dc395x] [ 1.260307] Call Trace: [ 1.260442] dc395x_init_one.cold+0x72a/0x9bb [dc395x] Link: https://lore.kernel.org/r/20210907040702.1846409-1-ztong0001@gmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Finn Thain <fthain@linux-m68k.org> Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18	scsi: qla2xxx: Fix unmap of already freed sgl	Dmitry Bogdanov	1	-9/+5
	[ Upstream commit 4a8f71014b4d56c4fb287607e844c0a9f68f46d9 ] The sgl is freed in the target stack in target_release_cmd_kref() before calling qlt_free_cmd() but there is an unmap of sgl in qlt_free_cmd() that causes a panic if sgl is not yet DMA unmapped: NIP dma_direct_unmap_sg+0xdc/0x180 LR dma_direct_unmap_sg+0xc8/0x180 Call Trace: ql_dbg_prefix+0x68/0xc0 [qla2xxx] (unreliable) dma_unmap_sg_attrs+0x54/0xf0 qlt_unmap_sg.part.19+0x54/0x1c0 [qla2xxx] qlt_free_cmd+0x124/0x1d0 [qla2xxx] tcm_qla2xxx_release_cmd+0x4c/0xa0 [tcm_qla2xxx] target_put_sess_cmd+0x198/0x370 [target_core_mod] transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod] tcm_qla2xxx_complete_free+0x6c/0x90 [tcm_qla2xxx] The sgl may be left unmapped in error cases of response sending. For instance, qlt_rdy_to_xfer() maps sgl and exits when session is being deleted keeping the sgl mapped. This patch removes use-after-free of the sgl and ensures that the sgl is unmapped for any command that was not sent to firmware. Link: https://lore.kernel.org/r/20211018122650.11846-1-d.bogdanov@yadro.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18	scsi: qla2xxx: Return -ENOMEM if kzalloc() fails	Zheyu Ma	1	-1/+1
	[ Upstream commit 06634d5b6e923ed0d4772aba8def5a618f44c7fe ] The driver probing function should return < 0 for failure, otherwise kernel will treat value > 0 as success. Link: https://lore.kernel.org/r/1634522181-31166-1-git-send-email-zheyuma97@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18	scsi: qla2xxx: Fix use after free in eh_abort path	Quinn Tran	1	-3/+5
	commit 3d33b303d4f3b74a71bede5639ebba3cfd2a2b4d upstream. In eh_abort path driver prematurely exits the call to upper layer. Check whether command is aborted / completed by firmware before exiting the call. 9 [ffff8b1ebf803c00] page_fault at ffffffffb0389778 [exception RIP: qla2x00_status_entry+0x48d] RIP: ffffffffc04fa62d RSP: ffff8b1ebf803cb0 RFLAGS: 00010082 RAX: 00000000ffffffff RBX: 00000000000e0000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000000000013d8 RDI: fffff3253db78440 RBP: ffff8b1ebf803dd0 R8: ffff8b1ebcd9b0c0 R9: 0000000000000000 R10: ffff8b1e38a30808 R11: 0000000000001000 R12: 00000000000003e9 R13: 0000000000000000 R14: ffff8b1ebcd9d740 R15: 0000000000000028 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 10 [ffff8b1ebf803cb0] enqueue_entity at ffffffffafce708f 11 [ffff8b1ebf803d00] enqueue_task_fair at ffffffffafce7b88 12 [ffff8b1ebf803dd8] qla24xx_process_response_queue at ffffffffc04fc9a6 [qla2xxx] 13 [ffff8b1ebf803e78] qla24xx_msix_rsp_q at ffffffffc04ff01b [qla2xxx] 14 [ffff8b1ebf803eb0] __handle_irq_event_percpu at ffffffffafd50714 Link: https://lore.kernel.org/r/20210908164622.19240-10-njavali@marvell.com Fixes: f45bca8c5052 ("scsi: qla2xxx: Fix double scsi_done for abort path") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Co-developed-by: David Jeffery <djeffery@redhat.com> Signed-off-by: David Jeffery <djeffery@redhat.com> Co-developed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	scsi: qla2xxx: Fix kernel crash when accessing port_speed sysfs file	Arun Easi	1	-2/+22
	commit 3ef68d4f0c9e7cb589ae8b70f07d77f528105331 upstream. Kernel crashes when accessing port_speed sysfs file. The issue happens on a CNA when the local array was accessed beyond bounds. Fix this by changing the lookup. BUG: unable to handle kernel paging request at 0000000000004000 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 15 PID: 455213 Comm: sosreport Kdump: loaded Not tainted 4.18.0-305.7.1.el8_4.x86_64 #1 RIP: 0010:string_nocheck+0x12/0x70 Code: 00 00 4c 89 e2 be 20 00 00 00 48 89 ef e8 86 9a 00 00 4c 01 e3 eb 81 90 49 89 f2 48 89 ce 48 89 f8 48 c1 fe 30 66 85 f6 74 4f <44> 0f b6 0a 45 84 c9 74 46 83 ee 01 41 b8 01 00 00 00 48 8d 7c 37 RSP: 0018:ffffb5141c1afcf0 EFLAGS: 00010286 RAX: ffff8bf4009f8000 RBX: ffff8bf4009f9000 RCX: ffff0a00ffffff04 RDX: 0000000000004000 RSI: ffffffffffffffff RDI: ffff8bf4009f8000 RBP: 0000000000004000 R08: 0000000000000001 R09: ffffb5141c1afb84 R10: ffff8bf4009f9000 R11: ffffb5141c1afce6 R12: ffff0a00ffffff04 R13: ffffffffc08e21aa R14: 0000000000001000 R15: ffffffffc08e21aa FS: 00007fc4ebfff700(0000) GS:ffff8c717f7c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000004000 CR3: 000000edfdee6006 CR4: 00000000001706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: string+0x40/0x50 vsnprintf+0x33c/0x520 scnprintf+0x4d/0x90 qla2x00_port_speed_show+0xb5/0x100 [qla2xxx] dev_attr_show+0x1c/0x40 sysfs_kf_seq_show+0x9b/0x100 seq_read+0x153/0x410 vfs_read+0x91/0x140 ksys_read+0x4f/0xb0 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca Link: https://lore.kernel.org/r/20210908164622.19240-7-njavali@marvell.com Fixes: 4910b524ac9e ("scsi: qla2xxx: Add support for setting port speed") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18	scsi: core: Remove command size deduction from scsi_setup_scsi_cmnd()	Tadeusz Struk	1	-2/+0
	commit 703535e6ae1e94c89a9c1396b4c7b6b41160ef0c upstream. No need to deduce command size in scsi_setup_scsi_cmnd() anymore as appropriate checks have been added to scsi_fill_sghdr_rq() function and the cmd_len should never be zero here. The code to do that wasn't correct anyway, as it used uninitialized cmd->cmnd, which caused a null-ptr-deref if the command size was zero as in the trace below. Fix this by removing the unneeded code. KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 1822 Comm: repro Not tainted 5.15.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014 Call Trace: blk_mq_dispatch_rq_list+0x7c7/0x12d0 __blk_mq_sched_dispatch_requests+0x244/0x380 blk_mq_sched_dispatch_requests+0xf0/0x160 __blk_mq_run_hw_queue+0xe8/0x160 __blk_mq_delay_run_hw_queue+0x252/0x5d0 blk_mq_run_hw_queue+0x1dd/0x3b0 blk_mq_sched_insert_request+0x1ff/0x3e0 blk_execute_rq_nowait+0x173/0x1e0 blk_execute_rq+0x15c/0x540 sg_io+0x97c/0x1370 scsi_ioctl+0xe16/0x28e0 sd_ioctl+0x134/0x170 blkdev_ioctl+0x362/0x6e0 block_ioctl+0xb0/0xf0 vfs_ioctl+0xa7/0xf0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae ---[ end trace 8b086e334adef6d2 ]--- Kernel panic - not syncing: Fatal exception Link: https://lore.kernel.org/r/20211103170659.22151-2-tadeusz.struk@linaro.org Fixes: 2ceda20f0a99 ("scsi: core: Move command size detection out of the fast path") Cc: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: James E.J. Bottomley <jejb@linux.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: <linux-scsi@vger.kernel.org> Cc: <linux-kernel@vger.kernel.org> Cc: <stable@vger.kernel.org> # 5.15, 5.14, 5.10 Reported-by: syzbot+5516b30f5401d4dcbcae@syzkaller.appspotmail.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-06	scsi: core: Put LLD module refcnt after SCSI device is released	Ming Lei	2	-1/+12
	commit f2b85040acec9a928b4eb1b57a989324e8e38d3f upstream. SCSI host release is triggered when SCSI device is freed. We have to make sure that the low-level device driver module won't be unloaded before SCSI host instance is released because shost->hostt is required in the release handler. Make sure to put LLD module refcnt after SCSI device is released. Fixes a kernel panic of 'BUG: unable to handle page fault for address' reported by Changhui and Yi. Link: https://lore.kernel.org/r/20211008050118.1440686-1-ming.lei@redhat.com Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reported-by: Changhui Zhong <czhong@redhat.com> Reported-by: Yi Zhang <yi.zhang@redhat.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-02	scsi: ufs: ufs-exynos: Correct timeout value setting registers	Chanho Park	1	-3/+3
	[ Upstream commit 282da7cef078a87b6d5e8ceba8b17e428cf0e37c ] PA_PWRMODEUSERDATA0 -> DL_FC0PROTTIMEOUTVAL PA_PWRMODEUSERDATA1 -> DL_TC0REPLAYTIMEOUTVAL PA_PWRMODEUSERDATA2 -> DL_AFC0REQTIMEOUTVAL Link: https://lore.kernel.org/r/20211018062841.18226-1-chanho61.park@samsung.com Fixes: a967ddb22d94 ("scsi: ufs: ufs-exynos: Apply vendor-specific values for three timeouts") Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Kiwoong Kim <kwmad.kim@samsung.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Chanho Park <chanho61.park@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-27	scsi: core: Fix shost->cmd_per_lun calculation in scsi_add_host_with_dma()	Dexuan Cui	1	-1/+2
	commit 50b6cb3516365cb69753b006be2b61c966b70588 upstream. After commit ea2f0f77538c ("scsi: core: Cap scsi_host cmd_per_lun at can_queue"), a 416-CPU VM running on Hyper-V hangs during boot because the hv_storvsc driver sets scsi_driver.can_queue to an integer value that exceeds SHRT_MAX, and hence scsi_add_host_with_dma() sets shost->cmd_per_lun to a negative "short" value. Use min_t(int, ...) to work around the issue. Link: https://lore.kernel.org/r/20211008043546.6006-1-decui@microsoft.com Fixes: ea2f0f77538c ("scsi: core: Cap scsi_host cmd_per_lun at can_queue") Cc: stable@vger.kernel.org Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-27	scsi: qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()	Joy Gu	1	-1/+1
	[ Upstream commit 7fb223d0ad801f633c78cbe42b1d1b55f5d163ad ] Commit 8c0eb596baa5 ("[SCSI] qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()"), intended to change: bsg_job->request->msgcode == FC_BSG_HST_ELS_NOLOGIN to: bsg_job->request->msgcode != FC_BSG_RPT_ELS but changed it to: bsg_job->request->msgcode == FC_BSG_RPT_ELS instead. Change the == to a != to avoid leaking the fcport structure or freeing unallocated memory. Link: https://lore.kernel.org/r/20211012191834.90306-2-jgu@purestorage.com Fixes: 8c0eb596baa5 ("[SCSI] qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Joy Gu <jgu@purestorage.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-27	scsi: iscsi: Fix set_param() handling	Mike Christie	1	-2/+0
	[ Upstream commit 187a580c9e7895978dcd1e627b9c9e7e3d13ca96 ] In commit 9e67600ed6b8 ("scsi: iscsi: Fix race condition between login and sync thread") we meant to add a check where before we call ->set_param() we make sure the iscsi_cls_connection is bound. The problem is that between versions 4 and 5 of the patch the deletion of the unchecked set_param() call was dropped so we ended up with 2 calls. As a result we can still hit a crash where we access the unbound connection on the first call. This patch removes that first call. Fixes: 9e67600ed6b8 ("scsi: iscsi: Fix race condition between login and sync thread") Link: https://lore.kernel.org/r/20211010161904.60471-1-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Li Feng <fengli@smartx.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-17	scsi: virtio_scsi: Fix spelling mistake "Unsupport" -> "Unsupported"	Colin Ian King	1	-2/+2
	[ Upstream commit cced4c0ec7c06f5230a2958907a409c849762293 ] There are a couple of spelling mistakes in pr_info and pr_err messages. Fix them. Link: https://lore.kernel.org/r/20210924230330.143785-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-17	scsi: ses: Fix unsigned comparison with less than zero	Jiapeng Chong	1	-1/+1
	[ Upstream commit dd689ed5aa905daf4ba4c99319a52aad6ea0a796 ] Fix the following coccicheck warning: ./drivers/scsi/ses.c:137:10-16: WARNING: Unsigned expression compared with zero: result > 0. Link: https://lore.kernel.org/r/1632477113-90378-1-git-send-email-jiapeng.chong@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-09	scsi: ses: Retry failed Send/Receive Diagnostic commands	Wen Xiong	1	-4/+18
	[ Upstream commit fbdac19e642899455b4e64c63aafe2325df7aafa ] Setting SCSI logging level with error=3, we saw some errors from enclosues: [108017.360833] ses 0:0:9:0: tag#641 Done: NEEDS_RETRY Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s [108017.360838] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00 [108017.427778] ses 0:0:9:0: Power-on or device reset occurred [108017.427784] ses 0:0:9:0: tag#641 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s [108017.427788] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00 [108017.427791] ses 0:0:9:0: tag#641 Sense Key : Unit Attention [current] [108017.427793] ses 0:0:9:0: tag#641 Add. Sense: Bus device reset function occurred [108017.427801] ses 0:0:9:0: Failed to get diagnostic page 0x1 [108017.427804] ses 0:0:9:0: Failed to bind enclosure -19 [108017.427895] ses 0:0:10:0: Attached Enclosure device [108017.427942] ses 0:0:10:0: Attached scsi generic sg18 type 13 Retry if the Send/Receive Diagnostic commands complete with a transient error status (NOT_READY or UNIT_ATTENTION with ASC 0x29). Link: https://lore.kernel.org/r/1631849061-10210-2-git-send-email-wenxiong@linux.ibm.com Reviewed-by: Brian King <brking@linux.ibm.com> Reviewed-by: James Bottomley <jejb@linux.ibm.com> Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-09	scsi: sd: Free scsi_disk device via put_device()	Ming Lei	1	-4/+5
	[ Upstream commit 265dfe8ebbabae7959060bd1c3f75c2473b697ed ] After a device is initialized via device_initialize() it should be freed via put_device(). sd_probe() currently gets this wrong, fix it up. Link: https://lore.kernel.org/r/20210906090112.531442-1-ming.lei@redhat.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-06	scsi: csiostor: Add module softdep on cxgb4	Rahul Lakkireddy	1	-0/+1
	[ Upstream commit 79a7482249a7353bc86aff8127954d5febf02472 ] Both cxgb4 and csiostor drivers run on their own independent Physical Function. But when cxgb4 and csiostor are both being loaded in parallel via modprobe, there is a race when firmware upgrade is attempted by both the drivers. When the cxgb4 driver initiates the firmware upgrade, it halts the firmware and the chip until upgrade is complete. When the csiostor driver is coming up in parallel, the firmware mailbox communication fails with timeouts and the csiostor driver probe fails. Add a module soft dependency on cxgb4 driver to ensure loading csiostor triggers cxgb4 to load first when available to avoid the firmware upgrade race. Link: https://lore.kernel.org/r/1632759248-15382-1-git-send-email-rahul.lakkireddy@chelsio.com Fixes: a3667aaed569 ("[SCSI] csiostor: Chelsio FCoE offload driver") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-06	scsi: ufs: Fix illegal offset in UPIU event trace	Jonathan Hsu	1	-2/+1
	commit e8c2da7e329ce004fee748b921e4c765dc2fa338 upstream. Fix incorrect index for UTMRD reference in ufshcd_add_tm_upiu_trace(). Link: https://lore.kernel.org/r/20210924085848.25500-1-jonathan.hsu@mediatek.com Fixes: 4b42d557a8ad ("scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs") Cc: stable@vger.kernel.org Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jonathan Hsu <jonathan.hsu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-06	scsi: qla2xxx: Changes to support kdump kernel for NVMe BFS	Saurav Kashyap	3	-23/+20
	[ Upstream commit 4a0a542fe5e4273baf9228459ef3f75c29490cba ] The MSI-X and MSI calls fails in kdump kernel. Because of this qla2xxx_create_qpair() fails leading to .create_queue callback failure. The fix is to return existing qpair instead of allocating new one and allocate a single hw queue. [ 19.975838] qla2xxx [0000:d8:00.1]-00c7:11: MSI-X: Failed to enable support, giving up -- 16/-28. [ 19.984885] qla2xxx [0000:d8:00.1]-0037:11: Falling back-to MSI mode -- ret=-28. [ 19.992278] qla2xxx [0000:d8:00.1]-0039:11: Falling back-to INTa mode -- ret=-28. .. .. .. [ 21.141518] qla2xxx [0000:d8:00.0]-2104:2: qla_nvme_alloc_queue: handle 00000000e7ee499d, idx =1, qsize 32 [ 21.151166] qla2xxx [0000:d8:00.0]-0181:2: FW/Driver is not multi-queue capable. [ 21.158558] qla2xxx [0000:d8:00.0]-2122:2: Failed to allocate qpair [ 21.164824] nvme nvme0: NVME-FC{0}: reset: Reconnect attempt failed (-22) [ 21.171612] nvme nvme0: NVME-FC{0}: Reconnect attempt in 2 seconds Link: https://lore.kernel.org/r/20210810043720.1137-13-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-30	scsi: lpfc: Use correct scnprintf() limit	Dan Carpenter	1	-1/+2
	[ Upstream commit 6dacc371b77f473770ec646e220303a84fe96c11 ] The limit should be "PAGE_SIZE - len" instead of "PAGE_SIZE". We're not going to hit the limit so this fix will not affect runtime. Link: https://lore.kernel.org/r/20210916132331.GE25094@kili Fixes: 5b9e70b22cc5 ("scsi: lpfc: raise sg count for nvme to use available sg resources") Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-30	scsi: qla2xxx: Restore initiator in dual mode	Dmitry Bogdanov	1	-1/+2
	[ Upstream commit 5f8579038842d77e6ce05e1df6bf9dd493b0e3ef ] In dual mode in case of disabling the target, the whole port goes offline and initiator is turned off too. Fix restoring initiator mode after disabling target in dual mode. Link: https://lore.kernel.org/r/20210915153239.8035-1-d.bogdanov@yadro.com Fixes: 0645cb8350cd ("scsi: qla2xxx: Add mode control for each physical port") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-30	scsi: iscsi: Adjust iface sysfs attr detection	Baokun Li	1	-4/+4
	[ Upstream commit 4e28550829258f7dab97383acaa477bd724c0ff4 ] ISCSI_NET_PARAM_IFACE_ENABLE belongs to enum iscsi_net_param instead of iscsi_iface_param so move it to ISCSI_NET_PARAM. Otherwise, when we call into the driver, we might not match and return that we don't want attr visible in sysfs. Found in code review. Link: https://lore.kernel.org/r/20210901085336.2264295-1-libaokun1@huawei.com Fixes: e746f3451ec7 ("scsi: iscsi: Fix iface sysfs attr detection") Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-30	scsi: sd_zbc: Ensure buffer size is aligned to SECTOR_SIZE	Naohiro Aota	1	-3/+3
	commit 7215e909814fed7cda33c954943a4050d8348204 upstream. Reporting zones on a SCSI device sometimes fail with the following error: [76248.516390] ata16.00: invalid transfer count 131328 [76248.523618] sd 15:0:0:0: [sda] REPORT ZONES start lba 536870912 failed The error (from drivers/ata/libata-scsi.c:ata_scsi_zbc_in_xlat()) indicates that buffer size is not aligned to SECTOR_SIZE. This happens when the __vmalloc() failed. Consider we are reporting 4096 zones, then we will have "bufsize = roundup((4096 + 1) * 64, SECTOR_SIZE)" = (513 * 512) = 262656. Then, __vmalloc() failure halves the bufsize to 131328, which is no longer aligned to SECTOR_SIZE. Use rounddown() to ensure the size is always aligned to SECTOR_SIZE and fix the comment as well. Link: https://lore.kernel.org/r/20210906140642.2267569-1-naohiro.aota@wdc.com Fixes: 23a50861adda ("scsi: sd_zbc: Cleanup sd_zbc_alloc_report_buffer()") Cc: stable@vger.kernel.org # 5.5+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-18	scsi: qla2xxx: Sync queue idx with queue_pair_map idx	Saurav Kashyap	1	-2/+3
	commit c8fadf019964d0eb1da410ba8b629494d3339db9 upstream. The first invocation of function find_first_zero_bit will return 0 and queue_id gets set to 0. An index of queue_pair_map also gets set to 0. qpair_id = find_first_zero_bit(ha->qpair_qid_map, ha->max_qpairs); set_bit(qpair_id, ha->qpair_qid_map); ha->queue_pair_map[qpair_id] = qpair; In the alloc_queue callback driver checks the map, if queue is already allocated: ha->queue_pair_map[qidx] This works fine as long as max_qpairs is greater than nvme_max_hw_queues(8) since the size of the queue_pair_map is equal to max_qpair. In case nr_cpus is less than 8, max_qpairs is less than 8. This creates wrong value returned as qpair. [ 1572.353669] qla2xxx [0000:24:00.3]-2121:6: Returning existing qpair of 4e00000000000000 for idx=2 [ 1572.354458] general protection fault: 0000 [#1] SMP PTI [ 1572.354461] CPU: 1 PID: 44 Comm: kworker/1:1H Kdump: loaded Tainted: G IOE --------- - - 4.18.0-304.el8.x86_64 #1 [ 1572.354462] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 03/01/2013 [ 1572.354467] Workqueue: kblockd blk_mq_run_work_fn [ 1572.354485] RIP: 0010:qla_nvme_post_cmd+0x92/0x760 [qla2xxx] [ 1572.354486] Code: 84 24 5c 01 00 00 00 00 b8 0a 74 1e 66 83 79 48 00 0f 85 a8 03 00 00 48 8b 44 24 08 48 89 ee 4c 89 e7 8b 50 24 e8 5e 8e 00 00 <f0> 41 ff 47 04 0f ae f0 41 f6 47 24 04 74 19 f0 41 ff 4f 04 b8 f0 [ 1572.354487] RSP: 0018:ffff9c81c645fc90 EFLAGS: 00010246 [ 1572.354489] RAX: 0000000000000001 RBX: ffff8ea3e5070138 RCX: 0000000000000001 [ 1572.354490] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8ea4c866b800 [ 1572.354491] RBP: ffff8ea4c866b800 R08: 0000000000005010 R09: ffff8ea4c866b800 [ 1572.354492] R10: 0000000000000001 R11: 000000069d1ca3ff R12: ffff8ea4bc460000 [ 1572.354493] R13: ffff8ea3e50702b0 R14: ffff8ea4c4c16a58 R15: 4e00000000000000 [ 1572.354494] FS: 0000000000000000(0000) GS:ffff8ea4dfd00000(0000) knlGS:0000000000000000 [ 1572.354495] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1572.354496] CR2: 000055884504fa58 CR3: 00000005a1410001 CR4: 00000000000606e0 [ 1572.354497] Call Trace: [ 1572.354503] ? check_preempt_curr+0x62/0x90 [ 1572.354506] ? dma_direct_map_sg+0x72/0x1f0 [ 1572.354509] ? nvme_fc_start_fcp_op.part.32+0x175/0x460 [nvme_fc] [ 1572.354511] ? blk_mq_dispatch_rq_list+0x11c/0x730 [ 1572.354515] ? __switch_to_asm+0x35/0x70 [ 1572.354516] ? __switch_to_asm+0x41/0x70 [ 1572.354518] ? __switch_to_asm+0x35/0x70 [ 1572.354519] ? __switch_to_asm+0x41/0x70 [ 1572.354521] ? __switch_to_asm+0x35/0x70 [ 1572.354522] ? __switch_to_asm+0x41/0x70 [ 1572.354523] ? __switch_to_asm+0x35/0x70 [ 1572.354525] ? entry_SYSCALL_64_after_hwframe+0xb9/0xca [ 1572.354527] ? __switch_to_asm+0x41/0x70 [ 1572.354529] ? __blk_mq_sched_dispatch_requests+0xc6/0x170 [ 1572.354531] ? blk_mq_sched_dispatch_requests+0x30/0x60 [ 1572.354532] ? __blk_mq_run_hw_queue+0x51/0xd0 [ 1572.354535] ? process_one_work+0x1a7/0x360 [ 1572.354537] ? create_worker+0x1a0/0x1a0 [ 1572.354538] ? worker_thread+0x30/0x390 [ 1572.354540] ? create_worker+0x1a0/0x1a0 [ 1572.354541] ? kthread+0x116/0x130 [ 1572.354543] ? kthread_flush_work_fn+0x10/0x10 [ 1572.354545] ? ret_from_fork+0x35/0x40 Fix is to use index 0 for admin and first IO queue. Link: https://lore.kernel.org/r/20210810043720.1137-14-njavali@marvell.com Fixes: e84067d74301 ("scsi: qla2xxx: Add FC-NVMe F/W initialization and transport registration") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-18	scsi: qla2xxx: Changes to support kdump kernel	Saurav Kashyap	1	-0/+6
	commit 62e0dec59c1e139dab55aff5aa442adc97804271 upstream. Avoid allocating firmware dump and only allocate a single queue for a kexec kernel. Link: https://lore.kernel.org/r/20210810043720.1137-12-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-18	scsi: BusLogic: Fix missing pr_cont() use	Maciej W. Rozycki	1	-2/+2
	commit 44d01fc86d952f5a8b8b32bdb4841504d5833d95 upstream. Update BusLogic driver's messaging system to use pr_cont() for continuation lines, bringing messy output: pci 0000:00:13.0: PCI->APIC IRQ transform: INT A -> IRQ 17 scsi: *** BusLogic SCSI Driver Version 2.1.17 of 12 September 2013 * scsi: Copyright 1995-1998 by Leonard N. Zubkoff <lnz@dandelion.com> scsi0: Configuring BusLogic Model BT-958 PCI Wide Ultra SCSI Host Adapter scsi0: Firmware Version: 5.07B, I/O Address: 0x7000, IRQ Channel: 17/Level scsi0: PCI Bus: 0, Device: 19, Address: 0xE0012000, Host Adapter SCSI ID: 7 scsi0: Parity Checking: Enabled, Extended Translation: Enabled scsi0: Synchronous Negotiation: Ultra, Wide Negotiation: Enabled scsi0: Disconnect/Reconnect: Enabled, Tagged Queuing: Enabled scsi0: Scatter/Gather Limit: 128 of 8192 segments, Mailboxes: 211 scsi0: Driver Queue Depth: 211, Host Adapter Queue Depth: 192 scsi0: Tagged Queue Depth: Automatic , Untagged Queue Depth: 3 scsi0: SCSI Bus Termination: Both Enabled , SCAM: Disabled scsi0: * BusLogic BT-958 Initialized Successfully * scsi host0: BusLogic BT-958 back to order: pci 0000:00:13.0: PCI->APIC IRQ transform: INT A -> IRQ 17 scsi: * BusLogic SCSI Driver Version 2.1.17 of 12 September 2013 * scsi: Copyright 1995-1998 by Leonard N. Zubkoff <lnz@dandelion.com> scsi0: Configuring BusLogic Model BT-958 PCI Wide Ultra SCSI Host Adapter scsi0: Firmware Version: 5.07B, I/O Address: 0x7000, IRQ Channel: 17/Level scsi0: PCI Bus: 0, Device: 19, Address: 0xE0012000, Host Adapter SCSI ID: 7 scsi0: Parity Checking: Enabled, Extended Translation: Enabled scsi0: Synchronous Negotiation: Ultra, Wide Negotiation: Enabled scsi0: Disconnect/Reconnect: Enabled, Tagged Queuing: Enabled scsi0: Scatter/Gather Limit: 128 of 8192 segments, Mailboxes: 211 scsi0: Driver Queue Depth: 211, Host Adapter Queue Depth: 192 scsi0: Tagged Queue Depth: Automatic, Untagged Queue Depth: 3 scsi0: SCSI Bus Termination: Both Enabled, SCAM: Disabled scsi0: * BusLogic BT-958 Initialized Successfully * scsi host0: BusLogic BT-958 Also diagnostic output such as with the BusLogic=TraceConfiguration parameter is affected and becomes vertical and therefore hard to read. This has now been corrected, e.g.: pci 0000:00:13.0: PCI->APIC IRQ transform: INT A -> IRQ 17 blogic_cmd(86) Status = 30: 4 ==> 4: FF 05 93 00 blogic_cmd(95) Status = 28: (Modify I/O Address) blogic_cmd(91) Status = 30: 1 ==> 1: 01 blogic_cmd(04) Status = 30: 4 ==> 4: 41 41 35 30 blogic_cmd(8D) Status = 30: 14 ==> 14: 45 DC 00 20 00 00 00 00 00 40 30 37 42 1D scsi: * BusLogic SCSI Driver Version 2.1.17 of 12 September 2013 *** scsi: Copyright 1995-1998 by Leonard N. Zubkoff <lnz@dandelion.com> blogic_cmd(04) Status = 30: 4 ==> 4: 41 41 35 30 blogic_cmd(0B) Status = 30: 3 ==> 3: 00 08 07 blogic_cmd(0D) Status = 30: 34 ==> 34: 03 01 07 04 00 00 00 00 00 00 00 00 00 00 00 00 FF 42 44 46 FF 00 00 00 00 00 00 00 00 00 FF 00 FF 00 blogic_cmd(8D) Status = 30: 14 ==> 14: 45 DC 00 20 00 00 00 00 00 40 30 37 42 1D blogic_cmd(84) Status = 30: 1 ==> 1: 37 blogic_cmd(8B) Status = 30: 5 ==> 5: 39 35 38 20 20 blogic_cmd(85) Status = 30: 1 ==> 1: 42 blogic_cmd(86) Status = 30: 4 ==> 4: FF 05 93 00 blogic_cmd(91) Status = 30: 64 ==> 64: 41 46 3E 20 39 35 38 20 20 00 C4 00 04 01 07 2F 07 04 35 FF FF FF FF FF FF FF FF FF FF 01 00 FE FF 08 FF FF 00 00 00 00 00 00 00 01 00 01 00 00 FF FF 00 00 00 00 00 00 00 00 00 00 00 00 00 FC scsi0: Configuring BusLogic Model BT-958 PCI Wide Ultra SCSI Host Adapter etc. Link: https://lore.kernel.org/r/alpine.DEB.2.21.2104201940430.44318@angie.orcam.me.uk Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines") Cc: stable@vger.kernel.org # v4.9+ Acked-by: Khalid Aziz <khalid@gonehiking.org> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-18	scsi: ufs: ufs-exynos: Fix static checker warning	Alim Akhtar	2	-3/+3
	[ Upstream commit 313bf281f2091552f509fd05a74172c70ce7572f ] clk_get_rate() returns unsigned long and currently this driver stores the return value in u32 type, resulting the below warning: Fixed smatch warnings: drivers/scsi/ufs/ufs-exynos.c:286 exynos_ufs_get_clk_info() warn: wrong type for 'ufs->mclk_rate' (should be 'ulong') drivers/scsi/ufs/ufs-exynos.c:287 exynos_ufs_get_clk_info() warn: wrong type for 'pclk_rate' (should be 'ulong') Link: https://lore.kernel.org/r/20210819171131.55912-1-alim.akhtar@samsung.com Fixes: 55f4b1f73631 ("scsi: ufs: ufs-exynos: Add UFS host support for Exynos SoCs") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18	scsi: qedf: Fix error codes in qedf_alloc_global_queues()	Dan Carpenter	1	-5/+5
	[ Upstream commit ccc89737aa6b9f248cf1623014038beb6c2b7f56 ] This driver has some left over "return 1" on failure style code mixed with "return negative error codes" style code. The caller doesn't care so we should just convert everything to return negative error codes. Then there was a problem that there were two variables used to store error codes which just resulted in confusion. If qedf_alloc_bdq() returned a negative error code, we accidentally returned success instead of propagating the error code. So get rid of the "rc" variable and use "status" every where. Also remove the "status = 0" initialization so that these sorts of bugs will be detected by the compiler in the future. Link: https://lore.kernel.org/r/20210810085023.GA23998@kili Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18	scsi: qedi: Fix error codes in qedi_alloc_global_queues()	Dan Carpenter	1	-7/+7
	[ Upstream commit 4dbe57d46d54a847875fa33e7d05877bb341585e ] This function had some left over code that returned 1 on error instead negative error codes. Convert everything to use negative error codes. The caller treats all non-zero returns the same so this does not affect run time. A couple places set "rc" instead of "status" so those error paths ended up returning success by mistake. Get rid of the "rc" variable and use "status" everywhere. Remove the bogus "status = 0" initialization, as a future proofing measure so the compiler will warn about uninitialized error codes. Link: https://lore.kernel.org/r/20210810084753.GD23810@kili Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18	scsi: smartpqi: Fix an error code in pqi_get_raid_map()	Dan Carpenter	1	-0/+1
	[ Upstream commit d1f6581a6796c4e9fd8a4a24e8b77463d18f0df1 ] Return -EINVAL on failure instead of success. Link: https://lore.kernel.org/r/20210810084613.GB23810@kili Fixes: a91aaae0243b ("scsi: smartpqi: allow for larger raid maps") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18	scsi: fdomain: Fix error return code in fdomain_probe()	Wei Li	1	-1/+3
	[ Upstream commit 632c4ae6da1d629eddf9da1e692d7617c568c256 ] If request_region() fails the return value is not set. Return -EBUSY on error. Link: https://lore.kernel.org/r/20210715032625.1395495-1-liwei391@huawei.com Fixes: 8674a8aa2c39 ("scsi: fdomain: Add PCMCIA support") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18	scsi: ufs: Fix memory corruption by ufshcd_read_desc_param()	Bart Van Assche	1	-3/+5
	[ Upstream commit d3d9c4570285090b533b00946b72647361f0345b ] If param_offset > buff_len then the memcpy() statement in ufshcd_read_desc_param() corrupts memory since it copies 256 + buff_len - param_offset bytes into a buffer with size buff_len. Since param_offset < 256 this results in writing past the bound of the output buffer. Link: https://lore.kernel.org/r/20210722033439.26550-2-bvanassche@acm.org Fixes: cbe193f6f093 ("scsi: ufs: Fix potential NULL pointer access during memcpy") Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18	scsi: BusLogic: Use %X for u32 sized integer rather than %lX	Colin Ian King	1	-1/+1
	[ Upstream commit 2127cd21fb78c6e22d92944253afd967b0ff774d ] An earlier fix changed the print format specifier for adapter->bios_addr to use %lX. However, the integer is a u32 so the fix was wrong. Fix this by using the correct %X format specifier. Link: https://lore.kernel.org/r/20210730095031.26981-1-colin.king@canonical.com Fixes: 43622697117c ("scsi: BusLogic: use %lX for unsigned long rather than %X") Acked-by: Khalid Aziz <khalid@gonehiking.org> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Addresses-Coverity: ("Invalid type in argument") Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-03	scsi: core: Fix hang of freezing queue between blocking and running device	Li Jinlin	1	-3/+6
	commit 02c6dcd543f8f051973ee18bfbc4dc3bd595c558 upstream. We found a hang, the steps to reproduce are as follows: 1. blocking device via scsi_device_set_state() 2. dd if=/dev/sda of=/mnt/t.log bs=1M count=10 3. echo none > /sys/block/sda/queue/scheduler 4. echo "running" >/sys/block/sda/device/state Step 3 and 4 should complete after step 4, but they hang. CPU#0 CPU#1 CPU#2 --------------- ---------------- ---------------- Step 1: blocking device Step 2: dd xxxx ^^^^^^ get request q_usage_counter++ Step 3: switching scheculer elv_iosched_store elevator_switch blk_mq_freeze_queue blk_freeze_queue > blk_freeze_queue_start ^^^^^^ mq_freeze_depth++ > blk_mq_run_hw_queues ^^^^^^ can't run queue when dev blocked > blk_mq_freeze_queue_wait ^^^^^^ Hang here!!! wait q_usage_counter==0 Step 4: running device store_state_field scsi_rescan_device scsi_attach_vpd scsi_vpd_inquiry __scsi_execute blk_get_request blk_mq_alloc_request blk_queue_enter ^^^^^^ Hang here!!! wait mq_freeze_depth==0 blk_mq_run_hw_queues ^^^^^^ dispatch IO, q_usage_counter will reduce to zero blk_mq_unfreeze_queue ^^^^^ mq_freeze_depth-- To fix this, we need to run queue before rescanning device when the device state changes to SDEV_RUNNING. Link: https://lore.kernel.org/r/20210824025921.3277629-1-lijinlin3@huawei.com Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Li Jinlin <lijinlin3@huawei.com> Signed-off-by: Qiu Laibin <qiulaibin@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-26	scsi: core: Fix capacity set to zero after offlinining device	lijinlin	1	-3/+6
	[ Upstream commit f0f82e2476f6adb9c7a0135cfab8091456990c99 ] After adding physical volumes to a volume group through vgextend, the kernel will rescan the partitions. This in turn will cause the device capacity to be queried. If the device status is set to offline through sysfs at this time, READ CAPACITY command will return a result which the host byte is DID_NO_CONNECT, and the capacity of the device will be set to zero in read_capacity_error(). After setting device status back to running, the capacity of the device will remain stuck at zero. Fix this issue by rescanning device when the device state changes to SDEV_RUNNING. Link: https://lore.kernel.org/r/20210727034455.1494960-1-lijinlin3@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: lijinlin <lijinlin3@huawei.com> Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-26	scsi: core: Avoid printing an error if target_alloc() returns -ENXIO	Sreekanth Reddy	1	-1/+2
	[ Upstream commit 70edd2e6f652f67d854981fd67f9ad0f1deaea92 ] Avoid printing a 'target allocation failed' error if the driver target_alloc() callback function returns -ENXIO. This return value indicates that the corresponding H:C:T:L entry is empty. Removing this error reduces the scan time if the user issues SCAN_WILD_CARD scan operation through sysfs parameter on a host with a lot of empty H:C:T:L entries. Avoiding the printk on -ENXIO matches the behavior of the other callback functions during scanning. Link: https://lore.kernel.org/r/20210726115402.1936-1-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-26	scsi: scsi_dh_rdac: Avoid crash during rdac_bus_attach()	Ye Bin	1	-2/+2
	[ Upstream commit bc546c0c9abb3bb2fb46866b3d1e6ade9695a5f6 ] The following BUG_ON() was observed during RDAC scan: [595952.944297] kernel BUG at drivers/scsi/device_handler/scsi_dh_rdac.c:427! [595952.951143] Internal error: Oops - BUG: 0 [#1] SMP ...... [595953.251065] Call trace: [595953.259054] check_ownership+0xb0/0x118 [595953.269794] rdac_bus_attach+0x1f0/0x4b0 [595953.273787] scsi_dh_handler_attach+0x3c/0xe8 [595953.278211] scsi_dh_add_device+0xc4/0xe8 [595953.282291] scsi_sysfs_add_sdev+0x8c/0x2a8 [595953.286544] scsi_probe_and_add_lun+0x9fc/0xd00 [595953.291142] __scsi_scan_target+0x598/0x630 [595953.295395] scsi_scan_target+0x120/0x130 [595953.299481] fc_user_scan+0x1a0/0x1c0 [scsi_transport_fc] [595953.304944] store_scan+0xb0/0x108 [595953.308420] dev_attr_store+0x44/0x60 [595953.312160] sysfs_kf_write+0x58/0x80 [595953.315893] kernfs_fop_write+0xe8/0x1f0 [595953.319888] __vfs_write+0x60/0x190 [595953.323448] vfs_write+0xac/0x1c0 [595953.326836] ksys_write+0x74/0xf0 [595953.330221] __arm64_sys_write+0x24/0x30 Code is in check_ownership: list_for_each_entry_rcu(tmp, &h->ctlr->dh_list, node) { /* h->sdev should always be valid */ BUG_ON(!tmp->sdev); tmp->sdev->access_state = access_state; } rdac_bus_attach initialize_controller list_add_rcu(&h->node, &h->ctlr->dh_list); h->sdev = sdev; rdac_bus_detach list_del_rcu(&h->node); h->sdev = NULL; Fix the race between rdac_bus_attach() and rdac_bus_detach() where h->sdev is NULL when processing the RDAC attach. Link: https://lore.kernel.org/r/20210113063103.2698953-1-yebin10@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-26	scsi: megaraid_mm: Fix end of loop tests for list_for_each_entry()	Harshvardhan Jha	1	-6/+15
	[ Upstream commit 77541f78eadfe9fdb018a7b8b69f0f2af2cf4b82 ] The list_for_each_entry() iterator, "adapter" in this code, can never be NULL. If we exit the loop without finding the correct adapter then "adapter" points invalid memory that is an offset from the list head. This will eventually lead to memory corruption and presumably a kernel crash. Link: https://lore.kernel.org/r/20210708074642.23599-1-harshvardhan.jha@oracle.com Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Harshvardhan Jha <harshvardhan.jha@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-26	scsi: pm80xx: Fix TMF task completion race condition	Igor Pylypiv	1	-17/+15
	[ Upstream commit d712d3fb484b7fa8d1d57e9ca6f134bb9d8c18b1 ] The TMF timeout timer may trigger at the same time when the response from a controller is be