summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)AuthorFilesLines
2020-12-11scsi: mpt3sas: Fix ioctl timeoutSuganath Prabu S1-1/+1
commit 42f687038bcc34aa919e0e4c29b04e4cda3f6a79 upstream. Commit c1a6c5ac4278 ("scsi: mpt3sas: For NVME device, issue a protocol level reset") modified the ioctl path 'timeout' variable type to u8 from unsigned long, limiting the maximum timeout value that the driver can support to 255 seconds. If the management application is requesting a higher value the resulting timeout will be zero. The operation times out immediately and the ioctl request fails. Change datatype back to unsigned long. Link: https://lore.kernel.org/r/20201125094838.4340-1-suganath-prabu.subramani@broadcom.com Fixes: c1a6c5ac4278 ("scsi: mpt3sas: For NVME device, issue a protocol level reset") Cc: <stable@vger.kernel.org> #v4.18+ Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-02scsi: ufs: Fix race between shutdown and runtime resume flowStanley Chu1-5/+1
[ Upstream commit e92643db514803c2c87d72caf5950b4c0a8faf4a ] If UFS host device is in runtime-suspended state while UFS shutdown callback is invoked, UFS device shall be resumed for register accesses. Currently only UFS local runtime resume function will be invoked to wake up the host. This is not enough because if someone triggers runtime resume from block layer, then race may happen between shutdown and runtime resume flow, and finally lead to unlocked register access. To fix this, in ufshcd_shutdown(), use pm_runtime_get_sync() instead of resuming UFS device by ufshcd_runtime_resume() "internally" to let runtime PM framework manage the whole resume flow. Link: https://lore.kernel.org/r/20201119062916.12931-1-stanley.chu@mediatek.com Fixes: 57d104c153d3 ("ufs: add UFS power management support") Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-02scsi: libiscsi: Fix NOP race conditionLee Duncan1-8/+15
[ Upstream commit fe0a8a95e7134d0b44cd407bc0085b9ba8d8fe31 ] iSCSI NOPs are sometimes "lost", mistakenly sent to the user-land iscsid daemon instead of handled in the kernel, as they should be, resulting in a message from the daemon like: iscsid: Got nop in, but kernel supports nop handling. This can occur because of the new forward- and back-locks, and the fact that an iSCSI NOP response can occur before processing of the NOP send is complete. This can result in "conn->ping_task" being NULL in iscsi_nop_out_rsp(), when the pointer is actually in the process of being set. To work around this, we add a new state to the "ping_task" pointer. In addition to NULL (not assigned) and a pointer (assigned), we add the state "being set", which is signaled with an INVALID pointer (using "-1"). Link: https://lore.kernel.org/r/20201106193317.16993-1-leeman.duncan@gmail.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-24scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()Can Guo1-3/+3
[ Upstream commit da3fecb0040324c08f1587e5bff1f15f36be1872 ] The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be decreased back in ufshcd_ungate_work() in a paired way. However, if specific ufshcd_hold/release sequences are met, it is possible that scsi_block_reqs_cnt is increased twice but only one ungate work is queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() and ufshcd_ungate_work() in a paired way, increase it only if queue_work() returns true. Link: https://lore.kernel.org/r/1604384682-15837-2-git-send-email-cang@codeaurora.org Reviewed-by: Hongwu Su <hongwus@codeaurora.org> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-18scsi: scsi_dh_alua: Avoid crash during alua_bus_detach()Hannes Reinecke1-4/+5
[ Upstream commit 5faf50e9e9fdc2117c61ff7e20da49cd6a29e0ca ] alua_bus_detach() might be running concurrently with alua_rtpg_work(), so we might trip over h->sdev == NULL and call BUG_ON(). The correct way of handling it is to not set h->sdev to NULL in alua_bus_detach(), and call rcu_synchronize() before the final delete to ensure that all concurrent threads have left the critical section. Then we can get rid of the BUG_ON() and replace it with a simple if condition. Link: https://lore.kernel.org/r/1600167537-12509-1-git-send-email-jitendra.khasdev@oracle.com Link: https://lore.kernel.org/r/20200924104559.26753-1-hare@suse.de Cc: Brian Bunker <brian@purestorage.com> Acked-by: Brian Bunker <brian@purestorage.com> Tested-by: Jitendra Khasdev <jitendra.khasdev@oracle.com> Reviewed-by: Jitendra Khasdev <jitendra.khasdev@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-18scsi: hpsa: Fix memory leak in hpsa_init_one()Keita Suzuki1-1/+3
[ Upstream commit af61bc1e33d2c0ec22612b46050f5b58ac56a962 ] When hpsa_scsi_add_host() fails, h->lastlogicals is leaked since it is missing a free() in the error handler. Fix this by adding free() when hpsa_scsi_add_host() fails. Link: https://lore.kernel.org/r/20201027073125.14229-1-keitasuzuki.park@sslab.ics.keio.ac.jp Tested-by: Don Brace <don.brace@microchip.com> Acked-by: Don Brace <don.brace@microchip.com> Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-10scsi: core: Don't start concurrent async scan on same hostMing Lei1-3/+4
[ Upstream commit 831e3405c2a344018a18fcc2665acc5a38c3a707 ] The current scanning mechanism is supposed to fall back to a synchronous host scan if an asynchronous scan is in progress. However, this rule isn't strictly respected, scsi_prep_async_scan() doesn't hold scan_mutex when checking shost->async_scan. When scsi_scan_host() is called concurrently, two async scans on same host can be started and a hang in do_scan_async() is observed. Fixes this issue by checking & setting shost->async_scan atomically with shost->scan_mutex. Link: https://lore.kernel.org/r/20201010032539.426615-1-ming.lei@redhat.com Cc: Christoph Hellwig <hch@lst.de> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-05scsi: qla2xxx: Fix crash on session cleanup with unloadQuinn Tran1-6/+7
commit 50457dab670f396557e60c07f086358460876353 upstream. On unload, session cleanup prematurely gave the signal for driver unload path to advance. Link: https://lore.kernel.org/r/20200929102152.32278-6-njavali@marvell.com Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-30scsi: ufs: ufs-qcom: Fix race conditions caused by ufs_qcom_testbus_config()Can Guo1-5/+0
[ Upstream commit 89dd87acd40a44de8ff3358138aedf8f73f4efc6 ] If ufs_qcom_dump_dbg_regs() calls ufs_qcom_testbus_config() from ufshcd_suspend/resume and/or clk gate/ungate context, pm_runtime_get_sync() and ufshcd_hold() will cause a race condition. Fix this by removing the unnecessary calls of pm_runtime_get_sync() and ufshcd_hold(). Link: https://lore.kernel.org/r/1596975355-39813-3-git-send-email-cang@codeaurora.org Reviewed-by: Hongwu Su <hongwus@codeaurora.org> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-30scsi: qedi: Fix list_del corruption while removing active I/ONilesh Javali1-4/+11
[ Upstream commit 28b35d17f9f8573d4646dd8df08917a4076a6b63 ] While aborting the I/O, the firmware cleanup task timed out and driver deleted the I/O from active command list. Some time later the firmware sent the cleanup task response and driver again deleted the I/O from active command list causing firmware to send completion for non-existent I/O and list_del corruption of active command list. Add fix to check if I/O is present before deleting it from the active command list to ensure firmware sends valid I/O completion and protect against list_del corruption. Link: https://lore.kernel.org/r/20200908095657.26821-4-mrangankar@marvell.com Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-30scsi: qedi: Protect active command list to avoid list corruptionNilesh Javali2-0/+10
[ Upstream commit c0650e28448d606c84f76c34333dba30f61de993 ] Protect active command list for non-I/O commands like login response, logout response, text response, and recovery cleanup of active list to avoid list corruption. Link: https://lore.kernel.org/r/20200908095657.26821-5-mrangankar@marvell.com Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-30scsi: ibmvfc: Fix error return in ibmvfc_probe()Jing Xiangfeng1-0/+1
[ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ] Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangfeng@huawei.com Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-30scsi: mvumi: Fix error return in mvumi_io_attach()Jing Xiangfeng1-0/+1
[ Upstream commit 055f15ab2cb4a5cbc4c0a775ef3d0066e0fa9b34 ] Return PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200910123848.93649-1-jingxiangfeng@huawei.com Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs()Dan Carpenter1-2/+2
[ Upstream commit 38b2db564d9ab7797192ef15d7aade30633ceeae ] The be_fill_queue() function can only fail when "eq_vaddress" is NULL and since it's non-NULL here that means the function call can't fail. But imagine if it could, then in that situation we would want to store the "paddr" so that dma memory can be released. Link: https://lore.kernel.org/r/20200928091300.GD377727@mwanda Fixes: bfead3b2cb46 ("[SCSI] be2iscsi: Adding msix and mcc_rings V3") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29scsi: csiostor: Fix wrong return value in csio_hw_prep_fw()Tianjia Zhang1-1/+1
[ Upstream commit 44f4daf8678ae5f08c93bbe70792f90cd88e4649 ] On an error exit path, a negative error code should be returned instead of a positive return value. Link: https://lore.kernel.org/r/20200802111531.5065-1-tianjia.zhang@linux.alibaba.com Fixes: f40e74ffa3de ("csiostor:firmware upgrade fix") Cc: Praveen Madhavan <praveenm@chelsio.com> Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29scsi: qla2xxx: Fix wrong return value in qla_nvme_register_hba()Tianjia Zhang1-1/+1
[ Upstream commit ca4fb89a3d714a770e9c73c649da830f3f4a5326 ] On an error exit path, a negative error code should be returned instead of a positive return value. Link: https://lore.kernel.org/r/20200802111530.5020-1-tianjia.zhang@linux.alibaba.com Fixes: 8777e4314d39 ("scsi: qla2xxx: Migrate NVME N2N handling into state machine") Cc: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29scsi: qla4xxx: Fix an error handling path in 'qla4xxx_get_host_stats()'Christophe JAILLET1-1/+1
[ Upstream commit 574918e69720fe62ab3eb42ec3750230c8d16b06 ] Update the size used in 'dma_free_coherent()' in order to match the one used in the corresponding 'dma_alloc_coherent()'. Link: https://lore.kernel.org/r/20200802101527.676054-1-christophe.jaillet@wanadoo.fr Fixes: 4161cee52df8 ("[SCSI] qla4xxx: Add host statistics support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: libfc: Skip additional kref updating work eventJaved Hasan1-4/+5
[ Upstream commit 823a65409c8990f64c5693af98ce0e7819975cba ] When an rport event (RPORT_EV_READY) is updated without work being queued, avoid taking an additional reference. This issue was leading to memory leak. Trace from KMEMLEAK tool: unreferenced object 0xffff8888259e8780 (size 512): comm "kworker/2:1", jiffies 4433237386 (age 113021.971s) hex dump (first 32 bytes): 58 0a ec cf 83 88 ff ff 00 00 00 00 00 00 00 00 01 00 00 00 08 00 00 00 13 7d f0 1e 0e 00 00 10 backtrace: [<000000006b25760f>] fc_rport_recv_req+0x3c6/0x18f0 [libfc] [<00000000f208d994>] fc_lport_recv_els_req+0x120/0x8a0 [libfc] [<00000000a9c437b8>] fc_lport_recv+0xb9/0x130 [libfc] [<00000000a9c437b8>] fc_lport_recv+0xb9/0x130 [libfc] [<00000000ad5be37b>] qedf_ll2_process_skb+0x73d/0xad0 [qedf] [<00000000e0eb6893>] process_one_work+0x382/0x6c0 [<000000002dfd9e21>] worker_thread+0x57/0x5c0 [<00000000b648204f>] kthread+0x1a0/0x1c0 [<0000000072f5ab20>] ret_from_fork+0x35/0x40 [<000000001d5c05d8>] 0xffffffffffffffff Below is the log sequence which leads to memory leak. Here we get the RPORT_EV_READY and RPORT_EV_STOP back to back, which lead to overwrite the event RPORT_EV_READY by event RPORT_EV_STOP. Because of this, kref_count gets incremented by 1. kernel: host0: rport fffce5: Received PLOGI request kernel: host0: rport fffce5: Received PLOGI in INIT state kernel: host0: rport fffce5: Port is Ready kernel: host0: rport fffce5: Received PRLI request while in state Ready kernel: host0: rport fffce5: PRLI rspp type 8 active 1 passive 0 kernel: host0: rport fffce5: Received LOGO request while in state Ready kernel: host0: rport fffce5: Delete port kernel: host0: rport fffce5: Received PLOGI request kernel: host0: rport fffce5: Received PLOGI in state Delete - send busy kernel: host0: rport fffce5: work event 3 kernel: host0: rport fffce5: lld callback ev 3 kernel: host0: rport fffce5: work delete Link: https://lore.kernel.org/r/20200626094959.32151-1-jhasan@marvell.com Reviewed-by: Girish Basrur <gbasrur@marvell.com> Reviewed-by: Saurav Kashyap <skashyap@marvell.com> Reviewed-by: Shyam Sundar <ssundar@marvell.com> Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: libfc: Handling of extra krefJaved Hasan1-1/+3
[ Upstream commit 71f2bf85e90d938d4a9ef9dd9bfa8d9b0b6a03f7 ] Handling of extra kref which is done by lookup table in case rdata is already present in list. This issue was leading to memory leak. Trace from KMEMLEAK tool: unreferenced object 0xffff8888259e8780 (size 512): comm "kworker/2:1", pid 182614, jiffies 4433237386 (age 113021.971s) hex dump (first 32 bytes): 58 0a ec cf 83 88 ff ff 00 00 00 00 00 00 00 00 01 00 00 00 08 00 00 00 13 7d f0 1e 0e 00 00 10 backtrace: [<000000006b25760f>] fc_rport_recv_req+0x3c6/0x18f0 [libfc] [<00000000f208d994>] fc_lport_recv_els_req+0x120/0x8a0 [libfc] [<00000000a9c437b8>] fc_lport_recv+0xb9/0x130 [libfc] [<00000000ad5be37b>] qedf_ll2_process_skb+0x73d/0xad0 [qedf] [<00000000e0eb6893>] process_one_work+0x382/0x6c0 [<000000002dfd9e21>] worker_thread+0x57/0x5c0 [<00000000b648204f>] kthread+0x1a0/0x1c0 [<0000000072f5ab20>] ret_from_fork+0x35/0x40 [<000000001d5c05d8>] 0xffffffffffffffff Below is the log sequence which leads to memory leak. Here we get the nested "Received PLOGI request" for same port and this request leads to call the fc_rport_create() twice for the same rport. kernel: host1: rport fffce5: Received PLOGI request kernel: host1: rport fffce5: Received PLOGI in INIT state kernel: host1: rport fffce5: Port is Ready kernel: host1: rport fffce5: Received PRLI request while in state Ready kernel: host1: rport fffce5: PRLI rspp type 8 active 1 passive 0 kernel: host1: rport fffce5: Received LOGO request while in state Ready kernel: host1: rport fffce5: Delete port kernel: host1: rport fffce5: Received PLOGI request kernel: host1: rport fffce5: Received PLOGI in state Delete - send busy Link: https://lore.kernel.org/r/20200622101212.3922-2-jhasan@marvell.com Reviewed-by: Girish Basrur <gbasrur@marvell.com> Reviewed-by: Saurav Kashyap <skashyap@marvell.com> Reviewed-by: Shyam Sundar <ssundar@marvell.com> Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: cxlflash: Fix error return code in cxlflash_probe()Wei Yongjun1-0/+1
[ Upstream commit d0b1e4a638d670a09f42017a3e567dc846931ba8 ] Fix to return negative error code -ENOMEM from create_afu error handling case instead of 0, as done elsewhere in this function. Link: https://lore.kernel.org/r/20200428141855.88704-1-weiyongjun1@huawei.com Acked-by: Matthew R. Ochs <mrochs@linux.ibm.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: aacraid: Fix error handling paths in aac_probe_one()Christophe JAILLET1-4/+8
[ Upstream commit f7854c382240c1686900b2f098b36430c6f5047e ] If 'scsi_host_alloc()' or 'kcalloc()' fail, 'error' is known to be 0. Set it explicitly to -ENOMEM before branching to the error handling path. While at it, remove 2 useless assignments to 'error'. These values are overwridden a few lines later. Link: https://lore.kernel.org/r/20200412094039.8822-1-christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: qedi: Fix termination timeouts in session logoutNilesh Javali1-0/+3
[ Upstream commit b9b97e6903032ec56e6dcbe137a9819b74a17fea ] The destroy connection ramrod timed out during session logout. Fix the wait delay for graceful vs abortive termination as per the FW requirements. Link: https://lore.kernel.org/r/20200408064332.19377-7-mrangankar@marvell.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: hpsa: correct race condition in offload enabledDon Brace1-23/+57
[ Upstream commit 3e16e83a62edac7617bfd8dbb4e55d04ff6adbe1 ] Correct race condition where ioaccel is re-enabled before the raid_map is updated. For RAID_1, RAID_1ADM, and RAID 5/6 there is a BUG_ON called which is bad. - Change event thread to disable ioaccel only. Send all requests down the RAID path instead. - Have rescan thread handle offload_enable. - Since there is only one rescan allowed at a time, turning offload_enabled on/off should not be racy. Each handler queues up a rescan if one is already in progress. - For timing diagram, offload_enabled is initially off due to a change (transformation: splitmirror/remirror), ... otbe = offload_to_be_enabled oe = offload_enabled Time Event Rescan Completion Request Worker Worker Thread Thread ---- ------ ------ ---------- ------- T0 | | + UA | T1 | + rescan started | 0x3f | T2 + Event | | 0x0e | T3 + Ack msg | | | T4 | + if (!dev[i]->oe && | | T5 | | dev[i]->otbe) | | T6 | | get_raid_map | | T7 + otbe = 1 | | | T8 | | | | T9 | + oe = otbe | | T10 | | | + ioaccel request T11 * BUG_ON T0 - I/O completion with UA 0x3f 0x0e sets rescan flag. T1 - rescan worker thread starts a rescan. T2 - event comes in T3 - event thread starts and issues "Acknowledge" message ... T6 - rescan thread has bypassed code to reload new raid map. ... T7 - event thread runs and sets offload_to_be_enabled ... T9 - rescan thread turns on offload_enabled. T10- request comes in and goes down ioaccel path. T11- BUG_ON. - After the patch is applied, ioaccel_enabled can only be re-enabled in the re-scan thread. Link: https://lore.kernel.org/r/158472877894.14200.7077843399036368335.stgit@brunhilda Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Matt Perricone <matt.perricone@microsemi.com> Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: aacraid: Disabling TM path and only processing IOP resetSagar Biradar2-10/+26
[ Upstream commit bef18d308a2215eff8c3411a23d7f34604ce56c3 ] Fixes the occasional adapter panic when sg_reset is issued with -d, -t, -b and -H flags. Removal of command type HBA_IU_TYPE_SCSI_TM_REQ in aac_hba_send since iu_type, request_id and fib_flags are not populated. Device and target reset handlers are made to send TMF commands only when reset_state is 0. Link: https://lore.kernel.org/r/1581553771-25796-1-git-send-email-Sagar.Biradar@microchip.com Reviewed-by: Sagar Biradar <Sagar.Biradar@microchip.com> Signed-off-by: Sagar Biradar <Sagar.Biradar@microchip.com> Signed-off-by: Balsundar P <balsundar.p@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: lpfc: Fix coverity errors in fmdi attribute handlingJames Smart2-88/+85
[ Upstream commit 4cb9e1ddaa145be9ed67b6a7de98ca705a43f998 ] Coverity reported a memory corruption error for the fdmi attributes routines: CID 15768 [Memory Corruption] Out-of-bounds access on FDMI Sloppy coding of the fmdi structures. In both the lpfc_fdmi_attr_def and lpfc_fdmi_reg_port_list structures, a field was placed at the start of payload that may have variable content. The field was given an arbitrary type (uint32_t). The code then uses the field name to derive an address, which it used in things such as memset and memcpy. The memset sizes or memcpy lengths were larger than the arbitrary type, thus coverity reported an error. Fix by replacing the arbitrary fields with the real field structures describing the payload. Link: https://lore.kernel.org/r/20200128002312.16346-8-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: lpfc: Fix RQ buffer leakage when no IOCBs availableJames Smart1-0/+4
[ Upstream commit 39c4f1a965a9244c3ba60695e8ff8da065ec6ac4 ] The driver is occasionally seeing the following SLI Port error, requiring reset and reinit: Port Status Event: ... error 1=0x52004a01, error 2=0x218 The failure means an RQ timeout. That is, the adapter had received asynchronous receive frames, ran out of buffer slots to place the frames, and the driver did not replenish the buffer slots before a timeout occurred. The driver should not be so slow in replenishing buffers that a timeout can occur. When the driver received all the frames of a sequence, it allocates an IOCB to put the frames in. In a situation where there was no IOCB available for the frame of a sequence, the RQ buffer corresponding to the first frame of the sequence was not returned to the FW. Eventually, with enough traffic encountering the situation, the timeout occurred. Fix by releasing the buffer back to firmware whenever there is no IOCB for the first frame. [mkp: typo] Link: https://lore.kernel.org/r/20200128002312.16346-2-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: ufs: Fix a race condition in the tracing codeBart Van Assche1-1/+1
[ Upstream commit eacf36f5bebde5089dddb3d5bfcbeab530b01f8a ] Starting execution of a command before tracing a command may cause the completion handler to free data while it is being traced. Fix this race by tracing a command before it is submitted. Cc: Bean Huo <beanhuo@micron.com> Cc: Can Guo <cang@codeaurora.org> Cc: Avri Altman <avri.altman@wdc.com> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20191224220248.30138-5-bvanassche@acm.org Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: ufs: Make ufshcd_add_command_trace() easier to readBart Van Assche1-6/+6
[ Upstream commit e4d2add7fd5bc64ee3e388eabe6b9e081cb42e11 ] Since the lrbp->cmd expression occurs multiple times, introduce a new local variable to hold that pointer. This patch does not change any functionality. Cc: Bean Huo <beanhuo@micron.com> Cc: Can Guo <cang@codeaurora.org> Cc: Avri Altman <avri.altman@wdc.com> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20191224220248.30138-3-bvanassche@acm.org Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Can Guo <cang@codeaurora.org> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: pm80xx: Cleanup command when a reset times outpeter chang1-13/+37
[ Upstream commit 51c1c5f6ed64c2b65a8cf89dac136273d25ca540 ] Added the fix so the if driver properly sent the abort it tries to remove it from the firmware's list of outstanding commands regardless of the abort status. This means that the task gets freed 'now' rather than possibly getting freed later when the scsi layer thinks it's leaked but still valid. Link: https://lore.kernel.org/r/20191114100910.6153-10-deepak.ukey@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: peter chang <dpf@google.com> Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: lpfc: Fix kernel crash at lpfc_nvme_info_show during remote port bounceJames Smart1-17/+18
[ Upstream commit 6c1e803eac846f886cd35131e6516fc51a8414b9 ] When reading sysfs nvme_info file while a remote port leaves and comes back, a NULL pointer is encountered. The issue is due to ndlp list corruption as the the nvme_info_show does not use the same lock as the rest of the code. Correct by removing the rcu_xxx_lock calls and replace by the host_lock and phba->hbaLock spinlocks that are used by the rest of the driver. Given we're called from sysfs, we are safe to use _irq rather than _irqsave. Link: https://lore.kernel.org/r/20191105005708.7399-4-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: fnic: fix use after freePan Bian1-1/+2
[ Upstream commit ec990306f77fd4c58c3b27cc3b3c53032d6e6670 ] The memory chunk io_req is released by mempool_free. Accessing io_req->start_time will result in a use after free bug. The variable start_time is a backup of the timestamp. So, use start_time here to avoid use after free. Link: https://lore.kernel.org/r/1572881182-37664-1-git-send-email-bianpan2016@163.com Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01scsi: aacraid: fix illegal IO beyond last LBABalsundar P1-4/+4
[ Upstream commit c86fbe484c10b2cd1e770770db2d6b2c88801c1d ] The driver fails to handle data when read or written beyond device reported LBA, which triggers kernel panic Link: https://lore.kernel.org/r/1571120524-6037-2-git-send-email-balsundar.p@microsemi.com Signed-off-by: Balsundar P <balsundar.p@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-23scsi: lpfc: Fix FLOGI/PLOGI receive race condition in pt2pt discoveryJames Smart1-1/+3
[ Upstream commit 7b08e89f98cee9907895fabb64cf437bc505ce9a ] The driver is unable to successfully login with remote device. During pt2pt login, the driver completes its FLOGI request with the remote device having WWN precedence. The remote device issues its own (delayed) FLOGI after accepting the driver's and, upon transmitting the FLOGI, immediately recognizes it has already processed the driver's FLOGI thus it transitions to sending a PLOGI before waiting for an ACC to its FLOGI. In the driver, the FLOGI is received and an ACC sent, followed by the PLOGI being received and an ACC sent. The issue is that the PLOGI reception occurs before the response from the adapter from the FLOGI ACC is received. Processing of the PLOGI sets state flags to perform the REG_RPI mailbox command and proceed with the rest of discovery on the port. The same completion routine used by both FLOGI and PLOGI is generic in nature. One of the things it does is clear flags, and those flags happen to drive the rest of discovery. So what happened was the PLOGI processing set the flags, the FLOGI ACC completion cleared them, thus when the PLOGI ACC completes it doesn't see the flags and stops. Fix by modifying the generic completion routine to not clear the rest of discovery flag (NLP_ACC_REGLOGIN) unless the completion is also associated with performing a mailbox command as part of its handling. For things such as FLOGI ACC, there isn't a subsequent action to perform with the adapter, thus there is no mailbox cmd ptr. PLOGI ACC though will perform REG_RPI upon completion, thus there is a mailbox cmd ptr. Link: https://lore.kernel.org/r/20200828175332.130300-3-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-23scsi: libfc: Fix for double free()Javed Hasan1-2/+0
[ Upstream commit 5a5b80f98534416b3b253859897e2ba1dc241e70 ] Fix for '&fp->skb' double free. Link: https://lore.kernel.org/r/20200825093940.19612-1-jhasan@marvell.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-23scsi: pm8001: Fix memleak in pm8001_exec_internal_task_abortDinghao Liu1-1/+1
[ Upstream commit ea403fde7552bd61bad6ea45e3feb99db77cb31e ] When pm8001_tag_alloc() fails, task should be freed just like it is done in the subsequent error paths. Link: https://lore.kernel.org/r/20200823091453.4782-1-dinghao.liu@zju.edu.cn Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-23scsi: qla2xxx: Reduce holding sess_lock to prevent CPU lock-upQuinn Tran5-25/+33
commit 0aca77843e2803bf4fab1598b7891c56c16be979 upstream. - Reduce sess_lock holding to prevent CPU Lock up. sess_lock was held across fc_port registration and deletion. These calls can be blocked by upper layer. Sess_lock is also being accessed by interrupt thread. - Reduce number of loops in processing work_list to prevent kernel complaint of CPU lockup or holding sess_lock. Reported-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Tested-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Fixes: 9ba1cb25c151 ("scsi: qla2xxx: Remove all rports if fabric scan retry fails") Link: https://lore.kernel.org/linux-scsi/D01377DD-2E86-427B-BA0C-8D7649E37870@oracle.com/T/#t Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-23scsi: qla2xxx: Move rport registration out of internal work_listQuinn Tran5-39/+147
commit cd4ed6b470f1569692b5d0d295b207f870570829 upstream. Currently, the rport registration is being called from a single work element that is used to process QLA internal "work_list". This work_list is meant for quick and simple task (ie no sleep). The Rport registration process sometime can be delayed by upper layer. This causes back pressure with the internal queue where other jobs are unable to move forward. This patch will schedule the registration process with a new work element (fc_port.reg_work). While the RPort is being registered, the current state of the fcport will not move forward until the registration is done. If the state of the fabric has changed, a new field/next_disc_state will record the next action on whether to 'DELETE' or 'Reverify the session/ADISC'. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-23scsi: qla2xxx: Update rscn_rcvd field to more meaningful scan_neededQuinn Tran3-8/+8
commit cb873ba4002095d1e2fc60521bc4d860c7b72b92 upstream. Rename rscn_rcvd field to scan_needed to be more meaningful. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-17scsi: libsas: Set data_dir as DMA_NONE if libata marks qc as NODATALuo Jiaxing1-1/+4
[ Upstream commit 53de092f47ff40e8d4d78d590d95819d391bf2e0 ] It was discovered that sdparm will fail when attempting to disable write cache on a SATA disk connected via libsas. In the ATA command set the write cache state is controlled through the SET FEATURES operation. This is roughly corresponds to MODE SELECT in SCSI and the latter command is what is used in the SCSI-ATA translation layer. A subtle difference is that a MODE SELECT carries data whereas SET FEATURES is defined as a non-data command in ATA. Set the DMA data direction to DMA_NONE if the requested ATA command is identified as non-data. [mkp: commit desc] Fixes: fa1c1e8f1ece ("[SCSI] Add SATA support to libsas") Link: https://lore.kernel.org/r/1598426666-54544-1-git-send-email-luojiaxing@huawei.com Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03Revert "scsi: qla2xxx: Fix crash on qla2x00_mailbox_command"Saurav Kashyap1-8/+0
[ Upstream commit de7e6194301ad31c4ce95395eb678e51a1b907e5 ] FCoE adapter initialization failed for ISP8021 with the following patch applied. In addition, reproduction of the issue the patch originally tried to address has been unsuccessful. This reverts commit 3cb182b3fa8b7a61f05c671525494697cba39c6a. Link: https://lore.kernel.org/r/20200806111014.28434-11-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03scsi: qla2xxx: Fix null pointer access during disconnect from subsystemQuinn Tran1-0/+5
[ Upstream commit 83949613fac61e8e37eadf8275bf072342302f4e ] NVMEAsync command is being submitted to QLA while the same NVMe controller is in the middle of reset. The reset path has deleted the association and freed aen_op->fcp_req.private. Add a check for this private pointer before issuing the command. ... 6 [ffffb656ca11fce0] page_fault at ffffffff8c00114e [exception RIP: qla_nvme_post_cmd+394] RIP: ffffffffc0d012ba RSP: ffffb656ca11fd98 RFLAGS: 00010206 RAX: ffff8fb039eda228 RBX: ffff8fb039eda200 RCX: 00000000000da161 RDX: ffffffffc0d4d0f0 RSI: ffffffffc0d26c9b RDI: ffff8fb039eda220 RBP: 0000000000000013 R8: ffff8fb47ff6aa80 R9: 0000000000000002 R10: 0000000000000000 R11: ffffb656ca11fdc8 R12: ffff8fb27d04a3b0 R13: ffff8fc46dd98a58 R14: 0000000000000000 R15: ffff8fc4540f0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 7 [ffffb656ca11fe08] nvme_fc_start_fcp_op at ffffffffc0241568 [nvme_fc] 8 [ffffb656ca11fe50] nvme_fc_submit_async_event at ffffffffc0241901 [nvme_fc] 9 [ffffb656ca11fe68] nvme_async_event_work at ffffffffc014543d [nvme_core] 10 [ffffb656ca11fe98] process_one_work at ffffffff8b6cd437 11 [ffffb656ca11fed8] worker_thread at ffffffff8b6cdcef 12 [ffffb656ca11ff10] kthread at ffffffff8b6d3402 13 [ffffb656ca11ff50] ret_from_fork at ffffffff8c000255 -- PID: 37824 TASK: ffff8fb033063d80 CPU: 20 COMMAND: "kworker/u97:451" 0 [ffffb656ce1abc28] __schedule at ffffffff8be629e3 1 [ffffb656ce1abcc8] schedule at ffffffff8be62fe8 2 [ffffb656ce1abcd0] schedule_timeout at ffffffff8be671ed 3 [ffffb656ce1abd70] wait_for_completion at ffffffff8be639cf 4 [ffffb656ce1abdd0] flush_work at ffffffff8b6ce2d5 5 [ffffb656ce1abe70] nvme_stop_ctrl at ffffffffc0144900 [nvme_core] 6 [ffffb656ce1abe80] nvme_fc_reset_ctrl_work at ffffffffc0243445 [nvme_fc] 7 [ffffb656ce1abe98] process_one_work at ffffffff8b6cd437 8 [ffffb656ce1abed8] worker_thread at ffffffff8b6cdb50 9 [ffffb656ce1abf10] kthread at ffffffff8b6d3402 10 [ffffb656ce1abf50] ret_from_fork at ffffffff8c000255 Link: https://lore.kernel.org/r/20200806111014.28434-10-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03scsi: qla2xxx: Check if FW supports MQ before enablingSaurav Kashyap1-0/+5
[ Upstream commit dffa11453313a115157b19021cc2e27ea98e624c ] OS boot during Boot from SAN was stuck at dracut emergency shell after enabling NVMe driver parameter. For non-MQ support the driver was enabling MQ. Add a check to confirm if FW supports MQ. Link: https://lore.kernel.org/r/20200806111014.28434-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03scsi: ufs: Clean up completed request without interrupt notificationStanley Chu1-1/+2
[ Upstream commit b10178ee7fa88b68a9e8adc06534d2605cb0ec23 ] If somehow no interrupt notification is raised for a completed request and its doorbell bit is cleared by host, UFS driver needs to cleanup its outstanding bit in ufshcd_abort(). Otherwise, system may behave abnormally in the following scenario: After ufshcd_abort() returns, this request will be requeued by SCSI layer with its outstanding bit set. Any future completed request will trigger ufshcd_transfer_req_compl() to handle all "completed outstanding bits". At this time the "abnormal outstanding bit" will be detected and the "requeued request" will be chosen to execute request post-processing flow. This is wrong because this request is still "alive". Link: https://lore.kernel.org/r/20200811141859.27399-2-huobean@gmail.com Reviewed-by: Can Guo <cang@codeaurora.org> Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03scsi: ufs: Improve interrupt handling for shared interruptsAdrian Hunter1-3/+3
[ Upstream commit 127d5f7c4b653b8be5eb3b2c7bbe13728f9003ff ] For shared interrupts, the interrupt status might be zero, so check that first. Link: https://lore.kernel.org/r/20200811133936.19171-2-adrian.hunter@intel.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Adrian Hunter <adrian.hunt