Age | Commit message (Collapse) | Author | Files | Lines |
|
When a remote port is disconnected and disappears, its node structure
(ndlp) stays allocated and on a vport node list. While on the list it can
be matched, thus requires validation checks on state to be added in
numerous code paths. If the node comes back, its possible for there to be
multiple node structures for the same device on the vport node list. There
is no reason to keep the node structure around after it is no longer in
existence, and the current implementation creates problems for itself
(multiple nodes) and lots of unnecessary code for state validation.
Additionally, the reference taking on the node structure didn't follow the
normal model used by the kernel kref api. It included lots of odd logic to
match state with reference count. The combination of this odd logic plus
the way it was implicitly used in the discovery engine made its reference
taking implementation suspect and extremely hard to follow.
Change the driver such that the reference taking routines are now normal
ref increments/decrements and callout on refcount=0.
With this in place, the rework can be done such that the node structure is
fully removed and deallocated when the remote port no longer exists and all
references are removed. This removal logic, and the basic ref counting are
intrically tied, thus in a single patch.
Link: https://lore.kernel.org/r/20201115192646.12977-2-james.smart@broadcom.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Fixes the following W=1 kernel build warning(s):
drivers/scsi/lpfc/lpfc_nvme.c: In function ‘lpfc_nvme_ls_abort’:
drivers/scsi/lpfc/lpfc_nvme.c:943:19: warning: variable ‘phba’ set but not used [-Wunused-but-set-variable]
drivers/scsi/lpfc/lpfc_nvme.c:256: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_create_queue'
drivers/scsi/lpfc/lpfc_nvme.c:804: warning: Function parameter or member 'pnvme_rport' not described in 'lpfc_nvme_ls_req'
drivers/scsi/lpfc/lpfc_nvme.c:804: warning: Excess function parameter 'nvme_rport' description in 'lpfc_nvme_ls_req'
drivers/scsi/lpfc/lpfc_nvme.c:1312: warning: Function parameter or member 'lpfc_ncmd' not described in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1312: warning: Excess function parameter 'lpfcn_cmd' description in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1416: warning: Function parameter or member 'lpfc_ncmd' not described in 'lpfc_nvme_prep_io_dma'
drivers/scsi/lpfc/lpfc_nvme.c:1416: warning: Excess function parameter 'lpfcn_cmd' description in 'lpfc_nvme_prep_io_dma'
drivers/scsi/lpfc/lpfc_nvme.c:1594: warning: bad line: indicated in @lpfc_nvme_rport.
drivers/scsi/lpfc/lpfc_nvme.c:1605: warning: Function parameter or member 'pnvme_lport' not described in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1605: warning: Function parameter or member 'pnvme_rport' not described in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1605: warning: Function parameter or member 'pnvme_fcreq' not described in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1605: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1605: warning: Excess function parameter 'lpfc_nvme_lport' description in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1605: warning: Excess function parameter 'lpfc_nvme_rport' description in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1605: warning: Excess function parameter 'lpfc_nvme_fcreq' description in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1852: warning: Function parameter or member 'abts_cmpl' not described in 'lpfc_nvme_abort_fcreq_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1852: warning: Excess function parameter 'rspiocb' description in 'lpfc_nvme_abort_fcreq_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1888: warning: Function parameter or member 'pnvme_lport' not described in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1888: warning: Function parameter or member 'pnvme_rport' not described in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1888: warning: Function parameter or member 'pnvme_fcreq' not described in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1888: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1888: warning: Excess function parameter 'lpfc_nvme_lport' description in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1888: warning: Excess function parameter 'lpfc_nvme_rport' description in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1888: warning: Excess function parameter 'lpfc_nvme_fcreq' description in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:2089: warning: Function parameter or member 'ndlp' not described in 'lpfc_get_nvme_buf'
drivers/scsi/lpfc/lpfc_nvme.c:2089: warning: Function parameter or member 'idx' not described in 'lpfc_get_nvme_buf'
drivers/scsi/lpfc/lpfc_nvme.c:2089: warning: Function parameter or member 'expedite' not described in 'lpfc_get_nvme_buf'
drivers/scsi/lpfc/lpfc_nvme.c:2193: warning: Function parameter or member 'vport' not described in 'lpfc_nvme_create_localport'
drivers/scsi/lpfc/lpfc_nvme.c:2326: warning: Function parameter or member 'vport' not described in 'lpfc_nvme_destroy_localport'
drivers/scsi/lpfc/lpfc_nvme.c:2326: warning: Excess function parameter 'pnvme' description in 'lpfc_nvme_destroy_localport'
drivers/scsi/lpfc/lpfc_nvme.c:2544: warning: Function parameter or member 'vport' not described in 'lpfc_nvme_rescan_port'
drivers/scsi/lpfc/lpfc_nvme.c:2544: warning: Function parameter or member 'ndlp' not described in 'lpfc_nvme_rescan_port'
Link: https://lore.kernel.org/r/20201102142359.561122-13-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Fixes the following W=1 kernel build warning(s):
drivers/scsi/lpfc/lpfc_nvme.c: In function ‘lpfc_nvme_ls_abort’:
drivers/scsi/lpfc/lpfc_nvme.c:943:19: warning: variable ‘phba’ set but not used [-Wunused-but-set-variable]
Link: https://lore.kernel.org/r/20201102142359.561122-11-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The following call trace was seen during HBA reset testing:
BUG: scheduling while atomic: swapper/2/0/0x10000100
...
Call Trace:
dump_stack+0x19/0x1b
__schedule_bug+0x64/0x72
__schedule+0x782/0x840
__cond_resched+0x26/0x30
_cond_resched+0x3a/0x50
mempool_alloc+0xa0/0x170
lpfc_unreg_rpi+0x151/0x630 [lpfc]
lpfc_sli_abts_recover_port+0x171/0x190 [lpfc]
lpfc_sli4_abts_err_handler+0xb2/0x1f0 [lpfc]
lpfc_sli4_io_xri_aborted+0x256/0x300 [lpfc]
lpfc_sli4_sp_handle_abort_xri_wcqe.isra.51+0xa3/0x190 [lpfc]
lpfc_sli4_fp_handle_cqe+0x89/0x4d0 [lpfc]
__lpfc_sli4_process_cq+0xdb/0x2e0 [lpfc]
__lpfc_sli4_hba_process_cq+0x41/0x100 [lpfc]
lpfc_cq_poll_hdler+0x1a/0x30 [lpfc]
irq_poll_softirq+0xc7/0x100
__do_softirq+0xf5/0x280
call_softirq+0x1c/0x30
do_softirq+0x65/0xa0
irq_exit+0x105/0x110
do_IRQ+0x56/0xf0
common_interrupt+0x16a/0x16a
With the conversion to blk_io_poll for better interrupt latency in normal
cases, it introduced this code path, executed when I/O aborts or logouts
are seen, which attempts to allocate memory for a mailbox command to be
issued. The allocation is GFP_KERNEL, thus it could attempt to sleep.
Fix by creating a work element that performs the event handling for the
remote port. This will have the mailbox commands and other items performed
in the work element, not the irq. A much better method as the "irq" routine
does not stall while performing all this deep handling code.
Ensure that allocation failures are handled and send LOGO on failure.
Additionally, enlarge the mailbox memory pool to reduce the possibility of
additional allocation in this path.
Link: https://lore.kernel.org/r/20201020202719.54726-3-james.smart@broadcom.com
Fixes: 317aeb83c92b ("scsi: lpfc: Add blk_io_poll support for latency improvment")
Cc: <stable@vger.kernel.org> # v5.9+
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.
[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
|
Either due to API slippage before the driver was mainlined or copy/paste
errors.
Fixes the following W=1 kernel build warning(s):
drivers/scsi/lpfc/lpfc_nvme.c:254: warning: Function parameter or member 'pnvme_lport' not described in 'lpfc_nvme_create_queue'
drivers/scsi/lpfc/lpfc_nvme.c:254: warning: Function parameter or member 'qsize' not described in 'lpfc_nvme_create_queue'
drivers/scsi/lpfc/lpfc_nvme.c:254: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_create_queue'
drivers/scsi/lpfc/lpfc_nvme.c:311: warning: Function parameter or member 'pnvme_lport' not described in 'lpfc_nvme_delete_queue'
drivers/scsi/lpfc/lpfc_nvme.c:311: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_delete_queue'
drivers/scsi/lpfc/lpfc_nvme.c:689: warning: Function parameter or member 'gen_req_cmp' not described in '__lpfc_nvme_ls_req'
drivers/scsi/lpfc/lpfc_nvme.c:801: warning: Function parameter or member 'pnvme_lport' not described in 'lpfc_nvme_ls_req'
drivers/scsi/lpfc/lpfc_nvme.c:801: warning: Function parameter or member 'pnvme_rport' not described in 'lpfc_nvme_ls_req'
drivers/scsi/lpfc/lpfc_nvme.c:801: warning: Function parameter or member 'pnvme_lsreq' not described in 'lpfc_nvme_ls_req'
drivers/scsi/lpfc/lpfc_nvme.c:801: warning: Excess function parameter 'lpfc_nvme_lport' description in 'lpfc_nvme_ls_req'
drivers/scsi/lpfc/lpfc_nvme.c:801: warning: Excess function parameter 'lpfc_nvme_rport' description in 'lpfc_nvme_ls_req'
drivers/scsi/lpfc/lpfc_nvme.c:937: warning: Function parameter or member 'pnvme_lport' not described in 'lpfc_nvme_ls_abort'
drivers/scsi/lpfc/lpfc_nvme.c:937: warning: Function parameter or member 'pnvme_rport' not described in 'lpfc_nvme_ls_abort'
drivers/scsi/lpfc/lpfc_nvme.c:937: warning: Function parameter or member 'pnvme_lsreq' not described in 'lpfc_nvme_ls_abort'
drivers/scsi/lpfc/lpfc_nvme.c:937: warning: Excess function parameter 'lpfc_nvme_lport' description in 'lpfc_nvme_ls_abort'
drivers/scsi/lpfc/lpfc_nvme.c:937: warning: Excess function parameter 'lpfc_nvme_rport' description in 'lpfc_nvme_ls_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1075: warning: Function parameter or member 'phba' not described in 'lpfc_nvme_io_cmd_wqe_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1075: warning: Function parameter or member 'pwqeIn' not described in 'lpfc_nvme_io_cmd_wqe_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1075: warning: Function parameter or member 'wcqe' not described in 'lpfc_nvme_io_cmd_wqe_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1075: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_io_cmd_wqe_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1075: warning: Excess function parameter 'lpfc_nvme_lport' description in 'lpfc_nvme_io_cmd_wqe_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1075: warning: Excess function parameter 'lpfc_nvme_rport' description in 'lpfc_nvme_io_cmd_wqe_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1313: warning: Function parameter or member 'vport' not described in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1313: warning: Function parameter or member 'lpfc_ncmd' not described in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1313: warning: Function parameter or member 'pnode' not described in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1313: warning: Function parameter or member 'cstat' not described in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1313: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1313: warning: Excess function parameter 'lpfc_nvme_lport' description in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1313: warning: Excess function parameter 'lpfc_nvme_rport' description in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1313: warning: Excess function parameter 'lpfc_nvme_fcreq' description in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1313: warning: Excess function parameter 'hw_queue_handle' description in 'lpfc_nvme_prep_io_cmd'
drivers/scsi/lpfc/lpfc_nvme.c:1420: warning: Function parameter or member 'vport' not described in 'lpfc_nvme_prep_io_dma'
drivers/scsi/lpfc/lpfc_nvme.c:1420: warning: Function parameter or member 'lpfc_ncmd' not described in 'lpfc_nvme_prep_io_dma'
drivers/scsi/lpfc/lpfc_nvme.c:1420: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_prep_io_dma'
drivers/scsi/lpfc/lpfc_nvme.c:1420: warning: Excess function parameter 'lpfc_nvme_lport' description in 'lpfc_nvme_prep_io_dma'
drivers/scsi/lpfc/lpfc_nvme.c:1420: warning: Excess function parameter 'lpfc_nvme_rport' description in 'lpfc_nvme_prep_io_dma'
drivers/scsi/lpfc/lpfc_nvme.c:1420: warning: Excess function parameter 'lpfc_nvme_fcreq' description in 'lpfc_nvme_prep_io_dma'
drivers/scsi/lpfc/lpfc_nvme.c:1420: warning: Excess function parameter 'hw_queue_handle' description in 'lpfc_nvme_prep_io_dma'
drivers/scsi/lpfc/lpfc_nvme.c:1598: warning: bad line: indicated in @lpfc_nvme_rport.
drivers/scsi/lpfc/lpfc_nvme.c:1609: warning: Function parameter or member 'pnvme_lport' not described in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1609: warning: Function parameter or member 'pnvme_rport' not described in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1609: warning: Function parameter or member 'pnvme_fcreq' not described in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1609: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1609: warning: Excess function parameter 'lpfc_nvme_lport' description in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1609: warning: Excess function parameter 'lpfc_nvme_rport' description in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1609: warning: Excess function parameter 'lpfc_nvme_fcreq' description in 'lpfc_nvme_fcp_io_submit'
drivers/scsi/lpfc/lpfc_nvme.c:1856: warning: Function parameter or member 'abts_cmpl' not described in 'lpfc_nvme_abort_fcreq_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1856: warning: Excess function parameter 'rspiocb' description in 'lpfc_nvme_abort_fcreq_cmpl'
drivers/scsi/lpfc/lpfc_nvme.c:1892: warning: Function parameter or member 'pnvme_lport' not described in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1892: warning: Function parameter or member 'pnvme_rport' not described in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1892: warning: Function parameter or member 'pnvme_fcreq' not described in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1892: warning: Excess function parameter 'lpfc_pnvme' description in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1892: warning: Excess function parameter 'lpfc_nvme_lport' description in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1892: warning: Excess function parameter 'lpfc_nvme_rport' description in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:1892: warning: Excess function parameter 'lpfc_nvme_fcreq' description in 'lpfc_nvme_fcp_abort'
drivers/scsi/lpfc/lpfc_nvme.c:2093: warning: Function parameter or member 'ndlp' not described in 'lpfc_get_nvme_buf'
drivers/scsi/lpfc/lpfc_nvme.c:2093: warning: Function parameter or member 'idx' not described in 'lpfc_get_nvme_buf'
drivers/scsi/lpfc/lpfc_nvme.c:2093: warning: Function parameter or member 'expedite' not described in 'lpfc_get_nvme_buf'
drivers/scsi/lpfc/lpfc_nvme.c:2197: warning: Function parameter or member 'vport' not described in 'lpfc_nvme_create_localport'
drivers/scsi/lpfc/lpfc_nvme.c:2330: warning: Function parameter or member 'vport' not described in 'lpfc_nvme_destroy_localport'
drivers/scsi/lpfc/lpfc_nvme.c:2330: warning: Excess function parameter 'pnvme' description in 'lpfc_nvme_destroy_localport'
drivers/scsi/lpfc/lpfc_nvme.c:2543: warning: Function parameter or member 'vport' not described in 'lpfc_nvme_rescan_port'
drivers/scsi/lpfc/lpfc_nvme.c:2543: warning: Function parameter or member 'ndlp' not described in 'lpfc_nvme_rescan_port'
Link: https://lore.kernel.org/r/20200713080001.128044-21-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
nvme-fc allows the driver to specify a default devloss_tmo value when
registering the remote port. The lpfc driver is currently not doing so,
although it is implementing a driver internal node loss value of 30s which
also is used on the SCSI FC remote port. As no devloss_tmo is set, the
nvme-fc transport defaults to 60s. So there are competing timers.
Additionally, due to the competing timers, it is possible the NVMe rport is
removed but the SCSI rport remains. It is possible that the SCSI FC rport,
which was registered for the NVMe port even if it doesn't utilize the SCSI
protocol, had been tuned to not match either the 60s (nvme-fc default) or
30s (lldd default), it gets out of whack. The lldd will defer to the SCSI
FC rport as long as the SCSI FC rport has not had its devloss_tmo expire.
Correct the situation by specifying a default devloss_tmo to the nvme-fc
transport when registering the rport. Take the value from the SCSI FC
rport if it exists, otherwise use the driver default.
Link: https://lore.kernel.org/r/20200714211412.11773-1-jsmart2021@gmail.com
Tested-by: Martin George <Martin.George@netapp.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The current logging methods typically end up requesting a reproduction with
a different logging level set to figure out what happened. This was mainly
by design to not clutter the kernel log messages with things that were
typically not interesting and the messages themselves could cause other
issues.
When looking to make a better system, it was seen that in many cases when
more data was wanted was when another message, usually at KERN_ERR level,
was logged. And in most cases, what the additional logging that was then
enabled was typically. Most of these areas fell into the discovery machine.
Based on this summary, the following design has been put in place: The
driver will maintain an internal log (256 elements of 256 bytes). The
"additional logging" messages that are usually enabled in a reproduction
will be changed to now log all the time to the internal log. A new logging
level is defined - LOG_TRACE_EVENT. When this level is set (it is not by
default) and a message marked as KERN_ERR is logged, all the messages in
the internal log will be dumped to the kernel log before the KERN_ERR
message is logged.
There is a timestamp on each message added to the internal log. However,
this timestamp is not converted to wall time when logged. The value of the
timestamp is solely to give a crude time reference for the messages.
Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull SCSI updates from James Bottomley:
:This series consists of the usual driver updates (qla2xxx, ufs, zfcp,
target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host
of other minor updates.
There are no major core changes in this series apart from a
refactoring in scsi_lib.c"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (207 commits)
scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes
scsi: cxgb3i: Fix some leaks in init_act_open()
scsi: ibmvscsi: Make some functions static
scsi: iscsi: Fix deadlock on recovery path during GFP_IO reclaim
scsi: ufs: Fix WriteBooster flush during runtime suspend
scsi: ufs: Fix index of attributes query for WriteBooster feature
scsi: ufs: Allow WriteBooster on UFS 2.2 devices
scsi: ufs: Remove unnecessary memset for dev_info
scsi: ufs-qcom: Fix scheduling while atomic issue
scsi: mpt3sas: Fix reply queue count in non RDPQ mode
scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event
scsi: target: tcmu: Fix a use after free in tcmu_check_expired_queue_cmd()
scsi: vhost: Notify TCM about the maximum sg entries supported per command
scsi: qla2xxx: Remove return value from qla_nvme_ls()
scsi: qla2xxx: Remove an unused function
scsi: iscsi: Register sysfs for iscsi workqueue
scsi: scsi_debug: Parser tables and code interaction
scsi: core: Refactor scsi_mq_setup_tags function
scsi: core: Fix incorrect usage of shost_for_each_device
scsi: qla2xxx: Fix endianness annotations in source files
...
|
|
A static checker reported the following issue:
drivers/scsi/lpfc/lpfc_nvmet.c:1366 lpfc_nvmet_ls_abort()
warn: 'ret' can be either negative or positive
The comment indicates a non-zero value indicates error in the
form of -Exxx, but the code is returning "1".
Fix the code to return -EINVAL to be compliant to comment.
Fixes: e96a22b0b7c2 ("lpfc: Refactor Send LS Abort support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Now that common helpers exist, add the ability to receive NVME LS requests
to the driver. New requests will be delivered to the transport by
nvme_fc_rcv_ls_req().
In order to complete the LS, add support for Send LS Response and send
LS response completion handling to the driver.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Send LS Abort support is needed when Send LS Request is supported.
Currently, the ability to abort an NVME LS request is limited to the nvme
(host) side of the driver. In preparation of both the nvme and nvmet sides
supporting Send LS Abort, rework the existing ls_req abort routines such
that there is common code that can be used by both sides.
While refactoring it was seen the logic in the abort routine was incorrect.
It attempted to abort all NVME LS's on the indicated port. As such, the
routine was reworked to abort only the NVME LS request that was specified.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently, the ability to send an NVME LS request is limited to the nvme
(host) side of the driver. In preparation of both the nvme and nvmet sides
support Send LS Request, rework the existing send ls_req and ls_req
completion routines such that there is common code that can be used by
both sides.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In preparation for supporting both intiator mode and target mode
receiving NVME LS's, commonize the existing NVME LS request receive
handling found in the base driver and in the nvmet side.
Using the original lpfc_nvmet_unsol_ls_event() and
lpfc_nvme_unsol_ls_buffer() routines as a templates, commonize the
reception of an NVME LS request. The common routine will validate the LS
request, that it was received from a logged-in node, and allocate a
lpfc_async_xchg_ctx that is used to manage the LS request. The role of
the port is then inspected to determine which handler is to receive the
LS - nvme or nvmet. As such, the nvmet handler is tied back in. A handler
is created in nvme and is stubbed out.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
A lot of files in lpfc include nvme headers, building up relationships that
require a file to change for its headers when there is no other change
necessary. It would be better to localize the nvme headers.
There is also no need for separate nvme (initiator) and nvmet (tgt)
header files.
Refactor the inclusion of nvme headers so that all nvme items are
included by lpfc_nvme.h
Merge lpfc_nvmet.h into lpfc_nvme.h so that there is a single header used
by both the nvme and nvmet sides. This prepares for structure sharing
between the two roles. Prep to add shared function prototypes for upcoming
shared routines.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The lldd rebinds the ndlp with rport during a nvme rport registration (via
nvme_fc_register_remoteport). If rport & ndlp pointers are same as the
previous one, the lldd will re-use the ndlp and rport association without
re-initialization. This assumption is incorrect. The lldd should be
ignorant of whether the returned rport pointer is new or not, and should
always assume it is new.
Remove the re-binding code, always assumes that rport pointer received from
transport is a new pointer.
Link: https://lore.kernel.org/r/20200501214310.91713-4-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
During code reviews several instances of duplicate module unloading checks
were found.
Remove the duplicate checks.
Link: https://lore.kernel.org/r/20200421203354.49420-1-jsmart2021@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull more SCSI updates from James Bottomley:
"This is a batch of changes that didn't make it in the initial pull
request because the lpfc series had to be rebased to redo an incorrect
split.
It's basically driver updates to lpfc, target, bnx2fc and ufs with the
rest being minor updates except the sr_block_release one which fixes a
use after free introduced by the removal of the global mutex in the
first patch set"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (35 commits)
scsi: core: Add DID_ALLOC_FAILURE and DID_MEDIUM_ERROR to hostbyte_table
scsi: ufs: Use ufshcd_config_pwr_mode() when scaling gear
scsi: bnx2fc: fix boolreturn.cocci warnings
scsi: zfcp: use fallthrough;
scsi: aacraid: do not overwrite retval in aac_reset_adapter()
scsi: sr: Fix sr_block_release()
scsi: aic7xxx: Remove more FreeBSD-specific code
scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug
scsi: ufs: set device as active power mode after resetting device
scsi: iscsi: Report unbind session event when the target has been removed
scsi: lpfc: Change default SCSI LUN QD to 64
scsi: libfc: rport state move to PLOGI if all PRLI retry exhausted
scsi: libfc: If PRLI rejected, move rport to PLOGI state
scsi: bnx2fc: Update the driver version to 2.12.13
scsi: bnx2fc: Fix SCSI command completion after cleanup is posted
scsi: bnx2fc: Process the RQE with CQE in interrupt context
scsi: target: use the stack for XCOPY passthrough cmds
scsi: target: increase XCOPY I/O size
scsi: target: avoid per-loop XCOPY buffer allocations
scsi: target: drop xcopy DISK BLOCK LENGTH debug
...
|
|
The original patch was to resolve the lldd being able to be unloaded
while being used to talk to the boot device of the system. However, the
end result of the original patch is that any driver unload while a nvme
controller is live via the lldd is now being prohibited. Given the module
reference, the module teardown routine can't be called, thus there's no
way, other than manual actions to terminate the controllers.
Fixes: 863fbae929c7 ("nvme_fc: add module to ops template to allow module references")
Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Currently driver ktime stats, measuring code paths, is NVME-specific.
Convert the stats routines such that the code paths are generic, providing
status for NVME and SCSI. Added ktime stat calls in SCSI queuecommand and
cmpl routines.
Link: https://lore.kernel.org/r/20200322181304.37655-11-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The cpu io statistics were capped by a hard define limit of 128. This
effectively was a max number of CPUs, not an actual CPU count, nor actual
CPU numbers which can be even larger than both of those values. This made
stats off/misleading and on large CPU count systems, wrong.
Fix the stats so that all CPUs can have a stats struct. Fix the looping
such that it loops by hdwq, finds CPUs that used the hdwq, and sum the
stats, then display.
Link: https://lore.kernel.org/r/20200322181304.37655-9-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Kernel is crashing with the following stacktrace:
BUG: unable to handle kernel NULL pointer dereference at
00000000000005bc
IP: lpfc_nvme_register_port+0x1a8/0x3a0 [lpfc]
...
Call Trace:
lpfc_nlp_state_cleanup+0x2b2/0x500 [lpfc]
lpfc_nlp_set_state+0xd7/0x1a0 [lpfc]
lpfc_cmpl_prli_prli_issue+0x1f7/0x450 [lpfc]
lpfc_disc_state_machine+0x7a/0x1e0 [lpfc]
lpfc_cmpl_els_prli+0x16f/0x1e0 [lpfc]
lpfc_sli_sp_handle_rspiocb+0x5b2/0x690 [lpfc]
lpfc_sli_handle_slow_ring_event_s4+0x182/0x230 [lpfc]
lpfc_do_work+0x87f/0x1570 [lpfc]
kthread+0x10d/0x130
ret_from_fork+0x35/0x40
During target side fault injections, it is possible to hit the
NLP_WAIT_FOR_UNREG case in lpfc_nvme_remoteport_delete. A prior commit
fixed a rebind and delete race condition, but called lpfc_nlp_put
unconditionally. This triggered a deletion and the crash.
Fix by movng nlp_put to inside the NLP_WAIT_FOR_UNREG case, where the nlp
will be being unregistered/removed. Leave the reference if the flag isn't
set.
Link: https://lore.kernel.org/r/20200322181304.37655-8-jsmart2021@gmail.com
Fixes: b15bd3e6212e ("scsi: lpfc: Fix nvme remoteport registration race conditions")
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull block fixes from Jens Axboe:
- stable fix for the bi_size overflow. Not a corruption issue, but a
case wher we could merge but disallowed (Andreas)
- NVMe pull request via Keith, with various fixes.
- MD pull request from Song.
- Merge window regression fix for the rq passthrough stats (Logan)
- Remove unused blkcg_drain_queue() function (Guoqing)
* tag 'for-linus-20191212' of git://git.kernel.dk/linux-block:
blk-cgroup: remove blkcg_drain_queue
block: fix NULL pointer dereference in account statistics with IDE
md: make sure desc_nr less than MD_SB_DISKS
md: raid1: check rdev before reference in raid1_sync_request func
raid5: need to set STRIPE_HANDLE for batch head
block: fix "check bi_size overflow before merge"
nvme/pci: Fix read queue count
nvme/pci Limit write queue sizes to possible cpus
nvme/pci: Fix write and poll queue types
nvme/pci: Remove last_cq_head
nvme: Namepace identification descriptor list is optional
nvme-fc: fix double-free scenarios on hw queues
nvme: else following return is not needed
nvme: add error message on mismatching controller ids
nvme_fc: add module to ops template to allow module references
nvmet-loop: Avoid preallocating big SGL for data
nvme-fc: Avoid preallocating big SGL for data
nvme-rdma: Avoid preallocating big SGL for data
|
|
In nvme-fc: it's possible to have connected active controllers
and as no references are taken on the LLDD, the LLDD can be
unloaded. The controller would enter a reconnect state and as
long as the LLDD resumed within the reconnect timeout, the
controller would resume. But if a namespace on the controller
is the root device, allowing the driver to unload can be problematic.
To reload the driver, it may require new io to the boot device,
and as it's no longer connected we get into a catch-22 that
eventually fails, and the system locks up.
Fix this issue by taking a module reference for every connected
controller (which is what the core layer did to the transport
module). Reference is cleared when the controller is removed.
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Driver is setting the initiator nvme template with a max hw queues value of
the present cpu count which is odd. It should be registering the number of
hdwq queues (queues created on the adapter).
Change to set nvme tempate, in all cases, to the number of hardware queues.
Link: https://lore.kernel.org/r/20191111230401.12958-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Slightly rework some error check code paths for better streamlining.
Added compiler unlikely hints to allow slightly better optimization of the
fast-path.
Removed a few pointer checks that were obviously already valid.
Link: https://lore.kernel.org/r/20191018211832.7917-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When the port, running as a nvme target, receives an ABTS, it submits
commands to the adapter to Abort i/o outstanding in the adapter. The Abort
command formatting routine left a command field set to zero, which
instructs the adapter to generate an ABTS on the wire as part of cleaning
up the I/O. This is common operation for an initiator, but not for a
target.
Fix the driver to check whether an ABTS had been received for the I/O, and
if so, change the Abort command formatting so that the ABTS generation is
disabled (IA=1). No need to ABTS it when the other side already has.
Also refactored the code such that there is a single routine being used for
nvme or nvmet ABORT requests, and IA is an argument.
Link: https://lore.kernel.org/r/20190922035906.10977-11-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Currently, each hardware queue, typically allocated per-cpu, consists of a
WQ/CQ pair per protocol. Meaning if both SCSI and NVMe are supported 2
WQ/CQ pairs will exist for the hardware queue. Separate queues are
unnecessary. The current implementation wastes memory backing the 2nd set
of queues, and the use of double the SLI-4 WQ/CQ's means less hardware
queues can be supported which means there may not always be enough to have
a pair per cpu. If there is only 1 pair per cpu, more cpu's may get their
own WQ/CQ.
Rework the implementation to use a single WQ/CQ pair by both protocols.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
FC-NVMe-2 added support for sequence level error recovery in the FC-NVME
protocol. This allows for the detection of errors and lost frames and
immediate retransmission of data to avoid exchange termination, which
escalates into NVMeoFC connection and association failures. A significant
RAS improvement.
The driver is modified to indicate support for SLER in the NVMe PRLI is
issues and to check for support in the PRLI response. When both sides
support it, the driver will set a bit in the WQE to enable the recovery
behavior on the exchange. The adapter will take care of all detection and
retransmission.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Typical SLI-4 hardware supports up to 2 4KB pages to be registered per XRI
to contain the exchanges Scatter/Gather List. This caps the number of SGL
elements that can be in the SGL. There are not extensions to extend the
list out of the 2 pages.
The G7 hardware adds a SGE type that allows the SGL to be vectored to a
different scatter/gather list segment. And that segment can contain a SGE
to go to another segment and so on. The initial segment must still be
pre-registered for the XRI, but it can be a much smaller amount (256Bytes)
as it can now be dynamically grown. This much smaller allocation can
handle the SG list for most normal I/O, and the dynamic aspect allows it to
support many MB's if needed.
The implementation creates a pool which contains "segments" and which is
initially sized to hold the initial small segment per xri. If an I/O
requires additional segments, they are allocated from the pool. If the
pool has no more segments, the pool is grown based on what is now
needed. After the I/O completes, the additional segments are returned to
the pool for use by other I/Os. Once allocated, the additional segments are
not released under the assumption of "if needed once, it will be needed
again". Pools are kept on a per-hardware queue basis, which is typically
1:1 per cpu, but may be shared by multiple cpus.
The switch to the smaller initial allocation significantly reduces the
memory footprint of the driver (which only grows if large ios are
issued). Based on the several K of XRIs for the adapter, the 8KB->256B
reduction can conserve 32MBs or more.
It has been observed with per-cpu resource pools that allocating a resource
on CPU A, may be put back on CPU B. While the get routines are distributed
evenly, only a limited subset of CPUs may be handling the put routines.
This can put a strain on the lpfc_put_cmd_rsp_buf_per_cpu routine because
all the resources are being put on a limited subset of CPUs.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In order to see real addresses, convert %p with %px for kernel addresses
and replace %p with %pf for functions.
While converting, standardize on "x%px" throughout (not %px or 0x%px).
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
While performing code review, several relatively simple optimizations can
be done in the fast path.
Add these optimizations (unlikely designators).
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Running on Coverity produced the following errors:
- coding style (indentation)
- memset size mismatch errors
note: comment cases where it is purposely a mismatch
Fix the errors.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
As part of firmware download, the adapter is reset. On the adapter the
reset causes the function to stop and all outstanding io is terminated
(without responses). The reset path then starts teardown of the adapter,
starting with deregistration of the remote ports with the nvme-fc
transport. The local port is then deregistered and the driver waits for
local port deregistration. This never finishes.
The remote port deregistrations terminated the nvme controllers, causing
them to send aborts for all the outstanding io. The aborts were serviced in
the driver, but stalled due to its state. The nvme layer then stops to
reclaim it's outstanding io before continuing. The io must be returned
before the reset on the controller is deemed complete and the controller
delete performed. The remote port deregistration won't complete until all
the controllers are terminated. And the local port deregistration won't
complete until all controllers and remote ports are terminated. Thus things
hang.
The issue is the reset which stopped the adapter also stopped all the
responses that would drive i/o completions, and the aborts were also
stopped that stopped i/o completions. The driver, when resetting the
adapter like this, needs to be generating the completions as part of the
adapter reset so that I/O complete (in error), and any aborts are not
queued.
Fix by adding flush routines whenever the adapter port has been reset or
discovered in error. The flush routines will generate the completions for
the scsi and nvme outstanding io. The abort ios, if waiting, will be caught
and flushed as well.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In a test with high nvme remote port counts connected via a multi-hop FC
switch config where switches were systematically reset (e.g. fabric
partitioning and re-establishment), the nvme remote ports would switch
addresses based on the switch reconfiguration events. The driver would get
into a situation where the nvme port changed address, PLOGI and PRLI would
succeed nvme transport registration occurred, but subsequent LS requests by
the nvme subsystem failed due to a bad ndlp state and connectivity to the
device failed.
The driver hit a race condition on multiple devices that address swapped
simultaneously. In cases where the |