Age | Commit message (Collapse) | Author | Files | Lines |
|
Requests to delete an NPIV port may fail repeatedly if the initial request
is received during discovery.
If the FC_UNLOADING load_flag is set, then skip CT response processing for
the physical port. This allows discovery processing for other lpfc_vport
objects to reach their cmpl routines before deleting the vport.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-8-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Upon first RSCN receipt of a target server's remote port that is initially
acting as an initiator function, the driver marks the ndlp->nlp_type as an
initiator role. Then later, when processing an RSCN for a target function
role switch, that ndlp remote port is permanently stuck as an initiator
role and can never transition to be discovered as an updated target role
function.
Remove the NLP_RCV_PLOGI early return if statement clause so that the
NLP_NPR_2B_DISC flag gets set. This allows for role change detections.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-7-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Remove the early return NLP_FABRIC check in lpfc_plogi_confirm_nport()
because it is possible for switch domain controllers to change WWPN.
As a result, allow lpfc_plogi_confirm_nport() to detect that a new ndlp
should be initialized in such cases. The old ndlp object will be cleaned
up when dev_loss_tmo callbk executes.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-6-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
D_ID swaps are common during cable swaps in a SAN. Thus, there's no reason
to log the event at a KERN_ERR level with the trace event logger.
Change the log level to KERN_INFO and the normal LOG_ELS flag.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-5-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The sg_dma_len() API should be used to retrieve a scatterlist's length
instead of directly accessing scatterlist->length.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-4-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The call to lpfc_sli4_resume_rpi() in lpfc_rcv_padisc() may return an
unsuccessful status. In such cases, the elsiocb is not issued, the
completion is not called, and thus the elsiocb resource is leaked.
Check return value after calling lpfc_sli4_resume_rpi() and conditionally
release the elsiocb resource.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-3-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
A static code analyzer tool indicates that the local variable called status
in the lpfc_sli4_repost_sgl_list() routine could be used to print garbage
uninitialized values in the routine's log message.
Fix by initializing to zero.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-2-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
LUNs going into "failed ready running" state observed on >1T and on even
numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when DIF is
enabled at the host.
The kernel logs:
Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0
The host lpfc driver is failing to setup scatter/gather list (protection
data) for the I/Os.
The return type lpfc_bg_setup_sgl()/lpfc_bg_setup_sgl_prot() causes the
compiler to remove the most significant bit. Use an unsigned type instead.
Signed-off-by: Hannes Reinecke <hare@suse.de>
[dwagner: added commit message]
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Link: https://lore.kernel.org/r/20231220162658.12392-1-dwagner@suse.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Commit 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock
for waking up EH handler") intended to fix a hard lockup issue triggered by
EH. The core idea was to move scsi_host_busy() out of the host lock when
processing individual commands for EH. However, a suggested style change
inadvertently caused scsi_host_busy() to remain under the host lock. Fix
this by calling scsi_host_busy() outside the lock.
Fixes: 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler")
Cc: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20240203024521.2006455-1-ming.lei@redhat.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Mike Christie <michael.christie@oracle.com> says:
The following patches were made over Linus's tree which contains a fix
for sd which was not in Martin's branches.
The patches allow scsi_execute_cmd users to have scsi-ml retry the cmd
for it instead of the caller having to parse the error and loop
itself.
Link: https://lore.kernel.org/r/20240123002220.129141-1-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add some kunit tests for scsi_check_passthrough() so we can easily make
sure we are hitting the cases for which it's difficult to replicate in
hardware or even scsi_debug.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-20-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has get_sectorsize() have the SCSI midlayer retry errors instead of
driving them itself.
There is one behavior change where we no longer retry when
scsi_execute_cmd() returns < 0, but we should be ok. We don't need to retry
for failures like the queue being removed, and for the case where there are
no tags/reqs the block layer waits/retries for us. For possible memory
allocation failures from blk_rq_map_kern() we use GFP_NOIO, so retrying
will probably not help.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-18-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has ses have the SCSI midlayer retry scsi_execute_cmd() errors instead
of driving them itself.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-17-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has read_capacity_10() have the SCSI midlayer retry errors instead of
driving them itself.
There are 2 behavior changes with this patch:
1. There is one behavior change where we no longer retry when
scsi_execute_cmd() returns < 0, but we should be ok. We don't need to
retry for failures like the queue being removed, and for the case where
there are no tags/reqs since the block layer waits/retries for us. For
possible memory allocation failures from blk_rq_map_kern() we use
GFP_NOIO, so retrying will probably not help.
2. For the specific UAs we checked for and retried, we would get
READ_CAPACITY_RETRIES_ON_RESET retries plus whatever retries were left
from the main loop's retries. Each UA now gets
READ_CAPACITY_RETRIES_ON_RESET retries, and the other errors get up to
3 retries. This is most likely ok, because
READ_CAPACITY_RETRIES_ON_RESET is already 10 and is not based on
anything specific like a spec or device, so the extra 3 we got from the
main loop was probably just an accident and is not going to help.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-16-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
It's common to get a UA when doing PR commands. It could be due to a target
restarting, transport level relogin or other PR commands like a release
causing it. The upper layers don't get the sense and in some cases have no
idea if it's a SCSI device, so this has the sd layer retry.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-15-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has scsi_report_lun_scan() have the SCSI midlayer retry errors instead
of driving them itself.
There is one behavior change where we no longer retry when
scsi_execute_cmd() returns < 0, but we should be ok. We don't need to retry
for failures like the queue being removed, and for the case where there are
no tags/reqs the block layer waits/retries for us. For possible memory
allocation failures from blk_rq_map_kern() we use GFP_NOIO, so retrying
will probably not help.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-14-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has scsi_mode_sense() have the SCSI midlayer retry UAs instead of
driving them itself.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-13-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has ch_do_scsi() have the SCSI midlayer retry UAs instead of driving
them itself.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-12-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
unit_attention is not used so remove it.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-11-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has sd_sync_cache() have the SCSI midlayer retry errors instead of
driving them itself.
There is one behavior change where we no longer retry when
scsi_execute_cmd() returns < 0, but we should be ok. We don't need to retry
for failures like the queue being removed, and for the case where there are
no tags/reqs the block layer waits/retries for us. For possible memory
allocation failures from blk_rq_map_kern() we use GFP_NOIO, so retrying
will probably not help.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-10-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has spi_execute() have the SCSI midlayer retry UAs instead of driving
them.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-9-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has rdac have the SCSI midlayer retry errors instead of driving them
itself.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-8-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has hp_sw have the SCSI midlayer retry scsi_execute_cmd() errors
instead of driving them itself.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-7-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This simplifies sd_spinup_disk() so the SCSI midlayer retries errors for
it. Note that we retried every UA except Medium Not Present and also if
scsi_status_is_good() returned failed which would happen for all check
conditions. In this patch we use SCMD_FAILURE_STAT_ANY which will trigger
for the same conditions as when scsi_status_is_good() returns false and
there is status. This will cover all CCs including UAs so there is no
explicit failures array entry for UAs except for Medium Not Present which
we don't want to retry.
There is one behavior change where we no longer retry when
scsi_execute_cmd() returns < 0, but we should be ok. We don't need to retry
for failures like the queue being removed, and for the case where there are
no tags/reqs the block layer waits/retries for us. For possible memory
allocation failures from blk_rq_map_kern() we use GFP_NOIO, so retrying
will probably not help.
We do not handle the outside loop's retries because we want to sleep
between tries and we don't support that yet.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-6-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
We currently reuse the cmd buffer for the TUR and START_STOP commands
which requires us to reset the buffer when retrying. This has us use
separate buffers for the 2 commands so we can make them const and I think
it makes it easier to handle for retries but does not add too much extra to
the stack use.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-5-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description from: Martin Wilck <mwilck@suse.com>:
The SCSI mid layer doesn't retry commands after DID_TIME_OUT (see
scsi_noretry_cmd()). Packet loss in the fabric can cause spurious timeouts
during SCSI device probing, causing device probing to fail. This has been
observed in FCoE uplink failover tests, for example.
This patch fixes the issue by retrying the INQUIRY.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-4-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This has scsi_probe_lun() ask the SCSI midlayer to retry UAs instead of
driving them itself.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-3-michael.christie@oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
For passthrough we don't retry any error which we get a check condition
for. This results in a lot of callers driving their own retries for all
UAs, specific UAs, NOT_READY, specific sense values or any type of failure.
This adds the core code to allow passthrough users to specify what errors
they want the SCSI midlayer to retry for them. We can then convert users to
drop a lot of their sense parsing and retry handling.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240123002220.129141-2-michael.christie@oracle.com
Reviewed-by: John Garry <john.g.garry@oracle.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Per filesystems/sysfs.rst, show() should only use sysfs_emit() or
sysfs_emit_at() when formatting the value to be returned to user space.
coccinelle complains that there are still a couple of functions that use
snprintf(). Convert them to sysfs_emit().
> ./drivers/scsi/pm8001/pm8001_ctl.c:883:8-16: WARNING: please use sysfs_emit
No functional change intended
CC: Jack Wang <jinpu.wang@cloud.ionos.com>
CC: James E.J. Bottomley <jejb@linux.ibm.com>
CC: Martin K. Petersen <martin.petersen@oracle.com>
CC: linux-scsi@vger.kernel.org
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/20240116045151.3940401-32-lizhijian@fujitsu.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Per filesystems/sysfs.rst, show() should only use sysfs_emit() or
sysfs_emit_at() when formatting the value to be returned to user space.
coccinelle complains that there are still a couple of functions that use
snprintf(). Convert them to sysfs_emit().
> ./drivers/scsi/isci/init.c:140:8-16: WARNING: please use sysfs_emit
No functional change intended
CC: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
CC: James E.J. Bottomley <jejb@linux.ibm.com>
CC: Martin K. Petersen <martin.petersen@oracle.com>
CC: linux-scsi@vger.kernel.org
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/20240116045151.3940401-25-lizhijian@fujitsu.com
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Per filesystems/sysfs.rst, show() should only use sysfs_emit() or
sysfs_emit_at() when formatting the value to be returned to user space.
coccinelle complains that there are still a couple of functions that use
snprintf(). Convert them to sysfs_emit().
> ./drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:3619:8-16: WARNING: please use sysfs_emit
> ./drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:3625:8-16: WARNING: please use sysfs_emit
> ./drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:3633:8-16: WARNING: please use sysfs_emit
No functional change intended
CC: Michael Cyr <mikecyr@linux.ibm.com>
CC: James E.J. Bottomley <jejb@linux.ibm.com>
CC: Martin K. Petersen <martin.petersen@oracle.com>
CC: linux-scsi@vger.kernel.org
CC: target-devel@vger.kernel.org
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/20240116045151.3940401-24-lizhijian@fujitsu.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Per filesystems/sysfs.rst, show() should only use sysfs_emit() or
sysfs_emit_at() when formatting the value to be returned to user space.
coccinelle complains that there are still a couple of functions that use
snprintf(). Convert them to sysfs_emit().
> ./drivers/scsi/ibmvscsi/ibmvfc.c:3483:8-16: WARNING: please use sysfs_emit
> ./drivers/scsi/ibmvscsi/ibmvfc.c:3493:8-16: WARNING: please use sysfs_emit
> ./drivers/scsi/ibmvscsi/ibmvfc.c:3503:8-16: WARNING: please use sysfs_emit
> ./drivers/scsi/ibmvscsi/ibmvfc.c:3513:8-16: WARNING: please use sysfs_emit
> ./drivers/scsi/ibmvscsi/ibmvfc.c:3522:8-16: WARNING: please use sysfs_emit
> ./drivers/scsi/ibmvscsi/ibmvfc.c:3530:8-16: WARNING: please use sysfs_emit
No functional change intended
CC: Tyrel Datwyler <tyreld@linux.ibm.com>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Nicholas Piggin <npiggin@gmail.com>
CC: Christophe Leroy <christophe.leroy@csgroup.eu>
CC: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
CC: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
CC: James E.J. Bottomley <jejb@linux.ibm.com>
CC: Martin K. Petersen <martin.petersen@oracle.com>
CC: linux-scsi@vger.kernel.org
CC: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/20240116045151.3940401-23-lizhijian@fujitsu.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Per filesystems/sysfs.rst, show() should only use sysfs_emit() or
sysfs_emit_at() when formatting the value to be returned to user space.
coccinelle complains that there are still a couple of functions that use
snprintf(). Convert them to sysfs_emit().
> ./drivers/scsi/fnic/fnic_attrs.c:17:8-16: WARNING: please use sysfs_emit
> ./drivers/scsi/fnic/fnic_attrs.c:23:8-16: WARNING: please use sysfs_emit
> ./drivers/scsi/fnic/fnic_attrs.c:31:8-16: WARNING: please use sysfs_emit
No functional change intended
CC: Satish Kharat <satishkh@cisco.com>
CC: Sesidhar Baddela <sebaddel@cisco.com>
CC: Karan Tilak Kumar <kartilak@cisco.com>
CC: James E.J. Bottomley <jejb@linux.ibm.com>
CC: Martin K. Petersen <martin.petersen@oracle.com>
CC: linux-scsi@vger.kernel.org
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/20240116045151.3940401-20-lizhijian@fujitsu.com
Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There is a general misunderstanding amongst engineers that {v}snprintf()
returns the length of the data *actually* encoded into the destination
array. However, as per the C99 standard {v}snprintf() really returns
the length of the data that *would have been* written if there were
enough space for it. This misunderstanding has led to buffer-overruns
in the past. It's generally considered safer to use the {v}scnprintf()
variants in their place (or even sprintf() in simple cases). So let's
do that.
Link: https://lwn.net/Articles/69419/
Link: https://github.com/KSPP/linux/issues/105
Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com>
Cc: PMC-Sierra, Inc <aacraid@pmc-sierra.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Link: https://lore.kernel.org/r/20240111131732.1815560-6-lee@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
sysfs_emit()
Since snprintf() has the documented, but still rather strange trait of
returning the length of the data that *would have been* written to the
array if space were available, rather than the arguably more useful
length of data *actually* written, it is usually considered wise to use
something else instead in order to avoid confusion.
In the case of sysfs call-backs, new wrappers exist that do just that.
[mkp: removed unrelated whitespace cleanups]
Link: https://lwn.net/Articles/69419/
Link: https://github.com/KSPP/linux/issues/105
Cc: Richard Hirst <rhirst@linuxcare.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Link: https://lore.kernel.org/r/20240111131732.1815560-5-lee@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
sysfs_emit()
Since snprintf() has the documented, but still rather strange trait of
returning the length of the data that *would have been* written to the
array if space were available, rather than the arguably more useful
length of data *actually* written, it is usually considered wise to use
something else instead in order to avoid confusion.
In the case of sysfs call-backs, new wrappers exist that do just that.
Link: https://lwn.net/Articles/69419/
Link: https://github.com/KSPP/linux/issues/105
Cc: Adam Radford <aradford@gmail.com>
Cc: Joel Jacobson <linux@3ware.com>
Cc: de Melo <acme@conectiva.com.br>
Cc: Andre Hedrick <andre@suse.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Link: https://lore.kernel.org/r/20240111131732.1815560-4-lee@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
sysfs_emit()
Since snprintf() has the documented, but still rather strange trait of
returning the length of the data that *would have been* written to the
array if space were available, rather than the arguably more useful
length of data *actually* written, it is usually considered wise to use
something else instead in order to avoid confusion.
In the case of sysfs call-backs, new wrappers exist that do just that.
Link: https://lwn.net/Articles/69419/
Link: https://github.com/KSPP/linux/issues/105
Cc: Adam Radford <aradford@gmail.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Link: https://lore.kernel.org/r/20240111131732.1815560-3-lee@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
sysfs_emit()
Since snprintf() has the documented, but still rather strange trait of
returning the length of the data that *would have been* written to the
array if space were available, rather than the arguably more useful
length of data *actually* written, it is usually considered wise to use
something else instead in order to avoid confusion.
In the case of sysfs call-backs, new wrappers exist that do just that.
Link: https://lwn.net/Articles/69419/
Link: https://github.com/KSPP/linux/issues/105
Cc: Adam Radford <aradford@gmail.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Link: https://lore.kernel.org/r/20240111131732.1815560-2-lee@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update driver version to 48.100.00.00.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20231228114810.11923-3-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add a new IOCTL command MPT3ENABLEDIAGSBRRELOAD. As a part of firmware
update operation, applications use this IOCTL command to set the SBR reload
bit in the Host Diagnostic register. This permits HBA firmware to be
updated without powercycling the system.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312280909.MZyhxwBL-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202312281141.jDyPezRn-lkp@intel.com/
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20231228114810.11923-2-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
chenxiang <chenxiang66@hisilicon.com> says:
This series contains some fixes and cleanups including:
- Fix a deadlock issue related to automatic debugfs;
- Remove redundant checks for automatic debugfs;
- Check whether debugfs is enabled before removing or releasing it;
- Remove hisi_hba->timer for v3 hw;
Link: https://lore.kernel.org/r/1705904747-62186-1-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
hisi_hba->timer is not used for v3 hw but there are two places that some
operations related to hisi_hba->timer are called by v3 hw:
- Deleting the timer in function hisi_sas_v3_hw() which is only for v3 hw;
- Deleting the timer in function hisi_sas_controller_reset_prepare() which
is common for v1/v2/v3 hw.
We can remove the timer in the first case, but for the second scenario we
need to remove it only for v3 hw, so check hw->sht which is NULL only for
v3 hw before deleting hisi_hba->timer.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1705904747-62186-5-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
hisi_sas debugfs remove should be executed only when debugfs is enabled.
Check whether debugfs is enabled and then remove it only if enabled.
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1705904747-62186-4-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In commit 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump
trigger"), the memory allocation time of the DFX is changed from device
initialization to dump occurs, so .debugfs_itct is not a valid address and
do not need to check.
The parameter hisi_sas_debugfs_enable is enough to check whether automatic
debugfs dump is triggered, so remove redunant checks.
Fixes: 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump trigger")
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1705904747-62186-3-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
If we issue a disabling PHY command, the device attached with it will go
offline, if a 2 bit ECC error occurs at the same time, a hung task may be
found:
[ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds.
[ 4613.666297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4613.674809] task:kworker/u256:0 state:D stack: 0 pid:165233 ppid: 2 flags:0x00000208
[ 4613.683959] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [libsas]
[ 4613.691518] Call trace:
[ 4613.694678] __switch_to+0xf8/0x17c
[ 4613.698872] __schedule+0x660/0xee0
[ 4613.703063] schedule+0xac/0x240
[ 4613.706994] schedule_timeout+0x500/0x610
[ 4613.711705] __down+0x128/0x36c
[ 4613.715548] down+0x240/0x2d0
[ 4613.719221] hisi_sas_internal_abort_timeout+0x1bc/0x260 [hisi_sas_main]
[ 4613.726618] sas_execute_internal_abort+0x144/0x310 [libsas]
[ 4613.732976] sas_execute_internal_abort_dev+0x44/0x60 [libsas]
[ 4613.739504] hisi_sas_internal_task_abort_dev.isra.0+0xbc/0x1b0 [hisi_sas_main]
[ 4613.747499] hisi_sas_dev_gone+0x174/0x250 [hisi_sas_main]
[ 4613.753682] sas_notify_lldd_dev_gone+0xec/0x2e0 [libsas]
[ 4613.759781] sas_unregister_common_dev+0x4c/0x7a0 [libsas]
[ 4613.765962] sas_destruct_devices+0xb8/0x120 [libsas]
[ 4613.771709] sas_do_revalidate_domain.constprop.0+0x1b8/0x31c [libsas]
[ 4613.778930] sas_revalidate_domain+0x60/0xa4 [libsas]
[ 4613.784716] process_one_work+0x248/0x950
[ 4613.789424] worker_thread+0x318/0x934
[ 4613.793878] kthread+0x190/0x200
[ 4613.797810] ret_from_fork+0x10/0x18
[ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds.
[ 4613.816026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4613.824538] task:kworker/u256:4 state:D stack: 0 pid:316722 ppid: 2 flags:0x00000208
[ 4613.833670] Workqueue: 0000:74:02.0 hisi_sas_rst_work_handler [hisi_sas_main]
[ 4613.841491] Call trace:
[ 4613.844647] __switch_to+0xf8/0x17c
[ 4613.848852] __schedule+0x660/0xee0
[ 4613.853052] schedule+0xac/0x240
[ 4613.856984] schedule_timeout+0x500/0x610
[ 4613.861695] __down+0x128/0x36c
[ 4613.865542] down+0x240/0x2d0
[ 4613.869216] hisi_sas_controller_prereset+0x58/0x1fc [hisi_sas_main]
[ 4613.876324] hisi_sas_rst_work_handler+0x40/0x8c [hisi_sas_main]
[ 4613.883019] process_one_work+0x248/0x950
[ 4613.887732] worker_thread+0x318/0x934
[ 4613.892204] kthread+0x190/0x200
[ 4613.896118] ret_from_fork+0x10/0x18
[ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds.
[ 4613.914341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4613.922852] task:kworker/u256:1 state:D stack: 0 pid:348985 ppid: 2 flags:0x00000208
[ 4613.931984] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas]
[ 4613.939549] Call trace:
[ 4613.942702] __switch_to+0xf8/0x17c
[ 4613.946892] __schedule+0x660/0xee0
[ 4613.951083] schedule+0xac/0x240
[ 4613.955015] schedule_timeout+0x500/0x610
[ 4613.959725] wait_for_common+0x200/0x610
[ 4613.964349] wait_for_completion+0x3c/0x5c
[ 4613.969146] flush_workqueue+0x198/0x790
[ 4613.973776] sas_porte_broadcast_rcvd+0x1e8/0x320 [libsas]
[ 4613.979960] sas_port_event_worker+0x54/0xa0 [libsas]
[ 4613.985708] process_one_work+0x248/0x950
[ 4613.990420] worker_thread+0x318/0x934
[ 4613.994868] kthread+0x190/0x200
[ 4613.998800] ret_from_fork+0x10/0x18
This is because when the device goes offline, we obtain the hisi_hba
semaphore and send the ABORT_DEV command to the device. However, the
internal abort timed out due to the 2 bit ECC error and triggers automatic
dump. In addition, since the hisi_hba semaphore has been obtained, the dump
cannot be executed and the controller cannot be reset.
Therefore, the deadlocks occur on the following circular dependencies:
hisi_sas_dev_gone() -> down() -> hisi_sas_internal_task_abort_dev() -> ...
-> hisi_sas_internal_abort_timeout() -> down().
The deadlock is triggered only when the timeout occurs during device goes
offline. To fix this issue, use .rst_ha_timeout to distinguish the scenario
where a device goes offline from other scenarios.
Fixes: 2ff07b5c6fe9 ("scsi: hisi_sas: Directly call register snapshot instead of using workqueue")
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1705904747-62186-2-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
No functional modification involved.
drivers/scsi/fnic/fnic_scsi.c:1964 fnic_abort_cmd() warn: inconsistent indenting.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7930
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240118020128.24432-1-jiapeng.chong@linux.alibaba.com
Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
To ensure that the same ID is not obtained during concurrent execution of
the probe, an ida is used to manage the mrioc's ID.
Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231229040331.52518-1-kanie@linux.alibaba.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
strncpy() is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.
We don't need the NUL-padding behavior that strncpy() provides as vscsi is
NUL-allocated in ibmvscsis_probe() which proceeds to call
ibmvscsis_adapter_info():
| vscsi = kzalloc(sizeof(*vscsi), GFP_KERNEL);
ibmvscsis_probe() -> ibmvscsis_handle_crq() -> ibmvscsis_parse_command()
-> ibmvscsis_mad() -> ibmvscsis_process_mad() -> ibmvscsis_adapter_info()
Following the same idea, `partition_name` is defiend as:
| static char partition_name[PARTITION_NAMELEN] = "UNKNOWN";
... which is NUL-padded already, meaning strscpy() is the best option.
Considering the above, a suitable replacement is strscpy() [2] due to the
fact that it guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.
However, for cap->name and info let's use strscpy_pad() as they are
allocated via dma_alloc_coherent():
| cap = dma_alloc_coherent(&vscsi->dma_dev->dev, olen, &token,
| GFP_ATOMIC);
&
| info = dma_alloc_coherent(&vscsi->dma_dev->dev, sizeof(*info), &token,
| GFP_ATOMIC);
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20231212-strncpy-drivers-scsi-ibmvscsi_tgt-ibmvscsi_tgt-c-v2-1-bdb9a7cd96c8@google.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The variable 'retval' is being assigned a value that is not being read
afterwards. The assignment is redundant and can be removed.
Cleans up clang scan warning:
Although the value stored to 'retval' is used in the enclosing
expression, the value is never actually read from 'retval'
[deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240118121441.2533620-1-colin.i.king@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Current code uses the specified ring buffer size (either the default of 128
Kbytes or a module parameter specified value |