Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit d539a871ae47a1f27a609a62e06093fa69d7ce99 ]
The only input fc_rport_set_marginal_state() currently accepts is
"Marginal" when port_state is "Online", and "Online" when the port_state
is "Marginal". It should also allow setting port_state to its current
state, either "Marginal or "Online".
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Link: https://lore.kernel.org/r/20240917230643.966768-1-bmarzins@redhat.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d28d17a845600dd9f7de241de9b1528a1b138716 ]
If the sg_copy_buffer() call returns less than sdebug_sector_size, then
we drop out of the copy loop. However, we still report that we copied
the full expected amount, which is not proper.
Fix by keeping a running total and return that value.
Fixes: 84f3a3c01d70 ("scsi: scsi_debug: Atomic write support")
Reported-by: Colin Ian King <colin.i.king@gmail.com>
Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20241018101655.4207-1-john.g.garry@oracle.com
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit b9e63d6c7c0e94a99e1af7c9c0c7fad13a2f2453 upstream.
A sanity check on phy_mask was added in commit 3668651def2c ("scsi:
mpi3mr: Sanitise num_phys"). This causes warning messages when more than
64 phys are detected and devices connected to phys greater than 64 are
dropped.
The phy_mask bitmap is only needed for controller phys and not required
for expander phys. Controller phys can go up to a maximum of 64 and
therefore u64 is good enough to contain phy_mask bitmap.
To suppress those warnings and allow devices to be discovered as before
the offending commit, restrict the phy_mask setting and lowest phy
setting only to the controller phys.
Fixes: 3668651def2c ("scsi: mpi3mr: Sanitise num_phys")
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202410051943.Mp9o5DlF-lkp@intel.com/
Reported-by: Alexander Motin <mav@ixsystems.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20241008074353.200379-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f30e5f77d2f205ac14d09dec40fd4bb76712f13d upstream.
After commit 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a
work queue"), it can happen that a work item is sent to an uninitialized
work queue. This may has the effect that the item being queued is never
actually queued, and any further actions depending on it will not
proceed.
The following warning is observed while the fnic driver is loaded:
kernel: WARNING: CPU: 11 PID: 0 at ../kernel/workqueue.c:1524 __queue_work+0x373/0x410
kernel: <IRQ>
kernel: queue_work_on+0x3a/0x50
kernel: fnic_wq_copy_cmpl_handler+0x54a/0x730 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24]
kernel: fnic_isr_msix_wq_copy+0x2d/0x60 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24]
kernel: __handle_irq_event_percpu+0x36/0x1a0
kernel: handle_irq_event_percpu+0x30/0x70
kernel: handle_irq_event+0x34/0x60
kernel: handle_edge_irq+0x7e/0x1a0
kernel: __common_interrupt+0x3b/0xb0
kernel: common_interrupt+0x58/0xa0
kernel: </IRQ>
It has been observed that this may break the rediscovery of Fibre
Channel devices after a temporary fabric failure.
This patch fixes it by moving the work queue initialization out of
an if block in fnic_probe().
Signed-off-by: Martin Wilck <mwilck@suse.com>
Fixes: 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a work queue")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240930133014.71615-1-mwilck@suse.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9023ed8d91eb1fcc93e64dc4962f7412b1c4cbec upstream.
A regression was introduced with commit dbb2da557a6a ("scsi: wd33c93:
Move the SCSI pointer to private command data") which results in an oops
in wd33c93_intr(). That commit added the scsi_pointer variable and
initialized it from hostdata->connected. However, during selection,
hostdata->connected is not yet valid. Fix this by getting the current
scsi_pointer from hostdata->selecting.
Cc: Daniel Palmer <daniel@0x0f.com>
Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: stable@kernel.org
Fixes: dbb2da557a6a ("scsi: wd33c93: Move the SCSI pointer to private command data")
Signed-off-by: Daniel Palmer <daniel@0x0f.com>
Co-developed-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/09e11a0a54e6aa2a88bd214526d305aaf018f523.1727926187.git.fthain@linux-m68k.org
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1af9af1f8ab38f1285b27581a5e6920ec58296ba ]
Revise certain log messages marked as KERN_ERR LOG_TRACE_EVENT to
KERN_WARNING and use the lpfc_vlog_msg() macro to still log the event. The
benefit is that events of interest are still logged and the entire trace
buffer is not dumped with extraneous logging information when using default
lpfc_log_verbose driver parameter settings.
Also, delete the keyword "fail" from such log messages as they aren't
really causes for concern. The log messages are more for warnings to a SAN
admin about SAN activity.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0a3c84f71680684c1d41abb92db05f95c09111e8 ]
Deleting an NPIV instance requires all fabric ndlps to be released before
an NPIV's resources can be torn down. Failure to release fabric ndlps
beforehand opens kref imbalance race conditions. Fix by forcing the DA_ID
to complete synchronously with usage of wait_queue.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 93bcc5f3984bf4f51da1529700aec351872dbfff ]
During HBA stress testing, a spam of received PLOGIs exposes a resource
recovery bug causing leakage of lpfc_sqlq entries from the global
phba->sli4_hba.lpfc_els_sgl_list.
The issue is in lpfc_els_flush_cmd(), where the driver attempts to recover
outstanding ELS sgls when walking the txcmplq. Only CMD_ELS_REQUEST64_CRs
and CMD_GEN_REQUEST64_CRs are added to the abort and cancel lists. A check
for CMD_XMIT_ELS_RSP64_WQE is missing in order to recover LS_ACC usages of
the phba->sli4_hba.lpfc_els_sgl_list too.
Fix by adding CMD_XMIT_ELS_RSP64_WQE as part of the txcmplq walk when
adding WQEs to the abort and cancel list in lpfc_els_flush_cmd(). Also,
update naming convention from CRs to WQEs.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1c71065df2df693d208dd32758171c1dece66341 ]
Following an incomplete transfer in MSG IN phase, the driver would not
notice the problem and would make use of invalid data. Initialize 'tmp'
appropriately and bail out if no message was received. For STATUS phase,
preserve the existing status code unless a new value was transferred.
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/52e02a8812ae1a2d810d7f9f7fd800c3ccc320c4.1723001788.git.fthain@linux-m68k.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1f0f7679ad8942f810b0f19ee9cf098c3502d66a ]
A kref imbalance occurs when handling an unsolicited PRLO in direct
attached topology.
Rework PRLO rcv handling when in MAPPED state. Save the state that we were
handling a PRLO by setting nlp_last_elscmd to ELS_CMD_PRLO. Then in the
lpfc_cmpl_els_logo_acc() completion routine, manually restart discovery.
By issuing the PLOGI, which nlp_gets, before nlp_put at the end of the
lpfc_cmpl_els_logo_acc() routine, we are saving us from a final nlp_put.
And, we are still allowing the unreg_rpi to happen.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
topology
[ Upstream commit b5c18c9dd138733c16893613345af44deadcf05e ]
In direct attached topology, certain target vendors that are quick to issue
FLOGI followed by a cable pull for more than dev_loss_tmo may result in a
kref imbalance for the remote port ndlp object.
Add an nlp_get when the defer_flogi_acc flag is set. This is expected to
balance the nlp_put in the defer_flogi_acc clause in the
lpfc_issue_els_flogi() routine. Because we need to retain the ndlp ptr,
reorganize all of the defer_flogi_acc information into one
lpfc_defer_flogi_acc struct.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2be1d4f11944cd6283cb97268b3e17c4424945ca ]
When the HBA is undergoing a reset or is handling an errata event, NULL ptr
dereference crashes may occur in routines such as
lpfc_sli_flush_io_rings(), lpfc_dev_loss_tmo_callbk(), or
lpfc_abort_handler().
Add NULL ptr checks before dereferencing hdwq pointers that may have been
freed due to operations colliding with a reset or errata event handler.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6e5860b0ad4934baee8c7a202c02033b2631bb44 ]
struct aac_srb_unit contains struct aac_srb, which contains struct sgmap,
which ends in a (currently) "fake" (1-element) flexible array. Converting
this to a flexible array is needed so that runtime bounds checking won't
think the array is fixed size (i.e. under CONFIG_FORTIFY_SOURCE=y and/or
CONFIG_UBSAN_BOUNDS=y), as other parts of aacraid use struct sgmap as a
flexible array.
It is not legal to have a flexible array in the middle of a structure, so
it either needs to be split up or rearranged so that it is at the end of
the structure. Luckily, struct aac_srb_unit, which is exclusively
consumed/updated by aac_send_safw_bmic_cmd(), does not depend on member
ordering.
The values set in the on-stack struct aac_srb_unit instance "srbu" by the
only two callers, aac_issue_safw_bmic_identify() and
aac_get_safw_ciss_luns(), do not contain anything in srbu.srb.sgmap.sg, and
they both implicitly initialize srbu.srb.sgmap.count to 0 during
memset(). For example:
memset(&srbu, 0, sizeof(struct aac_srb_unit));
srbcmd = &srbu.srb;
srbcmd->flags = cpu_to_le32(SRB_DataIn);
srbcmd->cdb[0] = CISS_REPORT_PHYSICAL_LUNS;
srbcmd->cdb[1] = 2; /* extended reporting */
srbcmd->cdb[8] = (u8)(datasize >> 8);
srbcmd->cdb[9] = (u8)(datasize);
rcode = aac_send_safw_bmic_cmd(dev, &srbu, phys_luns, datasize);
During aac_send_safw_bmic_cmd(), a separate srb is mapped into DMA, and has
srbu.srb copied into it:
srb = fib_data(fibptr);
memcpy(srb, &srbu->srb, sizeof(struct aac_srb));
Only then is srb.sgmap.count written and srb->sg populated:
srb->count = cpu_to_le32(xfer_len);
sg64 = (struct sgmap64 *)&srb->sg;
sg64->count = cpu_to_le32(1);
sg64->sg[0].addr[1] = cpu_to_le32(upper_32_bits(addr));
sg64->sg[0].addr[0] = cpu_to_le32(lower_32_bits(addr));
sg64->sg[0].count = cpu_to_le32(xfer_len);
But this is happening in the DMA memory, not in srbu.srb. An attempt to
copy the changes back to srbu does happen:
/*
* Copy the updated data for other dumping or other usage if
* needed
*/
memcpy(&srbu->srb, srb, sizeof(struct aac_srb));
But this was never correct: the sg64 (3 u32s) overlap of srb.sg (2 u32s)
always meant that srbu.srb would have held truncated information and any
attempt to walk srbu.srb.sg.sg based on the value of srbu.srb.sg.count
would result in attempting to parse past the end of srbu.srb.sg.sg[0] into
srbu.srb_reply.
After getting a reply from hardware, the reply is copied into
srbu.srb_reply:
srb_reply = (struct aac_srb_reply *)fib_data(fibptr);
memcpy(&srbu->srb_reply, srb_reply, sizeof(struct aac_srb_reply));
This has always been fixed-size, so there's no issue here. It is worth
noting that the two callers _never check_ srbu contents -- neither
srbu.srb nor srbu.srb_reply is examined. (They depend on the mapped
xfer_buf instead.)
Therefore, the ordering of members in struct aac_srb_unit does not matter,
and the flexible array member can moved to the end.
(Additionally, the two memcpy()s that update srbu could be entirely
removed as they are never consumed, but I left that as-is.)
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20240711215739.208776-1-kees@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dbc39b84540f746cc814e69b21e53e6d3e12329a ]
All PCI ID entries in Hex.
Add new cisco pci ids:
VID / DID / SVID / SDID
---- ---- ---- ----
9005 028f 1137 02fe
9005 028f 1137 02ff
9005 028f 1137 0300
Add new h3c pci ids:
VID / DID / SVID / SDID
---- ---- ---- ----
9005 028f 193d 0462
9005 028f 193d 8462
Add new ieit pci ids:
VID / DID / SVID / SDID
---- ---- ---- ----
9005 028f 1ff9 00a3
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: David Strahan <David.Strahan@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-5-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4c76114932d1d6fad2e72823e7898a3c960cf2a7 ]
Correct stream detection by initializing the structure
pqi_scsi_dev_raid_map_data to 0s.
When the OS issues SCSI READ commands, the driver erroneously considers
them as SCSI WRITES. If they are identified as sequential IOs, the driver
then submits those requests via the RAID path instead of the AIO path.
The 'is_write' flag might be set for SCSI READ commands also. The driver
may interpret SCSI READ commands as SCSI WRITE commands, resulting in IOs
being submitted through the RAID path.
Note: This does not cause data corruption.
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-3-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0e21e73384d324f75ea16f3d622cfc433fa6209b ]
All PCI ID entries in hex.
Add new inagile PCI IDs:
VID / DID / SVID / SDID
---- ---- ---- ----
SMART-HBA 8242-24i 9005 / 028f / 1ff9 / 0045
RAID 8236-16i 9005 / 028f / 1ff9 / 0046
RAID 8240-24i 9005 / 028f / 1ff9 / 0047
SMART-HBA 8238-16i 9005 / 028f / 1ff9 / 0048
PM8222-SHBA 9005 / 028f / 1ff9 / 004a
RAID PM8204-2GB 9005 / 028f / 1ff9 / 004b
RAID PM8204-4GB 9005 / 028f / 1ff9 / 004c
PM8222-HBA 9005 / 028f / 1ff9 / 004f
MT0804M6R 9005 / 028f / 1ff9 / 0051
MT0801M6E 9005 / 028f / 1ff9 / 0052
MT0808M6R 9005 / 028f / 1ff9 / 0053
MT0800M6H 9005 / 028f / 1ff9 / 0054
RS0800M5H24i 9005 / 028f / 1ff9 / 006b
RS0800M5E8i 9005 / 028f / 1ff9 / 006c
RS0800M5H8i 9005 / 028f / 1ff9 / 006d
RS0804M5R16i 9005 / 028f / 1ff9 / 006f
RS0800M5E24i 9005 / 028f / 1ff9 / 0070
RS0800M5H16i 9005 / 028f / 1ff9 / 0071
RS0800M5E16i 9005 / 028f / 1ff9 / 0072
RT0800M7E 9005 / 028f / 1ff9 / 0086
RT0800M7H 9005 / 028f / 1ff9 / 0087
RT0804M7R 9005 / 028f / 1ff9 / 0088
RT0808M7R 9005 / 028f / 1ff9 / 0089
RT1608M6R16i 9005 / 028f / 1ff9 / 00a1
Add new h3c pci_id:
VID / DID / SVID / SDID
---- ---- ---- ----
UN RAID P4408-Mr-2 9005 / 028f / 193d / 1110
Add new powerleader pci ids:
VID / DID / SVID / SDID
---- ---- ---- ----
PL SmartROC PM8204 9005 / 028f / 1f3a / 0104
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: David Strahan <David.Strahan@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240711194704.982400-2-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a141c17a543332fc1238eb5cba562bfc66879126 ]
blk_mq_pci_map_queues() maps all queues but right after this, we overwrite
these mappings by calling blk_mq_map_queues(). Just use one helper but not
both.
Fixes: 42f22fe36d51 ("scsi: pm8001: Expose hardware queues for pm80xx")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Link: https://lore.kernel.org/r/20240912-do-not-overwrite-pci-mapping-v1-1-85724b6cec49@suse.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3d882cca73be830549833517ddccb3ac4668c04e ]
A previous change was introduced to prevent data loss during a power-on
reset when a tape is present inside the drive. This commit set the
"pos_unknown" flag to true to avoid operations that could compromise data
by performing actions from an untracked position. The relevant change is
commit 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
As a consequence of this change, a new issue has surfaced: the driver now
returns an "Input/output error" even for empty drives when the drive, host,
or bus is reset. This issue stems from the "flush_buffer" function, which
first checks whether the "pos_unknown" flag is set. If the flag is set, the
user will encounter an "Input/output error" until the tape position is
known again. This behavior differs from the previous implementation, where
empty drives were not affected at system start up time, allowing tape
software to send commands to the driver to retrieve the drive's status and
other information.
The current behavior prioritizes the "pos_unknown" flag over the
"ST_NO_TAPE" status, leading to issues for software that detects drives
during system startup. This software will receive an "Input/output error"
until a tape is loaded and its position is known.
To resolve this, the "ST_NO_TAPE" status should take priority when the
drive is empty, allowing communication with the drive following a power-on
reset. At the same time, the change should continue to protect data by
maintaining the "pos_unknown" flag when the drive contains a tape and its
position is unknown.
Signed-off-by: Rafael Rocha <rrochavi@fnal.gov>
Link: https://lore.kernel.org/r/20240905173921.10944-1-rrochavi@fnal.gov
Fixes: 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 5551bc30e4a69ad86d0d008e2f56cd59b6583476 upstream.
SD cards can produce write latency spikes on the order of a hundred
milliseconds. If the target firmware does not hide that latency during DATA
IN and OUT phases it can cause the PDMA circuitry to raise a processor bus
fault which in turn leads to an unreliable byte count and a DMA overrun.
The Last Byte Sent flag is used to detect the overrun but this mechanism is
unreliable on some systems. Instead, set a DID_ERROR result whenever there
is a bus fault during a PDMA send, unless the cause was a phase mismatch.
Cc: stable@vger.kernel.org # 5.15+
Reported-and-tested-by: Stan Johnson <userm57@yahoo.com>
Fixes: 7c1f3e3447a1 ("scsi: mac_scsi: Treat Last Byte Sent time-out as failure")
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/cc38df687ace2c4ffc375a683b2502fc476b600d.1723001788.git.fthain@linux-m68k.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5545c3165cbc98615fe65a44f41167cbb557e410 upstream.
Before the error handling can be revised, some preparation is needed.
Refactor the polling loop with a new function, macscsi_wait_for_drq().
This function will gain more call sites in the next patch.
Cc: stable@vger.kernel.org # 5.15+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/6a5ffabb4290c0d138c6d285fda8fa3902e926f0.1723001788.git.fthain@linux-m68k.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5ec4f820cb9766e4583df947150a6febce8da794 upstream.
After a bus fault, capture and log the chip registers immediately, if the
NDEBUG_PSEUDO_DMA macro is defined. Remove some printk(KERN_DEBUG ...)
messages that aren't needed any more. Don't skip the debug message when
bytes == 0. Show all of the byte counters in the debug messages.
Cc: stable@vger.kernel.org # 5.15+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/7573c79f4e488fc00af2b8a191e257ca945e0409.1723001788.git.fthain@linux-m68k.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 05ab4e7846f1103377133c00295a9a910cc6dfc2 upstream.
An older generation of HBAs are failing FCP discovery due to usage of an
outdated field in FCP command WQEs.
Fix by checking the SLI Interface Type register for applicable support of
32 Byte CDB commands, and restore a setting for a WQE path using normal 16
byte CDBs.
Fixes: af20bb73ac25 ("scsi: lpfc: Add support for 32 byte CDBs")
Cc: stable@vger.kernel.org # v6.10+
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f81eaf08385ddd474a2f41595a7757502870c0eb upstream.
Ff the device returns page 0xb1 with length 8 (happens with qemu v2.x, for
example), sd_read_block_characteristics() may attempt an out-of-bounds
memory access when accessing the zoned field at offset 8.
Fixes: 7fb019c46eee ("scsi: sd: Switch to using scsi_device VPD pages")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Wilck <mwilck@suse.com>
Link: https://lore.kernel.org/r/20240912134308.282824-1-mwilck@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2e4b02fad094976763af08fec2c620f4f8edd9ae ]
The kref_put() function will call nport->release if the refcount drops to
zero. The nport->release release function is _efc_nport_free() which frees
"nport". But then we dereference "nport" on the next line which is a use
after free. Re-order these lines to avoid the use after free.
Fixes: fcd427303eb9 ("scsi: elx: libefc: SLI and FC PORT state machine interfaces")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/b666ab26-6581-4213-9a3d-32a9147f0399@stanley.mountain
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5768718da9417331803fc4bc090544c2a93b88dc ]
It's not an error for a target to change the bus phase during a transfer.
Unfortunately, the FLAG_DMA_FIXUP workaround does not allow for that -- a
phase change produces a DRQ timeout error and the device borken flag will
be set.
Check the phase match bit during FLAG_DMA_FIXUP processing. Don't forget to
decrement the command residual. While we are here, change shost_printk()
into scmd_printk() for better consistency with other DMA error messages.
Tested-by: Stan Johnson <userm57@yahoo.com>
Fixes: 55181be8ced1 ("ncr5380: Replace redundant flags with FLAG_NO_DMA_FIXUP")
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/99dc7d1f4c825621b5b120963a69f6cd3e9ca659.1723001788.git.fthain@linux-m68k.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0c150b30d3d51a8c2e09fadd004a640fa12985c6 ]
Flag REQ_ATOMIC can only be set for writes, so don't check if the operation
is also a write in sd_setup_read_write_cmnd().
Fixes: bf4ae8f2e640 ("scsi: sd: Atomic write support")
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240805113315.1048591-2-john.g.garry@oracle.com
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f1393d52e6cda9c20f12643cbecf1e1dc357e0e2 ]
Correct a rare multipath failure issue by reverting commit 94a68c814328
("scsi: smartpqi: Quickly propagate path failures to SCSI midlayer") [1].
Reason for revert: The patch propagated the path failure to SML quickly
when one of the path fails during IO and AIO path gets disabled for a
multipath device.
But it created a new issue: when creating a volume on an encryption-enabled
controller, the firmware reports the AIO path is disabled, which cause the
driver to report a path failure to SML for a multipath device.
There will be a new fix to handle "Illegal request" and "Invalid field in
parameter list" on RAID path when the AIO path is disabled on a multipath
device.
[1] https://lore.kernel.org/all/164375209313.440833.9992416628621839233.stgit@brunhilda.pdev.net/
Fixes: 94a68c814328 ("scsi: smartpqi: Quickly propagate path failures to SCSI midlayer")
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Gilbert Wu <Gilbert.Wu@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240711194704.982400-4-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Minor fixes only.
The sd.c one ignores a sync cache request if format is in progress
which can happen if formatting a drive across suspend/resume"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Ignore command SYNCHRONIZE CACHE error if format in progress
scsi: aacraid: Fix double-free on probe failure
scsi: lpfc: Fix overflow build issue
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"The important core fix is another tweak to our discard discovery
issues. The off by 512 in logical block count seems bad, but in fact
the inline was only ever used in debug prints, which is why no-one
noticed"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Do not attempt to configure discard unless LBPME is set
scsi: MAINTAINERS: Add header files to SCSI SUBSYSTEM
scsi: ufs: qcom: Add UFSHCD_QUIRK_BROKEN_LSDBS_CAP for SM8550 SoC
scsi: ufs: core: Add a quirk for handling broken LSDBS field in controller capabilities register
scsi: core: Fix the return value of scsi_logical_block_count()
scsi: MAINTAINERS: Update HiSilicon SAS controller driver maintainer
|
|
If formatting a suspended disk (such as formatting with different DIF
type), the disk will be resuming first, and then the format command will
submit to the disk through SG_IO ioctl.
When the disk is processing the format command, the system does not
submit other commands to the disk. Therefore, the system attempts to
suspend the disk again and sends the SYNCHRONIZE CACHE command. However,
the SYNCHRONIZE CACHE command will fail because the disk is in the
formatting process. This will cause the runtime_status of the disk to
error and it is difficult for user to recover it. Error info like:
[ 669.925325] sd 6:0:6:0: [sdg] Synchronizing SCSI cache
[ 670.202371] sd 6:0:6:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK
[ 670.216300] sd 6:0:6:0: [sdg] Sense Key : 0x2 [current]
[ 670.221860] sd 6:0:6:0: [sdg] ASC=0x4 ASCQ=0x4
To solve the issue, ignore the error and return success/0 when format is
in progress.
Cc: stable@vger.kernel.org
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Link: https://lore.kernel.org/r/20240819090934.2130592-1-liyihang9@huawei.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
aac_probe_one() calls hardware-specific init functions through the
aac_driver_ident::init pointer, all of which eventually call down to
aac_init_adapter().
If aac_init_adapter() fails after allocating memory for aac_dev::queues,
it frees the memory but does not clear that member.
After the hardware-specific init function returns an error,
aac_probe_one() goes down an error path that frees the memory pointed to
by aac_dev::queues, resulting.in a double-free.
Reported-by: Michael Gordon <m.gordon.zelenoborsky@gmail.com>
Link: https://bugs.debian.org/1075855
Fixes: 8e0c5ebde82b ("[SCSI] aacraid: Newer adapter communication iterface support")
Signed-off-by: Ben Hutchings <benh@debian.org>
Link: https://lore.kernel.org/r/ZsZvfqlQMveoL5KQ@decadent.org.uk
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Build failed while enabling "CONFIG_GCOV_KERNEL=y" and
"CONFIG_GCOV_PROFILE_ALL=y" with following error:
BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c: In function 'lpfc_get_cgnbuf_info':
BUILDSTDERR: ./include/linux/fortify-string.h:114:33: error: '__builtin_memcpy' accessing 18446744073709551615 bytes at offsets 0 and 0 overlaps 9223372036854775807 bytes at offset -9223372036854775808 [-Werror=restrict]
BUILDSTDERR: 114 | #define __underlying_memcpy __builtin_memcpy
BUILDSTDERR: | ^
BUILDSTDERR: ./include/linux/fortify-string.h:637:9: note: in expansion of macro '__underlying_memcpy'
BUILDSTDERR: 637 | __underlying_##op(p, q, __fortify_size); \
BUILDSTDERR: | ^~~~~~~~~~~~~
BUILDSTDERR: ./include/linux/fortify-string.h:682:26: note: in expansion of macro '__fortify_memcpy_chk'
BUILDSTDERR: 682 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
BUILDSTDERR: | ^~~~~~~~~~~~~~~~~~~~
BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c:5468:9: note: in expansion of macro 'memcpy'
BUILDSTDERR: 5468 | memcpy(cgn_buff, cp, cinfosz);
BUILDSTDERR: | ^~~~~~
This happens from the commit 06bb7fc0feee ("kbuild: turn on -Wrestrict by
default"). Address this issue by using size_t type.
Signed-off-by: Sherry Yang <sherry.yang@oracle.com>
Link: https://lore.kernel.org/r/20240821065131.1180791-1-sherry.yang@oracle.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two small fixes to the mpi3mr driver. One to avoid oversize
allocations in tracing and the other to fix an uninitialized spinlock
in the user to driver feature request code (used to trigger dumps and
the like)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpi3mr: Avoid MAX_PAGE_ORDER WARNING for buffer allocations
scsi: mpi3mr: Add missing spin_lock_init() for mrioc->trigger_lock
|
|
Commit f874d7210d88 ("scsi: sd: Keep the discard mode stable") attempted
to address an issue where one mode of discard operation got configured
prior to the device completing full discovery. Unfortunately this
change assumed discard was always enabled on the device.
Do not attempt to configure discard unless LBPME is enabled.
Link: https://lore.kernel.org/r/20240817005325.3319384-1-martin.petersen@oracle.com
Fixes: f874d7210d88 ("scsi: sd: Keep the discard mode stable")
Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Commit fc4444941140 ("scsi: mpi3mr: HDB allocation and posting for hardware
and firmware buffers") added mpi3mr_alloc_diag_bufs() which calls
dma_alloc_coherent() to allocate the trace buffer and the firmware
buffer. mpi3mr_alloc_diag_bufs() decides the buffer sizes from the driver
configuration. In my environment, the sizes are 8MB. With the sizes,
dma_alloc_coherent() fails and report this WARNING:
WARNING: CPU: 4 PID: 438 at mm/page_alloc.c:4676 __alloc_pages_noprof+0x52f/0x640
The WARNING indicates that the order of the allocation size is larger than
MAX_PAGE_ORDER. After this failure, mpi3mr_alloc_diag_bufs() reduces the
buffer sizes and retries dma_alloc_coherent(). In the end, the buffer
allocations succeed with 4MB size in my environment, which corresponds to
MAX_PAGE_ORDER=10. Though the allocations succeed, the WARNING message is
misleading and should be avoided.
To avoid the WARNING, check the orders of the buffer allocation sizes
before calling dma_alloc_coherent(). If the orders are larger than
MAX_PAGE_ORDER, fall back to the retry path.
Fixes: fc4444941140 ("scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffers")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20240810042701.661841-3-shinichiro.kawasaki@wdc.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Commit fc4444941140 ("scsi: mpi3mr: HDB allocation and posting for hardware
and firmware buffers") added the spinlock trigger_lock to the struct
mpi3mr_ioc. However, spin_lock_init() call was not added for it, then the
lock does not work as expected. Also, the kernel reports the message below
when lockdep is enabled.
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
To fix the issue and to avoid the INFO message, add the missing
spin_lock_init() call.
Fixes: fc4444941140 ("scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffers")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20240810042701.661841-2-shinichiro.kawasaki@wdc.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two core fixes: one to prevent discard type changes (seen on iSCSI)
during intermittent errors and the other is fixing a lockdep problem
caused by the queue limits change.
And one driver fix in ufs"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Keep the discard mode stable
scsi: sd: Move sd_read_cpr() out of the q->limits_lock region
scsi: ufs: core: Fix hba->last_dme_cmd_tstamp timestamp updating logic
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"One core change that reverts the double message print patch in sd.c
(it was causing regressions on embedded systems).
The rest are driver fixes in ufs, mpt3sas and mpi3mr"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: exynos: Don't resume FMP when crypto support is disabled
scsi: mpt3sas: Avoid IOMMU page faults on REPORT ZONES
scsi: mpi3mr: Avoid IOMMU page faults on REPORT ZONES
scsi: ufs: core: Do not set link to OFF state while waking up from hibernation
scsi: Revert "scsi: sd: Do not repeat the starting disk message"
scsi: ufs: core: Fix deadlock during RTC update
scsi: ufs: core: Bypass quick recovery if force reset is needed
scsi: ufs: core: Check LSDBS cap when !mcq
|
|
There is a scenario where a large number of discard commands are issued
when the iscsi initiator connects to the target and then performs a session
rescan operation. There is a time window, most of the commands are in UNMAP
mode, and some discard commands become WRITE SAME with UNMAP.
The discard mode has been negotiated during the SCSI probe. If the mode is
temporarily changed from UNMAP to WRITE SAME with UNMAP, an I/O ERROR may
occur because the target may not implement WRITE SAME with UNMAP. Keep the
discard mode stable to fix this issue.
Signed-off-by: Li Feng <fengli@smartx.com>
Link: https://lore.kernel.org/r/20240718080751.313102-2-fengli@smartx.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Commit 804e498e0496 ("sd: convert to the atomic queue limits API")
introduced pairs of function calls to queue_limits_start_update() and
queue_limits_commit_update(). These two functions lock and unlock
q->limits_lock. In sd_revalidate_disk(), sd_read_cpr() is called after
queue_limits_start_update() call and before queue_limits_commit_update()
call. sd_read_cpr() locks q->sysfs_dir_lock and &q->sysfs_lock. Then new
lock dependencies were created between q->limits_lock, q->sysfs_dir_lock
and q->sysfs_lock, as follows:
sd_revalidate_disk
queue_limits_start_update
mutex_lock(&q->limits_lock)
sd_read_cpr
disk_set_independent_access_ranges
mutex_lock(&q->sysfs_dir_lock)
mutex_lock(&q->sysfs_lock)
mutex_unlock(&q->sysfs_lock)
mutex_unlock(&q->sysfs_dir_lock)
queue_limits_commit_update
mutex_unlock(&q->limits_lock)
However, the three locks already had reversed dependencies in other
places. Then the new dependencies triggered the lockdep WARN "possible
circular locking dependency detected" [1]. This WARN was observed by
running the blktests test case srp/002.
To avoid the WARN, move the sd_read_cpr() call in sd_revalidate_disk()
after the queue_limits_commit_update() call. In other words, move the
sd_read_cpr() call out of the q->limits_lock region.
[1] https://lore.kernel.org/linux-scsi/vlmv53ni3ltwxplig5qnw4xsl2h6ccxijfbqzekx76vxoim5a5@dekv7q3es3tx/
Fixes: 804e498e0496 ("sd: convert to the atomic queue limi |