summaryrefslogtreecommitdiff
path: root/drivers/usb/host
AgeCommit message (Collapse)AuthorFilesLines
2025-04-10usb: xhci: correct debug message page size calculationNiklas Neronin1-3/+3
[ Upstream commit 55741c723318905e6d5161bf1e12749020b161e3 ] The ffs() function returns the index of the first set bit, starting from 1. If no bits are set, it returns zero. This behavior causes an off-by-one page size in the debug message, as the page size calculation [1] is zero-based, while ffs() is one-based. Fix this by subtracting one from the result of ffs(). Note that since variable 'val' is unsigned, subtracting one from zero will result in the maximum unsigned integer value. Consequently, the condition 'if (val < 16)' will still function correctly. [1], Page size: (2^(n+12)), where 'n' is the set page size bit. Fixes: 81720ec5320c ("usb: host: xhci: use ffs() in xhci_mem_init()") Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-9-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-07usb: xhci: Apply the link chain quirk on NEC isoc endpointsMichal Pecio1-2/+11
commit bb0ba4cb1065e87f9cc75db1fa454e56d0894d01 upstream. Two clearly different specimens of NEC uPD720200 (one with start/stop bug, one without) were seen to cause IOMMU faults after some Missed Service Errors. Faulting address is immediately after a transfer ring segment and patched dynamic debug messages revealed that the MSE was received when waiting for a TD near the end of that segment: [ 1.041954] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ffa08fe0 [ 1.042120] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09000 flags=0x0000] [ 1.042146] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09040 flags=0x0000] It gets even funnier if the next page is a ring segment accessible to the HC. Below, it reports MSE in segment at ff1e8000, plows through a zero-filled page at ff1e9000 and starts reporting events for TRBs in page at ff1ea000 every microframe, instead of jumping to seg ff1e6000. [ 7.041671] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0 [ 7.041999] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0 [ 7.042011] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint [ 7.042028] xhci_hcd: All TDs skipped for slot 1 ep 2. Clear skip flag. [ 7.042134] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint [ 7.042138] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31 [ 7.042144] xhci_hcd: Looking for event-dma 00000000ff1ea040 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820 [ 7.042259] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint [ 7.042262] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31 [ 7.042266] xhci_hcd: Looking for event-dma 00000000ff1ea050 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820 At some point completion events change from Isoch Buffer Overrun to Short Packet and the HC finally finds cycle bit mismatch in ff1ec000. [ 7.098130] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13 [ 7.098132] xhci_hcd: Looking for event-dma 00000000ff1ecc50 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820 [ 7.098254] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13 [ 7.098256] xhci_hcd: Looking for event-dma 00000000ff1ecc60 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820 [ 7.098379] xhci_hcd: Overrun event on slot 1 ep 2 It's possible that data from the isochronous device were written to random buffers of pending TDs on other endpoints (either IN or OUT), other devices or even other HCs in the same IOMMU domain. Lastly, an error from a different USB device on another HC. Was it caused by the above? I don't know, but it may have been. The disk was working without any other issues and generated PCIe traffic to starve the NEC of upstream BW and trigger those MSEs. The two HCs shared one x1 slot by means of a commercial "PCIe splitter" board. [ 7.162604] usb 10-2: reset SuperSpeed USB device number 3 using xhci_hcd [ 7.178990] sd 9:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s [ 7.179001] sd 9:0:0:0: [sdb] tag#0 CDB: opcode=0x28 28 00 04 02 ae 00 00 02 00 00 [ 7.179004] I/O error, dev sdb, sector 67284480 op 0x0:(READ) flags 0x80700 phys_seg 5 prio class 0 Fortunately, it appears that this ridiculous bug is avoided by setting the chain bit of Link TRBs on isochronous rings. Other ancient HCs are known which also expect the bit to be set and they ignore Link TRBs if it's not. Reportedly, 0.95 spec guaranteed that the bit is set. The bandwidth-starved NEC HC running a 32KB/uframe UVC endpoint reports tens of MSEs per second and runs into the bug within seconds. Chaining Link TRBs allows the same workload to run for many minutes, many times. No negative side effects seen in UVC recording and UAC playback with a few devices at full speed, high speed and SuperSpeed. The problem doesn't reproduce on the newer Renesas uPD720201/uPD720202 and on old Etron EJ168 and VIA VL805 (but the VL805 has other bug). [shorten line length of log snippets in commit messge -Mathias] Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-14-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-07usb: xhci: Don't skip on Stopped - Length InvalidMichal Pecio1-0/+4
commit 58d0a3fab5f4fdc112c16a4c6d382f62097afd1c upstream. Up until commit d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped") in v6.11, the driver didn't skip missed isochronous TDs when handling Stoppend and Stopped - Length Invalid events. Instead, it erroneously cleared the skip flag, which would cause the ring to get stuck, as future events won't match the missed TD which is never removed from the queue until it's cancelled. This buggy logic seems to have been in place substantially unchanged since the 3.x series over 10 years ago, which probably speaks first and foremost about relative rarity of this case in normal usage, but by the spec I see no reason why it shouldn't be possible. After d56b0b2ab142, TDs are immediately skipped when handling those Stopped events. This poses a potential problem in case of Stopped - Length Invalid, which occurs either on completed TDs (likely already given back) or Link and No-Op TRBs. Such event won't be recognized as matching any TD (unless it's the rare Link TRB inside a TD) and will result in skipping all pending TDs, giving them back possibly before they are done, risking isoc data loss and maybe UAF by HW. As a compromise, don't skip and don't clear the skip flag on this kind of event. Then the next event will skip missed TDs. A downside of not handling Stopped - Length Invalid on a Link inside a TD is that if the TD is cancelled, its actual length will not be updated to account for TRBs (silently) completed before the TD was stopped. I had no luck producing this sequence of completion events so there is no compelling demonstration of any resulting disaster. It may be a very rare, obscure condition. The sole motivation for this patch is that if such unlikely event does occur, I'd rather risk reporting a cancelled partially done isoc frame as empty than gamble with UAF. This will be fixed more properly by looking at Stopped event's TRB pointer when making skipping decisions, but such rework is unlikely to be backported to v6.12, which will stay around for a few years. Fixes: d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-13usb: xhci: Enable the TRB overfetch quirk on VIA VL805Michal Pecio3-5/+10
commit c133ec0e5717868c9967fa3df92a55e537b1aead upstream. Raspberry Pi is a major user of those chips and they discovered a bug - when the end of a transfer ring segment is reached, up to four TRBs can be prefetched from the next page even if the segment ends with link TRB and on page boundary (the chip claims to support standard 4KB pages). It also appears that if the prefetched TRBs belong to a different ring whose doorbell is later rung, they may be used without refreshing from system RAM and the endpoint will stay idle if their cycle bit is stale. Other users complain about IOMMU faults on x86 systems, unsurprisingly. Deal with it by using existing quirk which allocates a dummy page after each transfer ring segment. This was seen to resolve both problems. RPi came up with a more efficient solution, shortening each segment by four TRBs, but it complicated the driver and they ditched it for this quirk. Also rename the quirk and add VL805 device ID macro. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Link: https://github.com/raspberrypi/linux/issues/4685 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215906 CC: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250225095927.2512358-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-13usb: xhci: Fix host controllers "dying" after suspend and resumeMichal Pecio1-1/+5
commit c7c1f3b05c67173f462d73d301d572b3f9e57e3b upstream. A recent cleanup went a bit too far and dropped clearing the cycle bit of link TRBs, so it stays different from the rest of the ring half of the time. Then a race occurs: if the xHC reaches such link TRB before more commands are queued, the link's cycle bit unintentionally matches the xHC's cycle so it follows the link and waits for further commands. If more commands are queued before the xHC gets there, inc_enq() flips the bit so the xHC later sees a mismatch and stops executing commands. This function is called before suspend and 50% of times after resuming the xHC is doomed to get stuck sooner or later. Then some Stop Endpoint command fails to complete in 5 seconds and this shows up xhci_hcd 0000:00:10.0: xHCI host not responding to stop endpoint command xhci_hcd 0000:00:10.0: xHCI host controller not responding, assume dead xhci_hcd 0000:00:10.0: HC died; cleaning up followed by loss of all USB decives on the affected bus. That's if you are lucky, because if Set Deq gets stuck instead, the failure is silent. Likely responsible for kernel bug 219824. I found this while searching for possible causes of that regression and reproduced it locally before hearing back from the reporter. To repro, simply wait for link cycle to become set (debugfs), then suspend, resume and wait. To accelerate the failure I used a script which repeatedly starts and stops a UVC camera. Some HCs get fully reinitialized on resume and they are not affected. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219824 Fixes: 36b972d4b7ce ("usb: xhci: improve xhci_clear_command_ring()") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250304113147.3322584-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-13xhci: Restrict USB4 tunnel detection for USB3 devices to Intel hostsMarc Zyngier1-0/+8
commit 487cfd4a8e3dc42d34a759017978a4edaf85fce0 upstream. When adding support for USB3-over-USB4 tunnelling detection, a check for an Intel-specific capability was added. This capability, which goes by ID 206, is used without any check that we are actually dealing with an Intel host. As it turns out, the Cadence XHCI controller *also* exposes an extended capability numbered 206 (for unknown purposes), but of course doesn't have the Intel-specific registers that the tunnelling code is trying to access. Fun follows. The core of the problems is that the tunnelling code blindly uses vendor-specific capabilities without any check (the Intel-provided documentation I have at hand indicates that 192-255 are indeed vendor-specific). Restrict the detection code to Intel HW for real, preventing any further explosion on my (non-Intel) HW. Cc: stable <stable@kernel.org> Fixes: 948ce83fbb7df ("xhci: Add USB4 tunnel detection for USB3 devices on Intel hosts") Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250227194529.2288718-1-maz@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21USB: pci-quirks: Fix HCCPARAMS register error for LS7A EHCIHuacai Chen1-0/+9
commit e71f7f42e3c874ac3314b8f250e8416a706165af upstream. LS7A EHCI controller doesn't have extended capabilities, so the EECP (EHCI Extended Capabilities Pointer) field of HCCPARAMS register should be 0x0, but it reads as 0xa0 now. This is a hardware flaw and will be fixed in future, now just clear the EECP field to avoid error messages on boot: ...... [ 0.581675] pci 0000:00:04.1: EHCI: unrecognized capability ff [ 0.581699] pci 0000:00:04.1: EHCI: unrecognized capability ff [ 0.581716] pci 0000:00:04.1: EHCI: unrecognized capability ff [ 0.581851] pci 0000:00:04.1: EHCI: unrecognized capability ff ...... [ 0.581916] pci 0000:00:05.1: EHCI: unrecognized capability ff [ 0.581951] pci 0000:00:05.1: EHCI: unrecognized capability ff [ 0.582704] pci 0000:00:05.1: EHCI: unrecognized capability ff [ 0.582799] pci 0000:00:05.1: EHCI: unrecognized capability ff ...... Cc: stable <stable@kernel.org> Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/r/20250202124935.480500-1-chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21usb: xhci: Restore xhci_pci support for Renesas HCsMichal Pecio1-3/+4
commit c81d9fcd5b9402166048f377d4e5e0ee6f9ef26d upstream. Some Renesas HCs require firmware upload to work, this is handled by the xhci_pci_renesas driver. Other variants of those chips load firmware from a SPI flash and are ready to work with xhci_pci alone. A refactor merged in v6.12 broke the latter configuration so that users are finding their hardware ignored by the normal driver and are forced to enable the firmware loader which isn't really necessary on their systems. Let xhci_pci work with those chips as before when the firmware loader is disabled by kernel configuration. Fixes: 25f51b76f90f ("xhci-pci: Make xhci-pci-renesas a proper modular driver") Cc: stable <stable@kernel.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219616 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219726 Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Tested-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://lore.kernel.org/r/20250128104529.58a79bfc@foxbook Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08usb: xhci: Fix NULL pointer dereference on certain command abortsMichal Pecio1-1/+2
commit 1e0a19912adb68a4b2b74fd77001c96cd83eb073 upstream. If a command is queued to the final usable TRB of a ring segment, the enqueue pointer is advanced to the subsequent link TRB and no further. If the command is later aborted, when the abort completion is handled the dequeue pointer is advanced to the first TRB of the next segment. If no further commands are queued, xhci_handle_stopped_cmd_ring() sees the ring pointers unequal and assumes that there is a pending command, so it calls xhci_mod_cmd_timer() which crashes if cur_cmd was NULL. Don't attempt timer setup if cur_cmd is NULL. The subsequent doorbell ring likely is unnecessary too, but it's harmless. Leave it alone. This is probably Bug 219532, but no confirmation has been received. The issue has been independently reproduced and confirmed fixed using a USB MCU programmed to NAK the Status stage of SET_ADDRESS forever. Everything continued working normally after several prevented crashes. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219532 Fixes: c311e391a7ef ("xhci: rework command timeout and cancellation,") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241227120142.1035206-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-23usb: host: xhci-plat: set skip_phy_initialization if software node has ↵Xu Yang1-1/+2
XHCI_SKIP_PHY_INIT property The source of quirk XHCI_SKIP_PHY_INIT comes from xhci_plat_priv.quirks or software node property. This will set skip_phy_initialization if software node also has XHCI_SKIP_PHY_INIT property. Fixes: a6cd2b3fa894 ("usb: host: xhci-plat: Parse xhci-missing_cas_quirk and apply quirk") Cc: stable <stable@kernel.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20241209111423.4085548-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-17usb: xhci: fix ring expansion regression in 6.13-rc1Niklas Neronin1-1/+1
The source and destination rings were incorrectly assigned during the ring linking process. The "source" ring, which contains the new segments, was not spliced into the "destination" ring, leading to incorrect ring expansion. Fixes: fe688e500613 ("usb: xhci: refactor xhci_link_rings() to use source and destination rings") Reported-by: Jeff Chua <jeff.chua.linux@gmail.com> Closes: https://lore.kernel.org/lkml/CAAJw_ZtppNqC9XA=-WVQDr+vaAS=di7jo15CzSqONeX48H75MA@mail.gmail.com/ Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241217102122.2316814-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-17xhci: Turn NEC specific quirk for handling Stop Endpoint errors genericMathias Nyman1-2/+0
xHC hosts from several vendors have the same issue where endpoints start so slowly that a later queued 'Stop Endpoint' command may complete before endpoint is up and running. The 'Stop Endpoint' command fails with context state error as the endpoint still appears as stopped. See commit 42b758137601 ("usb: xhci: Limit Stop Endpoint retries") for details CC: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241217102122.2316814-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-04usb: host: max3421-hcd: Correctly abort a USB request.Mark Tomlinson1-5/+11
If the current USB request was aborted, the spi thread would not respond to any further requests. This is because the "curr_urb" pointer would not become NULL, so no further requests would be taken off the queue. The solution here is to set the "urb_done" flag, as this will cause the correct handling of the URB. Also clear interrupts that should only be expected if an URB is in progress. Fixes: 2d53139f3162 ("Add support for using a MAX3421E chip as a host driver.") Cc: stable <stable@kernel.org> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Link: https://lore.kernel.org/r/20241124221430.1106080-1-mark.tomlinson@alliedtelesis.co.nz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-04usb: ehci-hcd: fix call balance of clocks handling routinesVitalii Mordan1-2/+7
If the clocks priv->iclk and priv->fclk were not enabled in ehci_hcd_sh_probe, they should not be disabled in any path. Conversely, if they was enabled in ehci_hcd_sh_probe, they must be disabled in all error paths to ensure proper cleanup. Found by Linux Verification Center (linuxtesting.org) with Klever. Fixes: 63c845522263 ("usb: ehci-hcd: Add support for SuperH EHCI.") Cc: stable@vger.kernel.org # ff30bd6a6618: sh: clk: Fix clk_enable() to return 0 on NULL clk Signed-off-by: Vitalii Mordan <mordan@ispras.ru> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20241121114700.2100520-1-mordan@ispras.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-02module: Convert symbol namespace to string literalPeter Zijlstra2-3/+3
Clean up the existing export namespace code along the same lines of commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ && $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]*\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-16usb: ehci-spear: fix call balance of sehci clk handling routinesVitalii Mordan1-3/+4
If the clock sehci->clk was not enabled in spear_ehci_hcd_drv_probe, it should not be disabled in any path. Conversely, if it was enabled in spear_ehci_hcd_drv_probe, it must be disabled in all error paths to ensure proper cleanup. Found by Linux Verification Center (linuxtesting.org) with Klever. Fixes: 7675d6ba436f ("USB: EHCI: make ehci-spear a separate driver") Cc: stable@vger.kernel.org Signed-off-by: Vitalii Mordan <mordan@ispras.ru> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20241114230310.432213-1-mordan@ispras.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-13drivers/usb/host: refactor min/max with min_t/max_tSabyrzhan Tasbolatov3-4/+4
Ensure type safety by using min_t/max_t instead of casted min/max. Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Link: https://lore.kernel.org/r/20241112155817.3512577-4-snovitoll@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-13usb: cdns3: Synchronise PCI IDs via common data baseAndy Shevchenko1-3/+2
There are a few places in the kernel where PCI IDs for different Cadence USB controllers are being used. Besides different naming, they duplicate each other. Make this all in order by providing common definitions via PCI IDs database and use in all users. While doing that, rename definitions as Roger suggested. Suggested-by: Roger Quadros <rogerq@kernel.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20241112160125.2340972-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: Avoid queuing redundant Stop Endpoint commandsMichal Pecio3-4/+29
Stop Endpoint command on an already stopped endpoint fails and may be misinterpreted as a known hardware bug by the completion handler. This results in an unnecessary delay with repeated retries of the command. Avoid queuing this command when endpoint state flags indicate that it's stopped or halted and the command will fail. If commands are pending on the endpoint, their completion handlers will process cancelled TDs so it's done. In case of waiting for external operations like clearing TT buffer, the endpoint is stopped and cancelled TDs can be processed now. This eliminates practically all unnecessary retries because an endpoint with pending URBs is maintained in Running state by the driver, unless aforementioned commands or other operations are pending on it. This is guaranteed by xhci_ring_ep_doorbell() and by the fact that it is called every time any of those operations completes. The only known exceptions are hardware bugs (the endpoint never starts at all) and Stream Protocol errors not associated with any TRB, which cause an endpoint reset not followed by restart. Sounds like a bug. Generally, these retries are only expected to happen when the endpoint fails to start for unknown/no reason, which is a worse problem itself, and fixing the bug eliminates the retries too. All cases were tested and found to work as expected. SET_DEQ_PENDING was produced by patching uvcvideo to unlink URBs in 100us intervals, which then runs into this case very often. EP_HALTED was produced by restarting 'cat /dev/ttyUSB0' on a serial dongle with broken cable. EP_CLEARING_TT by the same, with the dongle on an external hub. Fixes: fd9d55d190c0 ("xhci: retry Stop Endpoint on buggy NEC controllers") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-34-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: Fix TD invalidation under pending Set TR DequeueMichal Pecio1-5/+13
xhci_invalidate_cancelled_tds() may not work correctly if the hardware is modifying endpoint or stream contexts at the same time by executing a Set TR Dequeue command. And even if it worked, it would be unable to queue Set TR Dequeue for the next stream, failing to clear xHC cache. On stream endpoints, a chain of Set TR Dequeue commands may take some time to execute and we may want to cancel more TDs during this time. Currently this leads to Stop Endpoint completion handler calling this function without testing for SET_DEQ_PENDING, which will trigger the aforementioned problems when it happens. On all endpoints, a halt condition causes Reset Endpoint to be queued and an error status given to the class driver, which may unlink more URBs in response. Stop Endpoint is queued and its handler may execute concurrently with Set TR Dequeue queued by Reset Endpoint handler. (Reset Endpoint handler calls this function too, but there seems to be no possibility of it running concurrently with Set TR Dequeue). Fix xhci_invalidate_cancelled_tds() to work correctly under a pending Set TR Dequeue. Bail out of the function when SET_DEQ_PENDING is set, then make the completion handler call the function again and also call xhci_giveback_invalidated_tds(), which needs to be called next. This seems to fix another potential bug, where the handler would call xhci_invalidate_cancelled_tds(), which may clear some deferred TDs if a sanity check fails, and the TDs wouldn't be given back promptly. Said sanity check seems to be wrong and prone to false positives when the endpoint halts, but fixing it is beyond the scope of this change, besides ensuring that cleared TDs are given back properly. Fixes: 5ceac4402f5d ("xhci: Handle TD clearing for multiple streams case") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-33-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: Limit Stop Endpoint retriesMichal Pecio3-4/+27
Some host controllers fail to atomically transition an endpoint to the Running state on a doorbell ring and enter a hidden "Restarting" state, which looks very much like Stopped, with the important difference that it will spontaneously transition to Running anytime soon. A Stop Endpoint command queued in the Restarting state typically fails with Context State Error and the completion handler sees the Endpoint Context State as either still Stopped or already Running. Even a case of Halted was observed, when an error occurred right after the restart. The Halted state is already recovered from by resetting the endpoint. The Running state is handled by retrying Stop Endpoint. The Stopped state was recognized as a problem on NEC controllers and worked around also by retrying, because the endpoint soon restarts and then stops for good. But there is a risk: the command may fail if the endpoint is "stopped for good" already, and retries will fail forever. The possibility of this was not realized at the time, but a number of cases were discovered later and reproduced. Some proved difficult to deal with, and it is outright impossible to predict if an endpoint may fail to ever start at all due to a hardware bug. One such bug (albeit on ASM3142, not on NEC) was found to be reliably triggered simply by toggling an AX88179 NIC up/down in a tight loop for a few seconds. An endless retries storm is quite nasty. Besides putting needless load on the xHC and CPU, it causes URBs never to be given back, paralyzing the device and connection/disconnection logic for the whole bus if the device is unplugged. User processes waiting for URBs become unkillable, drivers and kworker threads lock up and xhci_hcd cannot be reloaded. For peace of mind, impose a timeout on Stop Endpoint retries in this case. If they don't succeed in 100ms, consider the endpoint stopped permanently for some reason and just give back the unlinked URBs. This failure case is rare already and work is under way to make it rarer. Start this work today by also handling one simple case of race with Reset Endpoint, because it costs just two lines to implement. Fixes: fd9d55d190c0 ("xhci: retry Stop Endpoint on buggy NEC controllers") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-32-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: remove irrelevant commentNiklas Neronin1-5/+0
The code which it is referencing does not exist in the same function, or the file for that matter. Since it was added [1], the Interrupter Moderation Interval can be changed within xhci addon, e.g. PCI xhci_pci_setup(). [1], commit 0ebbab374223 ("USB: xhci: Ring allocation and initialization.") Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-31-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: add help function xhci_dequeue_td()Niklas Neronin1-16/+13
Add xhci_dequeue_td() helper function to reduce code duplication. Function xhci_dequeue_td() advances the dequeue pointer past the specified Transfer Descriptor (TD) and releases the TD. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-30-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: refactor xhci_td_cleanup() to return voidNiklas Neronin1-31/+28
The function is modified to return 'void' instead of an integer since it invariably returns '0'. Additionally, multiple functions which only return xhci_td_cleanup() are also refactored to return void. This change eliminates the need for callers to handle a return value that does not convey meaningful information and improve code readability, as it becomes immediately clear that the function does not produce a significant output. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-29-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: remove unused arguments from td_to_noop()Niklas Neronin1-7/+6
Function td_to_noop() does not utilize arguments 'xhci' and 'ep_ring'. These unused arguments are removed to clean up the code. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-28-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: improve xhci_clear_command_ring()Niklas Neronin1-3/+1
Remove redundant TRB cycle reset, the TRB cycle is already set to zero by the preceding memset(), making the explicit reset unnecessary. Clarify ring loop start point. Change the loop start from the dequeue segment to the start segment. Both approaches achieve the same result, but starting from the start segment makes it clearer that the entire ring is being zeroed out. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-27-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: request MSI/-X according to requested amountNiklas Neronin1-7/+4
Variable 'max_interrupters' contains the maximum supported interrupters or the maximum interrupters the user has requested. Thus, it should be used instead of HCS_MAX_INTRS(). User set 'max_interrupters' value is validated in xhci_gen_setup(), otherwise 'max_interrupters' value is 'HCS_MAX_INTRS(xhci->hcs_params1)'. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-26-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: move link TRB quirk to xhci_gen_setup()Niklas Neronin1-8/+6
This quirk is old and seldom seen, as a result the trace is changed to debug message and only printed when the quirk is set. Move it into xhci_gen_setup() where the majority of quirks are set. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-25-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: simplify TDs start and end naming scheme in struct 'xhci_td'Niklas Neronin3-38/+38
Old names: * start_seg - last_trb_seg * first_trb - last_trb New names: * start_seg - end_seg * start_trb - end_trb A Transfer Descriptor (TD) in the xhci driver is a data structure that represents a single transaction to be performed by the USB host controller. This transaction is defined by TRBs from 'start_trb' in 'start_seg' to 'end_trb' in 'end_seg'. The terms "start" and "end" were chosen over "first" and "last" for ease of searching within the codebase. The ring structure uses 'first_seg' and 'last_seg', while the TD structure uses 'start_seg' and 'end_seg'. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-24-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06xhci: pci: Fix indentation in the PCI device ID definitionsAndy Shevchenko1-4/+4
Some of the definitions are missing the one TAB, add it to them. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-23-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06xhci: pci: Use standard pattern for device IDsAndy Shevchenko1-5/+5
The definitions of vendor IDs follow the pattern PCI_VENDOR_ID_#vendor, while device IDs — PCI_DEVICE_ID_#vendor_#device. Update the ETRON device IDs to follow the above mentioned pattern. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-22-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06xhci: Don't perform Soft Retry for Etron xHCI hostKuangyi Chiang1-0/+1
Since commit f8f80be501aa ("xhci: Use soft retry to recover faster from transaction errors"), unplugging USB device while enumeration results in errors like this: [ 364.855321] xhci_hcd 0000:0b:00.0: ERROR Transfer event for disabled endpoint slot 5 ep 2 [ 364.864622] xhci_hcd 0000:0b:00.0: @0000002167656d70 67f03000 00000021 0c000000 05038001 [ 374.934793] xhci_hcd 0000:0b:00.0: Abort failed to stop command ring: -110 [ 374.958793] xhci_hcd 0000:0b:00.0: xHCI host controller not responding, assume dead [ 374.967590] xhci_hcd 0000:0b:00.0: HC died; cleaning up [ 374.973984] xhci_hcd 0000:0b:00.0: Timeout while waiting for configure endpoint command Seems that Etorn xHCI host can not perform Soft Retry correctly, apply XHCI_NO_SOFT_RETRY quirk to disable Soft Retry and then issue is gone. This patch depends on commit a4a251f8c235 ("usb: xhci: do not perform Soft Retry for some xHCI hosts"). Fixes: f8f80be501aa ("xhci: Use soft retry to recover faster from transaction errors") Cc: stable@vger.kernel.org Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-21-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06xhci: Fix control transfer error on Etron xHCI hostKuangyi Chiang1-0/+14
Performing a stability stress test on a USB3.0 2.5G ethernet adapter results in errors like this: [ 91.441469] r8152 2-3:1.0 eth3: get_registers -71 [ 91.458659] r8152 2-3:1.0 eth3: get_registers -71 [ 91.475911] r8152 2-3:1.0 eth3: get_registers -71 [ 91.493203] r8152 2-3:1.0 eth3: get_registers -71 [ 91.510421] r8152 2-3:1.0 eth3: get_registers -71 The r8152 driver will periodically issue lots of control-IN requests to access the status of ethernet adapter hardware registers during the test. This happens when the xHCI driver enqueue a control TD (which cross over the Link TRB between two ring segments, as shown) in the endpoint zero's transfer ring. Seems the Etron xHCI host can not perform this TD correctly, causing the USB transfer error occurred, maybe the upper driver retry that control-IN request can solve problem, but not all drivers do this. | | ------- | TRB | Setup Stage ------- | TRB | Link ------- ------- | TRB | Data Stage ------- | TRB | Status Stage ------- | | To work around this, the xHCI driver should enqueue a No Op TRB if next available TRB is the Link TRB in the ring segment, this can prevent the Setup and Data Stage TRB to be breaked by the Link TRB. Check if the XHCI_ETRON_HOST quirk flag is set before invoking the workaround in xhci_queue_ctrl_tx(). Fixes: d0e96f5a71a0 ("USB: xhci: Control transfer support.") Cc: stable@vger.kernel.org Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-20-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06xhci: Don't issue Reset Device command to Etron xHCI hostKuangyi Chiang3-0/+21
Sometimes the hub driver does not recognize the USB device connected to the external USB2.0 hub when the system resumes from S4. After the SetPortFeature(PORT_RESET) request is completed, the hub driver calls the HCD reset_device callback, which will issue a Reset Device command and free all structures associated with endpoints that were disabled. This happens when the xHCI driver issue a Reset Device command to inform the Etron xHCI host that the USB device associated with a device slot has been reset. Seems that the Etron xHCI host can not perform this command correctly, affecting the USB device. To work around this, the xHCI driver should obtain a new device slot with reference to commit 651aaf36a7d7 ("usb: xhci: Handle USB transaction error on address command"), which is another way to inform the Etron xHCI host that the USB device has been reset. Add a new XHCI_ETRON_HOST quirk flag to invoke the workaround in xhci_discover_or_reset_device(). Fixes: 2a8f82c4ceaf ("USB: xhci: Notify the xHC when a device is reset.") Cc: stable@vger.kernel.org Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-19-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06xhci: Combine two if statements for Etron xHCI hostKuangyi Chiang1-6/+2
Combine two if statements, because these hosts have the same quirk flags applied. [Mathias: has stable tag because other fixes in series depend on this] Fixes: 91f7a1524a92 ("xhci: Apply broken streams quirk to Etron EJ188 xHCI host") Cc: stable@vger.kernel.org Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-18-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: add xhci_initialize_ring_segments()Niklas Neronin1-18/+23
A ring consists of a list of segments, each containing a specific number of TRBs. The xhci driver allocates and initializes ring segments and TRBs in the same functions. This combined allocation and initialization process leads to an issue where, after hibernation (S4 state), the xhci driver frees all its memory and re-creates the rings, segments, and TRBs from scratch. Move all default ring segment initialization into function xhci_initialize_ring_segments(). This function can be called to reinitialize a ring without freeing and reallocating it. Since xhci_alloc_segments_for_ring() no longer initializes segment TRBs, xhci_initialize_ring_segments() is added to xhci_ring_expansion(). This results in the last segment of the source ring having the 'LINK_TOGGLE' bit set. Therefore, if the last source ring segment is not the last in the destination ring, the 'LINK_TOGGLE' bit must be cleared. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-17-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: rework xhci_link_segments()Niklas Neronin1-24/+30
Prepare for splitting ring segments allocation and initialization by reworking the xhci_link_segments() function. Segment linking and ring type checks are moved out of xhci_link_segments(), and the function is renamed to "xhci_set_link_trb()". The goal is to keep ring linking within xhci_alloc_segments_for_ring() and move initialization into a separate function. Additionally, reorder and simplify xhci_set_link_trb() for better readability. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-16-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: refactor xhci_link_rings() to use source and destination ringsNiklas Neronin1-20/+20
Refactor the xhci_link_rings() function to accept two rings: a source ring and a destination ring. Previously, the function accepted a ring and a segment list as arguments, now the function splices the source ring segment list into the destination ring. This new approach reduces the number of arguments and simplifies the code, making it easier to follow. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-15-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06usb: xhci: rework xhci_free_segments_for_ring()Niklas Neronin1-10/+11
The segment list is only relevant within the context of the ring. Thus, it is more logical to pass the ring itself when freeing all segments from a ring,