summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2025-12-12drm/amdkfd: Fix GPU mappings for APU after prefetchHarish Kasiviswanathan1-0/+2
[ Upstream commit eac32ff42393efa6657efc821231b8d802c1d485 ] Fix the following corner case:- Consider a 2M huge page SVM allocation, followed by prefetch call for the first 4K page. The whole range is initially mapped with single PTE. After the prefetch, this range gets split to first page + rest of the pages. Currently, the first page mapping is not updated on MI300A (APU) since page hasn't migrated. However, after range split PTE mapping it not valid. Fix this by forcing page table update for the whole range when prefetch is called. Calling prefetch on APU doesn't improve performance. If all it deteriotes. However, functionality has to be supported. v2: Use apu_prefer_gtt as this issue doesn't apply to APUs with carveout VRAM v3: Simplify by setting the flag for all ASICs as it doesn't affect dGPU v4: Remove v2 and v3 changes. Force update_mapping when range is split at a size that is not aligned to prange granularity Suggested-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 076470b9f6f8d9c7c8ca73a9f054942a686f9ba7) Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-12drm/vmwgfx: Use kref in vmw_bo_dirtyIan Forbes1-7/+5
[ Upstream commit c1962742ffff7e245f935903a4658eb6f94f6058 ] Rather than using an ad hoc reference count use kref which is atomic and has underflow warnings. Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patch.msgid.link/20251030193640.153697-1-ian.forbes@broadcom.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07drm/amd/display: Increase EDID read retriesMario Limonciello (AMD)1-4/+4
commit 8ea902361734c87b82122f9c17830f168ebfc65a upstream. [WHY] When monitor is still booting EDID read can fail while DPCD read is successful. In this case no EDID data will be returned, and this could happen for a while. [HOW] Increase number of attempts to read EDID in dm_helpers_read_local_edid() to 25. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4672 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a76d6f2c76c3abac519ba753e2723e6ffe8e461c) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07drm/amd/display: Don't change brightness for disabled connectorsMario Limonciello (AMD)1-0/+15
commit 81f4d4ba509522596143fd5d7dc2fc3495296b0a upstream. [WHY] When a laptop lid is closed the connector is disabled but userspace can still try to change brightness. This doesn't work because the panel is turned off. It will eventually time out, but there is a lot of stutter along the way. [How] Iterate all connectors to check whether the matching one for the backlight index is enabled. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4675 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f6eeab30323d1174a4cc022e769d248fe8241304) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07drm/amd/display: Check NULL before accessingAlex Hung1-3/+8
commit 3ce62c189693e8ed7b3abe551802bbc67f3ace54 upstream. [WHAT] IGT kms_cursor_legacy's long-nonblocking-modeset-vs-cursor-atomic fails with NULL pointer dereference. This can be reproduced with both an eDP panel and a DP monitors connected. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 13 UID: 0 PID: 2960 Comm: kms_cursor_lega Not tainted 6.16.0-99-custom #8 PREEMPT(voluntary) Hardware name: AMD ........ RIP: 0010:dc_stream_get_scanoutpos+0x34/0x130 [amdgpu] Code: 57 4d 89 c7 41 56 49 89 ce 41 55 49 89 d5 41 54 49 89 fc 53 48 83 ec 18 48 8b 87 a0 64 00 00 48 89 75 d0 48 c7 c6 e0 41 30 c2 <48> 8b 38 48 8b 9f 68 06 00 00 e8 8d d7 fd ff 31 c0 48 81 c3 e0 02 RSP: 0018:ffffd0f3c2bd7608 EFLAGS: 00010292 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffd0f3c2bd7668 RDX: ffffd0f3c2bd7664 RSI: ffffffffc23041e0 RDI: ffff8b32494b8000 RBP: ffffd0f3c2bd7648 R08: ffffd0f3c2bd766c R09: ffffd0f3c2bd7760 R10: ffffd0f3c2bd7820 R11: 0000000000000000 R12: ffff8b32494b8000 R13: ffffd0f3c2bd7664 R14: ffffd0f3c2bd7668 R15: ffffd0f3c2bd766c FS: 000071f631b68700(0000) GS:ffff8b399f114000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001b8105000 CR4: 0000000000f50ef0 PKRU: 55555554 Call Trace: <TASK> dm_crtc_get_scanoutpos+0xd7/0x180 [amdgpu] amdgpu_display_get_crtc_scanoutpos+0x86/0x1c0 [amdgpu] ? __pfx_amdgpu_crtc_get_scanout_position+0x10/0x10[amdgpu] amdgpu_crtc_get_scanout_position+0x27/0x50 [amdgpu] drm_crtc_vblank_helper_get_vblank_timestamp_internal+0xf7/0x400 drm_crtc_vblank_helper_get_vblank_timestamp+0x1c/0x30 drm_crtc_get_last_vbltimestamp+0x55/0x90 drm_crtc_next_vblank_start+0x45/0xa0 drm_atomic_helper_wait_for_fences+0x81/0x1f0 ... Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 621e55f1919640acab25383362b96e65f2baea3c) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07drm/amd/amdgpu: reserve vm invalidation engine for uni_mesMichael Chen1-0/+3
commit 971fb57429df5aa4e6efc796f7841e0d10b1e83c upstream. Reserve vm invalidation engine 6 when uni_mes enabled. It is used in processing tlb flush request from host. Signed-off-by: Michael Chen <michael.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shaoyun liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 873373739b9b150720ea2c5390b4e904a4d21505) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07drm/amdgpu: attach tlb fence to the PTs updatePrike Liang1-1/+1
commit b4a7f4e7ad2b120a94f3111f92a11520052c762d upstream. Ensure the userq TLB flush is emitted only after the VM update finishes and the PT BOs have been annotated with bookkeeping fences. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f3854e04b708d73276c4488231a8bd66d30b4671) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07drm/xe/guc: Fix stack_depot usageLucas De Marchi1-0/+3
commit 0e234632e39bd21dd28ffc9ba3ae8eec4deb949c upstream. Add missing stack_depot_init() call when CONFIG_DRM_XE_DEBUG_GUC is enabled to fix the following call stack: [] BUG: kernel NULL pointer dereference, address: 0000000000000000 [] Workqueue: drm_sched_run_job_work [gpu_sched] [] RIP: 0010:stack_depot_save_flags+0x172/0x870 [] Call Trace: [] <TASK> [] fast_req_track+0x58/0xb0 [xe] Fixes: 16b7e65d299d ("drm/xe/guc: Track FAST_REQ H2Gs to report where errors came from") Tested-by: Sagar Ghuge <sagar.ghuge@intel.com> Cc: stable@vger.kernel.org # v6.17+ Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20251118-fix-debug-guc-v1-1-9f780c6bedf8@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 64fdf496a6929a0a194387d2bb5efaf5da2b542f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07drm/i915/psr: Reject async flips when selective fetch is enabledVille Syrjälä2-6/+8
commit 7c373b3bd03c77fe8f6ea206ed49375eb4d43d13 upstream. The selective fetch code doesn't handle asycn flips correctly. There is a nonsense check for async flips in intel_psr2_sel_fetch_config_valid() but that only gets called for modesets/fastsets and thus does nothing for async flips. Currently intel_async_flip_check_hw() is very unhappy as the selective fetch code pulls in planes that are not even async flips capable. Reject async flips when selective fetch is enabled, until someone fixes this properly (ie. disable selective fetch while async flips are being issued). Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20251105171015.22234-1-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander <jouni.hogander@intel.com> (cherry picked from commit a5f0cc8e0cd4007370af6985cb152001310cf20c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07drm: sti: fix device leaks at component probeJohan Hovold1-1/+6
commit 620a8f131154250f6a64a07d049a4f235d6451a5 upstream. Make sure to drop the references taken to the vtg devices by of_find_device_by_node() when looking up their driver data during component probe. Note that holding a reference to a platform device does not prevent its driver data from going away so there is no point in keeping the reference after the lookup helper returns. Fixes: cc6b741c6f63 ("drm: sti: remove useless fields from vtg structure") Cc: stable@vger.kernel.org # 4.16 Cc: Benjamin Gaignard <benjamin.gaignard@collabora.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20250922122012.27407-1-johan@kernel.org Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07drm, fbcon, vga_switcheroo: Avoid race condition in fbcon setupThomas Zimmermann1-14/+0
commit eb76d0f5553575599561010f24c277cc5b31d003 upstream. Protect vga_switcheroo_client_fb_set() with console lock. Avoids OOB access in fbcon_remap_all(). Without holding the console lock the call races with switching outputs. VGA switcheroo calls fbcon_remap_all() when switching clients. The fbcon function uses struct fb_info.node, which is set by register_framebuffer(). As the fb-helper code currently sets up VGA switcheroo before registering the framebuffer, the value of node is -1 and therefore not a legal value. For example, fbcon uses the value within set_con2fb_map() [1] as an index into an array. Moving vga_switcheroo_client_fb_set() after register_framebuffer() can result in VGA switching that does not switch fbcon correctly. Therefore move vga_switcheroo_client_fb_set() under fbcon_fb_registered(), which already holds the console lock. Fbdev calls fbcon_fb_registered() from within register_framebuffer(). Serializes the helper with VGA switcheroo's call to fbcon_remap_all(). Although vga_switcheroo_client_fb_set() takes an instance of struct fb_info as parameter, it really only needs the contained fbcon state. Moving the call to fbcon initialization is therefore cleaner than before. Only amdgpu, i915, nouveau and radeon support vga_switcheroo. For all other drivers, this change does nothing. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://elixir.bootlin.com/linux/v6.17/source/drivers/video/fbdev/core/fbcon.c#L2942 # [1] Fixes: 6a9ee8af344e ("vga_switcheroo: initial implementation (v15)") Acked-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Cc: linux-fbdev@vger.kernel.org Cc: <stable@vger.kernel.org> # v2.6.34+ Link: https://patch.msgid.link/20251105161549.98836-1-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07Revert "drm/amd/display: Move setup_stream_attribute"Alex Deucher5-12/+3
commit 3126c9ccb4373d8758733c6699ba5ab93dbe5c9d upstream. This reverts commit 2681bf4ae8d24df950138b8c9ea9c271cd62e414. This results in a blank screen on the HDMI port on some systems. Revert for now so as not to regress 6.18, can be addressed in 6.19 once the issue is root caused. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4652 Cc: Sunpeng.Li@amd.com Cc: ivan.lipski@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d0e9de7a81503cdde37fb2d37f1d102f9e0f38fb) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07drm/amdgpu: fix cyan_skillfish2 gpu info fw handlingAlex Deucher1-0/+2
[ Upstream commit 7fa666ab07ba9e08f52f357cb8e1aad753e83ac6 ] If the board supports IP discovery, we don't need to parse the gpu info firmware. Backport to 6.18. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4721 Fixes: fa819e3a7c1e ("drm/amdgpu: add support for cyan skillfish gpu_info") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5427e32fa3a0ba9a016db83877851ed277b065fb) Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07drm/xe: Fix conversion from clock ticks to millisecondsHarish Chegondi1-6/+1
[ Upstream commit 7276878b069c57d9a9cca5db01d2f7a427b73456 ] When tick counts are large and multiplication by MSEC_PER_SEC is larger than 64 bits, the conversion from clock ticks to milliseconds can go bad. Use mul_u64_u32_div() instead. Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Suggested-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: 49cc215aad7f ("drm/xe: Add xe_gt_clock_interval_to_ms helper") Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/1562f1b62d5be3fbaee100f09107f3cc49e40dd1.1763408584.git.harish.chegondi@intel.com (cherry picked from commit 96b93ac214f9dd66294d975d86c5dee256faef91) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07drm/bridge: sii902x: Fix HDMI detection with DRM_BRIDGE_ATTACH_NO_CONNECTORDevarsh Thakkar1-12/+8
[ Upstream commit d6732ef4ab252e5753be7acad87b0a91cfd06953 ] The sii902x driver was caching HDMI detection state in a sink_is_hdmi field and checking it in mode_set() to determine whether to set HDMI or DVI output mode. This approach had two problems: 1. With DRM_BRIDGE_ATTACH_NO_CONNECTOR (used by modern display drivers like TIDSS), the bridge's get_modes() is never called. Instead, the drm_bridge_connector helper calls the bridge's edid_read() and updates the connector itself. This meant sink_is_hdmi was never populated, causing the driver to default to DVI mode and breaking HDMI audio. 2. The mode_set() callback doesn't receive atomic state or connector pointer, making it impossible to check connector->display_info.is_hdmi directly at that point. Fix this by moving the HDMI vs DVI decision from mode_set() to atomic_enable(), where we can access the connector via drm_atomic_get_new_connector_for_encoder(). This works for both connector models: - With DRM_BRIDGE_ATTACH_NO_CONNECTOR: Returns the drm_bridge_connector created by the display driver, which has already been updated by the helper's call to drm_edid_connector_update() - Without DRM_BRIDGE_ATTACH_NO_CONNECTOR (legacy): Returns the connector embedded in sii902x struct, which gets updated by the bridge's own get_modes() Fixes: 3de47e1309c2 ("drm/bridge: sii902x: use display info is_hdmi") Signed-off-by: Devarsh Thakkar <devarsht@ti.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/20251030151635.3019864-1-devarsht@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/i915/dp: Add device specific quirk to limit eDP rate to HBR2Ankit Nautiyal3-7/+44
commit 21c586d9233a1f258e8d437466c441d50885d30f upstream. Some ICL/TGL platforms with combo PHY ports suffer from signal integrity issues at HBR3. While certain systems include a Parade PS8461 mux to mitigate this, its presence cannot be reliably detected. Furthermore, broken or missing VBT entries make it unsafe to rely on VBT for enforcing link rate limits. To address this introduce a device specific quirk to cap the eDP link rate to HBR2 (540000 kHz). This will override any higher advertised rates from the sink or DPCD for specific devices. Currently, the quirk is added for Dell XPS 13 7390 2-in-1 which is reported in gitlab issue #5969 [1]. [1] https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/5969 v2: Align the quirk with the intended quirk name and refactor the condition to use min(). (Jani) v3: Use condition `rate > 540000`. Drop extra parentheses. (Ville) Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/5969 Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://lore.kernel.org/r/20250710052041.1238567-3-ankit.k.nautiyal@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01Revert "drm/i915/dp: Reject HBR3 when sink doesn't support TPS4"Ankit Nautiyal1-42/+7
commit 8c9006283e4b767003b2d11182d6e90f8b184c3d upstream. This reverts commit 584cf613c24a4250d9be4819efc841aa2624d5b6. Commit 584cf613c24a ("drm/i915/dp: Reject HBR3 when sink doesn't support TPS4") introduced a blanket rejection of HBR3 link rate when the sink does not support TPS4. While this was intended to address instability observed on certain eDP panels [1], there seem to be edp panels that do not follow the specification. These eDP panels do not advertise TPS4 support, but require HBR3 to operate at their fixed native resolution [2]. As a result, the change causes blank screens on such panels. Apparently, Windows driver does not enforce this restriction, and the issue is not seen there. Therefore, revert the commit to restore functionality for such panels, and align behaviour with Windows driver. [1] https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/5969 [2] https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14517 v2: Update the commit message with better justification. (Ville) Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14517 Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://lore.kernel.org/r/20250710052041.1238567-2-ankit.k.nautiyal@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amd/display: Prevent Gating DTBCLK before It Is Properly LatchedFangzhi Zuo2-2/+4
[ Upstream commit cfa0904a35fd0231f4d05da0190f0a22ed881cce ] [why] 1. With allow_0_dtb_clk enabled, the time required to latch DTBCLK to 600 MHz depends on the SMU. If DTBCLK is not latched to 600 MHz before set_mode completes, gating DTBCLK causes the DP2 sink to lose its clock source. 2. The existing DTBCLK gating sequence ungates DTBCLK based on both pix_clk and ref_dtbclk, but gates DTBCLK when either pix_clk or ref_dtbclk is zero. pix_clk can be zero outside the set_mode sequence before DTBCLK is properly latched, which can lead to DTBCLK being gated by mistake. [how] Consider both pixel_clk and ref_dtbclk when determining when it is safe to gate DTBCLK; this is more accurate. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4701 Fixes: 5949e7c4890c ("drm/amd/display: Enable Dynamic DTBCLK Switch") Reviewed-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d04eb0c402780ca037b62a6aecf23b863545ebca) Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amd/display: Insert dccg log for easy debugCharlene Liu1-3/+21
[ Upstream commit 35bcc9168f3ce6416cbf3f776758be0937f84cb3 ] [why] Log for sequence tracking Reviewed-by: Ovidiu (Ovi) Bunea <ovidiu.bunea@amd.com> Reviewed-by: Yihan Zhu <yihan.zhu@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: cfa0904a35fd ("drm/amd/display: Prevent Gating DTBCLK before It Is Properly Latched") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amdgpu/jpeg: Add parse_cs for JPEG5_0_1Sathishkumar S1-0/+1
[ Upstream commit bbe3c115030da431c9ec843c18d5583e59482dd2 ] enable parse_cs callback for JPEG5_0_1. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 547985579932c1de13f57f8bcf62cd9361b9d3d3) Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amdgpu/jpeg: Move parse_cs to amdgpu_jpeg.cSathishkumar S10-70/+83
[ Upstream commit 28f75f9bcc7da7da12e5dae2ae8d8629a2b2e42e ] Rename jpeg_v2_dec_ring_parse_cs to amdgpu_jpeg_dec_parse_cs and move it to amdgpu_jpeg.c as it is shared among jpeg versions. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: bbe3c115030d ("drm/amdgpu/jpeg: Add parse_cs for JPEG5_0_1") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/i915/dp_mst: Disable Panel ReplayImre Deak1-0/+4
[ Upstream commit f2687d3cc9f905505d7b510c50970176115066a2 ] Disable Panel Replay on MST links until it's properly implemented. For instance the required VSC SDP is not programmed on MST and FEC is not enabled if Panel Replay is enabled. Fixes: 3257e55d3ea7 ("drm/i915/panelreplay: enable/disable panel replay") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15174 Cc: Jouni Högander <jouni.hogander@intel.com> Cc: Animesh Manna <animesh.manna@intel.com> Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patch.msgid.link/20251107124141.911895-1-imre.deak@intel.com (cherry picked from commit e109f644b871df8440c886a69cdce971ed533088) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/i915/psr: Check drm_dp_dpcd_read return value on PSR dpcd initJouni Högander1-11/+21
[ Upstream commit 9cc10041e9fe7f32c4817e3cdd806ff1986d266c ] Currently we are ignoriong drm_dp_dpcd_read return values when reading PSR and Panel Replay capability DPCD register. Rework intel_psr_dpcd a bit to take care of checking the return value. v2: use drm_dp_dpcd_read_data Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250821045918.17757-1-jouni.hogander@intel.com Stable-dep-of: f2687d3cc9f9 ("drm/i915/dp_mst: Disable Panel Replay") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amdgpu: fix gpu page fault after hibernation on PF passthroughSamuel Zhang2-2/+5
[ Upstream commit eb6e7f520d6efa4d4ebf1671455abe4a681f7a05 ] On PF passthrough environment, after hibernate and then resume, coralgemm will cause gpu page fault. Mode1 reset happens during hibernate, but partition mode is not restored on resume, register mmCP_HYP_XCP_CTL and mmCP_PSP_XCP_CTL is not right after resume. When CP access the MQD BO, wrong stride size is used, this will cause out of bound access on the MQD BO, resulting page fault. The fix is to ensure gfx_v9_4_3_switch_compute_partition() is called when resume from a hibernation. KFD resume is called separately during a reset recovery or resume from suspend sequence. Hence it's not required to be called as part of partition switch. Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5d1b32cfe4a676fe552416cb5ae847b215463a1a) Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/xe: Prevent BIT() overflow when handling invalid prefetch regionShuicheng Lin1-2/+2
[ Upstream commit d52dea485cd3c98cfeeb474cf66cf95df2ab142f ] If user provides a large value (such as 0x80) for parameter prefetch_mem_region_instance in vm_bind ioctl, it will cause BIT(prefetch_region) overflow as below: " ------------[ cut here ]------------ UBSAN: shift-out-of-bounds in drivers/gpu/drm/xe/xe_vm.c:3414:7 shift exponent 128 is too large for 64-bit type 'long unsigned int' CPU: 8 UID: 0 PID: 53120 Comm: xe_exec_system_ Tainted: G W 6.18.0-rc1-lgci-xe-kernel+ #200 PREEMPT(voluntary) Tainted: [W]=WARN Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023 Call Trace: <TASK> dump_stack_lvl+0xa0/0xc0 dump_stack+0x10/0x20 ubsan_epilogue+0x9/0x40 __ubsan_handle_shift_out_of_bounds+0x10e/0x170 ? mutex_unlock+0x12/0x20 xe_vm_bind_ioctl.cold+0x20/0x3c [xe] ... " Fix it by validating prefetch_region before the BIT() usage. v2: Add Closes and Cc stable kernels. (Matt) Reported-by: Koen Koning <koen.koning@intel.com> Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6478 Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20251112181005.2120521-2-shuicheng.lin@intel.com (cherry picked from commit 8f565bdd14eec5611cc041dba4650e42ccdf71d9) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit d52dea485cd3c98cfeeb474cf66cf95df2ab142f) Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/msm: Fix pgtable prealloc error pathRob Clark1-0/+5
[ Upstream commit 830d68f2cb8ab6fb798bb9555016709a9e012af0 ] The following splat was reported: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=00000008d0fd8000 [0000000000000010] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] SMP CPU: 5 UID: 1000 PID: 149076 Comm: Xwayland Tainted: G S 6.16.0-rc2-00809-g0b6974bb4134-dirty #367 PREEMPT Tainted: [S]=CPU_OUT_OF_SPEC Hardware name: Qualcomm Technologies, Inc. SM8650 HDK (DT) pstate: 83400005 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : build_detached_freelist+0x28/0x224 lr : kmem_cache_free_bulk.part.0+0x38/0x244 sp : ffff000a508c7a20 x29: ffff000a508c7a20 x28: ffff000a508c7d50 x27: ffffc4e49d16f350 x26: 0000000000000058 x25: 00000000fffffffc x24: 0000000000000000 x23: ffff00098c4e1450 x22: 00000000fffffffc x21: 0000000000000000 x20: ffff000a508c7af8 x19: 0000000000000002 x18: 00000000000003e8 x17: ffff000809523850 x16: ffff000809523820 x15: 0000000000401640 x14: ffff000809371140 x13: 0000000000000130 x12: ffff0008b5711e30 x11: 00000000001058fa x10: 0000000000000a80 x9 : ffff000a508c7940 x8 : ffff000809371ba0 x7 : 781fffe033087fff x6 : 0000000000000000 x5 : ffff0008003cd000 x4 : 781fffe033083fff x3 : ffff000a508c7af8 x2 : fffffdffc0000000 x1 : 0001000000000000 x0 : ffff0008001a6a00 Call trace: build_detached_freelist+0x28/0x224 (P) kmem_cache_free_bulk.part.0+0x38/0x244 kmem_cache_free_bulk+0x10/0x1c msm_iommu_pagetable_prealloc_cleanup+0x3c/0xd0 msm_vma_job_free+0x30/0x240 msm_ioctl_vm_bind+0x1d0/0x9a0 drm_ioctl_kernel+0x84/0x104 drm_ioctl+0x358/0x4d4 __arm64_sys_ioctl+0x8c/0xe0 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0x3c/0xe0 do_el0_svc+0x18/0x20 el0_svc+0x30/0x100 el0t_64_sync_handler+0x104/0x130 el0t_64_sync+0x170/0x174 Code: aa0203f5 b26287e2 f2dfbfe2 aa0303f4 (f8737ab6) ---[ end trace 0000000000000000 ]--- Since msm_vma_job_free() is called directly from the ioctl, this looks like an error path cleanup issue. Which I think results from prealloc_cleanup() called without a preceding successful prealloc_allocate() call. So handle that case better. Reported-by: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/678677/ Message-ID: <20251006153542.419998-1-robin.clark@oss.qualcomm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/xe/irq: Handle msix vector0 interruptVenkata Ramana Nayana1-17/+1
[ Upstream commit 5b38c22687d9287d85dd3bef2fa708bf62cf3895 ] Current gu2host handler registered as MSI-X vector 0 and as per bspec for a msix vector 0 interrupt, the driver must check the legacy registers 190008(TILE_INT_REG), 190060h (GT INTR Identity Reg 0) and other registers mentioned in "Interrupt Service Routine Pseudocode" otherwise it will block the next interrupts. To overcome this issue replacing guc2host handler with legacy xe_irq_handler. Fixes: da889070be7b2 ("drm/xe/irq: Separate MSI and MSI-X flows") Bspec: 62357 Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patch.msgid.link/20251107083141.2080189-1-venkata.ramana.nayana@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit c34a14bce7090862ebe5a64abe8d85df75e62737) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/xe/kunit: Fix forcewake assertion in mocs testMatt Roper1-1/+1
[ Upstream commit 905a3468ec679293949438393de7e61310432662 ] The MOCS kunit test calls KUNIT_ASSERT_TRUE_MSG() with a condition of 'true;' this prevents the assertion from ever failing. Replace KUNIT_ASSERT_TRUE_MSG with KUNIT_FAIL_AND_ABORT to get the intended failure behavior in cases where forcewake was not acquired successfully. Fixes: 51c0ee84e4dc ("drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regs") Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patch.msgid.link/20251113234038.2256106-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 9be4f0f687048ba77428ceca11994676736507b7) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/i915/xe3: Restrict PTL intel_encoder_is_c10phy() to only PHY ADnyaneshwar Bhadane1-8/+6
[ Upstream commit 5474560381775bc70cc90ed2acefad48ffd6ee07 ] On PTL, no combo PHY is connected to PORT B. However, PORT B can still be used for Type-C and will utilize the C20 PHY for eDP over Type-C. In such configurations, VBTs also enumerate PORT B. This leads to issues where PORT B is incorrectly identified as using the C10 PHY, due to the assumption that returning true for PORT B in intel_encoder_is_c10phy() would not cause problems. From PTL's perspective, only PORT A/PHY A uses the C10 PHY. Update the helper intel_encoder_is_c10phy() to return true only for PORT A/PHY on PTL. v2: Change the condition code style for ptl/wcl Bspec: 72571,73944 Fixes: 9d10de78a37f ("drm/i915/wcl: C10 phy connected to port A and B") Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250922150317.2334680-4-dnyaneshwar.bhadane@intel.com (cherry picked from commit 8147f7a1c083fd565fb958824f7c552de3b2dc46) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/i915/display: Add definition for wcl as subplatformDnyaneshwar Bhadane2-1/+15
[ Upstream commit 913253ed47b9925454cbb17faa3e350015b3d67a ] We will need to differentiate between WCL and PTL in intel_encoder_is_c10phy(). Since WCL and PTL use the same display architecture, let's define WCL as a subplatform of PTL to allow the differentiation. v2: Update commit message and reorder wcl define (Gustavo) Fixes: 3c0f211bc8fc ("drm/xe: Add Wildcat Lake device IDs to PTL list") Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250922150317.2334680-3-dnyaneshwar.bhadane@intel.com (cherry picked from commit 4dfaae643e59cf3ab71b88689dce1b874f036f00) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo added Fixes tag when porting it to fixes] Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/pcids: Split PTL pciids group to make wcl subplatformDnyaneshwar Bhadane2-0/+2
[ Upstream commit 6eb2e056b0e418718fc5a3cfe79bdb41d9a2851d ] To form the WCL platform as a subplatform of PTL in definition, WCL pci ids are splited into saparate group from PTL. So update the pciidlist struct to cover all the pci ids. v2: - Squash wcl description in single patch for display and xe.(jani,gustavo) Fixes: 3c0f211bc8fc ("drm/xe: Add Wildcat Lake device IDs to PTL list") Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250922150317.2334680-2-dnyaneshwar.bhadane@intel.com (cherry picked from commit 32620e176443bf23ec81bfe8f177c6721a904864) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo added the Fixes tag when porting it to fixes] Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/i915/xe3lpd: Load DMC for Xe3_LPD version 30.02Dnyaneshwar Bhadane1-3/+7
[ Upstream commit fa766e759ff7b128ab77323d9d9c232434621bb6 ] Load the DMC for Xe3_LPD version 30.02. Fixes: 3c0f211bc8fc ("drm/xe: Add Wildcat Lake device IDs to PTL list") Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Link: https://lore.kernel.org/r/20251016131517.2032684-1-dnyaneshwar.bhadane@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> (cherry picked from commit a63db39a578b543f5e5719b9f14dd82d3b8648d1) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo added the Fixes tag while cherry-picking to fixes] Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/tegra: Add call to put_pid()Prateek Agarwal1-2/+5
[ Upstream commit 6cbab9f0da72b4dc3c3f9161197aa3b9daa1fa3a ] Add a call to put_pid() corresponding to get_task_pid(). host1x_memory_context_alloc() does not take ownership of the PID so we need to free it here to avoid leaking. Signed-off-by: Prateek Agarwal <praagarwal@nvidia.com> Fixes: e09db97889ec ("drm/tegra: Support context isolation") [mperttunen@nvidia.com: reword commit message] Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20250919-host1x-put-pid-v1-1-19c2163dfa87@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01drm/amd/display: Clear the CUR_ENABLE register on DCN20 on DPP5Ivan Lipski1-0/+8
commit 5bab4c89390f32b2f491f49a151948cd226dd909 upstream. [Why] On DCN20 & DCN30, the 6th DPP's & HUBP's are powered on permanently and cannot be power gated. Thus, when dpp_reset() is invoked for the DPP5, while it's still powered on, the cached cursor_state (dpp_base->pos.cur0_ctl.bits.cur0_enable) and the actual state (CUR0_ENABLE) bit are unsycned. This can cause a double cursor in full screen with non-native scaling. [How] Force disable cursor on DPP5 on plane powerdown for ASICs w/ 6 DPPs/HUBPs. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4673 Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 79b3c037f972dcb13e325a8eabfb8da835764e15) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amd/display: Fix pbn to kbps ConversionFangzhi Zuo1-36/+23
commit 1788ef30725da53face7e311cdf62ad65fababcd upstream. [Why] Existing routine has two conversion sequence, pbn_to_kbps and kbps_to_pbn with margin. Non of those has without-margin calculation. kbps_to_pbn with margin conversion includes fec overhead which has already been included in pbn_div calculation with 0.994 factor considered. It is a double counted fec overhead factor that causes potential bw loss. [How] Add without-margin calculation. Fix fec overhead double counted issue. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3735 Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e0dec00f3d05e8c0eceaaebfdca217f8d10d380c) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amd/display: Move sleep into each retry for retrieve_link_cap()Mario Limonciello (AMD)1-4/+5
commit 71ad9054c1f241be63f9d11df8cbd0aa0352fe16 upstream. [Why] When a monitor is booting it's possible that it isn't ready to retrieve link caps and this can lead to an EDID read failure: ``` [drm:retrieve_link_cap [amdgpu]] *ERROR* retrieve_link_cap: Read receiver caps dpcd data failed. amdgpu 0000:c5:00.0: [drm] *ERROR* No EDID read. ``` [How] Rather than msleep once and try a few times, msleep each time. Should be no changes for existing working monitors, but should correct reading caps on a monitor that is slow to boot. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4672 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 669dca37b3348a447db04bbdcbb3def94d5997cc) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amd/display: Increase DPCD read retriesMario Limonciello (AMD)1-1/+1
commit 8612badc331bcab2068baefa69e1458085ed89e3 upstream. [Why] Empirical measurement of some monitors that fail to read EDID while booting shows that the number of retries with a 30ms delay between tries is as high as 16. [How] Increase number of retries to 20. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4672 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ad1c59ad7cf74ec06e32fe2c330ac1e957222288) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amdgpu: Skip emit de meta data on gfx11 with rs64 enabledYifan Zha1-2/+2
commit 80d8a9ad1587b64c545d515ab6cb7ecb9908e1b3 upstream. [Why] Accoreding to CP updated to RS64 on gfx11, WRITE_DATA with PREEMPTION_META_MEMORY(dst_sel=8) is illegal for CP FW. That packet is used for MCBP on F32 based system. So it would lead to incorrect GRBM write and FW is not handling that extra case correctly. [How] With gfx11 rs64 enabled, skip emit de meta data. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8366cd442d226463e673bed5d199df916f4ecbcf) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/amd: Skip power ungate during suspend for VPEMario Limonciello1-1/+2
commit 31ab31433c9bd2f255c48dc6cb9a99845c58b1e4 upstream. During the suspend sequence VPE is already going to be power gated as part of vpe_suspend(). It's unnecessary to call during calls to amdgpu_device_set_pg_state(). It actually can expose a race condition with the firmware if s0i3 sequence starts as well. Drop these calls. Cc: Peyton.Lee@amd.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2a6c826cfeedd7714611ac115371a959ead55bda) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/plane: Fix create_in_format_blob() return valueVille Syrjälä1-2/+2
commit cead55e24cf9e092890cf51c0548eccd7569defa upstream. create_in_format_blob() is either supposed to return a valid pointer or an error, but never NULL. The caller will dereference the blob when it is not an error, and thus will oops if NULL returned. Return proper error values in the failure cases. Cc: stable@vger.kernel.org Cc: Arun R Murthy <arun.r.murthy@intel.com> Fixes: 0d6dcd741c26 ("drm/plane: modify create_in_formats to acommodate async") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20251112233030.24117-2-ville.syrjala@linux.intel.com Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/radeon: delete radeon_fence_process in is_signaled, no deadlockRobert McClinton1-7/+0
commit 9eb00b5f5697bd56baa3222c7a1426fa15bacfb5 upstream. Delete the attempt to progress the queue when checking if fence is signaled. This avoids deadlock. dma-fence_ops::signaled can be called with the fence lock in unknown state. For radeon, the fence lock is also the wait queue lock. This can cause a self deadlock when signaled() tries to make forward progress on the wait queue. But advancing the queue is unneeded because incorrectly returning false from signaled() is perfectly acceptable. Link: https://github.com/brave/brave-browser/issues/49182 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4641 Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Robert McClinton <rbmccav@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 527ba26e50ec2ca2be9c7c82f3ad42998a75d0db) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01drm/tegra: dc: Fix reference leak in tegra_dc_couple()Ma Ke1-0/+1
commit 4c5376b4b143c4834ebd392aef2215847752b16a upstream. driver_find_device() calls get_device() to increment the reference count once a matching device is found, but there is no put_device() to balance the reference count. To avoid reference count leakage, add put_device() to decrease the reference count. Found by code review. Cc: stable@vger.kernel.org Fixes: a31500fe7055 ("drm/tegra: dc: Restore coupling of display controllers") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Acked-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20251022114720.24937-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01nouveau/firmware: Add missing kfree() of nvkm_falcon_fw::bootNam Cao1-0/+2
commit 949f1fd2225baefbea2995afa807dba5cbdb6bd3 upstream. nvkm_falcon_fw::boot is allocated, but no one frees it. This causes a kmemleak warning. Make sure this data is deallocated. Fixes: 2541626cfb79 ("drm/nouveau/acr: use common falcon HS FW code for ACR FWs") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patch.msgid.link/20251117084231.2910561-1-namcao@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-01Revert "drm/tegra: dsi: Clear enable register if powered by bootloader"Diogo Ivo1-9/+0
commit 660b299bed2a2a55a1f9102d029549d0235f881c upstream. Commit b6bcbce33596 ("soc/tegra: pmc: Ensure power-domains are in a known state") was introduced so that all power domains get initialized to a known working state when booting and it does this by shutting them down (including asserting resets and disabling clocks) before registering each power domain with the genpd framework, leaving it to each driver to later on power its needed domains. This caused the Google Pixel C to hang when booting due to a workaround in the DSI driver introduced in commit b22fd0b9639e ("drm/tegra: dsi: Clear enable register if powered by bootloader") meant to handle the case where the bootloader enabled the DSI hardware module. The workaround relies on reading a hardware register to determine the current status and after b6bcbce33596 that now happens in a powered down state thus leading to the boot hang. Fix this by reverting b22fd0b9639e since currently we are guaranteed that the hardware will be fully reset by the time we start enabling the DSI module. Fixes: b6bcbce33596 ("soc/tegra: pmc: Ensure power-domains are in a known state") Cc: stable@vger.kernel.org Signed-off-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20251103-diogo-smaug_ec_typec-v1-1-be656ccda391@tecnico.ulisboa.pt Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24drm/xe/xe3: Add WA_14024681466 for Xe3_LPGNitin Gote2-0/+5
commit 0b2f7be548006b0651e1e8320790f49723265cbc upstream. Apply WA_14024681466 to Xe3_LPG graphics IP versions from 30.00 to 30.05. v2: (Matthew Roper) - Remove stepping filter as workaround applies to all steppings. - Add an engine class filter so it only applies to the RENDER engine. Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://patch.msgid.link/20251027092643.335904-1-nitin.r.gote@intel.com Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 071089a69e199bd810ff31c4c933bd528e502743) Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24drm/xe/xe3: Extend wa_14023061436Tangudu Tilak Tirumalesh1-0/+2
commit fa3376319b83ba8b7fd55f2c1a268dcbf9d6eedc upstream. Extend wa_14023061436 to Graphics Versions 30.03, 30.04 and 30.05. Signed-off-by: Tangudu Tilak Tirumalesh <tilak.tirumalesh.tangudu@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20251030154626.3124565-1-tilak.tirumalesh.tangudu@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 0dd656d06f50ae4cedf160634cf13fd9e0944cf7) Cc: stable@vger.kernel.org # v6.17+ Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24drm/xe/xe3lpg: Extend Wa_15016589081 for xe3lpgNitin Gote1-0/+5
commit 240372edaf854c9136f5ead45f2d8cd9496a9cb3 upstream. Wa_15016589081 applies to Xe3_LPG renderCS Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://patch.msgid.link/20251106100516.318863-2-nitin.r.gote@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 715974499a2199bd199fb4630501f55545342ea4) Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24drm/i915/psr: fix pipe to vblank conversionJani Nikula1-1/+2
commit 994dec10991b53beac3e16109d876ae363e8a329 upstream. First, we can't assume pipe == crtc index. If a pipe is fused off in between, it no longer holds. intel_crtc_for_pipe() is the only proper way to get from a pipe to the corresponding crtc. Second, drivers aren't supposed to access or index drm->vblank[] directly. There's drm_crtc_vblank_crtc() for this. Use both functions to fix the pipe to vblank conversion. Fixes: f02658c46cf7 ("drm/i915/psr: Add mechanism to notify PSR of pipe enable/disable") Cc: Jouni Högander <jouni.hogander@intel.com> Cc: stable@vger.kernel.org # v6.16+ Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Link: https://patch.msgid.link/20251106200000.1455164-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 2750f6765d6974f7e163c5d540a96c8703f6d8dd) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfacesVitaly Prosyak1-0/+12
commit 22a36e660d014925114feb09a2680bb3c2d1e279 upstream. Certain multi-GPU configurations (especially GFX12) may hit data corruption when a DCC-compressed VRAM surface is shared across GPUs using peer-to-peer (P2P) DMA transfers. Such surfaces rely on device-local metadata and cannot be safely accessed through a remote GPU’s page tables. Attempting to import a DCC-enabled surface through P2P leads to incorrect rendering or GPU faults. This change disables P2P for DCC-enabled VRAM buffers that are contiguous and allocated on GFX12+ hardware. In these cases, the importer falls back to the standard system-memory path, avoiding invalid access to compressed surfaces. Future work could consider optional migration (VRAM→System→VRAM) if a performance regression is observed when `attach->peer2peer = false`. Tested on: - Dual RX 9700 XT (Navi4x) setup - GNOME and Wayland compositor scenarios - Confirmed no corruption after disabling P2P under these conditions v2: Remove check TTM_PL_VRAM & TTM_PL_FLAG_CONTIGUOUS. v3: simplify for upsteam and fix ip version check (Alex) Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9dff2bb709e6fbd97e263fd12bf12802d2b5a0cf) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24drm/amdgpu: fix lock warning in amdgpu_userq_fence_driver_processJesse.Zhang1-2/+3
commit 6623c5f9fd877868fba133b4ae4dab0052e82dad upstream. Fix a potential deadlock caused by inconsistent spinlock usage between interrupt and process contexts in the userq fence driver. The issue occurs when amdgpu_userq_fence_driver_process() is called from both: - Interrupt context: gfx_v11_0_eop_irq() -> amdgpu_userq_fence_driver_process() - Process context: amdgpu_eviction_fence_suspend_worker() -> amdgpu_userq_fence_driver_force_completion() -> amdgpu_userq_fence_driver_process() In interrupt context, the spinlock was acquired without disabling interrupts, leaving it in {IN-HARDIRQ-W} state. When the same lock is acquired in process context, the kernel detects inconsistent locking since the process context acquisition would enable interrupts while holding a lock previously acquired in interrupt context. Kernel log shows: [ 4039.310790] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 4039.310804] kworker/7:2/409 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 4039.310818] ffff9284e1bed000 (&fence_drv->fence_list_lock){?...}-{3:3}, [ 4039.310993] {IN-HARDIRQ-W} state was registered at: [ 4039.311004] lock_acquire+0xc6/0x300 [ 4039.311018] _raw_spin_lock+0x39/0x80 [ 4039.311031] amdgpu_userq_fence_driver_process.part.0+0x30/0x180 [amdgpu] [ 4039.311146] amdgpu_userq_fence_driver_process+0x17/0x30 [amdgpu] [ 4039.311257] gfx_v11_0_eop_irq+0x132/0x170 [amdgpu] Fix by using spin_lock_irqsave()/spin_unlock_irqrestore() to properly manage interrupt state regardless of calling context. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ded3ad780cf97a04927773c4600823b84f7f3cc2) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>