| Age | Commit message (Collapse) | Author | Files | Lines |
|
Link: https://lore.kernel.org/r/20240401152549.131030308@linuxfoundation.org
Tested-by: Ronald Warsow <rwarsow@gmx.de>
Tested-by: SeongJae Park <sj@kernel.org>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Tested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f34e8bb7d6c6626933fe993e03ed59ae85e16abb upstream.
The bug can be triggered by sending an amdgpu_cs_wait_ioctl
to the AMDGPU DRM driver on any ASICs with valid context.
The bug was reported by Joonkyo Jung <joonkyoj@yonsei.ac.kr>.
For example the following code:
static void Syzkaller2(int fd)
{
union drm_amdgpu_ctx arg1;
union drm_amdgpu_wait_cs arg2;
arg1.in.op = AMDGPU_CTX_OP_ALLOC_CTX;
ret = drmIoctl(fd, 0x140106442 /* amdgpu_ctx_ioctl */, &arg1);
arg2.in.handle = 0x0;
arg2.in.timeout = 0x2000000000000;
arg2.in.ip_type = AMD_IP_VPE /* 0x9 */;
arg2->in.ip_instance = 0x0;
arg2.in.ring = 0x0;
arg2.in.ctx_id = arg1.out.alloc.ctx_id;
drmIoctl(fd, 0xc0206449 /* AMDGPU_WAIT_CS * /, &arg2);
}
The ioctl AMDGPU_WAIT_CS without previously submitted job could be assumed that
the error should be returned, but the following commit 1decbf6bb0b4dc56c9da6c5e57b994ebfc2be3aa
modified the logic and allowed to have sched_rq equal to NULL.
As a result when there is no job the ioctl AMDGPU_WAIT_CS returns success.
The change fixes null-ptr-deref in init entity and the stack below demonstrates
the error condition:
[ +0.000007] BUG: kernel NULL pointer dereference, address: 0000000000000028
[ +0.007086] #PF: supervisor read access in kernel mode
[ +0.005234] #PF: error_code(0x0000) - not-present page
[ +0.005232] PGD 0 P4D 0
[ +0.002501] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ +0.005034] CPU: 10 PID: 9229 Comm: amd_basic Tainted: G B W L 6.7.0+ #4
[ +0.007797] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
[ +0.009798] RIP: 0010:drm_sched_entity_init+0x2d3/0x420 [gpu_sched]
[ +0.006426] Code: 80 00 00 00 00 00 00 00 e8 1a 81 82 e0 49 89 9c 24 c0 00 00 00 4c 89 ef e8 4a 80 82 e0 49 8b 5d 00 48 8d 7b 28 e8 3d 80 82 e0 <48> 83 7b 28 00 0f 84 28 01 00 00 4d 8d ac 24 98 00 00 00 49 8d 5c
[ +0.019094] RSP: 0018:ffffc90014c1fa40 EFLAGS: 00010282
[ +0.005237] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff8113f3fa
[ +0.007326] RDX: fffffbfff0a7889d RSI: 0000000000000008 RDI: ffffffff853c44e0
[ +0.007264] RBP: ffffc90014c1fa80 R08: 0000000000000001 R09: fffffbfff0a7889c
[ +0.007266] R10: ffffffff853c44e7 R11: 0000000000000001 R12: ffff8881a719b010
[ +0.007263] R13: ffff88810d412748 R14: 0000000000000002 R15: 0000000000000000
[ +0.007264] FS: 00007ffff7045540(0000) GS:ffff8883cc900000(0000) knlGS:0000000000000000
[ +0.008236] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.005851] CR2: 0000000000000028 CR3: 000000011912e000 CR4: 0000000000350ef0
[ +0.007175] Call Trace:
[ +0.002561] <TASK>
[ +0.002141] ? show_regs+0x6a/0x80
[ +0.003473] ? __die+0x25/0x70
[ +0.003124] ? page_fault_oops+0x214/0x720
[ +0.004179] ? preempt_count_sub+0x18/0xc0
[ +0.004093] ? __pfx_page_fault_oops+0x10/0x10
[ +0.004590] ? srso_return_thunk+0x5/0x5f
[ +0.004000] ? vprintk_default+0x1d/0x30
[ +0.004063] ? srso_return_thunk+0x5/0x5f
[ +0.004087] ? vprintk+0x5c/0x90
[ +0.003296] ? drm_sched_entity_init+0x2d3/0x420 [gpu_sched]
[ +0.005807] ? srso_return_thunk+0x5/0x5f
[ +0.004090] ? _printk+0xb3/0xe0
[ +0.003293] ? __pfx__printk+0x10/0x10
[ +0.003735] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ +0.005482] ? do_user_addr_fault+0x345/0x770
[ +0.004361] ? exc_page_fault+0x64/0xf0
[ +0.003972] ? asm_exc_page_fault+0x27/0x30
[ +0.004271] ? add_taint+0x2a/0xa0
[ +0.003476] ? drm_sched_entity_init+0x2d3/0x420 [gpu_sched]
[ +0.005812] amdgpu_ctx_get_entity+0x3f9/0x770 [amdgpu]
[ +0.009530] ? finish_task_switch.isra.0+0x129/0x470
[ +0.005068] ? __pfx_amdgpu_ctx_get_entity+0x10/0x10 [amdgpu]
[ +0.010063] ? __kasan_check_write+0x14/0x20
[ +0.004356] ? srso_return_thunk+0x5/0x5f
[ +0.004001] ? mutex_unlock+0x81/0xd0
[ +0.003802] ? srso_return_thunk+0x5/0x5f
[ +0.004096] amdgpu_cs_wait_ioctl+0xf6/0x270 [amdgpu]
[ +0.009355] ? __pfx_amdgpu_cs_wait_ioctl+0x10/0x10 [amdgpu]
[ +0.009981] ? srso_return_thunk+0x5/0x5f
[ +0.004089] ? srso_return_thunk+0x5/0x5f
[ +0.004090] ? __srcu_read_lock+0x20/0x50
[ +0.004096] drm_ioctl_kernel+0x140/0x1f0 [drm]
[ +0.005080] ? __pfx_amdgpu_cs_wait_ioctl+0x10/0x10 [amdgpu]
[ +0.009974] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[ +0.005618] ? srso_return_thunk+0x5/0x5f
[ +0.004088] ? __kasan_check_write+0x14/0x20
[ +0.004357] drm_ioctl+0x3da/0x730 [drm]
[ +0.004461] ? __pfx_amdgpu_cs_wait_ioctl+0x10/0x10 [amdgpu]
[ +0.009979] ? __pfx_drm_ioctl+0x10/0x10 [drm]
[ +0.004993] ? srso_return_thunk+0x5/0x5f
[ +0.004090] ? __kasan_check_write+0x14/0x20
[ +0.004356] ? srso_return_thunk+0x5/0x5f
[ +0.004090] ? _raw_spin_lock_irqsave+0x99/0x100
[ +0.004712] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.005063] ? __pfx_arch_do_signal_or_restart+0x10/0x10
[ +0.005477] ? srso_return_thunk+0x5/0x5f
[ +0.004000] ? preempt_count_sub+0x18/0xc0
[ +0.004237] ? srso_return_thunk+0x5/0x5f
[ +0.004090] ? _raw_spin_unlock_irqrestore+0x27/0x50
[ +0.005069] amdgpu_drm_ioctl+0x7e/0xe0 [amdgpu]
[ +0.008912] __x64_sys_ioctl+0xcd/0x110
[ +0.003918] do_syscall_64+0x5f/0xe0
[ +0.003649] ? noist_exc_debug+0xe6/0x120
[ +0.004095] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ +0.005150] RIP: 0033:0x7ffff7b1a94f
[ +0.003647] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[ +0.019097] RSP: 002b:00007fffffffe0a0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ +0.007708] RAX: ffffffffffffffda RBX: 000055555558b360 RCX: 00007ffff7b1a94f
[ +0.007176] RDX: 000055555558b360 RSI: 00000000c0206449 RDI: 0000000000000003
[ +0.007326] RBP: 00000000c0206449 R08: 000055555556ded0 R09: 000000007fffffff
[ +0.007176] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5d8
[ +0.007238] R13: 0000000000000003 R14: 000055555555cba8 R15: 00007ffff7ffd040
[ +0.007250] </TASK>
v2: Reworked check to guard against null ptr deref and added helpful comments
(Christian)
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Joonkyo Jung <joonkyoj@yonsei.ac.kr>
Cc: Dokyung Song <dokyungs@yonsei.ac.kr>
Cc: <jisoo.jang@yonsei.ac.kr>
Cc: <yw9865@yonsei.ac.kr>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Fixes: 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
Link: https://patchwork.freedesktop.org/patch/msgid/20240315023926.343164-1-vitaly.prosyak@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 22207fd5c80177b860279653d017474b2812af5e upstream.
The bug can be triggered by sending a single amdgpu_gem_userptr_ioctl
to the AMDGPU DRM driver on any ASICs with an invalid address and size.
The bug was reported by Joonkyo Jung <joonkyoj@yonsei.ac.kr>.
For example the following code:
static void Syzkaller1(int fd)
{
struct drm_amdgpu_gem_userptr arg;
int ret;
arg.addr = 0xffffffffffff0000;
arg.size = 0x80000000; /*2 Gb*/
arg.flags = 0x7;
ret = drmIoctl(fd, 0xc1186451/*amdgpu_gem_userptr_ioctl*/, &arg);
}
Due to the address and size are not valid there is a failure in
amdgpu_hmm_register->mmu_interval_notifier_insert->__mmu_interval_notifier_insert->
check_shl_overflow, but we even the amdgpu_hmm_register failure we still call
amdgpu_hmm_unregister into amdgpu_gem_object_free which causes access to a bad address.
The following stack is below when the issue is reproduced when Kazan is enabled:
[ +0.000014] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
[ +0.000009] RIP: 0010:mmu_interval_notifier_remove+0x327/0x340
[ +0.000017] Code: ff ff 49 89 44 24 08 48 b8 00 01 00 00 00 00 ad de 4c 89 f7 49 89 47 40 48 83 c0 22 49 89 47 48 e8 ce d1 2d 01 e9 32 ff ff ff <0f> 0b e9 16 ff ff ff 4c 89 ef e8 fa 14 b3 ff e9 36 ff ff ff e8 80
[ +0.000014] RSP: 0018:ffffc90002657988 EFLAGS: 00010246
[ +0.000013] RAX: 0000000000000000 RBX: 1ffff920004caf35 RCX: ffffffff8160565b
[ +0.000011] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8881a9f78260
[ +0.000010] RBP: ffffc90002657a70 R08: 0000000000000001 R09: fffff520004caf25
[ +0.000010] R10: 0000000000000003 R11: ffffffff8161d1d6 R12: ffff88810e988c00
[ +0.000010] R13: ffff888126fb5a00 R14: ffff88810e988c0c R15: ffff8881a9f78260
[ +0.000011] FS: 00007ff9ec848540(0000) GS:ffff8883cc880000(0000) knlGS:0000000000000000
[ +0.000012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000010] CR2: 000055b3f7e14328 CR3: 00000001b5770000 CR4: 0000000000350ef0
[ +0.000010] Call Trace:
[ +0.000006] <TASK>
[ +0.000007] ? show_regs+0x6a/0x80
[ +0.000018] ? __warn+0xa5/0x1b0
[ +0.000019] ? mmu_interval_notifier_remove+0x327/0x340
[ +0.000018] ? report_bug+0x24a/0x290
[ +0.000022] ? handle_bug+0x46/0x90
[ +0.000015] ? exc_invalid_op+0x19/0x50
[ +0.000016] ? asm_exc_invalid_op+0x1b/0x20
[ +0.000017] ? kasan_save_stack+0x26/0x50
[ +0.000017] ? mmu_interval_notifier_remove+0x23b/0x340
[ +0.000019] ? mmu_interval_notifier_remove+0x327/0x340
[ +0.000019] ? mmu_interval_notifier_remove+0x23b/0x340
[ +0.000020] ? __pfx_mmu_interval_notifier_remove+0x10/0x10
[ +0.000017] ? kasan_save_alloc_info+0x1e/0x30
[ +0.000018] ? srso_return_thunk+0x5/0x5f
[ +0.000014] ? __kasan_kmalloc+0xb1/0xc0
[ +0.000018] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __kasan_check_read+0x11/0x20
[ +0.000020] amdgpu_hmm_unregister+0x34/0x50 [amdgpu]
[ +0.004695] amdgpu_gem_object_free+0x66/0xa0 [amdgpu]
[ +0.004534] ? __pfx_amdgpu_gem_object_free+0x10/0x10 [amdgpu]
[ +0.004291] ? do_syscall_64+0x5f/0xe0
[ +0.000023] ? srso_return_thunk+0x5/0x5f
[ +0.000017] drm_gem_object_free+0x3b/0x50 [drm]
[ +0.000489] amdgpu_gem_userptr_ioctl+0x306/0x500 [amdgpu]
[ +0.004295] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu]
[ +0.004270] ? srso_return_thunk+0x5/0x5f
[ +0.000014] ? __this_cpu_preempt_check+0x13/0x20
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? sysvec_apic_timer_interrupt+0x57/0xc0
[ +0.000020] ? srso_return_thunk+0x5/0x5f
[ +0.000014] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ +0.000022] ? drm_ioctl_kernel+0x17b/0x1f0 [drm]
[ +0.000496] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu]
[ +0.004272] ? drm_ioctl_kernel+0x190/0x1f0 [drm]
[ +0.000492] drm_ioctl_kernel+0x140/0x1f0 [drm]
[ +0.000497] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu]
[ +0.004297] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[ +0.000489] ? srso_return_thunk+0x5/0x5f
[ +0.000011] ? __kasan_check_write+0x14/0x20
[ +0.000016] drm_ioctl+0x3da/0x730 [drm]
[ +0.000475] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu]
[ +0.004293] ? __pfx_drm_ioctl+0x10/0x10 [drm]
[ +0.000506] ? __pfx_rpm_resume+0x10/0x10
[ +0.000016] ? srso_return_thunk+0x5/0x5f
[ +0.000011] ? __kasan_check_write+0x14/0x20
[ +0.000010] ? srso_return_thunk+0x5/0x5f
[ +0.000011] ? _raw_spin_lock_irqsave+0x99/0x100
[ +0.000015] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? srso_return_thunk+0x5/0x5f
[ +0.000011] ? srso_return_thunk+0x5/0x5f
[ +0.000011] ? preempt_count_sub+0x18/0xc0
[ +0.000013] ? srso_return_thunk+0x5/0x5f
[ +0.000010] ? _raw_spin_unlock_irqrestore+0x27/0x50
[ +0.000019] amdgpu_drm_ioctl+0x7e/0xe0 [amdgpu]
[ +0.004272] __x64_sys_ioctl+0xcd/0x110
[ +0.000020] do_syscall_64+0x5f/0xe0
[ +0.000021] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ +0.000015] RIP: 0033:0x7ff9ed31a94f
[ +0.000012] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[ +0.000013] RSP: 002b:00007fff25f66790 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ +0.000016] RAX: ffffffffffffffda RBX: 000055b3f7e133e0 RCX: 00007ff9ed31a94f
[ +0.000012] RDX: 000055b3f7e133e0 RSI: 00000000c1186451 RDI: 0000000000000003
[ +0.000010] RBP: 00000000c1186451 R08: 0000000000000000 R09: 0000000000000000
[ +0.000009] R10: 0000000000000008 R11: 0000000000000246 R12: 00007fff25f66ca8
[ +0.000009] R13: 0000000000000003 R14: 000055b3f7021ba8 R15: 00007ff9ed7af040
[ +0.000024] </TASK>
[ +0.000007] ---[ end trace 0000000000000000 ]---
v2: Consolidate any error handling into amdgpu_hmm_register
which applied to kfd_bo also. (Christian)
v3: Improve syntax and comment (Christian)
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Cc: Joonkyo Jung <joonkyoj@yonsei.ac.kr>
Cc: Dokyung Song <dokyungs@yonsei.ac.kr>
Cc: <jisoo.jang@yonsei.ac.kr>
Cc: <yw9865@yonsei.ac.kr>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 62248b22d01e96a4d669cde0d7005bd51ebf9e76 upstream.
Include the header that defines u32.
This fixes build of 6.6.23 and 6.1.83 kernels for Alpine Linux, which
uses musl libc. I assume that GNU libc indirecly pulls in linux/types.h.
Fixes: 9707ac4fe2f5 ("tools/resolve_btfids: Refactor set sorting with types from btf_ids.h")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218647
Cc: stable@vger.kernel.org
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Tested-by: Greg Thelen <gthelen@google.com>
Link: https://lore.kernel.org/r/20240328110103.28734-1-ncopa@alpinelinux.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0f4a1e80989aca185d955fcd791d7750082044a2 upstream.
SEV-SNP requires encrypted memory to be validated before access.
Because the ROM memory range is not part of the e820 table, it is not
pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
to access this range, the guest must first validate the range.
The current SEV-SNP code does indeed scan the ROM range during early
boot and thus attempts to validate the ROM range in probe_roms().
However, this behavior is neither sufficient nor necessary for the
following reasons:
* With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
falls in the ROM range) prior to validation.
For example, Project Oak Stage 0 provides a minimal guest firmware
that currently meets these configuration conditions, meaning guests
booting atop Oak Stage 0 firmware encounter a problematic call chain
during dmi_setup() -> dmi_scan_machine() that results in a crash
during boot if SEV-SNP is enabled.
* With regards to necessity, SEV-SNP guests generally read garbage
(which changes across boots) from the ROM range, meaning these scans
are unnecessary. The guest reads garbage because the legacy ROM range
is unencrypted data but is accessed via an encrypted PMD during early
boot (where the PMD is marked as encrypted due to potentially mapping
actually-encrypted data in other PMD-contained ranges).
In one exceptional case, EISA probing treats the ROM range as
unencrypted data, which is inconsistent with other probing.
Continuing to allow SEV-SNP guests to use garbage and to inconsistently
classify ROM range encryption status can trigger undesirable behavior.
For instance, if garbage bytes appear to be a valid signature, memory
may be unnecessarily reserved for the ROM range. Future code or other
use cases may result in more problematic (arbitrary) behavior that
should be avoided.
While one solution would be to overhaul the early PMD mapping to always
treat the ROM region of the PMD as unencrypted, SEV-SNP guests do not
currently rely on data from the ROM region during early boot (and even
if they did, they would be mostly relying on garbage data anyways).
As a simpler solution, skip the ROM range scans (and the otherwise-
necessary range validation) during SEV-SNP guest early boot. The
potential SEV-SNP guest crash due to lack of ROM range validation is
thus avoided by simply not accessing the ROM range.
In most cases, skip the scans by overriding problematic x86_init
functions during sme_early_init() to SNP-safe variants, which can be
likened to x86_init overrides done for other platforms (ex: Xen); such
overrides also avoid the spread of cc_platform_has() checks throughout
the tree.
In the exceptional EISA case, still use cc_platform_has() for the
simplest change, given (1) checks for guest type (ex: Xen domain status)
are already performed here, and (2) these checks occur in a subsys
initcall instead of an x86_init function.
[ bp: Massage commit message, remove "we"s. ]
Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20240313121546.2964854-1-kevinloughlin@google.com
Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c7b2edd8377be983442c1344cb940cd2ac21b601 upstream.
AMD processors based on Zen 2 and later microarchitectures do not
support PMCx087 (instruction pipe stalls) which is used as the backing
event for "stalled-cycles-frontend" and "stalled-cycles-backend".
Use PMCx0A9 (cycles where micro-op queue is empty) instead to count
frontend stalls and remove the entry for backend stalls since there
is no direct replacement.
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h")
Link: https://lore.kernel.org/r/03d7fc8fa2a28f9be732116009025bdec1b3ec97.1711352180.git.sandipan.das@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8e68a458bcf5b5cb9c3624598bae28f08251601f upstream.
As of commit d8649fc1c5e4 ("scsi: libsas: Do discovery on empty PHY to
update PHY info"), do discovery will send a new SMP_DISCOVER and update
phy->phy_change_count. We found that if the disk is reconnected and phy
change_count changes at this time, the disk scanning process will not be
triggered.
Therefore, call sas_set_ex_phy() to update the PHY info with the results of
the last query. And because the previous phy info will be used when calling
sas_unregister_devs_sas_addr(), sas_unregister_devs_sas_addr() should be
called before sas_set_ex_phy().
Fixes: d8649fc1c5e4 ("scsi: libsas: Do discovery on empty PHY to update PHY info")
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Link: https://lore.kernel.org/r/20240307141413.48049-3-yangxingui@huawei.com
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a57345279fd311ba679b8083feb0eec5272c7729 upstream.
Add a helper to get attached_sas_addr and device type from disc_resp.
Suggested-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Link: https://lore.kernel.org/r/20240307141413.48049-2-yangxingui@huawei.com
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 28d41991182c210ec1654f8af2e140ef4cc73f20 upstream.
The wqe is of type lpfc_wqe128. It should be memset with the same type.
Fixes: 6c621a2229b0 ("scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context")
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240304090649.833953-1-usama.anjum@collabora.com
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Justin Tee <justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 16cc2ba71b9f6440805aef7f92ba0f031f79b765 upstream.
The cmdwqe and rspwqe are of type lpfc_wqe128. They should be memset() with
the same type.
Fixes: 61910d6a5243 ("scsi: lpfc: SLI path split: Refactor CT paths")
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240304091119.847060-1-usama.anjum@collabora.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f121531703ae442edc1dde4b56803680628bc5b7 upstream.
Intel Arrow Lake CPU uses the Meteor Lake ID with this
controller (the controller that's part of the Intel Arrow
Lake chipset (PCH) does still have unique PCI ID).
Fixes: de4b5b28c87c ("usb: dwc3: pci: add support for the Intel Arrow Lake-H")
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20240312115008.1748637-1-heikki.krogerus@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 532a0c57d7ff75e8f07d4e25cba4184989e2a241 upstream.
This was reverts commit 8009479ee919b9a91674f48050ccbff64eafedaa.
It was originally in x86/urgent, but was deemed wrong so got zapped.
But in the meantime, x86/urgent had been merged into x86/apic to
resolve a conflict. I didn't notice the merge so didn't zap it
from x86/apic and it managed to make it up with the x86/apic
material.
The reverted commit is known to cause some KASAN problems.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8009479ee919b9a91674f48050ccbff64eafedaa upstream.
The macro used for MDS mitigation executes VERW with relative
addressing for the operand. This was necessary in earlier versions of
the series. Now it is unnecessary and creates a problem for backports
on older kernels that don't support relocations in alternatives.
Relocation support was added by commit 270a69c4485d ("x86/alternative:
Support relocations in alternatives"). Also asm for fixed addressing
is much cleaner than relative RIP addressing.
Simplify the asm by using fixed addressing for VERW operand.
[ dhansen: tweak changelog ]
Closes: https://lore.kernel.org/lkml/20558f89-299b-472e-9a96-171403a83bd6@suse.com/
Fixes: baf8361e5455 ("x86/bugs: Add asm helpers for executing VERW")
Reported-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20240226-verw-arg-fix-v1-1-7b37ee6fd57d%40linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1202f794cdaa4f0ba6a456bc034f2db6cfcf5579 upstream.
We need to re-enable idle power optimizations after entering PSR. Since,
we get kicked out of idle power optimizations before entering PSR
(entering PSR requires us to write to DCN registers, which isn't allowed
while we are in IPS).
Fixes: a9b1a4f684b3 ("drm/amd/display: Add more checks for exiting idle in DC")
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 32fbe5246582af4f611ccccee33fd6e559087252 upstream.
There are regression reports[1][2] that crashkernel region on x86_64 can't
be added into iomem tree sometime. This causes the later failure of kdump
loading.
This happened after commit 4a693ce65b18 ("kdump: defer the insertion of
crashkernel resources") was merged.
Even though, these reported issues are proved to be related to other
component, they are just exposed after above commmit applied, I still
would like to keep crashk_res and crashk_low_res being added into iomem
early as before because the early adding has been always there on x86_64
and working very well. For safety of kdump, Let's change it back.
Here, add a macro HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY to limit that
only ARCH defining the macro can have the early adding
crashk_res/_low_res into iomem. Then define
HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY on x86 to enable it.
Note: In reserve_crashkernel_low(), there's a remnant of crashk_low_res
handling which was mistakenly added back in commit 85fcde402db1 ("kexec:
split crashkernel reservation code out from crash_core.c").
[1]
[PATCH V2] x86/kexec: do not update E820 kexec table for setup_data
https://lore.kernel.org/all/Zfv8iCL6CT2JqLIC@darkstar.users.ipa.redhat.com/T/#u
[2]
Question about Address Range Validation in Crash Kernel Allocation
https://lore.kernel.org/all/4eeac1f733584855965a2ea62fa4da58@huawei.com/T/#u
Link: https://lkml.kernel.org/r/ZgDYemRQ2jxjLkq+@MiWiFi-R3L-srv
Fixes: 4a693ce65b18 ("kdump: defer the insertion of crashkernel resources")
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Huacai Chen <chenhuacai@loongson.cn>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4535e1a4174c4111d92c5a9a21e542d232e0fcaa upstream.
The original version of the mitigation would patch in the calls to the
untraining routines directly. That is, the alternative() in UNTRAIN_RET
will patch in the CALL to srso_alias_untrain_ret() directly.
However, even if commit e7c25c441e9e ("x86/cpu: Cleanup the untrain
mess") meant well in trying to clean up the situation, due to micro-
architectural reasons, the untraining routine srso_alias_untrain_ret()
must be the target of a CALL instruction and not of a JMP instruction as
it is done now.
Reshuffle the alternative macros to accomplish that.
Fixes: e7c25c441e9e ("x86/cpu: Cleanup the untrain mess")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 591c1fdf2016d118b8fbde427b796fac13f3f070 upstream.
Currently when PCI error is detected, I/O is aborted manually through the
ABORT IOCB mechanism which is not guaranteed to succeed.
Instead, wait for the OS or system to notify driver to wind down I/O
through the pci_error_handlers api. Set eeh_busy flag to pause all traffic
and wait for I/O to drain.
Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-11-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b5a30840727a3e41d12a336d19f6c0716b299161 upstream.
Upon driver unload, purge_mbox flag is set and the heartbeat monitor thread
detects this flag and does not send the mailbox command down to FW with a
debug message "Error detected: purge[1] eeh[0] cmd=0x0, Exiting". This
being not a real error, change the debug message.
Cc: stable@vger.kernel.org
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-10-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 82f522ae0d97119a43da53e0f729275691b9c525 upstream.
The server was crashing after LOGO because fcport was getting freed twice.
-----------[ cut here ]-----------
kernel BUG at mm/slub.c:371!
invalid opcode: 0000 1 SMP PTI
CPU: 35 PID: 4610 Comm: bash Kdump: loaded Tainted: G OE --------- - - 4.18.0-425.3.1.el8.x86_64 #1
Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 09/03/2021
RIP: 0010:set_freepointer.part.57+0x0/0x10
RSP: 0018:ffffb07107027d90 EFLAGS: 00010246
RAX: ffff9cb7e3150000 RBX: ffff9cb7e332b9c0 RCX: ffff9cb7e3150400
RDX: 0000000000001f37 RSI: 0000000000000000 RDI: ffff9cb7c0005500
RBP: fffff693448c5400 R08: 0000000080000000 R09: 0000000000000009
R10: 0000000000000000 R11: 0000000000132af0 R12: ffff9cb7c0005500
R13: ffff9cb7e3150000 R14: ffffffffc06990e0 R15: ffff9cb7ea85ea58
FS: 00007ff6b79c2740(0000) GS:ffff9cb8f7ec0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055b426b7d700 CR3: 0000000169c18002 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
kfree+0x238/0x250
qla2x00_els_dcmd_sp_free+0x20/0x230 [qla2xxx]
? qla24xx_els_dcmd_iocb+0x607/0x690 [qla2xxx]
qla2x00_issue_logo+0x28c/0x2a0 [qla2xxx]
? qla2x00_issue_logo+0x28c/0x2a0 [qla2xxx]
? kernfs_fop_write+0x11e/0x1a0
Remove one of the free calls and add check for valid fcport. Also use
function qla2x00_free_fcport() instead of kfree().
Cc: stable@vger.kernel.org
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-9-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e288285d47784fdcf7c81be56df7d65c6f10c58b upstream.
Coverity scan reported potential risk of double free of the pointer
ha->vp_map. ha->vp_map was freed in qla2x00_mem_alloc(), and again freed
in function qla2x00_mem_free(ha).
Assign NULL to vp_map and kfree take care of NULL.
Cc: stable@vger.kernel.org
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-8-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a27d4d0e7de305def8a5098a614053be208d1aa1 upstream.
System crash due to command failed to flush back to SCSI layer.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 27 PID: 793455 Comm: kworker/u130:6 Kdump: loaded Tainted: G OE --------- - - 4.18.0-372.9.1.el8.x86_64 #1
Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 09/03/2021
Workqueue: nvme-wq nvme_fc_connect_ctrl_work [nvme_fc]
RIP: 0010:__wake_up_common+0x4c/0x190
Code: 24 10 4d 85 c9 74 0a 41 f6 01 04 0f 85 9d 00 00 00 48 8b 43 08 48 83 c3 08 4c 8d 48 e8 49 8d 41 18 48 39 c3 0f 84 f0 00 00 00 <49> 8b 41 18 89 54 24 08 31 ed 4c 8d 70 e8 45 8b 29 41 f6 c5 04 75
RSP: 0018:ffff95f3e0cb7cd0 EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffff8b08d3b26328 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8b08d3b26320
RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffffffffffe8
R10: 0000000000000000 R11: ffff95f3e0cb7a60 R12: ffff95f3e0cb7d20
R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8b2fdf6c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000002f1e410002 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
__wake_up_common_lock+0x7c/0xc0
qla_nvme_ls_req+0x355/0x4c0 [qla2xxx]
qla2xxx [0000:12:00.1]-f084:3: qlt_free_session_done: se_sess 0000000000000000 / sess ffff8ae1407ca000 from port 21:32:00:02:ac:07:ee:b8 loop_id 0x02 s_id 01:02:00 logout 1 keep 0 els_logo 0
? __nvme_fc_send_ls_req+0x260/0x380 [nvme_fc]
qla2xxx [0000:12:00.1]-207d:3: FCPort 21:32:00:02:ac:07:ee:b8 state transitioned from ONLINE to LOST - portid=010200.
? nvme_fc_send_ls_req.constprop.42+0x1a/0x45 [nvme_fc]
qla2xxx [0000:12:00.1]-2109:3: qla2x00_schedule_rport_del 21320002ac07eeb8. rport ffff8ae598122000 roles 1
? nvme_fc_connect_ctrl_work.cold.63+0x1e3/0xa7d [nvme_fc]
qla2xxx [0000:12:00.1]-f084:3: qlt_free_session_done: se_sess 0000000000000000 / sess ffff8ae14801e000 from port 21:32:01:02:ad:f7:ee:b8 loop_id 0x04 s_id 01:02:01 logout 1 keep 0 els_logo 0
? __switch_to+0x10c/0x450
? process_one_work+0x1a7/0x360
qla2xxx [0000:12:00.1]-207d:3: FCPort 21:32:01:02:ad:f7:ee:b8 state transitioned from ONLINE to LOST - portid=010201.
? worker_thread+0x1ce/0x390
? create_worker+0x1a0/0x1a0
qla2xxx [0000:12:00.1]-2109:3: qla2x00_schedule_rport_del 21320102adf7eeb8. rport ffff8ae3b2312800 roles 70
? kthread+0x10a/0x120
qla2xxx [0000:12:00.1]-2112:3: qla_nvme_unregister_remote_port: unregister remoteport on ffff8ae14801e000 21320102adf7eeb8
? set_kthread_struct+0x40/0x40
qla2xxx [0000:12:00.1]-2110:3: remoteport_delete of ffff8ae14801e000 21320102adf7eeb8 completed.
? ret_from_fork+0x1f/0x40
qla2xxx [0000:12:00.1]-f086:3: qlt_free_session_done: waiting for sess ffff8ae14801e000 logout
The system was under memory stress where driver was not able to allocate an
SRB to carry out error recovery of cable pull. The failure to flush causes
upper layer to start modifying scsi_cmnd. When the system frees up some
memory, the subsequent cable pull trigger another command flush. At this
point the driver access a null pointer when attempting to DMA unmap the
SGL.
Add a check to make sure commands are flush back on session tear down to
prevent the null pointer access.
Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 69aecdd410106dc3a8f543a4f7ec6379b995b8d0 upstream.
Changing of [FCP|NVME] prefer flag in flash has no effect on driver. For
device that supports both FCP + NVMe over the same connection, driver
continues to connect to this device using the previous successful login
mode.
On completion of flash update, adapter will be reset. Driver will
reset the prefer flag based on setting from flash.
Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-6-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 688fa069fda6fce24d243cddfe0c7024428acb74 upstream.
Update manufacturer detail from "Marvell Semiconductor, Inc." to
"Marvell".
Cc: stable@vger.kernel.org
Signed-off-by: Bikash Hazarika <bhazarika@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 76a192e1a566e15365704b9f8fb3b70825f85064 upstream.
Current code combines the allocation of FCE|EFT trace buffers and enables
the features all in 1 step.
Split this step into separate steps in preparation for follow-on patch to
allow user to have a choice to enable / disable FCE trace feature.
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-4-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 881eb861ca3877300570db10abbf11494e48548d upstream.
Disk failed to rediscover after chip reset error injection. The chip reset
happens at the time when a PLOGI is being sent. This causes a flag to be
left on which blocks the retry. Clear the blocking flag.
Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4895009c4bb72f71f2e682f1e7d2c2d96e482087 upstream.
Currently IOCBs are allowed to push through while chip reset could be in
progress. During chip reset the outstanding_cmds array is cleared
twice. Once when any command on this array is returned as failed and
secondly when the array is initialize to zero. If a command is inserted on
to the array between these intervals, then the command will be lost. Check
for chip reset before sending IOCB.
Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240227164127.36465-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3de4f996a0b5412aa451729008130a488f71563e upstream.
Check the UCSI_CCI_RESET_COMPLETE complete flag before starting
another reset. Use a UCSI_SET_NOTIFICATION_ENABLE command to clear
the flag if it is set.
Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
Cc: stable <stable@kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Link: https://lore.kernel.org/r/20240320073927.1641788-6-lk@c--e.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6aaceb7d9cd00f3e065dc4b054ecfe52c5253b03 upstream.
Some DELL systems don't like UCSI_ACK_CC_CI commands with the
UCSI_ACK_CONNECTOR_CHANGE but not the UCSI_ACK_COMMAND_COMPLETE
bit set. The current quirk still leaves room for races because
it requires two consecutive ACK commands to be sent.
Refactor and significantly simplify the quirk to fix this:
Send a dummy command and bundle the connector change ack with the
command completion ack in a single UCSI_ACK_CC_CI command.
This removes the need to probe for the quirk.
While there define flag bits for struct ucsi_acpi->flags in ucsi_acpi.c
and don't re-use definitions from ucsi.h for struct ucsi->flags.
Fixes: f3be347ea42d ("usb: ucsi_acpi: Quirk to ack a connector change ack cmd")
Cc: stable@vger.kernel.org
Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Link: https://lore.kernel.org/r/20240320073927.1641788-5-lk@c--e.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6b5c85ddeea77d18c4b69e3bda60e9374a20c304 upstream.
If a command completes the OPM must send an ack. This applies
to unsupported commands, too.
Send the required ACK for unsupported commands.
Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
Cc: stable <stable@kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Link: https://lore.kernel.org/r/20240320073927.1641788-4-lk@c--e.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 15b2e71b4653b3e13df34695a29ebeee237c5af2 upstream.
Suppose we sleep on the PPM lock after clearing the EVENT_PENDING
bit because the thread for another connector is executing a command.
In this case the command completion of the other command will still
report the connector change for our connector.
Clear the EVENT_PENDING bit under the PPM lock to avoid another
useless call to ucsi_handle_connector_change() in this case.
Fixes: c9aed03a0a68 ("usb: ucsi: Add missing ppm_lock")
Cc: stable <stable@kernel.org>
Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Link: https://lore.kernel.org/r/20240320073927.1641788-2-lk@c--e.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 53f5094fdf5deacd99b8655df692e9278506724d upstream.
The attribute writing should return the number of bytes used from the
buffer on success.
Fixes: a7cff92f0635 ("usb: typec: USB Power Delivery helpers for ports and partners")
Cc: stable@vger.kernel.org
Signed-off-by: Kyle Tso <kyletso@google.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20240319074309.3306579-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 17af5050dead6cbcca12c1fcd17e0bb8bb284eae upstream.
The PD of Type-C port needs to be updated in pd_set. Unlink the Type-C
port device to the old PD before linking it to a new one.
Fixes: cd099cde4ed2 ("usb: typec: tcpm: Support multiple capabilities")
Cc: stable@vger.kernel.org
Signed-off-by: Kyle Tso <kyletso@google.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20240311172306.3911309-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 893cd9469c68a89a34956121685617dbb37497b1 upstream.
In tcpm_pd_set, the array of port source capabilities is port->src_pdo,
not port->snk_pdo.
Fixes: cd099cde4ed2 ("usb: typec: tcpm: Support multiple capabilities")
Cc: stable@vger.kernel.org
Signed-off-by: Kyle Tso <kyletso@google.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20240311144500.3694849-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b63f90487bdf93a4223ce7853d14717e9d452856 upstream.
When unregister pd capabilitie in tcpm, KASAN will capture below double
-free issue. The root cause is the same capabilitiy will be kfreed twice,
the first time is kfreed by pd_capabilities_release() and the second time
is explicitly kfreed by tcpm_port_unregister_pd().
[ 3.988059] BUG: KASAN: double-free in tcpm_port_unregister_pd+0x1a4/0x3dc
[ 3.995001] Free of addr ffff0008164d3000 by task kworker/u16:0/10
[ 4.001206]
[ 4.002712] CPU: 2 PID: 10 Comm: kworker/u16:0 Not tainted 6.8.0-rc5-next-20240220-05616-g52728c567a55 #53
[ 4.012402] Hardware name: Freescale i.MX8QXP MEK (DT)
[ 4.017569] Workqueue: events_unbound deferred_probe_work_func
[ 4.023456] Call trace:
[ 4.025920] dump_backtrace+0x94/0xec
[ 4.029629] show_stack+0x18/0x24
[ 4.032974] dump_stack_lvl+0x78/0x90
[ 4.036675] print_report+0xfc/0x5c0
[ 4.040289] kasan_report_invalid_free+0xa0/0xc0
[ 4.044937] __kasan_slab_free+0x124/0x154
[ 4.049072] kfree+0xb4/0x1e8
[ 4.052069] tcpm_port_unregister_pd+0x1a4/0x3dc
[ 4.0 |