<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/ras, branch v6.18.21</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>APEI/GHES: ARM processor Error: don't go past allocated memory</title>
<updated>2026-03-04T12:19:35+00:00</updated>
<author>
<name>Mauro Carvalho Chehab</name>
<email>mchehab+huawei@kernel.org</email>
</author>
<published>2026-01-08T11:35:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=136093ba4161e0080088abff48273f6830a47766'/>
<id>136093ba4161e0080088abff48273f6830a47766</id>
<content type='text'>
[ Upstream commit 87880af2d24e62a84ed19943dbdd524f097172f2 ]

If the BIOS generates a very small ARM Processor Error, or
an incomplete one, the current logic will fail to deferrence

	err-&gt;section_length
and
	ctx_info-&gt;size

Add checks to avoid that. With such changes, such GHESv2
records won't cause OOPSes like this:

[    1.492129] Internal error: Oops: 0000000096000005 [#1]  SMP
[    1.495449] Modules linked in:
[    1.495820] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.18.0-rc1-00017-gabadcc3553dd-dirty #18 PREEMPT
[    1.496125] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022
[    1.496433] Workqueue: kacpi_notify acpi_os_execute_deferred
[    1.496967] pstate: 814000c5 (Nzcv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[    1.497199] pc : log_arm_hw_error+0x5c/0x200
[    1.497380] lr : ghes_handle_arm_hw_error+0x94/0x220

0xffff8000811c5324 is in log_arm_hw_error (../drivers/ras/ras.c:75).
70		err_info = (struct cper_arm_err_info *)(err + 1);
71		ctx_info = (struct cper_arm_ctx_info *)(err_info + err-&gt;err_info_num);
72		ctx_err = (u8 *)ctx_info;
73
74		for (n = 0; n &lt; err-&gt;context_info_num; n++) {
75			sz = sizeof(struct cper_arm_ctx_info) + ctx_info-&gt;size;
76			ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + sz);
77			ctx_len += sz;
78		}
79

and similar ones while trying to access section_length on an
error dump with too small size.

Signed-off-by: Mauro Carvalho Chehab &lt;mchehab+huawei@kernel.org&gt;
Reviewed-by: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Acked-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Reviewed-by: Hanjun Guo &lt;guohanjun@huawei.com&gt;
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/7fd9f38413be05ee2d7cfdb0dc31ea2274cf1a54.1767871950.git.mchehab+huawei@kernel.org
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 87880af2d24e62a84ed19943dbdd524f097172f2 ]

If the BIOS generates a very small ARM Processor Error, or
an incomplete one, the current logic will fail to deferrence

	err-&gt;section_length
and
	ctx_info-&gt;size

Add checks to avoid that. With such changes, such GHESv2
records won't cause OOPSes like this:

[    1.492129] Internal error: Oops: 0000000096000005 [#1]  SMP
[    1.495449] Modules linked in:
[    1.495820] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.18.0-rc1-00017-gabadcc3553dd-dirty #18 PREEMPT
[    1.496125] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022
[    1.496433] Workqueue: kacpi_notify acpi_os_execute_deferred
[    1.496967] pstate: 814000c5 (Nzcv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[    1.497199] pc : log_arm_hw_error+0x5c/0x200
[    1.497380] lr : ghes_handle_arm_hw_error+0x94/0x220

0xffff8000811c5324 is in log_arm_hw_error (../drivers/ras/ras.c:75).
70		err_info = (struct cper_arm_err_info *)(err + 1);
71		ctx_info = (struct cper_arm_ctx_info *)(err_info + err-&gt;err_info_num);
72		ctx_err = (u8 *)ctx_info;
73
74		for (n = 0; n &lt; err-&gt;context_info_num; n++) {
75			sz = sizeof(struct cper_arm_ctx_info) + ctx_info-&gt;size;
76			ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + sz);
77			ctx_len += sz;
78		}
79

and similar ones while trying to access section_length on an
error dump with too small size.

Signed-off-by: Mauro Carvalho Chehab &lt;mchehab+huawei@kernel.org&gt;
Reviewed-by: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Acked-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Reviewed-by: Hanjun Guo &lt;guohanjun@huawei.com&gt;
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/7fd9f38413be05ee2d7cfdb0dc31ea2274cf1a54.1767871950.git.mchehab+huawei@kernel.org
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RAS: Report all ARM processor CPER information to userspace</title>
<updated>2025-12-18T13:03:09+00:00</updated>
<author>
<name>Jason Tian</name>
<email>jason@os.amperecomputing.com</email>
</author>
<published>2025-08-14T16:52:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=22b5096abc9824fb84f0bfe084f5be9f7ea5f2d9'/>
<id>22b5096abc9824fb84f0bfe084f5be9f7ea5f2d9</id>
<content type='text'>
[ Upstream commit 05954511b73e748d0370549ad9dd9cd95297d97a ]

The ARM processor CPER record was added in UEFI v2.6 and remained
unchanged up to v2.10.

Yet, the original arm_event trace code added by

  e9279e83ad1f ("trace, ras: add ARM processor error trace event")

is incomplete, as it only traces some fields of UAPI 2.6 table N.16, not
exporting any information from tables N.17 to N.29 of the record.

This is not enough for the user to be able to figure out what has
exactly happened or to take appropriate action.

According to the UEFI v2.9 specification chapter N2.4.4, the ARM
processor error section includes:

- several (ERR_INFO_NUM) ARM processor error information structures
  (Tables N.17 to N.20);
- several (CONTEXT_INFO_NUM) ARM processor context information
  structures (Tables N.21 to N.29);
- several vendor specific error information structures. The
  size is given by Section Length minus the size of the other
  fields.

In addition, it also exports two fields that are parsed by the GHES
driver when firmware reports it, e.g.:

- error severity
- CPU logical index

Report all of these information to userspace via a the ARM tracepoint so
that userspace can properly record the error and take decisions related
to CPU core isolation according to error severity and other info.

The updated ARM trace event now contains the following fields:

======================================  =============================
UEFI field on table N.16                ARM Processor trace fields
======================================  =============================
Validation                              handled when filling data for
                                        affinity MPIDR and running
                                        state.
ERR_INFO_NUM                            pei_len
CONTEXT_INFO_NUM                        ctx_len
Section Length                          indirectly reported by
                                        pei_len, ctx_len and oem_len
Error affinity level                    affinity
MPIDR_EL1                               mpidr
MIDR_EL1                                midr
Running State                           running_state
PSCI State                              psci_state
Processor Error Information Structure   pei_err - count at pei_len
Processor Context                       ctx_err- count at ctx_len
Vendor Specific Error Info              oem - count at oem_len
======================================  =============================

It should be noted that decoding of tables N.17 to N.29, if needed, will
be handled in userspace. That gives more flexibility, as there won't be
any need to flood the kernel with micro-architecture specific error
decoding.

Also, decoding the other fields require a complex logic, and should be
done for each of the several values inside the record field.  So, let
userspace daemons like rasdaemon decode them, parsing such tables and
having vendor-specific micro-architecture-specific decoders.

 [mchehab: modified description, solved merge conflicts and fixed coding style]

Signed-off-by: Jason Tian &lt;jason@os.amperecomputing.com&gt;
Co-developed-by: Shengwei Luo &lt;luoshengwei@huawei.com&gt;
Signed-off-by: Shengwei Luo &lt;luoshengwei@huawei.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab+huawei@kernel.org&gt;
Signed-off-by: Daniel Ferguson &lt;danielf@os.amperecomputing.com&gt; # rebased
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Tested-by: Shiju Jose &lt;shiju.jose@huawei.com&gt;
Acked-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event")
Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section
Signed-off-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 05954511b73e748d0370549ad9dd9cd95297d97a ]

The ARM processor CPER record was added in UEFI v2.6 and remained
unchanged up to v2.10.

Yet, the original arm_event trace code added by

  e9279e83ad1f ("trace, ras: add ARM processor error trace event")

is incomplete, as it only traces some fields of UAPI 2.6 table N.16, not
exporting any information from tables N.17 to N.29 of the record.

This is not enough for the user to be able to figure out what has
exactly happened or to take appropriate action.

According to the UEFI v2.9 specification chapter N2.4.4, the ARM
processor error section includes:

- several (ERR_INFO_NUM) ARM processor error information structures
  (Tables N.17 to N.20);
- several (CONTEXT_INFO_NUM) ARM processor context information
  structures (Tables N.21 to N.29);
- several vendor specific error information structures. The
  size is given by Section Length minus the size of the other
  fields.

In addition, it also exports two fields that are parsed by the GHES
driver when firmware reports it, e.g.:

- error severity
- CPU logical index

Report all of these information to userspace via a the ARM tracepoint so
that userspace can properly record the error and take decisions related
to CPU core isolation according to error severity and other info.

The updated ARM trace event now contains the following fields:

======================================  =============================
UEFI field on table N.16                ARM Processor trace fields
======================================  =============================
Validation                              handled when filling data for
                                        affinity MPIDR and running
                                        state.
ERR_INFO_NUM                            pei_len
CONTEXT_INFO_NUM                        ctx_len
Section Length                          indirectly reported by
                                        pei_len, ctx_len and oem_len
Error affinity level                    affinity
MPIDR_EL1                               mpidr
MIDR_EL1                                midr
Running State                           running_state
PSCI State                              psci_state
Processor Error Information Structure   pei_err - count at pei_len
Processor Context                       ctx_err- count at ctx_len
Vendor Specific Error Info              oem - count at oem_len
======================================  =============================

It should be noted that decoding of tables N.17 to N.29, if needed, will
be handled in userspace. That gives more flexibility, as there won't be
any need to flood the kernel with micro-architecture specific error
decoding.

Also, decoding the other fields require a complex logic, and should be
done for each of the several values inside the record field.  So, let
userspace daemons like rasdaemon decode them, parsing such tables and
having vendor-specific micro-architecture-specific decoders.

 [mchehab: modified description, solved merge conflicts and fixed coding style]

Signed-off-by: Jason Tian &lt;jason@os.amperecomputing.com&gt;
Co-developed-by: Shengwei Luo &lt;luoshengwei@huawei.com&gt;
Signed-off-by: Shengwei Luo &lt;luoshengwei@huawei.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab+huawei@kernel.org&gt;
Signed-off-by: Daniel Ferguson &lt;danielf@os.amperecomputing.com&gt; # rebased
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Tested-by: Shiju Jose &lt;shiju.jose@huawei.com&gt;
Acked-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event")
Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section
Signed-off-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RAS: Export log_non_standard_event() to drivers</title>
<updated>2025-09-15T14:20:29+00:00</updated>
<author>
<name>Shubhrajyoti Datta</name>
<email>shubhrajyoti.datta@amd.com</email>
</author>
<published>2025-09-08T11:56:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=36e74c95638296b1f272d9d011d435ce62d8f11c'/>
<id>36e74c95638296b1f272d9d011d435ce62d8f11c</id>
<content type='text'>
The function log_non_standard_event() is responsible for logging
platform-specific or vendor-defined RAS (Reliability, Availability, and
Serviceability) events. Currently, this function is only available within the
RAS subsystem, preventing external modules from leveraging its capabilities.

Export it to drivers to log non-standard RAS events via EDAC.

Signed-off-by: Shubhrajyoti Datta &lt;shubhrajyoti.datta@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/20250908115649.22903-1-shubhrajyoti.datta@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The function log_non_standard_event() is responsible for logging
platform-specific or vendor-defined RAS (Reliability, Availability, and
Serviceability) events. Currently, this function is only available within the
RAS subsystem, preventing external modules from leveraging its capabilities.

Export it to drivers to log non-standard RAS events via EDAC.

Signed-off-by: Shubhrajyoti Datta &lt;shubhrajyoti.datta@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/20250908115649.22903-1-shubhrajyoti.datta@amd.com
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'v6.15-rc5' into x86/cpu, to resolve conflicts</title>
<updated>2025-05-06T08:00:58+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2025-05-06T07:59:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=24035886d735e4ce1c4605638adafe1fa2988e7a'/>
<id>24035886d735e4ce1c4605638adafe1fa2988e7a</id>
<content type='text'>
 Conflicts:
	tools/arch/x86/include/asm/cpufeatures.h

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
 Conflicts:
	tools/arch/x86/include/asm/cpufeatures.h

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/platform/amd: Move the &lt;asm/amd_node.h&gt; header to &lt;asm/amd/node.h&gt;</title>
<updated>2025-04-14T07:34:17+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2025-04-13T08:41:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=0a35c9280a9105e601cfe23b7c15522a195fa412'/>
<id>0a35c9280a9105e601cfe23b7c15522a195fa412</id>
<content type='text'>
Collect AMD specific platform header files in &lt;asm/amd/*.h&gt;.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mario Limonciello &lt;superm1@kernel.org&gt;
Link: https://lore.kernel.org/r/20250413084144.3746608-7-mingo@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Collect AMD specific platform header files in &lt;asm/amd/*.h&gt;.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mario Limonciello &lt;superm1@kernel.org&gt;
Link: https://lore.kernel.org/r/20250413084144.3746608-7-mingo@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/platform/amd: Move the &lt;asm/amd_nb.h&gt; header to &lt;asm/amd/nb.h&gt;</title>
<updated>2025-04-14T07:34:14+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2025-04-14T07:32:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=bcbb65559532891148d990527e9df6b8fc98e98d'/>
<id>bcbb65559532891148d990527e9df6b8fc98e98d</id>
<content type='text'>
Collect AMD specific platform header files in &lt;asm/amd/*.h&gt;.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mario Limonciello &lt;superm1@kernel.org&gt;
Link: https://lore.kernel.org/r/20250413084144.3746608-4-mingo@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Collect AMD specific platform header files in &lt;asm/amd/*.h&gt;.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mario Limonciello &lt;superm1@kernel.org&gt;
Link: https://lore.kernel.org/r/20250413084144.3746608-4-mingo@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>RAS/AMD/FMPM: Get masked address</title>
<updated>2025-04-08T17:30:58+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2025-02-27T19:31:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=58029c39cdc54ac4f4dc40b4a9c05eed9f9b808a'/>
<id>58029c39cdc54ac4f4dc40b4a9c05eed9f9b808a</id>
<content type='text'>
Some operations require checking, or ignoring, specific bits in an address
value. For example, this can be comparing address values to identify unique
structures.

Currently, the full address value is compared when filtering for duplicates.
This results in over counting and creation of extra records.  This gives the
impression that more unique events occurred than did in reality.

Mask the address for physical rows on MI300.

  [ bp: Simplify. ]

Fixes: 6f15e617cc99 ("RAS: Introduce a FRU memory poison manager")
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: stable@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some operations require checking, or ignoring, specific bits in an address
value. For example, this can be comparing address values to identify unique
structures.

Currently, the full address value is compared when filtering for duplicates.
This results in over counting and creation of extra records.  This gives the
impression that more unique events occurred than did in reality.

Mask the address for physical rows on MI300.

  [ bp: Simplify. ]

Fixes: 6f15e617cc99 ("RAS: Introduce a FRU memory poison manager")
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: stable@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>RAS/AMD/ATL: Include row[13] bit in row retirement</title>
<updated>2025-04-07T13:06:06+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2025-04-01T20:49:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=6c44e5354d4d16d9d891a419ca3f57abfe18ce7a'/>
<id>6c44e5354d4d16d9d891a419ca3f57abfe18ce7a</id>
<content type='text'>
Based on feedback from hardware folks, row[13] is part of the variable
bits within a physical row (along with all column bits).

Only half the physical addresses affected by a row are calculated if
this bit is not included.

Add the row[13] bit to the row retirement flow.

Fixes: 3b566b30b414 ("RAS/AMD/ATL: Add MI300 row retirement support")
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250401-fix-fmpm-extra-records-v1-1-840bcf7a8ac5@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Based on feedback from hardware folks, row[13] is part of the variable
bits within a physical row (along with all column bits).

Only half the physical addresses affected by a row are calculated if
this bit is not included.

Add the row[13] bit to the row retirement flow.

Fixes: 3b566b30b414 ("RAS/AMD/ATL: Add MI300 row retirement support")
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250401-fix-fmpm-extra-records-v1-1-840bcf7a8ac5@amd.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/amd_nb: Move SMN access code to a new amd_node driver</title>
<updated>2025-01-08T09:59:44+00:00</updated>
<author>
<name>Mario Limonciello</name>
<email>mario.limonciello@amd.com</email>
</author>
<published>2024-12-06T16:12:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=d6caeafaa324e6aba5ed2ca1a416340c2fd061a2'/>
<id>d6caeafaa324e6aba5ed2ca1a416340c2fd061a2</id>
<content type='text'>
SMN access was bolted into amd_nb mostly as convenience.  This has
limitations though that require incurring tech debt to keep it working.

Move SMN access to the newly introduced AMD Node driver.

Signed-off-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Acked-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt; # pdx86
Acked-by: Shyam Sundar S K &lt;Shyam-sundar.S-k@amd.com&gt; # PMF, PMC
Link: https://lore.kernel.org/r/20241206161210.163701-11-yazen.ghannam@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SMN access was bolted into amd_nb mostly as convenience.  This has
limitations though that require incurring tech debt to keep it working.

Move SMN access to the newly introduced AMD Node driver.

Signed-off-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Acked-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt; # pdx86
Acked-by: Shyam Sundar S K &lt;Shyam-sundar.S-k@amd.com&gt; # PMF, PMC
Link: https://lore.kernel.org/r/20241206161210.163701-11-yazen.ghannam@amd.com
</pre>
</div>
</content>
</entry>
<entry>
<title>RAS/AMD/ATL: Add debug prints for DF register reads</title>
<updated>2024-10-22T16:55:57+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2024-10-21T15:21:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=233679b58c0bfe7bafeb6e048b90af08bb9e7fc1'/>
<id>233679b58c0bfe7bafeb6e048b90af08bb9e7fc1</id>
<content type='text'>
The ATL will fail early if the DF register access fails due to missing
PCI IDs in the amd_nb code. There aren't any clear indicators on why the
ATL will fail to load in this case.

Add a couple of debug print statements to highlight reasons for failure.

A common scenario is missing support for new hardware. If the ATL fails
to load on a system, and there is interest to support it, then dynamic
debugging can be enabled to help find the cause for failure. If there is
no interest in supporting ATL on a new system, then these failures will
be silent.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/r/20241021152158.2525669-1-yazen.ghannam@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The ATL will fail early if the DF register access fails due to missing
PCI IDs in the amd_nb code. There aren't any clear indicators on why the
ATL will fail to load in this case.

Add a couple of debug print statements to highlight reasons for failure.

A common scenario is missing support for new hardware. If the ATL fails
to load on a system, and there is interest to support it, then dynamic
debugging can be enabled to help find the cause for failure. If there is
no interest in supporting ATL on a new system, then these failures will
be silent.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/r/20241021152158.2525669-1-yazen.ghannam@amd.com
</pre>
</div>
</content>
</entry>
</feed>
