<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c, branch v5.10.258</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>drm/amdgpu: Fix amdgpu_ras_eeprom_init()</title>
<updated>2021-09-18T11:40:17+00:00</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2021-03-26T20:40:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=10a135969fd7419695c003ddb67ef8a7820a808b'/>
<id>10a135969fd7419695c003ddb67ef8a7820a808b</id>
<content type='text'>
[ Upstream commit dce4400e6516d18313d23de45b5be8a18980b00e ]

No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.

Cc: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Cc: Alexander Deucher &lt;Alexander.Deucher@amd.com&gt;
Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Acked-by: Alexander Deucher &lt;Alexander.Deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit dce4400e6516d18313d23de45b5be8a18980b00e ]

No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.

Cc: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Cc: Alexander Deucher &lt;Alexander.Deucher@amd.com&gt;
Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Acked-by: Alexander Deucher &lt;Alexander.Deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: added RAS EEPROM device support check</title>
<updated>2020-08-04T21:29:18+00:00</updated>
<author>
<name>John Clements</name>
<email>john.clements@amd.com</email>
</author>
<published>2020-08-03T07:52:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=4bfb74282f064f738abb22734a675326f41c230c'/>
<id>4bfb74282f064f738abb22734a675326f41c230c</id>
<content type='text'>
updated RAS EEPROM init/threshold sequences to check for device support

Reviewed-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Signed-off-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
updated RAS EEPROM init/threshold sequences to check for device support

Reviewed-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Signed-off-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: update eeprom once specifying one bigger threshold(v3)</title>
<updated>2020-08-04T21:27:25+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-07-27T07:51:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=9b856defbe355f886f99777e667506841bd267a8'/>
<id>9b856defbe355f886f99777e667506841bd267a8</id>
<content type='text'>
During driver's probe, when it hits bad gpu tag in eeprom i2c
init calling(the tag was set when reported bad page reaches
bad page threshold in last driver's working loop), there are
some strategys to deal with the cases:

1. when the module parameter amdgpu_bad_page_threshold = 0,
that means page retirement feature is disabled, so just resetting
the eeprom is fine.
2. When amdgpu_bad_page_threshold is not 0, and moreover, user
sets one bigger valid data in order to make current boot up
succeeds, correct eeprom header tag and do not break booting.
3. For other cases, driver's probe will be broken.

v2: Just update eeprom header tag instead of resetting the whole
    table header when user sets one bigger threshold data.

v3: Use dev_info/dev_err to print PCI device information, which
    helps in mGPU case.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During driver's probe, when it hits bad gpu tag in eeprom i2c
init calling(the tag was set when reported bad page reaches
bad page threshold in last driver's working loop), there are
some strategys to deal with the cases:

1. when the module parameter amdgpu_bad_page_threshold = 0,
that means page retirement feature is disabled, so just resetting
the eeprom is fine.
2. When amdgpu_bad_page_threshold is not 0, and moreover, user
sets one bigger valid data in order to make current boot up
succeeds, correct eeprom header tag and do not break booting.
3. For other cases, driver's probe will be broken.

v2: Just update eeprom header tag instead of resetting the whole
    table header when user sets one bigger threshold data.

v3: Use dev_info/dev_err to print PCI device information, which
    helps in mGPU case.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: break GPU recovery once it's in bad state(v4)</title>
<updated>2020-08-04T21:26:54+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-07-23T08:20:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=e8fbaf03429d085b05e66b153a8363b746c94b43'/>
<id>e8fbaf03429d085b05e66b153a8363b746c94b43</id>
<content type='text'>
When GPU executes recovery and retriving bad GPU tag
from external eerpom device, the recovery will be broken
and error message is printed as well for user's awareness.

v2: Refine warning message in threshold reaching case, and
    fix spelling typo.

v3: Fix explicit calling of bad gpu.

v4: Rename function names.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When GPU executes recovery and retriving bad GPU tag
from external eerpom device, the recovery will be broken
and error message is printed as well for user's awareness.

v2: Refine warning message in threshold reaching case, and
    fix spelling typo.

v3: Fix explicit calling of bad gpu.

v4: Rename function names.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: schedule ras recovery when reaching bad page threshold(v2)</title>
<updated>2020-08-04T21:26:46+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-07-23T08:05:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=9c06f91ff2349da67651cddf88507aa967b58b08'/>
<id>9c06f91ff2349da67651cddf88507aa967b58b08</id>
<content type='text'>
Once the bad page saved to eeprom reaches the configured
threshold, ras recovery will be issued to notify user.

v2: Fix spelling typo.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Once the bad page saved to eeprom reaches the configured
threshold, ras recovery will be issued to notify user.

v2: Fix spelling typo.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: break driver init process when it's bad GPU(v5)</title>
<updated>2020-08-04T21:26:31+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-07-23T07:42:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=b82e65a93510465cb4c203c938245f137a4e95dc'/>
<id>b82e65a93510465cb4c203c938245f137a4e95dc</id>
<content type='text'>
When retrieving bad gpu tag from eeprom, GPU init should
fail as the GPU needs to be retired for further check.

v2: Fix spelling typo, correct the condition to detect
    bad gpu tag and refine error message.

v3: Refine function argument name.

v4: Fix missing check of returning value of i2c
    initialization error case.

v5: Use dev_err to print PCI information in dmesg instead
    of DRM_ERROR.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When retrieving bad gpu tag from eeprom, GPU init should
fail as the GPU needs to be retired for further check.

v2: Fix spelling typo, correct the condition to detect
    bad gpu tag and refine error message.

v3: Refine function argument name.

v4: Fix missing check of returning value of i2c
    initialization error case.

v5: Use dev_err to print PCI information in dmesg instead
    of DRM_ERROR.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: add bad gpu tag definition</title>
<updated>2020-08-04T21:26:16+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-07-23T07:35:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=1d6a9d122d2c7817ad1bae0e59c8a29450c2b14d'/>
<id>1d6a9d122d2c7817ad1bae0e59c8a29450c2b14d</id>
<content type='text'>
This tag will be hired for bad gpu detection in eeprom's access.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This tag will be hired for bad gpu detection in eeprom's access.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: validate bad page threshold in ras(v3)</title>
<updated>2020-08-04T21:25:58+00:00</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-07-22T02:00:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=c84d46707ebbfc303268e27d5aeb8bb4e6d5b1ff'/>
<id>c84d46707ebbfc303268e27d5aeb8bb4e6d5b1ff</id>
<content type='text'>
Bad page threshold value should be valid in the range between
-1 and max records length of eeprom. It could determine when
saved bad pages exceed threshold value, and proceed corresponding
actions.

v2: When using the default typical value, it should be min
value between typical value and eeprom max records length.

v3: drop the case of setting bad_page_cnt_threshold to be
    0xFFFFFFFF, as it confuses user.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Bad page threshold value should be valid in the range between
-1 and max records length of eeprom. It could determine when
saved bad pages exceed threshold value, and proceed corresponding
actions.

v2: When using the default typical value, it should be min
value between typical value and eeprom max records length.

v3: drop the case of setting bad_page_cnt_threshold to be
    0xFFFFFFFF, as it confuses user.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Move EEPROM I2C adapter to amdgpu_device</title>
<updated>2020-03-16T20:21:32+00:00</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2020-03-12T21:11:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=9015d60c9ee106cef0c8b969f998ca157f401605'/>
<id>9015d60c9ee106cef0c8b969f998ca157f401605</id>
<content type='text'>
Puts the i2c adapter in common place for sharing by RAS
and upcoming data read from FRU EEPROM feature.

v2:
Move i2c adapter to amdgpu_pm and rename it.

v3: Move i2c adapter init to ASIC specific code and get rid
of the switch case in amdgpu_device

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Puts the i2c adapter in common place for sharing by RAS
and upcoming data read from FRU EEPROM feature.

v2:
Move i2c adapter to amdgpu_pm and rename it.

v3: Move i2c adapter init to ASIC specific code and get rid
of the switch case in amdgpu_device

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: Add Arcturus D342 page retire support</title>
<updated>2020-02-26T19:17:32+00:00</updated>
<author>
<name>John Clements</name>
<email>john.clements@amd.com</email>
</author>
<published>2020-02-25T12:25:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=ef1caf48bd4d7cb568f22bf9d7cc9d725edc1555'/>
<id>ef1caf48bd4d7cb568f22bf9d7cc9d725edc1555</id>
<content type='text'>
Check Arcturus SKU type to select I2C address of page retirement EEPROM

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Check Arcturus SKU type to select I2C address of page retirement EEPROM

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
