Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit eac21f8ebeb4f84d703cf41dc3f81d16fa9dc00a ]
Currently the TTM shrinker aborts shrinking as soon as it frees pages from
any of the page order pools and by doing so it can fail to respect the
freeing target which was configured by the shrinker core.
We use the wording "can fail" because the number of freed pages will
depend on the presence of pages in the pools and the order of the pools on
the LRU list. For example if there are no free pages in the high order
pools the shrinker core may require multiple passes over the TTM shrinker
before it will free the default target of 128 pages (assuming there are
free pages in the low order pools). This inefficiency can be compounded by
the pool LRU where multiple further calls into the TTM shrinker are
required to end up looking at the pool with pages.
Improve this by never freeing less than the shrinker core has requested.
At the same time we start reporting the number of scanned pages (freed in
this case), which prevents the core shrinker from giving up on the TTM
shrinker too soon and moving on.
v2:
* Simplify loop logic. (Christian)
* Improve commit message.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://lore.kernel.org/r/20250603112750.34997-2-tvrtko.ursulin@igalia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4e16a9a00239db5d819197b9a00f70665951bf50 ]
For the evict fail case, the evict error should be returned.
v2: Consider ENOENT case.
v3: Abort directly when the eviction failed for some reason (except for -ENOENT)
and not wait for the move to finish
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20250603091154.3472646-1-Emily.Deng@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 97e000acf2e20a86a50a0ec8c2739f0846f37509 upstream.
Unlocking the resv object was missing in the error path, additionally to
that we should move over the resource only after the fence slot was
reserved.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Fixes: c8d4c18bfbc4a ("dma-buf/drivers: make reserving a shared slot mandatory v4")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20250616130726.22863-3-christian.koenig@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 878516a9e62cd220379e511d43dcf58df3a6ca9f ]
In the ttm_bo_unreserve_bulk() test function, resv is allocated using
kunit_kzalloc(), but the subsequent assertion mistakenly verifies the
ttm_dev pointer instead of the resv pointer.
Fix the assertion to properly verify the resv pointer.
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250313161424.10688-1-qasdev00@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
The abstraction was previously added to support separate
ttm_backup implementations.
However with the current implementation casting from a
struct file to a struct ttm_backup, we run into trouble since
struct file may have randomized the layout and gcc complains.
Remove the struct ttm_backup abstraction
Cc: dri-devel@lists.freedesktop.org
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Kees Cook <kees@kernel.org>
Closes: https://lore.kernel.org/dri-devel/9c8dbbafdaf9f3f089da2cde5a772d69579b3795.camel@linux.intel.com/T/#mb153ab9216cb813b92bdeb36f391ad4808c2ba29
Suggested-by: Christian König <christian.koenig@amd.com>
Fixes: 70d645deac98 ("drm/ttm: Add helpers for shrinking")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20250502130014.3156-1-thomas.hellstrom@linux.intel.com
|
|
The docs were not properly updated from an earlier version of the code.
Fixes: e7b5d23e5d47 ("drm/ttm: Provide a shmem backup implementation")
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250502130101.3185-1-thomas.hellstrom@linux.intel.com
|
|
fix the below warning messages:
ttm/ttm_bo.c:1098: warning: Function parameter or struct member 'hit_low' not described in 'ttm_bo_swapout_walk'
ttm/ttm_bo.c:1098: warning: Function parameter or struct member 'evict_low' not described in 'ttm_bo_swapout_walk'
Cc: Maarten Lankhorst <dev@lankhorst.se>
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Maarten Lankhorst <dev@lankhorst.se>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20250423042442.762108-1-sunil.khatri@amd.com
|
|
Add a number of helpers for shrinking that access core TTM and
core MM functionality in a way that make them unsuitable for
driver open-coding.
v11:
- New patch (split off from previous) and additional helpers.
v13:
- Adapt to ttm_backup interface change.
- Take resource off LRU when backed up.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/intel-xe/20250305092220.123405-6-thomas.hellstrom@linux.intel.com
|
|
Following the design direction communicated here:
https://lore.kernel.org/linux-mm/b7491378-defd-4f1c-31e2-29e4c77e2d67@amd.com/T/#ma918844aa8a6efe8768fdcda0c6590d5c93850c9
Export a LRU walker for driver shrinker use. The walker
initially supports only trylocking, since that's the
method used by shrinkes. The walker makes use of
scoped_guard() to allow exiting from the LRU walk loop
without performing any explicit unlocking or
cleanup.
v8:
- Split out from another patch.
- Use a struct for bool arguments to increase readability (Matt Brost).
- Unmap user-space cpu-mappings before shrinking pages.
- Explain non-fatal error codes (Matt Brost)
v10:
- Instead of using the existing helper, Wrap the interface inside out and
provide a loop to de-midlayer things the LRU iteration (Christian König).
- Removing the R-B by Matt Brost since the patch was significantly changed.
v11:
- Split the patch up to include just the LRU walk helper.
v12:
- Indent after scoped_guard() (Matt Brost)
v15:
- Adapt to new definition of scoped_guard()
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/intel-xe/20250305092220.123405-5-thomas.hellstrom@linux.intel.com
|
|
Use fault-injection to test partial TTM swapout and interrupted swapin.
Return -EINTR for swapin to test the callers ability to handle and
restart the swapin, and on swapout perform a partial swapout to test that
the swapin and release_shrunken functionality.
v8:
- Use the core fault-injection system.
v9:
- Fix compliation failure for !CONFIG_FAULT_INJECTION
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/intel-xe/20250305092220.123405-4-thomas.hellstrom@linux.intel.com
|
|
Provide a helper to shrink ttm_tt page-vectors on a per-page
basis. A ttm_backup backend could then in theory get away with
allocating a single temporary page for each struct ttm_tt.
This is accomplished by splitting larger pages before trying to
back them up.
In the future we could allow ttm_backup to handle backing up
large pages as well, but currently there's no benefit in
doing that, since the shmem backup backend would have to
split those anyway to avoid allocating too much temporary
memory, and if the backend instead inserts pages into the
swap-cache, those are split on reclaim by the core.
Due to potential backup- and recover errors, allow partially swapped
out struct ttm_tt's, although mark them as swapped out stopping them
from being swapped out a second time. More details in the ttm_pool.c
DOC section.
v2:
- A couple of cleanups and error fixes in ttm_pool_back_up_tt.
- s/back_up/backup/
- Add a writeback parameter to the exported interface.
v8:
- Use a struct for flags for readability (Matt Brost)
- Address misc other review comments (Matt Brost)
v9:
- Update the kerneldoc for the ttm_tt::backup field.
v10:
- Rebase.
v13:
- Rebase on ttm_backup interface change. Update kerneldoc.
- Rebase and adjust ttm_tt_is_swapped().
v15:
- Rebase on ttm_backup return value change.
- Rebase on previous restructuring of ttm_pool_alloc()
- Rework the ttm_pool backup interface (Christian König)
- Remove cond_resched() (Christian König)
- Get rid of the need to allocate an intermediate page array
when restoring a multi-order page (Christian König)
- Update documentation.
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Link: https://lore.kernel.org/intel-xe/20250305092220.123405-3-thomas.hellstrom@linux.intel.com
|
|
Provide a standalone shmem backup implementation.
Given the ttm_backup interface, this could
later on be extended to providing other backup
implementation than shmem, with one use-case being
GPU swapout to a user-provided fd.
v5:
- Fix a UAF. (kernel test robot, Dan Carptenter)
v6:
- Rename ttm_backup_shmem_copy_page() function argument
(Matthew Brost)
- Add some missing documentation
v8:
- Use folio_file_page to get to the page we want to writeback
instead of using the first page of the folio.
v13:
- Remove the base class abstraction (Christian König)
- Include ttm_backup_bytes_avail().
v14:
- Fix kerneldoc for ttm_backup_bytes_avail() (0-day)
- Work around casting of __randomize_layout struct pointer (0-day)
v15:
- Return negative error code from ttm_backup_backup_page()
(Christian König)
- Doc fixes. (Christian König).
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/intel-xe/20250305092220.123405-2-thomas.hellstrom@linux.intel.com
|
|
TTM always uses pin_count and ttm_resource_is_swapped() together to
determine whether a BO is unevictable.
Now use ttm_resource_unevictable() to replace them.
Signed-off-by: Zhaoyu Liu <liuzhaoyu.zackary@bytedance.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250126093256.GA688734@bytedance
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Bring rc1 to start the new release dev.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
|
Simplify the pool allocation code somewhat by merging loop arguments
used by multiple functions together in a struct and simplifying the
loop. Also add documentation.
This hopefully makes the behaviour of the allocation loop
simplier to understand, but above all paves the way for upcoming
restore-while-allocating functionality.
There are no functional changes, but the "allow_pools" bool
introduced to keep current functionality could be removed as a
follow up, which would enable using write-back cached pools when
allocating memory for other caching modes, rather than to resort
to allocating from the system directly.
v15:
- Introduce this patch to simplify the upcoming patch that introduces
restore while allocating.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241217145852.37342-4-thomas.hellstrom@linux.intel.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Make the interface more symmetric by providing and using a
ttm_resource_cursor_init().
v10:
- Fix a stray newline (Matthew Brost)
- Update kerneldoc (Matthew Brost)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241217145852.37342-2-thomas.hellstrom@linux.intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-next
DMEM cgroup pull request
This introduces a new cgroup controller to limit the device memory.
Notable users would be DRM, dma-buf heaps, or v4l2.
This pull request is based on the series developped by Maarten
Lankhorst, Friedrich Vock, and I:
https://lore.kernel.org/all/20241204134410.1161769-1-dev@lankhorst.se/
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250110-cryptic-warm-mandrill-b71f5d@houat
|
|
cgroup resource allocation has to be handled in TTM, so -EAGAIN from
cgroups can be converted into -ENOSPC, and the limitcg can be properly
evicted in ttm code.
When hitting a resource limit through -EAGAIN, the cgroup for which the
limit is hit is also returned. This allows eviction to delete only from
cgroups which are a subgroup of the current cgroup.
The returned CSS is used to determine if eviction is valuable for a
given resource, and allows TTM to only target specific resources to
lower memory usage.
Co-developed-by: Friedrich Vock <friedrich.vock@gmx.de>
Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
Co-developed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20241204134410.1161769-4-dev@lankhorst.se
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes:
- Make OA buffer size configurable (Sai)
Display Changes (including i915):
- Fix ttm_bo_access() usage (Auld)
- Power request asserting/deasserting for Xe3lpd (Mika)
- One Type-C conversion towards struct intel_display (Mika)
Driver Changes:
- GuC capture related fixes (Everest, Zhanjun)
- Move old workaround to OOB infra (Lucas)
- Compute mode change refactoring (Bala)
- Add ufence and g2h flushes for LNL Hybrid timeouts (Nirmoy)
- Avoid unnecessary OOM kills (Thomas)
- Restore system memory GGTT mappings (Brost)
- Fix build error for XE_IOCTL_DBG macro (Gyeyoung)
- Documentation updates and fixes (Lucas, Randy)
- A few exec IOCTL fixes (Brost)
- Fix potential GGTT allocation leak (Michal)
- Fix races on fdinfo (Lucas)
- SRIOV VF: Post-migration recovery worker basis (Tomasz)
- GuC Communication fixes and improvements (Michal, John, Tomasz, Auld, Jonathan)
- SRIOV PF: Add support for VF scheduling priority
- Trace improvements (Lucas, Auld, Oak)
- Hibernation on igpu fixes and improvements (Auld)
- GT oriented logs/asserts improvements (Michal)
- Take job list lock in xe_sched_first_pending_job (Nirmoy)
- GSC: Improve SW proxy error checking and logging (Daniele)
- GuC crash notifications & drop default log verbosity (John)
- Fix races on fdinfo (Lucas)
- Fix runtime_pm handling in OA (Ashutosh)
- Allow fault injection in vm create and vm bind IOCTLs (Francois)
- TLB invalidation fixes (Nirmoy, Daniele)
- Devcoredump Improvements, doc and fixes (Brost, Lucas, Zhanjun, John)
- Wake up waiters after setting ufence->signalled (Nirmoy)
- Mark preempt fence workqueue as reclaim (Brost)
- Trivial header/flags cleanups (Lucas)
- VRAM drop 2G block restriction (Auld)
- Drop useless d3cold allowed message (Brost)
- SRIOV PF: Drop 2GiB limit of fair LMEM allocation (Michal)
- Add another PTL PCI ID (Atwood)
- Allow bo mapping on multiple ggtts (Niranjana)
- Add support for GuC-to-GuC communication (John)
- Update xe2_graphics name string (Roper)
- VRAM: fix lpfn check (Auld)
- Ad Xe3 workaround (Apoorva)
- Migrate fixes (Auld)
- Fix non-contiguous VRAM BO access (Brost)
- Log throttle reasons (Raag)
- Enable PMT support for BMG (Michael)
- IRQ related fixes and improvements (Ilia)
- Avoid evicting object of the same vm in none fault mode (Oak)
- Fix in tests (Nirmoy)
- Fix ERR_PTR handling (Mirsad)
- Some reg_sr/whitelist fixes and refactors (Lucas)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmdaHkMACgkQ+mJfZA7r
# E8o+twf/XYZTk4O3qQ+yNL3PDQT0NIKjH8mEnmu4udyIw/sYhQe6ji+uh1YutK8Y
# 41IQc06qQogTj36bqSwbjThw5asMfRh2sNR/p1uOy7RGUnN25FuYSXEgOeDWi/Ec
# xrZE1TKPotFGeGI09KJmzjzMq94cgv97Pxma+5m8BjVsvzXQSzEJ2r9cC6ruSfNT
# O5Jq5nqxHSkWUbKCxPnixSlGnH4jbsuiqS1E1pnH+u6ijxsfhOJj686wLn2FRkiw
# 6FhXmJBrd8AZ0Q2E7h3UswE5O88I0ALDc58OINAzD1GMyzvZj2vB1pXgj5uNr0/x
# Ku4cxu1jprsi+FLUdKAdYpxRBRanow==
# =3Ou7
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 12 Dec 2024 09:20:35 AEST
# gpg: using RSA key 6D207068EEDD65091C2CE2A3FA625F640EEB13CA
# gpg: Good signature from "Rodrigo Vivi <rodrigo.vivi@intel.com>" [unknown]
# gpg: aka "Rodrigo Vivi <rodrigo.vivi@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6D20 7068 EEDD 6509 1C2C E2A3 FA62 5F64 0EEB 13CA
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z1ofx-fExLQKV_e4@intel.com
|
|
Instead of a TTM reference grab a GEM reference whenever necessary for a
VM mapping.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240723121750.2086-7-christian.koenig@amd.com
|
|
Non-contiguous VRAM cannot easily be mapped in TTM nor can non-visible
VRAM easily be accessed. Add ttm_bo_access, which is similar to
ttm_bo_vm_access, to access such memory.
v4:
- Fix checkpatch warnings (CI)
v5:
- Fix checkpatch warnings (CI)
v6:
- Fix kernel doc (Auld)
v7:
- Move ttm_bo_access to ttm_bo_vm.c (Christian)
Cc: Christian König <christian.koenig@amd.com>
Reported-by: Christoph Manszewski <christoph.manszewski@intel.com>
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Tested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241126174615.2665852-3-matthew.brost@intel.com
|
|
The ttm_bo_pin() and ttm_bo_unpin() functions weren't moving their
resources off the LRU list to the unevictable list.
Make sure that happens so that pinned objects don't accidently linger
on the LRU lists, and also make sure to move them back once they
are unpinned.
v2:
- Removing from a bulk move must be done with the pin-count still zero.
v3:
- ttm_resource_move_to_lru_tail must be done after pinning with a non-
NULL resource (Intel CI).
v6:
- Use a TAB instead of space (checkpatch.pl error).
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240911121859.85387-3-thomas.hellstrom@linux.intel.com
|
|
Resources of swapped objects remains on the TTM_PL_SYSTEM manager's
LRU list, which is bad for the LRU walk efficiency.
Rename the device-wide "pinned" list to "unevictable" and move
also resources of swapped-out objects to that list.
An alternative would be to create an "UNEVICTABLE" priority to
be able to keep the pinned- and swapped objects on their
respective manager's LRU without affecting the LRU walk efficiency.
v2:
- Remove a bogus WARN_ON (Christian König)
- Update ttm_resource_[add|del] bulk move (Christian König)
- Fix TTM KUNIT tests (Intel CI)
v3:
- Check for non-NULL bo->resource in ttm_bo_populate().
v4:
- Don't move to LRU tail during swapout until the resource
is properly swapped or there was a swapout failure.
(Intel Ci)
- Add a newline after checkpatch check.
v5:
- Introduce ttm_resource_is_swapped() to avoid a corner-case where
a newly created resource was considered swapped. (Intel CI)
v6:
- Move an assert.
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240911121859.85387-2-thomas.hellstrom@linux.intel.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v6.12:
UAPI Changes:
virtio:
- Define DRM capset
Cross-subsystem Changes:
dma-buf:
- heaps: Clean up documentation
printk:
- Pass description to kmsg_dump()
Core Changes:
CI:
- Update IGT tests
- Point upstream repo to GitLab instance
modesetting:
- Introduce Power Saving Policy property for connectors
- Add might_fault() to drm_modeset_lock priming
- Add dynamic per-crtc vblank configuration support
panic:
- Avoid build-time interference with framebuffer console
docs:
- Document Colorspace property
scheduler:
- Remove full_recover from drm_sched_start
TTM:
- Make LRU walk restartable after dropping locks
- Allow direct reclaim to allocate local memory
Driver Changes:
amdgpu:
- Support Power Saving Policy connector property
ast:
- astdp: Support AST2600 with VGA; Clean up HPD
bridge:
- Silence error message on -EPROBE_DEFER
- analogix: Clean aup
- bridge-connector: Fix double free
- lt6505: Disable interrupt when powered off
- tc358767: Make default DP port preemphasis configurable
gma500:
- Update i2c terminology
ivpu:
- Add MODULE_FIRMWARE()
lcdif:
- Fix pixel clock
loongson:
- Use GEM refcount over TTM's
mgag200:
- Improve BMC handling
- Support VBLANK intterupts
nouveau:
- Refactor and clean up internals
- Use GEM refcount over TTM's
panel:
- Shutdown fixes plus documentation
- Refactor several drivers for better code sharing
- boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
DT; Fix porch parameter
- edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
CMN N116BCP-EA2, CSW MNB601LS1-4
- himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
- ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
- jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
for code sharing
sti:
- Fix module owner
stm:
- Avoid UAF wih managed plane and CRTC helpers
- Fix module owner
- Fix error handling in probe
- Depend on COMMON_CLK
- ltdc: Fix transparency after disabling plane; Remove unused interrupt
tegra:
- Call drm_atomic_helper_shutdown()
v3d:
- Clean up perfmon
vkms:
- Clean up
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801121406.GA102996@linux.fritz.box
|
|
Limiting the allocation of higher order pages to the closest NUMA node
and enabling direct memory reclaim provides not only failsafe against
situations when memory becomes too much fragmented and the allocator is
not able to satisfy the request from the local node but falls back to
remote pages (HUGEPAGE) but also offers performance improvement.
Accessing remote pages suffers due to bandwidth limitations and could be
avoided if memory becomes defragmented and in most cases without using
manual compaction. (/proc/sys/vm/compact_memory)
Note: On certain distros such as RHEL, the proactive compaction is
disabled. (https://tinyurl.com/4f32f7rs)
v2 (chk): drop __GFP_RECLAIM since that is already set by GFP_USER
Cc: Dave Airlie <airlied@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708160636.1147308-1-rajneesh.bhardwaj@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Pull drm updates from Dave Airlie:
"There's a lot of stuff in here, amd, i915 and xe have new platform
work, lots of core rework around EDID handling, some new COMPILE_TEST
options, maintainer changes and a lots of other stuff. Summary:
core:
- deprecate DRM data and return 0 date
- connector: Create a set of helpers to help with HDMI support
- Remove driver owner assignments
- Allow more drivers to compile with COMPILE_TEST
- Conversions to drm_edid
- Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
- Remove drm_mm_replace_node
- print: Add a drm prefix to warn level messages too, remove
___drm_dbg, consolidate prefix handling
- New monochrome TV mode variant
ttm:
- improve number of page faults on some platforms
- fix test builds under PREEMPT_RT
- more test coverage
ci:
- Require a more recent version of mesa
- improve farm setup and test generation
dma-buf:
- warn if reserving 0 fence slots
- internal API heap enhancements
fbdev:
- Create memory manager optimized fbdev emulation
panic:
- Allow to select fonts
- improve drm_fb_dma_get_scanout_buffer
- Allow to dump kmsg to the screen
bridge:
- Remove redundant checks on bridge->encoder
- Remove drm_bridge_chain_mode_fixup
- bridge-connector: Plumb in the new HDMI helper
- analogix_dp: Various improvements, handle AUX transfers timeout
- samsung-dsim: Fix timings calculation
- tc358767: Plenty of small fixes, fix no connector attach, fix
clocks
- sii902x: state validation improvements
panels:
- Switch panels from register table initialization to proper code
- Now that the panel code tracks the panel state, remove every ad-hoc
implementation in the panel drivers
- More cleanup of prepare / enable state tracking in drivers
- edp: Drop legacy panel compatibles
- simple-bridge: Switch to devm_drm_bridge_add
- New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0,
BOE nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView
PM070WL4, Lincoln Technologies LCD197, Ortustech
COM35H3P70ULC, AUO G104STN01, K&d kd101ne3-40ti
amdgpu:
- DCN 4.0.x support
- GC 12.0 support
- GMC 12.0 support
- SDMA 7.0 support
- MES12 support
- MMHUB 4.1 support
- GFX12 modifier and DCC support
- lots of IP fixes/updates
amdkfd:
- Contiguous VRAM allocations
- GC 12.0 support
- SDMA 7.0 support
- SR-IOV fixes
- KFD GFX ALU exceptions
i915:
- Battlemage Xe2 HPD display enablement
- Panel Replay enabling
- DP AUX-less ALPM/LOBF
- Enable link training failure fallback for DP MST links
- CMRR (Content Match Refresh Rate) enabling
- Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
- Enable eDP AUX based HDR backlight
- Support replaying GPU hangs with captured context image
- Automate CCS Mode setting during engine resets
- lots of refactoring
- Support replaying GPU hangs with captured context image
- Increase FLR timeout from 3s to 9s
- Enable w/a 16021333562 for DG2, MTL and ARL [guc]
xe:
- update MAINATINERS
- New uapi adding OA functionality to Xe
- expose l3 bank mask
- fix display detect on ADL-N
- runtime PM Fixes
- Fix silent backmerge issues
- More prep for SR-IOV
- HWmon additions
- per client usage info
- Rework GPU page fault handling
- Drop EXEC_QUEUE_FLAG_BANNED
- Add BMG PCI IDs
- Scheduler fixes and improvements
- Rename xe_exec_queue::compute to xe_exec_queue::lr
- Use ttm_uncached for BO with NEEDS_UC flag
- Rename xe perf layer as xe observation layer
- lots of refactoring
radeon:
- Backlight workaround for iMac
- Silence UBSAN flex array warnings
msm:
- Validate registers XML description against schema in CI
- core/dpu: SM7150 support
- mdp5: Add support for MSM8937
- gpu: Add param for userspace to know if raytracing is supported
- gpu: X185 support (aka gpu in X1 laptop chips)
- gpu: a505 support
ivpu:
- hardware scheduler support
- profiling support
- improvements to the platform support layer
- firmware handling improvements
- clocks/power mgmt improvements
- scheduler/logging improvements
habanalabs:
- Gradual sleep in polling memory macro
- Reduce Gaudi2 MSI-X interrupt count to 128
- Add Gaudi2-D revision support
- Add timestamp to CPLD info
- Gaudi2: Assume hard-reset by firmware upon MC SEI severe error
- Align Gaudi2 interrupt names
- Check for errors after preboot is ready
- Change habanalabs maintainer and git repo path
mgag200:
- refactoring and improvements
- Add BMC output
- enable polling
nouveau:
- add registry command line
v3d:
- perf counters improvements
zynqmp:
- irq and debugfs improvements
atmel-hlcdc:
- Support XLCDC in sam9x7
mipi-dbi:
- Remove mipi_dbi_machine_little_endian
- make SPI bits per word configurable
- support RGB888
- allow pixel formats to be specified in the DT
sun4i:
- Rework the blender setup for DE2
panfrost:
- Enable MT8188 support
vc4:
- Monochrome TV support
exynos:
- fix fallback mode regression
- fix memory leak
- Use drm_edid_duplicate() instead of kmemdup()
etnaviv:
- fix i.MX8MP NPU clock gating
- workaround FE register cdc issues on some cores
- fix DMA sync handling for cached buffers
- fix job timeout handling
- keep TS enabled on MMUv2 cores for improved performance
mediatek:
- Convert to platform remove callback returning void-
- Drop chain_mode_fixup call in mode_valid()
- Fixes the errors of MediaTek display driver found by IGT
- Add display support for the MT8365-EVK board
- Fix bit depth overwritten for mtk_ovl_set bit_depth()
- Fix possible_crtcs calculation
- Fix spurious kfree()
ast:
- refactor mode setting code
stm:
- Add LVDS support
- DSI PHY updates"
* tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel: (2501 commits)
drm/amdgpu/mes12: add missing opcode string
drm/amdgpu/mes11: update opcode strings
Revert "drm/amd/display: Reset freesync config before update new state"
drm/omap: Restrict compile testing to PAGE_SIZE less than 64KB
drm/xe: Drop trace_xe_hw_fence_free
drm/xe/uapi: Rename xe perf layer as xe observation layer
drm/amdgpu: remove exp hw support check for gfx12
drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed
drm/amdgpu: flush all cached ras bad pages to eeprom
drm/amdgpu: select compute ME engines dynamically
drm/amd/display: Allow display DCC for DCN401
drm/amdgpu: select compute ME engines dynamically
drm/amdgpu/job: Replace DRM_INFO/ERROR logging
drm/amdgpu: select compute ME engines dynamically
drm/amd/pm: Ignore initial value in smu response register
drm/amdgpu: Initialize VF partition mode
drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping
MAINTAINERS: fix Xinhui's name
MAINTAINERS: update powerplay and swsmu
drm/qxl: Pin buffer objects for internal mappings
...
|
|
Use the LRU walker for eviction. This helps
removing a lot of code with weird locking
semantics.
The functionality is slightly changed so that
when trylocked buffer objects are exhausted, we
continue to interleave walks with ticket-locks while
there is still progress made. The list walks are
not restarted in-between evictions.
Also provide a separate ttm_bo_evict_first()
function for its single user. The context of that
user allows sleeping dma_resv locks.
v6:
- Various cleanups suggested by Matthew Brost.
- Fix error return code of ttm_bo_evict_first(). (Matthew Brost)
- Fix an error check that was inverted. (Matthew Brost)
v7:
- Use s64 rather than long (Christian König)
- Early ttm_resource_cursor_fini() in ttm_bo_evict_first().
- Simplify check for bo_moved in ttm_bo_evict_first().
(Christian König)
- Don't evict pinned bos.
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v6
Link: https://patchwork.freedesktop.org/patch/msgid/20240705153206.68526-8-thomas.hellstrom@linux.intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Rework the TTM swapping to use the LRU walker helper.
This helps fixing up the ttm_bo_swapout() interface
to be consistent about not requiring any locking.
For now mimic the current behaviour of using trylock
only. We could be using ticket-locks here but defer
that until it's deemed necessary. The TTM swapout
functionality is a bit weird anyway since it
alternates between memory types without exhausting
TTM_PL_SYSTEM first.
Intentionally keep pages as the unit of progress since
changing that to bytes is an unrelated change that can
be done later.
v6:
- Improve on error code translation in the swapout callback
(Matthew Brost).
v7:
- Use s64 rather than long.
- Remove ttm_resource_cursor_fini() since it's no longer used.
- Rename ttm_resource_cursor_fini_locked() to
ttm_resource_cursor_fini().
- Don't swap out pinned bos.
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v6
Link: https://patchwork.freedesktop.org/patch/msgid/20240705153206.68526-7-thomas.hellstrom@linux.intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Provide a generic LRU walker in TTM, in the spirit of drm_gem_lru_scan()
but building on the restartable TTM LRU functionality.
The LRU walker optionally supports locking objects as part of
a ww mutex locking transaction, to mimic to some extent the
current functionality in ttm. However any -EDEADLK return
is converted to -ENOSPC and then to -ENOMEM before reaching
the driver, so that the driver will need to backoff and possibly retry
without being able to keep the ticket.
v3:
- Move the helper to core ttm.
- Remove the drm_exec usage from it for now, it will be
reintroduced later in the series.
v4:
- Handle the -EALREADY case if ticketlocking.
v6:
- Some cleanup and added code comments (Matthew Brost)
- Clarified the ticketlock in the commit message (Matthew Brost)
v7:
- Use s64 rather than long for the target and progress
(Christian König)
- Update documentation to not encourage using pages as a
progress measure. (Christian König)
- Remove cond_resched(). (Christian König)
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v6
Link: https://patchwork.freedesktop.org/patch/msgid/20240705153206.68526-6-thomas.hellstrom@linux.intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
To address the problem with hitches moving when bulk move
sublists are lru-bumped, register the list cursors with the
ttm_lru_bulk_move structure when traversing its list, and
when lru-bumping the list, move the cursor hitch to the tail.
This also means it's mandatory for drivers to call
ttm_lru_bulk_move_init() and ttm_lru_bulk_move_fini() when
initializing and finalizing the bulk move structure, so add
those calls to the amdgpu- and xe driver.
Compared to v1 this is slightly more code but less fragile
and hopefully easier to understand.
Changes in previous series:
- Completely rework the functionality
- Avoid a NULL pointer dereference assigning manager->mem_type
- Remove some leftover code causing build problems
v2:
- For hitch bulk tail moves, store the mem_type in the cursor
instead of with the manager.
v3:
- Remove leftover mem_type member from change in v2.
v6:
- Add some lockdep asserts (Matthew Brost)
- Avoid NULL pointer dereference (Matthew Brost)
- No need to check bo->resource before dereferencing
bo->bulk_move (Matthew Brost)
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240705153206.68526-5-thomas.hellstrom@linux.intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Have iterators insert themselves into the list they are iterating
over using hitch list nodes. Since only the iterator owner
can remove these list nodes from the list, it's safe to unlock
the list and when continuing, use them as a starting point. Due to
the way LRU bumping works in TTM, newly added items will not be
missed, and bumped items will be iterated over a second time before
reaching the end of the list.
The exception is list with bulk move sublists. When bumping a
sublist, a hitch that is part of that sublist will also be moved
and we might miss items if restarting from it. This will be
addressed in a later patch.
Changes in previous series:
- Updated ttm_resource_cursor_fini() documentation.
v2:
- Don't reorder ttm_resource_manager_first() and _next().
(Christian König).
- Use list_add instead of list_move
(Christian König)
v3:
- Split into two patches, one cleanup, one new functionality
(Christian König)
- use ttm_resource_cursor_fini_locked() instead of open-coding
(Matthew Brost)
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240705153206.68526-4-thomas.hellstrom@linux.intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
To make the transition to using lru hitches easier,
simplify the ttm_resource_manager_next() interface to only take
the cursor and reuse ttm_resource_manager_next() functionality
from ttm_resource_manager_first().
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240705153206.68526-3-thomas.hellstrom@linux.intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
To be able to handle list unlocking while traversing the LRU
list, we want the iterators not only to point to the next
position of the list traversal, but to insert themselves as
list nodes at that point to work around the fact that the
next node might otherwise disappear from the list while
the iterator is pointing to it.
These list nodes need to be easily distinguishable from other
list nodes so that others traversing the list can skip
over them.
So declare a struct ttm_lru_item, with a struct list_head member
and a type enum. This will slightly increase the size of a
struct ttm_resource.
Changes in previous series:
- Update enum ttm_lru_item_type documentation.
v3:
- Introduce ttm_lru_first_res_or_null()
(Christian König, Thomas Hellström)
v5:
- Update also the TTM test code (Xe CI).
Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240705153206.68526-2-thomas.hellstrom@linux.intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Bos can be put with multiple unrelated dma-resv locks held. But
imported bos attempt to grab the bo dma-resv during dma-buf detach
that typically happens during cleanup. That leads to lockde splats
similar to the below and a potential ABBA deadlock.
Fix this by always taking the delayed workqueue cleanup path for
imported bos.
Requesting stable fixes from when the Xe driver was introduced,
since its usage of drm_exec and wide vm dma_resvs appear to be
the first reliable trigger of this.
[22982.116427] ============================================
[22982.116428] WARNING: possible recursive locking detected
[22982.116429] 6.10.0-rc2+ #10 Tainted: G U W
[22982.116430] --------------------------------------------
[22982.116430] glxgears:sh0/5785 is trying to acquire lock:
[22982.116431] ffff8c2bafa539a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_detach+0x3b/0xf0
[22982.116438]
but task is already holding lock:
[22982.116438] ffff8c2d9aba6da8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_exec_lock_obj+0x49/0x2b0 [drm_exec]
[22982.116442]
other info that might help us debug this:
[22982.116442] Possible unsafe locking scenario:
[22982.116443] CPU0
[22982.116444] ----
[22982.116444] lock(reservation_ww_class_mutex);
[22982.116445] lock(reservation_ww_class_mutex);
[22982.116447]
*** DEADLOCK ***
[22982.116447] May be due to missing lock nesting notation
[22982.116448] 5 locks held by glxgears:sh0/5785:
[22982.116449] #0: ffff8c2d9aba58c8 (&xef->vm.lock){+.+.}-{3:3}, at: xe_file_close+0xde/0x1c0 [xe]
[22982.116507] #1: ffff8c2e28cc8480 (&vm->lock){++++}-{3:3}, at: xe_vm_close_and_put+0x161 |