diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-07-28 16:09:03 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-07-28 16:09:03 -0700 |
| commit | b5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b (patch) | |
| tree | 6991c8a7a5bb9e73aa209460ad5bd51feba2b50e /Documentation/filesystems | |
| parent | 0965549d6f5f23e9250cd9c642f4ea5fd682eddb (diff) | |
| parent | d5212d819e02313f27c867e6d365e71f1fdaaca4 (diff) | |
| download | linux-b5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b.tar.gz linux-b5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b.tar.bz2 linux-b5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b.zip | |
Merge tag 'vfs-6.17-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs iomap updates from Christian Brauner:
- Refactor the iomap writeback code and split the generic and ioend/bio
based writeback code.
There are two methods that define the split between the generic
writeback code, and the implemementation of it, and all knowledge of
ioends and bios now sits below that layer.
- Add fuse iomap support for buffered writes and dirty folio writeback.
This is needed so that granular uptodate and dirty tracking can be
used in fuse when large folios are enabled. This has two big
advantages. For writes, instead of the entire folio needing to be
read into the page cache, only the relevant portions need to be. For
writeback, only the dirty portions need to be written back instead of
the entire folio.
* tag 'vfs-6.17-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fuse: refactor writeback to use iomap_writepage_ctx inode
fuse: hook into iomap for invalidating and checking partial uptodateness
fuse: use iomap for folio laundering
fuse: use iomap for writeback
fuse: use iomap for buffered writes
iomap: build the writeback code without CONFIG_BLOCK
iomap: add read_folio_range() handler for buffered writes
iomap: improve argument passing to iomap_read_folio_sync
iomap: replace iomap_folio_ops with iomap_write_ops
iomap: export iomap_writeback_folio
iomap: move folio_unlock out of iomap_writeback_folio
iomap: rename iomap_writepage_map to iomap_writeback_folio
iomap: move all ioend handling to ioend.c
iomap: add public helpers for uptodate state manipulation
iomap: hide ioends from the generic writeback code
iomap: refactor the writeback interface
iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks
iomap: pass more arguments using the iomap writeback context
iomap: header diet
Diffstat (limited to 'Documentation/filesystems')
| -rw-r--r-- | Documentation/filesystems/iomap/design.rst | 3 | ||||
| -rw-r--r-- | Documentation/filesystems/iomap/operations.rst | 57 |
2 files changed, 28 insertions, 32 deletions
diff --git a/Documentation/filesystems/iomap/design.rst b/Documentation/filesystems/iomap/design.rst index f2df9b6df988..0f7672676c0b 100644 --- a/Documentation/filesystems/iomap/design.rst +++ b/Documentation/filesystems/iomap/design.rst @@ -167,7 +167,6 @@ structure below: struct dax_device *dax_dev; void *inline_data; void *private; - const struct iomap_folio_ops *folio_ops; u64 validity_cookie; }; @@ -292,8 +291,6 @@ The fields are as follows: <https://lore.kernel.org/all/20180619164137.13720-7-hch@lst.de/>`_. This value will be passed unchanged to ``->iomap_end``. - * ``folio_ops`` will be covered in the section on pagecache operations. - * ``validity_cookie`` is a magic freshness value set by the filesystem that should be used to detect stale mappings. For pagecache operations this is critical for correct operation diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst index 3b628e370d88..067ed8e14ef3 100644 --- a/Documentation/filesystems/iomap/operations.rst +++ b/Documentation/filesystems/iomap/operations.rst @@ -57,21 +57,19 @@ The following address space operations can be wrapped easily: * ``bmap`` * ``swap_activate`` -``struct iomap_folio_ops`` +``struct iomap_write_ops`` -------------------------- -The ``->iomap_begin`` function for pagecache operations may set the -``struct iomap::folio_ops`` field to an ops structure to override -default behaviors of iomap: - .. code-block:: c - struct iomap_folio_ops { + struct iomap_write_ops { struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos, unsigned len); void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied, struct folio *folio); bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap); + int (*read_folio_range)(const struct iomap_iter *iter, + struct folio *folio, loff_t pos, size_t len); }; iomap calls these functions: @@ -127,6 +125,10 @@ iomap calls these functions: ``->iomap_valid``, then the iomap should considered stale and the validation failed. + - ``read_folio_range``: Called to synchronously read in the range that will + be written to. If this function is not provided, iomap will default to + submitting a bio read request. + These ``struct kiocb`` flags are significant for buffered I/O with iomap: * ``IOCB_NOWAIT``: Turns on ``IOMAP_NOWAIT``. @@ -271,7 +273,7 @@ writeback. It does not lock ``i_rwsem`` or ``invalidate_lock``. The dirty bit will be cleared for all folios run through the -``->map_blocks`` machinery described below even if the writeback fails. +``->writeback_range`` machinery described below even if the writeback fails. This is to prevent dirty folio clots when storage devices fail; an ``-EIO`` is recorded for userspace to collect via ``fsync``. @@ -283,15 +285,14 @@ The ``ops`` structure must be specified and is as follows: .. code-block:: c struct iomap_writeback_ops { - int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode, - loff_t offset, unsigned len); - int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status); - void (*discard_folio)(struct folio *folio, loff_t pos); + int (*writeback_range)(struct iomap_writepage_ctx *wpc, + struct folio *folio, u64 pos, unsigned int len, u64 end_pos); + int (*writeback_submit)(struct iomap_writepage_ctx *wpc, int error); }; The fields are as follows: - - ``map_blocks``: Sets ``wpc->iomap`` to the space mapping of the file + - ``writeback_range``: Sets ``wpc->iomap`` to the space mapping of the file range (in bytes) given by ``offset`` and ``len``. iomap calls this function for each dirty fs block in each dirty folio, though it will `reuse mappings @@ -306,27 +307,26 @@ The fields are as follows: This revalidation must be open-coded by the filesystem; it is unclear if ``iomap::validity_cookie`` can be reused for this purpose. - This function must be supplied by the filesystem. - - - ``submit_ioend``: Allows the file systems to hook into writeback bio - submission. - This might include pre-write space accounting updates, or installing - a custom ``->bi_end_io`` function for internal purposes, such as - deferring the ioend completion to a workqueue to run metadata update - transactions from process context before submitting the bio. - This function is optional. - - ``discard_folio``: iomap calls this function after ``->map_blocks`` - fails to schedule I/O for any part of a dirty folio. - The function should throw away any reservations that may have been - made for the write. + If this methods fails to schedule I/O for any part of a dirty folio, it + should throw away any reservations that may have been made for the write. The folio will be marked clean and an ``-EIO`` recorded in the pagecache. Filesystems can use this callback to `remove <https://lore.kernel.org/all/20201029163313.1766967-1-bfoster@redhat.com/>`_ delalloc reservations to avoid having delalloc reservations for clean pagecache. - This function is optional. + This function must be supplied by the filesystem. + + - ``writeback_submit``: Submit the previous built writeback context. + Block based file systems should use the iomap_ioend_writeback_submit + helper, other file system can implement their own. + File systems can optionall to hook into writeback bio submission. + This might include pre-write space accounting updates, or installing + a custom ``->bi_end_io`` function for internal purposes, such as + deferring the ioend completion to a workqueue to run metadata update + transactions from process context before submitting the bio. + This function must be supplied by the filesystem. Pagecache Writeback Completion ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -340,10 +340,9 @@ If the write failed, it will also set the error bits on the folios and the address space. This can happen in interrupt or process context, depending on the storage device. - Filesystems that need to update internal bookkeeping (e.g. unwritten -extent conversions) should provide a ``->submit_ioend`` function to -set ``struct iomap_end::bio::bi_end_io`` to its own function. +extent conversions) should set their own bi_end_io on the bios +submitted by ``->submit_writeback`` This function should call ``iomap_finish_ioends`` after finishing its own work (e.g. unwritten extent conversion). |
