summaryrefslogtreecommitdiff
path: root/Documentation/filesystems
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-07-28 16:09:03 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2025-07-28 16:09:03 -0700
commitb5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b (patch)
tree6991c8a7a5bb9e73aa209460ad5bd51feba2b50e /Documentation/filesystems
parent0965549d6f5f23e9250cd9c642f4ea5fd682eddb (diff)
parentd5212d819e02313f27c867e6d365e71f1fdaaca4 (diff)
downloadlinux-b5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b.tar.gz
linux-b5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b.tar.bz2
linux-b5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b.zip
Merge tag 'vfs-6.17-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs iomap updates from Christian Brauner: - Refactor the iomap writeback code and split the generic and ioend/bio based writeback code. There are two methods that define the split between the generic writeback code, and the implemementation of it, and all knowledge of ioends and bios now sits below that layer. - Add fuse iomap support for buffered writes and dirty folio writeback. This is needed so that granular uptodate and dirty tracking can be used in fuse when large folios are enabled. This has two big advantages. For writes, instead of the entire folio needing to be read into the page cache, only the relevant portions need to be. For writeback, only the dirty portions need to be written back instead of the entire folio. * tag 'vfs-6.17-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fuse: refactor writeback to use iomap_writepage_ctx inode fuse: hook into iomap for invalidating and checking partial uptodateness fuse: use iomap for folio laundering fuse: use iomap for writeback fuse: use iomap for buffered writes iomap: build the writeback code without CONFIG_BLOCK iomap: add read_folio_range() handler for buffered writes iomap: improve argument passing to iomap_read_folio_sync iomap: replace iomap_folio_ops with iomap_write_ops iomap: export iomap_writeback_folio iomap: move folio_unlock out of iomap_writeback_folio iomap: rename iomap_writepage_map to iomap_writeback_folio iomap: move all ioend handling to ioend.c iomap: add public helpers for uptodate state manipulation iomap: hide ioends from the generic writeback code iomap: refactor the writeback interface iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks iomap: pass more arguments using the iomap writeback context iomap: header diet
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/iomap/design.rst3
-rw-r--r--Documentation/filesystems/iomap/operations.rst57
2 files changed, 28 insertions, 32 deletions
diff --git a/Documentation/filesystems/iomap/design.rst b/Documentation/filesystems/iomap/design.rst
index f2df9b6df988..0f7672676c0b 100644
--- a/Documentation/filesystems/iomap/design.rst
+++ b/Documentation/filesystems/iomap/design.rst
@@ -167,7 +167,6 @@ structure below:
struct dax_device *dax_dev;
void *inline_data;
void *private;
- const struct iomap_folio_ops *folio_ops;
u64 validity_cookie;
};
@@ -292,8 +291,6 @@ The fields are as follows:
<https://lore.kernel.org/all/20180619164137.13720-7-hch@lst.de/>`_.
This value will be passed unchanged to ``->iomap_end``.
- * ``folio_ops`` will be covered in the section on pagecache operations.
-
* ``validity_cookie`` is a magic freshness value set by the filesystem
that should be used to detect stale mappings.
For pagecache operations this is critical for correct operation
diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
index 3b628e370d88..067ed8e14ef3 100644
--- a/Documentation/filesystems/iomap/operations.rst
+++ b/Documentation/filesystems/iomap/operations.rst
@@ -57,21 +57,19 @@ The following address space operations can be wrapped easily:
* ``bmap``
* ``swap_activate``
-``struct iomap_folio_ops``
+``struct iomap_write_ops``
--------------------------
-The ``->iomap_begin`` function for pagecache operations may set the
-``struct iomap::folio_ops`` field to an ops structure to override
-default behaviors of iomap:
-
.. code-block:: c
- struct iomap_folio_ops {
+ struct iomap_write_ops {
struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
unsigned len);
void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
struct folio *folio);
bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
+ int (*read_folio_range)(const struct iomap_iter *iter,
+ struct folio *folio, loff_t pos, size_t len);
};
iomap calls these functions:
@@ -127,6 +125,10 @@ iomap calls these functions:
``->iomap_valid``, then the iomap should considered stale and the
validation failed.
+ - ``read_folio_range``: Called to synchronously read in the range that will
+ be written to. If this function is not provided, iomap will default to
+ submitting a bio read request.
+
These ``struct kiocb`` flags are significant for buffered I/O with iomap:
* ``IOCB_NOWAIT``: Turns on ``IOMAP_NOWAIT``.
@@ -271,7 +273,7 @@ writeback.
It does not lock ``i_rwsem`` or ``invalidate_lock``.
The dirty bit will be cleared for all folios run through the
-``->map_blocks`` machinery described below even if the writeback fails.
+``->writeback_range`` machinery described below even if the writeback fails.
This is to prevent dirty folio clots when storage devices fail; an
``-EIO`` is recorded for userspace to collect via ``fsync``.
@@ -283,15 +285,14 @@ The ``ops`` structure must be specified and is as follows:
.. code-block:: c
struct iomap_writeback_ops {
- int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode,
- loff_t offset, unsigned len);
- int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
- void (*discard_folio)(struct folio *folio, loff_t pos);
+ int (*writeback_range)(struct iomap_writepage_ctx *wpc,
+ struct folio *folio, u64 pos, unsigned int len, u64 end_pos);
+ int (*writeback_submit)(struct iomap_writepage_ctx *wpc, int error);
};
The fields are as follows:
- - ``map_blocks``: Sets ``wpc->iomap`` to the space mapping of the file
+ - ``writeback_range``: Sets ``wpc->iomap`` to the space mapping of the file
range (in bytes) given by ``offset`` and ``len``.
iomap calls this function for each dirty fs block in each dirty folio,
though it will `reuse mappings
@@ -306,27 +307,26 @@ The fields are as follows:
This revalidation must be open-coded by the filesystem; it is
unclear if ``iomap::validity_cookie`` can be reused for this
purpose.
- This function must be supplied by the filesystem.
-
- - ``submit_ioend``: Allows the file systems to hook into writeback bio
- submission.
- This might include pre-write space accounting updates, or installing
- a custom ``->bi_end_io`` function for internal purposes, such as
- deferring the ioend completion to a workqueue to run metadata update
- transactions from process context before submitting the bio.
- This function is optional.
- - ``discard_folio``: iomap calls this function after ``->map_blocks``
- fails to schedule I/O for any part of a dirty folio.
- The function should throw away any reservations that may have been
- made for the write.
+ If this methods fails to schedule I/O for any part of a dirty folio, it
+ should throw away any reservations that may have been made for the write.
The folio will be marked clean and an ``-EIO`` recorded in the
pagecache.
Filesystems can use this callback to `remove
<https://lore.kernel.org/all/20201029163313.1766967-1-bfoster@redhat.com/>`_
delalloc reservations to avoid having delalloc reservations for
clean pagecache.
- This function is optional.
+ This function must be supplied by the filesystem.
+
+ - ``writeback_submit``: Submit the previous built writeback context.
+ Block based file systems should use the iomap_ioend_writeback_submit
+ helper, other file system can implement their own.
+ File systems can optionall to hook into writeback bio submission.
+ This might include pre-write space accounting updates, or installing
+ a custom ``->bi_end_io`` function for internal purposes, such as
+ deferring the ioend completion to a workqueue to run metadata update
+ transactions from process context before submitting the bio.
+ This function must be supplied by the filesystem.
Pagecache Writeback Completion
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -340,10 +340,9 @@ If the write failed, it will also set the error bits on the folios and
the address space.
This can happen in interrupt or process context, depending on the
storage device.
-
Filesystems that need to update internal bookkeeping (e.g. unwritten
-extent conversions) should provide a ``->submit_ioend`` function to
-set ``struct iomap_end::bio::bi_end_io`` to its own function.
+extent conversions) should set their own bi_end_io on the bios
+submitted by ``->submit_writeback``
This function should call ``iomap_finish_ioends`` after finishing its
own work (e.g. unwritten extent conversion).