<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/block/blk-sysfs.c, branch v5.12.5</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>block: fix potential IO hang when turning off io_poll</title>
<updated>2021-02-22T13:40:02+00:00</updated>
<author>
<name>Jeffle Xu</name>
<email>jefflexu@linux.alibaba.com</email>
</author>
<published>2021-02-22T06:54:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=6b09b4d33bd964f49d07d3cabfb4204d58cf9811'/>
<id>6b09b4d33bd964f49d07d3cabfb4204d58cf9811</id>
<content type='text'>
QUEUE_FLAG_POLL flag will be cleared when turning off 'io_poll', while
at that moment there may be IOs stuck in hw queue uncompleted. The
following polling routine won't help reap these IOs, since blk_poll()
will return immediately because of cleared QUEUE_FLAG_POLL flag. Thus
these IOs will hang until they finnaly time out. The hang out can be
observed by 'fio --engine=io_uring iodepth=1', while turning off
'io_poll' at the same time.

To fix this, freeze and flush the request queue first when turning off
'io_poll'.

Signed-off-by: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
QUEUE_FLAG_POLL flag will be cleared when turning off 'io_poll', while
at that moment there may be IOs stuck in hw queue uncompleted. The
following polling routine won't help reap these IOs, since blk_poll()
will return immediately because of cleared QUEUE_FLAG_POLL flag. Thus
these IOs will hang until they finnaly time out. The hang out can be
observed by 'fio --engine=io_uring iodepth=1', while turning off
'io_poll' at the same time.

To fix this, freeze and flush the request queue first when turning off
'io_poll'.

Signed-off-by: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: introduce zone_write_granularity limit</title>
<updated>2021-02-10T14:44:40+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2021-01-28T04:47:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=a805a4fa4fa376bbc145762bb8b09caa2fa8af48'/>
<id>a805a4fa4fa376bbc145762bb8b09caa2fa8af48</id>
<content type='text'>
Per ZBC and ZAC specifications, host-managed SMR hard-disks mandate that
all writes into sequential write required zones be aligned to the device
physical block size. However, NVMe ZNS does not have this constraint and
allows write operations into sequential zones to be aligned to the
device logical block size. This inconsistency does not help with
software portability across device types.

To solve this, introduce the zone_write_granularity queue limit to
indicate the alignment constraint, in bytes, of write operations into
zones of a zoned block device. This new limit is exported as a
read-only sysfs queue attribute and the helper
blk_queue_zone_write_granularity() introduced for drivers to set this
limit.

The function blk_queue_set_zoned() is modified to set this new limit to
the device logical block size by default. NVMe ZNS devices as well as
zoned nullb devices use this default value as is. The scsi disk driver
is modified to execute the blk_queue_zone_write_granularity() helper to
set the zone write granularity of host-managed SMR disks to the disk
physical block size.

The accessor functions queue_zone_write_granularity() and
bdev_zone_write_granularity() are also introduced.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@edc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Per ZBC and ZAC specifications, host-managed SMR hard-disks mandate that
all writes into sequential write required zones be aligned to the device
physical block size. However, NVMe ZNS does not have this constraint and
allows write operations into sequential zones to be aligned to the
device logical block size. This inconsistency does not help with
software portability across device types.

To solve this, introduce the zone_write_granularity queue limit to
indicate the alignment constraint, in bytes, of write operations into
zones of a zoned block device. This new limit is exported as a
read-only sysfs queue attribute and the helper
blk_queue_zone_write_granularity() introduced for drivers to set this
limit.

The function blk_queue_set_zoned() is modified to set this new limit to
the device logical block size by default. NVMe ZNS devices as well as
zoned nullb devices use this default value as is. The scsi disk driver
is modified to execute the blk_queue_zone_write_granularity() helper to
set the zone write granularity of host-managed SMR disks to the disk
physical block size.

The accessor functions queue_zone_write_granularity() and
bdev_zone_write_granularity() are also introduced.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@edc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: move cancel of hctx-&gt;run_work to the front of blk_exit_queue</title>
<updated>2020-10-09T18:46:28+00:00</updated>
<author>
<name>Yang Yang</name>
<email>yang.yang@vivo.com</email>
</author>
<published>2020-10-09T08:00:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=47ce030b7ac5a5259a9a5919f230b52497afc31a'/>
<id>47ce030b7ac5a5259a9a5919f230b52497afc31a</id>
<content type='text'>
blk_exit_queue will free elevator_data, while blk_mq_run_work_fn
will access it. Move cancel of hctx-&gt;run_work to the front of
blk_exit_queue to avoid use-after-free.

Fixes: 1b97871b501f ("blk-mq: move cancel of hctx-&gt;run_work into blk_mq_hw_sysfs_release")
Signed-off-by: Yang Yang &lt;yang.yang@vivo.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
blk_exit_queue will free elevator_data, while blk_mq_run_work_fn
will access it. Move cancel of hctx-&gt;run_work to the front of
blk_exit_queue to avoid use-after-free.

Fixes: 1b97871b501f ("blk-mq: move cancel of hctx-&gt;run_work into blk_mq_hw_sysfs_release")
Signed-off-by: Yang Yang &lt;yang.yang@vivo.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: get rid of unnecessary local variable</title>
<updated>2020-10-09T18:34:06+00:00</updated>
<author>
<name>Yufen Yu</name>
<email>yuyufen@huawei.com</email>
</author>
<published>2020-10-09T03:26:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=0546858c59fb90122ca8c820e6aa221925a13bd8'/>
<id>0546858c59fb90122ca8c820e6aa221925a13bd8</id>
<content type='text'>
Since whole elevator register is protectd by sysfs_lock, we
don't need extras 'has_elevator'. Just use q-&gt;elevator directly.

Signed-off-by: Yufen Yu &lt;yuyufen@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since whole elevator register is protectd by sysfs_lock, we
don't need extras 'has_elevator'. Just use q-&gt;elevator directly.

Signed-off-by: Yufen Yu &lt;yuyufen@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: invoke blk_mq_exit_sched no matter whether have .exit_sched</title>
<updated>2020-10-09T18:34:06+00:00</updated>
<author>
<name>Yufen Yu</name>
<email>yuyufen@huawei.com</email>
</author>
<published>2020-10-09T03:26:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=dd1c372d656600234a1704f456f9977ca684865c'/>
<id>dd1c372d656600234a1704f456f9977ca684865c</id>
<content type='text'>
We will register debugfs for scheduler no matter whether it have
defined callback funciton .exit_sched. So, blk_mq_exit_sched()
is always needed to unregister debugfs. Also, q-&gt;elevator should
be set as NULL after exiting scheduler.

For now, since all register scheduler have defined .exit_sched,
it will not cause any actual problem. But It will be more reasonable
to do this change.

Signed-off-by: Yufen Yu &lt;yuyufen@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We will register debugfs for scheduler no matter whether it have
defined callback funciton .exit_sched. So, blk_mq_exit_sched()
is always needed to unregister debugfs. Also, q-&gt;elevator should
be set as NULL after exiting scheduler.

For now, since all register scheduler have defined .exit_sched,
it will not cause any actual problem. But It will be more reasonable
to do this change.

Signed-off-by: Yufen Yu &lt;yuyufen@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag</title>
<updated>2020-09-24T19:43:39+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-09-24T06:51:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=1cb039f3dc1619eb795c54aad0a98fdb379b4237'/>
<id>1cb039f3dc1619eb795c54aad0a98fdb379b4237</id>
<content type='text'>
The BDI_CAP_STABLE_WRITES is one of the few bits of information in the
backing_dev_info shared between the block drivers and the writeback code.
To help untangling the dependency replace it with a queue flag and a
superblock flag derived from it.  This also helps with the case of e.g.
a file system requiring stable writes due to its own checksumming, but
not forcing it on other users of the block device like the swap code.

One downside is that we an't support the stable_pages_required bdi
attribute in sysfs anymore.  It is replaced with a queue attribute which
also is writable for easier testing.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The BDI_CAP_STABLE_WRITES is one of the few bits of information in the
backing_dev_info shared between the block drivers and the writeback code.
To help untangling the dependency replace it with a queue flag and a
superblock flag derived from it.  This also helps with the case of e.g.
a file system requiring stable writes due to its own checksumming, but
not forcing it on other users of the block device like the swap code.

One downside is that we an't support the stable_pages_required bdi
attribute in sysfs anymore.  It is replaced with a queue attribute which
also is writable for easier testing.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: lift setting the readahead size into the block layer</title>
<updated>2020-09-24T19:43:39+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-09-24T06:51:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=c2e4cd57cfa1f627b786c764d185fff85fd12be9'/>
<id>c2e4cd57cfa1f627b786c764d185fff85fd12be9</id>
<content type='text'>
Drivers shouldn't really mess with the readahead size, as that is a VM
concept.  Instead set it based on the optimal I/O size by lifting the
algorithm from the md driver when registering the disk.  Also set
bdi-&gt;io_pages there as well by applying the same scheme based on
max_sectors.  To ensure the limits work well for stacking drivers a
new helper is added to update the readahead limits from the block
limits, which is also called from disk_stack_limits.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Drivers shouldn't really mess with the readahead size, as that is a VM
concept.  Instead set it based on the optimal I/O size by lifting the
algorithm from the md driver when registering the disk.  Also set
bdi-&gt;io_pages there as well by applying the same scheme based on
max_sectors.  To ensure the limits work well for stacking drivers a
new helper is added to update the readahead limits from the block
limits, which is also called from disk_stack_limits.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: make QUEUE_SYSFS_BIT_FNS more useful</title>
<updated>2020-09-08T15:01:10+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-09-03T06:07:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=fc93fe145316dad6186d8eb4b4f49f68b1cba0c3'/>
<id>fc93fe145316dad6186d8eb4b4f49f68b1cba0c3</id>
<content type='text'>
Switch to the naming used by the other entries so that we can use the
QUEUE_RW_ENTRY helper.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Switch to the naming used by the other entries so that we can use the
QUEUE_RW_ENTRY helper.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: add helper macros for queue sysfs entries</title>
<updated>2020-09-08T15:01:10+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-09-03T06:07:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=3562614705c6f2324f79a959a2cf1ae24d142578'/>
<id>3562614705c6f2324f79a959a2cf1ae24d142578</id>
<content type='text'>
Add two helpers macros to avoid boilerplate code for the queue sysfs
entries.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add two helpers macros to avoid boilerplate code for the queue sysfs
entries.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: add max_active_zones to blk-sysfs</title>
<updated>2020-07-15T20:26:11+00:00</updated>
<author>
<name>Niklas Cassel</name>
<email>niklas.cassel@wdc.com</email>
</author>
<published>2020-07-14T21:18:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=659bf827ba8f1183b714341d8a1d4b1e446178d9'/>
<id>659bf827ba8f1183b714341d8a1d4b1e446178d9</id>
<content type='text'>
Add a new max_active zones definition in the sysfs documentation.
This definition will be common for all devices utilizing the zoned block
device support in the kernel.

Export max_active_zones according to this new definition for NVMe Zoned
Namespace devices, ZAC ATA devices (which are treated as SCSI devices by
the kernel), and ZBC SCSI devices.

Add the new max_active_zones member to struct request_queue, rather
than as a queue limit, since this property cannot be split across stacking
drivers.

For SCSI devices, even though max active zones is not part of the ZBC/ZAC
spec, export max_active_zones as 0, signifying "no limit".

Signed-off-by: Niklas Cassel &lt;niklas.cassel@wdc.com&gt;
Reviewed-by: Javier González &lt;javier@javigon.com&gt;
Reviewed-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new max_active zones definition in the sysfs documentation.
This definition will be common for all devices utilizing the zoned block
device support in the kernel.

Export max_active_zones according to this new definition for NVMe Zoned
Namespace devices, ZAC ATA devices (which are treated as SCSI devices by
the kernel), and ZBC SCSI devices.

Add the new max_active_zones member to struct request_queue, rather
than as a queue limit, since this property cannot be split across stacking
drivers.

For SCSI devices, even though max active zones is not part of the ZBC/ZAC
spec, export max_active_zones as 0, signifying "no limit".

Signed-off-by: Niklas Cassel &lt;niklas.cassel@wdc.com&gt;
Reviewed-by: Javier González &lt;javier@javigon.com&gt;
Reviewed-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
