| Age | Commit message (Collapse) | Author | Files | Lines |
|
This reverts commit f92a3b0d003b9f7eb1f452598966a08802183f47, which
was commit 083a6a50783ef54256eec3499e6575237e0e3d53 upstream. In 4.19
there is still an option to use 32-bit sector_t on 32-bit
architectures, so we need to keep checking for truncation.
Since loop_set_status() was refactored by subsequent patches, this
reintroduces its truncation check in loop_set_status_from_info()
instead.
I tested that the loop ioctl operations have the expected behaviour on
x86_64, x86_32 with CONFIG_LBDAF=y, and (the special case) x86_32 with
CONFIG_LBDAF=n.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9f6ad5d533d1c71e51bdd06a5712c4fbc8768dfa ]
In loop_set_status_from_info(), lo->lo_offset and lo->lo_sizelimit should
be checked before reassignment, because if an overflow error occurs, the
original correct value will be changed to the wrong value, and it will not
be changed back.
More, the original patch did not solve the problem, the value was set and
ioctl returned an error, but the subsequent io used the value in the loop
driver, which still caused an alarm:
loop_handle_cmd
do_req_filebacked
loff_t pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
lo_rw_aio
cmd->iocb.ki_pos = pos
Fixes: c490a0b5a4f3 ("loop: Check for overflow while configuring loop")
Signed-off-by: Zhong Jinghua <zhongjinghua@huawei.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20230221095027.3656193-1-zhongjinghua@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c490a0b5a4f36da3918181a8acdc6991d967c5f3 ]
The userspace can configure a loop using an ioctl call, wherein
a configuration of type loop_config is passed (see lo_ioctl()'s
case on line 1550 of drivers/block/loop.c). This proceeds to call
loop_configure() which in turn calls loop_set_status_from_info()
(see line 1050 of loop.c), passing &config->info which is of type
loop_info64*. This function then sets the appropriate values, like
the offset.
loop_device has lo_offset of type loff_t (see line 52 of loop.c),
which is typdef-chained to long long, whereas loop_info64 has
lo_offset of type __u64 (see line 56 of include/uapi/linux/loop.h).
The function directly copies offset from info to the device as
follows (See line 980 of loop.c):
lo->lo_offset = info->lo_offset;
This results in an overflow, which triggers a warning in iomap_iter()
due to a call to iomap_iter_done() which has:
WARN_ON_ONCE(iter->iomap.offset > iter->pos);
Thus, check for negative value during loop_set_status_from_info().
Bug report: https://syzkaller.appspot.com/bug?id=c620fe14aac810396d3c3edc9ad73848bf69a29e
Reported-and-tested-by: syzbot+a8e049cd3abd342936b6@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Siddh Raman Pant <code@siddh.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220823160810.181275-1-code@siddh.me
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0c3796c244598122a5d59d56f30d19390096817f ]
Factor out this code into a separate function, so it can be reused by
other code more easily.
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 56a85fd8376ef32458efb6ea97a820754e12f6bb ]
The loop driver always declares the rotational flag of its device as
rotational, even when the device of the mapped file is nonrotational,
as is the case with SSDs or on tmpfs. This can confuse filesystem tools
which are SSD-aware; in my case I frequently forget to tell mkfs.btrfs
that my loop device on tmpfs is nonrotational, and that I really don't
need any automatic metadata redundancy.
The attached patch fixes this by introspecting the rotational flag of the
mapped file's underlying block device, if it exists. If the mapped file's
filesystem has no associated block device - as is the case on e.g. tmpfs -
we assume nonrotational storage. If there is a better way to identify such
non-devices I'd love to hear them.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: holger@applied-asynchrony.com
Signed-off-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Signed-off-by: Benjamin Gordon <bmgordon@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b0bd158dd630bd47640e0e418c062cda1e0da5ad ]
figure_loop_size() calculates the loop size based on the passed in
parameters, but at the same time it updates the offset and sizelimit
parameters in the loop device configuration. That is a somewhat
unexpected side effect of a function with this name, and it is only only
needed by one of the two callers of this function - loop_set_status().
Move the lo_offset and lo_sizelimit assignment back into loop_set_status(),
and use the newly factored out functions to validate and apply the newly
calculated size. This allows us to get rid of figure_loop_size() in a
follow-up commit.
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5795b6f5607f7e4db62ddea144727780cb351a9b ]
This code is used repeatedly.
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 083a6a50783ef54256eec3499e6575237e0e3d53 ]
sector_t is now always u64, so we don't need to check for truncation.
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7c5014b0987a30e4989c90633c198aced454c0ec ]
loop_set_status() calls loop_config_discard() to configure discard for
the loop device; however, the discard configuration depends on whether
the loop device uses encryption, and when we call it the encryption
configuration has not been updated yet. Move the call down so we apply
the correct discard configuration based on the new configuration.
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 2035c770bfdbcc82bd52e05871a7c82db9529e0f.
This patch lost a unlock loop_ctl_mutex in loop_get_status(...),
which caused syzbot to report a UAF issue.The upstream patch does not
have this issue. Therefore, we revert this patch and directly apply
the upstream patch later on.
Risk use-after-free as reported by syzbot:
[ 174.437352] BUG: KASAN: use-after-free in __mutex_lock.isra.10+0xbc4/0xc30
[ 174.437772] Read of size 4 at addr ffff8880bac49ab8 by task syz-executor.0/13897
[ 174.438205]
[ 174.438306] CPU: 1 PID: 13897 Comm: syz-executor.0 Not tainted 4.19.306 #1
[ 174.438712] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1kylin1 04/01/2014
[ 174.439236] Call Trace:
[ 174.439392] dump_stack+0x94/0xc7
[ 174.439596] ? __mutex_lock.isra.10+0xbc4/0xc30
[ 174.439881] print_address_description+0x60/0x229
[ 174.440165] ? __mutex_lock.isra.10+0xbc4/0xc30
[ 174.440436] kasan_report.cold.6+0x241/0x2fd
[ 174.440696] __mutex_lock.isra.10+0xbc4/0xc30
[ 174.440959] ? entry_SYSCALL_64_after_hwframe+0x5c/0xc1
[ 174.441272] ? mutex_trylock+0xa0/0xa0
[ 174.441500] ? entry_SYSCALL_64_after_hwframe+0x5c/0xc1
[ 174.441816] ? kobject_get_unless_zero+0x129/0x1c0
[ 174.442106] ? kset_unregister+0x30/0x30
[ 174.442351] ? find_symbol_in_section+0x310/0x310
[ 174.442634] ? __mutex_lock_slowpath+0x10/0x10
[ 174.442901] mutex_lock_killable+0xb0/0xf0
[ 174.443149] ? __mutex_lock_killable_slowpath+0x10/0x10
[ 174.443465] ? __mutex_lock_slowpath+0x10/0x10
[ 174.443732] ? _cond_resched+0x10/0x20
[ 174.443966] ? kobject_get+0x54/0xa0
[ 174.444190] lo_open+0x16/0xc0
[ 174.444382] __blkdev_get+0x273/0x10f0
[ 174.444612] ? lo_fallocate.isra.20+0x150/0x150
[ 174.444886] ? bdev_disk_changed+0x190/0x190
[ 174.445146] ? path_init+0x1030/0x1030
[ 174.445371] ? do_syscall_64+0x9a/0x2d0
[ 174.445608] ? deref_stack_reg+0xab/0xe0
[ 174.445852] blkdev_get+0x97/0x880
[ 174.446061] ? walk_component+0x297/0xdc0
[ 174.446303] ? __blkdev_get+0x10f0/0x10f0
[ 174.446547] ? __fsnotify_inode_delete+0x20/0x20
[ 174.446822] blkdev_open+0x1bd/0x240
[ 174.447040] do_dentry_open+0x448/0xf80
[ 174.447274] ? blkdev_get_by_dev+0x60/0x60
[ 174.447522] ? __x64_sys_fchdir+0x1a0/0x1a0
[ 174.447775] ? inode_permission+0x86/0x320
[ 174.448022] path_openat+0xa83/0x3ed0
[ 174.448248] ? path_mountpoint+0xb50/0xb50
[ 174.448495] ? kasan_kmalloc+0xbf/0xe0
[ 174.448723] ? kmem_cache_alloc+0xbc/0x1b0
[ 174.448971] ? getname_flags+0xc4/0x560
[ 174.449203] ? do_sys_open+0x1ce/0x3f0
[ 174.449432] ? do_syscall_64+0x9a/0x2d0
[ 174.449706] ? entry_SYSCALL_64_after_hwframe+0x5c/0xc1
[ 174.450022] ? __d_alloc+0x2a/0xa50
[ 174.450232] ? kasan_unpoison_shadow+0x30/0x40
[ 174.450510] ? should_fail+0x117/0x6c0
[ 174.450737] ? timespec64_trunc+0xc1/0x150
[ 174.450986] ? inode_init_owner+0x2e0/0x2e0
[ 174.451237] ? timespec64_trunc+0xc1/0x150
[ 174.451484] ? inode_init_owner+0x2e0/0x2e0
[ 174.451736] do_filp_open+0x197/0x270
[ 174.451959] ? may_open_dev+0xd0/0xd0
[ 174.452182] ? kasan_unpoison_shadow+0x30/0x40
[ 174.452448] ? kasan_kmalloc+0xbf/0xe0
[ 174.452672] ? __alloc_fd+0x1a3/0x4b0
[ 174.452895] do_sys_open+0x2c7/0x3f0
[ 174.453114] ? filp_open+0x60/0x60
[ 174.453320] do_syscall_64+0x9a/0x2d0
[ 174.453541] ? prepare_exit_to_usermode+0xf3/0x170
[ 174.453832] entry_SYSCALL_64_after_hwframe+0x5c/0xc1
[ 174.454136] RIP: 0033:0x41edee
[ 174.454321] Code: 25 00 00 41 00 3d 00 00 41 00 74 48 48 c7 c0 a4 af 0b 01 8b 00 85 c0 75 69 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a6 00 00 00 48 8b 4c 24 28 64 48 33 0c5
[ 174.455404] RSP: 002b:00007ffd2501fbd0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[ 174.455854] RAX: ffffffffffffffda RBX: 00007ffd2501fc90 RCX: 000000000041edee
[ 174.456273] RDX: 0000000000000002 RSI: 00007ffd2501fcd0 RDI: 00000000ffffff9c
[ 174.456698] RBP: 0000000000000003 R08: 0000000000000001 R09: 00007ffd2501f9a7
[ 174.457116] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
[ 174.457535] R13: 0000000000565e48 R14: 00007ffd2501fcd0 R15: 0000000000400510
[ 174.457955]
[ 174.458052] Allocated by task 945:
[ 174.458261] kasan_kmalloc+0xbf/0xe0
[ 174.458478] kmem_cache_alloc_node+0xb4/0x1d0
[ 174.458743] copy_process.part.57+0x14b0/0x7010
[ 174.459017] _do_fork+0x197/0x980
[ 174.459218] kernel_thread+0x2f/0x40
[ 174.459438] call_usermodehelper_exec_work+0xa8/0x240
[ 174.459742] process_one_work+0x933/0x13b0
[ 174.459986] worker_thread+0x8c/0x1000
[ 174.460212] kthread+0x343/0x410
[ 174.460408] ret_from_fork+0x35/0x40
[ 174.460621]
[ 174.460716] Freed by task 22902:
[ 174.460913] __kasan_slab_free+0x125/0x170
[ 174.461159] kmem_cache_free+0x6e/0x1b0
[ 174.461391] __put_task_struct+0x1c4/0x440
[ 174.461636] delayed_put_task_struct+0x135/0x170
[ 174.461915] rcu_process_callbacks+0x578/0x15c0
[ 174.462184] __do_softirq+0x175/0x60e
[ 174.462403]
[ 174.462501] The buggy address belongs to the object at ffff8880bac49a80
[ 174.462501] which belongs to the cache task_struct of size 3264
[ 174.463235] The buggy address is located 56 bytes inside of
[ 174.463235] 3264-byte region [ffff8880bac49a80, ffff8880bac4a740)
[ 174.463923] The buggy address belongs to the page:
[ 174.464210] page:ffffea0002eb1200 count:1 mapcount:0 mapping:ffff888188ca0a00 index:0x0 compound_mapcount: 0
[ 174.464784] flags: 0x100000000008100(slab|head)
[ 174.465079] raw: 0100000000008100 ffffea0002eaa400 0000000400000004 ffff888188ca0a00
[ 174.465533] raw: 0000000000000000 0000000000090009 00000001ffffffff 0000000000000000
[ 174.465988] page dumped because: kasan: bad access detected
[ 174.466321]
[ 174.466322] Memory state around the buggy address:
[ 174.466325] ffff8880bac49980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 174.466327] ffff8880bac49a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 174.466329] >ffff8880bac49a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 174.466329] ^
[ 174.466331] ffff8880bac49b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 174.466333] ffff8880bac49b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 174.466333] ==================================================================
[ 174.466338] Disabling lock debugging due to kernel taint
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2112f5c1330a671fa852051d85cb9eadc05d7eb7 upstream.
We noticed that the user interface of Android devices becomes very slow
under memory pressure. This is because Android uses the zram driver on top
of the loop driver for swapping, because under memory pressure the swap
code alternates reads and writes quickly, because mq-deadline is the
default scheduler for loop devices and because mq-deadline delays writes by
five seconds for such a workload with default settings. Fix this by making
the kernel select I/O scheduler 'none' from inside add_disk() for loop
devices. This default can be overridden at any time from user space,
e.g. via a udev rule. This approach has an advantage compared to changing
the I/O scheduler from userspace from 'mq-deadline' into 'none', namely
that synchronize_rcu() does not get called.
This patch changes the default I/O scheduler for loop devices from
'mq-deadline' into 'none'.
Additionally, this patch reduces the Android boot time on my test setup
with 0.5 seconds compared to configuring the loop I/O scheduler from user
space.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Martijn Coenen <maco@android.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210805174200.3250718-3-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c490a0b5a4f36da3918181a8acdc6991d967c5f3 upstream.
The userspace can configure a loop using an ioctl call, wherein
a configuration of type loop_config is passed (see lo_ioctl()'s
case on line 1550 of drivers/block/loop.c). This proceeds to call
loop_configure() which in turn calls loop_set_status_from_info()
(see line 1050 of loop.c), passing &config->info which is of type
loop_info64*. This function then sets the appropriate values, like
the offset.
loop_device has lo_offset of type loff_t (see line 52 of loop.c),
which is typdef-chained to long long, whereas loop_info64 has
lo_offset of type __u64 (see line 56 of include/uapi/linux/loop.h).
The function directly copies offset from info to the device as
follows (See line 980 of loop.c):
lo->lo_offset = info->lo_offset;
This results in an overflow, which triggers a warning in iomap_iter()
due to a call to iomap_iter_done() which has:
WARN_ON_ONCE(iter->iomap.offset > iter->pos);
Thus, check for negative value during loop_set_status_from_info().
Bug report: https://syzkaller.appspot.com/bug?id=c620fe14aac810396d3c3edc9ad73848bf69a29e
Reported-and-tested-by: syzbot+a8e049cd3abd342936b6@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Siddh Raman Pant <code@siddh.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220823160810.181275-1-code@siddh.me
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b27824d31f09ea7b4a6ba2c1b18bd328df3e8bed ]
sprintf does not know the PAGE_SIZE maximum of the temporary buffer
used for outputting sysfs content and it's possible to overrun the
PAGE_SIZE buffer length.
Use a generic sysfs_emit function that knows the size of the
temporary buffer and ensures that no overrun is done for offset
attribute in
loop_attr_[offset|sizelimit|autoclear|partscan|dio]_show() callbacks.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Link: https://lore.kernel.org/r/20220215213310.7264-2-kch@nvidia.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit bcb21c8cc9947286211327d663ace69f07d37a76 upstream.
In case of block device backend, if the backend supports write zeros, the
loop device will set queue flag of QUEUE_FLAG_DISCARD. However,
limits.discard_granularity isn't setup, and this way is wrong,
see the following description in Documentation/ABI/testing/sysfs-block:
A discard_granularity of 0 means that the device does not support
discard functionality.
Especially 9b15d109a6b2 ("block: improve discard bio alignment in
__blkdev_issue_discard()") starts to take q->limits.discard_granularity
for computing max discard sectors. And zero discard granularity may cause
kernel oops, or fail discard request even though the loop queue claims
discard support via QUEUE_FLAG_DISCARD.
Fix the issue by setup discard granularity and alignment.
Fixes: c52abf563049 ("loop: Better discard support for block devices")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Coly Li <colyli@suse.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Xiao Ni <xni@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Evan Green <evgreen@chromium.org>
Cc: Gwendal Grignou <gwendal@chromium.org>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 200f93377220504c5e56754823e7adfea6037f1a ]
Be pedantic on removal as well and hold the mutex.
This should prevent uses of addition while we exit.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit f4bd34b139a3fa2808c4205f12714c65e1548c6c upstream.
When a filesystem is mounted on a loop device and on a loop ioctl
LOOP_SET_STATUS64, because of kill_bdev, buffer_head mappings are getting
destroyed.
kill_bdev
truncate_inode_pages
truncate_inode_pages_range
do_invalidatepage
block_invalidatepage
discard_buffer -->clear BH_Mapped flag
sb_bread
__bread_gfp
bh = __getblk_gfp
-->discard_buffer clear BH_Mapped flag
__bread_slow
submit_bh
submit_bh_wbc
BUG_ON(!buffer_mapped(bh)) --> hit this BUG_ON
Fixes: 5db470e229e2 ("loop: drop caches if offset or block_size are changed")
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c52abf563049e787c1341cdf15c7dbe1bfbc951b ]
If the backing device for a loop device is itself a block device,
then mirror the "write zeroes" capabilities of the underlying
block device into the loop device. Copy this capability into both
max_write_zeroes_sectors and max_discard_sectors of the loop device.
The reason for this is that REQ_OP_DISCARD on a loop device translates
into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
presents a consistent interface for loop devices (that discarded data
is zeroed), regardless of the backing device type of the loop device.
There should be no behavior change for loop devices backed by regular
files.
This change fixes blktest block/003, and removes an extraneous
error print in block/013 when testing on a loop device backed
by a block device that does not support discard.
Signed-off-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
[used updated version of Evan's comment in loop_config_discard()]
[moved backingq to local scope, removed redundant braces]
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit efcfec579f6139528c9e6925eca2bc4a36da65c6 ]
Currently, if the loop device receives a WRITE_ZEROES request, it asks
the underlying filesystem to punch out the range. This behavior is
correct if unmapping is allowed. However, a NOUNMAP request means that
the caller doesn't want us to free the storage backing the range, so
punching out the range is incorrect behavior.
To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the
underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to
the fallocate documentation) required to ensure that the entire range is
backed by real storage, which suffices for our purposes.
Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fdbe4eeeb1aac219b14f10c0ed31ae5d1123e9b8 ]
Enabling Direct I/O with loop devices helps reducing memory usage by
avoiding double caching. 32 bit applications running on 64 bits systems
are currently not able to request direct I/O because is missing from the
lo_compat_ioctl.
This patch fixes the compatibility issue mentioned above by exporting
LOOP_SET_DIRECT_IO as additional lo_compat_ioctl() entry.
The input argument for this ioctl is a single long converted to a 1-bit
boolean, so compatibility is preserved.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit d0a255e795ab976481565f6ac178314b34fbf891 upstream.
A deadlock with this stacktrace was observed.
The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio
shrinker and the shrinker depends on I/O completion in the dm-bufio
subsystem.
In order to fix the deadlock (and other similar ones), we set the flag
PF_MEMALLOC_NOIO at loop thread entry.
PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0"
#0 [ffff8813dedfb938] __schedule at ffffffff8173f405
#1 [ffff8813dedfb990] schedule at ffffffff8173fa27
#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
#13 [ffff8813dedfbec0] kthread at ffffffff810a8428
#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242
PID: 14127 TASK: ffff881455749c00 CPU: 11 COMMAND: "loop1"
#0 [ffff88272f5af228] __schedule at ffffffff8173f405
#1 [ffff88272f5af280] schedule at ffffffff8173fa27
#2 [ffff88272f5af2a0] schedule_preempt_disabled at ffffffff8173fd5e
#3 [ffff88272f5af2b0] __mutex_lock_slowpath at ffffffff81741fb5
#4 [ffff88272f5af330] mutex_lock at ffffffff81742133
#5 [ffff88272f5af350] dm_bufio_shrink_count at ffffffffa03865f9 [dm_bufio]
#6 [ffff88272f5af380] shrink_slab at ffffffff811a86bd
#7 [ffff88272f5af470] shrink_zone at ffffffff811ad778
#8 [ffff88272f5af500] do_try_to_free_pages at ffffffff811adb34
#9 [ffff88272f5af590] try_to_free_pages at ffffffff811adef8
#10 [ffff88272f5af610] __alloc_pages_nodemask at ffffffff811a09c3
#11 [ffff88272f5af710] alloc_pages_current at ffffffff811e8b71
#12 [ffff88272f5af760] new_slab at ffffffff811f4523
#13 [ffff88272f5af7b0] __slab_alloc at ffffffff8173a1b5
#14 [ffff88272f5af880] kmem_cache_alloc at ffffffff811f484b
#15 [ffff88272f5af8d0] do_blockdev_direct_IO at ffffffff812535b3
#16 [ffff88272f5afb00] __blockdev_direct_IO at ffffffff81255dc3
#17 [ffff88272f5afb30] xfs_vm_direct_IO at ffffffffa01fe3fc [xfs]
#18 [ffff88272f5afb90] generic_file_read_iter at ffffffff81198994
#19 [ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at ffffffffa020c970 [xfs]
#20 [ffff88272f5afcc0] lo_rw_aio at ffffffffa0377042 [loop]
#21 [ffff88272f5afd70] loop_queue_work at ffffffffa0377c3b [loop]
#22 [ffff88272f5afe60] kthread_worker_fn at ffffffff810a8a0c
#23 [ffff88272f5afec0] kthread at ffffffff810a8428
#24 [ffff88272f5aff50] ret_from_fork at ffffffff81745242
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 40853d6fc619a6fd3d3177c3973a2eac9b598a80 ]
Do not print warn message when the partition scan returns 0.
Fixes: d57f3374ba48 ("loop: Move special partition reread handling in loop_clr_fd()")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 758a58d0bc67457f1215321a536226654a830eeb ]
Commit 0da03cab87e6
("loop: Fix deadlock when calling blkdev_reread_part()") moves
blkdev_reread_part() out of the loop_ctl_mutex. However,
GENHD_FL_NO_PART_SCAN is set before __blkdev_reread_part(). As a result,
__blkdev_reread_part() will fail the check of GENHD_FL_NO_PART_SCAN and
will not rescan the loop device to delete all partitions.
Below are steps to reproduce the issue:
step1 # dd if=/dev/zero of=tmp.raw bs=1M count=100
step2 # losetup -P /dev/loop0 tmp.raw
step3 # parted /dev/loop0 mklabel gpt
step4 # parted -a none -s /dev/loop0 mkpart primary 64s 1
step5 # losetup -d /dev/loop0
Step5 will not be able to delete /dev/loop0p1 (introduced by step4) and
there is below kernel warning message:
[ 464.414043] __loop_clr_fd: partition scan of loop0 failed (rc=-22)
This patch sets GENHD_FL_NO_PART_SCAN after blkdev_reread_part().
Fixes: 0da03cab87e6 ("loop: Fix deadlock when calling blkdev_reread_part()")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit f7c8a4120eedf24c36090b7542b179ff7a649219 upstream.
Commit 758a58d0bc67 ("loop: set GENHD_FL_NO_PART_SCAN after
blkdev_reread_part()") separates "lo->lo_backing_file = NULL" and
"lo->lo_state = Lo_unbound" into different critical regions protected by
loop_ctl_mutex.
However, there is below race that the NULL lo->lo_backing_file would be
accessed when the backend of a loop is another loop device, e.g., loop0's
backend is a file, while loop1's backend is loop0.
loop0's backend is file loop1's backend is loop0
__loop_clr_fd()
mutex_lock(&loop_ctl_mutex);
lo->lo_backing_file = NULL; --> set to NULL
mutex_unlock(&loop_ctl_mutex);
loop_set_fd()
mutex_lock_killable(&loop_ctl_mutex);
loop_validate_file()
f = l->lo_backing_file; --> NULL
access if loop0 is not Lo_unbound
mutex_lock(&loop_ctl_mutex);
lo->lo_state = Lo_unbound;
mutex_unlock(&loop_ctl_mutex);
lo->lo_backing_file should be accessed only when the loop device is
Lo_bound.
In fact, the problem has been introduced already in commit 7ccd0791d985
("loop: Push loop_ctl_mutex down into loop_clr_fd()") after which
loop_validate_file() could see devices in Lo_rundown state with which it
did not count. It was harmless at that point but still.
Fixes: 7ccd0791d985 ("loop: Push loop_ctl_mutex down into loop_clr_fd()")
Reported-by: syzbot+9bdc1adc1c55e7fe765b@syzkaller.appspotmail.com
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5db470e229e22b7eda6e23b5566e532c96fb5bc3 upstream.
If we don't drop caches used in old offset or block_size, we can get old data
from new offset/block_size, which gives unexpected data to user.
For example, Martijn found a loopback bug in the below scenario.
1) LOOP_SET_FD loads first two pages on loop file
2) LOOP_SET_STATUS64 changes the offset on the loop file
3) mount is failed due to the cached pages having wrong superblock
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Reported-by: Martijn Coenen <maco@google.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 628bd85947091830a8c4872adfd5ed1d515a9cf2 upstream.
Commit 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex") forgot to
remove mutex_unlock(&loop_ctl_mutex) from loop_control_ioctl() when
replacing loop_index_mutex with loop_ctl_mutex.
Fixes: 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex")
Reported-by: syzbot <syzbot+c0138741c2290fc5e63f@syzkaller.appspotmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c28445fa06a3a54e06938559b9514c5a7f01c90f upstream.
The nested acquisition of loop_ctl_mutex (->lo_ctl_mutex back then) has
been introduced by commit f028f3b2f987e "loop: fix circular locking in
loop_clr_fd()" to fix lockdep complains about bd_mutex being acquired
after lo_ctl_mutex during partition rereading. Now that these are
properly fixed, let's stop fooling lockdep.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1dded9acf6dc9a34cd27fcf8815507e4e65b3c4f upstream.
Code in loop_change_fd() drops reference to the old file (and also the
new file in a failure case) under loop_ctl_mutex. Similarly to a
situation in loop_set_fd() this can create a circular locking dependency
if this was the last reference holding the file open. Delay dropping of
the file reference until we have released loop_ctl_mutex.
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0da03cab87e6323ff2e05b14bc7d5c6fcc531efd upstream.
Calling blkdev_reread_part() under loop_ctl_mutex causes lockdep to
complain about circular lock dependency between bdev->bd_mutex and
lo->lo_ctl_mutex. The problem is that on loop device open or close
lo_open() and lo_release() get called with bdev->bd_mutex held and they
need to acquire loop_ctl_mutex. OTOH when loop_reread_partitions() is
called with loop_ctl_mutex held, it will call blkdev_reread_part() which
acquires bdev->bd_mutex. See syzbot report for details [1].
Move call to blkdev_reread_part() in __loop_clr_fd() from under
loop_ctl_mutex to finish fixing of the lockdep warning and the possible
deadlock.
[1] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d1588
Reported-by: syzbot <syzbot+4684a000d5abdade83fac55b1e7d1f935ef1936e@syzkaller.appspotmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 85b0a54a82e4fbceeb1aebb7cb6909edd1a24668 upstream.
Calling loop_reread_partitions() under loop_ctl_mutex causes lockdep to
complain about circular lock dependency between bdev->bd_mutex and
lo->lo_ctl_mutex. The problem is that on loop device open or close
lo_open() and lo_release() get called with bdev->bd_mutex held and they
need to acquire loop_ctl_mutex. OTOH when loop_reread_partitions() is
called with loop_ctl_mutex held, it will call blkdev_reread_part() which
acquires bdev->bd_mutex. See syzbot report for details [1].
Move all calls of loop_rescan_partitions() out of loop_ctl_mutex to
avoid lockdep warning and fix deadlock possibility.
[1] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d1588
Reported-by: syzbot <syzbot+4684a000d5abdade83fac55b1e7d1f935ef1936e@syzkaller.appspotmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d57f3374ba4817f7c8d26fae8a13d20ac8d31b92 upstream.
The call of __blkdev_reread_part() from loop_reread_partition() happens
only when we need to invalidate partitions from loop_release(). Thus
move a detection for this into loop_clr_fd() and simplify
loop_reread_partition().
This makes loop_reread_partition() safe to use without loop_ctl_mutex
because we use only lo->lo_number and lo->lo_file_name in case of error
for reporting purposes (thus possibly reporting outdate information is
not a big deal) and we are safe from 'lo' going away under us by
elevated lo->lo_refcnt.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c371077000f4138ee3c15fbed50101ff24bdc91d upstream.
Push loop_ctl_mutex down to loop_change_fd(). We will need this to be
able to call loop_reread_partitions() without loop_ctl_mutex.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 757ecf40b7e029529768eb5f9562d5eeb3002106 upstream.
Push lo_ctl_mutex down to loop_set_fd(). We will need this to be able to
call loop_reread_partitions() without lo_ctl_mutex.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 550df5fdacff94229cde0ed9b8085155654c1696 upstream.
Push loop_ctl_mutex down to loop_set_status(). We will need this to be
able to call loop_reread_partitions() without loop_ctl_mutex.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4a5ce9ba5877e4640200d84a735361306ad1a1b8 upstream.
Push loop_ctl_mutex down to loop_get_status() to avoid the unusual
convention that the function gets called with loop_ctl_mutex held and
releases it.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7ccd0791d98531df7cd59e92d55e4f063d48a070 upstream.
loop_clr_fd() has a weird locking convention that is expects
loop_ctl_mutex held, releases it on success and keeps it on failure.
Untangle the mess by moving locking of loop_ctl_mutex into
loop_clr_fd().
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a2505b799a496b7b84d9a4a14ec870ff9e42e11b upstream.
Move setting of lo_state to Lo_rundown out into the callers. That will
allow us to unlock loop_ctl_mutex while the loop device is protected
from other changes by its special state.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a13165441d58b216adbd50252a9cc829d78a6bce upstream.
Push acquisition of lo_ctl_mutex down into individual ioctl handling
branches. This is a preparatory step for pushing the lock down into
individual ioctl handling functions so that they can release the lock as
they need it. We also factor out some simple ioctl handlers that will
not need any special handling to reduce unnecessary code duplication.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0a42e99b58a208839626465af194cfe640ef9493 upstream.
Now that loop_ctl_mutex is global, just get rid of loop_index_mutex as
there is no good reason to keep these two separate and it just
complicates the locking.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 967d1dc144b50ad005e5eecdfadfbcfb399ffff6 upstream.
__loop_release() has a single call site. Fold it there. This is
currently not a huge win but it will make following replacement of
loop_index_mutex more obvious.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 310ca162d779efee8a2dc3731439680f3e9c1e86 upstream.
syzbot is reporting NULL pointer dereference [1] which is caused by
race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus
ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other
loop devices at loop_validate_file() without holding corresponding
lo->lo_ctl_mutex locks.
Since ioctl() request on loop devices is not frequent operation, we don't
need fine grained locking. Let's use global lock in order to allow safe
traversal at loop_validate_file().
Note that syzbot is also reporting circular locking dependency between
bdev->bd_mutex and lo->lo_ctl_mutex [2] which is caused by calling
blkdev_reread_part() with lock held. This patch does not address it.
[1] https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a3
[2] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+bf89c128e05dd6c62523@syzkaller.appspotmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|