summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-04-07ASoC: rt5640: Fix dac- and adc- vol-tlv values being off by a factor of 10Hans de Goede1-2/+2
[ Upstream commit cfa26ed1f9f885c2fd8f53ca492989d1e16d0199 ] The adc_vol_tlv volume-control has a range from -17.625 dB to +30 dB, not -176.25 dB to + 300 dB. This wrong scale is esp. a problem in userspace apps which translate the dB scale to a linear scale. With the logarithmic dB scale being of by a factor of 10 we loose all precision in the lower area of the range when apps translate things to a linear scale. E.g. the 0 dB default, which corresponds with a value of 47 of the 0 - 127 range for the control, would be shown as 0/100 in alsa-mixer. Since the centi-dB values used in the TLV struct cannot represent the 0.375 dB step size used by these controls, change the TLV definition for them to specify a min and max value instead of min + stepsize. Note this mirrors commit 3f31f7d9b540 ("ASoC: rt5670: Fix dac- and adc- vol-tlv values being off by a factor of 10") which made the exact same change to the rt5670 codec driver. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20210226143817.84287-2-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07ASoC: rt1015: fix i2c communication errorJack Yu1-0/+1
[ Upstream commit 9e0bdaa9fcb8c64efc1487a7fba07722e7bc515e ] Remove 0x100 cache re-sync to solve i2c communication error. Signed-off-by: Jack Yu <jack.yu@realtek.com> Link: https://lore.kernel.org/r/20210222090057.29532-1-jack.yu@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07iomap: Fix negative assignment to unsigned sis->pages in iomap_swapfile_activateRitesh Harjani1-0/+10
[ Upstream commit 5808fecc572391867fcd929662b29c12e6d08d81 ] In case if isi.nr_pages is 0, we are making sis->pages (which is unsigned int) a huge value in iomap_swapfile_activate() by assigning -1. This could cause a kernel crash in kernel v4.18 (with below signature). Or could lead to unknown issues on latest kernel if the fake big swap gets used. Fix this issue by returning -EINVAL in case of nr_pages is 0, since it is anyway a invalid swapfile. Looks like this issue will be hit when we have pagesize < blocksize type of configuration. I was able to hit the issue in case of a tiny swap file with below test script. https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/scripts/swap-issue.sh kernel crash analysis on v4.18 ============================== On v4.18 kernel, it causes a kernel panic, since sis->pages becomes a huge value and isi.nr_extents is 0. When 0 is returned it is considered as a swapfile over NFS and SWP_FILE is set (sis->flags |= SWP_FILE). Then when swapoff was getting called it was calling a_ops->swap_deactivate() if (sis->flags & SWP_FILE) is true. Since a_ops->swap_deactivate() is NULL in case of XFS, it causes below panic. Panic signature on v4.18 kernel: ======================================= root@qemu:/home/qemu# [ 8291.723351] XFS (loop2): Unmounting Filesystem [ 8292.123104] XFS (loop2): Mounting V5 Filesystem [ 8292.132451] XFS (loop2): Ending clean mount [ 8292.263362] Adding 4294967232k swap on /mnt1/test/swapfile. Priority:-2 extents:1 across:274877906880k [ 8292.277834] Unable to handle kernel paging request for instruction fetch [ 8292.278677] Faulting instruction address: 0x00000000 cpu 0x19: Vector: 400 (Instruction Access) at [c0000009dd5b7ad0] pc: 0000000000000000 lr: c0000000003eb9dc: destroy_swap_extents+0xfc/0x120 sp: c0000009dd5b7d50 msr: 8000000040009033 current = 0xc0000009b6710080 paca = 0xc00000003ffcb280 irqmask: 0x03 irq_happened: 0x01 pid = 5604, comm = swapoff Linux version 4.18.0 (riteshh@xxxxxxx) (gcc version 8.4.0 (Ubuntu 8.4.0-1ubuntu1~18.04)) #57 SMP Wed Mar 3 01:33:04 CST 2021 enter ? for help [link register ] c0000000003eb9dc destroy_swap_extents+0xfc/0x120 [c0000009dd5b7d50] c0000000025a7058 proc_poll_event+0x0/0x4 (unreliable) [c0000009dd5b7da0] c0000000003f0498 sys_swapoff+0x3f8/0x910 [c0000009dd5b7e30] c00000000000bbe4 system_call+0x5c/0x70 Exception: c01 (System Call) at 00007ffff7d208d8 Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> [djwong: rework the comment to provide more details] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07rpc: fix NULL dereference on kmalloc failureJ. Bruce Fields1-4/+7
[ Upstream commit 0ddc942394013f08992fc379ca04cffacbbe3dae ] I think this is unlikely but possible: svc_authenticate sets rq_authop and calls svcauth_gss_accept. The kmalloc(sizeof(*svcdata), GFP_KERNEL) fails, leaving rq_auth_data NULL, and returning SVC_DENIED. This causes svc_process_common to go to err_bad_auth, and eventually call svc_authorise. That calls ->release == svcauth_gss_release, which tries to dereference rq_auth_data. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Link: https://lore.kernel.org/linux-nfs/3F1B347F-B809-478F-A1E9-0BE98E22B0F0@oracle.com/T/#t Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07fs: nfsd: fix kconfig dependency warning for NFSD_V4Julian Braha1-0/+1
[ Upstream commit 7005227369079963d25fb2d5d736d0feb2c44cf6 ] When NFSD_V4 is enabled and CRYPTO is disabled, Kbuild gives the following warning: WARNING: unmet direct dependencies detected for CRYPTO_SHA256 Depends on [n]: CRYPTO [=n] Selected by [y]: - NFSD_V4 [=y] && NETWORK_FILESYSTEMS [=y] && NFSD [=y] && PROC_FS [=y] WARNING: unmet direct dependencies detected for CRYPTO_MD5 Depends on [n]: CRYPTO [=n] Selected by [y]: - NFSD_V4 [=y] && NETWORK_FILESYSTEMS [=y] && NFSD [=y] && PROC_FS [=y] This is because NFSD_V4 selects CRYPTO_MD5 and CRYPTO_SHA256, without depending on or selecting CRYPTO, despite those config options being subordinate to CRYPTO. Signed-off-by: Julian Braha <julianbraha@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07ext4: fix bh ref count on error pathsZhaolong Zhang1-3/+3
[ Upstream commit c915fb80eaa6194fa9bd0a4487705cd5b0dda2f1 ] __ext4_journalled_writepage should drop bhs' ref count on error paths Signed-off-by: Zhaolong Zhang <zhangzl2013@126.com> Link: https://lore.kernel.org/r/1614678151-70481-1-git-send-email-zhangzl2013@126.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07ext4: shrink race window in ext4_should_retry_alloc()Eric Whitney4-12/+39
[ Upstream commit efc61345274d6c7a46a0570efbc916fcbe3e927b ] When generic/371 is run on kvm-xfstests using 5.10 and 5.11 kernels, it fails at significant rates on the two test scenarios that disable delayed allocation (ext3conv and data_journal) and force actual block allocation for the fallocate and pwrite functions in the test. The failure rate on 5.10 for both ext3conv and data_journal on one test system typically runs about 85%. On 5.11, the failure rate on ext3conv sometimes drops to as low as 1% while the rate on data_journal increases to nearly 100%. The observed failures are largely due to ext4_should_retry_alloc() cutting off block allocation retries when s_mb_free_pending (used to indicate that a transaction in progress will free blocks) is 0. However, free space is usually available when this occurs during runs of generic/371. It appears that a thread attempting to allocate blocks is just missing transaction commits in other threads that increase the free cluster count and reset s_mb_free_pending while the allocating thread isn't running. Explicitly testing for free space availability avoids this race. The current code uses a post-increment operator in the conditional expression that determines whether the retry limit has been exceeded. This means that the conditional expression uses the value of the retry counter before it's increased, resulting in an extra retry cycle. The current code actually retries twice before hitting its retry limit rather than once. Increasing the retry limit to 3 from the current actual maximum retry count of 2 in combination with the change described above reduces the observed failure rate to less that 0.1% on both ext3conv and data_journal with what should be limited impact on users sensitive to the overhead caused by retries. A per filesystem percpu counter exported via sysfs is added to allow users or developers to track the number of times the retry limit is exceeded without resorting to debugging methods. This should provide some insight into worst case retry behavior. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20210218151132.19678-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07virtiofs: Fail dax mount if device does not support itVivek Goyal1-1/+8
[ Upstream commit 3f9b9efd82a84f27e95d0414f852caf1fa839e83 ] Right now "mount -t virtiofs -o dax myfs /mnt/virtiofs" succeeds even if filesystem deivce does not have a cache window and hence DAX can't be supported. This gives a false sense to user that they are using DAX with virtiofs but fact of the matter is that they are not. Fix this by returning error if dax can't be supported and user has asked for it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07bpf: Fix fexit trampoline.Alexei Starovoitov5-61/+213
[ Upstream commit e21aa341785c679dd409c8cb71f864c00fe6c463 ] The fexit/fmod_ret programs can be attached to kernel functions that can sleep. The synchronize_rcu_tasks() will not wait for such tasks to complete. In such case the trampoline image will be freed and when the task wakes up the return IP will point to freed memory causing the crash. Solve this by adding percpu_ref_get/put for the duration of trampoline and separate trampoline vs its image life times. The "half page" optimization has to be removed, since first_half->second_half->first_half transition cannot be guaranteed to complete in deterministic time. Every trampoline update becomes a new image. The image with fmod_ret or fexit progs will be freed via percpu_ref_kill and call_rcu_tasks. Together they will wait for the original function and trampoline asm to complete. The trampoline is patched from nop to jmp to skip fexit progs. They are freed independently from the trampoline. The image with fentry progs only will be freed via call_rcu_tasks_trace+call_rcu_tasks which will wait for both sleepable and non-sleepable progs to complete. Fixes: fec56f5890d9 ("bpf: Introduce BPF trampoline") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Paul E. McKenney <paulmck@kernel.org> # for RCU Link: https://lore.kernel.org/bpf/20210316210007.38949-1-alexei.starovoitov@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07arm64: mm: correct the inside linear map range during hotplug checkPavel Tatashin1-2/+18
[ Upstream commit ee7febce051945be28ad86d16a15886f878204de ] Memory hotplug may fail on systems with CONFIG_RANDOMIZE_BASE because the linear map range is not checked correctly. The start physical address that linear map covers can be actually at the end of the range because of randomization. Check that and if so reduce it to 0. This can be verified on QEMU with setting kaslr-seed to ~0ul: memstart_offset_seed = 0xffff START: __pa(_PAGE_OFFSET(vabits_actual)) = ffff9000c0000000 END: __pa(PAGE_END - 1) = 1000bfffffff Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Fixes: 58284a901b42 ("arm64/mm: Validate hotplug range before creating linear mapping") Tested-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20210216150351.129018-2-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30Linux 5.10.27v5.10.27Greg Kroah-Hartman1-1/+1
Tested-by: Andrei Rabusov <a.rabusov@tum.de> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Jason Self <jason@bluehome.net> Tested-by: Hulk Robot <hulkrobot@huawei.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/20210329101340.196712908@linuxfoundation.org Link: https://lore.kernel.org/r/20210329075629.172032742@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30xen-blkback: don't leak persistent grants from xen_blkbk_map()Jan Beulich1-1/+1
commit a846738f8c3788d846ed1f587270d2f2e3d32432 upstream. The fix for XSA-365 zapped too many of the ->persistent_gnt[] entries. Ones successfully obtained should not be overwritten, but instead left for xen_blkbk_unmap_prepare() to pick up and put. This is XSA-371. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Wei Liu <wl@xen.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30can: peak_usb: Revert "can: peak_usb: add forgotten supported devices"Marc Kleine-Budde1-2/+0
commit 5d7047ed6b7214fbabc16d8712a822e256b1aa44 upstream. In commit 6417f03132a6 ("module: remove never implemented MODULE_SUPPORTED_DEVICE") the MODULE_SUPPORTED_DEVICE macro was removed from the kerne entirely. Shortly before this patch was applied mainline the commit 59ec7b89ed3e ("can: peak_usb: add forgotten supported devices") was added to net/master. As this would result in a merge conflict, let's revert this patch. Fixes: 59ec7b89ed3e ("can: peak_usb: add forgotten supported devices") Link: https://lore.kernel.org/r/20210320192649.341832-1-mkl@pengutronix.de Suggested-by: Leon Romanovsky <leon@kernel.org> Cc: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30nvme: fix the nsid value to print in nvme_validate_or_alloc_nsChristoph Hellwig1-1/+1
commit f4f9fc29e56b6fa9d7fa65ec51d3c82aff99c99b upstream. ns can be NULL at this point, and my move of the check from the original patch by Chaitanya broke this. Fixes: 0ec84df4953b ("nvme-core: check ctrl css before setting up zns") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30Revert "net: bonding: fix error return code of bond_neigh_init()"David S. Miller1-6/+2
commit 080bfa1e6d928a5d1f185cc44e5f3c251df06df5 upstream. This reverts commit 2055a99da8a253a357bdfd359b3338ef3375a26c. This change rejects legitimate configurations. A slave doesn't need to exist nor implement ndo_slave_setup. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30Revert "xen: fix p2m size in dom0 for disabled memory hotplug case"Roger Pau Monne3-17/+14
commit af44a387e743ab7aa39d3fb5e29c0a973cf91bdc upstream. This partially reverts commit 882213990d32 ("xen: fix p2m size in dom0 for disabled memory hotplug case") There's no need to special case XEN_UNPOPULATED_ALLOC anymore in order to correctly size the p2m. The generic memory hotplug option has already been tied together with the Xen hotplug limit, so enabling memory hotplug should already trigger a properly sized p2m on Xen PV. Note that XEN_UNPOPULATED_ALLOC depends on ZONE_DEVICE which pulls in MEMORY_HOTPLUG. Leave the check added to __set_phys_to_machine and the adjusted comment about EXTRA_MEM_RATIO. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210324122424.58685-3-roger.pau@citrix.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [boris: fixed formatting issues] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-30fs/ext4: fix integer overflow in s_log_groups_per_flexSabyrzhan Tasbolatov1-2/+9
commit f91436d55a279f045987e8b8c1385585dca54be9 upstream. syzbot found UBSAN: shift-out-of-bounds in ext4_mb_init [1], when 1 << sbi->s_es->s_log_groups_per_flex is bigger than UINT_MAX, where sbi->s_mb_prefetch is unsigned integer type. 32 is the maximum allowed power of s_log_groups_per_flex. Following if check will also trigger UBSAN shift-out-of-bound: if (1 << sbi->s_es->s_log_groups_per_flex >= UINT_MAX) { So I'm checking it against the raw number, perhaps there is another way to calculate UINT_MAX max power. Also use min_t as to make sure it's uint type. [1] UBSAN: shift-out-of-bounds in fs/ext4/mballoc.c:2713:24 shift exponent 60 is too large for 32-bit type 'int' Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x137/0x1be lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:148 [inline] __ubsan_handle_shift_out_of_bounds+0x432/0x4d0 lib/ubsan.c:395 ext4_mb_init_backend fs/ext4/mballoc.c:2713 [inline] ext4_mb_init+0x19bc/0x19f0 fs/ext4/mballoc.c:2898 ext4_fill_super+0xc2ec/0xfbe0 fs/ext4/super.c:4983 Reported-by: syzbot+a8b4b0c60155e87e9484@syzkaller.appspotmail.com Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210224095800.3350002-1-snovitoll@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30ext4: add reclaim checks to xattr codeJan Kara1-0/+4
commit 163f0ec1df33cf468509ff38cbcbb5eb0d7fac60 upstream. Syzbot is reporting that ext4 can enter fs reclaim from kvmalloc() while the transaction is started like: fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340 might_alloc include/linux/sched/mm.h:193 [inline] slab_pre_alloc_hook mm/slab.h:493 [inline] slab_alloc_node mm/slub.c:2817 [inline] __kmalloc_node+0x5f/0x430 mm/slub.c:4015 kmalloc_node include/linux/slab.h:575 [inline] kvmalloc_node+0x61/0xf0 mm/util.c:587 kvmalloc include/linux/mm.h:781 [inline] ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline] ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline] ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649 ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224 ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380 ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493 This should be impossible since transaction start sets PF_MEMALLOC_NOFS. Add some assertions to the code to catch if something isn't working as expected early. Link: https://lore.kernel.org/linux-ext4/000000000000563a0205bafb7970@google.com/ Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210222171626.21884-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30mac80211: fix double free in ibss_leaveMarkus Theil1-0/+2
commit 3bd801b14e0c5d29eeddc7336558beb3344efaa3 upstream. Clear beacon ie pointer and ie length after free in order to prevent double free. ================================================================== BUG: KASAN: double-free or invalid-free \ in ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876 CPU: 0 PID: 8472 Comm: syz-executor100 Not tainted 5.11.0-rc6-syzkaller #0 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2c6 mm/kasan/report.c:230 kasan_report_invalid_free+0x51/0x80 mm/kasan/report.c:355 ____kasan_slab_free+0xcc/0xe0 mm/kasan/common.c:341 kasan_slab_free include/linux/kasan.h:192 [inline] __cache_free mm/slab.c:3424 [inline] kfree+0xed/0x270 mm/slab.c:3760 ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876 rdev_leave_ibss net/wireless/rdev-ops.h:545 [inline] __cfg80211_leave_ibss+0x19a/0x4c0 net/wireless/ibss.c:212 __cfg80211_leave+0x327/0x430 net/wireless/core.c:1172 cfg80211_leave net/wireless/core.c:1221 [inline] cfg80211_netdev_notifier_call+0x9e8/0x12c0 net/wireless/core.c:1335 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2040 call_netdevice_notifiers_extack net/core/dev.c:2052 [inline] call_netdevice_notifiers net/core/dev.c:2066 [inline] __dev_close_many+0xee/0x2e0 net/core/dev.c:1586 __dev_close net/core/dev.c:1624 [inline] __dev_change_flags+0x2cb/0x730 net/core/dev.c:8476 dev_change_flags+0x8a/0x160 net/core/dev.c:8549 dev_ifsioc+0x210/0xa70 net/core/dev_ioctl.c:265 dev_ioctl+0x1b1/0xc40 net/core/dev_ioctl.c:511 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: syzbot+93976391bf299d425f44@syzkaller.appspotmail.com Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de> Link: https://lore.kernel.org/r/20210213133653.367130-1-markus.theil@tu-ilmenau.de Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30net: dsa: b53: VLAN filtering is global to all usersFlorian Fainelli1-7/+7
commit d45c36bafb94e72fdb6dee437279b61b6d97e706 upstream. The bcm_sf2 driver uses the b53 driver as a library but does not make usre of the b53_setup() function, this made it fail to inherit the vlan_filtering_is_global attribute. Fix this by moving the assignment to b53_switch_alloc() which is used by bcm_sf2. Fixes: 7228b23e68f7 ("net: dsa: b53: Let DSA handle mismatched VLAN filtering settings") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30r8169: fix DMA being used after buffer free if WoL is enabledHeiner Kallweit1-2/+4
commit f658b90977d2e79822a558e48116e059a7e75dec upstream. IOMMU errors have been reported if WoL is enabled and interface is brought down. It turned out that the network chip triggers DMA transfers after the DMA buffers have been freed. For WoL to work we need to leave rx enabled, therefore simply stop the chip from being a DMA busmaster. Fixes: 567ca57faa62 ("r8169: add rtl8169_up") Tested-by: Paul Blazejowski <paulb@blazebox.homeip.net> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30can: dev: Move device back to init netns on owning netns deleteMartin Willi3-1/+4
commit 3a5ca857079ea022e0b1b17fc154f7ad7dbc150f upstream. When a non-initial netns is destroyed, the usual policy is to delete all virtual network interfaces contained, but move physical interfaces back to the initial netns. This keeps the physical interface visible on the system. CAN devices are somewhat special, as they define rtnl_link_ops even if they are physical devices. If a CAN interface is moved into a non-initial netns, destroying that netns lets the interface vanish instead of moving it back to the initial netns. default_device_exit() skips CAN interfaces due to having rtnl_link_ops set. Reproducer: ip netns add foo ip link set can0 netns foo ip netns delete foo WARNING: CPU: 1 PID: 84 at net/core/dev.c:11030 ops_exit_list+0x38/0x60 CPU: 1 PID: 84 Comm: kworker/u4:2 Not tainted 5.10.19 #1 Workqueue: netns cleanup_net [<c010e700>] (unwind_backtrace) from [<c010a1d8>] (show_stack+0x10/0x14) [<c010a1d8>] (show_stack) from [<c086dc10>] (dump_stack+0x94/0xa8) [<c086dc10>] (dump_stack) from [<c086b938>] (__warn+0xb8/0x114) [<c086b938>] (__warn) from [<c086ba10>] (warn_slowpath_fmt+0x7c/0xac) [<c086ba10>] (warn_slowpath_fmt) from [<c0629f20>] (ops_exit_list+0x38/0x60) [<c0629f20>] (ops_exit_list) from [<c062a5c4>] (cleanup_net+0x230/0x380) [<c062a5c4>] (cleanup_net) from [<c0142c20>] (process_one_work+0x1d8/0x438) [<c0142c20>] (process_one_work) from [<c0142ee4>] (worker_thread+0x64/0x5a8) [<c0142ee4>] (worker_thread) from [<c0148a98>] (kthread+0x148/0x14c) [<c0148a98>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c) To properly restore physical CAN devices to the initial netns on owning netns exit, introduce a flag on rtnl_link_ops that can be set by drivers. For CAN devices setting this flag, default_device_exit() considers them non-virtual, applying the usual namespace move. The issue was introduced in the commit mentioned below, as at that time CAN devices did not have a dellink() operation. Fixes: e008b5fc8dc7 ("net: Simplfy default_device_exit and improve batching.") Link: https://lore.kernel.org/r/20210302122423.872326-1-martin@strongswan.org Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30ch_ktls: fix enum-conversion warningArnd Bergmann1-1/+1
commit 6f235a69e59484e382dc31952025b0308efedc17 upstream. gcc points out an incorrect enum assignment: drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c: In function 'chcr_ktls_cpl_set_tcb_rpl': drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:684:22: warning: implicit conversion from 'enum <anonymous>' to 'enum ch_ktls_open_state' [-Wenum-conversion] This appears harmless, and should apparently use 'CH_KTLS_OPEN_SUCCESS' instead of 'false', with the same value '0'. Fixes: efca3878a5fb ("ch_ktls: Issue if connection offload fails") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30fs/cachefiles: Remove wait_bit_key layout dependencyMatthew Wilcox (Oracle)2-5/+3
commit 39f985c8f667c80a3d1eb19d31138032fa36b09e upstream. Cachefiles was relying on wait_page_key and wait_bit_key being the same layout, which is fragile. Now that wait_page_key is exposed in the pagemap.h header, we can remove that fragility A comment on the need to maintain structure layout equivalence was added by Linus[1] and that is no longer applicable. Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing@auristor.com cc: linux-cachefs@redhat.com cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20210320054104.1300774-2-willy@infradead.org/ Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3510ca20ece0150af6b10c77a74ff1b5c198e3e2 [1] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30mm/memcg: fix 5.10 backport of splitting page memcgHugh Dickins1-1/+5
The straight backport of 5.12's e1baddf8475b ("mm/memcg: set memcg when splitting page") works fine in 5.11, but turned out to be wrong for 5.10: because that relies on a separate flag, which must also be set for the memcg to be recognized and uncharged and cleared when freeing. Fix that. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30x86/mem_encrypt: Correct physical address calculation in __set_clr_pte_enc()Isaku Yamahata1-1/+1
commit 8249d17d3194eac064a8ca5bc5ca0abc86feecde upstream. The pfn variable contains the page frame number as returned by the pXX_pfn() functions, shifted to the right by PAGE_SHIFT to remove the page bits. After page protection computations are done to it, it gets shifted back to the physical address using page_level_shift(). That is wrong, of course, because that function determines the shift length based on the level of the page in the page table but in all the cases, it was shifted by PAGE_SHIFT before. Therefore, shift it back using PAGE_SHIFT to get the correct physical address. [ bp: Rewrite commit message. ] Fixes: dfaaec9033b8 ("x86: Add support for changing memory encryption attribute in early boot") Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/81abbae1657053eccc535c16151f63cd049dcb97.1616098294.git.isaku.yamahata@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30locking/mutex: Fix non debug version of mutex_lock_io_nested()Thomas Gleixner1-1/+1
commit 291da9d4a9eb3a1cb0610b7f4480f5b52b1825e7 upstream. If CONFIG_DEBUG_LOCK_ALLOC=n then mutex_lock_io_nested() maps to mutex_lock() which is clearly wrong because mutex_lock() lacks the io_schedule_prepare()/finish() invocations. Map it to mutex_lock_io(). Fixes: f21860bac05b ("locking/mutex, sched/wait: Fix the mutex_lock_io_nested() define") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/878s6fshii.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30cifs: Adjust key sizes and key generation routines for AES256 encryptionShyam Prasad N5-15/+41
commit 45a4546c6167a2da348a31ca439d8a8ff773b6ea upstream. For AES256 encryption (GCM and CCM), we need to adjust the size of a few fields to 32 bytes instead of 16 to accommodate the larger keys. Also, the L value supplied to the key generator needs to be changed from to 256 when these algorithms are used. Keeping the ioctl struct for dumping keys of the same size for now. Will send out a different patch for that one. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: <stable@vger.kernel.org> # v5.10+ Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30smb3: fix cached file size problems in duplicate extents (reflink)Steve French1-3/+15
commit cfc63fc8126a93cbf95379bc4cad79a7b15b6ece upstream. There were two problems (one of which could cause data corruption) that were noticed with duplicate extents (ie reflink) when debugging why various xfstests were being incorrectly skipped (e.g. generic/138, generic/140, generic/142). First, we were not updating the file size locally in the cache when extending a file due to reflink (it would refresh after actimeo expires) but xfstest was checking the size immediately which was still 0 so caused the test to be skipped. Second, we were setting the target file size (which could shrink the file) in all cases to the end of the reflinked range rather than only setting the target file size when reflink would extend the file. CC: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30scsi: mpt3sas: Fix error return code of mpt3sas_base_attach()Jia-Ju Bai1-2/+6
[ Upstream commit 3401ecf7fc1b9458a19d42c0e26a228f18ac7dda ] When kzalloc() returns NULL, no error return code of mpt3sas_base_attach() is assigned. To fix this bug, r is assigned with -ENOMEM in this case. Link: https://lore.kernel.org/r/20210308035241.3288-1-baijiaju1990@gmail.com Fixes: c696f7b83ede ("scsi: mpt3sas: Implement device_remove_in_progress check in IOCTL path") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30scsi: qedi: Fix error return code of qedi_alloc_global_queues()Jia-Ju Bai1-0/+1
[ Upstream commit f69953837ca5d98aa983a138dc0b90a411e9c763 ] When kzalloc() returns NULL to qedi->global_queues[i], no error return code of qedi_alloc_global_queues() is assigned. To fix this bug, status is assigned with -ENOMEM in this case. Link: https://lore.kernel.org/r/20210308033024.27147-1-baijiaju1990@gmail.com Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30scsi: Revert "qla2xxx: Make sure that aborted commands are freed"Bart Van Assche2-12/+5
[ Upstream commit 39c0c8553bfb5a3d108aa47f1256076d507605e3 ] Calling vha->hw->tgt.tgt_ops->free_cmd() from qlt_xmit_response() is wrong since the command for which a response is sent must remain valid until the SCSI target core calls .release_cmd(). It has been observed that the following scenario triggers a kernel crash: - qlt_xmit_response() calls qlt_check_reserve_free_req() - qlt_check_reserve_free_req() returns -EAGAIN - qlt_xmit_response() calls vha->hw->tgt.tgt_ops->free_cmd(cmd) - transport_handle_queue_full() tries to retransmit the response Fix this crash by reverting the patch that introduced it. Link: https://lore.kernel.org/r/20210320232359.941-2-bvanassche@acm.org Fixes: 0dcec41acb85 ("scsi: qla2xxx: Make sure that aborted commands are freed") Cc: Quinn Tran <qutran@marvell.com> Cc: Mike Christie <michael.christie@oracle.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30block: recalculate segment count for multi-segment discards correctlyDavid Jeffery1-0/+8
[ Upstream commit a958937ff166fc60d1c3a721036f6ff41bfa2821 ] When a stacked block device inserts a request into another block device using blk_insert_cloned_request, the request's nr_phys_segments field gets recalculated by a call to blk_recalc_rq_segments in blk_cloned_rq_check_limits. But blk_recalc_rq_segments does not know how to handle multi-segment discards. For disk types which can handle multi-segment discards like nvme, this results in discard requests which claim a single segment when it should report several, triggering a warning in nvme and causing nvme to fail the discard from the invalid state. WARNING: CPU: 5 PID: 191 at drivers/nvme/host/core.c:700 nvme_setup_discard+0x170/0x1e0 [nvme_core] ... nvme_setup_cmd+0x217/0x270 [nvme_core] nvme_loop_queue_rq+0x51/0x1b0 [nvme_loop] __blk_mq_try_issue_directly+0xe7/0x1b0 blk_mq_request_issue_directly+0x41/0x70 ? blk_account_io_start+0x40/0x50 dm_mq_queue_rq+0x200/0x3e0 blk_mq_dispatch_rq_list+0x10a/0x7d0 ? __sbitmap_queue_get+0x25/0x90 ? elv_rb_del+0x1f/0x30 ? deadline_remove_request+0x55/0xb0 ? dd_dispatch_request+0x181/0x210 __blk_mq_do_dispatch_sched+0x144/0x290 ? bio_attempt_discard_merge+0x134/0x1f0 __blk_mq_sched_dispatch_requests+0x129/0x180 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x47/0xe0 __blk_mq_delay_run_hw_queue+0x15b/0x170 blk_mq_sched_insert_requests+0x68/0xe0 blk_mq_flush_plug_list+0xf0/0x170 blk_finish_plug+0x36/0x50 xlog_cil_committed+0x19f/0x290 [xfs] xlog_cil_process_committed+0x57/0x80 [xfs] xlog_state_do_callback+0x1e0/0x2a0 [xfs] xlog_ioend_work+0x2f/0x80 [xfs] process_one_work+0x1b6/0x350 worker_thread+0x53/0x3e0 ? process_one_work+0x350/0x350 kthread+0x11b/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 This patch fixes blk_recalc_rq_segments to be aware of devices which can have multi-segment discards. It calculates the correct discard segment count by counting the number of bio as each discard bio is considered its own segment. Fixes: 1e739730c5b9 ("block: optionally merge discontiguous discard bios into a single request") Signed-off-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Link: https://lore.kernel.org/r/20210211143807.GA115624@redhat Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30io_uring: fix provide_buffers sign extensionPavel Begunkov1-1/+3
[ Upstream commit d81269fecb8ce16eb07efafc9ff5520b2a31c486 ] io_provide_buffers_prep()'s "p->len * p->nbufs" to sign extension problems. Not a huge problem as it's only used for access_ok() and increases the checked length, but better to keep typing right. Reported-by: Colin Ian King <colin.king@canonical.com> Fixes: efe68c1ca8f49 ("io_uring: validate the full range of provided buffers for access") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/562376a39509e260d8532186a06226e56eb1f594.1616149233.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30perf synthetic events: Avoid write of uninitialized memory when generating ↵Ian Rogers1-4/+5
PERF_RECORD_MMAP* records [ Upstream commit 2a76f6de07906f0bb5f2a13fb02845db1695cc29 ] Account for alignment bytes in the zero-ing memset. Fixes: 1a853e36871b533c ("perf record: Allow specifying a pid to record") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210309234945.419254-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30perf auxtrace: Fix auxtrace queue conflictAdrian Hunter1-4/+0
[ Upstream commit b410ed2a8572d41c68bd9208555610e4b07d0703 ] The only requirement of an auxtrace queue is that the buffers are in time order. That is achieved by making separate queues for separate perf buffer or AUX area buffer mmaps. That generally means a separate queue per cpu for per-cpu contexts, and a separate queue per thread for per-task contexts. When buffers are added to a queue, perf checks that the buffer cpu and thread id (tid) match the queue cpu and thread id. However, generally, that need not be true, and perf will queue buffers correctly anyway, so the check is not needed. In addition, the check gets erroneously hit when using sample mode to trace multiple threads. Consequently, fix that case by removing the check. Fixes: e502789302a6 ("perf auxtrace: Add helpers for queuing AUX area tracing data") Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20210308151143.18338-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30ACPI: scan: Use unique number for instance_noAndy Shevchenko3-6/+34
[ Upstream commit eb50aaf960e3bedfef79063411ffd670da94b84b ] The decrementation of acpi_device_bus_id->instance_no in acpi_device_del() is incorrect, because it may cause a duplicate instance number to be allocated next time a device with the same acpi_device_bus_id is added. Replace above mentioned approach by using IDA framework. While at it, define the instance range to be [0, 4096). Fixes: e49bd2dd5a50 ("ACPI: use PNPID:instance_no as bus_id of ACPI device") Fixes: ca9dc8d42b30 ("ACPI / scan: Fix acpi_bus_id_list bookkeeping") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30ACPI: scan: Rearrange memory allocation in acpi_device_add()Rafael J. Wysocki1-31/+26
[ Upstream commit c1013ff7a5472db637c56bb6237f8343398c03a7 ] The upfront allocation of new_bus_id is done to avoid allocating memory under acpi_device_lock, but it doesn't really help, because (1) it leads to many unnecessary memory allocations for _ADR devices, (2) kstrdup_const() is run under that lock anyway and (3) it complicates the code. Rearrange acpi_device_add() to allocate memory for a new struct acpi_device_bus_id instance only when necessary, eliminate a redundant local variable from it and reduce the number of labels in there. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30Revert "netfilter: x_tables: Update remaining dereference to RCU"Mark Tomlinson3-3/+3
[ Upstream commit abe7034b9a8d57737e80cc16d60ed3666990bdbf ] This reverts commit 443d6e86f821a165fae3fc3fc13086d27ac140b1. This (and the following) patch basically re-implemented the RCU mechanisms of patch 784544739a25. That patch was replaced because of the performance problems that it created when replacing tables. Now, we have the same issue: the call to synchronize_rcu() makes replacing tables slower by as much as an order of magnitude. Revert these patches and fix the issue in a different way. Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30mm/mmu_notifiers: ensure range_end() is paired with range_start()Sean Christopherson2-5/+28
[ Upstream commit c2655835fd8cabdfe7dab737253de3ffb88da126 ] If one or more notifiers fails .invalidate_range_start(), invoke .invalidate_range_end() for "all" notifiers. If there are multiple notifiers, those that did not fail are expecting _start() and _end() to be paired, e.g. KVM's mmu_notifier_count would become imbalanced. Disallow notifiers that can fail _start() from implementing _end() so that it's unnecessary to either track which notifiers rejected _start(), or had already succeeded prior to a failed _start(). Note, the existing behavior of calling _start() on all notifiers even after a previous notifier failed _start() was an unintented "feature". Make it canon now that the behavior is depended on for correctness. As of today, the bug is likely benign: 1. The only caller of the non-blocking notifier is OOM kill. 2. The only notifiers that can fail _start() are the i915 and Nouveau drivers. 3. The only notifiers that utilize _end() are the SGI UV GRU driver and KVM. 4. The GRU driver will never coincide with the i195/Nouveau drivers. 5. An imbalanced kvm->mmu_notifier_count only causes soft lockup in the _guest_, and the guest is already doomed due to being an OOM victim. Fix the bug now to play nice with future usage, e.g. KVM has a potential use case for blocking memslot updates in KVM while an invalidation is in-progress, and failure to unblock would result in said updates being blocked indefinitely and hanging. Found by inspection. Verified by adding a second notifier in KVM that periodically returns -EAGAIN on non-blockable ranges, triggering OOM, and observing that KVM exits with an elevated notifier count. Link: https://lkml.kernel.org/r/20210311180057.1582638-1-seanjc@google.com Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") Signed-off-by: Sean Christopherson <seanjc@google.com> Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Cc: David Rientjes <rientjes@google.com> Cc: Ben Gardon <bgardon@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30dm table: Fix zoned model check and zone sectors checkShin'ichiro Kawasaki3-10/+40
[ Upstream commit 2d669ceb69c276f7637cf760287ca4187add082e ] Commit 24f6b6036c9e ("dm table: fix zoned iterate_devices based device capability checks") triggered dm table load failure when dm-zoned device is set up for zoned block devices and a regular device for cach