summaryrefslogtreecommitdiff
path: root/drivers/base
AgeCommit message (Collapse)AuthorFilesLines
2021-11-02regmap: Fix possible double-free in regcache_rbtree_exit()Yang Yingliang1-4/+3
commit 55e6d8037805b3400096d621091dfbf713f97e83 upstream. In regcache_rbtree_insert_to_block(), when 'present' realloc failed, the 'blk' which is supposed to assign to 'rbnode->block' will be freed, so 'rbnode->block' points a freed memory, in the error handling path of regcache_rbtree_init(), 'rbnode->block' will be freed again in regcache_rbtree_exit(), KASAN will report double-free as follows: BUG: KASAN: double-free or invalid-free in kfree+0xce/0x390 Call Trace: slab_free_freelist_hook+0x10d/0x240 kfree+0xce/0x390 regcache_rbtree_exit+0x15d/0x1a0 regcache_rbtree_init+0x224/0x2c0 regcache_init+0x88d/0x1310 __regmap_init+0x3151/0x4a80 __devm_regmap_init+0x7d/0x100 madera_spi_probe+0x10f/0x333 [madera_spi] spi_probe+0x183/0x210 really_probe+0x285/0xc30 To fix this, moving up the assignment of rbnode->block to immediately after the reallocation has succeeded so that the data structure stays valid even if the second reallocation fails. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: 3f4ff561bc88b ("regmap: rbtree: Make cache_present bitmap per node") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20211012023735.1632786-1-yangyingliang@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-26PM / wakeirq: Fix unbalanced IRQ enable for wakeirqTony Lindgren1-2/+4
commit 69728051f5bf15efaf6edfbcfe1b5a49a2437918 upstream. If a device is runtime PM suspended when we enter suspend and has a dedicated wake IRQ, we can get the following warning: WARNING: CPU: 0 PID: 108 at kernel/irq/manage.c:526 enable_irq+0x40/0x94 [ 102.087860] Unbalanced enable for IRQ 147 ... (enable_irq) from [<c06117a8>] (dev_pm_arm_wake_irq+0x4c/0x60) (dev_pm_arm_wake_irq) from [<c0618360>] (device_wakeup_arm_wake_irqs+0x58/0x9c) (device_wakeup_arm_wake_irqs) from [<c0615948>] (dpm_suspend_noirq+0x10/0x48) (dpm_suspend_noirq) from [<c01ac7ac>] (suspend_devices_and_enter+0x30c/0xf14) (suspend_devices_and_enter) from [<c01adf20>] (enter_state+0xad4/0xbd8) (enter_state) from [<c01ad3ec>] (pm_suspend+0x38/0x98) (pm_suspend) from [<c01ab3e8>] (state_store+0x68/0xc8) This is because the dedicated wake IRQ for the device may have been already enabled earlier by dev_pm_enable_wake_irq_check(). Fix the issue by checking for runtime PM suspended status. This issue can be easily reproduced by setting serial console log level to zero, letting the serial console idle, and suspend the system from an ssh terminal. On resume, dmesg will have the warning above. The reason why I have not run into this issue earlier has been that I typically run my PM test cases from on a serial console instead over ssh. Fixes: c84345597558 (PM / wakeirq: Enable dedicated wakeirq for suspend) Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-22regmap: fix the offset of register error logJeongtae Park1-1/+1
[ Upstream commit 1852f5ed358147095297a09cc3c6f160208a676d ] This patch fixes the offset of register error log by using regmap_get_offset(). Signed-off-by: Jeongtae Park <jeongtae.park@gmail.com> Link: https://lore.kernel.org/r/20210701142630.44936-1-jeongtae.park@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-22PM / wakeirq: Enable dedicated wakeirq for suspendGrygorii Strashko1-2/+10
commit c84345597558349474f55be2b7d4093256e42884 upstream. We currently rely on runtime PM to enable dedicated wakeirq for suspend. This assumption fails in the following two cases: 1. If the consumer driver does not have runtime PM implemented, the dedicated wakeirq never gets enabled for suspend 2. If the consumer driver has runtime PM implemented, but does not idle in suspend Let's fix the issue by always enabling the dedicated wakeirq during suspend. Depends-on: bed570307ed7 (PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend) Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling) Reported-by: Keerthy <j-keerthy@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> [ tony@atomide.com: updated based on bed570307ed7, added description ] Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-26PCI/MSI: Protect msi_desc::masked for multi-MSIThomas Gleixner1-0/+1
commit 77e89afc25f30abd56e76a809ee2884d7c1b63ce upstream. Multi-MSI uses a single MSI descriptor and there is a single mask register when the device supports per vector masking. To avoid reading back the mask register the value is cached in the MSI descriptor and updates are done by clearing and setting bits in the cache and writing it to the device. But nothing protects msi_desc::masked and the mask register from being modified concurrently on two different CPUs for two different Linux interrupts which belong to the same multi-MSI descriptor. Add a lock to struct device and protect any operation on the mask and the mask register with it. This makes the update of msi_desc::masked unconditional, but there is no place which requires a modification of the hardware register without updating the masked cache. msi_mask_irq() is now an empty wrapper which will be cleaned up in follow up changes. The problem goes way back to the initial support of multi-MSI, but picking the commit which introduced the mask cache is a valid cut off point (2.6.30). Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.726833414@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-12Revert "device property: Keep secondary firmware node secondary by type"Bard Liao1-1/+1
commit 47f4469970d8861bc06d2d4d45ac8200ff07c693 upstream. While commit d5dcce0c414f ("device property: Keep secondary firmware node secondary by type") describes everything correct in its commit message, the change it made does the opposite and original commit c15e1bdda436 ("device property: Fix the secondary firmware node handling in set_primary_fwnode()") was fully correct. Revert the former one here and improve documentation in the next patch. Fixes: d5dcce0c414f ("device property: Keep secondary firmware node secondary by type") Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: 5.10+ <stable@vger.kernel.org> # 5.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-10device property: Don't clear secondary pointer for shared primary firmware nodeAndy Shevchenko1-1/+3
commit 99aed9227073fb34ce2880cbc7063e04185a65e1 upstream. It appears that firmware nodes can be shared between devices. In such case when a (child) device is about to be deleted, its firmware node may be shared and ACPI_COMPANION_SET(..., NULL) call for it breaks the secondary link of the shared primary firmware node. In order to prevent that, check, if the device has a parent and parent's firmware node is shared with its child, and avoid crashing the link. Fixes: c15e1bdda436 ("device property: Fix the secondary firmware node handling in set_primary_fwnode()") Reported-by: Ferry Toth <fntoth@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Tested-by: Ferry Toth <fntoth@gmail.com> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-10device property: Keep secondary firmware node secondary by typeAndy Shevchenko1-1/+1
commit d5dcce0c414fcbfe4c2037b66ac69ea5f9b3f75c upstream. Behind primary and secondary we understand the type of the nodes which might define their ordering. However, if primary node gone, we can't maintain the ordering by definition of the linked list. Thus, by ordering secondary node becomes first in the list. But in this case the meaning of it is still secondary (or auxiliary). The type of the node is maintained by the secondary pointer in it: secondary pointer Meaning NULL or valid primary node ERR_PTR(-ENODEV) secondary node So, if by some reason we do the following sequence of calls set_primary_fwnode(dev, NULL); set_primary_fwnode(dev, primary); we should preserve secondary node. This concept is supported by the description of set_primary_fwnode() along with implementation of set_secondary_fwnode(). Hence, fix the commit c15e1bdda436 to follow this as well. Fixes: c15e1bdda436 ("device property: Fix the secondary firmware node handling in set_primary_fwnode()") Cc: Ferry Toth <fntoth@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Tested-by: Ferry Toth <fntoth@gmail.com> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-14driver core: Fix probe_count imbalance in really_probe()Tetsuo Handa1-2/+3
commit b292b50b0efcc7095d8bf15505fba6909bb35dce upstream. syzbot is reporting hung task in wait_for_device_probe() [1]. At least, we always need to decrement probe_count if we incremented probe_count in really_probe(). However, since I can't find "Resources present before probing" message in the console log, both "this message simply flowed off" and "syzbot is not hitting this path" will be possible. Therefore, while we are at it, let's also prepare for concurrent wait_for_device_probe() calls by replacing wake_up() with wake_up_all(). [1] https://syzkaller.appspot.com/bug?id=25c833f1983c9c1d512f4ff860dd0d7f5a2e2c0f Reported-by: syzbot <syzbot+805f5f6ae37411f15b64@syzkaller.appspotmail.com> Fixes: 7c35e699c88bd607 ("driver core: Print device when resources present in really_probe()") Cc: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20200713021254.3444-1-penguin-kernel@I-love.SAKURA.ne.jp [iwamatsu: Drop patch for deferred_probe_timeout_work_func()] Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-03device property: Fix the secondary firmware node handling in ↵Heikki Krogerus1-4/+8
set_primary_fwnode() commit c15e1bdda4365a5f17cdadf22bf1c1df13884a9e upstream. When the primary firmware node pointer is removed from a device (set to NULL) the secondary firmware node pointer, when it exists, is made the primary node for the device. However, the secondary firmware node pointer of the original primary firmware node is never cleared (set to NULL). To avoid situation where the secondary firmware node pointer is pointing to a non-existing object, clearing it properly when the primary node is removed from a device in set_primary_fwnode(). Fixes: 97badf873ab6 ("device property: Make it possible to use secondary firmware nodes") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-03PM: sleep: core: Fix the handling of pending runtime resume requestsRafael J. Wysocki1-6/+10
commit e3eb6e8fba65094328b8dca635d00de74ba75b45 upstream. It has been reported that system-wide suspend may be aborted in the absence of any wakeup events due to unforseen interactions of it with the runtume PM framework. One failing scenario is when there are multiple devices sharing an ACPI power resource and runtime-resume needs to be carried out for one of them during system-wide suspend (for example, because it needs to be reconfigured before the whole system goes to sleep). In that case, the runtime-resume of that device involves turning the ACPI power resource "on" which in turn causes runtime-resume requests to be queued up for all of the other devices sharing it. Those requests go to the runtime PM workqueue which is frozen during system-wide suspend, so they are not actually taken care of until the resume of the whole system, but the pm_runtime_barrier() call in __device_suspend() sees them and triggers system wakeup events for them which then cause the system-wide suspend to be aborted if wakeup source objects are in active use. Of course, the logic that leads to triggering those wakeup events is questionable in the first place, because clearly there are cases in which a pending runtime resume request for a device is not connected to any real wakeup events in any way (like the one above). Moreover, it is racy, because the device may be resuming already by the time the pm_runtime_barrier() runs and so if the driver doesn't take care of signaling the wakeup event as appropriate, it will be lost. However, if the driver does take care of that, the extra pm_wakeup_event() call in the core is redundant. Accordingly, drop the conditional pm_wakeup_event() call fron __device_suspend() and make the latter call pm_runtime_barrier() alone. Also modify the comment next to that call to reflect the new code and extend it to mention the need to avoid unwanted interactions between runtime PM and system-wide device suspend callbacks. Fixes: 1e2ef05bb8cf8 ("PM: Limit race conditions between runtime PM and system sleep (v2)") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Utkarsh H Patel <utkarsh.h.patel@intel.com> Tested-by: Utkarsh H Patel <utkarsh.h.patel@intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-31regmap: debugfs: check count when read regmap filePeng Fan1-0/+6
commit 74edd08a4fbf51d65fd8f4c7d8289cd0f392bd91 upstream. When executing the following command, we met kernel dump. dmesg -c > /dev/null; cd /sys; for i in `ls /sys/kernel/debug/regmap/* -d`; do echo "Checking regmap in $i"; cat $i/registers; done && grep -ri "0x02d0" *; It is because the count value is too big, and kmalloc fails. So add an upper bound check to allow max size `PAGE_SIZE << (MAX_ORDER - 1)`. Signed-off-by: Peng Fan <peng.fan@nxp.com> Link: https://lore.kernel.org/r/1584064687-12964-1-git-send-email-peng.fan@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-31regmap: dev_get_regmap_match(): fix string comparisonMarc Kleine-Budde1-1/+1
[ Upstream commit e84861fec32dee8a2e62bbaa52cded6b05a2a456 ] This function is used by dev_get_regmap() to retrieve a regmap for the specified device. If the device has more than one regmap, the name parameter can be used to specify one. The code here uses a pointer comparison to check for equal strings. This however will probably always fail, as the regmap->name is allocated via kstrdup_const() from the regmap's config->name. Fix this by using strcmp() instead. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20200703103315.267996-1-mkl@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30drivers: base: Fix NULL pointer exception in __platform_driver_probe() if a ↵Kuppuswamy Sathyanarayanan1-0/+2
driver developer is foolish [ Upstream commit 388bcc6ecc609fca1b4920de7dc3806c98ec535e ] If platform bus driver registration is failed then, accessing platform bus spin lock (&drv->driver.bus->p->klist_drivers.k_lock) in __platform_driver_probe() without verifying the return value __platform_driver_register() can lead to NULL pointer exception. So check the return value before attempting the spin lock. One such example is below: For a custom usecase, I have intentionally failed the platform bus registration and I expected all the platform device/driver registrations to fail gracefully. But I came across this panic issue. [ 1.331067] BUG: kernel NULL pointer dereference, address: 00000000000000c8 [ 1.331118] #PF: supervisor write access in kernel mode [ 1.331163] #PF: error_code(0x0002) - not-present page [ 1.331208] PGD 0 P4D 0 [ 1.331233] Oops: 0002 [#1] PREEMPT SMP [ 1.331268] CPU: 3 PID: 1 Comm: swapper/0 Tainted: G W 5.6.0-00049-g670d35fb0144 #165 [ 1.331341] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 [ 1.331406] RIP: 0010:_raw_spin_lock+0x15/0x30 [ 1.331588] RSP: 0000:ffffc9000001be70 EFLAGS: 00010246 [ 1.331632] RAX: 0000000000000000 RBX: 00000000000000c8 RCX: 0000000000000001 [ 1.331696] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000000 [ 1.331754] RBP: 00000000ffffffed R08: 0000000000000501 R09: 0000000000000001 [ 1.331817] R10: ffff88817abcc520 R11: 0000000000000670 R12: 00000000ffffffed [ 1.331881] R13: ffffffff82dbc268 R14: ffffffff832f070a R15: 0000000000000000 [ 1.331945] FS: 0000000000000000(0000) GS:ffff88817bd80000(0000) knlGS:0000000000000000 [ 1.332008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.332062] CR2: 00000000000000c8 CR3: 000000000681e001 CR4: 00000000003606e0 [ 1.332126] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.332189] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.332252] Call Trace: [ 1.332281] __platform_driver_probe+0x92/0xee [ 1.332323] ? rtc_dev_init+0x2b/0x2b [ 1.332358] cmos_init+0x37/0x67 [ 1.332396] do_one_initcall+0x7d/0x168 [ 1.332428] kernel_init_freeable+0x16c/0x1c9 [ 1.332473] ? rest_init+0xc0/0xc0 [ 1.332508] kernel_init+0x5/0x100 [ 1.332543] ret_from_fork+0x1f/0x30 [ 1.332579] CR2: 00000000000000c8 [ 1.332616] ---[ end trace 3bd87f12e9010b87 ]--- [ 1.333549] note: swapper/0[1] exited with preempt_count 1 [ 1.333592] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [ 1.333736] Kernel Offset: disabled Note, this can only be triggered if a driver errors out from this call, which should never happen. If it does, the driver needs to be fixed. Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20200408214003.3356-1-sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-11x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigationMark Gross1-0/+8
commit 7e5b3c267d256822407a22fdce6afdf9cd13f9fb upstream SRBDS is an MDS-like speculative side channel that can leak bits from the random number generator (RNG) across cores and threads. New microcode serializes the processor access during the execution of RDRAND and RDSEED. This ensures that the shared buffer is overwritten before it is released for reuse. While it is present on all affected CPU models, the microcode mitigation is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the cases where TSX is not supported or has been disabled with TSX_CTRL. The mitigation is activated by default on affected processors and it increases latency for RDRAND and RDSEED instructions. Among other effects this will reduce throughput from /dev/urandom. * Enable administrator to configure the mitigation off when desired using either mitigations=off or srbds=off. * Export vulnerability status via sysfs * Rename file-scoped macros to apply for non-whitelist table initializations. [ bp: Massage, - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g, - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in, - flip check in cpu_set_bug_bits() to save an indentation level, - reflow comments. jpoimboe: s/Mitigated/Mitigation/ in user-visible strings tglx: Dropped the fused off magic for now ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-27component: Silence bind error on -EPROBE_DEFERJames Hilliard1-3/+5
[ Upstream commit 7706b0a76a9697021e2bf395f3f065c18f51043d ] If a component fails to bind due to -EPROBE_DEFER we should not log an error as this is not a real failure. Fixes messages like: vc4-drm soc:gpu: failed to bind 3f902000.hdmi (ops vc4_hdmi_ops): -517 vc4-drm soc:gpu: master bind failed: -517 Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Link: https://lore.kernel.org/r/20200411190241.89404-1-james.hilliard1@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28driver core: platform: fix u32 greater or equal to zero comparisonColin Ian King1-1/+1
[ Upstream commit 0707cfa5c3ef58effb143db9db6d6e20503f9dec ] Currently the check that a u32 variable i is >= 0 is always true because the unsigned variable will never be negative, causing the loop to run forever. Fix this by changing the pre-decrement check to a zero check on i followed by a decrement of i. Addresses-Coverity: ("Unsigned compared against 0") Fixes: 39cc539f90d0 ("driver core: platform: Prevent resouce overflow from causing infinite loops") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20200116175758.88396-1-colin.king@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28driver core: Print device when resources present in really_probe()Geert Uytterhoeven1-1/+4
[ Upstream commit 7c35e699c88bd60734277b26962783c60e04b494 ] If a device already has devres items attached before probing, a warning backtrace is printed. However, this backtrace does not reveal the offending device, leaving the user uninformed. Furthermore, using WARN_ON() causes systems with panic-on-warn to reboot. Fix this by replacing the WARN_ON() by a dev_crit() message. Abort probing the device, to prevent doing more damage to the device's resources. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20191206132219.28908-1-geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28driver core: platform: Prevent resouce overflow from causing infinite loopsSimon Schwartz1-4/+6
[ Upstream commit 39cc539f90d035a293240c9443af50be55ee81b8 ] num_resources in the platform_device struct is declared as a u32. The for loops that iterate over num_resources use an int as the counter, which can cause infinite loops on architectures with smaller ints. Change the loop counters to u32. Signed-off-by: Simon Schwartz <kern.simon@theschwartz.xyz> Link: https://lore.kernel.org/r/2201ce63a2a171ffd2ed14e867875316efcf71db.camel@theschwartz.xyz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05drivers/base/platform.c: kmemleak ignore a known leakQian Cai1-0/+3
[ Upstream commit 967d3010df8b6f6f9aa95c198edc5fe3646ebf36 ] unreferenced object 0xffff808ec6dc5a80 (size 128): comm "swapper/0", pid 1, jiffies 4294938063 (age 2560.530s) hex dump (first 32 bytes): ff ff ff ff 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b ........kkkkkkkk 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk backtrace: [<00000000476dcf8c>] kmem_cache_alloc_trace+0x430/0x500 [<000000004f708d37>] platform_device_register_full+0xbc/0x1e8 [<000000006c2a7ec7>] acpi_create_platform_device+0x370/0x450 [<00000000ef135642>] acpi_default_enumeration+0x34/0x78 [<000000003bd9a052>] acpi_bus_attach+0x2dc/0x3e0 [<000000003cf4f7f2>] acpi_bus_attach+0x108/0x3e0 [<000000003cf4f7f2>] acpi_bus_attach+0x108/0x3e0 [<000000002968643e>] acpi_bus_scan+0xb0/0x110 [<0000000010dd0bd7>] acpi_scan_init+0x1a8/0x410 [<00000000965b3c5a>] acpi_init+0x408/0x49c [<00000000ed4b9fe2>] do_one_initcall+0x178/0x7f4 [<00000000a5ac5a74>] kernel_init_freeable+0x9d4/0xa9c [<0000000070ea6c15>] kernel_init+0x18/0x138 [<00000000fb8fff06>] ret_from_fork+0x10/0x1c [<0000000041273a0d>] 0xffffffffffffffff Then, faddr2line pointed out this line, /* * This memory isn't freed when the device is put, * I don't have a nice idea for that though. Conceptually * dma_mask in struct device should not be a pointer. * See http://thread.gmane.org/gmane.linux.kernel.pci/9081 */ pdev->dev.dma_mask = kmalloc(sizeof(*pdev->dev.dma_mask), GFP_KERNEL); Since this leak has existed for more than 8 years and it does not reference other parts of the memory, let kmemleak ignore it, so users don't need to waste time reporting this in the future. Link: http://lkml.kernel.org/r/20181206160751.36211-1-cai@gmx.us Signed-off-by: Qian Cai <cai@gmx.us> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J . Wysocki" <rafael.j.wysocki@intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28mm/memory_hotplug: Do not unlock when fails to take the device_hotplug_lockzhong jiang1-1/+1
[ Upstream commit d2ab99403ee00d8014e651728a4702ea1ae5e52c ] When adding the memory by probing memory block in sysfs interface, there is an obvious issue that we will unlock the device_hotplug_lock when fails to takes it. That issue was introduced in Commit 8df1d0e4a265 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock") We should drop out in time when fails to take the device_hotplug_lock. Fixes: 8df1d0e4a265 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock") Reported-by: Yang yingliang <yangyingliang@huawei.com> Signed-off-by: zhong jiang <zhongjiang@huawei.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28mm/memory_hotplug: make add_memory() take the device_hotplug_lockDavid Hildenbrand1-2/+7
[ Upstream commit 8df1d0e4a265f25dc1e7e7624ccdbcb4a6630c89 ] add_memory() currently does not take the device_hotplug_lock, however is aleady called under the lock from arch/powerpc/platforms/pseries/hotplug-memory.c drivers/acpi/acpi_memhotplug.c to synchronize against CPU hot-remove and similar. In general, we should hold the device_hotplug_lock when adding memory to synchronize against online/offline request (e.g. from user space) - which already resulted in lock inversions due to device_lock() and mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory hot-add deadlock"). add_memory()/add_memory_resource() will create memory block devices, so this really feels like the right thing to do. Holding the device_hotplug_lock makes sure that a memory block device can really only be accessed (e.g. via .online/.state) from user space, once the memory has been fully added to the system. The lock is not held yet in drivers/xen/balloon.c arch/powerpc/platforms/powernv/memtrace.c drivers/s390/char/sclp_cmd.c drivers/hv/hv_balloon.c So, let's either use the locked variants or take the lock. Don't export add_memory_resource(), as it once was exported to be used by XEN, which is never built as a module. If somebody requires it, we also have to export a locked variant (as device_hotplug_lock is never exported). Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mathieu Malaterre <malat@debian.org> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25component: fix loop condition to call unbind() if bind() failsBanajit Goswami1-3/+3
[ Upstream commit bdae566d5d9733b6e32b378668b84eadf28a94d4 ] During component_bind_all(), if bind() fails for any particular component associated with a master, unbind() should be called for all previous components in that master's match array, whose bind() might have completed successfully. As per the current logic, if bind() fails for the component at position 'n' in the master's match array, it would start calling unbind() from component in 'n'th position itself and work backwards, and will always skip calling unbind() for component in 0th position in the master's match array. Fix this by updating the loop condition, and the logic to refer to the components in master's match array, so that unbind() is called for all components starting from 'n-1'st position in the array, until (and including) component in 0th position. Signed-off-by: Banajit Goswami <bgoswami@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-16x86/bugs: Add ITLB_MULTIHIT bug infrastructureVineela Tummalapalli1-0/+8
commit db4d30fbb71b47e4ecb11c4efa5d8aad4b03dfae upstream. Some processors may incur a machine check error possibly resulting in an unrecoverable CPU lockup when an instruction fetch encounters a TLB multi-hit in the instruction TLB. This can occur when the page size is changed along with either the physical address or cache type. The relevant erratum can be found here: https://bugzilla.kernel.org/show_bug.cgi?id=205195 There are other processors affected for which the erratum does not fully disclose the impact. This issue affects both bare-metal x86 page tables and EPT. It can be mitigated by either eliminating the use of large pages or by using careful TLB invalidations when changing the page size in the page tables. Just like Spectre, Meltdown, L1TF and MDS, a new bit has been allocated in MSR_IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) and will be set on CPUs which are mitigated against this issue. Signed-off-by: Vineela Tummalapalli <vineela.tummalapalli@intel.com> Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bwh: Backported to 4.9: - No support for X86_VENDOR_HYGON, ATOM_AIRMONT_NP - Adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-16x86/speculation/taa: Add sysfs reporting for TSX Async AbortPawan Gupta1-0/+9
commit 6608b45ac5ecb56f9e171252229c39580cc85f0f upstream. Add the sysfs reporting file for TSX Async Abort. It exposes the vulnerability and the mitigation state similar to the existing files for the other hardware vulnerabilities. Sysfs file path is: /sys/devices/system/cpu/vulnerabilities/tsx_async_abort Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Reviewed-by: Mark Gross <mgross@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29cpufreq: Avoid cpufreq_suspend() deadlock on system shutdownRafael J. Wysocki1-0/+3
commit 65650b35133ff20f0c9ef0abd5c3c66dbce3ae57 upstream. It is incorrect to set the cpufreq syscore shutdown callback pointer to cpufreq_suspend(), because that function cannot be run in the syscore stage of system shutdown for two reasons: (a) it may attempt to carry out actions depending on devices that have already been shut down at that point and (b) the RCU synchronization carried out by it may not be able to make progress then. The latter issue has been present since commit 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds"), but the former one has been there since commit 90de2a4aa9f3 ("cpufreq: suspend cpufreq governors on shutdown") regardless. Fix that by dropping cpufreq_syscore_ops altogether and making device_shutdown() call cpufreq_suspend() directly before shutting down devices, which is along the lines of what system-wide power management does. Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds") Fixes: 90de2a4aa9f3 ("cpufreq: suspend cpufreq governors on shutdown") Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.0+ <stable@vger.kernel.org> # 4.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05base: soc: Export soc_device_register/unregister APIsVinod Koul1-0/+2
[ Upstream commit f7ccc7a397cf2ef64aebb2f726970b93203858d2 ] Qcom Socinfo driver can be built as a module, so export these two APIs. Tested-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Vaishali Thakkar <vaishali.thakkar@linaro.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-21driver core: Fix use-after-free and double free on glue directoryMuchun Song1-1/+52
commit ac43432cb1f5c2950408534987e57c2071e24d8f upstream. There is a race condition between removing glue directory and adding a new device under the glue dir. It can be reproduced in following test: CPU1: CPU2: device_add() get_device_parent() class_dir_create_and_add() kobject_add_internal() create_dir() // create glue_dir device_add() get_device_parent() kobject_get() // get glue_dir device_del() cleanup_glue_dir() kobject_del(glue_dir) kobject_add() kobject_add_internal() create_dir() // in glue_dir sysfs_create_dir_ns() kernfs_create_dir_ns(sd) sysfs_remove_dir() // glue_dir->sd=NULL sysfs_put() // free glue_dir->sd // sd is freed kernfs_new_node(sd) kernfs_get(glue_dir) kernfs_add_one() kernfs_put() Before CPU1 remove last child device under glue dir, if CPU2 add a new device under glue dir, the glue_dir kobject reference count will be increase to 2 via kobject_get() in get_device_parent(). And CPU2 has been called kernfs_create_dir_ns(), but not call kernfs_new_node(). Meanwhile, CPU1 call sysfs_remove_dir() and sysfs_put(). This result in glue_dir->sd is freed and it's reference count will be 0. Then CPU2 call kernfs_get(glue_dir) will trigger a warning in kernfs_get() and increase it's reference count to 1. Because glue_dir->sd is freed by CPU1, the next call kernfs_add_one() by CPU2 will fail(This is also use-after-free) and call kernfs_put() to decrease reference count. Because the reference count is decremented to 0, it will also call kmem_cache_free() to free the glue_dir->sd again. This will result in double free. In order to avoid this happening, we also should make sure that kernfs_node for glue_dir is released in CPU1 only when refcount for glue_dir kobj is 1 to fix this race. The following calltrace is captured in kernel 4.14 with the following patch applied: commit 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier") -------------------------------------------------------------------------- [ 3.633703] WARNING: CPU: 4 PID: 513 at .../fs/kernfs/dir.c:494 Here is WARN_ON(!atomic_read(&kn->count) in kernfs_get(). .... [ 3.633986] Call trace: [ 3.633991] kernfs_create_dir_ns+0xa8/0xb0 [ 3.633994] sysfs_create_dir_ns+0x54/0xe8 [ 3.634001] kobject_add_internal+0x22c/0x3f0 [ 3.634005] kobject_add+0xe4/0x118 [ 3.634011] device_add+0x200/0x870 [ 3.634017] _request_firmware+0x958/0xc38 [ 3.634020] request_firmware_into_buf+0x4c/0x70 .... [ 3.634064] kernel BUG at .../mm/slub.c:294! Here is BUG_ON(object == fp) in set_freepointer(). .... [ 3.634346] Call trace: [ 3.634351] kmem_cache_free+0x504/0x6b8 [ 3.634355] kernfs_put+0x14c/0x1d8 [ 3.634359] kernfs_create_dir_ns+0x88/0xb0 [ 3.634362] sysfs_create_dir_ns+0x54/0xe8 [ 3.634366] kobject_add_internal+0x22c/0x3f0 [ 3.634370] kobject_add+0xe4/0x118 [ 3.634374] device_add+0x200/0x870 [ 3.634378] _request_firmware+0x958/0xc38 [ 3.634381] request_firmware_into_buf+0x4c/0x70 -------------------------------------------------------------------------- Fixes: 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier") Signed-off-by: Muchun Song <smuchun@gmail.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Prateek Sood <prsood@codeaurora.org> Link: https://lore.kernel.org/r/20190727032122.24639-1-smuchun@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04regmap: fix bulk writes on paged registersSrinivas Kandagatla1-0/+2
[ Upstream commit db057679de3e9e6a03c1bcd5aee09b0d25fd9f5b ] On buses like SlimBus and SoundWire which does not support gather_writes yet in regmap, A bulk write on paged register would be silently ignored after programming page. This is because local variable 'ret' value in regmap_raw_write_impl() gets reset to 0 once page register is written successfully and the code below checks for 'ret' value to be -ENOTSUPP before linearising the write buffer to send to bus->write(). Fix this by resetting the 'ret' value to -ENOTSUPP in cases where gather_writes() is not supported or single register write is not possible. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-31PM / core: Propagate dev->power.wakeup_path when no callbacksUlf Hansson1-0/+4
[ Upstream commit dc351d4c5f4fe4d0f274d6d660227be0c3a03317 ] The dev->power.direct_complete flag may become set in device_prepare() in case the device don't have any PM callbacks (dev->power.no_pm_callbacks is set). This leads to a broken behaviour, when there is child having wakeup enabled and relies on its parent to be used in the wakeup path. More precisely, when the direct complete path becomes selected for the child in __device_suspend(), the propagation of the dev->power.wakeup_path becomes skipped as well. Let's address this problem, by checking if the device is a part the wakeup path or has wakeup enabled, then prevent the direct complete path from being used. Reported-by: Loic Pallardy <loic.pallardy@st.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> [ rjw: Comment cleanup ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-14x86/speculation/mds: Add sysfs reporting for MDSThomas Gleixner1-0/+8
commit 8a4b06d391b0a42a373808979b5028f5c84d9c6a upstream. Add the sysfs reporting file for MDS. It exposes the vulnerability and mitigation state similar to the existing files for the other speculative hardware vulnerabilities. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com> [bwh: Backported to 4.9: test x86_hyper instead of using hypervisor_is_type()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23PM / wakeup: Rework wakeup source timer cancellationViresh Kumar1-1/+7
commit 1fad17fb1bbcd73159c2b992668a6957ecc5af8a upstream. If wakeup_source_add() is called right after wakeup_source_remove() for the same wakeup source, timer_setup() may be called for a potentially scheduled timer which is incorrect. To avoid that, move the wakeup source timer cancellation from wakeup_source_drop() to wakeup_source_remove(). Moreover, make wakeup_source_remove() clear the timer function after canceling the timer to let wakeup_source_not_registered() treat unregistered wakeup sources in the same way as the ones that have never been registered. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.4+ <stable@vger.kernel.org> # 4.4+ [ rjw: Subject, changelog, merged two patches together ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-06drivers: core: Remove glue dirs from sysfs earlierBenjamin Herrenschmidt1-0/+2
commit 726e41097920a73e4c7c33385dcc0debb1281e18 upstream. For devices with a class, we create a "glue" directory between the parent device and the new device with the class name. This directory is never "explicitely" removed when empty however, this is left to the implicit sysfs removal done by kobject_release() when the object loses its last reference via kobject_put(). This is problematic because as long as it's not been removed from sysfs, it is still present in the class kset and in sysfs directory structure. The presence in the class kset exposes a use after free bug fixed by the previous patch, but the presence in sysfs means that until the kobject is released, which can take a while (especially with kobject debugging), any attempt at re-creating such as binding a new device for that class/parent pair, will result in a sysfs duplicate file name error. This fixes it by instead doing an explicit kobject_del() when the glue dir is empty, by keeping track of the number of child devices of the gluedir. This is made easy by the fact that all glue dir operations are done with a global mutex, and there's already a function (cleanup_glue_dir) called in all the right places taking that mutex that can be enhanced for this. It appears that this was in fact the intent of the function, but the implementation was wrong. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Zubin Mithra <zsm@chromium.org> Cc: Guenter Roeck <groeck@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-26sysfs: Disable lockdep for driver bind/unbind filesDaniel Vetter1-2/+5
[ Upstream commit 4f4b374332ec0ae9c738ff8ec9bed5cd97ff9adc ] This is the much more correct fix for my earlier attempt at: https://lkml.org/lkml/2018/12/10/118 Short recap: - There's not actually a locking issue, it's just lockdep being a bit too eager to complain about a possible deadlock. - Contrary to what I claimed the real problem is recursion on kn->count. Greg pointed me at sysfs_break_active_protection(), used by the scsi subsystem to allow a sysfs file to unbind itself. That would be a real deadlock, which isn't what's happening here. Also, breaking the active protection means we'd need to manually handle all the lifetime fun. - With Rafael we discussed the task_work approach, which kinda works, but has two downsides: It's a functional change for a lockdep annotation issue, and it won't work for the bind file (which needs to get the errno from the driver load function back to userspace). - Greg also asked why this never showed up: To hit this you need to unregister a 2nd driver from the unload code of