Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 55e6d8037805b3400096d621091dfbf713f97e83 upstream.
In regcache_rbtree_insert_to_block(), when 'present' realloc failed,
the 'blk' which is supposed to assign to 'rbnode->block' will be freed,
so 'rbnode->block' points a freed memory, in the error handling path of
regcache_rbtree_init(), 'rbnode->block' will be freed again in
regcache_rbtree_exit(), KASAN will report double-free as follows:
BUG: KASAN: double-free or invalid-free in kfree+0xce/0x390
Call Trace:
slab_free_freelist_hook+0x10d/0x240
kfree+0xce/0x390
regcache_rbtree_exit+0x15d/0x1a0
regcache_rbtree_init+0x224/0x2c0
regcache_init+0x88d/0x1310
__regmap_init+0x3151/0x4a80
__devm_regmap_init+0x7d/0x100
madera_spi_probe+0x10f/0x333 [madera_spi]
spi_probe+0x183/0x210
really_probe+0x285/0xc30
To fix this, moving up the assignment of rbnode->block to immediately after
the reallocation has succeeded so that the data structure stays valid even
if the second reallocation fails.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 3f4ff561bc88b ("regmap: rbtree: Make cache_present bitmap per node")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211012023735.1632786-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 69728051f5bf15efaf6edfbcfe1b5a49a2437918 upstream.
If a device is runtime PM suspended when we enter suspend and has
a dedicated wake IRQ, we can get the following warning:
WARNING: CPU: 0 PID: 108 at kernel/irq/manage.c:526 enable_irq+0x40/0x94
[ 102.087860] Unbalanced enable for IRQ 147
...
(enable_irq) from [<c06117a8>] (dev_pm_arm_wake_irq+0x4c/0x60)
(dev_pm_arm_wake_irq) from [<c0618360>]
(device_wakeup_arm_wake_irqs+0x58/0x9c)
(device_wakeup_arm_wake_irqs) from [<c0615948>]
(dpm_suspend_noirq+0x10/0x48)
(dpm_suspend_noirq) from [<c01ac7ac>]
(suspend_devices_and_enter+0x30c/0xf14)
(suspend_devices_and_enter) from [<c01adf20>]
(enter_state+0xad4/0xbd8)
(enter_state) from [<c01ad3ec>] (pm_suspend+0x38/0x98)
(pm_suspend) from [<c01ab3e8>] (state_store+0x68/0xc8)
This is because the dedicated wake IRQ for the device may have been
already enabled earlier by dev_pm_enable_wake_irq_check(). Fix the
issue by checking for runtime PM suspended status.
This issue can be easily reproduced by setting serial console log level
to zero, letting the serial console idle, and suspend the system from
an ssh terminal. On resume, dmesg will have the warning above.
The reason why I have not run into this issue earlier has been that I
typically run my PM test cases from on a serial console instead over ssh.
Fixes: c84345597558 (PM / wakeirq: Enable dedicated wakeirq for suspend)
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1852f5ed358147095297a09cc3c6f160208a676d ]
This patch fixes the offset of register error log
by using regmap_get_offset().
Signed-off-by: Jeongtae Park <jeongtae.park@gmail.com>
Link: https://lore.kernel.org/r/20210701142630.44936-1-jeongtae.park@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit c84345597558349474f55be2b7d4093256e42884 upstream.
We currently rely on runtime PM to enable dedicated wakeirq for suspend.
This assumption fails in the following two cases:
1. If the consumer driver does not have runtime PM implemented, the
dedicated wakeirq never gets enabled for suspend
2. If the consumer driver has runtime PM implemented, but does not idle
in suspend
Let's fix the issue by always enabling the dedicated wakeirq during
suspend.
Depends-on: bed570307ed7 (PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend)
Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling)
Reported-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
[ tony@atomide.com: updated based on bed570307ed7, added description ]
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 77e89afc25f30abd56e76a809ee2884d7c1b63ce upstream.
Multi-MSI uses a single MSI descriptor and there is a single mask register
when the device supports per vector masking. To avoid reading back the mask
register the value is cached in the MSI descriptor and updates are done by
clearing and setting bits in the cache and writing it to the device.
But nothing protects msi_desc::masked and the mask register from being
modified concurrently on two different CPUs for two different Linux
interrupts which belong to the same multi-MSI descriptor.
Add a lock to struct device and protect any operation on the mask and the
mask register with it.
This makes the update of msi_desc::masked unconditional, but there is no
place which requires a modification of the hardware register without
updating the masked cache.
msi_mask_irq() is now an empty wrapper which will be cleaned up in follow
up changes.
The problem goes way back to the initial support of multi-MSI, but picking
the commit which introduced the mask cache is a valid cut off point
(2.6.30).
Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.726833414@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 47f4469970d8861bc06d2d4d45ac8200ff07c693 upstream.
While commit d5dcce0c414f ("device property: Keep secondary firmware
node secondary by type") describes everything correct in its commit
message, the change it made does the opposite and original commit
c15e1bdda436 ("device property: Fix the secondary firmware node handling
in set_primary_fwnode()") was fully correct.
Revert the former one here and improve documentation in the next patch.
Fixes: d5dcce0c414f ("device property: Keep secondary firmware node secondary by type")
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 99aed9227073fb34ce2880cbc7063e04185a65e1 upstream.
It appears that firmware nodes can be shared between devices. In such case
when a (child) device is about to be deleted, its firmware node may be shared
and ACPI_COMPANION_SET(..., NULL) call for it breaks the secondary link
of the shared primary firmware node.
In order to prevent that, check, if the device has a parent and parent's
firmware node is shared with its child, and avoid crashing the link.
Fixes: c15e1bdda436 ("device property: Fix the secondary firmware node handling in set_primary_fwnode()")
Reported-by: Ferry Toth <fntoth@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Ferry Toth <fntoth@gmail.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d5dcce0c414fcbfe4c2037b66ac69ea5f9b3f75c upstream.
Behind primary and secondary we understand the type of the nodes
which might define their ordering. However, if primary node gone,
we can't maintain the ordering by definition of the linked list.
Thus, by ordering secondary node becomes first in the list.
But in this case the meaning of it is still secondary (or auxiliary).
The type of the node is maintained by the secondary pointer in it:
secondary pointer Meaning
NULL or valid primary node
ERR_PTR(-ENODEV) secondary node
So, if by some reason we do the following sequence of calls
set_primary_fwnode(dev, NULL);
set_primary_fwnode(dev, primary);
we should preserve secondary node.
This concept is supported by the description of set_primary_fwnode()
along with implementation of set_secondary_fwnode(). Hence, fix
the commit c15e1bdda436 to follow this as well.
Fixes: c15e1bdda436 ("device property: Fix the secondary firmware node handling in set_primary_fwnode()")
Cc: Ferry Toth <fntoth@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Ferry Toth <fntoth@gmail.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b292b50b0efcc7095d8bf15505fba6909bb35dce upstream.
syzbot is reporting hung task in wait_for_device_probe() [1]. At least,
we always need to decrement probe_count if we incremented probe_count in
really_probe().
However, since I can't find "Resources present before probing" message in
the console log, both "this message simply flowed off" and "syzbot is not
hitting this path" will be possible. Therefore, while we are at it, let's
also prepare for concurrent wait_for_device_probe() calls by replacing
wake_up() with wake_up_all().
[1] https://syzkaller.appspot.com/bug?id=25c833f1983c9c1d512f4ff860dd0d7f5a2e2c0f
Reported-by: syzbot <syzbot+805f5f6ae37411f15b64@syzkaller.appspotmail.com>
Fixes: 7c35e699c88bd607 ("driver core: Print device when resources present in really_probe()")
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/20200713021254.3444-1-penguin-kernel@I-love.SAKURA.ne.jp
[iwamatsu: Drop patch for deferred_probe_timeout_work_func()]
Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
set_primary_fwnode()
commit c15e1bdda4365a5f17cdadf22bf1c1df13884a9e upstream.
When the primary firmware node pointer is removed from a
device (set to NULL) the secondary firmware node pointer,
when it exists, is made the primary node for the device.
However, the secondary firmware node pointer of the original
primary firmware node is never cleared (set to NULL).
To avoid situation where the secondary firmware node pointer
is pointing to a non-existing object, clearing it properly
when the primary node is removed from a device in
set_primary_fwnode().
Fixes: 97badf873ab6 ("device property: Make it possible to use secondary firmware nodes")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e3eb6e8fba65094328b8dca635d00de74ba75b45 upstream.
It has been reported that system-wide suspend may be aborted in the
absence of any wakeup events due to unforseen interactions of it with
the runtume PM framework.
One failing scenario is when there are multiple devices sharing an
ACPI power resource and runtime-resume needs to be carried out for
one of them during system-wide suspend (for example, because it needs
to be reconfigured before the whole system goes to sleep). In that
case, the runtime-resume of that device involves turning the ACPI
power resource "on" which in turn causes runtime-resume requests
to be queued up for all of the other devices sharing it. Those
requests go to the runtime PM workqueue which is frozen during
system-wide suspend, so they are not actually taken care of until
the resume of the whole system, but the pm_runtime_barrier()
call in __device_suspend() sees them and triggers system wakeup
events for them which then cause the system-wide suspend to be
aborted if wakeup source objects are in active use.
Of course, the logic that leads to triggering those wakeup events is
questionable in the first place, because clearly there are cases in
which a pending runtime resume request for a device is not connected
to any real wakeup events in any way (like the one above). Moreover,
it is racy, because the device may be resuming already by the time
the pm_runtime_barrier() runs and so if the driver doesn't take care
of signaling the wakeup event as appropriate, it will be lost.
However, if the driver does take care of that, the extra
pm_wakeup_event() call in the core is redundant.
Accordingly, drop the conditional pm_wakeup_event() call fron
__device_suspend() and make the latter call pm_runtime_barrier()
alone. Also modify the comment next to that call to reflect the new
code and extend it to mention the need to avoid unwanted interactions
between runtime PM and system-wide device suspend callbacks.
Fixes: 1e2ef05bb8cf8 ("PM: Limit race conditions between runtime PM and system sleep (v2)")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Utkarsh H Patel <utkarsh.h.patel@intel.com>
Tested-by: Utkarsh H Patel <utkarsh.h.patel@intel.com>
Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 74edd08a4fbf51d65fd8f4c7d8289cd0f392bd91 upstream.
When executing the following command, we met kernel dump.
dmesg -c > /dev/null; cd /sys;
for i in `ls /sys/kernel/debug/regmap/* -d`; do
echo "Checking regmap in $i";
cat $i/registers;
done && grep -ri "0x02d0" *;
It is because the count value is too big, and kmalloc fails. So add an
upper bound check to allow max size `PAGE_SIZE << (MAX_ORDER - 1)`.
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/1584064687-12964-1-git-send-email-peng.fan@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e84861fec32dee8a2e62bbaa52cded6b05a2a456 ]
This function is used by dev_get_regmap() to retrieve a regmap for the
specified device. If the device has more than one regmap, the name parameter
can be used to specify one.
The code here uses a pointer comparison to check for equal strings. This
however will probably always fail, as the regmap->name is allocated via
kstrdup_const() from the regmap's config->name.
Fix this by using strcmp() instead.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20200703103315.267996-1-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
driver developer is foolish
[ Upstream commit 388bcc6ecc609fca1b4920de7dc3806c98ec535e ]
If platform bus driver registration is failed then, accessing
platform bus spin lock (&drv->driver.bus->p->klist_drivers.k_lock)
in __platform_driver_probe() without verifying the return value
__platform_driver_register() can lead to NULL pointer exception.
So check the return value before attempting the spin lock.
One such example is below:
For a custom usecase, I have intentionally failed the platform bus
registration and I expected all the platform device/driver
registrations to fail gracefully. But I came across this panic
issue.
[ 1.331067] BUG: kernel NULL pointer dereference, address: 00000000000000c8
[ 1.331118] #PF: supervisor write access in kernel mode
[ 1.331163] #PF: error_code(0x0002) - not-present page
[ 1.331208] PGD 0 P4D 0
[ 1.331233] Oops: 0002 [#1] PREEMPT SMP
[ 1.331268] CPU: 3 PID: 1 Comm: swapper/0 Tainted: G W 5.6.0-00049-g670d35fb0144 #165
[ 1.331341] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 1.331406] RIP: 0010:_raw_spin_lock+0x15/0x30
[ 1.331588] RSP: 0000:ffffc9000001be70 EFLAGS: 00010246
[ 1.331632] RAX: 0000000000000000 RBX: 00000000000000c8 RCX: 0000000000000001
[ 1.331696] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000000
[ 1.331754] RBP: 00000000ffffffed R08: 0000000000000501 R09: 0000000000000001
[ 1.331817] R10: ffff88817abcc520 R11: 0000000000000670 R12: 00000000ffffffed
[ 1.331881] R13: ffffffff82dbc268 R14: ffffffff832f070a R15: 0000000000000000
[ 1.331945] FS: 0000000000000000(0000) GS:ffff88817bd80000(0000) knlGS:0000000000000000
[ 1.332008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.332062] CR2: 00000000000000c8 CR3: 000000000681e001 CR4: 00000000003606e0
[ 1.332126] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1.332189] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1.332252] Call Trace:
[ 1.332281] __platform_driver_probe+0x92/0xee
[ 1.332323] ? rtc_dev_init+0x2b/0x2b
[ 1.332358] cmos_init+0x37/0x67
[ 1.332396] do_one_initcall+0x7d/0x168
[ 1.332428] kernel_init_freeable+0x16c/0x1c9
[ 1.332473] ? rest_init+0xc0/0xc0
[ 1.332508] kernel_init+0x5/0x100
[ 1.332543] ret_from_fork+0x1f/0x30
[ 1.332579] CR2: 00000000000000c8
[ 1.332616] ---[ end trace 3bd87f12e9010b87 ]---
[ 1.333549] note: swapper/0[1] exited with preempt_count 1
[ 1.333592] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[ 1.333736] Kernel Offset: disabled
Note, this can only be triggered if a driver errors out from this call,
which should never happen. If it does, the driver needs to be fixed.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lore.kernel.org/r/20200408214003.3356-1-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 7e5b3c267d256822407a22fdce6afdf9cd13f9fb upstream
SRBDS is an MDS-like speculative side channel that can leak bits from the
random number generator (RNG) across cores and threads. New microcode
serializes the processor access during the execution of RDRAND and
RDSEED. This ensures that the shared buffer is overwritten before it is
released for reuse.
While it is present on all affected CPU models, the microcode mitigation
is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
cases where TSX is not supported or has been disabled with TSX_CTRL.
The mitigation is activated by default on affected processors and it
increases latency for RDRAND and RDSEED instructions. Among other
effects this will reduce throughput from /dev/urandom.
* Enable administrator to configure the mitigation off when desired using
either mitigations=off or srbds=off.
* Export vulnerability status via sysfs
* Rename file-scoped macros to apply for non-whitelist table initializations.
[ bp: Massage,
- s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g,
- do not read arch cap MSR a second time in tsx_fused_off() - just pass it in,
- flip check in cpu_set_bug_bits() to save an indentation level,
- reflow comments.
jpoimboe: s/Mitigated/Mitigation/ in user-visible strings
tglx: Dropped the fused off magic for now
]
Signed-off-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7706b0a76a9697021e2bf395f3f065c18f51043d ]
If a component fails to bind due to -EPROBE_DEFER we should not log an
error as this is not a real failure.
Fixes messages like:
vc4-drm soc:gpu: failed to bind 3f902000.hdmi (ops vc4_hdmi_ops): -517
vc4-drm soc:gpu: master bind failed: -517
Signed-off-by: James Hilliard <james.hilliard1@gmail.com>
Link: https://lore.kernel.org/r/20200411190241.89404-1-james.hilliard1@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0707cfa5c3ef58effb143db9db6d6e20503f9dec ]
Currently the check that a u32 variable i is >= 0 is always true because
the unsigned variable will never be negative, causing the loop to run
forever. Fix this by changing the pre-decrement check to a zero check on
i followed by a decrement of i.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 39cc539f90d0 ("driver core: platform: Prevent resouce overflow from causing infinite loops")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200116175758.88396-1-colin.king@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7c35e699c88bd60734277b26962783c60e04b494 ]
If a device already has devres items attached before probing, a warning
backtrace is printed. However, this backtrace does not reveal the
offending device, leaving the user uninformed. Furthermore, using
WARN_ON() causes systems with panic-on-warn to reboot.
Fix this by replacing the WARN_ON() by a dev_crit() message.
Abort probing the device, to prevent doing more damage to the device's
resources.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20191206132219.28908-1-geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 39cc539f90d035a293240c9443af50be55ee81b8 ]
num_resources in the platform_device struct is declared as a u32. The
for loops that iterate over num_resources use an int as the counter,
which can cause infinite loops on architectures with smaller ints.
Change the loop counters to u32.
Signed-off-by: Simon Schwartz <kern.simon@theschwartz.xyz>
Link: https://lore.kernel.org/r/2201ce63a2a171ffd2ed14e867875316efcf71db.camel@theschwartz.xyz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 967d3010df8b6f6f9aa95c198edc5fe3646ebf36 ]
unreferenced object 0xffff808ec6dc5a80 (size 128):
comm "swapper/0", pid 1, jiffies 4294938063 (age 2560.530s)
hex dump (first 32 bytes):
ff ff ff ff 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b ........kkkkkkkk
6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
backtrace:
[<00000000476dcf8c>] kmem_cache_alloc_trace+0x430/0x500
[<000000004f708d37>] platform_device_register_full+0xbc/0x1e8
[<000000006c2a7ec7>] acpi_create_platform_device+0x370/0x450
[<00000000ef135642>] acpi_default_enumeration+0x34/0x78
[<000000003bd9a052>] acpi_bus_attach+0x2dc/0x3e0
[<000000003cf4f7f2>] acpi_bus_attach+0x108/0x3e0
[<000000003cf4f7f2>] acpi_bus_attach+0x108/0x3e0
[<000000002968643e>] acpi_bus_scan+0xb0/0x110
[<0000000010dd0bd7>] acpi_scan_init+0x1a8/0x410
[<00000000965b3c5a>] acpi_init+0x408/0x49c
[<00000000ed4b9fe2>] do_one_initcall+0x178/0x7f4
[<00000000a5ac5a74>] kernel_init_freeable+0x9d4/0xa9c
[<0000000070ea6c15>] kernel_init+0x18/0x138
[<00000000fb8fff06>] ret_from_fork+0x10/0x1c
[<0000000041273a0d>] 0xffffffffffffffff
Then, faddr2line pointed out this line,
/*
* This memory isn't freed when the device is put,
* I don't have a nice idea for that though. Conceptually
* dma_mask in struct device should not be a pointer.
* See http://thread.gmane.org/gmane.linux.kernel.pci/9081
*/
pdev->dev.dma_mask =
kmalloc(sizeof(*pdev->dev.dma_mask), GFP_KERNEL);
Since this leak has existed for more than 8 years and it does not
reference other parts of the memory, let kmemleak ignore it, so users
don't need to waste time reporting this in the future.
Link: http://lkml.kernel.org/r/20181206160751.36211-1-cai@gmx.us
Signed-off-by: Qian Cai <cai@gmx.us>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d2ab99403ee00d8014e651728a4702ea1ae5e52c ]
When adding the memory by probing memory block in sysfs interface, there is an
obvious issue that we will unlock the device_hotplug_lock when fails to takes it.
That issue was introduced in Commit 8df1d0e4a265
("mm/memory_hotplug: make add_memory() take the device_hotplug_lock")
We should drop out in time when fails to take the device_hotplug_lock.
Fixes: 8df1d0e4a265 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock")
Reported-by: Yang yingliang <yangyingliang@huawei.com>
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8df1d0e4a265f25dc1e7e7624ccdbcb4a6630c89 ]
add_memory() currently does not take the device_hotplug_lock, however
is aleady called under the lock from
arch/powerpc/platforms/pseries/hotplug-memory.c
drivers/acpi/acpi_memhotplug.c
to synchronize against CPU hot-remove and similar.
In general, we should hold the device_hotplug_lock when adding memory to
synchronize against online/offline request (e.g. from user space) - which
already resulted in lock inversions due to device_lock() and
mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
hot-add deadlock"). add_memory()/add_memory_resource() will create memory
block devices, so this really feels like the right thing to do.
Holding the device_hotplug_lock makes sure that a memory block device
can really only be accessed (e.g. via .online/.state) from user space,
once the memory has been fully added to the system.
The lock is not held yet in
drivers/xen/balloon.c
arch/powerpc/platforms/powernv/memtrace.c
drivers/s390/char/sclp_cmd.c
drivers/hv/hv_balloon.c
So, let's either use the locked variants or take the lock.
Don't export add_memory_resource(), as it once was exported to be used by
XEN, which is never built as a module. If somebody requires it, we also
have to export a locked variant (as device_hotplug_lock is never
exported).
Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bdae566d5d9733b6e32b378668b84eadf28a94d4 ]
During component_bind_all(), if bind() fails for any
particular component associated with a master, unbind()
should be called for all previous components in that
master's match array, whose bind() might have completed
successfully. As per the current logic, if bind() fails
for the component at position 'n' in the master's match
array, it would start calling unbind() from component in
'n'th position itself and work backwards, and will always
skip calling unbind() for component in 0th position in the
master's match array.
Fix this by updating the loop condition, and the logic to
refer to the components in master's match array, so that
unbind() is called for all components starting from 'n-1'st
position in the array, until (and including) component in
0th position.
Signed-off-by: Banajit Goswami <bgoswami@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit db4d30fbb71b47e4ecb11c4efa5d8aad4b03dfae upstream.
Some processors may incur a machine check error possibly resulting in an
unrecoverable CPU lockup when an instruction fetch encounters a TLB
multi-hit in the instruction TLB. This can occur when the page size is
changed along with either the physical address or cache type. The relevant
erratum can be found here:
https://bugzilla.kernel.org/show_bug.cgi?id=205195
There are other processors affected for which the erratum does not fully
disclose the impact.
This issue affects both bare-metal x86 page tables and EPT.
It can be mitigated by either eliminating the use of large pages or by
using careful TLB invalidations when changing the page size in the page
tables.
Just like Spectre, Meltdown, L1TF and MDS, a new bit has been allocated in
MSR_IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) and will be set on CPUs which
are mitigated against this issue.
Signed-off-by: Vineela Tummalapalli <vineela.tummalapalli@intel.com>
Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bwh: Backported to 4.9:
- No support for X86_VENDOR_HYGON, ATOM_AIRMONT_NP
- Adjust context, indentation]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6608b45ac5ecb56f9e171252229c39580cc85f0f upstream.
Add the sysfs reporting file for TSX Async Abort. It exposes the
vulnerability and the mitigation state similar to the existing files for
the other hardware vulnerabilities.
Sysfs file path is:
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 65650b35133ff20f0c9ef0abd5c3c66dbce3ae57 upstream.
It is incorrect to set the cpufreq syscore shutdown callback pointer
to cpufreq_suspend(), because that function cannot be run in the
syscore stage of system shutdown for two reasons: (a) it may attempt
to carry out actions depending on devices that have already been shut
down at that point and (b) the RCU synchronization carried out by it
may not be able to make progress then.
The latter issue has been present since commit 45975c7d21a1 ("rcu:
Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds"),
but the former one has been there since commit 90de2a4aa9f3 ("cpufreq:
suspend cpufreq governors on shutdown") regardless.
Fix that by dropping cpufreq_syscore_ops altogether and making
device_shutdown() call cpufreq_suspend() directly before shutting
down devices, which is along the lines of what system-wide power
management does.
Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds")
Fixes: 90de2a4aa9f3 ("cpufreq: suspend cpufreq governors on shutdown")
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f7ccc7a397cf2ef64aebb2f726970b93203858d2 ]
Qcom Socinfo driver can be built as a module, so
export these two APIs.
Tested-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Vaishali Thakkar <vaishali.thakkar@linaro.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit ac43432cb1f5c2950408534987e57c2071e24d8f upstream.
There is a race condition between removing glue directory and adding a new
device under the glue dir. It can be reproduced in following test:
CPU1: CPU2:
device_add()
get_device_parent()
class_dir_create_and_add()
kobject_add_internal()
create_dir() // create glue_dir
device_add()
get_device_parent()
kobject_get() // get glue_dir
device_del()
cleanup_glue_dir()
kobject_del(glue_dir)
kobject_add()
kobject_add_internal()
create_dir() // in glue_dir
sysfs_create_dir_ns()
kernfs_create_dir_ns(sd)
sysfs_remove_dir() // glue_dir->sd=NULL
sysfs_put() // free glue_dir->sd
// sd is freed
kernfs_new_node(sd)
kernfs_get(glue_dir)
kernfs_add_one()
kernfs_put()
Before CPU1 remove last child device under glue dir, if CPU2 add a new
device under glue dir, the glue_dir kobject reference count will be
increase to 2 via kobject_get() in get_device_parent(). And CPU2 has
been called kernfs_create_dir_ns(), but not call kernfs_new_node().
Meanwhile, CPU1 call sysfs_remove_dir() and sysfs_put(). This result in
glue_dir->sd is freed and it's reference count will be 0. Then CPU2 call
kernfs_get(glue_dir) will trigger a warning in kernfs_get() and increase
it's reference count to 1. Because glue_dir->sd is freed by CPU1, the next
call kernfs_add_one() by CPU2 will fail(This is also use-after-free)
and call kernfs_put() to decrease reference count. Because the reference
count is decremented to 0, it will also call kmem_cache_free() to free
the glue_dir->sd again. This will result in double free.
In order to avoid this happening, we also should make sure that kernfs_node
for glue_dir is released in CPU1 only when refcount for glue_dir kobj is
1 to fix this race.
The following calltrace is captured in kernel 4.14 with the following patch
applied:
commit 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier")
--------------------------------------------------------------------------
[ 3.633703] WARNING: CPU: 4 PID: 513 at .../fs/kernfs/dir.c:494
Here is WARN_ON(!atomic_read(&kn->count) in kernfs_get().
....
[ 3.633986] Call trace:
[ 3.633991] kernfs_create_dir_ns+0xa8/0xb0
[ 3.633994] sysfs_create_dir_ns+0x54/0xe8
[ 3.634001] kobject_add_internal+0x22c/0x3f0
[ 3.634005] kobject_add+0xe4/0x118
[ 3.634011] device_add+0x200/0x870
[ 3.634017] _request_firmware+0x958/0xc38
[ 3.634020] request_firmware_into_buf+0x4c/0x70
....
[ 3.634064] kernel BUG at .../mm/slub.c:294!
Here is BUG_ON(object == fp) in set_freepointer().
....
[ 3.634346] Call trace:
[ 3.634351] kmem_cache_free+0x504/0x6b8
[ 3.634355] kernfs_put+0x14c/0x1d8
[ 3.634359] kernfs_create_dir_ns+0x88/0xb0
[ 3.634362] sysfs_create_dir_ns+0x54/0xe8
[ 3.634366] kobject_add_internal+0x22c/0x3f0
[ 3.634370] kobject_add+0xe4/0x118
[ 3.634374] device_add+0x200/0x870
[ 3.634378] _request_firmware+0x958/0xc38
[ 3.634381] request_firmware_into_buf+0x4c/0x70
--------------------------------------------------------------------------
Fixes: 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier")
Signed-off-by: Muchun Song <smuchun@gmail.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Prateek Sood <prsood@codeaurora.org>
Link: https://lore.kernel.org/r/20190727032122.24639-1-smuchun@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit db057679de3e9e6a03c1bcd5aee09b0d25fd9f5b ]
On buses like SlimBus and SoundWire which does not support
gather_writes yet in regmap, A bulk write on paged register
would be silently ignored after programming page.
This is because local variable 'ret' value in regmap_raw_write_impl()
gets reset to 0 once page register is written successfully and the
code below checks for 'ret' value to be -ENOTSUPP before linearising
the write buffer to send to bus->write().
Fix this by resetting the 'ret' value to -ENOTSUPP in cases where
gather_writes() is not supported or single register write is
not possible.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dc351d4c5f4fe4d0f274d6d660227be0c3a03317 ]
The dev->power.direct_complete flag may become set in device_prepare() in
case the device don't have any PM callbacks (dev->power.no_pm_callbacks is
set). This leads to a broken behaviour, when there is child having wakeup
enabled and relies on its parent to be used in the wakeup path.
More precisely, when the direct complete path becomes selected for the
child in __device_suspend(), the propagation of the dev->power.wakeup_path
becomes skipped as well.
Let's address this problem, by checking if the device is a part the wakeup
path or has wakeup enabled, then prevent the direct complete path from
being used.
Reported-by: Loic Pallardy <loic.pallardy@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ rjw: Comment cleanup ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 8a4b06d391b0a42a373808979b5028f5c84d9c6a upstream.
Add the sysfs reporting file for MDS. It exposes the vulnerability and
mitigation state similar to the existing files for the other speculative
hardware vulnerabilities.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
[bwh: Backported to 4.9: test x86_hyper instead of using hypervisor_is_type()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1fad17fb1bbcd73159c2b992668a6957ecc5af8a upstream.
If wakeup_source_add() is called right after wakeup_source_remove()
for the same wakeup source, timer_setup() may be called for a
potentially scheduled timer which is incorrect.
To avoid that, move the wakeup source timer cancellation from
wakeup_source_drop() to wakeup_source_remove().
Moreover, make wakeup_source_remove() clear the timer function after
canceling the timer to let wakeup_source_not_registered() treat
unregistered wakeup sources in the same way as the ones that have
never been registered.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
[ rjw: Subject, changelog, merged two patches together ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 726e41097920a73e4c7c33385dcc0debb1281e18 upstream.
For devices with a class, we create a "glue" directory between
the parent device and the new device with the class name.
This directory is never "explicitely" removed when empty however,
this is left to the implicit sysfs removal done by kobject_release()
when the object loses its last reference via kobject_put().
This is problematic because as long as it's not been removed from
sysfs, it is still present in the class kset and in sysfs directory
structure.
The presence in the class kset exposes a use after free bug fixed
by the previous patch, but the presence in sysfs means that until
the kobject is released, which can take a while (especially with
kobject debugging), any attempt at re-creating such as binding a
new device for that class/parent pair, will result in a sysfs
duplicate file name error.
This fixes it by instead doing an explicit kobject_del() when
the glue dir is empty, by keeping track of the number of
child devices of the gluedir.
This is made easy by the fact that all glue dir operations are
done with a global mutex, and there's already a function
(cleanup_glue_dir) called in all the right places taking that
mutex that can be enhanced for this. It appears that this was
in fact the intent of the function, but the implementation was
wrong.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Zubin Mithra <zsm@chromium.org>
Cc: Guenter Roeck <groeck@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4f4b374332ec0ae9c738ff8ec9bed5cd97ff9adc ]
This is the much more correct fix for my earlier attempt at:
https://lkml.org/lkml/2018/12/10/118
Short recap:
- There's not actually a locking issue, it's just lockdep being a bit
too eager to complain about a possible deadlock.
- Contrary to what I claimed the real problem is recursion on
kn->count. Greg pointed me at sysfs_break_active_protection(), used
by the scsi subsystem to allow a sysfs file to unbind itself. That
would be a real deadlock, which isn't what's happening here. Also,
breaking the active protection means we'd need to manually handle
all the lifetime fun.
- With Rafael we discussed the task_work approach, which kinda works,
but has two downsides: It's a functional change for a lockdep
annotation issue, and it won't work for the bind file (which needs
to get the errno from the driver load function back to userspace).
- Greg also asked why this never showed up: To hit this you need to
unregister a 2nd driver from the unload code of |