summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2023-06-28gpio: Allow per-parent interrupt dataMarc Zyngier1-2/+17
[ Upstream commit cfe6807d82e97e81c3209dca9448f091e1448a57 ] The core gpiolib code is able to deal with multiple interrupt parents for a single gpio irqchip. It however only allows a single piece of data to be conveyed to all flow handlers (either the gpio_chip or some other, driver-specific data). This means that drivers have to go through some interesting dance to find the correct context, something that isn't great in interrupt context (see aebdc8abc9db86e2bd33070fc2f961012fff74b4 for a prime example). Instead, offer an optional way for a pinctrl/gpio driver to provide an array of pointers which gets used to provide the correct context to the flow handler. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20211026175815.52703-2-joey.gouly@arm.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Stable-dep-of: 8c00914e5438 ("gpiolib: Fix GPIO chip IRQ initialization restriction") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-28regulator: pca9450: Fix LDO3OUT and LDO4OUT MASKTeresa Remmet1-2/+2
[ Upstream commit 7257d930aadcd62d1c7971ab14f3b1126356abdc ] L3_OUT and L4_OUT Bit fields range from Bit 0:4 and thus the mask should be 0x1F instead of 0x0F. Fixes: 0935ff5f1f0a ("regulator: pca9450: add pca9450 pmic driver") Signed-off-by: Teresa Remmet <t.remmet@phytec.de> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Link: https://lore.kernel.org/r/20230614125240.3946519-1-t.remmet@phytec.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21of: overlay: add entry to of_overlay_action_name[]Frank Rowand1-0/+13
commit 1ac17586c950a2c129393f8a92901a2b357acf24 upstream. The values of enum of_overlay_notify_action are used to index into array of_overlay_action_name. Add an entry to of_overlay_action_name for the value recently added to of_overlay_notify_action. Array of_overlay_action_name[] is moved into include/linux/of.h adjacent to enum of_overlay_notify_action to make the connection between the two more obvious if either is modified in the future. The only use of of_overlay_action_name is for error reporting in overlay_notify(). All callers of overlay_notify() report the same error, but with fewer details. Remove the redundant error reports in the callers. Fixes: 067c098766c6 ("of: overlay: rework overlay apply and remove kfree()s") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Frank Rowand <frank.rowand@sony.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220502181742.1402826-2-frowand.list@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-21Remove DECnet support from kernelStephen Hemminger3-17/+0
commit 1202cdd665315c525b5237e96e0bedc76d7e754f upstream. DECnet is an obsolete network protocol that receives more attention from kernel janitors than users. It belongs in computer protocol history museum not in Linux kernel. It has been "Orphaned" in kernel since 2010. The iproute2 support for DECnet was dropped in 5.0 release. The documentation link on Sourceforge says it is abandoned there as well. Leave the UAPI alone to keep userspace programs compiling. This means that there is still an empty neighbour table for AF_DECNET. The table of /proc/sys/net entries was updated to match current directories and reformatted to be alphabetical. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Ahern <dsahern@kernel.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-21of: overlay: rework overlay apply and remove kfree()sFrank Rowand1-1/+2
[ Upstream commit 067c098766c6af667a9002d4e33cf1f3c998abbe ] Fix various kfree() issues related to of_overlay_apply(). - Double kfree() of fdt and tree when init_overlay_changeset() returns an error. - free_overlay_changeset() free the root of the unflattened overlay (variable tree) instead of the memory that contains the unflattened overlay. - For the case of a failure during applying an overlay, move kfree() of new_fdt and overlay_mem into free_overlay_changeset(), which is called by the function that allocated them. - For the case of removing an overlay, the kfree() of new_fdt and overlay_mem remains in free_overlay_changeset(). - Check return value of of_fdt_unflatten_tree() for error instead of checking the returned value of overlay_root. - When storing pointers to allocated objects in ovcs, do so as near to the allocation as possible instead of in deeply layered function. More clearly document policy related to lifetime of pointers into overlay memory. Double kfree() Reported-by: Slawomir Stepien <slawomir.stepien@nokia.com> Signed-off-by: Frank Rowand <frank.rowand@sony.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220420222505.928492-3-frowand.list@gmail.com Stable-dep-of: 39affd1fdf65 ("of: overlay: Fix missing of_node_put() in error case of init_overlay_changeset()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-14usb: usbfs: Enforce page requirements for mmapRuihan Li1-0/+5
commit 0143d148d1e882fb1538dc9974c94d63961719b9 upstream. The current implementation of usbdev_mmap uses usb_alloc_coherent to allocate memory pages that will later be mapped into the user space. Meanwhile, usb_alloc_coherent employs three different methods to allocate memory, as outlined below: * If hcd->localmem_pool is non-null, it uses gen_pool_dma_alloc to allocate memory; * If DMA is not available, it uses kmalloc to allocate memory; * Otherwise, it uses dma_alloc_coherent. However, it should be noted that gen_pool_dma_alloc does not guarantee that the resulting memory will be page-aligned. Furthermore, trying to map slab pages (i.e., memory allocated by kmalloc) into the user space is not resonable and can lead to problems, such as a type confusion bug when PAGE_TABLE_CHECK=y [1]. To address these issues, this patch introduces hcd_alloc_coherent_pages, which addresses the above two problems. Specifically, hcd_alloc_coherent_pages uses gen_pool_dma_alloc_align instead of gen_pool_dma_alloc to ensure that the memory is page-aligned. To replace kmalloc, hcd_alloc_coherent_pages directly allocates pages by calling __get_free_pages. Reported-by: syzbot+fcf1a817ceb50935ce99@syzkaller.appspotmail.comm Closes: https://lore.kernel.org/lkml/000000000000258e5e05fae79fc1@google.com/ [1] Fixes: f7d34b445abc ("USB: Add support for usbfs zerocopy.") Fixes: ff2437befd8f ("usb: host: Fix excessive alignment restriction for local memory allocations") Cc: stable@vger.kernel.org Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20230515130958.32471-2-lrh2000@pku.edu.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-14rfs: annotate lockless accesses to RFS sock flow tableEric Dumazet1-2/+5
[ Upstream commit 5c3b74a92aa285a3df722bf6329ba7ccf70346d6 ] Add READ_ONCE()/WRITE_ONCE() on accesses to the sock flow table. This also prevents a (smart ?) compiler to remove the condition in: if (table->ents[index] != newval) table->ents[index] = newval; We need the condition to avoid dirtying a shared cache line. Fixes: fec5e652e58f ("rfs: Receive Flow Steering") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05power: supply: core: Refactor ↵Hans de Goede1-2/+3
power_supply_set_input_current_limit_from_supplier() [ Upstream commit 2220af8ca61ae67de4ec3deec1c6395a2f65b9fd ] Some (USB) charger ICs have variants with USB D+ and D- pins to do their own builtin charger-type detection, like e.g. the bq24190 and bq25890 and also variants which lack this functionality, e.g. the bq24192 and bq25892. In case the charger-type; and thus the input-current-limit detection is done outside the charger IC then we need some way to communicate this to the charger IC. In the past extcon was used for this, but if the external detection does e.g. full USB PD negotiation then the extcon cable-types do not convey enough information. For these setups it was decided to model the external charging "brick" and the parameters negotiated with it as a power_supply class-device itself; and power_supply_set_input_current_limit_from_supplier() was introduced to allow drivers to get the input-current-limit this way. But in some cases psy drivers may want to know other properties, e.g. the bq25892 can do "quick-charge" negotiation by pulsing its current draw, but this should only be done if the usb_type psy-property of its supplier is set to DCP (and device-properties indicate the board allows higher voltages). Instead of adding extra helper functions for each property which a psy-driver wants to query from its supplier, refactor power_supply_set_input_current_limit_from_supplier() into a more generic power_supply_get_property_from_supplier() function. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Stable-dep-of: 77c2a3097d70 ("power: supply: bq24190: Call power_supply_changed() after updating input current") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05power: supply: bq27xxx: Ensure power_supply_changed() is called on current ↵Hans de Goede1-0/+3
sign changes [ Upstream commit 939a116142012926e25de0ea6b7e2f8d86a5f1b6 ] On gauges where the current register is signed, there is no charging flag in the flags register. So only checking flags will not result in power_supply_changed() getting called when e.g. a charger is plugged in and the current sign changes from negative (discharging) to positive (charging). This causes userspace's notion of the status to lag until userspace does a poll. And when a power_supply_leds.c LED trigger is used to indicate charging status with a LED, this LED will lag until the capacity percentage changes, which may take many minutes (because the LED trigger only is updated on power_supply_changed() calls). Fix this by calling bq27xxx_battery_current_and_status() on gauges with a signed current register and checking if the status has changed. Fixes: 297a533b3e62 ("bq27x00: Cache battery registers") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-30net/mlx5: DR, Check force-loopback RC QP capability independently from RoCEYevgeny Kliteynik1-1/+3
commit c7dd225bc224726c22db08e680bf787f60ebdee3 upstream. SW Steering uses RC QP for writing STEs to ICM. This writingis done in LB (loopback), and FL (force-loopback) QP is preferred for performance. FL is available when RoCE is enabled or disabled based on RoCE caps. This patch adds reading of FL capability from HCA caps in addition to the existing reading from RoCE caps, thus fixing the case where we didn't have loopback enabled when RoCE was disabled. Fixes: 7304d603a57a ("net/mlx5: DR, Add support for force-loopback QP") Signed-off-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30fs: fix undefined behavior in bit shift for SB_NOUSERHao Ge1-21/+21
commit f15afbd34d8fadbd375f1212e97837e32bc170cc upstream. Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. It was spotted by UBSAN. So let's just fix this by using the BIT() helper for all SB_* flags. Fixes: e462ec50cb5f ("VFS: Differentiate mount flags (MS_*) from internal superblock flags") Signed-off-by: Hao Ge <gehao@kylinos.cn> Message-Id: <20230424051835.374204-1-gehao@kylinos.cn> [brauner@kernel.org: use BIT() for all SB_* flags] Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30firmware: arm_ffa: Fix FFA device names for logical partitionsSudeep Holla1-0/+1
commit 19b8766459c41c6f318f8a548cc1c66dffd18363 upstream. Each physical partition can provide multiple services each with UUID. Each such service can be presented as logical partition with a unique combination of VM ID and UUID. The number of distinct UUID in a system will be less than or equal to the number of logical partitions. However, currently it fails to register more than one logical partition or service within a physical partition as the device name contains only VM ID while both VM ID and UUID are maintained in the partition information. The kernel complains with the below message: | sysfs: cannot create duplicate filename '/devices/arm-ffa-8001' | CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc7 #8 | Hardware name: FVP Base RevC (DT) | Call trace: | dump_backtrace+0xf8/0x118 | show_stack+0x18/0x24 | dump_stack_lvl+0x50/0x68 | dump_stack+0x18/0x24 | sysfs_create_dir_ns+0xe0/0x13c | kobject_add_internal+0x220/0x3d4 | kobject_add+0x94/0x100 | device_add+0x144/0x5d8 | device_register+0x20/0x30 | ffa_device_register+0x88/0xd8 | ffa_setup_partitions+0x108/0x1b8 | ffa_init+0x2ec/0x3a4 | do_one_initcall+0xcc/0x240 | do_initcall_level+0x8c/0xac | do_initcalls+0x54/0x94 | do_basic_setup+0x1c/0x28 | kernel_init_freeable+0x100/0x16c | kernel_init+0x20/0x1a0 | ret_from_fork+0x10/0x20 | kobject_add_internal failed for arm-ffa-8001 with -EEXIST, don't try to | register things with the same name in the same directory. | arm_ffa arm-ffa: unable to register device arm-ffa-8001 err=-17 | ARM FF-A: ffa_setup_partitions: failed to register partition ID 0x8001 By virtue of being random enough to avoid collisions when generated in a distributed system, there is no way to compress UUID keys to the number of bits required to identify each. We can eliminate '-' in the name but it is not worth eliminating 4 bytes and add unnecessary logic for doing that. Also v1.0 doesn't provide the UUID of the partitions which makes it hard to use the same for the device name. So to keep it simple, let us alloc an ID using ida_alloc() and append the same to "arm-ffa" to make up a unique device name. Also stash the id value in ffa_dev to help freeing the ID later when the device is destroyed. Fixes: e781858488b9 ("firmware: arm_ffa: Add initial FFA bus support for device enumeration") Reported-by: Lucian Paul-Trifu <lucian.paul-trifu@arm.com> Link: https://lore.kernel.org/r/20230419-ffa_fixes_6-4-v2-3-d9108e43a176@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30power: supply: bq27xxx: Fix poll_interval handling and races on removeHans de Goede1-0/+1
commit c00bc80462afc7963f449d7f21d896d2f629cacc upstream. Before this patch bq27xxx_battery_teardown() was setting poll_interval = 0 to avoid bq27xxx_battery_update() requeuing the delayed_work item. There are 2 problems with this: 1. If the driver is unbound through sysfs, rather then the module being rmmod-ed, this changes poll_interval unexpectedly 2. This is racy, after it being set poll_interval could be changed before bq27xxx_battery_update() checks it through /sys/module/bq27xxx_battery/parameters/poll_interval Fix this by added a removed attribute to struct bq27xxx_device_info and using that instead of setting poll_interval to 0. There also is another poll_interval related race on remove(), writing /sys/module/bq27xxx_battery/parameters/poll_interval will requeue the delayed_work item for all devices on the bq27xxx_battery_devices list and the device being removed was only removed from that list after cancelling the delayed_work item. Fix this by moving the removal from the bq27xxx_battery_devices list to before cancelling the delayed_work item. Fixes: 8cfaaa811894 ("bq27x00_battery: Fix OOPS caused by unregistring bq27x00 driver") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30USB: core: Add routines for endpoint checks in old driversAlan Stern1-0/+5
commit 13890626501ffda22b18213ddaf7930473da5792 upstream. Many of the older USB drivers in the Linux USB stack were written based simply on a vendor's device specification. They use the endpoint information in the spec and assume these endpoints will always be present, with the properties listed, in any device matching the given vendor and product IDs. While that may have been true back then, with spoofing and fuzzing it is not true any more. More and more we are finding that those old drivers need to perform at least a minimum of checking before they try to use any endpoint other than ep0. To make this checking as simple as possible, we now add a couple of utility routines to the USB core. usb_check_bulk_endpoints() and usb_check_int_endpoints() take an interface pointer together with a list of endpoint addresses (numbers and directions). They check that the interface's current alternate setting includes endpoints with those addresses and that each of these endpoints has the right type: bulk or interrupt, respectively. Although we already have usb_find_common_endpoints() and related routines meant for a similar purpose, they are not well suited for this kind of checking. Those routines find endpoints of various kinds, but only one (either the first or the last) of each kind, and they don't verify that the endpoints' addresses agree with what the caller expects. In theory the new routines could be more general: They could take a particular altsetting as their argument instead of always using the interface's current altsetting. In practice I think this won't matter too much; multiple altsettings tend to be used for transferring media (audio or visual) over isochronous endpoints, not bulk or interrupt. Drivers for such devices will generally require more sophisticated checking than these simplistic routines provide. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/dd2c8e8c-2c87-44ea-ba17-c64b97e201c9@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30net: fix stack overflow when LRO is disabled for virtual interfacesTaehee Yoo1-0/+1
commit ae9b15fbe63447bc1d3bba3769f409d17ca6fdf6 upstream. When the virtual interface's feature is updated, it synchronizes the updated feature for its own lower interface. This propagation logic should be worked as the iteration, not recursively. But it works recursively due to the netdev notification unexpectedly. This problem occurs when it disables LRO only for the team and bonding interface type. team0 | +------+------+-----+-----+ | | | | | team1 team2 team3 ... team200 If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE event to its own lower interfaces(team1 ~ team200). It is worked by netdev_sync_lower_features(). So, the NETDEV_FEAT_CHANGE notification logic of each lower interface work iteratively. But generated NETDEV_FEAT_CHANGE event is also sent to the upper interface too. upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own lower interfaces again. lower and upper interfaces receive this event and generate this event again and again. So, the stack overflow occurs. But it is not the infinite loop issue. Because the netdev_sync_lower_features() updates features before generating the NETDEV_FEAT_CHANGE event. Already synchronized lower interfaces skip notification logic. So, it is just the problem that iteration logic is changed to the recursive unexpectedly due to the notification mechanism. Reproducer: ip link add team0 type team ethtool -K team0 lro on for i in {1..200} do ip link add team$i master team0 type team ethtool -K team$i lro on done ethtool -K team0 lro off In order to fix it, the notifier_ctx member of bonding/team is introduced. Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230517143010.3596250-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30usb: gadget: Properly configure the device for remote wakeupElson Roy Serrao2-0/+10
[ Upstream commit b93c2a68f3d9dc98ec30dcb342ae47c1c8d09d18 ] The wakeup bit in the bmAttributes field indicates whether the device is configured for remote wakeup. But this field should be allowed to set only if the UDC supports such wakeup mechanism. So configure this field based on UDC capability. Also inform the UDC whether the device is configured for remote wakeup by implementing a gadget op. Reviewed-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Signed-off-by: Elson Roy Serrao <quic_eserrao@quicinc.com> Link: https://lore.kernel.org/r/1679694482-16430-2-git-send-email-quic_eserrao@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: 4e8ef34e36f2 ("usb: dwc3: fix gadget mode suspend interrupt handler issue") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24SUNRPC: always free ctxt when freeing deferred requestNeilBrown2-2/+2
[ Upstream commit 948f072ada23e0a504c5e4d7d71d4c83bd0785ec ] Since the ->xprt_ctxt pointer was added to svc_deferred_req, it has not been sufficient to use kfree() to free a deferred request. We may need to free the ctxt as well. As freeing the ctxt is all that ->xpo_release_rqst() does, we repurpose it to explicit do that even when the ctxt is not stored in an rqst. So we now have ->xpo_release_ctxt() which is given an xprt and a ctxt, which may have been taken either from an rqst or from a dreq. The caller is now responsible for clearing that pointer after the call to ->xpo_release_ctxt. We also clear dr->xprt_ctxt when the ctxt is moved into a new rqst when revisiting a deferred request. This ensures there is only one pointer to the ctxt, so the risk of double freeing in future is reduced. The new code in svc_xprt_release which releases both the ctxt and any rq_deferred depends on this. Fixes: 773f91b2cf3f ("SUNRPC: Fix NFSD's request deferral on RDMA transports") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24SUNRPC: Remove svc_rqst::rq_xprt_hlenChuck Lever1-2/+0
[ Upstream commit 983084b2672c593959e3148d6a17c8b920797dde ] Clean up: This field is now always set to zero. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Stable-dep-of: 948f072ada23 ("SUNRPC: always free ctxt when freeing deferred request") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24sched: Fix KCSAN noinstr violationJosh Poimboeuf1-1/+1
[ Upstream commit e0b081d17a9f4e5c0cbb0e5fbeb1abe3de0f7e4e ] With KCSAN enabled, end_of_stack() can get out-of-lined. Force it inline. Fixes the following warnings: vmlinux.o: warning: objtool: check_stackleak_irqoff+0x2b: call to end_of_stack() leaves .noinstr.text section Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/cc1b4d73d3a428a00d206242a68fdf99a934ca7b.1681320026.git.jpoimboe@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24firmware: arm_sdei: Fix sleep from invalid context BUGPierre Gondois1-1/+0
[ Upstream commit d2c48b2387eb89e0bf2a2e06e30987cf410acad4 ] Running a preempt-rt (v6.2-rc3-rt1) based kernel on an Ampere Altra triggers: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 24, name: cpuhp/0 preempt_count: 0, expected: 0 RCU nest depth: 0, expected: 0 3 locks held by cpuhp/0/24: #0: ffffda30217c70d0 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x5c/0x248 #1: ffffda30217c7120 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x5c/0x248 #2: ffffda3021c711f0 (sdei_list_lock){....}-{3:3}, at: sdei_cpuhp_up+0x3c/0x130 irq event stamp: 36 hardirqs last enabled at (35): [<ffffda301e85b7bc>] finish_task_switch+0xb4/0x2b0 hardirqs last disabled at (36): [<ffffda301e812fec>] cpuhp_thread_fun+0x21c/0x248 softirqs last enabled at (0): [<ffffda301e80b184>] copy_process+0x63c/0x1ac0 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 0 PID: 24 Comm: cpuhp/0 Not tainted 5.19.0-rc3-rt5-[...] Hardware name: WIWYNN Mt.Jade Server [...] Call trace: dump_backtrace+0x114/0x120 show_stack+0x20/0x70 dump_stack_lvl+0x9c/0xd8 dump_stack+0x18/0x34 __might_resched+0x188/0x228 rt_spin_lock+0x70/0x120 sdei_cpuhp_up+0x3c/0x130 cpuhp_invoke_callback+0x250/0xf08 cpuhp_thread_fun+0x120/0x248 smpboot_thread_fn+0x280/0x320 kthread+0x130/0x140 ret_from_fork+0x10/0x20 sdei_cpuhp_up() is called in the STARTING hotplug section, which runs with interrupts disabled. Use a CPUHP_AP_ONLINE_DYN entry instead to execute the cpuhp cb later, with preemption enabled. SDEI originally got its own cpuhp slot to allow interacting with perf. It got superseded by pNMI and this early slot is not relevant anymore. [1] Some SDEI calls (e.g. SDEI_1_0_FN_SDEI_PE_MASK) take actions on the calling CPU. It is checked that preemption is disabled for them. _ONLINE cpuhp cb are executed in the 'per CPU hotplug thread'. Preemption is enabled in those threads, but their cpumask is limited to 1 CPU. Move 'WARN_ON_ONCE(preemptible())' statements so that SDEI cpuhp cb don't trigger them. Also add a check for the SDEI_1_0_FN_SDEI_PRIVATE_RESET SDEI call which acts on the calling CPU. [1]: https://lore.kernel.org/all/5813b8c5-ae3e-87fd-fccc-94c9cd08816d@arm.com/ Suggested-by: James Morse <james.morse@arm.com> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20230216084920.144064-1-pierre.gondois@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24net: add vlan_get_protocol_and_depth() helperEric Dumazet1-0/+17
[ Upstream commit 4063384ef762cc5946fc7a3f89879e76c6ec51e2 ] Before blamed commit, pskb_may_pull() was used instead of skb_header_pointer() in __vlan_get_protocol() and friends. Few callers depended on skb->head being populated with MAC header, syzbot caught one of them (skb_mac_gso_segment()) Add vlan_get_protocol_and_depth() to make the intent clearer and use it where sensible. This is a more generic fix than commit e9d3f80935b6 ("net/af_packet: make sure to pull mac header") which was dealing with a similar issue. kernel BUG at include/linux/skbuff.h:2655 ! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 1441 Comm: syz-executor199 Not tainted 6.1.24-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023 RIP: 0010:__skb_pull include/linux/skbuff.h:2655 [inline] RIP: 0010:skb_mac_gso_segment+0x68f/0x6a0 net/core/gro.c:136 Code: fd 48 8b 5c 24 10 44 89 6b 70 48 c7 c7 c0 ae 0d 86 44 89 e6 e8 a1 91 d0 00 48 c7 c7 00 af 0d 86 48 89 de 31 d2 e8 d1 4a e9 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 RSP: 0018:ffffc90001bd7520 EFLAGS: 00010286 RAX: ffffffff8469736a RBX: ffff88810f31dac0 RCX: ffff888115a18b00 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffc90001bd75e8 R08: ffffffff84697183 R09: fffff5200037adf9 R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000012 R13: 000000000000fee5 R14: 0000000000005865 R15: 000000000000fed7 FS: 000055555633f300(0000) GS:ffff8881f6a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 0000000116fea000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> [<ffffffff847018dd>] __skb_gso_segment+0x32d/0x4c0 net/core/dev.c:3419 [<ffffffff8470398a>] skb_gso_segment include/linux/netdevice.h:4819 [inline] [<ffffffff8470398a>] validate_xmit_skb+0x3aa/0xee0 net/core/dev.c:3725 [<ffffffff84707042>] __dev_queue_xmit+0x1332/0x3300 net/core/dev.c:4313 [<ffffffff851a9ec7>] dev_queue_xmit+0x17/0x20 include/linux/netdevice.h:3029 [<ffffffff851b4a82>] packet_snd net/packet/af_packet.c:3111 [inline] [<ffffffff851b4a82>] packet_sendmsg+0x49d2/0x6470 net/packet/af_packet.c:3142 [<ffffffff84669a12>] sock_sendmsg_nosec net/socket.c:716 [inline] [<ffffffff84669a12>] sock_sendmsg net/socket.c:736 [inline] [<ffffffff84669a12>] __sys_sendto+0x472/0x5f0 net/socket.c:2139 [<ffffffff84669c75>] __do_sys_sendto net/socket.c:2151 [inline] [<ffffffff84669c75>] __se_sys_sendto net/socket.c:2147 [inline] [<ffffffff84669c75>] __x64_sys_sendto+0xe5/0x100 net/socket.c:2147 [<ffffffff8551d40f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff8551d40f>] do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80 [<ffffffff85600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 469aceddfa3e ("vlan: consolidate VLAN parsing code and limit max parsing depth") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24net: stmmac: switch to use interrupt for hw crosstimestampingWong Vee Khee1-0/+1
[ Upstream commit 76c16d3e19446deea98b7883f261758b96b8781a ] Using current implementation of polling mode, there is high chances we will hit into timeout error when running phc2sys. Hence, update the implementation of hardware crosstimestamping to use the MAC interrupt service routine instead of polling for TSIS bit in the MAC Timestamp Interrupt Status register to be set. Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 8efbdbfa9938 ("net: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24linux/dim: Do nothing if no time delta between samplesRoy Novich1-1/+2
[ Upstream commit 162bd18eb55adf464a0fa2b4144b8d61c75ff7c2 ] Add return value for dim_calc_stats. This is an indication for the caller if curr_stats was assigned by the function. Avoid using curr_stats uninitialized over {rdma/net}_dim, when no time delta between samples. Coverity reported this potential use of an uninitialized variable. Fixes: 4c4dbb4a7363 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux") Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://lore.kernel.org/r/20230507135743.138993-1-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17crypto: api - Add scaffolding to change completion function signatureHerbert Xu1-0/+6
[ Upstream commit c35e03eaece71101ff6cbf776b86403860ac8cc3 ] The crypto completion function currently takes a pointer to a struct crypto_async_request object. However, in reality the API does not allow the use of any part of the object apart from the data field. For example, ahash/shash will create a fake object on the stack to pass along a different data field. This leads to potential bugs where the user may try to dereference or otherwise use the crypto_async_request object. This patch adds some temporary scaffolding so that the completion function can take a void * instead. Once affected users have been converted this can be removed. The helper crypto_request_complete will remain even after the conversion is complete. It should be used instead of calling the completion function directly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Stable-dep-of: 4140aafcff16 ("crypto: engine - fix crypto_queue backlog handling") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11mailbox: zynqmp: Fix typo in IPI documentationTanmay Shah1-1/+1
commit 79963fbfc233759bd8a43462f120d15a1bd4f4fa upstream. Xilinx IPI message buffers allows 32-byte data transfer. Fix documentation that says 12 bytes Fixes: 4981b82ba2ff ("mailbox: ZynqMP IPI mailbox controller") Signed-off-by: Tanmay Shah <tanmay.shah@amd.com> Acked-by: Michal Simek <michal.simek@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230311012407.1292118-4-tanmay.shah@amd.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-11SUNRPC: remove the maximum number of retries in call_bind_statusDai Ngo1-2/+1
[ Upstream commit 691d0b782066a6eeeecbfceb7910a8f6184e6105 ] Currently call_bind_status places a hard limit of 3 to the number of retries on EACCES error. This limit was done to prevent NLM unlock requests from being hang forever when the server keeps returning garbage. However this change causes problem for cases when NLM service takes longer than 9 seconds to register with the port mapper after a restart. This patch removes this hard coded limit and let the RPC handles the retry based on the standard hard/soft task semantics. Fixes: 0b760113a3a1 ("NLM: Don't hang forever on NLM unlock requests") Reported-by: Helen Chao <helen.chao@oracle.com> Tested-by: Helen Chao <helen.chao@oracle.com> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11RDMA/mlx5: Fix flow counter query via DEVXMark Bloch1-1/+2
[ Upstream commit 3e358ea8614ddfbc59ca7a3f5dff5dde2b350b2c ] Commit cited in "fixes" tag added bulk support for flow counters but it didn't account that's also possible to query a counter using a non-base id if the counter was allocated as bulk. When a user performs a query, validate the flow counter id given in the mailbox is inside the valid range taking bulk value into account. Fixes: 208d70f562e5 ("IB/mlx5: Support flow counters offset for bulk counters") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Link: https://lore.kernel.org/r/79d7fbe291690128e44672418934256254d93115.1681377114.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11workqueue: Introduce show_one_worker_pool and show_one_workqueue.Imran Khan1-1/+2
[ Upstream commit 55df0933be74bd2e52aba0b67eb743ae0feabe7e ] Currently show_workqueue_state shows the state of all workqueues and of all worker pools. In certain cases we may need to dump state of only a specific workqueue or worker pool. For example in destroy_workqueue we only need to show state of the workqueue which is getting destroyed. So rename show_workqueue_state to show_all_workqueues(to signify it dumps state of all busy workqueues) and divide it into more granular functions (show_one_workqueue and show_one_worker_pool), that would show states of individual workqueues and worker pools and can be used in cases such as the one mentioned above. Also, as mentioned earlier, make destroy_workqueue dump data pertaining to only the workqueue that is being destroyed and make user(s) of earlier interface(show_workqueue_state), use new interface (show_all_workqueues). Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> Stable-dep-of: 335a42ebb0ca ("workqueue: Fix hung time report of worker pools") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11sched: Make struct sched_statistics independent of fair sched classYafang Shao1-3/+3
[ Upstream commit ceeadb83aea28372e54857bf88ab7e17af48ab7b ] If we want to use the schedstats facility to trace other sched classes, we should make it independent of fair sched class. The struct sched_statistics is the schedular statistics of a task_struct or a task_group. So we can move it into struct task_struct and struct task_group to achieve the goal. After the patch, schestats are orgnized as follows, struct task_struct { ... struct sched_entity se; struct sched_rt_entity rt; struct sched_dl_entity dl; ... struct sched_statistics stats; ... }; Regarding the task group, schedstats is only supported for fair group sched, and a new struct sched_entity_stats is introduced, suggested by Peter - struct sched_entity_stats { struct sched_entity se; struct sched_statistics stats; } __no_randomize_layout; Then with the se in a task_group, we can easily get the stats. The sched_statistics members may be frequently modified when schedstats is enabled, in order to avoid impacting on random data which may in the same cacheline with them, the struct sched_statistics is defined as cacheline aligned. As this patch changes the core struct of scheduler, so I verified the performance it may impact on the scheduler with 'perf bench sched pipe', suggested by Mel. Below is the result, in which all the values are in usecs/op. Before After kernel.sched_schedstats=0 5.2~5.4 5.2~5.4 kernel.sched_schedstats=1 5.3~5.5 5.3~5.5 [These data is a little difference with the earlier version, that is because my old test machine is destroyed so I have to use a new different test machine.] Almost no impact on the sched performance. No functional change. [lkp@intel.com: reported build failure in earlier version] Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mel Gorman <mgorman@suse.de> Link: https://lore.kernel.org/r/20210905143547.4668-3-laoar.shao@gmail.com Stable-dep-of: 39afe5d6fc59 ("sched/fair: Fix inaccurate tally of ttwu_move_affine") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11linux/vt_buffer.h: allow either builtin or modular for macrosRandy Dunlap1-1/+1
[ Upstream commit 2b76ffe81e32afd6d318dc4547e2ba8c46207b77 ] Fix build errors on ARCH=alpha when CONFIG_MDA_CONSOLE=m. This allows the ARCH macros to be the only ones defined. In file included from ../drivers/video/console/mdacon.c:37: ../arch/alpha/include/asm/vga.h:17:40: error: expected identifier or '(' before 'volatile' 17 | static inline void scr_writew(u16 val, volatile u16 *addr) | ^~~~~~~~ ../include/linux/vt_buffer.h:24:34: note: in definition of macro 'scr_writew' 24 | #define scr_writew(val, addr) (*(addr) = (val)) | ^~~~ ../include/linux/vt_buffer.h:24:40: error: expected ')' before '=' token 24 | #define scr_writew(val, addr) (*(addr) = (val)) | ^ ../arch/alpha/include/asm/vga.h:17:20: note: in expansion of macro 'scr_writew' 17 | static inline void scr_writew(u16 val, volatile u16 *addr) | ^~~~~~~~~~ ../arch/alpha/include/asm/vga.h:25:29: error: expected identifier or '(' before 'volatile' 25 | static inline u16 scr_readw(volatile const u16 *addr) | ^~~~~~~~ Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: dri-devel@lists.freedesktop.org Cc: linux-fbdev@vger.kernel.org Link: https://lore.kernel.org/r/20230329021529.16188-1-rdunlap@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11netfilter: nf_tables: don't write table validation state without mutexFlorian Westphal1-1/+0
[ Upstream commit 9a32e9850686599ed194ccdceb6cd3dd56b2d9b9 ] The ->cleanup callback needs to be removed, this doesn't work anymore as the transaction mutex is already released in the ->abort function. Just do it after a successful validation pass, this either happens from commit or abort phases where transaction mutex is held. Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11nvme: handle the persistent internal error AERMichael Kelley1-0/+4
[ Upstream commit 2c61c97fb12b806e1c8eb15f04c277ad097ec95e ] In the NVM Express Revision 1.4 spec, Figure 145 describes possible values for an AER with event type "Error" (value 000b). For a Persistent Internal Error (value 03h), the host should perform a controller reset. Add support for this error using code that already exists for doing a controller reset. As part of this support, introduce two utility functions for parsing the AER type and subtype. This new support was tested in a lab environment where we can generate the persistent internal error on demand, and observe both the Linux side and NVMe controller side to see that the controller reset has been done. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Stable-dep-of: 6622b76fe922 ("nvme: fix async event trace event") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11platform: Provide a remove callback that returns no valueUwe Kleine-König1-0/+11
[ Upstream commit 5c5a7680e67ba6fbbb5f4d79fa41485450c1985c ] struct platform_driver::remove returning an integer made driver authors expect that returning an error code was proper error handling. However the driver core ignores the error and continues to remove the device because there is nothing the core could do anyhow and reentering the remove callback again is only calling for trouble. So this is an source for errors typically yielding resource leaks in the error path. As there are too many platform drivers to neatly convert them all to return void in a single go, do it in several steps after this patch: a) Convert all drivers to implement .remove_new() returning void instead of .remove() returning int; b) Change struct platform_driver::remove() to return void and so make it identical to .remove_new(); c) Change all drivers back to .remove() now with the better prototype; d) drop struct platform_driver::remove_new(). While this touches all drivers eventually twice, steps a) and c) can be done one driver after another and so reduces coordination efforts immensely and simplifies review. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20221209150914.3557650-1-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: c766c90faf93 ("media: rcar_fdp1: Fix refcount leak in probe and remove function") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystemJoel Fernandes (Google)1-0/+2
commit 58d7668242647e661a20efe065519abd6454287e upstream. For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined. However, cpu_is_hotpluggable() still returns true for those CPUs. This causes torture tests that do offlining to end up trying to offline this CPU causing test failures. Such failure happens on all architectures. Fix the repeated error messages thrown by this (even if the hotplug errors are harmless) by asking the opinion of the nohz subsystem on whether the CPU can be hotplugged. [ Apply Frederic Weisbecker feedback on refactoring tick_nohz_cpu_down(). ] For drivers/base/ portion: Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: rcu <rcu@vger.kernel.org> Cc: stable@vger.kernel.org Fixes: 2987557f52b9 ("driver-core/cpu: Expose hotpluggability to the rest of the kernel") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-11blk-crypto: make blk_crypto_evict_key() return voidEric Biggers1-2/+2
commit 70493a63ba04f754f7a7dd53a4fcc82700181490 upstream. blk_crypto_evict_key() is only called in contexts such as inode eviction where failure is not an option. So there is nothing the caller can do with errors except log them. (dm-table.c does "use" the error code, but only to pass on to upper layers, so it doesn't really count.) Just make blk_crypto_evict_key() return void and log errors itself. Cc: s