summaryrefslogtreecommitdiff
path: root/drivers/ptp
AgeCommit message (Collapse)AuthorFilesLines
2025-02-21ptp: vmclock: Set driver data before its usageThomas Weißschuh1-2/+2
commit f7d07cd4f77d77f366c8ffbb8ba8b61f614e5fce upstream. If vmclock_ptp_register() fails during probing, vmclock_remove() is called to clean up the ptp clock and misc device. It uses dev_get_drvdata() to access the vmclock state. However the driver data is not yet set at this point. Assign the driver data earlier. Fixes: 205032724226 ("ptp: Add support for the AMZNC10C 'vmclock' device") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21ptp: vmclock: Add .owner to vmclock_miscdev_fopsDavid Woodhouse1-0/+1
commit 7b07b040257c1b658ef3eca86e4b6ae02d65069c upstream. Without the .owner field, the module can be unloaded while /dev/vmclock0 is open, leading to an oops. Fixes: 205032724226 ("ptp: Add support for the AMZNC10C 'vmclock' device") Cc: stable@vger.kernel.org Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21ptp: vmclock: Don't unregister misc device if it was not registeredThomas Weißschuh1-1/+2
commit 39e926c3a21b25af6cae479fbb752f193240ce03 upstream. vmclock_remove() tries to detect the successful registration of the misc device based on the value of its minor value. However that check is incorrect if the misc device registration was not attempted in the first place. Always initialize the minor number, so the check works properly. Fixes: 205032724226 ("ptp: Add support for the AMZNC10C 'vmclock' device") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-17ptp: Ensure info->enable callback is always setThomas Weißschuh1-0/+8
commit fd53aa40e65f518453115b6f56183b0c201db26b upstream. The ioctl and sysfs handlers unconditionally call the ->enable callback. Not all drivers implement that callback, leading to NULL dereferences. Example of affected drivers: ptp_s390.c, ptp_vclock.c and ptp_mock.c. Instead use a dummy callback if no better was specified by the driver. Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250123-ptp-enable-v1-1-b015834d3a47@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08pps: Fix a use-after-freeCalvin Owens1-1/+1
commit c79a39dc8d060b9e64e8b0fa9d245d44befeefbe upstream. On a board running ntpd and gpsd, I'm seeing a consistent use-after-free in sys_exit() from gpsd when rebooting: pps pps1: removed ------------[ cut here ]------------ kobject: '(null)' (00000000db4bec24): is not initialized, yet kobject_put() is being called. WARNING: CPU: 2 PID: 440 at lib/kobject.c:734 kobject_put+0x120/0x150 CPU: 2 UID: 299 PID: 440 Comm: gpsd Not tainted 6.11.0-rc6-00308-gb31c44928842 #1 Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kobject_put+0x120/0x150 lr : kobject_put+0x120/0x150 sp : ffffffc0803d3ae0 x29: ffffffc0803d3ae0 x28: ffffff8042dc9738 x27: 0000000000000001 x26: 0000000000000000 x25: ffffff8042dc9040 x24: ffffff8042dc9440 x23: ffffff80402a4620 x22: ffffff8042ef4bd0 x21: ffffff80405cb600 x20: 000000000008001b x19: ffffff8040b3b6e0 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 696e6920746f6e20 x14: 7369203a29343263 x13: 205d303434542020 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: kobject_put+0x120/0x150 cdev_put+0x20/0x3c __fput+0x2c4/0x2d8 ____fput+0x1c/0x38 task_work_run+0x70/0xfc do_exit+0x2a0/0x924 do_group_exit+0x34/0x90 get_signal+0x7fc/0x8c0 do_signal+0x128/0x13b4 do_notify_resume+0xdc/0x160 el0_svc+0xd4/0xf8 el0t_64_sync_handler+0x140/0x14c el0t_64_sync+0x190/0x194 ---[ end trace 0000000000000000 ]--- ...followed by more symptoms of corruption, with similar stacks: refcount_t: underflow; use-after-free. kernel BUG at lib/list_debug.c:62! Kernel panic - not syncing: Oops - BUG: Fatal exception This happens because pps_device_destruct() frees the pps_device with the embedded cdev immediately after calling cdev_del(), but, as the comment above cdev_del() notes, fops for previously opened cdevs are still callable even after cdev_del() returns. I think this bug has always been there: I can't explain why it suddenly started happening every time I reboot this particular board. In commit d953e0e837e6 ("pps: Fix a use-after free bug when unregistering a source."), George Spelvin suggested removing the embedded cdev. That seems like the simplest way to fix this, so I've implemented his suggestion, using __register_chrdev() with pps_idr becoming the source of truth for which minor corresponds to which device. But now that pps_idr defines userspace visibility instead of cdev_add(), we need to be sure the pps->dev refcount can't reach zero while userspace can still find it again. So, the idr_remove() call moves to pps_unregister_cdev(), and pps_idr now holds a reference to pps->dev. pps_core: source serial1 got cdev (251:1) <...> pps pps1: removed pps_core: unregistering pps1 pps_core: deallocating pps1 Fixes: d953e0e837e6 ("pps: Fix a use-after free bug when unregistering a source.") Cc: stable@vger.kernel.org Signed-off-by: Calvin Owens <calvin@wbinvd.org> Reviewed-by: Michal Schmidt <mschmidt@redhat.com> Link: https://lore.kernel.org/r/a17975fd5ae99385791929e563f72564edbcf28f.1731383727.git.calvin@wbinvd.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08ptp: Properly handle compat ioctlsThomas Weißschuh1-0/+4
[ Upstream commit 19ae40f572a9ce1ade9954990af709a03fd37010 ] Pointer arguments passed to ioctls need to pass through compat_ptr() to work correctly on s390; as explained in Documentation/driver-api/ioctl.rst. Detect compat mode at runtime and call compat_ptr() for those commands which do take pointer arguments. Suggested-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/lkml/1ba5d3a4-7931-455b-a3ce-85a968a7cb10@app.fastmail.com/ Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://patch.msgid.link/20250125-posix-clock-compat_ioctl-v2-1-11c865c500eb@weissschuh.net Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-06ptp: kvm: x86: Return EOPNOTSUPP instead of ENODEV from kvm_arch_ptp_init()Thomas Weißschuh1-3/+3
The caller, ptp_kvm_init(), emits a warning if kvm_arch_ptp_init() exits with any error which is not EOPNOTSUPP: "fail to initialize ptp_kvm" Replace ENODEV with EOPNOTSUPP to avoid this spurious warning, aligning with the ARM implementation. Fixes: a86ed2cfa13c ("ptp: Don't print an error if ptp_kvm is not supported") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20241203-kvm_ptp-eopnotsuppp-v2-1-d1d060f27aa6@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05Merge tag 'net-6.13-rc2' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can and netfilter. Current release - regressions: - rtnetlink: fix double call of rtnl_link_get_net_ifla() - tcp: populate XPS related fields of timewait sockets - ethtool: fix access to uninitialized fields in set RXNFC command - selinux: use sk_to_full_sk() in selinux_ip_output() Current release - new code bugs: - net: make napi_hash_lock irq safe - eth: - bnxt_en: support header page pool in queue API - ice: fix NULL pointer dereference in switchdev Previous releases - regressions: - core: fix icmp host relookup triggering ip_rt_bug - ipv6: - avoid possible NULL deref in modify_prefix_route() - release expired exception dst cached in socket - smc: fix LGR and link use-after-free issue - hsr: avoid potential out-of-bound access in fill_frame_info() - can: hi311x: fix potential use-after-free - eth: ice: fix VLAN pruning in switchdev mode Previous releases - always broken: - netfilter: - ipset: hold module reference while requesting a module - nft_inner: incorrect percpu area handling under softirq - can: j1939: fix skb reference counting - eth: - mlxsw: use correct key block on Spectrum-4 - mlx5: fix memory leak in mlx5hws_definer_calc_layout" * tag 'net-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits) net :mana :Request a V2 response version for MANA_QUERY_GF_STAT net: avoid potential UAF in default_operstate() vsock/test: verify socket options after setting them vsock/test: fix parameter types in SO_VM_SOCKETS_* calls vsock/test: fix failures due to wrong SO_RCVLOWAT parameter net/mlx5e: Remove workaround to avoid syndrome for internal port net/mlx5e: SD, Use correct mdev to build channel param net/mlx5: E-Switch, Fix switching to switchdev mode in MPV net/mlx5: E-Switch, Fix switching to switchdev mode with IB device disabled net/mlx5: HWS: Properly set bwc queue locks lock classes net/mlx5: HWS: Fix memory leak in mlx5hws_definer_calc_layout bnxt_en: handle tpa_info in queue API implementation bnxt_en: refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap() bnxt_en: refactor tpa_info alloc/free into helpers geneve: do not assume mac header is set in geneve_xmit_skb() mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4 ethtool: Fix wrong mod state in case of verbose and no_mask bitset ipmr: tune the ipmr_can_free_table() checks. netfilter: nft_set_hash: skip duplicated elements pending gc run netfilter: ipset: Hold module reference while requesting a module ...
2024-12-01Get rid of 'remove_new' relic from platform driver structLinus Torvalds7-7/+7
The continual trickle of small conversion patches is grating on me, and is really not helping. Just get rid of the 'remove_new' member function, which is just an alias for the plain 'remove', and had a comment to that effect: /* * .remove_new() is a relic from a prototype conversion of .remove(). * New drivers are supposed to implement .remove(). Once all drivers are * converted to not use .remove_new any more, it will be dropped. */ This was just a tree-wide 'sed' script that replaced '.remove_new' with '.remove', with some care taken to turn a subsequent tab into two tabs to make things line up. I did do some minimal manual whitespace adjustment for places that used spaces to line things up. Then I just removed the old (sic) .remove_new member function, and this is the end result. No more unnecessary conversion noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-30ptp: Add error handling for adjfine callback in ptp_clock_adjtimeAjay Kaher1-1/+2
ptp_clock_adjtime sets ptp->dialed_frequency even when adjfine callback returns an error. This causes subsequent reads to return an incorrect value. Fix this by adding error check before ptp->dialed_frequency is set. Fixes: 39a8cbd9ca05 ("ptp: remember the adjusted frequency") Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://patch.msgid.link/20241125105954.1509971-1-ajay.kaher@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04ptp: Remove 'default y' for VMCLOCK PTP deviceDavid Woodhouse1-1/+5
The VMCLOCK device gives support for accurate timekeeping even across live migration, unlike the KVM PTP clock. To help ensure that users can always use ptp_vmclock where it's available in preference to ptp_kvm, set it to 'default PTP_1588_CLOCK_VMCLOCK' instead of 'default y'. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/89955b74d225129d6e3d79b53aa8d81d1b50560f.camel@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03ptp: fc3: remove redundant check on variable retColin Ian King1-5/+0
The check on ret has already been performed a few statements earlier and ret has not been re-assigned and so the re-checking is redundant. Clean up the code by removing the redundant check. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patch.msgid.link/20241031135042.3250614-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31ptp_pch: Replace deprecated PCI functionsPhilipp Stanner1-3/+3
pcim_iomap_regions() and pcim_iomap_table() have been deprecated in commit e354bb84a4c1 ("PCI: Deprecate pcim_iomap_table(), pcim_iomap_regions_request_all()"). Replace these functions with pcim_iomap_region(). Additionally, pass KBUILD_MODNAME to that function, since the 'name' parameter should indicate who (i.e., which driver) has requested the resource. Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patch.msgid.link/20241028095943.20498-2-pstanner@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30s390/time: Add PtP driverSven Schnelle3-0/+141
Add a small PtP driver which allows user space to get the values of the physical and tod clock. This allows programs like chrony to use STP as clock source and steer the kernel clock. The physical clock can be used as a debugging aid to get the clock without any additional offsets like STP steering or LPAR offset. Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Link: https://patch.msgid.link/20241023065601.449586-3-svens@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ptp: Add support for the AMZNC10C 'vmclock' deviceDavid Woodhouse3-0/+629
The vmclock device addresses the problem of live migration with precision clocks. The tolerances of a hardware counter (e.g. TSC) are typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that counter against an external source of 'real' time, and track the precise frequency of the counter as it changes with environmental conditions. When a guest is live migrated, anything it knows about the frequency of the underlying counter becomes invalid. It may move from a host where the counter running at -50PPM of its nominal frequency, to a host where it runs at +50PPM. There will also be a step change in the value of the counter, as the correctness of its absolute value at migration is limited by the accuracy of the source and destination host's time synchronization. In its simplest form, the device merely advertises a 'disruption_marker' which indicates that the guest should throw away any NTP synchronization it thinks it has, and start again. Because the shared memory region can be exposed all the way to userspace through the /dev/vmclock0 node, applications can still use time from a fast vDSO 'system call', and check the disruption marker to be sure that their timestamp is indeed truthful. The structure also allows for the precise time, as known by the host, to be exposed directly to guests so that they don't have to wait for NTP to resync from scratch. The PTP driver consumes this information if present. Like the KVM PTP clock, this PTP driver can convert TSC-based cross timestamps into KVM clock values. Unlike the KVM PTP clock, it does so only when such is actually helpful. The values and fields are based on the nascent virtio-rtc specification, and the intent is that a version (hopefully precisely this version) of this structure will be included as an optional part of that spec. In the meantime, this driver supports the simple ACPI form of the device which is being shipped in certain commercial hypervisors (and submitted for inclusion in QEMU). Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-02move asm/unaligned.h to linux/unaligned.hAl Viro2-2/+2
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-09-09ptp: ptp_ines: Remove setting of RX software timestampGal Pressman1-4/+0
The responsibility for reporting of RX software timestamp has moved to the core layer (see __ethtool_get_ts_info()), remove usage from the device drivers. Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20240906144632.404651-17-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-08ptp/ioctl: support MONOTONIC{,_RAW} timestamps for PTP_SYS_OFFSET_EXTENDEDMahesh Bandewar1-2/+6
The ability to read the PHC (Physical Hardware Clock) alongside multiple system clocks is currently dependent on the specific hardware architecture. This limitation restricts the use of PTP_SYS_OFFSET_PRECISE to certain hardware configurations. The generic soultion which would work across all architectures is to read the PHC along with the latency to perform PHC-read as offered by PTP_SYS_OFFSET_EXTENDED which provides pre and post timestamps. However, these timestamps are currently limited to the CLOCK_REALTIME timebase. Since CLOCK_REALTIME is affected by NTP (or similar time synchronization services), it can experience significant jumps forward or backward. This hinders the precise latency measurements that PTP_SYS_OFFSET_EXTENDED is designed to provide. This problem could be addressed by supporting MONOTONIC_RAW timestamps within PTP_SYS_OFFSET_EXTENDED. Unlike CLOCK_REALTIME or CLOCK_MONOTONIC, the MONOTONIC_RAW timebase is unaffected by NTP adjustments. This enhancement can be implemented by utilizing one of the three reserved words within the PTP_SYS_OFFSET_EXTENDED struct to pass the clock-id for timestamps. The current behavior aligns with clock-id for CLOCK_REALTIME timebase (value of 0), ensuring backward compatibility of the UAPI. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-06ptp: ocp: Improve PCIe delay estimationVadim Fedorenko1-9/+11
The PCIe bus can be pretty busy during boot and probe function can see excessive delays. Let's find the minimal value out of several tests and use it as estimated value. Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20240905140028.560454-1-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-67/+101
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/phy/phy_device.c 2560db6ede1a ("net: phy: Fix missing of_node_put() for leds") 1dce520abd46 ("net: phy: Use for_each_available_child_of_node_scoped()") https://lore.kernel.org/20240904115823.74333648@canb.auug.org.au Adjacent changes: drivers/net/ethernet/xilinx/xilinx_axienet.h drivers/net/ethernet/xilinx/xilinx_axienet_main.c 858430db28a5 ("net: xilinx: axienet: Fix race in axienet_stop") 76abb5d675c4 ("net: xilinx: axienet: Add statistics support") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-05ptp: ptp_idt82p33: Convert comma to semicolonChen Ni1-4/+4
Replace comma between expressions with semicolons. Using a ',' in place of a ';' can have unintended side effects. Although that is not the case here, it is seems best to use ';' unless ',' is intended. Found by inspection. No functional change intended. Compile tested only. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240904015003.1065872-1-nichen@iscas.ac.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03ptp: ocp: adjust sysfs entries to expose tty informationVadim Fedorenko1-11/+51
Implement additional attribute group to expose serial port information. Fixes tag points to the commit which introduced the change in serial port subsystem and made it impossible to use symlinks. Fixes: b286f4e87e32 ("serial: core: Move tty and serdev to be children of serial core port device") Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03ptp: ocp: convert serial ports to arrayVadim Fedorenko1-63/+57
Simplify serial port management code by using array of ports and helpers to get the name of the port. This change is needed to make the next patch simplier. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-16Merge tag 'net-next-6.11' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Not much excitement - a handful of large patchsets (devmem among them) did not make it in time. Core & protocols: - Use local_lock in addition to local_bh_disable() to protect per-CPU resources in networking, a step closer for local_bh_disable() not to act as a big lock on PREEMPT_RT - Use flex array for netdevice priv area, ensure its cache alignment - Add a sysctl knob to allow user to specify a default rto_min at socket init time. Bit of a big hammer but multiple companies were independently carrying such patch downstream so clearly it's useful - Support scheduling transmission of packets based on CLOCK_TAI - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off using cpusets - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address - Allow configuration of multipath hash seed, to both allow synchronizing hashing of two routers, and preventing partial accidental sync - Improve TCP compliance with RFC 9293 for simultaneous connect() - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace IKE daemon had to do this before, but the kernel can better keep track of it - Support sending supervision HSR frames with MAC addresses stored in ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled - Introduce IPPROTO_SMC for selecting SMC when socket is created - Allow UDP GSO transmit from devices with no checksum offload - openvswitch: add packet sampling via psample, separating the sampled traffic from "upcall" packets sent to user space for forwarding - nf_tables: shrink memory consumption for transaction objects Things we sprinkled into general kernel code: - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for QCA6390) [ Already merged separately - Linus ] - Add IRQ information in sysfs for auxiliary bus - Introduce guard definition for local_lock - Add aligned flavor of __cacheline_group_{begin, end}() markings for grouping fields in structures BPF: - Notify user space (via epoll) when a struct_ops object is getting detached/unregistered - Add new kfuncs for a generic, open-coded bits iterator - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head - Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible WRT BTF from modules - Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs - riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter - Add the capability to offload the netfilter flowtable in XDP layer through kfuncs Driver API: - Allow users to configure IRQ tresholds between which automatic IRQ moderation can choose - Expand Power Sourcing (PoE) status with power, class and failure reason. Support setting power limits - Track additional RSS contexts in the core, make sure configuration changes don't break them - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP data paths - Support updating firmware on SFP modules Tests and tooling: - mptcp: use net/lib.sh to manage netns - TCP-AO and TCP-MD5: replace debug prints used by tests with tracepoints - openvswitch: make test self-contained (don't depend on OvS CLI tools) Drivers: - Ethernet high-speed NICs: - Broadcom (bnxt): - increase the max total outstanding PTP TX packets to 4 - add timestamping statistics support - implement netdev_queue_mgmt_ops - support new RSS context API - Intel (100G, ice, idpf): - implement FEC statistics and dumping signal quality indicators - support E825C products (with 56Gbps PHYs) - nVidia/Mellanox: - support HW-GRO - mlx4/mlx5: support per-queue statistics via netlink - obey the max number of EQs setting in sub-functions - AMD/Solarflare: - support new RSS context API - AMD/Pensando: - ionic: rework fix for doorbell miss to lower overhead and skip it on new HW - Wangxun: - txgbe: support Flow Director perfect filters - Ethernet NICs consumer, embedded and virtual: - Add driver for Tehuti Networks TN40xx chips - Add driver for Meta's internal NIC chips - Add driver for Ethernet MAC on Airoha EN7581 SoCs - Add driver for Renesas Ethernet-TSN devices - Google cloud vNIC: - flow steering support - Microsoft vNIC: - support page sizes other than 4KB on ARM64 - vmware vNIC: - support latency measurement (update to version 9) - VirtIO net: - support for Byte Queue Limits - support configuring thresholds for automatic IRQ moderation - support for AF_XDP Rx zero-copy - Synopsys (stmmac): - support for STM32MP13 SoC - let platforms select the right PCS implementation - TI: - icssg-prueth: add multicast filtering support - icssg-prueth: enable PTP timestamping and PPS - Renesas: - ravb: improve Rx performance 30-400% by using page pool, theaded NAPI and timer-based IRQ coalescing - ravb: add MII support for R-Car V4M - Cadence (macb): - macb: add ARP support to Wake-On-LAN - Cortina: - use phylib for RX and TX pause configuration - Ethernet switches: - nVidia/Mellanox: - support configuration of multipath hash seed - report more accurate max MTU - use page_pool to improve Rx performance - MediaTek: - mt7530: add support for bridge port isolation - Qualcomm: - qca8k: add support for bridge port isolation - Microchip: - lan9371/2: add 100BaseTX PHY support - NXP: - vsc73xx: implement VLAN operations - Ethernet PHYs: - aquantia: enable support for aqr115c - aquantia: add support for PHY LEDs - realtek: add support for rtl8224 2.5Gbps PHY - xpcs: add memory-mapped device support - add BroadR-Reach link mode and support in Broadcom's PHY driver - CAN: - add document for ISO 15765-2 protocol support - mcp251xfd: workaround for erratum DS80000789E, use timestamps to catch when device returns incorrect FIFO status - WiFi: - mac80211/cfg80211: - parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers - improvements for 6 GHz regulatory flexibility - multi-link improvements - support multiple radios per wiphy - remove DEAUTH_NEED_MGD_TX_PREP flag - Intel (iwlwifi): - bump FW API to 91 for BZ/SC devices - report 64-bit radiotap timestamp - enable P2P low latency by default - handle Transmit Power Envelope (TPE) advertised by AP - remove support for older FW for new devices - fast resume (keeping the device configured) - mvm: re-enable Multi-Link Operation (MLO) - aggregation (A-MSDU) optimizations - MediaTek (mt76): - mt7925 Multi-Link Operation (MLO) support - Qualcomm (ath10k): - LED support for various chipsets - Qualcomm (ath12k): - remove unsupported Tx monitor handling - support channel 2 in 6 GHz band - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA) - support dynamic VLAN - add panic handler for resetting the firmware state - DebugFS support for datapath statistics - WCN7850: support for Wake on WLAN - Microchip (wilc1000): - read MAC address during probe to make it visible to user space - suspend/resume improvements - TI (wl18xx): - support newer firmware versions - RealTek (rtw89): - preparation for RTL8852BE-VT support - Wake on WLAN support for WiFi 6 chips - 36-bit PCI DMA support - RealTek (rtlwifi): - RTL8192DU support - Broadcom (brcmfmac): - Management Frame Protection support (to enable WPA3) - Bluetooth: - qualcomm: use the power sequencer for QCA6390 - btusb: mediatek: add ISO data transmission functions - hci_bcm4377: add BCM4388 support - btintel: add support for BlazarU core - btintel: add support for Whale Peak2 - btnxpuart: add support for AW693 A1 chipset - btnxpuart: add support for IW615 chipset - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591" * tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits) eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering" tcp: Replace strncpy() with strscpy() wifi: ath12k: fix build vs old compiler tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child(). eth: fbnic: Write the TCAM tables used for RSS control and Rx to host eth: fbnic: Add L2 address programming eth: fbnic: Add basic Rx handling eth: fbnic: Add basic Tx handling eth: fbnic: Add link detection eth: fbnic: Add initial messaging to notify FW of our presence eth: fbnic: Implement Rx queue alloc/start/stop/free eth: fbnic: Implement Tx queue alloc/start/stop/free eth: fbnic: Allocate a netdevice and napi vectors with queues eth: fbnic: Add FW communication mechanism eth: fbnic: Add message parsing for FW messages eth: fbnic: Add register init to set PCIe/Ethernet device config eth: fbnic: Allocate core device specific structures and devlink interface eth: fbnic: Add scaffolding for Meta's NIC driver PCI: Add Meta Platforms vendor ID net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK ...
2024-07-15net: Add struct kernel_ethtool_ts_infoKory Maincent1-1/+1
In prevision to add new UAPI for hwtstamp we will be limited to the struct ethtool_ts_info that is currently passed in fixed binary format through the ETHTOOL_GET_TS_INFO ethtool ioctl. It would be good if new kernel code already started operating on an extensible kernel variant of that structure, similar in concept to struct kernel_hwtstamp_config vs struct hwtstamp_config. Since struct ethtool_ts_info is in include/uapi/linux/ethtool.h, here we introduce the kernel-only structure in include/linux/ethtool.h. The manual copy is then made in the function called by ETHTOOL_GET_TS_INFO. Acked-by: Shannon Nelson <shannon.nelson@amd.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-6-b5317f50df2a@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25ptp/vmware: Use VMware hypercall APIAlexey Makhalov1-9/+3
Switch from VMWARE_HYPERCALL macro to vmware_hypercall API. Eliminate arch specific code. No functional changes intended. Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240613191650.9913-3-alexey.makhalov@broadcom.com
2024-06-18ptp: fix integer overflow in max_vclocks_storeDan Carpenter1-2/+1
On 32bit systems, the "4 * max" multiply can overflow. Use kcalloc() to do the allocation to prevent this. Fixes: 44c494c8e30e ("ptp: track available ptp vclocks information") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Link: https://lore.kernel.org/r/ee8110ed-6619-4bd7-9024-28c1f2ac24f4@moroto.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05ptp: Fix error message on failed pin verificationKarol Kolacinski1-1/+2
On failed verification of PTP clock pin, error message prints channel number instead of pin index after "pin", which is incorrect. Fix error message by adding channel number to the message and printing pin number instead of channel number. Fixes: 6092315dfdec ("ptp: introduce programmable pins.") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20240604120555.16643-1-karol.kolacinski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-14Merge tag 'net-next-6.10' of ↵Linus Torvalds6-21/+13
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Complete rework of garbage collection of AF_UNIX sockets. AF_UNIX is prone to forming reference count cycles due to fd passing functionality. New method based on Tarjan's Strongly Connected Components algorithm should be both faster and remove a lot of workarounds we accumulated over the years. - Add TCP fraglist GRO support, allowing chaining multiple TCP packets and forwarding them together. Useful for small switches / routers which lack basic checksum offload in some scenarios (e.g. PPPoE). - Support using SMP threads for handling packet backlog i.e. packet processing from software interfaces and old drivers which don't use NAPI. This helps move the processing out of the softirq jumble. - Continue work of converting from rtnl lock to RCU protection. Don't require rtnl lock when reading: IPv6 routing FIB, IPv6 address labels, netdev threaded NAPI sysfs files, bonding driver's sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics, TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot of the link information available via rtnetlink. - Small optimizations from Eric to UDP wake up handling, memory accounting, RPS/RFS implementation, TCP packet sizing etc. - Allow direct page recycling in the bulk API used by XDP, for +2% PPS. - Support peek with an offset on TCP sockets. - Add MPTCP APIs for querying last time packets were received/sent/acked and whether MPTCP "upgrade" succeeded on a TCP socket. - Add intra-node communication shortcut to improve SMC performance. - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol driver. - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver. - Add reset reasons for tracing what caused a TCP reset to be sent. - Introduce direction attribute for xfrm (IPSec) states. State can be used either for input or output packet processing. Things we sprinkled into general kernel code: - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS(). This required touch-ups and renaming of a few existing users. - Add Endian-dependent __counted_by_{le,be} annotations. - Make building selftests "quieter" by printing summaries like "CC object.o" rather than full commands with all the arguments. Netfilter: - Use GFP_KERNEL to clone elements, to deal better with OOM situations and avoid failures in the .commit step. BPF: - Add eBPF JIT for ARCv2 CPUs. - Support attaching kprobe BPF programs through kprobe_multi link in a session mode, meaning, a BPF program is attached to both function entry and return, the entry program can decide if the return program gets executed and the entry program can share u64 cookie value with return program. "Session mode" is a common use-case for tetragon and bpftrace. - Add the ability to specify and retrieve BPF cookie for raw tracepoint programs in order to ease migration from classic to raw tracepoints. - Add an internal-only BPF per-CPU instruction for resolving per-CPU memory addresses and implement support in x86, ARM64 and RISC-V JITs. This allows inlining functions which need to access per-CPU state. - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various atomics in bpf_arena which can be JITed as a single x86 instruction. Support BPF arena on ARM64. - Add a new bpf_wq API for deferring events and refactor process-context bpf_timer code to keep common code where possible. - Harden the BPF verifier's and/or/xor value tracking. - Introduce crypto kfuncs to let BPF programs call kernel crypto APIs. - Support bpf_tail_call_static() helper for BPF programs with GCC 13. - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF program to have code sections where preemption is disabled. Driver API: - Skip software TC processing completely if all installed rules are marked as HW-only, instead of checking the HW-only flag rule by rule. - Add support for configuring PoE (Power over Ethernet), similar to the already existing support for PoDL (Power over Data Line) config. - Initial bits of a queue control API, for now allowing a single queue to be reset without disturbing packet flow to other queues. - Common (ethtool) statistics for hardware timestamping. Tests and tooling: - Remove the need to create a config file to run the net forwarding tests so that a naive "make run_tests" can exercise them. - Define a method of writing tests which require an external endpoint to communicate with (to send/receive data towards the test machine). Add a few such tests. - Create a shared code library for writing Python tests. Expose the YAML Netlink library from tools/ to the tests for easy Netlink access. - Move netfilter tests under net/, extend them, separate performance tests from correctness tests, and iron out issues found by running them "on every commit". - Refactor BPF selftests to use common network helpers. - Further work filling in YAML definitions of Netlink messages for: nftables, team driver, bonding interfaces, vlan interfaces, VF info, TC u32 mark, TC police action. - Teach Python YAML Netlink to decode attribute policies. - Extend the definition of the "indexed array" construct in the specs to cover arrays of scalars rather than just nests. - Add hyperlinks between definitions in generated Netlink docs. Drivers: - Make sure unsupported flower control flags are rejected by drivers, and make more drivers report errors directly to the application rather than dmesg (large number of driver changes from Asbjørn Sloth Tønnesen). - Ethernet high-speed NICs: - Broadcom (bnxt): - support multiple RSS contexts and steering traffic to them - support XDP metadata - make page pool allocations more NUMA aware - Intel (100G, ice, idpf): - extract datapath code common among Intel drivers into a library - use fewer resources in switchdev by sharing queues with the PF - add PFCP filter support - add Ethernet filter support - use a spinlock instead of HW lock in PTP clock ops - support 5 layer Tx scheduler topology - nVidia/Mellanox: - 800G link modes and 100G SerDes speeds - per-queue IRQ coalescing configuration - Marvell Octeon: - support offloading TC packet mark action - Ethernet NICs consumer, embedded and virtual: - stop lying about skb->truesize in USB Ethernet drivers, it messes up TCP memory calculations - Google cloud vNIC: - support changing ring size via ethtool - support ring reset using the queue control API - VirtIO net: - expose flow hash from RSS to XDP - per-queue statistics - add selftests - Synopsys (stmmac): - support controllers which require an RX clock signal from the MII bus to perform their hardware initialization - TI: - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices - icssg_prueth: add SW TX / RX Coalescing based on hrtimers - cpsw: minimal XDP support - Renesas (ravb): - support describing the MDIO bus - Realtek (r8169): - add support for RTL8168M - Microchip Sparx5: - matchall and flower actions mirred and redirect - Ethernet switches: - nVidia/Mellanox: - improve events processing performance - Marvell: - add support for MV88E6250 family internal PHYs - Microchip: - add DCB and DSCP mapping support for KSZ switches - vsc73xx: convert to PHYLINK - Realtek: - rtl8226b/rtl8221b: add C45 instances and SerDes switching - Many driver changes related to PHYLIB and PHYLINK deprecated API cleanup - Ethernet PHYs: - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY. - micrel: lan8814: add support for PPS out and external timestamp trigger - WiFi: - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices drivers. Modern devices can only be configured using nl80211. - mac80211/cfg80211 - handle color change per link for WiFi 7 Multi-Link Operation - Intel (iwlwifi): - don't support puncturing in 5 GHz - support monitor mode on passive channels - BZ-W device support - P2P with HE/EHT support - re-add support for firmware API 90 - provide channel survey information for Automatic Channel Selection - MediaTek (mt76): - mt7921 LED control - mt7925 EHT radiotap support - mt7920e PCI support - Qualcomm (ath11k): - P2P support for QCA6390, WCN6855 and QCA2066 - support hibernation - ieee80211-freq-limit Device Tree property support - Qualcomm (ath12k): - refactoring in preparation of multi-link support - suspend and hibernation support - ACPI support - debugfs support, including dfs_simulate_radar support - RealTek: - rtw88: RTL8723CS SDIO device support - rtw89: RTL8922AE Wi-Fi 7 PCI device support - rtw89: complete features of new WiFi 7 chip 8922AE including BT-coexistence and Wake-on-WLAN - rtw89: use BIOS ACPI settings to set TX power and channels - rtl8xxxu: enable Management Frame Protection (MFP) support - Bluetooth: - support for Intel BlazarI and Filmore Peak2 (BE201) - support for MediaTek MT7921S SDIO - initial support for Intel PCIe BT driver - remove HCI_AMP support" * tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1827 commits) selftests: netfilter: fix packetdrill conntrack testcase net: gro: fix napi_gro_cb zeroed alignment Bluetooth: btintel_pcie: Refactor and code cleanup Bluetooth: btintel_pcie: Fix warning reported by sparse Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1 Bluetooth: btintel: Fix compiler warning for multi_v7_defconfig config Bluetooth: btintel_pcie: Fix compiler warnings Bluetooth: btintel_pcie: Add *setup* function to download firmware Bluetooth: btintel_pcie: Add support for PCIe transport Bluetooth: btintel: Export few static functions Bluetooth: HCI: Remove HCI_AMP support Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init() Bluetooth: qca: Fix error code in qca_read_fw_build_info() Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning Bluetooth: btintel: Add support for Filmore Peak2 (BE201) Bluetooth: btintel: Add support for BlazarI LE Create Connection command timeout increased to 20 secs dt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO Bluetooth Bluetooth: compute LE flow credits based on recvbuf space Bluetooth: hci_sync: Use cmd->num_cis instead of magic number ...
2024-05-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-3/+3
Merge in late fixes to prepare for the 6.10 net-next PR. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-09ptp: ocp: fix DPLL functionsVadim Fedorenko1-3/+3
In ptp_ocp driver pin actions assume sma_nr starts with 1, but for DPLL subsystem callback 0-based index was used. Fix it providing proper index. Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops") Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/20240508132111.11545-1-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-12ptp: ptp_qoriq: Convert to platform remove callback returning voidUwe Kleine-König1-3/+2
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Richard C