summaryrefslogtreecommitdiff
path: root/Documentation/networking
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-10-02 15:17:01 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2025-10-02 15:17:01 -0700
commit07fdad3a93756b872da7b53647715c48d0f4a2d0 (patch)
tree133af559ac91e6b24358b57a025abc060a782129 /Documentation/networking
parentf79e772258df311c2cb21594ca0996318e720d28 (diff)
parentf1455695d2d99894b65db233877acac9a0e120b9 (diff)
downloadlinux-07fdad3a93756b872da7b53647715c48d0f4a2d0.tar.gz
linux-07fdad3a93756b872da7b53647715c48d0f4a2d0.tar.bz2
linux-07fdad3a93756b872da7b53647715c48d0f4a2d0.zip
Merge tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "Core & protocols: - Improve drop account scalability on NUMA hosts for RAW and UDP sockets and the backlog, almost doubling the Pps capacity under DoS - Optimize the UDP RX performance under stress, reducing contention, revisiting the binary layout of the involved data structs and implementing NUMA-aware locking. This improves UDP RX performance by an additional 50%, even more under extreme conditions - Add support for PSP encryption of TCP connections; this mechanism has some similarities with IPsec and TLS, but offers superior HW offloads capabilities - Ongoing work to support Accurate ECN for TCP. AccECN allows more than one congestion notification signal per RTT and is a building block for Low Latency, Low Loss, and Scalable Throughput (L4S) - Reorganize the TCP socket binary layout for data locality, reducing the number of touched cachelines in the fastpath - Refactor skb deferral free to better scale on large multi-NUMA hosts, this improves TCP and UDP RX performances significantly on such HW - Increase the default socket memory buffer limits from 256K to 4M to better fit modern link speeds - Improve handling of setups with a large number of nexthop, making dump operating scaling linearly and avoiding unneeded synchronize_rcu() on delete - Improve bridge handling of VLAN FDB, storing a single entry per bridge instead of one entry per port; this makes the dump order of magnitude faster on large switches - Restore IP ID correctly for encapsulated packets at GSO segmentation time, allowing GRO to merge packets in more scenarios - Improve netfilter matching performance on large sets - Improve MPTCP receive path performance by leveraging recently introduced core infrastructure (skb deferral free) and adopting recent TCP autotuning changes - Allow bridges to redirect to a backup port when the bridge port is administratively down - Introduce MPTCP 'laminar' endpoint that con be used only once per connection and simplify common MPTCP setups - Add RCU safety to dst->dev, closing a lot of possible races - A significant crypto library API for SCTP, MPTCP and IPv6 SR, reducing code duplication - Supports pulling data from an skb frag into the linear area of an XDP buffer Things we sprinkled into general kernel code: - Generate netlink documentation from YAML using an integrated YAML parser Driver API: - Support using IPv6 Flow Label in Rx hash computation and RSS queue selection - Introduce API for fetching the DMA device for a given queue, allowing TCP zerocopy RX on more H/W setups - Make XDP helpers compatible with unreadable memory, allowing more easily building DevMem-enabled drivers with a unified XDP/skbs datapath - Add a new dedicated ethtool callback enabling drivers to provide the number of RX rings directly, improving efficiency and clarity in RX ring queries and RSS configuration - Introduce a burst period for the health reporter, allowing better handling of multiple errors due to the same root cause - Support for DPLL phase offset exponential moving average, controlling the average smoothing factor Device drivers: - Add a new Huawei driver for 3rd gen NIC (hinic3) - Add a new SpacemiT driver for K1 ethernet MAC - Add a generic abstraction for shared memory communication devices (dibps) - Ethernet high-speed NICs: - nVidia/Mellanox: - Use multiple per-queue doorbell, to avoid MMIO contention issues - support adjacent functions, allowing them to delegate their SR-IOV VFs to sibling PFs - support RSS for IPSec offload - support exposing raw cycle counters in PTP and mlx5 - support for disabling host PFs. - Intel (100G, ice, idpf): - ice: support for SRIOV VFs over an Active-Active link aggregate - ice: support for firmware logging via debugfs - ice: support for Earliest TxTime First (ETF) hardware offload - idpf: support basic XDP functionalities and XSk - Broadcom (bnxt): - support Hyper-V VF ID - dynamic SRIOV resource allocations for RoCE - Meta (fbnic): - support queue API, zero-copy Rx and Tx - support basic XDP functionalities - devlink health support for FW crashes and OTP mem corruptions - expand hardware stats coverage to FEC, PHY, and Pause - Wangxun: - support ethtool coalesce options - support for multiple RSS contexts - Ethernet virtual: - Macsec: - replace custom netlink attribute checks with policy-level checks - Bonding: - support aggregator selection based on port priority - Microsoft vNIC: - use page pool fragments for RX buffers instead of full pages to improve memory efficiency - Ethernet NICs consumer, and embedded: - Qualcomm: support Ethernet function for IPQ9574 SoC - Airoha: implement wlan offloading via NPU - Freescale - enetc: add NETC timer PTP driver and add PTP support - fec: enable the Jumbo frame support for i.MX8QM - Renesas (R-Car S4): - support HW offloading for layer 2 switching - support for RZ/{T2H, N2H} SoCs - Cadence (macb): support TAPRIO traffic scheduling - TI: - support for Gigabit ICSS ethernet SoC (icssm-prueth) - Synopsys (stmmac): a lot of cleanups - Ethernet PHYs: - Support 10g-qxgmi phy-mode for AQR412C, Felix DSA and Lynx PCS driver - Support bcm63268 GPHY power control - Support for Micrel lan8842 PHY and PTP - Support for Aquantia AQR412 and AQR115 - CAN: - a large CAN-XL preparation work - reorganize raw_sock and uniqframe struct to minimize memory usage - rcar_canfd: update the CAN-FD handling - WiFi: - extended Neighbor Awareness Networking (NAN) support - S1G channel representation cleanup - improve S1G support - WiFi drivers: - Intel (iwlwifi): - major refactor and cleanup - Broadcom (brcm80211): - support for AP isolation - RealTek (rtw88/89) rtw88/89: - preparation work for RTL8922DE support - MediaTek (mt76): - HW restart improvements - MLO support - Qualcomm/Atheros (ath10k): - GTK rekey fixes - Bluetooth drivers: - btusb: support for several new IDs for MT7925 - btintel: support for BlazarIW core - btintel_pcie: support for _suspend() / _resume() - btintel_pcie: support for Scorpious, Panther Lake-H484 IDs" * tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1536 commits) net: stmmac: Add support for Allwinner A523 GMAC200 dt-bindings: net: sun8i-emac: Add A523 GMAC200 compatible Revert "Documentation: net: add flow control guide and document ethtool API" octeontx2-pf: fix bitmap leak octeontx2-vf: fix bitmap leak net/mlx5e: Use extack in set rxfh callback net/mlx5e: Introduce mlx5e_rss_params for RSS configuration net/mlx5e: Introduce mlx5e_rss_init_params net/mlx5e: Remove unused mdev param from RSS indir init net/mlx5: Improve QoS error messages with actual depth values net/mlx5e: Prevent entering switchdev mode with inconsistent netns net/mlx5: HWS, Generalize complex matchers net/mlx5: Improve write-combining test reliability for ARM64 Grace CPUs selftests/net: add tcp_port_share to .gitignore Revert "net/mlx5e: Update and set Xon/Xoff upon MTU set" net: add NUMA awareness to skb_attempt_defer_free() net: use llist for sd->defer_list net: make softnet_data.defer_count an atomic selftests: drv-net: psp: add tests for destroying devices selftests: drv-net: psp: add test for auto-adjusting TCP MSS ...
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/bonding.rst104
-rw-r--r--Documentation/networking/device_drivers/ethernet/index.rst1
-rw-r--r--Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst7
-rw-r--r--Documentation/networking/device_drivers/ethernet/meta/fbnic.rst30
-rw-r--r--Documentation/networking/device_drivers/ethernet/qualcomm/ppe/ppe.rst194
-rw-r--r--Documentation/networking/devlink/devlink-health.rst2
-rw-r--r--Documentation/networking/devlink/devlink-params.rst8
-rw-r--r--Documentation/networking/devlink/index.rst20
-rw-r--r--Documentation/networking/devlink/mlx5.rst113
-rw-r--r--Documentation/networking/devlink/zl3073x.rst14
-rw-r--r--Documentation/networking/dns_resolver.rst52
-rw-r--r--Documentation/networking/ethtool-netlink.rst5
-rw-r--r--Documentation/networking/index.rst3
-rw-r--r--Documentation/networking/ip-sysctl.rst71
-rw-r--r--Documentation/networking/mptcp-sysctl.rst8
-rw-r--r--Documentation/networking/mptcp.rst2
-rw-r--r--Documentation/networking/net_cachelines/tcp_sock.rst18
-rw-r--r--Documentation/networking/netlink_spec/.gitignore1
-rw-r--r--Documentation/networking/netlink_spec/readme.txt4
-rw-r--r--Documentation/networking/phy.rst2
-rw-r--r--Documentation/networking/psp.rst183
-rw-r--r--Documentation/networking/rxrpc.rst9
-rw-r--r--Documentation/networking/segmentation-offloads.rst22
23 files changed, 710 insertions, 163 deletions
diff --git a/Documentation/networking/bonding.rst b/Documentation/networking/bonding.rst
index f8f5766703d4..e700bf1d095c 100644
--- a/Documentation/networking/bonding.rst
+++ b/Documentation/networking/bonding.rst
@@ -193,6 +193,15 @@ ad_actor_sys_prio
This parameter has effect only in 802.3ad mode and is available through
SysFs interface.
+actor_port_prio
+
+ In an AD system, this specifies the port priority. The allowed range
+ is 1 - 65535. If the value is not specified, it takes 255 as the
+ default value.
+
+ This parameter has effect only in 802.3ad mode and is available through
+ netlink interface.
+
ad_actor_system
In an AD system, this specifies the mac-address for the actor in
@@ -241,10 +250,18 @@ ad_select
ports (slaves). Reselection occurs as described under the
"bandwidth" setting, above.
- The bandwidth and count selection policies permit failover of
- 802.3ad aggregations when partial failure of the active aggregator
- occurs. This keeps the aggregator with the highest availability
- (either in bandwidth or in number of ports) active at all times.
+ actor_port_prio or 3
+
+ The active aggregator is chosen by the highest total sum of
+ actor port priorities across its active ports. Note this
+ priority is actor_port_prio, not per port prio, which is
+ used for primary reselect.
+
+ The bandwidth, count and actor_port_prio selection policies permit
+ failover of 802.3ad aggregations when partial failure of the active
+ aggregator occurs. This keeps the aggregator with the highest
+ availability (either in bandwidth, number of ports, or total value
+ of port priorities) active at all times.
This option was added in bonding version 3.4.0.
@@ -582,10 +599,8 @@ miimon
This determines how often the link state of each slave is
inspected for link failures. A value of zero disables MII
link monitoring. A value of 100 is a good starting point.
- The use_carrier option, below, affects how the link state is
- determined. See the High Availability section for additional
- information. The default value is 100 if arp_interval is not
- set.
+
+ The default value is 100 if arp_interval is not set.
min_links
@@ -896,25 +911,14 @@ updelay
use_carrier
- Specifies whether or not miimon should use MII or ETHTOOL
- ioctls vs. netif_carrier_ok() to determine the link
- status. The MII or ETHTOOL ioctls are less efficient and
- utilize a deprecated calling sequence within the kernel. The
- netif_carrier_ok() relies on the device driver to maintain its
- state with netif_carrier_on/off; at this writing, most, but
- not all, device drivers support this facility.
-
- If bonding insists that the link is up when it should not be,
- it may be that your network device driver does not support
- netif_carrier_on/off. The default state for netif_carrier is
- "carrier on," so if a driver does not support netif_carrier,
- it will appear as if the link is always up. In this case,
- setting use_carrier to 0 will cause bonding to revert to the
- MII / ETHTOOL ioctl method to determine the link state.
-
- A value of 1 enables the use of netif_carrier_ok(), a value of
- 0 will use the deprecated MII / ETHTOOL ioctls. The default
- value is 1.
+ Obsolete option that previously selected between MII /
+ ETHTOOL ioctls and netif_carrier_ok() to determine link
+ state.
+
+ All link state checks are now done with netif_carrier_ok().
+
+ For backwards compatibility, this option's value may be inspected
+ or set. The only valid setting is 1.
xmit_hash_policy
@@ -2036,22 +2040,8 @@ depending upon the device driver to maintain its carrier state, by
querying the device's MII registers, or by making an ethtool query to
the device.
-If the use_carrier module parameter is 1 (the default value),
-then the MII monitor will rely on the driver for carrier state
-information (via the netif_carrier subsystem). As explained in the
-use_carrier parameter information, above, if the MII monitor fails to
-detect carrier loss on the device (e.g., when the cable is physically
-disconnected), it may be that the driver does not support
-netif_carrier.
-
-If use_carrier is 0, then the MII monitor will first query the
-device's (via ioctl) MII registers and check the link state. If that
-request fails (not just that it returns carrier down), then the MII
-monitor will make an ethtool ETHTOOL_GLINK request to attempt to obtain
-the same information. If both methods fail (i.e., the driver either
-does not support or had some error in processing both the MII register
-and ethtool requests), then the MII monitor will assume the link is
-up.
+The MII monitor relies on the driver for carrier state information (via
+the netif_carrier subsystem).
8. Potential Sources of Trouble
===============================
@@ -2135,34 +2125,6 @@ This will load tg3 and e1000 modules before loading the bonding one.
Full documentation on this can be found in the modprobe.d and modprobe
manual pages.
-8.3. Painfully Slow Or No Failed Link Detection By Miimon
----------------------------------------------------------
-
-By default, bonding enables the use_carrier option, which
-instructs bonding to trust the driver to maintain carrier state.
-
-As discussed in the options section, above, some drivers do
-not support the netif_carrier_on/_off link state tracking system.
-With use_carrier enabled, bonding will always see these links as up,
-regardless of their actual state.
-
-Additionally, other drivers do support netif_carrier, but do
-not maintain it in real time, e.g., only polling the link state at
-some fixed interval. In this case, miimon will detect failures, but
-only after some long period of time has expired. If it appears that
-miimon is very slow in detecting link failures, try specifying
-use_carrier=0 to see if that improves the failure detection time. If
-it does, then it may be that the driver checks the carrier state at a
-fixed interval, but does not cache the MII register values (so the
-use_carrier=0 method of querying the registers directly works). If
-use_carrier=0 does not improve the failover, then the driver may cache
-the registers, or the problem may be elsewhere.
-
-Also, remember that miimon only checks for the device's
-carrier state. It has no way to determine the state of devices on or
-beyond other ports of a switch, or if a switch is refusing to pass
-traffic while still maintaining carrier on.
-
9. SNMP agents
===============
diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst
index 40ac552641a3..0b0a3eef6aae 100644
--- a/Documentation/networking/device_drivers/ethernet/index.rst
+++ b/Documentation/networking/device_drivers/ethernet/index.rst
@@ -50,6 +50,7 @@ Contents:
neterion/s2io
netronome/nfp
pensando/ionic
+ qualcomm/ppe/ppe
smsc/smc9
stmicro/stmmac
ti/cpsw
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
index 754c81436408..cc498895f92e 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
@@ -1348,7 +1348,7 @@ Device Counters
is in a congested state.
If pci_bw_inbound_high == pci_bw_inbound_low then the device is not congested.
If pci_bw_inbound_high > pci_bw_inbound_low then the device is congested.
- - Tnformative
+ - Informative
* - `pci_bw_inbound_low`
- The number of times the device crossed the low inbound PCIe bandwidth
@@ -1373,3 +1373,8 @@ Device Counters
If pci_bw_outbound_high == pci_bw_outbound_low then the device is not congested.
If pci_bw_outbound_high > pci_bw_outbound_low then the device is congested.
- Informative
+
+ * - `pci_bw_stale_event`
+ - The number of times the device fired a PCIe congestion event but on query
+ there was no change in state.
+ - Informative
diff --git a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
index afb8353daefd..1e82f90d9ad2 100644
--- a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
+++ b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
@@ -69,6 +69,25 @@ On host boot the latest UEFI driver is always used, no explicit activation
is required. Firmware activation is required to run new control firmware. cmrt
firmware can only be activated by power cycling the NIC.
+Health reporters
+----------------
+
+fw reporter
+~~~~~~~~~~~
+
+The ``fw`` health reporter tracks FW crashes. Dumping the reporter will
+show the core dump of the most recent FW crash, and if no FW crash has
+happened since power cycle - a snapshot of the FW memory. Diagnose callback
+shows FW uptime based on the most recently received heartbeat message
+(the crashes are detected by checking if uptime goes down).
+
+otp reporter
+~~~~~~~~~~~~
+
+OTP memory ("fuses") are used for secure boot and anti-rollback
+protection. The OTP memory is ECC protected, ECC errors indicate
+either manufacturing defect or part deteriorating with age.
+
Statistics
----------
@@ -160,3 +179,14 @@ behavior and potential performance bottlenecks.
credit exhaustion
- ``pcie_ob_rd_no_np_cred``: Read requests dropped due to non-posted
credit exhaustion
+
+XDP Length Error:
+~~~~~~~~~~~~~~~~~
+
+For XDP programs without frags support, fbnic tries to make sure that MTU fits
+into a single buffer. If an oversized frame is received and gets fragmented,
+it is dropped and the following netlink counters are updated
+
+ - ``rx-length``: number of frames dropped due to lack of fragmentation
+ support in the attached XDP program
+ - ``rx-errors``: total number of packets with errors received on the interface
diff --git a/Documentation/networking/device_drivers/ethernet/qualcomm/ppe/ppe.rst b/Documentation/networking/device_drivers/ethernet/qualcomm/ppe/ppe.rst
new file mode 100644
index 000000000000..4ab299a28969
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/qualcomm/ppe/ppe.rst
@@ -0,0 +1,194 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============================================
+PPE Ethernet Driver for Qualcomm IPQ SoC Family
+===============================================
+
+Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+
+Author: Lei Wei <quic_leiwei@quicinc.com>
+
+
+Contents
+========
+
+- `PPE Overview`_
+- `PPE Driver Overview`_
+- `PPE Driver Supported SoCs`_
+- `Enabling the Driver`_
+- `Debugging`_
+
+
+PPE Overview
+============
+
+IPQ (Qualcomm Internet Processor) SoC (System-on-Chip) series is Qualcomm's series of
+networking SoC for Wi-Fi access points. The PPE (Packet Process Engine) is the Ethernet
+packet process engine in the IPQ SoC.
+
+Below is a simplified hardware diagram of IPQ9574 SoC which includes the PPE engine and
+other blocks which are in the SoC but outside the PPE engine. These blocks work together
+to enable the Ethernet for the IPQ SoC::
+
+ +------+ +------+ +------+ +------+ +------+ +------+ start +-------+
+ |netdev| |netdev| |netdev| |netdev| |netdev| |netdev|<------|PHYLINK|
+ +------+ +------+ +------+ +------+ +------+ +------+ stop +-+-+-+-+
+ | | | ^
+ +-------+ +-------------------------+--------+----------------------+ | | |
+ | GCC | | | EDMA | | | | |
+ +---+---+ | PPE +---+----+ | | | |
+ | clk | | | | | |
+ +-------->| +-----------------------+------+-----+---------------+ | | | |
+ | | Switch Core |Port0 | |Port7(EIP FIFO)| | | | |
+ | | +---+--+ +------+--------+ | | | |
+ | | | | | | | | |
+ +-------+ | | +------+---------------+----+ | | | | |
+ |CMN PLL| | | +---+ +---+ +----+ | +--------+ | | | | | |
+ +---+---+ | | |BM | |QM | |SCH | | | L2/L3 | ....... | | | | | |
+ | | | | +---+ +---+ +----+ | +--------+ | | | | | |
+ | | | | +------+--------------------+ | | | | |
+ | | | | | | | | | |
+ | v | | +-----+-+-----+-+-----+-+-+---+--+-----+-+-----+ | | | | |
+ | +------+ | | |Port1| |Port2| |Port3| |Port4| |Port5| |Port6| | | | | |
+ | |NSSCC | | | +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ | | mac| | |
+ | +-+-+--+ | | |MAC0 | |MAC1 | |MAC2 | |MAC3 | |MAC4 | |MAC5 | | |<---+ | |
+ | ^ | |clk | | +-----+-+-----+-+-----+-+-----+--+-----+-+-----+ | | ops | |
+ | | | +------>| +----|------|-------|-------|---------|--------|-----+ | | |
+ | | | +---------------------------------------------------------+ | |
+ | | | | | | | | | | |
+ | | | MII clk | QSGMII USXGMII USXGMII | |
+ | | +--------------->| | | | | | | |
+ | | +-------------------------+ +---------+ +---------+ | |
+ | |125/312.5MHz clk| (PCS0) | | (PCS1) | | (PCS2) | pcs ops | |
+ | +----------------+ UNIPHY0 | | UNIPHY1 | | UNIPHY2 |<--------+ |
+ +----------------->| | | | | | |
+ | 31.25MHz ref clk +-------------------------+ +---------+ +---------+ |
+ | | | | | | | |
+ | +-----------------------------------------------------+ |
+ |25/50MHz ref clk| +-------------------------+ +------+ +------+ | link |
+ +--------------->| | QUAD PHY | | PHY4 | | PHY5 | |---------+
+ | +-------------------------+ +------+ +------+ | change
+ | |
+ | MDIO bus |
+ +-----------------------------------------------------+
+
+The CMN (Common) PLL, NSSCC (Networking Sub System Clock Controller) and GCC (Global
+Clock Controller) blocks are in the SoC and act as clock providers.
+
+The UNIPHY block is in the SoC and provides the PCS (Physical Coding Sublayer) and
+XPCS (10-Gigabit Physical Coding Sublayer) functions to support different interface
+modes between the PPE MAC and the external PHY.
+
+This documentation focuses on the descriptions of PPE engine and the PPE driver.
+
+The Ethernet functionality in the PPE (Packet Process Engine) is comprised of three
+components: the switch core, port wrapper and Ethernet DMA.
+
+The Switch core in the IPQ9574 PPE has maximum of 6 front panel ports and two FIFO
+interfaces. One of the two FIFO interfaces is used for Ethernet port to host CPU
+communication using Ethernet DMA. The other one is used to communicate to the EIP
+engine which is used for IPsec offload. On the IPQ9574, the PPE includes 6 GMAC/XGMACs
+that can be connected with external Ethernet PHY. Switch core also includes BM (Buffer
+Management), QM (Queue Management) and SCH (Scheduler) modules for supporting the
+packet processing.
+
+The port wrapper provides connections from the 6 GMAC/XGMACS to UNIPHY (PCS) supporting
+various modes such as SGMII/QSGMII/PSGMII/USXGMII/10G-BASER. There are 3 UNIPHY (PCS)
+instances supported on the IPQ9574.
+
+Ethernet DMA is used to transmit and receive packets between the Ethernet subsystem
+and ARM host CPU.
+
+The following lists the main blocks in the PPE engine which will be driven by this
+PPE driver:
+
+- BM
+ BM is the hardware buffer manager for the PPE switch ports.
+- QM
+ Queue Manager for managing the egress hardware queues of the PPE switch ports.
+- SCH
+ The scheduler which manages the hardware traffic scheduling for the PPE switch ports.
+- L2
+ The L2 block performs the packet bridging in the switch core. The bridge domain is
+ represented by the VSI (Virtual Switch Instance) domain in PPE. FDB learning can be
+ enabled based on the VSI domain and bridge forwarding occurs within the VSI domain.
+- MAC
+ The PPE in the IPQ9574 supports up to six MACs (MAC0 to MAC5) which are corresponding
+ to six switch ports (port1 to port6). The MAC block is connected with external PHY
+ through the UNIPHY PCS block. Each MAC block includes the GMAC and XGMAC blocks and
+ the switch port can select to use GMAC or XMAC through a MUX selection according to
+ the external PHY's capability.
+- EDMA (Ethernet DMA)
+ The Ethernet DMA is used to transmit and receive Ethernet packets between the PPE
+ ports and the ARM cores.
+
+The received packet on a PPE MAC port can be forwarded to another PPE MAC port. It can
+be also forwarded to internal switch port0 so that the packet can be delivered to the
+ARM cores using the Ethernet DMA (EDMA) engine. The Ethernet DMA driver will deliver the
+packet to the corresponding 'netdevice' interface.
+
+The software instantiations of the PPE MAC (netdevice), PCS and external PHYs interact
+with the Linux PHYLINK framework to manage the connectivity between the PPE ports and
+the connected PHYs, and the port link states. This is also illustrated in above diagram.
+
+
+PPE Driver Overview
+===================
+PPE driver is Ethernet driver for the Qualcomm IPQ SoC. It is a single platform driver
+which includes the PPE part and Ethernet DMA part. The PPE part initializes and drives the
+various blocks in PPE switch core such as BM/QM/L2 blocks and the PPE MACs. The EDMA part
+drives the Ethernet DMA for packet transfer between PPE ports and ARM cores, and enables
+the netdevice driver for the PPE ports.
+
+The PPE driver files in drivers/net/ethernet/qualcomm/ppe/ are listed as below:
+
+- Makefile
+- ppe.c
+- ppe.h
+- ppe_config.c
+- ppe_config.h
+- ppe_debugfs.c
+- ppe_debugfs.h
+- ppe_regs.h
+
+The ppe.c file contains the main PPE platform driver and undertakes the initialization of
+PPE switch core blocks such as QM, BM and L2. The configuration APIs for these hardware
+blocks are provided in the ppe_config.c file.
+
+The ppe.h defines the PPE device data structure which will be used by PPE driver functions.
+
+The ppe_debugfs.c enables the PPE statistics counters such as PPE port Rx and Tx counters,
+CPU code counters and queue counters.
+
+
+PPE Driver Supported SoCs
+=========================
+
+The PPE driver supports the following IPQ SoC:
+
+- IPQ9574
+
+
+Enabling the Driver
+===================
+
+The driver is located in the menu structure at::
+
+ -> Device Drivers
+ -> Network device support (NETDEVICES [=y])
+ -> Ethernet driver support
+ -> Qualcomm devices
+ -> Qualcomm Technologies, Inc. PPE Ethernet support
+
+If the driver is built as a module, the module will be called qcom-ppe.
+
+The PPE driver functionally depends on the CMN PLL and NSSCC clock controller drivers.
+Please make sure the dependent modules are installed before installing the PPE driver
+module.
+
+
+Debugging
+=========
+
+The PPE hardware counters can be accessed using debugfs interface from the
+``/sys/kernel/debug/ppe/`` directory.
diff --git a/Documentation/networking/devlink/devlink-health.rst b/Documentation/networking/devlink/devlink-health.rst
index e0b8cfed610a..4d10536377ab 100644
--- a/Documentation/networking/devlink/devlink-health.rst
+++ b/Documentation/networking/devlink/devlink-health.rst
@@ -50,7 +50,7 @@ Once an error is reported, devlink health will perform the following actions:
* Auto recovery attempt is being done. Depends on:
- Auto-recovery configuration
- - Grace period vs. time passed since last recover
+ - Grace period (and burst period) vs. time passed since last recover
Devlink formatted message
=========================
diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
index 211b58177e12..0a9c20d70122 100644
--- a/Documentation/networking/devlink/devlink-params.rst
+++ b/Documentation/networking/devlink/devlink-params.rst
@@ -143,3 +143,11 @@ own name.
* - ``clock_id``
- u64
- Clock ID used by the device for registering DPLL devices and pins.
+ * - ``total_vfs``
+ - u32
+ - The max number of Virtual Functions (VFs) exposed by the PF.
+ after reboot/pci reset, 'sriov_totalvfs' entry under the device's sysfs
+ directory will report this value.
+ * - ``num_doorbells``
+ - u32
+ - Controls the number of doorbells used by the device.
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index 270a65a01411..0c58e5c729d9 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -56,18 +56,18 @@ general.
:maxdepth: 1
devlink-dpipe
+ devlink-eswitch-attr
+ devlink-flash
devlink-health
devlink-info
- devlink-flash
+ devlink-linecard
devlink-params
devlink-port
devlink-region
- devlink-resource
devlink-reload
+ devlink-resource
devlink-selftests
devlink-trap
- devlink-linecard
- devlink-eswitch-attr
Driver-specific documentation
-----------------------------
@@ -78,12 +78,14 @@ parameters, info versions, and other features it supports.
.. toctree::
:maxdepth: 1
+ am65-nuss-cpsw-switch
bnxt
etas_es58x
hns3
i40e
- ionic
ice
+ ionic
+ iosm
ixgbe
kvaser_pciefd
kvaser_usb
@@ -93,11 +95,9 @@ parameters, info versions, and other features it supports.
mv88e6xxx
netdevsim
nfp
- qed
- ti-cpsw-switch
- am65-nuss-cpsw-switch
- prestera
- iosm
octeontx2
+ prestera
+ qed
sfc
+ ti-cpsw-switch
zl3073x
diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 7febe0aecd53..0e5f9c76e514 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -15,23 +15,62 @@ Parameters
* - Name
- Mode
- Validation
+ - Notes
* - ``enable_roce``
- driverinit
- - Type: Boolean
-
- If the device supports RoCE disablement, RoCE enablement state controls
+ - Boolean
+ - If the device supports RoCE disablement, RoCE enablement state controls
device support for RoCE capability. Otherwise, the control occurs in the
driver stack. When RoCE is disabled at the driver level, only raw
ethernet QPs are supported.
* - ``io_eq_size``
- driverinit
- The range is between 64 and 4096.
+ -
* - ``event_eq_size``
- driverinit
- The range is between 64 and 4096.
+ -
* - ``max_macs``
- driverinit
- The range is between 1 and 2^31. Only power of 2 values are supported.
+ -
+ * - ``enable_sriov``
+ - permanent
+ - Boolean
+ - Applies to each physical function (PF) independently, if the device
+ s