summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/ice
AgeCommit message (Collapse)AuthorFilesLines
2024-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-2/+26
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/dev.c 9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()") 723de3ebef03 ("net: free altname using an RCU callback") net/unix/garbage.c 11498715f266 ("af_unix: Remove io_uring code for GC.") 25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.") drivers/net/ethernet/renesas/ravb_main.c ed4adc07207d ("net: ravb: Count packets instead of descriptors in GbEth RX path" ) c2da9408579d ("ravb: Add Rx checksum offload support for GbEth") net/mptcp/protocol.c bdd70eb68913 ("mptcp: drop the push_pending field") 28e5c1380506 ("mptcp: annotate lockless accesses around read-mostly fields") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-14ice: Add check for lport extraction to LAG initDave Ertman2-2/+26
To fully support initializing the LAG support code, a DDP package that extracts the logical port from the metadata is required. If such a package is not present, there could be difficulties in supporting some bond types. Add a check into the initialization flow that will bypass the new paths if any of the support pieces are missing. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Fixes: df006dd4b1dc ("ice: Add initial support framework for LAG") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240213183957.1483857-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-12ice: Fix debugfs with devlink reloadWojciech Drewek4-2/+14
During devlink reload it is needed to remove debugfs entries correlated with only one PF. ice_debugfs_exit() removes all entries created by ice driver so we can't use it. Introduce ice_debugfs_pf_deinit() in order to release PF's debugfs entries. Move ice_debugfs_exit() call to ice_module_exit(), it makes more sense since ice_debugfs_init() is called in ice_module_init() and not in ice_probe(). Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-12ice: Remove and readd netdev during devlink reloadWojciech Drewek3-131/+125
Recent changes to the devlink reload (commit 9b2348e2d6c9 ("devlink: warn about existing entities during reload-reinit")) force the drivers to destroy devlink ports during reinit. Adjust ice driver to this requirement, unregister netdvice, destroy devlink port. ice_init_eth() was removed and all the common code between probe and reload was moved to ice_load(). During devlink reload we can't take devl_lock (it's already taken) and in ice_probe() we have to lock it. Use devl_* variant of the API which does not acquire and release devl_lock. Guard ice_load() with devl_lock only in case of probe. Suggested-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-12ice: add support for 3k signing DDP sections for E825CGrzegorz Nitka2-0/+10
E825C devices shall support the new signing type of RSA 3K for new DDP section (SEGMENT_SIGN_TYPE_RSA3K_E825 (5) - already in the code). The driver is responsible to verify the presence of correct signing type. Add 3k signinig support for E825C devices based on mac_type: ICE_MAC_GENERIC_3K_E825; Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-12ice: Add helper function ice_is_generic_macGrzegorz Nitka5-3/+19
E800 series devices have a couple of quirks: 1. Sideband control queues are not supported 2. The registers that the driver needs to program for the "Precision Time Protocol (PTP)" feature are different for E800 series devices compared to other devices supported by this driver. Both these require conditional logic based on the underlying device we are dealing with. The function ice_is_sbq_supported added by commit 8f5ee3c477a8 ("ice: add support for sideband messages") addresses (1). The same function can be used to address (2) as well but this just looks weird readability wise in cases that have nothing to do with sideband control queues: if (ice_is_sbq_supported(hw)) /* program register A */ else /* program register B */ For these cases, the function ice_is_generic_mac introduced by this patch communicates the idea/intention better. Also rework ice_is_sbq_supported to use this new function. As side-band queue is supported for E825C devices, it's mac_type is considered as generic mac_type. Co-developed-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-12ice: introduce new E825C devices familyGrzegorz Nitka4-0/+32
Introduce new Intel Ethernet E825C family devices. Add new PCI device IDs which are going to be supported by the driver: - 579C: Intel(R) Ethernet Connection E825-C for backplane - 579D: Intel(R) Ethernet Connection E825-C for QSFP - 579E: Intel(R) Ethernet Connection E825-C for SFP - 579F: Intel(R) Ethernet Connection E825-C for SGMII Add helper function ice_is_e825c() to verify if the running device belongs to E825C family. Co-developed-by: Jan Glaza <jan.glaza@intel.com> Signed-off-by: Jan Glaza <jan.glaza@intel.com> Co-developed-by: Michal Michalik <michal.michalik@intel.com> Signed-off-by: Michal Michalik <michal.michalik@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-2/+2
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/stmicro/stmmac/common.h 38cc3c6dcc09 ("net: stmmac: protect updates of 64-bit statistics counters") fd5a6a71313e ("net: stmmac: est: Per Tx-queue error count for HLBF") c5c3e1bfc9e0 ("net: stmmac: Offload queueMaxSDU from tc-taprio") drivers/net/wireless/microchip/wilc1000/netdev.c c9013880284d ("wifi: fill in MODULE_DESCRIPTION()s for wilc1000") 328efda22af8 ("wifi: wilc1000: do not realloc workqueue everytime an interface is added") net/unix/garbage.c 11498715f266 ("af_unix: Remove io_uring code for GC.") 1279f9d9dec2 ("af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-07net: intel: fix old compiler regressionsJesse Brandeburg2-2/+2
The kernel build regressions/improvements email contained a couple of issues with old compilers (in fact all the reports were on different architectures, but all gcc 5.5) and the FIELD_PREP() and FIELD_GET() conversions. They're all because an integer #define that should have been declared as unsigned, was shifted to the point that it could set the sign bit. The fix just involves making sure the defines use the "U" identifier on the constants to make sure they're unsigned. Should make the checkers happier. Confirmed with objdump before/after that there is no change to the binaries. Issues were reported as follows: ./drivers/net/ethernet/intel/ice/ice_base.c:238:7: note: in expansion of macro 'FIELD_GET' (FIELD_GET(GLINT_CTL_ITR_GRAN_25_M, regval) == ICE_ITR_GRAN_US)) ^ ./include/linux/compiler_types.h:435:38: error: call to '__compiletime_assert_1093' declared with attribute error: FIELD_GET: mask is not constant drivers/net/ethernet/intel/ice/ice_nvm.c:709:16: note: in expansion of macro ‘FIELD_GET’ orom->major = FIELD_GET(ICE_OROM_VER_MASK, combo_ver); ^ ./include/linux/compiler_types.h:435:38: error: call to ‘__compiletime_assert_796’ declared with attribute error: FIELD_GET: mask is not constant drivers/net/ethernet/intel/ice/ice_common.c:945:18: note: in expansion of macro ‘FIELD_GET’ u8 max_agg_bw = FIELD_GET(GL_PWR_MODE_CTL_CAR_MAX_BW_M, ^ ./include/linux/compiler_types.h:435:38: error: call to ‘__compiletime_assert_420’ declared with attribute error: FIELD_GET: mask is not constant drivers/net/ethernet/intel/i40e/i40e_dcb.c:458:8: note: in expansion of macro ‘FIELD_GET’ oui = FIELD_GET(I40E_LLDP_TLV_OUI_MASK, ouisubtype); ^ Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/lkml/d03e90ca-8485-4d1b-5ec1-c3398e0e8da@linux-m68k.org/ #i40e #ice Fixes: 62589808d73b ("i40e: field get conversion") Fixes: 5a259f8e0baf ("ice: field get conversion") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/20240206022906.2194214-1-jesse.brandeburg@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-02ice: remove incorrect commentPaul M Stillwell Jr1-3/+0
Copy paste issue left a comment for this structure that has nothing to do with FW alignment; remove the comment. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-02ice: Add a new counter for Rx EIPE errorsAniruddha Paul3-2/+8
HW incorrectly reports EIPE errors on encapsulated packets with L2 padding inside inner packet. HW shows outer UDP/IPV4 packet checksum errors as part of the EIPE flags of the Rx descriptor. These are reported only if checksum offload is enabled and L3/L4 parsed flag is valid in Rx descriptor. When that error is reported by HW, we don't act on it instead of incrementing main Rx errors statistic as it would normally happen. Add a new statistic to count these errors since we still want to print them. Signed-off-by: Aniruddha Paul <aniruddha.paul@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jan Glaza <jan.glaza@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-02ice: make ice_vsi_cfg_txq() staticMaciej Fijalkowski5-100/+82
Currently, XSK control path in ice driver calls directly ice_vsi_cfg_txq() whereas we have ice_vsi_cfg_single_txq() for that purpose. Use the latter from XSK side and make ice_vsi_cfg_txq() static. ice_vsi_cfg_txq() resides in ice_base.c and is rather big, so to reduce the code churn let us move the callers of it from ice_lib.c to ice_base.c. This change puts ice_qp_ena() on nice diet due to the checks and operations that ice_vsi_cfg_single_{r,t}xq() do internally. add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-182 (-182) Function old new delta ice_xsk_pool_setup 2165 1983 -182 Total: Before=472597, After=472415, chg -0.04% Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-02ice: make ice_vsi_cfg_rxq() staticMaciej Fijalkowski5-63/+60
Currently, XSK control path in ice driver calls directly ice_vsi_cfg_rxq() whereas we have ice_vsi_cfg_single_rxq() for that purpose. Use the latter from XSK side and make ice_vsi_cfg_rxq() static. ice_vsi_cfg_rxq() resides in ice_base.c and is rather big, so to reduce the code churn let us move two callers of it from ice_lib.c to ice_base.c. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-01dpll: extend lock_status_get() op by status error and expose to userJiri Pirko1-0/+2
Pass additional argunent status_error over lock_status_get() so drivers can fill it up. In case they do, expose the value over previously introduced attribute to user. Do it only in case the current lock_status is either "unlocked" or "holdover". Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-30ice: stop destroying and reinitalizing Tx tracker during resetJacob Keller1-12/+21
The ice driver currently attempts to destroy and re-initialize the Tx timestamp tracker during the reset flow. The release of the Tx tracker only happened during CORE reset or GLOBAL reset. The ice_ptp_rebuild() function always calls the ice_ptp_init_tx function which will allocate a new tracker data structure, resulting in memory leaks during PF reset. Certainly the driver should not be allocating a new tracker without removing the old tracker data, as this results in a memory leak. Additionally, there's no reason to remove the tracker memory during a reset. Remove this logic from the reset and rebuild flow. Instead of releasing the Tx tracker, flush outstanding timestamps just before we reset the PHY timestamp block in ice_ptp_cfg_phy_interrupt(). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-30ice: factor out ice_ptp_rebuild_owner()Jacob Keller3-28/+42
The ice_ptp_reset() function uses a goto to skip past clock owner operations if performing a PF reset or if the device is not the clock owner. This is a bit confusing. Factor this out into ice_ptp_rebuild_owner() instead. The ice_ptp_reset() function is called by ice_rebuild() to restore PTP functionality after a device reset. Follow the convention set by the ice_main.c file and rename this function to ice_ptp_rebuild(), in the same way that we have ice_prepare_for_reset() and ice_ptp_prepare_for_reset(). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-30ice: rename ice_ptp_tx_cfg_intrJacob Keller1-6/+6
The ice_ptp_tx_cfg_intr() function sends a control queue message to configure the PHY timestamp interrupt block. This is a very similar name to a function which is used to configure the MAC Other Interrupt Cause Enable register. Rename this function to ice_ptp_cfg_phy_interrupt in order to make it more obvious to the reader what action it performs, and distinguish it from other similarly named functions. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-30ice: don't check has_ready_bitmap in E810 functionsJacob Keller1-12/+11
E810 hardware does not have a Tx timestamp ready bitmap. Don't check has_ready_bitmap in E810-specific functions. Add has_ready_bitmap check in ice_ptp_process_tx_tstamp() to stop relying on the fact that ice_get_phy_tx_tstamp_ready() returns all 1s. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-30ice: rename verify_cached to has_ready_bitmapJacob Keller2-9/+11
The tx->verify_cached flag is used to inform the Tx timestamp tracking code whether it needs to verify the cached Tx timestamp value against a previous captured value. This is necessary on E810 hardware which does not have a Tx timestamp ready bitmap. In addition, we currently rely on the fact that the ice_get_phy_tx_tstamp_ready() function returns all 1s for E810 hardware. Instead of introducing a brand new flag, rename and verify_cached to has_ready_bitmap, inverting the relevant checks. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-30ice: pass reset type to PTP reset functionsJacob Keller3-12/+21
The ice_ptp_prepare_for_reset() and ice_ptp_reset() functions currently check the pf->flags ICE_FLAG_PFR_REQ bit to determine if the current reset is a PF reset or not. This is problematic, because it is possible that a PF reset and a higher level reset (CORE reset, GLOBAL reset, EMP reset) are requested simultaneously. In that case, the driver performs the highest level reset requested. However, the ICE_FLAG_PFR_REQ flag will still be set. The main driver reset functions take an enum ice_reset_req indicating which reset is actually being performed. Pass this data into the PTP functions and rely on this instead of relying on the driver flags. This ensures that the PTP code performs the proper level of reset that the driver is actually undergoing. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-30ice: introduce PTP state machineJacob Keller4-49/+74
Add PTP state machine so that the driver can correctly identify PTP state around resets. When the driver got information about ungraceful reset, PTP was not prepared for reset and it returned error. When this situation occurs, prepare PTP before rebuilding its structures. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-24ice: update xdp_rxq_info::frag_size for ZC enabled Rx queueMaciej Fijalkowski1-14/+23
Now that ice driver correctly sets up frag_size in xdp_rxq_info, let us make it work for ZC multi-buffer as well. ice_rx_ring::rx_buf_len for ZC is being set via xsk_pool_get_rx_frame_size() and this needs to be propagated up to xdp_rxq_info. Use a bigger hammer and instead of unregistering only xdp_rxq_info's memory model, unregister it altogether and register it again and have xdp_rxq_info with correct frag_size value. Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20240124191602.566724-9-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24intel: xsk: initialize skb_frag_t::bv_offset in ZC driversMaciej Fijalkowski1-1/+2
Ice and i40e ZC drivers currently set offset of a frag within skb_shared_info to 0, which is incorrect. xdp_buffs that come from xsk_buff_pool always have 256 bytes of a headroom, so they need to be taken into account to retrieve xdp_buff::data via skb_frag_address(). Otherwise, bpf_xdp_frags_increase_tail() would be starting its job from xdp_buff::data_hard_start which would result in overwriting existing payload. Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support") Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support") Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20240124191602.566724-8-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24ice: remove redundant xdp_rxq_info registrationMaciej Fijalkowski1-5/+0
xdp_rxq_info struct can be registered by drivers via two functions - xdp_rxq_info_reg() and __xdp_rxq_info_reg(). The latter one allows drivers that support XDP multi-buffer to set up xdp_rxq_info::frag_size which in turn will make it possible to grow the packet via bpf_xdp_adjust_tail() BPF helper. Currently, ice registers xdp_rxq_info in two spots: 1) ice_setup_rx_ring() // via xdp_rxq_info_reg(), BUG 2) ice_vsi_cfg_rxq() // via __xdp_rxq_info_reg(), OK Cited commit under fixes tag took care of setting up frag_size and updated registration scheme in 2) but it did not help as 1) is called before 2) and as shown above it uses old registration function. This means that 2) sees that xdp_rxq_info is already registered and never calls __xdp_rxq_info_reg() which leaves us with xdp_rxq_info::frag_size being set to 0. To fix this misbehavior, simply remove xdp_rxq_info_reg() call from ice_setup_rx_ring(). Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20240124191602.566724-7-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24ice: work on pre-XDP prog frag countMaciej Fijalkowski3-14/+32
Fix an OOM panic in XDP_DRV mode when a XDP program shrinks a multi-buffer packet by 4k bytes and then redirects it to an AF_XDP socket. Since support for handling multi-buffer frames was added to XDP, usage of bpf_xdp_adjust_tail() helper within XDP program can free the page that given fragment occupies and in turn decrease the fragment count within skb_shared_info that is embedded in xdp_buff struct. In current ice driver codebase, it can become problematic when page recycling logic decides not to reuse the page. In such case, __page_frag_cache_drain() is used with ice_rx_buf::pagecnt_bias that was not adjusted after refcount of page was changed by XDP prog which in turn does not drain the refcount to 0 and page is never freed. To address this, let us store the count of frags before the XDP program was executed on Rx ring struct. This will be used to compare with current frag count from skb_shared_info embedded in xdp_buff. A smaller value in the latter indicates that XDP prog freed frag(s). Then, for given delta decrement pagecnt_bias for XDP_DROP verdict. While at it, let us also handle the EOP frag within ice_set_rx_bufs_act() to make our life easier, so all of the adjustments needed to be applied against freed frags are performed in the single place. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20240124191602.566724-5-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24xsk: make xsk_buff_pool responsible for clearing xdp_buff::flagsMaciej Fijalkowski1-1/+0
XDP multi-buffer support introduced XDP_FLAGS_HAS_FRAGS flag that is used by drivers to notify data path whether xdp_buff contains fragments or not. Data path looks up mentioned flag on first buffer that occupies the linear part of xdp_buff, so drivers only modify it there. This is sufficient for SKB and XDP_DRV modes as usually xdp_buff is allocated on stack or it resides within struct representing driver's queue and fragments are carried via skb_frag_t structs. IOW, we are dealing with only one xdp_buff. ZC mode though relies on list of xdp_buff structs that is carried via xsk_buff_pool::xskb_list, so ZC data path has to make sure that fragments do *not* have XDP_FLAGS_HAS_FRAGS set. Otherwise, xsk_buff_free() could misbehave if it would be executed against xdp_buff that carries a frag with XDP_FLAGS_HAS_FRAGS flag set. Such scenario can take place when within supplied XDP program bpf_xdp_adjust_tail() is used with negative offset that would in turn release the tail fragment from multi-buffer frame. Calling xsk_buff_free() on tail fragment with XDP_FLAGS_HAS_FRAGS would result in releasing all the nodes from xskb_list that were produced by driver before XDP program execution, which is not what is intended - only tail fragment should be deleted from xskb_list and then it should be put onto xsk_buff_pool::free_list. Such multi-buffer frame will never make it up to user space, so from AF_XDP application POV there would be no traffic running, however due to free_list getting constantly new nodes, driver will be able to feed HW Rx queue with recycled buffers. Bottom line is that instead of traffic being redirected to user space, it would be continuously dropped. To fix this, let us clear the mentioned flag on xsk_buff_pool side during xdp_buff initialization, which is what should have been done right from the start of XSK multi-buffer support. Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support") Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support") Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20240124191602.566724-3-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-7/+12
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c e009b2efb7a8 ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()") 0f2b21477988 ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL") https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04Merge branch '40GbE' of ↵Jakub Kicinski1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-01-03 (i40e, ice, igc) This series contains updates to i40e, ice, and igc drivers. Ke Xiao fixes use after free for unicast filters on i40e. Andrii restores VF MSI-X flag after PCI reset on i40e. Paul corrects admin queue link status structure to fulfill firmware expectations for ice. Rodrigo Cataldo corrects value used for hicredit calculations on igc. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: Fix hicredit calculation ice: fix Get link status data length i40e: Restore VF MSI-X state during PCI reset i40e: fix use-after-free in i40e_aqc_add_filters() ==================== Link: https://lore.kernel.org/r/20240103193254.822968-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-03ice: fix Get link status data lengthPaul Greenwalt1-1/+2
Get link status version 2 (opcode 0x0607) is returning an error because FW expects a data length of 56 bytes, and this is causing the driver to fail probe. Update the get link status version 2 data length to 56 bytes by adding 5 byte reserved5 field to the end of struct ice_aqc_get_link_status_data and passing it as parameter to offsetofend() to the fix error. Fixes: 2777d24ec6d1 ("ice: Add ice_get_link_status_datalen") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-01-02ice: Fix some null pointer dereference issues in ice_ptp.cKunwu Chan1-0/+4
devm_kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. Fixes: d938a8cca88a ("ice: Auxbus devices & driver for E822 TS") Cc: Kunwu Chan <kunwu.chan@hotmail.com> Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-01-02ice: ice_base.c: Add const modifier to params and varsJan Glaza2-5/+5
Add const modifier to function parameters and variables where appropriate in ice_base.c and corresponding declarations in ice_base.h. The reason for starting the change is that read-only pointers should be marked as const when possible to allow for smoother and more optimal code generation and optimization as well as allowing the compiler to warn the developer about potentially unwanted modifications, while not carrying noticeable negative impact. Reviewed-by: Andrii Staikov <andrii.staikov@intel.com> Reviewed-by: Sachin Bahadur <sachin.bahadur@intel.com> Signed-off-by: Jan Glaza <jan.glaza@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-01-02ice: remove rx_len_errors statisticJan Sokolowski3-7/+0
It was found that this statistic is incorrectly reported by HW and thus, useless. As RX length error statistics are shown to the end user when requested, the values reported are misleading. Thus, that value is no longer reported and doesn't count anymore when adding all rx errors. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-01-02ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()Jacob Keller4-33/+28
The ice_vf_create_vsi() function and its VF ops helper introduced by commit a4c785e8162e ("ice: convert vf_ops .vsi_rebuild to .create_vsi") are used during an individual VF reset to re-create the VSI. This was done in order to ensure that the VSI gets properly reconfigured within the hardware. This is somewhat heavy handed as we completely release the VSI memory and structure, and then create a new VSI. This can also potentially force a change of the VSI index as we will re-use the first open slot in the VSI array which may not be the same. As part of implementing devlink reload, commit 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") split VSI setup into smaller functions, introducing both ice_vsi_cfg() and ice_vsi_decfg() which can be used to configure or deconfigure an existing software VSI structure. Rather than completely removing the VSI and adding a new one using the .create_vsi() VF operation, simply use ice_vsi_decfg() to remove the current configuration. Save the VSI type and then call ice_vsi_cfg() to reconfigure the VSI as the same type that it was before. The existing reset logic assumes that all hardware filters will be removed, so also call ice_fltr_remove_all() before re-configuring the VSI. This new operation does not re-create the VSI, so rename it to ice_vf_reconfig_vsi(). The new approach can safely share the exact same flow for both SR-IOV VFs as well as the Scalable IOV VFs being worked on. This uses less code and is a better abstraction over fully deleting the VSI and adding a new one. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Petr Oros <poros@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-01-02ice: Add support for packet mirroring using hardware in switchdev modeAndrii Staikov3-7/+60
Switchdev mode allows to add mirroring rules to mirror incoming and outgoing packets to the interface's port representor. Previously, this was available only using software functionality. Add possibility to offload this functionality to the NIC hardware. Introduce ICE_MIRROR_PACKET filter action to the ice_sw_fwd_act_type enum to identify the desired action and pass it to the hardware as well as the VSI to mirror. Example of tc mirror command using hardware: tc filter add dev ens1f0np0 ingress protocol ip prio 1 flower src_mac b4:96:91:a5:c7:a7 skip_sw action mirred egress mirror dev eth1 ens1f0np0 - PF b4:96:91:a5:c7:a7 - source MAC address eth1 - PR of a VF to mirror to Co-developed-by: Marcin Szycik <marcin.szycik@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Andrii Staikov <andrii.staikov@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-01-02ice: Enable SW interrupt from FW for LL TSKarol Kolacinski8-28/+274
Introduce new capability - Low Latency Timestamping with Interrupt. On supported devices, driver can request a single timestamp from FW without polling the register afterwards. Instead, FW can issue a dedicated interrupt when the timestamp was read from the PHY register and its value is available to read from the register. This eliminates the need of bottom half scheduling, which results in minimal delay for timestamping. For this mode, allocate TS indices sequentially, so that timestamps are always completed in FIFO manner. Co-developed-by: Yochai Hagvi <yochai.hagvi@intel.com> Signed-off-by: Yochai Hagvi <yochai.hagvi@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-01-02ice: Schedule service task in IRQ top halfKarol Kolacinski2-10/+11
Schedule service task and EXTTS in the top half to avoid bottom half scheduling if possible, which significantly reduces timestamping delay. Co-developed-by: Michal Michalik <michal.michalik@intel.com> Signed-off-by: Michal Michalik <michal.michalik@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-27ice: dpll: fix phase offset valueArkadiusz Kubalewski1-3/+1
Stop dividing the phase_offset value received from firmware. This fault is present since the initial implementation. The phase_offset value received from firmware is in 0.01ps resolution. Dpll subsystem is using the value in 0.001ps, raw value is adjusted before providing it to the user. The user can observe the value of phase offset with response to `pin-get` netlink message of dpll subsystem for an active pin: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --do pin-get --json '{"id":2}' Where example of correct response would be: {'board-label': 'C827_0-RCLKA', 'capabilities': 6, 'clock-id': 4658613174691613800, 'frequency': 1953125, 'id': 2, 'module-name': 'ice', 'parent-device': [{'direction': 'input', 'parent-id': 6, 'phase-offset': -216839550, 'prio': 9, 'state': 'connected'}, {'direction': 'input', 'parent-id': 7, 'phase-offset': -42930, 'prio': 8, 'state': 'connected'}], 'phase-adjust': 0, 'phase-adjust-max': 16723, 'phase-adjust-min': -16723, 'type': 'mux'} Provided phase-offset value (-42930) shall be divided by the user with DPLL_PHASE_OFFSET_DIVIDER to get actual value of -42.930 ps. Before the fix, the response was not correct: {'board-label': 'C827_0-RCLKA', 'capabilities': 6, 'clock-id': 4658613174691613800, 'frequency': 1953125, 'id': 2, 'module-name': 'ice', 'parent-device': [{'direction': 'input', 'parent-id': 6, 'phase-offset': -216839, 'prio': 9, 'state': 'connected'}, {'direction': 'input', 'parent-id': 7, 'phase-offset': -42, 'prio': 8, 'state': 'connected'}], 'phase-adjust': 0, 'phase-adjust-max': 16723, 'phase-adjust-min': -16723, 'type': 'mux'} Where phase-offset value (-42), after division (DPLL_PHASE_OFFSET_DIVIDER) would be: -0.042 ps. Fixes: 8a3a565ff210 ("ice: add admin commands to access cgu configuration") Fixes: 90e1c90750d7 ("ice: dpll: implement phase related callbacks") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-27ice: Shut down VSI with "link-down-on-close" enabledNgai-Mint Kwan1-0/+2
Disabling netdev with ethtool private flag "link-down-on-close" enabled can cause NULL pointer dereference bug. Shut down VSI regardless of "link-down-on-close" state. Fixes: 8ac7132704f3 ("ice: Fix interface being down after reset with link-down-on-close flag on") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-27ice: Fix link_down_on_close messageKatarzyna Wieczerzycka1-3/+7
The driver should not report an error message when for a medialess port the link_down_on_close flag is enabled and the physical link cannot be set down. Fixes: 8ac7132704f3 ("ice: Fix interface being down after reset with link-down-on-close flag on") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Katarzyna Wieczerzycka <katarzyna.wieczerzycka@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-22Merge branch '1GbE' of ↵David S. Miller22-341/+240
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== intel: use bitfield operations Jesse Brandeburg says: After repeatedly getting review comments on new patches, and sporadic patches to fix parts of our drivers, we should just convert the Intel code to use FIELD_PREP() and FIELD_GET(). It's then "common" in the code and hopefully future change-sets will see the context and do-the-right-thing. This conversion was done with a coccinelle script which is mentioned in the commit messages. Generally there were only a couple conversions that were "undone" after the automatic changes because they tried to convert a non-contiguous mask. Patch 1 is required at the beginning of this series to fix a "forever" issue in the e1000e driver that fails the compilation test after conversion because the shift / mask was out of range. The second patch just adds all the new #includes in one go. The patch titled: "ice: fix pre-shifted bit usage" is needed to allow the use of the FIELD_* macros and fix up the unexpected "shifts included" defines found while creating this series. The rest are the conversion to use FIELD_PREP()/FIELD_GET(), and the occasional leXX_{get,set,encode}_bits() call, as suggested by Alex. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni3-6/+7
Cross-merge networking fixes after downstream PR. Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c 23c93c3b6275 ("bnxt_en: do not map packet buffers twice") 6d1add95536b ("bnxt_en: Modify TX ring indexing logic.") tools/testing/selftests/net/Makefile 2258b666482d ("selftests: add vlan hw filter tests") a0bc96c0cd6e ("selftests: net: verify fq per-band packet limit") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-18Merge tag 'for-netdev' of ↵Jakub Kicinski11-273/+508
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2023-12-18 This PR is larger than usual and contains changes in various parts of the kernel. The main changes are: 1) Fix kCFI bugs in BPF, from Peter Zijlstra. End result: all forms of indirect calls from BPF into kernel and from kernel into BPF work with CFI enabled. This allows BPF to work with CONFIG_FINEIBT=y. 2) Introduce BPF token object, from Andrii Nakryiko. It ad