diff options
| author | Jakub Kicinski <kuba@kernel.org> | 2024-10-04 11:33:48 -0700 |
|---|---|---|
| committer | Jakub Kicinski <kuba@kernel.org> | 2024-10-04 11:33:48 -0700 |
| commit | 34ea1df802f79d4498a12ca79eff6fffbf8fa7f3 (patch) | |
| tree | 917894d7a57ad66dde5fad82da213a047ce94192 /include/linux/mlx5/driver.h | |
| parent | c55ff46aeebed1704a9a6861777b799f15ce594d (diff) | |
| parent | d1c9cffe4b01f4d8bc52169139a5fedd48908abc (diff) | |
| download | linux-34ea1df802f79d4498a12ca79eff6fffbf8fa7f3.tar.gz linux-34ea1df802f79d4498a12ca79eff6fffbf8fa7f3.tar.bz2 linux-34ea1df802f79d4498a12ca79eff6fffbf8fa7f3.zip | |
Merge branch 'net-mlx5-hw-counters-refactor'
Tariq Toukan says:
====================
net/mlx5: hw counters refactor
This is a patchset re-post, see:
https://lore.kernel.org/20240815054656.2210494-7-tariqt@nvidia.com
In this patchset, Cosmin refactors hw counters and solves perf scaling
issue.
Series generated against:
commit c824deb1a897 ("cxgb4: clip_tbl: Fix spelling mistake "wont" -> "won't"")
HW counters are central to mlx5 driver operations. They are hardware
objects created and used alongside most steering operations, and queried
from a variety of places. Most counters are queried in bulk from a
periodic task in fs_counters.c.
Counter performance is important and as such, a variety of improvements
have been done over the years. Currently, counters are allocated from
pools, which are bulk allocated to amortize the cost of firmware
commands. Counters are managed through an IDR, a doubly linked list and
two atomic single linked lists. Adding/removing counters is a complex
dance between user contexts requesting it and the mlx5_fc_stats_work
task which does most of the work.
Under high load (e.g. from connection tracking flow insertion/deletion),
the counter code becomes a bottleneck, as seen on flame graphs. Whenever
a counter is deleted, it gets added to a list and the wq task is
scheduled to run immediately to actually delete it. This is done via
mod_delayed_work which uses an internal spinlock. In some tests, waiting
for this spinlock took up to 66% of all samples.
This series refactors the counter code to use a more straight-forward
approach, avoiding the mod_delayed_work problem and making the code
easier to understand. For that:
- patch #1 moves counters data structs to a more appropriate place.
- patch #2 simplifies the bulk query allocation scheme by using vmalloc.
- patch #3 replaces the IDR+3 lists with an xarray. This is the main
patch of the series, solving the spinlock congestion issue.
- patch #4 removes an unnecessary cacheline alignment causing a lot of
memory to be wasted.
- patches #5 and #6 are small cleanups enabled by the refactoring.
====================
Link: https://patch.msgid.link/20241001103709.58127-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'include/linux/mlx5/driver.h')
| -rw-r--r-- | include/linux/mlx5/driver.h | 33 |
1 files changed, 1 insertions, 32 deletions
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index e23c692a34c7..fc7e6153b73d 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -45,7 +45,6 @@ #include <linux/workqueue.h> #include <linux/mempool.h> #include <linux/interrupt.h> -#include <linux/idr.h> #include <linux/notifier.h> #include <linux/refcount.h> #include <linux/auxiliary_bus.h> @@ -474,36 +473,6 @@ struct mlx5_core_sriov { u16 max_ec_vfs; }; -struct mlx5_fc_pool { - struct mlx5_core_dev *dev; - struct mutex pool_lock; /* protects pool lists */ - struct list_head fully_used; - struct list_head partially_used; - struct list_head unused; - int available_fcs; - int used_fcs; - int threshold; -}; - -struct mlx5_fc_stats { - spinlock_t counters_idr_lock; /* protects counters_idr */ - struct idr counters_idr; - struct list_head counters; - struct llist_head addlist; - struct llist_head dellist; - - struct workqueue_struct *wq; - struct delayed_work work; - unsigned long next_query; - unsigned long sampling_interval; /* jiffies */ - u32 *bulk_query_out; - int bulk_query_len; - size_t num_counters; - bool bulk_query_alloc_failed; - unsigned long next_bulk_query_alloc; - struct mlx5_fc_pool fc_pool; -}; - struct mlx5_events; struct mlx5_mpfs; struct mlx5_eswitch; @@ -630,7 +599,7 @@ struct mlx5_priv { struct mlx5_devcom_comp_dev *hca_devcom_comp; struct mlx5_fw_reset *fw_reset; struct mlx5_core_roce roce; - struct mlx5_fc_stats fc_stats; + struct mlx5_fc_stats *fc_stats; struct mlx5_rl_table rl_table; struct mlx5_ft_pool *ft_pool; |
