summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/hfi1
AgeCommit message (Collapse)AuthorFilesLines
2021-12-14Merge tag 'v5.16-rc5' into rdma.git for-nextJason Gunthorpe5-27/+24
Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddrMike Marciniszyn1-19/+14
This buffer is currently allocated in hfi1_init(): if (reinit) ret = init_after_reset(dd); else ret = loadtime_init(dd); if (ret) goto done; /* allocate dummy tail memory for all receive contexts */ dd->rcvhdrtail_dummy_kvaddr = dma_alloc_coherent(&dd->pcidev->dev, sizeof(u64), &dd->rcvhdrtail_dummy_dma, GFP_KERNEL); if (!dd->rcvhdrtail_dummy_kvaddr) { dd_dev_err(dd, "cannot allocate dummy tail memory\n"); ret = -ENOMEM; goto done; } The reinit triggered path will overwrite the old allocation and leak it. Fix by moving the allocation to hfi1_alloc_devdata() and the deallocation to hfi1_free_devdata(). Link: https://lore.kernel.org/r/20211129192008.101968.91302.stgit@awfm-01.cornelisnetworks.com Cc: stable@vger.kernel.org Fixes: 46b010d3eeb8 ("staging/rdma/hfi1: Workaround to prevent corruption during packet delivery") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07IB/hfi1: Fix early init panicMike Marciniszyn3-3/+6
The following trace can be observed with an init failure such as firmware load failures: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0010 [#1] SMP PTI CPU: 0 PID: 537 Comm: kworker/0:3 Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1 Workqueue: events work_for_cpu_fn RIP: 0010:0x0 Code: Bad RIP value. RSP: 0000:ffffae5f878a3c98 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff95e48e025c00 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff95e48e025c00 RBP: ffff95e4bf3660a4 R08: 0000000000000000 R09: ffffffff86d5e100 R10: ffff95e49e1de600 R11: 0000000000000001 R12: ffff95e4bf366180 R13: ffff95e48e025c00 R14: ffff95e4bf366028 R15: ffff95e4bf366000 FS: 0000000000000000(0000) GS:ffff95e4df200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 0000000f86a0a003 CR4: 00000000001606f0 Call Trace: receive_context_interrupt+0x1f/0x40 [hfi1] __free_irq+0x201/0x300 free_irq+0x2e/0x60 pci_free_irq+0x18/0x30 msix_free_irq.part.2+0x46/0x80 [hfi1] msix_clean_up_interrupts+0x2b/0x70 [hfi1] hfi1_init_dd+0x640/0x1a90 [hfi1] do_init_one.isra.19+0x34d/0x680 [hfi1] local_pci_probe+0x41/0x90 work_for_cpu_fn+0x16/0x20 process_one_work+0x1a7/0x360 worker_thread+0x1cf/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x112/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x35/0x40 The free_irq() results in a callback to the registered interrupt handler, and rcd->do_interrupt is NULL because the receive context data structures are not fully initialized. Fix by ensuring that the do_interrupt is always assigned and adding a guards in the slow path handler to detect and handle a partially initialized receive context and noop the receive. Link: https://lore.kernel.org/r/20211129192003.101968.33612.stgit@awfm-01.cornelisnetworks.com Cc: stable@vger.kernel.org Fixes: b0ba3c18d6bf ("IB/hfi1: Move normal functions from hfi1_devdata to const array") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07IB/hfi1: Insure use of smp_processor_id() is preempt disabledMike Marciniszyn1-1/+1
The following BUG has just surfaced with our 5.16 testing: BUG: using smp_processor_id() in preemptible [00000000] code: mpicheck/1581081 caller is sdma_select_user_engine+0x72/0x210 [hfi1] CPU: 0 PID: 1581081 Comm: mpicheck Tainted: G S 5.16.0-rc1+ #1 Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016 Call Trace: <TASK> dump_stack_lvl+0x33/0x42 check_preemption_disabled+0xbf/0xe0 sdma_select_user_engine+0x72/0x210 [hfi1] ? _raw_spin_unlock_irqrestore+0x1f/0x31 ? hfi1_mmu_rb_insert+0x6b/0x200 [hfi1] hfi1_user_sdma_process_request+0xa02/0x1120 [hfi1] ? hfi1_write_iter+0xb8/0x200 [hfi1] hfi1_write_iter+0xb8/0x200 [hfi1] do_iter_readv_writev+0x163/0x1c0 do_iter_write+0x80/0x1c0 vfs_writev+0x88/0x1a0 ? recalibrate_cpu_khz+0x10/0x10 ? ktime_get+0x3e/0xa0 ? __fget_files+0x66/0xa0 do_writev+0x65/0x100 do_syscall_64+0x3a/0x80 Fix this long standing bug by moving the smp_processor_id() to after the rcu_read_lock(). The rcu_read_lock() implicitly disables preemption. Link: https://lore.kernel.org/r/20211129191958.101968.87329.stgit@awfm-01.cornelisnetworks.com Cc: stable@vger.kernel.org Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07IB/hfi1: Correct guard on eager buffer deallocationMike Marciniszyn1-1/+1
The code tests the dma address which legitimately can be 0. The code should test the kernel logical address to avoid leaking eager buffer allocations that happen to map to a dma address of 0. Fixes: 60368186fd85 ("IB/hfi1: Fix user-space buffers mapping with IOMMU enabled") Link: https://lore.kernel.org/r/20211129191952.101968.17137.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-29IB/hfi1: Use bitmap_zalloc() when applicableChristophe JAILLET1-5/+3
Use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid some open-coded arithmetic in allocator arguments. Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency. Link: https://lore.kernel.org/r/d46c6bc1869b8869244fa71943d2cad4104b3668.1637869925.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-16IB/hfi1: Properly allocate rdma counter desc memoryDennis Dalessandro1-3/+2
When optional counter support was added the allocation of the memory holding the counter descriptors was not cleared properly. This caused WARN_ON()s in the IB/sysfs code to be hit. This is because the uninitialized memory made some of the counters wrongly look like optional counters. Use kzalloc. While here change the sizeof() calls to use the pointer rather than the name of the type. WARNING: CPU: 0 PID: 32644 at drivers/infiniband/core/sysfs.c:1064 ib_setup_port_attrs+0x7e1/0x890 [ib_core] CPU: 0 PID: 32644 Comm: kworker/0:2 Tainted: G S W 5.15.0+ #36 Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016 Workqueue: events work_for_cpu_fn RIP: 0010:ib_setup_port_attrs+0x7e1/0x890 [ib_core] RSP: 0018:ffffc90006ea3c40 EFLAGS: 00010202 RAX: 0000000000000068 RBX: ffff888106ad8000 RCX: 0000000000000138 RDX: ffff888126c84c00 RSI: ffff888103c41000 RDI: 0000000000000124 RBP: ffff88810f63a801 R08: ffff888126c8a000 R09: 0000000000000001 R10: ffffffffa09acf20 R11: 0000000000000065 R12: ffff88810f63a800 R13: ffff88810f63a800 R14: ffff88810f63a8e0 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff888667a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005590102cb078 CR3: 000000000240a003 CR4: 00000000001706f0 Call Trace: ib_register_device.cold.44+0x23e/0x2d0 [ib_core] rvt_register_device+0xfa/0x230 [rdmavt] hfi1_register_ib_device+0x623/0x690 [hfi1] init_one.cold.36+0x2d1/0x49b [hfi1] local_pci_probe+0x45/0x80 work_for_cpu_fn+0x16/0x20 process_one_work+0x1b1/0x360 worker_thread+0x1d4/0x3a0 kthread+0x11a/0x140 ret_from_fork+0x22/0x30 Fixes: 5e2ddd1e5982 ("RDMA/counter: Add optional counter support") Link: https://lore.kernel.org/r/20211115200913.124104.47770.stgit@awfm-01.cornelisnetworks.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-01Merge tag 'v5.15' into rdma.git for-nextJason Gunthorpe1-3/+6
Pull in the accepted for-rc patches as the next merge needs a newer base. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-29IB/hfi1: Rebranding of hfi1 driver to Cornelis NetworksScott Breyer4-5/+8
Changes instances of Intel to Cornelis in identifying strings Link: https://lore.kernel.org/r/20211028124601.26694.35662.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Scott Breyer <scott.breyer@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25RDMA: Constify netdev->dev_addr accessesJakub Kicinski1-1/+1
netdev->dev_addr will become const soon, make sure drivers propagate the qualifier. Link: https://lore.kernel.org/r/20211019182604.1441387-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-13IB/hfi1: Fix abba locking issue with sc_disable()Mike Marciniszyn1-3/+6
sc_disable() after having disabled the send context wakes up any waiters by calling hfi1_qp_wakeup() while holding the waitlock for the sc. This is contrary to the model for all other calls to hfi1_qp_wakeup() where the waitlock is dropped and a local is used to drive calls to hfi1_qp_wakeup(). Fix by moving the sc->piowait into a local list and driving the wakeup calls from the list. Fixes: 099a884ba4c0 ("IB/hfi1: Handle wakeup of orphaned QPs for pio") Link: https://lore.kernel.org/r/20211013141852.128104.2682.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12RDMA/counter: Add a descriptor in struct rdma_hw_statsAharon Landau1-26/+27
Add a counter statistic descriptor structure in rdma_hw_stats. In addition to the counter name, more meta-information will be added. This code extension is needed for optional-counter support in the following patches. Link: https://lore.kernel.org/r/20211008122439.166063-4-markzhang@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-05IB/hf1: Use string_upper() instead of an open coded variantAndy Shevchenko1-6/+4
Use string_upper() from the string helper module instead of an open coded variant. Link: https://lore.kernel.org/r/20211001123153.67379-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-04Merge tag 'v5.15-rc4' into rdma.get for-nextJason Gunthorpe1-4/+4
Merged due to dependencies in following patches. Conflict in drivers/infiniband/hw/hfi1/ipoib_tx.c resolved by hand to take the %p change and txq stats rename together. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27RDMA/hfi1: Use struct_size() and flex_array_size() helpersGustavo A. R. Silva1-3/+2
Make use of the struct_size() and flex_array_size() helpers instead of open-coded versions, in order to avoid any potential type mistakes or integer overflows that, in the worse scenario, could lead to heap overflows. Link: https://lore.kernel.org/r/20210927225333.GA192634@embeddedor Link: https://github.com/KSPP/linux/issues/160 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27IB/hfi1: Add ring consumer and producers tracesMike Marciniszyn3-22/+85
These traces are used to debugging ring issues. The ipoib_txreq needed to be moved to a header file to allow access from the trace header file. The trace changes include: - new producer/consumer traces - new allocation deallocation traces - additional fidelity for SDMA engine prints Fixes: 4bd00b55c978 ("IB/hfi1: Add AIP tx traces") Link: https://lore.kernel.org/r/20210913132852.131370.9664.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27IB/hfi1: Remove atomic completion countMike Marciniszyn3-12/+10
The atomic is not needed. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Link: https://lore.kernel.org/r/20210913132847.131370.54250.stgit@awfm-01.cornelisnetworks.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27IB/hfi1: Tune netdev xmit cachelinesMike Marciniszyn3-52/+66
This patch moves fields in the ring and creates a line for the producer and the consumer. The adds a consumer side variable that tracks the ring avail so that the code doesn't have the read the other cacheline to get a count for every packet. A read now only occurs when the avail is at 0. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Link: https://lore.kernel.org/r/20210913132842.131370.15636.stgit@awfm-01.cornelisnetworks.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27IB/hfi1: Get rid of tx priv backpointerMike Marciniszyn1-5/+4
The txq has the backpointer, so this is a micro optimization for the tx path. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Link: https://lore.kernel.org/r/20210913132836.131370.89704.stgit@awfm-01.cornelisnetworks.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27IB/hfi1: Get rid of hot path divideMike Marciniszyn2-27/+11
The pointer math in this statemet does a divide; struct hfi1_ipoib_txq *txq = &priv->txqs[napi - priv->tx_napis]; Elminate the divide by embedding the struct napi_strut in the txq and getting the txq with a container_of() using the newly embedded napi. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Link: https://lore.kernel.org/r/20210913132831.131370.3993.stgit@awfm-01.cornelisnetworks.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27IB/hfi1: Remove cache and embed txreq in ringMike Marciniszyn2-128/+101
This patch removes kmem cache allocation and deallocation in favor of having the ipoib_txreq in the ring. The consumer is now the packet sending side allocating tx descriptors from ring and the producer is the napi interrupt handling freeing tx descriptors. The locks are now eliminated because the napi tx lock insures a single consumer and the napi handling insures a single producer. The napi poll is converted to memory poll looking for items that have been marked completed. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Link: https://lore.kernel.org/r/20210913132826.131370.4397.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27RDMA/hfi1: Fix kernel pointer leakGuo Zhi1-4/+4
Pointers should be printed with %p or %px rather than cast to 'unsigned long long' and printed with %llx. Change %llx to %p to print the secured pointer. Fixes: 042a00f93aad ("IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdev") Link: https://lore.kernel.org/r/20210922134857.619602-1-qtxuning1999@sjtu.edu.cn Signed-off-by: Guo Zhi <qtxuning1999@sjtu.edu.cn> Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-1/+1
Pull rdma fixes from Jason Gunthorpe: "I don't usually send a second PR in the merge window, but the fix to mlx5 is significant enough that it should start going through the process ASAP. Along with it comes some of the usual -rc stuff that would normally wait for a -rc2 or so. Summary: Important error case regression fixes in mlx5: - Wrong size used when computing the error path smaller allocation request leads to corruption - Confusing but ultimately harmless alignment mis-calculation Static checker warning fixes: - NULL pointer subtraction in qib - kcalloc in bnxt_re - Missing static on global variable in hfi1" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/hfi1: make hist static RDMA/bnxt_re: Prefer kcalloc over open coded arithmetic IB/qib: Fix null pointer subtraction compiler warning RDMA/mlx5: Fix xlt_chunk_align calculation RDMA/mlx5: Fix number of allocated XLT entries
2021-09-08IB/hfi1: make hist staticchongjiapeng1-1/+1
This symbol is not used outside of trace.c, so marks it static. Fix the following sparse warning: drivers/infiniband/hw/hfi1/trace.c:491:23: warning: symbol 'hist' was not declared. Should it be static? Link: https://lore.kernel.org/r/1630921723-21545-1-git-send-email-jiapeng.chong@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: chongjiapeng <jiapeng.chong@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds75-3214/+163
Pull rdma updates from Jason Gunthorpe: "This is quite a small cycle, no major series stands out. The HNS and rxe drivers saw the most activity this cycle, with rxe being broken for a good chunk of time. The significant deleted line count is due to a SPDX cleanup series. Summary: - Various cleanup and small features for rtrs - kmap_local_page() conversions - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns - Cache the IB subnet prefix - Rework how CRC is calcuated in rxe - Clean reference counting in iwpm's netlink - Pull object allocation and lifecycle for user QPs to the uverbs core code - Several small hns features and continued general code cleanups - Fix the scatterlist confusion of orig_nents/nents introduced in an earlier patch creating the append operation" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (90 commits) RDMA/mlx5: Relax DCS QP creation checks RDMA/hns: Delete unnecessary blank lines. RDMA/hns: Encapsulate the qp db as a function RDMA/hns: Adjust the order in which irq are requested and enabled RDMA/hns: Remove RST2RST error prints for hw v1 RDMA/hns: Remove dqpn filling when modify qp from Init to Init RDMA/hns: Fix QP's resp incomplete assignment RDMA/hns: Fix query destination qpn RDMA/hfi1: Convert to SPDX identifier IB/rdmavt: Convert to SPDX identifier RDMA/hns: Bugfix for incorrect association between dip_idx and dgid RDMA/hns: Bugfix for the missing assignment for dip_idx RDMA/hns: Bugfix for data type of dip_idx RDMA/hns: Fix incorrect lsn field RDMA/irdma: Remove the repeated declaration RDMA/core/sa_query: Retry SA queries RDMA: Use the sg_table directly and remove the opencoded version from umem lib/scatterlist: Fix wrong update of orig_nents lib/scatterlist: Provide a dedicated function to support table append RDMA/hns: Delete unused hns bitmap interface ...
2021-08-25RDMA/hfi1: Convert to SPDX identifierCai Huoqing73-3170/+133
use SPDX-License-Identifier instead of a verbose license text Link: https://lore.kernel.org/r/20210823042622.109-1-caihuoqing@baidu.com Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-23RDMA: switch from 'pci_' to 'dma_' APIChristophe JAILLET2-16/+8
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below. It has been hand modified to use 'dma_set_mask_and_coherent()' instead of 'pci_set_dma_mask()/pci_set_consistent_dma_mask()' when applicable. This is less verbose. It has been compile tested. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Link: https://lore.kernel.org/r/259e53b7a00f64bf081d41da8761b171b2ad8f5c.1629634798.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19RDMA/hfi1: Stop using seq_get_buf in _driver_stats_seq_showChristoph Hellwig1-9/+4
Just use seq_write to copy the stats into the seq_file buffer instead of poking holes into the seq_file abstraction. Link: https://lore.kernel.org/r/20210810151711.1795374-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs()Tuo Li1-5/+4
kmalloc_array() is called to allocate memory for tx->descp. If it fails, the function __sdma_txclean() is called: __sdma_txclean(dd, tx); However, in the function __sdma_txclean(), tx-descp is dereferenced if tx->num_desc is not zero: sdma_unmap_desc(dd, &tx->descp[0]); To fix this possible null-pointer dereference, assign the return value of kmalloc_array() to a local variable descp, and then assign it to tx->descp if it is not NULL. Otherwise, go to enomem. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20210806133029.194964-1-islituo@gmail.com Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-07-30RDMA/hfi1: Fix typo in commentsCai Huoqing5-7/+7
Remove the repeated word 'the' from comments Link: https://lore.kernel.org/r/20210729082346.1882-1-caihuoqing@baidu.com Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-07-30RDMA/hfi1: Convert from atomic_t to refcount_t on hfi1_devdata->user_refcountXiyu Yang4-6/+7
refcount_t type and corresponding API can protect refcounters from accidental underflow and overflow and further use-after-free situations. Link: https://lore.kernel.org/r/1626674454-56075-1-git-send-email-xiyuyang19@fudan.edu.cn Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Tested-by: Josh Fisher <josh.fisher@cornelisnetworks.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-07-30IB/hfi1: Adjust pkey entry in index 0Mike Marciniszyn1-6/+1
It is possible for the primary IPoIB network device associated with any RDMA device to fail to join certain multicast groups preventing IPv6 neighbor discovery and possibly other network ULPs from working correctly. The IPv4 broadcast group is not affected as the IPoIB network device handles joining that multicast group directly. This is because the primary IPoIB network device uses the pkey at ndex 0 in the associated RDMA device's pkey table. Anytime the pkey value of index 0 changes, the primary IPoIB network device automatically modifies it's broadcast address (i.e. /sys/class/net/[ib0]/broadcast), since the broadcast address includes the pkey value, and then bounces carrier. This includes initial pkey assignment, such as when the pkey at index 0 transitions from the opa default of invalid (0x0000) to some value such as the OPA default pkey for Virtual Fabric 0: 0x8001 or when the fabric manager is restarted with a configuration change causing the pkey at index 0 to change. Many network ULPs are not sensitive to the carrier bounce and are not expecting the broadcast address to change including the linux IPv6 stack. This problem does not affect IPoIB child network devices as their pkey value is constant for all time. To mitigate this issue, change the default pkey in at index 0 to 0x8001 to cover the predominant case and avoid issues as ipoib comes up and the FM sweeps. At some point, ipoib multicast support should automatically fix non-broadcast addresses as it does with the primary broadcast address. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20210715160445.142451.47651.stgit@awfm-01.cornelisnetworks.com Suggested-by: Josh Collier <josh.d.collier@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-07-30IB/hfi1: Indicate DMA wait when txq is queued for wakeupMike Marciniszyn1-0/+3
There is no counter for dmawait in AIP, which hampers debugging performance issues. Add the counter increment when the txq is queued. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Link: https://lore.kernel.org/r/20210715160440.142451.8278.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-07-03Merge tag 'trace-v5.14' of ↵Linus Torvalds4-11/+11
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Added option for per CPU threads to the hwlat tracer - Have hwlat tracer handle hotplug CPUs - New tracer: osnoise, that detects latency caused by interrupts, softirqs and scheduling of other tasks. - Added timerlat tracer that creates a thread and measures in detail what sources of latency it has for wake ups. - Removed the "success" field of the sched_wakeup trace event. This has been hardcoded as "1" since 2015, no tooling should be looking at it now. If one exists, we can revert this commit, fix that tool and try to remove it again in the future. - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids. - New boot command line option "tp_printk_stop", as tp_printk causes trace events to write to console. When user space starts, this can easily live lock the system. Having a boot option to stop just after boot up is useful to prevent that from happening. - Have ftrace_dump_on_oops boot command line option take numbers that match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops. - Bootconfig clean ups, fixes and enhancements. - New ktest script that tests bootconfig options. - Add tracepoint_probe_register_may_exist() to register a tracepoint without triggering a WARN*() if it already exists. BPF has a path from user space that can do this. All other paths are considered a bug. - Small clean ups and fixes * tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits) tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT tracing: Simplify & fix saved_tgids logic treewide: Add missing semicolons to __assign_str uses tracing: Change variable type as bool for clean-up trace/timerlat: Fix indentation on timerlat_main() trace/osnoise: Make 'noise' variable s64 in run_osnoise() tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing tracing: Fix spelling in osnoise tracer "interferences" -> "interference" Documentation: Fix a typo on trace/osnoise-tracer trace/osnoise: Fix return value on osnoise_init_hotplug_support trace/osnoise: Make interval u64 on osnoise_main trace/osnoise: Fix 'no previous prototype' warnings tracing: Have osnoise_main() add a quiescent state for task rcu seq_buf: Make trace_seq_putmem_hex() support data longer than 8 seq_buf: Fix overflow in seq_buf_putmem_hex() trace/osnoise: Support hotplug operations trace/hwlat: Support hotplug operations trace/hwlat: Protect kdata->kthread with get/put_online_cpus trace: Add timerlat tracer trace: Add osnoise tracer ...
2021-06-30treewide: Add missing semicolons to __assign_str usesJoe Perches4-11/+11
The __assign_str macro has an unusual ending semicolon but the vast majority of uses of the macro already have semicolon termination. $ git grep -P '\b__assign_str\b' | wc -l 551 $ git grep -P '\b__assign_str\b.*;' | wc -l 480 Add semicolons to the __assign_str() uses without semicolon termination and all the other uses without semicolon termination via additional defines that are equivalent to __assign_str() with the eventual goal of removing the semicolon from the __assign_str() macro definition. Link: https://lore.kernel.org/lkml/1e068d21106bb6db05b735b4916bb420e6c9842a.camel@perches.com/ Link: https://lkml.kernel.org/r/48a056adabd8f70444475352f617914cef504a45.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-24RDMA/hfi1: Remove use of kmap()Ira Weiny1-2/+2
kmap() is being deprecated and will break uses of device dax after PKS protection is introduced.[1] The kmap() used in sdma does not need to be global. Use the new kmap_local_page() which works with PKS and may provide better performance for this thread local mapping anyway. [1] https://lore.kernel.org/lkml/20201009195033.3208459-59-ira.weiny@intel.com/ Link: https://lore.kernel.org/r/20210622061422.2633501-2-ira.weiny@intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21RDMA: Fix kernel-doc warnings about wrong commentLeon Romanovsky5-9/+9
Compilation with W=1 produces warnings similar to the below. drivers/infiniband/ulp/ipoib/ipoib_main.c:320: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst All such occurrences were found with the following one line git grep -A 1 "\/\*\*" drivers/infiniband/ Link: https://lore.kernel.org/r/e57d5f4ddd08b7a19934635b44d6d632841b9ba7.1623823612.git.leonro@nvidia.com Reviewed-by: Jack Wang <jinpu.wang@ionos.com> #rtrs Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16RDMA: Remove rdma_set_device_sysfs_group()Jason Gunthorpe1-3/+1
The driver's device group can be specified as part of the ops structure like the device's port group. No need for the complicated API. Link: https://lore.kernel.org/r/8964785a34fd3a29ff5b6693493f575b717e594d.1623427137.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16RDMA: Change ops->init_port to ops->port_groupsJason Gunthorpe3-14/+3
init_port was only being used to register sysfs attributes against the port kobject. Now that all users are creating static attribute_group's we can simply set the attribute_group list in the ops and the core code can just handle it directly. This makes all the sysfs management quite straightforward and prevents any driver from abusing the naked port kobject in future because no driver code can access it. Link: https://lore.kernel.org/r/114f68f3d921460eafe14cea5a80ca65d81729c3.1623427137.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16RDMA/hfi1: Use attributes for the port sysfsJason Gunthorpe2-340/+194
hfi1 should not be creating a mess of kobjects to attach to the port kobject - this is all attributes. The proper API is to create an attribute_group list and create it against the port's kobject. Link: https://lore.kernel.org/r/cbe0ccb6175dd22274359b6ad803a37435a70e91.1623427137.git.leonro@nvidia.com Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16RDMA: Split the alloc_hw_stats() ops to port and device variantsJason Gunthorpe1-43/+43
This is being used to implement both the port and device global stats, which is causing some confusion in the drivers. For instance EFA and i40iw both seem to be misusing the device stats. Split it into two ops so drivers that don't support one or the other can leave the op NULL'd, making the calling code a little simpler to understand. Link: https://lore.kernel.org/r/1955c154197b2a159adc2dc97266ddc74afe420c.1623427137.git.leonro@nvidia.com Tested-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28IB/hfi1: Move a function from a header file into a .c fileBart Van Assche1-0/+5
Since ib_get_len() only has one caller, move it from a header file into a .c file. Additionally, remove the superfluous u16 cast. That cast was introduced by commit 7dafbab3753f ("IB/hfi1: Add functions to parse BTH/IB headers"). Link: https://lore.kernel.org/r/20210524041211.9480-2-bvanassche@acm.org Cc: Don Hiatt <don.hiatt@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-20IB/hfi1: Remove the repeated declarationShaokun Zhang1-2/+0
Function 'init_credit_return' and 'sc_return_credits' are declared twice, remove the repeated declaration. Link: https://lore.kernel.org/r/1621417415-3772-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-11IB/hfi1: Delete an unneeded bool conversionZhen Lei1-1/+1
The result of an expression consisting of a single relational operator is already of the bool type and does not need to be evaluated explicitly. No functional change. Link: https://lore.kernel.org/r/20210510120635.3636-1-thunder.leizhen@huawei.com Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds29-285/+465
Pull rdma updates from Jason Gunthorpe: "This is significantly bug fixes and general cleanups. The noteworthy new features are fairly small: - XRC support for HNS and improves RQ operations - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw and qib - Quite a few general cleanups on spelling, error handling, static checker detections, etc - Increase the number of device ports supported beyond 255. High port count software switches now exist - Several bug fixes for rtrs - mlx5 Device Memory support for host controlled atomics - Report SRQ tables through to rdma-tool" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits) IB/qib: Remove redundant assignment to ret RDMA/nldev: Add copy-on-fork attribute to get sys command RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res RDMA/siw: Fix a use after free in siw_alloc_mr IB/hfi1: Remove redundant variable rcd RDMA/nldev: Add QP numbers to SRQ information RDMA/nldev: Return SRQ information RDMA/restrack: Add support to get resource tracking for SRQ RDMA/nldev: Return context information RDMA/core: Add CM to restrack after successful attachment to a device RDMA/cma: Skip device which doesn't support CM RDMA/rxe: Fix a bug in rxe_fill_ip_info() RDMA/mlx5: Expose private query port RDMA/mlx4: Remove an unused variable RDMA/mlx5: Fix type assignment for ICM DM IB/mlx5: Set right RoCE l3 type and roce version while deleting GID RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails RDMA/cxgb4: add missing qpid increment IB/ipoib: Remove unnecessary struct declaration RDMA/bnxt_re: Get rid of custom module reference counting ...
2021-04-27IB/hfi1: Remove redundant variable rcdJiapeng Chong1-4/+4
The variable rcd is being assigned a value from a calculation however the variable is never read, so this redundant variable can be removed. Cleans up the following clang-analyzer warning: drivers/infiniband/hw/hfi1/affinity.c:986:3: warning: Value stored to 'rcd' is never read [clang-analyzer-deadcode.DeadStores]. Link: https://lore.kernel.org/r/1619346696-46300-1-git-send-email-jiapeng.chong@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13IB/hfi1: Rework AIP and VNIC dummy netdev usageMike Marciniszyn5-116/+105
All other users of the dummy netdevice embed the netdev in other structures: init_dummy_netdev(&mal->dummy_dev); init_dummy_netdev(&eth->dummy_dev); init_dummy_netdev(&ar->napi_dev); init_dummy_netdev(&irq_grp->napi_ndev); init_dummy_netdev(&wil->napi_ndev); init_dummy_netdev(&trans_pcie->napi_dev); init_dummy_netdev(&dev->napi_dev); init_dummy_netdev(&bus->mux_dev); The AIP and VNIC implementation turns that model inside out and used a kfree() to free what appears to be a netdev struct when in reality, it is a struct that enbodies the rx state as well as the dummy netdev used to support napi_poll across disparate receive contexts. The relationship is infered by the odd allocation: const int netdev_size = sizeof(*dd->dummy_netdev) + sizeof(struct hfi1_netdev_priv); <snip> dd->dummy_netdev = kcalloc_node(1, netdev_size, GFP_KERNEL, dd->node); Correct the issue by: - Correctly naming the alloc and free functions - Renaming hfi1_netdev_priv to hfi1_netdev_rx - Replacing dd dummy_netdev with a netdev_rx pointer - Embedding the net_device in hfi1_netdev_rx - Moving the init_dummy_netdev to the alloc routine - Adjusting wrappers to fit the new model Fixes: 6991abcb993c ("IB/hfi1: Add functions to receive accelerated ipo