Age | Commit message (Collapse) | Author | Files | Lines |
|
These files all use functions declared in interrupt.h, but currently rely
on implicit inclusion of this file (via netns/xfrm.h).
That won't work anymore when the flow cache is removed so include that
header where needed.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Multiple IES API resets can cause a race condition where the mailbox
interrupt request bits can be cleared before being handled. This can
leave certain mailbox messages from the PF to be untreated and the PF
will enter in some inactive state. If this situation occurs, the IES API
will initiate a mailbox version reset which, then, trigger a mailbox
state change. Once this mailbox transition occurs (from OPEN to CONNECT
state), a request for reset will be returned.
This ensures that PF will undergo a reset whenever IES API encounters an
unknown global mailbox interrupt event or whenever the IES API
terminates.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Ensure that other bits in the RXQCTL register do not get cleared. This
ensures that bits related to queue ownership are maintained.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Similar to how we handle VXLAN offload, enable support for a single
Geneve tunnel.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
In the event of an uncorrectable AER error occurring when the driver has
not loaded, the recovery routines are not done. This is done because
future loads of the driver may not be aware of the IO state and may not
be able to recover at all. In this case, when we next load the driver it
fails due to what appears to be a surprise remove event. Instead, add
a check to ensure that the device is in the normal IO state before
continuing to probe. This allows us to give a more descriptive message
of what is wrong.
Without this change, the driver will attempt to probe up to our first
call of .reset_hw() which will be unable to read registers and act as if
a surprise remove event occurred.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
While technically not needed, as all our uses of ACCESS_ONCE are scalar
types, we already use READ_ONCE in a few places, and for code
readability we can swap all the uses of the older ACCESS_ONCE into
READ_ONCE.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
A previous patch added support to check for hardware Tx pending in the
fm10k_down routine. This support was intended to ensure that we
accurately check what the hardware state is. However, checking for Tx
hangs in this manor during the hotpath results in a large performance
hit. Avoid this by making the hotpath check use the SW counters instead.
Fixes: a0f53cf49cb0 ("fm10k: use actual hardware registers when checking for pending Tx", 2016-06-08)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
A previous patch removed the pci_disable_device() call in
.io_error_detected. This call corresponded to a pci_enable_device_mem()
call within .io_slot_reset handler. Change the call here to
a pci_reenable_device() so that it does not increment and leak the
enable_cnt reference count for the device. Without this change, VF
devices may fail during an unbind/bind, and we'll never zero the
reference counter for the pci_dev structure.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Highlights:
- ARM64 support for ACPI host bridges
- new drivers for Axis ARTPEC-6 and Marvell Aardvark
- new pci_alloc_irq_vectors() interface for MSI-X, MSI, legacy INTx
- pci_resource_to_user() cleanup (more to come)
Detailed summary:
Enumeration:
- Move ecam.h to linux/include/pci-ecam.h (Jayachandran C)
- Add parent device field to ECAM struct pci_config_window (Jayachandran C)
- Add generic MCFG table handling (Tomasz Nowicki)
- Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki)
- Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki)
Resource management:
- Add devm_request_pci_bus_resources() (Bjorn Helgaas)
- Unify pci_resource_to_user() declarations (Bjorn Helgaas)
- Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas)
- Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas)
- Make PCI I/O space optional on ARM32 (Bjorn Helgaas)
- Ignore write combining when mapping I/O port space (Bjorn Helgaas)
- Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas)
- Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas)
- Support I/O resources when parsing host bridge resources (Jayachandran C)
- Add helpers to request/release memory and I/O regions (Johannes Thumshirn)
- Use pci_(request|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn)
- Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5))
- Add generic pci_bus_claim_resources() (Lorenzo Pieralisi)
- Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
- Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi)
- Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya)
- Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu)
PCI device hotplug:
- Allow additional bus numbers for hotplug bridges (Keith Busch)
- Ignore interrupts during D3cold (Lukas Wunner)
Power management:
- Enforce type casting for pci_power_t (Andy Shevchenko)
- Don't clear d3cold_allowed for PCIe ports (Mika Westerberg)
- Put PCIe ports into D3 during suspend (Mika Westerberg)
- Power on bridges before scanning new devices (Mika Westerberg)
- Runtime resume bridge before rescan (Mika Westerberg)
- Add runtime PM support for PCIe ports (Mika Westerberg)
- Remove redundant check of pcie_set_clkpm (Shawn Lin)
Virtualization:
- Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra)
- Add DMA alias quirk for Adaptec 3805 (Alex Williamson)
- Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake)
- Add ACS quirk for Solarflare SFC9220 (Edward Cree)
MSI:
- Fix PCI_MSI dependencies (Arnd Bergmann)
- Add pci_msix_desc_addr() helper (Christoph Hellwig)
- Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig)
- Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig)
- Provide sensible IRQ vector alloc/free routines (Christoph Hellwig)
- Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig)
Error Handling:
- Bind DPC to Root Ports as well as Downstream Ports (Keith Busch)
- Remove DPC tristate module option (Keith Busch)
- Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg)
Generic host bridge driver:
- Select IRQ_DOMAIN (Arnd Bergmann)
- Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
ACPI host bridge driver:
- Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki)
- Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki)
- Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki)
- Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki)
Altera host bridge driver:
- Check link status before retrain link (Ley Foon Tan)
- Poll for link up status after retraining the link (Ley Foon Tan)
Axis ARTPEC-6 host bridge driver:
- Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann)
- Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel)
- Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel)
Intel VMD host bridge driver:
- Use lock save/restore in interrupt enable path (Jon Derrick)
- Select device dma ops to override (Keith Busch)
- Initialize list item in IRQ disable (Keith Busch)
- Use x86_vector_domain as parent domain (Keith Busch)
- Separate MSI and MSI-X vector sharing (Keith Busch)
Marvell Aardvark host bridge driver:
- Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni)
- Add Aardvark PCI host controller driver (Thomas Petazzoni)
- Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni)
Microsoft Hyper-V host bridge driver:
- Fix interrupt cleanup path (Cathy Avery)
- Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
- Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
NVIDIA Tegra host bridge driver:
- Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren)
- Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren)
- Use lower-case hex consistently for register definitions (Thierry Reding)
- Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding)
- Stop setting pcibios_min_mem (Thierry Reding)
Renesas R-Car host bridge driver:
- Drop gen2 dummy I/O port region (Bjorn Helgaas)
TI DRA7xx host bridge driver:
- Fix return value in case of error (Christophe JAILLET)
Xilinx AXI host bridge driver:
- Fix return value in case of error (Christophe JAILLET)
Miscellaneous:
- Make bus_attr_resource_alignment static (Ben Dooks)
- Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks)
- MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven)
- Make host bridge drivers explicitly non-modular (Paul Gortmaker)"
* tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (125 commits)
PCI: xgene: Make explicitly non-modular
PCI: thunder-pem: Make explicitly non-modular
PCI: thunder-ecam: Make explicitly non-modular
PCI: tegra: Make explicitly non-modular
PCI: rcar-gen2: Make explicitly non-modular
PCI: rcar: Make explicitly non-modular
PCI: mvebu: Make explicitly non-modular
PCI: layerscape: Make explicitly non-modular
PCI: keystone: Make explicitly non-modular
PCI: hisi: Make explicitly non-modular
PCI: generic: Make explicitly non-modular
PCI: designware-plat: Make it explicitly non-modular
PCI: artpec6: Make explicitly non-modular
PCI: armada8k: Make explicitly non-modular
PCI: artpec: Add PCI_MSI_IRQ_DOMAIN dependency
PCI: Add ACS quirk for Solarflare SFC9220
arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700
PCI: aardvark: Add Aardvark PCI host controller driver
dt-bindings: add DT binding for the Aardvark PCIe controller
PCI: tegra: Program PADS_REFCLK_CFG* registers with per-SoC values
...
|
|
When we resume from an AER recovery with many active VFs, the PF sees
many spurious link up and link down events. Prevent this by delaying
link down for at least one second after the resume event.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Sometimes, a VF driver will lose PCIe address access, such as due to
a PF FLR event. In fm10k_detach_subtask, poll and check whether the
PCIe register space is active again and restore the device when it has.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
If an FLR occurs, VF devices will be knocked out of bus master mode, and
the driver will be unable to recover from the reset properly, resulting
in malicious driver events and an infinite reset loop. In the normal
case, the bus master mode will already be enabled and this call will
essentially be a no-op. Since we're doing this every reset, it is
possible we could remove the other calls to pci_set_master() but it
seems not harmful to just leave them in place.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Continuing the effort to commonize the similar suspend/resume flows,
finish up by using the new fm10k_handle_suspand and fm10k_handle_resume
functions for the standard suspend/resume flow.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
When a function level PCI reset is triggered using sysfs, it calls the
driver's .reset_notify error handler. Implement a handler based on the
now split fm10k_prepare_for_reset and fm10k_handle_reset functions, so
that we fully reset the driver when the PCI function level reset occurs.
This also ensures the reset is handled in a clean way by first disabling
all the driver bits first and then restoring them after the function
reset. Previously the stack simply performed a blind function reset and
our driver didn't take any part in the process.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Now that we have extracted the necessary steps for a split
suspend/resume flow, re-use these functions instead of using the current
open coded flow. This ensures that we don't miss any steps. It also
ensures that we have the correct driver states set.
Since we'll be handling all of the reset flow ourselves, we no longer
need to request a reset in the io_slot_reset() function.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Implement fm10k_prepare_suspend and fm10k_handle_resume functions which
abstract around the now existing fm10k_prepare_for_reset and
fm10k_handle_reset. The new functions also handle stopping the service
task, which is something that the original re-init flow does not need.
Every other location that does a suspend/resume type flow is expected to
use these functions, because otherwise they may have conflicts with the
running watchdog routines. This also has the effect of preventing
possible surprise remove events during handling of FLR events and PCIe
errors.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
There are several flows in the driver which perform the similar function
of tearing down software and restoring software to recover from certain
errors or PCIe events, including:
* fm10k_reinit
* fm10k_suspend/resume
* fm10k_io_error_detected/fm10k_io_resume
In addition, we want to implement a .reset_notify() handler as well
which will also perform similar function.
Rework how the driver codes reset and resume flows by separating out the
reinit logic into two functions "fm10k_prepare_for_reset" and
"fm10k_handle_reset". This first step will allow us to re-use this
functionality in the similar blocks of code instead of re-coding the
same sequence of events slightly different.
The end result should be more maintainable and correct, fixing several
inconsistencies with the work flow.
The new functions expect to take the rtnl_lock() themselves, and it does
have the unfortunate side effect of having the reinit flow take then
release then take the rtnl_lock. However, this minor downside is
out weighted by the benefits of code reduction and reducing needless
difference between these flows.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
It turns out that sometimes during a reset the Tx queues will be
temporarily stuck longer than .stop_hw() expects. Work around this issue
by attempting to .stop_hw() first. If it tails, wait a number of
attempts until the Tx queues appear to be drained. After this, attempt
stop_hw() again. This ensures that we avoid waiting if we don't need to,
such as during the first initialization of a VF, and give the proper
amount of time necessary to recover from most situations. It is possible
that the hardware is actually stuck. For PFs, this is usually fixed by
a datapath reset. Unfortunately the VF cannot request a similar reset
for itself.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
When stop_hw() routine fails with FM10K_ERR_REQUESTS_PENDING, this
indicates that the Tx or Rx queues did not shutdown within the time
limit. Print a more suitable message at the dev_info level instead of
dev_err.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Also prevent updating stats while the interface is down. If we're
already updating stats, just return doing nothing. When we take the
device down, block stat updates until we come back up. This ensures that
we avoid tearing down rings when we're updating statistics, and prevents
updating statistics until we're up.
We can't re-use the __FM10K_DOWN for this because it wouldn't prevent
multiple threads from accessing statistics. Neither does it prevent the
case where we start updating stats and then start going down in another
thread.
The fm10k_get_stats64 is except from this, because it has a completely
different flow which does not suffer from the same issues as
fm10k_update_stats might.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
It's currently possible for fm10k_update_stats to be called during the
window when we go down and the rings are removed. This can result in
a null pointer dereference. In fm10k_get_stats64 we work around this by
using ACCESS_ONCE and a null pointer check inside the loop. Use this
same flow in the fm10k_update_stats to avoid the potential null pointer.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Return early from fm10k_down() when we are already down, since that
means another thread is either already finished or has started going
down, so shouldn't conflict with them.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Now that we do have pci_request_mem_regions() and pci_release_mem_regions()
at hand, use it in the Intel ethernet drivers.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: David S. Miller <davem@davemloft.net>
|
|
Replace all trans_start updates with netif_trans_update helper.
change was done via spatch:
struct net_device *d;
@@
- d->trans_start = jiffies
+ netif_trans_update(d)
Compile tested only.
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: linux-xtensa@linux-xtensa.org
Cc: linux1394-devel@lists.sourceforge.net
Cc: linux-rdma@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: MPT-FusionLinux.pdl@broadcom.com
Cc: linux-scsi@vger.kernel.org
Cc: linux-can@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-omap@vger.kernel.org
Cc: linux-hams@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: linux-bluetooth@vger.kernel.org
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Acked-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
fm10k_open requires rtnl_lock to be held.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Shannon Nelson <shannon.nelson@intel.com>
Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
Cc: John Ronciak <john.ronciak@intel.com>
Cc: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Update every header file and other locations to consistently use
Intel(R) instead of just Intel. Also update copyright year of files
which we modified.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
fm10k_io_error_detected() does not need to call pci_disable_device(). In
the cases where the reset needs to occur, the stack flow will result in
calling fm10k_remove() which already disables the PCI device. If we
leave the pci_disable_device(), we result in a warning about disabling
an already disabled device.
Many PCI drivers do call pci_disable_device() in their .error_detected()
routines, but it does not appear to be required. In addition, these
drivers have a check "is_pci_enabled()" call in their remove routines,
which is how they chose to handle the duplicate device disable.
This seems incorrect, since the PCI device structure is reference
counted. It is very possible that the reference count for the PCI device
could be greater than 1. In this case, you would remove the PCI device
within the error_detected routine, reducing count to 1, then remove it
again in the remove function, reducing it to zero. This would result in
yet another disable somewhere else failing. Thus, we shouldn't be using
is_pci_enabled() to check for this issue. Instead, just remove the
extraneous pci_device_disable() found within the error_detected routine.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Currently, any error responses from the switch manager after an
LPORT_MAP request are silently ignored. At most the mailbox message will
be reported as an error. This can result in unexpected behavior when the
switch manager has configured a port with zero bandwidth. Add support
for reading the fm10k_swapi_error structure from LPORT_MAP responses.
If the message contains the necessary TLV and has a non-zero error code,
report link down, clear the dglort_map, and delay the next
get_host_state call by a reasonable delay. Also log an error message
indicating that the LPORT_MAP request failed.
The delay ensures preventing an interrupt storm on the switch manager,
and reduces the number of mailbox messages we send in this scenario
drastically.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
The 1588 support within fm10k does not work correctly with the current
version of the switch management software, and likely never worked
correctly to begin with. Remove support for PTP/1588. Update copyright
year for all these files while we're touching them.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
During an AER action response, we were calling fm10k_close without
holding the rtnl_lock() which could lead to possible RCU warnings being
produced due to 64bit stat updates among other causes. Similarly, we
need rtnl_lock() around fm10k_open during fm10k_io_resume. Follow the
same pattern elsewhere in the driver and protect the entire open/close
sequence.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
s/funciton/function to resolve a typo, and cleanup grammar on a few
comments regarding processing the VF mailboxes.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
During fm10k_io_error_detected we were clearing the interrupt scheme
before we freed the MBX IRQ. This causes a kernel panic because the MBX
IRQ are assigned after MSI-X initialization. Clearing the interrupt
scheme results in removing the MSI-X entry table. Fix this by freeing
the MBX IRQ before we clear the interrupt scheme, as we do elsewhere in
the driver.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
fm10k_stop_hw_generic calls fm10k_disable_queues_generic, which may
return an error code indicating that the queues were not stopped within
the time limit. Notify the user by displaying a message in the kernel
message ring, in a similar way to how we notify the user when reset_hw
fails. There isn't much we can do to recover from this error, so
currently nothing else is done.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Delay initialization of the service timer and service task until late
probe. If we don't wait, failures in probe do not properly cleanup the
service timer or service task items, which results in the kernel panic
below, potentially freezing the whole system. In addition, ensure that
the SERVICE_DISABLE bit is set before we request the MBX IRQ since the
MBX interrupt attempts to schedule the service task otherwise. This
prevents a similar trace from occurring after this change.
We didn't notice this issue before because probe almost always completes
successfully. I discovered it due to a mis-ordered mailbox handler
array, which resulted in the following failure when requesting mailbox
interrupt.
[ 555.325619] ------------[ cut here ]------------
[ 555.325628] WARNING: CPU: 0 PID: 4941 at lib/list_debug.c:33 __list_add+0xa0/0xd0()
[ 555.325631] list_add corruption. prev->next should be next (ffffffff81f46648), but was (null). (prev=ffff8807fad5d0e8).
<snip>
[ 555.325722] CPU: 0 PID: 4941 Comm: insmod Tainted: G OE 4.0.4-303.fc22.x86_64 #1
[ 555.325725] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.8x23.060520140825 06/05/2014
[ 555.325727] 0000000000000000 00000000b4f161b3 ffff88081a21f8e8 ffffffff81783124
[ 555.325734] 0000000000000000 ffff88081a21f940 ffff88081a21f928 ffffffff8109c66a
[ 555.325740] 0000000064000000 ffff8807fad5d0e8 ffff8807fad5d0e8 ffffffff81f46648
[ 555.325746] Call Trace:
[ 555.325752] [<ffffffff81783124>] dump_stack+0x45/0x57
[ 555.325757] [<ffffffff8109c66a>] warn_slowpath_common+0x8a/0xc0
[ 555.325759] [<ffffffff8109c6f5>] warn_slowpath_fmt+0x55/0x70
[ 555.325763] [<ffffffff813ba270>] __list_add+0xa0/0xd0
[ 555.325768] [<ffffffff81102d1d>] __internal_add_timer+0x9d/0x110
[ 555.325771] [<ffffffff81102dbf>] internal_add_timer+0x2f/0xc0
[ 555.325774] [<ffffffff81104e5a>] mod_timer+0x12a/0x230
[ 555.325782] [<ffffffffa03d54ca>] fm10k_probe+0x69a/0xc80 [fm10k]
[ 555.325787] [<ffffffff813e8355>] local_pci_probe+0x45/0xa0
[ 555.325791] [<ffffffff8129cf42>] ? sysfs_do_create_link_sd.isra.2+0x72/0xc0
[ 555.325794] [<ffffffff813e96b9>] pci_device_probe+0xf9/0x150
[ 555.325799] [<ffffffff814d7e73>] driver_probe_device+0xa3/0x400
[ 555.325802] [<ffffffff814d82ab>] __driver_attach+0x9b/0xa0
[ 555.325805] [<ffffffff814d8210>] ? __device_attach+0x40/0x40
[ 555.325808] [<ffffffff814d5bd3>] bus_for_each_dev+0x73/0xc0
[ 555.325811] [<ffffffff814d78ce>] driver_attach+0x1e/0x20
[ 555.325815] [<ffffffff814d7480>] bus_add_driver+0x180/0x250
[ 555.325819] [<ffffffffa03b2000>] ? 0xffffffffa03b2000
[ 555.325823] [<ffffffff814d8aa4>] driver_register+0x64/0xf0
[ 555.325826] [<ffffffff813e7bec>] __pci_register_driver+0x4c/0x50
[ 555.325832] [<ffffffffa03d6ca3>] fm10k_register_pci_driver+0x23/0x30 [fm10k]
[ 555.325838] [<ffffffffa03b2080>] fm10k_init_module+0x80/0x1000 [fm10k]
[ 555.325843] [<ffffffff81002128>] do_one_initcall+0xb8/0x200
[ 555.325848] [<ffffffff811e10d2>] ? __vunmap+0xa2/0x100
[ 555.325852] [<ffffffff811fe239>] ? kmem_cache_alloc_trace+0x1b9/0x240
[ 555.325855] [<ffffffff8178230e>] ? do_init_module+0x28/0x1cb
[ 555.325858] [<ffffffff81782346>] do_init_module+0x60/0x1cb
[ 555.325862] [<ffffffff8112168e>] load_module+0x205e/0x26b0
[ 555.325866] [<ffffffff8111d110>] ? store_uevent+0x70/0x70
[ 555.325870] [<ffffffff812234b0>] ? kernel_read+0x50/0x80
[ 555.325873] [<ffffffff81121f3e>] SyS_finit_module+0xbe/0xf0
[ 555.325878] [<ffffffff81789749>] system_call_fastpath+0x12/0x17
[ 555.325880] ---[ end trace 9e0f58d071eafd2a ]---
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
According to the C standard dereferencing a variable before it is
checked invokes undefined behavior, and thus compilers are free to
assume the check for NULL isn't necessary. Prevent this by re-ordering
the NULL check of msix_entries in fm10k_free_mbx_irq.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Cleanup the remaining instances of using memcpy() instead of the preferred
ether_addr_copy().
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
We don't need to crash the kernel in this instance so just warn about the
condition and play on.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Use BIT() macro instead.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
The semantic patch that makes this change is available
in scripts/coccinelle/misc/compare_const_fl.cocci.
More information about semantic patching is available at
http://coccinelle.lip6.fr/
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
When comparing MAC addresses, use ether_addr_equal instead of memcmp to
ETH_ALEN length. Found and replaced using the following sed:
sed -e 's/memcmp\x28\(.*\), ETH_ALEN\x29/!ether_addr_equal\x28\1\x29/'
Reported-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
This patch is meant to cleanup the exception handling for the paths where
we reset the interrupts and then reconfigure them. In all of these paths
we had very different levels of exception handling. I have updated the
driver so that all of the paths should result in a similar state if we
fail.
Specifically the driver will now unload the mailbox interrupt, free the
queue vectors and MSI-X, and then detach the interface.
In addition for any of the PCIe related resets I have added a check with
the hw_ready function to just make sure the registers are in a readable
state prior to reopening the interface.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Similar to ixgbe and i40e, initialize XPS on driver load so that we can
take advantage of this kernel feature.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
This patch addresses two issues.
First is the fact that the fm10k_mbx_free_irq was assuming msix_entries was
valid and that will not always be the case. As such we need to add a check
for if it is NULL.
Second is the fact that we weren't freeing the IRQ if the mailbox API
returned an error on trying to connect.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Instead of using lowercase vlan, vid, or VID, always use VLAN or VLAN ID
in comments when referring to VLANs. The original driver code was
consistent, but recent patches have not been as consistent with this
naming scheme.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Avoid the use of CamelCase for some variable names that previously
slipped through review.
Reported-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jacob Keller & |