diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-12 09:50:05 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-12 09:50:05 -0800 |
| commit | 06cff4a58e7dfa018c5f8a6ebdc3ff12745e0bae (patch) | |
| tree | 9481c1d3c3ebcdeddfae8b786f24207834d3c433 /drivers | |
| parent | 164f59000c19fa1ee5d09327a8055ec9f9b9905a (diff) | |
| parent | 5f4c374760b031f06c69c2fdad1b0e981a1ad42f (diff) | |
| download | linux-06cff4a58e7dfa018c5f8a6ebdc3ff12745e0bae.tar.gz linux-06cff4a58e7dfa018c5f8a6ebdc3ff12745e0bae.tar.bz2 linux-06cff4a58e7dfa018c5f8a6ebdc3ff12745e0bae.zip | |
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The highlights this time are support for dynamically enabling and
disabling Clang's Shadow Call Stack at boot and a long-awaited
optimisation to the way in which we handle the SVE register state on
system call entry to avoid taking unnecessary traps from userspace.
Summary:
ACPI:
- Enable FPDT support for boot-time profiling
- Fix CPU PMU probing to work better with PREEMPT_RT
- Update SMMUv3 MSI DeviceID parsing to latest IORT spec
- APMT support for probing Arm CoreSight PMU devices
CPU features:
- Advertise new SVE instructions (v2.1)
- Advertise range prefetch instruction
- Advertise CSSC ("Common Short Sequence Compression") scalar
instructions, adding things like min, max, abs, popcount
- Enable DIT (Data Independent Timing) when running in the kernel
- More conversion of system register fields over to the generated
header
CPU misfeatures:
- Workaround for Cortex-A715 erratum #2645198
Dynamic SCS:
- Support for dynamic shadow call stacks to allow switching at
runtime between Clang's SCS implementation and the CPU's pointer
authentication feature when it is supported (complete with scary
DWARF parser!)
Tracing and debug:
- Remove static ftrace in favour of, err, dynamic ftrace!
- Seperate 'struct ftrace_regs' from 'struct pt_regs' in core ftrace
and existing arch code
- Introduce and implement FTRACE_WITH_ARGS on arm64 to replace the
old FTRACE_WITH_REGS
- Extend 'crashkernel=' parameter with default value and fallback to
placement above 4G physical if initial (low) allocation fails
SVE:
- Optimisation to avoid disabling SVE unconditionally on syscall
entry and just zeroing the non-shared state on return instead
Exceptions:
- Rework of undefined instruction handling to avoid serialisation on
global lock (this includes emulation of user accesses to the ID
registers)
Perf and PMU:
- Support for TLP filters in Hisilicon's PCIe PMU device
- Support for the DDR PMU present in Amlogic Meson G12 SoCs
- Support for the terribly-named "CoreSight PMU" architecture from
Arm (and Nvidia's implementation of said architecture)
Misc:
- Tighten up our boot protocol for systems with memory above 52 bits
physical
- Const-ify static keys to satisty jump label asm constraints
- Trivial FFA driver cleanups in preparation for v1.1 support
- Export the kernel_neon_* APIs as GPL symbols
- Harden our instruction generation routines against instrumentation
- A bunch of robustness improvements to our arch-specific selftests
- Minor cleanups and fixes all over (kbuild, kprobes, kfence, PMU, ...)"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (151 commits)
arm64: kprobes: Return DBG_HOOK_ERROR if kprobes can not handle a BRK
arm64: kprobes: Let arch do_page_fault() fix up page fault in user handler
arm64: Prohibit instrumentation on arch_stack_walk()
arm64:uprobe fix the uprobe SWBP_INSN in big-endian
arm64: alternatives: add __init/__initconst to some functions/variables
arm_pmu: Drop redundant armpmu->map_event() in armpmu_event_init()
kselftest/arm64: Allow epoll_wait() to return more than one result
kselftest/arm64: Don't drain output while spawning children
kselftest/arm64: Hold fp-stress children until they're all spawned
arm64/sysreg: Remove duplicate definitions from asm/sysreg.h
arm64/sysreg: Convert ID_DFR1_EL1 to automatic generation
arm64/sysreg: Convert ID_DFR0_EL1 to automatic generation
arm64/sysreg: Convert ID_AFR0_EL1 to automatic generation
arm64/sysreg: Convert ID_MMFR5_EL1 to automatic generation
arm64/sysreg: Convert MVFR2_EL1 to automatic generation
arm64/sysreg: Convert MVFR1_EL1 to automatic generation
arm64/sysreg: Convert MVFR0_EL1 to automatic generation
arm64/sysreg: Convert ID_PFR2_EL1 to automatic generation
arm64/sysreg: Convert ID_PFR1_EL1 to automatic generation
arm64/sysreg: Convert ID_PFR0_EL1 to automatic generation
...
Diffstat (limited to 'drivers')
27 files changed, 3185 insertions, 169 deletions
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 473241b5193f..1cc11d2a2a88 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -90,7 +90,7 @@ config ACPI_SPCR_TABLE config ACPI_FPDT bool "ACPI Firmware Performance Data Table (FPDT) support" - depends on X86_64 + depends on X86_64 || ARM64 help Enable support for the Firmware Performance Data Table (FPDT). This table provides information on the timing of the system diff --git a/drivers/acpi/arm64/Kconfig b/drivers/acpi/arm64/Kconfig index d4a72835f328..b3ed6212244c 100644 --- a/drivers/acpi/arm64/Kconfig +++ b/drivers/acpi/arm64/Kconfig @@ -18,3 +18,6 @@ config ACPI_AGDI reset command. If set, the kernel parses AGDI table and listens for the command. + +config ACPI_APMT + bool diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile index 7b9e4045659d..e21a9e84e394 100644 --- a/drivers/acpi/arm64/Makefile +++ b/drivers/acpi/arm64/Makefile @@ -2,4 +2,5 @@ obj-$(CONFIG_ACPI_AGDI) += agdi.o obj-$(CONFIG_ACPI_IORT) += iort.o obj-$(CONFIG_ACPI_GTDT) += gtdt.o +obj-$(CONFIG_ACPI_APMT) += apmt.o obj-y += dma.o diff --git a/drivers/acpi/arm64/apmt.c b/drivers/acpi/arm64/apmt.c new file mode 100644 index 000000000000..8cab69fa5d59 --- /dev/null +++ b/drivers/acpi/arm64/apmt.c @@ -0,0 +1,178 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ARM APMT table support. + * Design document number: ARM DEN0117. + * + * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. + * + */ + +#define pr_fmt(fmt) "ACPI: APMT: " fmt + +#include <linux/acpi.h> +#include <linux/acpi_apmt.h> +#include <linux/init.h> +#include <linux/kernel.h> +#include <linux/platform_device.h> + +#define DEV_NAME "arm-cs-arch-pmu" + +/* There can be up to 3 resources: page 0 and 1 address, and interrupt. */ +#define DEV_MAX_RESOURCE_COUNT 3 + +/* Root pointer to the mapped APMT table */ +static struct acpi_table_header *apmt_table; + +static int __init apmt_init_resources(struct resource *res, + struct acpi_apmt_node *node) +{ + int irq, trigger; + int num_res = 0; + + res[num_res].start = node->base_address0; + res[num_res].end = node->base_address0 + SZ_4K - 1; + res[num_res].flags = IORESOURCE_MEM; + + num_res++; + + res[num_res].start = node->base_address1; + res[num_res].end = node->base_address1 + SZ_4K - 1; + res[num_res].flags = IORESOURCE_MEM; + + num_res++; + + if (node->ovflw_irq != 0) { + trigger = (node->ovflw_irq_flags & ACPI_APMT_OVFLW_IRQ_FLAGS_MODE); + trigger = (trigger == ACPI_APMT_OVFLW_IRQ_FLAGS_MODE_LEVEL) ? + ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE; + irq = acpi_register_gsi(NULL, node->ovflw_irq, trigger, + ACPI_ACTIVE_HIGH); + + if (irq <= 0) { + pr_warn("APMT could not register gsi hwirq %d\n", irq); + return num_res; + } + + res[num_res].start = irq; + res[num_res].end = irq; + res[num_res].flags = IORESOURCE_IRQ; + + num_res++; + } + + return num_res; +} + +/** + * apmt_add_platform_device() - Allocate a platform device for APMT node + * @node: Pointer to device ACPI APMT node + * @fwnode: fwnode associated with the APMT node + * + * Returns: 0 on success, <0 failure + */ +static int __init apmt_add_platform_device(struct acpi_apmt_node *node, + struct fwnode_handle *fwnode) +{ + struct platform_device *pdev; + int ret, count; + struct resource res[DEV_MAX_RESOURCE_COUNT]; + + pdev = platform_device_alloc(DEV_NAME, PLATFORM_DEVID_AUTO); + if (!pdev) + return -ENOMEM; + + memset(res, 0, sizeof(res)); + + count = apmt_init_resources(res, node); + + ret = platform_device_add_resources(pdev, res, count); + if (ret) + goto dev_put; + + /* + * Add a copy of APMT node pointer to platform_data to be used to + * retrieve APMT data information. + */ + ret = platform_device_add_data(pdev, &node, sizeof(node)); + if (ret) + goto dev_put; + + pdev->dev.fwnode = fwnode; + + ret = platform_device_add(pdev); + + if (ret) + goto dev_put; + + return 0; + +dev_put: + platform_device_put(pdev); + + return ret; +} + +static int __init apmt_init_platform_devices(void) +{ + struct acpi_apmt_node *apmt_node; + struct acpi_table_apmt *apmt; + struct fwnode_handle *fwnode; + u64 offset, end; + int ret; + + /* + * apmt_table and apmt both point to the start of APMT table, but + * have different struct types + */ + apmt = (struct acpi_table_apmt *)apmt_table; + offset = sizeof(*apmt); + end = apmt->header.length; + + while (offset < end) { + apmt_node = ACPI_ADD_PTR(struct acpi_apmt_node, apmt, + offset); + + fwnode = acpi_alloc_fwnode_static(); + if (!fwnode) + return -ENOMEM; + + ret = apmt_add_platform_device(apmt_node, fwnode); + if (ret) { + acpi_free_fwnode_static(fwnode); + return ret; + } + + offset += apmt_node->length; + } + + return 0; +} + +void __init acpi_apmt_init(void) +{ + acpi_status status; + int ret; + + /** + * APMT table nodes will be used at runtime after the apmt init, + * so we don't need to call acpi_put_table() to release + * the APMT table mapping. + */ + status = acpi_get_table(ACPI_SIG_APMT, 0, &apmt_table); + + if (ACPI_FAILURE(status)) { + if (status != AE_NOT_FOUND) { + const char *msg = acpi_format_exception(status); + + pr_err("Failed to get APMT table, %s\n", msg); + } + + return; + } + + ret = apmt_init_platform_devices(); + if (ret) { + pr_err("Failed to initialize APMT platform devices, ret: %d\n", ret); + acpi_put_table(apmt_table); + } +} diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index 8059baf4ef27..38fb84974f35 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -402,6 +402,10 @@ static struct acpi_iort_node *iort_node_get_id(struct acpi_iort_node *node, return NULL; } +#ifndef ACPI_IORT_SMMU_V3_DEVICEID_VALID +#define ACPI_IORT_SMMU_V3_DEVICEID_VALID (1 << 4) +#endif + static int iort_get_id_mapping_index(struct acpi_iort_node *node) { struct acpi_iort_smmu_v3 *smmu; @@ -418,12 +422,16 @@ static int iort_get_id_mapping_index(struct acpi_iort_node *node) smmu = (struct acpi_iort_smmu_v3 *)node->node_data; /* - * ID mapping index is only ignored if all interrupts are - * GSIV based + * Until IORT E.e (node rev. 5), the ID mapping index was + * defined to be valid unless all interrupts are GSIV-based. */ - if (smmu->event_gsiv && smmu->pri_gsiv && smmu->gerr_gsiv - && smmu->sync_gsiv) + if (node->revision < 5) { + if (smmu->event_gsiv && smmu->pri_gsiv && + smmu->gerr_gsiv && smmu->sync_gsiv) + return -EINVAL; + } else if (!(smmu->flags & ACPI_IORT_SMMU_V3_DEVICEID_VALID)) { return -EINVAL; + } if (smmu->id_mapping_index >= node->mapping_count) { pr_err(FW_BUG "[node %p type %d] ID mapping index overflows valid mappings\n", diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index d466c8195314..351208eda9be 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -27,6 +27,7 @@ #include <linux/dmi.h> #endif #include <linux/acpi_agdi.h> +#include <linux/acpi_apmt.h> #include <linux/acpi_iort.h> #include <linux/acpi_viot.h> #include <linux/pci.h> @@ -1423,6 +1424,7 @@ static int __init acpi_init(void) acpi_setup_sb_notify_handler(); acpi_viot_init(); acpi_agdi_init(); + acpi_apmt_init(); return 0; } diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c index d5e86ef40b89..fa85c64d3ded 100644 --- a/drivers/firmware/arm_ffa/driver.c +++ b/drivers/firmware/arm_ffa/driver.c @@ -36,81 +36,6 @@ #include "common.h" #define FFA_DRIVER_VERSION FFA_VERSION_1_0 - -#define FFA_SMC(calling_convention, func_num) \ - ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, (calling_convention), \ - ARM_SMCCC_OWNER_STANDARD, (func_num)) - -#define FFA_SMC_32(func_num) FFA_SMC(ARM_SMCCC_SMC_32, (func_num)) -#define FFA_SMC_64(func_num) FFA_SMC(ARM_SMCCC_SMC_64, (func_num)) - -#define FFA_ERROR FFA_SMC_32(0x60) -#define FFA_SUCCESS FFA_SMC_32(0x61) -#define FFA_INTERRUPT FFA_SMC_32(0x62) -#define FFA_VERSION FFA_SMC_32(0x63) -#define FFA_FEATURES FFA_SMC_32(0x64) -#define FFA_RX_RELEASE FFA_SMC_32(0x65) -#define FFA_RXTX_MAP FFA_SMC_32(0x66) -#define FFA_FN64_RXTX_MAP FFA_SMC_64(0x66) -#define FFA_RXTX_UNMAP FFA_SMC_32(0x67) -#define FFA_PARTITION_INFO_GET FFA_SMC_32(0x68) -#define FFA_ID_GET FFA_SMC_32(0x69) -#define FFA_MSG_POLL FFA_SMC_32(0x6A) -#define FFA_MSG_WAIT FFA_SMC_32(0x6B) -#define FFA_YIELD FFA_SMC_32(0x6C) -#define FFA_RUN FFA_SMC_32(0x6D) -#define FFA_MSG_SEND FFA_SMC_32(0x6E) -#define FFA_MSG_SEND_DIRECT_REQ FFA_SMC_32(0x6F) -#define FFA_FN64_MSG_SEND_DIRECT_REQ FFA_SMC_64(0x6F) -#define FFA_MSG_SEND_DIRECT_RESP FFA_SMC_32(0x70) -#define FFA_FN64_MSG_SEND_DIRECT_RESP FFA_SMC_64(0x70) -#define FFA_MEM_DONATE FFA_SMC_32(0x71) -#define FFA_FN64_MEM_DONATE FFA_SMC_64(0x71) -#define FFA_MEM_LEND FFA_SMC_32(0x72) -#define FFA_FN64_MEM_LEND FFA_SMC_64(0x72) -#define FFA_MEM_SHARE FFA_SMC_32(0x73) -#define FFA_FN64_MEM_SHARE FFA_SMC_64(0x73) -#define FFA_MEM_RETRIEVE_REQ FFA_SMC_32(0x74) -#define FFA_FN64_MEM_RETRIEVE_REQ FFA_SMC_64(0x74) -#define FFA_MEM_RETRIEVE_RESP FFA_SMC_32(0x75) -#define FFA_MEM_RELINQUISH FFA_SMC_32(0x76) -#define FFA_MEM_RECLAIM FFA_SMC_32(0x77) -#define FFA_MEM_OP_PAUSE FFA_SMC_32(0x78) -#define FFA_MEM_OP_RESUME FFA_SMC_32(0x79) -#define FFA_MEM_FRAG_RX FFA_SMC_32(0x7A) -#define FFA_MEM_FRAG_TX FFA_SMC_32(0x7B) -#define FFA_NORMAL_WORLD_RESUME FFA_SMC_32(0x7C) - -/* - * For some calls it is necessary to use SMC64 to pass or return 64-bit values. - * For such calls FFA_FN_NATIVE(name) will choose the appropriate - * (native-width) function ID. - */ -#ifdef CONFIG_64BIT -#define FFA_FN_NATIVE(name) FFA_FN64_##name -#else -#define FFA_FN_NATIVE(name) FFA_##name -#endif - -/* FFA error codes. */ -#define FFA_RET_SUCCESS (0) -#define FFA_RET_NOT_SUPPORTED (-1) -#define FFA_RET_INVALID_PARAMETERS (-2) -#define FFA_RET_NO_MEMORY (-3) -#define FFA_RET_BUSY (-4) -#define FFA_RET_INTERRUPTED (-5) -#define FFA_RET_DENIED (-6) -#define FFA_RET_RETRY (-7) -#define FFA_RET_ABORTED (-8) - -#define MAJOR_VERSION_MASK GENMASK(30, 16) -#define MINOR_VERSION_MASK GENMASK(15, 0) -#define MAJOR_VERSION(x) ((u16)(FIELD_GET(MAJOR_VERSION_MASK, (x)))) -#define MINOR_VERSION(x) ((u16)(FIELD_GET(MINOR_VERSION_MASK, (x)))) -#define PACK_VERSION_INFO(major, minor) \ - (FIELD_PREP(MAJOR_VERSION_MASK, (major)) | \ - FIELD_PREP(MINOR_VERSION_MASK, (minor))) -#define FFA_VERSION_1_0 PACK_VERSION_INFO(1, 0) #define FFA_MIN_VERSION FFA_VERSION_1_0 #define SENDER_ID_MASK GENMASK(31, 16) @@ -121,12 +46,6 @@ (FIELD_PREP(SENDER_ID_MASK, (s)) | FIELD_PREP(RECEIVER_ID_MASK, (r))) /* - * FF-A specification mentions explicitly about '4K pages'. This should - * not be confused with the kernel PAGE_SIZE, which is the translation - * granule kernel is configured and may be one among 4K, 16K and 64K. - */ -#define FFA_PAGE_SIZE SZ_4K -/* * Keeping RX TX buffer size as 4K for now * 64K may be preferred to keep it min a page in 64K PAGE_SIZE config */ @@ -178,9 +97,9 @@ static struct ffa_drv_info *drv_info; */ static u32 ffa_compatible_version_find(u32 version) { - u16 major = MAJOR_VERSION(version), minor = MINOR_VERSION(version); - u16 drv_major = MAJOR_VERSION(FFA_DRIVER_VERSION); - u16 drv_minor = MINOR_VERSION(FFA_DRIVER_VERSION); + u16 major = FFA_MAJOR_VERSION(version), minor = FFA_MINOR_VERSION(version); + u16 drv_major = FFA_MAJOR_VERSION(FFA_DRIVER_VERSION); + u16 drv_minor = FFA_MINOR_VERSION(FFA_DRIVER_VERSION); if ((major < drv_major) || (major == drv_major && minor <= drv_minor)) return version; @@ -204,16 +123,16 @@ static int ffa_version_check(u32 *version) if (ver.a0 < FFA_MIN_VERSION) { pr_err("Incompatible v%d.%d! Earliest supported v%d.%d\n", - MAJOR_VERSION(ver.a0), MINOR_VERSION(ver.a0), - MAJOR_VERSION(FFA_MIN_VERSION), - MINOR_VERSION(FFA_MIN_VERSION)); + FFA_MAJOR_VERSION(ver.a0), FFA_MINOR_VERSION(ver.a0), + FFA_MAJOR_VERSION(FFA_MIN_VERSION), + FFA_MINOR_VERSION(FFA_MIN_VERSION)); return -EINVAL; } - pr_info("Driver version %d.%d\n", MAJOR_VERSION(FFA_DRIVER_VERSION), - MINOR_VERSION(FFA_DRIVER_VERSION)); - pr_info("Firmware version %d.%d found\n", MAJOR_VERSION(ver.a0), - MINOR_VERSION(ver.a0)); + pr_info("Driver version %d.%d\n", FFA_MAJOR_VERSION(FFA_DRIVER_VERSION), + FFA_MINOR_VERSION(FFA_DRIVER_VERSION)); + pr_info("Firmware version %d.%d found\n", FFA_MAJOR_VERSION(ver.a0), + FFA_MINOR_VERSION(ver.a0)); *version = ffa_compatible_version_find(ver.a0); return 0; diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index ef5045a53ce0..453b67582fee 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -20,6 +20,7 @@ cflags-$(CONFIG_X86) += -m$(BITS) -D__KERNEL__ \ # disable the stackleak plugin cflags-$(CONFIG_ARM64) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ -fpie $(DISABLE_STACKLEAK_PLUGIN) \ + -fno-unwind-tables -fno-asynchronous-unwind-tables \ $(call cc-option,-mbranch-protection=none) cflags-$(CONFIG_ARM) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ -fno-builtin -fpic \ diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 341010f20b77..77043bcdb33c 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -199,4 +199,8 @@ config MARVELL_CN10K_DDR_PMU Enable perf support for Marvell DDR Performance monitoring event on CN10K platform. +source "drivers/perf/arm_cspmu/Kconfig" + +source "drivers/perf/amlogic/Kconfig" + endmenu diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 050d04ee19dd..13e45da61100 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -21,3 +21,5 @@ obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o +obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu/ +obj-$(CONFIG_MESON_DDR_PMU) += amlogic/ diff --git a/drivers/perf/amlogic/Kconfig b/drivers/perf/amlogic/Kconfig new file mode 100644 index 000000000000..f68db01a7f17 --- /dev/null +++ b/drivers/perf/amlogic/Kconfig @@ -0,0 +1,10 @@ +# SPDX-License-Identifier: GPL-2.0-only +config MESON_DDR_PMU + tristate "Amlogic DDR Bandwidth Performance Monitor" + depends on ARCH_MESON || COMPILE_TEST + help + Provides support for the DDR performance monitor + in Amlogic SoCs, which can give information about + memory throughput and other related events. It + supports multiple channels to monitor the memory + bandwidth simultaneously. diff --git a/drivers/perf/amlogic/Makefile b/drivers/perf/amlogic/Makefile new file mode 100644 index 000000000000..d3ab2ac5353b --- /dev/null +++ b/drivers/perf/amlogic/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0-only + +obj-$(CONFIG_MESON_DDR_PMU) += meson_ddr_pmu_g12.o + +meson_ddr_pmu_g12-y := meson_ddr_pmu_core.o meson_g12_ddr_pmu.o diff --git a/drivers/perf/amlogic/meson_ddr_pmu_core.c b/drivers/perf/amlogic/meson_ddr_pmu_core.c new file mode 100644 index 000000000000..b84346dbac2c --- /dev/null +++ b/drivers/perf/amlogic/meson_ddr_pmu_core.c @@ -0,0 +1,561 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2022 Amlogic, Inc. All rights reserved. + */ + +#include <linux/bitfield.h> +#include <linux/init.h> +#include <linux/irqreturn.h> +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/of.h> +#include <linux/of_device.h> +#include <linux/of_irq.h> +#include <linux/perf_event.h> +#include <linux/platform_device.h> +#include <linux/printk.h> +#include <linux/sysfs.h> +#include <linux/types.h> + +#include <soc/amlogic/meson_ddr_pmu.h> + +struct ddr_pmu { + struct pmu pmu; + struct dmc_info info; + struct dmc_counter counters; /* save counters from hw */ + bool pmu_enabled; + struct device *dev; + char *name; + struct hlist_node node; + enum cpuhp_state cpuhp_state; + int cpu; /* for cpu hotplug */ +}; + +#define DDR_PERF_DEV_NAME "meson_ddr_bw" +#define MAX_AXI_PORTS_OF_CHANNEL 4 /* A DMC channel can monitor max 4 axi ports */ + +#define to_ddr_pmu(p) container_of(p, struct ddr_pmu, pmu) +#define dmc_info_to_pmu(p) container_of(p, struct ddr_pmu, info) + +static void dmc_pmu_enable(struct ddr_pmu *pmu) +{ + if (!pmu->pmu_enabled) + pmu->info.hw_info->enable(&pmu->info); + + pmu->pmu_enabled = true; +} + +static void dmc_pmu_disable(struct ddr_pmu *pmu) +{ + if (pmu->pmu_enabled) + pmu->info.hw_info->disable(&pmu->info); + + pmu->pmu_enabled = false; +} + +static void meson_ddr_set_axi_filter(struct perf_event *event, u8 axi_id) +{ + struct ddr_pmu *pmu = to_ddr_pmu(event->pmu); + int chann; + + if (event->attr.config > ALL_CHAN_COUNTER_ID && + event->attr.config < COUNTER_MAX_ID) { + chann = event->attr.config - CHAN1_COUNTER_ID; + + pmu->info.hw_info->set_axi_filter(&pmu->info, axi_id, chann); + } +} + +static void ddr_cnt_addition(struct dmc_counter *sum, + struct dmc_counter *add1, + struct dmc_counter *add2, + int chann_nr) +{ + int i; + u64 cnt1, cnt2; + + sum->all_cnt = add1->all_cnt + add2->all_cnt; + sum->all_req = add1->all_req + add2->all_req; + for (i = 0; i < chann_nr; i++) { + cnt1 = add1->channel_cnt[i]; + cnt2 = add2->channel_cnt[i]; + + sum->channel_cnt[i] = cnt1 + cnt2; + } +} + +static void meson_ddr_perf_event_update(struct perf_event *event) +{ + struct ddr_pmu *pmu = to_ddr_pmu(event->pmu); + u64 new_raw_count = 0; + struct dmc_counter dc = {0}, sum_dc = {0}; + int idx; + int chann_nr = pmu->info.hw_info->chann_nr; + + /* get the remain counters in register. */ + pmu->info.hw_info->get_counters(&pmu->info, &dc); + + ddr_cnt_addition(&sum_dc, &pmu->counters, &dc, chann_nr); + + switch (event->attr.config) { + case ALL_CHAN_COUNTER_ID: + new_raw_count = sum_dc.all_cnt; + break; + case CHAN1_COUNTER_ID: + case CHAN2_COUNTER_ID: + case CHAN3_COUNTER_ID: + case CHAN4_COUNTER_ID: + case CHAN5_COUNTER_ID: + case CHAN6_COUNTER_ID: + case CHAN7_COUNTER_ID: + case CHAN8_COUNTER_ID: + idx = event->attr.config - CHAN1_COUNTER_ID; + new_raw_count = sum_dc.channel_cnt[idx]; + break; + } + + local64_set(&event->count, new_raw_count); +} + +static int meson_ddr_perf_event_init(struct perf_event *event) +{ + struct ddr_pmu *pmu = to_ddr_pmu(event->pmu); + u64 config1 = event->attr.config1; + u64 config2 = event->attr.config2; + + if (event->attr.type != event->pmu->type) + return -ENOENT; + + if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK) + return -EOPNOTSUPP; + + if (event->cpu < 0) + return -EOPNOTSUPP; + + /* check if the number of parameters is too much */ + if (event->attr.config != ALL_CHAN_COUNTER_ID && + hweight64(config1) + hweight64(config2) > MAX_AXI_PORTS_OF_CHANNEL) + return -EOPNOTSUPP; + + event->cpu = pmu->cpu; + + return 0; +} + +static void meson_ddr_perf_event_start(struct perf_event *event, int flags) +{ + struct ddr_pmu *pmu = to_ddr_pmu(event->pmu); + + memset(&pmu->counters, 0, sizeof(pmu->counters)); + dmc_pmu_enable(pmu); +} + +static int meson_ddr_perf_event_add(struct perf_event *event, int flags) +{ + u64 config1 = event->attr.config1; + u64 config2 = event->attr.config2; + int i; + + for_each_set_bit(i, (const unsigned long *)&config1, sizeof(config1)) + meson_ddr_set_axi_filter(event, i); + + for_each_set_bit(i, (const unsigned long *)&config2, sizeof(config2)) + meson_ddr_set_axi_filter(event, i + 64); + + if (flags & PERF_EF_START) + meson_ddr_perf_event_start(event, flags); + + return 0; +} + +static void meson_ddr_perf_event_stop(struct perf_event *event, int flags) +{ + struct ddr_pmu *pmu = to_ddr_pmu(event->pmu); + + if (flags & PERF_EF_UPDATE) + meson_ddr_perf_event_update(event); + + dmc_pmu_disable(pmu); +} + +static void meson_ddr_perf_event_del(struct perf_event *event, int flags) +{ + meson_ddr_perf_event_stop(event, PERF_EF_UPDATE); +} + +static ssize_t meson_ddr_perf_cpumask_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ddr_pmu *pmu = dev_get_drvdata(dev); + + return cpumap_print_to_pagebuf(true, buf, cpumask_of(pmu->cpu)); +} + +static struct device_attribute meson_ddr_perf_cpumask_attr = +__ATTR(cpumask, 0444, meson_ddr_perf_cpumask_show, NULL); + +static struct attribute *meson_ddr_perf_cpumask_attrs[] = { + &meson_ddr_perf_cpumask_attr.attr, + NULL, +}; + +static const struct attribute_group ddr_perf_cpumask_attr_group = { + .attrs = meson_ddr_perf_cpumask_attrs, +}; + +static ssize_t +pmu_event_show(struct device *dev, struct device_attribute *attr, + char *page) +{ + struct perf_pmu_events_attr *pmu_attr; + + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr); + return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id); +} + +static ssize_t +event_show_unit(struct device *dev, struct device_attribute *attr, + char *page) +{ + return sysfs_emit(page, "MB\n"); +} + +static ssize_t +event_show_scale(struct device *dev, struct device_attribute *attr, + char *page) +{ + /* one count = 16byte = 1.52587890625e-05 MB */ + return sysfs_emit(page, "1.52587890625e-05\n"); +} + +#define AML_DDR_PMU_EVENT_ATTR(_name, _id) \ +{ \ + .attr = __ATTR(_name, 0444, pmu_event_show, NULL), \ + .id = _id, \ +} + +#define AML_DDR_PMU_EVENT_UNIT_ATTR(_name) \ + __ATTR(_name.unit, 0444, event_show_unit, NULL) + +#define AML_DDR_PMU_EVENT_SCALE_ATTR(_name) \ + __ATTR(_name.scale, 0444, event_show_scale, NULL) + +static struct device_attribute event_unit_attrs[] = { + AML_DDR_PMU_EVENT_UNIT_ATTR(total_rw_bytes), + AML_DDR_PMU_EVENT_UNIT_ATTR(chan_1_rw_bytes), + AML_DDR_PMU_EVENT_UNIT_ATTR(chan_2_rw_bytes), + AML_DDR_PMU_EVENT_UNIT_ATTR(chan_3_rw_bytes), + AML_DDR_PMU_EVENT_UNIT_ATTR(chan_4_rw_bytes), + AML_DDR_PMU_EVENT_UNIT_ATTR(chan_5_rw_bytes), + AML_DDR_PMU_EVENT_UNIT_ATTR(chan_6_rw_bytes), + AML_DDR_PMU_EVENT_UNIT_ATTR(chan_7_rw_bytes), + AML_DDR_PMU_EVENT_UNIT_ATTR(chan_8_rw_bytes), +}; + +static struct device_attribute event_scale_attrs[] = { + AML_DDR_PMU_EVENT_SCALE_ATTR(total_rw_bytes), + AML_DDR_PMU_EVENT_SCALE_ATTR(chan_1_rw_bytes), + AML_DDR_PMU_EVENT_SCALE_ATTR(chan_2_rw_bytes), + AML_DDR_PMU_EVENT_SCALE_ATTR(chan_3_rw_bytes), + AML_DDR_PMU_EVENT_SCALE_ATTR(chan_4_rw_bytes), + AML_DDR_PMU_EVENT_SCALE_ATTR(chan_5_rw_bytes), + AML_DDR_PMU_EVENT_SCALE_ATTR(chan_6_rw_bytes), + AML_DDR_PMU_EVENT_SCALE_ATTR(chan_7_rw_bytes), + AML_DDR_PMU_EVENT_SCALE_ATTR(chan_8_rw_bytes), +}; + +static struct perf_pmu_events_attr event_attrs[] = { + AML_DDR_PMU_EVENT_ATTR(total_rw_bytes, ALL_CHAN_COUNTER_ID), + AML_DDR_PMU_EVENT_ATTR(chan_1_rw_bytes, CHAN1_COUNTER_ID), + AML_DDR_PMU_EVENT_ATTR(chan_2_rw_bytes, CHAN2_COUNTER_ID), + AML_DDR_PMU_EVENT_ATTR(chan_3_rw_bytes, CHAN3_COUNTER_ID), + AML_DDR_PMU_EVENT_ATTR(chan_4_rw_bytes, CHAN4_COUNTER_ID), + AML_DDR_PMU_EVENT_ATTR(chan_5_rw_bytes, CHAN5_COUNTER_ID), + AML_DDR_PMU_EVENT_ATTR(chan_6_rw_bytes, CHAN6_COUNTER_ID), + AML_DDR_PMU_EVENT_ATTR(chan_7_rw_bytes, CHAN7_COUNTER_ID), + AML_DDR_PMU_EVENT_ATTR(chan_8_rw_bytes, CHAN8_COUNTER_ID), +}; + +/* three |
