summaryrefslogtreecommitdiff
path: root/tools
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2023-06-30 11:35:41 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2023-06-30 11:35:41 -0700
commitb30d7a77c53ec04a6d94683d7680ec406b7f3ac8 (patch)
tree5c8d99d15eb1a9b28810a5358b098ac18daefa71 /tools
parentd2a6fd45c5c4a5c5fdfe6c57f74f630e61d8d9a0 (diff)
parent4d60e83dfcee794213878155463d8f7353a80864 (diff)
downloadlinux-b30d7a77c53ec04a6d94683d7680ec406b7f3ac8.tar.gz
linux-b30d7a77c53ec04a6d94683d7680ec406b7f3ac8.tar.bz2
linux-b30d7a77c53ec04a6d94683d7680ec406b7f3ac8.zip
Merge tag 'perf-tools-for-v6.5-1-2023-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next
Pull perf tools updates from Namhyung Kim: "Internal cleanup: - Refactor PMU data management to handle hybrid systems in a generic way. Do more work in the lexer so that legacy event types parse more easily. A side-effect of this is that if a PMU is specified, scanning sysfs is avoided improving start-up time. - Fix hybrid metrics, for example, the TopdownL1 works for both performance and efficiency cores on Intel machines. To support this, sort and regroup events after parsing. - Add reference count checking for the 'thread' data structure. - Lots of fixes for memory leaks in various places thanks to the ASAN and Ian's refcount checker. - Reduce the binary size by replacing static variables with local or dynamically allocated memory. - Introduce shared_mutex for annotate data to reduce memory footprint. - Make filesystem access library functions more thread safe. Test: - Organize cpu_map tests into a single suite. - Add metric value validation test to check if the values are within correct value ranges. - Add perf stat stdio output test to check if event and metric names match. - Add perf data converter JSON output test. - Fix a lot of issues reported by shellcheck(1). This is a preparation to enable shellcheck by default. - Make the large x86 new instructions test optional at build time using EXTRA_TESTS=1. - Add a test for libpfm4 events. perf script: - Add 'dsoff' outpuf field to display offset from the DSO. $ perf script -F comm,pid,event,ip,dsoff ls 2695501 cycles: 152cc73ef4b5 (/usr/lib/x86_64-linux-gnu/ld-2.31.so+0x1c4b5) ls 2695501 cycles: ffffffff99045b3e ([kernel.kallsyms]) ls 2695501 cycles: ffffffff9968e107 ([kernel.kallsyms]) ls 2695501 cycles: ffffffffc1f54afb ([kernel.kallsyms]) ls 2695501 cycles: ffffffff9968382f ([kernel.kallsyms]) ls 2695501 cycles: ffffffff99e00094 ([kernel.kallsyms]) ls 2695501 cycles: 152cc718a8d0 (/usr/lib/x86_64-linux-gnu/libselinux.so.1+0x68d0) ls 2695501 cycles: ffffffff992a6db0 ([kernel.kallsyms]) - Adjust width for large PID/TID values. perf report: - Robustify reading addr2line output for srcline by checking sentinel output before the actual data and by using timeout of 1 second. - Allow config terms (like 'name=ABC') with breakpoint events. $ perf record -e mem:0x55feb98dd169:x/name=breakpoint/ -p 19646 -- sleep 1 perf annotate: - Handle x86 instruction suffix like 'l' in 'movl' generally. - Parse instruction operands properly even with a whitespace. This is needed for llvm-objdump output. - Support RISC-V binutils lookup using the triplet prefixes. - Add '<' and '>' key to navigate to prev/next symbols in TUI. - Fix instruction association and parsing for LoongArch. perf stat: - Add --per-cache aggregation option, optionally specify a cache level like `--per-cache=L2`. $ sudo perf stat --per-cache -a -e ls_dmnd_fills_from_sys.ext_cache_remote --\ taskset -c 0-15,64-79,128-143,192-207\ perf bench sched messaging -p -t -l 100000 -g 8 # Running 'sched/messaging' benchmark: # 20 sender and receiver threads per group # 8 groups == 320 threads run Total time: 7.648 [sec] Performance counter stats for 'system wide': S0-D0-L3-ID0 16 17,145,912 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID8 16 14,977,628 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID16 16 262,539 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID24 16 3,140 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID32 16 27,403 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID40 16 17,026 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID48 16 7,292 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID56 16 2,464 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID64 16 22,489,306 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID72 16 21,455,257 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID80 16 11,619 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID88 16 30,978 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID96 16 37,628 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID104 16 13,594 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID112 16 10,164 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID120 16 11,259 ls_dmnd_fills_from_sys.ext_cache_remote 7.779171484 seconds time elapsed - Change default (no event/metric) formatting for default metrics so that events are hidden and the metric and group appear. Performance counter stats for 'ls /': 1.85 msec task-clock # 0.594 CPUs utilized 0 context-switches # 0.000 /sec 0 cpu-migrations # 0.000 /sec 97 page-faults # 52.517 K/sec 2,187,173 cycles # 1.184 GHz 2,474,459 instructions # 1.13 insn per cycle 531,584 branches # 287.805 M/sec 13,626 branch-misses # 2.56% of all branches TopdownL1 # 23.5 % tma_backend_bound # 11.5 % tma_bad_speculation # 39.1 % tma_frontend_bound # 25.9 % tma_retiring - Allow --cputype option to have any PMU name (not just hybrid). - Fix output value not to added when it runs multiple times with -r option. perf list: - Show metricgroup description from JSON file called metricgroups.json. - Allow 'pfm' argument to list only libpfm4 events and check each event is supported before showing it. JSON vendor events: - Avoid event grouping using "NO_GROUP_EVENTS" constraints. The topdown events are correctly grouped even if no group exists. - Add "Default" metric group to print it in the default output. And use "DefaultMetricgroupName" to indicate the real metric group name. - Add AmpereOne core PMU events. Misc: - Define man page date correctly. - Track exception level properly on ARM CoreSight ETM. - Allow anonymous struct, union or enum when retrieving type names from DWARF. - Fix incorrect filename when calling `perf inject --jit`. - Handle PLT size correctly on LoongArch" * tag 'perf-tools-for-v6.5-1-2023-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next: (269 commits) perf test: Skip metrics w/o event name in stat STD output linter perf test: Reorder event name checks in stat STD output linter perf pmu: Remove a hard coded cpu PMU assumption perf pmus: Add notion of default PMU for JSON events perf unwind: Fix map reference counts perf test: Set PERF_EXEC_PATH for script execution perf script: Initialize buffer for regs_map() perf tests: Fix test_arm_callgraph_fp variable expansion perf symbol: Add LoongArch case in get_plt_sizes() perf test: Remove x permission from lib/stat_output.sh perf test: Rerun failed metrics with longer workload perf test: Add skip list for metrics known would fail perf test: Add metric value validation test perf jit: Fix incorrect file name in DWARF line table perf annotate: Fix instruction association and parsing for LoongArch perf annotation: Switch lock from a mutex to a sharded_mutex perf sharded_mutex: Introduce sharded_mutex tools: Fix incorrect calculation of object size by sizeof perf subcmd: Fix missing check for return value of malloc() in add_cmdname() perf parse-events: Remove unneeded semicolon ...
Diffstat (limited to 'tools')
-rw-r--r--tools/lib/api/fs/cgroup.c17
-rw-r--r--tools/lib/api/fs/fs.c226
-rw-r--r--tools/lib/api/fs/tracing_path.c17
-rw-r--r--tools/lib/api/io.h28
-rw-r--r--tools/lib/perf/cpumap.c125
-rw-r--r--tools/lib/perf/evlist.c25
-rw-r--r--tools/lib/perf/include/internal/evsel.h15
-rw-r--r--tools/lib/perf/include/perf/cpumap.h19
-rw-r--r--tools/lib/perf/include/perf/event.h3
-rw-r--r--tools/lib/subcmd/exec-cmd.c35
-rw-r--r--tools/lib/subcmd/help.c10
-rw-r--r--tools/perf/Documentation/Makefile15
-rw-r--r--tools/perf/Documentation/perf-script.txt2
-rw-r--r--tools/perf/Documentation/perf-stat.txt31
-rw-r--r--tools/perf/Makefile.config5
-rw-r--r--tools/perf/Makefile.perf4
-rw-r--r--tools/perf/arch/arm/tests/dwarf-unwind.c2
-rw-r--r--tools/perf/arch/arm/util/auxtrace.c7
-rw-r--r--tools/perf/arch/arm/util/cs-etm.c4
-rwxr-xr-xtools/perf/arch/arm64/entry/syscalls/mksyscalltbl17
-rw-r--r--tools/perf/arch/arm64/tests/dwarf-unwind.c2
-rw-r--r--tools/perf/arch/arm64/util/pmu.c6
-rw-r--r--tools/perf/arch/common.c18
-rw-r--r--tools/perf/arch/loongarch/annotate/instructions.c116
-rwxr-xr-xtools/perf/arch/loongarch/entry/syscalls/mksyscalltbl40
-rw-r--r--tools/perf/arch/mips/entry/syscalls/mksyscalltbl2
-rwxr-xr-xtools/perf/arch/powerpc/entry/syscalls/mksyscalltbl2
-rw-r--r--tools/perf/arch/powerpc/tests/dwarf-unwind.c2
-rw-r--r--tools/perf/arch/powerpc/util/kvm-stat.c4
-rw-r--r--tools/perf/arch/s390/annotate/instructions.c3
-rwxr-xr-xtools/perf/arch/s390/entry/syscalls/mksyscalltbl2
-rw-r--r--tools/perf/arch/x86/annotate/instructions.c50
-rwxr-xr-xtools/perf/arch/x86/entry/syscalls/syscalltbl.sh2
-rw-r--r--tools/perf/arch/x86/include/arch-tests.h3
-rw-r--r--tools/perf/arch/x86/tests/Build6
-rw-r--r--tools/perf/arch/x86/tests/amd-ibs-via-core-pmu.c5
-rw-r--r--tools/perf/arch/x86/tests/arch-tests.c14
-rw-r--r--tools/perf/arch/x86/tests/dwarf-unwind.c2
-rw-r--r--tools/perf/arch/x86/tests/hybrid.c288
-rw-r--r--tools/perf/arch/x86/tests/insn-x86.c10
-rw-r--r--tools/perf/arch/x86/tests/intel-pt-test.c14
-rw-r--r--tools/perf/arch/x86/util/Build1
-rw-r--r--tools/perf/arch/x86/util/auxtrace.c5
-rw-r--r--tools/perf/arch/x86/util/env.c19
-rw-r--r--tools/perf/arch/x86/util/env.h7
-rw-r--r--tools/perf/arch/x86/util/evlist.c29
-rw-r--r--tools/perf/arch/x86/util/evsel.c43
-rw-r--r--tools/perf/arch/x86/util/intel-bts.c4
-rw-r--r--tools/perf/arch/x86/util/intel-pt.c4
-rw-r--r--tools/perf/arch/x86/util/mem-events.c36
-rw-r--r--tools/perf/arch/x86/util/perf_regs.c15
-rw-r--r--tools/perf/arch/x86/util/pmu.c12
-rw-r--r--tools/perf/arch/x86/util/topdown.c5
-rw-r--r--tools/perf/bench/epoll-ctl.c5
-rw-r--r--tools/perf/bench/epoll-wait.c5
-rw-r--r--tools/perf/bench/futex-lock-pi.c12
-rw-r--r--tools/perf/bench/futex-requeue.c12
-rw-r--r--tools/perf/bench/futex-wake-parallel.c19
-rw-r--r--tools/perf/bench/futex-wake.c12
-rw-r--r--tools/perf/bench/pmu-scan.c60
-rw-r--r--tools/perf/bench/sched-messaging.c18
-rw-r--r--tools/perf/builtin-annotate.c32
-rw-r--r--tools/perf/builtin-bench.c2
-rw-r--r--tools/perf/builtin-c2c.c31
-rw-r--r--tools/perf/builtin-config.c4
-rw-r--r--tools/perf/builtin-daemon.c44
-rw-r--r--tools/perf/builtin-diff.c24
-rw-r--r--tools/perf/builtin-ftrace.c2
-rw-r--r--tools/perf/builtin-help.c4
-rw-r--r--tools/perf/builtin-inject.c35
-rw-r--r--tools/perf/builtin-kmem.c26
-rw-r--r--tools/perf/builtin-kwork.c27
-rw-r--r--tools/perf/builtin-list.c48
-rw-r--r--tools/perf/builtin-lock.c27
-rw-r--r--tools/perf/builtin-mem.c13
-rw-r--r--tools/perf/builtin-probe.c133
-rw-r--r--tools/perf/builtin-record.c42
-rw-r--r--tools/perf/builtin-report.c21
-rw-r--r--tools/perf/builtin-sched.c120
-rw-r--r--tools/perf/builtin-script.c218
-rw-r--r--tools/perf/builtin-stat.c343
-rw-r--r--tools/perf/builtin-timechart.c59
-rw-r--r--tools/perf/builtin-top.c48
-rw-r--r--tools/perf/builtin-trace.c96
-rwxr-xr-xtools/perf/check-headers.sh232
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/branch.json17
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/bus.json32
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/cache.json104
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/core-imp-def.json698
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/exception.json44
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/instruction.json89
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/intrinsic.json14
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/memory.json44
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/pipeline.json23
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/spe.json14
-rw-r--r--tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json12
-rw-r--r--tools/perf/pmu-events/arch/arm64/mapfile.csv1
-rw-r--r--tools/perf/pmu-events/arch/arm64/sbsa.json12
-rw-r--r--tools/perf/pmu-events/arch/x86/alderlake/adl-metrics.json1410
-rw-r--r--tools/perf/pmu-events/arch/x86/alderlake/cache.json9
-rw-r--r--tools/perf/pmu-events/arch/x86/alderlake/memory.json6
-rw-r--r--tools/perf/pmu-events/arch/x86/alderlake/metricgroups.json122
-rw-r--r--tools/perf/pmu-events/arch/x86/alderlaken/adln-metrics.json301
-rw-r--r--tools/perf/pmu-events/arch/x86/alderlaken/metricgroups.json26
-rw-r--r--tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json580
-rw-r--r--tools/perf/pmu-events/arch/x86/broadwell/floating-point.json15
-rw-r--r--tools/perf/pmu-events/arch/x86/broadwell/metricgroups.json107
-rw-r--r--tools/perf/pmu-events/arch/x86/broadwellde/bdwde-metrics.json556
-rw-r--r--tools/perf/pmu-events/arch/x86/broadwellde/floating-point.json15
-rw-r--r--tools/perf/pmu-events/arch/x86/broadwellde/metricgroups.json107
-rw-r--r--tools/perf/pmu-events/arch/x86/broadwellx/bdx-metrics.json796
-rw-r--r--tools/perf/pmu-events/arch/x86/broadwellx/floating-point.json15