summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2024-10-28perf cap: Add __NR_capget to arch/x86 unistdIan Rogers1-7/+3
As there are duplicated kernel headers in tools/include libc can pick up the wrong definitions. This was causing the wrong system call for capget in perf. Reported-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: e25ebda78e230283 ("perf cap: Tidy up and improve capability testing") Closes: https://lore.kernel.org/lkml/cc7d6bdf-1aeb-4179-9029-4baf50b59342@intel.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241026055448.312247-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-23perf python: Fix up the build on architectures without HAVE_KVM_STAT_SUPPORTArnaldo Carvalho de Melo1-0/+3
Noticed while building on a raspbian arm 32-bit system. There was also this other case, fixed by adding a missing util/stat.h with the prototypes: /tmp/tmp.MbiSHoF3dj/perf-6.12.0-rc3/tools/perf/util/python.c:1396:6: error: no previous prototype for ‘perf_stat__set_no_csv_summary’ [-Werror=missing-prototypes] 1396 | void perf_stat__set_no_csv_summary(int set __maybe_unused) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /tmp/tmp.MbiSHoF3dj/perf-6.12.0-rc3/tools/perf/util/python.c:1400:6: error: no previous prototype for ‘perf_stat__set_big_num’ [-Werror=missing-prototypes] 1400 | void perf_stat__set_big_num(int set __maybe_unused) | ^~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors In other architectures this must be building due to some lucky indirect inclusion of that header. Fixes: 9dabf4003423c8d3 ("perf python: Switch module to linking libraries from building source") Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZxllAtpmEw5fg9oy@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-23perf trace: Fix non-listed archs in the syscalltbl routinesJiri Slaby1-0/+10
This fixes a build breakage on 32-bit arm, where the syscalltbl__id_at_idx() function was missing. Committer notes: Generating a proper syscall table from a copy of arch/arm/tools/syscall.tbl ends up being too big a patch for this rc stage, I started doing it but while testing noticed some other problems with using BPF to collect pointer args on arm7 (32-bit) will maybe continue trying to make it work on the next cycle... Fixes: 7a2fb5619cc1fb53 ("perf trace: Fix iteration of syscall ids in syscalltbl->entries") Suggested-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: <jslaby@suse.cz> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/lkml/3a592835-a14f-40be-8961-c0cee7720a94@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-23perf trace augmented_raw_syscalls: Add more checks to pass the verifierHoward Chu1-2/+18
Add some more checks to pass the verifier in more kernels. Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241011021403.4089793-3-howardchu95@gmail.com [ Reduced the patch removing things that can be done later ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-23perf trace augmented_raw_syscalls: Add extra array index bounds checking to ↵Arnaldo Carvalho de Melo1-0/+2
satisfy some BPF verifiers In a RHEL8 kernel (4.18.0-513.11.1.el8_9.x86_64), that, as enterprise kernels go, have backports from modern kernels, the verifier complains about lack of bounds check for the index into the array of syscall arguments, on a BPF bytecode generated by clang 17, with: ; } else if (size < 0 && size >= -6) { /* buffer */ 116: (b7) r1 = -6 117: (2d) if r1 > r6 goto pc-30 R0=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=inv-6 R2=map_value(id=0,off=16,ks=4,vs=8272,imm=0) R3=inv(id=0) R5=inv40 R6=inv(id=0,umin_value=18446744073709551610,var_off=(0xffffffff00000000; 0xffffffff)) R7=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8=invP6 R9=map_value(id=0,off=20,ks=4,vs=24,imm=0) R10=fp0 fp-8=mmmmmmmm fp-16=map_value fp-24=map_value fp-32=inv40 fp-40=ctx fp-48=map_value fp-56=inv1 fp-64=map_value fp-72=map_value fp-80=map_value ; index = -(size + 1); 118: (a7) r6 ^= -1 119: (67) r6 <<= 32 120: (77) r6 >>= 32 ; aug_size = args->args[index]; 121: (67) r6 <<= 3 122: (79) r1 = *(u64 *)(r10 -24) 123: (0f) r1 += r6 last_idx 123 first_idx 116 regs=40 stack=0 before 122: (79) r1 = *(u64 *)(r10 -24) regs=40 stack=0 before 121: (67) r6 <<= 3 regs=40 stack=0 before 120: (77) r6 >>= 32 regs=40 stack=0 before 119: (67) r6 <<= 32 regs=40 stack=0 before 118: (a7) r6 ^= -1 regs=40 stack=0 before 117: (2d) if r1 > r6 goto pc-30 regs=42 stack=0 before 116: (b7) r1 = -6 R0_w=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=inv1 R2_w=map_value(id=0,off=16,ks=4,vs=8272,imm=0) R3_w=inv(id=0) R5_w=inv40 R6_rw=invP(id=0,smin_value=-2147483648,smax_value=0) R7_w=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8_w=invP6 R9_w=map_value(id=0,off=20,ks=4,vs=24,imm=0) R10=fp0 fp-8=mmmmmmmm fp-16_w=map_value fp-24_r=map_value fp-32_w=inv40 fp-40=ctx fp-48=map_value fp-56_w=inv1 fp-64_w=map_value fp-72=map_value fp-80=map_value parent didn't have regs=40 stack=0 marks last_idx 110 first_idx 98 regs=40 stack=0 before 110: (6d) if r1 s> r6 goto pc+5 regs=42 stack=0 before 109: (b7) r1 = 1 regs=40 stack=0 before 108: (65) if r6 s> 0x1000 goto pc+7 regs=40 stack=0 before 98: (55) if r6 != 0x1 goto pc+9 R0_w=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=invP12 R2_w=map_value(id=0,off=16,ks=4,vs=8272,imm=0) R3_rw=inv(id=0) R5_w=inv24 R6_rw=invP(id=0,smin_value=-2147483648,smax_value=2147483647) R7_w=map_value(id=0,off=40,ks=4,vs=8272,imm=0) R8_rw=invP4 R9_w=map_value(id=0,off=12,ks=4,vs=24,imm=0) R10=fp0 fp-8=mmmmmmmm fp-16_rw=map_value fp-24_r=map_value fp-32_rw=invP24 fp-40_r=ctx fp-48_r=map_value fp-56_w=invP1 fp-64_rw=map_value fp-72_r=map_value fp-80_r=map_value parent already had regs=40 stack=0 marks 124: (79) r6 = *(u64 *)(r1 +16) R0=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=map_value(id=0,off=0,ks=4,vs=8272,umax_value=34359738360,var_off=(0x0; 0x7fffffff8),s32_max_value=2147483640,u32_max_value=-8) R2=map_value(id=0,off=16,ks=4,vs=8272,imm=0) R3=inv(id=0) R5=inv40 R6_w=invP(id=0,umax_value=34359738360,var_off=(0x0; 0x7fffffff8),s32_max_value=2147483640,u32_max_value=-8) R7=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8=invP6 R9=map_value(id=0,off=20,ks=4,vs=24,imm=0) R10=fp0 fp-8=mmmmmmmm fp-16=map_value fp-24=map_value fp-32=inv40 fp-40=ctx fp-48=map_value fp-56=inv1 fp-64=map_value fp-72=map_value fp-80=map_value R1 unbounded memory access, make sure to bounds check any such access processed 466 insns (limit 1000000) max_states_per_insn 2 total_states 20 peak_states 20 mark_read 3 If we add this line, as used in other BPF programs, to cap that index: index &= 7; The generated BPF program is considered safe by that version of the BPF verifier, allowing perf to collect the syscall args in one more kernel using the BPF based pointer contents collector. With the above one-liner it works with that kernel: [root@dell-per740-01 ~]# uname -a Linux dell-per740-01.khw.eng.rdu2.dc.redhat.com 4.18.0-513.11.1.el8_9.x86_64 #1 SMP Thu Dec 7 03:06:13 EST 2023 x86_64 x86_64 x86_64 GNU/Linux [root@dell-per740-01 ~]# ~acme/bin/perf trace -e *sleep* sleep 1.234567890 0.000 (1234.704 ms): sleep/3863610 nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 234567890 }) = 0 [root@dell-per740-01 ~]# As well as with the one in Fedora 40: root@number:~# uname -a Linux number 6.11.3-200.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Oct 10 22:31:19 UTC 2024 x86_64 GNU/Linux root@number:~# perf trace -e *sleep* sleep 1.234567890 0.000 (1234.722 ms): sleep/14873 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 234567890 }, rmtp: 0x7ffe87311a40) = 0 root@number:~# Song Liu reported that this one-liner was being optimized out by clang 18, so I suggested and he tested that adding a compiler barrier before it made clang v18 to keep it and the verifier in the kernel in Song's case (Meta's 5.12 based kernel) also was happy with the resulting bytecode. I'll investigate using virtme-ng[1] to have all the perf BPF based functionality thoroughly tested over multiple kernels and clang versions. [1] https://kernel-recipes.org/en/2024/virtme-ng/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <andrea.righi@linux.dev> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/lkml/Zw7JgJc0LOwSpuvx@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-08Merge tag 'perf-tools-fixes-for-v6.12-1-2024-10-08' of ↵Linus Torvalds4-2/+8
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix an assert() to handle captured and unprocessed ARM CoreSight CPU traces - Fix static build compilation error when libdw isn't installed or is too old - Add missing include when building with !HAVE_DWARF_GETLOCATIONS_SUPPORT - Add missing refcount put on 32-bit DSOs - Fix disassembly of user space binaries by setting the binary_type of DSO when loading - Update headers with the kernel sources, including asound.h, sched.h, fcntl, msr-index.h, irq_vectors.h, socket.h, list_sort.c and arm64's cputype.h * tag 'perf-tools-fixes-for-v6.12-1-2024-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf cs-etm: Fix the assert() to handle captured and unprocessed cpu trace perf build: Fix build feature-dwarf_getlocations fail for old libdw perf build: Fix static compilation error when libdw is not installed perf dwarf-aux: Fix build with !HAVE_DWARF_GETLOCATIONS_SUPPORT tools headers arm64: Sync arm64's cputype.h with the kernel sources perf tools: Cope with differences for lib/list_sort.c copy from the kernel tools check_headers.sh: Add check variant that excludes some hunks perf beauty: Update copy of linux/socket.h with the kernel sources tools headers UAPI: Sync the linux/in.h with the kernel sources perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools include UAPI: Sync linux/fcntl.h copy with the kernel sources tools include UAPI: Sync linux/sched.h copy with the kernel sources tools include UAPI: Sync sound/asound.h copy with the kernel sources perf vdso: Missed put on 32-bit dsos perf symbol: Set binary_type of dso when loading
2024-10-02move asm/unaligned.h to linux/unaligned.hAl Viro2-2/+2
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02perf cs-etm: Fix the assert() to handle captured and unprocessed cpu traceIlkka Koskinen1-1/+1
If one builds perf with DEBUG=1, captures data on multiple CPUs and finally runs 'perf report -C <cpu>' for only one of the cpus, assert() aborts the program. This happens because there are empty queues with format set. This patch changes the condition to abort only if a queue is not empty and if the format is unset. $ make -C tools/perf DEBUG=1 CORESIGHT=1 CSLIBS=/usr/lib CSINCLUDES=/usr/include install $ perf record -o kcore --kcore -e cs_etm/timestamp/k -s -C 0-1 dd if=/dev/zero of=/dev/null bs=1M count=1 $ perf report --input kcore/data --vmlinux=/home/ikoskine/projects/linux/vmlinux -C 1 Aborted (core dumped) Fixes: 57880a7966be510c ("perf: cs-etm: Allocate queues for all CPUs") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240924233930.5193-1-ilkka@os.amperecomputing.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-02perf dwarf-aux: Fix build with !HAVE_DWARF_GETLOCATIONS_SUPPORTJames Clark1-0/+1
The linked fixes commit added an #include "dwarf-aux.h" to disasm.h which gets picked up in a lot of places. Without HAVE_DWARF_GETLOCATIONS_SUPPORT the stubs return an errno, so include errno.h to fix the following build error: In file included from util/disasm.h:8, from util/annotate.h:16, from builtin-top.c:23: util/dwarf-aux.h: In function 'die_get_var_range': util/dwarf-aux.h:183:10: error: 'ENOTSUP' undeclared (first use in this function) 183 | return -ENOTSUP; | ^~~~~~~ Fixes: 782959ac248ac3cb ("perf annotate: Add "update_insn_state" callback function to handle arch specific instruction tracking") Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241001123625.1063153-1-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-27perf vdso: Missed put on 32-bit dsosIan Rogers1-1/+3
If the dso type doesn't match then NULL is returned but the dso should be put first. Fixes: f649ed80f3cabbf1 ("perf dsos: Tidy reference counting and locking") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240912182757.762369-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-22perf symbol: Set binary_type of dso when loadingNamhyung Kim1-0/+3
For the kernel dso, it sets the binary type of dso when loading the symbol table. But it seems not to do that for user DSOs. Actually it sets the symtab type only. It's not clear why we want to maintain the two separately but it uses the binary type info before getting the disassembly. Let's use the symtab type as binary type too if it's not set. I think it's ok to set the binary type when it founds a symsrc whether or not it has actual symbols. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Alexander Monakov <amonakov@ispras.ru> Link: https://lore.kernel.org/r/20240426215139.1271039-1-namhyung@kernel.org Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: LKML <linux-kernel@vger.kernel.org> Cc: <linux-perf-users@vger.kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-11perf env: Find correct branch counter info on hybridKan Liang4-4/+25
No event is printed in the "Branch Counter" column on hybrid machines. For example, $ perf record -e "{cpu_core/branch-instructions/pp,cpu_core/branches/}:S" -j any,counter $ perf report --total-cycles # Branch counter abbr list: # cpu_core/branch-instructions/pp = A # cpu_core/branches/ = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter # ............... .............. ........... .......... .............. 44.54% 727.1K 0.00% 1 |+ |+ | 36.31% 592.7K 0.00% 2 |+ |+ | 17.83% 291.1K 0.00% 1 |+ |+ | The branch counter information (br_cntr_width and br_cntr_nr) in the perf_env is retrieved from the CPU_PMU_CAPS. However, the CPU_PMU_CAPS is not available on hybrid machines. Without the width information, the number of occurrences of an event cannot be calculated. For a hybrid machine, the caps information should be retrieved from the PMU_CAPS, and stored in the perf_env->pmu_caps. Add a perf_env__find_br_cntr_info() to return the correct branch counter information from the corresponding fields. Committer notes: While testing I couldn't s ee those "Branch counter" columns enabled by pressing 'B' on the TUI, after reporting it to the list Kan explained the situation: <quote Kan Liang> For a hybrid client, the "Branch Counter" feature is only supported starting from the just released Lunar Lake. Perf falls back to only "ANY" on your Raptor Lake. The "The branch counter is not available" message is expected. Here is the 'perf evlist' result from my Lunar Lake machine, # perf evlist -v cpu_core/branch-instructions/pp: type: 4 (cpu_core), size: 136, config: 0xc4 (branch-instructions), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|READ|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID|GROUP|LOST, disabled: 1, freq: 1, enable_on_exec: 1, precise_ip: 2, sample_id_all: 1, exclude_guest: 1, branch_sample_type: ANY|COUNTERS # </quote> Fixes: 6f9d8d1de2c61288 ("perf script: Add branch counters") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240909184201.553519-1-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-11perf pmu: To info add event_type_descIan Rogers3-1/+6
All PMU events are assumed to be "Kernel PMU event", however, this isn't true for fake PMUs and won't be true with the addition of more software PMUs. Make the PMU's type description name configurable - largely for printing callbacks. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20240907050830.6752-5-irogers@google.com Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Xu Yang <xu.yang_2@nxp.com> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-11perf evsel: Add accessor for tool_eventIan Rogers3-13/+18
Currently tool events use a dedicated variable within the evsel. Later changes will move this to the unused struct perf_event_attr config for these events. Add an accessor to allow the later change to be well typed and avoid changing all uses. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20240907050830.6752-4-irogers@google.com Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Xu Yang <xu.yang_2@nxp.com> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-11perf pmus: Fake PMU clean upIan Rogers7-34/+57
Rather than passing a fake PMU around, just pass that the fake PMU should be used - true when doing testing. Move the fake PMU into pmus.[ch] and try to abstract the PMU's properties in pmu.c, ie so there is less "if fake_pmu" in non-PMU code. Give the fake PMU a made up type number. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xu Yang <xu.yang_2@nxp.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20240907050830.6752-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10perf callchain: Allow symbols to be optional when resolving a callchainIan Rogers4-51/+84
In uses like 'perf inject' it is not necessary to gather the symbol for each call chain location, the map for the sample IP is wanted so that build IDs and the like can be injected. Make gathering the symbol in the callchain_cursor optional. For a 'perf inject -B' command this lowers the peak RSS from 54.1MB to 29.6MB by avoiding loading symbols. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Casey Chen <cachen@purestorage.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Link: https://lore.kernel.org/r/20240909203740.143492-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10perf inject: Lazy build-id mmap2 event insertionIan Rogers2-0/+12
Add -B option that lazily inserts mmap2 events thereby dropping all mmap events without samples. This is similar to the behavior of -b where only build_id events are inserted when a dso is accessed in a sample. File size savings can be significant in system-wide mode, consider: $ perf record -g -a -o perf.data sleep 1 $ perf inject -B -i perf.data -o perf.new.data $ ls -al perf.data perf.new.data 5147049 perf.data 2248493 perf.new.data Give test coverage of the new option in pipe test. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Casey Chen <cachen@purestorage.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Link: https://lore.kernel.org/r/20240909203740.143492-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10perf inject: Add new mmap2-buildid-all optionIan Rogers2-0/+68
Add an option that allows all mmap or mmap2 events to be rewritten as mmap2 events with build IDs. This is similar to the existing -b/--build-ids and --buildid-all options except instead of adding a build_id event an existing mmap/mmap2 event is used as a template and a new mmap2 event synthesized from it. As mmap2 events are typical this avoids the insertion of build_id events. Add test coverage to the pipe test. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Casey Chen <cachen@purestorage.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Link: https://lore.kernel.org/r/20240909203740.143492-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10perf inject: Fix build ID injectionIan Rogers3-16/+44
Build ID injection wasn't inserting a sample ID and aligning events to 64 bytes rather than 8. No sample ID means events are unordered and two different build_id events for the same path, as happens when a file is replaced, can't be differentiated. Add in sample ID insertion for the build_id events alongside some refactoring. The refactoring better aligns the function arguments for different use cases, such as synthesizing build_id events without needing to have a dso. The misc bits are explicitly passed as with callchains the maps/dsos may span user and kernel land, so using sample->cpumode isn't good enough. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Casey Chen <cachen@purestorage.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Link: https://lore.kernel.org/r/20240909203740.143492-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10perf annotate-data: Add pr_debug_scope()Namhyung Kim1-2/+23
The pr_debug_scope() is to print more information about the scope DIE during the instruction tracking so that it can help finding relevant debug info and the source code like inlined functions more easily. $ perf --debug type-profile annotate --data-type ... ----------------------------------------------------------- find data type for 0(reg0, reg12) at set_task_cpu+0xdd CU for kernel/sched/core.c (die:0x1268dae) frame base: cfa=1 fbreg=7 scope: [3/3] (die:12b6d28) [inlined] set_task_rq <<<--- (here) bb: [9f - dd] var [9f] reg3 type='struct task_struct*' size=0x8 (die:0x126aff0) var [9f] reg6 type='unsigned int' size=0x4 (die:0x1268e0d) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240909214251.3033827-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10perf annotate: Treat 'call' instruction as stack operationNamhyung Kim1-0/+1
I found some portion of mem-store events sampled on CALL instruction which has no memory access. But it actually saves a return address into stack. It should be considered as a stack operation like RET instruction. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240909214251.3033827-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10perf trace: Collect augmented data using BPFHoward Chu1-1/+106
Include trace_augment.h for TRACE_AUG_MAX_BUF, so that BPF reads TRACE_AUG_MAX_BUF bytes of buffer maximum. Determine what type of argument and how many bytes to read from user space, us ing the value in the beauty_map. This is the relation of parameter type and its corres ponding value in the beauty map, and how many bytes we read eventually: string: 1 -> size of string (till null) struct: size of struct -> size of struct buffer: -1 * (index of paired len) -> value of paired len (maximum: TRACE_AUG_ MAX_BUF) After reading from user space, we output the augmented data using bpf_perf_event_output(). If the struct augmenter, augment_sys_enter() failed, we fall back to using bpf_tail_call(). I have to make the payload 6 times the size of augmented_arg, to pass the BPF verifier. Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240815013626.935097-10-howardchu95@gmail.com Link: https://lore.kernel.org/r/20240824163322.60796-7-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10perf trace: Pretty print buffer dataHoward Chu1-0/+6
Define TRACE_AUG_MAX_BUF in trace_augment.h data, which is the maximum buffer size we can augment. BPF will include this header too. Print buffer in a way that's different than just printing a string, we print all the control characters in \digits (such as \0 for null, and \10 for newline, LF). For character that has a bigger value than 127, we print the digits instead of the character itself as well. Committer notes: Simplified the buffer scnprintf to avoid using multiple buffers as discussed in the patch review thread. We can't really all 'buf' args to SCA_BUF as we're collecting so far just on the sys_enter path, so we would be printing the previous 'read' arg buffer contents, not what the kernel puts there. So instead of: static int syscall_fmt__cmp(const void *name, const void *fmtp) @@ -1987,8 +1989,6 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field - else if (strstr(field->type, "char *") && strstr(field->name, "buf")) - arg->scnprintf = SCA_BUF; Do: static const struct syscall_fmt syscall_fmts[] = { + { .name = "write", .errpid = true, + .arg = { [1] = { .scnprintf = SCA_BUF /* buf */, from_user = true, }, }, }, Signed-off-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240815013626.935097-8-howardchu95@gmail.com Link: https://lore.kernel.org/r/20240824163322.60796-6-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10perf trace: Add trace__bpf_sys_enter_beauty_map() to prepare for fetching ↵Howard Chu1-0/+7
data in BPF Set up beauty_map, load it to BPF, in such format: if argument No.3 is a struct of size 32 bytes (of syscall number 114) beauty_map[114][2] = 32; if argument No.3 is a string (of syscall number 114) beauty_map[114][2] = 1; if argument No.3 is a buffer, its size is indicated by argument No.4 (of syscall number 114) beauty_map[114][2] = -4; /* -1 ~ -6, we'll read this buffer size in BPF */ Committer notes: Moved syscall_arg_fmt__cache_btf_struct() from a ifdef HAVE_LIBBPF_SUPPORT to closer to where it is used, that is ifdef'ed on HAVE_BPF_SKEL and thus breaks the build when building with BUILD_BPF_SKEL=0, as detected using 'make -C tools/perf build-test'. Also add 'struct beauty_map_enter' to tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c as we're using it in this patch, otherwise we get this while trying to build at this point in the original patch series: builtin-trace.c: In function ‘trace__init_syscalls_bpf_prog_array_maps’: builtin-trace.c:3725:58: error: ‘struct <anonymous>’ has no member named ‘beauty_map_enter’ 3725 | int beauty_map_fd = bpf_map__fd(trace->skel->maps.beauty_map_enter); | We also have to take into account syscall_arg_fmt.from_user when telling the kernel what to copy in the sys_enter generic collector, we don't want to collect bogus data in buffers that will only be available to us at sys_exit time, i.e. after the kernel has filled it, so leave this for when we have such a sys_exit based collector. Committer testing: Not wired up yet, so all continues to work, using the existing BPF collector and userspace beautifiers that are augmentation aware: root@number:~# rm -f 987654 ; touch 123456 ; perf trace -e rename* mv 123456 987654 0.000 ( 0.031 ms): mv/20888 renameat2(olddfd: CWD, oldname: "123456", newdfd: CWD, newname: "987654", flags: NOREPLACE) = 0 root@number:~# perf trace -e connect,sendto ping -c 1 www.google.com 0.000 ( 0.014 ms): ping/20892 connect(fd: 5, uservaddr: { .family: LOCAL, path: /run/systemd/resolve/io.systemd.Resolve }, addrlen: 42) = 0 0.040 ( 0.003 ms): ping/20892 sendto(fd: 5, buff: 0x560b4ff17980, len: 97, flags: DONTWAIT|NOSIGNAL) = 97 0.480 ( 0.017 ms): ping/20892 sendto(fd: 5, buff: 0x7ffd82d07150, len: 20, addr: { .family: NETLINK }, addr_len: 0xc) = 20 0.526 ( 0.014 ms): ping/20892 connect(fd: 5, uservaddr: { .family: INET6, port: 0, addr: 2800:3f0:4004:810::2004 }, addrlen: 28) = 0 0.542 ( 0.002 ms): ping/20892 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0 0.544 ( 0.004 ms): ping/20892 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 142.251.135.100 }, addrlen: 16) = 0 0.559 ( 0.002 ms): ping/20892 connect(fd: 5, uservaddr: { .family: INET, port: 1025, addr: 142.251.135.100 }, addrlen: 16PING www.google.com (142.251.135.100) 56(84) bytes of data. ) = 0 0.589 ( 0.058 ms): ping/20892 sendto(fd: 3, buff: 0x560b4ff11ac0, len: 64, addr: { .family: INET, port: 0, addr: 142.251.135.100 }, addr_len: 0x10) = 64 45.250 ( 0.029 ms): ping/20892 connect(fd: 5, uservaddr: { .family: LOCAL, path: /run/systemd/resolve/io.systemd.Resolve }, addrlen: 42) = 0 45.344 ( 0.012 ms): ping/20892 sendto(fd: 5, buff: 0x560b4ff19340, len: 111, flags: DONTWAIT|NOSIGNAL) = 111 64 bytes from rio09s08-in-f4.1e100.net (142.251.135.100): icmp_seq=1 ttl=49 time=44.4 ms --- www.google.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 44.361/44.361/44.361/0.000 ms root@number:~# Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240815013626.935097-4-howardchu95@gmail.com Link: https://lore.kernel.org/r/20240824163322.60796-3-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-09perf trace: Use a common encoding for augmented arguments, with size + error ↵Arnaldo Carvalho de Melo1-41/+63
+ payload We were using a more compact format, without explicitely encoding the size and possible error in the payload for an argument. To do it generically, at least as Howard Chu did in his GSoC activities, it is more convenient to use the same model that was being used for string arguments, passing { size, error, payload }. So use that for the non string syscall args we have so far: struct timespec struct perf_event_attr struct sockaddr (this one has even a variable size) With this in place we have the userspace pretty printers: perf_event_attr___scnprintf() syscall_arg__scnprintf_augmented_sockaddr() syscall_arg__scnprintf_augmented_timespec() Ready to have the generic BPF collector in tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c sending its generic payload and thus we'll use them instead of a generic libbpf btf_dump interface that doesn't know about about the sockaddr mux, perf_event_attr non-trivial fields (sample_type, etc), leaving it as a (useful) fallback that prints just basic types until we put in place a more sophisticated pretty printer infrastructure that associates synthesized enums to struct fields using the header scrapers we have in tools/perf/trace/beauty/, some of them in this list: $ ls tools/perf/trace/beauty/*.sh tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/kcmp_type.sh tools/perf/trace/beauty/perf_ioctl.sh tools/perf/trace/beauty/statx_mask.sh tools/perf/trace/beauty/clone.sh tools/perf/trace/beauty/kvm_ioctl.sh tools/perf/trace/beauty/pkey_alloc_access_rights.sh tools/perf/trace/beauty/sync_file_range.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/madvise_behavior.sh tools/perf/trace/beauty/prctl_option.sh tools/perf/trace/beauty/usbdevfs_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/mmap_flags.sh tools/perf/trace/beauty/rename_flags.sh tools/perf/trace/beauty/vhost_virtio_ioctl.sh tools/perf/trace/beauty/fs_at_flags.sh tools/perf/trace/beauty/mmap_prot.sh tools/perf/trace/beauty/sndrv_ctl_ioctl.sh tools/perf/trace/beauty/x86_arch_prctl.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/mount_flags.sh tools/perf/trace/beauty/sndrv_pcm_ioctl.sh tools/perf/trace/beauty/fsmount.sh tools/perf/trace/beauty/move_mount_flags.sh tools/perf/trace/beauty/sockaddr.sh tools/perf/trace/beauty/fspick.sh tools/perf/trace/beauty/mremap_flags.sh tools/perf/trace/beauty/socket.sh $ Testing it: root@number:~# rm -f 987654 ; touch 123456 ; perf trace -e rename* mv 123456 987654 0.000 ( 0.031 ms): mv/1193096 renameat2(olddfd: CWD, oldname: "123456", newdfd: CWD, newname: "987654", flags: NOREPLACE) = 0 root@number:~# perf trace -e *nanosleep sleep 1.2345678901 0.000 (1234.654 ms): sleep/1192697 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 234567891 }, rmtp: 0x7ffe1ea80460) = 0 root@number:~# perf trace -e perf_event_open* perf stat -e cpu-clock sleep 1 0.000 ( 0.011 ms): perf/1192701 perf_event_open(attr_uptr: { type: 1 (software), size: 136, config: 0 (PERF_COUNT_SW_CPU_CLOCK), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 1192702 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3 Performance counter stats for 'sleep 1': 0.51 msec cpu-clock # 0.001 CPUs utilized 1.001242090 seconds time elapsed 0.000000000 seconds user 0.001010000 seconds sys root@number:~# perf trace -e connect* ping -c 1 bsky.app 0.000 ( 0.130 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: LOCAL, path: /run/systemd/resolve/io.systemd.Resolve }, addrlen: 42) = 0 23.907 ( 0.006 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.20.108.158 }, addrlen: 16) = 0 23.915 PING bsky.app (3.20.108.158) 56(84) bytes of data. ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0 23.917 ( 0.002 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.12.170.30 }, addrlen: 16) = 0 23.921 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0 23.923 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 18.217.70.179 }, addrlen: 16) = 0 23.925 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0 23.927 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.132.20.46 }, addrlen: 16) = 0 23.930 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0 23.931 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.142.89.165 }, addrlen: 16) = 0 23.934 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0 23.935 ( 0.002 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 18.119.147.159 }, addrlen: 16) = 0 23.938 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0 23.940 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.22.38.164 }, addrlen: 16) = 0 23.942 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0 23.944 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.13.14.133 }, addrlen: 16) = 0 23.956 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 1025, addr: 3.20.108.158 }, addrlen: 16) = 0 ^C --- bsky.app ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms root@number:~# Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/CAP-5=fW4=2GoP6foAN6qbrCiUzy0a_TzHbd8rvDsakTPfdzvfg@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-09perf trace augmented_syscalls.bpf: Move the renameat aumenter to renameat2, ↵Arnaldo Carvalho de Melo1-2/+2
temporarily While trying to shape Howard Chu's generic BPF augmenter transition into the codebase I got stuck with the renameat2 syscall. Until I noticed that the attempt at reusing augmenters were making it use the 'openat' syscall augmenter, that collect just one string syscall arg, for the 'renameat2' syscall, that takes two strings. So, for the moment, just to help in this transition period, since 'renameat2' is what is used these days in the 'mv' utility, just make the BPF collector be associated with the more widely used syscall, hopefully the transition to Howard's generic BPF augmenter will cure this, so get this out of the way for now! So now we still have that odd "reuse", but for something we're not testing so won't get in the way anymore: root@number:~# rm -f 987654 ; touch 123456 ; perf trace -vv -e rename* mv 123456 987654 |& grep renameat Reusing "openat" BPF sys_enter augmenter for "renameat" 0.000 ( 0.079 ms): mv/1158612 renameat2(olddfd: CWD, oldname: "123456", newdfd: CWD, newname: "987654", flags: NOREPLACE) = 0 root@number:~# Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/CAP-5=fXjGYs=tpBgETK-P9U-CuXssytk9pSnTXpfphrmmOydWA@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-06perf mem: Fix missed p-core mem events on ADL and RPLKan Liang2-3/+5
The p-core mem events are missed when launching 'perf mem record' on ADL and RPL. root@number:~# perf mem record sleep 1 Memory events are enabled on a subset of CPUs: 16-27 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.032 MB perf.data ] root@number:~# perf evlist cpu_atom/mem-loads,ldlat=30/P cpu_atom/mem-stores/P dummy:u A variable 'record' in the 'struct perf_mem_event' is to indicate whether a mem event