summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2021-05-28perf unwind: Set userdata for all __report_module() pathsDave Rigby1-3/+8
commit 4e1481445407b86a483616c4542ffdc810efb680 upstream. When locating the DWARF module for a given address, __find_debuginfo() requires a 'struct dso' passed via the userdata argument. However, this field is only set in __report_module() if the module is found in via dwfl_addrmodule(), not if it is found later via dwfl_report_elf(). Set userdata irrespective of how the DWARF module was found, as long as we found a module. Fixes: bf53fc6b5f41 ("perf unwind: Fix separate debug info files when using elfutils' libdw's unwinder") Signed-off-by: Dave Rigby <d.rigby@me.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211801 Acked-by: Jan Kratochvil <jan.kratochvil@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/linux-perf-users/20210218165654.36604-1-d.rigby@me.com/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Tommi Rantala" <tommi.t.rantala@nokia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-28perf unwind: Fix separate debug info files when using elfutils' libdw's unwinderJan Kratochvil1-5/+27
commit bf53fc6b5f415cddc7118091cb8fd6a211b2320d upstream. elfutils needs to be provided main binary and separate debug info file respectively. Providing separate debug info file instead of the main binary is not sufficient. One needs to try both supplied filename and its possible cache by its build-id depending on the use case. Signed-off-by: Jan Kratochvil <jan.kratochvil@redhat.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "Tommi Rantala" <tommi.t.rantala@nokia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-22tweewide: Fix most Shebang linesFinn Behrens2-2/+2
commit c25ce589dca10d64dde139ae093abc258a32869c upstream. Change every shebang which does not need an argument to use /usr/bin/env. This is needed as not every distro has everything under /usr/bin, sometimes not even bash. Signed-off-by: Finn Behrens <me@kloenk.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19perf tools: Fix dynamic libbpf linkJiri Olsa2-0/+8
[ Upstream commit ad1237c30d975535a669746496cbed136aa5a045 ] Justin reported broken build with LIBBPF_DYNAMIC=1. When linking libbpf dynamically we need to use perf's hashmap object, because it's not exported in libbpf.so (only in libbpf.a). Following build is now passing: $ make LIBBPF_DYNAMIC=1 BUILD: Doing 'make -j8' parallel build ... $ ldd perf | grep libbpf libbpf.so.0 => /lib64/libbpf.so.0 (0x00007fa7630db000) Fixes: eee19501926d ("perf tools: Grab a copy of libbpf's hashmap") Reported-by: Justin M. Forbes <jforbes@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210508205020.617984-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf session: Add swap operation for event TIME_CONVLeo Yan1-1/+14
[ Upstream commit 050ffc449008eeeafc187dec337d9cf1518f89bc ] Since commit d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV"), the event PERF_RECORD_TIME_CONV has extended the data structure for clock parameters. To be backwards-compatible, this patch adds a dedicated swap operation for the event PERF_RECORD_TIME_CONV, based on checking if the event contains field "time_cycles", it can support both for the old and new event formats. Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve MacLean <Steve.MacLean@Microsoft.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20210428120915.7123-4-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf jit: Let convert_timestamp() to be backwards-compatibleLeo Yan1-10/+20
[ Upstream commit aa616f5a8a2d22a179d5502ebd85045af66fa656 ] Commit d110162cafc80dad ("perf tsc: Support cap_user_time_short for event TIME_CONV") supports the extended parameters for event TIME_CONV, but it broke the backwards compatibility, so any perf data file with old event format fails to convert timestamp. This patch introduces a helper event_contains() to check if an event contains a specific member or not. For the backwards-compatibility, if the event size confirms the extended parameters are supported in the event TIME_CONV, then copies these parameters. Committer notes: To make this compiler backwards compatible add this patch: - struct perf_tsc_conversion tc = { 0 }; + struct perf_tsc_conversion tc = { .time_shift = 0, }; Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve MacLean <Steve.MacLean@Microsoft.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20210428120915.7123-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf beauty: Fix fsconfig generatorVitaly Chikunov1-4/+3
[ Upstream commit 2e1daee14e67fbf9b27280b974e2c680a22cabea ] After gnulib update sed stopped matching `[[:space:]]*+' as before, causing the following compilation error: In file included from builtin-trace.c:719: trace/beauty/generated/fsconfig_arrays.c:2:3: error: expected expression before ']' token 2 | [] = "", | ^ trace/beauty/generated/fsconfig_arrays.c:2:3: error: array index in initializer not of integer type trace/beauty/generated/fsconfig_arrays.c:2:3: note: (near initialization for 'fsconfig_cmds') Fix this by correcting the regular expression used in the generator. Also, clean up the script by removing redundant egrep, xargs, and printf invocations. Committer testing: Continues to work: $ cat tools/perf/trace/beauty/fsconfig.sh #!/bin/sh # SPDX-License-Identifier: LGPL-2.1 if [ $# -ne 1 ] ; then linux_header_dir=tools/include/uapi/linux else linux_header_dir=$1 fi linux_mount=${linux_header_dir}/mount.h printf "static const char *fsconfig_cmds[] = {\n" ms='[[:space:]]*' sed -nr "s/^${ms}FSCONFIG_([[:alnum:]_]+)${ms}=${ms}([[:digit:]]+)${ms},.*/\t[\2] = \"\1\",/p" \ ${linux_mount} printf "};\n" $ tools/perf/trace/beauty/fsconfig.sh static const char *fsconfig_cmds[] = { [0] = "SET_FLAG", [1] = "SET_STRING", [2] = "SET_BINARY", [3] = "SET_PATH", [4] = "SET_PATH_EMPTY", [5] = "SET_FD", [6] = "CMD_CREATE", [7] = "CMD_RECONFIGURE", }; $ Fixes: d35293004a5e4 ("perf beauty: Add generator for fsconfig's 'cmd' arg values") Signed-off-by: Vitaly Chikunov <vt@altlinux.org> Co-authored-by: Dmitry V. Levin <ldv@altlinux.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/20210414182723.1670663-1-vt@altlinux.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metricSmita Koralahalli4-8/+8
[ Upstream commit 86c2bc3da769124e3e856b6e9457be3667c30919 ] Commit 08ed77e414ab2342 ("perf vendor events amd: Add recommended events") added the hits event "L2 Cache Hits from L2 HWPF" with the same metric expression as the accesses event "L2 Cache Accesses from L2 HWPF": $ perf list --details ... l2_cache_accesses_from_l2_hwpf [L2 Cache Accesses from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] l2_cache_hits_from_l2_hwpf [L2 Cache Hits from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] ... This was wrong and led to counting hits the same as accesses. Section 2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 - Sep 12, 2019", documents the hits event with EventCode 0x70 which is the same as l2_pf_hit_l2. Fix this, and massage the description for l2_pf_hit_l2 as the hits event is now the duplicate of l2_pf_hit_l2. AMD recommends using the recommended event over other events if the duplicate exists and maintain both for consistency. Hence, l2_cache_hits_from_l2_hwpf should override l2_pf_hit_l2. Before: # perf stat -M l2_cache_accesses_from_l2_hwpf,l2_cache_hits_from_l2_hwpf sleep 1 Performance counter stats for 'sleep 1': 1,436 l2_pf_miss_l2_l3 # 11114.00 l2_cache_accesses_from_l2_hwpf # 11114.00 l2_cache_hits_from_l2_hwpf 4,482 l2_pf_hit_l2 5,196 l2_pf_miss_l2_hit_l3 1.001765339 seconds time elapsed After: # perf stat -M l2_cache_accesses_from_l2_hwpf sleep 1 Performance counter stats for 'sleep 1': 1,477 l2_pf_miss_l2_l3 # 10442.00 l2_cache_accesses_from_l2_hwpf 3,978 l2_pf_hit_l2 4,987 l2_pf_miss_l2_hit_l3 1.001491186 seconds time elapsed # perf stat -e l2_cache_hits_from_l2_hwpf sleep 1 Performance counter stats for 'sleep 1': 3,983 l2_cache_hits_from_l2_hwpf 1.001329970 seconds time elapsed Note the difference in performance counter values for the accesses versus the hits after the fix, and the hits event now counting the same as l2_pf_hit_l2. Fixes: 08ed77e414ab ("perf vendor events amd: Add recommended events") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> # On a 3900X Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: linux-perf-users@vger.kernel.org Link: https://lore.kernel.org/r/20210406215944.113332-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf symbols: Fix dso__fprintf_symbols_by_name() to return the number of ↵Arnaldo Carvalho de Melo1-1/+1
printed chars [ Upstream commit 210e4c89ef61432040c6cd828fefa441f4887186 ] The 'ret' variable was initialized to zero but then it was not updated from the fprintf() return, fix it. Reported-by: Yang Li <yang.lee@linux.alibaba.com> cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> cc: Ingo Molnar <mingo@redhat.com> cc: Jiri Olsa <jolsa@redhat.com> cc: Mark Rutland <mark.rutland@arm.com> cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Fixes: 90f18e63fbd00513 ("perf symbols: List symbols in a dso in ascending name order") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-07perf ftrace: Fix access to pid in array when setting a pid filterThomas Richter1-1/+1
[ Upstream commit 671b60cb6a897a5b3832fe57657152f2c3995e25 ] Command 'perf ftrace -v -- ls' fails in s390 (at least 5.12.0rc6). The root cause is a missing pointer dereference which causes an array element address to be used as PID. Fix this by extracting the PID. Output before: # ./perf ftrace -v -- ls function_graph tracer is used write '-263732416' to tracing/set_ftrace_pid failed: Invalid argument failed to set ftrace pid # Output after: ./perf ftrace -v -- ls function_graph tracer is used # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 4) | rcu_read_lock_sched_held() { 4) 0.552 us | rcu_lockdep_current_cpu_online(); 4) 6.124 us | } Reported-by: Alexander Schmidt <alexschm@de.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: http://lore.kernel.org/lkml/20210421120400.2126433-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-07perf data: Fix error return code in perf_data__create_dir()Zhen Lei1-2/+3
[ Upstream commit f2211881e737cade55e0ee07cf6a26d91a35a6fe ] Although 'ret' has been initialized to -1, but it will be reassigned by the "ret = open(...)" statement in the for loop. So that, the value of 'ret' is unknown when asprintf() failed. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210415083417.3740-1-thunder.leizhen@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-28perf map: Fix error return code in maps__clone()Zhen Lei1-2/+5
[ Upstream commit c6f87141254d16e281e4b4431af7316895207b8f ] Although 'err' has been initialized to -ENOMEM, but it will be reassigned by the "err = unwind__prepare_access(...)" statement in the for loop. So that, the value of 'err' is unknown when map__clone() failed. Fixes: 6c502584438bda63 ("perf unwind: Call unwind__prepare_access for forked thread") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: zhen lei <thunder.leizhen@huawei.com> Link: http://lore.kernel.org/lkml/20210415092744.3793-1-thunder.leizhen@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-28perf auxtrace: Fix potential NULL pointer dereferenceLeo Yan1-1/+1
[ Upstream commit b14585d9f18dc617e975815570fe836be656b1da ] In the function auxtrace_parse_snapshot_options(), the callback pointer "itr->parse_snapshot_options" can be NULL if it has not been set during the AUX record initialization. This can cause tool crashing if the callback pointer "itr->parse_snapshot_options" is dereferenced without performing NULL check. Add a NULL check for the pointer "itr->parse_snapshot_options" before invoke the callback. Fixes: d20031bb63dd6dde ("perf tools: Add AUX area tracing Snapshot Mode") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Link: http://lore.kernel.org/lkml/20210420151554.2031768-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-16perf map: Tighten snprintf() string precision to pass gcc check on some ↵Arnaldo Carvalho de Melo1-4/+3
32-bit arches commit 77d02bd00cea9f1a87afe58113fa75b983d6c23a upstream. Noticed on a debian:experimental mips and mipsel cross build build environment: perfbuilder@ec265a086e9b:~$ mips-linux-gnu-gcc --version | head -1 mips-linux-gnu-gcc (Debian 10.2.1-3) 10.2.1 20201224 perfbuilder@ec265a086e9b:~$ CC /tmp/build/perf/util/map.o util/map.c: In function 'map__new': util/map.c:109:5: error: '%s' directive output may be truncated writing between 1 and 2147483645 bytes into a region of size 4096 [-Werror=format-truncation=] 109 | "%s/platforms/%s/arch-%s/usr/lib/%s", | ^~ In file included from /usr/mips-linux-gnu/include/stdio.h:867, from util/symbol.h:11, from util/map.c:2: /usr/mips-linux-gnu/include/bits/stdio2.h:67:10: note: '__builtin___snprintf_chk' output 32 or more bytes (assuming 4294967321) into a destination of size 4096 67 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 68 | __bos (__s), __fmt, __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Since we have the lenghts for what lands in that place, use it to give the compiler more info and make it happy. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-14perf report: Fix wrong LBR block sortingJin Yao1-3/+3
[ Upstream commit f2013278ae40b89cc27916366c407ce5261815ef ] When '--total-cycles' is specified, it supports sorting for all blocks by 'Sampled Cycles%'. This is useful to concentrate on the globally hottest blocks. 'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles But in current code, it doesn't use the cycles aggregation. Part of 'cycles' counting is possibly dropped for some overlap jumps. But for identifying the hot block, we always need the full cycles. # perf record -b ./triad_loop # perf report --total-cycles --stdio Before: # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object # ............... .............. ........... .......... ............................................................. ................. # 0.81% 793 4.32% 793 [setup-vdso.h:34 -> setup-vdso.h:40] ld-2.27.so 0.49% 480 0.87% 160 [native_write_msr+0 -> native_write_msr+16] [kernel.kallsyms] 0.48% 476 0.52% 95 [native_read_msr+0 -> native_read_msr+29] [kernel.kallsyms] 0.31% 303 1.65% 303 [nmi_restore+0 -> nmi_restore+37] [kernel.kallsyms] 0.26% 255 1.39% 255 [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162] [kernel.kallsyms] 0.24% 234 1.28% 234 [end_repeat_nmi+67 -> end_repeat_nmi+83] [kernel.kallsyms] 0.23% 227 1.24% 227 [__irqentry_text_end+96 -> __irqentry_text_end+126] [kernel.kallsyms] 0.20% 194 1.06% 194 [native_set_debugreg+52 -> native_set_debugreg+56] [kernel.kallsyms] 0.11% 106 0.14% 26 [native_sched_clock+0 -> native_sched_clock+98] [kernel.kallsyms] 0.10% 97 0.53% 97 [trigger_load_balance+0 -> trigger_load_balance+67] [kernel.kallsyms] 0.09% 85 0.46% 85 [get-dynamic-info.h:102 -> get-dynamic-info.h:111] ld-2.27.so ... 0.00% 92.7K 0.02% 4 [triad_loop.c:64 -> triad_loop.c:65] triad_loop The hottest block '[triad_loop.c:64 -> triad_loop.c:65]' is not at the top of output. After: # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object # ............... .............. ........... .......... .............................................................. ................. # 94.35% 92.7K 0.02% 4 [triad_loop.c:64 -> triad_loop.c:65] triad_loop 0.81% 793 4.32% 793 [setup-vdso.h:34 -> setup-vdso.h:40] ld-2.27.so 0.49% 480 0.87% 160 [native_write_msr+0 -> native_write_msr+16] [kernel.kallsyms] 0.48% 476 0.52% 95 [native_read_msr+0 -> native_read_msr+29] [kernel.kallsyms] 0.31% 303 1.65% 303 [nmi_restore+0 -> nmi_restore+37] [kernel.kallsyms] 0.26% 255 1.39% 255 [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162] [kernel.kallsyms] 0.24% 234 1.28% 234 [end_repeat_nmi+67 -> end_repeat_nmi+83] [kernel.kallsyms] 0.23% 227 1.24% 227 [__irqentry_text_end+96 -> __irqentry_text_end+126] [kernel.kallsyms] 0.20% 194 1.06% 194 [native_set_debugreg+52 -> native_set_debugreg+56] [kernel.kallsyms] 0.11% 106 0.14% 26 [native_sched_clock+0 -> native_sched_clock+98] [kernel.kallsyms] 0.10% 97 0.53% 97 [trigger_load_balance+0 -> trigger_load_balance+67] [kernel.kallsyms] 0.09% 85 0.46% 85 [get-dynamic-info.h:102 -> get-dynamic-info.h:111] ld-2.27.so 0.08% 82 0.06% 11 [intel_pmu_drain_pebs_nhm+580 -> intel_pmu_drain_pebs_nhm+627] [kernel.kallsyms] 0.08% 77 0.42% 77 [lru_add_drain_cpu+0 -> lru_add_drain_cpu+133] [kernel.kallsyms] 0.08% 74 0.10% 18 [handle_pmi_common+271 -> handle_pmi_common+310] [kernel.kallsyms] 0.08% 74 0.40% 74 [get-dynamic-info.h:131 -> get-dynamic-info.h:157] ld-2.27.so 0.07% 69 0.09% 17 [intel_pmu_drain_pebs_nhm+432 -> intel_pmu_drain_pebs_nhm+468] [kernel.kallsyms] Now the hottest block is reported at the top of output. Fixes: b65a7d372b1a55db ("perf hist: Support block formats with compare/sort/display") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210407024452.29988-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-14perf inject: Fix repipe usageAdrian Hunter1-1/+1
[ Upstream commit 026334a3bb6a3919b42aba9fc11843db2b77fd41 ] Since commit 14d3d54052539a1e ("perf session: Try to read pipe data from file") 'perf inject' has started printing "PERFILE2h" when not processing pipes. The commit exposed perf to the possiblity that the input is not a pipe but the 'repipe' parameter gets used. That causes the printing because perf inject sets 'repipe' to true always. The 'repipe' parameter of perf_session__new() is used by 2 functions: - perf_file_header__read_pipe() - trace_report() In both cases, the functions copy data to STDOUT_FILENO when 'repipe' is true. Fix by setting 'repipe' to true only if the output is a pipe. Fixes: e558a5bd8b74aff4 ("perf inject: Work with files") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andrew Vagin <avagin@openvz.org> Link: http://lore.kernel.org/lkml/20210401103605.9000-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30perf synthetic events: Avoid write of uninitialized memory when generating ↵Ian Rogers1-4/+5
PERF_RECORD_MMAP* records [ Upstream commit 2a76f6de07906f0bb5f2a13fb02845db1695cc29 ] Account for alignment bytes in the zero-ing memset. Fixes: 1a853e36871b533c ("perf record: Allow specifying a pid to record") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210309234945.419254-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30perf auxtrace: Fix auxtrace queue conflictAdrian Hunter1-4/+0
[ Upstream commit b410ed2a8572d41c68bd9208555610e4b07d0703 ] The only requirement of an auxtrace queue is that the buffers are in time order. That is achieved by making separate queues for separate perf buffer or AUX area buffer mmaps. That generally means a separate queue per cpu for per-cpu contexts, and a separate queue per thread for per-task contexts. When buffers are added to a queue, perf checks that the buffer cpu and thread id (tid) match the queue cpu and thread id. However, generally, that need not be true, and perf will queue buffers correctly anyway, so the check is not needed. In addition, the check gets erroneously hit when using sample mode to trace multiple threads. Consequently, fix that case by removing the check. Fixes: e502789302a6 ("perf auxtrace: Add helpers for queuing AUX area tracing data") Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20210308151143.18338-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-17perf report: Fix -F for branch & mem modesRavi Bangoria1-2/+2
commit 6740a4e70e5d1b9d8e7fe41fd46dd5656d65dadf upstream. perf report fails to add valid additional fields with -F when used with branch or mem modes. Fix it. Before patch: $ perf record -b $ perf report -b -F +srcline_from --stdio Error: Invalid --fields key: `srcline_from' After patch: $ perf report -b -F +srcline_from --stdio # Samples: 8K of event 'cycles' # Event count (approx.): 8784 ... Committer notes: There was an inversion: when looking at branch stack dimensions (keys) it was checking if the sort mode was 'mem', not 'branch'. Fixes: aa6b3c99236b ("perf report: Make -F more strict like -s") Reported-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20210304062958.85465-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-17perf traceevent: Ensure read cmdlines are null terminated.Ian Rogers1-0/+1
commit 137a5258939aca56558f3a23eb229b9c4b293917 upstream. Issue detected by address sanitizer. Fixes: cd4ceb63438e9e28 ("perf util: Save pid-cmdline mapping into tracing header") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210226221431.1985458-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-17perf build: Fix ccache usage in $(CC) when generating arch errno tableAntonio Terceiro1-1/+1
commit dacfc08dcafa7d443ab339592999e37bbb8a3ef0 upstream. This was introduced by commit e4ffd066ff440a57 ("perf: Normalize gcc parameter when generating arch errno table"). Assuming the first word of $(CC) is the actual compiler breaks usage like CC="ccache gcc": the script ends up calling ccache directly with gcc arguments, what fails. Instead of getting the first word, just remove from $(CC) any word that starts with a "-". This maintains the spirit of the original patch, while not breaking ccache users. Fixes: e4ffd066ff440a57 ("perf: Normalize gcc parameter when generating arch errno table") Signed-off-by: Antonio Terceiro <antonio.terceiro@linaro.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Zhe <zhe.he@windriver.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20210224130046.346977-1-antonio.terceiro@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04perf test: Fix unaligned access in sample parsing testNamhyung Kim1-1/+1
[ Upstream commit c5c97cadd7ed13381cb6b4bef5c841a66938d350 ] The ubsan reported the following error. It was because sample's raw data missed u32 padding at the end. So it broke the alignment of the array after it. The raw data contains an u32 size prefix so the data size should have an u32 padding after 8-byte aligned data. 27: Sample parsing :util/synthetic-events.c:1539:4: runtime error: store to misaligned address 0x62100006b9bc for type '__u64' (aka 'unsigned long long'), which requires 8 byte alignment 0x62100006b9bc: note: pointer points here 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ #0 0x561532a9fc96 in perf_event__synthesize_sample util/synthetic-events.c:1539:13 #1 0x5615327f4a4f in do_test tests/sample-parsing.c:284:8 #2 0x5615327f3f50 in test__sample_parsing tests/sample-parsing.c:381:9 #3 0x56153279d3a1 in run_test tests/builtin-test.c:424:9 #4 0x56153279c836 in test_and_print tests/builtin-test.c:454:9 #5 0x56153279b7eb in __cmd_test tests/builtin-test.c:675:4 #6 0x56153279abf0 in cmd_test tests/builtin-test.c:821:9 #7 0x56153264e796 in run_builtin perf.c:312:11 #8 0x56153264cf03 in handle_internal_command perf.c:364:8 #9 0x56153264e47d in run_argv perf.c:408:2 #10 0x56153264c9a9 in main perf.c:538:3 #11 0x7f137ab6fbbc in __libc_start_main (/lib64/libc.so.6+0x38bbc) #12 0x561532596828 in _start ... SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use util/synthetic-events.c:1539:4 in Fixes: 045f8cd8542d ("perf tests: Add a sample parsing test") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20210214091638.519643-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04perf intel-pt: Fix IPC with CYC thresholdAdrian Hunter3-0/+41
[ Upstream commit 6af4b60033e0ce0332fcdf256c965ad41942821a ] The code assumed every CYC-eligible packet has a CYC packet, which is not the case when CYC thresholds are used. Fix by checking if a CYC packet is actually present in that case. Fixes: 5b1dc0fd1da06 ("perf intel-pt: Add support for samples to contain IPC ratio") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210205175350.23817-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04perf intel-pt: Fix premature IPCAdrian Hunter3-11/+17
[ Upstream commit 20aa39708a5999b7921b27482a756766272286ac ] The code assumed a change in cycle count means accurate IPC. That is not correct, for example when sampling both branches and instructions, or at a FUP packet (which is not CYC-eligible) address. Fix by using an explicit flag to indicate when IPC can be sampled. Fixes: 5b1dc0fd1da06 ("perf intel-pt: Add support for samples to contain IPC ratio") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/20210205175350.23817-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04perf intel-pt: Fix missing CYC processing in PSBAdrian Hunter1-0/+3
[ Upstream commit 03fb0f859b45d1eb05c984ab4bd3bef67e45ede2 ] Add missing CYC packet processing when walking through PSB+. This improves the accuracy of timestamps that follow PSB+, until the next MTC. Fixes: 3d49807870f08 ("perf tools: Add new Intel PT packet definitions") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210205175350.23817-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04perf record: Fix continue profiling after draining the bufferYang Jihong3-1/+13
[ Upstream commit e16c2ce7c5ed5de881066c1fd10ba5c09af69559 ] Commit da231338ec9c0987 ("perf record: Use an eventfd to wakeup when done") uses eventfd() to solve a rare race where the setting and checking of 'done' which add done_fd to pollfd. When draining buffer, revents of done_fd is 0 and evlist__filter_pollfd function returns a non-zero value. As a result, perf record does not stop profiling. The following simple scenarios can trigger this condition: # sleep 10 & # perf record -p $! After the sleep process exits, perf record should stop profiling and exit. However, perf record keeps running. If pollfd revents contains only POLLERR or POLLHUP, perf record indicates that buffer is draining and need to stop profiling. Use fdarray_flag__nonfilterable() to set done eventfd to nonfilterable objects, so that evlist__filter_pollfd() does not filter and check done eventfd. Fixes: da231338ec9c0987 ("perf record: Use an eventfd to wakeup when done") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: zhangjinhao2@huawei.com Link: http://lore.kernel.org/lkml/20210205065001.23252-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04perf symbols: Fix return value when loading PE DSONicholas Fraser1-1/+3
[ Upstream commit 77771a97011fa9146ccfaf2983a3a2885dc57b6f ] The first time dso__load() was called on a PE file it always returned -1 error. This caused the first call to map__find_symbol() to always fail on a PE file so the first sample from each PE file always had symbol <unknown>. Subsequent samples succeed however because the DSO is already loaded. This fixes dso__load() to return 0 when successfully loading a DSO with libbfd. Fixes: eac9a4342e5447ca ("perf symbols: Try reading the symbol table with libbfd") Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Huw Davies <huw@codeweavers.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Remi Bernon <rbernon@codeweavers.com> Cc: Song Liu <songliubraving@fb.com> Cc: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Ulrich Czekalla <uczekalla@codeweavers.com> Link: http://lore.kernel.org/lkml/1671b43b-09c3-1911-dbf8-7f030242fbf7@codeweavers.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04perf symbols: Use (long) for iterator for bfd symbolsDmitry Safonov1-2/+1
[ Upstream commit 96de68fff5ded8833bf5832658cb43c54f86ff6c ] GCC (GCC) 8.4.0 20200304 fails to build perf with: : util/symbol.c: In function 'dso__load_bfd_symbols': : util/symbol.c:1626:16: error: comparison of integer expressions of different signednes : for (i = 0; i < symbols_count; ++i) { : ^ : util/symbol.c:1632:16: error: comparison of integer expressions of different signednes : while (i + 1 < symbols_count && : ^ : util/symbol.c:1637:13: error: comparison of integer expressions of different signednes : if (i + 1 < symbols_count && : ^ : cc1: all warnings being treated as errors It's unlikely that the symtable will be that big, but the fix is an oneliner and as perf has CORE_CFLAGS += -Wextra, which makes build to fail together with CORE_CFLAGS += -Werror Fixes: eac9a4342e54 ("perf symbols: Try reading the symbol table with libbfd") Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Jacek Caban <jacek@codeweavers.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Remi Bernon <rbernon@codeweavers.com> Link: http://lore.kernel.org/lkml/20210209145148.178702-1-dima@arista.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04perf vendor events arm64: Fix Ampere eMag event typoJohn Garry1-1/+1
[ Upstream commit 2bf797be81fa808f05f1a7a65916619132256a27 ] The "briefdescription" for event 0x35 has a typo - fix it. Fixes: d35c595bf005 ("perf vendor events arm64: Revise core JSON events for eMAG") Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@openeuler.org Link: https://lore.kernel.org/r/1611835236-34696-2-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04perf tools: Fix DSO filtering when not finding a map for a sampled addressArnaldo Carvalho de Melo1-0/+2
[ Upstream commit c69bf11ad3d30b6bf01cfa538ddff1a59467c734 ] When we lookup an address and don't find a map we should filter that sample if the user specified a list of --dso entries to filter on, fix it. Before: $ perf script sleep 274800 2843.556162: 1 cycles:u: ffffffffbb26bff4 [unknown] ([unknown]) sleep 274800 2843.556168: 1 cycles:u: ffffffffbb2b047d [unknown] ([unknown]) sleep 274800 2843.556171: 1 cycles:u: ffffffffbb2706b2 [unknown] ([unknown]) sleep 274800 2843.556174: 6 cycles:u: ffffffffbb2b0267 [unknown] ([unknown]) sleep 274800 2843.556176: 59 cycles:u: ffffffffbb2b03b1 [unknown] ([unknown]) sleep 274800 2843.556180: 691 cycles:u: ffffffffbb26bff4 [unknown] ([unknown]) sleep 274800 2843.556189: 9160 cycles:u: 7fa9550eeaa3 __GI___tunables_init+0xf3 (/usr/lib64/ld-2.32.so) sleep 274800 2843.556312: 86937 cycles:u: 7fa9550e157b _dl_lookup_symbol_x+0x4b (/usr/lib64/ld-2.32.so) $ So we have some samples we somehow didn't find in a map for, if we now do: $ perf report --stdio --dso /usr/lib64/ld-2.32.so # dso: /usr/lib64/ld-2.32.so # # Total Lost Samples: 0 # # Samples: 8 of event 'cycles:u' # Event count (approx.): 96856 # # Overhead Command Symbol # ........ ....... ........................ # 89.76% sleep [.] _dl_lookup_symbol_x 9.46% sleep [.] __GI___tunables_init 0.71% sleep [k] 0xffffffffbb26bff4 0.06% sleep [k] 0xffffffffbb2b03b1 0.01% sleep [k] 0xffffffffbb2b0267 0.00% sleep [k] 0xffffffffbb2706b2 0.00% sleep [k] 0xffffffffbb2b047d $ After this patch we get the right output with just entries for the DSOs specified in --dso: $ perf report --stdio --dso /usr/lib64/ld-2.32.so # dso: /usr/lib64/ld-2.32.so # # Total Lost Samples: 0 # # Samples: 8 of event 'cycles:u' # Event count (approx.): 96856 # # Overhead Command Symbol # ........ ....... ........................ # 89.76% sleep [.] _dl_lookup_symbol_x 9.46% sleep [.] __GI___tunables_init $ # Fixes: 96415e4d3f5fdf9c ("perf symbols: Avoid unnecessary symbol loading when dso list is specified") Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210128131209.GD775562@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-30tools: Factor HOSTCC, HOSTLD, HOSTAR definitionsJean-Philippe Brucker1-4/+0
commit c8a950d0d3b926a02c7b2e713850d38217cec3d1 upstream. Several Makefiles in tools/ need to define the host toolchain variables. Move their definition to tools/scripts/Makefile.include Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/bpf/20201110164310.2600671-2-jean-philippe@linaro.org Cc: Alistair Delva <adelva@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19perf intel-pt: Fix 'CPU too large' errorAdrian Hunter2-3/+3
commit 5501e9229a80d95a1ea68609f44c447a75d23ed5 upstream. In some cases, the number of cpus (nr_cpus_online) is confused with the maximum cpu number (nr_cpus_avail), which results in the error in the example below: Example on system with 8 cpus: Before: # echo 0 > /sys/devices/system/cpu/cpu2/online # ./perf record --kcore -e intel_pt// taskset --cpu-list 7 uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.147 MB perf.data ] # ./perf script --itrace=e Requested CPU 7 too large. Consider raising MAX_NR_CPUS 0x25908 [0x8]: failed to process type: 68 [Invalid argument] After: # ./perf script --itrace=e # Fixes: 8c7274691f0d ("perf machine: Replace MAX_NR_CPUS with perf_env::nr_cpus_online") Fixes: 7df4e36a4785 ("perf session: Replace MAX_NR_CPUS with perf_env::nr_cpus_online") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20210107174159.24897-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30perf probe: Fix memory leak when synthesizing SDT probesArnaldo Carvalho de Melo1-3/+10
[ Upstream commit 5149303fdfe5c67ddb51c911e23262f781cd75eb ] The argv_split() function must be paired with argv_free(), else we must keep a reference to the argv array received or do the freeing ourselves, in synthesize_sdt_probe_command() we were simply leaking that argv[] array. Fixes: 3b1f8311f6963cd1 ("perf probe: Add sdt probes arguments into the uprobe cmd string") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: Alexis Ber