| Age | Commit message (Collapse) | Author | Files | Lines |
|
When generating callstack information from branch_stack(Intel LBR), the
actual number of callstack entry should be bigger than the number of
branch_stack, for example:
branch_stack records:
B() -> C()
A() -> B()
converted callstack records should be:
C()
B()
A()
though, the number of callstack equals
to the number of branch stack plus 1.
This patch fixes above issue in branch_stack__printf(). For example,
# echo 'scale=2000; 4*a(1)' > cmd
# perf record --call-graph lbr bc -l < cmd
Before applying this patch, `perf script -D` output:
1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
... LBR call chain: nr:8
..... 0: fffffffffffffe00
..... 1: 000000000040a410
..... 2: 000000000040573c
..... 3: 0000000000408650
..... 4: 00000000004022f2
..... 5: 00000000004015f5
..... 6: 00007f5ed6dcb553
..... 7: 0000000000401698
... FP chain: nr:2
..... 0: fffffffffffffe00
..... 1: 000000000040a6d8
... branch callstack: nr:6 # which is not consistent with LBR records.
..... 0: 000000000040a410
..... 1: 0000000000408650 # ditto
..... 2: 00000000004022f2
..... 3: 00000000004015f5
..... 4: 00007f5ed6dcb553
..... 5: 0000000000401698
... thread: bc:17990
...... dso: /usr/bin/bc
bc 17990 1220022.677386: 894172 cycles:
40a410 [unknown] (/usr/bin/bc)
40573c [unknown] (/usr/bin/bc)
408650 [unknown] (/usr/bin/bc)
4022f2 [unknown] (/usr/bin/bc)
4015f5 [unknown] (/usr/bin/bc)
7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
401698 [unknown] (/usr/bin/bc)
After applied:
1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
... LBR call chain: nr:8
..... 0: fffffffffffffe00
..... 1: 000000000040a410
..... 2: 000000000040573c
..... 3: 0000000000408650
..... 4: 00000000004022f2
..... 5: 00000000004015f5
..... 6: 00007f5ed6dcb553
..... 7: 0000000000401698
... FP chain: nr:2
..... 0: fffffffffffffe00
..... 1: 000000000040a6d8
... branch callstack: nr:7
..... 0: 000000000040a410
..... 1: 000000000040573c
..... 2: 0000000000408650
..... 3: 00000000004022f2
..... 4: 00000000004015f5
..... 5: 00007f5ed6dcb553
..... 6: 0000000000401698
... thread: bc:17990
...... dso: /usr/bin/bc
bc 17990 1220022.677386: 894172 cycles:
40a410 [unknown] (/usr/bin/bc)
40573c [unknown] (/usr/bin/bc)
408650 [unknown] (/usr/bin/bc)
4022f2 [unknown] (/usr/bin/bc)
4015f5 [unknown] (/usr/bin/bc)
7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
401698 [unknown] (/usr/bin/bc)
Change from v1:
- refined code style according to Jiri's review comments.
Signed-off-by: Chengdong Li <chengdongli@tencent.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: likexu@tencent.com
Link: https://lore.kernel.org/r/20220517015726.96131-1-chengdongli@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Perf BPF filter test fails in environment where "clang" is not
installed.
Test failure logs:
<<>>
42: BPF filter :
42.1: Basic BPF filtering : Skip
42.2: BPF pinning : FAILED!
42.3: BPF prologue generation : FAILED!
<<>>
Enabling verbose option provided debug logs which says clang/llvm needs
to be installed. Snippet of verbose logs:
<<>>
42.2: BPF pinning :
--- start ---
test child forked, pid 61423
ERROR: unable to find clang.
Hint: Try to install latest clang/llvm to support BPF.
Check your $PATH
<<logs_here>>
Failed to compile test case: 'Basic BPF llvm compile'
Unable to get BPF object, fix kbuild first
test child finished with -1
---- end ----
BPF filter subtest 2: FAILED!
<<>>
Here subtests, "BPF pinning" and "BPF prologue generation" failed and
logs shows clang/llvm is needed. After installing clang, testcase
passes.
Reason on why subtest failure happens though logs has proper debug
information:
Main function __test__bpf calls test_llvm__fetch_bpf_obj by
passing 4th argument as true ( 4th arguments maps to parameter
"force" in test_llvm__fetch_bpf_obj ). But this will cause
test_llvm__fetch_bpf_obj to skip the check for clang/llvm.
Snippet of code part which checks for clang based on
parameter "force" in test_llvm__fetch_bpf_obj:
<<>>
if (!force && (!llvm_param.user_set_param &&
<<>>
Since force is set to "false", test won't get skipped and fails to
compile test case. The BPF code compilation needs clang, So pass the
fourth argument as "false" and also skip the test if reason for return
is "TEST_SKIP"
After the patch:
<<>>
42: BPF filter :
42.1: Basic BPF filtering : Skip
42.2: BPF pinning : Skip
42.3: BPF prologue generation : Skip
<<>>
Fixes: ba1fae431e74bb42 ("perf test: Add 'perf test BPF'")
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lore.kernel.org/r/20220511115438.84032-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The session topology test fails in powerpc pSeries platform.
Test logs:
<<>>
Session topology : FAILED!
<<>>
This testcases tests cpu topology by checking the core_id and socket_id
stored in perf_env from perf session. The data from perf session is
compared with the cpu topology information from
"/sys/devices/system/cpu/cpuX/topology" like core_id,
physical_package_id.
In case of virtual environment, detail like physical_package_id is
restricted to be exposed. Hence physical_package_id is set to -1. The
testcase fails on such platforms since socket_id can't be fetched from
topology info.
Skip the testcase in powerpc if physical_package_id returns -1.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>---
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220511114959.84002-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The compilation on s390 results in this error:
# make DEBUG=y bench/numa.o
...
bench/numa.c: In function ‘__bench_numa’:
bench/numa.c:1749:81: error: ‘%d’ directive output may be truncated
writing between 1 and 11 bytes into a region of size between
10 and 20 [-Werror=format-truncation=]
1749 | snprintf(tname, sizeof(tname), "process%d:thread%d", p, t);
^~
...
bench/numa.c:1749:64: note: directive argument in the range
[-2147483647, 2147483646]
...
#
The maximum length of the %d replacement is 11 characters because of the
negative sign. Therefore extend the array by two more characters.
Output after:
# make DEBUG=y bench/numa.o > /dev/null 2>&1; ll bench/numa.o
-rw-r--r-- 1 root root 418320 May 19 09:11 bench/numa.o
#
Fixes: 3aff8ba0a4c9c919 ("perf bench numa: Avoid possible truncation when using snprintf()")
Suggested-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220520081158.2990006-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
for_each_shell_test() is already strict in expecting tests to be files
and executable. It is sometimes possible when it iterates over all files
that it finds one that is executable and lacks a newline character. When
this happens the loop never terminates as it doesn't check for EOF.
Add the EOF check to make this loop at least bounded by the file size.
If the description is returned as NULL then also skip the test.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220517204144.645913-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The X86 specific arch__intr_reg_mask() is to check whether the kernel
and hardware can collect XMM registers. But it doesn't work on some
hybrid platform.
Without the patch on ADL-N:
$ perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
R11 R12 R13 R14 R15
The config of the test event doesn't contain the PMU information. The
kernel may fail to initialize it on the correct hybrid PMU and return
the wrong non-supported information.
Add the PMU information into the config for the hybrid platform. The
same register set is supported among different hybrid PMUs. Checking
the first available one is good enough.
With the patch on ADL-N:
$ perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
Fixes: 6466ec14aaf44ff1 ("perf regs x86: Add X86 specific arch__intr_reg_mask()")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518145125.1494156-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
"perf all PMU test" picks the input events from "perf list --raw-dump
pmu" list and runs "perf stat -e" for each of the event in the list. In
case of powerpc, the PowerVM environment supports events from hv_24x7
and hv_gpci PMU which is of example format like below:
- hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
- hv_gpci/event,partition_id=?/
The value for "?" needs to be filled in depending on system and
respective event. CPM_ADJUNCT_INST needs have core value and domain
value. hv_gpci event needs partition_id. Similarly, there are other
events for hv_24x7 and hv_gpci having "?" in event format. Hence skip
these events on powerpc platform since values like partition_id, domain
is specific to system and event.
Fixes: 3d5ac9effcc640d5 ("perf test: Workload test of all PMUs")
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20220520101236.17249-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Stat events can come from disk and so need a degree of validation. They
contain a CPU which needs looking up via CPU map to access a counter.
Add the CPU to index translation, alongside validity checking.
Discussion thread:
https://lore.kernel.org/linux-perf-users/CAP-5=fWQR=sCuiSMktvUtcbOLidEpUJLCybVF6=BRvORcDOq+g@mail.gmail.com/
Fixes: 7ac0089d138f80dc ("perf evsel: Pass cpu not cpu map index to synthesize")
Reported-by: Michael Petlan <mpetlan@redhat.com>
Suggested-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20220519032005.1273691-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Avi Kivity reported a problem where the __weak
btf__load_from_kernel_by_id() in tools/perf/util/bpf-event.c was being
used and it called btf__get_from_id() in tools/lib/bpf/btf.c that in
turn called back to btf__load_from_kernel_by_id(), resulting in an
endless loop.
Fix this by adding a feature test to check if
btf__load_from_kernel_by_id() is available when building perf with
LIBBPF_DYNAMIC=1, and if not then provide the fallback to the old
btf__get_from_id(), that doesn't call back to btf__load_from_kernel_by_id()
since at that time it didn't exist at all.
Tested on Fedora 35 where we have libbpf-devel 0.4.0 with LIBBPF_DYNAMIC
where we don't have btf__load_from_kernel_by_id() and thus its feature
test fail, not defining HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID:
$ cat /tmp/build/perf-urgent/feature/test-libbpf-btf__load_from_kernel_by_id.make.output
test-libbpf-btf__load_from_kernel_by_id.c: In function ‘main’:
test-libbpf-btf__load_from_kernel_by_id.c:6:16: error: implicit declaration of function ‘btf__load_from_kernel_by_id’ [-Werror=implicit-function-declaration]
6 | return btf__load_from_kernel_by_id(20151128, NULL);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
$
$ nm /tmp/build/perf-urgent/perf | grep btf__load_from_kernel_by_id
00000000005ba180 T btf__load_from_kernel_by_id
$
$ objdump --disassemble=btf__load_from_kernel_by_id -S /tmp/build/perf-urgent/perf
/tmp/build/perf-urgent/perf: file format elf64-x86-64
<SNIP>
00000000005ba180 <btf__load_from_kernel_by_id>:
#include "record.h"
#include "util/synthetic-events.h"
#ifndef HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID
struct btf *btf__load_from_kernel_by_id(__u32 id)
{
5ba180: 55 push %rbp
5ba181: 48 89 e5 mov %rsp,%rbp
5ba184: 48 83 ec 10 sub $0x10,%rsp
5ba188: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
5ba18f: 00 00
5ba191: 48 89 45 f8 mov %rax,-0x8(%rbp)
5ba195: 31 c0 xor %eax,%eax
struct btf *btf;
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
int err = btf__get_from_id(id, &btf);
5ba197: 48 8d 75 f0 lea -0x10(%rbp),%rsi
5ba19b: e8 a0 57 e5 ff call 40f940 <btf__get_from_id@plt>
5ba1a0: 89 c2 mov %eax,%edx
#pragma GCC diagnostic pop
return err ? ERR_PTR(err) : btf;
5ba1a2: 48 98 cltq
5ba1a4: 85 d2 test %edx,%edx
5ba1a6: 48 0f 44 45 f0 cmove -0x10(%rbp),%rax
}
<SNIP>
Fixes: 218e7b775d368f38 ("perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions")
Reported-by: Avi Kivity <avi@scylladb.com>
Link: https://lore.kernel.org/linux-perf-users/f0add43b-3de5-20c5-22c4-70aff4af959f@scylladb.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/linux-perf-users/YobjjFOblY4Xvwo7@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently the `perf test` always fails the coresight test like:
89: Check Arm CoreSight trace data recording and synthesized samples: FAILED!
That is because the test_arm_coresight.sh is attempting to SIGINT the
parent but is using $$ rather than $PPID and it sigint's itself when
run under the perf test framework.
Since this is done in a trap clause it ends up returning a non zero
return.
Since $PPID is a bash ism and not all distros are linking /bin/sh to
bash, the alternative parent pid lookups are uglier than just dropping
the kill, and its not strictly needed, lets pick the simple solution and
drop the sigint.
Fixes: 133fe2e617e48ca0 ("perf tests: Improve temp file cleanup in test_arm_coresight.sh")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Jeremy Linton <jeremy.linton@arm.com>
Link: https://lore.kernel.org/r/20220428151947.290146-1-jeremy.linton@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
BUG_ON is a no-op if NDEBUG is defined, otherwise it is an assert.
Compiling with NDEBUG yields:
bench/numa.c: In function ‘bind_to_cpu’:
bench/numa.c:314:1: error: control reaches end of non-void function [-Werror=return-type]
314 | }
| ^
bench/numa.c: In function ‘bind_to_node’:
bench/numa.c:367:1: error: control reaches end of non-void function [-Werror=return-type]
367 | }
| ^
Add return statements to cover this case.
Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220428202912.1056444-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now the generic code can handle kallsyms fixup properly so no need to
keep the arch-functions anymore.
Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220416004048.1514900-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now arch-specific functions all do the same thing. When it fixes the
symbol address it needs to check the boundary between the kernel image
and modules. For the last symbol in the previous region, it cannot
know the exact size as it's discarded already. Thus it just uses a
small page size (4096) and rounds it up like the last symbol.
Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220416004048.1514900-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The symbol fixup is necessary for symbols in kallsyms since they don't
have size info. So we use the next symbol's address to calculate the
size. Now it's also used for user binaries because sometimes they miss
size for hand-written asm functions.
There's a arch-specific function to handle kallsyms differently but
currently it cannot distinguish kallsyms from others. Pass this
information explicitly to handle it properly. Note that those arch
functions will be moved to the generic function so I didn't added it to
the arch-functions.
Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220416004048.1514900-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adds a perf_event_attr test for Arm SPE in which the presence of
physical addresses are checked when SPE unit is run with pa_enable=1.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220421165205.117662-4-timothy.hayes@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch corrects a bug whereby SPE collection is invoked with
pa_enable=1 but synthesized events fail to show physical addresses.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220421165205.117662-3-timothy.hayes@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch corrects a bug whereby synthesized events from SPE
samples are missing virtual addresses.
Fixes: 54f7815efef7fad9 ("perf arm-spe: Fill address info for samples")
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: bpf@vger.kernel.org
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220421165205.117662-2-timothy.hayes@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Intel PT does not capture data in separate directories, so do not
use separate directory processing because it doesn't work for
timeless decoding. It also looks like it doesn't support one_mmap
handling.
Example:
Before:
# perf record --kcore -a -e intel_pt/tsc=0/k sleep 0.1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.799 MB perf.data ]
# perf script --itrace=bep | head
#
After:
# perf script --itrace=bep | head
perf 21073 [000] psb: psb offs: 0 ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms])
perf 21073 [000] cbr: cbr: 45 freq: 4505 MHz (161%) ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: 0 [unknown] ([unknown]) => ffffffffaa68faf6 native_write_msr+0x6 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa68faf8 native_write_msr+0x8 ([kernel.kallsyms]) => ffffffffaa61aab0 pt_config_start+0x60 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa61aabd pt_config_start+0x6d ([kernel.kallsyms]) => ffffffffaa61b8ad pt_event_start+0x27d ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa61b8bb pt_event_start+0x28b ([kernel.kallsyms]) => ffffffffaa61ba60 pt_event_add+0x40 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa61ba76 pt_event_add+0x56 ([kernel.kallsyms]) => ffffffffaa880e86 event_sched_in+0xc6 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa880e9b event_sched_in+0xdb ([kernel.kallsyms]) => ffffffffaa880ea5 event_sched_in+0xe5 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa880eba event_sched_in+0xfa ([kernel.kallsyms]) => ffffffffaa880f96 event_sched_in+0x1d6 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa880fc8 event_sched_in+0x208 ([kernel.kallsyms]) => ffffffffaa880ec0 event_sched_in+0x100 ([kernel.kallsyms])
Fixes: bb6be405c4a2a5 ("perf session: Load data directory files for analysis")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220428093109.274641-1-adrian.hunter@intel.com
Cc: Ian Rogers <irogers@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-kernel@vger.kernel.org
|
|
Test case 71 'Convert perf time to TSC' is not supported on s390.
Subtest 71.1 is skipped with the correct message, but subtest 71.2 is
not skipped and fails.
The root cause is function evlist__open() called from
test__perf_time_to_tsc(). evlist__open() returns -ENOENT because the
event cycles:u is not supported by the selected PMU, for example
platform s390 on z/VM or an x86_64 virtual machine.
The PMU driver returns -ENOENT in this case. This error is leads to the
failure.
Fix this by returning TEST_SKIP on -ENOENT.
Output before:
71: Convert perf time to TSC:
71.1: TSC support: Skip (This architecture does not support)
71.2: Perf time to TSC: FAILED!
Output after:
71: Convert perf time to TSC:
71.1: TSC support: Skip (This architecture does not support)
71.2: Perf time to TSC: Skip (perf_read_tsc_conversion is not supported)
This also happens on an x86_64 virtual machine:
# uname -m
x86_64
$ ./perf test -F 71
71: Convert perf time to TSC :
71.1: TSC support : Ok
71.2: Perf time to TSC : FAILED!
$
Committer testing:
Continues to work on x86_64:
$ perf test 71
71: Convert perf time to TSC :
71.1: TSC support : Ok
71.2: Perf time to TSC : Ok
$
Fixes: 290fa68bdc458863 ("perf test tsc: Fix error message when not supported")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Chengdong Li <chengdongli@tencent.com>
Cc: chengdongli@tencent.com
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220420062921.1211825-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since commit bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem
info is not available") "perf mem report" and "perf report --mem-mode"
don't report result if the PERF_SAMPLE_DATA_SRC bit is missed in sample
type.
The commit ffab487052054162 ("perf: arm-spe: Fix perf report
--mem-mode") partially fixes the issue. It adds PERF_SAMPLE_DATA_SRC
bit for Arm SPE event, this allows the perf data file generated by
kernel v5.18-rc1 or later version can be reported properly.
On the other hand, perf tool still fails to be backward compatibility
for a data file recorded by an older version's perf which contains Arm
SPE trace data. This patch is a workaround in reporting phase, when
detects ARM SPE PMU event and without PERF_SAMPLE_DATA_SRC bit, it will
force to set the bit in the sample type and give a warning info.
Fixes: bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem info is not available")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: German Gomez <german.gomez@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/20220414123201.842754-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If use command 'perf script -F,+data_src' to dump memory samples with
Arm SPE trace data, it reports error:
# perf script -F,+data_src
Samples for 'dummy:u' event do not have DATA_SRC attribute set. Cannot print 'data_src' field.
This is because the 'dummy:u' event is absent DATA_SRC bit in its sample
type, so if a file contains AUX area tracing data then always allow
field 'data_src' to be selected as an option for perf script.
Fixes: e55ed3423c1bb29f ("perf arm-spe: Synthesize memory event")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220417114837.839896-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The header TargetRegistry.h has moved in LLVM/clang 14.
Committer notes:
The problem as noticed when building in ubuntu:22.04:
90 98.61 ubuntu:22.04 : FAIL gcc version 11.2.0 (Ubuntu 11.2.0-19ubuntu1)
util/c++/clang.cpp:23:10: fatal error: llvm/Support/TargetRegistry.h: No such file or directory
23 | #include "llvm/Support/TargetRegistry.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
Fixed after applying this patch.
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://twitter.com/GuilhermeAmadio/status/1514970524232921088
Link: http://lore.kernel.org/lkml/Ylp0M/VYgHOxtcnF@gentoo.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 'perf bench numa' testcase fails on systems with more than 1K CPUs.
Testcase: perf bench numa mem -p 1 -t 3 -P 512 -s 100 -zZ0qcm --thp 1
Snippet of code:
<<>>
perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed.
Aborted (core dumped)
<<>>
bind_to_node() uses "sched_getaffinity" to save the original cpumask and
this call is returning EINVAL ((invalid argument).
This happens because the default mask size in glibc is 1024. To
overcome this 1024 CPUs mask size limitation of cpu_set_t, change the
mask size using the CPU_*_S macros ie, use CPU_ALLOC to allocate
cpumask, CPU_ALLOC_SIZE for size.
Apart from fixing this for "orig_mask", apply same logic to "mask" as
well which is used to setaffinity so that mask size is large enough to
represent number of possible CPU's in the system.
sched_getaffinity is used in one more place in perf numa bench. It is in
"bind_to_cpu" function. Apply the same logic there also. Though
currently no failure is reported from there, it is ideal to change
getaffinity to work with such system configurations having CPU's more
than default mask size supported by glibc.
Also fix "sched_setaffinity" to use mask size which is large enough to
represent number of possible CPU's in the system.
Fixed all places where "bind_cpumask" which is part of "struct
thread_data" is used such that bind_cpumask works in all configuration.
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220412164059.42654-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Perf numa bench test fails with error:
Testcase:
./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk
Failure snippet:
<<>>
Running 'numa/mem' benchmark:
# Running main, "perf bench numa numa-mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk"
perf: bench/numa.c:333: bind_to_cpumask: Assertion `!(ret)' failed.
<<>>
The Testcases uses CPU's 0 and 8. In function "parse_setup_cpu_list",
There is check to see if cpu number is greater than max cpu's possible
in the system ie via "if (bind_cpu_0 >= g->p.nr_cpus || bind_cpu_1 >=
g->p.nr_cpus) {".
But it could happen that system has say 48 CPU's, but only number of
online CPU's is 0-7. Other CPU's are offlined. Since "g->p.nr_cpus" is
48, so function will go ahead and set bit for CPU 8 also in cpumask (
td->bind_cpumask).
bind_to_cpumask function is called to set affinity using
sched_setaffinity and the cpumask. Since the CPU8 is not present, set
affinity will fail here with EINVAL.
Fix this issue by adding a check to make sure that, CPU's provided in
the input argument values are online before proceeding further and skip
the test. For this, include new helper function "is_cpu_online" in
"tools/perf/util/header.c".
Since "BIT(x)" definition will get included from header.h, remove
that from bench/numa.c
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220412164059.42654-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Per-thread mode doesn't have specific CPUs for events, add checks for
this case.
Minor fix to a pr_debug by Ian Rogers <irogers@google.com> to avoid an
out of bound array access.
Fixes: 7954f71689f90cb2 ("perf record: Introduce thread affinity and mmap masks")
Reported-by: Ian Rogers <irogers@google.com>
Signed-off-by: Alexey Bayduraev <alexey.bayduraev@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220414014642.3308206-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
hashmap__new() returns ERR_PTR(-ENOMEM) when it fails, so we should use
IS_ERR() to check it in error handling path.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220413093302.2538128-1-lv.ruyi@zte.com.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix incorrect debug message:
Attempting to add event pmu 'intel_pt' with '' that may result in
non-fatal errors
which always appears with perf record -vv and intel_pt e.g.
perf record -vv -e intel_pt//u uname
The message is incorrect because there will never be non-fatal errors.
Suppress the message if the PMU is 'selectable' i.e. meant to be
selected directly as an event.
Fixes: 4ac22b484d4c79e8 ("perf parse-events: Make add PMU verbose output clearer")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20220411061758.2458417-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
output
If objdump writes to stderr it can block waiting for it to be read. As
perf doesn't read stderr then progress stops with perf waiting for
stdout output.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Lexi Shao <shaolexi@huawei.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220407230503.1265036-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The `perf --list-cmds` output prints only internal commands, although
there is no reason for that from users' perspective.
Adding the external commands to commands array with NULL function
pointer allows printing all perf commands while not changing the logic
of command handler selection.
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220404221541.30312-2-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220404221541.30312-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If a perf event doesn't fit into remaining buffer space return NULL to
remap buf and fetch the event again.
Keep the logic to error out on inadequate input from fuzzing.
This fixes perf failing on ChromeOS (with 32b userspace):
$ perf report -v -i perf.data
...
prefetch_event: head=0x1fffff8 event->header_size=0x30, mmap_size=0x2000000: fuzzed or compressed perf.data?
Error:
failed to process sample
Fixes: 57fc032ad643ffd0 ("perf session: Avoid infinite loop when seeing invalid header.size")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Denis Nikitin <denik@chromium.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220330031130.2152327-1-denik@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|