| Age | Commit message (Collapse) | Author | Files | Lines |
|
The exit function fixes a memory leak with the src field as detected by
leak sanitizer. An example of which is:
Indirect leak of 25133184 byte(s) in 207 object(s) allocated from:
#0 0x7f199ecfe987 in __interceptor_calloc libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x55defe638224 in annotated_source__alloc_histograms util/annotate.c:803
#2 0x55defe6397e4 in symbol__hists util/annotate.c:952
#3 0x55defe639908 in symbol__inc_addr_samples util/annotate.c:968
#4 0x55defe63aa29 in hist_entry__inc_addr_samples util/annotate.c:1119
#5 0x55defe499a79 in hist_iter__report_callback tools/perf/builtin-report.c:182
#6 0x55defe7a859d in hist_entry_iter__add util/hist.c:1236
#7 0x55defe49aa63 in process_sample_event tools/perf/builtin-report.c:315
#8 0x55defe731bc8 in evlist__deliver_sample util/session.c:1473
#9 0x55defe731e38 in machines__deliver_event util/session.c:1510
#10 0x55defe732a23 in perf_session__deliver_event util/session.c:1590
#11 0x55defe72951e in ordered_events__deliver_event util/session.c:183
#12 0x55defe740082 in do_flush util/ordered-events.c:244
#13 0x55defe7407cb in __ordered_events__flush util/ordered-events.c:323
#14 0x55defe740a61 in ordered_events__flush util/ordered-events.c:341
#15 0x55defe73837f in __perf_session__process_events util/session.c:2390
#16 0x55defe7385ff in perf_session__process_events util/session.c:2420
...
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211112035124.94327-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Right now, when Line numbers are displayed, one can't easily find a
source file that the line corresponds to.
When a source line is selected and 'l' is pressed, full source file
location is displayed in perf UI footer line. The hotkey works only for
source code lines.
Signed-off-by: Martin Liška <mliska@suse.cz>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lore.kernel.org/lkml/25a6384f-d862-5dda-4fec-8f0555599c75@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Considering the following testcase:
int
foo(int a, int b)
{
for (unsigned i = 0; i < 1000000000; i++)
a += b;
return a;
}
int main()
{
foo (3, 4);
return 0;
}
'perf annotate' displays:
86.52 │40055e: → ja 40056c <foo(int, int)+0x26>
13.37 │400560: mov -0x18(%rbp),%eax
│400563: add %eax,-0x14(%rbp)
│400566: addl $0x1,-0x4(%rbp)
0.11 │40056a: → jmp 400557 <foo(int, int)+0x11>
│40056c: mov -0x14(%rbp),%eax
│40056f: pop %rbp
and the 'ja 40056c' does not link to the location in the function. It's
caused by fact that comma is wrongly parsed, it's part of function
signature.
With my patch I see:
86.52 │ ┌──ja 26
13.37 │ │ mov -0x18(%rbp),%eax
│ │ add %eax,-0x14(%rbp)
│ │ addl $0x1,-0x4(%rbp)
0.11 │ │↑ jmp 11
│26:└─→mov -0x14(%rbp),%eax
and 'o' output prints:
86.52 │4005┌── ↓ ja 40056c <foo(int, int)+0x26>
13.37 │4005│0: mov -0x18(%rbp),%eax
│4005│3: add %eax,-0x14(%rbp)
│4005│6: addl $0x1,-0x4(%rbp)
0.11 │4005│a: ↑ jmp 400557 <foo(int, int)+0x11>
│4005└─→ mov -0x14(%rbp),%eax
On the contrary, compiling the very same file with gcc -x c, the parsing
is fine because function arguments are not displayed:
jmp 400543 <foo+0x1d>
Committer testing:
Before:
$ cat cpp_args_annotate.c
int
foo(int a, int b)
{
for (unsigned i = 0; i < 1000000000; i++)
a += b;
return a;
}
int main()
{
foo (3, 4);
return 0;
}
$ gcc --version |& head -1
gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)
$ gcc -g cpp_args_annotate.c -o cpp_args_annotate
$ perf record ./cpp_args_annotate
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.275 MB perf.data (7188 samples) ]
$ perf annotate --stdio2 foo
Samples: 7K of event 'cycles:u', 4000 Hz, Event count (approx.): 7468429289, [percent: local period]
foo() /home/acme/c/cpp_args_annotate
Percent
0000000000401106 <foo>:
foo():
int
foo(int a, int b)
{
push %rbp
mov %rsp,%rbp
mov %edi,-0x14(%rbp)
mov %esi,-0x18(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
movl $0x0,-0x4(%rbp)
↓ jmp 1d
a += b;
13.45 13: mov -0x18(%rbp),%eax
add %eax,-0x14(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
addl $0x1,-0x4(%rbp)
0.09 1d: cmpl $0x3b9ac9ff,-0x4(%rbp)
86.46 ↑ jbe 13
return a;
mov -0x14(%rbp),%eax
}
pop %rbp
← retq
$
I.e. works for C, now lets switch to C++:
$ g++ -g cpp_args_annotate.c -o cpp_args_annotate
$ perf record ./cpp_args_annotate
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.268 MB perf.data (6976 samples) ]
$ perf annotate --stdio2 foo
Samples: 6K of event 'cycles:u', 4000 Hz, Event count (approx.): 7380681761, [percent: local period]
foo() /home/acme/c/cpp_args_annotate
Percent
0000000000401106 <foo(int, int)>:
foo(int, int):
int
foo(int a, int b)
{
push %rbp
mov %rsp,%rbp
mov %edi,-0x14(%rbp)
mov %esi,-0x18(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
movl $0x0,-0x4(%rbp)
cmpl $0x3b9ac9ff,-0x4(%rbp)
86.53 → ja 40112c <foo(int, int)+0x26>
a += b;
13.32 mov -0x18(%rbp),%eax
0.00 add %eax,-0x14(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
addl $0x1,-0x4(%rbp)
0.15 → jmp 401117 <foo(int, int)+0x11>
return a;
mov -0x14(%rbp),%eax
}
pop %rbp
← retq
$
Reproduced.
Now with this patch:
Reusing the C++ built binary, as we can see here:
$ readelf -wi cpp_args_annotate | grep producer
<c> DW_AT_producer : (indirect string, offset: 0x2e): GNU C++14 10.2.1 20201125 (Red Hat 10.2.1-9) -mtune=generic -march=x86-64 -g
$
And furthermore:
$ file cpp_args_annotate
cpp_args_annotate: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=4fe3cab260204765605ec630d0dc7a7e93c361a9, for GNU/Linux 3.2.0, with debug_info, not stripped
$ perf buildid-list -i cpp_args_annotate
4fe3cab260204765605ec630d0dc7a7e93c361a9
$ perf buildid-list | grep cpp_args_annotate
4fe3cab260204765605ec630d0dc7a7e93c361a9 /home/acme/c/cpp_args_annotate
$
It now works:
$ perf annotate --stdio2 foo
Samples: 6K of event 'cycles:u', 4000 Hz, Event count (approx.): 7380681761, [percent: local period]
foo() /home/acme/c/cpp_args_annotate
Percent
0000000000401106 <foo(int, int)>:
foo(int, int):
int
foo(int a, int b)
{
push %rbp
mov %rsp,%rbp
mov %edi,-0x14(%rbp)
mov %esi,-0x18(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
movl $0x0,-0x4(%rbp)
11: cmpl $0x3b9ac9ff,-0x4(%rbp)
86.53 ↓ ja 26
a += b;
13.32 mov -0x18(%rbp),%eax
0.00 add %eax,-0x14(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
addl $0x1,-0x4(%rbp)
0.15 ↑ jmp 11
return a;
26: mov -0x14(%rbp),%eax
}
pop %rbp
← retq
$
Signed-off-by: Martin Liška <mliska@suse.cz>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Link: http://lore.kernel.org/lkml/13e1a405-edf9-e4c2-4327-a9b454353730@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array
member[1][2], introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this
change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200515172926.GA31976@embeddedor
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For perf report on stripped binaries it is currently impossible to do
annotation. The annotation state is all tied to symbols, but there are
either no symbols, or symbols are not covering all the code.
We should support the annotation functionality even without symbols.
This patch fakes a symbol and the symbol name is the string of address.
After that, we just follow current annotation working flow.
For example,
1. perf report
Overhead Command Shared Object Symbol
20.67% div libc-2.27.so [.] __random_r
17.29% div libc-2.27.so [.] __random
10.59% div div [.] 0x0000000000000628
9.25% div div [.] 0x0000000000000612
6.11% div div [.] 0x0000000000000645
2. Select the line of "10.59% div div [.] 0x0000000000000628" and ENTER.
Annotate 0x0000000000000628
Zoom into div thread
Zoom into div DSO (use the 'k' hotkey to zoom directly into the kernel)
Browse map details
Run scripts for samples of symbol [0x0000000000000628]
Run scripts for all samples
Switch to another data file in PWD
Exit
3. Select the "Annotate 0x0000000000000628" and ENTER.
Percent│
│
│
│ Disassembly of section .text:
│
│ 0000000000000628 <.text+0x68>:
│ divsd %xmm4,%xmm0
│ divsd %xmm3,%xmm1
│ movsd (%rsp),%xmm2
│ addsd %xmm1,%xmm0
│ addsd %xmm2,%xmm0
│ movsd %xmm0,(%rsp)
Now we can see the dump of object starting from 0x628.
v5:
---
Remove the hotkey 'a' implementation from this patch. It
will be moved to a separate patch.
v4:
---
1. Support the hotkey 'a'. When we press 'a' on address,
now it supports the annotation.
2. Change the patch title from
"Support interactive annotation of code without symbols" to
"perf report: Support interactive annotation of code without symbols"
v3:
---
Keep just the ANNOTATION_DUMMY_LEN, and remove the
opts->annotate_dummy_len since it's the "maybe in future
we will provide" feature.
v2:
---
Fix a crash issue when annotating an address in "unknown" object.
The steps to reproduce this issue:
perf record -e cycles:u ls
perf report
75.29% ls ld-2.27.so [.] do_lookup_x
23.64% ls ld-2.27.so [.] __GI___tunables_init
1.04% ls [unknown] [k] 0xffffffff85c01210
0.03% ls ld-2.27.so [.] _start
When annotating 0xffffffff85c01210, the crash happens.
v2 adds checking for ms->map in add_annotate_opt(). If the object is
"unknown", ms->map is NULL.
Committer notes:
Renamed new_annotate_sym() to symbol__new_unresolved().
Use PRIx64 to fix this issue in some 32-bit arches:
ui/browsers/hists.c: In function 'symbol__new_unresolved':
ui/browsers/hists.c:2474:38: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
snprintf(name, sizeof(name), "%-#.*lx", BITS_PER_LONG / 4, addr);
~~~~~~^ ~~~~
%-#.*llx
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200227043939.4403-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 'nr_jumps' field in 'struct annotation' is not used since it's
inception in commit 2402e4a936a0 ("perf annotate browser: Show 'jumpy'
functions"). Get rid of it.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/20200204045233.474937-7-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We are allocating disasm_line object in annotation_line__new() instead
of disasm_line__new(). Similarly annotation_line__delete() is actually
freeing disasm_line object as well. This complexity is because of
privsize. But we don't need privsize anymore so get rid of privsize and
simplify disasm_line allocation and freeing code.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/20200204045233.474937-3-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
privsize is passed as 0 from all the symbol__annotate() callers.
Remove it from argument list.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/20200204045233.474937-2-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf default config set by user in [annotate] section is totally ignored
by annotate code. Fix it.
Before:
$ ./perf config
annotate.hide_src_code=true
annotate.show_nr_jumps=true
annotate.show_nr_samples=true
$ ./perf annotate shash
│ unsigned h = 0;
│ movl $0x0,-0xc(%rbp)
│ while (*s)
│ ↓ jmp 44
│ h = 65599 * h + *s++;
11.33 │24: mov -0xc(%rbp),%eax
43.50 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%rax
After:
│ movl $0x0,-0xc(%rbp)
│ ↓ jmp 44
1 │1 24: mov -0xc(%rbp),%eax
4 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%rax
Note that we have removed show_nr_samples and show_total_period from
annotation_options because they are not used. Instead of them we use
symbol_conf.show_nr_samples and symbol_conf.show_total_period.
Committer testing:
Using 'perf annotate --stdio2' to use the TUI rendering but emitting the output to stdio:
# perf config
#
# perf config annotate.hide_src_code=true
# perf config
annotate.hide_src_code=true
#
# perf config annotate.show_nr_jumps=true
# perf config annotate.show_nr_samples=true
# perf config
annotate.hide_src_code=true
annotate.show_nr_jumps=true
annotate.show_nr_samples=true
#
#
Before:
# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Percent
00000000000609f0 <ObjectInstance::weak_pointer_was_finalized()@@Base>:
endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
100.00 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1b: xor %eax,%eax
pop %rbp
← retq
nop
20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
cmpq $0x0,0x18(%rbp)
↑ jne 1b
mov %rbp,%rdi
→ callq ObjectBase::jsobj_addr() const@plt
mov $0x1,%eax
pop %rbp
← retq
#
After:
# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Samples endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
1 1 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1 1b: xor %eax,%eax
pop %rbp
← retq
nop
1 20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
cmpq $0x0,0x18(%rbp)
↑ jne 1b
mov %rbp,%rdi
→ callq ObjectBase::jsobj_addr() const@plt
mov $0x1,%eax
pop %rbp
← retq
#
# perf config annotate.show_nr_jumps
annotate.show_nr_jumps=true
# perf config annotate.show_nr_jumps=false
# perf config annotate.show_nr_jumps
annotate.show_nr_jumps=false
#
# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Samples endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
1 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1b: xor %eax,%eax
pop %rbp
← retq
nop
20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
cmpq $0x0,0x18(%rbp)
↑ jne 1b
mov %rbp,%rdi
→ callq ObjectBase::jsobj_addr() const@plt
mov $0x1,%eax
pop %rbp
← retq
#
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Link: http://lore.kernel.org/lkml/20200213064306.160480-6-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf annotate --show-total-period does not really show total period.
The reason is we have two separate variables for the same purpose.
One is in symbol_conf.show_total_period and another is
annotation_options.show_total_period.
We save command line option in symbol_conf.show_total_period but uses
annotation_option.show_total_period while rendering tui/stdio2 browser.
Though, we copy symbol_conf.show_total_period to
annotation__default_options.show_total_period but that is not really
effective as we don't use annotation__default_options once we copy
default options to dynamic variable annotate.opts in cmd_annotate().
Instead of all these complication, keep only one variable and use it all
over. symbol_conf.show_total_period is used by perf report/top as well.
So let's kill annotation_options.show_total_period.
On a side note, I've kept annotation_options.show_total_period
definition because it's still used by perf-config code. Follow up patch
to fix perf-config for annotate will remove
annotation_options.show_total_period.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Link: http://lore.kernel.org/lkml/20200213064306.160480-3-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The objdump utility has useful --prefix / --prefix-strip options to
allow changing source code file names hardcoded into executables' debug
info. Add options to 'perf report', 'perf top' and 'perf annotate',
which are then passed to objdump.
$ mkdir foo
$ echo 'main() { for (;;); }' > foo/foo.c
$ gcc -g foo/foo.c
foo/foo.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
1 | main() { for (;;); }
| ^~~~
$ perf record ./a.out
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.230 MB perf.data (5721 samples) ]
$ mv foo bar
$ perf annotate
<does not show source code>
$ perf annotate --prefix=/home/ak/lsrc/git/bar --prefix-strip=5
<does show source code>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
LPU-Reference: 20200107210444.214071-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'symbol' pointers
We are already passing things like:
symbol__annotate(ms->sym, ms->map, ...)
So shorten the signature of such functions to receive the 'map_symbol'
pointer.
This also paves the way to having the 'struct map_groups' pointer in the
'struct map_symbol' so that we can get rid of 'struct map'->groups.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-23yx8v1t41nzpkpi7rdrozww@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch prints the stddev and hist for the cycles diff of program
block. It can help us to understand if the cycles is noisy or not.
This patch is inspired by Andi Kleen's patch:
https://lwn.net/Articles/600471/
We create new option '--cycles-hist'.
Example:
perf record -b ./div
perf record -b ./div
perf diff -c cycles
# Baseline [Program Block Range] Cycles Diff Shared Object Symbol
# ........ .......................................................... .... ................. ............................
#
46.72% [div.c:40 -> div.c:40] 0 div [.] main
46.72% [div.c:42 -> div.c:44] 0 div [.] main
46.72% [div.c:42 -> div.c:39] 0 div [.] main
20.54% [random_r.c:357 -> random_r.c:394] 1 libc-2.27.so [.] __random_r
20.54% [random_r.c:357 -> random_r.c:380] 0 libc-2.27.so [.] __random_r
20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r
20.54% [random_r.c:388 -> random_r.c:391] 0 libc-2.27.so [.] __random_r
17.04% [random.c:288 -> random.c:291] 0 libc-2.27.so [.] __random
17.04% [random.c:291 -> random.c:291] 0 libc-2.27.so [.] __random
17.04% [random.c:293 -> random.c:293] 0 libc-2.27.so [.] __random
17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random
17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random
17.04% [random.c:298 -> random.c:298] 0 libc-2.27.so [.] __random
8.40% [div.c:22 -> div.c:25] 0 div [.] compute_flag
8.40% [div.c:27 -> div.c:28] 0 div [.] compute_flag
5.14% [rand.c:26 -> rand.c:27] 0 libc-2.27.so [.] rand
5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand
2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt
0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax
0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap
0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap
0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap
0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15
0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 [kernel.kallsyms] [k] native_sched_clock
0.00% [native_write_msr+0 -> native_write_msr+16] -13 [kernel.kallsyms] [k] native_write_msr
When we enable the option '--cycles-hist', the output is
perf diff -c cycles --cycles-hist
# Baseline [Program Block Range] Cycles Diff stddev/Hist Shared Object Symbol
# ........ .......................................................... .... ................. ................. ............................
#
46.72% [div.c:40 -> div.c:40] 0 ± 37.8% ▁█▁▁██▁█ div [.] main
46.72% [div.c:42 -> div.c:44] 0 ± 49.4% ▁▁▂█▂▂▂▂ div [.] main
46.72% [div.c:42 -> div.c:39] 0 ± 24.1% ▃█▂▄▁▃▂▁ div [.] main
20.54% [random_r.c:357 -> random_r.c:394] 1 ± 33.5% ▅▂▁█▃▁▂▁ libc-2.27.so [.] __random_r
20.54% [random_r.c:357 -> random_r.c:380] 0 ± 39.4% ▁▁█▁██▅▁ libc-2.27.so [.] __random_r
20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r
20.54% [random_r.c:388 -> random_r.c:391] 0 ± 41.2% ▁▃▁▂█▄▃▁ libc-2.27.so [.] __random_r
17.04% [random.c:288 -> random.c:291] 0 ± 48.8% ▁▁▁▁███▁ libc-2.27.so [.] __random
17.04% [random.c:291 -> random.c:291] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random
17.04% [random.c:293 -> random.c:293] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random
17.04% [random.c:295 -> random.c:295] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random
17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random
17.04% [random.c:298 -> random.c:298] 0 ± 75.6% ▃█▁▁▁▁▁▁ libc-2.27.so [.] __random
8.40% [div.c:22 -> div.c:25] 0 ± 42.1% ▁▃▁▁███▁ div [.] compute_flag
8.40% [div.c:27 -> div.c:28] 0 ± 41.8% ██▁▁▄▁▁▄ div [.] compute_flag
5.14% [rand.c:26 -> rand.c:27] 0 ± 37.8% ▁▁▁████▁ libc-2.27.so [.] rand
5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand
2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt
0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax
0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap
0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap
0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap
0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15
0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 ± 38.5% ▄█▁ [kernel.kallsyms] [k] native_sched_clock
0.00% [native_write_msr+0 -> native_write_msr+16] -13 ± 47.1% ▁█▇▃▁▁ [kernel.kallsyms] [k] native_write_msr
v8:
---
Rebase to perf/core branch
v7:
---
1. v6 got Jiri's ACK.
2. Rebase to latest perf/core branch.
v6:
---
1. Jiri provides better code for using data__hpp_register() in ui_init().
Use this code in v6.
v5:
---
1. Refine the use of data__hpp_register() in ui_init() according to
Jiri's suggestion.
v4:
---
1. Rename the new option from '--noisy' to '--cycles-hist'
2. Remove the option '-n'.
3. Only update the spark value and stats when '--cycles-hist' is enabled.
4. Remove the code of printing '..'.
v3:
---
1. Move the histogram to a separate column
2. Move the svals[] out of struct stats
v2:
---
Jiri got a compile error,
CC builtin-diff.o
builtin-diff.c: In function ‘compute_cycles_diff’:
builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value]
712 | labs(pair->block_info->cycles_spark[i] -
| ^~~~
Because the result of u64 - u64 is still u64. Now we change the type of
cycles_spark[] to s64.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190925011446.30678-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Return errno when open_memstream() fails and add two new speciall error
codes for when an invalid, non BPF file or one without BTF is passed to
symbol__disassemble_bpf(), so that its callers can rely on
symbol__strerror_disassemble() to convert that to a human readable error
message that can help figure out what is wrong, with hints even.
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Song Liu <songliubraving@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-usevw9r2gcipfcrbpaueurw0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
They are called from symbol__annotate() and to propagate errors that can
help understand the problem make them return what
symbol__strerror_disassemble() known, i.e. errno codes and other
annotation specific errors in a special, out of errnos, range.
Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-pqx7srcv7tixgid251aeboj6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move nr_entries count from 'struct perf' to into perf_evlist struct.
Committer notes:
Fix tools/perf/arch/s390/util/auxtrace.c case. And also the comment in
tools/perf/util/annotate.h.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-42-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.
Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into
a new function symbol__disassemble_bpf(), where annotation line
information is filled based on the bpf_prog_info and btf data saved in
given perf_env.
symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf
programs.
Committer testing:
After fixing this:
- u64 *addrs = (u64 *)(info_linear->info.jited_ksyms);
+ u64 *addrs = (u64 *)(uintptr_t)(info_linear->info.jited_ksyms);
Detected when crossbuilding to a 32-bit arch.
And making all this dependent on HAVE_LIBBFD_SUPPORT and
HAVE_LIBBPF_SUPPORT:
1) Have a BPF program running, one that has BTF info, etc, I used
the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place
by 'perf trace'.
# grep -B1 augmented_raw ~/.perfconfig
[trace]
add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
#
# perf trace -e *mmsg
dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2
NetworkManager/10055 sendmmsg(22<socket:[1056822]>, 0x7f8126ad1bb0, 2, MSG_NOSIGNAL) = 2
2) Then do a 'perf record' system wide for a while:
# perf record -a
^C[ perf record: Woken up 68 times to write data ]
[ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ]
#
3) Check that we captured BPF and BTF info in the perf.data file:
# perf report --header-only | grep 'b[pt]f'
# event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
# bpf_prog_info of id 13
# bpf_prog_info of id 14
# bpf_prog_info of id 15
# bpf_prog_info of id 16
# bpf_prog_info of id 17
# bpf_prog_info of id 18
# bpf_prog_info of id 21
# bpf_prog_info of id 22
# bpf_prog_info of id 41
# bpf_prog_info of id 42
# btf info of id 2
#
4) Check which programs got recorded:
# perf report | grep bpf_prog | head
0.16% exe bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.14% exe bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.08% fuse-overlayfs bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.07% fuse-overlayfs bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.01% clang-4.0 bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.01% clang-4.0 bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% clang bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.00% runc bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% clang bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% sh bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
#
This was with the default --sort order for 'perf report', which is:
--sort comm,dso,symbol
If we just look for the symbol, for instance:
# perf report --sort symbol | grep bpf_prog | head
0.26% [k] bpf_prog_819967866022f1e1_sys_enter - -
0.24% [k] bpf_prog_c1bd85c092d6e4aa_sys_exit - -
#
or the DSO:
# perf report --sort dso | grep bpf_prog | head
0.26% bpf_prog_819967866022f1e1_sys_enter
0.24% bpf_prog_c1bd85c092d6e4aa_sys_exit
#
We'll see the two BPF programs that augmented_raw_syscalls.o puts in
place, one attached to the raw_syscalls:sys_enter and another to the
raw_syscalls:sys_exit tracepoints, as expected.
Now we can finally do, from the command line, annotation for one of
those two symbols, with the original BPF program source coude intermixed
with the disassembled JITed code:
# perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter
Samples: 950 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 553756947, [percent: local period]
bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
Percent int sys_enter(struct syscall_enter_args *args)
53.41 push %rbp
0.63 mov %rsp,%rbp
0.31 sub $0x170,%rsp
1.93 sub $0x28,%rbp
7.02 mov %rbx,0x0(%rbp)
3.20 mov %r13,0x8(%rbp)
1.07 mov %r14,0x10(%rbp)
0.61 mov %r15,0x18(%rbp)
0.11 xor %eax,%eax
1.29 mov %rax,0x20(%rbp)
0.11 mov %rdi,%rbx
return bpf_get_current_pid_tgid();
2.02 → callq *ffffffffda6776d9
2.76 mov %eax,-0x148(%rbp)
mov %rbp,%rsi
int sys_enter(struct syscall_enter_args *args)
add $0xfffffffffffffeb8,%rsi
return bpf_map_lookup_elem(pids, &pid) != NULL;
movabs $0xffff975ac2607800,%rdi
1.26 → callq *ffffffffda6789e9
cmp $0x0,%rax
2.43 → je 0
add $0x38,%rax
0.21 xor %r13d,%r13d
if (pid_filter__has(&pids_filtered, getpid()))
0.81 cmp $0x0,%rax
→ jne 0
mov %rbp,%rdi
probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
2.22 add $0xfffffffffffffeb8,%rdi
0.11 mov $0x40,%esi
0.32 mov %rbx,%rdx
2.74 → callq *ffffffffda658409
syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
0.22 mov %rbp,%rsi
1.69 add $0xfffffffffffffec0,%rsi
syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
movabs $0xffff975bfcd36000,%rdi
add $0xd0,%rdi
0.21 mov 0x0(%rsi),%eax
0.93 cmp $0x200,%rax
→ jae 0
0.10 shl $0x3,%rax
0.11 add %rdi,%rax
0.11 → jmp 0
xor %eax,%eax
if (syscall == NULL || !syscall->enabled)
1.07 cmp $0x0,%rax
→ je 0
if (syscall == NULL || !syscall->enabled)
6.57 movzbq 0x0(%rax),%rdi
if (syscall == NULL || !syscall->enabled)
cmp $0x0,%rdi
0.95 → je 0
mov $0x40,%r8d
switch (augmented_args.args.syscall_nr) {
mov -0x140(%rbp),%rdi
switch (augmented_args.args.syscall_nr) {
cmp $0x2,%rdi
→ je 0
cmp $0x101,%rdi
→ je 0
cmp $0x15,%rdi
→ jne 0
case SYS_OPEN: filename_arg = (const void *)args->args[0];
mov 0x10(%rbx),%rdx
→ jmp 0
case SYS_OPENAT: filename_arg = (const void *)args->args[1];
mov 0x18(%rbx),%rdx
if (filename_arg != NULL) {
cmp $0x0,%rdx
→ je 0
xor %edi,%edi
augmented_args.filename.reserved = 0;
mov %edi,-0x104(%rbp)
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %rbp,%rdi
add $0xffffffffffffff00,%rdi
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov $0x100,%esi
→ callq *ffffffffda658499
mov $0x148,%r8d
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %eax,-0x108(%rbp)
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %rax,%rdi
shl $0x20,%rdi
shr $0x20,%rdi
if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) {
cmp $0xff,%rdi
→ ja 0
len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size;
add $0x48,%rax
len &= sizeof(augmented_args.filename.value) - 1;
and $0xff,%rax
mov %rax,%r8
mov %rbp,%rcx
return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, len);
add $0xfffffffffffffeb8,%rcx
mov %rbx,%rdi
movabs $0xffff975fbd72d800,%rsi
mov $0xffffffff,%edx
→ callq *ffffffffda658ad9
mov %rax,%r13
}
mov %r13,%rax
0.72 mov 0x0(%rbp),%rbx
mov 0x8(%rbp),%r13
1.16 mov 0x10(%rbp),%r14
0.10 mov 0x18(%rbp),%r15
0.42 add $0x28,%rbp
0.54 leaveq
0.54 ← retq
#
Please see 'man perf-config' to see how to control what should be seen,
via ~/.perfconfig [annotate] section, for instance, one can suppress the
source code and see just the disassembly, etc.
Alternatively, use the TUI bu just using 'perf annotate', press
'/bpf_prog' to see the bpf symbols, press enter and do the interactive
annotation, which allows for dumping to a file after selecting the
the various output tunables, for instance, the above without source code
intermixed, plus showing all the instruction offsets:
# perf annotate bpf_prog_819967866022f1e1_sys_enter
Then press: 's' to hide the source code + 'O' twice to show all
instruction offsets, then 'P' to print to the
bpf_prog_819967866022f1e1_sys_enter.annotation file, which will have:
# cat bpf_prog_819967866022f1e1_sys_enter.annotation
bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
Event: cycles:ppp
53.41 0: push %rbp
0.63 1: mov %rsp,%rbp
0.31 4: sub $0x170,%rsp
1.93 b: sub $0x28,%rbp
7.02 f: mov %rbx,0x0(%rbp)
3.20 13: mov %r13,0x8(%rbp)
1.07 17: mov %r14,0x10(%rbp)
0.61 1b: mov %r15,0x18(%rbp)
0.11 1f: xor %eax,%eax
1.29 21: mov %rax,0x20(%rbp)
0.11 25: mov %rdi,%rbx
2.02 28: → callq *ffffffffda6776d9
2.76 2d: mov %eax,-0x148(%rbp)
33: mov %rbp,%rsi
36: add $0xfffffffffffffeb8,%rsi
3d: movabs $0xffff975ac2607800,%rdi
1.26 47: → callq *ffffffffda6789e9
4c: cmp $0x0,%rax
2.43 50: → je 0
52: add $0x38,%rax
0.21 56: xor %r13d,%r13d
0.81 59: cmp $0x0,%rax
5d: → jne 0
63: mov %rbp,%rdi
2.22 66: add $0xfffffffffffffeb8,%rdi
0.11 6d: mov $0x40,%esi
0.32 72: mov %rbx,%rdx
2.74 75: → callq *ffffffffda658409
0.22 7a: mov %rbp,%rsi
1.69 7d: add $0xfffffffffffffec0,%rsi
84: movabs $0xffff975bfcd36000,%rdi
8e: add $0xd0,%rdi
0.21 95: mov 0x0(%rsi),%eax
0.93 98: cmp $0x200,%rax
9f: → jae 0
0.10 a1: shl $0x3,%rax
0.11 a5: add %rdi,%rax
0.11 a8: → jmp 0
aa: xor %eax,%eax
1.07 ac: cmp $0x0,%rax
b0: → je 0
6.57 b6: movzbq 0x0(%rax),%rdi
bb: cmp $0x0,%rdi
0.95 bf: → je 0
c5: mov $0x40,%r8d
cb: mov -0x140(%rbp),%rdi
d2: cmp $0x2,%rdi
d6: → je 0
d8: cmp $0x101,%rdi
df: → je 0
e1: cmp $0x15,%rdi
e5: → jne 0
e7: mov 0x10(%rbx),%rdx
eb: → jmp 0
ed: mov 0x18(%rbx),%rdx
f1: cmp $0x0,%rdx
f5: → je 0
f7: xor %edi,%edi
f9: mov %edi,-0x104(%rbp)
ff: mov %rbp,%rdi
102: add $0xffffffffffffff00,%rdi
109: mov $0x100,%esi
10e: → callq *ffffffffda658499
113: mov $0x148,%r8d
119: mov %eax,-0x108(%rbp)
11f: mov %rax,%rdi
122: shl $0x20,%rdi
126: shr $0x20,%rdi
12a: cmp $0xff,%rdi
131: → ja 0
133: add $0x48,%rax
137: and $0xff,%rax
13d: mov %rax,%r8
140: mov %rbp,%rcx
143: add $0xfffffffffffffeb8,%rcx
14a: mov %rbx,%rdi
14d: movabs $0xffff975fbd72d800,%rsi
157: mov $0xffffffff,%edx
15c: → callq *ffffffffda658ad9
161: mov %rax,%r13
164: mov %r13,%rax
0.72 167: mov 0x0(%rbp),%rbx
16b: mov 0x8(%rbp),%r13
1.16 16f: mov 0x10(%rbp),%r14
0.10 173: mov 0x18(%rbp),%r15
0.42 177: add $0x28,%rbp
0.54 17b: leaveq
0.54 17c: ← retq
Another cool way to test all this is to symple use 'perf top' look for
those symbols, go there and press enter, annotate it live :-)
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jiri Olsa <jolsa@kernel |