Age | Commit message (Collapse) | Author | Files | Lines |
|
It'd be better to have them in hex to check cacheline alignment.
Percent offset size field
100.00 0 0x1c0 struct cfs_rq {
0.00 0 0x10 struct load_weight load {
0.00 0 0x8 long unsigned int weight;
0.00 0x8 0x4 u32 inv_weight;
};
0.00 0x10 0x4 unsigned int nr_running;
14.56 0x14 0x4 unsigned int h_nr_running;
0.00 0x18 0x4 unsigned int idle_nr_running;
0.00 0x1c 0x4 unsigned int idle_h_nr_running;
...
Committer notes:
Justification from Namhyung when asked about why it would be "better":
Cache line sizes are power of 2 so it'd be natural to use hex and
check whether an offset is in the same boundary. Also 'perf annotate'
shows instruction offsets in hex.
>
> Maybe this should be selectable?
I can add an option and/or a config if you want.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240819233603.54941-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Display the branch counter histogram in the annotation view.
Press 'B' to display the branch counter's abbreviation list as well.
Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }',
4000 Hz, Event count (approx.):
f3 /home/sdp/test/tchain_edit [Percent: local period]
Percent │ IPC Cycle Branch Counter (Average IPC: 1.39, IPC Coverage: 29.4%)
│ 0000000000401755 <f3>:
0.00 0.00 │ endbr64
│ push %rbp
│ mov %rsp,%rbp
│ movl $0x0,-0x4(%rbp)
0.00 0.00 │1.33 3 |A |- | ↓ jmp 25
11.03 11.03 │ 11: mov -0x4(%rbp),%eax
│ and $0x1,%eax
│ test %eax,%eax
17.13 17.13 │2.41 1 |A |- | ↓ je 21
│ addl $0x1,-0x4(%rbp)
21.84 21.84 │2.22 2 |AA |- | ↓ jmp 25
17.13 17.13 │ 21: addl $0x1,-0x4(%rbp)
21.84 21.84 │ 25: cmpl $0x270f,-0x4(%rbp)
11.03 11.03 │0.61 3 |A |- | ↑ jle 11
│ nop
│ pop %rbp
0.00 0.00 │0.24 20 |AA |B | ← ret
Originally-by: Tinghao Zhang <tinghao.zhang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-8-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Reusing the existing --total-cycles option to display the branch
counters. Add a new PERF_HPP_REPORT__BLOCK_BRANCH_COUNTER to display
the logged branch counter events. They are shown right after all the
cycle-related annotations.
Extend the 'struct block_info' to store and pass the branch counter
related information.
The annotation_br_cntr_entry() is to print the histogram of each branch
counter event. If the number of logged events is less than 4, the exact
number of the abbr name is printed. Otherwise, using '+' to stands for
more than 3 events.
Assume the number of logged events is less than 4.
The annotation_br_cntr_abbr_list() prints the branch counter's
abbreviation list. Press 'B' to display the list in the TUI mode.
$ perf record -e "{branch-instructions:ppp,branch-misses}:S" -j any,counter
$ perf report --total-cycles --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }'
# Event count (approx.): 1610046
#
# Branch counter abbr list:
# branch-instructions:ppp = A
# branch-misses = B
# '-' No event occurs
# '+' Event occurrences may be lost due to branch counter saturated
#
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter [Program Block Range]
# ............... .............. ........... .......... .............. ..................
#
57.55% 2.5M 0.00% 3 |A |- | ...
25.27% 1.1M 0.00% 2 |AA |- | ...
15.61% 667.2K 0.00% 1 |A |- | ...
0.16% 6.9K 0.81% 575 |A |- | ...
0.16% 6.8K 1.38% 977 |AA |- | ...
0.16% 6.8K 0.04% 28 |AA |B | ...
0.15% 6.6K 1.33% 946 |A |- | ...
0.11% 4.5K 0.06% 46 |AAA+|- | ...
0.10% 4.4K 0.88% 624 |A |- | ...
0.09% 3.7K 0.74% 524 |AAA+|B | ...
With -v applied,
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter [Program Block Range]
# ............... .............. ........... .......... .............. ..................
#
57.55% 2.5M 0.00% 3 A=1 ,B=- ...
25.27% 1.1M 0.00% 2 A=2 ,B=- ...
15.61% 667.2K 0.00% 1 A=1 ,B=- ...
0.16% 6.9K 0.81% 575 A=1 ,B=- ...
0.16% 6.8K 1.38% 977 A=2 ,B=- ...
0.16% 6.8K 0.04% 28 A=2 ,B=1 ...
0.15% 6.6K 1.33% 946 A=1 ,B=- ...
0.11% 4.5K 0.06% 46 A=3+,B=- ...
0.10% 4.4K 0.88% 624 A=1 ,B=- ...
0.09% 3.7K 0.74% 524 A=3+,B=1 ...
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-7-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now default is to fold everything but it only shows the name of the
top-level data type which is not very useful. Instead just expand the
top level entry so that it can show the layout at a higher level.
Annotate type: 'struct task_struct' (4 samples)
Percent Offset Size Field
- 100.00 0 9792 struct task_struct { ◆
+ 0.50 0 24 struct thread_info thread_info; ▒
0.00 24 4 unsigned int __state; ▒
0.00 32 8 void* stack; ▒
+ 0.00 40 4 refcount_t usage; ▒
0.00 44 4 unsigned int flags; ▒
0.00 48 4 unsigned int ptrace; ▒
0.00 52 4 int on_cpu; ▒
+ 0.00 56 16 struct __call_single_node wake_entry; ▒
0.00 72 4 unsigned int wakee_flips; ▒
0.00 80 8 long unsigned int wakee_flip_decay_ts;▒
0.00 88 8 struct task_struct* last_wakee; ▒
0.00 96 4 int recent_used_cpu; ▒
0.00 100 4 int wake_cpu; ▒
0.00 104 4 int on_rq; ▒
0.00 108 4 int prio; ▒
0.00 112 4 int static_prio; ▒
0.00 116 4 int normal_prio; ▒
0.00 120 4 unsigned int rt_priority; ▒
+ 0.00 128 256 struct sched_entity se; ▒
+ 0.00 384 48 struct sched_rt_entity rt; ▒
+ 0.00 432 224 struct sched_dl_entity dl; ▒
0.00 656 8 struct sched_class* sched_class; ▒
...
Committer testing:
# perf mem record -a sleep 5s
# perf annotate --group --data-type=pthread_mutex_t
Annotate type: 'pthread_mutex_t' (13 samples)
Percent Offset Size Field
- 100.00 0 40 pthread_mutex_t { ▒
- 100.00 0 40 struct __pthread_mutex_s __data { ▒
39.45 0 4 int __lock; ▒
0.00 4 4 unsigned int __count; ▒
7.80 8 4 int __owner; ▒
6.88 12 4 unsigned int __nusers; ▒
45.87 16 4 int __kind; ▒
0.00 20 2 short int __spins; ▒
0.00 22 2 short int __elision; ▒
+ 0.00 24 16 __pthread_list_t __list; ▒
}; ▒
0.00 0 0 char[] __size; ▒
39.45 0 8 long int __align;
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240812194447.2049187-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Like 'perf report', use 'e' or 'E' key to toggle folding the current
entry so that it can control displaying child entries.
Note I didn't add the 'c' and 'C' key to collapse the entry because it's
also handled with the 'e'/'E' since it toggles the state.
Committer testing:
Do some 'perf mem record' for some workload of the whole system, using
the target options, as usual (--pid/-p, -C/--cpu, -a for the system wide
profiling, etc) and then:
# perf annotate --skip-empty --data-type=pthread_mutex_t
That, by default, will start as --tui, then press 'E' to see the whole
struct unfolded, etc.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240812194447.2049187-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Like in the hists browser, it should support folding current entry so
that it can hide unwanted details in some data structures.
The folded entries will be displayed with the '+' sign, while unfolded
entries will have the '-' sign.
Entries that have no children will not show any signs.
Annotate type: 'struct socket' (1 samples)
Percent Offset Size Field
- 100.00 0 128 struct socket { ◆
0.00 0 4 socket_state state; ▒
0.00 4 2 short int type; ▒
0.00 8 8 long unsigned int flags; ▒
0.00 16 8 struct file* file; ▒
100.00 24 8 struct sock* sk; ▒
0.00 32 8 struct proto_ops* ops; ▒
- 0.00 64 64 struct socket_wq wq { ▒
- 0.00 64 24 wait_queue_head_t wait { ▒
+ 0.00 64 4 spinlock_t lock; ▒
- 0.00 72 16 struct list_head head { ▒
0.00 72 8 struct list_head* next; ▒
0.00 80 8 struct list_head* prev; ▒
}; ▒
}; ▒
0.00 88 8 struct fasync_struct* fasync_list; ▒
0.00 96 8 long unsigned int flags; ▒
+ 0.00 104 16 struct callback_head rcu; ▒
}; ▒
}; ▒
This just adds the display logic for folding, actually folding action
will be implemented in the next patch.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240812194447.2049187-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In get_member_overhead(), k is updated when it has a entry in the
histogram. But the entry->hists array is allocated with the number of
evsel in the group. So the k should be reset when it iterates the event
using for_each_group_evsel(), otherwise it'd crash due to a buffer
overflow.
Fixes: cb1898f58e0f175d ("perf annotate-data: Support --skip-empty option")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240810191502.1947959-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The --skip-empty option is to hide dummy events in a group. Like other
output mode in 'perf report' and 'perf annotate', the data-type
profiling output should support the option.
Committer testing:
With dummy:
root@number:~# perf annotate --stdio --group --data-type --skip-empty | head -24
Annotate type: 'pthread_mutex_t' in /usr/lib64/libc.so.6 (50 samples):
event[0] = cpu_atom/mem-loads,ldlat=30/P
event[1] = cpu_atom/mem-stores/P
event[2] = dummy:u
============================================================================
Percent offset size field
100.00 100.00 0.00 0 40 pthread_mutex_t {
100.00 100.00 0.00 0 40 struct __pthread_mutex_s __data {
45.21 84.54 0.00 0 4 int __lock;
0.00 0.00 0.00 4 4 unsigned int __count;
0.00 1.83 0.00 8 4 int __owner;
5.19 10.65 0.00 12 4 unsigned int __nusers;
49.61 2.97 0.00 16 4 int __kind;
0.00 0.00 0.00 20 2 short int __spins;
0.00 0.00 0.00 22 2 short int __elision;
0.00 0.00 0.00 24 16 __pthread_list_t __list {
0.00 0.00 0.00 24 8 struct __pthread_internal_list* __prev;
0.00 0.00 0.00 32 8 struct __pthread_internal_list* __next;
};
};
0.00 0.00 0.00 0 0 char[] __size;
45.21 84.54 0.00 0 8 long int __align;
};
Skipping it:
root@number:~# perf annotate --stdio --group --data-type --skip-empty | head -24
Annotate type: 'pthread_mutex_t' in /usr/lib64/libc.so.6 (50 samples):
event[0] = cpu_atom/mem-loads,ldlat=30/P
event[1] = cpu_atom/mem-stores/P
============================================================================
Percent offset size field
100.00 100.00 0 40 pthread_mutex_t {
100.00 100.00 0 40 struct __pthread_mutex_s __data {
45.21 84.54 0 4 int __lock;
0.00 0.00 4 4 unsigned int __count;
0.00 1.83 8 4 int __owner;
5.19 10.65 12 4 unsigned int __nusers;
49.61 2.97 16 4 int __kind;
0.00 0.00 20 2 short int __spins;
0.00 0.00 22 2 short int __elision;
0.00 0.00 24 16 __pthread_list_t __list {
0.00 0.00 24 8 struct __pthread_internal_list* __prev;
0.00 0.00 32 8 struct __pthread_internal_list* __next;
};
};
0.00 0.00 0 0 char[] __size;
45.21 84.54 0 8 long int __align;
};
Annotate type: 'pthread_mutexattr_t' in /usr/lib64/libc.so.6 (1 samples):
root@number:~#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240807061713.1642924-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Replace a comma between expression statements by a semicolon.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240716073405.968801-1-nichen@iscas.ac.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make the ui code its own library. This is done to avoid compiling code
twice, once for the perf tool and once for the perf python module.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-3-irogers@google.com
|
|
When freeing a->b it is good practice to set a->b to NULL using
zfree(&a->b) so that when we have a bug where a reference to a freed 'a'
pointer is kept somewhere, we can more quickly cause a segfault if some
code tries to use a->b.
This is mostly done but some new cases were introduced recently, convert
them to zfree().
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZjmbHHrjIm5YRIBv@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.
The majority of the change is mechanical in nature and not easy to
split up.
Committer testing:
'perf test' up to this patch shows no regressions.
But:
util/symbol.c: In function ‘dso__load_bfd_symbols’:
util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
1683 | dso__set_adjust_symbols(dso);
| ^~~~~~~~~~~~~~~~~~~~~~~
In file included from util/symbol.c:21:
util/dso.h:268:20: note: declared here
268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
| ^~~~~~~~~~~~~~~~~~~~~~~
make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/
make[6]: *** Waiting for unfinished jobs....
This was updated:
- symbols__fixup_end(&dso->symbols, false);
- symbols__fixup_duplicate(&dso->symbols);
- dso->adjust_symbols = 1;
+ symbols__fixup_end(dso__symbols(dso), false);
+ symbols__fixup_duplicate(dso__symbols(dso));
+ dso__set_adjust_symbols(dso);
But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).
Add the missing argument:
symbols__fixup_end(dso__symbols(dso), false);
symbols__fixup_duplicate(dso__symbols(dso));
- dso__set_adjust_symbols(dso);
+ dso__set_adjust_symbols(dso, true);
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Chengen Du <chengen.du@canonical.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up fixes sent via perf-tools, by Namhyung Kim.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When the hist entry has the type info, it should be able to display the
annotation browser for the type like in `perf annotate --data-type`.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240411033256.2099646-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Like in stdio, it should print all events in a group together.
Committer notes:
Collect it:
root@number:~# perf record -a -e '{cpu_core/mem-loads,ldlat=30/P,cpu_core/mem-stores/P}'
^C[ perf record: Woken up 8 times to write data ]
[ perf record: Captured and wrote 4.980 MB perf.data (55825 samples) ]
root@number:~#
Then do it in stdio:
root@number:~# perf annotate --stdio --data-type
Annotate type: 'union ' in /usr/lib64/libc.so.6 (1131 samples):
event[0] = cpu_core/mem-loads,ldlat=30/P
event[1] = cpu_core/mem-stores/P
============================================================================
Percent offset size field
100.00 100.00 0 40 union {
100.00 100.00 0 40 struct __pthread_mutex_s __data {
48.61 23.46 0 4 int __lock;
0.00 0.48 4 4 unsigned int __count;
6.38 41.32 8 4 int __owner;
8.74 34.02 12 4 unsigned int __nusers;
35.66 0.26 16 4 int __kind;
0.61 0.45 20 2 short int __spins;
0.00 0.00 22 2 short int __elision;
0.00 0.00 24 16 __pthread_list_t __list {
0.00 0.00 24 8 struct __pthread_internal_list* __prev;
0.00 0.00 32 8 struct __pthread_internal_list* __next;
};
};
0.00 0.00 0 0 char* __size;
48.61 23.94 0 8 long int __align;
};
Now with TUI before this patch:
root@number:~# perf annotate --tui --data-type
Annotate type: 'union ' (790 samples)
Percent Offset Size Field
100.00 0 40 union {
100.00 0 40 struct __pthread_mutex_s __data {
48.61 0 4 int __lock;
0.00 4 4 unsigned int __count;
6.38 8 4 int __owner;
8.74 12 4 unsigned int __nusers;
35.66 16 4 int __kind;
0.61 20 2 short int __spins;
0.00 22 2 short int __elision;
0.00 24 16 __pthread_list_t __list {
0.00 24 8 struct __pthread_internal_list* __prev;
0.00 32 8 struct __pthread_internal_list* __next;
0.00 0 0 char* __size;
48.61 0 8 long int __align;
};
And now after this patch:
Annotate type: 'union ' (790 samples)
Percent Offset Size Field
100.00 100.00 0 40 union {
100.00 100.00 0 40 struct __pthread_mutex_s __data {
48.61 23.46 0 4 int __lock;
0.00 0.48 4 4 unsigned int __count;
6.38 41.32 8 4 int __owner;
8.74 34.02 12 4 unsigned int __nusers;
35.66 0.26 16 4 int __kind;
0.61 0.45 20 2 short int __spins;
0.00 0.00 22 2 short int __elision;
0.00 0.00 24 16 __pthread_list_t __list {
0.00 0.00 24 8 struct __pthread_internal_list* __prev;
0.00 0.00 32 8 struct __pthread_internal_list* __next;
};
};
0.00 0.00 0 0 char* __size;
48.61 23.94 0 8 long int __align;
};
On a followup patch the --tui output should have this that is present in
--stdio:
And the --stdio has all the missing info in TUI:
Annotate type: 'union ' in /usr/lib64/libc.so.6 (1131 samples):
event[0] = cpu_core/mem-loads,ldlat=30/P
event[1] = cpu_core/mem-stores/P
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240411033256.2099646-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Support data type profiling output on TUI.
Testing from Arnaldo:
First make sure that the debug information for your workload binaries
in embedded in them by building it with '-g' or install the debuginfo
packages, since our workload is 'find':
root@number:~# type find
find is hashed (/usr/bin/find)
root@number:~# rpm -qf /usr/bin/find
findutils-4.9.0-5.fc39.x86_64
root@number:~# dnf debuginfo-install findutils
<SNIP>
root@number:~#
Then collect some data:
root@number:~# echo 1 > /proc/sys/vm/drop_caches
root@number:~# perf mem record find / > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.331 MB perf.data (3982 samples) ]
root@number:~#
Finally do data-type annotation with the following command, that will
default, as 'perf report' to the --tui mode, with lines colored to
highlight the hotspots, etc.
root@number:~# perf annotate --data-type
Annotate type: 'struct predicate' (58 samples)
Percent Offset Size Field
100.00 0 312 struct predicate {
0.00 0 8 PRED_FUNC pred_func;
0.00 8 8 char* p_name;
0.00 16 4 enum predicate_type p_type;
0.00 20 4 enum predicate_precedence p_prec;
0.00 24 1 _Bool side_effects;
0.00 25 1 _Bool no_default_print;
0.00 26 1 _Bool need_stat;
0.00 27 1 _Bool need_type;
0.00 28 1 _Bool need_inum;
0.00 32 4 enum EvaluationCost p_cost;
0.00 36 4 float est_success_rate;
0.00 40 1 _Bool literal_control_chars;
0.00 41 1 _Bool artificial;
0.00 48 8 char* arg_text;
<SNIP>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240411033256.2099646-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The symbol__annotate2() initializes some data structures needed by TUI.
It has a logic to prevent calling it multiple times by checking if it
has the annotated source. But data type profiling uses a different
code (symbol__annotate) to allocate the annotated lines in advance.
So TUI missed to call symbol__annotate2() when it shows the annotation
browser.
Make symbol__annotate() reentrant and handle that situation properly.
This fixes a crash in the annotation browser started by perf report in
TUI like below.
$ perf report -s type,sym --tui
# and press 'a' key and then move down
Fixes: 81e57deec325 ("perf report: Support data type profiling")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240405211800.1412920-2-namhyung@kernel.org
|
|
It's only used in 'perf annotate' output which means functions with actual
samples. No need to consume memory for every symbol ('struct annotation').
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240404175716.1225482-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It's only used in 'perf annotate' output which means functions with
actual samples. No need to consume memory for every symbol
('struct annotation').
Also move the 'max_line_len' field into it as it's related.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240404175716.1225482-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The struct annotated_source.offsets[] is to save pointers to
annotation_line at each offset. We can use annotated_source__get_line()
helper instead so let's get rid of the array.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240404175716.1225482-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It's a helper function to get annotation_line at the given offset
without using the offsets array. The goal is to get rid of the
offsets array altogether. It just does the linear search but I
think it's better to save memory as it won't be called in a hot
path.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240404175716.1225482-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now perf can show assembly instructions with libcapstone for x86, and the
capstone is better in general.
Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: changbin.du@gmail.com
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240217074046.4100789-6-changbin.du@huawei.com
|
|
In its infinite wisdom, by default, SLang sets susp undef, and this can
only be un-done by calling SLtty_set_suspend_state(true). After every
SLang_init_tty().
Additionally, no provisions are made for maintaining the teletype
attributes across suspend/continue (outside of curses emulation
mode(?!), which provides full support, naturally), so we need to save
and restore the flags ourselves, as well as reset the text colours when
going under. We need to also re-draw the screen, and raising SIGWINCH,
shockingly, Just Works.
The correct solution would be to Not Use SLang, but as a stop-gap,
this makes TUI 'perf report' usable.
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yaowenbin <yaowenbin1@huawei.com>
Link: https://lore.kernel.org/r/0354dcae23a8713f75f4fed609e0caec3c6e3cd5.1672174189.git.nabijaczleweli@nabijaczleweli.xyz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that it can get rid of the unused data.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231128175441.721579-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now it can use the global options and no need save local browser
options separately.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231128175441.721579-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now it can directly use the global options and no need to pass it as an
argument.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231128175441.721579-5-namhyung@kernel.org
[ Fixup build with GTK2=1 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
annotated_source'
The offsets array keeps pointers to 'struct annotation_line' entries
which are available in the 'struct annotated_source'. Let's move it to
there.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231103191907.54531-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
to 'struct annotated_source'
Some fields in the 'struct annotation' are only used with 'struct
annotated_source' so better to be moved there in order to reduce memory
consumption for other symbols.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231103191907.54531-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
annotation_line'
The cycles info is used only when branch stack is provided. Separate
them from 'struct annotation_line' into a separate struct and lazy
allocate them to save some memory.
Committer notes:
Make annotation__compute_ipc() check if the lazy allocation works,
bailing out if so, its callers already do error checking and
propagation.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231103191907.54531-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On other code paths browser->he_selection is NULL checked, add a
missing case reported by clang-tidy.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: llvm@lists.linux.dev
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20231009183920.200859-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Address clang-tidy warning:
```
tools/perf/ui/browsers/hists.c:2416:8: warning: Excessive padding in 'struct popup_action' (8 padding bytes, where 0 is optimal).
Optimal fields order:
time,
thread,
evsel,
fn,
ms,
socket,
rstype,
```
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: llvm@lists.linux.dev
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20231009183920.200859-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Casts were necessary for older versions of libslang, however, these
are now 15 years old and so we no longer need to care about supporting
them. Tidy the casts and remove unnecessary logic.
Move the ENABLE_SLFUTURE_CONST |