Age | Commit message (Collapse) | Author | Files | Lines |
|
The __die_find_member_offset_cb() missed to handle bitfield members
which don't have DW_AT_data_member_location. Like in adding member
types in __add_member_cb() it should fallback to check the bit offset
when it resolves the member type for an offset.
Fixes: 437683a9941891c1 ("perf dwarf-aux: Handle type transfer for memory access")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821232628.353177-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sometimes it's useful to organize member fields in cache-line boundary.
The 'typecln' sort key is short for type-cacheline and to show samples
in each cacheline. The cacheline size is fixed to 64 for now, but it
can read the actual size once it saves the value from sysfs.
For example, you maybe want to which cacheline in a target is hot or
cold. The following shows members in the cfs_rq's first cache line.
$ perf report -s type,typecln,typeoff -H
...
- 2.67% struct cfs_rq
+ 1.23% struct cfs_rq: cache-line 2
+ 0.57% struct cfs_rq: cache-line 4
+ 0.46% struct cfs_rq: cache-line 6
- 0.41% struct cfs_rq: cache-line 0
0.39% struct cfs_rq +0x14 (h_nr_running)
0.02% struct cfs_rq +0x38 (tasks_timeline.rb_leftmost)
...
Committer testing:
# root@number:~# perf report -s type,typecln,typeoff -H --stdio
# Total Lost Samples: 0
#
# Samples: 5K of event 'cpu_atom/mem-loads,ldlat=5/P'
# Event count (approx.): 312251
#
# Overhead Data Type / Data Type Cacheline / Data Type Offset
# .............. ..................................................
#
<SNIP>
0.07% struct sigaction
0.05% struct sigaction: cache-line 1
0.02% struct sigaction +0x58 (sa_mask)
0.02% struct sigaction +0x78 (sa_mask)
0.03% struct sigaction: cache-line 0
0.02% struct sigaction +0x38 (sa_mask)
0.01% struct sigaction +0x8 (sa_mask)
<SNIP>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240819233603.54941-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It'd be better to have them in hex to check cacheline alignment.
Percent offset size field
100.00 0 0x1c0 struct cfs_rq {
0.00 0 0x10 struct load_weight load {
0.00 0 0x8 long unsigned int weight;
0.00 0x8 0x4 u32 inv_weight;
};
0.00 0x10 0x4 unsigned int nr_running;
14.56 0x14 0x4 unsigned int h_nr_running;
0.00 0x18 0x4 unsigned int idle_nr_running;
0.00 0x1c 0x4 unsigned int idle_h_nr_running;
...
Committer notes:
Justification from Namhyung when asked about why it would be "better":
Cache line sizes are power of 2 so it'd be natural to use hex and
check whether an offset is in the same boundary. Also 'perf annotate'
shows instruction offsets in hex.
>
> Maybe this should be selectable?
I can add an option and/or a config if you want.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240819233603.54941-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The check that map is NULL is already done in the bpf_map__fd(map) and
returns an errno, which does not run further checks.
In addition, even if the check for map is run, the return is a pointer,
which is not consistent with the err_number returned by bpf_map__fd(map).
Signed-off-by: Yang Ruibin <11162571@vivo.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: opensource.kernel@vivo.com
Link: https://lore.kernel.org/r/20240821101500.4568-1-11162571@vivo.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In check_matching_type(), it checks the type state of the register in a
wrong order. When it's the percpu pointer, it should check the type for
the pointer, but it checks the CFA bit first and thought it has no type
in the stack slot. This resulted in no type info.
-----------------------------------------------------------
find data type for 0x28(reg1) at hrtimer_reprogram+0x88
CU for kernel/time/hrtimer.c (die:0x18f219f)
frame base: cfa=1 fbreg=7
...
add [72] percpu 0x24500 -> reg1 pointer type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)
bb: [7a - 7e]
bb: [80 - 86] (here)
bb: [88 - 88] vvv
chk [88] reg1 offset=0x28 ok=1 kind=4 cfa : no type information
no type information
Here, instruction at 0x72 found reg1 has a (percpu) pointer and got the
correct type. But when it checks the final result, it wrongly thought
it was stack variable because it checks the cfa bit first.
After changing the order of state check:
-----------------------------------------------------------
find data type for 0x28(reg1) at hrtimer_reprogram+0x88
CU for kernel/time/hrtimer.c (die:0x18f219f)
frame base: cfa=1 fbreg=7
... (here)
vvvvvvvvvv
chk [88] reg1 offset=0x28 ok=1 kind=4 percpu ptr : Good!
found by insn track: 0x28(reg1) type-offset=0x28
final type: type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821065408.285548-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sometimes a compound type can have a single field and the size is the
same as the base type. But it's still preferred as struct or union
could carry more information than the base type.
Also put a slight priority on the typedef for the same reason.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821065408.285548-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I found it missed to copy the immediate constant when it moves the
register value. This could result in a wrong type inference since the
address for the per-cpu variable would be 0 always.
Fixes: eb9190afaed6afd5 ("perf annotate-data: Handle ADD instructions")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821065408.285548-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove dependence on libcap. libcap is only used to query whether a
capability is supported, which is just 1 capget system call.
If the capget system call fails, fall back on root permission
checking. Previously if libcap fails then the permission is assumed
not present which may be pessimistic/wrong.
Add a used_root out argument to perf_cap__capable to say whether the
fall back root check was used. This allows the correct error message,
"root" vs "users with the CAP_PERFMON or CAP_SYS_ADMIN capability", to
be selected.
Tidy uses of perf_cap__capable so that tests aren't repeated if capget
isn't supported.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240806220614.831914-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The bitfield members might not have DW_AT_data_member_location. Let's
use DW_AT_data_bit_offset to set the member offset correct. Also use
DW_AT_bit_size for the name like in a C program.
Before:
Annotate type: 'struct sk_buff' (1 samples)
Percent Offset Size Field
- 100.00 0 232 struct sk_buff {
+ 0.00 0 24 union ;
+ 0.00 24 8 union ;
+ 0.00 32 8 union ;
0.00 40 48 char[] cb;
+ 0.00 88 16 union ;
0.00 104 8 long unsigned int _nfct;
100.00 112 4 unsigned int len;
0.00 116 4 unsigned int data_len;
0.00 120 2 __u16 mac_len;
0.00 122 2 __u16 hdr_len;
0.00 124 2 __u16 queue_mapping;
0.00 126 0 __u8[] __cloned_offset;
0.00 0 1 __u8 cloned;
0.00 0 1 __u8 nohdr;
0.00 0 1 __u8 fclone;
0.00 0 1 __u8 peeked;
0.00 0 1 __u8 head_frag;
0.00 0 1 __u8 pfmemalloc;
0.00 0 1 __u8 pp_recycle;
0.00 127 1 __u8 active_extensions;
+ 0.00 128 60 union ;
0.00 188 4 sk_buff_data_t tail;
0.00 192 4 sk_buff_data_t end;
0.00 200 8 unsigned char* head;
After:
Annotate type: 'struct sk_buff' (1 samples)
Percent Offset Size Field
- 100.00 0 232 struct sk_buff {
+ 0.00 0 24 union ;
+ 0.00 24 8 union ;
+ 0.00 32 8 union ;
0.00 40 48 char[] cb
+ 0.00 88 16 union ;
0.00 104 8 long unsigned int _nfct;
100.00 112 4 unsigned int len;
0.00 116 4 unsigned int data_len;
0.00 120 2 __u16 mac_len;
0.00 122 2 __u16 hdr_len;
0.00 124 2 __u16 queue_mapping;
0.00 126 0 __u8[] __cloned_offset;
0.00 126 1 __u8 cloned:1;
0.00 126 1 __u8 nohdr:1;
0.00 126 1 __u8 fclone:2;
0.00 126 1 __u8 peeked:1;
0.00 126 1 __u8 head_frag:1;
0.00 126 1 __u8 pfmemalloc:1;
0.00 126 1 __u8 pp_recycle:1;
0.00 127 1 __u8 active_extensions;
+ 0.00 128 60 union ;
0.00 188 4 sk_buff_data_t tail;
0.00 192 4 sk_buff_data_t end;
0.00 200 8 unsigned char* head;
Commiter notes:
Collect some data:
root@number:~# perf mem record -a --ldlat 5 -- ping -s 8193 -f 192.168.86.1
Memory events are enabled on a subset of CPUs: 16-27
PING 192.168.86.1 (192.168.86.1) 8193(8221) bytes of data.
.^C
--- 192.168.86.1 ping statistics ---
13881 packets transmitted, 13880 received, 0.00720409% packet loss, time 8664ms
rtt min/avg/max/mdev = 0.510/0.599/7.768/0.115 ms, ipg/ewma 0.624/0.593 ms
[ perf record: Woken up 8 times to write data ]
[ perf record: Captured and wrote 14.877 MB perf.data (46785 samples) ]
root@number:~#
root@number:~# perf evlist
cpu_atom/mem-loads,ldlat=5/P
cpu_atom/mem-stores/P
dummy:u
root@number:~# perf evlist -v
cpu_atom/mem-loads,ldlat=5/P: type: 10 (cpu_atom), size: 136, config: 0x5d0 (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1, { bp_addr, config1 }: 0x7
cpu_atom/mem-stores/P: type: 10 (cpu_atom), size: 136, config: 0x6d0 (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1
dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
root@number:~#
Ok, now lets see what changes from before this patch to after it:
root@number:~# perf annotate --data-type > /tmp/before
Apply the patch, build:
root@number:~# perf annotate --data-type > /tmp/after
The first hunk of the diff, for a glib data structure, in userspace,
look at those bitfields:
root@number:~# diff -u10 /tmp/before /tmp/after | head -20
--- /tmp/before 2024-08-20 17:29:58.306765780 -0300
+++ /tmp/after 2024-08-20 17:33:13.210582596 -0300
@@ -163,22 +163,22 @@
Annotate type: 'GHashTable' in /usr/lib64/libglib-2.0.so.0.8000.3 (1 samples):
============================================================================
Percent offset size field
100.00 0 96 GHashTable {
0.00 0 8 gsize size;
0.00 8 4 gint mod;
100.00 12 4 guint mask;
0.00 16 4 guint nnodes;
0.00 20 4 guint noccupied;
- 0.00 0 4 guint have_big_keys;
- 0.00 0 4 guint have_big_values;
+ 0.00 24 1 guint have_big_keys:1;
+ 0.00 24 1 guint have_big_values:1;
0.00 32 8 gpointer keys;
0.00 40 8 guint* hashes;
0.00 48 8 gpointer values;
root@number:~#
As advertised :-)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240815223823.2402285-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The previous attempt fixed the build on debian:experimental-x-mipsel,
but when building on a larger set of containers I noticed it broke the
build on some other 32-bit architectures such as:
42 7.87 ubuntu:18.04-x-arm : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
builtin-daemon.c: In function 'cmd_session_list':
builtin-daemon.c:692:16: error: format '%llu' expects argument of type 'long long unsigned int', but argument 4 has type 'long int' [-Werror=format=]
fprintf(out, "%c%" PRIu64,
^~~~~
builtin-daemon.c:694:13:
csv_sep, (curr - daemon->start) / 60);
~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from builtin-daemon.c:3:0:
/usr/arm-linux-gnueabihf/include/inttypes.h:105:34: note: format string is defined here
# define PRIu64 __PRI64_PREFIX "u"
So lets cast that time_t (32-bit/64-bit) to uint64_t to make sure it
builds everywhere.
Fixes: 4bbe6002931954bb ("perf daemon: Fix the build on 32-bit architectures")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZsPmldtJ0D9Cua9_@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add it to the record.sh shell test to verify if it tracks cgroup
information correctly. It records with --all-cgroups option can check
if it has PERF_RECORD_CGROUP and the names are not "unknown".
$ sudo ./perf test -vv 95
95: perf record tests:
--- start ---
test child forked, pid 2871922
169c90-169cd0 g test_loop
perf does have symbol 'test_loop'
Basic --per-thread mode test
Basic --per-thread mode test [Success]
Register capture test
Register capture test [Success]
Basic --system-wide mode test
Basic --system-wide mode test [Success]
Basic target workload test
Basic target workload test [Success]
Branch counter test
branch counter feature not supported on all core PMUs (/sys/bus/event_source/devices/cpu) [Skipped]
Cgroup sampling test
Cgroup sampling test [Success]
---- end(0) ----
95: perf record tests : Ok
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240818212948.2873156-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The recent change in 'struct perf_tool' constification broke the cgroup
and/or namespace tracking by resetting tool fields. It should set the
values after perf_tool__init().
Fixes: cecb1cf154b301c6 ("perf record: Use perf_tool__init()")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240818212948.2873156-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The handling of mmap and mmap2 events is near identical. Add a common
helper function and call that by the two event handling functions.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There are repipe, build ID and JIT dump variants of the mmap and mmap2
repipe functions. The organization doesn't allow JIT dump to work with
build ID injection and the structure is less than clear. Combine the
function and enable the different behaviors based on ifs.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It is clearer to have a single enum that determines how build ids are
injected, it also allows for future extension.
Set the header build ID feature whether lazy or all are generated,
previously only the lazy case would set it.
Allow parsing of known build IDs for either the lazy or all cases.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Test recording of call-graphs and injecting --build-all. Add/expand
trap handler.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Allows evsel__id_hdr_size() to be used when the evsel is const.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The passed dso_id is copied and so is never an out argument. Remove
its mutability.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make it clearer the argument is just being used as a string.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
map__init() is only used internally so make it static. Assume memory is
zero initialized, which will better support adding fields to struct
map in the future and was already the case for map__new2.
To reduce complexity, change set_priv and set_erange_warned to not take
a value to assign as they always assign true.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make sure the memset of a synthesized event only zeros the necessary
tracing data part of the event, as a full event can be over 4kb in
size.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 32-bit arm build system will complain:
tools/perf/util/python.c:75:28: error: field ‘sample’ has incomplete type
75 | struct perf_sample sample;
However, arm64 build system doesn't complain this.
The root cause is arm64 define "HAVE_KVM_STAT_SUPPORT := 1" in
tools/perf/arch/arm64/Makefile, but arm arch doesn't define this. This
will lead to kvm-stat.h include other header files on arm64 build
system, especially "util/sample.h" for util/python.c.
This will try to directly include "util/sample.h" for "util/python.c" to
avoid such build issue on arm platform.
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: imx@lists.linux.dev
Link: https://lore.kernel.org/r/20240819023403.201324-1-xu.yang_2@nxp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
After trying all possibilities with DWARF and instruction tracking.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sometimes it matches a variable in the inner scope but it fails because
the actual access can be on a different type. Let's try variables in
every scope and choose the best one using is_better_type().
I have an example with update_blocked_averages(), at first it found a
variable (__mptr) but it's a void pointer. So it moved on to the upper
scope and found another variable (cfs_rq).
$ perf --debug type-profile annotate --data-type --stdio
...
-----------------------------------------------------------
find data type for 0x140(reg14) at update_blocked_averages+0x2db
CU for kernel/sched/fair.c (die:0x12dd892)
frame base: cfa=1 fbreg=7
found "__mptr" (die: 0x13022f1) in scope=4/4 (die: 0x13022e8) failed: no/void pointer
variable location: base=reg14, offset=0x140
type='void*' size=0x8 (die:0x12dd8f9)
found "cfs_rq" (die: 0x1301721) in scope=3/4 (die: 0x130171c) type_offset=0x140
variable location: reg14
type='struct cfs_rq' size=0x1c0 (die:0x12e37e5)
final type: type='struct cfs_rq' size=0x1c0 (die:0x12e37e5)
IIUC the scope is like below:
1: update_blocked_averages
2: __update_blocked_fair
3: for_each_leaf_cfs_rq_safe
4: list_entry -> (container_of)
The container_of is implemented like:
#define container_of(ptr, type, member) ({ \
void *__mptr = (void *)(ptr); \
static_assert(__same_type(*(ptr), ((type *)0)->member) || \
__same_type(*(ptr), void), \
"pointer type mismatch in container_of()"); \
((type *)(__mptr - offsetof(type, member))); })
That's why we see the __mptr variable first but it failed since it has
no type information.
Then for_each_leaf_cfs_rq_safe() is defined as
#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) \
list_for_each_entry_safe(cfs_rq, pos, &rq->leaf_cfs_rq_list, \
leaf_cfs_rq_list)
Note that the access was 0x140(r14). And the cfs_rq has
leaf_cfs_rq_list at the 0x140. So it converts the list_head pointer to
a pointer to struct cfs_rq here.
$ pahole --hex -C cfs_rq vmlinux | grep 140
struct cfs_rq struct list_head leaf_cfs_rq_list; /* 0x140 0x10 */
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sometimes more than one variables are located in the same register or a
stack slot. Or it can overwrite existing information with others. I
found this is not helpful in some cases so it needs to update the type
information from the variable only if it's better.
But it's hard to know which one is better, so we needs heuristics. :)
As it deals with memory accesses, the location should have a pointer or
something similar (like array or reference). So if it had an integer
type and a variable is a pointer, we can take the variable's type to
resolve the target of the access.
If it has a pointer type and a variable with the same location has a
different pointer type, it'll take one with bigger target type. This
can be useful when the target type embeds a smaller type (like list
header or RB-tree node) at the beginning so their location is same.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It treats pointers and arrays in the same way. Let's add the helper and
use it when it checks if it needs a pointer.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that it can return enum variable_match_type to be propagated to the
find_data_type_die(). Also update the debug message to show the result
of the check_matching_type().
chk [dd] reg0 offset=0 ok=1 kind=1 : Good!
or
chk [177] reg4 offset=0x138 ok=0 kind=0 cfa : no type information
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that it can show a proper debug message in the right place. The
check_variable() is used in other places which don't want to print the
message.
$ perf --debug type-profile annotate --data-type
Before:
-----------------------------------------------------------
find data type for 0x140(reg14) at update_blocked_averages+0x2db
CU for kernel/sched/fair.c (die:0x12dd892)
frame base: cfa=1 fbreg=7
no pointer or no type <<<--- removed
check variable "__mptr" failed (die: 0x13022f1)
variable location: base=reg14, offset=0x140
type='void*' size=0x8 (die:0x12dd8f9)
After:
-----------------------------------------------------------
find data type for 0x140(reg14) at update_blocked_averages+0x2db
CU for kernel/sched/fair.c (die:0x12dd892)
frame base: cfa=1 fbreg=7
found "__mptr" (die: 0x13022f1) in scope=4/4 (die: 0x13022e8) failed: no/void pointer <<<--- here
variable location: base=reg14, offset=0x140
type='void*' size=0x8 (die:0x12dd8f9)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
And let check_variable() return the enum value so that callers can know
what was the problem. This will be used by the later patch to update
the statistics correctly and print the error message in a right place.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The location list will have entries with half-open addressing like
[start, end) which means it doesn't include the end address. So it
should skip entries at the end address and match to the next entry.
An example location list looks like this (from readelf -wo):
00237876 ffffffff8110d32b (base address)
0023787f v000000000000000 v000000000000002 views at 00237868 for:
ffffffff8110d32b ffffffff8110d4eb (DW_OP_reg3 (rbx)) <<<--- 1
00237885 v000000000000002 v000000000000000 views at 0023786a for:
ffffffff8110d4eb ffffffff8110d50b (DW_OP_reg14 (r14)) <<<--- 2
0023788c v000000000000000 v000000000000001 views at 0023786c for:
ffffffff8110d50b ffffffff8110d7c4 (DW_OP_reg3 (rbx))
00237893 v000000000000000 v000000000000000 views at 0023786e for:
ffffffff8110d806 ffffffff8110d854 (DW_OP_reg3 (rbx))
0023789a v000000000000000 v000000000000000 views at 00237870 for:
ffffffff8110d876 ffffffff8110d88e (DW_OP_reg3 (rbx))
The first entry at 0023787f has [8110d32b, 8110d4eb) (omitting the
ffffffff at the beginning), and the second one has [8110d4eb, 8110d50b).
Fixes: 2bc3cf575a162a2c ("perf annotate-data: Improve debug message with location info")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It missed to call check_allowed_ops() in __die_collect_vars_cb() so it
can take variables with complex location expression incorrectly.
For example, I found some variable has this expression.
015d8df8 ffffffff81aacfb3 (base address)
015d8e01 v000000000000004 v000000000000000 views at 015d8df2 for:
ffffffff81aacfb3 ffffffff81aacfd2 (DW_OP_fbreg: -176; DW_OP_deref;
DW_OP_plus_uconst: 332; DW_OP_deref_size: 4;
DW_OP_lit1; DW_OP_shra; DW_OP_const1u: 64;
DW_OP_minus; DW_OP_stack_value)
015d8e14 v000000000000000 v000000000000000 views at 015d8df4 for:
ffffffff81aacfd2 ffffffff81aacfd7 (DW_OP_reg3 (rbx))
015d8e19 v000000000000000 v000000000000000 views at 015d8df6 for:
ffffffff81aacfd7 ffffffff81aad020 (DW_OP_fbreg: -176; DW_OP_deref;
DW_OP_plus_uconst: 332; DW_OP_deref_size: 4;
DW_OP_lit1; DW_OP_shra; DW_OP_const1u: 64;
DW_OP_minus; DW_OP_stack_value)
015d8e2c <End of list>
It looks like '((int *)(-176(%rbp) + 332) >> 1) - 64' but the current
code thought it's just -176(%rbp) and processed the variable incorrectly.
It should reject such a complex expression if check_allowed_ops()
doesn't like it. :)
Fixes: 932dcc2c39aedf54 ("perf dwarf-aux: Add die_collect_vars()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-2-namhyung@ke |