Age | Commit message (Collapse) | Author | Files | Lines |
|
Define TRACE_AUG_MAX_BUF in trace_augment.h data, which is the maximum
buffer size we can augment. BPF will include this header too.
Print buffer in a way that's different than just printing a string, we
print all the control characters in \digits (such as \0 for null, and
\10 for newline, LF).
For character that has a bigger value than 127, we print the digits
instead of the character itself as well.
Committer notes:
Simplified the buffer scnprintf to avoid using multiple buffers as
discussed in the patch review thread.
We can't really all 'buf' args to SCA_BUF as we're collecting so far
just on the sys_enter path, so we would be printing the previous 'read'
arg buffer contents, not what the kernel puts there.
So instead of:
static int syscall_fmt__cmp(const void *name, const void *fmtp)
@@ -1987,8 +1989,6 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
- else if (strstr(field->type, "char *") && strstr(field->name, "buf"))
- arg->scnprintf = SCA_BUF;
Do:
static const struct syscall_fmt syscall_fmts[] = {
+ { .name = "write", .errpid = true,
+ .arg = { [1] = { .scnprintf = SCA_BUF /* buf */, from_user = true, }, }, },
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-8-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20240824163322.60796-6-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Change the arg->augmented.args to arg->augmented.args->value to skip the
header for customized pretty printers, since we collect data in BPF
using the general augment_sys_enter(), which always adds the header.
Use btf_dump API to pretty print augmented struct pointer.
Prefer existed pretty-printer than btf general pretty-printer.
set compact = true and skip_names = true, so that no newline character
and argument name are printed.
Committer notes:
Simplified the btf_dump_snprintf callback to avoid using multiple
buffers, as discussed in the thread accessible via the Link tag below.
Also made it do:
dump_data_opts.skip_names = !arg->trace->show_arg_names;
I.e. show the type and struct field names according to that tunable, we
probably need another tunable just for this, but for now if the user
wants to see syscall names in addition to its value, it makes sense to
see the struct field names according to that tunable.
Committer testing:
The following have explicitely set beautifiers (SCA_FILENAME,
SCA_SOCKADDR and SCA_PERF_ATTR), SCA_FILENAME is here just because we
have been wiring up the "renameat2" ("renameat" until recently), so it
doesn't use the introduced generic fallback (btf_struct_scnprintf(), see
the definition of SCA_PERF_ATTR, SCA_SOCKADDR to see the more feature
rich beautifiers, that are not using BTF):
root@number:~# rm -f 987654 ; touch 123456 ; perf trace -e rename* mv 123456 987654
0.000 ( 0.039 ms): mv/258478 renameat2(olddfd: CWD, oldname: "123456", newdfd: CWD, newname: "987654", flags: NOREPLACE) = 0
root@number:~# perf trace -e connect,sendto ping -c 1 www.google.com
0.000 ( 0.014 ms): ping/258481 connect(fd: 5, uservaddr: { .family: LOCAL, path: /run/systemd/resolve/io.systemd.Resolve }, addrlen: 42) = 0
0.040 ( 0.003 ms): ping/258481 sendto(fd: 5, buff: 0x55bc317a6980, len: 97, flags: DONTWAIT|NOSIGNAL) = 97
18.742 ( 0.020 ms): ping/258481 sendto(fd: 5, buff: 0x7ffc04768df0, len: 20, addr: { .family: NETLINK }, addr_len: 0xc) = 20
PING www.google.com (142.251.129.68) 56(84) bytes of data.
18.783 ( 0.012 ms): ping/258481 connect(fd: 5, uservaddr: { .family: INET6, port: 0, addr: 2800:3f0:4004:810::2004 }, addrlen: 28) = 0
18.797 ( 0.001 ms): ping/258481 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0
18.800 ( 0.004 ms): ping/258481 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 142.251.129.68 }, addrlen: 16) = 0
18.815 ( 0.002 ms): ping/258481 connect(fd: 5, uservaddr: { .family: INET, port: 1025, addr: 142.251.129.68 }, addrlen: 16) = 0
18.862 ( 0.023 ms): ping/258481 sendto(fd: 3, buff: 0x55bc317a0ac0, len: 64, addr: { .family: INET, port: 0, addr: 142.251.129.68 }, addr_len: 0x10) = 64
63.330 ( 0.038 ms): ping/258481 connect(fd: 5, uservaddr: { .family: LOCAL, path: /run/systemd/resolve/io.systemd.Resolve }, addrlen: 42) = 0
63.435 ( 0.010 ms): ping/258481 sendto(fd: 5, buff: 0x55bc317a8340, len: 110, flags: DONTWAIT|NOSIGNAL) = 110
64 bytes from rio07s07-in-f4.1e100.net (142.251.129.68): icmp_seq=1 ttl=49 time=44.2 ms
--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 44.158/44.158/44.158/0.000 ms
root@number:~# perf trace -e perf_event_open perf stat -e instructions,cache-misses,syscalls:sys_enter*sleep* sleep 1.23456789
0.000 ( 0.010 ms): :258487/258487 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), config: 0xa00000000, disabled: 1, { bp_len, config2 }: 0x900000000, branch_sample_type: USER|COUNTERS, sample_regs_user: 0x3f1b7ffffffff, sample_stack_user: 258487, clockid: -599052088, sample_regs_intr: 0x60a000003eb, sample_max_stack: 14, sig_data: 120259084288 }, cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
0.016 ( 0.002 ms): :258487/258487 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), config: 0x400000000, disabled: 1, { bp_len, config2 }: 0x900000000, branch_sample_type: USER|COUNTERS, sample_regs_user: 0x3f1b7ffffffff, sample_stack_user: 258487, clockid: -599044082, sample_regs_intr: 0x60a000003eb, sample_max_stack: 14, sig_data: 120259084288 }, cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
1.838 ( 0.006 ms): perf/258487 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0xa00000001, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258488 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5
1.846 ( 0.002 ms): perf/258487 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x400000001, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258488 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 6
1.849 ( 0.002 ms): perf/258487 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0xa00000003, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258488 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 7
1.851 ( 0.002 ms): perf/258487 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x400000003, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258488 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 9
1.853 ( 0.600 ms): perf/258487 perf_event_open(attr_uptr: { type: 2 (tracepoint), size: 136, config: 0x190 (syscalls:sys_enter_nanosleep), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258488 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 10
2.456 ( 0.016 ms): perf/258487 perf_event_open(attr_uptr: { type: 2 (tracepoint), size: 136, config: 0x196 (syscalls:sys_enter_clock_nanosleep), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258488 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 11
Performance counter stats for 'sleep 1.23456789':
1,402,839 cpu_atom/instructions/
<not counted> cpu_core/instructions/ (0.00%)
11,066 cpu_atom/cache-misses/
<not counted> cpu_core/cache-misses/ (0.00%)
0 syscalls:sys_enter_nanosleep
1 syscalls:sys_enter_clock_nanosleep
1.236246714 seconds time elapsed
0.000000000 seconds user
0.001308000 seconds sys
root@number:~#
Now if we use it even for the ones we have a specific beautifier in
tools/perf/trace/beauty, i.e. use btf_struct_scnprintf() for all
structs, by adding the following patch:
@@ -2316,7 +2316,7 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
default_scnprintf = sc->arg_fmt[arg.idx].scnprintf;
- if (default_scnprintf == NULL || default_scnprintf == SCA_PTR) {
+ if (1 || (default_scnprintf == NULL || default_scnprintf == SCA_PTR)) {
btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
size - printed, val, field->type);
if (btf_printed) {
We get:
root@number:~# perf trace -e connect,sendto ping -c 1 www.google.com
PING www.google.com (142.251.129.68) 56(84) bytes of data.
0.000 ( 0.015 ms): ping/283259 connect(fd: 5, uservaddr: (struct sockaddr){.sa_family = (sa_family_t)1,(union){.sa_data_min = (char[14])['/','r','u','n','/','s','y','s','t','e','m','d','/','r',],},}, addrlen: 42) = 0
0.046 ( 0.004 ms): ping/283259 sendto(fd: 5, buff: 0x559b008ae980, len: 97, flags: DONTWAIT|NOSIGNAL) = 97
0.353 ( 0.012 ms): ping/283259 sendto(fd: 5, buff: 0x7ffc01294960, len: 20, addr: (struct sockaddr){.sa_family = (sa_family_t)16,}, addr_len: 0xc) = 20
0.377 ( 0.006 ms): ping/283259 connect(fd: 5, uservaddr: (struct sockaddr){.sa_family = (sa_family_t)2,}, addrlen: 16) = 0
0.388 ( 0.010 ms): ping/283259 connect(fd: 5, uservaddr: (struct sockaddr){.sa_family = (sa_family_t)10,}, addrlen: 28) = 0
0.402 ( 0.001 ms): ping/283259 connect(fd: 5, uservaddr: (struct sockaddr){.sa_family = (sa_family_t)2,(union){.sa_data_min = (char[14])[4,1,142,251,129,'D',],},}, addrlen: 16) = 0
0.425 ( 0.045 ms): ping/283259 sendto(fd: 3, buff: 0x559b008a8ac0, len: 64, addr: (struct sockaddr){.sa_family = (sa_family_t)2,}, addr_len: 0x10) = 64
64 bytes from rio07s07-in-f4.1e100.net (142.251.129.68): icmp_seq=1 ttl=49 time=44.1 ms
--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 44.113/44.113/44.113/0.000 ms
44.849 ( 0.038 ms): ping/283259 connect(fd: 5, uservaddr: (struct sockaddr){.sa_family = (sa_family_t)1,(union){.sa_data_min = (char[14])['/','r','u','n','/','s','y','s','t','e','m','d','/','r',],},}, addrlen: 42) = 0
44.927 ( 0.006 ms): ping/283259 sendto(fd: 5, buff: 0x559b008b03d0, len: 110, flags: DONTWAIT|NOSIGNAL) = 110
root@number:~#
Which looks sane, i.e.:
18.800 ( 0.004 ms): ping/258481 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 142.251.129.68 }, addrlen: 16) = 0
Becomes:
0.402 ( 0.001 ms): ping/283259 connect(fd: 5, uservaddr: (struct sockaddr){.sa_family = (sa_family_t)2,(union){.sa_data_min = (char[14])[4,1,142,251,129,'D',],},}, addrlen: 16) = 0
And.
#define AF_UNIX 1 /* Unix domain sockets */
#define AF_LOCAL 1 /* POSIX name for AF_UNIX */
#define AF_INET 2 /* Internet IP Protocol */
<SNIP>
#define AF_INET6 10 /* IP version 6 */
And 'D' == 68, so the preexisting sockaddr BPF collector is working with
the new generic BTF pretty printer (btf_struct_scnprintf()), its just
that it doesn't know about 'struct sockaddr' besides what is in BTF,
i.e. its an array of bytes, not an IPv4 address that needs extra
massaging.
Ditto for the 'struct perf_event_attr' case:
1.851 ( 0.002 ms): perf/258487 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x400000003, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258488 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 9
Becomes:
2.081 ( 0.002 ms): :283304/283304 perf_event_open(attr_uptr: (struct perf_event_attr){.size = (__u32)136,.config = (__u64)17179869187,.sample_type = (__u64)65536,.read_format = (__u64)3,.disabled = (__u64)0x1,.inherit = (__u64)0x1,.enable_on_exec = (__u64)0x1,.exclude_guest = (__u64)0x1,}, pid: 283305 (sleep), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 9
hex(17179869187) = 0x400000003, etc.
read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING is
enum perf_event_read_format {
PERF_FORMAT_TOTAL_TIME_ENABLED = 1U << 0,
PERF_FORMAT_TOTAL_TIME_RUNNING = 1U << 1,
and so on.
We need to work with the libbpf btf dump api to get one output that
matches the 'perf trace'/strace expectations/format, but having this in
this current form is already an improvement to 'perf trace', so lets
improve from what we have.
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-7-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20240824163322.60796-5-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
data in BPF
Set up beauty_map, load it to BPF, in such format: if argument No.3 is a
struct of size 32 bytes (of syscall number 114) beauty_map[114][2] = 32;
if argument No.3 is a string (of syscall number 114) beauty_map[114][2] =
1;
if argument No.3 is a buffer, its size is indicated by argument No.4 (of
syscall number 114) beauty_map[114][2] = -4; /* -1 ~ -6, we'll read this
buffer size in BPF */
Committer notes:
Moved syscall_arg_fmt__cache_btf_struct() from a ifdef
HAVE_LIBBPF_SUPPORT to closer to where it is used, that is ifdef'ed on
HAVE_BPF_SKEL and thus breaks the build when building with
BUILD_BPF_SKEL=0, as detected using 'make -C tools/perf build-test'.
Also add 'struct beauty_map_enter' to tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
as we're using it in this patch, otherwise we get this while trying to
build at this point in the original patch series:
builtin-trace.c: In function ‘trace__init_syscalls_bpf_prog_array_maps’:
builtin-trace.c:3725:58: error: ‘struct <anonymous>’ has no member named ‘beauty_map_enter’
3725 | int beauty_map_fd = bpf_map__fd(trace->skel->maps.beauty_map_enter);
|
We also have to take into account syscall_arg_fmt.from_user when telling
the kernel what to copy in the sys_enter generic collector, we don't
want to collect bogus data in buffers that will only be available to us
at sys_exit time, i.e. after the kernel has filled it, so leave this for
when we have such a sys_exit based collector.
Committer testing:
Not wired up yet, so all continues to work, using the existing BPF
collector and userspace beautifiers that are augmentation aware:
root@number:~# rm -f 987654 ; touch 123456 ; perf trace -e rename* mv 123456 987654
0.000 ( 0.031 ms): mv/20888 renameat2(olddfd: CWD, oldname: "123456", newdfd: CWD, newname: "987654", flags: NOREPLACE) = 0
root@number:~# perf trace -e connect,sendto ping -c 1 www.google.com
0.000 ( 0.014 ms): ping/20892 connect(fd: 5, uservaddr: { .family: LOCAL, path: /run/systemd/resolve/io.systemd.Resolve }, addrlen: 42) = 0
0.040 ( 0.003 ms): ping/20892 sendto(fd: 5, buff: 0x560b4ff17980, len: 97, flags: DONTWAIT|NOSIGNAL) = 97
0.480 ( 0.017 ms): ping/20892 sendto(fd: 5, buff: 0x7ffd82d07150, len: 20, addr: { .family: NETLINK }, addr_len: 0xc) = 20
0.526 ( 0.014 ms): ping/20892 connect(fd: 5, uservaddr: { .family: INET6, port: 0, addr: 2800:3f0:4004:810::2004 }, addrlen: 28) = 0
0.542 ( 0.002 ms): ping/20892 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0
0.544 ( 0.004 ms): ping/20892 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 142.251.135.100 }, addrlen: 16) = 0
0.559 ( 0.002 ms): ping/20892 connect(fd: 5, uservaddr: { .family: INET, port: 1025, addr: 142.251.135.100 }, addrlen: 16PING www.google.com (142.251.135.100) 56(84) bytes of data.
) = 0
0.589 ( 0.058 ms): ping/20892 sendto(fd: 3, buff: 0x560b4ff11ac0, len: 64, addr: { .family: INET, port: 0, addr: 142.251.135.100 }, addr_len: 0x10) = 64
45.250 ( 0.029 ms): ping/20892 connect(fd: 5, uservaddr: { .family: LOCAL, path: /run/systemd/resolve/io.systemd.Resolve }, addrlen: 42) = 0
45.344 ( 0.012 ms): ping/20892 sendto(fd: 5, buff: 0x560b4ff19340, len: 111, flags: DONTWAIT|NOSIGNAL) = 111
64 bytes from rio09s08-in-f4.1e100.net (142.251.135.100): icmp_seq=1 ttl=49 time=44.4 ms
--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 44.361/44.361/44.361/0.000 ms
root@number:~#
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-4-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20240824163322.60796-3-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This one has no specific pretty printer right now, so will be handled by
the generic BTF based one later in this patch series.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Paving the way for the generic BPF BTF based syscall arg augmenter.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Paving the way for the generic BPF BTF based syscall arg augmenter.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Paving the way for the generic BPF BTF based syscall arg augmenter.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We need to know where to collect it in the BPF augmenters, if in the
sys_enter hook or in the sys_exit hook.
Start with the SCA_FILENAME one, that is just from user to kernel space.
The alternative, better, but takes a bit more time than I have now, is
to use the __user information that is already in the syscall args and
encoded in BTF via a tag, do it later.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
+ payload
We were using a more compact format, without explicitely encoding the
size and possible error in the payload for an argument.
To do it generically, at least as Howard Chu did in his GSoC activities,
it is more convenient to use the same model that was being used for
string arguments, passing { size, error, payload }.
So use that for the non string syscall args we have so far:
struct timespec
struct perf_event_attr
struct sockaddr (this one has even a variable size)
With this in place we have the userspace pretty printers:
perf_event_attr___scnprintf()
syscall_arg__scnprintf_augmented_sockaddr()
syscall_arg__scnprintf_augmented_timespec()
Ready to have the generic BPF collector in tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
sending its generic payload and thus we'll use them instead of a generic
libbpf btf_dump interface that doesn't know about about the sockaddr
mux, perf_event_attr non-trivial fields (sample_type, etc), leaving it
as a (useful) fallback that prints just basic types until we put in
place a more sophisticated pretty printer infrastructure that associates
synthesized enums to struct fields using the header scrapers we have in
tools/perf/trace/beauty/, some of them in this list:
$ ls tools/perf/trace/beauty/*.sh
tools/perf/trace/beauty/arch_errno_names.sh
tools/perf/trace/beauty/kcmp_type.sh
tools/perf/trace/beauty/perf_ioctl.sh
tools/perf/trace/beauty/statx_mask.sh
tools/perf/trace/beauty/clone.sh
tools/perf/trace/beauty/kvm_ioctl.sh
tools/perf/trace/beauty/pkey_alloc_access_rights.sh
tools/perf/trace/beauty/sync_file_range.sh
tools/perf/trace/beauty/drm_ioctl.sh
tools/perf/trace/beauty/madvise_behavior.sh
tools/perf/trace/beauty/prctl_option.sh
tools/perf/trace/beauty/usbdevfs_ioctl.sh
tools/perf/trace/beauty/fadvise.sh
tools/perf/trace/beauty/mmap_flags.sh
tools/perf/trace/beauty/rename_flags.sh
tools/perf/trace/beauty/vhost_virtio_ioctl.sh
tools/perf/trace/beauty/fs_at_flags.sh
tools/perf/trace/beauty/mmap_prot.sh
tools/perf/trace/beauty/sndrv_ctl_ioctl.sh
tools/perf/trace/beauty/x86_arch_prctl.sh
tools/perf/trace/beauty/fsconfig.sh
tools/perf/trace/beauty/mount_flags.sh
tools/perf/trace/beauty/sndrv_pcm_ioctl.sh
tools/perf/trace/beauty/fsmount.sh
tools/perf/trace/beauty/move_mount_flags.sh
tools/perf/trace/beauty/sockaddr.sh
tools/perf/trace/beauty/fspick.sh
tools/perf/trace/beauty/mremap_flags.sh
tools/perf/trace/beauty/socket.sh
$
Testing it:
root@number:~# rm -f 987654 ; touch 123456 ; perf trace -e rename* mv 123456 987654
0.000 ( 0.031 ms): mv/1193096 renameat2(olddfd: CWD, oldname: "123456", newdfd: CWD, newname: "987654", flags: NOREPLACE) = 0
root@number:~# perf trace -e *nanosleep sleep 1.2345678901
0.000 (1234.654 ms): sleep/1192697 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 234567891 }, rmtp: 0x7ffe1ea80460) = 0
root@number:~# perf trace -e perf_event_open* perf stat -e cpu-clock sleep 1
0.000 ( 0.011 ms): perf/1192701 perf_event_open(attr_uptr: { type: 1 (software), size: 136, config: 0 (PERF_COUNT_SW_CPU_CLOCK), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 1192702 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
Performance counter stats for 'sleep 1':
0.51 msec cpu-clock # 0.001 CPUs utilized
1.001242090 seconds time elapsed
0.000000000 seconds user
0.001010000 seconds sys
root@number:~# perf trace -e connect* ping -c 1 bsky.app
0.000 ( 0.130 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: LOCAL, path: /run/systemd/resolve/io.systemd.Resolve }, addrlen: 42) = 0
23.907 ( 0.006 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.20.108.158 }, addrlen: 16) = 0
23.915 PING bsky.app (3.20.108.158) 56(84) bytes of data.
( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0
23.917 ( 0.002 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.12.170.30 }, addrlen: 16) = 0
23.921 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0
23.923 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 18.217.70.179 }, addrlen: 16) = 0
23.925 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0
23.927 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.132.20.46 }, addrlen: 16) = 0
23.930 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0
23.931 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.142.89.165 }, addrlen: 16) = 0
23.934 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0
23.935 ( 0.002 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 18.119.147.159 }, addrlen: 16) = 0
23.938 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0
23.940 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.22.38.164 }, addrlen: 16) = 0
23.942 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: UNSPEC }, addrlen: 16) = 0
23.944 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 0, addr: 3.13.14.133 }, addrlen: 16) = 0
23.956 ( 0.001 ms): ping/1192740 connect(fd: 5, uservaddr: { .family: INET, port: 1025, addr: 3.20.108.158 }, addrlen: 16) = 0
^C
--- bsky.app ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
root@number:~#
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/CAP-5=fW4=2GoP6foAN6qbrCiUzy0a_TzHbd8rvDsakTPfdzvfg@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
temporarily
While trying to shape Howard Chu's generic BPF augmenter transition into
the codebase I got stuck with the renameat2 syscall.
Until I noticed that the attempt at reusing augmenters were making it
use the 'openat' syscall augmenter, that collect just one string syscall
arg, for the 'renameat2' syscall, that takes two strings.
So, for the moment, just to help in this transition period, since
'renameat2' is what is used these days in the 'mv' utility, just make
the BPF collector be associated with the more widely used syscall,
hopefully the transition to Howard's generic BPF augmenter will cure
this, so get this out of the way for now!
So now we still have that odd "reuse", but for something we're not
testing so won't get in the way anymore:
root@number:~# rm -f 987654 ; touch 123456 ; perf trace -vv -e rename* mv 123456 987654 |& grep renameat
Reusing "openat" BPF sys_enter augmenter for "renameat"
0.000 ( 0.079 ms): mv/1158612 renameat2(olddfd: CWD, oldname: "123456", newdfd: CWD, newname: "987654", flags: NOREPLACE) = 0
root@number:~#
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/CAP-5=fXjGYs=tpBgETK-P9U-CuXssytk9pSnTXpfphrmmOydWA@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A segmentation fault can be triggered when running
'perf mem record -e ldlat-loads'
The commit 35b38a71c92fa033 ("perf mem: Rework command option handling")
moves the OPT_CALLBACK of event from __cmd_record() to cmd_mem().
When invoking the __cmd_record(), the 'mem' has been referenced (&).
So the &mem passed into the parse_record_events() is a double reference
(&&) of the original struct perf_mem mem.
But in the cmd_mem(), the &mem is the single reference (&) of the
original struct perf_mem mem.
Fixes: 35b38a71c92fa033 ("perf mem: Rework command option handling")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240905170737.4070743-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The p-core mem events are missed when launching 'perf mem record' on ADL
and RPL.
root@number:~# perf mem record sleep 1
Memory events are enabled on a subset of CPUs: 16-27
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.032 MB perf.data ]
root@number:~# perf evlist
cpu_atom/mem-loads,ldlat=30/P
cpu_atom/mem-stores/P
dummy:u
A variable 'record' in the 'struct perf_mem_event' is to indicate
whether a mem event in a mem_events[] should be recorded. The current
code only configure the variable for the first eligible PMU.
It's good enough for a non-hybrid machine or a hybrid machine which has
the same mem_events[].
However, if a different mem_events[] is used for different PMUs on a
hybrid machine, e.g., ADL or RPL, the 'record' for the second PMU never
get a chance to be set.
The mem_events[] of the second PMU are always ignored.
'perf mem' doesn't support the per-PMU configuration now. A per-PMU
mem_events[] 'record' variable doesn't make sense. Make it global.
That could also avoid searching for the per-PMU mem_events[] via
perf_pmu__mem_events_ptr every time.
Committer testing:
root@number:~# perf evlist -g
cpu_atom/mem-loads,ldlat=30/P
cpu_atom/mem-stores/P
{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}
cpu_core/mem-stores/P
dummy:u
root@number:~#
The :S for '{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}' is
not being added by 'perf evlist -g', to be checked.
Fixes: abbdd79b786e036e ("perf mem: Clean up perf_mem_events__name()")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Closes: https://lore.kernel.org/lkml/Zthu81fA3kLC2CS2@x1/
Link: https://lore.kernel.org/r/20240905170737.4070743-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The current perf_pmu__mem_events_init() only checks the availability of
the mem_events for the first eligible PMU. It works for non-hybrid
machines and hybrid machines that have the same mem_events.
However, it may bring issues if a hybrid machine has a different
mem_events on different PMU, e.g., Alder Lake and Raptor Lake. A
mem-loads-aux event is only required for the p-core. The mem_events on
both e-core and p-core should be checked and marked.
The issue was not found, because it's hidden by another bug, which only
records the mem-events for the e-core. The wrong check for the p-core
events didn't yell.
Fixes: abbdd79b786e036e ("perf mem: Clean up perf_mem_events__name()")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240905170737.4070743-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Running a script that processes PEBS records gives buffer overflows
in valgrind.
The problem is that the allocation of the register string doesn't
include the terminating 0 byte. Fix this.
I also replaced the very magic "28" with a more reasonable larger buffer
that should fit all registers. There's no need to conserve memory here.
==2106591== Memcheck, a memory error detector
==2106591== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==2106591== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==2106591== Command: ../perf script -i tcall.data gcov.py tcall.gcov
==2106591==
==2106591== Invalid write of size 1
==2106591== at 0x713354: regs_map (trace-event-python.c:748)
==2106591== by 0x7134EB: set_regs_in_dict (trace-event-python.c:784)
==2106591== by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
==2106591== by 0x716327: python_process_general_event (trace-event-python.c:1499)
==2106591== by 0x7164E1: python_process_event (trace-event-python.c:1531)
==2106591== by 0x44F9AF: process_sample_event (builtin-script.c:2549)
==2106591== by 0x6294DC: evlist__deliver_sample (session.c:1534)
==2106591== by 0x6296D0: machines__deliver_event (session.c:1573)
==2106591== by 0x629C39: perf_session__deliver_event (session.c:1655)
==2106591== by 0x625830: ordered_events__deliver_event (session.c:193)
==2106591== by 0x630B23: do_flush (ordered-events.c:245)
==2106591== by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
==2106591== Address 0x7186fe0 is 0 bytes after a block of size 0 alloc'd
==2106591== at 0x484280F: malloc (vg_replace_malloc.c:442)
==2106591== by 0x7134AD: set_regs_in_dict (trace-event-python.c:780)
==2106591== by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
==2106591== by 0x716327: python_process_general_event (trace-event-python.c:1499)
==2106591== by 0x7164E1: python_process_event (trace-event-python.c:1531)
==2106591== by 0x44F9AF: process_sample_event (builtin-script.c:2549)
==2106591== by 0x6294DC: evlist__deliver_sample (session.c:1534)
==2106591== by 0x6296D0: machines__deliver_event (session.c:1573)
==2106591== by 0x629C39: perf_session__deliver_event (session.c:1655)
==2106591== by 0x625830: ordered_events__deliver_event (session.c:193)
==2106591== by 0x630B23: do_flush (ordered-events.c:245)
==2106591== by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
==2106591==
==2106591== Invalid read of size 1
==2106591== at 0x484B6C6: strlen (vg_replace_strmem.c:502)
==2106591== by 0x555D494: PyUnicode_FromString (unicodeobject.c:1899)
==2106591== by 0x7134F7: set_regs_in_dict (trace-event-python.c:786)
==2106591== by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
==2106591== by 0x716327: python_process_general_event (trace-event-python.c:1499)
==2106591== by 0x7164E1: python_process_event (trace-event-python.c:1531)
==2106591== by 0x44F9AF: process_sample_event (builtin-script.c:2549)
==2106591== by 0x6294DC: evlist__deliver_sample (session.c:1534)
==2106591== by 0x6296D0: machines__deliver_event (session.c:1573)
==2106591== by 0x629C39: perf_session__deliver_event (session.c:1655)
==2106591== by 0x625830: ordered_events__deliver_event (session.c:193)
==2106591== by 0x630B23: do_flush (ordered-events.c:245)
==2106591== Address 0x7186fe0 is 0 bytes after a block of size 0 alloc'd
==2106591== at 0x484280F: malloc (vg_replace_malloc.c:442)
==2106591== by 0x7134AD: set_regs_in_dict (trace-event-python.c:780)
==2106591== by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
==2106591== by 0x716327: python_process_general_event (trace-event-python.c:1499)
==2106591== by 0x7164E1: python_process_event (trace-event-python.c:1531)
==2106591== by 0x44F9AF: process_sample_event (builtin-script.c:2549)
==2106591== by 0x6294DC: evlist__deliver_sample (session.c:1534)
==2106591== by 0x6296D0: machines__deliver_event (session.c:1573)
==2106591== by 0x629C39: perf_session__deliver_event (session.c:1655)
==2106591== by 0x625830: ordered_events__deliver_event (session.c:193)
==2106591== by 0x630B23: do_flush (ordered-events.c:245)
==2106591== by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
==2106591==
==2106591== Invalid write of size 1
==2106591== at 0x713354: regs_map (trace-event-python.c:748)
==2106591== by 0x713539: set_regs_in_dict (trace-event-python.c:789)
==2106591== by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
==2106591== by 0x716327: python_process_general_event (trace-event-python.c:1499)
==2106591== by 0x7164E1: python_process_event (trace-event-python.c:1531)
==2106591== by 0x44F9AF: process_sample_event (builtin-script.c:2549)
==2106591== by 0x6294DC: evlist__deliver_sample (session.c:1534)
==2106591== by 0x6296D0: machines__deliver_event (session.c:1573)
==2106591== by 0x629C39: perf_session__deliver_event (session.c:1655)
==2106591== by 0x625830: ordered_events__deliver_event (session.c:193)
==2106591== by 0x630B23: do_flush (ordered-events.c:245)
==2106591== by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
==2106591== Address 0x7186fe0 is 0 bytes after a block of size 0 alloc'd
==2106591== at 0x484280F: malloc (vg_replace_malloc.c:442)
==2106591== by 0x7134AD: set_regs_in_dict (trace-event-python.c:780)
==2106591== by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
==2106591== by 0x716327: python_process_general_event (trace-event-python.c:1499)
==2106591== by 0x7164E1: python_process_event (trace-event-python.c:1531)
==2106591== by 0x44F9AF: process_sample_event (builtin-script.c:2549)
==2106591== by 0x6294DC: evlist__deliver_sample (session.c:1534)
==2106591== by 0x6296D0: machines__deliver_event (session.c:1573)
==2106591== by 0x629C39: perf_session__deliver_event (session.c:1655)
==2106591== by 0x625830: ordered_events__deliver_event (session.c:193)
==2106591== by 0x630B23: do_flush (ordered-events.c:245)
==2106591== by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
==2106591==
==2106591== Invalid read of size 1
==2106591== at 0x484B6C6: strlen (vg_replace_strmem.c:502)
==2106591== by 0x555D494: PyUnicode_FromString (unicodeobject.c:1899)
==2106591== by 0x713545: set_regs_in_dict (trace-event-python.c:791)
==2106591== by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
==2106591== by 0x716327: python_process_general_event (trace-event-python.c:1499)
==2106591== by 0x7164E1: python_process_event (trace-event-python.c:1531)
==2106591== by 0x44F9AF: process_sample_event (builtin-script.c:2549)
==2106591== by 0x6294DC: evlist__deliver_sample (session.c:1534)
==2106591== by 0x6296D0: machines__deliver_event (session.c:1573)
==2106591== by 0x629C39: perf_session__deliver_event (session.c:1655)
==2106591== by 0x625830: ordered_events__deliver_event (session.c:193)
==2106591== by 0x630B23: do_flush (ordered-events.c:245)
==2106591== Address 0x7186fe0 is 0 bytes after a block of size 0 alloc'd
==2106591== at 0x484280F: malloc (vg_replace_malloc.c:442)
==2106591== by 0x7134AD: set_regs_in_dict (trace-event-python.c:780)
==2106591== by 0x713E58: get_perf_sample_dict (trace-event-python.c:940)
==2106591== by 0x716327: python_process_general_event (trace-event-python.c:1499)
==2106591== by 0x7164E1: python_process_event (trace-event-python.c:1531)
==2106591== by 0x44F9AF: process_sample_event (builtin-script.c:2549)
==2106591== by 0x6294DC: evlist__deliver_sample (session.c:1534)
==2106591== by 0x6296D0: machines__deliver_event (session.c:1573)
==2106591== by 0x629C39: perf_session__deliver_event (session.c:1655)
==2106591== by 0x625830: ordered_events__deliver_event (session.c:193)
==2106591== by 0x630B23: do_flush (ordered-events.c:245)
==2106591== by 0x630E7A: __ordered_events__flush (ordered-events.c:324)
==2106591==
73056 total, 29 ignored
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240905151058.2127122-2-ak@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Existing sys directories aren't placed under a model directory like
skylake.
Placing a sys directory there causes the `is_leaf_dir` test to fail and
consequently no events or metrics are generated for the model.
Ignore sys directories in this case and update the comments to
reflect why.
This change has no affect, but when testing with a sys directory for a
model people have reported running into the no event/metric issue.
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240904211705.915101-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix two inconsistencies in feature names as discussed in [1]:
1. Rename "dwarf-unwind-support" to "dwarf-unwind"
2. 'get_cpuid' feature and 'HAVE_AUXTRACE_SUPPORT' names don't
look related, change the feature name to 'auxtrace' to match the
macro name, as 'get_cpuid' string is not used anywhere to check the
feature presence
[1]: https://lore.kernel.org/linux-perf-users/ZoRw5we4HLSTZND6@x1/
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904190132.415212-7-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In probe_vfs_getname.sh, current we use "perf record --dry-run"
to check for libtraceevent and skip the test if perf is not
build with libtraceevent. Change the check to use "perf check feature"
option
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904190132.415212-6-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently we use output of 'perf version --build-options', to check
whether perf was built with libtraceevent support.
Instead, use 'perf check feature libtraceevent' to check for
libtraceevent support.
Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904190132.415212-5-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now that the feature list has been duplicated in a global
'supported_features' array, use that array instead of manually checking
status of built-in features.
This helps in being consistent with commands such as 'perf check feature',
so commands can use the same array, and any new feature can be added at
one place, in the 'supported_features' array
Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904190132.415212-4-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When restricting jevents generated json lookup code with JEVENTS_MODEL
a list of models must be provided. Some builds don't know model names
but know cpuids. Add a command that can convert a cpuid to a model
using mapfile.csv files. This can be used with JEVENTS_MODEL like:
$ make JEVENTS_MODEL=`./pmu-events/models.py x86 'GenuineIntel-6-8D-1,AuthenticAMD-26-1' pmu-events/arch/`
Committer testing:
$ tools/perf/pmu-events/models.py x86 'GenuineIntel-6-8D-1,AuthenticAMD-26-1' tools/perf/pmu-events/arch/
tigerlake,amdzen5
$ perf stat -v sleep 1 |& head -1
Using CPUID GenuineIntel-6-B7-1
$ tools/perf/pmu-events/models.py x86 'GenuineIntel-6-B7-1' tools/perf/pmu-events/arch/
alderlake
$
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240904044351.712080-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently the presence of a feature is checked with a combination of
perf version --build-options and greps, such as:
perf version --build-options | grep " on .* HAVE_FEATURE"
Instead of this, introduce a subcommand "perf check feature", with which
scripts can test for presence of a feature, such as:
perf check feature HAVE_FEATURE
'perf check feature' command is expected to have exit status of 0 if
feature is built-in, and 1 if it's not built-in or if feature is not known.
Multiple features can also be passed as a comma-separated list, in which
case the exit status will be 1 only if all of the passed features are
built-in. For example, with below command, it will have exit status of 0
only if both libtraceevent and bpf are enabled, else 1 in all other cases
perf check feature libtraceevent,bpf
The arguments are case-insensitive.
An array 'supported_features' has also been introduced that can be used by
other commands like 'perf version --build-options', so that new features
can be added in one place, with the array
Committer testing:
$ perf check feature libtraceevent,bpf
libtraceevent: [ on ] # HAVE_LIBTRACEEVENT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
$ perf check feature libtraceevent
libtraceevent: [ on ] # HAVE_LIBTRACEEVENT
$ perf check feature bpf
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
$ perf check -q feature bpf && echo "BPF support is present"
BPF support is present
$ perf check -q feature Bogus && echo "Bogus support is present"
$
Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904061836.55873-3-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently, commands which depend on 'parse_options_subcommand()' don't
show the usage string, and instead show '(null)'
$ ./perf sched
Usage: (null)
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose (show symbol address, etc)
'parse_options_subcommand()' is generally expected to initialise the usage
string, with information in the passed 'subcommands[]' array
This behaviour was changed in:
230a7a71f92212e7 ("libsubcmd: Fix parse-options memory leak")
Where the generated usage string is deallocated, and usage[0] string is
reassigned as NULL.
As discussed in [1], free the allocated usage string in the main
function itself, and don't reset usage string to NULL in
parse_options_subcommand
With this change, the behaviour is restored.
$ ./perf sched
Usage: perf sched [<options>] {record|latency|map|replay|script|timehist}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose (show symbol address, etc)
[1]: https://lore.kernel.org/linux-perf-users/htq5vhx6piet4nuq2mmhk7fs2bhfykv52dbppwxmo3s7du2odf@styd27tioc6e/
Fixes: 230a7a71f92212e7 ("libsubcmd: Fix parse-options memory leak")
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904061836.55873-2-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On arm64 the breakpoint length should be 4-bytes but 8-bytes is
tolerat |