diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-06-30 10:44:53 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-06-30 10:44:53 -0700 |
| commit | d2a6fd45c5c4a5c5fdfe6c57f74f630e61d8d9a0 (patch) | |
| tree | 608d2c19afc652e740de9deec315f6ab20029ca4 | |
| parent | cccf0c2ee52d3bd710be3a3f865df1b869a68f11 (diff) | |
| parent | 53431798f4bb60d214ae1ec4a79eefdd414f577b (diff) | |
| download | linux-d2a6fd45c5c4a5c5fdfe6c57f74f630e61d8d9a0.tar.gz linux-d2a6fd45c5c4a5c5fdfe6c57f74f630e61d8d9a0.tar.bz2 linux-d2a6fd45c5c4a5c5fdfe6c57f74f630e61d8d9a0.zip | |
Merge tag 'probes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
- fprobe: Pass return address to the fprobe entry/exit callbacks so
that the callbacks don't need to analyze pt_regs/stack to find the
function return address.
- kprobe events: cleanup usage of TPARG_FL_FENTRY and TPARG_FL_RETURN
flags so that those are not set at once.
- fprobe events:
- Add a new fprobe events for tracing arbitrary function entry and
exit as a trace event.
- Add a new tracepoint events for tracing raw tracepoint as a
trace event. This allows user to trace non user-exposed
tracepoints.
- Move eprobe's event parser code into probe event common file.
- Introduce BTF (BPF type format) support to kernel probe (kprobe,
fprobe and tracepoint probe) events so that user can specify
traced function arguments by name. This also applies the type of
argument when fetching the argument.
- Introduce '$arg*' wildcard support if BTF is available. This
expands the '$arg*' meta argument to all function argument
automatically.
- Check the return value types by BTF. If the function returns
'void', '$retval' is rejected.
- Add some selftest script for fprobe events, tracepoint events
and BTF support.
- Update documentation about the fprobe events.
- Some fixes for above features, document and selftests.
- selftests for ftrace (in addition to the new fprobe events):
- Add a test case for multiple consecutive probes in a function
which checks if ftrace based kprobe, optimized kprobe and normal
kprobe can be defined in the same target function.
- Add a test case for optimized probe, which checks whether kprobe
can be optimized or not.
* tag 'probes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Fix tracepoint event with $arg* to fetch correct argument
Documentation: Fix typo of reference file name
tracing/probes: Fix to return NULL and keep using current argc
selftests/ftrace: Add new test case which checks for optimized probes
selftests/ftrace: Add new test case which adds multiple consecutive probes in a function
Documentation: tracing/probes: Add fprobe event tracing document
selftests/ftrace: Add BTF arguments test cases
selftests/ftrace: Add tracepoint probe test case
tracing/probes: Add BTF retval type support
tracing/probes: Add $arg* meta argument for all function args
tracing/probes: Support function parameters if BTF is available
tracing/probes: Move event parameter fetching code to common parser
tracing/probes: Add tracepoint support on fprobe_events
selftests/ftrace: Add fprobe related testcases
tracing/probes: Add fprobe events for tracing function entry and exit.
tracing/probes: Avoid setting TPARG_FL_FENTRY and TPARG_FL_RETURN
fprobe: Pass return address to the handlers
31 files changed, 2476 insertions, 164 deletions
diff --git a/Documentation/trace/fprobetrace.rst b/Documentation/trace/fprobetrace.rst new file mode 100644 index 000000000000..7297f9478459 --- /dev/null +++ b/Documentation/trace/fprobetrace.rst @@ -0,0 +1,188 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================== +Fprobe-based Event Tracing +========================== + +.. Author: Masami Hiramatsu <mhiramat@kernel.org> + +Overview +-------- + +Fprobe event is similar to the kprobe event, but limited to probe on +the function entry and exit only. It is good enough for many use cases +which only traces some specific functions. + +This document also covers tracepoint probe events (tprobe) since this +is also works only on the tracepoint entry. User can trace a part of +tracepoint argument, or the tracepoint without trace-event, which is +not exposed on tracefs. + +As same as other dynamic events, fprobe events and tracepoint probe +events are defined via `dynamic_events` interface file on tracefs. + +Synopsis of fprobe-events +------------------------- +:: + + f[:[GRP1/][EVENT1]] SYM [FETCHARGS] : Probe on function entry + f[MAXACTIVE][:[GRP1/][EVENT1]] SYM%return [FETCHARGS] : Probe on function exit + t[:[GRP2/][EVENT2]] TRACEPOINT [FETCHARGS] : Probe on tracepoint + + GRP1 : Group name for fprobe. If omitted, use "fprobes" for it. + GRP2 : Group name for tprobe. If omitted, use "tracepoints" for it. + EVENT1 : Event name for fprobe. If omitted, the event name is + "SYM__entry" or "SYM__exit". + EVENT2 : Event name for tprobe. If omitted, the event name is + the same as "TRACEPOINT", but if the "TRACEPOINT" starts + with a digit character, "_TRACEPOINT" is used. + MAXACTIVE : Maximum number of instances of the specified function that + can be probed simultaneously, or 0 for the default value + as defined in Documentation/trace/fprobe.rst + + FETCHARGS : Arguments. Each probe can have up to 128 args. + ARG : Fetch "ARG" function argument using BTF (only for function + entry or tracepoint.) (\*1) + @ADDR : Fetch memory at ADDR (ADDR should be in kernel) + @SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol) + $stackN : Fetch Nth entry of stack (N >= 0) + $stack : Fetch stack address. + $argN : Fetch the Nth function argument. (N >= 1) (\*2) + $retval : Fetch return value.(\*3) + $comm : Fetch current task comm. + +|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*4)(\*5) + \IMM : Store an immediate value to the argument. + NAME=FETCHARG : Set NAME as the argument name of FETCHARG. + FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types + (u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types + (x8/x16/x32/x64), "char", "string", "ustring", "symbol", "symstr" + and bitfield are supported. + + (\*1) This is available only when BTF is enabled. + (\*2) only for the probe on function entry (offs == 0). + (\*3) only for return probe. + (\*4) this is useful for fetching a field of data structures. + (\*5) "u" means user-space dereference. + +For the details of TYPE, see :ref:`kprobetrace documentation <kprobetrace_types>`. + +BTF arguments +------------- +BTF (BPF Type Format) argument allows user to trace function and tracepoint +parameters by its name instead of ``$argN``. This feature is available if the +kernel is configured with CONFIG_BPF_SYSCALL and CONFIG_DEBUG_INFO_BTF. +If user only specify the BTF argument, the event's argument name is also +automatically set by the given name. :: + + # echo 'f:myprobe vfs_read count pos' >> dynamic_events + # cat dynamic_events + f:fprobes/myprobe vfs_read count=count pos=pos + +It also chooses the fetch type from BTF information. For example, in the above +example, the ``count`` is unsigned long, and the ``pos`` is a pointer. Thus, both +are converted to 64bit unsigned long, but only ``pos`` has "%Lx" print-format as +below :: + + # cat events/fprobes/myprobe/format + name: myprobe + ID: 1313 + format: + field:unsigned short common_type; offset:0; size:2; signed:0; + field:unsigned char common_flags; offset:2; size:1; signed:0; + field:unsigned char common_preempt_count; offset:3; size:1; signed:0; + field:int common_pid; offset:4; size:4; signed:1; + + field:unsigned long __probe_ip; offset:8; size:8; signed:0; + field:u64 count; offset:16; size:8; signed:0; + field:u64 pos; offset:24; size:8; signed:0; + + print fmt: "(%lx) count=%Lu pos=0x%Lx", REC->__probe_ip, REC->count, REC->pos + +If user unsures the name of arguments, ``$arg*`` will be helpful. The ``$arg*`` +is expanded to all function arguments of the function or the tracepoint. :: + + # echo 'f:myprobe vfs_read $arg*' >> dynamic_events + # cat dynamic_events + f:fprobes/myprobe vfs_read file=file buf=buf count=count pos=pos + +BTF also affects the ``$retval``. If user doesn't set any type, the retval type is +automatically picked from the BTF. If the function returns ``void``, ``$retval`` +is rejected. + +Usage examples +-------------- +Here is an example to add fprobe events on ``vfs_read()`` function entry +and exit, with BTF arguments. +:: + + # echo 'f vfs_read $arg*' >> dynamic_events + # echo 'f vfs_read%return $retval' >> dynamic_events + # cat dynamic_events + f:fprobes/vfs_read__entry vfs_read file=file buf=buf count=count pos=pos + f:fprobes/vfs_read__exit vfs_read%return arg1=$retval + # echo 1 > events/fprobes/enable + # head -n 20 trace | tail + # TASK-PID CPU# ||||| TIMESTAMP FUNCTION + # | | | ||||| | | + sh-70 [000] ...1. 335.883195: vfs_read__entry: (vfs_read+0x4/0x340) file=0xffff888005cf9a80 buf=0x7ffef36c6879 count=1 pos=0xffffc900005aff08 + sh-70 [000] ..... 335.883208: vfs_read__exit: (ksys_read+0x75/0x100 <- vfs_read) arg1=1 + sh-70 [000] ...1. 335.883220: vfs_read__entry: (vfs_read+0x4/0x340) file=0xffff888005cf9a80 buf=0x7ffef36c6879 count=1 pos=0xffffc900005aff08 + sh-70 [000] ..... 335.883224: vfs_read__exit: (ksys_read+0x75/0x100 <- vfs_read) arg1=1 + sh-70 [000] ...1. 335.883232: vfs_read__entry: (vfs_read+0x4/0x340) file=0xffff888005cf9a80 buf=0x7ffef36c687a count=1 pos=0xffffc900005aff08 + sh-70 [000] ..... 335.883237: vfs_read__exit: (ksys_read+0x75/0x100 <- vfs_read) arg1=1 + sh-70 [000] ...1. 336.050329: vfs_read__entry: (vfs_read+0x4/0x340) file=0xffff888005cf9a80 buf=0x7ffef36c6879 count=1 pos=0xffffc900005aff08 + sh-70 [000] ..... 336.050343: vfs_read__exit: (ksys_read+0x75/0x100 <- vfs_read) arg1=1 + +You can see all function arguments and return values are recorded as signed int. + +Also, here is an example of tracepoint events on ``sched_switch`` tracepoint. +To compare the result, this also enables the ``sched_switch`` traceevent too. +:: + + # echo 't sched_switch $arg*' >> dynamic_events + # echo 1 > events/sched/sched_switch/enable + # echo 1 > events/tracepoints/sched_switch/enable + # echo > trace + # head -n 20 trace | tail + # TASK-PID CPU# ||||| TIMESTAMP FUNCTION + # | | | ||||| | | + sh-70 [000] d..2. 3912.083993: sched_switch: prev_comm=sh prev_pid=70 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120 + sh-70 [000] d..3. 3912.083995: sched_switch: (__probestub_sched_switch+0x4/0x10) preempt=0 prev=0xffff88800664e100 next=0xffffffff828229c0 prev_state=1 + <idle>-0 [000] d..2. 3912.084183: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=16 next_prio=120 + <idle>-0 [000] d..3. 3912.084184: sched_switch: (__probestub_sched_switch+0x4/0x10) preempt=0 prev=0xffffffff828229c0 next=0xffff888004208000 prev_state=0 + rcu_preempt-16 [000] d..2. 3912.084196: sched_switch: prev_comm=rcu_preempt prev_pid=16 prev_prio=120 prev_state=I ==> next_comm=swapper/0 next_pid=0 next_prio=120 + rcu_preempt-16 [000] d..3. 3912.084196: sched_switch: (__probestub_sched_switch+0x4/0x10) preempt=0 prev=0xffff888004208000 next=0xffffffff828229c0 prev_state=1026 + <idle>-0 [000] d..2. 3912.085191: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=16 next_prio=120 + <idle>-0 [000] d..3. 3912.085191: sched_switch: (__probestub_sched_switch+0x4/0x10) preempt=0 prev=0xffffffff828229c0 next=0xffff888004208000 prev_state=0 + +As you can see, the ``sched_switch`` trace-event shows *cooked* parameters, on +the other hand, the ``sched_switch`` tracepoint probe event shows *raw* +parameters. This means you can access any field values in the task +structure pointed by the ``prev`` and ``next`` arguments. + +For example, usually ``task_struct::start_time`` is not traced, but with this +traceprobe event, you can trace it as below. +:: + + # echo 't sched_switch comm=+1896(next):string start_time=+1728(next):u64' > dynamic_events + # head -n 20 trace | tail + # TASK-PID CPU# ||||| TIMESTAMP FUNCTION + # | | | ||||| | | + sh-70 [000] d..3. 5606.686577: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="rcu_preempt" usage=1 start_time=245000000 + rcu_preempt-16 [000] d..3. 5606.686602: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="sh" usage=1 start_time=1596095526 + sh-70 [000] d..3. 5606.686637: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="swapper/0" usage=2 start_time=0 + <idle>-0 [000] d..3. 5606.687190: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="rcu_preempt" usage=1 start_time=245000000 + rcu_preempt-16 [000] d..3. 5606.687202: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="swapper/0" usage=2 start_time=0 + <idle>-0 [000] d..3. 5606.690317: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000 + kworker/0:1-14 [000] d..3. 5606.690339: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="swapper/0" usage=2 start_time=0 + <idle>-0 [000] d..3. 5606.692368: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000 + +Currently, to find the offset of a specific field in the data structure, +you need to build kernel with debuginfo and run `perf probe` command with +`-D` option. e.g. +:: + + # perf probe -D "__probestub_sched_switch next->comm:string next->start_time" + p:probe/__probestub_sched_switch __probestub_sched_switch+0 comm=+1896(%cx):string start_time=+1728(%cx):u64 + +And replace the ``%cx`` with the ``next``. diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst index ea25a9220f92..5092d6c13af5 100644 --- a/Documentation/trace/index.rst +++ b/Documentation/trace/index.rst @@ -13,6 +13,7 @@ Linux Tracing Technologies kprobes kprobetrace uprobetracer + fprobetrace tracepoints events events-kmem diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst index 651f9ab53f3e..8a2dfee38145 100644 --- a/Documentation/trace/kprobetrace.rst +++ b/Documentation/trace/kprobetrace.rst @@ -66,6 +66,8 @@ Synopsis of kprobe_events (\*3) this is useful for fetching a field of data structures. (\*4) "u" means user-space dereference. See :ref:`user_mem_access`. +.. _kprobetrace_types: + Types ----- Several types are supported for fetchargs. Kprobe tracer will access memory diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h index 47fefc7f363b..3e03758151f4 100644 --- a/include/linux/fprobe.h +++ b/include/linux/fprobe.h @@ -35,9 +35,11 @@ struct fprobe { int nr_maxactive; int (*entry_handler)(struct fprobe *fp, unsigned long entry_ip, - struct pt_regs *regs, void *entry_data); + unsigned long ret_ip, struct pt_regs *regs, + void *entry_data); void (*exit_handler)(struct fprobe *fp, unsigned long entry_ip, - struct pt_regs *regs, void *entry_data); + unsigned long ret_ip, struct pt_regs *regs, + void *entry_data); }; /* This fprobe is soft-disabled. */ @@ -64,6 +66,7 @@ int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num); int register_fprobe_syms(struct fprobe *fp, const char **syms, int num); int unregister_fprobe(struct fprobe *fp); +bool fprobe_is_registered(struct fprobe *fp); #else static inline int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter) { @@ -81,6 +84,10 @@ static inline int unregister_fprobe(struct fprobe *fp) { return -EOPNOTSUPP; } +static inline bool fprobe_is_registered(struct fprobe *fp) +{ + return false; +} #endif /** diff --git a/include/linux/rethook.h b/include/linux/rethook.h index c8ac1e5afcd1..fdf26cd0e742 100644 --- a/include/linux/rethook.h +++ b/include/linux/rethook.h @@ -14,7 +14,7 @@ struct rethook_node; -typedef void (*rethook_handler_t) (struct rethook_node *, void *, struct pt_regs *); +typedef void (*rethook_handler_t) (struct rethook_node *, void *, unsigned long, struct pt_regs *); /** * struct rethook - The rethook management data structure. diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 7c4a0b72334e..3930e676436c 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -318,6 +318,7 @@ enum { TRACE_EVENT_FL_KPROBE_BIT, TRACE_EVENT_FL_UPROBE_BIT, TRACE_EVENT_FL_EPROBE_BIT, + TRACE_EVENT_FL_FPROBE_BIT, TRACE_EVENT_FL_CUSTOM_BIT, }; @@ -332,6 +333,7 @@ enum { * KPROBE - Event is a kprobe * UPROBE - Event is a uprobe * EPROBE - Event is an event probe + * FPROBE - Event is an function probe * CUSTOM - Event is a custom event (to be attached to an exsiting tracepoint) * This is set when the custom event has not been attached * to a tracepoint yet, then it is cleared when it is. @@ -346,6 +348,7 @@ enum { TRACE_EVENT_FL_KPROBE = (1 << TRACE_EVENT_FL_KPROBE_BIT), TRACE_EVENT_FL_UPROBE = (1 << TRACE_EVENT_FL_UPROBE_BIT), TRACE_EVENT_FL_EPROBE = (1 << TRACE_EVENT_FL_EPROBE_BIT), + TRACE_EVENT_FL_FPROBE = (1 << TRACE_EVENT_FL_FPROBE_BIT), TRACE_EVENT_FL_CUSTOM = (1 << TRACE_EVENT_FL_CUSTOM_BIT), }; diff --git a/include/linux/tracepoint-defs.h b/include/linux/tracepoint-defs.h index e7c2276be33e..4dc4955f0fbf 100644 --- a/include/linux/tracepoint-defs.h +++ b/include/linux/tracepoint-defs.h @@ -35,6 +35,7 @@ struct tracepoint { struct static_call_key *static_call_key; void *static_call_tramp; void *iterator; + void *probestub; int (*regfunc)(void); void (*unregfunc)(void); struct tracepoint_func __rcu *funcs; diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 6811e43c1b5c..88c0ba623ee6 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -303,6 +303,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) __section("__tracepoints_strings") = #_name; \ extern struct static_call_key STATIC_CALL_KEY(tp_func_##_name); \ int __traceiter_##_name(void *__data, proto); \ + void __probestub_##_name(void *__data, proto); \ struct tracepoint __tracepoint_##_name __used \ __section("__tracepoints") = { \ .name = __tpstrtab_##_name, \ @@ -310,6 +311,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) .static_call_key = &STATIC_CALL_KEY(tp_func_##_name), \ .static_call_tramp = STATIC_CALL_TRAMP_ADDR(tp_func_##_name), \ .iterator = &__traceiter_##_name, \ + .probestub = &__probestub_##_name, \ .regfunc = _reg, \ .unregfunc = _unreg, \ .funcs = NULL }; \ @@ -330,6 +332,9 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) } \ return 0; \ } \ + void __probestub_##_name(void *__data, proto) \ + { \ + } \ DEFINE_STATIC_CALL(tp_func_##_name, __traceiter_##_name); #define DEFINE_TRACE(name, proto, args) \ diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 00e177de91cc..ce13f1a35251 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -2127,6 +2127,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs) NOKPROBE_SYMBOL(pre_handler_kretprobe); static void kretprobe_rethook_handler(struct rethook_node *rh, void *data, + unsigned long ret_addr, struct pt_regs *regs) { struct kretprobe *rp = (struct kretprobe *)data; diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index abe5c583bd59..61c541c36596 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -665,6 +665,32 @@ config BLK_DEV_IO_TRACE If unsure, say N. +config FPROBE_EVENTS + depends on FPROBE + depends on HAVE_REGS_AND_STACK_ACCESS_API + bool "Enable fprobe-based dynamic events" + select TRACING + select PROBE_EVENTS + select DYNAMIC_EVENTS + default y + help + This allows user to add tracing events on the function entry and + exit via ftrace interface. The syntax is same as the kprobe events + and the kprobe events on function entry and exit will be + transparently converted to this fprobe events. + +config PROBE_EVENTS_BTF_ARGS + depends on HAVE_FUNCTION_ARG_ACCESS_API + depends on FPROBE_EVENTS || KPROBE_EVENTS + depends on DEBUG_INFO_BTF && BPF_SYSCALL + bool "Support BTF function arguments for probe events" + default y + help + The user can specify the arguments of the probe event using the names + of the arguments of the probed function, when the probe location is a + kernel function entry or a tracepoint. + This is available only if BTF (BPF Type Format) support is enabled. + config KPROBE_EVENTS depends on KPROBES depends on HAVE_REGS_AND_STACK_ACCESS_API diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index c6651e16b557..64b61f67a403 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -104,6 +104,7 @@ obj-$(CONFIG_BOOTTIME_TRACING) += trace_boot.o obj-$(CONFIG_FTRACE_RECORD_RECURSION) += trace_recursion_record.o obj-$(CONFIG_FPROBE) += fprobe.o obj-$(CONFIG_RETHOOK) += rethook.o +obj-$(CONFIG_FPROBE_EVENTS) += trace_fprobe.o obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o obj-$(CONFIG_RV) += rv/ diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 03b7f6b8e4f0..5f2dcabad202 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -2652,7 +2652,8 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link, static int kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip, - struct pt_regs *regs, void *data) + unsigned long ret_ip, struct pt_regs *regs, + void *data) { struct bpf_kprobe_multi_link *link; @@ -2663,7 +2664,8 @@ kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip, static void kprobe_multi_link_exit_handler(struct fprobe *fp, unsigned long fentry_ip, - struct pt_regs *regs, void *data) + unsigned long ret_ip, struct pt_regs *regs, + void *data) { struct bpf_kprobe_multi_link *link; diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c index 18d36842faf5..e4704ec26df7 100644 --- a/kernel/trace/fprobe.c +++ b/kernel/trace/fprobe.c @@ -46,7 +46,7 @@ static inline void __fprobe_handler(unsigned long ip, unsigned long parent_ip, } if (fp->entry_handler) - ret = fp->entry_handler(fp, ip, ftrace_get_regs(fregs), entry_data); + ret = fp->entry_handler(fp, ip, parent_ip, ftrace_get_regs(fregs), entry_data); /* If entry_handler returns !0, nmissed is not counted. */ if (rh) { @@ -112,7 +112,7 @@ static void fprobe_kprobe_handler(unsigned long ip, unsigned long parent_ip, } static void fprobe_exit_handler(struct rethook_node *rh, void *data, - struct pt_regs *regs) + unsigned long ret_ip, struct pt_regs *regs) { struct fprobe *fp = (struct fprobe *)data; struct fprobe_rethook_node *fpr; @@ -133,7 +133,7 @@ static void fprobe_exit_handler(struct rethook_node *rh, void *data, return; } - fp->exit_handler(fp, fpr->entry_ip, regs, + fp->exit_handler(fp, fpr->entry_ip, ret_ip, regs, fp->entry_data_size ? (void *)fpr->data : NULL); ftrace_test_recursion_unlock(bit); } @@ -348,6 +348,14 @@ int register_fprobe_syms(struct fprobe *fp, const char **syms, int num) } EXPORT_SYMBOL_GPL(register_fprobe_syms); +bool fprobe_is_registered(struct fprobe *fp) +{ + if (!fp || (fp->ops.saved_func != fprobe_handler && + fp->ops.saved_func != fprobe_kprobe_handler)) + return false; + return true; +} + /** * unregister_fprobe() - Unregister fprobe from ftrace * @fp: A fprobe data structure to be unregistered. @@ -360,8 +368,7 @@ int unregister_fprobe(struct fprobe *fp) { int ret; - if (!fp || (fp->ops.saved_func != fprobe_handler && - fp->ops.saved_func != fprobe_kprobe_handler)) + if (!fprobe_is_registered(fp)) return -EINVAL; /* diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c index 60f6cb2b486b..f32ee484391a 100644 --- a/kernel/trace/rethook.c +++ b/kernel/trace/rethook.c @@ -301,7 +301,8 @@ unsigned long rethook_trampoline_handler(struct pt_regs *regs, break; handler = READ_ONCE(rhn->rethook->handler); if (handler) - handler(rhn, rhn->rethook->data, regs); + handler(rhn, rhn->rethook->data, + correct_ret_addr, regs); if (first == node) break; diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 074d0b2e19ed..b04f52e7cd28 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5672,10 +5672,17 @@ static const char readme_msg[] = " uprobe_events\t\t- Create/append/remove/show the userspace dynamic events\n" "\t\t\t Write into this file to define/undefine new trace events.\n" #endif -#if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS) +#if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS) || \ + defined(CONFIG_FPROBE_EVENTS) "\t accepts: event-definitions (one definition per line)\n" +#if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS) "\t Format: p[:[<group>/][<event>]] <place> [<args>]\n" "\t r[maxactive][:[<group>/][<event>]] <place> [<args>]\n" +#endif +#ifdef CONFIG_FPROBE_EVENTS + "\t f[:[<group>/][<event>]] <func-name>[%return] [<args>]\n" + "\t t[:[<group>/][<event>]] <tracepoint> [<args>]\n" +#endif #ifdef CONFIG_HIST_TRIGGERS "\t s:[synthetic/]<event> <field> [<field>]\n" #endif @@ -5691,7 +5698,11 @@ static const char readme_msg[] = "\t args: <name>=fetcharg[:type]\n" "\t fetcharg: (%<register>|$<efield>), @<address>, @<symbol>[+|-<offset>],\n" #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API +#ifdef CONFIG_PROBE_EVENTS_BTF_ARGS + "\t $stack<index>, $stack, $retval, $comm, $arg<N>, <argname>\n" +#else "\t $stack<index>, $stack, $retval, $comm, $arg<N>,\n" +#endif #else "\t $stack<index>, $stack, $retval, $comm,\n" #endif diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index e6407a27d644..ed7906b13f09 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -148,6 +148,17 @@ struct kretprobe_trace_entry_head { unsigned long ret_ip; }; +struct fentry_trace_entry_head { + struct trace_entry ent; + unsigned long ip; +}; + +struct fexit_trace_entry_head { + struct trace_entry ent; + unsigned long func; + unsigned long ret_ip; +}; + #define TRACE_BUF_SIZE 1024 struct trace_array; diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c index 67e854979d53..cb0077ba2b49 100644 --- a/kernel/trace/trace_eprobe.c +++ b/kernel/trace/trace_eprobe.c @@ -227,37 +227,6 @@ error: return ERR_PTR(ret); } -static int trace_eprobe_tp_arg_update(struct trace_eprobe *ep, int i) -{ - struct probe_arg *parg = &ep->tp.args[i]; - struct ftrace_event_field *field; - struct list_head *head; - int ret = -ENOENT; - - head = trace_get_fields(ep->event); - list_for_each_entry(field, head, link) { - if (!strcmp(parg->code->data, field->name)) { - kfree(parg->code->data); - parg->code->data = field; - return 0; - } - } - - /* - * Argument not found on event. But allow for comm and COMM - * to be used to get the current->comm. - */ - if (strcmp(parg->code->data, "COMM") == 0 || - strcmp(parg->code->data, "comm") == 0) { - parg->code->op = FETCH_OP_COMM; - ret = 0; - } - - kfree(parg->code->data); - parg->code->data = NULL; - return ret; -} - static int eprobe_event_define_fields(struct trace_event_call *event_call) { struct eprobe_trace_entry_head field; @@ -817,19 +786,16 @@ find_and_get_event(const char *system, const char *event_name) static int trace_eprobe_tp_update_arg(struct trace_eprobe *ep, const char *argv[], int i) { - unsigned int flags = TPARG_FL_KERNEL | TPARG_FL_TPOINT; + struct traceprobe_parse_context ctx = { + .event = ep->event, + .flags = TPARG_FL_KERNEL | TPARG_FL_TEVENT, + }; int ret; - ret = traceprobe_par |
