// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <libelf.h>
#include <gelf.h>
#include <unistd.h>
#include <linux/ptrace.h>
#include <linux/kernel.h>
/* s8 will be marked as poison while it's a reg of riscv */
#if defined(__riscv)
#define rv_s8 s8
#endif
#include "bpf.h"
#include "libbpf.h"
#include "libbpf_common.h"
#include "libbpf_internal.h"
#include "hashmap.h"
/* libbpf's USDT support consists of BPF-side state/code and user-space
* state/code working together in concert. BPF-side parts are defined in
* usdt.bpf.h header library. User-space state is encapsulated by struct
* usdt_manager and all the supporting code centered around usdt_manager.
*
* usdt.bpf.h defines two BPF maps that usdt_manager expects: USDT spec map
* and IP-to-spec-ID map, which is auxiliary map necessary for kernels that
* don't support BPF cookie (see below). These two maps are implicitly
* embedded into user's end BPF object file when user's code included
* usdt.bpf.h. This means that libbpf doesn't do anything special to create
* these USDT support maps. They are created by normal libbpf logic of
* instantiating BPF maps when opening and loading BPF object.
*
* As such, libbpf is basically unaware of the need to do anything
* USDT-related until the very first call to bpf_program__attach_usdt(), which
* can be called by user explicitly or happen automatically during skeleton
* attach (or, equivalently, through generic bpf_program__attach() call). At
* this point, libbpf will instantiate and initialize struct usdt_manager and
* store it in bpf_object. USDT manager is per-BPF object construct, as each
* independent BPF object might or might not have USDT programs, and thus all
* the expected USDT-related state. There is no coordination between two
* bpf_object in parts of USDT attachment, they are oblivious of each other's
* existence and libbpf is just oblivious, dealing with bpf_object-specific
* USDT state.
*
* Quick crash course on USDTs.
*
* From user-space application's point of view, USDT is essentially just
* a slightly special function call that normally has zero overhead, unless it
* is being traced by some external entity (e.g, BPF-based tool). Here's how
* a typical application can trigger USDT probe:
*
* #include <sys/sdt.h> // provided by systemtap-sdt-devel package
* // folly also provide similar functionality in folly/tracing/StaticTracepoint.h
*
* STAP_PROBE3(my_usdt_provider, my_usdt_probe_name, 123, x, &y);
*
* USDT is identified by it's <provider-name>:<probe-name> pair of names. Each
* individual USDT has a fixed number of arguments (3 in the above example)
* and specifies values of each argument as if it was a function call.
*
* USDT call is actually not a function call, but is instead replaced by
* a single NOP instruction (thus zero overhead, effectively). But in addition
* to that, those USDT macros generate special SHT_NOTE ELF records in
* .note.stapsdt ELF section. Here's an example USDT definition as emitted by
* `readelf -n <binary>`:
*
* stapsdt 0x00000089 NT_STAPSDT (SystemTap probe descriptors)
* Provider: test
* Name: usdt12
* Location: 0x0000000000549df3, Base: 0x00000000008effa4, Semaphore: 0x0000000000a4606e
* Arguments: -4@-1204(%rbp) -4@%edi -8@-1216(%rbp) -8@%r8 -4@$5 -8@%r9 8@%rdx 8@%r10 -4@$-9 -2@%cx -2@%ax -1@%sil
*
* In this case we have USDT test:usdt12 with 12 arguments.
*
* Location and base are offsets used to calculate absolute IP address of that
* NOP instruction that kernel can replace with an interrupt instruction to
* trigger instrumentation code (BPF program for all that we care about).
*
* Semaphore above is and optional feature. It records an address of a 2-byte
* refcount variable (normally in '.probes' ELF section) used for signaling if
* there is anything that is attached to USDT. This is useful for user
* applications if, for example, they need to prepare some arguments that are
* passed only to USDTs and preparation is expensive. By checking if USDT is
* "activated", an application can avoid paying those costs unnecessarily.
* Recent enough kernel has built-in support for automatically managing this
* refcount, which libbpf expects and relies on. If USDT is defined without
* associated semaphore, this value will be zero. See selftests for semaphore
* examples.
*
* Arguments is the most interesting part. This USDT specification string is
* providing information about all the USDT arguments and their locations. The
* part before @ sign defined byte size of the argument (1, 2, 4, or 8) and
* whether the argument is signed or unsigned (negative size means signed).
* The part after @ sign is assembly-like definition of argument location
* (see [0] for more details). Technically, assembler can provide some pretty
* advanced definitions, but libbpf is currently supporting three most common
* cases:
* 1) immediate constant, see 5th and 9th args above (-4@$5 and -4@-9);
* 2) register value, e.g., 8@%rdx, which means "unsigned 8-byte integer
* whose value is in register %rdx";
* 3) memory dereference addressed by register, e.g., -4@-1204(%rbp), which
* specifies signed 32-bit integer stored at offset -1204 bytes from
* memory address stored in %rbp.
*
* [0] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
*
* During attachment, libbpf parses all the relevant USDT specifications and
* prepares `struct usdt_spec` (USDT spec), which is then provided to BPF-side
* code through spec map. This allows BPF applications to quickly fetch the
* actual value at runtime using a simple BPF-side code.
*
* With basics out of the way, let's go over less immediately obvious aspects
* of supporting USDTs.
*
* First, there is no special USDT BPF program type. It is actually just
* a uprobe BPF program (which for kernel, at least currently, is just a kprobe
* program, so BPF_PROG_TYPE_KPROBE program type). With the only difference
* that uprobe is usually attached at the function entry, while USDT will
* normally will be somewhere inside the function. But it should always be
* pointing to NOP instruction, which makes such uprobes the fastest uprobe
* kind.
*
* Second, it's important to realize that such STAP_PROBEn(provider, name, ...)
* macro invocations can end up being inlined many-many times, depending on
* specifics of each individual user application. So single conceptual USDT
* (identified by provider:name pair of identifiers) is, generally speaking,
* multiple uprobe locations (USDT call sites) in different places in user
* application. Further, again due to inlining, each USDT call site might end
* up having the same argument #N be located in a different place. In one call
* site it could be a constant, in another will end up in a register, and in
* yet another could be some other register or even somewhere on the stack.
*
* As such, "attaching to USDT" means (in general case) attaching the same
* uprobe BPF program to multiple target locations in user application, each
* potentially having a completely different USDT spec associated with it.
* To wire all this up together libbpf allocates a unique integer spec ID for
* each unique USDT spec. Spec IDs are allocated as sequential small integers
* so that they can be used as keys in array BPF map (for performance reasons).
* Spec ID allocation and accounting is big part of what usdt_manager is
* about. This state has to be maintained per-BPF object and coordinate
* between different USDT attachments within the same
|