summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--Documentation/bpf/btf.rst6
-rw-r--r--Documentation/bpf/index.rst1
-rw-r--r--Documentation/bpf/kfuncs.rst170
-rw-r--r--Documentation/bpf/map_hash.rst185
-rw-r--r--arch/arm64/include/asm/insn.h3
-rw-r--r--arch/arm64/lib/insn.c30
-rw-r--r--arch/arm64/net/bpf_jit.h7
-rw-r--r--arch/arm64/net/bpf_jit_comp.c715
-rw-r--r--arch/x86/net/bpf_jit_comp.c58
-rw-r--r--include/linux/bpf.h29
-rw-r--r--include/linux/bpf_verifier.h8
-rw-r--r--include/linux/btf.h65
-rw-r--r--include/linux/btf_ids.h68
-rw-r--r--include/linux/filter.h8
-rw-r--r--include/linux/ftrace.h43
-rw-r--r--include/linux/skbuff.h8
-rw-r--r--include/net/netfilter/nf_conntrack_core.h19
-rw-r--r--include/net/xdp_sock_drv.h14
-rw-r--r--include/uapi/linux/bpf.h3
-rw-r--r--kernel/bpf/arraymap.c40
-rw-r--r--kernel/bpf/bpf_lsm.c8
-rw-r--r--kernel/bpf/bpf_struct_ops.c3
-rw-r--r--kernel/bpf/btf.c126
-rw-r--r--kernel/bpf/core.c100
-rw-r--r--kernel/bpf/devmap.c2
-rw-r--r--kernel/bpf/hashtab.c6
-rw-r--r--kernel/bpf/local_storage.c2
-rw-r--r--kernel/bpf/lpm_trie.c2
-rw-r--r--kernel/bpf/preload/iterators/Makefile10
-rw-r--r--kernel/bpf/syscall.c36
-rw-r--r--kernel/bpf/trampoline.c163
-rw-r--r--kernel/bpf/verifier.c89
-rw-r--r--kernel/kallsyms.c91
-rw-r--r--kernel/trace/ftrace.c328
-rw-r--r--net/bpf/test_run.c78
-rw-r--r--net/core/dev.c1
-rw-r--r--net/core/filter.c4
-rw-r--r--net/core/skmsg.c4
-rw-r--r--net/ipv4/bpf_tcp_ca.c18
-rw-r--r--net/ipv4/tcp_bbr.c24
-rw-r--r--net/ipv4/tcp_cubic.c20
-rw-r--r--net/ipv4/tcp_dctcp.c20
-rw-r--r--net/netfilter/nf_conntrack_bpf.c365
-rw-r--r--net/netfilter/nf_conntrack_core.c62
-rw-r--r--net/netfilter/nf_conntrack_netlink.c54
-rw-r--r--net/xdp/xsk.c5
-rw-r--r--samples/bpf/Makefile10
-rw-r--r--samples/bpf/fds_example.c3
-rw-r--r--samples/bpf/sock_example.c3
-rw-r--r--samples/bpf/test_cgrp2_attach.c3
-rw-r--r--samples/bpf/test_lru_dist.c2
-rw-r--r--samples/bpf/test_map_in_map_user.c4
-rw-r--r--samples/bpf/tracex5_user.c3
-rw-r--r--samples/bpf/xdp_redirect_map.bpf.c6
-rw-r--r--samples/bpf/xdp_redirect_map_user.c9
-rwxr-xr-xscripts/bpf_doc.py22
-rw-r--r--tools/bpf/resolve_btfids/main.c40
-rw-r--r--tools/bpf/runqslower/Makefile7
-rw-r--r--tools/include/uapi/linux/bpf.h3
-rw-r--r--tools/lib/bpf/bpf_tracing.h51
-rw-r--r--tools/lib/bpf/btf_dump.c2
-rw-r--r--tools/lib/bpf/gen_loader.c2
-rw-r--r--tools/lib/bpf/libbpf.c390
-rw-r--r--tools/lib/bpf/libbpf.h62
-rw-r--r--tools/lib/bpf/libbpf.map2
-rw-r--r--tools/lib/bpf/libbpf_internal.h8
-rw-r--r--tools/lib/bpf/usdt.bpf.h16
-rw-r--r--tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c10
-rw-r--r--tools/testing/selftests/bpf/prog_tests/bpf_iter.c16
-rw-r--r--tools/testing/selftests/bpf/prog_tests/bpf_nf.c64
-rw-r--r--tools/testing/selftests/bpf/prog_tests/btf.c2
-rw-r--r--tools/testing/selftests/bpf/prog_tests/core_extern.c17
-rw-r--r--tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c2
-rw-r--r--tools/testing/selftests/bpf/prog_tests/ringbuf_multi.c11
-rw-r--r--tools/testing/selftests/bpf/prog_tests/skeleton.c2
-rw-r--r--tools/testing/selftests/bpf/progs/bpf_iter.h7
-rw-r--r--tools/testing/selftests/bpf/progs/bpf_iter_ksym.c74
-rw-r--r--tools/testing/selftests/bpf/progs/bpf_syscall_macro.c6
-rw-r--r--tools/testing/selftests/bpf/progs/test_attach_probe.c15
-rw-r--r--tools/testing/selftests/bpf/progs/test_bpf_nf.c85
-rw-r--r--tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c134
-rw-r--r--tools/testing/selftests/bpf/progs/test_core_extern.c3
-rw-r--r--tools/testing/selftests/bpf/progs/test_probe_user.c27
-rw-r--r--tools/testing/selftests/bpf/progs/test_skeleton.c4
-rw-r--r--tools/testing/selftests/bpf/progs/test_xdp_noinline.c30
-rwxr-xr-xtools/testing/selftests/bpf/test_xdp_veth.sh6
-rw-r--r--tools/testing/selftests/bpf/verifier/bpf_loop_inline.c1
-rw-r--r--tools/testing/selftests/bpf/verifier/calls.c53
88 files changed, 3458 insertions, 860 deletions
diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst
index f49aeef62d0c..cf8722f96090 100644
--- a/Documentation/bpf/btf.rst
+++ b/Documentation/bpf/btf.rst
@@ -369,7 +369,8 @@ No additional type data follow ``btf_type``.
* ``name_off``: offset to a valid C identifier
* ``info.kind_flag``: 0
* ``info.kind``: BTF_KIND_FUNC
- * ``info.vlen``: 0
+ * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
+ or BTF_FUNC_EXTERN)
* ``type``: a BTF_KIND_FUNC_PROTO type
No additional type data follow ``btf_type``.
@@ -380,6 +381,9 @@ type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
(ABI).
+Currently, only linkage values of BTF_FUNC_STATIC and BTF_FUNC_GLOBAL are
+supported in the kernel.
+
2.2.13 BTF_KIND_FUNC_PROTO
~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index 96056a7447c7..1bc2c5c58bdb 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -19,6 +19,7 @@ that goes into great technical depth about the BPF Architecture.
faq
syscall_api
helpers
+ kfuncs
programs
maps
bpf_prog_run
diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
new file mode 100644
index 000000000000..c0b7dae6dbf5
--- /dev/null
+++ b/Documentation/bpf/kfuncs.rst
@@ -0,0 +1,170 @@
+=============================
+BPF Kernel Functions (kfuncs)
+=============================
+
+1. Introduction
+===============
+
+BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux
+kernel which are exposed for use by BPF programs. Unlike normal BPF helpers,
+kfuncs do not have a stable interface and can change from one kernel release to
+another. Hence, BPF programs need to be updated in response to changes in the
+kernel.
+
+2. Defining a kfunc
+===================
+
+There are two ways to expose a kernel function to BPF programs, either make an
+existing function in the kernel visible, or add a new wrapper for BPF. In both
+cases, care must be taken that BPF program can only call such function in a
+valid context. To enforce this, visibility of a kfunc can be per program type.
+
+If you are not creating a BPF wrapper for existing kernel function, skip ahead
+to :ref:`BPF_kfunc_nodef`.
+
+2.1 Creating a wrapper kfunc
+----------------------------
+
+When defining a wrapper kfunc, the wrapper function should have extern linkage.
+This prevents the compiler from optimizing away dead code, as this wrapper kfunc
+is not invoked anywhere in the kernel itself. It is not necessary to provide a
+prototype in a header for the wrapper kfunc.
+
+An example is given below::
+
+ /* Disables missing prototype warnings */
+ __diag_push();
+ __diag_ignore_all("-Wmissing-prototypes",
+ "Global kfuncs as their definitions will be in BTF");
+
+ struct task_struct *bpf_find_get_task_by_vpid(pid_t nr)
+ {
+ return find_get_task_by_vpid(nr);
+ }
+
+ __diag_pop();
+
+A wrapper kfunc is often needed when we need to annotate parameters of the
+kfunc. Otherwise one may directly make the kfunc visible to the BPF program by
+registering it with the BPF subsystem. See :ref:`BPF_kfunc_nodef`.
+
+2.2 Annotating kfunc parameters
+-------------------------------
+
+Similar to BPF helpers, there is sometime need for additional context required
+by the verifier to make the usage of kernel functions safer and more useful.
+Hence, we can annotate a parameter by suffixing the name of the argument of the
+kfunc with a __tag, where tag may be one of the supported annotations.
+
+2.2.1 __sz Annotation
+---------------------
+
+This annotation is used to indicate a memory and size pair in the argument list.
+An example is given below::
+
+ void bpf_memzero(void *mem, int mem__sz)
+ {
+ ...
+ }
+
+Here, the verifier will treat first argument as a PTR_TO_MEM, and second
+argument as its size. By default, without __sz annotation, the size of the type
+of the pointer is used. Without __sz annotation, a kfunc cannot accept a void
+pointer.
+
+.. _BPF_kfunc_nodef:
+
+2.3 Using an existing kernel function
+-------------------------------------
+
+When an existing function in the kernel is fit for consumption by BPF programs,
+it can be directly registered with the BPF subsystem. However, care must still
+be taken to review the context in which it will be invoked by the BPF program
+and whether it is safe to do so.
+
+2.4 Annotating kfuncs
+---------------------
+
+In addition to kfuncs' arguments, verifier may need more information about the
+type of kfunc(s) being registered with the BPF subsystem. To do so, we define
+flags on a set of kfuncs as follows::
+
+ BTF_SET8_START(bpf_task_set)
+ BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
+ BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
+ BTF_SET8_END(bpf_task_set)
+
+This set encodes the BTF ID of each kfunc listed above, and encodes the flags
+along with it. Ofcourse, it is also allowed to specify no flags.
+
+2.4.1 KF_ACQUIRE flag
+---------------------
+
+The KF_ACQUIRE flag is used to indicate that the kfunc returns a pointer to a
+refcounted object. The verifier will then ensure that the pointer to the object
+is eventually released using a release kfunc, or transferred to a map using a
+referenced kptr (by invoking bpf_kptr_xchg). If not, the verifier fails the
+loading of the BPF program until no lingering references remain in all possible
+explored states of the program.
+
+2.4.2 KF_RET_NULL flag
+----------------------
+
+The KF_RET_NULL flag is used to indicate that the pointer returned by the kfunc
+may be NULL. Hence, it forces the user to do a NULL check on the pointer
+returned from the kfunc before making use of it (dereferencing or passing to
+another helper). This flag is often used in pairing with KF_ACQUIRE flag, but
+both are orthogonal to each other.
+
+2.4.3 KF_RELEASE flag
+---------------------
+
+The KF_RELEASE flag is used to indicate that the kfunc releases the pointer
+passed in to it. There can be only one referenced pointer that can be passed in.
+All copies of the pointer being released are invalidated as a result of invoking
+kfunc with this flag.
+
+2.4.4 KF_KPTR_GET flag
+----------------------
+
+The KF_KPTR_GET flag is used to indicate that the kfunc takes the first argument
+as a pointer to kptr, safely increments the refcount of the object it points to,
+and returns a reference to the user. The rest of the arguments may be normal
+arguments of a kfunc. The KF_KPTR_GET flag should be used in conjunction with
+KF_ACQUIRE and KF_RET_NULL flags.
+
+2.4.5 KF_TRUSTED_ARGS flag
+--------------------------
+
+The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
+indicates that the all pointer arguments will always be refcounted, and have
+their offset set to 0. It can be used to enforce that a pointer to a refcounted
+object acquired from a kfunc or BPF helper is passed as an argument to this
+kfunc without any modifications (e.g. pointer arithmetic) such that it is
+trusted and points to the original object. This flag is often used for kfuncs
+that operate (change some property, perform some operation) on an object that
+was obtained using an acquire kfunc. Such kfuncs need an unchanged pointer to
+ensure the integrity of the operation being performed on the expected object.
+
+2.5 Registering the kfuncs
+--------------------------
+
+Once the kfunc is prepared for use, the final step to making it visible is
+registering it with the BPF subsystem. Registration is done per BPF program
+type. An example is shown below::
+
+ BTF_SET8_START(bpf_task_set)
+ BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
+ BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
+ BTF_SET8_END(bpf_task_set)
+
+ static const struct btf_kfunc_id_set bpf_task_kfunc_set = {
+ .owner = THIS_MODULE,
+ .set = &bpf_task_set,
+ };
+
+ static int init_subsystem(void)
+ {
+ return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_task_kfunc_set);
+ }
+ late_initcall(init_subsystem);
diff --git a/Documentation/bpf/map_hash.rst b/Documentation/bpf/map_hash.rst
new file mode 100644
index 000000000000..e85120878b27
--- /dev/null
+++ b/Documentation/bpf/map_hash.rst
@@ -0,0 +1,185 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+.. Copyright (C) 2022 Red Hat, Inc.
+
+===============================================
+BPF_MAP_TYPE_HASH, with PERCPU and LRU Variants
+===============================================
+
+.. note::
+ - ``BPF_MAP_TYPE_HASH`` was introduced in kernel version 3.19
+ - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduced in version 4.6
+ - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
+ were introduced in version 4.10
+
+``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCPU_HASH`` provide general
+purpose hash map storage. Both the key and the value can be structs,
+allowing for composite keys and values.
+
+The kernel is responsible for allocating and freeing key/value pairs, up
+to the max_entries limit that you specify. Hash maps use pre-allocation
+of hash table elements by default. The ``BPF_F_NO_PREALLOC`` flag can be
+used to disable pre-allocation when it is too memory expensive.
+
+``BPF_MAP_TYPE_PERCPU_HASH`` provides a separate value slot per
+CPU. The per-cpu values are stored internally in an array.
+
+The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
+variants add LRU semantics to their respective hash tables. An LRU hash
+will automatically evict the least recently used entries when the hash
+table reaches capacity. An LRU hash maintains an internal LRU list that
+is used to select elements for eviction. This internal LRU list is
+shared across CPUs but it is possible to request a per CPU LRU list with
+the ``BPF_F_NO_COMMON_LRU`` flag when calling ``bpf_map_create``.
+
+Usage
+=====
+
+.. c:function::
+ long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags)
+
+Hash entries can be added or updated using the ``bpf_map_update_elem()``
+helper. This helper replaces existing elements atomically. The ``flags``
+parameter can be used to control the update behaviour:
+
+- ``BPF_ANY`` will create a new element or update an existing element
+- ``BPF_NOEXIST`` will create a new element only if one did not already
+ exist
+- ``BPF_EXIST`` will update an existing element
+
+``bpf_map_update_elem()`` returns 0 on success, or negative error in
+case of failure.
+
+.. c:function::
+ void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
+
+Hash entries can be retrieved using the ``bpf_map_lookup_elem()``
+helper. This helper returns a pointer to the value associated with
+``key``, or ``NULL`` if no entry was found.
+
+.. c:function::
+ long bpf_map_delete_elem(struct bpf_map *map, const void *key)
+
+Hash entries can be deleted using the ``bpf_map_delete_elem()``
+helper. This helper will return 0 on success, or negative error in case
+of failure.
+
+Per CPU Hashes
+--------------
+
+For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
+the