summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--Documentation/userspace-api/check_exec.rst144
-rw-r--r--Documentation/userspace-api/index.rst1
-rw-r--r--fs/exec.c20
-rw-r--r--include/linux/binfmts.h7
-rw-r--r--include/uapi/linux/audit.h1
-rw-r--r--include/uapi/linux/fcntl.h4
-rw-r--r--include/uapi/linux/securebits.h24
-rw-r--r--samples/Kconfig9
-rw-r--r--samples/Makefile1
-rw-r--r--samples/check-exec/.gitignore2
-rw-r--r--samples/check-exec/Makefile15
-rw-r--r--samples/check-exec/inc.c205
-rwxr-xr-xsamples/check-exec/run-script-ask.inc9
-rwxr-xr-xsamples/check-exec/script-ask.inc5
-rwxr-xr-xsamples/check-exec/script-exec.inc4
-rw-r--r--samples/check-exec/script-noexec.inc4
-rw-r--r--samples/check-exec/set-exec.c85
-rw-r--r--security/commoncap.c29
-rw-r--r--security/integrity/ima/ima_appraise.c27
-rw-r--r--security/integrity/ima/ima_main.c29
-rw-r--r--security/security.c10
-rw-r--r--tools/testing/selftests/exec/.gitignore4
-rw-r--r--tools/testing/selftests/exec/Makefile19
-rwxr-xr-xtools/testing/selftests/exec/check-exec-tests.sh205
-rw-r--r--tools/testing/selftests/exec/check-exec.c456
-rw-r--r--tools/testing/selftests/exec/config2
-rw-r--r--tools/testing/selftests/exec/false.c5
-rw-r--r--tools/testing/selftests/kselftest/ktap_helpers.sh2
-rw-r--r--tools/testing/selftests/landlock/fs_test.c27
29 files changed, 1341 insertions, 14 deletions
diff --git a/Documentation/userspace-api/check_exec.rst b/Documentation/userspace-api/check_exec.rst
new file mode 100644
index 000000000000..05dfe3b56f71
--- /dev/null
+++ b/Documentation/userspace-api/check_exec.rst
@@ -0,0 +1,144 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. Copyright © 2024 Microsoft Corporation
+
+===================
+Executability check
+===================
+
+The ``AT_EXECVE_CHECK`` :manpage:`execveat(2)` flag, and the
+``SECBIT_EXEC_RESTRICT_FILE`` and ``SECBIT_EXEC_DENY_INTERACTIVE`` securebits
+are intended for script interpreters and dynamic linkers to enforce a
+consistent execution security policy handled by the kernel. See the
+`samples/check-exec/inc.c`_ example.
+
+Whether an interpreter should check these securebits or not depends on the
+security risk of running malicious scripts with respect to the execution
+environment, and whether the kernel can check if a script is trustworthy or
+not. For instance, Python scripts running on a server can use arbitrary
+syscalls and access arbitrary files. Such interpreters should then be
+enlighten to use these securebits and let users define their security policy.
+However, a JavaScript engine running in a web browser should already be
+sandboxed and then should not be able to harm the user's environment.
+
+Script interpreters or dynamic linkers built for tailored execution environments
+(e.g. hardened Linux distributions or hermetic container images) could use
+``AT_EXECVE_CHECK`` without checking the related securebits if backward
+compatibility is handled by something else (e.g. atomic update ensuring that
+all legitimate libraries are allowed to be executed). It is then recommended
+for script interpreters and dynamic linkers to check the securebits at run time
+by default, but also to provide the ability for custom builds to behave like if
+``SECBIT_EXEC_RESTRICT_FILE`` or ``SECBIT_EXEC_DENY_INTERACTIVE`` were always
+set to 1 (i.e. always enforce restrictions).
+
+AT_EXECVE_CHECK
+===============
+
+Passing the ``AT_EXECVE_CHECK`` flag to :manpage:`execveat(2)` only performs a
+check on a regular file and returns 0 if execution of this file would be
+allowed, ignoring the file format and then the related interpreter dependencies
+(e.g. ELF libraries, script's shebang).
+
+Programs should always perform this check to apply kernel-level checks against
+files that are not directly executed by the kernel but passed to a user space
+interpreter instead. All files that contain executable code, from the point of
+view of the interpreter, should be checked. However the result of this check
+should only be enforced according to ``SECBIT_EXEC_RESTRICT_FILE`` or
+``SECBIT_EXEC_DENY_INTERACTIVE.``.
+
+The main purpose of this flag is to improve the security and consistency of an
+execution environment to ensure that direct file execution (e.g.
+``./script.sh``) and indirect file execution (e.g. ``sh script.sh``) lead to
+the same result. For instance, this can be used to check if a file is
+trustworthy according to the caller's environment.
+
+In a secure environment, libraries and any executable dependencies should also
+be checked. For instance, dynamic linking should make sure that all libraries
+are allowed for execution to avoid trivial bypass (e.g. using ``LD_PRELOAD``).
+For such secure execution environment to make sense, only trusted code should
+be executable, which also requires integrity guarantees.
+
+To avoid race conditions leading to time-of-check to time-of-use issues,
+``AT_EXECVE_CHECK`` should be used with ``AT_EMPTY_PATH`` to check against a
+file descriptor instead of a path.
+
+SECBIT_EXEC_RESTRICT_FILE and SECBIT_EXEC_DENY_INTERACTIVE
+==========================================================
+
+When ``SECBIT_EXEC_RESTRICT_FILE`` is set, a process should only interpret or
+execute a file if a call to :manpage:`execveat(2)` with the related file
+descriptor and the ``AT_EXECVE_CHECK`` flag succeed.
+
+This secure bit may be set by user session managers, service managers,
+container runtimes, sandboxer tools... Except for test environments, the
+related ``SECBIT_EXEC_RESTRICT_FILE_LOCKED`` bit should also be set.
+
+Programs should only enforce consistent restrictions according to the
+securebits but without relying on any other user-controlled configuration.
+Indeed, the use case for these securebits is to only trust executable code
+vetted by the system configuration (through the kernel), so we should be
+careful to not let untrusted users control this configuration.
+
+However, script interpreters may still use user configuration such as
+environment variables as long as it is not a way to disable the securebits
+checks. For instance, the ``PATH`` and ``LD_PRELOAD`` variables can be set by
+a script's caller. Changing these variables may lead to unintended code
+executions, but only from vetted executable programs, which is OK. For this to
+make sense, the system should provide a consistent security policy to avoid
+arbitrary code execution e.g., by enforcing a write xor execute policy.
+
+When ``SECBIT_EXEC_DENY_INTERACTIVE`` is set, a process should never interpret
+interactive user commands (e.g. scripts). However, if such commands are passed
+through a file descriptor (e.g. stdin), its content should be interpreted if a
+call to :manpage:`execveat(2)` with the related file descriptor and the
+``AT_EXECVE_CHECK`` flag succeed.
+
+For instance, script interpreters called with a script snippet as argument
+should always deny such execution if ``SECBIT_EXEC_DENY_INTERACTIVE`` is set.
+
+This secure bit may be set by user session managers, service managers,
+container runtimes, sandboxer tools... Except for test environments, the
+related ``SECBIT_EXEC_DENY_INTERACTIVE_LOCKED`` bit should also be set.
+
+Here is the expected behavior for a script interpreter according to combination
+of any exec securebits:
+
+1. ``SECBIT_EXEC_RESTRICT_FILE=0`` and ``SECBIT_EXEC_DENY_INTERACTIVE=0``
+
+ Always interpret scripts, and allow arbitrary user commands (default).
+
+ No threat, everyone and everything is trusted, but we can get ahead of
+ potential issues thanks to the call to :manpage:`execveat(2)` with
+ ``AT_EXECVE_CHECK`` which should always be performed but ignored by the
+ script interpreter. Indeed, this check is still important to enable systems
+ administrators to verify requests (e.g. with audit) and prepare for
+ migration to a secure mode.
+
+2. ``SECBIT_EXEC_RESTRICT_FILE=1`` and ``SECBIT_EXEC_DENY_INTERACTIVE=0``
+
+ Deny script interpretation if they are not executable, but allow
+ arbitrary user commands.
+
+ The threat is (potential) malicious scripts run by trusted (and not fooled)
+ users. That can protect against unintended script executions (e.g. ``sh
+ /tmp/*.sh``). This makes sense for (semi-restricted) user sessions.
+
+3. ``SECBIT_EXEC_RESTRICT_FILE=0`` and ``SECBIT_EXEC_DENY_INTERACTIVE=1``
+
+ Always interpret scripts, but deny arbitrary user commands.
+
+ This use case may be useful for secure services (i.e. without interactive
+ user session) where scripts' integrity is verified (e.g. with IMA/EVM or
+ dm-verity/IPE) but where access rights might not be ready yet. Indeed,
+ arbitrary interactive commands would be much more difficult to check.
+
+4. ``SECBIT_EXEC_RESTRICT_FILE=1`` and ``SECBIT_EXEC_DENY_INTERACTIVE=1``
+
+ Deny script interpretation if they are not executable, and also deny
+ any arbitrary user commands.
+
+ The threat is malicious scripts run by untrusted users (but trusted code).
+ This makes sense for system services that may only execute trusted scripts.
+
+.. Links
+.. _samples/check-exec/inc.c:
+ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/check-exec/inc.c
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index 274cc7546efc..6272bcf11296 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -35,6 +35,7 @@ Security-related interfaces
mfd_noexec
spec_ctrl
tee
+ check_exec
Devices and I/O
===============
diff --git a/fs/exec.c b/fs/exec.c
index 2f0acef8908e..d58b061c5e42 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -892,7 +892,8 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
.lookup_flags = LOOKUP_FOLLOW,
};
- if ((flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
+ if ((flags &
+ ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH | AT_EXECVE_CHECK)) != 0)
return ERR_PTR(-EINVAL);
if (flags & AT_SYMLINK_NOFOLLOW)
open_exec_flags.lookup_flags &= ~LOOKUP_FOLLOW;
@@ -1564,6 +1565,21 @@ static struct linux_binprm *alloc_bprm(int fd, struct filename *filename, int fl
}
bprm->interp = bprm->filename;
+ /*
+ * At this point, security_file_open() has already been called (with
+ * __FMODE_EXEC) and access control checks for AT_EXECVE_CHECK will
+ * stop just after the security_bprm_creds_for_exec() call in
+ * bprm_execve(). Indeed, the kernel should not try to parse the
+ * content of the file with exec_binprm() nor change the calling
+ * thread, which means that the following security functions will not
+ * be called:
+ * - security_bprm_check()
+ * - security_bprm_creds_from_file()
+ * - security_bprm_committing_creds()
+ * - security_bprm_committed_creds()
+ */
+ bprm->is_check = !!(flags & AT_EXECVE_CHECK);
+
retval = bprm_mm_init(bprm);
if (!retval)
return bprm;
@@ -1845,7 +1861,7 @@ static int bprm_execve(struct linux_binprm *bprm)
/* Set the unchanging part of bprm->cred */
retval = security_bprm_creds_for_exec(bprm);
- if (retval)
+ if (retval || bprm->is_check)
goto out;
retval = exec_binprm(bprm);
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index 3305c849abd6..60d674af3080 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -44,7 +44,12 @@ struct linux_binprm {
*/
point_of_no_return:1,
/* Set when "comm" must come from the dentry. */
- comm_from_dentry:1;
+ comm_from_dentry:1,
+ /*
+ * Set by user space to check executability according to the
+ * caller's environment.
+ */
+ is_check:1;
struct file *executable; /* Executable to pass to the interpreter */
struct file *interpreter;
struct file *file;
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 75e21a135483..d9a069b4a775 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -161,6 +161,7 @@
#define AUDIT_INTEGRITY_RULE 1805 /* policy rule */
#define AUDIT_INTEGRITY_EVM_XATTR 1806 /* New EVM-covered xattr */
#define AUDIT_INTEGRITY_POLICY_RULE 1807 /* IMA policy rules */
+#define AUDIT_INTEGRITY_USERSPACE 1808 /* Userspace enforced data integrity */
#define AUDIT_KERNEL 2000 /* Asynchronous audit record. NOT A REQUEST. */
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index 6e6907e63bfc..a15ac2fa4b20 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -155,4 +155,8 @@
#define AT_HANDLE_MNT_ID_UNIQUE 0x001 /* Return the u64 unique mount ID. */
#define AT_HANDLE_CONNECTABLE 0x002 /* Request a connectable file handle */
+/* Flags for execveat2(2). */
+#define AT_EXECVE_CHECK 0x10000 /* Only perform a check if execution
+ would be allowed. */
+
#endif /* _UAPI_LINUX_FCNTL_H */
diff --git a/include/uapi/linux/securebits.h b/include/uapi/linux/securebits.h
index d6d98877ff1a..3fba30dbd68b 100644
--- a/include/uapi/linux/securebits.h
+++ b/include/uapi/linux/securebits.h
@@ -52,10 +52,32 @@
#define SECBIT_NO_CAP_AMBIENT_RAISE_LOCKED \
(issecure_mask(SECURE_NO_CAP_AMBIENT_RAISE_LOCKED))
+/* See Documentation/userspace-api/check_exec.rst */
+#define SECURE_EXEC_RESTRICT_FILE 8
+#define SECURE_EXEC_RESTRICT_FILE_LOCKED 9 /* make bit-8 immutable */
+
+#define SECBIT_EXEC_RESTRICT_FILE (issecure_mask(SECURE_EXEC_RESTRICT_FILE))
+#define SECBIT_EXEC_RESTRICT_FILE_LOCKED \
+ (issecure_mask(SECURE_EXEC_RESTRICT_FILE_LOCKED))
+
+/* See Documentation/userspace-api/check_exec.rst */
+#define SECURE_EXEC_DENY_INTERACTIVE 10
+#define SECURE_EXEC_DENY_INTERACTIVE_LOCKED 11 /* make bit-10 immutable */
+
+#define SECBIT_EXEC_DENY_INTERACTIVE \
+ (issecure_mask(SECURE_EXEC_DENY_INTERACTIVE))
+#define SECBIT_EXEC_DENY_INTERACTIVE_LOCKED \
+ (issecure_mask(SECURE_EXEC_DENY_INTERACTIVE_LOCKED))
+
#define SECURE_ALL_BITS (issecure_mask(SECURE_NOROOT) | \
issecure_mask(SECURE_NO_SETUID_FIXUP) | \
issecure_mask(SECURE_KEEP_CAPS) | \
- issecure_mask(SECURE_NO_CAP_AMBIENT_RAISE))
+ issecure_mask(SECURE_NO_CAP_AMBIENT_RAISE) | \
+ issecure_mask(SECURE_EXEC_RESTRICT_FILE) | \
+ issecure_mask(SECURE_EXEC_DENY_INTERACTIVE))
#define SECURE_ALL_LOCKS (SECURE_ALL_BITS << 1)
+#define SECURE_ALL_UNPRIVILEGED (issecure_mask(SECURE_EXEC_RESTRICT_FILE) | \
+ issecure_mask(SECURE_EXEC_DENY_INTERACTIVE))
+
#endif /* _UAPI_LINUX_SECUREBITS_H */
diff --git a/samples/Kconfig b/samples/Kconfig
index b288d9991d27..84a9d4e8d947 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -291,6 +291,15 @@ config SAMPLE_CGROUP
help
Build samples that demonstrate the usage of the cgroup API.
+config SAMPLE_CHECK_EXEC
+ bool "Exec secure bits examples"
+ depends on CC_CAN_LINK && HEADERS_INSTALL
+ help
+ Build a tool to easily configure SECBIT_EXEC_RESTRICT_FILE and
+ SECBIT_EXEC_DENY_INTERACTIVE, and a simple script interpreter to
+ demonstrate how they should be used with execveat(2) +
+ AT_EXECVE_CHECK.
+
source "samples/rust/Kconfig"
endif # SAMPLES
diff --git a/samples/Makefile b/samples/Makefile
index b85fa64390c5..f988202f3a30 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -3,6 +3,7 @@
subdir-$(CONFIG_SAMPLE_AUXDISPLAY) += auxdisplay
subdir-$(CONFIG_SAMPLE_ANDROID_BINDERFS) += binderfs
+subdir-$(CONFIG_SAMPLE_CHECK_EXEC) += check-exec
subdir-$(CONFIG_SAMPLE_CGROUP) += cgroup
obj-$(CONFIG_SAMPLE_CONFIGFS) += configfs/
obj-$(CONFIG_SAMPLE_CONNECTOR) += connector/
diff --git a/samples/check-exec/.gitignore b/samples/check-exec/.gitignore
new file mode 100644
index 000000000000..cd759a19dacd
--- /dev/null
+++ b/samples/check-exec/.gitignore
@@ -0,0 +1,2 @@
+/inc
+/set-exec
diff --git a/samples/check-exec/Makefile b/samples/check-exec/Makefile
new file mode 100644
index 000000000000..c4f08ad0f8e3
--- /dev/null
+++ b/samples/check-exec/Makefile
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: BSD-3-Clause
+
+userprogs-always-y := \
+ inc \
+ set-exec
+
+userccflags += -I usr/include
+
+.PHONY: all clean
+
+all:
+ $(MAKE) -C ../.. samples/check-exec/
+
+clean:
+ $(MAKE) -C ../.. M=samples/check-exec/ clean
diff --git a/samples/check-exec/inc.c b/samples/check-exec/inc.c
new file mode 100644
index 000000000000..94b87569d2a2
--- /dev/null
+++ b/samples/check-exec/inc.c
@@ -0,0 +1,205 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Very simple script interpreter that can evaluate two different commands (one
+ * per line):
+ * - "?" to initialize a counter from user's input;
+ * - "+" to increment the counter (which is set to 0 by default).
+ *
+ * See tools/testing/selftests/exec/check-exec-tests.sh and
+ * Documentation/userspace-api/check_exec.rst
+ *
+ * Copyright © 2024 Microsoft Corporation
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <linux/fcntl.h>
+#include <linux/prctl.h>
+#include <linux/securebits.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <unistd.h>
+
+/* Returns 1 on error, 0 otherwise. */
+static int interpret_buffer(char *buffer, size_t buffer_size)
+{
+ char *line, *saveptr = NULL;
+ long long number = 0;
+
+ /* Each command is the first character of a line. */
+ saveptr = NULL;
+ line = strtok_r(buffer, "\n", &saveptr);
+ while (line) {
+ if (*line != '#' && strlen(line) != 1) {
+ fprintf(stderr, "# ERROR: Unknown string\n");
+ return 1;
+ }
+ switch (*line) {
+ case '#':
+ /* Skips shebang and comments. */
+ break;
+ case '+':
+ /* Increments and prints the number. */
+ number++;
+ printf("%lld\n", number);
+ break;
+ case '?':
+ /* Reads integer from stdin. */
+ fprintf(stderr, "> Enter new number: \n");
+ if (scanf("%lld", &number) != 1) {
+ fprintf(stderr,
+ "# WARNING: Failed to read number from stdin\n");
+ }
+ break;
+ default:
+ fprintf(stderr, "# ERROR: Unknown character '%c'\n",
+ *line);
+ return 1;
+ }
+ line = strtok_r(NULL, "\n", &saveptr);
+ }
+ return 0;
+}
+
+/* Returns 1 on error, 0 otherwise. */
+static int interpret_stream(FILE *script, char *const script_name,
+ char *const *const envp, const bool restrict_stream)
+{
+ int err;
+ char *const script_argv[] = { script_name, NULL };
+ char buf[128] = {};
+ size_t buf_size = sizeof(buf);
+
+ /*
+ * We pass a valid argv and envp to the kernel to emulate a native
+ * script execution. We must use the script file descriptor instead of
+ * the script path name to avoid race conditions.
+ */
+ err = execveat(fileno(script), "", script_argv, envp,
+ AT_EMPTY_PATH | AT_EXECVE_CHECK);
+ if (err && restrict_stream) {
+ perror("ERROR: Script execution check");
+ return 1;
+ }
+
+ /* Reads script. */
+ buf_size = fread(buf, 1, buf_size - 1, script);
+ return interpret_buffer(buf, buf_size);
+}
+
+static void print_usage(const char *argv0)
+{
+ fprintf(stderr, "usage: %s <script.inc> | -i | -c <command>\n\n",
+ argv0);
+ fprintf(stderr, "Example:\n");
+ fprintf(stderr, " ./set-exec -fi -- ./inc -i < script-exec.inc\n");
+}
+
+int main(const int argc, char *const argv[], char *const *const envp)
+{
+ int opt;
+ char *cmd = NULL;
+ char *script_name = NULL;
+ bool interpret_stdin = false;
+ FILE *script_file = NULL;
+ int secbits;
+ bool deny_interactive, restrict_file;
+ size_t arg_nb;
+
+ secbits = prctl(PR_GET_SECUREBITS);
+ if (secbits == -1) {
+ /*
+ * This should never happen, except with a buggy seccomp
+ * filter.
+ */
+ perror("ERROR: Failed to get securebits");
+ return 1;
+ }
+
+ deny_interactive = !!(secbits & SECBIT_EXEC_DENY_INTERACTIVE);
+ restrict_file = !!(secbits & SECBIT_EXEC_RESTRICT_FILE);
+
+ while ((opt = getopt(argc, argv, "c:i")) != -1) {
+ switch (opt) {
+ case 'c':
+ if (cmd) {
+ fprintf(stderr, "ERROR: Command already set");
+ return 1;
+ }
+ cmd = optarg;
+ break;
+ case 'i':
+ interpret_stdin = true;
+ break;
+ default:
+ print_usage(argv[0]);
+ return 1;
+ }
+ }
+
+ /* Checks that only one argument is used, or read stdin. */
+ arg_nb = !!cmd + !!interpret_stdin;
+ if (arg_nb == 0 && argc == 2) {
+ script_name = argv[1];
+ } else if (arg_nb != 1) {
+ print_usage(argv[0]);
+ return 1;
+ }
+
+ if (cmd) {
+ /*
+ * Other kind of interactive interpretations should be denied
+ * as well (e.g. CLI arguments passing script snippets,
+ * environment variables interpreted as script). However, any
+ * way to pass script files should only be restricted according
+ * to restrict_file.
+ */
+ if (deny_interactive) {
+ fprintf(stderr,
+ "ERROR: Interactive interpretation denied.\n");
+ return 1;
+ }
+
+ return interpret_buffer(cmd, strlen(cmd));
+ }
+
+ if (interpret_stdin && !script_name) {
+ script_file = stdin;
+ /*
+ * As for any execve(2) call, this path may be logged by the
+ * kernel.
+ */
+ script_name = "/proc/self/fd/0";
+ /*
+ * When stdin is used, it can point to a regular file or a
+ * pipe. Restrict stdin execution according to
+ * SECBIT_EXEC_DENY_INTERACTIVE but always allow executable
+ * files (which are not considered as interactive inputs).
+ */
+ return interpret_stream(script_file, script_name, envp,
+ deny_interactive);
+ } else if (script_name && !interpret_stdin) {
+ /*
+ * In this sample, we don't pass any argument to scripts, but
+ * otherwise we would have to forge an argv with such
+ * arguments.
+ */
+ script_file = fopen(script_name, "r");
+ if (!script_file) {
+ perror("ERROR: Failed to open script");
+ return 1;
+ }
+ /*
+ * Restricts file execution according to
+ * SECBIT_EXEC_RESTRICT_FILE.
+ */
+ return interpret_stream(script_file, script_name, envp,
+ restrict_file);
+ }
+
+ print_usage(argv[0]);
+ return 1;
+}
diff --git a/samples/check-exec/run-script-ask.inc b/samples/check-exec/run-script-ask.inc
new file mode 100755
index 000000000000..8ef0fdc37266
--- /dev/null
+++ b/samples/check-exec/run-script-ask.inc
@@ -0,0 +1,9 @@
+#!/usr/bin/env sh
+# SPDX-License-Identifier: BSD-3-Clause
+
+DIR="$(dirname -- "$0")"
+
+PATH="${PATH}:${DIR}"
+
+set -x
+"${DIR}/script-ask.inc"
diff --git a/samples/check-exec/script-ask.inc b/samples/check-exec/script-ask.inc
new file mode 100755
index 000000000000..720a8e649225
--- /dev/null
+++ b/samples/check-exec/script-ask.inc
@@ -0,0 +1,5 @@
+#!/usr/bin/env inc
+# SPDX-License-Identifier: BSD-3-Clause
+
+?
++
diff --git a/samples/check-exec/script-exec.inc b/samples/check-exec/script-exec.inc
new file mode 100755
index 000000000000..3245cb9d8dd1
--- /dev/null
+++ b/samples/check-exec/script-exec.inc
@@ -0,0 +1,4 @@
+#!/usr/bin/env inc
+# SPDX-License-Identifier: BSD-3-Clause
+
++
diff --git a/samples/check-exec/script-noexec.inc b/samples/check-exec/script-noexec.inc
new file mode 100644
index 000000000000..3245cb9d8dd1
--- /dev/null
+++ b/samples/check-exec/script-noexec.inc
@@ -0,0 +1,4 @@
+#!/usr/bin/env inc
+# SPDX-License-Identifier: BSD-3-Clause
+
++
diff --git a/samples/check-exec/set-exec.c b/samples/check-exec/set-exec.c
new file mode 100644
index 000000000000..ba86a60a20dd
--- /dev/null
+++ b/samples/check-exec/set-exec.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Simple tool to set SECBIT_EXEC_RESTRICT_FILE, SECBIT_EXEC_DENY_INTERACTIVE,
+ * before executing a command.
+ *
+ * Copyright © 2024 Microsoft Corporation
+ */
+
+#define _GNU_SOURCE
+#define __SANE_USERSPACE_TYPES__
+#include <errno.h>
+#include <linux/prctl.h>
+#include <linux/securebits.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <unistd.h>
+
+static void print_usage(const char *argv0)
+{
+ fprintf(stderr, "usage: %s -f|-i -- <cmd> [args]...\n\n", argv0);
+ fprintf(stderr, "Execute a command with\n");
+ fprintf(stderr, "- SECBIT_EXEC_RESTRICT_FILE set: -f\n");
+ fprintf(stderr, "- SECBIT_EXEC_DENY_INTERACTIVE set: -i\n");
+}
+
+int main(const int argc, char *const argv[], char *const *const envp)
+{
+ const char *cmd_path;
+ char *const *cmd_argv;
+ int opt, secbits_cur, secbits_new;
+ bool has_policy = false;
+
+ secbits_cur = prctl(PR_GET_SECUREBITS);
+ if (secbits_cur == -1) {
+ /*
+ * This should never happen, except with a buggy seccomp
+ * filter.
+ */
+ perror("ERROR: Failed to get securebits");
+ return 1;
+ }
+
+ secbits_new = secbits_cur;
+ while ((opt = getopt(argc, argv, "fi")) != -1) {
+ switch (opt) {
+ case 'f':
+ secbits_new |= SECBIT_EXEC_RESTRICT_FILE |
+ SECBIT_EXEC_RESTRICT_FILE_LOCKED;
+ has_policy = true;
+ break;
+ case 'i':
+ secbits_new |= SECBIT_EXEC_DENY_INTERACTIVE |
+ SECBIT_EXEC_DENY_INTERACTIVE_LOCKED;
+ has_policy = true;
+ break;
+ default:
+ print_usage(argv[0]);
+ return 1;
+ }
+ }
+
+ if (!argv[optind] || !has_policy) {
+ print_usage(argv[0]);
+ return 1;
+ }
+
+ if (secbits_cur != secbits_new &&
+ prctl(PR_SET_SECUREBITS, secbits_new)) {
+ perror("Failed to set secure bit(s).");
+ fprintf(stderr,
+ "Hint: The running kernel may not support this feature.\n");
+ return 1;
+ }
+
+ cmd_path = argv[optind];
+ cmd_argv = argv + optind;
+ fprintf(stderr, "Executing command...\n");
+ execvpe(cmd_path, cmd_argv, envp);
+ fprintf(stderr, "Failed to execute \"%s\": %s\n", cmd_path,
+ strerror(errno));
+ return 1;
+}
diff --git a/security/commoncap.c b/security/commoncap.c
index cefad323a0b1..52ea01acb453 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -1302,21 +1302,38 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
& (old->securebits ^ arg2)) /*[1]*/
|| ((old->securebits & SECURE_ALL_LOCKS & ~arg2)) /*[2]*/
|| (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS)) /*[3]*/
- || (cap_capable(current_cred(),
- current_cred()->user_ns,
- CAP_SETPCAP,
- CAP_OPT_NONE) != 0) /*[4]*/
/*
* [1] no changing of bits that are locked
* [2] no unlocking of locks
* [3] no setting of unsupported bits
- * [4] doing anything requires privilege (go read about
- * the "sendmail capabilities bug")
*/
)
/* cannot change a locked bit */
return -EPERM;
+ /*
+ * Doing anything requires privilege (go read about the
+ * "sendmail capabilities bug"), except for unprivileged bits.
+ * Indeed, the SECURE_ALL_UNPRIVILEGED bits are not
+ * restrictions enforced by the kernel but by user space on
+ * itself.
+ */
+ if (cap_capable(current_cred(), current_cred()->user_ns,
+ CAP_SETPCAP, CAP_OPT_NONE) != 0) {
+ const unsigned long unpriv_and_locks =
+ SECURE_ALL_UNPRIVILEGED |
+ SECURE_ALL_UNPRIVILEGED << 1;
+ const unsigned long changed = old->securebits ^ arg2;
+
+ /* For legacy reason, denies non-change. */
+ if (!changed)
+ return -EPERM;
+
+ /* Denies privileged changes. */
+ if (changed & ~unpriv_and_locks)
+ return -EPERM;
+ }
+
new = prepare_creds();
if (!new)
return -ENOMEM;
diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
index 884a3533f7af..f435eff4667f 100644
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -8,6 +8,7 @@
#include <linux/module.h>
#include <linux/init.h>
#include <linux/file.h>
+#include <linux/binfmts.h>
#include <linux/fs.h>
#include <linux/xattr.h>
#include <linux/magic.h>
@@ -469,6 +470,17 @@ int ima_check_blacklist(struct ima_iint_cache *iint,
return rc;
}
+static bool is_bprm_creds_for_exec(enum ima_hooks func, struct file *file)
+{
+ struct linux_binprm *bprm;
+
+ if (func == BPRM_CHECK) {
+ bprm = container_of(&file, struct linux_binprm, file);
+ return bprm->is_check;
+ }
+ return false;
+}
+
/*
* ima_appraise_measurement - appraise file measurement
*
@@ -483,6 +495,7 @@ int ima_appraise_measurement(enum ima_hooks func, struct ima_iint_cache *iint,
int xattr_len, const struct modsig *modsig)
{
static const char op[] = "appraise_data";
+ int audit_msgno = AUDIT_INTEGRITY_DATA;
const char *cause = "unknown";
struct dentry *dentry = file_dentry(file);
struct inode *inode = d_backing_inode(dentry);
@@ -494,6 +507,16 @@ int ima_appraise_measurement(enum ima_hooks func, struct ima_iint_cache *iint,
if (!(inode->i_opflags & IOP_XATTR) && !try_modsig)
return INTEGRITY_UNKNOWN;
+ /*
+ * Unlike any of the other LSM hooks where the kernel enforces file
+ * integrity, enforcing file integrity for the bprm_creds_for_exec()
+ * LSM hook with the AT_EXECVE_CHECK flag is left up to the discretion
+ * of the script int