summaryrefslogtreecommitdiff
path: root/security
AgeCommit message (Collapse)AuthorFilesLines
2025-01-17landlock: Align partial refer access checks with final onesMickaël Salaün1-1/+13
Fix a logical issue that could have been visible if the source or the destination of a rename/link action was allowed for either the source or the destination but not both. However, this logical bug is unreachable because either: - the rename/link action is allowed by the access rights tied to the same mount point (without relying on access rights in a parent mount point) and the access request is allowed (i.e. allow_parent1 and allow_parent2 are true in current_check_refer_path), - or a common rule in a parent mount point updates the access check for the source and the destination (cf. is_access_to_paths_allowed). See the following layout1.refer_part_mount_tree_is_allowed test that work with and without this fix. This fix does not impact current code but it is required for the audit support. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250108154338.1129069-12-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17landlock: Simplify initially denied access rightsMickaël Salaün3-11/+19
Upgrade domain's handled access masks when creating a domain from a ruleset, instead of converting them at runtime. This is more consistent and helps with audit support. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250108154338.1129069-7-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17landlock: Move access typesMickaël Salaün5-46/+68
Move LANDLOCK_ACCESS_FS_INITIALLY_DENIED, access_mask_t, struct access_mask, and struct access_masks_all to a dedicated access.h file. Rename LANDLOCK_ACCESS_FS_INITIALLY_DENIED to _LANDLOCK_ACCESS_FS_INITIALLY_DENIED to make it clear that it's not part of UAPI. Add some newlines when appropriate. This file will be extended with following commits, and it will help to avoid dependency loops. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250108154338.1129069-6-mic@digikod.net [mic: Fix rebase conflict because of the new cleanup headers] Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17landlock: Factor out check_access_path()Mickaël Salaün1-21/+11
Merge check_access_path() into current_check_access_path() and make hook_path_mknod() use it. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250108154338.1129069-4-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14landlock: Use scoped guards for ruleset in landlock_add_rule()Mickaël Salaün1-10/+4
Simplify error handling by replacing goto statements with automatic calls to landlock_put_ruleset() when going out of scope. This change depends on the TCP support. Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com> Cc: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> Reviewed-by: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250113161112.452505-3-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14landlock: Use scoped guards for rulesetMickaël Salaün3-29/+23
Simplify error handling by replacing goto statements with automatic calls to landlock_put_ruleset() when going out of scope. This change will be easy to backport to v6.6 if needed, only the kernel.h include line conflicts. As for any other similar changes, we should be careful when backporting without goto statements. Add missing include file. Reviewed-by: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250113161112.452505-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14landlock: Constify get_mode_access()Mickaël Salaün1-1/+1
Use __attribute_const__ for get_mode_access(). Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20250110153918.241810-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14landlock: Handle weird filesMickaël Salaün1-6/+5
A corrupted filesystem (e.g. bcachefs) might return weird files. Instead of throwing a warning and allowing access to such file, treat them as regular files. Cc: Dave Chinner <david@fromorbit.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Paul Moore <paul@paul-moore.com> Reported-by: syzbot+34b68f850391452207df@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/000000000000a65b35061cffca61@google.com Reported-by: syzbot+360866a59e3c80510a62@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/67379b3f.050a0220.85a0.0001.GAE@google.com Reported-by: Ubisectech Sirius <bugreport@ubisectech.com> Closes: https://lore.kernel.org/r/c426821d-8380-46c4-a494-7008bbd7dd13.bugreport@ubisectech.com Fixes: cb2c7d1a1776 ("landlock: Support filesystem access-control") Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20250110153918.241810-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-12security: remove get_task_comm() and print task comm directlyYafang Shao1-3/+1
Since task->comm is guaranteed to be NUL-terminated, we can print it directly without the need to copy it into a separate buffer. This simplifies the code and avoids unnecessary operations. Link: https://lkml.kernel.org/r/20241219023452.69907-5-laoar.shao@gmail.com Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Acked-by: Kees Cook <kees@kernel.org> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: "André Almeida" <andrealmeid@igalia.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Kalle Valo <kvalo@kernel.org> Cc: Karol Herbst <kherbst@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lyude Paul <lyude@redhat.com> Cc: Oded Gabbay <ogabbay@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tvrtko Ursulin <tursulin@ursulin.net> Cc: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-08hardening: Document INIT_STACK_ALL_PATTERN behavior with GCCGeert Uytterhoeven1-0/+1
The help text for INIT_STACK_ALL_PATTERN documents the patterns used by Clang, but lacks documentation for GCC. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/293d29d6a0d1823165be97285c1bc73e90ee9db8.1736239070.git.geert+renesas@glider.be Signed-off-by: Kees Cook <kees@kernel.org>
2025-01-07selinux: make more use of str_read() when loading the policyChristian Göttsche3-22/+12
Simplify the call sites, and enable future string validation in a single place. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: subject tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07selinux: avoid unnecessary indirection in struct level_datumChristian Göttsche3-17/+10
Store the owned member of type struct mls_level directly in the parent struct instead of an extra heap allocation. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07selinux: use known type instead of void pointerChristian Göttsche8-74/+77
Improve type safety and readability by using the known type. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07selinux: rename comparison functions for clarityChristian Göttsche7-16/+16
The functions context_cmp(), mls_context_cmp() and ebitmap_cmp() are not traditional C style compare functions returning -1, 0, and 1 for less than, equal, and greater than; they only return whether their arguments are equal. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07selinux: rework match_ipv6_addrmask()Christian Göttsche1-7/+5
Constify parameters, add size hints, and simplify control flow. According to godbolt the same assembly is generated. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07selinux: constify and reconcile function parameter namesChristian Göttsche4-6/+6
Align the parameter names between declarations and definitions, and constify read-only parameters. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: tweak the subject line] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07selinux: avoid using types indicating user space interactionChristian Göttsche2-2/+2
Integer types starting with a double underscore, like __u32, are intended for usage of variables interacting with user-space. Just use the plain variant. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07selinux: supply missing field initializersChristian Göttsche2-2/+2
Please clang by supplying the missing field initializers in the secclass_map variable and sel_fill_super() function. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: tweak subj and commit description] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07Merge tag 'selinux-pr-20250107' of ↵Linus Torvalds5-38/+65
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "A single SELinux patch to address a problem with a single domain using multiple xperm classes" * tag 'selinux-pr-20250107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: match extended permissions to their base permissions
2025-01-06tomoyo: automatically use patterns for several situations in learning modeTetsuo Handa1-0/+30
The "file_pattern" keyword was used for automatically recording patternized pathnames when using the learning mode. This keyword was removed in TOMOYO 2.4 because it is impossible to predefine all possible pathname patterns. However, since the numeric part of proc:/$PID/ , pipe:[$INO] and socket:[$INO] has no meaning except $PID == 1, automatically replacing the numeric part with \$ pattern helps reducing frequency of restarting the learning mode due to hitting the quota. Since replacing one digit with \$ pattern requires enlarging string buffer, and several programs access only $PID == 1, replace only two or more digits with \$ pattern. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2025-01-05lockdown: initialize local array before use to quiet static analysisTanya Agarwal1-1/+1
The static code analysis tool "Coverity Scan" pointed the following details out for further development considerations: CID 1486102: Uninitialized scalar variable (UNINIT) uninit_use_in_call: Using uninitialized value *temp when calling strlen. Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com> [PM: edit/reformat the description, subject line] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04safesetid: check size of policy writesLeo Stone1-0/+3
syzbot attempts to write a buffer with a large size to a sysfs entry with writes handled by handle_policy_update(), triggering a warning in kmalloc. Check the size specified for write buffers before allocating. Reported-by: syzbot+4eb7a741b3216020043a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4eb7a741b3216020043a Signed-off-by: Leo Stone <leocstone@gmail.com> [PM: subject tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04lsm: rename variable to avoid shadowingChristian Göttsche1-2/+2
The function dump_common_audit_data() contains two variables with the name comm: one declared at the top and one nested one. Rename the nested variable to improve readability and make future refactorings of the function less error prone. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: description long line removal, line wrap cleanup, merge fuzz] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04lsm: constify function parametersChristian Göttsche1-2/+2
The functions print_ipv4_addr() and print_ipv6_addr() are called with string literals and do not modify these parameters internally. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: cleaned up the description to remove long lines] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04security: remove redundant assignment to return variableColin Ian King1-3/+1
In the case where rc is equal to EOPNOTSUPP it is being reassigned a new value of zero that is never read. The following continue statement loops back to the next iteration of the lsm_for_each_hook loop and rc is being re-assigned a new value from the call to getselfattr. The assignment is redundant and can be removed. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> [PM: subj tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04selinux: match extended permissions to their base permissionsThiébaud Weksteen5-38/+65
In commit d1d991efaf34 ("selinux: Add netlink xperm support") a new extended permission was added ("nlmsg"). This was the second extended permission implemented in selinux ("ioctl" being the first one). Extended permissions are associated with a base permission. It was found that, in the access vector cache (avc), the extended permission did not keep track of its base permission. This is an issue for a domain that is using both extended permissions (i.e., a domain calling ioctl() on a netlink socket). In this case, the extended permissions were overlapping. Keep track of the base permission in the cache. A new field "base_perm" is added to struct extended_perms_decision to make sure that the extended permission refers to the correct policy permission. A new field "base_perms" is added to struct extended_perms to quickly decide if extended permissions apply. While it is in theory possible to retrieve the base permission from the access vector, the same base permission may not be mapped to the same bit for each class (e.g., "nlmsg" is mapped to a different bit for "netlink_route_socket" and "netlink_audit_socket"). Instead, use a constant (AVC_EXT_IOCTL or AVC_EXT_NLMSG) provided by the caller. Fixes: d1d991efaf34 ("selinux: Add netlink xperm support") Signed-off-by: Thiébaud Weksteen <tweek@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are setMickaël Salaün2-1/+6
When CONFIG_AUDIT is set, its CONFIG_NET dependency is also set, and the dev_get_by_index and init_net symbols (used by dump_common_audit_data) are found by the linker. dump_common_audit_data() should then failed to build when CONFIG_NET is not set. However, because the compiler is smart, it knows that audit_log_start() always return NULL when !CONFIG_AUDIT, and it doesn't build the body of common_lsm_audit(). As a side effect, dump_common_audit_data() is not built and the linker doesn't error out because of missing symbols. Let's only build lsm_audit.o when CONFIG_SECURITY and CONFIG_AUDIT are both set, which is checked with the new CONFIG_HAS_SECURITY_AUDIT. ipv4_skb_to_auditdata() and ipv6_skb_to_auditdata() are only used by Smack if CONFIG_AUDIT is set, so they don't need fake implementations. Because common_lsm_audit() is used in multiple places without CONFIG_AUDIT checks, add a fake implementation. Link: https://lore.kernel.org/r/20241122143353.59367-2-mic@digikod.net Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: James Morris <jmorris@namei.org> Cc: Paul Moore <paul@paul-moore.com> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-03ima: ignore suffixed policy rule commentsMimi Zohar1-1/+1
Lines beginning with '#' in the IMA policy are comments and are ignored. Instead of placing the rule and comment on separate lines, allow the comment to be suffixed to the IMA policy rule. Reviewed-by: Petr Vorel <pvorel@suse.cz> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-01-03ima: limit the builtin 'tcb' dont_measure tmpfs policy ruleMimi Zohar1-1/+2
With a custom policy similar to the builtin IMA 'tcb' policy [1], arch specific policy, and a kexec boot command line measurement policy rule, the kexec boot command line is not measured due to the dont_measure tmpfs rule. Limit the builtin 'tcb' dont_measure tmpfs policy rule to just the "func=FILE_CHECK" hook. Depending on the end users security threat model, a custom policy might not even include this dont_measure tmpfs rule. Note: as a result of this policy rule change, other measurements might also be included in the IMA-measurement list that previously weren't included. [1] https://ima-doc.readthedocs.io/en/latest/ima-policy.html#ima-tcb Reviewed-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-12-24ima: kexec: silence RCU list traversal warningBreno Leitao1-1/+2
The ima_measurements list is append-only and doesn't require rcu_read_lock() protection. However, lockdep issues a warning when traversing RCU lists without the read lock: security/integrity/ima/ima_kexec.c:40 RCU-list traversed in non-reader section!! Fix this by using the variant of list_for_each_entry_rcu() with the last argument set to true. This tells the RCU subsystem that traversing this append-only list without the read lock is intentional and safe. This change silences the lockdep warning while maintaining the correct semantics for the append-only list traversal. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-12-22vfs: support caching symlink lengths in inodesMateusz Guzik1-1/+1
When utilized it dodges strlen() in vfs_readlink(), giving about 1.5% speed up when issuing readlink on /initrd.img on ext4. Filesystems opt in by calling inode_set_cached_link() when creating an inode. The size is stored in a new union utilizing the same space as i_devices, thus avoiding growing the struct or taking up any more space. Churn-wise the current readlink_copy() helper is patched to accept the size instead of calculating it. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20241120112037.822078-2-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-18ima: instantiate the bprm_creds_for_exec() hookMimi Zohar2-2/+54
Like direct file execution (e.g. ./script.sh), indirect file execution (e.g. sh script.sh) needs to be measured and appraised. Instantiate the new security_bprm_creds_for_exec() hook to measure and verify the indirect file's integrity. Unlike direct file execution, indirect file execution is optionally enforced by the interpreter. Differentiate kernel and userspace enforced integrity audit messages. Co-developed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20241212174223.389435-9-mic@digikod.net Signed-off-by: Kees Cook <kees@kernel.org>
2024-12-18security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebitsMickaël Salaün1-6/+23
The new SECBIT_EXEC_RESTRICT_FILE, SECBIT_EXEC_DENY_INTERACTIVE, and their *_LOCKED counterparts are designed to be set by processes setting up an execution environment, such as a user session, a container, or a security sandbox. Unlike other securebits, these ones can be set by unprivileged processes. Like seccomp filters or Landlock domains, the securebits are inherited across processes. When SECBIT_EXEC_RESTRICT_FILE is set, programs interpreting code should control executable resources according to execveat(2) + AT_EXECVE_CHECK (see previous commit). When SECBIT_EXEC_DENY_INTERACTIVE is set, a process should deny execution of user interactive commands (which excludes executable regular files). Being able to configure each of these securebits enables system administrators or owner of image containers to gradually validate the related changes and to identify potential issues (e.g. with interpreter or audit logs). It should be noted that unlike other security bits, the SECBIT_EXEC_RESTRICT_FILE and SECBIT_EXEC_DENY_INTERACTIVE bits are dedicated to user space willing to restrict itself. Because of that, they only make sense in the context of a trusted environment (e.g. sandbox, container, user session, full system) where the process changing its behavior (according to these bits) and all its parent processes are trusted. Otherwise, any parent process could just execute its own malicious code (interpreting a script or not), or even enforce a seccomp filter to mask these bits. Such a secure environment can be achieved with an appropriate access control (e.g. mount's noexec option, file access rights, LSM policy) and an enlighten ld.so checking that libraries are allowed for execution e.g., to protect against illegitimate use of LD_PRELOAD. Ptrace restrictions according to these securebits would not make sense because of the processes' trust assumption. Scripts may need some changes to deal with untrusted data (e.g. stdin, environment variables), but that is outside the scope of the kernel. See chromeOS's documentation about script execution control and the related threat model: https://www.chromium.org/chromium-os/developer-library/guides/security/noexec-shell-scripts/ Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Christian Brauner <brauner@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Paul Moore <paul@paul-moore.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Jeff Xu <jeffxu@chromium.org> Tested-by: Jeff Xu <jeffxu@chromium.org> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20241212174223.389435-3-mic@digikod.net Signed-off-by: Kees Cook <kees@kernel.org>
2024-12-18exec: Add a new AT_EXECVE_CHECK flag to execveat(2)Mickaël Salaün1-0/+10
Add a new AT_EXECVE_CHECK flag to execveat(2) to check if a file would be allowed for execution. The main use case is for script interpreters and dynamic linkers to check execution permission according to the kernel's security policy. Another use case is to add context to access logs e.g., which script (instead of interpreter) accessed a file. As any executable code, scripts could also use this check [1]. This is different from faccessat(2) + X_OK which only checks a subset of access rights (i.e. inode permission and mount options for regular files), but not the full context (e.g. all LSM access checks). The main use case for access(2) is for SUID processes to (partially) check access on behalf of their caller. The main use case for execveat(2) + AT_EXECVE_CHECK is to check if a script execution would be allowed, according to all the different restrictions in place. Because the use of AT_EXECVE_CHECK follows the exact kernel semantic as for a real execution, user space gets the same error codes. An interesting point of using execveat(2) instead of openat2(2) is that it decouples the check from the enforcement. Indeed, the security check can be logged (e.g. with audit) without blocking an execution environment not yet ready to enforce a strict security policy. LSMs can control or log execution requests with security_bprm_creds_for_exec(). However, to enforce a consistent and complete access control (e.g. on binary's dependencies) LSMs should restrict file executability, or measure executed files, with security_file_open() by checking file->f_flags & __FMODE_EXEC. Because AT_EXECVE_CHECK is dedicated to user space interpreters, it doesn't make sense for the kernel to parse the checked files, look for interpreters known to the kernel (e.g. ELF, shebang), and return ENOEXEC if the format is unknown. Because of that, security_bprm_check() is never called when AT_EXECVE_CHECK is used. It should be noted that script interpreters cannot directly use execveat(2) (without this new AT_EXECVE_CHECK flag) because this could lead to unexpected behaviors e.g., `python script.sh` could lead to Bash being executed to interpret the script. Unlike the kernel, script interpreters may just interpret the shebang as a simple comment, which should not change for backward compatibility reasons. Because scripts or libraries files might not currently have the executable permission set, or because we might want specific users to be allowed to run arbitrary scripts, the following patch provides a dynamic configuration mechanism with the SECBIT_EXEC_RESTRICT_FILE and SECBIT_EXEC_DENY_INTERACTIVE securebits. This is a redesign of the CLIP OS 4's O_MAYEXEC: https://github.com/clipos-archive/src_platform_clip-patches/blob/f5cb330d6b684752e403b4e41b39f7004d88e561/1901_open_mayexec.patch This patch has been used for more than a decade with customized script interpreters. Some examples can be found here: https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Kees Cook <keescook@chromium.org> Acked-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Jeff Xu <jeffxu@chromium.org> Tested-by: Jeff Xu <jeffxu@chromium.org> Link: https://docs.python.org/3/library/io.html#io.open_code [1] Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20241212174223.389435-2-mic@digikod.net Signed-off-by: Kees Cook <kees@kernel.org>
2024-12-18Merge tag 'selinux-pr-20241217' of ↵Linus Torvalds1-2/+6
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "One small SELinux patch to get rid improve our handling of unknown extended permissions by safely ignoring them. Not only does this make it easier to support newer SELinux policy on older kernels in the future, it removes to BUG() calls from the SELinux code." * tag 'selinux-pr-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: ignore unknown extended permissions
2024-12-17tomoyo: use realpath if symlink's pathname refers to procfsTetsuo Handa1-2/+9
Fedora 41 has reached Linux 6.12 kernel with TOMOYO enabled. I observed that /usr/lib/systemd/systemd executes /usr/lib/systemd/systemd-executor by passing dirfd == 9 or dirfd == 16 upon execveat(). Commit ada1986d0797 ("tomoyo: fallback to realpath if symlink's pathname does not exist") used realpath only if symlink's pathname does not exist. But an out of tree patch suggested that it will be reasonable to always use realpath if symlink's pathname refers to proc filesystem. Therefore, this patch changes the pathname used for checking "file execute" and the domainname used after a successful execve() request. Before: <kernel> /usr/lib/systemd/systemd file execute proc:/self/fd/16 exec.realpath="/usr/lib/systemd/systemd-executor" exec.argv[0]="/usr/lib/systemd/systemd-executor" file execute proc:/self/fd/9 exec.realpath="/usr/lib/systemd/systemd-executor" exec.argv[0]="/usr/lib/systemd/systemd-executor" <kernel> /usr/lib/systemd/systemd proc:/self/fd/16 file execute /usr/sbin/auditd exec.realpath="/usr/sbin/auditd" exec.argv[0]="/usr/sbin/auditd" <kernel> /usr/lib/systemd/systemd proc:/self/fd/16 /usr/sbin/auditd <kernel> /usr/lib/systemd/systemd proc:/self/fd/9 file execute /usr/bin/systemctl exec.realpath="/usr/bin/systemctl" exec.argv[0]="/usr/bin/systemctl" <kernel> /usr/lib/systemd/systemd proc:/self/fd/9 /usr/bin/systemctl After: <kernel> /usr/lib/systemd/systemd file execute /usr/lib/systemd/systemd-executor exec.realpath="/usr/lib/systemd/systemd-executor" exec.argv[0]="/usr/lib/systemd/systemd-executor" <kernel> /usr/lib/systemd/systemd /usr/lib/systemd/systemd-executor file execute /usr/bin/systemctl exec.realpath="/usr/bin/systemctl" exec.argv[0]="/usr/bin/systemctl" file execute /usr/sbin/auditd exec.realpath="/usr/sbin/auditd" exec.argv[0]="/usr/sbin/auditd" <kernel> /usr/lib/systemd/systemd /usr/lib/systemd/systemd-executor /usr/bin/systemctl <kernel> /usr/lib/systemd/systemd /usr/lib/systemd/systemd-executor /usr/sbin/auditd Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2024-12-16bpf: lsm: Remove hook to bpf_task_storage_freeSong Liu1-1/+0
free_task() already calls bpf_task_storage_free(). It is not necessary to call it again on security_task_free(). Remove the hook. Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Matt Bobrowski <mattbobrowski@google.com> Link: https://patch.msgid.link/20241212075956.2614894-1-song@kernel.org
2024-12-16tomoyo: don't emit warning in tomoyo_write_control()Tetsuo Handa1-1/+1
syzbot is reporting too large allocation warning at tomoyo_write_control(), for one can write a very very long line without new line character. To fix this warning, I use __GFP_NOWARN rather than checking for KMALLOC_MAX_SIZE, for practically a valid line should be always shorter than 32KB where the "too small to fail" memory-allocation rule applies. One might try to write a valid line that is longer than 32KB, but such request will likely fail with -ENOMEM. Therefore, I feel that separately returning -EINVAL when a line is longer than KMALLOC_MAX_SIZE is redundant. There is no need to distinguish over-32KB and over-KMALLOC_MAX_SIZE. Reported-by: syzbot+7536f77535e5210a5c76@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7536f77535e5210a5c76 Reported-by: Leo Stone <leocstone@gmail.com> Closes: https://lkml.kernel.org/r/20241216021459.178759-2-leocstone@gmail.com Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2024-12-15selinux: ignore unknown extended permissionsThiébaud Weksteen1-2/+6
When evaluating extended permissions, ignore unknown permissions instead of calling BUG(). This commit ensures that future permissions can be added without interfering with older kernels. Cc: stable@vger.kernel.org Fixes: fa1aa143ac4a ("selinux: extended permissions for ioctls") Signed-off-by: Thiébaud Weksteen <tweek@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-15selinux: add netlink nlmsg_type audit messageThiébaud Weksteen2-2/+5
Add a new audit message type to capture nlmsg-related information. This is similar to LSM_AUDIT_DATA_IOCTL_OP which was added for the other SELinux extended permission (ioctl). Adding a new type is preferred to adding to the existing lsm_network_audit structure which contains irrelevant information for the netlink sockets (i.e., dport, sport). Signed-off-by: Thiébaud Weksteen <tweek@google.com> [PM: change "nlnk-msgtype" to "nl-msgtype" as discussed] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-13selinux: add support for xperms in conditional policiesChristian Göttsche6-9/+26
Add support for extended permission rules in conditional policies. Currently the kernel accepts such rules already, but evaluating a security decision will hit a BUG() in services_compute_xperms_decision(). Thus reject extended permission rules in conditional policies for current policy versions. Add a new policy version for this feature. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Tested-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-11selinux: Fix SCTP error inconsistency in selinux_socket_bind()Mikhail Ivanov1-1/+1
Check sk->sk_protocol instead of security class to recognize SCTP socket. SCTP socket is initialized with SECCLASS_SOCKET class if policy does not support EXTSOCKCLASS capability. In this case bind(2) hook wrongfully return EAFNOSUPPORT instead of EINVAL. The inconsistency was detected with help of Landlock tests: https://lore.kernel.org/all/b58680ca-81b2-7222-7287-0ac7f4227c3c@huawei-partners.com/ Fixes: 0f8db8cc73df ("selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()") Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-11selinux: use native iterator typesChristian Göttsche3-4/+4
Use types for iterators equal to the type of the to be compared values. Reported by clang: ../ss/sidtab.c:126:2: warning: comparison of integers of different signs: 'int' and 'unsigned long' 126 | hash_for_each_rcu(sidtab->context_to_sid, i, entry, list) { | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../hashtable.h:139:51: note: expanded from macro 'hash_for_each_rcu' 139 | for (... ; obj == NULL && (bkt) < HASH_SIZE(name);\ | ~~~ ^ ~~~~~~~~~~~~~~~ ../selinuxfs.c:1520:23: warning: comparison of integers of different signs: 'int' and 'unsigned int' 1520 | for (cpu = *idx; cpu < nr_cpu_ids; ++cpu) { | ~~~ ^ ~~~~~~~~~~ ../hooks.c:412:16: warning: comparison of integers of different signs: 'int' and 'unsigned long' 412 | for (i = 0; i < ARRAY_SIZE(tokens); i++) { | ~ ^ ~~~~~~~~~~~~~~~~~~ Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: munged the clang output due to line length concerns] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-11selinux: add generated av_permissions.h to targetsThomas Weißschuh1-4/+3
av_permissions.h was not declared as a target and therefore not cleaned up automatically by kbuild. Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/lkml/CAK7LNATUnCPt03BRFSKh1EH=+Sy0Q48wE4ER0BZdJqOb_44L8w@mail.gmail.com/ Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-11ima: Suspend PCR extends and log appends when rebootingStefan Berger3-0/+47
To avoid the following types of error messages due to a failure by the TPM driver to use the TPM, suspend TPM PCR extensions and the appending of entries to the IMA log once IMA's reboot notifier has been called. This avoids trying to use the TPM after the TPM subsystem has been shut down. [111707.685315][ T1] ima: Error Communicating to TPM chip, result: -19 [111707.685960][ T1] ima: Error Communicating to TPM chip, result: -19 Synchronization with the ima_extend_list_mutex to set ima_measurements_suspended ensures that the TPM subsystem is not shut down when IMA holds the mutex while appending to the log and extending the PCR. The alternative of reading the system_state variable would not provide this guarantee. This error could be observed on a ppc64 machine running SuSE Linux where processes are still accessing files after devices have been shut down. Suspending the IMA log and PCR extensions shortly before reboot does not seem to open a significant measurement gap since neither TPM quoting would work for attestation nor that new log entries could be written to anywhere after devices have been shut down. However, there's a time window between the invocation of the reboot notifier and the shutdown of devices. This includes all subsequently invoked reboot notifiers as well as kernel_restart_prepare() where __usermodehelper_disable() waits for all running_helpers to exit. During this time window IMA could now miss log entries even though attestation would still work. The reboot of the system shortly after may make this small gap insignificant. Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-12-10fsnotify: introduce pre-content permission eventsAmir Goldstein1-1/+2
The new FS_PRE_ACCESS permission event is similar to FS_ACCESS_PERM, but it meant for a different use case of filling file content before access to a file range, so it has slightly different semantics. Generate FS_PRE_ACCESS/FS_ACCESS_PERM as two seperate events, so content scanners could inspect the content filled by pre-content event handler. Unlike FS_ACCESS_PERM, FS_PRE_ACCESS is also called before a file is modified by syscalls as write() and fallocate(). FS_ACCESS_PERM is reported also on blockdev and pipes, but the new pre-content events are only reported for regular files and dirs. The pre-content events are meant to be used by hierarchical storage managers that want to fill the content of files on first access. There are some specific requirements from filesystems that could be used with pre-content events, so add a flag for fs to opt-in for pre-content events explicitly before they can be used. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/b934c5e3af205abc4e0e4709f6486815937ddfdf.1731684329.git.josef@toxicpanda.com
2024-12-06smack: deduplicate access to string conversionKonstantin Andreev4-40/+15
Signed-off-by: Konstantin Andreev <andreev@swemel.ru> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2024-12-05Merge tag 'net-6.13-rc2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can and netfilter. Current release - regressions: - rtnetlink: fix double call of rtnl_link_get_net_ifla() - tcp: populate XPS related fields of timewait sockets - ethtool: fix access to uninitialized fields in set RXNFC command - selinux: use sk_to_full_sk() in selinux_ip_output() Current release - new code bugs: - net: make napi_hash_lock irq safe - eth: - bnxt_en: support header page pool in queue API - ice: fix NULL pointer dereference in switchdev Previous releases - regressions: - core: fix icmp host relookup triggering ip_rt_bug - ipv6: - avoid possible NULL deref in modify_prefix_route() - release expired exception dst cached in socket - smc: fix LGR and link use-after-free issue - hsr: avoid potential out-of-bound access in fill_frame_info() - can: hi311x: fix potential use-after-free - eth: ice: fix VLAN pruning in switchdev mode Previous releases - always broken: - netfilter: - ipset: hold module reference while requesting a module - nft_inner: incorrect percpu area handling under softirq - can: j1939: fix skb reference counting - eth: - mlxsw: use correct key block on Spectrum-4 - mlx5: fix memory leak in mlx5hws_definer_calc_layout" * tag 'net-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits) net :mana :Request a V2 response version for MANA_QUERY_GF_STAT net: avoid potential UAF in default_operstate() vsock/test: verify socket options after setting them vsock/test: fix parameter types in SO_VM_SOCKETS_* calls vsock/test: fix failures due to wrong SO_RCVLOWAT parameter net/mlx5e: Remove workaround to avoid syndrome for internal port net/mlx5e: SD, Use correct mdev to build channel param net/mlx5: E-Switch, Fix switching to switchdev mode in MPV net/mlx5: E-Switch, Fix switching to switchdev mode with IB device disabled net/mlx5: HWS: Properly set bwc queue locks lock classes net/mlx5: HWS: Fix memory leak in mlx5hws_definer_calc_layout bnxt_en: handle tpa_info in queue API implementation bnxt_en: refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap() bnxt_en: refactor tpa_info alloc/free into helpers geneve: do not assume mac header is set in geneve_xmit_skb() mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4 ethtool: Fix wrong mod state in case of verbose and no_mask bitset ipmr: tune the ipmr_can_free_table() checks. netfilter: nft_set_hash: skip duplicated elements pending gc run netfilter: ipset: Hold module reference while requesting a module ...
2024-12-04security: add trace event for cap_capableJordan Rome1-13/+41
In cases where we want a stable way to observe/trace cap_capable (e.g. protection from inlining and API updates) add a tracepoint that passes: - The credentials used - The user namespace of the resource being accessed - The user namespace in which the credential provides the capability to access the targeted resource - The capability to check for - The return value of the check Signed-off-by: Jordan Rome <linux@jordanrome.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Link: https://lore.kernel.org/r/20241204155911.1817092-1-linux@jordanrome.com Signed-off-by: Serge Hallyn <sergeh@kernel.org>
2024-12-04capabilities: remove cap_mmap_file()Paul Moore1-7/+0
The cap_mmap_file() LSM callback returns the default value for the security_mmap_file() LSM hook and can be safely removed. Signed-off-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Serge Hallyn <sergeh@kernel.org>