summaryrefslogtreecommitdiff
path: root/kernel/auditsc.c
AgeCommit message (Collapse)AuthorFilesLines
2024-08-28audit: use task_tgid_nr() instead of task_pid_nr()Ricardo Robaina1-1/+1
In a few audit records, PIDs were being recorded with task_pid_nr() instead of task_tgid_nr(). $ grep "task_pid_nr" kernel/audit*.c audit.c: task_pid_nr(current), auditfilter.c: pid = task_pid_nr(current); auditsc.c: audit_log_format(ab, " pid=%u", task_pid_nr(current)); For single-thread applications, the process id (pid) and the thread group id (tgid) are the same. However, on multi-thread applications, task_pid_nr() returns the current thread id (user-space's TID), while task_tgid_nr() returns the main thread id (user-space's PID). Since the users are more interested in the process id (pid), rather than the thread id (tid), this patch converts these callers to the correct method. Link: https://github.com/linux-audit/audit-kernel/issues/126 Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Ricardo Robaina <rrobaina@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-10-13audit,io_uring: io_uring openat triggers audit reference count underflowDan Clash1-4/+4
An io_uring openat operation can update an audit reference count from multiple threads resulting in the call trace below. A call to io_uring_submit() with a single openat op with a flag of IOSQE_ASYNC results in the following reference count updates. These first part of the system call performs two increments that do not race. do_syscall_64() __do_sys_io_uring_enter() io_submit_sqes() io_openat_prep() __io_openat_prep() getname() getname_flags() /* update 1 (increment) */ __audit_getname() /* update 2 (increment) */ The openat op is queued to an io_uring worker thread which starts the opportunity for a race. The system call exit performs one decrement. do_syscall_64() syscall_exit_to_user_mode() syscall_exit_to_user_mode_prepare() __audit_syscall_exit() audit_reset_context() putname() /* update 3 (decrement) */ The io_uring worker thread performs one increment and two decrements. These updates can race with the system call decrement. io_wqe_worker() io_worker_handle_work() io_wq_submit_work() io_issue_sqe() io_openat() io_openat2() do_filp_open() path_openat() __audit_inode() /* update 4 (increment) */ putname() /* update 5 (decrement) */ __audit_uring_exit() audit_reset_context() putname() /* update 6 (decrement) */ The fix is to change the refcnt member of struct audit_names from int to atomic_t. kernel BUG at fs/namei.c:262! Call Trace: ... ? putname+0x68/0x70 audit_reset_context.part.0.constprop.0+0xe1/0x300 __audit_uring_exit+0xda/0x1c0 io_issue_sqe+0x1f3/0x450 ? lock_timer_base+0x3b/0xd0 io_wq_submit_work+0x8d/0x2b0 ? __try_to_del_timer_sync+0x67/0xa0 io_worker_handle_work+0x17c/0x2b0 io_wqe_worker+0x10a/0x350 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/MW2PR2101MB1033FFF044A258F84AEAA584F1C9A@MW2PR2101MB1033.namprd21.prod.outlook.com/ Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring") Signed-off-by: Dan Clash <daclash@linux.microsoft.com> Link: https://lore.kernel.org/r/20231012215518.GA4048@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-09-07Merge tag 'net-6.6-rc1' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking updates from Jakub Kicinski: "Including fixes from netfilter and bpf. Current release - regressions: - eth: stmmac: fix failure to probe without MAC interface specified Current release - new code bugs: - docs: netlink: fix missing classic_netlink doc reference Previous releases - regressions: - deal with integer overflows in kmalloc_reserve() - use sk_forward_alloc_get() in sk_get_meminfo() - bpf_sk_storage: fix the missing uncharge in sk_omem_alloc - fib: avoid warn splat in flow dissector after packet mangling - skb_segment: call zero copy functions before using skbuff frags - eth: sfc: check for zero length in EF10 RX prefix Previous releases - always broken: - af_unix: fix msg_controllen test in scm_pidfd_recv() for MSG_CMSG_COMPAT - xsk: fix xsk_build_skb() dereferencing possible ERR_PTR() - netfilter: - nft_exthdr: fix non-linear header modification - xt_u32, xt_sctp: validate user space input - nftables: exthdr: fix 4-byte stack OOB write - nfnetlink_osf: avoid OOB read - one more fix for the garbage collection work from last release - igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU - bpf, sockmap: fix preempt_rt splat when using raw_spin_lock_t - handshake: fix null-deref in handshake_nl_done_doit() - ip: ignore dst hint for multipath routes to ensure packets are hashed across the nexthops - phy: micrel: - correct bit assignments for cable test errata - disable EEE according to the KSZ9477 errata Misc: - docs/bpf: document compile-once-run-everywhere (CO-RE) relocations - Revert "net: macsec: preserve ingress frame ordering", it appears to have been developed against an older kernel, problem doesn't exist upstream" * tag 'net-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (95 commits) net: enetc: distinguish error from valid pointers in enetc_fixup_clear_rss_rfs() Revert "net: team: do not use dynamic lockdep key" net: hns3: remove GSO partial feature bit net: hns3: fix the port information display when sfp is absent net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue net: hns3: fix debugfs concurrency issue between kfree buffer and read net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read() net: hns3: Support query tx timeout threshold by debugfs net: hns3: fix tx timeout issue net: phy: Provide Module 4 KSZ9477 errata (DS80000754C) netfilter: nf_tables: Unbreak audit log reset netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction netfilter: nf_tables: uapi: Describe NFTA_RULE_CHAIN_ID netfilter: nfnetlink_osf: avoid OOB read netfilter: nftables: exthdr: fix 4-byte stack OOB write selftests/bpf: Check bpf_sk_storage has uncharged sk_omem_alloc bpf: bpf_sk_storage: Fix the missing uncharge in sk_omem_alloc bpf: bpf_sk_storage: Fix invalid wait context lockdep report s390/bpf: Pass through tail call counter in trampolines ...
2023-08-31netfilter: nf_tables: Audit log rule resetPhil Sutter1-0/+1
Resetting rules' stateful data happens outside of the transaction logic, so 'get' and 'dump' handlers have to emit audit log entries themselves. Fixes: 8daa8fde3fc3f ("netfilter: nf_tables: Introduce NFT_MSG_GETRULE_RESET") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-08-31netfilter: nf_tables: Audit log setelem resetPhil Sutter1-0/+1
Since set element reset is not integrated into nf_tables' transaction logic, an explicit log call is needed, similar to NFT_MSG_GETOBJ_RESET handling. For the sake of simplicity, catchall element reset will always generate a dedicated log entry. This relieves nf_tables_dump_set() from having to adjust the logged element count depending on whether a catchall element was found or not. Fixes: 079cd633219d7 ("netfilter: nf_tables: Introduce NFT_MSG_GETSETELEM_RESET") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-08-15audit: cleanup function braces and assignment-in-if-conditionAtul Kumar Pant1-2/+4
The patch fixes following checkpatch.pl issue: ERROR: open brace '{' following function definitions go on the next line ERROR: do not use assignment in if condition Signed-off-by: Atul Kumar Pant <atulpant.linux@gmail.com> [PM: subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-15audit: add space before parenthesis and around '=', "==", and '<'Atul Kumar Pant1-1/+1
Fixes following checkpatch.pl issue: ERROR: space required before the open parenthesis '(' ERROR: spaces required around that '=' ERROR: spaces required around that '<' ERROR: spaces required around that '==' Signed-off-by: Atul Kumar Pant <atulpant.linux@gmail.com> [PM: subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-08audit: fix possible soft lockup in __audit_inode_child()Gaosheng Cui1-0/+2
Tracefs or debugfs maybe cause hundreds to thousands of PATH records, too many PATH records maybe cause soft lockup. For example: 1. CONFIG_KASAN=y && CONFIG_PREEMPTION=n 2. auditctl -a exit,always -S open -k key 3. sysctl -w kernel.watchdog_thresh=5 4. mkdir /sys/kernel/debug/tracing/instances/test There may be a soft lockup as follows: watchdog: BUG: soft lockup - CPU#45 stuck for 7s! [mkdir:15498] Kernel panic - not syncing: softlockup: hung tasks Call trace: dump_backtrace+0x0/0x30c show_stack+0x20/0x30 dump_stack+0x11c/0x174 panic+0x27c/0x494 watchdog_timer_fn+0x2bc/0x390 __run_hrtimer+0x148/0x4fc __hrtimer_run_queues+0x154/0x210 hrtimer_interrupt+0x2c4/0x760 arch_timer_handler_phys+0x48/0x60 handle_percpu_devid_irq+0xe0/0x340 __handle_domain_irq+0xbc/0x130 gic_handle_irq+0x78/0x460 el1_irq+0xb8/0x140 __audit_inode_child+0x240/0x7bc tracefs_create_file+0x1b8/0x2a0 trace_create_file+0x18/0x50 event_create_dir+0x204/0x30c __trace_add_new_event+0xac/0x100 event_trace_add_tracer+0xa0/0x130 trace_array_create_dir+0x60/0x140 trace_array_create+0x1e0/0x370 instance_mkdir+0x90/0xd0 tracefs_syscall_mkdir+0x68/0xa0 vfs_mkdir+0x21c/0x34c do_mkdirat+0x1b4/0x1d4 __arm64_sys_mkdirat+0x4c/0x60 el0_svc_common.constprop.0+0xa8/0x240 do_el0_svc+0x8c/0xc0 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb4 el0_sync+0x160/0x180 Therefore, we add cond_resched() to __audit_inode_child() to fix it. Fixes: 5195d8e217a7 ("audit: dynamically allocate audit_names when not enough space is in the names array") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-01capability: just use a 'u64' instead of a 'u32[2]' arrayLinus Torvalds1-5/+1
Back in 2008 we extended the capability bits from 32 to 64, and we did it by extending the single 32-bit capability word from one word to an array of two words. It was then obfuscated by hiding the "2" behind two macro expansions, with the reasoning being that maybe it gets extended further some day. That reasoning may have been valid at the time, but the last thing we want to do is to extend the capability set any more. And the array of values not only causes source code oddities (with loops to deal with it), but also results in worse code generation. It's a lose-lose situation. So just change the 'u32[2]' into a 'u64' and be done with it. We still have to deal with the fact that the user space interface is designed around an array of these 32-bit values, but that was the case before too, since the array layouts were different (ie user space doesn't use an array of 32-bit values for individual capability masks, but an array of 32-bit slices of multiple masks). So that marshalling of data is actually simplified too, even if it does remain somewhat obscure and odd. This was all triggered by my reaction to the new "cap_isidentical()" introduced recently. By just using a saner data structure, it went from unsigned __capi; CAP_FOR_EACH_U32(__capi) { if (a.cap[__capi] != b.cap[__capi]) return false; } return true; to just being return a.val == b.val; instead. Which is rather more obvious both to humans and to compilers. Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-02-20Merge tag 'fsnotify_for_v6.3-rc1' of ↵Linus Torvalds1-3/+15
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: "Support for auditing decisions regarding fanotify permission events" * tag 'fsnotify_for_v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fanotify,audit: Allow audit to use the full permission event response fanotify: define struct members to hold response decision context fanotify: Ensure consistent variable type for response
2023-02-07fanotify,audit: Allow audit to use the full permission event responseRichard Guy Briggs1-3/+15
This patch passes the full response so that the audit function can use all of it. The audit function was updated to log the additional information in the AUDIT_FANOTIFY record. Currently the only type of fanotify info that is defined is an audit rule number, but convert it to hex encoding to future-proof the field. Hex encoding suggested by Paul Moore <paul@paul-moore.com>. The {subj,obj}_trust values are {0,1,2}, corresponding to no, yes, unknown. Sample records: type=FANOTIFY msg=audit(1600385147.372:590): resp=2 fan_type=1 fan_info=3137 subj_trust=3 obj_trust=5 type=FANOTIFY msg=audit(1659730979.839:284): resp=1 fan_type=0 fan_info=0 subj_trust=2 obj_trust=2 Suggested-by: Steve Grubb <sgrubb@redhat.com> Link: https://lore.kernel.org/r/3075502.aeNJFYEL58@x2 Tested-by: Steve Grubb <sgrubb@redhat.com> Acked-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <bcb6d552e517b8751ece153e516d8b073459069c.1675373475.git.rgb@redhat.com>
2023-02-07fanotify: Ensure consistent variable type for responseRichard Guy Briggs1-1/+1
The user space API for the response variable is __u32. This patch makes sure that the whole path through the kernel uses u32 so that there is no sign extension or truncation of the user space response. Suggested-by: Steve Grubb <sgrubb@redhat.com> Link: https://lore.kernel.org/r/12617626.uLZWGnKmhe@x2 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Paul Moore <paul@paul-moore.com> Tested-by: Steve Grubb <sgrubb@redhat.com> Acked-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <3778cb0b3501bc4e686ba7770b20eb9ab0506cf4.1675373475.git.rgb@redhat.com>
2023-01-19fs: port xattr to mnt_idmapChristian Brauner1-2/+2
Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-10-17audit: unify audit_filter_{uring(), inode_name(), syscall()}Ankur Arora1-37/+39
audit_filter_uring(), audit_filter_inode_name() are substantially similar to audit_filter_syscall(). Move the core logic to __audit_filter_op() which can be parametrized for all three. On a Skylakex system, getpid() latency (all results aggregated across 12 boot cycles): Min Mean Median Max pstdev (ns) (ns) (ns) (ns) - 196.63 207.86 206.60 230.98 (+- 3.92%) + 183.73 196.95 192.31 232.49 (+- 6.04%) Performance counter stats for 'bin/getpid' (3 runs) go from: cycles 805.58 ( +- 4.11% ) instructions 1654.11 ( +- .05% ) IPC 2.06 ( +- 3.39% ) branches 430.02 ( +- .05% ) branch-misses 1.55 ( +- 7.09% ) L1-dcache-loads 440.01 ( +- .09% ) L1-dcache-load-misses 9.05 ( +- 74.03% ) to: cycles 765.37 ( +- 6.66% ) instructions 1677.07 ( +- 0.04% ) IPC 2.20 ( +- 5.90% ) branches 431.10 ( +- 0.04% ) branch-misses 1.60 ( +- 11.25% ) L1-dcache-loads 521.04 ( +- 0.05% ) L1-dcache-load-misses 6.92 ( +- 77.60% ) (Both aggregated over 12 boot cycles.) The increased L1-dcache-loads are due to some intermediate values now coming from the stack. The improvement in cycles is due to a slightly denser loop (the list parameter in the list_for_each_entry_rcu() exit check now comes from a register rather than a constant as before.) Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-10-17audit: cache ctx->major in audit_filter_syscall()Ankur Arora1-1/+2
ctx->major contains the current syscall number. This is, of course, a constant for the duration of the syscall. Unfortunately, GCC's alias analysis cannot prove that it is not modified via a pointer in the audit_filter_syscall() loop, and so always loads it from memory. In and of itself the load isn't very expensive (ops dependent on the ctx->major load are only used to determine the direction of control flow and have short dependence chains and, in any case the related branches get predicted perfectly in the fastpath) but still cache ctx->major in a local for two reasons: * ctx->major is in the first cacheline of struct audit_context and has similar alignment as audit_entry::list audit_entry. For cases with a lot of audit rules, doing this reduces one source of contention from a potentially busy cache-set. * audit_in_mask() (called in the hot loop in audit_filter_syscall()) does cast manipulation and error checking on ctx->major: audit_in_mask(const struct audit_krule *rule, unsigned long val): if (val > 0xffffffff) return false; word = AUDIT_WORD(val); if (word >= AUDIT_BITMASK_SIZE) return false; bit = AUDIT_BIT(val); return rule->mask[word] & bit; The clauses related to the rule need to be evaluated in the loop, but the rest is unnecessarily re-evaluated for every loop iteration. (Note, however, that most of these are cheap ALU ops and the branches are perfectly predicted. However, see discussion on cycles improvement below for more on why it is still worth hoisting.) On a Skylakex system change in getpid() latency (aggregated over 12 boot cycles): Min Mean Median Max pstdev (ns) (ns) (ns) (ns) - 201.30 216.14 216.22 228.46 (+- 1.45%) + 196.63 207.86 206.60 230.98 (+- 3.92%) Performance counter stats for 'bin/getpid' (3 runs) go from: cycles 836.89 ( +- .80% ) instructions 2000.19 ( +- .03% ) IPC 2.39 ( +- .83% ) branches 430.14 ( +- .03% ) branch-misses 1.48 ( +- 3.37% ) L1-dcache-loads 471.11 ( +- .05% ) L1-dcache-load-misses 7.62 ( +- 46.98% ) to: cycles 805.58 ( +- 4.11% ) instructions 1654.11 ( +- .05% ) IPC 2.06 ( +- 3.39% ) branches 430.02 ( +- .05% ) branch-misses 1.55 ( +- 7.09% ) L1-dcache-loads 440.01 ( +- .09% ) L1-dcache-load-misses 9.05 ( +- 74.03% ) (Both aggregated over 12 boot cycles.) instructions: we reduce around 8 instructions/iteration because some of the computation is now hoisted out of the loop (branch count does not change because GCC, for reasons unclear, only hoists the computations while keeping the basic-blocks.) cycles: improve by about 5% (in aggregate and looking at individual run numbers.) This is likely because we now waste fewer pipeline resources on unnecessary instructions which allows the control flow to speculatively execute further ahead shortening the execution of the loop a little. The final gating factor on the performance of this loop remains the long dependence chain due to the linked-list load. Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-10-04Merge tag 'audit-pr-20221003' of ↵Linus Torvalds1-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "Six audit patches for v6.1, most are pretty trivial, but a quick list of the highlights are below: - Only free the audit proctitle information on task exit. This allows us to cache the information and improve performance slightly. - Use the time_after() macro to do time comparisons instead of doing it directly and potentially causing ourselves problems when the timer wraps. - Convert an audit_context state comparison from a relative enum comparison, e.g. (x < y), to a not-equal comparison to ensure that we are not caught out at some unknown point in the future by an enum shuffle. - A handful of small cleanups such as tidying up comments and removing unused declarations" * tag 'audit-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: remove selinux_audit_rule_update() declaration audit: use time_after to compare time audit: free audit_proctitle only on task exit audit: explicitly check audit_context->context enum value audit: audit_context pid unused, context enum comment fix audit: fix repeated words in comments
2022-08-26audit: free audit_proctitle only on task exitRichard Guy Briggs1-1/+1
Since audit_proctitle is generated at syscall exit time, its value is used immediately and cached for the next syscall. Since this is the case, then only clear it at task exit time. Otherwise, there is no point in caching the value OR bearing the overhead of regenerating it. Fixes: 12c5e81d3fd0 ("audit: prepare audit_context for use in calling contexts beyond syscalls") Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-26audit: explicitly check audit_context->context enum valueRichard Guy Briggs1-1/+1
Be explicit in checking the struct audit_context "context" member enum value rather than assuming the order of context enum values. Fixes: 12c5e81d3fd0 ("audit: prepare audit_context for use in calling contexts beyond syscalls") Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-26audit: audit_context pid unused, context enum comment fixRichard Guy Briggs1-2/+2
The pid member of struct audit_context is never used. Remove it. The audit_reset_context() comment about unconditionally resetting "ctx->state" should read "ctx->context". Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-25audit: move audit_return_fixup before the filtersRichard Guy Briggs1-2/+2
The success and return_code are needed by the filters. Move audit_return_fixup() before the filters. This was causing syscall auditing events to be missed. Link: https://github.com/linux-audit/audit-kernel/issues/138 Cc: stable@vger.kernel.org Fixes: 12c5e81d3fd0 ("audit: prepare audit_context for use in calling contexts beyond syscalls") Signed-off-by: Richard Guy Briggs <rgb@redhat.com> [PM: manual merge required] Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-15audit: fix repeated words in commentsJilin Yuan1-1/+1
Delete the redundant word 'doesn't'. Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-04audit, io_uring, io-wq: Fix memory leak in io_sq_thread() and io_wqe_worker()Peilin Ye1-25/+0
Currently @audit_context is allocated twice for io_uring workers: 1. copy_process() calls audit_alloc(); 2. io_sq_thread() or io_wqe_worker() calls audit_alloc_kernel() (which is effectively audit_alloc()) and overwrites @audit_context, causing: BUG: memory leak unreferenced object 0xffff888144547400 (size 1024): <...> hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8135cfc3>] audit_alloc+0x133/0x210 [<ffffffff81239e63>] copy_process+0xcd3/0x2340 [<ffffffff8123b5f3>] create_io_thread+0x63/0x90 [<ffffffff81686604>] create_io_worker+0xb4/0x230 [<ffffffff81686f68>] io_wqe_enqueue+0x248/0x3b0 [<ffffffff8167663a>] io_queue_iowq+0xba/0x200 [<ffffffff816768b3>] io_queue_async+0x113/0x180 [<ffffffff816840df>] io_req_task_submit+0x18f/0x1a0 [<ffffffff816841cd>] io_apoll_task_func+0xdd/0x120 [<ffffffff8167d49f>] tctx_task_work+0x11f/0x570 [<ffffffff81272c4e>] task_work_run+0x7e/0xc0 [<ffffffff8125a688>] get_signal+0xc18/0xf10 [<ffffffff8111645b>] arch_do_signal_or_restart+0x2b/0x730 [<ffffffff812ea44e>] exit_to_user_mode_prepare+0x5e/0x180 [<ffffffff844ae1b2>] syscall_exit_to_user_mode+0x12/0x20 [<ffffffff844a7e80>] do_syscall_64+0x40/0x80 Then, 3. io_sq_thread() or io_wqe_worker() frees @audit_context using audit_free(); 4. do_exit() eventually calls audit_free() again, which is okay because audit_free() does a NULL check. As suggested by Paul Moore, fix it by deleting audit_alloc_kernel() and redundant audit_free() calls. Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring") Suggested-by: Paul Moore <paul@paul-moore.com> Cc: stable@vger.kernel.org Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20220803222343.31673-1-yepeilin.cs@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-15audit: free module nameChristian Göttsche1-1/+1
Reset the type of the record last as the helper `audit_free_module()` depends on it. unreferenced object 0xffff888153b707f0 (size 16): comm "modprobe", pid 1319, jiffies 4295110033 (age 1083.016s) hex dump (first 16 bytes): 62 69 6e 66 6d 74 5f 6d 69 73 63 00 6b 6b 6b a5 binfmt_misc.kkk. backtrace: [<ffffffffa07dbf9b>] kstrdup+0x2b/0x50 [<ffffffffa04b0a9d>] __audit_log_kern_module+0x4d/0xf0 [<ffffffffa03b6664>] load_module+0x9d4/0x2e10 [<ffffffffa03b8f44>] __do_sys_finit_module+0x114/0x1b0 [<ffffffffa1f47124>] do_syscall_64+0x34/0x80 [<ffffffffa200007e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Cc: stable@vger.kernel.org Fixes: 12c5e81d3fd0 ("audit: prepare audit_context for use in calling contexts beyond syscalls") Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-05-17audit,io_uring,io-wq: call __audit_uring_exit for dummy contextsJulian Orth1-0/+6
Not calling the function for dummy contexts will cause the context to not be reset. During the next syscall, this will cause an error in __audit_syscall_entry: WARN_ON(context->context != AUDIT_CTX_UNUSED); WARN_ON(context->name_count); if (context->context != AUDIT_CTX_UNUSED || context->name_count) { audit_panic("unrecoverable error in audit_syscall_entry()"); return; } These problematic dummy contexts are created via the following call chain: exit_to_user_mode_prepare -> arch_do_signal_or_restart -> get_signal -> task_work_run -> tctx_task_work -> io_req_task_submit -> io_issue_sqe -> audit_uring_entry Cc: stable@vger.kernel.org Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring") Signed-off-by: Julian Orth <ju.orth@gmail.com> [PM: subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-03-21Merge tag 'audit-pr-20220321' of ↵Linus Torvalds1-20/+67
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit update from Paul Moore: "Just one audit patch queued for v5.18: - Change the AUDIT_TIME_* record generation so that they are generated at syscall exit time and subject to all of the normal syscall exit filtering. This should help reduce noise and ensure those records which are most relevant to the admin's audit configuration are recorded in the audit log" * tag 'audit-pr-20220321' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: log AUDIT_TIME_* records only from rules
2022-02-22audit: log AUDIT_TIME_* records only from rulesRichard Guy Briggs1-20/+67
AUDIT_TIME_* events are generated when there are syscall rules present that are not related to time keeping. This will produce noisy log entries that could flood the logs and hide events we really care about. Rather than immediately produce the AUDIT_TIME_* records, store the data in the context and log it at syscall exit time respecting the filter rules. Note: This eats the audit_buffer, unlike any others in show_special(). Please see https://bugzilla.redhat.com/show_bug.cgi?id=1991919 Fixes: 7e8eda734d30 ("ntp: Audit NTP parameters adjustment") Fixes: 2d87a0674bd6 ("timekeeping: Audit clock adjustments") Signed-off-by: Richard Guy Briggs <rgb@redhat.com> [PM: fixed style/whitespace issues] Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-02-09audit: don't deref the syscall args when checking the openat2 open_how::flagsPaul Moore1-1/+1
As reported by Jeff, dereferencing the openat2 syscall argument in audit_match_perm() to obtain the open_how::flags can result in an oops/page-fault. This patch fixes this by using the open_how struct that we store in the audit_context with audit_openat2_how(). Independent of this patch, Richard Guy Briggs posted a similar patch to the audit mailing list roughly 40 minutes after this patch was posted. Cc: stable@vger.kernel.org Fixes: 1c30e3af8a79 ("audit: add support for the openat2 syscall") Reported-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-11-22lsm: security_task_getsecid_subj() -> security_current_getsecid_subj()Paul Moore1-1/+10
The security_task_getsecid_subj() LSM hook invites misuse by allowing callers to specify a task even though the hook is only safe when the current task is referenced. Fix this by removing the task_struct argument to the hook, requiring LSM implementations to use the current task. While we are changing the hook declaration we also rename the function to security_current_getsecid_subj() in an effort to reinforce that the hook captures the subjective credentials of the current task and not an arbitrary task on the system. Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-11-01Merge tag 'audit-pr-20211101' of ↵Linus Torvalds1-22/+29
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "Add some additional audit logging to capture the openat2() syscall open_how struct info. Previous variations of the open()/openat() syscalls allowed audit admins to inspect the syscall args to get the information contained in the new open_how struct used in openat2()" * tag 'audit-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: return early if the filter rule has a lower priority audit: add OPENAT2 record to list "how" info audit: add support for the openat2 syscall audit: replace magic audit syscall class numbers with macros lsm_audit: avoid overloading the "key" audit field audit: Convert to SPDX identifier audit: rename struct node to struct audit_node to prevent future name collisions
2021-11-01Merge tag 'selinux-pr-20211101' of ↵Linus Torvalds1-100/+368
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Add LSM/SELinux/Smack controls and auditing for io-uring. As usual, the individual commit descriptions have more detail, but we were basically missing two things which we're adding here: + establishment of a proper audit context so that auditing of io-uring ops works similarly to how it does for syscalls (with some io-uring additions because io-uring ops are *not* syscalls) + additional LSM hooks to enable access control points for some of the more unusual io-uring features, e.g. credential overrides. The additional audit callouts and LSM hooks were done in conjunction with the io-uring folks, based on conversations and RFC patches earlier in the year. - Fixup the binder credential handling so that the proper credentials are used in the LSM hooks; the commit description and the code comment which is removed in these patches are helpful to understand the background and why this is the proper fix. - Enable SELinux genfscon policy support for securityfs, allowing improved SELinux filesystem labeling for other subsystems which make use of securityfs, e.g. IMA. * tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: security: Return xattr name from security_dentry_init_security() selinux: fix a sock regression in selinux_ip_postroute_compat() binder: use cred instead of task for getsecid binder: use cred instead of task for selinux checks binder: use euid from cred instead of using task LSM: Avoid warnings about potentially unused hook variables selinux: fix all of the W=1 build warnings selinux: make better use of the nf_hook_state passed to the NF hooks selinux: fix race condition when computing ocontext SIDs selinux: remove unneeded ipv6 hook wrappers selinux: remove the SELinux lockdown implementation selinux: enable genfscon labeling for securityfs Smack: Brutalist io_uring support selinux: add support for the io_uring access controls lsm,io_uring: add LSM hooks to io_uring io_uring: convert io_uring to the secure anon inode interface fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure() audit: add filtering for io_uring records audit,io_uring,io-wq: add some basic audit support to io_uring audit: prepare audit_context for use in calling contexts beyond syscalls
2021-10-18audit: return early if the filter rule has a lower priorityGaosheng Cui1-2/+3
It is not necessary for audit_filter_rules() functions to check audit fileds of the rule with a lower priority, and if we did, there might be some unintended effects, such as the ctx->ppid may be changed unexpectedly, so return early if the rule has a lower priority. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> [PM: slight tweak to the subject line] Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-18audit: fix possible null-pointer dereference in audit_filter_rulesGaosheng Cui1-1/+1
Fix possible null-pointer dereference in audit_filter_rules. audit_filter_rules() error: we previously assumed 'ctx' could be null Cc: stable@vger.kernel.org Fixes: bf361231c295 ("audit: add saddr_fam filter field") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-04audit: add OPENAT2 record to list "how" infoRichard Guy Briggs1-1/+17
Since the openat2(2) syscall uses a struct open_how pointer to communicate its parameters they are not usefully recorded by the audit SYSCALL record's four existing arguments. Add a new audit record type OPENAT2 that reports the parameters in its third argument, struct open_how with fields oflag, mode and resolve. The new record in the context of an event would look like: time->Wed Mar 17 16:28:53 2021 type=PROCTITLE msg=audit(1616012933.531:184): proctitle= 73797363616C6C735F66696C652F6F70656E617432002F746D702F61756469742D 7465737473756974652D737641440066696C652D6F70656E617432 type=PATH msg=audit(1616012933.531:184): item=1 name="file-openat2" inode=29 dev=00:1f mode=0100600 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PATH msg=audit(1616012933.531:184): item=0 name="/root/rgb/git/audit-testsuite/tests" inode=25 dev=00:1f mode=040700 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:user_tmp_t:s0 nametype=PARENT cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=CWD msg=audit(1616012933.531:184): cwd="/root/rgb/git/audit-testsuite/tests" type=OPENAT2 msg=audit(1616012933.531:184): oflag=0100302 mode=0600 resolve=0xa type=SYSCALL msg=audit(1616012933.531:184): arch=c000003e syscall=437 success=yes exit=4 a0=3 a1=7ffe315f1c53 a2=7ffe315f1550 a3=18 items=2 ppid=528 pid=540 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 ses=1 comm="openat2" exe="/root/rgb/git/audit-testsuite/tests/syscalls_file/openat2" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key="testsuite-1616012933-bjAUcEPO" Link: https://lore.kernel.org/r/d23fbb89186754487850367224b060e26f9b7181.1621363275.git.rgb@redhat.com Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> [PM: tweak subject, wrap example, move AUDIT_OPENAT2 to 1337] Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-01audit: add support for the openat2 syscallRichard Guy Briggs1-0/+3
The openat2(2) syscall was added in kernel v5.6 with commit fddb5d430ad9 ("open: introduce openat2(2) syscall"). Add the openat2(2) syscall to the audit syscall classifier. Link: https://github.com/linux-audit/audit-kernel/issues/67 Link: https://lore.kernel.org/r/f5f1a4d8699613f8c02ce762807228c841c2e26f.1621363275.git.rgb@redhat.com Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> [PM: merge fuzz due to previous header rename, commit line wraps] Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-01audit: replace magic audit syscall class numbers with macrosRichard Guy Briggs1-6/+6
Replace audit syscall class magic numbers with macros. This required putting the macros into new header file include/linux/audit_arch.h since the syscall macros were included for both 64 bit and 32 bit in any compat code, causing redefinition warnings. Link: https://lore.kernel.org/r/2300b1083a32aade7ae7efb95826e8f3f260b1df.1621363275.git.rgb@redhat.com Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> [PM: renamed header to audit_arch.h after consulting with Richard] Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-09-19audit: add filtering for io_uring recordsPaul Moore1-14/+46
This patch adds basic audit io_uring filtering, using as much of the existing audit filtering infrastructure as possible. In order to do this we reuse the audit filter rule's syscall mask for the io_uring operation and we create a new filter for io_uring operations as AUDIT_FILTER_URING_EXIT/audit_filter_list[7]. Thanks to Richard Guy Briggs for his review, feedback, and work on the corresponding audit userspace changes. Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-09-19audit,io_uring,io-wq: add some basic audit support to io_uringPaul Moore1-0/+166
This patch adds basic auditing to io_uring operations, regardless of their context. This is accomplished by allocating audit_context structures for the io-wq worker and io_uring SQPOLL kernel threads as well as explicitly auditing the io_uring operations in io_issue_sqe(). Individual io_uring operations can bypass auditing through the "audit_skip" field in the struct io_op_def definition for the operation; although great care must be taken so that security relevant io_uring operations do not bypass auditing; please contact the audit mailing list (see the MAINTAINERS file) with any questions. The io_uring operations are audited using a new AUDIT_URINGOP record, an example is shown below: type=UNKNOWN[1336] msg=audit(1631800225.981:37289): uring_op=19 success=yes exit=0 items=0 ppid=15454 pid=15681 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) Thanks to Richard Guy Briggs for review and feedback. Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-09-19audit: prepare audit_context for use in calling contexts beyond syscallsPaul Moore1-93/+163
This patch cleans up some of our audit_context handling by abstracting out the reset and return code fixup handling to dedicated functions. Not only does this help make things easier to read and inspect, it allows for easier reuse by future patches. We also convert the simple audit_context->in_syscall flag into an enum which can be used to by future patches to indicate a calling context other than the syscall context. Thanks to Richard Guy Briggs for review and feedback. Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-09-14audit: Convert to SPDX identifierCai Huoqing1-14/+1
Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-06-10audit: remove trailing spaces and tabsZhen Lei1-4/+4
Run the following command to find and remove the trailing spaces and tabs: sed -r -i 's/[ \t]+$//' <audit_files> The files to be checked are as follows: kernel/audit* include/linux/audit.h include/uapi/linux/audit.h Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-06-08audit: Rename enum audit_state constants to avoid AUDIT_DISABLED redefinitionSergey Nazarov1-17/+17
AUDIT_DISABLED defined in kernel/audit.h as element of enum audit_state and redefined in kernel/audit.c. This produces a warning when kernel builds with syscalls audit disabled and brokes kernel build if -Werror used. enum audit_state used in syscall audit code only. This patch changes enum audit_state constants prefix AUDIT to AUDIT_STATE to avoid AUDIT_DISABLED redefinition. Signed-off-by: Sergey Nazarov <s-nazarov@yandex.ru> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-05-10audit: add blank line after variable declarationsRoni Nevalainen1-0/+21
Fix the following checkpatch warning in auditsc.c: WARNING: Missing a blank line after declarations Signed-off-by: Roni Nevalainen <kitten@kittenz.dev> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-04-27Merge tag 'audit-pr-20210426' of ↵Linus Torvalds1-7/+4
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "Another small pull request for audit, most of the patches are documentation updates with only two real code changes: one to fix a compiler warning for a dummy function/macro, and one to cleanup some code since we removed the AUDIT_FILTER_ENTRY ages ago (v4.17)" * tag 'audit-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: drop /proc/PID/loginuid documentation Format field audit: avoid -Wempty-body warning audit: document /proc/PID/sessionid audit: document /proc/PID/loginuid MAINTAINERS: update audit files audit: further cleanup of AUDIT_FILTER_ENTRY deprecation
2021-03-22lsm: separate security_task_getsecid() into subjective and objective variantsPaul Moore1-4/+4
Of the three LSMs that implement the security_task_getsecid() LSM hook, all three LSMs provide the task's objective security credentials. This turns out to be unfortunate as most of the hook's