summaryrefslogtreecommitdiff
path: root/kernel/module
AgeCommit message (Collapse)AuthorFilesLines
2024-01-10Merge tag 'modules-6.8-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull module updates from Luis Chamberlain: "Just one cleanup and one documentation improvement change. No functional changes" * tag 'modules-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: kernel/module: improve documentation for try_module_get() module: Remove redundant TASK_UNINTERRUPTIBLE
2023-12-19module: Remove redundant TASK_UNINTERRUPTIBLEKevin Hao1-1/+1
TASK_KILLABLE already includes TASK_UNINTERRUPTIBLE, so there is no need to add a separate TASK_UNINTERRUPTIBLE. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-12-18kunit: add KUNIT_INIT_TABLE to init linker sectionRae Moar1-0/+3
Add KUNIT_INIT_TABLE to the INIT_DATA linker section. Alter the KUnit macros to create init tests: kunit_test_init_section_suites Update lib/kunit/executor.c to run both the suites in KUNIT_TABLE and KUNIT_INIT_TABLE. Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Rae Moar <rmoar@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-11-02Merge tag 'v6.7-p1' of ↵Linus Torvalds1-10/+15
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Add virtual-address based lskcipher interface - Optimise ahash/shash performance in light of costly indirect calls - Remove ahash alignmask attribute Algorithms: - Improve AES/XTS performance of 6-way unrolling for ppc - Remove some uses of obsolete algorithms (md4, md5, sha1) - Add FIPS 202 SHA-3 support in pkcs1pad - Add fast path for single-page messages in adiantum - Remove zlib-deflate Drivers: - Add support for S4 in meson RNG driver - Add STM32MP13x support in stm32 - Add hwrng interface support in qcom-rng - Add support for deflate algorithm in hisilicon/zip" * tag 'v6.7-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (283 commits) crypto: adiantum - flush destination page before unmapping crypto: testmgr - move pkcs1pad(rsa,sha3-*) to correct place Documentation/module-signing.txt: bring up to date module: enable automatic module signing with FIPS 202 SHA-3 crypto: asymmetric_keys - allow FIPS 202 SHA-3 signatures crypto: rsa-pkcs1pad - Add FIPS 202 SHA-3 support crypto: FIPS 202 SHA-3 register in hash info for IMA x509: Add OIDs for FIPS 202 SHA-3 hash and signatures crypto: ahash - optimize performance when wrapping shash crypto: ahash - check for shash type instead of not ahash type crypto: hash - move "ahash wrapping shash" functions to ahash.c crypto: talitos - stop using crypto_ahash::init crypto: chelsio - stop using crypto_ahash::init crypto: ahash - improve file comment crypto: ahash - remove struct ahash_request_priv crypto: ahash - remove crypto_ahash_alignmask crypto: gcm - stop using alignmask of ahash crypto: chacha20poly1305 - stop using alignmask of ahash crypto: ccm - stop using alignmask of ahash net: ipv6: stop checking crypto_ahash_alignmask ...
2023-11-02module/decompress: use kvmalloc() consistentlyAndrea Righi1-4/+4
We consistently switched from kmalloc() to vmalloc() in module decompression to prevent potential memory allocation failures with large modules, however vmalloc() is not as memory-efficient and fast as kmalloc(). Since we don't know in general the size of the workspace required by the decompression algorithm, it is more reasonable to use kvmalloc() consistently, also considering that we don't have special memory requirements here. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-11-01module: Annotate struct module_notes_attrs with __counted_byKees Cook1-1/+1
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct module_notes_attrs. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01module: Fix comment typoZhu Mao1-1/+1
Delete duplicated word in comment. Signed-off-by: Zhu Mao <zhumao001@208suo.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-11-01module/decompress: use vmalloc() for gzip decompression workspaceAndrea Righi1-2/+2
Use a similar approach as commit a419beac4a07 ("module/decompress: use vmalloc() for zstd decompression workspace") and replace kmalloc() with vmalloc() also for the gzip module decompression workspace. In this case the workspace is represented by struct inflate_workspace that can be fairly large for kmalloc() and it can potentially lead to allocation errors on certain systems: $ pahole inflate_workspace struct inflate_workspace { struct inflate_state inflate_state; /* 0 9544 */ /* --- cacheline 149 boundary (9536 bytes) was 8 bytes ago --- */ unsigned char working_window[32768]; /* 9544 32768 */ /* size: 42312, cachelines: 662, members: 2 */ /* last cacheline: 8 bytes */ }; Considering that there is no need to use continuous physical memory, simply switch to vmalloc() to provide a more reliable in-kernel module decompression. Fixes: b1ae6dc41eaa ("module: add in-kernel support for decompressing") Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-10-27module: enable automatic module signing with FIPS 202 SHA-3Dimitri John Ledkov1-0/+15
Add Kconfig options to use SHA-3 for kernel module signing. 256 size for RSA only, and higher sizes for RSA and NIST P-384. Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-20module: Do not offer sha224 for built-in module signingDimitri John Ledkov1-5/+0
sha224 does not provide enough security against collision attacks relative to the default keys used for signing (RSA 4k & P-384). Also sha224 never became popular, as sha256 got widely adopter ahead of sha224 being introduced. Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-20crypto: pkcs7 - remove sha1 supportDimitri John Ledkov1-5/+0
Removes support for sha1 signed kernel modules, importing sha1 signed x.509 certificates. rsa-pkcs1pad keeps sha1 padding support, which seems to be used by virtio driver. sha1 remains available as there are many drivers and subsystems using it. Note only hmac(sha1) with secret keys remains cryptographically secure. In the kernel there are filesystems, IMA, tpm/pcr that appear to be using sha1. Maybe they can all start to be slowly upgraded to something else i.e. blake3, ParallelHash, SHAKE256 as needed. Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-29module/decompress: use vmalloc() for zstd decompression workspaceAndrea Righi1-2/+2
Using kmalloc() to allocate the decompression workspace for zstd may trigger the following warning when large modules are loaded (i.e., xfs): [ 2.961884] WARNING: CPU: 1 PID: 254 at mm/page_alloc.c:4453 __alloc_pages+0x2c3/0x350 ... [ 2.989033] Call Trace: [ 2.989841] <TASK> [ 2.990614] ? show_regs+0x6d/0x80 [ 2.991573] ? __warn+0x89/0x160 [ 2.992485] ? __alloc_pages+0x2c3/0x350 [ 2.993520] ? report_bug+0x17e/0x1b0 [ 2.994506] ? handle_bug+0x51/0xa0 [ 2.995474] ? exc_invalid_op+0x18/0x80 [ 2.996469] ? asm_exc_invalid_op+0x1b/0x20 [ 2.997530] ? module_zstd_decompress+0xdc/0x2a0 [ 2.998665] ? __alloc_pages+0x2c3/0x350 [ 2.999695] ? module_zstd_decompress+0xdc/0x2a0 [ 3.000821] __kmalloc_large_node+0x7a/0x150 [ 3.001920] __kmalloc+0xdb/0x170 [ 3.002824] module_zstd_decompress+0xdc/0x2a0 [ 3.003857] module_decompress+0x37/0xc0 [ 3.004688] init_module_from_file+0xd0/0x100 [ 3.005668] idempotent_init_module+0x11c/0x2b0 [ 3.006632] __x64_sys_finit_module+0x64/0xd0 [ 3.007568] do_syscall_64+0x59/0x90 [ 3.008373] ? ksys_read+0x73/0x100 [ 3.009395] ? exit_to_user_mode_prepare+0x30/0xb0 [ 3.010531] ? syscall_exit_to_user_mode+0x37/0x60 [ 3.011662] ? do_syscall_64+0x68/0x90 [ 3.012511] ? do_syscall_64+0x68/0x90 [ 3.013364] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 However, continuous physical memory does not seem to be required in module_zstd_decompress(), so use vmalloc() instead, to prevent the warning and avoid potential failures at loading compressed modules. Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression") Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-03module: Expose module_init_layout_section()James Morse1-1/+1
module_init_layout_section() choses whether the core module loader considers a section as init or not. This affects the placement of the exit section when module unloading is disabled. This code will never run, so it can be free()d once the module has been initialised. arm and arm64 need to count the number of PLTs they need before applying relocations based on the section name. The init PLTs are stored separately so they can be free()d. arm and arm64 both use within_module_init() to decide which list of PLTs to use when applying the relocation. Because within_module_init()'s behaviour changes when module unloading is disabled, both architecture would need to take this into account when counting the PLTs. Today neither architecture does this, meaning when module unloading is disabled there are insufficient PLTs in the init section to load some modules, resulting in warnings: | WARNING: CPU: 2 PID: 51 at arch/arm64/kernel/module-plts.c:99 module_emit_plt_entry+0x184/0x1cc | Modules linked in: crct10dif_common | CPU: 2 PID: 51 Comm: modprobe Not tainted 6.5.0-rc4-yocto-standard-dirty #15208 | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : module_emit_plt_entry+0x184/0x1cc | lr : module_emit_plt_entry+0x94/0x1cc | sp : ffffffc0803bba60 [...] | Call trace: | module_emit_plt_entry+0x184/0x1cc | apply_relocate_add+0x2bc/0x8e4 | load_module+0xe34/0x1bd4 | init_module_from_file+0x84/0xc0 | __arm64_sys_finit_module+0x1b8/0x27c | invoke_syscall.constprop.0+0x5c/0x104 | do_el0_svc+0x58/0x160 | el0_svc+0x38/0x110 | el0t_64_sync_handler+0xc0/0xc4 | el0t_64_sync+0x190/0x194 Instead of duplicating module_init_layout_section()s logic, expose it. Reported-by: Adam Johnston <adam.johnston@arm.com> Fixes: 055f23b74b20 ("module: check for exit sections in layout_sections() instead of module_init_section()") Cc: stable@vger.kernel.org Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-02modules: only allow symbol_get of EXPORT_SYMBOL_GPL modulesChristoph Hellwig1-3/+11
It has recently come to my attention that nvidia is circumventing the protection added in 262e6ae7081d ("modules: inherit TAINT_PROPRIETARY_MODULE") by importing exports from their proprietary modules into an allegedly GPL licensed module and then rexporting them. Given that symbol_get was only ever intended for tightly cooperating modules using very internal symbols it is logical to restrict it to being used on EXPORT_SYMBOL_GPL and prevent nvidia from costly DMCA Circumvention of Access Controls law suites. All symbols except for four used through symbol_get were already exported as EXPORT_SYMBOL_GPL, and the remaining four ones were switched over in the preparation patches. Fixes: 262e6ae7081d ("modules: inherit TAINT_PROPRIETARY_MODULE") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-07-24modpost, kallsyms: Treat add '$'-prefixed symbols as mapping symbolsPalmer Dabbelt1-1/+1
Trying to restrict the '$'-prefix change to RISC-V caused some fallout, so let's just treat all those symbols as special. Fixes: c05780ef3c190 ("module: Ignore RISC-V mapping symbols too") Link: https://lore.kernel.org/all/20230712015747.77263-1-wangkefeng.wang@huawei.com/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-07-10module: Ignore RISC-V mapping symbols tooPalmer Dabbelt1-1/+1
RISC-V has an extended form of mapping symbols that we use to encode the ISA when it changes in the middle of an ELF. This trips up modpost as a build failure, I haven't yet verified it yet but I believe the kallsyms difference should result in stacks looking sane again. Reported-by: Randy Dunlap <rdunlap@infradead.org> Closes: https://lore.kernel.org/all/9d9e2902-5489-4bf0-d9cb-556c8e5d71c2@infradead.org/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-07-04module: fix init_module_from_file() error handlingLinus Torvalds1-16/+23
Vegard Nossum pointed out two different problems with the error handling in init_module_from_file(): (a) the idempotent loading code didn't clean up properly in some error cases, leaving the on-stack 'struct idempotent' element still in the hash table (b) failure to read the module file would nonsensically update the 'invalid_kread_bytes' stat counter with the error value The first error is quite nasty, in that it can then cause subsequent idempotent loads of that same file to access stale stack contents of the previous failure. The case may not happen in any normal situation (explaining all the "Tested-by's on the original change), and requires admin privileges, but syzkaller triggers random bad behavior as a result: BUG: soft lockup in sys_finit_module BUG: unable to handle kernel paging request in init_module_from_file general protection fault in init_module_from_file INFO: task hung in init_module_from_file KASAN: out-of-bounds Read in init_module_from_file KASAN: slab-out-of-bounds Read in init_module_from_file ... The second error is fairly benign and just leads to nonsensical stats (and has been around since the debug stats were added). Vegard also provided a patch for the idempotent loading issue, but I'd rather re-organize the code and make it more legible using another level of helper functions than add the usual "goto out" error handling. Link: https://lore.kernel.org/lkml/20230704100852.23452-1-vegard.nossum@oracle.com/ Fixes: 9b9879fc0327 ("modules: catch concurrent module loads, treat them as idempotent") Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reported-by: syzbot+9c2bdc9d24e4a7abe741@syzkaller.appspotmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-01Merge tag 'kbuild-v6.5' of ↵Linus Torvalds1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove the deprecated rule to build *.dtbo from *.dts - Refactor section mismatch detection in modpost - Fix bogus ARM section mismatch detections - Fix error of 'make gtags' with O= option - Add Clang's target triple to KBUILD_CPPFLAGS to fix a build error with the latest LLVM version - Rebuild the built-in initrd when KBUILD_BUILD_TIMESTAMP is changed - Ignore more compiler-generated symbols for kallsyms - Fix 'make local*config' to handle the ${CONFIG_FOO} form in Makefiles - Enable more kernel-doc warnings with W=2 - Refactor <linux/export.h> by generating KSYMTAB data by modpost - Deprecate <asm/export.h> and <asm-generic/export.h> - Remove the EXPORT_DATA_SYMBOL macro - Move the check for static EXPORT_SYMBOL back to modpost, which makes the build faster - Re-implement CONFIG_TRIM_UNUSED_KSYMS with one-pass algorithm - Warn missing MODULE_DESCRIPTION when building modules with W=1 - Make 'make clean' robust against too long argument error - Exclude more objects from GCOV to fix CFI failures with GCOV - Allow 'make modules_install' to install modules.builtin and modules.builtin.modinfo even when CONFIG_MODULES is disabled - Include modules.builtin and modules.builtin.modinfo in the linux-image Debian package even when CONFIG_MODULES is disabled - Revive "Entering directory" logging for the latest Make version * tag 'kbuild-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (72 commits) modpost: define more R_ARM_* for old distributions kbuild: revive "Entering directory" for Make >= 4.4.1 kbuild: set correct abs_srctree and abs_objtree for package builds scripts/mksysmap: Ignore prefixed KCFI symbols kbuild: deb-pkg: remove the CONFIG_MODULES check in buildeb kbuild: builddeb: always make modules_install, to install modules.builtin* modpost: continue even with unknown relocation type modpost: factor out Elf_Sym pointer calculation to section_rel() modpost: factor out inst location calculation to section_rel() kbuild: Disable GCOV for *.mod.o kbuild: Fix CFI failures with GCOV kbuild: make clean rule robust against too long argument error script: modpost: emit a warning when the description is missing kbuild: make modules_install copy modules.builtin(.modinfo) linux/export.h: rename 'sec' argument to 'license' modpost: show offset from symbol for section mismatch warnings modpost: merge two similar section mismatch warnings kbuild: implement CONFIG_TRIM_UNUSED_KSYMS without recursion modpost: use null string instead of NULL pointer for default namespace modpost: squash sym_update_namespace() into sym_add_exported() ...
2023-06-28Merge tag 'v6.5-rc1-modules-next' of ↵Linus Torvalds2-35/+0
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull module updates from Luis Chamberlain: "The changes queued up for modules are pretty tame, mostly code removal of moving of code. Only two minor functional changes are made, the only one which stands out is Sebastian Andrzej Siewior's simplification of module reference counting by removing preempt_disable() and that has been tested on linux-next for well over a month without no regressions. I'm now, I guess, also a kitchen sink for some kallsyms changes" [ There was a mis-communication about the concurrent module load changes that I had expected to come through Luis despite me authoring the patch. So some of the module updates were left hanging in the email ether, and I just committed them separately. It's my bad - I should have made it more clear that I expected my own patches to come through the module tree too. Now they missed linux-next, but hopefully that won't cause any issues - Linus ] * tag 'v6.5-rc1-modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: kallsyms: make kallsyms_show_value() as generic function kallsyms: move kallsyms_show_value() out of kallsyms.c kallsyms: remove unsed API lookup_symbol_attrs kallsyms: remove unused arch_get_kallsym() helper module: Remove preempt_disable() from module reference counting.
2023-06-28modules: catch concurrent module loads, treat them as idempotentLinus Torvalds1-2/+71
This is the new-and-improved attempt at avoiding huge memory load spikes when the user space boot sequence tries to load hundreds (or even thousands) of redundant duplicate modules in parallel. See commit 9828ed3f695a ("module: error out early on concurrent load of the same module file") for background and an earlier failed attempt that was reverted. That earlier attempt just said "concurrently loading the same module is silly, just open the module file exclusively and return -ETXTBSY if somebody else is already loading it". While it is true that concurrent module loads of the same module is silly, the reason that earlier attempt then failed was that the concurrently loaded module would often be a prerequisite for another module. Thus failing to load the prerequisite would then cause cascading failures of the other modules, rather than just short-circuiting that one unnecessary module load. At the same time, we still really don't want to load the contents of the same module file hundreds of times, only to then wait for an eventually successful load, and have everybody else return -EEXIST. As a result, this takes another approach, and treats concurrent module loads from the same file as "idempotent" in the inode. So if one module load is ongoing, we don't start a new one, but instead just wait for the first one to complete and return the same return value as it did. So unlike the first attempt, this does not return early: the intent is not to speed up the boot, but to avoid a thundering herd problem in allocating memory (both physical and virtual) for a module more than once. Also note that this does change behavior: it used to be that when you had concurrent loads, you'd have one "winner" that would return success, and everybody else would return -EEXIST. In contrast, this idempotent logic goes all Oprah on the problem, and says "You are a winner! And you are a winner! We are ALL winners". But since there's no possible actual real semantic difference between "you loaded the module" and "somebody else already loaded the module", this is more of a feel-good change than an actual honest-to-goodness semantic change. Of course, any true Johnny-come-latelies that don't get caught in the concurrency filter will still return -EEXIST. It's no different from not even getting a seat at an Oprah taping. That's life. See the long thread on the kernel mailing list about this all, which includes some numbers for memory use before and after the patch. Link: https://lore.kernel.org/lkml/20230524213620.3509138-1-mcgrof@kernel.org/ Reviewed-by: Johan Hovold <johan@kernel.org> Tested-by: Johan Hovold <johan@kernel.org> Tested-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Rudi Heitbaum <rudi@heitbaum..com> Tested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-28module: split up 'finit_module()' into init_module_from_file() helperLinus Torvalds1-15/+27
This will simplify the next step, where we can then key off the inode to do one idempotent module load. Let's do the obvious re-organization in one step, and then the new code in another. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-22kbuild: generate KSYMTAB entries by modpostMasahiro Yamada1-0/+12
Commit 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS") made modpost output CRCs in the same way whether the EXPORT_SYMBOL() is placed in *.c or *.S. For further cleanups, this commit applies a similar approach to the entire data structure of EXPORT_SYMBOL(). The EXPORT_SYMBOL() compilation is split into two stages. When a source file is compiled, EXPORT_SYMBOL() will be converted into a dummy symbol in the .export_symbol section. For example, EXPORT_SYMBOL(foo); EXPORT_SYMBOL_NS_GPL(bar, BAR_NAMESPACE); will be encoded into the following assembly code: .section ".export_symbol","a" __export_symbol_foo: .asciz "" /* license */ .asciz "" /* name space */ .balign 8 .quad foo /* symbol reference */ .previous .section ".export_symbol","a" __export_symbol_bar: .asciz "GPL" /* license */ .asciz "BAR_NAMESPACE" /* name space */ .balign 8 .quad bar /* symbol reference */ .previous They are mere markers to tell modpost the name, license, and namespace of the symbols. They will be dropped from the final vmlinux and modules because the *(.export_symbol) will go into /DISCARD/ in the linker script. Then, modpost extracts all the information about EXPORT_SYMBOL() from the .export_symbol section, and generates the final C code: KSYMTAB_FUNC(foo, "", ""); KSYMTAB_FUNC(bar, "_gpl", "BAR_NAMESPACE"); KSYMTAB_FUNC() (or KSYMTAB_DATA() if it is data) is expanded to struct kernel_symbol that will be linked to the vmlinux or a module. With this change, EXPORT_SYMBOL() works in the same way for *.c and *.S files, providing the following benefits. [1] Deprecate EXPORT_DATA_SYMBOL() In the old days, EXPORT_SYMBOL() was only available in C files. To export a symbol in *.S, EXPORT_SYMBOL() was placed in a separate *.c file. arch/arm/kernel/armksyms.c is one example written in the classic manner. Commit 22823ab419d8 ("EXPORT_SYMBOL() for asm") removed this limitation. Since then, EXPORT_SYMBOL() can be placed close to the symbol definition in *.S files. It was a nice improvement. However, as that commit mentioned, you need to use EXPORT_DATA_SYMBOL() for data objects on some architectures. In the new approach, modpost checks symbol's type (STT_FUNC or not), and outputs KSYMTAB_FUNC() or KSYMTAB_DATA() accordingly. There are only two users of EXPORT_DATA_SYMBOL: EXPORT_DATA_SYMBOL_GPL(empty_zero_page) (arch/ia64/kernel/head.S) EXPORT_DATA_SYMBOL(ia64_ivt) (arch/ia64/kernel/ivt.S) They are transformed as follows and output into .vmlinux.export.c KSYMTAB_DATA(empty_zero_page, "_gpl", ""); KSYMTAB_DATA(ia64_ivt, "", ""); The other EXPORT_SYMBOL users in ia64 assembly are output as KSYMTAB_FUNC(). EXPORT_DATA_SYMBOL() is now deprecated. [2] merge <linux/export.h> and <asm-generic/export.h> There are two similar header implementations: include/linux/export.h for .c files include/asm-generic/export.h for .S files Ideally, the functionality should be consistent between them, but they tend to diverge. Commit 8651ec01daed ("module: add support for symbol namespaces.") did not support the namespace for *.S files. This commit shifts the essential implementation part to C, which supports EXPORT_SYMBOL_NS() for *.S files. <asm/export.h> and <asm-generic/export.h> will remain as a wrapper of <linux/export.h> for a while. They will be removed after #include <asm/export.h> directives are all replaced with #include <linux/export.h>. [3] Implement CONFIG_TRIM_UNUSED_KSYMS in one-pass algorithm (by a later commit) When CONFIG_TRIM_UNUSED_KSYMS is enabled, Kbuild recursively traverses the directory tree to determine which EXPORT_SYMBOL to trim. If an EXPORT_SYMBOL turns out to be unused by anyone, Kbuild begins the second traverse, where some source files are recompiled with their EXPORT_SYMBOL() tuned into a no-op. We can do this better now; modpost can selectively emit KSYMTAB entries that are really used by modules. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-06-01module/decompress: Fix error checking on zstd decompressionLucas De Marchi1-1/+1
While implementing support for in-kernel decompression in kmod, finit_module() was returning a very suspicious value: finit_module(3, "", MODULE_INIT_COMPRESSED_FILE) = 18446744072717407296 It turns out the check for module_get_next_page() failing is wrong, and hence the decompression was not really taking place. Invert the condition to fix it. Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression") Cc: stable@kernel.org Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-05-30module: fix module load for ia64Song Liu1-2/+2
Frank reported boot regression in ia64 as: ELILO v3.16 for EFI/IA-64 .. Uncompressing Linux... done Loading file AC100221.initrd.img...done [ 0.000000] Linux version 6.4.0-rc3 (root@x4270) (ia64-linux-gcc (GCC) 12.2.0, GNU ld (GNU Binutils) 2.39) #1 SMP Thu May 25 15:52:20 CEST 2023 [ 0.000000] efi: EFI v1.1 by HP [ 0.000000] efi: SALsystab=0x3ee7a000 ACPI 2.0=0x3fe2a000 ESI=0x3ee7b000 SMBIOS=0x3ee7c000 HCDP=0x3fe28000 [ 0.000000] PCDP: v3 at 0x3fe28000 [ 0.000000] earlycon: uart8250 at MMIO 0x00000000f4050000 (options '9600n8') [ 0.000000] printk: bootconsole [uart8250] enabled [ 0.000000] ACPI: Early table checksum verification disabled [ 0.000000] ACPI: RSDP 0x000000003FE2A000 000028 (v02 HP ) [ 0.000000] ACPI: XSDT 0x000000003FE2A02C 0000CC (v01 HP rx2620 00000000 HP 00000000) [...] [ 3.793350] Run /init as init process Loading, please wait... Starting systemd-udevd version 252.6-1 [ 3.951100] ------------[ cut here ]------------ [ 3.951100] WARNING: CPU: 6 PID: 140 at kernel/module/main.c:1547 __layout_sections+0x370/0x3c0 [ 3.949512] Unable to handle kernel paging request at virtual address 1000000000000000 [ 3.951100] Modules linked in: [ 3.951100] CPU: 6 PID: 140 Comm: (udev-worker) Not tainted 6.4.0-rc3 #1 [ 3.956161] (udev-worker)[142]: Oops 11003706212352 [1] [ 3.951774] Hardware name: hp server rx2620 , BIOS 04.29 11/30/2007 [ 3.951774] [ 3.951774] Call Trace: [ 3.958339] Unable to handle kernel paging request at virtual address 1000000000000000 [ 3.956161] Modules linked in: [ 3.951774] [<a0000001000156d0>] show_stack.part.0+0x30/0x60 [ 3.951774] sp=e000000183a67b20 bsp=e000000183a61628 [ 3.956161] [ 3.956161] which bisect to module_memory change [1]. Debug showed that ia64 uses some special sections: __layout_sections: section .got (sh_flags 10000002) matched to MOD_INVALID __layout_sections: section .sdata (sh_flags 10000003) matched to MOD_INVALID __layout_sections: section .sbss (sh_flags 10000003) matched to MOD_INVALID All these sections are loaded to module core memory before [1]. Fix ia64 boot by loading these sections to MOD_DATA (core rw data). [1] commit ac3b43283923 ("module: replace module_layout with module_memory") Fixes: ac3b43283923 ("module: replace module_layout with module_memory") Reported-by: Frank Scheiner <frank.scheiner@web.de> Closes: https://lists.debian.org/debian-ia64/2023/05/msg00010.html Closes: https://marc.info/?l=linux-ia64&m=168509859125505 Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Song Liu <song@kernel.org> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-05-26kallsyms: remove unsed API lookup_symbol_attrsManinder Singh1-28/+0
with commit '7878c231dae0 ("slab: remove /proc/slab_allocators")' lookup_symbol_attrs usage is removed. Thus removing redundant API. Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-05-23module: Remove preempt_disable() from module reference counting.Sebastian Andrzej Siewior1-7/+0
The preempt_disable() section in module_put() was added in commit e1783a240f491 ("module: Use this_cpu_xx to dynamically allocate counters") while the per-CPU counter were switched to another API. The API requires that during the RMW operation the CPU remained the same. This counting API was later replaced with atomic_t in commit 2f35c41f58a97 ("module: Replace module_ref with atomic_t refcnt") Since this atomic_t replacement there is no need to keep preemption disabled while the reference counter is modified. Remove preempt_disable() from module_put(), __module_get() and try_module_get(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-05-22module: Fix use-after-free bug in read_file_mod_stats()Harshit Mogalapalli1-1/+3
Smatch warns: kernel/module/stats.c:394 read_file_mod_stats() warn: passing freed memory 'buf' We are passing 'buf' to simple_read_from_buffer() after freeing it. Fix this by changing the order of 'simple_read_from_buffer' and 'kfree'. Fixes: df3e764d8e5c ("module: add debug stats to help identify memory pressure") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-05-02module: include internal.h in module/dups.cArnd Bergmann1-0/+2
Two newly introduced functions are declared in a header that is not included before the definition, causing a warning with sparse or 'make W=1': kernel/module/dups.c:118:6: error: no previous prototype for 'kmod_dup_request_exists_wait' [-Werror=missing-prototypes] 118 | bool kmod_dup_request_exists_wait(char *module_name, bool wait, int *dup_ret) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ kernel/module/dups.c:220:6: error: no previous prototype for 'kmod_dup_request_announce' [-Werror=missing-prototypes] 220 | void kmod_dup_request_announce(char *module_name, int ret) | ^~~~~~~~~~~~~~~~~~~~~~~~~ Add an explicit include to ensure the prototypes match. Fixes: 8660484ed1cf ("module: add debugging auto-load duplicate module support") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202304141440.DYO4NAzp-lkp@intel.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-27Merge tag 'modules-6.4-rc1' of ↵Linus Torvalds14-669/+1786
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull module updates from Luis Chamberlain: "The summary of the changes for this pull requests is: - Song Liu's new struct module_memory replacement - Nick Alcock's MODULE_LICENSE() removal for non-modules - My cleanups and enhancements to reduce the areas where we vmalloc module memory for duplicates, and the respective debug code which proves the remaining vmalloc pressure comes from userspace. Most of the changes have been in linux-next for quite some time except the minor fixes I made to check if a module was already loaded prior to allocating the final module memory with vmalloc and the respective debug code it introduces to help clarify the issue. Although the functional change is small it is rather safe as it can only *help* reduce vmalloc space for duplicates and is confirmed to fix a bootup issue with over 400 CPUs with KASAN enabled. I don't expect stable kernels to pick up that fix as the cleanups would have also had to have been picked up. Folks on larger CPU systems with modules will want to just upgrade if vmalloc space has been an issue on bootup. Given the size of this request, here's some more elaborate details: The functional change change in this pull request is the very first patch from Song Liu which replaces the 'struct module_layout' with a new 'struct module_memory'. The old data structure tried to put together all types of supported module memory types in one data structure, the new one abstracts the differences in memory types in a module to allow each one to provide their own set of details. This paves the way in the future so we can deal with them in a cleaner way. If you look at changes they also provide a nice cleanup of how we handle these different memory areas in a module. This change has been in linux-next since before the merge window opened for v6.3 so to provide more than a full kernel cycle of testing. It's a good thing as quite a bit of fixes have been found for it. Jason Baron then made dynamic debug a first class citizen module user by using module notifier callbacks to allocate / remove module specific dynamic debug information. Nick Alcock has done quite a bit of work cross-tree to remove module license tags from things which cannot possibly be module at my request so to: a) help him with his longer term tooling goals which require a deterministic evaluation if a piece a symbol code could ever be part of a module or not. But quite recently it is has been made clear that tooling is not the only one that would benefit. Disambiguating symbols also helps efforts such as live patching, kprobes and BPF, but for other reasons and R&D on this area is active with no clear solution in sight. b) help us inch closer to the now generally accepted long term goal of automating all the MODULE_LICENSE() tags from SPDX license tags In so far as a) is concerned, although module license tags are a no-op for non-modules, tools which would want create a mapping of possible modules can only rely on the module license tag after the commit 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"). Nick has been working on this *for years* and AFAICT I was the only one to suggest two alternatives to this approach for tooling. The complexity in one of my suggested approaches lies in that we'd need a possible-obj-m and a could-be-module which would check if the object being built is part of any kconfig build which could ever lead to it being part of a module, and if so define a new define -DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've suggested would be to have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as well but that means getting kconfig symbol names mapping to modules always, and I don't think that's the case today. I am not aware of Nick or anyone exploring either of these options. Quite recently Josh Poimboeuf has pointed out that live patching, kprobes and BPF would benefit from resolving some part of the disambiguation as well but for other reasons. The function granularity KASLR (fgkaslr) patches were mentioned but Joe Lawrence has clarified this effort has been dropped with no clear solution in sight [1]. In the meantime removing module license tags from code which could never be modules is welcomed for both objectives mentioned above. Some developers have also welcomed these changes as it has helped clarify when a module was never possible and they forgot to clean this up, and so you'll see quite a bit of Nick's patches in other pull requests for this merge window. I just picked up the stragglers after rc3. LWN has good coverage on the motivation behind this work [2] and the typical cross-tree issues he ran into along the way. The only concrete blocker issue he ran into was that we should not remove the MODULE_LICENSE() tags from files which have no SPDX tags yet, even if they can never be modules. Nick ended up giving up on his efforts due to having to do this vetting and backlash he ran into from folks who really did *not understand* the core of the issue nor were providing any alternative / guidance. I've gone through his changes and dropped the patches which dropped the module license tags where an SPDX license tag was missing, it only consisted of 11 drivers. To see if a pull request deals with a file which lacks SPDX tags you can just use: ./scripts/spdxcheck.py -f \ $(git diff --name-only commid-id | xargs echo) You'll see a core module file in this pull request for the above, but that's not related to his changes. WE just need to add the SPDX license tag for the kernel/module/kmod.c file in the future but it demonstrates the effectiveness of the script. Most of Nick's changes were spread out through different trees, and I just picked up the slack after rc3 for the last kernel was out. Those changes have been in linux-next for over two weeks. The cleanups, debug code I added and final fix I added for modules were motivated by David Hildenbrand's report of boot failing on a systems with over 400 CPUs when KASAN was enabled due to running out of virtual memory space. Although the functional change only consists of 3 lines in the patch "module: avoid allocation if module is already present and ready", proving that this was the best we can do on the modules side took quite a bit of effort and new debug code. The initial cleanups I did on the modules side of things has been in linux-next since around rc3 of the last kernel, the actual final fix for and debug code however have only been in linux-next for about a week or so but I think it is worth getting that code in for this merge window as it does help fix / prove / evaluate the issues reported with larger number of CPUs. Userspace is not yet fixed as it is taking a bit of time for folks to understand the crux of the issue and find a proper resolution. Worst come to worst, I have a kludge-of-concept [3] of how to make kernel_read*() calls for modules unique / converge them, but I'm currently inclined to just see if userspace can fix this instead" Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0] Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1] Link: https://lwn.net/Articles/927569/ [2] Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3] * tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits) module: add debugging auto-load duplicate module support module: stats: fix invalid_mod_bytes typo module: remove use of uninitialized variable len module: fix building stats for 32-bit targets module: stats: include uapi/linux/module.h module: avoid allocation if module is already present and ready module: add debug stats to help identify memory pressure module: extract patient module check into helper modules/kmod: replace implementation with a semaphore Change DEFINE_SEMAPHORE() to take a number argument module: fix kmemleak annotations for non init ELF sections module: Ignore L0 and rename is_arm_mapping_symbol() module: Move is_arm_mapping_symbol() to module_symbol.h module: Sync code of is_arm_mapping_symbol() scripts/gdb: use mem instead of core_layout to get the module address interconnect: remove module-related code interconnect: remove MODULE_LICENSE in non-modules zswap: remove MODULE_LICENSE in non-modules zpool: remove MODULE_LICENSE in non-modules x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules ...
2023-04-26Merge tag 'net-next-6.4' of ↵Linus Torvalds2-4/+13
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the default value allows for better BIG TCP performances - Reduce compound page head access for zero-copy data transfers - RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when possible - Threaded NAPI improvements, adding defer skb free support and unneeded softirq avoidance - Address dst_entry reference count scalability issues, via false sharing avoidance and optimize refcount tracking - Add lockless accesses annotation to sk_err[_soft] - Optimize again the skb struct layout - Extends the skb drop reasons to make it usable by multiple subsystems - Better const qualifier awareness for socket casts BPF: - Add skb and XDP typed dynptrs which allow BPF programs for more ergonomic and less brittle iteration through data and variable-sized accesses - Add a new BPF netfilter program type and minimal support to hook BPF programs to netfilter hooks such as prerouting or forward - Add more precise memory usage reporting for all BPF map types - Adds support for using {FOU,GUE} encap with an ipip device operating in collect_md mode and add a set of BPF kfuncs for controlling encap params - Allow BPF programs to detect at load time whether a particular kfunc exists or not, and also add support for this in light skeleton - Bigger batch of BPF verifier improvements to prepare for upcoming BPF open-coded iterators allowing for less restrictive looping capabilities - Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF programs to NULL-check before passing such pointers into kfunc - Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in local storage maps - Enable RCU semantics for task BPF kptrs and allow referenced kptr tasks to be stored in BPF maps - Add support for refcounted local kptrs to the verifier for allowing shared ownership, useful for adding a node to both the BPF list and rbtree - Add BPF verifier support for ST instructions in convert_ctx_access() which will help new -mcpu=v4 clang flag to start emitting them - Add ARM32 USDT support to libbpf - Improve bpftool's visual program dump which produces the control flow graph in a DOT format by adding C source inline annotations Protocols: - IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value indicates the provenance of the IP address - IPv6: optimize route lookup, dropping unneeded R/W lock acquisition - Add the handshake upcall mechanism, allowing the user-space to implement generic TLS handshake on kernel's behalf - Bridge: support per-{Port, VLAN} neighbor suppression, increasing resilience to nodes failures - SCTP: add support for Fair Capacity and Weighted Fair Queueing schedulers - MPTCP: delay first subflow allocation up to its first usage. This will allow for later better LSM interaction - xfrm: Remove inner/outer modes from input/output path. These are not needed anymore - WiFi: - reduced neighbor report (RNR) handling for AP mode - HW timestamping support - support for randomized auth/deauth TA for PASN privacy - per-link debugfs for multi-link - TC offload support for mac80211 drivers - mac80211 mesh fast-xmit and fast-rx support - enable Wi-Fi 7 (EHT) mesh support Netfilter: - Add nf_tables 'brouting' support, to force a packet to be routed instead of being bridged - Update bridge netfilter and ovs conntrack helpers to handle IPv6 Jumbo packets properly, i.e. fetch the packet length from hop-by-hop extension header. This is needed for BIT TCP support - The iptables 32bit compat interface isn't compiled in by default anymore - Move ip(6)tables builtin icmp matches to the udptcp one. This has the advantage that icmp/icmpv6 match doesn't load the iptables/ip6tables modules anymore when iptables-nft is used - Extended netlink error report for netdevice in flowtables and netdev/chains. Allow for incrementally add/delete devices to netdev basechain. Allow to create netdev chain without device Driver API: - Remove redundant Device Control Error Reporting Enable, as PCI core has already error reporting enabled at enumeration time - Move Multicast DB netlink handlers to core, allowing devices other then bridge to use them - Allow the page_pool to directly recycle the pages from safely localized NAPI - Implement lockless TX queue stop/wake combo macros, allowing for further code de-duplication and sanitization - Add YNL support for user headers and struct attrs - Add partial YNL specification for devlink - Add partial YNL specification for ethtool - Add tc-mqprio and tc-taprio support for preemptible traffic classes - Add tx push buf len param to ethtool, specifies the maximum number of bytes of a transmitted packet a driver can push directly to the underlying device - Add basic LED support for switch/phy - Add NAPI documentation, stop relaying on external links - Convert dsa_master_ioctl() to netdev notifier. This is a preparatory work to make the hardware timestamping layer selectable by user space - Add transceiver support and improve the error messages for CAN-FD controllers New hardware / drivers: - Ethernet: - AMD/Pensando core device support - MediaTek MT7981 SoC - MediaTek MT7988 SoC - Broadcom BCM53134 embedded switch - Texas Instruments CPSW9G ethernet switch - Qualcomm EMAC3 DWMAC ethernet - StarFive JH7110 SoC - NXP CBTX ethernet PHY - WiFi: - Apple M1 Pro/Max devices - RealTek rtl8710bu/rtl8188gu - RealTe