| Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module updates from Luis Chamberlain:
"Just one cleanup and one documentation improvement change. No
functional changes"
* tag 'modules-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
kernel/module: improve documentation for try_module_get()
module: Remove redundant TASK_UNINTERRUPTIBLE
|
|
TASK_KILLABLE already includes TASK_UNINTERRUPTIBLE, so there is no
need to add a separate TASK_UNINTERRUPTIBLE.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Add KUNIT_INIT_TABLE to the INIT_DATA linker section.
Alter the KUnit macros to create init tests:
kunit_test_init_section_suites
Update lib/kunit/executor.c to run both the suites in KUNIT_TABLE and
KUNIT_INIT_TABLE.
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Add virtual-address based lskcipher interface
- Optimise ahash/shash performance in light of costly indirect calls
- Remove ahash alignmask attribute
Algorithms:
- Improve AES/XTS performance of 6-way unrolling for ppc
- Remove some uses of obsolete algorithms (md4, md5, sha1)
- Add FIPS 202 SHA-3 support in pkcs1pad
- Add fast path for single-page messages in adiantum
- Remove zlib-deflate
Drivers:
- Add support for S4 in meson RNG driver
- Add STM32MP13x support in stm32
- Add hwrng interface support in qcom-rng
- Add support for deflate algorithm in hisilicon/zip"
* tag 'v6.7-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (283 commits)
crypto: adiantum - flush destination page before unmapping
crypto: testmgr - move pkcs1pad(rsa,sha3-*) to correct place
Documentation/module-signing.txt: bring up to date
module: enable automatic module signing with FIPS 202 SHA-3
crypto: asymmetric_keys - allow FIPS 202 SHA-3 signatures
crypto: rsa-pkcs1pad - Add FIPS 202 SHA-3 support
crypto: FIPS 202 SHA-3 register in hash info for IMA
x509: Add OIDs for FIPS 202 SHA-3 hash and signatures
crypto: ahash - optimize performance when wrapping shash
crypto: ahash - check for shash type instead of not ahash type
crypto: hash - move "ahash wrapping shash" functions to ahash.c
crypto: talitos - stop using crypto_ahash::init
crypto: chelsio - stop using crypto_ahash::init
crypto: ahash - improve file comment
crypto: ahash - remove struct ahash_request_priv
crypto: ahash - remove crypto_ahash_alignmask
crypto: gcm - stop using alignmask of ahash
crypto: chacha20poly1305 - stop using alignmask of ahash
crypto: ccm - stop using alignmask of ahash
net: ipv6: stop checking crypto_ahash_alignmask
...
|
|
We consistently switched from kmalloc() to vmalloc() in module
decompression to prevent potential memory allocation failures with large
modules, however vmalloc() is not as memory-efficient and fast as
kmalloc().
Since we don't know in general the size of the workspace required by the
decompression algorithm, it is more reasonable to use kvmalloc()
consistently, also considering that we don't have special memory
requirements here.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct module_notes_attrs.
[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Delete duplicated word in comment.
Signed-off-by: Zhu Mao <zhumao001@208suo.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Use a similar approach as commit a419beac4a07 ("module/decompress: use
vmalloc() for zstd decompression workspace") and replace kmalloc() with
vmalloc() also for the gzip module decompression workspace.
In this case the workspace is represented by struct inflate_workspace
that can be fairly large for kmalloc() and it can potentially lead to
allocation errors on certain systems:
$ pahole inflate_workspace
struct inflate_workspace {
struct inflate_state inflate_state; /* 0 9544 */
/* --- cacheline 149 boundary (9536 bytes) was 8 bytes ago --- */
unsigned char working_window[32768]; /* 9544 32768 */
/* size: 42312, cachelines: 662, members: 2 */
/* last cacheline: 8 bytes */
};
Considering that there is no need to use continuous physical memory,
simply switch to vmalloc() to provide a more reliable in-kernel module
decompression.
Fixes: b1ae6dc41eaa ("module: add in-kernel support for decompressing")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Add Kconfig options to use SHA-3 for kernel module signing. 256 size
for RSA only, and higher sizes for RSA and NIST P-384.
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
sha224 does not provide enough security against collision attacks
relative to the default keys used for signing (RSA 4k & P-384). Also
sha224 never became popular, as sha256 got widely adopter ahead of
sha224 being introduced.
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Removes support for sha1 signed kernel modules, importing sha1 signed
x.509 certificates.
rsa-pkcs1pad keeps sha1 padding support, which seems to be used by
virtio driver.
sha1 remains available as there are many drivers and subsystems using
it. Note only hmac(sha1) with secret keys remains cryptographically
secure.
In the kernel there are filesystems, IMA, tpm/pcr that appear to be
using sha1. Maybe they can all start to be slowly upgraded to
something else i.e. blake3, ParallelHash, SHAKE256 as needed.
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Using kmalloc() to allocate the decompression workspace for zstd may
trigger the following warning when large modules are loaded (i.e., xfs):
[ 2.961884] WARNING: CPU: 1 PID: 254 at mm/page_alloc.c:4453 __alloc_pages+0x2c3/0x350
...
[ 2.989033] Call Trace:
[ 2.989841] <TASK>
[ 2.990614] ? show_regs+0x6d/0x80
[ 2.991573] ? __warn+0x89/0x160
[ 2.992485] ? __alloc_pages+0x2c3/0x350
[ 2.993520] ? report_bug+0x17e/0x1b0
[ 2.994506] ? handle_bug+0x51/0xa0
[ 2.995474] ? exc_invalid_op+0x18/0x80
[ 2.996469] ? asm_exc_invalid_op+0x1b/0x20
[ 2.997530] ? module_zstd_decompress+0xdc/0x2a0
[ 2.998665] ? __alloc_pages+0x2c3/0x350
[ 2.999695] ? module_zstd_decompress+0xdc/0x2a0
[ 3.000821] __kmalloc_large_node+0x7a/0x150
[ 3.001920] __kmalloc+0xdb/0x170
[ 3.002824] module_zstd_decompress+0xdc/0x2a0
[ 3.003857] module_decompress+0x37/0xc0
[ 3.004688] init_module_from_file+0xd0/0x100
[ 3.005668] idempotent_init_module+0x11c/0x2b0
[ 3.006632] __x64_sys_finit_module+0x64/0xd0
[ 3.007568] do_syscall_64+0x59/0x90
[ 3.008373] ? ksys_read+0x73/0x100
[ 3.009395] ? exit_to_user_mode_prepare+0x30/0xb0
[ 3.010531] ? syscall_exit_to_user_mode+0x37/0x60
[ 3.011662] ? do_syscall_64+0x68/0x90
[ 3.012511] ? do_syscall_64+0x68/0x90
[ 3.013364] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
However, continuous physical memory does not seem to be required in
module_zstd_decompress(), so use vmalloc() instead, to prevent the
warning and avoid potential failures at loading compressed modules.
Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
module_init_layout_section() choses whether the core module loader
considers a section as init or not. This affects the placement of the
exit section when module unloading is disabled. This code will never run,
so it can be free()d once the module has been initialised.
arm and arm64 need to count the number of PLTs they need before applying
relocations based on the section name. The init PLTs are stored separately
so they can be free()d. arm and arm64 both use within_module_init() to
decide which list of PLTs to use when applying the relocation.
Because within_module_init()'s behaviour changes when module unloading
is disabled, both architecture would need to take this into account when
counting the PLTs.
Today neither architecture does this, meaning when module unloading is
disabled there are insufficient PLTs in the init section to load some
modules, resulting in warnings:
| WARNING: CPU: 2 PID: 51 at arch/arm64/kernel/module-plts.c:99 module_emit_plt_entry+0x184/0x1cc
| Modules linked in: crct10dif_common
| CPU: 2 PID: 51 Comm: modprobe Not tainted 6.5.0-rc4-yocto-standard-dirty #15208
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : module_emit_plt_entry+0x184/0x1cc
| lr : module_emit_plt_entry+0x94/0x1cc
| sp : ffffffc0803bba60
[...]
| Call trace:
| module_emit_plt_entry+0x184/0x1cc
| apply_relocate_add+0x2bc/0x8e4
| load_module+0xe34/0x1bd4
| init_module_from_file+0x84/0xc0
| __arm64_sys_finit_module+0x1b8/0x27c
| invoke_syscall.constprop.0+0x5c/0x104
| do_el0_svc+0x58/0x160
| el0_svc+0x38/0x110
| el0t_64_sync_handler+0xc0/0xc4
| el0t_64_sync+0x190/0x194
Instead of duplicating module_init_layout_section()s logic, expose it.
Reported-by: Adam Johnston <adam.johnston@arm.com>
Fixes: 055f23b74b20 ("module: check for exit sections in layout_sections() instead of module_init_section()")
Cc: stable@vger.kernel.org
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
It has recently come to my attention that nvidia is circumventing the
protection added in 262e6ae7081d ("modules: inherit
TAINT_PROPRIETARY_MODULE") by importing exports from their proprietary
modules into an allegedly GPL licensed module and then rexporting them.
Given that symbol_get was only ever intended for tightly cooperating
modules using very internal symbols it is logical to restrict it to
being used on EXPORT_SYMBOL_GPL and prevent nvidia from costly DMCA
Circumvention of Access Controls law suites.
All symbols except for four used through symbol_get were already exported
as EXPORT_SYMBOL_GPL, and the remaining four ones were switched over in
the preparation patches.
Fixes: 262e6ae7081d ("modules: inherit TAINT_PROPRIETARY_MODULE")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Trying to restrict the '$'-prefix change to RISC-V caused some fallout,
so let's just treat all those symbols as special.
Fixes: c05780ef3c190 ("module: Ignore RISC-V mapping symbols too")
Link: https://lore.kernel.org/all/20230712015747.77263-1-wangkefeng.wang@huawei.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
RISC-V has an extended form of mapping symbols that we use to encode
the ISA when it changes in the middle of an ELF. This trips up modpost
as a build failure, I haven't yet verified it yet but I believe the
kallsyms difference should result in stacks looking sane again.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Closes: https://lore.kernel.org/all/9d9e2902-5489-4bf0-d9cb-556c8e5d71c2@infradead.org/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Vegard Nossum pointed out two different problems with the error handling
in init_module_from_file():
(a) the idempotent loading code didn't clean up properly in some error
cases, leaving the on-stack 'struct idempotent' element still in
the hash table
(b) failure to read the module file would nonsensically update the
'invalid_kread_bytes' stat counter with the error value
The first error is quite nasty, in that it can then cause subsequent
idempotent loads of that same file to access stale stack contents of the
previous failure. The case may not happen in any normal situation
(explaining all the "Tested-by's on the original change), and requires
admin privileges, but syzkaller triggers random bad behavior as a
result:
BUG: soft lockup in sys_finit_module
BUG: unable to handle kernel paging request in init_module_from_file
general protection fault in init_module_from_file
INFO: task hung in init_module_from_file
KASAN: out-of-bounds Read in init_module_from_file
KASAN: slab-out-of-bounds Read in init_module_from_file
...
The second error is fairly benign and just leads to nonsensical stats
(and has been around since the debug stats were added).
Vegard also provided a patch for the idempotent loading issue, but I'd
rather re-organize the code and make it more legible using another level
of helper functions than add the usual "goto out" error handling.
Link: https://lore.kernel.org/lkml/20230704100852.23452-1-vegard.nossum@oracle.com/
Fixes: 9b9879fc0327 ("modules: catch concurrent module loads, treat them as idempotent")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reported-by: syzbot+9c2bdc9d24e4a7abe741@syzkaller.appspotmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Remove the deprecated rule to build *.dtbo from *.dts
- Refactor section mismatch detection in modpost
- Fix bogus ARM section mismatch detections
- Fix error of 'make gtags' with O= option
- Add Clang's target triple to KBUILD_CPPFLAGS to fix a build error
with the latest LLVM version
- Rebuild the built-in initrd when KBUILD_BUILD_TIMESTAMP is changed
- Ignore more compiler-generated symbols for kallsyms
- Fix 'make local*config' to handle the ${CONFIG_FOO} form in Makefiles
- Enable more kernel-doc warnings with W=2
- Refactor <linux/export.h> by generating KSYMTAB data by modpost
- Deprecate <asm/export.h> and <asm-generic/export.h>
- Remove the EXPORT_DATA_SYMBOL macro
- Move the check for static EXPORT_SYMBOL back to modpost, which makes
the build faster
- Re-implement CONFIG_TRIM_UNUSED_KSYMS with one-pass algorithm
- Warn missing MODULE_DESCRIPTION when building modules with W=1
- Make 'make clean' robust against too long argument error
- Exclude more objects from GCOV to fix CFI failures with GCOV
- Allow 'make modules_install' to install modules.builtin and
modules.builtin.modinfo even when CONFIG_MODULES is disabled
- Include modules.builtin and modules.builtin.modinfo in the
linux-image Debian package even when CONFIG_MODULES is disabled
- Revive "Entering directory" logging for the latest Make version
* tag 'kbuild-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (72 commits)
modpost: define more R_ARM_* for old distributions
kbuild: revive "Entering directory" for Make >= 4.4.1
kbuild: set correct abs_srctree and abs_objtree for package builds
scripts/mksysmap: Ignore prefixed KCFI symbols
kbuild: deb-pkg: remove the CONFIG_MODULES check in buildeb
kbuild: builddeb: always make modules_install, to install modules.builtin*
modpost: continue even with unknown relocation type
modpost: factor out Elf_Sym pointer calculation to section_rel()
modpost: factor out inst location calculation to section_rel()
kbuild: Disable GCOV for *.mod.o
kbuild: Fix CFI failures with GCOV
kbuild: make clean rule robust against too long argument error
script: modpost: emit a warning when the description is missing
kbuild: make modules_install copy modules.builtin(.modinfo)
linux/export.h: rename 'sec' argument to 'license'
modpost: show offset from symbol for section mismatch warnings
modpost: merge two similar section mismatch warnings
kbuild: implement CONFIG_TRIM_UNUSED_KSYMS without recursion
modpost: use null string instead of NULL pointer for default namespace
modpost: squash sym_update_namespace() into sym_add_exported()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module updates from Luis Chamberlain:
"The changes queued up for modules are pretty tame, mostly code removal
of moving of code.
Only two minor functional changes are made, the only one which stands
out is Sebastian Andrzej Siewior's simplification of module reference
counting by removing preempt_disable() and that has been tested on
linux-next for well over a month without no regressions.
I'm now, I guess, also a kitchen sink for some kallsyms changes"
[ There was a mis-communication about the concurrent module load changes
that I had expected to come through Luis despite me authoring the
patch. So some of the module updates were left hanging in the email
ether, and I just committed them separately.
It's my bad - I should have made it more clear that I expected my
own patches to come through the module tree too. Now they missed
linux-next, but hopefully that won't cause any issues - Linus ]
* tag 'v6.5-rc1-modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
kallsyms: make kallsyms_show_value() as generic function
kallsyms: move kallsyms_show_value() out of kallsyms.c
kallsyms: remove unsed API lookup_symbol_attrs
kallsyms: remove unused arch_get_kallsym() helper
module: Remove preempt_disable() from module reference counting.
|
|
This is the new-and-improved attempt at avoiding huge memory load spikes
when the user space boot sequence tries to load hundreds (or even
thousands) of redundant duplicate modules in parallel.
See commit 9828ed3f695a ("module: error out early on concurrent load of
the same module file") for background and an earlier failed attempt that
was reverted.
That earlier attempt just said "concurrently loading the same module is
silly, just open the module file exclusively and return -ETXTBSY if
somebody else is already loading it".
While it is true that concurrent module loads of the same module is
silly, the reason that earlier attempt then failed was that the
concurrently loaded module would often be a prerequisite for another
module.
Thus failing to load the prerequisite would then cause cascading
failures of the other modules, rather than just short-circuiting that
one unnecessary module load.
At the same time, we still really don't want to load the contents of the
same module file hundreds of times, only to then wait for an eventually
successful load, and have everybody else return -EEXIST.
As a result, this takes another approach, and treats concurrent module
loads from the same file as "idempotent" in the inode. So if one module
load is ongoing, we don't start a new one, but instead just wait for the
first one to complete and return the same return value as it did.
So unlike the first attempt, this does not return early: the intent is
not to speed up the boot, but to avoid a thundering herd problem in
allocating memory (both physical and virtual) for a module more than
once.
Also note that this does change behavior: it used to be that when you
had concurrent loads, you'd have one "winner" that would return success,
and everybody else would return -EEXIST.
In contrast, this idempotent logic goes all Oprah on the problem, and
says "You are a winner! And you are a winner! We are ALL winners". But
since there's no possible actual real semantic difference between "you
loaded the module" and "somebody else already loaded the module", this
is more of a feel-good change than an actual honest-to-goodness semantic
change.
Of course, any true Johnny-come-latelies that don't get caught in the
concurrency filter will still return -EEXIST. It's no different from
not even getting a seat at an Oprah taping. That's life.
See the long thread on the kernel mailing list about this all, which
includes some numbers for memory use before and after the patch.
Link: https://lore.kernel.org/lkml/20230524213620.3509138-1-mcgrof@kernel.org/
Reviewed-by: Johan Hovold <johan@kernel.org>
Tested-by: Johan Hovold <johan@kernel.org>
Tested-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Rudi Heitbaum <rudi@heitbaum..com>
Tested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This will simplify the next step, where we can then key off the inode to
do one idempotent module load.
Let's do the obvious re-organization in one step, and then the new code
in another.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing
CONFIG_MODULE_REL_CRCS") made modpost output CRCs in the same way
whether the EXPORT_SYMBOL() is placed in *.c or *.S.
For further cleanups, this commit applies a similar approach to the
entire data structure of EXPORT_SYMBOL().
The EXPORT_SYMBOL() compilation is split into two stages.
When a source file is compiled, EXPORT_SYMBOL() will be converted into
a dummy symbol in the .export_symbol section.
For example,
EXPORT_SYMBOL(foo);
EXPORT_SYMBOL_NS_GPL(bar, BAR_NAMESPACE);
will be encoded into the following assembly code:
.section ".export_symbol","a"
__export_symbol_foo:
.asciz "" /* license */
.asciz "" /* name space */
.balign 8
.quad foo /* symbol reference */
.previous
.section ".export_symbol","a"
__export_symbol_bar:
.asciz "GPL" /* license */
.asciz "BAR_NAMESPACE" /* name space */
.balign 8
.quad bar /* symbol reference */
.previous
They are mere markers to tell modpost the name, license, and namespace
of the symbols. They will be dropped from the final vmlinux and modules
because the *(.export_symbol) will go into /DISCARD/ in the linker script.
Then, modpost extracts all the information about EXPORT_SYMBOL() from the
.export_symbol section, and generates the final C code:
KSYMTAB_FUNC(foo, "", "");
KSYMTAB_FUNC(bar, "_gpl", "BAR_NAMESPACE");
KSYMTAB_FUNC() (or KSYMTAB_DATA() if it is data) is expanded to struct
kernel_symbol that will be linked to the vmlinux or a module.
With this change, EXPORT_SYMBOL() works in the same way for *.c and *.S
files, providing the following benefits.
[1] Deprecate EXPORT_DATA_SYMBOL()
In the old days, EXPORT_SYMBOL() was only available in C files. To export
a symbol in *.S, EXPORT_SYMBOL() was placed in a separate *.c file.
arch/arm/kernel/armksyms.c is one example written in the classic manner.
Commit 22823ab419d8 ("EXPORT_SYMBOL() for asm") removed this limitation.
Since then, EXPORT_SYMBOL() can be placed close to the symbol definition
in *.S files. It was a nice improvement.
However, as that commit mentioned, you need to use EXPORT_DATA_SYMBOL()
for data objects on some architectures.
In the new approach, modpost checks symbol's type (STT_FUNC or not),
and outputs KSYMTAB_FUNC() or KSYMTAB_DATA() accordingly.
There are only two users of EXPORT_DATA_SYMBOL:
EXPORT_DATA_SYMBOL_GPL(empty_zero_page) (arch/ia64/kernel/head.S)
EXPORT_DATA_SYMBOL(ia64_ivt) (arch/ia64/kernel/ivt.S)
They are transformed as follows and output into .vmlinux.export.c
KSYMTAB_DATA(empty_zero_page, "_gpl", "");
KSYMTAB_DATA(ia64_ivt, "", "");
The other EXPORT_SYMBOL users in ia64 assembly are output as
KSYMTAB_FUNC().
EXPORT_DATA_SYMBOL() is now deprecated.
[2] merge <linux/export.h> and <asm-generic/export.h>
There are two similar header implementations:
include/linux/export.h for .c files
include/asm-generic/export.h for .S files
Ideally, the functionality should be consistent between them, but they
tend to diverge.
Commit 8651ec01daed ("module: add support for symbol namespaces.") did
not support the namespace for *.S files.
This commit shifts the essential implementation part to C, which supports
EXPORT_SYMBOL_NS() for *.S files.
<asm/export.h> and <asm-generic/export.h> will remain as a wrapper of
<linux/export.h> for a while.
They will be removed after #include <asm/export.h> directives are all
replaced with #include <linux/export.h>.
[3] Implement CONFIG_TRIM_UNUSED_KSYMS in one-pass algorithm (by a later commit)
When CONFIG_TRIM_UNUSED_KSYMS is enabled, Kbuild recursively traverses
the directory tree to determine which EXPORT_SYMBOL to trim. If an
EXPORT_SYMBOL turns out to be unused by anyone, Kbuild begins the
second traverse, where some source files are recompiled with their
EXPORT_SYMBOL() tuned into a no-op.
We can do this better now; modpost can selectively emit KSYMTAB entries
that are really used by modules.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
|
|
While implementing support for in-kernel decompression in kmod,
finit_module() was returning a very suspicious value:
finit_module(3, "", MODULE_INIT_COMPRESSED_FILE) = 18446744072717407296
It turns out the check for module_get_next_page() failing is wrong,
and hence the decompression was not really taking place. Invert
the condition to fix it.
Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression")
Cc: stable@kernel.org
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Frank reported boot regression in ia64 as:
ELILO v3.16 for EFI/IA-64
..
Uncompressing Linux... done
Loading file AC100221.initrd.img...done
[ 0.000000] Linux version 6.4.0-rc3 (root@x4270) (ia64-linux-gcc
(GCC) 12.2.0, GNU ld (GNU Binutils) 2.39) #1 SMP Thu May 25 15:52:20
CEST 2023
[ 0.000000] efi: EFI v1.1 by HP
[ 0.000000] efi: SALsystab=0x3ee7a000 ACPI 2.0=0x3fe2a000
ESI=0x3ee7b000 SMBIOS=0x3ee7c000 HCDP=0x3fe28000
[ 0.000000] PCDP: v3 at 0x3fe28000
[ 0.000000] earlycon: uart8250 at MMIO 0x00000000f4050000 (options
'9600n8')
[ 0.000000] printk: bootconsole [uart8250] enabled
[ 0.000000] ACPI: Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 0x000000003FE2A000 000028 (v02 HP )
[ 0.000000] ACPI: XSDT 0x000000003FE2A02C 0000CC (v01 HP rx2620
00000000 HP 00000000)
[...]
[ 3.793350] Run /init as init process
Loading, please wait...
Starting systemd-udevd version 252.6-1
[ 3.951100] ------------[ cut here ]------------
[ 3.951100] WARNING: CPU: 6 PID: 140 at kernel/module/main.c:1547
__layout_sections+0x370/0x3c0
[ 3.949512] Unable to handle kernel paging request at virtual address
1000000000000000
[ 3.951100] Modules linked in:
[ 3.951100] CPU: 6 PID: 140 Comm: (udev-worker) Not tainted 6.4.0-rc3 #1
[ 3.956161] (udev-worker)[142]: Oops 11003706212352 [1]
[ 3.951774] Hardware name: hp server rx2620 , BIOS
04.29
11/30/2007
[ 3.951774]
[ 3.951774] Call Trace:
[ 3.958339] Unable to handle kernel paging request at virtual address
1000000000000000
[ 3.956161] Modules linked in:
[ 3.951774] [<a0000001000156d0>] show_stack.part.0+0x30/0x60
[ 3.951774] sp=e000000183a67b20
bsp=e000000183a61628
[ 3.956161]
[ 3.956161]
which bisect to module_memory change [1].
Debug showed that ia64 uses some special sections:
__layout_sections: section .got (sh_flags 10000002) matched to MOD_INVALID
__layout_sections: section .sdata (sh_flags 10000003) matched to MOD_INVALID
__layout_sections: section .sbss (sh_flags 10000003) matched to MOD_INVALID
All these sections are loaded to module core memory before [1].
Fix ia64 boot by loading these sections to MOD_DATA (core rw data).
[1] commit ac3b43283923 ("module: replace module_layout with module_memory")
Fixes: ac3b43283923 ("module: replace module_layout with module_memory")
Reported-by: Frank Scheiner <frank.scheiner@web.de>
Closes: https://lists.debian.org/debian-ia64/2023/05/msg00010.html
Closes: https://marc.info/?l=linux-ia64&m=168509859125505
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
with commit '7878c231dae0 ("slab: remove /proc/slab_allocators")'
lookup_symbol_attrs usage is removed.
Thus removing redundant API.
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
The preempt_disable() section in module_put() was added in commit
e1783a240f491 ("module: Use this_cpu_xx to dynamically allocate counters")
while the per-CPU counter were switched to another API. The API requires
that during the RMW operation the CPU remained the same.
This counting API was later replaced with atomic_t in commit
2f35c41f58a97 ("module: Replace module_ref with atomic_t refcnt")
Since this atomic_t replacement there is no need to keep preemption
disabled while the reference counter is modified.
Remove preempt_disable() from module_put(), __module_get() and
try_module_get().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Smatch warns:
kernel/module/stats.c:394 read_file_mod_stats()
warn: passing freed memory 'buf'
We are passing 'buf' to simple_read_from_buffer() after freeing it.
Fix this by changing the order of 'simple_read_from_buffer' and 'kfree'.
Fixes: df3e764d8e5c ("module: add debug stats to help identify memory pressure")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Two newly introduced functions are declared in a header that is not
included before the definition, causing a warning with sparse or
'make W=1':
kernel/module/dups.c:118:6: error: no previous prototype for 'kmod_dup_request_exists_wait' [-Werror=missing-prototypes]
118 | bool kmod_dup_request_exists_wait(char *module_name, bool wait, int *dup_ret)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
kernel/module/dups.c:220:6: error: no previous prototype for 'kmod_dup_request_announce' [-Werror=missing-prototypes]
220 | void kmod_dup_request_announce(char *module_name, int ret)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
Add an explicit include to ensure the prototypes match.
Fixes: 8660484ed1cf ("module: add debugging auto-load duplicate module support")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202304141440.DYO4NAzp-lkp@intel.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module updates from Luis Chamberlain:
"The summary of the changes for this pull requests is:
- Song Liu's new struct module_memory replacement
- Nick Alcock's MODULE_LICENSE() removal for non-modules
- My cleanups and enhancements to reduce the areas where we vmalloc
module memory for duplicates, and the respective debug code which
proves the remaining vmalloc pressure comes from userspace.
Most of the changes have been in linux-next for quite some time except
the minor fixes I made to check if a module was already loaded prior
to allocating the final module memory with vmalloc and the respective
debug code it introduces to help clarify the issue. Although the
functional change is small it is rather safe as it can only *help*
reduce vmalloc space for duplicates and is confirmed to fix a bootup
issue with over 400 CPUs with KASAN enabled. I don't expect stable
kernels to pick up that fix as the cleanups would have also had to
have been picked up. Folks on larger CPU systems with modules will
want to just upgrade if vmalloc space has been an issue on bootup.
Given the size of this request, here's some more elaborate details:
The functional change change in this pull request is the very first
patch from Song Liu which replaces the 'struct module_layout' with a
new 'struct module_memory'. The old data structure tried to put
together all types of supported module memory types in one data
structure, the new one abstracts the differences in memory types in a
module to allow each one to provide their own set of details. This
paves the way in the future so we can deal with them in a cleaner way.
If you look at changes they also provide a nice cleanup of how we
handle these different memory areas in a module. This change has been
in linux-next since before the merge window opened for v6.3 so to
provide more than a full kernel cycle of testing. It's a good thing as
quite a bit of fixes have been found for it.
Jason Baron then made dynamic debug a first class citizen module user
by using module notifier callbacks to allocate / remove module
specific dynamic debug information.
Nick Alcock has done quite a bit of work cross-tree to remove module
license tags from things which cannot possibly be module at my request
so to:
a) help him with his longer term tooling goals which require a
deterministic evaluation if a piece a symbol code could ever be
part of a module or not. But quite recently it is has been made
clear that tooling is not the only one that would benefit.
Disambiguating symbols also helps efforts such as live patching,
kprobes and BPF, but for other reasons and R&D on this area is
active with no clear solution in sight.
b) help us inch closer to the now generally accepted long term goal
of automating all the MODULE_LICENSE() tags from SPDX license tags
In so far as a) is concerned, although module license tags are a no-op
for non-modules, tools which would want create a mapping of possible
modules can only rely on the module license tag after the commit
8b41fc4454e ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf").
Nick has been working on this *for years* and AFAICT I was the only
one to suggest two alternatives to this approach for tooling. The
complexity in one of my suggested approaches lies in that we'd need a
possible-obj-m and a could-be-module which would check if the object
being built is part of any kconfig build which could ever lead to it
being part of a module, and if so define a new define
-DPOSSIBLE_MODULE [0].
A more obvious yet theoretical approach I've suggested would be to
have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as
well but that means getting kconfig symbol names mapping to modules
always, and I don't think that's the case today. I am not aware of
Nick or anyone exploring either of these options. Quite recently Josh
Poimboeuf has pointed out that live patching, kprobes and BPF would
benefit from resolving some part of the disambiguation as well but for
other reasons. The function granularity KASLR (fgkaslr) patches were
mentioned but Joe Lawrence has clarified this effort has been dropped
with no clear solution in sight [1].
In the meantime removing module license tags from code which could
never be modules is welcomed for both objectives mentioned above. Some
developers have also welcomed these changes as it has helped clarify
when a module was never possible and they forgot to clean this up, and
so you'll see quite a bit of Nick's patches in other pull requests for
this merge window. I just picked up the stragglers after rc3. LWN has
good coverage on the motivation behind this work [2] and the typical
cross-tree issues he ran into along the way. The only concrete blocker
issue he ran into was that we should not remove the MODULE_LICENSE()
tags from files which have no SPDX tags yet, even if they can never be
modules. Nick ended up giving up on his efforts due to having to do
this vetting and backlash he ran into from folks who really did *not
understand* the core of the issue nor were providing any alternative /
guidance. I've gone through his changes and dropped the patches which
dropped the module license tags where an SPDX license tag was missing,
it only consisted of 11 drivers. To see if a pull request deals with a
file which lacks SPDX tags you can just use:
./scripts/spdxcheck.py -f \
$(git diff --name-only commid-id | xargs echo)
You'll see a core module file in this pull request for the above, but
that's not related to his changes. WE just need to add the SPDX
license tag for the kernel/module/kmod.c file in the future but it
demonstrates the effectiveness of the script.
Most of Nick's changes were spread out through different trees, and I
just picked up the slack after rc3 for the last kernel was out. Those
changes have been in linux-next for over two weeks.
The cleanups, debug code I added and final fix I added for modules
were motivated by David Hildenbrand's report of boot failing on a
systems with over 400 CPUs when KASAN was enabled due to running out
of virtual memory space. Although the functional change only consists
of 3 lines in the patch "module: avoid allocation if module is already
present and ready", proving that this was the best we can do on the
modules side took quite a bit of effort and new debug code.
The initial cleanups I did on the modules side of things has been in
linux-next since around rc3 of the last kernel, the actual final fix
for and debug code however have only been in linux-next for about a
week or so but I think it is worth getting that code in for this merge
window as it does help fix / prove / evaluate the issues reported with
larger number of CPUs. Userspace is not yet fixed as it is taking a
bit of time for folks to understand the crux of the issue and find a
proper resolution. Worst come to worst, I have a kludge-of-concept [3]
of how to make kernel_read*() calls for modules unique / converge
them, but I'm currently inclined to just see if userspace can fix this
instead"
Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0]
Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1]
Link: https://lwn.net/Articles/927569/ [2]
Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3]
* tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits)
module: add debugging auto-load duplicate module support
module: stats: fix invalid_mod_bytes typo
module: remove use of uninitialized variable len
module: fix building stats for 32-bit targets
module: stats: include uapi/linux/module.h
module: avoid allocation if module is already present and ready
module: add debug stats to help identify memory pressure
module: extract patient module check into helper
modules/kmod: replace implementation with a semaphore
Change DEFINE_SEMAPHORE() to take a number argument
module: fix kmemleak annotations for non init ELF sections
module: Ignore L0 and rename is_arm_mapping_symbol()
module: Move is_arm_mapping_symbol() to module_symbol.h
module: Sync code of is_arm_mapping_symbol()
scripts/gdb: use mem instead of core_layout to get the module address
interconnect: remove module-related code
interconnect: remove MODULE_LICENSE in non-modules
zswap: remove MODULE_LICENSE in non-modules
zpool: remove MODULE_LICENSE in non-modules
x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
default value allows for better BIG TCP performances
- Reduce compound page head access for zero-copy data transfers
- RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when
possible
- Threaded NAPI improvements, adding defer skb free support and
unneeded softirq avoidance
- Address dst_entry reference count scalability issues, via false
sharing avoidance and optimize refcount tracking
- Add lockless accesses annotation to sk_err[_soft]
- Optimize again the skb struct layout
- Extends the skb drop reasons to make it usable by multiple
subsystems
- Better const qualifier awareness for socket casts
BPF:
- Add skb and XDP typed dynptrs which allow BPF programs for more
ergonomic and less brittle iteration through data and
variable-sized accesses
- Add a new BPF netfilter program type and minimal support to hook
BPF programs to netfilter hooks such as prerouting or forward
- Add more precise memory usage reporting for all BPF map types
- Adds support for using {FOU,GUE} encap with an ipip device
operating in collect_md mode and add a set of BPF kfuncs for
controlling encap params
- Allow BPF programs to detect at load time whether a particular
kfunc exists or not, and also add support for this in light
skeleton
- Bigger batch of BPF verifier improvements to prepare for upcoming
BPF open-coded iterators allowing for less restrictive looping
capabilities
- Rework RCU enforcement in the verifier, add kptr_rcu and enforce
BPF programs to NULL-check before passing such pointers into kfunc
- Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and
in local storage maps
- Enable RCU semantics for task BPF kptrs and allow referenced kptr
tasks to be stored in BPF maps
- Add support for refcounted local kptrs to the verifier for allowing
shared ownership, useful for adding a node to both the BPF list and
rbtree
- Add BPF verifier support for ST instructions in
convert_ctx_access() which will help new -mcpu=v4 clang flag to
start emitting them
- Add ARM32 USDT support to libbpf
- Improve bpftool's visual program dump which produces the control
flow graph in a DOT format by adding C source inline annotations
Protocols:
- IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
indicates the provenance of the IP address
- IPv6: optimize route lookup, dropping unneeded R/W lock acquisition
- Add the handshake upcall mechanism, allowing the user-space to
implement generic TLS handshake on kernel's behalf
- Bridge: support per-{Port, VLAN} neighbor suppression, increasing
resilience to nodes failures
- SCTP: add support for Fair Capacity and Weighted Fair Queueing
schedulers
- MPTCP: delay first subflow allocation up to its first usage. This
will allow for later better LSM interaction
- xfrm: Remove inner/outer modes from input/output path. These are
not needed anymore
- WiFi:
- reduced neighbor report (RNR) handling for AP mode
- HW timestamping support
- support for randomized auth/deauth TA for PASN privacy
- per-link debugfs for multi-link
- TC offload support for mac80211 drivers
- mac80211 mesh fast-xmit and fast-rx support
- enable Wi-Fi 7 (EHT) mesh support
Netfilter:
- Add nf_tables 'brouting' support, to force a packet to be routed
instead of being bridged
- Update bridge netfilter and ovs conntrack helpers to handle IPv6
Jumbo packets properly, i.e. fetch the packet length from
hop-by-hop extension header. This is needed for BIT TCP support
- The iptables 32bit compat interface isn't compiled in by default
anymore
- Move ip(6)tables builtin icmp matches to the udptcp one. This has
the advantage that icmp/icmpv6 match doesn't load the
iptables/ip6tables modules anymore when iptables-nft is used
- Extended netlink error report for netdevice in flowtables and
netdev/chains. Allow for incrementally add/delete devices to netdev
basechain. Allow to create netdev chain without device
Driver API:
- Remove redundant Device Control Error Reporting Enable, as PCI core
has already error reporting enabled at enumeration time
- Move Multicast DB netlink handlers to core, allowing devices other
then bridge to use them
- Allow the page_pool to directly recycle the pages from safely
localized NAPI
- Implement lockless TX queue stop/wake combo macros, allowing for
further code de-duplication and sanitization
- Add YNL support for user headers and struct attrs
- Add partial YNL specification for devlink
- Add partial YNL specification for ethtool
- Add tc-mqprio and tc-taprio support for preemptible traffic classes
- Add tx push buf len param to ethtool, specifies the maximum number
of bytes of a transmitted packet a driver can push directly to the
underlying device
- Add basic LED support for switch/phy
- Add NAPI documentation, stop relaying on external links
- Convert dsa_master_ioctl() to netdev notifier. This is a
preparatory work to make the hardware timestamping layer selectable
by user space
- Add transceiver support and improve the error messages for CAN-FD
controllers
New hardware / drivers:
- Ethernet:
- AMD/Pensando core device support
- MediaTek MT7981 SoC
- MediaTek MT7988 SoC
- Broadcom BCM53134 embedded switch
- Texas Instruments CPSW9G ethernet switch
- Qualcomm EMAC3 DWMAC ethernet
- StarFive JH7110 SoC
- NXP CBTX ethernet PHY
- WiFi:
- Apple M1 Pro/Max devices
- RealTek rtl8710bu/rtl8188gu
- RealTe |