2022-02-22 15:19:13

by Aaron Tomlin

[permalink] [raw]
Subject: [PATCH v8 00/13] module: core code clean up

Hi Luis, Christophe,

As per your suggestion [1], this is an attempt to refactor and split
optional code out of core module support code into separate components.
This version is based on Linus' commit 7993e65fdd0f ("Merge tag
'mtd/fixes-for-5.17-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux").

Hopefully this iteration is all right. Please let me know your thoughts.

Changes since v7 [2]

- Removed redundant ifdef CONFIG_MODULES and endif pairing from
kernel/module/Makefile

Changes since v6 [3]

- Moved KCOV_INSTRUMENT_module.o out of kernel/Makefile into
kernel/module/Makefile (Christophe Leroy)
- Moved kernel/module/signature.c back into kernel/
(Christophe Leroy)
- Fixed Oops in add_kallsyms() due to an invalid pointer assignment
(Christophe Leroy)

Changes since v5 [4]:

- Updated MAINTAINERS to include the entire kernel/module/ directory
(Christophe Leroy)
- Reintroduce commit a97ac8cb24a3 ("module: fix signature check failures
when using in-kernel decompression") (Michal Suchánek)
- Refactored code to address some (i.e.
--ignore=MULTIPLE_ASSIGNMENTS,ASSIGN_IN_IF was used) style violations
e.g. "Alignment should match open parenthesis", reported by
scripts/checkpatch.pl --strict (Christophe Leroy)
- Used PAGE_ALIGN() and PAGE_ALIGNED() instead (Christophe Leroy)
- Removed sig_enforce from include/linux/module.h as it is only
used in kernel/module/signing.c (Christophe Leroy)
- Added static keyword for anything not used outside a source file
(Christophe Leroy)
- Moved mod_sysfs_teardown() to kernel/module/sysfs.c (Christophe Leroy)
- Removed kdb_modules from kernel/debug/kdb/kdb_private.h
(Christophe Leroy)

Changes since v4 [5]:

- Moved is_livepatch_module() and set_livepatch_module() to
kernel/module/livepatch.c
- Addressed minor compiler warning concerning
kernel/module/internal.h (0-day)
- Resolved style violations reported by scripts/checkpatch.pl
- Dropped patch 5 [6] so external patch [7] can be applied at
a later date post merge into module-next (Christophe Leroy)

Changes since v3 [8]:

- Refactored both is_livepatch_module() and set_livepatch_module(),
respectively, to use IS_ENABLED(CONFIG_LIVEPATCH) (Joe Perches)
- Addressed various compiler warnings e.g., no previous prototype (0-day)

Changes since v2 [9]:

- Moved module decompress support to a separate file
- Made check_modinfo_livepatch() generic (Petr Mladek)
- Removed filename from each newly created file (Luis Chamberlain)
- Addressed some (i.e. --ignore=ASSIGN_IN_IF,AVOID_BUG was used)
minor scripts/checkpatch.pl concerns e.g., use strscpy over
strlcpy and missing a blank line after declarations (Allen)

Changes since v1 [10]:

- Moved module version support code into a new file

[1]: https://lore.kernel.org/lkml/[email protected]/
[2]: https://lore.kernel.org/lkml/[email protected]/
[3]: https://lore.kernel.org/lkml/[email protected]/
[4]: https://lore.kernel.org/lkml/[email protected]/
[5]: https://lore.kernel.org/lkml/[email protected]/
[6]: https://lore.kernel.org/lkml/[email protected]/
[7]: https://lore.kernel.org/lkml/203348805c9ac9851d8939d15cb9802ef047b5e2.1643919758.git.christophe.leroy@csgroup.eu/
[8]: https://lore.kernel.org/lkml/[email protected]/
[9]: https://lore.kernel.org/lkml/[email protected]/
[10]: https://lore.kernel.org/lkml/[email protected]/

Aaron Tomlin (13):
module: Move all into module/
module: Simple refactor in preparation for split
module: Make internal.h and decompress.c more compliant
module: Move livepatch support to a separate file
module: Move latched RB-tree support to a separate file
module: Move strict rwx support to a separate file
module: Move extra signature support out of core code
module: Move kmemleak support to a separate file
module: Move kallsyms support into a separate file
module: Move procfs support into a separate file
module: Move sysfs support into a separate file
module: Move kdb_modules list out of core code
module: Move version support into a separate file

MAINTAINERS | 2 +-
include/linux/module.h | 9 +-
kernel/Makefile | 5 +-
kernel/debug/kdb/kdb_main.c | 5 +
kernel/debug/kdb/kdb_private.h | 4 -
kernel/module-internal.h | 50 -
kernel/module/Makefile | 20 +
kernel/module/debug_kmemleak.c | 30 +
.../decompress.c} | 5 +-
kernel/module/internal.h | 275 +++
kernel/module/kallsyms.c | 506 +++++
kernel/module/livepatch.c | 74 +
kernel/{module.c => module/main.c} | 1856 +----------------
kernel/module/procfs.c | 142 ++
kernel/module/signing.c | 122 ++
kernel/module/strict_rwx.c | 85 +
kernel/module/sysfs.c | 436 ++++
kernel/module/tree_lookup.c | 109 +
kernel/module/version.c | 109 +
kernel/module_signing.c | 45 -
20 files changed, 2007 insertions(+), 1882 deletions(-)
delete mode 100644 kernel/module-internal.h
create mode 100644 kernel/module/Makefile
create mode 100644 kernel/module/debug_kmemleak.c
rename kernel/{module_decompress.c => module/decompress.c} (99%)
create mode 100644 kernel/module/internal.h
create mode 100644 kernel/module/kallsyms.c
create mode 100644 kernel/module/livepatch.c
rename kernel/{module.c => module/main.c} (64%)
create mode 100644 kernel/module/procfs.c
create mode 100644 kernel/module/signing.c
create mode 100644 kernel/module/strict_rwx.c
create mode 100644 kernel/module/sysfs.c
create mode 100644 kernel/module/tree_lookup.c
create mode 100644 kernel/module/version.c
delete mode 100644 kernel/module_signing.c


base-commit: 7993e65fdd0fe07beb9f36f998f9bbef2c0ee391
--
2.34.1


2022-02-22 15:37:15

by Aaron Tomlin

[permalink] [raw]
Subject: [PATCH v8 07/13] module: Move extra signature support out of core code

No functional change.

This patch migrates additional module signature check
code from core module code into kernel/module/signing.c.

Signed-off-by: Aaron Tomlin <[email protected]>
---
kernel/module/internal.h | 9 +++++
kernel/module/main.c | 87 ----------------------------------------
kernel/module/signing.c | 77 +++++++++++++++++++++++++++++++++++
3 files changed, 86 insertions(+), 87 deletions(-)

diff --git a/kernel/module/internal.h b/kernel/module/internal.h
index a6895bb5598a..d6f646a5da41 100644
--- a/kernel/module/internal.h
+++ b/kernel/module/internal.h
@@ -158,3 +158,12 @@ static inline int module_enforce_rwx_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
return 0;
}
#endif /* CONFIG_STRICT_MODULE_RWX */
+
+#ifdef CONFIG_MODULE_SIG
+int module_sig_check(struct load_info *info, int flags);
+#else /* !CONFIG_MODULE_SIG */
+static inline int module_sig_check(struct load_info *info, int flags)
+{
+ return 0;
+}
+#endif /* !CONFIG_MODULE_SIG */
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 5cd63f14b1ef..c63e10c61694 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -23,7 +23,6 @@
#include <linux/vmalloc.h>
#include <linux/elf.h>
#include <linux/proc_fs.h>
-#include <linux/security.h>
#include <linux/seq_file.h>
#include <linux/syscalls.h>
#include <linux/fcntl.h>
@@ -127,28 +126,6 @@ static void module_assert_mutex_or_preempt(void)
#endif
}

-#ifdef CONFIG_MODULE_SIG
-static bool sig_enforce = IS_ENABLED(CONFIG_MODULE_SIG_FORCE);
-module_param(sig_enforce, bool_enable_only, 0644);
-
-void set_module_sig_enforced(void)
-{
- sig_enforce = true;
-}
-#else
-#define sig_enforce false
-#endif
-
-/*
- * Export sig_enforce kernel cmdline parameter to allow other subsystems rely
- * on that instead of directly to CONFIG_MODULE_SIG_FORCE config.
- */
-bool is_module_sig_enforced(void)
-{
- return sig_enforce;
-}
-EXPORT_SYMBOL(is_module_sig_enforced);
-
/* Block module loading/unloading? */
int modules_disabled = 0;
core_param(nomodule, modules_disabled, bint, 0);
@@ -2569,70 +2546,6 @@ static inline void kmemleak_load_module(const struct module *mod,
}
#endif

-#ifdef CONFIG_MODULE_SIG
-static int module_sig_check(struct load_info *info, int flags)
-{
- int err = -ENODATA;
- const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
- const char *reason;
- const void *mod = info->hdr;
- bool mangled_module = flags & (MODULE_INIT_IGNORE_MODVERSIONS |
- MODULE_INIT_IGNORE_VERMAGIC);
- /*
- * Do not allow mangled modules as a module with version information
- * removed is no longer the module that was signed.
- */
- if (!mangled_module &&
- info->len > markerlen &&
- memcmp(mod + info->len - markerlen, MODULE_SIG_STRING, markerlen) == 0) {
- /* We truncate the module to discard the signature */
- info->len -= markerlen;
- err = mod_verify_sig(mod, info);
- if (!err) {
- info->sig_ok = true;
- return 0;
- }
- }
-
- /*
- * We don't permit modules to be loaded into the trusted kernels
- * without a valid signature on them, but if we're not enforcing,
- * certain errors are non-fatal.
- */
- switch (err) {
- case -ENODATA:
- reason = "unsigned module";
- break;
- case -ENOPKG:
- reason = "module with unsupported crypto";
- break;
- case -ENOKEY:
- reason = "module with unavailable key";
- break;
-
- default:
- /*
- * All other errors are fatal, including lack of memory,
- * unparseable signatures, and signature check failures --
- * even if signatures aren't required.
- */
- return err;
- }
-
- if (is_module_sig_enforced()) {
- pr_notice("Loading of %s is rejected\n", reason);
- return -EKEYREJECTED;
- }
-
- return security_locked_down(LOCKDOWN_MODULE_SIGNATURE);
-}
-#else /* !CONFIG_MODULE_SIG */
-static int module_sig_check(struct load_info *info, int flags)
-{
- return 0;
-}
-#endif /* !CONFIG_MODULE_SIG */
-
static int validate_section_offset(struct load_info *info, Elf_Shdr *shdr)
{
#if defined(CONFIG_64BIT)
diff --git a/kernel/module/signing.c b/kernel/module/signing.c
index 8aeb6d2ee94b..85c8999dfecf 100644
--- a/kernel/module/signing.c
+++ b/kernel/module/signing.c
@@ -11,9 +11,29 @@
#include <linux/module_signature.h>
#include <linux/string.h>
#include <linux/verification.h>
+#include <linux/security.h>
#include <crypto/public_key.h>
+#include <uapi/linux/module.h>
#include "internal.h"

+static bool sig_enforce = IS_ENABLED(CONFIG_MODULE_SIG_FORCE);
+module_param(sig_enforce, bool_enable_only, 0644);
+
+/*
+ * Export sig_enforce kernel cmdline parameter to allow other subsystems rely
+ * on that instead of directly to CONFIG_MODULE_SIG_FORCE config.
+ */
+bool is_module_sig_enforced(void)
+{
+ return sig_enforce;
+}
+EXPORT_SYMBOL(is_module_sig_enforced);
+
+void set_module_sig_enforced(void)
+{
+ sig_enforce = true;
+}
+
/*
* Verify the signature on a module.
*/
@@ -43,3 +63,60 @@ int mod_verify_sig(const void *mod, struct load_info *info)
VERIFYING_MODULE_SIGNATURE,
NULL, NULL);
}
+
+int module_sig_check(struct load_info *info, int flags)
+{
+ int err = -ENODATA;
+ const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
+ const char *reason;
+ const void *mod = info->hdr;
+ bool mangled_module = flags & (MODULE_INIT_IGNORE_MODVERSIONS |
+ MODULE_INIT_IGNORE_VERMAGIC);
+ /*
+ * Do not allow mangled modules as a module with version information
+ * removed is no longer the module that was signed.
+ */
+ if (!mangled_module &&
+ info->len > markerlen &&
+ memcmp(mod + info->len - markerlen, MODULE_SIG_STRING, markerlen) == 0) {
+ /* We truncate the module to discard the signature */
+ info->len -= markerlen;
+ err = mod_verify_sig(mod, info);
+ if (!err) {
+ info->sig_ok = true;
+ return 0;
+ }
+ }
+
+ /*
+ * We don't permit modules to be loaded into the trusted kernels
+ * without a valid signature on them, but if we're not enforcing,
+ * certain errors are non-fatal.
+ */
+ switch (err) {
+ case -ENODATA:
+ reason = "unsigned module";
+ break;
+ case -ENOPKG:
+ reason = "module with unsupported crypto";
+ break;
+ case -ENOKEY:
+ reason = "module with unavailable key";
+ break;
+
+ default:
+ /*
+ * All other errors are fatal, including lack of memory,
+ * unparseable signatures, and signature check failures --
+ * even if signatures aren't required.
+ */
+ return err;
+ }
+
+ if (is_module_sig_enforced()) {
+ pr_notice("Loading of %s is rejected\n", reason);
+ return -EKEYREJECTED;
+ }
+
+ return security_locked_down(LOCKDOWN_MODULE_SIGNATURE);
+}
--
2.34.1

2022-02-22 15:44:27

by Aaron Tomlin

[permalink] [raw]
Subject: [PATCH v8 11/13] module: Move sysfs support into a separate file

No functional change.

This patch migrates module sysfs support out of core code into
kernel/module/sysfs.c. In addition simple code refactoring to
make this possible.

Signed-off-by: Aaron Tomlin <[email protected]>
---
kernel/module/Makefile | 1 +
kernel/module/internal.h | 21 ++
kernel/module/main.c | 469 +--------------------------------------
kernel/module/sysfs.c | 436 ++++++++++++++++++++++++++++++++++++
4 files changed, 461 insertions(+), 466 deletions(-)
create mode 100644 kernel/module/sysfs.c

diff --git a/kernel/module/Makefile b/kernel/module/Makefile
index 94296c98a67f..cf8dcdc6b55f 100644
--- a/kernel/module/Makefile
+++ b/kernel/module/Makefile
@@ -16,3 +16,4 @@ obj-$(CONFIG_STRICT_MODULE_RWX) += strict_rwx.o
obj-$(CONFIG_DEBUG_KMEMLEAK) += debug_kmemleak.o
obj-$(CONFIG_KALLSYMS) += kallsyms.o
obj-$(CONFIG_PROC_FS) += procfs.o
+obj-$(CONFIG_SYSFS) += sysfs.o
diff --git a/kernel/module/internal.h b/kernel/module/internal.h
index 6af40c2d145f..62d749ef695e 100644
--- a/kernel/module/internal.h
+++ b/kernel/module/internal.h
@@ -34,6 +34,9 @@
extern struct mutex module_mutex;
extern struct list_head modules;

+extern struct module_attribute *modinfo_attrs[];
+extern size_t modinfo_attrs_count;
+
/* Provided by the linker */
extern const struct kernel_symbol __start___ksymtab[];
extern const struct kernel_symbol __stop___ksymtab[];
@@ -204,3 +207,21 @@ static inline void init_build_id(struct module *mod, const struct load_info *inf
static inline void layout_symtab(struct module *mod, struct load_info *info) { }
static inline void add_kallsyms(struct module *mod, const struct load_info *info) { }
#endif /* CONFIG_KALLSYMS */
+
+#ifdef CONFIG_SYSFS
+int mod_sysfs_setup(struct module *mod, const struct load_info *info,
+ struct kernel_param *kparam, unsigned int num_params);
+void mod_sysfs_teardown(struct module *mod);
+void init_param_lock(struct module *mod);
+#else /* !CONFIG_SYSFS */
+static inline int mod_sysfs_setup(struct module *mod,
+ const struct load_info *info,
+ struct kernel_param *kparam,
+ unsigned int num_params)
+{
+ return 0;
+}
+
+static inline void mod_sysfs_teardown(struct module *mod) { }
+static inline void init_param_lock(struct module *mod) { }
+#endif /* CONFIG_SYSFS */
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 44b6fd1acc44..b8a59b5c3e3a 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -14,9 +14,7 @@
#include <linux/init.h>
#include <linux/kallsyms.h>
#include <linux/buildid.h>
-#include <linux/file.h>
#include <linux/fs.h>
-#include <linux/sysfs.h>
#include <linux/kernel.h>
#include <linux/kernel_read_file.h>
#include <linux/slab.h>
@@ -989,7 +987,7 @@ static ssize_t show_taint(struct module_attribute *mattr,
static struct module_attribute modinfo_taint =
__ATTR(taint, 0444, show_taint, NULL);

-static struct module_attribute *modinfo_attrs[] = {
+struct module_attribute *modinfo_attrs[] = {
&module_uevent,
&modinfo_version,
&modinfo_srcversion,
@@ -1003,6 +1001,8 @@ static struct module_attribute *modinfo_attrs[] = {
NULL,
};

+size_t modinfo_attrs_count = ARRAY_SIZE(modinfo_attrs);
+
static const char vermagic[] = VERMAGIC_STRING;

static int try_to_force_load(struct module *mod, const char *reason)
@@ -1253,469 +1253,6 @@ resolve_symbol_wait(struct module *mod,
return ksym;
}

-/*
- * /sys/module/foo/sections stuff
- * J. Corbet <[email protected]>
- */
-#ifdef CONFIG_SYSFS
-
-#ifdef CONFIG_KALLSYMS
-struct module_sect_attr {
- struct bin_attribute battr;
- unsigned long address;
-};
-
-struct module_sect_attrs {
- struct attribute_group grp;
- unsigned int nsections;
- struct module_sect_attr attrs[];
-};
-
-#define MODULE_SECT_READ_SIZE (3 /* "0x", "\n" */ + (BITS_PER_LONG / 4))
-static ssize_t module_sect_read(struct file *file, struct kobject *kobj,
- struct bin_attribute *battr,
- char *buf, loff_t pos, size_t count)
-{
- struct module_sect_attr *sattr =
- container_of(battr, struct module_sect_attr, battr);
- char bounce[MODULE_SECT_READ_SIZE + 1];
- size_t wrote;
-
- if (pos != 0)
- return -EINVAL;
-
- /*
- * Since we're a binary read handler, we must account for the
- * trailing NUL byte that sprintf will write: if "buf" is
- * too small to hold the NUL, or the NUL is exactly the last
- * byte, the read will look like it got truncated by one byte.
- * Since there is no way to ask sprintf nicely to not write
- * the NUL, we have to use a bounce buffer.
- */
- wrote = scnprintf(bounce, sizeof(bounce), "0x%px\n",
- kallsyms_show_value(file->f_cred)
- ? (void *)sattr->address : NULL);
- count = min(count, wrote);
- memcpy(buf, bounce, count);
-
- return count;
-}
-
-static void free_sect_attrs(struct module_sect_attrs *sect_attrs)
-{
- unsigned int section;
-
- for (section = 0; section < sect_attrs->nsections; section++)
- kfree(sect_attrs->attrs[section].battr.attr.name);
- kfree(sect_attrs);
-}
-
-static void add_sect_attrs(struct module *mod, const struct load_info *info)
-{
- unsigned int nloaded = 0, i, size[2];
- struct module_sect_attrs *sect_attrs;
- struct module_sect_attr *sattr;
- struct bin_attribute **gattr;
-
- /* Count loaded sections and allocate structures */
- for (i = 0; i < info->hdr->e_shnum; i++)
- if (!sect_empty(&info->sechdrs[i]))
- nloaded++;
- size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded),
- sizeof(sect_attrs->grp.bin_attrs[0]));
- size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.bin_attrs[0]);
- sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);
- if (sect_attrs == NULL)
- return;
-
- /* Setup section attributes. */
- sect_attrs->grp.name = "sections";
- sect_attrs->grp.bin_attrs = (void *)sect_attrs + size[0];
-
- sect_attrs->nsections = 0;
- sattr = &sect_attrs->attrs[0];
- gattr = &sect_attrs->grp.bin_attrs[0];
- for (i = 0; i < info->hdr->e_shnum; i++) {
- Elf_Shdr *sec = &info->sechdrs[i];
- if (sect_empty(sec))
- continue;
- sysfs_bin_attr_init(&sattr->battr);
- sattr->address = sec->sh_addr;
- sattr->battr.attr.name =
- kstrdup(info->secstrings + sec->sh_name, GFP_KERNEL);
- if (sattr->battr.attr.name == NULL)
- goto out;
- sect_attrs->nsections++;
- sattr->battr.read = module_sect_read;
- sattr->battr.size = MODULE_SECT_READ_SIZE;
- sattr->battr.attr.mode = 0400;
- *(gattr++) = &(sattr++)->battr;
- }
- *gattr = NULL;
-
- if (sysfs_create_group(&mod->mkobj.kobj, &sect_attrs->grp))
- goto out;
-
- mod->sect_attrs = sect_attrs;
- return;
- out:
- free_sect_attrs(sect_attrs);
-}
-
-static void remove_sect_attrs(struct module *mod)
-{
- if (mod->sect_attrs) {
- sysfs_remove_group(&mod->mkobj.kobj,
- &mod->sect_attrs->grp);
- /*
- * We are positive that no one is using any sect attrs
- * at this point. Deallocate immediately.
- */
- free_sect_attrs(mod->sect_attrs);
- mod->sect_attrs = NULL;
- }
-}
-
-/*
- * /sys/module/foo/notes/.section.name gives contents of SHT_NOTE sections.
- */
-
-struct module_notes_attrs {
- struct kobject *dir;
- unsigned int notes;
- struct bin_attribute attrs[];
-};
-
-static ssize_t module_notes_read(struct file *filp, struct kobject *kobj,
- struct bin_attribute *bin_attr,
- char *buf, loff_t pos, size_t count)
-{
- /*
- * The caller checked the pos and count against our size.
- */
- memcpy(buf, bin_attr->private + pos, count);
- return count;
-}
-
-static void free_notes_attrs(struct module_notes_attrs *notes_attrs,
- unsigned int i)
-{
- if (notes_attrs->dir) {
- while (i-- > 0)
- sysfs_remove_bin_file(notes_attrs->dir,
- &notes_attrs->attrs[i]);
- kobject_put(notes_attrs->dir);
- }
- kfree(notes_attrs);
-}
-
-static void add_notes_attrs(struct module *mod, const struct load_info *info)
-{
- unsigned int notes, loaded, i;
- struct module_notes_attrs *notes_attrs;
- struct bin_attribute *nattr;
-
- /* failed to create section attributes, so can't create notes */
- if (!mod->sect_attrs)
- return;
-
- /* Count notes sections and allocate structures. */
- notes = 0;
- for (i = 0; i < info->hdr->e_shnum; i++)
- if (!sect_empty(&info->sechdrs[i]) &&
- (info->sechdrs[i].sh_type == SHT_NOTE))
- ++notes;
-
- if (notes == 0)
- return;
-
- notes_attrs = kzalloc(struct_size(notes_attrs, attrs, notes),
- GFP_KERNEL);
- if (notes_attrs == NULL)
- return;
-
- notes_attrs->notes = notes;
- nattr = &notes_attrs->attrs[0];
- for (loaded = i = 0; i < info->hdr->e_shnum; ++i) {
- if (sect_empty(&info->sechdrs[i]))
- continue;
- if (info->sechdrs[i].sh_type == SHT_NOTE) {
- sysfs_bin_attr_init(nattr);
- nattr->attr.name = mod->sect_attrs->attrs[loaded].battr.attr.name;
- nattr->attr.mode = S_IRUGO;
- nattr->size = info->sechdrs[i].sh_size;
- nattr->private = (void *) info->sechdrs[i].sh_addr;
- nattr->read = module_notes_read;
- ++nattr;
- }
- ++loaded;
- }
-
- notes_attrs->dir = kobject_create_and_add("notes", &mod->mkobj.kobj);
- if (!notes_attrs->dir)
- goto out;
-
- for (i = 0; i < notes; ++i)
- if (sysfs_create_bin_file(notes_attrs->dir,
- &notes_attrs->attrs[i]))
- goto out;
-
- mod->notes_attrs = notes_attrs;
- return;
-
- out:
- free_notes_attrs(notes_attrs, i);
-}
-
-static void remove_notes_attrs(struct module *mod)
-{
- if (mod->notes_attrs)
- free_notes_attrs(mod->notes_attrs, mod->notes_attrs->notes);
-}
-
-#else
-
-static inline void add_sect_attrs(struct module *mod,
- const struct load_info *info)
-{
-}
-
-static inline void remove_sect_attrs(struct module *mod)
-{
-}
-
-static inline void add_notes_attrs(struct module *mod,
- const struct load_info *info)
-{
-}
-
-static inline void remove_notes_attrs(struct module *mod)
-{
-}
-#endif /* CONFIG_KALLSYMS */
-
-static void del_usage_links(struct module *mod)
-{
-#ifdef CONFIG_MODULE_UNLOAD
- struct module_use *use;
-
- mutex_lock(&module_mutex);
- list_for_each_entry(use, &mod->target_list, target_list)
- sysfs_remove_link(use->target->holders_dir, mod->name);
- mutex_unlock(&module_mutex);
-#endif
-}
-
-static int add_usage_links(struct module *mod)
-{
- int ret = 0;
-#ifdef CONFIG_MODULE_UNLOAD
- struct module_use *use;
-
- mutex_lock(&module_mutex);
- list_for_each_entry(use, &mod->target_list, target_list) {
- ret = sysfs_create_link(use->target->holders_dir,
- &mod->mkobj.kobj, mod->name);
- if (ret)
- break;
- }
- mutex_unlock(&module_mutex);
- if (ret)
- del_usage_links(mod);
-#endif
- return ret;
-}
-
-static void module_remove_modinfo_attrs(struct module *mod, int end);
-
-static int module_add_modinfo_attrs(struct module *mod)
-{
- struct module_attribute *attr;
- struct module_attribute *temp_attr;
- int error = 0;
- int i;
-
- mod->modinfo_attrs = kzalloc((sizeof(struct module_attribute) *
- (ARRAY_SIZE(modinfo_attrs) + 1)),
- GFP_KERNEL);
- if (!mod->modinfo_attrs)
- return -ENOMEM;
-
- temp_attr = mod->modinfo_attrs;
- for (i = 0; (attr = modinfo_attrs[i]); i++) {
- if (!attr->test || attr->test(mod)) {
- memcpy(temp_attr, attr, sizeof(*temp_attr));
- sysfs_attr_init(&temp_attr->attr);
- error = sysfs_create_file(&mod->mkobj.kobj,
- &temp_attr->attr);
- if (error)
- goto error_out;
- ++temp_attr;
- }
- }
-
- return 0;
-
-error_out:
- if (i > 0)
- module_remove_modinfo_attrs(mod, --i);
- else
- kfree(mod->modinfo_attrs);
- return error;
-}
-
-static void module_remove_modinfo_attrs(struct module *mod, int end)
-{
- struct module_attribute *attr;
- int i;
-
- for (i = 0; (attr = &mod->modinfo_attrs[i]); i++) {
- if (end >= 0 && i > end)
- break;
- /* pick a field to test for end of list */
- if (!attr->attr.name)
- break;
- sysfs_remove_file(&mod->mkobj.kobj, &attr->attr);
- if (attr->free)
- attr->free(mod);
- }
- kfree(mod->modinfo_attrs);
-}
-
-static void mod_kobject_put(struct module *mod)
-{
- DECLARE_COMPLETION_ONSTACK(c);
- mod->mkobj.kobj_completion = &c;
- kobject_put(&mod->mkobj.kobj);
- wait_for_completion(&c);
-}
-
-static int mod_sysfs_init(struct module *mod)
-{
- int err;
- struct kobject *kobj;
-
- if (!module_sysfs_initialized) {
- pr_err("%s: module sysfs not initialized\n", mod->name);
- err = -EINVAL;
- goto out;
- }
-
- kobj = kset_find_obj(module_kset, mod->name);
- if (kobj) {
- pr_err("%s: module is already loaded\n", mod->name);
- kobject_put(kobj);
- err = -EINVAL;
- goto out;
- }
-
- mod->mkobj.mod = mod;
-
- memset(&mod->mkobj.kobj, 0, sizeof(mod->mkobj.kobj));
- mod->mkobj.kobj.kset = module_kset;
- err = kobject_init_and_add(&mod->mkobj.kobj, &module_ktype, NULL,
- "%s", mod->name);
- if (err)
- mod_kobject_put(mod);
-
-out:
- return err;
-}
-
-static int mod_sysfs_setup(struct module *mod,
- const struct load_info *info,
- struct kernel_param *kparam,
- unsigned int num_params)
-{
- int err;
-
- err = mod_sysfs_init(mod);
- if (err)
- goto out;
-
- mod->holders_dir = kobject_create_and_add("holders", &mod->mkobj.kobj);
- if (!mod->holders_dir) {
- err = -ENOMEM;
- goto out_unreg;
- }
-
- err = module_param_sysfs_setup(mod, kparam, num_params);
- if (err)
- goto out_unreg_holders;
-
- err = module_add_modinfo_attrs(mod);
- if (err)
- goto out_unreg_param;
-
- err = add_usage_links(mod);
- if (err)
- goto out_unreg_modinfo_attrs;
-
- add_sect_attrs(mod, info);
- add_notes_attrs(mod, info);
-
- return 0;
-
-out_unreg_modinfo_attrs:
- module_remove_modinfo_attrs(mod, -1);
-out_unreg_param:
- module_param_sysfs_remove(mod);
-out_unreg_holders:
- kobject_put(mod->holders_dir);
-out_unreg:
- mod_kobject_put(mod);
-out:
- return err;
-}
-
-static void mod_sysfs_fini(struct module *mod)
-{
- remove_notes_attrs(mod);
- remove_sect_attrs(mod);
- mod_kobject_put(mod);
-}
-
-static void init_param_lock(struct module *mod)
-{
- mutex_init(&mod->param_lock);
-}
-#else /* !CONFIG_SYSFS */
-
-static int mod_sysfs_setup(struct module *mod,
- const struct load_info *info,
- struct kernel_param *kparam,
- unsigned int num_params)
-{
- return 0;
-}
-
-static void mod_sysfs_fini(struct module *mod)
-{
-}
-
-static void module_remove_modinfo_attrs(struct module *mod, int end)
-{
-}
-
-static void del_usage_links(struct module *mod)
-{
-}
-
-static void init_param_lock(struct module *mod)
-{
-}
-#endif /* CONFIG_SYSFS */
-
-static void mod_sysfs_teardown(struct module *mod)
-{
- del_usage_links(mod);
- module_remove_modinfo_attrs(mod, -1);
- module_param_sysfs_remove(mod);
- kobject_put(mod->mkobj.drivers_dir);
- kobject_put(mod->holders_dir);
- mod_sysfs_fini(mod);
-}
-
/*
* LKM RO/NX protection: protect module's text/ro-data
* from modification and any data from execution.
diff --git a/kernel/module/sysfs.c b/kernel/module/sysfs.c
new file mode 100644
index 000000000000..ce68f821dcd1
--- /dev/null
+++ b/kernel/module/sysfs.c
@@ -0,0 +1,436 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Module sysfs support
+ *
+ * Copyright (C) 2008 Rusty Russell
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/sysfs.h>
+#include <linux/slab.h>
+#include <linux/kallsyms.h>
+#include <linux/mutex.h>
+#include "internal.h"
+
+/*
+ * /sys/module/foo/sections stuff
+ * J. Corbet <[email protected]>
+ */
+#ifdef CONFIG_KALLSYMS
+struct module_sect_attr {
+ struct bin_attribute battr;
+ unsigned long address;
+};
+
+struct module_sect_attrs {
+ struct attribute_group grp;
+ unsigned int nsections;
+ struct module_sect_attr attrs[];
+};
+
+#define MODULE_SECT_READ_SIZE (3 /* "0x", "\n" */ + (BITS_PER_LONG / 4))
+static ssize_t module_sect_read(struct file *file, struct kobject *kobj,
+ struct bin_attribute *battr,
+ char *buf, loff_t pos, size_t count)
+{
+ struct module_sect_attr *sattr =
+ container_of(battr, struct module_sect_attr, battr);
+ char bounce[MODULE_SECT_READ_SIZE + 1];
+ size_t wrote;
+
+ if (pos != 0)
+ return -EINVAL;
+
+ /*
+ * Since we're a binary read handler, we must account for the
+ * trailing NUL byte that sprintf will write: if "buf" is
+ * too small to hold the NUL, or the NUL is exactly the last
+ * byte, the read will look like it got truncated by one byte.
+ * Since there is no way to ask sprintf nicely to not write
+ * the NUL, we have to use a bounce buffer.
+ */
+ wrote = scnprintf(bounce, sizeof(bounce), "0x%px\n",
+ kallsyms_show_value(file->f_cred)
+ ? (void *)sattr->address : NULL);
+ count = min(count, wrote);
+ memcpy(buf, bounce, count);
+
+ return count;
+}
+
+static void free_sect_attrs(struct module_sect_attrs *sect_attrs)
+{
+ unsigned int section;
+
+ for (section = 0; section < sect_attrs->nsections; section++)
+ kfree(sect_attrs->attrs[section].battr.attr.name);
+ kfree(sect_attrs);
+}
+
+static void add_sect_attrs(struct module *mod, const struct load_info *info)
+{
+ unsigned int nloaded = 0, i, size[2];
+ struct module_sect_attrs *sect_attrs;
+ struct module_sect_attr *sattr;
+ struct bin_attribute **gattr;
+
+ /* Count loaded sections and allocate structures */
+ for (i = 0; i < info->hdr->e_shnum; i++)
+ if (!sect_empty(&info->sechdrs[i]))
+ nloaded++;
+ size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded),
+ sizeof(sect_attrs->grp.bin_attrs[0]));
+ size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.bin_attrs[0]);
+ sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);
+ if (!sect_attrs)
+ return;
+
+ /* Setup section attributes. */
+ sect_attrs->grp.name = "sections";
+ sect_attrs->grp.bin_attrs = (void *)sect_attrs + size[0];
+
+ sect_attrs->nsections = 0;
+ sattr = &sect_attrs->attrs[0];
+ gattr = &sect_attrs->grp.bin_attrs[0];
+ for (i = 0; i < info->hdr->e_shnum; i++) {
+ Elf_Shdr *sec = &info->sechdrs[i];
+
+ if (sect_empty(sec))
+ continue;
+ sysfs_bin_attr_init(&sattr->battr);
+ sattr->address = sec->sh_addr;
+ sattr->battr.attr.name =
+ kstrdup(info->secstrings + sec->sh_name, GFP_KERNEL);
+ if (!sattr->battr.attr.name)
+ goto out;
+ sect_attrs->nsections++;
+ sattr->battr.read = module_sect_read;
+ sattr->battr.size = MODULE_SECT_READ_SIZE;
+ sattr->battr.attr.mode = 0400;
+ *(gattr++) = &(sattr++)->battr;
+ }
+ *gattr = NULL;
+
+ if (sysfs_create_group(&mod->mkobj.kobj, &sect_attrs->grp))
+ goto out;
+
+ mod->sect_attrs = sect_attrs;
+ return;
+out:
+ free_sect_attrs(sect_attrs);
+}
+
+static void remove_sect_attrs(struct module *mod)
+{
+ if (mod->sect_attrs) {
+ sysfs_remove_group(&mod->mkobj.kobj,
+ &mod->sect_attrs->grp);
+ /*
+ * We are positive that no one is using any sect attrs
+ * at this point. Deallocate immediately.
+ */
+ free_sect_attrs(mod->sect_attrs);
+ mod->sect_attrs = NULL;
+ }
+}
+
+/*
+ * /sys/module/foo/notes/.section.name gives contents of SHT_NOTE sections.
+ */
+
+struct module_notes_attrs {
+ struct kobject *dir;
+ unsigned int notes;
+ struct bin_attribute attrs[];
+};
+
+static ssize_t module_notes_read(struct file *filp, struct kobject *kobj,
+ struct bin_attribute *bin_attr,
+ char *buf, loff_t pos, size_t count)
+{
+ /*
+ * The caller checked the pos and count against our size.
+ */
+ memcpy(buf, bin_attr->private + pos, count);
+ return count;
+}
+
+static void free_notes_attrs(struct module_notes_attrs *notes_attrs,
+ unsigned int i)
+{
+ if (notes_attrs->dir) {
+ while (i-- > 0)
+ sysfs_remove_bin_file(notes_attrs->dir,
+ &notes_attrs->attrs[i]);
+ kobject_put(notes_attrs->dir);
+ }
+ kfree(notes_attrs);
+}
+
+static void add_notes_attrs(struct module *mod, const struct load_info *info)
+{
+ unsigned int notes, loaded, i;
+ struct module_notes_attrs *notes_attrs;
+ struct bin_attribute *nattr;
+
+ /* failed to create section attributes, so can't create notes */
+ if (!mod->sect_attrs)
+ return;
+
+ /* Count notes sections and allocate structures. */
+ notes = 0;
+ for (i = 0; i < info->hdr->e_shnum; i++)
+ if (!sect_empty(&info->sechdrs[i]) &&
+ info->sechdrs[i].sh_type == SHT_NOTE)
+ ++notes;
+
+ if (notes == 0)
+ return;
+
+ notes_attrs = kzalloc(struct_size(notes_attrs, attrs, notes),
+ GFP_KERNEL);
+ if (!notes_attrs)
+ return;
+
+ notes_attrs->notes = notes;
+ nattr = &notes_attrs->attrs[0];
+ for (loaded = i = 0; i < info->hdr->e_shnum; ++i) {
+ if (sect_empty(&info->sechdrs[i]))
+ continue;
+ if (info->sechdrs[i].sh_type == SHT_NOTE) {
+ sysfs_bin_attr_init(nattr);
+ nattr->attr.name = mod->sect_attrs->attrs[loaded].battr.attr.name;
+ nattr->attr.mode = 0444;
+ nattr->size = info->sechdrs[i].sh_size;
+ nattr->private = (void *)info->sechdrs[i].sh_addr;
+ nattr->read = module_notes_read;
+ ++nattr;
+ }
+ ++loaded;
+ }
+
+ notes_attrs->dir = kobject_create_and_add("notes", &mod->mkobj.kobj);
+ if (!notes_attrs->dir)
+ goto out;
+
+ for (i = 0; i < notes; ++i)
+ if (sysfs_create_bin_file(notes_attrs->dir,
+ &notes_attrs->attrs[i]))
+ goto out;
+
+ mod->notes_attrs = notes_attrs;
+ return;
+
+out:
+ free_notes_attrs(notes_attrs, i);
+}
+
+static void remove_notes_attrs(struct module *mod)
+{
+ if (mod->notes_attrs)
+ free_notes_attrs(mod->notes_attrs, mod->notes_attrs->notes);
+}
+
+#else /* !CONFIG_KALLSYMS */
+static inline void add_sect_attrs(struct module *mod, const struct load_info *info) { }
+static inline void remove_sect_attrs(struct module *mod) { }
+static inline void add_notes_attrs(struct module *mod, const struct load_info *info) { }
+static inline void remove_notes_attrs(struct module *mod) { }
+#endif /* CONFIG_KALLSYMS */
+
+static void del_usage_links(struct module *mod)
+{
+#ifdef CONFIG_MODULE_UNLOAD
+ struct module_use *use;
+
+ mutex_lock(&module_mutex);
+ list_for_each_entry(use, &mod->target_list, target_list)
+ sysfs_remove_link(use->target->holders_dir, mod->name);
+ mutex_unlock(&module_mutex);
+#endif
+}
+
+static int add_usage_links(struct module *mod)
+{
+ int ret = 0;
+#ifdef CONFIG_MODULE_UNLOAD
+ struct module_use *use;
+
+ mutex_lock(&module_mutex);
+ list_for_each_entry(use, &mod->target_list, target_list) {
+ ret = sysfs_create_link(use->target->holders_dir,
+ &mod->mkobj.kobj, mod->name);
+ if (ret)
+ break;
+ }
+ mutex_unlock(&module_mutex);
+ if (ret)
+ del_usage_links(mod);
+#endif
+ return ret;
+}
+
+static void module_remove_modinfo_attrs(struct module *mod, int end)
+{
+ struct module_attribute *attr;
+ int i;
+
+ for (i = 0; (attr = &mod->modinfo_attrs[i]); i++) {
+ if (end >= 0 && i > end)
+ break;
+ /* pick a field to test for end of list */
+ if (!attr->attr.name)
+ break;
+ sysfs_remove_file(&mod->mkobj.kobj, &attr->attr);
+ if (attr->free)
+ attr->free(mod);
+ }
+ kfree(mod->modinfo_attrs);
+}
+
+static int module_add_modinfo_attrs(struct module *mod)
+{
+ struct module_attribute *attr;
+ struct module_attribute *temp_attr;
+ int error = 0;
+ int i;
+
+ mod->modinfo_attrs = kzalloc((sizeof(struct module_attribute) *
+ (modinfo_attrs_count + 1)),
+ GFP_KERNEL);
+ if (!mod->modinfo_attrs)
+ return -ENOMEM;
+
+ temp_attr = mod->modinfo_attrs;
+ for (i = 0; (attr = modinfo_attrs[i]); i++) {
+ if (!attr->test || attr->test(mod)) {
+ memcpy(temp_attr, attr, sizeof(*temp_attr));
+ sysfs_attr_init(&temp_attr->attr);
+ error = sysfs_create_file(&mod->mkobj.kobj,
+ &temp_attr->attr);
+ if (error)
+ goto error_out;
+ ++temp_attr;
+ }
+ }
+
+ return 0;
+
+error_out:
+ if (i > 0)
+ module_remove_modinfo_attrs(mod, --i);
+ else
+ kfree(mod->modinfo_attrs);
+ return error;
+}
+
+static void mod_kobject_put(struct module *mod)
+{
+ DECLARE_COMPLETION_ONSTACK(c);
+
+ mod->mkobj.kobj_completion = &c;
+ kobject_put(&mod->mkobj.kobj);
+ wait_for_completion(&c);
+}
+
+static int mod_sysfs_init(struct module *mod)
+{
+ int err;
+ struct kobject *kobj;
+
+ if (!module_sysfs_initialized) {
+ pr_err("%s: module sysfs not initialized\n", mod->name);
+ err = -EINVAL;
+ goto out;
+ }
+
+ kobj = kset_find_obj(module_kset, mod->name);
+ if (kobj) {
+ pr_err("%s: module is already loaded\n", mod->name);
+ kobject_put(kobj);
+ err = -EINVAL;
+ goto out;
+ }
+
+ mod->mkobj.mod = mod;
+
+ memset(&mod->mkobj.kobj, 0, sizeof(mod->mkobj.kobj));
+ mod->mkobj.kobj.kset = module_kset;
+ err = kobject_init_and_add(&mod->mkobj.kobj, &module_ktype, NULL,
+ "%s", mod->name);
+ if (err)
+ mod_kobject_put(mod);
+
+out:
+ return err;
+}
+
+int mod_sysfs_setup(struct module *mod,
+ const struct load_info *info,
+ struct kernel_param *kparam,
+ unsigned int num_params)
+{
+ int err;
+
+ err = mod_sysfs_init(mod);
+ if (err)
+ goto out;
+
+ mod->holders_dir = kobject_create_and_add("holders", &mod->mkobj.kobj);
+ if (!mod->holders_dir) {
+ err = -ENOMEM;
+ goto out_unreg;
+ }
+
+ err = module_param_sysfs_setup(mod, kparam, num_params);
+ if (err)
+ goto out_unreg_holders;
+
+ err = module_add_modinfo_attrs(mod);
+ if (err)
+ goto out_unreg_param;
+
+ err = add_usage_links(mod);
+ if (err)
+ goto out_unreg_modinfo_attrs;
+
+ add_sect_attrs(mod, info);
+ add_notes_attrs(mod, info);
+
+ return 0;
+
+out_unreg_modinfo_attrs:
+ module_remove_modinfo_attrs(mod, -1);
+out_unreg_param:
+ module_param_sysfs_remove(mod);
+out_unreg_holders:
+ kobject_put(mod->holders_dir);
+out_unreg:
+ mod_kobject_put(mod);
+out:
+ return err;
+}
+
+static void mod_sysfs_fini(struct module *mod)
+{
+ remove_notes_attrs(mod);
+ remove_sect_attrs(mod);
+ mod_kobject_put(mod);
+}
+
+void mod_sysfs_teardown(struct module *mod)
+{
+ del_usage_links(mod);
+ module_remove_modinfo_attrs(mod, -1);
+ module_param_sysfs_remove(mod);
+ kobject_put(mod->mkobj.drivers_dir);
+ kobject_put(mod->holders_dir);
+ mod_sysfs_fini(mod);
+}
+
+void init_param_lock(struct module *mod)
+{
+ mutex_init(&mod->param_lock);
+}
--
2.34.1

2022-02-22 15:52:07

by Aaron Tomlin

[permalink] [raw]
Subject: [PATCH v8 05/13] module: Move latched RB-tree support to a separate file

No functional change.

This patch migrates module latched RB-tree support
(e.g. see __module_address()) from core module code
into kernel/module/tree_lookup.c.

Signed-off-by: Aaron Tomlin <[email protected]>
---
kernel/module/Makefile | 1 +
kernel/module/internal.h | 33 +++++++++
kernel/module/main.c | 130 ++----------------------------------
kernel/module/tree_lookup.c | 109 ++++++++++++++++++++++++++++++
4 files changed, 147 insertions(+), 126 deletions(-)
create mode 100644 kernel/module/tree_lookup.c

diff --git a/kernel/module/Makefile b/kernel/module/Makefile
index ed3aacb04f17..88774e386276 100644
--- a/kernel/module/Makefile
+++ b/kernel/module/Makefile
@@ -11,3 +11,4 @@ obj-y += main.o
obj-$(CONFIG_MODULE_DECOMPRESS) += decompress.o
obj-$(CONFIG_MODULE_SIG) += signing.o
obj-$(CONFIG_LIVEPATCH) += livepatch.o
+obj-$(CONFIG_MODULES_TREE_LOOKUP) += tree_lookup.o
diff --git a/kernel/module/internal.h b/kernel/module/internal.h
index ad7a444253ed..f1682e3677be 100644
--- a/kernel/module/internal.h
+++ b/kernel/module/internal.h
@@ -9,6 +9,7 @@
#include <linux/compiler.h>
#include <linux/module.h>
#include <linux/mutex.h>
+#include <linux/rculist.h>

#ifndef ARCH_SHF_SMALL
#define ARCH_SHF_SMALL 0
@@ -93,3 +94,35 @@ static inline void module_decompress_cleanup(struct load_info *info)
{
}
#endif
+
+#ifdef CONFIG_MODULES_TREE_LOOKUP
+struct mod_tree_root {
+ struct latch_tree_root root;
+ unsigned long addr_min;
+ unsigned long addr_max;
+};
+
+extern struct mod_tree_root mod_tree;
+
+void mod_tree_insert(struct module *mod);
+void mod_tree_remove_init(struct module *mod);
+void mod_tree_remove(struct module *mod);
+struct module *mod_find(unsigned long addr);
+#else /* !CONFIG_MODULES_TREE_LOOKUP */
+
+static inline void mod_tree_insert(struct module *mod) { }
+static inline void mod_tree_remove_init(struct module *mod) { }
+static inline void mod_tree_remove(struct module *mod) { }
+static inline struct module *mod_find(unsigned long addr)
+{
+ struct module *mod;
+
+ list_for_each_entry_rcu(mod, &modules, list,
+ lockdep_is_held(&module_mutex)) {
+ if (within_module(addr, mod))
+ return mod;
+ }
+
+ return NULL;
+}
+#endif /* CONFIG_MODULES_TREE_LOOKUP */
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 3596ebf3a6c3..76b53880ad91 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -90,138 +90,16 @@ static DECLARE_WORK(init_free_wq, do_free_init);
static LLIST_HEAD(init_free_list);

#ifdef CONFIG_MODULES_TREE_LOOKUP
-
-/*
- * Use a latched RB-tree for __module_address(); this allows us to use
- * RCU-sched lookups of the address from any context.
- *
- * This is conditional on PERF_EVENTS || TRACING because those can really hit
- * __module_address() hard by doing a lot of stack unwinding; potentially from
- * NMI context.
- */
-
-static __always_inline unsigned long __mod_tree_val(struct latch_tree_node *n)
-{
- struct module_layout *layout = container_of(n, struct module_layout, mtn.node);
-
- return (unsigned long)layout->base;
-}
-
-static __always_inline unsigned long __mod_tree_size(struct latch_tree_node *n)
-{
- struct module_layout *layout = container_of(n, struct module_layout, mtn.node);
-
- return (unsigned long)layout->size;
-}
-
-static __always_inline bool
-mod_tree_less(struct latch_tree_node *a, struct latch_tree_node *b)
-{
- return __mod_tree_val(a) < __mod_tree_val(b);
-}
-
-static __always_inline int
-mod_tree_comp(void *key, struct latch_tree_node *n)
-{
- unsigned long val = (unsigned long)key;
- unsigned long start, end;
-
- start = __mod_tree_val(n);
- if (val < start)
- return -1;
-
- end = start + __mod_tree_size(n);
- if (val >= end)
- return 1;
-
- return 0;
-}
-
-static const struct latch_tree_ops mod_tree_ops = {
- .less = mod_tree_less,
- .comp = mod_tree_comp,
-};
-
-static struct mod_tree_root {
- struct latch_tree_root root;
- unsigned long addr_min;
- unsigned long addr_max;
-} mod_tree __cacheline_aligned = {
+struct mod_tree_root mod_tree __cacheline_aligned = {
.addr_min = -1UL,
};

#define module_addr_min mod_tree.addr_min
#define module_addr_max mod_tree.addr_max

-static noinline void __mod_tree_insert(struct mod_tree_node *node)
-{
- latch_tree_insert(&node->node, &mod_tree.root, &mod_tree_ops);
-}
-
-static void __mod_tree_remove(struct mod_tree_node *node)
-{
- latch_tree_erase(&node->node, &mod_tree.root, &mod_tree_ops);
-}
-
-/*
- * These modifications: insert, remove_init and remove; are serialized by the
- * module_mutex.
- */
-static void mod_tree_insert(struct module *mod)
-{
- mod->core_layout.mtn.mod = mod;
- mod->init_layout.mtn.mod = mod;
-
- __mod_tree_insert(&mod->core_layout.mtn);
- if (mod->init_layout.size)
- __mod_tree_insert(&mod->init_layout.mtn);
-}
-
-static void mod_tree_remove_init(struct module *mod)
-{
- if (mod->init_layout.size)
- __mod_tree_remove(&mod->init_layout.mtn);
-}
-
-static void mod_tree_remove(struct module *mod)
-{
- __mod_tree_remove(&mod->core_layout.mtn);
- mod_tree_remove_init(mod);
-}
-
-static struct module *mod_find(unsigned long addr)
-{
- struct latch_tree_node *ltn;
-
- ltn = latch_tree_find((void *)addr, &mod_tree.root, &mod_tree_ops);
- if (!ltn)
- return NULL;
-
- return container_of(ltn, struct mod_tree_node, node)->mod;
-}
-
-#else /* MODULES_TREE_LOOKUP */
-
-static unsigned long module_addr_min = -1UL, module_addr_max = 0;
-
-static void mod_tree_insert(struct module *mod) { }
-static void mod_tree_remove_init(struct module *mod) { }
-static void mod_tree_remove(struct module *mod) { }
-
-static struct module *mod_find(unsigned long addr)
-{
- struct module *mod;
-
- list_for_each_entry_rcu(mod, &modules, list,
- lockdep_is_held(&module_mutex)) {
- if (within_module(addr, mod))
- return mod;
- }
-
- return NULL;
-}
-
-#endif /* MODULES_TREE_LOOKUP */
+#else /* !CONFIG_MODULES_TREE_LOOKUP */
+static unsigned long module_addr_min = -1UL, module_addr_max;
+#endif /* CONFIG_MODULES_TREE_LOOKUP */

/*
* Bounds of module text, for speeding up __module_address.
diff --git a/kernel/module/tree_lookup.c b/kernel/module/tree_lookup.c
new file mode 100644
index 000000000000..0bc4ec3b22ce
--- /dev/null
+++ b/kernel/module/tree_lookup.c
@@ -0,0 +1,109 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Modules tree lookup
+ *
+ * Copyright (C) 2015 Peter Zijlstra
+ * Copyright (C) 2015 Rusty Russell
+ */
+
+#include <linux/module.h>
+#include <linux/rbtree_latch.h>
+#include "internal.h"
+
+/*
+ * Use a latched RB-tree for __module_address(); this allows us to use
+ * RCU-sched lookups of the address from any context.
+ *
+ * This is conditional on PERF_EVENTS || TRACING because those can really hit
+ * __module_address() hard by doing a lot of stack unwinding; potentially from
+ * NMI context.
+ */
+
+static __always_inline unsigned long __mod_tree_val(struct latch_tree_node *n)
+{
+ struct module_layout *layout = container_of(n, struct module_layout, mtn.node);
+
+ return (unsigned long)layout->base;
+}
+
+static __always_inline unsigned long __mod_tree_size(struct latch_tree_node *n)
+{
+ struct module_layout *layout = container_of(n, struct module_layout, mtn.node);
+
+ return (unsigned long)layout->size;
+}
+
+static __always_inline bool
+mod_tree_less(struct latch_tree_node *a, struct latch_tree_node *b)
+{
+ return __mod_tree_val(a) < __mod_tree_val(b);
+}
+
+static __always_inline int
+mod_tree_comp(void *key, struct latch_tree_node *n)
+{
+ unsigned long val = (unsigned long)key;
+ unsigned long start, end;
+
+ start = __mod_tree_val(n);
+ if (val < start)
+ return -1;
+
+ end = start + __mod_tree_size(n);
+ if (val >= end)
+ return 1;
+
+ return 0;
+}
+
+static const struct latch_tree_ops mod_tree_ops = {
+ .less = mod_tree_less,
+ .comp = mod_tree_comp,
+};
+
+static noinline void __mod_tree_insert(struct mod_tree_node *node)
+{
+ latch_tree_insert(&node->node, &mod_tree.root, &mod_tree_ops);
+}
+
+static void __mod_tree_remove(struct mod_tree_node *node)
+{
+ latch_tree_erase(&node->node, &mod_tree.root, &mod_tree_ops);
+}
+
+/*
+ * These modifications: insert, remove_init and remove; are serialized by the
+ * module_mutex.
+ */
+void mod_tree_insert(struct module *mod)
+{
+ mod->core_layout.mtn.mod = mod;
+ mod->init_layout.mtn.mod = mod;
+
+ __mod_tree_insert(&mod->core_layout.mtn);
+ if (mod->init_layout.size)
+ __mod_tree_insert(&mod->init_layout.mtn);
+}
+
+void mod_tree_remove_init(struct module *mod)
+{
+ if (mod->init_layout.size)
+ __mod_tree_remove(&mod->init_layout.mtn);
+}
+
+void mod_tree_remove(struct module *mod)
+{
+ __mod_tree_remove(&mod->core_layout.mtn);
+ mod_tree_remove_init(mod);
+}
+
+struct module *mod_find(unsigned long addr)
+{
+ struct latch_tree_node *ltn;
+
+ ltn = latch_tree_find((void *)addr, &mod_tree.root, &mod_tree_ops);
+ if (!ltn)
+ return NULL;
+
+ return container_of(ltn, struct mod_tree_node, node)->mod;
+}
--
2.34.1

2022-02-22 15:56:12

by Aaron Tomlin

[permalink] [raw]
Subject: [PATCH v8 10/13] module: Move procfs support into a separate file

No functional change.

This patch migrates code that allows one to generate a
list of loaded/or linked modules via /proc when procfs
support is enabled into kernel/module/procfs.c.

Signed-off-by: Aaron Tomlin <[email protected]>
---
kernel/module/Makefile | 1 +
kernel/module/internal.h | 1 +
kernel/module/main.c | 131 +-----------------------------------
kernel/module/procfs.c | 142 +++++++++++++++++++++++++++++++++++++++
4 files changed, 145 insertions(+), 130 deletions(-)
create mode 100644 kernel/module/procfs.c

diff --git a/kernel/module/Makefile b/kernel/module/Makefile
index 9901bed3ab5b..94296c98a67f 100644
--- a/kernel/module/Makefile
+++ b/kernel/module/Makefile
@@ -15,3 +15,4 @@ obj-$(CONFIG_MODULES_TREE_LOOKUP) += tree_lookup.o
obj-$(CONFIG_STRICT_MODULE_RWX) += strict_rwx.o
obj-$(CONFIG_DEBUG_KMEMLEAK) += debug_kmemleak.o
obj-$(CONFIG_KALLSYMS) += kallsyms.o
+obj-$(CONFIG_PROC_FS) += procfs.o
diff --git a/kernel/module/internal.h b/kernel/module/internal.h
index 44ca05b9eb8f..6af40c2d145f 100644
--- a/kernel/module/internal.h
+++ b/kernel/module/internal.h
@@ -72,6 +72,7 @@ struct module *find_module_all(const char *name, size_t len, bool even_unformed)
int cmp_name(const void *name, const void *sym);
long module_get_offset(struct module *mod, unsigned int *size, Elf_Shdr *sechdr,
unsigned int section);
+char *module_flags(struct module *mod, char *buf);

static inline unsigned long kernel_symbol_value(const struct kernel_symbol *sym)
{
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 952079987ea4..44b6fd1acc44 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -22,7 +22,6 @@
#include <linux/slab.h>
#include <linux/vmalloc.h>
#include <linux/elf.h>
-#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/syscalls.h>
#include <linux/fcntl.h>
@@ -805,31 +804,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
return ret;
}

-static inline void print_unload_info(struct seq_file *m, struct module *mod)
-{
- struct module_use *use;
- int printed_something = 0;
-
- seq_printf(m, " %i ", module_refcount(mod));
-
- /*
- * Always include a trailing , so userspace can differentiate
- * between this and the old multi-field proc format.
- */
- list_for_each_entry(use, &mod->source_list, source_list) {
- printed_something = 1;
- seq_printf(m, "%s,", use->source->name);
- }
-
- if (mod->init != NULL && mod->exit == NULL) {
- printed_something = 1;
- seq_puts(m, "[permanent],");
- }
-
- if (!printed_something)
- seq_puts(m, "-");
-}
-
void __symbol_put(const char *symbol)
{
struct find_symbol_arg fsa = {
@@ -919,12 +893,6 @@ void module_put(struct module *module)
EXPORT_SYMBOL(module_put);

#else /* !CONFIG_MODULE_UNLOAD */
-static inline void print_unload_info(struct seq_file *m, struct module *mod)
-{
- /* We don't know the usage count, or what modules are using. */
- seq_puts(m, " - -");
-}
-
static inline void module_unload_free(struct module *mod)
{
}
@@ -3596,7 +3564,7 @@ static void cfi_cleanup(struct module *mod)
}

/* Keep in sync with MODULE_FLAGS_BUF_SIZE !!! */
-static char *module_flags(struct module *mod, char *buf)
+char *module_flags(struct module *mod, char *buf)
{
int bx = 0;

@@ -3619,103 +3587,6 @@ static char *module_flags(struct module *mod, char *buf)
return buf;
}

-#ifdef CONFIG_PROC_FS
-/* Called by the /proc file system to return a list of modules. */
-static void *m_start(struct seq_file *m, loff_t *pos)
-{
- mutex_lock(&module_mutex);
- return seq_list_start(&modules, *pos);
-}
-
-static void *m_next(struct seq_file *m, void *p, loff_t *pos)
-{
- return seq_list_next(p, &modules, pos);
-}
-
-static void m_stop(struct seq_file *m, void *p)
-{
- mutex_unlock(&module_mutex);
-}
-
-static int m_show(struct seq_file *m, void *p)
-{
- struct module *mod = list_entry(p, struct module, list);
- char buf[MODULE_FLAGS_BUF_SIZE];
- void *value;
-
- /* We always ignore unformed modules. */
- if (mod->state == MODULE_STATE_UNFORMED)
- return 0;
-
- seq_printf(m, "%s %u",
- mod->name, mod->init_layout.size + mod->core_layout.size);
- print_unload_info(m, mod);
-
- /* Informative for users. */
- seq_printf(m, " %s",
- mod->state == MODULE_STATE_GOING ? "Unloading" :
- mod->state == MODULE_STATE_COMING ? "Loading" :
- "Live");
- /* Used by oprofile and other similar tools. */
- value = m->private ? NULL : mod->core_layout.base;
- seq_printf(m, " 0x%px", value);
-
- /* Taints info */
- if (mod->taints)
- seq_printf(m, " %s", module_flags(mod, buf));
-
- seq_puts(m, "\n");
- return 0;
-}
-
-/*
- * Format: modulename size refcount deps address
- *
- * Where refcount is a number or -, and deps is a comma-separated list
- * of depends or -.
- */
-static const struct seq_operations modules_op = {
- .start = m_start,
- .next = m_next,
- .stop = m_stop,
- .show = m_show
-};
-
-/*
- * This also sets the "private" pointer to non-NULL if the
- * kernel pointers should be hidden (so you can just test
- * "m->private" to see if you should keep the values private).
- *
- * We use the same logic as for /proc/kallsyms.
- */
-static int modules_open(struct inode *inode, struct file *file)
-{
- int err = seq_open(file, &modules_op);
-
- if (!err) {
- struct seq_file *m = file->private_data;
- m->private = kallsyms_show_value(file->f_cred) ? NULL : (void *)8ul;
- }
-
- return err;
-}
-
-static const struct proc_ops modules_proc_ops = {
- .proc_flags = PROC_ENTRY_PERMANENT,
- .proc_open = modules_open,
- .proc_read = seq_read,
- .proc_lseek = seq_lseek,
- .proc_release = seq_release,
-};
-
-static int __init proc_modules_init(void)
-{
- proc_create("modules", 0, NULL, &modules_proc_ops);
- return 0;
-}
-module_init(proc_modules_init);
-#endif
-
/* Given an address, look for it in the module exception tables. */
const struct exception_table_entry *search_module_extables(unsigned long addr)
{
diff --git a/kernel/module/procfs.c b/kernel/module/procfs.c
new file mode 100644
index 000000000000..2717e130788e
--- /dev/null
+++ b/kernel/module/procfs.c
@@ -0,0 +1,142 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Module proc support
+ *
+ * Copyright (C) 2008 Alexey Dobriyan
+ */
+
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include <linux/mutex.h>
+#include <linux/seq_file.h>
+#include <linux/proc_fs.h>
+#include "internal.h"
+
+#ifdef CONFIG_MODULE_UNLOAD
+static inline void print_unload_info(struct seq_file *m, struct module *mod)
+{
+ struct module_use *use;
+ int printed_something = 0;
+
+ seq_printf(m, " %i ", module_refcount(mod));
+
+ /*
+ * Always include a trailing , so userspace can differentiate
+ * between this and the old multi-field proc format.
+ */
+ list_for_each_entry(use, &mod->source_list, source_list) {
+ printed_something = 1;
+ seq_printf(m, "%s,", use->source->name);
+ }
+
+ if (mod->init && !mod->exit) {
+ printed_something = 1;
+ seq_puts(m, "[permanent],");
+ }
+
+ if (!printed_something)
+ seq_puts(m, "-");
+}
+#else /* !CONFIG_MODULE_UNLOAD */
+static inline void print_unload_info(struct seq_file *m, struct module *mod)
+{
+ /* We don't know the usage count, or what modules are using. */
+ seq_puts(m, " - -");
+}
+#endif /* CONFIG_MODULE_UNLOAD */
+
+/* Called by the /proc file system to return a list of modules. */
+static void *m_start(struct seq_file *m, loff_t *pos)
+{
+ mutex_lock(&module_mutex);
+ return seq_list_start(&modules, *pos);
+}
+
+static void *m_next(struct seq_file *m, void *p, loff_t *pos)
+{
+ return seq_list_next(p, &modules, pos);
+}
+
+static void m_stop(struct seq_file *m, void *p)
+{
+ mutex_unlock(&module_mutex);
+}
+
+static int m_show(struct seq_file *m, void *p)
+{
+ struct module *mod = list_entry(p, struct module, list);
+ char buf[MODULE_FLAGS_BUF_SIZE];
+ void *value;
+
+ /* We always ignore unformed modules. */
+ if (mod->state == MODULE_STATE_UNFORMED)
+ return 0;
+
+ seq_printf(m, "%s %u",
+ mod->name, mod->init_layout.size + mod->core_layout.size);
+ print_unload_info(m, mod);
+
+ /* Informative for users. */
+ seq_printf(m, " %s",
+ mod->state == MODULE_STATE_GOING ? "Unloading" :
+ mod->state == MODULE_STATE_COMING ? "Loading" :
+ "Live");
+ /* Used by oprofile and other similar tools. */
+ value = m->private ? NULL : mod->core_layout.base;
+ seq_printf(m, " 0x%px", value);
+
+ /* Taints info */
+ if (mod->taints)
+ seq_printf(m, " %s", module_flags(mod, buf));
+
+ seq_puts(m, "\n");
+ return 0;
+}
+
+/*
+ * Format: modulename size refcount deps address
+ *
+ * Where refcount is a number or -, and deps is a comma-separated list
+ * of depends or -.
+ */
+static const struct seq_operations modules_op = {
+ .start = m_start,
+ .next = m_next,
+ .stop = m_stop,
+ .show = m_show
+};
+
+/*
+ * This also sets the "private" pointer to non-NULL if the
+ * kernel pointers should be hidden (so you can just test
+ * "m->private" to see if you should keep the values private).
+ *
+ * We use the same logic as for /proc/kallsyms.
+ */
+static int modules_open(struct inode *inode, struct file *file)
+{
+ int err = seq_open(file, &modules_op);
+
+ if (!err) {
+ struct seq_file *m = file->private_data;
+
+ m->private = kallsyms_show_value(file->f_cred) ? NULL : (void *)8ul;
+ }
+
+ return err;
+}
+
+static const struct proc_ops modules_proc_ops = {
+ .proc_flags = PROC_ENTRY_PERMANENT,
+ .proc_open = modules_open,
+ .proc_read = seq_read,
+ .proc_lseek = seq_lseek,
+ .proc_release = seq_release,
+};
+
+static int __init proc_modules_init(void)
+{
+ proc_create("modules", 0, NULL, &modules_proc_ops);
+ return 0;
+}
+module_init(proc_modules_init);
--
2.34.1

2022-02-22 18:55:13

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v8 07/13] module: Move extra signature support out of core code



Le 22/02/2022 à 15:12, Aaron Tomlin a écrit :
> No functional change.
>
> This patch migrates additional module signature check
> code from core module code into kernel/module/signing.c.
>
> Signed-off-by: Aaron Tomlin <[email protected]>

Reviewed-by: Christophe Leroy <[email protected]>

> ---
> kernel/module/internal.h | 9 +++++
> kernel/module/main.c | 87 ----------------------------------------
> kernel/module/signing.c | 77 +++++++++++++++++++++++++++++++++++
> 3 files changed, 86 insertions(+), 87 deletions(-)
>
> diff --git a/kernel/module/internal.h b/kernel/module/internal.h
> index a6895bb5598a..d6f646a5da41 100644
> --- a/kernel/module/internal.h
> +++ b/kernel/module/internal.h
> @@ -158,3 +158,12 @@ static inline int module_enforce_rwx_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
> return 0;
> }
> #endif /* CONFIG_STRICT_MODULE_RWX */
> +
> +#ifdef CONFIG_MODULE_SIG
> +int module_sig_check(struct load_info *info, int flags);
> +#else /* !CONFIG_MODULE_SIG */
> +static inline int module_sig_check(struct load_info *info, int flags)
> +{
> + return 0;
> +}
> +#endif /* !CONFIG_MODULE_SIG */
> diff --git a/kernel/module/main.c b/kernel/module/main.c
> index 5cd63f14b1ef..c63e10c61694 100644
> --- a/kernel/module/main.c
> +++ b/kernel/module/main.c
> @@ -23,7 +23,6 @@
> #include <linux/vmalloc.h>
> #include <linux/elf.h>
> #include <linux/proc_fs.h>
> -#include <linux/security.h>
> #include <linux/seq_file.h>
> #include <linux/syscalls.h>
> #include <linux/fcntl.h>
> @@ -127,28 +126,6 @@ static void module_assert_mutex_or_preempt(void)
> #endif
> }
>
> -#ifdef CONFIG_MODULE_SIG
> -static bool sig_enforce = IS_ENABLED(CONFIG_MODULE_SIG_FORCE);
> -module_param(sig_enforce, bool_enable_only, 0644);
> -
> -void set_module_sig_enforced(void)
> -{
> - sig_enforce = true;
> -}
> -#else
> -#define sig_enforce false
> -#endif
> -
> -/*
> - * Export sig_enforce kernel cmdline parameter to allow other subsystems rely
> - * on that instead of directly to CONFIG_MODULE_SIG_FORCE config.
> - */
> -bool is_module_sig_enforced(void)
> -{
> - return sig_enforce;
> -}
> -EXPORT_SYMBOL(is_module_sig_enforced);
> -
> /* Block module loading/unloading? */
> int modules_disabled = 0;
> core_param(nomodule, modules_disabled, bint, 0);
> @@ -2569,70 +2546,6 @@ static inline void kmemleak_load_module(const struct module *mod,
> }
> #endif
>
> -#ifdef CONFIG_MODULE_SIG
> -static int module_sig_check(struct load_info *info, int flags)
> -{
> - int err = -ENODATA;
> - const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
> - const char *reason;
> - const void *mod = info->hdr;
> - bool mangled_module = flags & (MODULE_INIT_IGNORE_MODVERSIONS |
> - MODULE_INIT_IGNORE_VERMAGIC);
> - /*
> - * Do not allow mangled modules as a module with version information
> - * removed is no longer the module that was signed.
> - */
> - if (!mangled_module &&
> - info->len > markerlen &&
> - memcmp(mod + info->len - markerlen, MODULE_SIG_STRING, markerlen) == 0) {
> - /* We truncate the module to discard the signature */
> - info->len -= markerlen;
> - err = mod_verify_sig(mod, info);
> - if (!err) {
> - info->sig_ok = true;
> - return 0;
> - }
> - }
> -
> - /*
> - * We don't permit modules to be loaded into the trusted kernels
> - * without a valid signature on them, but if we're not enforcing,
> - * certain errors are non-fatal.
> - */
> - switch (err) {
> - case -ENODATA:
> - reason = "unsigned module";
> - break;
> - case -ENOPKG:
> - reason = "module with unsupported crypto";
> - break;
> - case -ENOKEY:
> - reason = "module with unavailable key";
> - break;
> -
> - default:
> - /*
> - * All other errors are fatal, including lack of memory,
> - * unparseable signatures, and signature check failures --
> - * even if signatures aren't required.
> - */
> - return err;
> - }
> -
> - if (is_module_sig_enforced()) {
> - pr_notice("Loading of %s is rejected\n", reason);
> - return -EKEYREJECTED;
> - }
> -
> - return security_locked_down(LOCKDOWN_MODULE_SIGNATURE);
> -}
> -#else /* !CONFIG_MODULE_SIG */
> -static int module_sig_check(struct load_info *info, int flags)
> -{
> - return 0;
> -}
> -#endif /* !CONFIG_MODULE_SIG */
> -
> static int validate_section_offset(struct load_info *info, Elf_Shdr *shdr)
> {
> #if defined(CONFIG_64BIT)
> diff --git a/kernel/module/signing.c b/kernel/module/signing.c
> index 8aeb6d2ee94b..85c8999dfecf 100644
> --- a/kernel/module/signing.c
> +++ b/kernel/module/signing.c
> @@ -11,9 +11,29 @@
> #include <linux/module_signature.h>
> #include <linux/string.h>
> #include <linux/verification.h>
> +#include <linux/security.h>
> #include <crypto/public_key.h>
> +#include <uapi/linux/module.h>
> #include "internal.h"
>
> +static bool sig_enforce = IS_ENABLED(CONFIG_MODULE_SIG_FORCE);
> +module_param(sig_enforce, bool_enable_only, 0644);
> +
> +/*
> + * Export sig_enforce kernel cmdline parameter to allow other subsystems rely
> + * on that instead of directly to CONFIG_MODULE_SIG_FORCE config.
> + */
> +bool is_module_sig_enforced(void)
> +{
> + return sig_enforce;
> +}
> +EXPORT_SYMBOL(is_module_sig_enforced);
> +
> +void set_module_sig_enforced(void)
> +{
> + sig_enforce = true;
> +}
> +
> /*
> * Verify the signature on a module.
> */
> @@ -43,3 +63,60 @@ int mod_verify_sig(const void *mod, struct load_info *info)
> VERIFYING_MODULE_SIGNATURE,
> NULL, NULL);
> }
> +
> +int module_sig_check(struct load_info *info, int flags)
> +{
> + int err = -ENODATA;
> + const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
> + const char *reason;
> + const void *mod = info->hdr;
> + bool mangled_module = flags & (MODULE_INIT_IGNORE_MODVERSIONS |
> + MODULE_INIT_IGNORE_VERMAGIC);
> + /*
> + * Do not allow mangled modules as a module with version information
> + * removed is no longer the module that was signed.
> + */
> + if (!mangled_module &&
> + info->len > markerlen &&
> + memcmp(mod + info->len - markerlen, MODULE_SIG_STRING, markerlen) == 0) {
> + /* We truncate the module to discard the signature */
> + info->len -= markerlen;
> + err = mod_verify_sig(mod, info);
> + if (!err) {
> + info->sig_ok = true;
> + return 0;
> + }
> + }
> +
> + /*
> + * We don't permit modules to be loaded into the trusted kernels
> + * without a valid signature on them, but if we're not enforcing,
> + * certain errors are non-fatal.
> + */
> + switch (err) {
> + case -ENODATA:
> + reason = "unsigned module";
> + break;
> + case -ENOPKG:
> + reason = "module with unsupported crypto";
> + break;
> + case -ENOKEY:
> + reason = "module with unavailable key";
> + break;
> +
> + default:
> + /*
> + * All other errors are fatal, including lack of memory,
> + * unparseable signatures, and signature check failures --
> + * even if signatures aren't required.
> + */
> + return err;
> + }
> +
> + if (is_module_sig_enforced()) {
> + pr_notice("Loading of %s is rejected\n", reason);
> + return -EKEYREJECTED;
> + }
> +
> + return security_locked_down(LOCKDOWN_MODULE_SIGNATURE);
> +}

2022-02-22 18:55:40

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v8 10/13] module: Move procfs support into a separate file



Le 22/02/2022 à 15:13, Aaron Tomlin a écrit :
> No functional change.
>
> This patch migrates code that allows one to generate a
> list of loaded/or linked modules via /proc when procfs
> support is enabled into kernel/module/procfs.c.
>
> Signed-off-by: Aaron Tomlin <[email protected]>

Reviewed-by: Christophe Leroy <[email protected]>

> ---
> kernel/module/Makefile | 1 +
> kernel/module/internal.h | 1 +
> kernel/module/main.c | 131 +-----------------------------------
> kernel/module/procfs.c | 142 +++++++++++++++++++++++++++++++++++++++
> 4 files changed, 145 insertions(+), 130 deletions(-)
> create mode 100644 kernel/module/procfs.c
>
> diff --git a/kernel/module/Makefile b/kernel/module/Makefile
> index 9901bed3ab5b..94296c98a67f 100644
> --- a/kernel/module/Makefile
> +++ b/kernel/module/Makefile
> @@ -15,3 +15,4 @@ obj-$(CONFIG_MODULES_TREE_LOOKUP) += tree_lookup.o
> obj-$(CONFIG_STRICT_MODULE_RWX) += strict_rwx.o
> obj-$(CONFIG_DEBUG_KMEMLEAK) += debug_kmemleak.o
> obj-$(CONFIG_KALLSYMS) += kallsyms.o
> +obj-$(CONFIG_PROC_FS) += procfs.o
> diff --git a/kernel/module/internal.h b/kernel/module/internal.h
> index 44ca05b9eb8f..6af40c2d145f 100644
> --- a/kernel/module/internal.h
> +++ b/kernel/module/internal.h
> @@ -72,6 +72,7 @@ struct module *find_module_all(const char *name, size_t len, bool even_unformed)
> int cmp_name(const void *name, const void *sym);
> long module_get_offset(struct module *mod, unsigned int *size, Elf_Shdr *sechdr,
> unsigned int section);
> +char *module_flags(struct module *mod, char *buf);
>
> static inline unsigned long kernel_symbol_value(const struct kernel_symbol *sym)
> {
> diff --git a/kernel/module/main.c b/kernel/module/main.c
> index 952079987ea4..44b6fd1acc44 100644
> --- a/kernel/module/main.c
> +++ b/kernel/module/main.c
> @@ -22,7 +22,6 @@
> #include <linux/slab.h>
> #include <linux/vmalloc.h>
> #include <linux/elf.h>
> -#include <linux/proc_fs.h>
> #include <linux/seq_file.h>
> #include <linux/syscalls.h>
> #include <linux/fcntl.h>
> @@ -805,31 +804,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
> return ret;
> }
>
> -static inline void print_unload_info(struct seq_file *m, struct module *mod)
> -{
> - struct module_use *use;
> - int printed_something = 0;
> -
> - seq_printf(m, " %i ", module_refcount(mod));
> -
> - /*
> - * Always include a trailing , so userspace can differentiate
> - * between this and the old multi-field proc format.
> - */
> - list_for_each_entry(use, &mod->source_list, source_list) {
> - printed_something = 1;
> - seq_printf(m, "%s,", use->source->name);
> - }
> -
> - if (mod->init != NULL && mod->exit == NULL) {
> - printed_something = 1;
> - seq_puts(m, "[permanent],");
> - }
> -
> - if (!printed_something)
> - seq_puts(m, "-");
> -}
> -
> void __symbol_put(const char *symbol)
> {
> struct find_symbol_arg fsa = {
> @@ -919,12 +893,6 @@ void module_put(struct module *module)
> EXPORT_SYMBOL(module_put);
>
> #else /* !CONFIG_MODULE_UNLOAD */
> -static inline void print_unload_info(struct seq_file *m, struct module *mod)
> -{
> - /* We don't know the usage count, or what modules are using. */
> - seq_puts(m, " - -");
> -}
> -
> static inline void module_unload_free(struct module *mod)
> {
> }
> @@ -3596,7 +3564,7 @@ static void cfi_cleanup(struct module *mod)
> }
>
> /* Keep in sync with MODULE_FLAGS_BUF_SIZE !!! */
> -static char *module_flags(struct module *mod, char *buf)
> +char *module_flags(struct module *mod, char *buf)
> {
> int bx = 0;
>
> @@ -3619,103 +3587,6 @@ static char *module_flags(struct module *mod, char *buf)
> return buf;
> }
>
> -#ifdef CONFIG_PROC_FS
> -/* Called by the /proc file system to return a list of modules. */
> -static void *m_start(struct seq_file *m, loff_t *pos)
> -{
> - mutex_lock(&module_mutex);
> - return seq_list_start(&modules, *pos);
> -}
> -
> -static void *m_next(struct seq_file *m, void *p, loff_t *pos)
> -{
> - return seq_list_next(p, &modules, pos);
> -}
> -
> -static void m_stop(struct seq_file *m, void *p)
> -{
> - mutex_unlock(&module_mutex);
> -}
> -
> -static int m_show(struct seq_file *m, void *p)
> -{
> - struct module *mod = list_entry(p, struct module, list);
> - char buf[MODULE_FLAGS_BUF_SIZE];
> - void *value;
> -
> - /* We always ignore unformed modules. */
> - if (mod->state == MODULE_STATE_UNFORMED)
> - return 0;
> -
> - seq_printf(m, "%s %u",
> - mod->name, mod->init_layout.size + mod->core_layout.size);
> - print_unload_info(m, mod);
> -
> - /* Informative for users. */
> - seq_printf(m, " %s",
> - mod->state == MODULE_STATE_GOING ? "Unloading" :
> - mod->state == MODULE_STATE_COMING ? "Loading" :
> - "Live");
> - /* Used by oprofile and other similar tools. */
> - value = m->private ? NULL : mod->core_layout.base;
> - seq_printf(m, " 0x%px", value);
> -
> - /* Taints info */
> - if (mod->taints)
> - seq_printf(m, " %s", module_flags(mod, buf));
> -
> - seq_puts(m, "\n");
> - return 0;
> -}
> -
> -/*
> - * Format: modulename size refcount deps address
> - *
> - * Where refcount is a number or -, and deps is a comma-separated list
> - * of depends or -.
> - */
> -static const struct seq_operations modules_op = {
> - .start = m_start,
> - .next = m_next,
> - .stop = m_stop,
> - .show = m_show
> -};
> -
> -/*
> - * This also sets the "private" pointer to non-NULL if the
> - * kernel pointers should be hidden (so you can just test
> - * "m->private" to see if you should keep the values private).
> - *
> - * We use the same logic as for /proc/kallsyms.
> - */
> -static int modules_open(struct inode *inode, struct file *file)
> -{
> - int err = seq_open(file, &modules_op);
> -
> - if (!err) {
> - struct seq_file *m = file->private_data;
> - m->private = kallsyms_show_value(file->f_cred) ? NULL : (void *)8ul;
> - }
> -
> - return err;
> -}
> -
> -static const struct proc_ops modules_proc_ops = {
> - .proc_flags = PROC_ENTRY_PERMANENT,
> - .proc_open = modules_open,
> - .proc_read = seq_read,
> - .proc_lseek = seq_lseek,
> - .proc_release = seq_release,
> -};
> -
> -static int __init proc_modules_init(void)
> -{
> - proc_create("modules", 0, NULL, &modules_proc_ops);
> - return 0;
> -}
> -module_init(proc_modules_init);
> -#endif
> -
> /* Given an address, look for it in the module exception tables. */
> const struct exception_table_entry *search_module_extables(unsigned long addr)
> {
> diff --git a/kernel/module/procfs.c b/kernel/module/procfs.c
> new file mode 100644
> index 000000000000..2717e130788e
> --- /dev/null
> +++ b/kernel/module/procfs.c
> @@ -0,0 +1,142 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Module proc support
> + *
> + * Copyright (C) 2008 Alexey Dobriyan
> + */
> +
> +#include <linux/module.h>
> +#include <linux/kallsyms.h>
> +#include <linux/mutex.h>
> +#include <linux/seq_file.h>
> +#include <linux/proc_fs.h>
> +#include "internal.h"
> +
> +#ifdef CONFIG_MODULE_UNLOAD
> +static inline void print_unload_info(struct seq_file *m, struct module *mod)
> +{
> + struct module_use *use;
> + int printed_something = 0;
> +
> + seq_printf(m, " %i ", module_refcount(mod));
> +
> + /*
> + * Always include a trailing , so userspace can differentiate
> + * between this and the old multi-field proc format.
> + */
> + list_for_each_entry(use, &mod->source_list, source_list) {
> + printed_something = 1;
> + seq_printf(m, "%s,", use->source->name);
> + }
> +
> + if (mod->init && !mod->exit) {
> + printed_something = 1;
> + seq_puts(m, "[permanent],");
> + }
> +
> + if (!printed_something)
> + seq_puts(m, "-");
> +}
> +#else /* !CONFIG_MODULE_UNLOAD */
> +static inline void print_unload_info(struct seq_file *m, struct module *mod)
> +{
> + /* We don't know the usage count, or what modules are using. */
> + seq_puts(m, " - -");
> +}
> +#endif /* CONFIG_MODULE_UNLOAD */
> +
> +/* Called by the /proc file system to return a list of modules. */
> +static void *m_start(struct seq_file *m, loff_t *pos)
> +{
> + mutex_lock(&module_mutex);
> + return seq_list_start(&modules, *pos);
> +}
> +
> +static void *m_next(struct seq_file *m, void *p, loff_t *pos)
> +{
> + return seq_list_next(p, &modules, pos);
> +}
> +
> +static void m_stop(struct seq_file *m, void *p)
> +{
> + mutex_unlock(&module_mutex);
> +}
> +
> +static int m_show(struct seq_file *m, void *p)
> +{
> + struct module *mod = list_entry(p, struct module, list);
> + char buf[MODULE_FLAGS_BUF_SIZE];
> + void *value;
> +
> + /* We always ignore unformed modules. */
> + if (mod->state == MODULE_STATE_UNFORMED)
> + return 0;
> +
> + seq_printf(m, "%s %u",
> + mod->name, mod->init_layout.size + mod->core_layout.size);
> + print_unload_info(m, mod);
> +
> + /* Informative for users. */
> + seq_printf(m, " %s",
> + mod->state == MODULE_STATE_GOING ? "Unloading" :
> + mod->state == MODULE_STATE_COMING ? "Loading" :
> + "Live");
> + /* Used by oprofile and other similar tools. */
> + value = m->private ? NULL : mod->core_layout.base;
> + seq_printf(m, " 0x%px", value);
> +
> + /* Taints info */
> + if (mod->taints)
> + seq_printf(m, " %s", module_flags(mod, buf));
> +
> + seq_puts(m, "\n");
> + return 0;
> +}
> +
> +/*
> + * Format: modulename size refcount deps address
> + *
> + * Where refcount is a number or -, and deps is a comma-separated list
> + * of depends or -.
> + */
> +static const struct seq_operations modules_op = {
> + .start = m_start,
> + .next = m_next,
> + .stop = m_stop,
> + .show = m_show
> +};
> +
> +/*
> + * This also sets the "private" pointer to non-NULL if the
> + * kernel pointers should be hidden (so you can just test
> + * "m->private" to see if you should keep the values private).
> + *
> + * We use the same logic as for /proc/kallsyms.
> + */
> +static int modules_open(struct inode *inode, struct file *file)
> +{
> + int err = seq_open(file, &modules_op);
> +
> + if (!err) {
> + struct seq_file *m = file->private_data;
> +
> + m->private = kallsyms_show_value(file->f_cred) ? NULL : (void *)8ul;
> + }
> +
> + return err;
> +}
> +
> +static const struct proc_ops modules_proc_ops = {
> + .proc_flags = PROC_ENTRY_PERMANENT,
> + .proc_open = modules_open,
> + .proc_read = seq_read,
> + .proc_lseek = seq_lseek,
> + .proc_release = seq_release,
> +};
> +
> +static int __init proc_modules_init(void)
> +{
> + proc_create("modules", 0, NULL, &modules_proc_ops);
> + return 0;
> +}
> +module_init(proc_modules_init);

2022-02-23 00:50:38

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v8 11/13] module: Move sysfs support into a separate file



Le 22/02/2022 à 15:13, Aaron Tomlin a écrit :
> No functional change.
>
> This patch migrates module sysfs support out of core code into
> kernel/module/sysfs.c. In addition simple code refactoring to
> make this possible.
>
> Signed-off-by: Aaron Tomlin <[email protected]>

Reviewed-by: Christophe Leroy <[email protected]>

> ---
> kernel/module/Makefile | 1 +
> kernel/module/internal.h | 21 ++
> kernel/module/main.c | 469 +--------------------------------------
> kernel/module/sysfs.c | 436 ++++++++++++++++++++++++++++++++++++
> 4 files changed, 461 insertions(+), 466 deletions(-)
> create mode 100644 kernel/module/sysfs.c
>
> diff --git a/kernel/module/Makefile b/kernel/module/Makefile
> index 94296c98a67f..cf8dcdc6b55f 100644
> --- a/kernel/module/Makefile
> +++ b/kernel/module/Makefile
> @@ -16,3 +16,4 @@ obj-$(CONFIG_STRICT_MODULE_RWX) += strict_rwx.o
> obj-$(CONFIG_DEBUG_KMEMLEAK) += debug_kmemleak.o
> obj-$(CONFIG_KALLSYMS) += kallsyms.o
> obj-$(CONFIG_PROC_FS) += procfs.o
> +obj-$(CONFIG_SYSFS) += sysfs.o
> diff --git a/kernel/module/internal.h b/kernel/module/internal.h
> index 6af40c2d145f..62d749ef695e 100644
> --- a/kernel/module/internal.h
> +++ b/kernel/module/internal.h
> @@ -34,6 +34,9 @@
> extern struct mutex module_mutex;
> extern struct list_head modules;
>
> +extern struct module_attribute *modinfo_attrs[];
> +extern size_t modinfo_attrs_count;
> +
> /* Provided by the linker */
> extern const struct kernel_symbol __start___ksymtab[];
> extern const struct kernel_symbol __stop___ksymtab[];
> @@ -204,3 +207,21 @@ static inline void init_build_id(struct module *mod, const struct load_info *inf
> static inline void layout_symtab(struct module *mod, struct load_info *info) { }
> static inline void add_kallsyms(struct module *mod, const struct load_info *info) { }
> #endif /* CONFIG_KALLSYMS */
> +
> +#ifdef CONFIG_SYSFS
> +int mod_sysfs_setup(struct module *mod, const struct load_info *info,
> + struct kernel_param *kparam, unsigned int num_params);
> +void mod_sysfs_teardown(struct module *mod);
> +void init_param_lock(struct module *mod);
> +#else /* !CONFIG_SYSFS */
> +static inline int mod_sysfs_setup(struct module *mod,
> + const struct load_info *info,
> + struct kernel_param *kparam,
> + unsigned int num_params)
> +{
> + return 0;
> +}
> +
> +static inline void mod_sysfs_teardown(struct module *mod) { }
> +static inline void init_param_lock(struct module *mod) { }
> +#endif /* CONFIG_SYSFS */
> diff --git a/kernel/module/main.c b/kernel/module/main.c
> index 44b6fd1acc44..b8a59b5c3e3a 100644
> --- a/kernel/module/main.c
> +++ b/kernel/module/main.c
> @@ -14,9 +14,7 @@
> #include <linux/init.h>
> #include <linux/kallsyms.h>
> #include <linux/buildid.h>
> -#include <linux/file.h>
> #include <linux/fs.h>
> -#include <linux/sysfs.h>
> #include <linux/kernel.h>
> #include <linux/kernel_read_file.h>
> #include <linux/slab.h>
> @@ -989,7 +987,7 @@ static ssize_t show_taint(struct module_attribute *mattr,
> static struct module_attribute modinfo_taint =
> __ATTR(taint, 0444, show_taint, NULL);
>
> -static struct module_attribute *modinfo_attrs[] = {
> +struct module_attribute *modinfo_attrs[] = {
> &module_uevent,
> &modinfo_version,
> &modinfo_srcversion,
> @@ -1003,6 +1001,8 @@ static struct module_attribute *modinfo_attrs[] = {
> NULL,
> };
>
> +size_t modinfo_attrs_count = ARRAY_SIZE(modinfo_attrs);
> +
> static const char vermagic[] = VERMAGIC_STRING;
>
> static int try_to_force_load(struct module *mod, const char *reason)
> @@ -1253,469 +1253,6 @@ resolve_symbol_wait(struct module *mod,
> return ksym;
> }
>
> -/*
> - * /sys/module/foo/sections stuff
> - * J. Corbet <[email protected]>
> - */
> -#ifdef CONFIG_SYSFS
> -
> -#ifdef CONFIG_KALLSYMS
> -struct module_sect_attr {
> - struct bin_attribute battr;
> - unsigned long address;
> -};
> -
> -struct module_sect_attrs {
> - struct attribute_group grp;
> - unsigned int nsections;
> - struct module_sect_attr attrs[];
> -};
> -
> -#define MODULE_SECT_READ_SIZE (3 /* "0x", "\n" */ + (BITS_PER_LONG / 4))
> -static ssize_t module_sect_read(struct file *file, struct kobject *kobj,
> - struct bin_attribute *battr,
> - char *buf, loff_t pos, size_t count)
> -{
> - struct module_sect_attr *sattr =
> - container_of(battr, struct module_sect_attr, battr);
> - char bounce[MODULE_SECT_READ_SIZE + 1];
> - size_t wrote;
> -
> - if (pos != 0)
> - return -EINVAL;
> -
> - /*
> - * Since we're a binary read handler, we must account for the
> - * trailing NUL byte that sprintf will write: if "buf" is
> - * too small to hold the NUL, or the NUL is exactly the last
> - * byte, the read will look like it got truncated by one byte.
> - * Since there is no way to ask sprintf nicely to not write
> - * the NUL, we have to use a bounce buffer.
> - */
> - wrote = scnprintf(bounce, sizeof(bounce), "0x%px\n",
> - kallsyms_show_value(file->f_cred)
> - ? (void *)sattr->address : NULL);
> - count = min(count, wrote);
> - memcpy(buf, bounce, count);
> -
> - return count;
> -}
> -
> -static void free_sect_attrs(struct module_sect_attrs *sect_attrs)
> -{
> - unsigned int section;
> -
> - for (section = 0; section < sect_attrs->nsections; section++)
> - kfree(sect_attrs->attrs[section].battr.attr.name);
> - kfree(sect_attrs);
> -}
> -
> -static void add_sect_attrs(struct module *mod, const struct load_info *info)
> -{
> - unsigned int nloaded = 0, i, size[2];
> - struct module_sect_attrs *sect_attrs;
> - struct module_sect_attr *sattr;
> - struct bin_attribute **gattr;
> -
> - /* Count loaded sections and allocate structures */
> - for (i = 0; i < info->hdr->e_shnum; i++)
> - if (!sect_empty(&info->sechdrs[i]))
> - nloaded++;
> - size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded),
> - sizeof(sect_attrs->grp.bin_attrs[0]));
> - size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.bin_attrs[0]);
> - sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);
> - if (sect_attrs == NULL)
> - return;
> -
> - /* Setup section attributes. */
> - sect_attrs->grp.name = "sections";
> - sect_attrs->grp.bin_attrs = (void *)sect_attrs + size[0];
> -
> - sect_attrs->nsections = 0;
> - sattr = &sect_attrs->attrs[0];
> - gattr = &sect_attrs->grp.bin_attrs[0];
> - for (i = 0; i < info->hdr->e_shnum; i++) {
> - Elf_Shdr *sec = &info->sechdrs[i];
> - if (sect_empty(sec))
> - continue;
> - sysfs_bin_attr_init(&sattr->battr);
> - sattr->address = sec->sh_addr;
> - sattr->battr.attr.name =
> - kstrdup(info->secstrings + sec->sh_name, GFP_KERNEL);
> - if (sattr->battr.attr.name == NULL)
> - goto out;
> - sect_attrs->nsections++;
> - sattr->battr.read = module_sect_read;
> - sattr->battr.size = MODULE_SECT_READ_SIZE;
> - sattr->battr.attr.mode = 0400;
> - *(gattr++) = &(sattr++)->battr;
> - }
> - *gattr = NULL;
> -
> - if (sysfs_create_group(&mod->mkobj.kobj, &sect_attrs->grp))
> - goto out;
> -
> - mod->sect_attrs = sect_attrs;
> - return;
> - out:
> - free_sect_attrs(sect_attrs);
> -}
> -
> -static void remove_sect_attrs(struct module *mod)
> -{
> - if (mod->sect_attrs) {
> - sysfs_remove_group(&mod->mkobj.kobj,
> - &mod->sect_attrs->grp);
> - /*
> - * We are positive that no one is using any sect attrs
> - * at this point. Deallocate immediately.
> - */
> - free_sect_attrs(mod->sect_attrs);
> - mod->sect_attrs = NULL;
> - }
> -}
> -
> -/*
> - * /sys/module/foo/notes/.section.name gives contents of SHT_NOTE sections.
> - */
> -
> -struct module_notes_attrs {
> - struct kobject *dir;
> - unsigned int notes;
> - struct bin_attribute attrs[];
> -};
> -
> -static ssize_t module_notes_read(struct file *filp, struct kobject *kobj,
> - struct bin_attribute *bin_attr,
> - char *buf, loff_t pos, size_t count)
> -{
> - /*
> - * The caller checked the pos and count against our size.
> - */
> - memcpy(buf, bin_attr->private + pos, count);
> - return count;
> -}
> -
> -static void free_notes_attrs(struct module_notes_attrs *notes_attrs,
> - unsigned int i)
> -{
> - if (notes_attrs->dir) {
> - while (i-- > 0)
> - sysfs_remove_bin_file(notes_attrs->dir,
> - &notes_attrs->attrs[i]);
> - kobject_put(notes_attrs->dir);
> - }
> - kfree(notes_attrs);
> -}
> -
> -static void add_notes_attrs(struct module *mod, const struct load_info *info)
> -{
> - unsigned int notes, loaded, i;
> - struct module_notes_attrs *notes_attrs;
> - struct bin_attribute *nattr;
> -
> - /* failed to create section attributes, so can't create notes */
> - if (!mod->sect_attrs)
> - return;
> -
> - /* Count notes sections and allocate structures. */
> - notes = 0;
> - for (i = 0; i < info->hdr->e_shnum; i++)
> - if (!sect_empty(&info->sechdrs[i]) &&
> - (info->sechdrs[i].sh_type == SHT_NOTE))
> - ++notes;
> -
> - if (notes == 0)
> - return;
> -
> - notes_attrs = kzalloc(struct_size(notes_attrs, attrs, notes),
> - GFP_KERNEL);
> - if (notes_attrs == NULL)
> - return;
> -
> - notes_attrs->notes = notes;
> - nattr = &notes_attrs->attrs[0];
> - for (loaded = i = 0; i < info->hdr->e_shnum; ++i) {
> - if (sect_empty(&info->sechdrs[i]))
> - continue;
> - if (info->sechdrs[i].sh_type == SHT_NOTE) {
> - sysfs_bin_attr_init(nattr);
> - nattr->attr.name = mod->sect_attrs->attrs[loaded].battr.attr.name;
> - nattr->attr.mode = S_IRUGO;
> - nattr->size = info->sechdrs[i].sh_size;
> - nattr->private = (void *) info->sechdrs[i].sh_addr;
> - nattr->read = module_notes_read;
> - ++nattr;
> - }
> - ++loaded;
> - }
> -
> - notes_attrs->dir = kobject_create_and_add("notes", &mod->mkobj.kobj);
> - if (!notes_attrs->dir)
> - goto out;
> -
> - for (i = 0; i < notes; ++i)
> - if (sysfs_create_bin_file(notes_attrs->dir,
> - &notes_attrs->attrs[i]))
> - goto out;
> -
> - mod->notes_attrs = notes_attrs;
> - return;
> -
> - out:
> - free_notes_attrs(notes_attrs, i);
> -}
> -
> -static void remove_notes_attrs(struct module *mod)
> -{
> - if (mod->notes_attrs)
> - free_notes_attrs(mod->notes_attrs, mod->notes_attrs->notes);
> -}
> -
> -#else
> -
> -static inline void add_sect_attrs(struct module *mod,
> - const struct load_info *info)
> -{
> -}
> -
> -static inline void remove_sect_attrs(struct module *mod)
> -{
> -}
> -
> -static inline void add_notes_attrs(struct module *mod,
> - const struct load_info *info)
> -{
> -}
> -
> -static inline void remove_notes_attrs(struct module *mod)
> -{
> -}
> -#endif /* CONFIG_KALLSYMS */
> -
> -static void del_usage_links(struct module *mod)
> -{
> -#ifdef CONFIG_MODULE_UNLOAD
> - struct module_use *use;
> -
> - mutex_lock(&module_mutex);
> - list_for_each_entry(use, &mod->target_list, target_list)
> - sysfs_remove_link(use->target->holders_dir, mod->name);
> - mutex_unlock(&module_mutex);
> -#endif
> -}
> -
> -static int add_usage_links(struct module *mod)
> -{
> - int ret = 0;
> -#ifdef CONFIG_MODULE_UNLOAD
> - struct module_use *use;
> -
> - mutex_lock(&module_mutex);
> - list_for_each_entry(use, &mod->target_list, target_list) {
> - ret = sysfs_create_link(use->target->holders_dir,
> - &mod->mkobj.kobj, mod->name);
> - if (ret)
> - break;
> - }
> - mutex_unlock(&module_mutex);
> - if (ret)
> - del_usage_links(mod);
> -#endif
> - return ret;
> -}
> -
> -static void module_remove_modinfo_attrs(struct module *mod, int end);
> -
> -static int module_add_modinfo_attrs(struct module *mod)
> -{
> - struct module_attribute *attr;
> - struct module_attribute *temp_attr;
> - int error = 0;
> - int i;
> -
> - mod->modinfo_attrs = kzalloc((sizeof(struct module_attribute) *
> - (ARRAY_SIZE(modinfo_attrs) + 1)),
> - GFP_KERNEL);
> - if (!mod->modinfo_attrs)
> - return -ENOMEM;
> -
> - temp_attr = mod->modinfo_attrs;
> - for (i = 0; (attr = modinfo_attrs[i]); i++) {
> - if (!attr->test || attr->test(mod)) {
> - memcpy(temp_attr, attr, sizeof(*temp_attr));
> - sysfs_attr_init(&temp_attr->attr);
> - error = sysfs_create_file(&mod->mkobj.kobj,
> - &temp_attr->attr);
> - if (error)
> - goto error_out;
> - ++temp_attr;
> - }
> - }
> -
> - return 0;
> -
> -error_out:
> - if (i > 0)
> - module_remove_modinfo_attrs(mod, --i);
> - else
> - kfree(mod->modinfo_attrs);
> - return error;
> -}
> -
> -static void module_remove_modinfo_attrs(struct module *mod, int end)
> -{
> - struct module_attribute *attr;
> - int i;
> -
> - for (i = 0; (attr = &mod->modinfo_attrs[i]); i++) {
> - if (end >= 0 && i > end)
> - break;
> - /* pick a field to test for end of list */
> - if (!attr->attr.name)
> - break;
> - sysfs_remove_file(&mod->mkobj.kobj, &attr->attr);
> - if (attr->free)
> - attr->free(mod);
> - }
> - kfree(mod->modinfo_attrs);
> -}
> -
> -static void mod_kobject_put(struct module *mod)
> -{
> - DECLARE_COMPLETION_ONSTACK(c);
> - mod->mkobj.kobj_completion = &c;
> - kobject_put(&mod->mkobj.kobj);
> - wait_for_completion(&c);
> -}
> -
> -static int mod_sysfs_init(struct module *mod)
> -{
> - int err;
> - struct kobject *kobj;
> -
> - if (!module_sysfs_initialized) {
> - pr_err("%s: module sysfs not initialized\n", mod->name);
> - err = -EINVAL;
> - goto out;
> - }
> -
> - kobj = kset_find_obj(module_kset, mod->name);
> - if (kobj) {
> - pr_err("%s: module is already loaded\n", mod->name);
> - kobject_put(kobj);
> - err = -EINVAL;
> - goto out;
> - }
> -
> - mod->mkobj.mod = mod;
> -
> - memset(&mod->mkobj.kobj, 0, sizeof(mod->mkobj.kobj));
> - mod->mkobj.kobj.kset = module_kset;
> - err = kobject_init_and_add(&mod->mkobj.kobj, &module_ktype, NULL,
> - "%s", mod->name);
> - if (err)
> - mod_kobject_put(mod);
> -
> -out:
> - return err;
> -}
> -
> -static int mod_sysfs_setup(struct module *mod,
> - const struct load_info *info,
> - struct kernel_param *kparam,
> - unsigned int num_params)
> -{
> - int err;
> -
> - err = mod_sysfs_init(mod);
> - if (err)
> - goto out;
> -
> - mod->holders_dir = kobject_create_and_add("holders", &mod->mkobj.kobj);
> - if (!mod->holders_dir) {
> - err = -ENOMEM;
> - goto out_unreg;
> - }
> -
> - err = module_param_sysfs_setup(mod, kparam, num_params);
> - if (err)
> - goto out_unreg_holders;
> -
> - err = module_add_modinfo_attrs(mod);
> - if (err)
> - goto out_unreg_param;
> -
> - err = add_usage_links(mod);
> - if (err)
> - goto out_unreg_modinfo_attrs;
> -
> - add_sect_attrs(mod, info);
> - add_notes_attrs(mod, info);
> -
> - return 0;
> -
> -out_unreg_modinfo_attrs:
> - module_remove_modinfo_attrs(mod, -1);
> -out_unreg_param:
> - module_param_sysfs_remove(mod);
> -out_unreg_holders:
> - kobject_put(mod->holders_dir);
> -out_unreg:
> - mod_kobject_put(mod);
> -out:
> - return err;
> -}
> -
> -static void mod_sysfs_fini(struct module *mod)
> -{
> - remove_notes_attrs(mod);
> - remove_sect_attrs(mod);
> - mod_kobject_put(mod);
> -}
> -
> -static void init_param_lock(struct module *mod)
> -{
> - mutex_init(&mod->param_lock);
> -}
> -#else /* !CONFIG_SYSFS */
> -
> -static int mod_sysfs_setup(struct module *mod,
> - const struct load_info *info,
> - struct kernel_param *kparam,
> - unsigned int num_params)
> -{
> - return 0;
> -}
> -
> -static void mod_sysfs_fini(struct module *mod)
> -{
> -}
> -
> -static void module_remove_modinfo_attrs(struct module *mod, int end)
> -{
> -}
> -
> -static void del_usage_links(struct module *mod)
> -{
> -}
> -
> -static void init_param_lock(struct module *mod)
> -{
> -}
> -#endif /* CONFIG_SYSFS */
> -
> -static void mod_sysfs_teardown(struct module *mod)
> -{
> - del_usage_links(mod);
> - module_remove_modinfo_attrs(mod, -1);
> - module_param_sysfs_remove(mod);
> - kobject_put(mod->mkobj.drivers_dir);
> - kobject_put(mod->holders_dir);
> - mod_sysfs_fini(mod);
> -}
> -
> /*
> * LKM RO/NX protection: protect module's text/ro-data
> * from modification and any data from execution.
> diff --git a/kernel/module/sysfs.c b/kernel/module/sysfs.c
> new file mode 100644
> index 000000000000..ce68f821dcd1
> --- /dev/null
> +++ b/kernel/module/sysfs.c
> @@ -0,0 +1,436 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Module sysfs support
> + *
> + * Copyright (C) 2008 Rusty Russell
> + */
> +
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include <linux/fs.h>
> +#include <linux/sysfs.h>
> +#include <linux/slab.h>
> +#include <linux/kallsyms.h>
> +#include <linux/mutex.h>
> +#include "internal.h"
> +
> +/*
> + * /sys/module/foo/sections stuff
> + * J. Corbet <[email protected]>
> + */
> +#ifdef CONFIG_KALLSYMS
> +struct module_sect_attr {
> + struct bin_attribute battr;
> + unsigned long address;
> +};
> +
> +struct module_sect_attrs {
> + struct attribute_group grp;
> + unsigned int nsections;
> + struct module_sect_attr attrs[];
> +};
> +
> +#define MODULE_SECT_READ_SIZE (3 /* "0x", "\n" */ + (BITS_PER_LONG / 4))
> +static ssize_t module_sect_read(struct file *file, struct kobject *kobj,
> + struct bin_attribute *battr,
> + char *buf, loff_t pos, size_t count)
> +{
> + struct module_sect_attr *sattr =
> + container_of(battr, struct module_sect_attr, battr);
> + char bounce[MODULE_SECT_READ_SIZE + 1];
> + size_t wrote;
> +
> + if (pos != 0)
> + return -EINVAL;
> +
> + /*
> + * Since we're a binary read handler, we must account for the
> + * trailing NUL byte that sprintf will write: if "buf" is
> + * too small to hold the NUL, or the NUL is exactly the last
> + * byte, the read will look like it got truncated by one byte.
> + * Since there is no way to ask sprintf nicely to not write
> + * the NUL, we have to use a bounce buffer.
> + */
> + wrote = scnprintf(bounce, sizeof(bounce), "0x%px\n",
> + kallsyms_show_value(file->f_cred)
> + ? (void *)sattr->address : NULL);
> + count = min(count, wrote);
> + memcpy(buf, bounce, count);
> +
> + return count;
> +}
> +
> +static void free_sect_attrs(struct module_sect_attrs *sect_attrs)
> +{
> + unsigned int section;
> +
> + for (section = 0; section < sect_attrs->nsections; section++)
> + kfree(sect_attrs->attrs[section].battr.attr.name);
> + kfree(sect_attrs);
> +}
> +
> +static void add_sect_attrs(struct module *mod, const struct load_info *info)
> +{
> + unsigned int nloaded = 0, i, size[2];
> + struct module_sect_attrs *sect_attrs;
> + struct module_sect_attr *sattr;
> + struct bin_attribute **gattr;
> +
> + /* Count loaded sections and allocate structures */
> + for (i = 0; i < info->hdr->e_shnum; i++)
> + if (!sect_empty(&info->sechdrs[i]))
> + nloaded++;
> + size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded),
> + sizeof(sect_attrs->grp.bin_attrs[0]));
> + size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.bin_attrs[0]);
> + sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);
> + if (!sect_attrs)
> + return;
> +
> + /* Setup section attributes. */
> + sect_attrs->grp.name = "sections";
> + sect_attrs->grp.bin_attrs = (void *)sect_attrs + size[0];
> +
> + sect_attrs->nsections = 0;
> + sattr = &sect_attrs->attrs[0];
> + gattr = &sect_attrs->grp.bin_attrs[0];
> + for (i = 0; i < info->hdr->e_shnum; i++) {
> + Elf_Shdr *sec = &info->sechdrs[i];
> +
> + if (sect_empty(sec))
> + continue;
> + sysfs_bin_attr_init(&sattr->battr);
> + sattr->address = sec->sh_addr;
> + sattr->battr.attr.name =
> + kstrdup(info->secstrings + sec->sh_name, GFP_KERNEL);
> + if (!sattr->battr.attr.name)
> + goto out;
> + sect_attrs->nsections++;
> + sattr->battr.read = module_sect_read;
> + sattr->battr.size = MODULE_SECT_READ_SIZE;
> + sattr->battr.attr.mode = 0400;
> + *(gattr++) = &(sattr++)->battr;
> + }
> + *gattr = NULL;
> +
> + if (sysfs_create_group(&mod->mkobj.kobj, &sect_attrs->grp))
> + goto out;
> +
> + mod->sect_attrs = sect_attrs;
> + return;
> +out:
> + free_sect_attrs(sect_attrs);
> +}
> +
> +static void remove_sect_attrs(struct module *mod)
> +{
> + if (mod->sect_attrs) {
> + sysfs_remove_group(&mod->mkobj.kobj,
> + &mod->sect_attrs->grp);
> + /*
> + * We are positive that no one is using any sect attrs
> + * at this point. Deallocate immediately.
> + */
> + free_sect_attrs(mod->sect_attrs);
> + mod->sect_attrs = NULL;
> + }
> +}
> +
> +/*
> + * /sys/module/foo/notes/.section.name gives contents of SHT_NOTE sections.
> + */
> +
> +struct module_notes_attrs {
> + struct kobject *dir;
> + unsigned int notes;
> + struct bin_attribute attrs[];
> +};
> +
> +static ssize_t module_notes_read(struct file *filp, struct kobject *kobj,
> + struct bin_attribute *bin_attr,
> + char *buf, loff_t pos, size_t count)
> +{
> + /*
> + * The caller checked the pos and count against our size.
> + */
> + memcpy(buf, bin_attr->private + pos, count);
> + return count;
> +}
> +
> +static void free_notes_attrs(struct module_notes_attrs *notes_attrs,
> + unsigned int i)
> +{
> + if (notes_attrs->dir) {
> + while (i-- > 0)
> + sysfs_remove_bin_file(notes_attrs->dir,
> + &notes_attrs->attrs[i]);
> + kobject_put(notes_attrs->dir);
> + }
> + kfree(notes_attrs);
> +}
> +
> +static void add_notes_attrs(struct module *mod, const struct load_info *info)
> +{
> + unsigned int notes, loaded, i;
> + struct module_notes_attrs *notes_attrs;
> + struct bin_attribute *nattr;
> +
> + /* failed to create section attributes, so can't create notes */
> + if (!mod->sect_attrs)
> + return;
> +
> + /* Count notes sections and allocate structures. */
> + notes = 0;
> + for (i = 0; i < info->hdr->e_shnum; i++)
> + if (!sect_empty(&info->sechdrs[i]) &&
> + info->sechdrs[i].sh_type == SHT_NOTE)
> + ++notes;
> +
> + if (notes == 0)
> + return;
> +
> + notes_attrs = kzalloc(struct_size(notes_attrs, attrs, notes),
> + GFP_KERNEL);
> + if (!notes_attrs)
> + return;
> +
> + notes_attrs->notes = notes;
> + nattr = &notes_attrs->attrs[0];
> + for (loaded = i = 0; i < info->hdr->e_shnum; ++i) {
> + if (sect_empty(&info->sechdrs[i]))
> + continue;
> + if (info->sechdrs[i].sh_type == SHT_NOTE) {
> + sysfs_bin_attr_init(nattr);
> + nattr->attr.name = mod->sect_attrs->attrs[loaded].battr.attr.name;
> + nattr->attr.mode = 0444;
> + nattr->size = info->sechdrs[i].sh_size;
> + nattr->private = (void *)info->sechdrs[i].sh_addr;
> + nattr->read = module_notes_read;
> + ++nattr;
> + }
> + ++loaded;
> + }
> +
> + notes_attrs->dir = kobject_create_and_add("notes", &mod->mkobj.kobj);
> + if (!notes_attrs->dir)
> + goto out;
> +
> + for (i = 0; i < notes; ++i)
> + if (sysfs_create_bin_file(notes_attrs->dir,
> + &notes_attrs->attrs[i]))
> + goto out;
> +
> + mod->notes_attrs = notes_attrs;
> + return;
> +
> +out:
> + free_notes_attrs(notes_attrs, i);
> +}
> +
> +static void remove_notes_attrs(struct module *mod)
> +{
> + if (mod->notes_attrs)
> + free_notes_attrs(mod->notes_attrs, mod->notes_attrs->notes);
> +}
> +
> +#else /* !CONFIG_KALLSYMS */
> +static inline void add_sect_attrs(struct module *mod, const struct load_info *info) { }
> +static inline void remove_sect_attrs(struct module *mod) { }
> +static inline void add_notes_attrs(struct module *mod, const struct load_info *info) { }
> +static inline void remove_notes_attrs(struct module *mod) { }
> +#endif /* CONFIG_KALLSYMS */
> +
> +static void del_usage_links(struct module *mod)
> +{
> +#ifdef CONFIG_MODULE_UNLOAD
> + struct module_use *use;
> +
> + mutex_lock(&module_mutex);
> + list_for_each_entry(use, &mod->target_list, target_list)
> + sysfs_remove_link(use->target->holders_dir, mod->name);
> + mutex_unlock(&module_mutex);
> +#endif
> +}
> +
> +static int add_usage_links(struct module *mod)
> +{
> + int ret = 0;
> +#ifdef CONFIG_MODULE_UNLOAD
> + struct module_use *use;
> +
> + mutex_lock(&module_mutex);
> + list_for_each_entry(use, &mod->target_list, target_list) {
> + ret = sysfs_create_link(use->target->holders_dir,
> + &mod->mkobj.kobj, mod->name);
> + if (ret)
> + break;
> + }
> + mutex_unlock(&module_mutex);
> + if (ret)
> + del_usage_links(mod);
> +#endif
> + return ret;
> +}
> +
> +static void module_remove_modinfo_attrs(struct module *mod, int end)
> +{
> + struct module_attribute *attr;
> + int i;
> +
> + for (i = 0; (attr = &mod->modinfo_attrs[i]); i++) {
> + if (end >= 0 && i > end)
> + break;
> + /* pick a field to test for end of list */
> + if (!attr->attr.name)
> + break;
> + sysfs_remove_file(&mod->mkobj.kobj, &attr->attr);
> + if (attr->free)
> + attr->free(mod);
> + }
> + kfree(mod->modinfo_attrs);
> +}
> +
> +static int module_add_modinfo_attrs(struct module *mod)
> +{
> + struct module_attribute *attr;
> + struct module_attribute *temp_attr;
> + int error = 0;
> + int i;
> +
> + mod->modinfo_attrs = kzalloc((sizeof(struct module_attribute) *
> + (modinfo_attrs_count + 1)),
> + GFP_KERNEL);
> + if (!mod->modinfo_attrs)
> + return -ENOMEM;
> +
> + temp_attr = mod->modinfo_attrs;
> + for (i = 0; (attr = modinfo_attrs[i]); i++) {
> + if (!attr->test || attr->test(mod)) {
> + memcpy(temp_attr, attr, sizeof(*temp_attr));
> + sysfs_attr_init(&temp_attr->attr);
> + error = sysfs_create_file(&mod->mkobj.kobj,
> + &temp_attr->attr);
> + if (error)
> + goto error_out;
> + ++temp_attr;
> + }
> + }
> +
> + return 0;
> +
> +error_out:
> + if (i > 0)
> + module_remove_modinfo_attrs(mod, --i);
> + else
> + kfree(mod->modinfo_attrs);
> + return error;
> +}
> +
> +static void mod_kobject_put(struct module *mod)
> +{
> + DECLARE_COMPLETION_ONSTACK(c);
> +
> + mod->mkobj.kobj_completion = &c;
> + kobject_put(&mod->mkobj.kobj);
> + wait_for_completion(&c);
> +}
> +
> +static int mod_sysfs_init(struct module *mod)
> +{
> + int err;
> + struct kobject *kobj;
> +
> + if (!module_sysfs_initialized) {
> + pr_err("%s: module sysfs not initialized\n", mod->name);
> + err = -EINVAL;
> + goto out;
> + }
> +
> + kobj = kset_find_obj(module_kset, mod->name);
> + if (kobj) {
> + pr_err("%s: module is already loaded\n", mod->name);
> + kobject_put(kobj);
> + err = -EINVAL;
> + goto out;
> + }
> +
> + mod->mkobj.mod = mod;
> +
> + memset(&mod->mkobj.kobj, 0, sizeof(mod->mkobj.kobj));
> + mod->mkobj.kobj.kset = module_kset;
> + err = kobject_init_and_add(&mod->mkobj.kobj, &module_ktype, NULL,
> + "%s", mod->name);
> + if (err)
> + mod_kobject_put(mod);
> +
> +out:
> + return err;
> +}
> +
> +int mod_sysfs_setup(struct module *mod,
> + const struct load_info *info,
> + struct kernel_param *kparam,
> + unsigned int num_params)
> +{
> + int err;
> +
> + err = mod_sysfs_init(mod);
> + if (err)
> + goto out;
> +
> + mod->holders_dir = kobject_create_and_add("holders", &mod->mkobj.kobj);
> + if (!mod->holders_dir) {
> + err = -ENOMEM;
> + goto out_unreg;
> + }
> +
> + err = module_param_sysfs_setup(mod, kparam, num_params);
> + if (err)
> + goto out_unreg_holders;
> +
> + err = module_add_modinfo_attrs(mod);
> + if (err)
> + goto out_unreg_param;
> +
> + err = add_usage_links(mod);
> + if (err)
> + goto out_unreg_modinfo_attrs;
> +
> + add_sect_attrs(mod, info);
> + add_notes_attrs(mod, info);
> +
> + return 0;
> +
> +out_unreg_modinfo_attrs:
> + module_remove_modinfo_attrs(mod, -1);
> +out_unreg_param:
> + module_param_sysfs_remove(mod);
> +out_unreg_holders:
> + kobject_put(mod->holders_dir);
> +out_unreg:
> + mod_kobject_put(mod);
> +out:
> + return err;
> +}
> +
> +static void mod_sysfs_fini(struct module *mod)
> +{
> + remove_notes_attrs(mod);
> + remove_sect_attrs(mod);
> + mod_kobject_put(mod);
> +}
> +
> +void mod_sysfs_teardown(struct module *mod)
> +{
> + del_usage_links(mod);
> + module_remove_modinfo_attrs(mod, -1);
> + module_param_sysfs_remove(mod);
> + kobject_put(mod->mkobj.drivers_dir);
> + kobject_put(mod->holders_dir);
> + mod_sysfs_fini(mod);
> +}
> +
> +void init_param_lock(struct module *mod)
> +{
> + mutex_init(&mod->param_lock);
> +}

2022-02-23 02:00:22

by Aaron Tomlin

[permalink] [raw]
Subject: [PATCH v8 09/13] module: Move kallsyms support into a separate file

No functional change.

This patch migrates kallsyms code out of core module
code kernel/module/kallsyms.c

Signed-off-by: Aaron Tomlin <[email protected]>
---
kernel/module/Makefile | 1 +
kernel/module/internal.h | 29 +++
kernel/module/kallsyms.c | 506 +++++++++++++++++++++++++++++++++++++
kernel/module/main.c | 531 +--------------------------------------
4 files changed, 542 insertions(+), 525 deletions(-)
create mode 100644 kernel/module/kallsyms.c

diff --git a/kernel/module/Makefile b/kernel/module/Makefile
index 12388627725c..9901bed3ab5b 100644
--- a/kernel/module/Makefile
+++ b/kernel/module/Makefile
@@ -14,3 +14,4 @@ obj-$(CONFIG_LIVEPATCH) += livepatch.o
obj-$(CONFIG_MODULES_TREE_LOOKUP) += tree_lookup.o
obj-$(CONFIG_STRICT_MODULE_RWX) += strict_rwx.o
obj-$(CONFIG_DEBUG_KMEMLEAK) += debug_kmemleak.o
+obj-$(CONFIG_KALLSYMS) += kallsyms.o
diff --git a/kernel/module/internal.h b/kernel/module/internal.h
index b0c360839f63..44ca05b9eb8f 100644
--- a/kernel/module/internal.h
+++ b/kernel/module/internal.h
@@ -68,6 +68,19 @@ struct load_info {
};

int mod_verify_sig(const void *mod, struct load_info *info);
+struct module *find_module_all(const char *name, size_t len, bool even_unformed);
+int cmp_name(const void *name, const void *sym);
+long module_get_offset(struct module *mod, unsigned int *size, Elf_Shdr *sechdr,
+ unsigned int section);
+
+static inline unsigned long kernel_symbol_value(const struct kernel_symbol *sym)
+{
+#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
+ return (unsigned long)offset_to_ptr(&sym->value_offset);
+#else
+ return sym->value;
+#endif
+}

#ifdef CONFIG_LIVEPATCH
int copy_module_elf(struct module *mod, struct load_info *info);
@@ -174,3 +187,19 @@ void kmemleak_load_module(const struct module *mod, const struct load_info *info
static inline void kmemleak_load_module(const struct module *mod,
const struct load_info *info) { }
#endif /* CONFIG_DEBUG_KMEMLEAK */
+
+#ifdef CONFIG_KALLSYMS
+void init_build_id(struct module *mod, const struct load_info *info);
+void layout_symtab(struct module *mod, struct load_info *info);
+void add_kallsyms(struct module *mod, const struct load_info *info);
+unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name);
+
+static inline bool sect_empty(const Elf_Shdr *sect)
+{
+ return !(sect->sh_flags & SHF_ALLOC) || sect->sh_size == 0;
+}
+#else /* !CONFIG_KALLSYMS */
+static inline void init_build_id(struct module *mod, const struct load_info *info) { }
+static inline void layout_symtab(struct module *mod, struct load_info *info) { }
+static inline void add_kallsyms(struct module *mod, const struct load_info *info) { }
+#endif /* CONFIG_KALLSYMS */
diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
new file mode 100644
index 000000000000..b6d49bb5afed
--- /dev/null
+++ b/kernel/module/kallsyms.c
@@ -0,0 +1,506 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Module kallsyms support
+ *
+ * Copyright (C) 2010 Rusty Russell
+ */
+
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include <linux/buildid.h>
+#include <linux/bsearch.h>
+#include "internal.h"
+
+/* Lookup exported symbol in given range of kernel_symbols */
+static const struct kernel_symbol *lookup_exported_symbol(const char *name,
+ const struct kernel_symbol *start,
+ const struct kernel_symbol *stop)
+{
+ return bsearch(name, start, stop - start,
+ sizeof(struct kernel_symbol), cmp_name);
+}
+
+static int is_exported(const char *name, unsigned long value,
+ const struct module *mod)
+{
+ const struct kernel_symbol *ks;
+
+ if (!mod)
+ ks = lookup_exported_symbol(name, __start___ksymtab, __stop___ksymtab);
+ else
+ ks = lookup_exported_symbol(name, mod->syms, mod->syms + mod->num_syms);
+
+ return ks && kernel_symbol_value(ks) == value;
+}
+
+/* As per nm */
+static char elf_type(const Elf_Sym *sym, const struct load_info *info)
+{
+ const Elf_Shdr *sechdrs = info->sechdrs;
+
+ if (ELF_ST_BIND(sym->st_info) == STB_WEAK) {
+ if (ELF_ST_TYPE(sym->st_info) == STT_OBJECT)
+ return 'v';
+ else
+ return 'w';
+ }
+ if (sym->st_shndx == SHN_UNDEF)
+ return 'U';
+ if (sym->st_shndx == SHN_ABS || sym->st_shndx == info->index.pcpu)
+ return 'a';
+ if (sym->st_shndx >= SHN_LORESERVE)
+ return '?';
+ if (sechdrs[sym->st_shndx].sh_flags & SHF_EXECINSTR)
+ return 't';
+ if (sechdrs[sym->st_shndx].sh_flags & SHF_ALLOC &&
+ sechdrs[sym->st_shndx].sh_type != SHT_NOBITS) {
+ if (!(sechdrs[sym->st_shndx].sh_flags & SHF_WRITE))
+ return 'r';
+ else if (sechdrs[sym->st_shndx].sh_flags & ARCH_SHF_SMALL)
+ return 'g';
+ else
+ return 'd';
+ }
+ if (sechdrs[sym->st_shndx].sh_type == SHT_NOBITS) {
+ if (sechdrs[sym->st_shndx].sh_flags & ARCH_SHF_SMALL)
+ return 's';
+ else
+ return 'b';
+ }
+ if (strstarts(info->secstrings + sechdrs[sym->st_shndx].sh_name,
+ ".debug")) {
+ return 'n';
+ }
+ return '?';
+}
+
+static bool is_core_symbol(const Elf_Sym *src, const Elf_Shdr *sechdrs,
+ unsigned int shnum, unsigned int pcpundx)
+{
+ const Elf_Shdr *sec;
+
+ if (src->st_shndx == SHN_UNDEF ||
+ src->st_shndx >= shnum ||
+ !src->st_name)
+ return false;
+
+#ifdef CONFIG_KALLSYMS_ALL
+ if (src->st_shndx == pcpundx)
+ return true;
+#endif
+
+ sec = sechdrs + src->st_shndx;
+ if (!(sec->sh_flags & SHF_ALLOC)
+#ifndef CONFIG_KALLSYMS_ALL
+ || !(sec->sh_flags & SHF_EXECINSTR)
+#endif
+ || (sec->sh_entsize & INIT_OFFSET_MASK))
+ return false;
+
+ return true;
+}
+
+/*
+ * We only allocate and copy the strings needed by the parts of symtab
+ * we keep. This is simple, but has the effect of making multiple
+ * copies of duplicates. We could be more sophisticated, see
+ * linux-kernel thread starting with
+ * <73defb5e4bca04a6431392cc341112b1@localhost>.
+ */
+void layout_symtab(struct module *mod, struct load_info *info)
+{
+ Elf_Shdr *symsect = info->sechdrs + info->index.sym;
+ Elf_Shdr *strsect = info->sechdrs + info->index.str;
+ const Elf_Sym *src;
+ unsigned int i, nsrc, ndst, strtab_size = 0;
+
+ /* Put symbol section at end of init part of module. */
+ symsect->sh_flags |= SHF_ALLOC;
+ symsect->sh_entsize = module_get_offset(mod, &mod->init_layout.size, symsect,
+ info->index.sym) | INIT_OFFSET_MASK;
+ pr_debug("\t%s\n", info->secstrings + symsect->sh_name);
+
+ src = (void *)info->hdr + symsect->sh_offset;
+ nsrc = symsect->sh_size / sizeof(*src);
+
+ /* Compute total space required for the core symbols' strtab. */
+ for (ndst = i = 0; i < nsrc; i++) {
+ if (i == 0 || is_livepatch_module(mod) ||
+ is_core_symbol(src + i, info->sechdrs, info->hdr->e_shnum,
+ info->index.pcpu)) {
+ strtab_size += strlen(&info->strtab[src[i].st_name]) + 1;
+ ndst++;
+ }
+ }
+
+ /* Append room for core symbols at end of core part. */
+ info->symoffs = ALIGN(mod->core_layout.size, symsect->sh_addralign ?: 1);
+ info->stroffs = mod->core_layout.size = info->symoffs + ndst * sizeof(Elf_Sym);
+ mod->core_layout.size += strtab_size;
+ info->core_typeoffs = mod->core_layout.size;
+ mod->core_layout.size += ndst * sizeof(char);
+ mod->core_layout.size = debug_align(mod->core_layout.size);
+
+ /* Put string table section at end of init part of module. */
+ strsect->sh_flags |= SHF_ALLOC;
+ strsect->sh_entsize = module_get_offset(mod, &mod->init_layout.size, strsect,
+ info->index.str) | INIT_OFFSET_MASK;
+ pr_debug("\t%s\n", info->secstrings + strsect->sh_name);
+
+ /* We'll tack temporary mod_kallsyms on the end. */
+ mod->init_layout.size = ALIGN(mod->init_layout.size,
+ __alignof__(struct mod_kallsyms));
+ info->mod_kallsyms_init_off = mod->init_layout.size;
+ mod->init_layout.size += sizeof(struct mod_kallsyms);
+ info->init_typeoffs = mod->init_layout.size;
+ mod->init_layout.size += nsrc * sizeof(char);
+ mod->init_layout.size = debug_align(mod->init_layout.size);
+}
+
+/*
+ * We use the full symtab and strtab which layout_symtab arranged to
+ * be appended to the init section. Later we switch to the cut-down
+ * core-only ones.
+ */
+void add_kallsyms(struct module *mod, const struct load_info *info)
+{
+ unsigned int i, ndst;
+ const Elf_Sym *src;
+ Elf_Sym *dst;
+ char *s;
+ Elf_Shdr *symsec = &info->sechdrs[info->index.sym];
+
+ /* Set up to point into init section. */
+ mod->kallsyms = (void __rcu *)mod->init_layout.base +
+ info->mod_kallsyms_init_off;
+
+ /* The following is safe since this pointer cannot change */
+ rcu_dereference_sched(mod->kallsyms)->symtab = (void *)symsec->sh_addr;
+ rcu_dereference_sched(mod->kallsyms)->num_symtab = symsec->sh_size / sizeof(Elf_Sym);
+ /* Make sure we get permanent strtab: don't use info->strtab. */
+ rcu_dereference_sched(mod->kallsyms)->strtab =
+ (void *)info->sechdrs[info->index.str].sh_addr;
+ rcu_dereference_sched(mod->kallsyms)->typetab =
+ mod->init_layout.base + info->init_typeoffs;
+
+ /*
+ * Now populate the cut down core kallsyms for after init
+ * and set types up while we still have access to sections.
+ */
+ mod->core_kallsyms.symtab = dst = mod->core_layout.base + info->symoffs;
+ mod->core_kallsyms.strtab = s = mod->core_layout.base + info->stroffs;
+ mod->core_kallsyms.typetab = mod->core_layout.base + info->core_typeoffs;
+ src = rcu_dereference_sched(mod->kallsyms)->symtab;
+ for (ndst = i = 0; i < rcu_dereference_sched(mod->kallsyms)->num_symtab; i++) {
+ rcu_dereference_sched(mod->kallsyms)->typetab[i] = elf_type(src + i, info);
+ if (i == 0 || is_livepatch_module(mod) ||
+ is_core_symbol(src + i, info->sechdrs, info->hdr->e_shnum,
+ info->index.pcpu)) {
+ mod->core_kallsyms.typetab[ndst] =
+ rcu_dereference_sched(mod->kallsyms)->typetab[i];
+ dst[ndst] = src[i];
+ dst[ndst++].st_name = s - mod->core_kallsyms.strtab;
+ s += strscpy(s,
+ &rcu_dereference_sched(mod->kallsyms)->strtab[src[i].st_name],
+ KSYM_NAME_LEN) + 1;
+ }
+ }
+ mod->core_kallsyms.num_symtab = ndst;
+}
+
+#if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
+void init_build_id(struct module *mod, const struct load_info *info)
+{
+ const Elf_Shdr *sechdr;
+ unsigned int i;
+
+ for (i = 0; i < info->hdr->e_shnum; i++) {
+ sechdr = &info->sechdrs[i];
+ if (!sect_empty(sechdr) && sechdr->sh_type == SHT_NOTE &&
+ !build_id_parse_buf((void *)sechdr->sh_addr, mod->build_id,
+ sechdr->sh_size))
+ break;
+ }
+}
+#else
+void init_build_id(struct module *mod, const struct load_info *info)
+{
+}
+#endif
+
+/*
+ * This ignores the intensely annoying "mapping symbols" found
+ * in ARM ELF files: $a, $t and $d.
+ */
+static inline int is_arm_mapping_symbol(const char *str)
+{
+ if (str[0] == '.' && str[1] == 'L')
+ return true;
+ return str[0] == '$' && strchr("axtd", str[1]) &&
+ (str[2] == '\0' || str[2] == '.');
+}
+
+static const char *kallsyms_symbol_name(struct mod_kallsyms *kallsyms, unsigned int symnum)
+{
+ return kallsyms->strtab + kallsyms->symtab[symnum].st_name;
+}
+
+/*
+ * Given a module and address, find the corresponding symbol and return its name
+ * while providing its size and offset if needed.
+ */
+static const char *find_kallsyms_symbol(struct module *mod,
+ unsigned long addr,
+ unsigned long *size,
+ unsigned long *offset)
+{
+ unsigned int i, best = 0;
+ unsigned long nextval, bestval;
+ struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
+
+ /* At worse, next value is at end of module */
+ if (within_module_init(addr, mod))
+ nextval = (unsigned long)mod->init_layout.base + mod->init_layout.text_size;
+ else
+ nextval = (unsigned long)mod->core_layout.base + mod->core_layout.text_size;
+
+ bestval = kallsyms_symbol_value(&kallsyms->symtab[best]);
+
+ /*
+ * Scan for closest preceding symbol, and next symbol. (ELF
+ * starts real symbols at 1).
+ */
+ for (i = 1; i < kallsyms->num_symtab; i++) {
+ const Elf_Sym *sym = &kallsyms->symtab[i];
+ unsigned long thisval = kallsyms_symbol_value(sym);
+
+ if (sym->st_shndx == SHN_UNDEF)
+ continue;
+
+ /*
+ * We ignore unnamed symbols: they're uninformative
+ * and inserted at a whim.
+ */
+ if (*kallsyms_symbol_name(kallsyms, i) == '\0' ||
+ is_arm_mapping_symbol(kallsyms_symbol_name(kallsyms, i)))
+ continue;
+
+ if (thisval <= addr && thisval > bestval) {
+ best = i;
+ bestval = thisval;
+ }
+ if (thisval > addr && thisval < nextval)
+ nextval = thisval;
+ }
+
+ if (!best)
+ return NULL;
+
+ if (size)
+ *size = nextval - bestval;
+ if (offset)
+ *offset = addr - bestval;
+
+ return kallsyms_symbol_name(kallsyms, best);
+}
+
+void * __weak dereference_module_function_descriptor(struct module *mod,
+ void *ptr)
+{
+ return ptr;
+}
+
+/*
+ * For kallsyms to ask for address resolution. NULL means not found. Careful
+ * not to lock to avoid deadlock on oopses, simply disable preemption.
+ */
+const char *module_address_lookup(unsigned long addr,
+ unsigned long *size,
+ unsigned long *offset,
+ char **modname,
+ const unsigned char **modbuildid,
+ char *namebuf)
+{
+ const char *ret = NULL;
+ struct module *mod;
+
+ preempt_disable();
+ mod = __module_address(addr);
+ if (mod) {
+ if (modname)
+ *modname = mod->name;
+ if (modbuildid) {
+#if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
+ *modbuildid = mod->build_id;
+#else
+ *modbuildid = NULL;
+#endif
+ }
+
+ ret = find_kallsyms_symbol(mod, addr, size, offset);
+ }
+ /* Make a copy in here where it's safe */
+ if (ret) {
+ strncpy(namebuf, ret, KSYM_NAME_LEN - 1);
+ ret = namebuf;
+ }
+ preempt_enable();
+
+ return ret;
+}
+
+int lookup_module_symbol_name(unsigned long addr, char *symname)
+{
+ struct module *mod;
+
+ preempt_disable();
+ list_for_each_entry_rcu(mod, &modules, list) {
+ if (mod->state == MODULE_STATE_UNFORMED)
+ continue;
+ if (within_module(addr, mod)) {
+ const char *sym;
+
+ sym = find_kallsyms_symbol(mod, addr, NULL, NULL);
+ if (!sym)
+ goto out;
+
+ strscpy(symname, sym, KSYM_NAME_LEN);
+ preempt_enable();
+ return 0;
+ }
+ }
+out:
+ preempt_enable();
+ return -ERANGE;
+}
+
+int lookup_module_symbol_attrs(unsigned long addr, unsigned long *size,
+ unsigned long *offset, char *modname, char *name)
+{
+ struct module *mod;
+
+ preempt_disable();
+ list_for_each_entry_rcu(mod, &modules, list) {
+ if (mod->state == MODULE_STATE_UNFORMED)
+ continue;
+ if (within_module(addr, mod)) {
+ const char *sym;
+
+ sym = find_kallsyms_symbol(mod, addr, size, offset);
+ if (!sym)
+ goto out;
+ if (modname)
+ strscpy(modname, mod->name, MODULE_NAME_LEN);
+ if (name)
+ strscpy(name, sym, KSYM_NAME_LEN);
+ preempt_enable();
+ return 0;
+ }
+ }
+out:
+ preempt_enable();
+ return -ERANGE;
+}
+
+int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
+ char *name, char *module_name, int *exported)
+{
+ struct module *mod;
+
+ preempt_disable();
+ list_for_each_entry_rcu(mod, &modules, list) {
+ struct mod_kallsyms *kallsyms;
+
+ if (mod->state == MODULE_STATE_UNFORMED)
+ continue;
+ kallsyms = rcu_dereference_sched(mod->kallsyms);
+ if (symnum < kallsyms->num_symtab) {
+ const Elf_Sym *sym = &kallsyms->symtab[symnum];
+
+ *value = kallsyms_symbol_value(sym);
+ *type = kallsyms->typetab[symnum];
+ strscpy(name, kallsyms_symbol_name(kallsyms, symnum), KSYM_NAME_LEN);
+ strscpy(module_name, mod->name, MODULE_NAME_LEN);
+ *exported = is_exported(name, *value, mod);
+ preempt_enable();
+ return 0;
+ }
+ symnum -= kallsyms->num_symtab;
+ }
+ preempt_enable();
+ return -ERANGE;
+}
+
+/* Given a module and name of symbol, find and return the symbol's value */
+unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name)
+{
+ unsigned int i;
+ struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
+
+ for (i = 0; i < kallsyms->num_symtab; i++) {
+ const Elf_Sym *sym = &kallsyms->symtab[i];
+
+ if (strcmp(name, kallsyms_symbol_name(kallsyms, i)) == 0 &&
+ sym->st_shndx != SHN_UNDEF)
+ return kallsyms_symbol_value(sym);
+ }
+ return 0;
+}
+
+/* Look for this name: can be of form module:name. */
+unsigned long module_kallsyms_lookup_name(const char *name)
+{
+ struct module *mod;
+ char *colon;
+ unsigned long ret = 0;
+
+ /* Don't lock: we're in enough trouble already. */
+ preempt_disable();
+ if ((colon = strnchr(name, MODULE_NAME_LEN, ':')) != NULL) {
+ if ((mod = find_module_all(name, colon - name, false)) != NULL)
+ ret = find_kallsyms_symbol_value(mod, colon + 1);
+ } else {
+ list_for_each_entry_rcu(mod, &modules, list) {
+ if (mod->state == MODULE_STATE_UNFORMED)
+ continue;
+ if ((ret = find_kallsyms_symbol_value(mod, name)) != 0)
+ break;
+ }
+ }
+ preempt_enable();
+ return ret;
+}
+
+#ifdef CONFIG_LIVEPATCH
+int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
+ struct module *, unsigned long),
+ void *data)
+{
+ struct module *mod;
+ unsigned int i;
+ int ret = 0;
+
+ mutex_lock(&module_mutex);
+ list_for_each_entry(mod, &modules, list) {
+ /* Still use rcu_dereference_sched to remain compliant with sparse */
+ struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
+
+ if (mod->state == MODULE_STATE_UNFORMED)
+ continue;
+ for (i = 0; i < kallsyms->num_symtab; i++) {
+ const Elf_Sym *sym = &kallsyms->symtab[i];
+
+ if (sym->st_shndx == SHN_UNDEF)
+ continue;
+
+ ret = fn(data, kallsyms_symbol_name(kallsyms, i),
+ mod, kallsyms_symbol_value(sym));
+ if (ret != 0)
+ goto out;
+ }
+ }
+out:
+ mutex_unlock(&module_mutex);
+ return ret;
+}
+#endif /* CONFIG_LIVEPATCH */
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 7dd283959c5c..952079987ea4 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -288,15 +288,6 @@ static bool check_exported_symbol(const struct symsearch *syms,
return true;
}

-static unsigned long kernel_symbol_value(const struct kernel_symbol *sym)
-{
-#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
- return (unsigned long)offset_to_ptr(&sym->value_offset);
-#else
- return sym->value;
-#endif
-}
-
static const char *kernel_symbol_name(const struct kernel_symbol *sym)
{
#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
@@ -317,7 +308,7 @@ static const char *kernel_symbol_namespace(const struct kernel_symbol *sym)
#endif
}

-static int cmp_name(const void *name, const void *sym)
+int cmp_name(const void *name, const void *sym)
{
return strcmp(name, kernel_symbol_name(sym));
}
@@ -387,8 +378,8 @@ static bool find_symbol(struct find_symbol_arg *fsa)
* Search for module by name: must hold module_mutex (or preempt disabled
* for read-only access).
*/
-static struct module *find_module_all(const char *name, size_t len,
- bool even_unformed)
+struct module *find_module_all(const char *name, size_t len,
+ bool even_unformed)
{
struct module *mod;

@@ -1294,13 +1285,6 @@ resolve_symbol_wait(struct module *mod,
return ksym;
}

-#ifdef CONFIG_KALLSYMS
-static inline bool sect_empty(const Elf_Shdr *sect)
-{
- return !(sect->sh_flags & SHF_ALLOC) || sect->sh_size == 0;
-}
-#endif
-
/*
* /sys/module/foo/sections stuff
* J. Corbet <[email protected]>
@@ -2065,7 +2049,7 @@ unsigned int __weak arch_mod_section_prepend(struct module *mod,
}

/* Update size with this section: return offset. */
-static long get_offset(struct module *mod, unsigned int *size,
+long module_get_offset(struct module *mod, unsigned int *size,
Elf_Shdr *sechdr, unsigned int section)
{
long ret;
@@ -2121,7 +2105,7 @@ static void layout_sections(struct module *mod, struct load_info *info)
|| s->sh_entsize != ~0UL
|| module_init_layout_section(sname))
continue;
- s->sh_entsize = get_offset(mod, &mod->core_layout.size, s, i);
+ s->sh_entsize = module_get_offset(mod, &mod->core_layout.size, s, i);
pr_debug("\t%s\n", sname);
}
switch (m) {
@@ -2154,7 +2138,7 @@ static void layout_sections(struct module *mod, struct load_info *info)
|| s->sh_entsize != ~0UL
|| !module_init_layout_section(sname))
continue;
- s->sh_entsize = (get_offset(mod, &mod->init_layout.size, s, i)
+ s->sh_entsize = (module_get_offset(mod, &mod->init_layout.size, s, i)
| INIT_OFFSET_MASK);
pr_debug("\t%s\n", sname);
}
@@ -2267,228 +2251,6 @@ static void free_modinfo(struct module *mod)
}
}

-#ifdef CONFIG_KALLSYMS
-
-/* Lookup exported symbol in given range of kernel_symbols */
-static const struct kernel_symbol *lookup_exported_symbol(const char *name,
- const struct kernel_symbol *start,
- const struct kernel_symbol *stop)
-{
- return bsearch(name, start, stop - start,
- sizeof(struct kernel_symbol), cmp_name);
-}
-
-static int is_exported(const char *name, unsigned long value,
- const struct module *mod)
-{
- const struct kernel_symbol *ks;
- if (!mod)
- ks = lookup_exported_symbol(name, __start___ksymtab, __stop___ksymtab);
- else
- ks = lookup_exported_symbol(name, mod->syms, mod->syms + mod->num_syms);
-
- return ks != NULL && kernel_symbol_value(ks) == value;
-}
-
-/* As per nm */
-static char elf_type(const Elf_Sym *sym, const struct load_info *info)
-{
- const Elf_Shdr *sechdrs = info->sechdrs;
-
- if (ELF_ST_BIND(sym->st_info) == STB_WEAK) {
- if (ELF_ST_TYPE(sym->st_info) == STT_OBJECT)
- return 'v';
- else
- return 'w';
- }
- if (sym->st_shndx == SHN_UNDEF)
- return 'U';
- if (sym->st_shndx == SHN_ABS || sym->st_shndx == info->index.pcpu)
- return 'a';
- if (sym->st_shndx >= SHN_LORESERVE)
- return '?';
- if (sechdrs[sym->st_shndx].sh_flags & SHF_EXECINSTR)
- return 't';
- if (sechdrs[sym->st_shndx].sh_flags & SHF_ALLOC
- && sechdrs[sym->st_shndx].sh_type != SHT_NOBITS) {
- if (!(sechdrs[sym->st_shndx].sh_flags & SHF_WRITE))
- return 'r';
- else if (sechdrs[sym->st_shndx].sh_flags & ARCH_SHF_SMALL)
- return 'g';
- else
- return 'd';
- }
- if (sechdrs[sym->st_shndx].sh_type == SHT_NOBITS) {
- if (sechdrs[sym->st_shndx].sh_flags & ARCH_SHF_SMALL)
- return 's';
- else
- return 'b';
- }
- if (strstarts(info->secstrings + sechdrs[sym->st_shndx].sh_name,
- ".debug")) {
- return 'n';
- }
- return '?';
-}
-
-static bool is_core_symbol(const Elf_Sym *src, const Elf_Shdr *sechdrs,
- unsigned int shnum, unsigned int pcpundx)
-{
- const Elf_Shdr *sec;
-
- if (src->st_shndx == SHN_UNDEF
- || src->st_shndx >= shnum
- || !src->st_name)
- return false;
-
-#ifdef CONFIG_KALLSYMS_ALL
- if (src->st_shndx == pcpundx)
- return true;
-#endif
-
- sec = sechdrs + src->st_shndx;
- if (!(sec->sh_flags & SHF_ALLOC)
-#ifndef CONFIG_KALLSYMS_ALL
- || !(sec->sh_flags & SHF_EXECINSTR)
-#endif
- || (sec->sh_entsize & INIT_OFFSET_MASK))
- return false;
-
- return true;
-}
-
-/*
- * We only allocate and copy the strings needed by the parts of symtab
- * we keep. This is simple, but has the effect of making multiple
- * copies of duplicates. We could be more sophisticated, see
- * linux-kernel thread starting with
- * <73defb5e4bca04a6431392cc341112b1@localhost>.
- */
-static void layout_symtab(struct module *mod, struct load_info *info)
-{
- Elf_Shdr *symsect = info->sechdrs + info->index.sym;
- Elf_Shdr *strsect = info->sechdrs + info->index.str;
- const Elf_Sym *src;
- unsigned int i, nsrc, ndst, strtab_size = 0;
-
- /* Put symbol section at end of init part of module. */
- symsect->sh_flags |= SHF_ALLOC;
- symsect->sh_entsize = get_offset(mod, &mod->init_layout.size, symsect,
- info->index.sym) | INIT_OFFSET_MASK;
- pr_debug("\t%s\n", info->secstrings + symsect->sh_name);
-
- src = (void *)info->hdr + symsect->sh_offset;
- nsrc = symsect->sh_size / sizeof(*src);
-
- /* Compute total space required for the core symbols' strtab. */
- for (ndst = i = 0; i < nsrc; i++) {
- if (i == 0 || is_livepatch_module(mod) ||
- is_core_symbol(src+i, info->sechdrs, info->hdr->e_shnum,
- info->index.pcpu)) {
- strtab_size += strlen(&info->strtab[src[i].st_name])+1;
- ndst++;
- }
- }
-
- /* Append room for core symbols at end of core part. */
- info->symoffs = ALIGN(mod->core_layout.size, symsect->sh_addralign ?: 1);
- info->stroffs = mod->core_layout.size = info->symoffs + ndst * sizeof(Elf_Sym);
- mod->core_layout.size += strtab_size;
- info->core_typeoffs = mod->core_layout.size;
- mod->core_layout.size += ndst * sizeof(char);
- mod->core_layout.size = debug_align(mod->core_layout.size);
-
- /* Put string table section at end of init part of module. */
- strsect->sh_flags |= SHF_ALLOC;
- strsect->sh_entsize = get_offset(mod, &mod->init_layout.size, strsect,
- info->index.str) | INIT_OFFSET_MASK;
- pr_debug("\t%s\n", info->secstrings + strsect->sh_name);
-
- /* We'll tack temporary mod_kallsyms on the end. */
- mod->init_layout.size = ALIGN(mod->init_layout.size,
- __alignof__(struct mod_kallsyms));
- info->mod_kallsyms_init_off = mod->init_layout.size;
- mod->init_layout.size += sizeof(struct mod_kallsyms);
- info->init_typeoffs = mod->init_layout.size;
- mod->init_layout.size += nsrc * sizeof(char);
- mod->init_layout.size = debug_align(mod->init_layout.size);
-}
-
-/*
- * We use the full symtab and strtab which layout_symtab arranged to
- * be appended to the init section. Later we switch to the cut-down
- * core-only ones.
- */
-static void add_kallsyms(struct module *mod, const struct load_info *info)
-{
- unsigned int i, ndst;
- const Elf_Sym *src;
- Elf_Sym *dst;
- char *s;
- Elf_Shdr *symsec = &info->sechdrs[info->index.sym];
-
- /* Set up to point into init section. */
- mod->kallsyms = mod->init_layout.base + info->mod_kallsyms_init_off;
-
- mod->kallsyms->symtab = (void *)symsec->sh_addr;
- mod->kallsyms->num_symtab = symsec->sh_size / sizeof(Elf_Sym);
- /* Make sure we get permanent strtab: don't use info->strtab. */
- mod->kallsyms->strtab = (void *)info->sechdrs[info->index.str].sh_addr;
- mod->kallsyms->typetab = mod->init_layout.base + info->init_typeoffs;
-
- /*
- * Now populate the cut down core kallsyms for after init
- * and set types up while we still have access to sections.
- */
- mod->core_kallsyms.symtab = dst = mod->core_layout.base + info->symoffs;
- mod->core_kallsyms.strtab = s = mod->core_layout.base + info->stroffs;
- mod->core_kallsyms.typetab = mod->core_layout.base + info->core_typeoffs;
- src = mod->kallsyms->symtab;
- for (ndst = i = 0; i < mod->kallsyms->num_symtab; i++) {
- mod->kallsyms->typetab[i] = elf_type(src + i, info);
- if (i == 0 || is_livepatch_module(mod) ||
- is_core_symbol(src+i, info->sechdrs, info->hdr->e_shnum,
- info->index.pcpu)) {
- mod->core_kallsyms.typetab[ndst] =
- mod->kallsyms->typetab[i];
- dst[ndst] = src[i];
- dst[ndst++].st_name = s - mod->core_kallsyms.strtab;
- s += strlcpy(s, &mod->kallsyms->strtab[src[i].st_name],
- KSYM_NAME_LEN) + 1;
- }
- }
- mod->core_kallsyms.num_symtab = ndst;
-}
-#else
-static inline void layout_symtab(struct module *mod, struct load_info *info)
-{
-}
-
-static void add_kallsyms(struct module *mod, const struct load_info *info)
-{
-}
-#endif /* CONFIG_KALLSYMS */
-
-#if IS_ENABLED(CONFIG_KALLSYMS) && IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
-static void init_build_id(struct module *mod, const struct load_info *info)
-{
- const Elf_Shdr *sechdr;
- unsigned int i;
-
- for (i = 0; i < info->hdr->e_shnum; i++) {
- sechdr = &info->sechdrs[i];
- if (!sect_empty(sechdr) && sechdr->sh_type == SHT_NOTE &&
- !build_id_parse_buf((void *)sechdr->sh_addr, mod->build_id,
- sechdr->sh_size))
- break;
- }
-}
-#else
-static void init_build_id(struct module *mod, const struct load_info *info)
-{
-}
-#endif
-
static void dynamic_debug_setup(struct module *mod, struct _ddebug *debug, unsigned int num)
{
if (!debug)
@@ -3799,287 +3561,6 @@ static inline int within(unsigned long addr, void *start, unsigned long size)
return ((void *)addr >= start && (void *)addr < start + size);
}

-#ifdef CONFIG_KALLSYMS
-/*
- * This ignores the intensely annoying "mapping symbols" found
- * in ARM ELF files: $a, $t and $d.
- */
-static inline int is_arm_mapping_symbol(const char *str)
-{
- if (str[0] == '.' && str[1] == 'L')
- return true;
- return str[0] == '$' && strchr("axtd", str[1])
- && (str[2] == '\0' || str[2] == '.');
-}
-
-static const char *kallsyms_symbol_name(struct mod_kallsyms *kallsyms, unsigned int symnum)
-{
- return kallsyms->strtab + kallsyms->symtab[symnum].st_name;
-}
-
-/*
- * Given a module and address, find the corresponding symbol and return its name
- * while providing its size and offset if needed.
- */
-static const char *find_kallsyms_symbol(struct module *mod,
- unsigned long addr,
- unsigned long *size,
- unsigned long *offset)
-{
- unsigned int i, best = 0;
- unsigned long nextval, bestval;
- struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
-
- /* At worse, next value is at end of module */
- if (within_module_init(addr, mod))
- nextval = (unsigned long)mod->init_layout.base+mod->init_layout.text_size;
- else
- nextval = (unsigned long)mod->core_layout.base+mod->core_layout.text_size;
-
- bestval = kallsyms_symbol_value(&kallsyms->symtab[best]);
-
- /*
- * Scan for closest preceding symbol, and next symbol. (ELF
- * starts real symbols at 1).
- */
- for (i = 1; i < kallsyms->num_symtab; i++) {
- const Elf_Sym *sym = &kallsyms->symtab[i];
- unsigned long thisval = kallsyms_symbol_value(sym);
-
- if (sym->st_shndx == SHN_UNDEF)
- continue;
-
- /*
- * We ignore unnamed symbols: they're uninformative
- * and inserted at a whim.
- */
- if (*kallsyms_symbol_name(kallsyms, i) == '\0'
- || is_arm_mapping_symbol(kallsyms_symbol_name(kallsyms, i)))
- continue;
-
- if (thisval <= addr && thisval > bestval) {
- best = i;
- bestval = thisval;
- }
- if (thisval > addr && thisval < nextval)
- nextval = thisval;
- }
-
- if (!best)
- return NULL;
-
- if (size)
- *size = nextval - bestval;
- if (offset)
- *offset = addr - bestval;
-
- return kallsyms_symbol_name(kallsyms, best);
-}
-
-void * __weak dereference_module_function_descriptor(struct module *mod,
- void *ptr)
-{
- return ptr;
-}
-
-/*
- * For kallsyms to ask for address resolution. NULL means not found. Careful
- * not to lock to avoid deadlock on oopses, simply disable preemption.
- */
-const char *module_address_lookup(unsigned long addr,
- unsigned long *size,
- unsigned long *offset,
- char **modname,
- const unsigned char **modbuildid,
- char *namebuf)
-{
- const char *ret = NULL;
- struct module *mod;
-
- preempt_disable();
- mod = __module_address(addr);
- if (mod) {
- if (modname)
- *modname = mod->name;
- if (modbuildid) {
-#if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
- *modbuildid = mod->build_id;
-#else
- *modbuildid = NULL;
-#endif
- }
-
- ret = find_kallsyms_symbol(mod, addr, size, offset);
- }
- /* Make a copy in here where it's safe */
- if (ret) {
- strncpy(namebuf, ret, KSYM_NAME_LEN - 1);
- ret = namebuf;
- }
- preempt_enable();
-
- return ret;
-}
-
-int lookup_module_symbol_name(unsigned long addr, char *symname)
-{
- struct module *mod;
-
- preempt_disable();
- list_for_each_entry_rcu(mod, &modules, list) {
- if (mod->state == MODULE_STATE_UNFORMED)
- continue;
- if (within_module(addr, mod)) {
- const char *sym;
-
- sym = find_kallsyms_symbol(mod, addr, NULL, NULL);
- if (!sym)
- goto out;
-
- strlcpy(symname, sym, KSYM_NAME_LEN);
- preempt_enable();
- return 0;
- }
- }
-out:
- preempt_enable();
- return -ERANGE;
-}
-
-int lookup_module_symbol_attrs(unsigned long addr, unsigned long *size,
- unsigned long *offset, char *modname, char *name)
-{
- struct module *mod;
-
- preempt_disable();
- list_for_each_entry_rcu(mod, &modules, list) {
- if (mod->state == MODULE_STATE_UNFORMED)
- continue;
- if (within_module(addr, mod)) {
- const char *sym;
-
- sym = find_kallsyms_symbol(mod, addr, size, offset);
- if (!sym)
- goto out;
- if (modname)
- strlcpy(modname, mod->name, MODULE_NAME_LEN);
- if (name)
- strlcpy(name, sym, KSYM_NAME_LEN);
- preempt_enable();
- return 0;
- }
- }
-out:
- preempt_enable();
- return -ERANGE;
-}
-
-int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
- char *name, char *module_name, int *exported)
-{
- struct module *mod;
-
- preempt_disable();
- list_for_each_entry_rcu(mod, &modules, list) {
- struct mod_kallsyms *kallsyms;
-
- if (mod->state == MODULE_STATE_UNFORMED)
- continue;
- kallsyms = rcu_dereference_sched(mod->kallsyms);
- if (symnum < kallsyms->num_symtab) {
- const Elf_Sym *sym = &kallsyms->symtab[symnum];
-
- *value = kallsyms_symbol_value(sym);
- *type = kallsyms->typetab[symnum];
- strlcpy(name, kallsyms_symbol_name(kallsyms, symnum), KSYM_NAME_LEN);
- strlcpy(module_name, mod->name, MODULE_NAME_LEN);
- *exported = is_exported(name, *value, mod);
- preempt_enable();
- return 0;
- }
- symnum -= kallsyms->num_symtab;
- }
- preempt_enable();
- return -ERANGE;
-}
-
-/* Given a module and name of symbol, find and return the symbol's value */
-static unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name)
-{
- unsigned int i;
- struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
-
- for (i = 0; i < kallsyms->num_symtab; i++) {
- const Elf_Sym *sym = &kallsyms->symtab[i];
-
- if (strcmp(name, kallsyms_symbol_name(kallsyms, i)) == 0 &&
- sym->st_shndx != SHN_UNDEF)
- return kallsyms_symbol_value(sym);
- }
- return 0;
-}
-
-/* Look for this name: can be of form module:name. */
-unsigned long module_kallsyms_lookup_name(const char *name)
-{
- struct module *mod;
- char *colon;
- unsigned long ret = 0;
-
- /* Don't lock: we're in enough trouble already. */
- preempt_disable();
- if ((colon = strnchr(name, MODULE_NAME_LEN, ':')) != NULL) {
- if ((mod = find_module_all(name, colon - name, false)) != NULL)
- ret = find_kallsyms_symbol_value(mod, colon+1);
- } else {
- list_for_each_entry_rcu(mod, &modules, list) {
- if (mod->state == MODULE_STATE_UNFORMED)
- continue;
- if ((ret = find_kallsyms_symbol_value(mod, name)) != 0)
- break;
- }
- }
- preempt_enable();
- return ret;
-}
-
-#ifdef CONFIG_LIVEPATCH
-int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
- struct module *, unsigned long),
- void *data)
-{
- struct module *mod;
- unsigned int i;
- int ret = 0;
-
- mutex_lock(&module_mutex);
- list_for_each_entry(mod, &modules, list) {
- /* We hold module_mutex: no need for rcu_dereference_sched */
- struct mod_kallsyms *kallsyms = mod->kallsyms;
-
- if (mod->state == MODULE_STATE_UNFORMED)
- continue;
- for (i = 0; i < kallsyms->num_symtab; i++) {
- const Elf_Sym *sym = &kallsyms->symtab[i];
-
- if (sym->st_shndx == SHN_UNDEF)
- continue;
-
- ret = fn(data, kallsyms_symbol_name(kallsyms, i),
- mod, kallsyms_symbol_value(sym));
- if (ret != 0)
- goto out;
-
- cond_resched();
- }
- }
-out:
- mutex_unlock(&module_mutex);
- return ret;
-}
-#endif /* CONFIG_LIVEPATCH */
-#endif /* CONFIG_KALLSYMS */
-
static void cfi_init(struct module *mod)
{
#ifdef CONFIG_CFI_CLANG
--
2.34.1

2022-02-23 05:29:03

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v8 05/13] module: Move latched RB-tree support to a separate file



Le 22/02/2022 à 15:12, Aaron Tomlin a écrit :
> No functional change.
>
> This patch migrates module latched RB-tree support
> (e.g. see __module_address()) from core module code
> into kernel/module/tree_lookup.c.
>
> Signed-off-by: Aaron Tomlin <[email protected]>

Reviewed-by: Christophe Leroy <[email protected]>

> ---
> kernel/module/Makefile | 1 +
> kernel/module/internal.h | 33 +++++++++
> kernel/module/main.c | 130 ++----------------------------------
> kernel/module/tree_lookup.c | 109 ++++++++++++++++++++++++++++++
> 4 files changed, 147 insertions(+), 126 deletions(-)
> create mode 100644 kernel/module/tree_lookup.c
>
> diff --git a/kernel/module/Makefile b/kernel/module/Makefile
> index ed3aacb04f17..88774e386276 100644
> --- a/kernel/module/Makefile
> +++ b/kernel/module/Makefile
> @@ -11,3 +11,4 @@ obj-y += main.o
> obj-$(CONFIG_MODULE_DECOMPRESS) += decompress.o
> obj-$(CONFIG_MODULE_SIG) += signing.o
> obj-$(CONFIG_LIVEPATCH) += livepatch.o
> +obj-$(CONFIG_MODULES_TREE_LOOKUP) += tree_lookup.o
> diff --git a/kernel/module/internal.h b/kernel/module/internal.h
> index ad7a444253ed..f1682e3677be 100644
> --- a/kernel/module/internal.h
> +++ b/kernel/module/internal.h
> @@ -9,6 +9,7 @@
> #include <linux/compiler.h>
> #include <linux/module.h>
> #include <linux/mutex.h>
> +#include <linux/rculist.h>
>
> #ifndef ARCH_SHF_SMALL
> #define ARCH_SHF_SMALL 0
> @@ -93,3 +94,35 @@ static inline void module_decompress_cleanup(struct load_info *info)
> {
> }
> #endif
> +
> +#ifdef CONFIG_MODULES_TREE_LOOKUP
> +struct mod_tree_root {
> + struct latch_tree_root root;
> + unsigned long addr_min;
> + unsigned long addr_max;
> +};
> +
> +extern struct mod_tree_root mod_tree;
> +
> +void mod_tree_insert(struct module *mod);
> +void mod_tree_remove_init(struct module *mod);
> +void mod_tree_remove(struct module *mod);
> +struct module *mod_find(unsigned long addr);
> +#else /* !CONFIG_MODULES_TREE_LOOKUP */
> +
> +static inline void mod_tree_insert(struct module *mod) { }
> +static inline void mod_tree_remove_init(struct module *mod) { }
> +static inline void mod_tree_remove(struct module *mod) { }
> +static inline struct module *mod_find(unsigned long addr)
> +{
> + struct module *mod;
> +
> + list_for_each_entry_rcu(mod, &modules, list,
> + lockdep_is_held(&module_mutex)) {
> + if (within_module(addr, mod))
> + return mod;
> + }
> +
> + return NULL;
> +}
> +#endif /* CONFIG_MODULES_TREE_LOOKUP */
> diff --git a/kernel/module/main.c b/kernel/module/main.c
> index 3596ebf3a6c3..76b53880ad91 100644
> --- a/kernel/module/main.c
> +++ b/kernel/module/main.c
> @@ -90,138 +90,16 @@ static DECLARE_WORK(init_free_wq, do_free_init);
> static LLIST_HEAD(init_free_list);
>
> #ifdef CONFIG_MODULES_TREE_LOOKUP
> -
> -/*
> - * Use a latched RB-tree for __module_address(); this allows us to use
> - * RCU-sched lookups of the address from any context.
> - *
> - * This is conditional on PERF_EVENTS || TRACING because those can really hit
> - * __module_address() hard by doing a lot of stack unwinding; potentially from
> - * NMI context.
> - */
> -
> -static __always_inline unsigned long __mod_tree_val(struct latch_tree_node *n)
> -{
> - struct module_layout *layout = container_of(n, struct module_layout, mtn.node);
> -
> - return (unsigned long)layout->base;
> -}
> -
> -static __always_inline unsigned long __mod_tree_size(struct latch_tree_node *n)
> -{
> - struct module_layout *layout = container_of(n, struct module_layout, mtn.node);
> -
> - return (unsigned long)layout->size;
> -}
> -
> -static __always_inline bool
> -mod_tree_less(struct latch_tree_node *a, struct latch_tree_node *b)
> -{
> - return __mod_tree_val(a) < __mod_tree_val(b);
> -}
> -
> -static __always_inline int
> -mod_tree_comp(void *key, struct latch_tree_node *n)
> -{
> - unsigned long val = (unsigned long)key;
> - unsigned long start, end;
> -
> - start = __mod_tree_val(n);
> - if (val < start)
> - return -1;
> -
> - end = start + __mod_tree_size(n);
> - if (val >= end)
> - return 1;
> -
> - return 0;
> -}
> -
> -static const struct latch_tree_ops mod_tree_ops = {
> - .less = mod_tree_less,
> - .comp = mod_tree_comp,
> -};
> -
> -static struct mod_tree_root {
> - struct latch_tree_root root;
> - unsigned long addr_min;
> - unsigned long addr_max;
> -} mod_tree __cacheline_aligned = {
> +struct mod_tree_root mod_tree __cacheline_aligned = {
> .addr_min = -1UL,
> };
>
> #define module_addr_min mod_tree.addr_min
> #define module_addr_max mod_tree.addr_max
>
> -static noinline void __mod_tree_insert(struct mod_tree_node *node)
> -{
> - latch_tree_insert(&node->node, &mod_tree.root, &mod_tree_ops);
> -}
> -
> -static void __mod_tree_remove(struct mod_tree_node *node)
> -{
> - latch_tree_erase(&node->node, &mod_tree.root, &mod_tree_ops);
> -}
> -
> -/*
> - * These modifications: insert, remove_init and remove; are serialized by the
> - * module_mutex.
> - */
> -static void mod_tree_insert(struct module *mod)
> -{
> - mod->core_layout.mtn.mod = mod;
> - mod->init_layout.mtn.mod = mod;
> -
> - __mod_tree_insert(&mod->core_layout.mtn);
> - if (mod->init_layout.size)
> - __mod_tree_insert(&mod->init_layout.mtn);
> -}
> -
> -static void mod_tree_remove_init(struct module *mod)
> -{
> - if (mod->init_layout.size)
> - __mod_tree_remove(&mod->init_layout.mtn);
> -}
> -
> -static void mod_tree_remove(struct module *mod)
> -{
> - __mod_tree_remove(&mod->core_layout.mtn);
> - mod_tree_remove_init(mod);
> -}
> -
> -static struct module *mod_find(unsigned long addr)
> -{
> - struct latch_tree_node *ltn;
> -
> - ltn = latch_tree_find((void *)addr, &mod_tree.root, &mod_tree_ops);
> - if (!ltn)
> - return NULL;
> -
> - return container_of(ltn, struct mod_tree_node, node)->mod;
> -}
> -
> -#else /* MODULES_TREE_LOOKUP */
> -
> -static unsigned long module_addr_min = -1UL, module_addr_max = 0;
> -
> -static void mod_tree_insert(struct module *mod) { }
> -static void mod_tree_remove_init(struct module *mod) { }
> -static void mod_tree_remove(struct module *mod) { }
> -
> -static struct module *mod_find(unsigned long addr)
> -{
> - struct module *mod;
> -
> - list_for_each_entry_rcu(mod, &modules, list,
> - lockdep_is_held(&module_mutex)) {
> - if (within_module(addr, mod))
> - return mod;
> - }
> -
> - return NULL;
> -}
> -
> -#endif /* MODULES_TREE_LOOKUP */
> +#else /* !CONFIG_MODULES_TREE_LOOKUP */
> +static unsigned long module_addr_min = -1UL, module_addr_max;
> +#endif /* CONFIG_MODULES_TREE_LOOKUP */
>
> /*
> * Bounds of module text, for speeding up __module_address.
> diff --git a/kernel/module/tree_lookup.c b/kernel/module/tree_lookup.c
> new file mode 100644
> index 000000000000..0bc4ec3b22ce
> --- /dev/null
> +++ b/kernel/module/tree_lookup.c
> @@ -0,0 +1,109 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Modules tree lookup
> + *
> + * Copyright (C) 2015 Peter Zijlstra
> + * Copyright (C) 2015 Rusty Russell
> + */
> +
> +#include <linux/module.h>
> +#include <linux/rbtree_latch.h>
> +#include "internal.h"
> +
> +/*
> + * Use a latched RB-tree for __module_address(); this allows us to use
> + * RCU-sched lookups of the address from any context.
> + *
> + * This is conditional on PERF_EVENTS || TRACING because those can really hit
> + * __module_address() hard by doing a lot of stack unwinding; potentially from
> + * NMI context.
> + */
> +
> +static __always_inline unsigned long __mod_tree_val(struct latch_tree_node *n)
> +{
> + struct module_layout *layout = container_of(n, struct module_layout, mtn.node);
> +
> + return (unsigned long)layout->base;
> +}
> +
> +static __always_inline unsigned long __mod_tree_size(struct latch_tree_node *n)
> +{
> + struct module_layout *layout = container_of(n, struct module_layout, mtn.node);
> +
> + return (unsigned long)layout->size;
> +}
> +
> +static __always_inline bool
> +mod_tree_less(struct latch_tree_node *a, struct latch_tree_node *b)
> +{
> + return __mod_tree_val(a) < __mod_tree_val(b);
> +}
> +
> +static __always_inline int
> +mod_tree_comp(void *key, struct latch_tree_node *n)
> +{
> + unsigned long val = (unsigned long)key;
> + unsigned long start, end;
> +
> + start = __mod_tree_val(n);
> + if (val < start)
> + return -1;
> +
> + end = start + __mod_tree_size(n);
> + if (val >= end)
> + return 1;
> +
> + return 0;
> +}
> +
> +static const struct latch_tree_ops mod_tree_ops = {
> + .less = mod_tree_less,
> + .comp = mod_tree_comp,
> +};
> +
> +static noinline void __mod_tree_insert(struct mod_tree_node *node)
> +{
> + latch_tree_insert(&node->node, &mod_tree.root, &mod_tree_ops);
> +}
> +
> +static void __mod_tree_remove(struct mod_tree_node *node)
> +{
> + latch_tree_erase(&node->node, &mod_tree.root, &mod_tree_ops);
> +}
> +
> +/*
> + * These modifications: insert, remove_init and remove; are serialized by the
> + * module_mutex.
> + */
> +void mod_tree_insert(struct module *mod)
> +{
> + mod->core_layout.mtn.mod = mod;
> + mod->init_layout.mtn.mod = mod;
> +
> + __mod_tree_insert(&mod->core_layout.mtn);
> + if (mod->init_layout.size)
> + __mod_tree_insert(&mod->init_layout.mtn);
> +}
> +
> +void mod_tree_remove_init(struct module *mod)
> +{
> + if (mod->init_layout.size)
> + __mod_tree_remove(&mod->init_layout.mtn);
> +}
> +
> +void mod_tree_remove(struct module *mod)
> +{
> + __mod_tree_remove(&mod->core_layout.mtn);
> + mod_tree_remove_init(mod);
> +}
> +
> +struct module *mod_find(unsigned long addr)
> +{
> + struct latch_tree_node *ltn;
> +
> + ltn = latch_tree_find((void *)addr, &mod_tree.root, &mod_tree_ops);
> + if (!ltn)
> + return NULL;
> +
> + return container_of(ltn, struct mod_tree_node, node)->mod;
> +}

2022-02-25 11:56:47

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file

On Fri 2022-02-25 09:27:33, Christophe Leroy wrote:
>
>
> Le 25/02/2022 ? 10:15, Petr Mladek a ?crit?:
> > On Tue 2022-02-22 14:12:59, Aaron Tomlin wrote:
> >> No functional change.
> >
> > The patch adds rcu_dereference_sched() into several locations.
> > It triggers lockdep warnings, see below.
> >
> > It is good example why avoid any hidden changes when shuffling
> > code. The changes in the code should be done in a preparatory
> > patch or not at all.
> >
> > This patch is even worse because these changes were not
> > mentioned in the commit message. It should describe what
> > is done and why.
> >
> > I wonder how many other changes are hidden in this patchset
> > and if anyone really checked them.
>
> That's probably my fault, when I reviewed version v5 of the series I
> mentionned all checkpatch and sparse reports asking Aaron to make his
> series exempt of such warnings. Most warnings where related to style
> (parenthesis alignment, blank lines, spaces, etc ...) or erroneous
> casting etc....
>
> But for that particular patch we had:
>
> kernel/module/kallsyms.c:174:23: warning: incorrect type in assignment
> (different address spaces)
> kernel/module/kallsyms.c:174:23: expected struct mod_kallsyms
> [noderef] __rcu *kallsyms
> kernel/module/kallsyms.c:174:23: got void *
> kernel/module/kallsyms.c:176:12: warning: dereference of noderef expression
> kernel/module/kallsyms.c:177:12: warning: dereference of noderef expression
> kernel/module/kallsyms.c:179:12: warning: dereference of noderef expression
> kernel/module/kallsyms.c:180:12: warning: dereference of noderef expression
> kernel/module/kallsyms.c:189:18: warning: dereference of noderef expression
> kernel/module/kallsyms.c:190:35: warning: dereference of noderef expression
> kernel/module/kallsyms.c:191:20: warning: dereference of noderef expression
> kernel/module/kallsyms.c:196:32: warning: dereference of noderef expression
> kernel/module/kallsyms.c:199:45: warning: dereference of noderef expression
>
> Aaron used rcu_dereference_sched() in order to fix that.
>
> How should this be fixed if using rcu_dereference_sched() is not correct ?

IMHO, sparse complains that _rcu pointer is not accessed using RCU
API.

rcu_dereference_sched() makes sparse happy. But lockdep complains
because the _rcu pointer is not accessed under:

rcu_read_lock_sched();
rcu_read_unlock_sched();

This is not the case here. Note that module_mutex does not
disable preemtion.

Now, the code is safe. The RCU access makes sure that "mod"
can't be freed in the meantime:

+ add_kallsyms() is called by the module loaded when the module
is being loaded. It could not get removed in parallel
by definition.

+ module_kallsyms_on_each_symbol() takes module_mutex.
It means that the module could not get removed.


IMHO, we have two possibilities here:

+ Make sparse and lockdep happy by using rcu_dereference_sched()
and calling the code under rcu_read_lock_sched().

+ Cast (struct mod_kallsyms *)mod->kallsyms when accessing
the value.

I do not have strong preference. I am fine with both.

Anyway, such a fix should be done in a separate patch!

Best Regards,
Petr

2022-02-25 12:03:52

by Aaron Tomlin

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file

On Fri 2022-02-25 11:15 +0100, Petr Mladek wrote:
> rcu_dereference_sched() makes sparse happy. But lockdep complains
> because the _rcu pointer is not accessed under:
>
> rcu_read_lock_sched();
> rcu_read_unlock_sched();

Hi Petr,

>
> This is not the case here. Note that module_mutex does not
> disable preemtion.
>
> Now, the code is safe. The RCU access makes sure that "mod"
> can't be freed in the meantime:
>
> + add_kallsyms() is called by the module loaded when the module
> is being loaded. It could not get removed in parallel
> by definition.
>
> + module_kallsyms_on_each_symbol() takes module_mutex.
> It means that the module could not get removed.

Indeed, which is why I did not use rcu_read_lock_sched() and
rcu_read_unlock_sched() with rcu_dereference_sched(). That being said, I
should have mentioned this in the commit message.

> IMHO, we have two possibilities here:
>
> + Make sparse and lockdep happy by using rcu_dereference_sched()
> and calling the code under rcu_read_lock_sched().
>
> + Cast (struct mod_kallsyms *)mod->kallsyms when accessing
> the value.

I prefer the first option.

> I do not have strong preference. I am fine with both.
>
> Anyway, such a fix should be done in a separate patch!

Agreed.


Kind regards,

--
Aaron Tomlin

2022-02-25 13:08:30

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file



Le 25/02/2022 à 13:21, Aaron Tomlin a écrit :
> On Fri 2022-02-25 10:27 +0000, Aaron Tomlin wrote:
>> On Fri 2022-02-25 11:15 +0100, Petr Mladek wrote:
>>> rcu_dereference_sched() makes sparse happy. But lockdep complains
>>> because the _rcu pointer is not accessed under:
>>>
>>> rcu_read_lock_sched();
>>> rcu_read_unlock_sched();
>>
>> Hi Petr,
>>
>>>
>>> This is not the case here. Note that module_mutex does not
>>> disable preemtion.
>>>
>>> Now, the code is safe. The RCU access makes sure that "mod"
>>> can't be freed in the meantime:
>>>
>>> + add_kallsyms() is called by the module loaded when the module
>>> is being loaded. It could not get removed in parallel
>>> by definition.
>>>
>>> + module_kallsyms_on_each_symbol() takes module_mutex.
>>> It means that the module could not get removed.
>>
>> Indeed, which is why I did not use rcu_read_lock_sched() and
>> rcu_read_unlock_sched() with rcu_dereference_sched(). That being said, I
>> should have mentioned this in the commit message.
>>
>>> IMHO, we have two possibilities here:
>>>
>>> + Make sparse and lockdep happy by using rcu_dereference_sched()
>>> and calling the code under rcu_read_lock_sched().
>>>
>>> + Cast (struct mod_kallsyms *)mod->kallsyms when accessing
>>> the value.
>>
>> I prefer the first option.
>>
>>> I do not have strong preference. I am fine with both.
>>>
>>> Anyway, such a fix should be done in a separate patch!
>>
>> Agreed.
>
> Luis,
>
> If I understand correctly, it might be cleaner to resolve the above in two
> separate patches for a v9 i.e. a) address the sparse and lockdep feedback
> and b) refactor the code, before the latest version [1] is merged into
> module-next. I assume the previous iteration will be reverted first?
>
> Please let me know your thoughts
>
> [1]: https://lore.kernel.org/all/[email protected]/
>

I would do it the other way: first move the code into a separate file,
and then handle the sparse __rcu feedback as a followup patch to the series.

Regarding module-next, AFAICS at the moment we still have only the 10
first patches of v6 in the tree. I guess the way forward will be to
rebase module-next and drop those patches and commit v9 instead.

Christophe

2022-02-25 13:22:12

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file

On Tue 2022-02-22 14:12:59, Aaron Tomlin wrote:
> No functional change.

The patch adds rcu_dereference_sched() into several locations.
It triggers lockdep warnings, see below.

It is good example why avoid any hidden changes when shuffling
code. The changes in the code should be done in a preparatory
patch or not at all.

This patch is even worse because these changes were not
mentioned in the commit message. It should describe what
is done and why.

I wonder how many other changes are hidden in this patchset
and if anyone really checked them.

> This patch migrates kallsyms code out of core module
> code kernel/module/kallsyms.c

> diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
> new file mode 100644
> index 000000000000..b6d49bb5afed
> --- /dev/null
> +++ b/kernel/module/kallsyms.c
[...]
> +/*
> + * We use the full symtab and strtab which layout_symtab arranged to
> + * be appended to the init section. Later we switch to the cut-down
> + * core-only ones.
> + */
> +void add_kallsyms(struct module *mod, const struct load_info *info)
> +{
> + unsigned int i, ndst;
> + const Elf_Sym *src;
> + Elf_Sym *dst;
> + char *s;
> + Elf_Shdr *symsec = &info->sechdrs[info->index.sym];
> +
> + /* Set up to point into init section. */
> + mod->kallsyms = (void __rcu *)mod->init_layout.base +
> + info->mod_kallsyms_init_off;
> +
> + /* The following is safe since this pointer cannot change */
> + rcu_dereference_sched(mod->kallsyms)->symtab = (void *)symsec->sh_addr;

I have got the following warning in livepatch self-test:

[ 372.740779] ===== TEST: basic function patching =====
[ 372.760921] % modprobe test_klp_livepatch
[ 372.766361] test_klp_livepatch: tainting kernel with TAINT_LIVEPATCH
[ 372.767319] test_klp_livepatch: module verification failed: signature and/or required key missing - tainting kernel

[ 372.769132] =============================
[ 372.769771] WARNING: suspicious RCU usage
[ 372.770392] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 372.770396] -----------------------------
[ 372.770397] kernel/module/kallsyms.c:178 suspicious rcu_dereference_check() usage!
[ 372.770400]
other info that might help us debug this:

[ 372.770401]
rcu_scheduler_active = 2, debug_locks = 1
[ 372.770403] no locks held by modprobe/1760.
[ 372.770405]
stack backtrace:
[ 372.770409] CPU: 3 PID: 1760 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 372.770412] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 372.770413] Call Trace:
[ 372.770415] <TASK>
[ 372.770417] dump_stack_lvl+0x58/0x71
[ 372.770424] add_kallsyms+0x477/0x5c0
[ 372.770434] load_module+0x107c/0x19c0
[ 372.770446] ? kernel_read_file+0x2a3/0x2d0
[ 372.782403] ? __do_sys_finit_module+0xaf/0x120
[ 372.783019] __do_sys_finit_module+0xaf/0x120
[ 372.783038] do_syscall_64+0x37/0x80
[ 372.783042] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 372.783045] RIP: 0033:0x7f13f53992a9
[ 372.783048] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 372.783050] RSP: 002b:00007ffca746bf08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 372.783052] RAX: ffffffffffffffda RBX: 000055bc9b8b8880 RCX: 00007f13f53992a9
[ 372.783054] RDX: 0000000000000000 RSI: 000055bc99c31688 RDI: 0000000000000005
[ 372.783055] RBP: 000055bc99c31688 R08: 0000000000000000 R09: 000055bc9b8b8410
[ 372.783056] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 372.783057] R13: 000055bc9b8b87a0 R14: 0000000000000000 R15: 000055bc9b8b8880
[ 372.783070] </TASK>


> + rcu_dereference_sched(mod->kallsyms)->num_symtab = symsec->sh_size / sizeof(Elf_Sym);

[ 372.793150] =============================
[ 372.793151] WARNING: suspicious RCU usage
[ 372.793153] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 372.793155] -----------------------------
[ 372.793156] kernel/module/kallsyms.c:179 suspicious rcu_dereference_check() usage!
[ 372.793158]
other info that might help us debug this:

[ 372.797266]
rcu_scheduler_active = 2, debug_locks = 1
[ 372.797268] no locks held by modprobe/1760.
[ 372.797270]
stack backtrace:
[ 372.797271] CPU: 3 PID: 1760 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 372.797274] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 372.797275] Call Trace:
[ 372.797277] <TASK>
[ 372.797278] dump_stack_lvl+0x58/0x71
[ 372.802579] add_kallsyms+0x56f/0x5c0
[ 372.802605] load_module+0x107c/0x19c0
[ 372.803525] ? kernel_read_file+0x2a3/0x2d0
[ 372.803538] ? __do_sys_finit_module+0xaf/0x120
[ 372.803540] __do_sys_finit_module+0xaf/0x120
[ 372.803555] do_syscall_64+0x37/0x80
[ 372.803558] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 372.803561] RIP: 0033:0x7f13f53992a9
[ 372.803563] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 372.803565] RSP: 002b:00007ffca746bf08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 372.803567] RAX: ffffffffffffffda RBX: 000055bc9b8b8880 RCX: 00007f13f53992a9
[ 372.803568] RDX: 0000000000000000 RSI: 000055bc99c31688 RDI: 0000000000000005
[ 372.811447] RBP: 000055bc99c31688 R08: 0000000000000000 R09: 000055bc9b8b8410
[ 372.811465] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 372.811467] R13: 000055bc9b8b87a0 R14: 0000000000000000 R15: 000055bc9b8b8880
[ 372.811479] </TASK>


> + /* Make sure we get permanent strtab: don't use info->strtab. */
> + rcu_dereference_sched(mod->kallsyms)->strtab =
> + (void *)info->sechdrs[info->index.str].sh_addr;

[ 372.814541] =============================
[ 372.815091] WARNING: suspicious RCU usage
[ 372.815093] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 372.815094] -----------------------------
[ 372.815095] kernel/module/kallsyms.c:181 suspicious rcu_dereference_check() usage!
[ 372.815096]
other info that might help us debug this:

[ 372.815097]
rcu_scheduler_active = 2, debug_locks = 1
[ 372.815099] no locks held by modprobe/1760.
[ 372.815100]
stack backtrace:
[ 372.815101] CPU: 3 PID: 1760 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 372.815102] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 372.815103] Call Trace:
[ 372.815105] <TASK>
[ 372.815106] dump_stack_lvl+0x58/0x71
[ 372.815111] add_kallsyms+0x531/0x5c0
[ 372.815119] load_module+0x107c/0x19c0
[ 372.815129] ? kernel_read_file+0x2a3/0x2d0
[ 372.815140] ? __do_sys_finit_module+0xaf/0x120
[ 372.815143] __do_sys_finit_module+0xaf/0x120
[ 372.815157] do_syscall_64+0x37/0x80
[ 372.815160] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 372.828879] RIP: 0033:0x7f13f53992a9
[ 372.828885] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 372.828889] RSP: 002b:00007ffca746bf08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 372.828892] RAX: ffffffffffffffda RBX: 000055bc9b8b8880 RCX: 00007f13f53992a9
[ 372.828893] RDX: 0000000000000000 RSI: 000055bc99c31688 RDI: 0000000000000005
[ 372.828894] RBP: 000055bc99c31688 R08: 0000000000000000 R09: 000055bc9b8b8410
[ 372.828895] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 372.836097] R13: 000055bc9b8b87a0 R14: 0000000000000000 R15: 000055bc9b8b8880
[ 372.836115] </TASK>


> + rcu_dereference_sched(mod->kallsyms)->typetab =
> + mod->init_layout.base + info->init_typeoffs;

[ 372.837224] =============================
[ 372.837224] WARNING: suspicious RCU usage
[ 372.837225] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 372.837227] -----------------------------
[ 372.837227] kernel/module/kallsyms.c:183 suspicious rcu_dereference_check() usage!
[ 372.837229]
other info that might help us debug this:

[ 372.837230]
rcu_scheduler_active = 2, debug_locks = 1
[ 372.837231] no locks held by modprobe/1760.
[ 372.837232]
stack backtrace:
[ 372.837233] CPU: 3 PID: 1760 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 372.837235] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 372.837236] Call Trace:
[ 372.837237] <TASK>
[ 372.837239] dump_stack_lvl+0x58/0x71
[ 372.837243] add_kallsyms+0x4f3/0x5c0
[ 372.837251] load_module+0x107c/0x19c0
[ 372.849013] ? kernel_read_file+0x2a3/0x2d0
[ 372.849026] ? __do_sys_finit_module+0xaf/0x120
[ 372.849930] __do_sys_finit_module+0xaf/0x120
[ 372.849946] do_syscall_64+0x37/0x80
[ 372.850772] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 372.850775] RIP: 0033:0x7f13f53992a9
[ 372.850778] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 372.850780] RSP: 002b:00007ffca746bf08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 372.854028] RAX: ffffffffffffffda RBX: 000055bc9b8b8880 RCX: 00007f13f53992a9
[ 372.854030] RDX: 0000000000000000 RSI: 000055bc99c31688 RDI: 0000000000000005
[ 372.854031] RBP: 000055bc99c31688 R08: 0000000000000000 R09: 000055bc9b8b8410
[ 372.854033] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 372.854034] R13: 000055bc9b8b87a0 R14: 0000000000000000 R15: 000055bc9b8b8880
[ 372.854048] </TASK>

> +
> + /*
> + * Now populate the cut down core kallsyms for after init
> + * and set types up while we still have access to sections.
> + */
> + mod->core_kallsyms.symtab = dst = mod->core_layout.base + info->symoffs;
> + mod->core_kallsyms.strtab = s = mod->core_layout.base + info->stroffs;
> + mod->core_kallsyms.typetab = mod->core_layout.base + info->core_typeoffs;
> + src = rcu_dereference_sched(mod->kallsyms)->symtab;

[ 372.854081] =============================
[ 372.854083] WARNING: suspicious RCU usage
[ 372.854084] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 372.854087] -----------------------------
[ 372.854089] kernel/module/kallsyms.c:193 suspicious rcu_dereference_check() usage!
[ 372.854091]
other info that might help us debug this:

[ 372.854093]
rcu_scheduler_active = 2, debug_locks = 1
[ 372.854095] no locks held by modprobe/1760.
[ 372.854097]
stack backtrace:
[ 372.854099] CPU: 3 PID: 1760 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 372.854102] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 372.854104] Call Trace:
[ 372.854106] <TASK>
[ 372.854109] dump_stack_lvl+0x58/0x71
[ 372.854126] add_kallsyms+0x4b5/0x5c0
[ 372.854139] load_module+0x107c/0x19c0
[ 372.866967] ? kernel_read_file+0x2a3/0x2d0
[ 372.866980] ? __do_sys_finit_module+0xaf/0x120
[ 372.867921] __do_sys_finit_module+0xaf/0x120
[ 372.867937] do_syscall_64+0x37/0x80
[ 372.868823] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 372.868826] RIP: 0033:0x7f13f53992a9
[ 372.868828] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 372.868830] RSP: 002b:00007ffca746bf08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 372.871419] RAX: ffffffffffffffda RBX: 000055bc9b8b8880 RCX: 00007f13f53992a9
[ 372.871420] RDX: 0000000000000000 RSI: 000055bc99c31688 RDI: 0000000000000005
[ 372.871422] RBP: 000055bc99c31688 R08: 0000000000000000 R09: 000055bc9b8b8410
[ 372.871423] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 372.871424] R13: 000055bc9b8b87a0 R14: 0000000000000000 R15: 000055bc9b8b8880
[ 372.871438] </TASK>

> + for (ndst = i = 0; i < rcu_dereference_sched(mod->kallsyms)->num_symtab; i++) {

[ 372.871464] =============================
[ 372.871466] WARNING: suspicious RCU usage
[ 372.871467] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 372.871470] -----------------------------
[ 372.871471] kernel/module/kallsyms.c:194 suspicious rcu_dereference_check() usage!
[ 372.878748]
other info that might help us debug this:

[ 372.878749]
rcu_scheduler_active = 2, debug_locks = 1
[ 372.878751] no locks held by modprobe/1760.
[ 372.878752]
stack backtrace:
[ 372.878753] CPU: 3 PID: 1760 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 372.878756] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 372.878757] Call Trace:
[ 372.878758] <TASK>
[ 372.878760] dump_stack_lvl+0x58/0x71
[ 372.878765] add_kallsyms+0x296/0x5c0
[ 372.878774] load_module+0x107c/0x19c0
[ 372.878785] ? kernel_read_file+0x2a3/0x2d0
[ 372.878797] ? __do_sys_finit_module+0xaf/0x120
[ 372.878800] __do_sys_finit_module+0xaf/0x120
[ 372.878815] do_syscall_64+0x37/0x80
[ 372.886420] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 372.886423] RIP: 0033:0x7f13f53992a9
[ 372.886425] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 372.886427] RSP: 002b:00007ffca746bf08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 372.886429] RAX: ffffffffffffffda RBX: 000055bc9b8b8880 RCX: 00007f13f53992a9
[ 372.886431] RDX: 0000000000000000 RSI: 000055bc99c31688 RDI: 0000000000000005
[ 372.886432] RBP: 000055bc99c31688 R08: 0000000000000000 R09: 000055bc9b8b8410
[ 372.886433] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 372.886435] R13: 000055bc9b8b87a0 R14: 0000000000000000 R15: 000055bc9b8b8880
[ 372.886448] </TASK>

> + rcu_dereference_sched(mod->kallsyms)->typetab[i] = elf_type(src + i, info);

[ 372.886474] =============================
[ 372.886476] WARNING: suspicious RCU usage
[ 372.886477] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 372.886480] -----------------------------
[ 372.886481] kernel/module/kallsyms.c:195 suspicious rcu_dereference_check() usage!
[ 372.886484]
other info that might help us debug this:

[ 372.886485]
rcu_scheduler_active = 2, debug_locks = 1
[ 372.886487] no locks held by modprobe/1760.
[ 372.886489]
stack backtrace:
[ 372.886491] CPU: 3 PID: 1760 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 372.886494] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 372.900968] Call Trace:
[ 372.900970] <TASK>
[ 372.900972] dump_stack_lvl+0x58/0x71
[ 372.900977] add_kallsyms+0x3c1/0x5c0
[ 372.900986] load_module+0x107c/0x19c0
[ 372.900997] ? kernel_read_file+0x2a3/0x2d0
[ 372.901009] ? __do_sys_finit_module+0xaf/0x120
[ 372.901012] __do_sys_finit_module+0xaf/0x120
[ 372.901027] do_syscall_64+0x37/0x80
[ 372.904379] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 372.904382] RIP: 0033:0x7f13f53992a9
[ 372.904384] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 372.904386] RSP: 002b:00007ffca746bf08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 372.904389] RAX: ffffffffffffffda RBX: 000055bc9b8b8880 RCX: 00007f13f53992a9
[ 372.904390] RDX: 0000000000000000 RSI: 000055bc99c31688 RDI: 0000000000000005
[ 372.904391] RBP: 000055bc99c31688 R08: 0000000000000000 R09: 000055bc9b8b8410
[ 372.904392] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 372.904394] R13: 000055bc9b8b87a0 R14: 0000000000000000 R15: 000055bc9b8b8880
[ 372.904407] </TASK>

> + if (i == 0 || is_livepatch_module(mod) ||
> + is_core_symbol(src + i, info->sechdrs, info->hdr->e_shnum,
> + info->index.pcpu)) {
> + mod->core_kallsyms.typetab[ndst] =
> + rcu_dereference_sched(mod->kallsyms)->typetab[i];

[ 372.904436] =============================
[ 372.904438] WARNING: suspicious RCU usage
[ 372.904440] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 372.904442] -----------------------------
[ 372.904444] kernel/module/kallsyms.c:200 suspicious rcu_dereference_check() usage!
[ 372.904446]
other info that might help us debug this:

[ 372.904448]
rcu_scheduler_active = 2, debug_locks = 1
[ 372.904450] no locks held by modprobe/1760.
[ 372.904452]
stack backtrace:
[ 372.904454] CPU: 3 PID: 1760 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 372.904457] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 372.904459] Call Trace:
[ 372.904461] <TASK>
[ 372.904464] dump_stack_lvl+0x58/0x71
[ 372.904470] add_kallsyms+0x439/0x5c0
[ 372.904485] load_module+0x107c/0x19c0
[ 372.904504] ? kernel_read_file+0x2a3/0x2d0
[ 372.921165] ? __do_sys_finit_module+0xaf/0x120
[ 372.921171] __do_sys_finit_module+0xaf/0x120
[ 372.921187] do_syscall_64+0x37/0x80
[ 372.922455] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 372.922458] RIP: 0033:0x7f13f53992a9
[ 372.922461] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 372.922463] RSP: 002b:00007ffca746bf08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 372.922466] RAX: ffffffffffffffda RBX: 000055bc9b8b8880 RCX: 00007f13f53992a9
[ 372.922467] RDX: 0000000000000000 RSI: 000055bc99c31688 RDI: 0000000000000005
[ 372.922469] RBP: 000055bc99c31688 R08: 0000000000000000 R09: 000055bc9b8b8410
[ 372.922470] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 372.922472] R13: 000055bc9b8b87a0 R14: 0000000000000000 R15: 000055bc9b8b8880
[ 372.922485] </TASK>

> + dst[ndst] = src[i];
> + dst[ndst++].st_name = s - mod->core_kallsyms.strtab;
> + s += strscpy(s,
> + &rcu_dereference_sched(mod->kallsyms)->strtab[src[i].st_name],

[ 372.929324] =============================
[ 372.929325] WARNING: suspicious RCU usage
[ 372.929327] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 372.929330] -----------------------------
[ 372.929331] kernel/module/kallsyms.c:204 suspicious rcu_dereference_check() usage!
[ 372.929334]
other info that might help us debug this:

[ 372.929335]
rcu_scheduler_active = 2, debug_locks = 1
[ 372.929338] no locks held by modprobe/1760.
[ 372.929340]
stack backtrace:
[ 372.929342] CPU: 3 PID: 1760 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 372.929345] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 372.929347] Call Trace:
[ 372.929349] <TASK>
[ 372.929352] dump_stack_lvl+0x58/0x71
[ 372.929360] add_kallsyms+0x3fb/0x5c0
[ 372.929374] load_module+0x107c/0x19c0
[ 372.929392] ? kernel_read_file+0x2a3/0x2d0
[ 372.939163] ? __do_sys_finit_module+0xaf/0x120
[ 372.939167] __do_sys_finit_module+0xaf/0x120
[ 372.939182] do_syscall_64+0x37/0x80
[ 372.939186] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 372.939188] RIP: 0033:0x7f13f53992a9
[ 372.939190] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 372.939192] RSP: 002b:00007ffca746bf08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 372.939195] RAX: ffffffffffffffda RBX: 000055bc9b8b8880 RCX: 00007f13f53992a9
[ 372.939196] RDX: 0000000000000000 RSI: 000055bc99c31688 RDI: 0000000000000005
[ 372.939197] RBP: 000055bc99c31688 R08: 0000000000000000 R09: 000055bc9b8b8410
[ 372.939199] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 372.939200] R13: 000055bc9b8b87a0 R14: 0000000000000000 R15: 000055bc9b8b8880
[ 372.939213] </TASK>

> + KSYM_NAME_LEN) + 1;
> + }
> + }
> + mod->core_kallsyms.num_symtab = ndst;
> +}

[...]

> +#ifdef CONFIG_LIVEPATCH
> +int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
> + struct module *, unsigned long),
> + void *data)
> +{
> + struct module *mod;
> + unsigned int i;
> + int ret = 0;
> +
> + mutex_lock(&module_mutex);
> + list_for_each_entry(mod, &modules, list) {
> + /* Still use rcu_dereference_sched to remain compliant with sparse */
> + struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);

I got the following warning when running livepatch selftest:

[ 403.430393] ===== TEST: multiple target modules =====
[ 403.452359] % modprobe test_klp_callbacks_busy block_transition=N
[ 403.458735] test_klp_callbacks_busy: test_klp_callbacks_busy_init
[ 403.459544] test_klp_callbacks_busy: busymod_work_func enter
[ 403.460274] test_klp_callbacks_busy: busymod_work_func exit
[ 403.476999] % modprobe test_klp_callbacks_demo

[ 403.483742] =============================
[ 403.484446] WARNING: suspicious RCU usage
[ 403.485158] 5.17.0-rc5-default+ #335 Tainted: G E K
[ 403.486490] -----------------------------
[ 403.486496] kernel/module/kallsyms.c:486 suspicious rcu_dereference_check() usage!
[ 403.486499]
other info that might help us debug this:

[ 403.486500]
rcu_scheduler_active = 2, debug_locks = 1
[ 403.486502] 2 locks held by modprobe/2479:
[ 403.486504] #0: ffffffff94c4f770 (klp_mutex){+.+.}-{3:3}, at: klp_enable_patch.part.12+0x24/0x910
[ 403.486517] #1: ffffffff94c50a50 (module_mutex){+.+.}-{3:3}, at: module_kallsyms_on_each_symbol+0x27/0x110
[ 403.486527]
stack backtrace:
[ 403.486529] CPU: 3 PID: 2479 Comm: modprobe Tainted: G E K 5.17.0-rc5-default+ #335
[ 403.486532] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 403.486535] Call Trace:
[ 403.486536] <TASK>
[ 403.486539] dump_stack_lvl+0x58/0x71
[ 403.486546] module_kallsyms_on_each_symbol+0x101/0x110
[ 403.486549] ? kobject_add_internal+0x1ca/0x2c0
[ 403.501245] klp_find_object_symbol+0x5f/0x110
[ 403.501255] klp_init_object_loaded+0xca/0x140
[ 403.501261] klp_enable_patch.part.12+0x5b6/0x910
[ 403.501266] ? pre_patch_callback+0x20/0x20 [test_klp_callbacks_demo]
[ 403.501271] ? pre_patch_callback+0x20/0x20 [test_klp_callbacks_demo]
[ 403.501276] do_one_initcall+0x58/0x300
[ 403.501286] do_init_module+0x4b/0x1f1
[ 403.501291] load_module+0x1862/0x19c0
[ 403.506243] ? __do_sys_finit_module+0xaf/0x120
[ 403.506247] __do_sys_finit_module+0xaf/0x120
[ 403.506261] do_syscall_64+0x37/0x80
[ 403.506264] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 403.506267] RIP: 0033:0x7f8e5f5f12a9
[ 403.506270] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bf 0b 2c 00 f7 d8 64 89 01 48
[ 403.510723] RSP: 002b:00007ffc725cfe48 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 403.510727] RAX: ffffffffffffffda RBX: 000055ddd32938d0 RCX: 00007f8e5f5f12a9
[ 403.510729] RDX: 0000000000000000 RSI: 000055ddd2231688 RDI: 0000000000000005
[ 403.510731] RBP: 000055ddd2231688 R08: 0000000000000000 R09: 000055ddd3293410
[ 403.510733] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000040000
[ 403.510734] R13: 000055ddd32937a0 R14: 0000000000000000 R15: 000055ddd32938d0
[ 403.510750] </TASK>

> +
> + if (mod->state == MODULE_STATE_UNFORMED)
> + continue;
> + for (i = 0; i < kallsyms->num_symtab; i++) {
> + const Elf_Sym *sym = &kallsyms->symtab[i];
> +
> + if (sym->st_shndx == SHN_UNDEF)
> + continue;
> +
> + ret = fn(data, kallsyms_symbol_name(kallsyms, i),
> + mod, kallsyms_symbol_value(sym));
> + if (ret != 0)
> + goto out;
> + }
> + }
> +out:
> + mutex_unlock(&module_mutex);
> + return ret;
> +}

2022-02-25 13:47:26

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file



Le 25/02/2022 à 10:15, Petr Mladek a écrit :
> On Tue 2022-02-22 14:12:59, Aaron Tomlin wrote:
>> No functional change.
>
> The patch adds rcu_dereference_sched() into several locations.
> It triggers lockdep warnings, see below.
>
> It is good example why avoid any hidden changes when shuffling
> code. The changes in the code should be done in a preparatory
> patch or not at all.
>
> This patch is even worse because these changes were not
> mentioned in the commit message. It should describe what
> is done and why.
>
> I wonder how many other changes are hidden in this patchset
> and if anyone really checked them.

That's probably my fault, when I reviewed version v5 of the series I
mentionned all checkpatch and sparse reports asking Aaron to make his
series exempt of such warnings. Most warnings where related to style
(parenthesis alignment, blank lines, spaces, etc ...) or erroneous
casting etc....

But for that particular patch we had:

kernel/module/kallsyms.c:174:23: warning: incorrect type in assignment
(different address spaces)
kernel/module/kallsyms.c:174:23: expected struct mod_kallsyms
[noderef] __rcu *kallsyms
kernel/module/kallsyms.c:174:23: got void *
kernel/module/kallsyms.c:176:12: warning: dereference of noderef expression
kernel/module/kallsyms.c:177:12: warning: dereference of noderef expression
kernel/module/kallsyms.c:179:12: warning: dereference of noderef expression
kernel/module/kallsyms.c:180:12: warning: dereference of noderef expression
kernel/module/kallsyms.c:189:18: warning: dereference of noderef expression
kernel/module/kallsyms.c:190:35: warning: dereference of noderef expression
kernel/module/kallsyms.c:191:20: warning: dereference of noderef expression
kernel/module/kallsyms.c:196:32: warning: dereference of noderef expression
kernel/module/kallsyms.c:199:45: warning: dereference of noderef expression

Aaron used rcu_dereference_sched() in order to fix that.

How should this be fixed if using rcu_dereference_sched() is not correct ?

Thanks
Christophe

2022-02-25 13:58:26

by Aaron Tomlin

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file

On Fri 2022-02-25 10:27 +0000, Aaron Tomlin wrote:
> On Fri 2022-02-25 11:15 +0100, Petr Mladek wrote:
> > rcu_dereference_sched() makes sparse happy. But lockdep complains
> > because the _rcu pointer is not accessed under:
> >
> > rcu_read_lock_sched();
> > rcu_read_unlock_sched();
>
> Hi Petr,
>
> >
> > This is not the case here. Note that module_mutex does not
> > disable preemtion.
> >
> > Now, the code is safe. The RCU access makes sure that "mod"
> > can't be freed in the meantime:
> >
> > + add_kallsyms() is called by the module loaded when the module
> > is being loaded. It could not get removed in parallel
> > by definition.
> >
> > + module_kallsyms_on_each_symbol() takes module_mutex.
> > It means that the module could not get removed.
>
> Indeed, which is why I did not use rcu_read_lock_sched() and
> rcu_read_unlock_sched() with rcu_dereference_sched(). That being said, I
> should have mentioned this in the commit message.
>
> > IMHO, we have two possibilities here:
> >
> > + Make sparse and lockdep happy by using rcu_dereference_sched()
> > and calling the code under rcu_read_lock_sched().
> >
> > + Cast (struct mod_kallsyms *)mod->kallsyms when accessing
> > the value.
>
> I prefer the first option.
>
> > I do not have strong preference. I am fine with both.
> >
> > Anyway, such a fix should be done in a separate patch!
>
> Agreed.

Luis,

If I understand correctly, it might be cleaner to resolve the above in two
separate patches for a v9 i.e. a) address the sparse and lockdep feedback
and b) refactor the code, before the latest version [1] is merged into
module-next. I assume the previous iteration will be reverted first?

Please let me know your thoughts

[1]: https://lore.kernel.org/all/[email protected]/


Kind regards,

--
Aaron Tomlin

2022-02-26 20:34:03

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file

On Fri, Feb 25, 2022 at 12:57:34PM +0000, Christophe Leroy wrote:
>
>
> Le 25/02/2022 ? 13:21, Aaron Tomlin a ?crit?:
> > On Fri 2022-02-25 10:27 +0000, Aaron Tomlin wrote:
> >> On Fri 2022-02-25 11:15 +0100, Petr Mladek wrote:
> >>> rcu_dereference_sched() makes sparse happy. But lockdep complains
> >>> because the _rcu pointer is not accessed under:
> >>>
> >>> rcu_read_lock_sched();
> >>> rcu_read_unlock_sched();
> >>
> >> Hi Petr,
> >>
> >>>
> >>> This is not the case here. Note that module_mutex does not
> >>> disable preemtion.
> >>>
> >>> Now, the code is safe. The RCU access makes sure that "mod"
> >>> can't be freed in the meantime:
> >>>
> >>> + add_kallsyms() is called by the module loaded when the module
> >>> is being loaded. It could not get removed in parallel
> >>> by definition.
> >>>
> >>> + module_kallsyms_on_each_symbol() takes module_mutex.
> >>> It means that the module could not get removed.
> >>
> >> Indeed, which is why I did not use rcu_read_lock_sched() and
> >> rcu_read_unlock_sched() with rcu_dereference_sched(). That being said, I
> >> should have mentioned this in the commit message.
> >>
> >>> IMHO, we have two possibilities here:
> >>>
> >>> + Make sparse and lockdep happy by using rcu_dereference_sched()
> >>> and calling the code under rcu_read_lock_sched().
> >>>
> >>> + Cast (struct mod_kallsyms *)mod->kallsyms when accessing
> >>> the value.
> >>
> >> I prefer the first option.
> >>
> >>> I do not have strong preference. I am fine with both.
> >>>
> >>> Anyway, such a fix should be done in a separate patch!
> >>
> >> Agreed.
> >
> > Luis,
> >
> > If I understand correctly, it might be cleaner to resolve the above in two
> > separate patches for a v9 i.e. a) address the sparse and lockdep feedback
> > and b) refactor the code, before the latest version [1] is merged into
> > module-next. I assume the previous iteration will be reverted first?
> >
> > Please let me know your thoughts
> >
> > [1]: https://lore.kernel.org/all/[email protected]/
> >
>
> I would do it the other way: first move the code into a separate file,
> and then handle the sparse __rcu feedback as a followup patch to the series.

I want to avoid any regressions and new complaints, fixes should be
submitted before so that if they are applicable to stable / etc they
can be sent there.

> Regarding module-next, AFAICS at the moment we still have only the 10
> first patches of v6 in the tree. I guess the way forward will be to
> rebase module-next and drop those patches and commit v9 instead.

Right, I'll just git fetch and reset to Linus' latest tree, so I'll drop
all of the stuff there now. And then the hope is to apply your new fresh new
clean v9.

Thanks for chugging on with this series!

Luis

2022-02-28 10:51:27

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file



Le 26/02/2022 à 21:27, Luis Chamberlain a écrit :
> On Fri, Feb 25, 2022 at 12:57:34PM +0000, Christophe Leroy wrote:
>>
>>
>> Le 25/02/2022 à 13:21, Aaron Tomlin a écrit :
>>> On Fri 2022-02-25 10:27 +0000, Aaron Tomlin wrote:
>>>> On Fri 2022-02-25 11:15 +0100, Petr Mladek wrote:
>>>>> rcu_dereference_sched() makes sparse happy. But lockdep complains
>>>>> because the _rcu pointer is not accessed under:
>>>>>
>>>>> rcu_read_lock_sched();
>>>>> rcu_read_unlock_sched();
>>>>
>>>> Hi Petr,
>>>>
>>>>>
>>>>> This is not the case here. Note that module_mutex does not
>>>>> disable preemtion.
>>>>>
>>>>> Now, the code is safe. The RCU access makes sure that "mod"
>>>>> can't be freed in the meantime:
>>>>>
>>>>> + add_kallsyms() is called by the module loaded when the module
>>>>> is being loaded. It could not get removed in parallel
>>>>> by definition.
>>>>>
>>>>> + module_kallsyms_on_each_symbol() takes module_mutex.
>>>>> It means that the module could not get removed.
>>>>
>>>> Indeed, which is why I did not use rcu_read_lock_sched() and
>>>> rcu_read_unlock_sched() with rcu_dereference_sched(). That being said, I
>>>> should have mentioned this in the commit message.
>>>>
>>>>> IMHO, we have two possibilities here:
>>>>>
>>>>> + Make sparse and lockdep happy by using rcu_dereference_sched()
>>>>> and calling the code under rcu_read_lock_sched().
>>>>>
>>>>> + Cast (struct mod_kallsyms *)mod->kallsyms when accessing
>>>>> the value.
>>>>
>>>> I prefer the first option.
>>>>
>>>>> I do not have strong preference. I am fine with both.
>>>>>
>>>>> Anyway, such a fix should be done in a separate patch!
>>>>
>>>> Agreed.
>>>
>>> Luis,
>>>
>>> If I understand correctly, it might be cleaner to resolve the above in two
>>> separate patches for a v9 i.e. a) address the sparse and lockdep feedback
>>> and b) refactor the code, before the latest version [1] is merged into
>>> module-next. I assume the previous iteration will be reverted first?
>>>
>>> Please let me know your thoughts
>>>
>>> [1]: https://lore.kernel.org/all/[email protected]/
>>>
>>
>> I would do it the other way: first move the code into a separate file,
>> and then handle the sparse __rcu feedback as a followup patch to the series.
>
> I want to avoid any regressions and new complaints, fixes should be
> submitted before so that if they are applicable to stable / etc they
> can be sent there.

Fair enough, however here we are talking about sparse warning only, and
the discussion around it has shown that this is not a real bug, just a
warning that can be either fixed with a proper cast or by adding rcu
locks which might not be necessary.

So I'm not sure this is a good candidate for -stable.

In
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
it is said "It must fix a real bug that bothers people (not a, “This
could be a problem…” type thing)"

But up to you.

>
>> Regarding module-next, AFAICS at the moment we still have only the 10
>> first patches of v6 in the tree. I guess the way forward will be to
>> rebase module-next and drop those patches and commit v9 instead.
>
> Right, I'll just git fetch and reset to Linus' latest tree, so I'll drop
> all of the stuff there now. And then the hope is to apply your new fresh new
> clean v9.
>

Aaron, do you plan to send v9 anytime soon ?

Thanks
Christophe

2022-02-28 10:51:50

by Aaron Tomlin

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file

On Mon 2022-02-28 09:02 +0000, Christophe Leroy wrote:
> Aaron, do you plan to send v9 anytime soon ?

Hi Christophe,

Yes, today.

As discussed previously, I will resolve the Sparse warnings, in the context
of Kconfig CONFIG_KALLSYMS, with an appropriate statement in the commit
message, as a preliminary patch to the series. That being said, I believe
it makes sense to include the aforementioned patch within the series.
Any objections?


Kind regards,

--
Aaron Tomlin

2022-02-28 14:09:35

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v8 09/13] module: Move kallsyms support into a separate file



Le 28/02/2022 à 10:31, Aaron Tomlin a écrit :
> On Mon 2022-02-28 09:02 +0000, Christophe Leroy wrote:
>> Aaron, do you plan to send v9 anytime soon ?
>
> Hi Christophe,
>
> Yes, today.
>
> As discussed previously, I will resolve the Sparse warnings, in the context
> of Kconfig CONFIG_KALLSYMS, with an appropriate statement in the commit
> message, as a preliminary patch to the series. That being said, I believe
> it makes sense to include the aforementioned patch within the series.
> Any objections?
>

No objection.

Thank you
Christophe