LinuxLists.cc - [PATCH v3 0/6] Generate address range data for built-in modules

2024-05-17 04:29:48

Subject: [PATCH v3 0/6] Generate address range data for built-in modules

Especially for tracing applications, it is convenient to be able to
refer to a symbol using a <module name, symbol name> pair and to be able
to translate an address into a <nodule mname, symbol name> pair. But
that does not work if the module is built into the kernel because the
object files that comprise the built-in module implementation are simply
linked into the kernel image along with all other kernel object files.

This is especially visible when providing tracing scripts for support
purposes, where the developer of the script targets a particular kernel
version, but does not have control over whether the target system has
a particular module as loadable module or built-in module. When tracing
symbols within a module, referring them by <module name, symbol name>
pairs is both convenient and aids symbol lookup. But that naming will
not work if the module name information is lost if the module is built
into the kernel on the target system.

Earlier work addressing this loss of information for built-in modules
involved adding module name information to the kallsyms data, but that
required more invasive code in the kernel proper. This work never did
get merged into the kernel tree.

All that is really needed is knowing whether a given address belongs to
a particular module (or multiple modules if they share an object file).
Or in other words, whether that address falls within an address range
that is associated with one or more modules.

Objects can be identified as belonging to a particular module (or
modules) based on defines that are passed as flags to their respective
compilation commands. The data found in modules.builtin.modinfo is
used to determine what modules are built into the kernel proper. Then,
vmlinux.o.map and vmlinux.map can be parsed in a single pass to generate
a modules.buitin.ranges file with offset range information (relative to
the base address of the associated section) for built-in modules. This
file gets installed along with the other modules.builtin.* files.

The impact on the kernel build is minimal because everything is done
using a single-pass AWK script. The generated data size is minimal as
well, (depending on the exact kernel configuration) usually in the range
of 500-700 lines, with a file size of 20-40KB (if all modules are built
in, the file contains about 8000 lines, with a file size of about 285KB).

Changes since v2:
- Switched from using modules.builtin.objs to parsing .*.cmd files
- Add explicit dependency on FTRACE for CONFIG_BUILTIN_MODULE_RANGES
- 1st arg to generate_builtin_ranges.awk is now modules.builtin.modinfo
- Parse data from .*.cmd in generate_builtin_ranges.awk
- Use $(real-prereqs) rather than $(filter-out ...)
- Include modules.builtin.ranges in modules install target

Changes since v1:
- Renamed CONFIG_BUILTIN_RANGES to CONFIG_BUILTIN_MODULE_RANGES
- Moved the config option to the tracers section
- 2nd arg to generate_builtin_ranges.awk should be vmlinux.map

Kris Van Hees (5):
trace: add CONFIG_BUILTIN_MODULE_RANGES option
kbuild: generate a linker map for vmlinux.o
module: script to generate offset ranges for builtin modules
kbuild: generate modules.builtin.ranges when linking the kernel
module: add install target for modules.builtin.ranges

Luis Chamberlain (1):
kbuild: add modules.builtin.objs

.gitignore | 2 +-
Documentation/dontdiff | 2 +-
Documentation/kbuild/kbuild.rst | 5 ++
Makefile | 8 +-
include/linux/module.h | 4 +-
kernel/trace/Kconfig | 17 ++++
scripts/Makefile.lib | 5 +-
scripts/Makefile.modinst | 11 ++-
scripts/Makefile.vmlinux | 17 ++++
scripts/Makefile.vmlinux_o | 18 ++++-
scripts/generate_builtin_ranges.awk | 149 ++++++++++++++++++++++++++++++++++++
11 files changed, 228 insertions(+), 10 deletions(-)
create mode 100755 scripts/generate_builtin_ranges.awk

base-commit: dd5a440a31fae6e459c0d6271dddd62825505361
--
2.42.0

2024-05-17 04:30:09

by Kris Van Hees

[permalink] [raw]

Subject: [PATCH v3 1/6] kbuild: add mod(name,file)_flags to assembler flags for module objects

Module objects compiled from C source can be identified by the presence
of -DKBUILD_MODFILE and -DKBUILD_MODNAME on their compile command lines.
However, module objects from assembler source do not have this defines.

Add $(modfile_flags) to modkern_aflags (similar to modkern_cflahs), and
add $(modname_flags) to a_flags (similar to c_flags).

Signed-off-by: Kris Van Hees <[email protected]>
---
scripts/Makefile.lib | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 3179747cbd2cc..a2524ffd046f4 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -234,7 +234,7 @@ modkern_rustflags = \

modkern_aflags = $(if $(part-of-module), \
$(KBUILD_AFLAGS_MODULE) $(AFLAGS_MODULE), \
- $(KBUILD_AFLAGS_KERNEL) $(AFLAGS_KERNEL))
+ $(KBUILD_AFLAGS_KERNEL) $(AFLAGS_KERNEL) $(modfile_flags))

c_flags = -Wp,-MMD,$(depfile) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) \
-include $(srctree)/include/linux/compiler_types.h \
@@ -244,7 +244,7 @@ c_flags = -Wp,-MMD,$(depfile) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) \
rust_flags = $(_rust_flags) $(modkern_rustflags) @$(objtree)/include/generated/rustc_cfg

a_flags = -Wp,-MMD,$(depfile) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) \
- $(_a_flags) $(modkern_aflags)
+ $(_a_flags) $(modkern_aflags) $(modname_flags)

cpp_flags = -Wp,-MMD,$(depfile) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) \
$(_cpp_flags)
--
2.43.0

2024-05-17 04:30:45

by Kris Van Hees

[permalink] [raw]

Subject: [PATCH v3 2/6] trace: add CONFIG_BUILTIN_MODULE_RANGES option

The CONFIG_BUILTIN_MODULE_RANGES option controls whether offset range data
is generated for kernel modules that are built into the kernel image.

Signed-off-by: Kris Van Hees <[email protected]>
Reviewed-by: Nick Alcock <[email protected]>
Reviewed-by: Alan Maguire <[email protected]>
---
Changes since v2:
- Add explicit dependency on FTRACE for CONFIG_BUILTIN_MODULE_RANGES
---
kernel/trace/Kconfig | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 47345bf1d4a9f..d0c82b4b3a61e 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -188,6 +188,24 @@ menuconfig FTRACE

if FTRACE

+config BUILTIN_MODULE_RANGES
+ bool "Generate address range information for builtin modules"
+ depends on FTRACE
+ select VMLINUX_MAP
+ help
+ When modules are built into the kernel, there will be no module name
+ associated with its symbols in /proc/kallsyms. Tracers may want to
+ identify symbols by module name and symbol name regardless of whether
+ the module is configured as loadable or not.
+
+ This option generates modules.builtin.ranges in the build tree with
+ offset ranges (per ELF section) for the module(s) they belong to.
+ It also records an anchor symbol to determine the load address of the
+ section.
+
+ It is fully compatible with CONFIG_RANDOMIZE_BASE and similar late-
+ address-modification options.
+
config BOOTTIME_TRACING
bool "Boot-time Tracing support"
depends on TRACING
--
2.43.0

2024-05-17 04:31:19

by Kris Van Hees

[permalink] [raw]

Subject: [PATCH v3 3/6] kbuild: generate a linker map for vmlinux.o

When CONFIG_BUILTIN_MODULE_RANGES is set, a linker map for vmlinux.o needs
to be generated. The generation of offset range data for builtin modules
depends on that linker map to know what offsets in an ELF section belong
to an object file for a particular builtin module.

Signed-off-by: Kris Van Hees <[email protected]>
Reviewed-by: Nick Alcock <[email protected]>
---
scripts/Makefile.vmlinux_o | 3 +++
1 file changed, 3 insertions(+)

diff --git a/scripts/Makefile.vmlinux_o b/scripts/Makefile.vmlinux_o
index 6de297916ce68..252505505e0e3 100644
--- a/scripts/Makefile.vmlinux_o
+++ b/scripts/Makefile.vmlinux_o
@@ -45,9 +45,12 @@ objtool-args = $(vmlinux-objtool-args-y) --link
# Link of vmlinux.o used for section mismatch analysis
# ---------------------------------------------------------------------------

+vmlinux-o-ld-args-$(CONFIG_BUILTIN_MODULE_RANGES) += [email protected]
+
quiet_cmd_ld_vmlinux.o = LD $@
cmd_ld_vmlinux.o = \
$(LD) ${KBUILD_LDFLAGS} -r -o $@ \
+ $(vmlinux-o-ld-args-y) \
$(addprefix -T , $(initcalls-lds)) \
--whole-archive vmlinux.a --no-whole-archive \
--start-group $(KBUILD_VMLINUX_LIBS) --end-group \
--
2.43.0

2024-05-17 04:31:52

by Kris Van Hees

[permalink] [raw]

Subject: [PATCH v3 4/6] module: script to generate offset ranges for builtin modules

The offset range data for builtin modules is generated using:
- modules.builtin.modinfo: associates object files with module names
- vmlinux.map: provides load order of sections and offset of first member
per section
- vmlinux.o.map: provides offset of object file content per section
- .*.cmd: build cmd file with KBUILD_MODFILE and KBUILD_MODNAME

The generated data will look like:

text 00000000-00000000 = _text
text 0000baf0-0000cb10 amd_uncore
text 0009bd10-0009c8e0 iosf_mbi
..
text 008e6660-008e9630 snd_soc_wcd_mbhc
text 008e9630-008ea610 snd_soc_wcd9335 snd_soc_wcd934x snd_soc_wcd938x
text 008ea610-008ea780 snd_soc_wcd9335
..
data 00000000-00000000 = _sdata
data 0000f020-0000f680 amd_uncore

For each ELF section, it lists the offset of the first symbol. This can
be used to determine the base address of the section at runtime.

Next, it lists (in strict ascending order) offset ranges in that section
that cover the symbols of one or more builtin modules. Multiple ranges
can apply to a single module, and ranges can be shared between modules.

Signed-off-by: Kris Van Hees <[email protected]>
Reviewed-by: Nick Alcock <[email protected]>
---
Changes since v2:
- 1st arg to generate_builtin_ranges.awk is now modules.builtin.modinfo
- Switched from using modules.builtin.objs to parsing .*.cmd files
- Parse data from .*.cmd in generate_builtin_ranges.awk
---
scripts/generate_builtin_ranges.awk | 232 ++++++++++++++++++++++++++++
1 file changed, 232 insertions(+)
create mode 100755 scripts/generate_builtin_ranges.awk

diff --git a/scripts/generate_builtin_ranges.awk b/scripts/generate_builtin_ranges.awk
new file mode 100755
index 0000000000000..6975a9c7266d9
--- /dev/null
+++ b/scripts/generate_builtin_ranges.awk
@@ -0,0 +1,232 @@
+#!/usr/bin/gawk -f
+# SPDX-License-Identifier: GPL-2.0
+# generate_builtin_ranges.awk: Generate address range data for builtin modules
+# Written by Kris Van Hees <[email protected]>
+#
+# Usage: generate_builtin_ranges.awk modules.builtin.modinfo vmlinux.map \
+# vmlinux.o.map > modules.builtin.ranges
+#
+
+BEGIN {
+ # modules.builtin.modinfo uses \0 as record separator
+ # All other files use \n.
+ RS = "[\n\0]";
+}
+
+# Return the module name(s) (if any) associated with the given object.
+#
+# If we have seen this object before, return information from the cache.
+# Otherwise, retrieve it from the corresponding .cmd file.
+#
+function get_module_info(fn, mod, obj, mfn, s) {
+ if (fn in omod)
+ return omod[fn];
+
+ if (match(fn, /\/[^/]+$/) == 0)
+ return "";
+
+ obj = fn;
+ mod = "";
+ mfn = "";
+ fn = substr(fn, 1, RSTART) "." substr(fn, RSTART + 1) ".cmd";
+ if (getline s <fn == 1) {
+ if (match(s, /DKBUILD_MODNAME=[^ ]+/) > 0) {
+ mod = substr(s, RSTART + 17, RLENGTH - 17);
+ gsub(/['"]/, "", mod);
+ gsub(/:/, " ", mod);
+ }
+
+ if (match(s, /DKBUILD_MODFILE=[^ ]+/) > 0) {
+ mfn = substr(s, RSTART + 17, RLENGTH - 17);
+ gsub(/['"]/, "", mfn);
+ gsub(/:/, " ", mfn);
+ }
+ }
+ close(fn);
+
+# tmp = $0;
+# $0 = s;
+# print mod " " mfn " " obj " " $NF;
+# $0 = tmp;
+
+ # A single module (common case) also reflects objects that are not part
+ # of a module. Some of those objects have names that are also a module
+ # name (e.g. core). We check the associated module file name, and if
+ # they do not match, the object is not part of a module.
+ if (mod !~ / /) {
+ if (!(mod in mods))
+ return "";
+ if (mods[mod] != mfn)
+ return "";
+ }
+
+ # At this point, mod is a single (valid) module name, or a list of
+ # module names (that do not need validation).
+ omod[obj] = mod;
+ close(fn);
+
+ return mod;
+}
+
+FNR == 1 {
+ FC++;
+}
+
+# (1-old) Build a mapping to associate object files with built-in module names.
+#
+# The first file argument is used as input (modules.builtin.objs).
+#
+FC == 1 && old_behaviour {
+ sub(/:/, "");
+ mod = $1;
+ sub(/([^/]*\/)+/, "", mod);
+ sub(/\.o$/, "", mod);
+ gsub(/-/, "_", mod);
+
+ if (NF > 1) {
+ for (i = 2; i <= NF; i++) {
+ if ($i in mods)
+ mods[$i] = mods[$i] " " mod;
+ else
+ mods[$i] = mod;
+ }
+ } else
+ mods[$1] = mod;
+
+ next;
+}
+
+# (1) Build a lookup map of built-in module names.
+#
+# The first file argument is used as input (modules.builtin.modinfo).
+#
+# We are interested in lines that follow the format
+# <modname>.file=<path>
+# and use them to record <modname>
+#
+FC == 1 && /^[^\.]+.file=/ {
+ gsub(/[\.=]/, " ");
+# print $1 " -> " $3;
+ mods[$1] = $3;
+ next;
+}
+
+# (2) Determine the load address for each section.
+#
+# The second file argument is used as input (vmlinux.map).
+#
+# Since some AWK implementations cannot handle large integers, we strip of the
+# first 4 hex digits from the address. This is safe because the kernel space
+# is not large enough for addresses to extend into those digits.
+#
+FC == 2 && /^\./ && NF > 2 {
+ if (type)
+ delete sect_addend[type];
+
+ if ($1 ~ /percpu/)
+ next;
+
+ raw_addr = $2;
+ addr_prefix = "^" substr($2, 1, 6);
+ sub(addr_prefix, "0x", $2);
+ base = strtonum($2);
+ type = $1;
+ anchor = 0;
+ sect_base[type] = base;
+
+ next;
+}
+
+!type {
+ next;
+}
+
+# (3) We need to determine the base address of the section so that ranges can
+# be expressed based on offsets from the base address. This accommodates the
+# kernel sections getting loaded at different addresses than what is recorded
+# in vmlinux.map.
+#
+# At runtime, we will need to determine the base address of each section we are
+# interested in. We do that by recording the offset of the first symbol in the
+# section. Once we know the address of this symbol in the running kernel, we
+# can calculate the base address of the section.
+#
+# If possible, we use an explicit anchor symbol (sym = .) listed at the base
+# address (offset 0).
+#
+# If there is no such symbol, we record the first symbol in the section along
+# with its offset.
+#
+# We also determine the offset of the first member in the section in case the
+# final linking inserts some content between the start of the section and the
+# first member. I.e. in that case, vmlinux.map will list the first member at
+# a non-zero offset whereas vmlinux.o.map will list it at offset 0. We record
+# the addend so we can apply it when processing vmlinux.o.map (next).
+#
+FC == 2 && !anchor && raw_addr == $1 && $3 == "=" && $4 == "." {
+ anchor = sprintf("%s %08x-%08x = %s", type, 0, 0, $2);
+ sect_anchor[type] = anchor;
+
+ next;
+}
+
+FC == 2 && !anchor && $1 ~ /^0x/ && $2 !~ /^0x/ && NF <= 4 {
+ sub(addr_prefix, "0x", $1);
+ addr = strtonum($1) - base;
+ anchor = sprintf("%s %08x-%08x = %s", type, addr, addr, $2);
+ sect_anchor[type] = anchor;
+
+ next;
+}
+
+FC == 2 && base && /^ \./ && $1 == type && NF == 4 {
+ sub(addr_prefix, "0x", $2);
+ addr = strtonum($2);
+ sect_addend[type] = addr - base;
+
+ if (anchor) {
+ base = 0;
+ type = 0;
+ }
+
+ next;
+}
+
+# (4) Collect offset ranges (relative to the section base address) for built-in
+# modules.
+#
+FC == 3 && /^ \./ && NF == 4 && $3 != "0x0" {
+ type = $1;
+ if (!(type in sect_addend))
+ next;
+
+ sub(addr_prefix, "0x", $2);
+ addr = strtonum($2) + sect_addend[type];
+
+ mod = get_module_info($4);
+# printf "[%s, %08x] %s [%s] %08x\n", mod_name, mod_start, $4, mod, addr;
+ if (mod == mod_name)
+ next;
+
+ if (mod_name) {
+ idx = mod_start + sect_base[type] + sect_addend[type];
+ entries[idx] = sprintf("%s %08x-%08x %s", type, mod_start, addr, mod_name);
+ count[type]++;
+ }
+# if (mod == "")
+# printf "ENTRY WITHOUT MOD - MODULE MAY END AT %08x\n", addr
+
+ mod_name = mod;
+ mod_start = addr;
+}
+
+END {
+ for (type in count) {
+ if (type in sect_anchor)
+ entries[sect_base[type]] = sect_anchor[type];
+ }
+
+ n = asorti(entries, indices);
+ for (i = 1; i <= n; i++)
+ print entries[indices[i]];
+}
--
2.43.0

2024-05-17 04:32:25

by Kris Van Hees

[permalink] [raw]

Subject: [PATCH v3 5/6] kbuild: generate modules.builtin.ranges when linking the kernel

Signed-off-by: Kris Van Hees <[email protected]>
Reviewed-by: Nick Alcock <[email protected]>
---
Changes since v2:
- 1st arg to generate_builtin_ranges.awk is now modules.builtin.modinfo
- Use $(real-prereqs) rather than $(filter-out ...)
---
scripts/Makefile.vmlinux | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/scripts/Makefile.vmlinux b/scripts/Makefile.vmlinux
index c9f3e03124d7f..afe8287e8dda0 100644
--- a/scripts/Makefile.vmlinux
+++ b/scripts/Makefile.vmlinux
@@ -36,6 +36,22 @@ targets += vmlinux
vmlinux: scripts/link-vmlinux.sh vmlinux.o $(KBUILD_LDS) FORCE
+$(call if_changed_dep,link_vmlinux)

+# module.builtin.ranges
+# ---------------------------------------------------------------------------
+ifdef CONFIG_BUILTIN_MODULE_RANGES
+__default: modules.builtin.ranges
+
+quiet_cmd_modules_builtin_ranges = GEN $@
+ cmd_modules_builtin_ranges = \
+ $(srctree)/scripts/generate_builtin_ranges.awk $(real-prereqs) > $@
+
+vmlinux.map: vmlinux
+
+targets += modules.builtin.ranges
+modules.builtin.ranges: modules.builtin.modinfo vmlinux.map vmlinux.o.map FORCE
+ $(call if_changed,modules_builtin_ranges)
+endif
+
# Add FORCE to the prequisites of a target to force it to be always rebuilt.
# ---------------------------------------------------------------------------

--
2.43.0

2024-05-17 04:33:03

by Kris Van Hees

[permalink] [raw]

Subject: [PATCH v3 6/6] module: add install target for modules.builtin.ranges

When CONFIG_BUILTIN_MODULE_RANGES is enabled, the modules.builtin.ranges
file should be installed in the module install location.

Signed-off-by: Kris Van Hees <[email protected]>
Reviewed-by: Nick Alcock <[email protected]>
---
Changes since v2:
- Include modules.builtin.ranges in modules install target
---
scripts/Makefile.modinst | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst
index 0afd75472679f..f5160ddd74239 100644
--- a/scripts/Makefile.modinst
+++ b/scripts/Makefile.modinst
@@ -30,10 +30,10 @@ $(MODLIB)/modules.order: modules.order FORCE
quiet_cmd_install_modorder = INSTALL $@
cmd_install_modorder = sed 's:^$.*$\.o$$:kernel/\1.ko:' $< > $@

-# Install modules.builtin(.modinfo) even when CONFIG_MODULES is disabled.
-install-y += $(addprefix $(MODLIB)/, modules.builtin modules.builtin.modinfo)
+# Install modules.builtin(.modinfo,.ranges) even when CONFIG_MODULES is disabled.
+install-y += $(addprefix $(MODLIB)/, modules.builtin modules.builtin.modinfo modules.builtin.ranges)

-$(addprefix $(MODLIB)/, modules.builtin modules.builtin.modinfo): $(MODLIB)/%: % FORCE
+$(addprefix $(MODLIB)/, modules.builtin modules.builtin.modinfo modules.builtin.ranges): $(MODLIB)/%: % FORCE
$(call cmd,install)

endif
--
2.43.0

2024-05-20 09:11:25

by Masahiro Yamada

[permalink] [raw]

Subject: Re: [PATCH v3 6/6] module: add install target for modules.builtin.ranges

On Fri, May 17, 2024 at 1:32 PM Kris Van Hees <kris.van.hees@oraclecom> wrote:
>
> When CONFIG_BUILTIN_MODULE_RANGES is enabled, the modules.builtin.ranges
> file should be installed in the module install location.
>
> Signed-off-by: Kris Van Hees <[email protected]>
> Reviewed-by: Nick Alcock <[email protected]>
> ---
> Changes since v2:
> - Include modules.builtin.ranges in modules install target
> ---
> scripts/Makefile.modinst | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst
> index 0afd75472679f..f5160ddd74239 100644
> --- a/scripts/Makefile.modinst
> +++ b/scripts/Makefile.modinst
> @@ -30,10 +30,10 @@ $(MODLIB)/modules.order: modules.order FORCE
> quiet_cmd_install_modorder = INSTALL $@
> cmd_install_modorder = sed 's:^$.*$\.o$$:kernel/\1.ko:' $< > $@
>
> -# Install modules.builtin(.modinfo) even when CONFIG_MODULES is disabled.
> -install-y += $(addprefix $(MODLIB)/, modules.builtin modules.builtin.modinfo)
> +# Install modules.builtin(.modinfo,.ranges) even when CONFIG_MODULES is disabled.
> +install-y += $(addprefix $(MODLIB)/, modules.builtin modules.builtin.modinfo modules.builtin.ranges)

This will break modules_install when CONFIG_BUILTIN_MODULE_RANGES
is disabled.

modules.builtin.ranges should be added to install-y conditionally,
like this:

# Install modules.builtin(.modinfo) even when CONFIG_MODULES is disabled.
install-y += $(addprefix $(MODLIB)/, modules.builtin modules.builtin.modinfo)

install-$(CONFIG_BUILTIN_MODULE_RANGES) += $(MODLIB)/modules.builtin.ranges

> -$(addprefix $(MODLIB)/, modules.builtin modules.builtin.modinfo): $(MODLIB)/%: % FORCE
> +$(addprefix $(MODLIB)/, modules.builtin modules.builtin.modinfo modules.builtin.ranges): $(MODLIB)/%: % FORCE
> $(call cmd,install)
>
> endif
> --
> 2.43.0
>

--
Best Regards
Masahiro Yamada

2024-05-20 09:30:00

by Masahiro Yamada

[permalink] [raw]

Subject: Re: [PATCH v3 2/6] trace: add CONFIG_BUILTIN_MODULE_RANGES option

On Fri, May 17, 2024 at 1:30 PM Kris Van Hees <kris.van.hees@oraclecom> wrote:
>
> The CONFIG_BUILTIN_MODULE_RANGES option controls whether offset range data
> is generated for kernel modules that are built into the kernel image.
>
> Signed-off-by: Kris Van Hees <[email protected]>
> Reviewed-by: Nick Alcock <[email protected]>
> Reviewed-by: Alan Maguire <[email protected]>
> ---
> Changes since v2:
> - Add explicit dependency on FTRACE for CONFIG_BUILTIN_MODULE_RANGES
> ---
> kernel/trace/Kconfig | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index 47345bf1d4a9f..d0c82b4b3a61e 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -188,6 +188,24 @@ menuconfig FTRACE
>
> if FTRACE
>
> +config BUILTIN_MODULE_RANGES
> + bool "Generate address range information for builtin modules"
> + depends on FTRACE

This 'depends on' is redundant because this config is
already located between 'if FTRACE' and 'endif'.

I believe 2/6 thru 5/6 should be squashed into one commit.
Adding only the config option does not make much sense.

> + select VMLINUX_MAP
> + help
> + When modules are built into the kernel, there will be no module name
> + associated with its symbols in /proc/kallsyms. Tracers may want to
> + identify symbols by module name and symbol name regardless of whether
> + the module is configured as loadable or not.
> +
> + This option generates modules.builtin.ranges in the build tree with
> + offset ranges (per ELF section) for the module(s) they belong to.
> + It also records an anchor symbol to determine the load address of the
> + section.
> +
> + It is fully compatible with CONFIG_RANDOMIZE_BASE and similar late-
> + address-modification options.
> +
> config BOOTTIME_TRACING
> bool "Boot-time Tracing support"
> depends on TRACING
> --
> 2.43.0
>

--
Best Regards
Masahiro Yamada

2024-05-20 09:38:59

by Masahiro Yamada

[permalink] [raw]

Subject: Re: [PATCH v3 2/6] trace: add CONFIG_BUILTIN_MODULE_RANGES option

2024-05-20 16:54:20

by Masahiro Yamada

[permalink] [raw]

Subject: Re: [PATCH v3 4/6] module: script to generate offset ranges for builtin modules

On Fri, May 17, 2024 at 1:31 PM Kris Van Hees <kris.van.hees@oraclecom> wrote:
>
> The offset range data for builtin modules is generated using:
> - modules.builtin.modinfo: associates object files with module names
> - vmlinux.map: provides load order of sections and offset of first member
> per section
> - vmlinux.o.map: provides offset of object file content per section
> - .*.cmd: build cmd file with KBUILD_MODFILE and KBUILD_MODNAME
>
> The generated data will look like:
>
> .text 00000000-00000000 = _text
> .text 0000baf0-0000cb10 amd_uncore
> .text 0009bd10-0009c8e0 iosf_mbi
> ...
> .text 008e6660-008e9630 snd_soc_wcd_mbhc
> .text 008e9630-008ea610 snd_soc_wcd9335 snd_soc_wcd934x snd_soc_wcd938x
> .text 008ea610-008ea780 snd_soc_wcd9335
> ...
> .data 00000000-00000000 = _sdata
> .data 0000f020-0000f680 amd_uncore
>
> For each ELF section, it lists the offset of the first symbol. This can
> be used to determine the base address of the section at runtime.
>
> Next, it lists (in strict ascending order) offset ranges in that section
> that cover the symbols of one or more builtin modules. Multiple ranges
> can apply to a single module, and ranges can be shared between modules.
>
> Signed-off-by: Kris Van Hees <[email protected]>
> Reviewed-by: Nick Alcock <[email protected]>
> ---
> Changes since v2:
> - 1st arg to generate_builtin_ranges.awk is now modules.builtin.modinfo
> - Switched from using modules.builtin.objs to parsing .*.cmd files
> - Parse data from .*.cmd in generate_builtin_ranges.awk
> ---
> scripts/generate_builtin_ranges.awk | 232 ++++++++++++++++++++++++++++
> 1 file changed, 232 insertions(+)
> create mode 100755 scripts/generate_builtin_ranges.awk
>
> diff --git a/scripts/generate_builtin_ranges.awk b/scripts/generate_builtin_ranges.awk
> new file mode 100755
> index 0000000000000..6975a9c7266d9
> --- /dev/null
> +++ b/scripts/generate_builtin_ranges.awk
> @@ -0,0 +1,232 @@
> +#!/usr/bin/gawk -f
> +# SPDX-License-Identifier: GPL-2.0
> +# generate_builtin_ranges.awk: Generate address range data for builtin modules
> +# Written by Kris Van Hees <[email protected]>
> +#
> +# Usage: generate_builtin_ranges.awk modules.builtin.modinfo vmlinux.map \
> +# vmlinux.o.map > modules.builtin.ranges
> +#
> +
> +BEGIN {
> + # modules.builtin.modinfo uses \0 as record separator
> + # All other files use \n.
> + RS = "[\n\0]";
> +}

Why do you want to parse modules.builtin.modinfo
instead of modules.builtin?

modules.builtin uses \n separator.

> +
> +# Return the module name(s) (if any) associated with the given object.
> +#
> +# If we have seen this object before, return information from the cache.
> +# Otherwise, retrieve it from the corresponding .cmd file.
> +#
> +function get_module_info(fn, mod, obj, mfn, s) {

There are 5 arguments, while the caller passes only 1 argument
( get_module_info($4) )

> + if (fn in omod)
> + return omod[fn];
> +
> + if (match(fn, /\/[^/]+$/) == 0)
> + return "";
> +
> + obj = fn;
> + mod = "";
> + mfn = "";
> + fn = substr(fn, 1, RSTART) "." substr(fn, RSTART + 1) ".cmd";
> + if (getline s <fn == 1) {
> + if (match(s, /DKBUILD_MODNAME=[^ ]+/) > 0) {
> + mod = substr(s, RSTART + 17, RLENGTH - 17);
> + gsub(/['"]/, "", mod);
> + gsub(/:/, " ", mod);
> + }
> +
> + if (match(s, /DKBUILD_MODFILE=[^ ]+/) > 0) {
> + mfn = substr(s, RSTART + 17, RLENGTH - 17);
> + gsub(/['"]/, "", mfn);
> + gsub(/:/, " ", mfn);
> + }
> + }
> + close(fn);
> +
> +# tmp = $0;
> +# $0 = s;
> +# print mod " " mfn " " obj " " $NF;
> +# $0 = tmp;
> +
> + # A single module (common case) also reflects objects that are not part
> + # of a module. Some of those objects have names that are also a module
> + # name (e.g. core). We check the associated module file name, and if
> + # they do not match, the object is not part of a module.

You do not need to use KBUILD_MODNAME.

Just use KBUILD_MODFILE only.
If the same path is found in modules.builtin,
it is a built-in module.

Its basename is modname.

One more question in a corner case.

How does your code work when an object is shared
by multiple modules?

For example, set
CONFIG_EDAC_SKX=y
CONFIG_EDAC_I10NM=y

How is the address range of drivers/edac/skx_common.o handled?

There are 4 possibilities.

- included in skx_edac
- included in i10nm_edac
- included in both of them
- not included in any of them

The correct behavior should be "included in both of them".

How does your code work?

> + if (mod !~ / /) {
> + if (!(mod in mods))
> + return "";
> + if (mods[mod] != mfn)
> + return "";
> + }
> +
> + # At this point, mod is a single (valid) module name, or a list of
> + # module names (that do not need validation).
> + omod[obj] = mod;
> + close(fn);
> +
> + return mod;
> +}
> +
> +FNR == 1 {
> + FC++;
> +}
> +
> +# (1-old) Build a mapping to associate object files with built-in module names.
> +#
> +# The first file argument is used as input (modules.builtin.objs).
> +#
> +FC == 1 && old_behaviour {
> + sub(/:/, "");
> + mod = $1;
> + sub(/([^/]*\/)+/, "", mod);
> + sub(/\.o$/, "", mod);
> + gsub(/-/, "_", mod);
> +
> + if (NF > 1) {
> + for (i = 2; i <= NF; i++) {
> + if ($i in mods)
> + mods[$i] = mods[$i] " " mod;
> + else
> + mods[$i] = mod;
> + }
> + } else
> + mods[$1] = mod;
> +
> + next;
> +}

Please remove the old code.

> +# (1) Build a lookup map of built-in module names.
> +#
> +# The first file argument is used as input (modules.builtin.modinfo).
> +#
> +# We are interested in lines that follow the format
> +# <modname>.file=<path>
> +# and use them to record <modname>
> +#
> +FC == 1 && /^[^\.]+.file=/ {
> + gsub(/[\.=]/, " ");
> +# print $1 " -> " $3;
> + mods[$1] = $3;
> + next;
> +}

I guess parsing module.builtin will be simpler.

> +
> +# (2) Determine the load address for each section.
> +#
> +# The second file argument is used as input (vmlinux.map).
> +#
> +# Since some AWK implementations cannot handle large integers, we strip of the
> +# first 4 hex digits from the address. This is safe because the kernel space
> +# is not large enough for addresses to extend into those digits.
> +#
> +FC == 2 && /^\./ && NF > 2 {
> + if (type)
> + delete sect_addend[type];
> +
> + if ($1 ~ /percpu/)
> + next;
> +
> + raw_addr = $2;
> + addr_prefix = "^" substr($2, 1, 6);
> + sub(addr_prefix, "0x", $2);
> + base = strtonum($2);
> + type = $1;
> + anchor = 0;
> + sect_base[type] = base;
> +
> + next;
> +}
> +
> +!type {
> + next;
> +}
> +
> +# (3) We need to determine the base address of the section so that ranges can
> +# be expressed based on offsets from the base address. This accommodates the
> +# kernel sections getting loaded at different addresses than what is recorded
> +# in vmlinux.map.
> +#
> +# At runtime, we will need to determine the base address of each section we are
> +# interested in. We do that by recording the offset of the first symbol in the
> +# section. Once we know the address of this symbol in the running kernel, we
> +# can calculate the base address of the section.
> +#
> +# If possible, we use an explicit anchor symbol (sym = .) listed at the base
> +# address (offset 0).
> +#
> +# If there is no such symbol, we record the first symbol in the section along
> +# with its offset.
> +#
> +# We also determine the offset of the first member in the section in case the
> +# final linking inserts some content between the start of the section and the
> +# first member. I.e. in that case, vmlinux.map will list the first member at
> +# a non-zero offset whereas vmlinux.o.map will list it at offset 0. We record
> +# the addend so we can apply it when processing vmlinux.o.map (next).
> +#
> +FC == 2 && !anchor && raw_addr == $1 && $3 == "=" && $4 == "." {
> + anchor = sprintf("%s %08x-%08x = %s", type, 0, 0, $2);
> + sect_anchor[type] = anchor;
> +
> + next;
> +}
> +
> +FC == 2 && !anchor && $1 ~ /^0x/ && $2 !~ /^0x/ && NF <= 4 {
> + sub(addr_prefix, "0x", $1);
> + addr = strtonum($1) - base;
> + anchor = sprintf("%s %08x-%08x = %s", type, addr, addr, $2);
> + sect_anchor[type] = anchor;
> +
> + next;
> +}
> +
> +FC == 2 && base && /^ \./ && $1 == type && NF == 4 {
> + sub(addr_prefix, "0x", $2);
> + addr = strtonum($2);
> + sect_addend[type] = addr - base;
> +
> + if (anchor) {
> + base = 0;
> + type = 0;
> + }
> +
> + next;
> +}
> +
> +# (4) Collect offset ranges (relative to the section base address) for built-in
> +# modules.
> +#
> +FC == 3 && /^ \./ && NF == 4 && $3 != "0x0" {
> + type = $1;
> + if (!(type in sect_addend))
> + next;

This assumes sections are 1:1 mapping
between vmlinux.o and vmlinux.

How far does this assumption work?

CONFIG_LD_DEAD_CODE_DATA_ELIMINATION will not work
at least.

As I said in the previous review,
gawk is not documented in Documentation/process/changes.rst

Please add it if you go with gawk.

> +
> + sub(addr_prefix, "0x", $2);
> + addr = strtonum($2) + sect_addend[type];
> +
> + mod = get_module_info($4);
> +# printf "[%s, %08x] %s [%s] %08x\n", mod_name, mod_start, $4, mod, addr;
> + if (mod == mod_name)
> + next;
> +
> + if (mod_name) {
> + idx = mod_start + sect_base[type] + sect_addend[type];
> + entries[idx] = sprintf("%s %08x-%08x %s", type, mod_start, addr, mod_name);
> + count[type]++;
> + }
> +# if (mod == "")
> +# printf "ENTRY WITHOUT MOD - MODULE MAY END AT %08x\n", addr
> +
> + mod_name = mod;
> + mod_start = addr;
> +}
> +
> +END {
> + for (type in count) {
> + if (type in sect_anchor)
> + entries[sect_base[type]] = sect_anchor[type];
> + }
> +
> + n = asorti(entries, indices);
> + for (i = 1; i <= n; i++)
> + print entries[indices[i]];
> +}
> --
> 2.43.0
>

--
Best Regards

Masahiro Yamada

2024-06-06 19:03:34

by Kris Van Hees

[permalink] [raw]

Subject: Re: [PATCH v3 4/6] module: script to generate offset ranges for builtin modules

On Tue, May 21, 2024 at 01:53:27AM +0900, Masahiro Yamada wrote:
> On Fri, May 17, 2024 at 1:31???PM Kris Van Hees <[email protected]> wrote:
> >
> > The offset range data for builtin modules is generated using:
> > - modules.builtin.modinfo: associates object files with module names
> > - vmlinux.map: provides load order of sections and offset of first member
> > per section
> > - vmlinux.o.map: provides offset of object file content per section
> > - .*.cmd: build cmd file with KBUILD_MODFILE and KBUILD_MODNAME
> >
> > The generated data will look like:
> >
> > .text 00000000-00000000 = _text
> > .text 0000baf0-0000cb10 amd_uncore
> > .text 0009bd10-0009c8e0 iosf_mbi
> > ...
> > .text 008e6660-008e9630 snd_soc_wcd_mbhc
> > .text 008e9630-008ea610 snd_soc_wcd9335 snd_soc_wcd934x snd_soc_wcd938x
> > .text 008ea610-008ea780 snd_soc_wcd9335
> > ...
> > .data 00000000-00000000 = _sdata
> > .data 0000f020-0000f680 amd_uncore
> >
> > For each ELF section, it lists the offset of the first symbol. This can
> > be used to determine the base address of the section at runtime.
> >
> > Next, it lists (in strict ascending order) offset ranges in that section
> > that cover the symbols of one or more builtin modules. Multiple ranges
> > can apply to a single module, and ranges can be shared between modules.
> >
> > Signed-off-by: Kris Van Hees <[email protected]>
> > Reviewed-by: Nick Alcock <[email protected]>
> > ---
> > Changes since v2:
> > - 1st arg to generate_builtin_ranges.awk is now modules.builtin.modinfo
> > - Switched from using modules.builtin.objs to parsing .*.cmd files
> > - Parse data from .*.cmd in generate_builtin_ranges.awk
> > ---
> > scripts/generate_builtin_ranges.awk | 232 ++++++++++++++++++++++++++++
> > 1 file changed, 232 insertions(+)
> > create mode 100755 scripts/generate_builtin_ranges.awk
> >
> > diff --git a/scripts/generate_builtin_ranges.awk b/scripts/generate_builtin_ranges.awk
> > new file mode 100755
> > index 0000000000000..6975a9c7266d9
> > --- /dev/null
> > +++ b/scripts/generate_builtin_ranges.awk
> > @@ -0,0 +1,232 @@
> > +#!/usr/bin/gawk -f
> > +# SPDX-License-Identifier: GPL-2.0
> > +# generate_builtin_ranges.awk: Generate address range data for builtin modules
> > +# Written by Kris Van Hees <[email protected]>
> > +#
> > +# Usage: generate_builtin_ranges.awk modules.builtin.modinfo vmlinux.map \
> > +# vmlinux.o.map > modules.builtin.ranges
> > +#
> > +
> > +BEGIN {
> > + # modules.builtin.modinfo uses \0 as record separator
> > + # All other files use \n.
> > + RS = "[\n\0]";
> > +}
>
>
> Why do you want to parse modules.builtin.modinfo
> instead of modules.builtin?
>
> modules.builtin uses \n separator.

Oh my, I completely overlooked modules.builtin. Thank you! That is indeed
much easier.

> > +
> > +# Return the module name(s) (if any) associated with the given object.
> > +#
> > +# If we have seen this object before, return information from the cache.
> > +# Otherwise, retrieve it from the corresponding .cmd file.
> > +#
> > +function get_module_info(fn, mod, obj, mfn, s) {
>
>
> There are 5 arguments, while the caller passes only 1 argument
> ( get_module_info($4) )

That is the way to be able to have local variables in an AWK function - every
variable mentioned in the function declaration is local to the function. It
is an oddity in AWK.

> > + if (fn in omod)
> > + return omod[fn];
> > +
> > + if (match(fn, /\/[^/]+$/) == 0)
> > + return "";
> > +
> > + obj = fn;
> > + mod = "";
> > + mfn = "";
> > + fn = substr(fn, 1, RSTART) "." substr(fn, RSTART + 1) ".cmd";
> > + if (getline s <fn == 1) {
> > + if (match(s, /DKBUILD_MODNAME=[^ ]+/) > 0) {
> > + mod = substr(s, RSTART + 17, RLENGTH - 17);
> > + gsub(/['"]/, "", mod);
> > + gsub(/:/, " ", mod);
> > + }
> > +
> > + if (match(s, /DKBUILD_MODFILE=[^ ]+/) > 0) {
> > + mfn = substr(s, RSTART + 17, RLENGTH - 17);
> > + gsub(/['"]/, "", mfn);
> > + gsub(/:/, " ", mfn);
> > + }
> > + }
> > + close(fn);
> > +
> > +# tmp = $0;
> > +# $0 = s;
> > +# print mod " " mfn " " obj " " $NF;
> > +# $0 = tmp;
> > +
> > + # A single module (common case) also reflects objects that are not part
> > + # of a module. Some of those objects have names that are also a module
> > + # name (e.g. core). We check the associated module file name, and if
> > + # they do not match, the object is not part of a module.
>
>
> You do not need to use KBUILD_MODNAME.
>
> Just use KBUILD_MODFILE only.
> If the same path is found in modules.builtin,
> it is a built-in module.
>
> Its basename is modname.

Yes, that is true. I can do all this based on KBUILD_MODFILE. Thank you.
Adjusting the patch that way.

> One more question in a corner case.
>
> How does your code work when an object is shared
> by multiple modules?
>
>
> For example, set
> CONFIG_EDAC_SKX=y
> CONFIG_EDAC_I10NM=y
>
> How is the address range of drivers/edac/skx_common.o handled?
>
> There are 4 possibilities.
>
> - included in skx_edac
> - included in i10nm_edac
> - included in both of them
> - not included in any of them
>
> The correct behavior should be "included in both of them".
>
> How does your code work?

In this case, you will find that KBUILD_MODFILE for drivers/edac/skx_common.o
is:

KBUILD_MODFILE='"drivers/edac/i10nm_edac drivers/edac/skx_edac"'

So, we can see that this object is present in more than one module. The code
(modified to just use KBUILD_MODFILE) sets mod to be "i10nm_edac skx_edac".
That means that the object will not be consider part of the i10nm_edac range
and also not part of the skx_edac range. Instead, a range entry will be
generated for "i10nm_edac skx_edac", like this:

.text 01eb2070-01eb71b0 edac_core
.text 01eb87d0-01eb98f0 i10nm_edac skx_edac
.text 01eb98f0-01eba710 skx_edac

(As an aside, there is commented out AWK code in this patch that I clearly have
to remove in the next version.)

> > + if (mod !~ / /) {
> > + if (!(mod in mods))
> > + return "";
> > + if (mods[mod] != mfn)
> > + return "";
> > + }
> > +
> > + # At this point, mod is a single (valid) module name, or a list of
> > + # module names (that do not need validation).
> > + omod[obj] = mod;
> > + close(fn);
> > +
> > + return mod;
> > +}
> > +
> > +FNR == 1 {
> > + FC++;
> > +}
> > +
> > +# (1-old) Build a mapping to associate object files with built-in module names.
> > +#
> > +# The first file argument is used as input (modules.builtin.objs).
> > +#
> > +FC == 1 && old_behaviour {
> > + sub(/:/, "");
> > + mod = $1;
> > + sub(/([^/]*\/)+/, "", mod);
> > + sub(/\.o$/, "", mod);
> > + gsub(/-/, "_", mod);
> > +
> > + if (NF > 1) {
> > + for (i = 2; i <= NF; i++) {
> > + if ($i in mods)
> > + mods[$i] = mods[$i] " " mod;
> > + else
> > + mods[$i] = mod;
> > + }
> > + } else
> > + mods[$1] = mod;
> > +
> > + next;
> > +}
>
>
> Please remove the old code.

Yes, thank you for mentioning that.

> > +# (1) Build a lookup map of built-in module names.
> > +#
> > +# The first file argument is used as input (modules.builtin.modinfo).
> > +#
> > +# We are interested in lines that follow the format
> > +# <modname>.file=<path>
> > +# and use them to record <modname>
> > +#
> > +FC == 1 && /^[^\.]+.file=/ {
> > + gsub(/[\.=]/, " ");
> > +# print $1 " -> " $3;
> > + mods[$1] = $3;
> > + next;
> > +}
>
>
> I guess parsing module.builtin will be simpler.

It is probably comparable, but I like the notion of being able to just parse
regular textfiles. I will do that.

> > +
> > +# (2) Determine the load address for each section.
> > +#
> > +# The second file argument is used as input (vmlinux.map).
> > +#
> > +# Since some AWK implementations cannot handle large integers, we strip of the
> > +# first 4 hex digits from the address. This is safe because the kernel space
> > +# is not large enough for addresses to extend into those digits.
> > +#
> > +FC == 2 && /^\./ && NF > 2 {
> > + if (type)
> > + delete sect_addend[type];
> > +
> > + if ($1 ~ /percpu/)
> > + next;
> > +
> > + raw_addr = $2;
> > + addr_prefix = "^" substr($2, 1, 6);
> > + sub(addr_prefix, "0x", $2);
> > + base = strtonum($2);
> > + type = $1;
> > + anchor = 0;
> > + sect_base[type] = base;
> > +
> > + next;
> > +}
> > +
> > +!type {
> > + next;
> > +}
> > +
> > +# (3) We need to determine the base address of the section so that ranges can
> > +# be expressed based on offsets from the base address. This accommodates the
> > +# kernel sections getting loaded at different addresses than what is recorded
> > +# in vmlinux.map.
> > +#
> > +# At runtime, we will need to determine the base address of each section we are
> > +# interested in. We do that by recording the offset of the first symbol in the
> > +# section. Once we know the address of this symbol in the running kernel, we
> > +# can calculate the base address of the section.
> > +#
> > +# If possible, we use an explicit anchor symbol (sym = .) listed at the base
> > +# address (offset 0).
> > +#
> > +# If there is no such symbol, we record the first symbol in the section along
> > +# with its offset.
> > +#
> > +# We also determine the offset of the first member in the section in case the
> > +# final linking inserts some content between the start of the section and the
> > +# first member. I.e. in that case, vmlinux.map will list the first member at
> > +# a non-zero offset whereas vmlinux.o.map will list it at offset 0. We record
> > +# the addend so we can apply it when processing vmlinux.o.map (next).
> > +#
> > +FC == 2 && !anchor && raw_addr == $1 && $3 == "=" && $4 == "." {
> > + anchor = sprintf("%s %08x-%08x = %s", type, 0, 0, $2);
> > + sect_anchor[type] = anchor;
> > +
> > + next;
> > +}
> > +
> > +FC == 2 && !anchor && $1 ~ /^0x/ && $2 !~ /^0x/ && NF <= 4 {
> > + sub(addr_prefix, "0x", $1);
> > + addr = strtonum($1) - base;
> > + anchor = sprintf("%s %08x-%08x = %s", type, addr, addr, $2);
> > + sect_anchor[type] = anchor;
> > +
> > + next;
> > +}
> > +
> > +FC == 2 && base && /^ \./ && $1 == type && NF == 4 {
> > + sub(addr_prefix, "0x", $2);
> > + addr = strtonum($2);
> > + sect_addend[type] = addr - base;
> > +
> > + if (anchor) {
> > + base = 0;
> > + type = 0;
> > + }
> > +
> > + next;
> > +}
> > +
> > +# (4) Collect offset ranges (relative to the section base address) for built-in
> > +# modules.
> > +#
> > +FC == 3 && /^ \./ && NF == 4 && $3 != "0x0" {
> > + type = $1;
> > + if (!(type in sect_addend))
> > + next;
>
>
> This assumes sections are 1:1 mapping
> between vmlinux.o and vmlinux.
>
>
> How far does this assumption work?

The assumption has shown to be accurate, but I did find a discrepancy when
building without LTO or IBT, where vmlinux.a is used to link vmlinux. There
are a few occurences where fillers change in size and that causes some entries
to be incorrect. Dealing with that is fortunately quite easy - the next patch
version will have that resolved.

LLVM-based builds (with and without LTO) are not currently supported in this
version, but I found a fairly easy way to support the non-LTO case. The next
revision will have that added.

I will have to make the BUILTIN_MODULE_RANGES option conflict with LTO because
there is no way that i can see to make that work. With LTO compiles, we do not
retain any information about compilation units (objects) so tehre is no way to
relate addresses with the objects that provided the content.

I will also document that in the option help text.

>
> CONFIG_LD_DEAD_CODE_DATA_ELIMINATION will not work
> at least.

Hm, that is one I have not encountered before. I'll look into whether that
can be made to worki, or whether that may need to be a case for now to also
not support.

> As I said in the previous review,
> gawk is not documented in Documentation/process/changes.rst
>
> Please add it if you go with gawk.

Thank you for the reminder. I overlooked that. I will do so.

> > +
> > + sub(addr_prefix, "0x", $2);
> > + addr = strtonum($2) + sect_addend[type];
> > +
> > + mod = get_module_info($4);
> > +# printf "[%s, %08x] %s [%s] %08x\n", mod_name, mod_start, $4, mod, addr;
> > + if (mod == mod_name)
> > + next;
> > +
> > + if (mod_name) {
> > + idx = mod_start + sect_base[type] + sect_addend[type];
> > + entries[idx] = sprintf("%s %08x-%08x %s", type, mod_start, addr, mod_name);
> > + count[type]++;
> > + }
> > +# if (mod == "")
> > +# printf "ENTRY WITHOUT MOD - MODULE MAY END AT %08x\n", addr
> > +
> > + mod_name = mod;
> > + mod_start = addr;
> > +}
> > +
> > +END {
> > + for (type in count) {
> > + if (type in sect_anchor)
> > + entries[sect_base[type]] = sect_anchor[type];
> > + }
> > +
> > + n = asorti(entries, indices);
> > + for (i = 1; i <= n; i++)
> > + print entries[indices[i]];
> > +}
> > --
> > 2.43.0
> >
>
>
> --
> Best Regards
>
> Masahiro Yamada