2022-11-09 14:16:11

by Nick Alcock

[permalink] [raw]
Subject: [PATCH PING v9] kallsyms: reliable symbol->address lookup with /proc/kallmodsyms

The kallmodsyms patch series was originally posted in Nov 2019, and the thread
(https://lore.kernel.org/linux-kbuild/[email protected]/t/#u)
shows review comments, questions, and feedback from interested parties.

All review comments have been satisfied, as far as I know: in particular
Yamada's note about translation units that are shared between built-in modules
is satisfied with a better representation which is also much, much smaller.

A kernel tree containing this series alone:
https://github.com/oracle/dtrace-linux-kernel kallmodsyms/6.1-rc4


The whole point of symbols is that their names are unique: you can look up a
symbol and get back a unique address, and vice versa. Alas, because
/proc/kallsyms (rightly) reports all symbols, even hidden ones, it does not
really satisfy this requirement. Large numbers of symbols are duplicated
many times (just search for __list_del_entry!), and while usually these are
just out-of-lined things defined in header files and thus all have the same
implementation, it does make it needlessly hard to figure out which one is
which in stack dumps, when tracing, and such things. Right now the kernel
has no way at all to tell these apart, and nor has the user: their address
differs and that's all. Which module did they come from? Which object
file? We don't know. Figuring out which is which when tracing needs a
combination of guesswork and luck. In discussions at LPC it became clear
that this is not just annoying me but Steve Rostedt and others, so it's
probably desirable to fix this.

It turns out that the linker, and the kernel build system, can be made to
give us everything we need to resolve this once and for all. This series
provides a new /proc/kallmodsyms which is like /proc/kallsyms except that it
annotates every (textual) symbol which comes from a built-in kernel module
with the module's name, in square brackets: if a symbol is used by multiple
modules, it gets [multiple] [names]. (We also add corresponding new fields
in the kallsyms iterator.)

But that's not quite enough: some symbols are still ambiguous, particularly
those that appear in the non-modular parts of the core kernel but also some
things that appear in built-in modules. We annotate such symbols with
cut-down {object file} names: the combination of symbol, [module] [names]
and {object file name} is unique. (The object file names are cut down to
save space: we store only the shortest suffix needed to distinguish symbols
from each other. It's fairly rare even to see two/level names, let alone
three/level/ones. We also save even more space by annotating every symbol
in a given object file with the object file name if we annotate any of
them.)

In brief we do this by mapping from address ranges to object files (with
assistance from the linker map file), then mapping from those object files
to built-in kernel modules and object file names. Because the number of
object files is much smaller than the number of symbols, because we fuse
address range and object file entries together if possible, and becasue we
don't even store object file names unless we need to, this is a fairly
efficient representation, even with a bit of extra complexity to allow
object files to be in more than one module at once.

The size impact of all of this is minimal: in testing, vmlinux grew by 16632
bytes, and the compressed vmlinux only grew by 12544 bytes (about .1% of a
10MiB kernel): though this is very configuration-dependent, it seems likely
to scale roughly with the kernel as a whole.

This is all controlled by a new config parameter CONFIG_KALLMODSYMS, which when
set results in output in /proc/kallmodsyms that looks like this:

ffffffff97606e50 t not_visible
ffffffff97606e70 T perf_msr_probe
ffffffff97606f80 t test_msr [rapl]
ffffffffa6007350 t rapl_pmu_event_stop [rapl]
ffffffffa6007440 t rapl_pmu_event_del [rapl]
ffffffffa6007460 t rapl_hrtimer_handle [rapl]
ffffffffa6007500 t rapl_pmu_event_read [rapl]
ffffffffa6007520 t rapl_pmu_event_init [rapl]
ffffffffa6007630 t rapl_cpu_offline [rapl]
ffffffffa6007710 t amd_pmu_event_map {core.o}
ffffffffa6007750 t amd_pmu_add_event {core.o}
ffffffffa6007760 t amd_put_event_constraints_f17h {core.o}

The modular symbols are notated as [rapl] even if rapl is built into the
kernel. Further, at least one symbol nottated as {core.o} would have been
ambiguous without that notation. If we look a little further down, we see:

ffffffff97607a70 t cmask_show {core.o}
ffffffff97607ab0 t inv_show {core.o}
ffffffff97607ae0 t edge_show {core.o}
ffffffff97607b10 t umask_show {core.o}
ffffffff97607b40 t event_show {core.o}

where event_show in particular is highly ambiguous and appears in many
object files, all of which are now notated with different {object file
names}.

Further down, we see what happens when object files are reused by multiple
modules, all of which are built in to the kernel, and some of which contain
symbols that are ambiguously-named even within that set of modules:

ffffffff97d7aed0 t liquidio_pcie_mmio_enabled [liquidio]
ffffffff97d7aef0 t liquidio_pcie_resume [liquidio]
ffffffff97d7af00 t liquidio_ptp_adjtime [liquidio]
ffffffff97d7af50 t liquidio_ptp_enable [liquidio]
ffffffff97d7af70 t liquidio_get_stats64 [liquidio]
ffffffff97d7b0f0 t liquidio_fix_features [liquidio]
ffffffff97d7b1c0 t liquidio_get_port_parent_id [liquidio]
[...]
ffffffff97d824c0 t lio_vf_rep_modinit [liquidio]
ffffffff97d824f0 t lio_vf_rep_modexit [liquidio]
ffffffff97d82520 t lio_ethtool_get_channels [liquidio] [liquidio_vf]
ffffffff97d82600 t lio_ethtool_get_ringparam [liquidio] [liquidio_vf]
ffffffff97d826a0 t lio_get_msglevel [liquidio] [liquidio_vf]
ffffffff97d826c0 t lio_vf_set_msglevel [liquidio] [liquidio_vf]
ffffffff97d826e0 t lio_get_pauseparam [liquidio] [liquidio_vf]
ffffffff97d82710 t lio_get_ethtool_stats [liquidio] [liquidio_vf]
ffffffff97d82e70 t lio_vf_get_ethtool_stats [liquidio] [liquidio_vf]
[...]
ffffffff97d91a80 t cn23xx_vf_mbox_thread [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91aa0 t cpumask_weight.constprop.0 [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91ac0 t cn23xx_vf_msix_interrupt_handler [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91bd0 t cn23xx_vf_get_oq_ticks [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91c00 t cn23xx_vf_ask_pf_to_do_flr [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91c70 t cn23xx_octeon_pfvf_handshake [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91e20 t cn23xx_setup_octeon_vf_device [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d92060 t octeon_mbox_read [liquidio] [liquidio_vf]
ffffffff97d92230 t octeon_mbox_write [liquidio] [liquidio_vf]
[...]
ffffffff97d946b0 t octeon_alloc_soft_command_resp [liquidio] [liquidio_vf]
ffffffff97d947e0 t octnet_send_nic_data_pkt [liquidio] [liquidio_vf]
ffffffff97d94820 t octnet_send_nic_ctrl_pkt [liquidio] [liquidio_vf]
ffffffff97d94ab0 t liquidio_get_stats64 [liquidio_vf]
ffffffff97d94c10 t liquidio_fix_features [liquidio_vf]
ffffffff97d94cd0 t wait_for_pending_requests [liquidio_vf]

Like /proc/kallsyms, the output is sorted by address, so keeps the curious
property of /proc/kallsyms that symbols may appear repeatedly with different
addresses: but now, unlike in /proc/kallsyms, we can see that those symbols
appear repeatedly because they are *different symbols* that ultimately
belong to different modules or different object files, all of which are
built in to the kernel.

Note that kernel symbols for built-in modules will probably appear
interspersed with other symbols that are part of different modules and
non-modular always-built-in symbols, which, as usual, have no
square-bracketed module denotation (though they might have an {object file
name}.

As with /proc/kallsyms, non-root usage produces addresses that are all zero.

(Now that kallmodsyms data uses very little space, the new
CONFIG_KALLMODSYMS option might perhaps be something people don't want to
bother with: maybe we can just control it via CONFIG_KALLSYMS or something?)

Limitations:

- this approach only works for textual symbols (and weak ones). I don't
see any way to make it work for data symbols etc: except for initialized
data they don't really have corresponding object files at all and they
tend to get merged together anyway.

- Non-built-in modules can also have ambiguous symbols in them in different
input object files: they aren't handled yet because kallsyms never runs
over modules to create the necessary sections. This is fixable, but it's
probably best handled in another patch series. (kallsyms would need to
do much less work for modules: only the sections introduced by this patch
series would need emission at all, and no [module] notations would be
needed, only {objfile}.)

- Section start/end symbols necessarily lie on the boundary between object
files, so are sometimes misreported as being in the wrong object file or
module. This is unlikely to be too troublesome for these symbols in
particular, but if anyone can figure out a way to fix this I'd be happy
to do it.

- There is no BPF iterator support yet (it's just a matter of adding it
if needed).

The commits in this series all have reviewed-by tags: they're all from
internal reviews, so please ignore them.

Differences from v8, February 2022:

- Add object file name handling, emitting only those object names needed to
disambiguate symbols, shortening them as much as possible compatible with
that.
- Rename .kallsyms_module_names to .kallsyms_mod_objnames now that it
contains object file names too.
- Fix a bug in optimize_obj2mod that prevented proper reuse of module names
for object files appearing in both multimodule modules and single-module
modules: saves a few KiB more, often more than the space increase due to
object file name handling.
- Rebased atop v6.1-rc2: move modules_thick.builtin generation into
the top-level Kbuild accordingly, and adjust to getopt_long use in
scripts/kallsyms.
- Significant revisions to the cover letter.
- Add proof-of-concept kallmodsyms module support to perf.
- (This ping) confirmed that series applies atop v6.1-rc4 without
further changes.

Differences from v7, December 2021:

- Adjust for changes in the v5.17 merge window. Adjust a few commit
messages and shrink the cover letter.
- Drop the symbol-size patch, probably better done from userspace.

Differences from v6, November 2021:

- Adjust for rewrite of confdata machinery in v5.16 (tristate.conf
handling is now more of a rewrite than a reversion)

Differences from v5, October 2021:

- Fix generation of mapfiles under UML

Differences from v4, September 2021:

- Fix building of tristate.conf if missing (usually concealed by the
syncconfig being run for other reasons, but not always: the kernel
test robot spotted it).
- Forward-port atop v5.15-rc3.

Differences from v3, August 2021:

- Fix a kernel test robot warning in get_ksymbol_core (possible
use of uninitialized variable if kallmodsyms was wanted but
kallsyms_module_offsets was not present, which is most unlikely).

Differences from v2, June 2021:

- Split the series up. In particular, the size impact of the table
optimizer is now quantified, and the symbol-size patch is split out and
turned into an RFC patch, with the /proc/kallmodsyms format before that
patch lacking a size column. Some speculation on how to make the symbol
sizes less space-wasteful is added (but not yet implemented).

- Drop a couple of unnecessary #includes, one unnecessarily exported
symbol, and a needless de-staticing.

Differences from v1, in 2019:

- Move from a straight symbol->module name mapping to a mapping from
address-range to TU to module name list, bringing major space savings
over the previous approach and support for object files used by many
built-in modules at the same time, at the cost of a slightly more complex
approach (unavoidably so, I think, given that we have to merge three data
sources together: the link map in .tmp_vmlinux.ranges, the nm output on
stdin, and the mapping from TU name to module names in
modules_thick.builtin).

We do opportunistic merging of TUs if they cite the same modules and
reuse module names where doing so is simple: see optimize_obj2mod below.
I considered more extensive searches for mergeable entries and more
intricate encodings of the module name list allowing TUs that are used by
overlapping sets of modules to share their names, but such modules are
rare enough (and such overlapping sharings are vanishingly rare) that it
seemed likely to save only a few bytes at the cost of much more
hard-to-test code. This is doubly true now that the tables needed are
only a few kilobytes in length.

Signed-off-by: Nick Alcock <[email protected]>
Signed-off-by: Eugene Loh <[email protected]>
Reviewed-by: Kris Van Hees <[email protected]>

Nick Alcock (8):
kbuild: bring back tristate.conf
kbuild: add modules_thick.builtin
kbuild: generate an address ranges map at vmlinux link time
kallsyms: introduce sections needed to map symbols to built-in modules
kallsyms: optimize .kallsyms_modules*
kallsyms: distinguish text symbols fully using object file names
kallsyms: add /proc/kallmodsyms for text symbol disambiguation
perf: proof-of-concept kallmodsyms support

.gitignore | 1 +
Documentation/dontdiff | 1 +
Documentation/kbuild/kconfig.rst | 5 +
Kbuild | 22 +
Makefile | 9 +-
init/Kconfig | 9 +
kernel/kallsyms.c | 277 ++++++-
kernel/kallsyms_internal.h | 14 +
scripts/Kbuild.include | 6 +
scripts/Makefile | 6 +
scripts/Makefile.modbuiltin | 56 ++
scripts/kallsyms.c | 1187 +++++++++++++++++++++++++++++-
scripts/kconfig/confdata.c | 41 +-
scripts/link-vmlinux.sh | 15 +-
scripts/modules_thick.c | 200 +++++
scripts/modules_thick.h | 48 ++
tools/perf/builtin-kallsyms.c | 35 +-
tools/perf/util/event.c | 14 +-
tools/perf/util/machine.c | 6 +-
tools/perf/util/machine.h | 1 +
tools/perf/util/symbol.c | 207 ++++--
tools/perf/util/symbol.h | 12 +-
22 files changed, 2073 insertions(+), 99 deletions(-)
create mode 100644 scripts/Makefile.modbuiltin
create mode 100644 scripts/modules_thick.c
create mode 100644 scripts/modules_thick.h

--
2.38.0.266.g481848f278



2022-11-09 14:44:01

by Nick Alcock

[permalink] [raw]
Subject: [PATCH v9 6/8] kallsyms: distinguish text symbols fully using object file names

The commits before this one allow you to distinguish identically-named
text symbols located in different built-in object files from each other;
but identically-named symbols can appear in any object files at all,
including object files that cannot be built as modules.

We already have nearly all the machinery to disambiguate these symbols
as well. Since any given object file can contain at most one definition
of a given symbol, it suffices to name the object files containing any
symbol which is otherwise ambiguous. (No others need be named, saving
a bunch of space).

We associate address ranges with object file names using a new
.kallsyms_objfiles section just like the previously-added
.kallsyms_modules section.

But that's not quite enough. Even the object file name is ambiguous in
some cases: e.g. there are a lot of files named "core.o" in the kernel.
We could just store the full pathname for every object file, but this is
needlessly wasteful: doing this eats more than 50KiB in object file
names alone, and nearly all the content of every name is repeated for
many names. But if we store the object file names in the same section
as the built-in module names, drop the .o, and store minimal path
suffixes, we can save almost all that space. (For example, "core.o"
would be stored as "core" unless there are ambiguous symbols in two
different object files both named "core", in which case they'd be named
"sched/core" and "futex/core", etc, possibly re-extending to
"kernel/sched/core" if still ambiguous).

We do this by a repeated-rehashing process. First, we compute a hash
value for symbol\0modhash for every symbol (the modhash is ignored if
this is a built-in symbol). Any two symbols with the same such hash are
identically-named: add the maximally-shortened (one-component,
.o-stripped) object file name for all such symbols, and rehash, this
time hashing symbol\0objname\0modhash. Any two symbols which still have
the same hash are still ambiguous: lengthen the name given to one of the
symbols' object files and repeat. Eventually, all the ambiguity will go
away. (We do have to take care not to re-lengthen anything we already
lengthened in any given hashing round.)

This involves multiple sorting passes but the impact on compilation time
appears to be nearly zero, and the impact on space in the running kernel
is noticeable: only a few dozen names need lengthening, so we can
completely ignore the overhead from storing repeated path components
because there are hardly any of them.

But that's not all. We can also do similar optimization tricks to what
was done with .kallsyms_modules, reusing module names and names of
already-emitted object files: so any given object file name only appears
once in the strtab, and can be cited by many address ranges and even by
module entries.

Put all this together and the net overhead of this in my testing is
about 3KiB of new object file names in the .kallsyms_mod_objnames table
and 6KiB for the .kallsyms_objfiles table (mostly zeroes: in future
maybe we can find a way to elide some of those, but 6KiB is small enough
that it's not worth taking too much effort).

No ambiguous textual symbols remain outside actual modules (which can
still contain identically-named symbols in different object files
because kallsyms doesn't run over them so none of these tables can be
built for them. At least, it doesn't yet.)

Signed-off-by: Nick Alcock <[email protected]>
Reviewed-by: Kris Van Hees <[email protected]>
---

Notes:
v9: new.

scripts/kallsyms.c | 559 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 546 insertions(+), 13 deletions(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index f89f569eb3c9..ffb69a8f6ff8 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -113,6 +113,9 @@ static unsigned int memhash(char *s, size_t len)
return hash;
}

+/*
+ * Object file -> module and shortened object file name tracking machinery.
+ */
#define OBJ2MOD_BITS 10
#define OBJ2MOD_N (1 << OBJ2MOD_BITS)
#define OBJ2MOD_MASK (OBJ2MOD_N - 1)
@@ -143,15 +146,40 @@ struct obj2mod_elem {
struct obj2mod_elem *mod2obj_next;
};

+/*
+ * Shortened object file names. These are only ever consulted after checking
+ * the obj2mod hashes: names that already exist in there are used directly from
+ * there (pointed to via the mod_xref field) rather than being re-emitted.
+ * Entries that do not exist there are added to the end of the mod_objnames
+ * list.
+ */
+struct obj2short_elem {
+ const char *obj;
+ char *desuffixed; /* objname sans suffix */
+ const char *short_obj; /* shortened at / and suffix */
+ int short_offset; /* offset of short name in .kallsyms_mod_objnames */
+ int last_rehash; /* used during disambiguate_hash_syms */
+
+ struct obj2mod_elem *mod_xref;
+ struct obj2short_elem *short_xref;
+ struct obj2short_elem *short_next;
+};
+
/*
* Map from object files to obj2mod entries (a unique mapping), and vice versa
* (not unique, but entries for objfiles in more than one module in this hash
- * are ignored).
+ * are ignored); also map from object file names to shortened names for them
+ * (also unique: there is no point storing both longer and shorter forms of one
+ * name, so if a longer name is needed we consistently use it instead of the
+ * shorter form.)
+ *
+ * obj2short is populated very late, at disambiguate_syms time.
*/

static struct obj2mod_elem *obj2mod[OBJ2MOD_N];
static struct obj2mod_elem *mod2obj[OBJ2MOD_N];
-static size_t num_objfiles;
+static struct obj2short_elem *obj2short[OBJ2MOD_N];
+static size_t num_objfiles, num_shortnames;

/*
* An ordered list of address ranges and the objfile that occupies that range.
@@ -165,6 +193,9 @@ struct addrmap_entry {
static struct addrmap_entry *addrmap;
static int addrmap_num, addrmap_alloced;

+static void disambiguate_syms(void);
+static void optimize_objnames(void);
+
static void obj2mod_init(void)
{
memset(obj2mod, 0, sizeof(obj2mod));
@@ -182,6 +213,18 @@ static struct obj2mod_elem *obj2mod_get(const char *obj)
return NULL;
}

+static struct obj2short_elem *obj2short_get(const char *obj)
+{
+ int i = strhash(obj) & OBJ2MOD_MASK;
+ struct obj2short_elem *elem;
+
+ for (elem = obj2short[i]; elem; elem = elem->short_next) {
+ if (strcmp(elem->obj, obj) == 0)
+ return elem;
+ }
+ return NULL;
+}
+
/*
* Note that a given object file is found in some module, interning it in the
* obj2mod hash. Should not be called more than once for any given (module,
@@ -254,6 +297,12 @@ static int qmodhash(const void *a, const void *b)
return 0;
}

+static int qobj2short(const void *a, const void *b)
+{
+ return strcmp((*(struct obj2short_elem **)a)->short_obj,
+ (*(struct obj2short_elem **)b)->short_obj);
+}
+
/*
* Associate all object files in obj2mod which refer to the same module with a
* single obj2mod entry for emission, preferring to point into the module list
@@ -393,6 +442,336 @@ static void optimize_obj2mod(void)
fprintf(stderr, "kallsyms: out of memory optimizing module list\n");
exit(EXIT_FAILURE);
}
+
+/*
+ * Associate all short-name entries in obj2short that refer to the same short
+ * name with a single entry for emission, either (preferably) a module that
+ * shares that name or (alternatively) the first obj2short entry referencing
+ * that name.
+ */
+static void optimize_objnames(void)
+{
+ size_t i;
+ size_t num_objnames = 0;
+ struct obj2short_elem *elem;
+ struct obj2short_elem **uniq;
+ struct obj2short_elem *last;
+
+ uniq = malloc(sizeof(struct obj2short_elem *) * num_shortnames);
+ if (uniq == NULL) {
+ fprintf(stderr, "kallsyms: out of memory optimizing object file name list\n");
+ exit(EXIT_FAILURE);
+ }
+
+ /*
+ * Much like optimize_obj2mod, except there is no need to canonicalize
+ * anything or handle multimodule entries, and we need to chase down
+ * possible entries in mod2obj first (so as not to duplicate them in the
+ * final kallsyms_mod_objnames strtab).
+ */
+ for (i = 0; i < OBJ2MOD_N; i++)
+ for (elem = obj2short[i]; elem; elem = elem->short_next)
+ uniq[num_objnames++] = elem;
+
+ qsort(uniq, num_objnames, sizeof(struct obj2short_elem *), qobj2short);
+
+ for (i = 0, last = NULL; i < num_objnames; i++) {
+ int h = strhash(uniq[i]->short_obj) & OBJ2MOD_MASK;
+ struct obj2mod_elem *mod_elem;
+
+ for (mod_elem = mod2obj[h]; mod_elem;
+ mod_elem = mod_elem->mod2obj_next) {
+ /*
+ * mod_elem entries are only valid if they are for
+ * single-module objfiles: see obj2mod_add
+ */
+ if (mod_elem->nmods > 1)
+ continue;
+
+ if (strcmp(mod_elem->mods, uniq[i]->short_obj) != 0)
+ continue;
+ uniq[i]->mod_xref = mod_elem;
+ break;
+ }
+
+ /*
+ * Only look for a short_xref match if we don't already have one
+ * in mod_xref. (This means that multiple objfiles with the
+ * same short name that is also a module name all chain directly
+ * to the module name via mod_xref, rather than going through a
+ * chain of short_xrefs.)
+ */
+ if (uniq[i]->mod_xref)
+ continue;
+
+ if (last != NULL && strcmp(last->short_obj,
+ uniq[i]->short_obj) == 0) {
+ uniq[i]->short_xref = last;
+ continue;
+ }
+
+ last = uniq[i];
+ }
+
+ free(uniq);
+}
+
+/*
+ * Used inside disambiguate_syms to identify colliding symbols. We spot this by
+ * hashing symbol\0modhash (or just the symbol name if this is in the core
+ * kernel) and seeing if that collides. (This means we don't need to bother
+ * canonicalizing the module list, since optimize_obj2mod already did it for
+ * us.)
+ *
+ * If that collides, we try disambiguating by adding ever-longer pieces of the
+ * object file name before the modhash until we no longer collide. The result
+ * of this repeated addition becomes the obj2short hashtab.
+ */
+struct sym_maybe_collides {
+ struct sym_entry *sym;
+ struct addrmap_entry *addr;
+ struct obj2short_elem *short_objname;
+ unsigned int symhash;
+};
+
+static int qsymhash(const void *a, const void *b)
+{
+ const struct sym_maybe_collides *el_a = a;
+ const struct sym_maybe_collides *el_b = b;
+ if (el_a->symhash < el_b->symhash)
+ return -1;
+ else if (el_a->symhash > el_b->symhash)
+ return 1;
+ return 0;
+}
+
+static int find_addrmap(const void *a, const void *b)
+{
+ const struct sym_entry *sym = a;
+ const struct addrmap_entry *map = b;
+
+ if (sym->addr < map->addr)
+ return -1;
+ else if (sym->addr >= map->end_addr)
+ return 1;
+ return 0;
+}
+
+/*
+ * Allocate or lengthen an object file name for a symbol that needs it.
+ */
+static int lengthen_short_name(struct sym_maybe_collides *sym, int hash_cycle)
+{
+ struct obj2short_elem *short_objname = obj2short_get(sym->addr->obj);
+
+ if (!short_objname) {
+ int i = strhash(sym->addr->obj) & OBJ2MOD_MASK;
+ char *p;
+
+ short_objname = malloc(sizeof(struct obj2short_elem));
+ if (short_objname == NULL)
+ goto oom;
+
+ /*
+ * New symbol: try maximal shortening, which is just the object
+ * file name (no directory) with the suffix removed (the suffix
+ * is useless for disambiguation since it is almost always .o).
+ *
+ * Add a bit of paranoia to allow for names starting with /,
+ * ending with ., and names with no suffix. (At least two of
+ * these are most unlikely, but possible.)
+ */
+
+ memset(short_objname, 0, sizeof(struct obj2short_elem));
+ short_objname->obj = sym->addr->obj;
+
+ p = strrchr(sym->addr->obj, '.');
+ if (p)
+ short_objname->desuffixed = strndup(sym->addr->obj,
+ p - sym->addr->obj);
+ else
+ short_objname->desuffixed = strdup(sym->addr->obj);
+
+ if (short_objname->desuffixed == NULL)
+ goto oom;
+
+ p = strrchr(short_objname->desuffixed, '/');
+ if (p && p[1] != 0)
+ short_objname->short_obj = p + 1;
+ else
+ short_objname->short_obj = short_objname->desuffixed;
+
+ short_objname->short_next = obj2short[i];
+ short_objname->last_rehash = hash_cycle;
+ obj2short[i] = short_objname;
+
+ num_shortnames++;
+ return 1;
+ }
+
+ /*
+ * Objname already lengthened by a previous symbol clash: do nothing
+ * until we rehash again.
+ */
+ if (short_objname->last_rehash == hash_cycle)
+ return 0;
+ short_objname->last_rehash = hash_cycle;
+
+ /*
+ * Existing symbol: lengthen the objname we already have.
+ */
+
+ if (short_objname->desuffixed == short_objname->short_obj) {
+ fprintf(stderr, "Cannot disambiguate %s: objname %s is "
+ "max-length but still colliding\n",
+ sym->sym->sym, short_objname->short_obj);
+ return 0;
+ }
+
+ /*
+ * Allow for absolute paths, where the first byte is '/'.
+ */
+
+ if (short_objname->desuffixed >= short_objname->short_obj - 2)
+ short_objname->short_obj = short_objname->desuffixed;
+ else {
+ for (short_objname->short_obj -= 2;
+ short_objname->short_obj > short_objname->desuffixed &&
+ *short_objname->short_obj != '/';
+ short_objname->short_obj--);
+
+ if (*short_objname->short_obj == '/')
+ short_objname->short_obj++;
+ }
+ return 1;
+ oom:
+ fprintf(stderr, "Out of memory disambiguating syms\n");
+ exit(EXIT_FAILURE);
+}
+
+/*
+ * Do one round of disambiguation-check symbol hashing, factoring in the current
+ * set of applicable shortened object file names for those symbols that need
+ * them.
+ */
+static void disambiguate_hash_syms(struct sym_maybe_collides *syms)
+{
+ size_t i;
+ for (i = 0; i < table_cnt; i++) {
+ struct obj2short_elem *short_objname = NULL;
+ char *tmp, *p;
+ size_t tmp_size;
+
+ if (syms[i].sym == NULL) {
+ syms[i].symhash = 0;
+ continue;
+ }
+
+ short_objname = obj2short_get(syms[i].addr->obj);
+
+ tmp_size = strlen((char *) &(syms[i].sym->sym[1])) + 1;
+
+ if (short_objname)
+ tmp_size += strlen(short_objname->short_obj) + 1;
+
+ if (syms[i].addr->objfile)
+ tmp_size += sizeof(syms[i].addr->objfile->modhash);
+
+ tmp = malloc(tmp_size);
+ if (tmp == NULL) {
+ fprintf(stderr, "Out of memory disambiguating syms\n");
+ exit(EXIT_FAILURE);
+ }
+
+ p = stpcpy(tmp, (char *) &(syms[i].sym->sym[1]));
+ p++;
+ if (short_objname) {
+ p = stpcpy(p, short_objname->short_obj);
+ p++;
+ }
+ if (syms[i].addr->objfile)
+ memcpy(p, &(syms[i].addr->objfile->modhash),
+ sizeof(syms[i].addr->objfile->modhash));
+
+ syms[i].symhash = memhash(tmp, tmp_size);
+ free(tmp);
+ }
+
+ qsort(syms, table_cnt, sizeof (struct sym_maybe_collides), qsymhash);
+}
+
+/*
+ * Figure out which object file names are necessary to disambiguate all symbols
+ * in the linked kernel: transform them for minimum length while retaining
+ * disambiguity: point to them in obj2short.
+ */
+static void disambiguate_syms(void)
+{
+ size_t i;
+ int retry;
+ int hash_cycle = 0;
+ unsigned int lasthash;
+ struct sym_maybe_collides *syms;
+
+ syms = calloc(table_cnt, sizeof(struct sym_maybe_collides));
+
+ if (syms == NULL)
+ goto oom;
+
+ /*
+ * Initial table population: symbol-dependent things not affected by
+ * disambiguation rounds.
+ */
+ for (i = 0; i < table_cnt; i++) {
+ struct addrmap_entry *addr;
+
+ /*
+ * Only bother doing anything for function symbols.
+ */
+ if (table[i]->sym[0] != 't' && table[i]->sym[0] != 'T' &&
+ table[i]->sym[0] != 'w' && table[i]->sym[0] != 'W')
+ continue;
+
+ addr = bsearch(table[i], addrmap, addrmap_num,
+ sizeof(struct addrmap_entry), find_addrmap);
+
+ /*
+ * Some function symbols (section start symbols, discarded
+ * non-text-range symbols, etc) don't appear in the linker map
+ * at all.
+ */
+ if (addr == NULL)
+ continue;
+
+ syms[i].sym = table[i];
+ syms[i].addr = addr;
+ }
+
+ do {
+ hash_cycle++;
+ retry = 0;
+ lasthash = 0;
+ disambiguate_hash_syms(syms);
+
+ for (i = 0; i < table_cnt; i++) {
+ if (syms[i].sym == NULL)
+ continue;
+ if (syms[i].symhash == lasthash) {
+ if (lengthen_short_name(&syms[i], hash_cycle))
+ retry = 1;
+ }
+ lasthash = syms[i].symhash;
+ }
+ } while (retry);
+
+ free(syms);
+ return;
+ oom:
+ fprintf(stderr, "kallsyms: out of memory disambiguating syms\n");
+ exit(EXIT_FAILURE);
+
+}
+
#endif /* CONFIG_KALLMODSYMS */

static void usage(void)
@@ -424,6 +803,7 @@ static bool is_ignored_symbol(const char *name, char type)
"kallsyms_relative_base",
"kallsyms_num_syms",
"kallsyms_num_modules",
+ "kallsyms_num_objfiles",
"kallsyms_names",
"kallsyms_markers",
"kallsyms_token_table",
@@ -431,6 +811,7 @@ static bool is_ignored_symbol(const char *name, char type)
"kallsyms_module_offsets",
"kallsyms_module_addresses",
"kallsyms_modules",
+ "kallsyms_objfiles",
"kallsyms_mod_objnames",
"kallsyms_mod_objnames_len",
/* Exclude linker generated symbols which vary between passes */
@@ -700,6 +1081,7 @@ static void output_address(unsigned long long addr)
static void output_kallmodsyms_mod_objnames(void)
{
struct obj2mod_elem *elem;
+ struct obj2short_elem *short_elem;
size_t offset = 1;
size_t i;

@@ -755,15 +1137,75 @@ static void output_kallmodsyms_mod_objnames(void)
}
}
}
+
+ /*
+ * Module names are done; now emit objfile names that don't match
+ * objfile names. They go in the same section to enable deduplication
+ * between (maximally-shortened) objfile names and module names.
+ * (This is another reason why objfile names drop the suffix.)
+ */
+ for (i = 0; i < OBJ2MOD_N; i++) {
+ for (short_elem = obj2short[i]; short_elem;
+ short_elem = short_elem->short_next) {
+
+ /* Already emitted? */
+ if (short_elem->mod_xref)
+ continue;
+
+ if (short_elem->short_xref)
+ short_elem = short_elem->short_xref;
+
+ if (short_elem->short_offset != 0)
+ continue;
+
+ printf("/* 0x%lx: shortened from %s */\n", offset,
+ short_elem->obj);
+
+ short_elem->short_offset = offset;
+ printf("\t.asciz\t\"%s\"\n", short_elem->short_obj);
+ offset += strlen(short_elem->short_obj) + 1;
+ }
+ }
+
printf("\n");
output_label("kallsyms_mod_objnames_len");
printf("\t.long\t%zi\n", offset);
}

+/*
+ * Return 1 if this address range cites the same built-in module and objfile
+ * name as the previous one.
+ */
+static int same_kallmodsyms_range(int i)
+{
+ struct obj2short_elem *last_short;
+ struct obj2short_elem *this_short;
+ if (i == 0)
+ return 0;
+
+ last_short = obj2short_get(addrmap[i-1].obj);
+ this_short = obj2short_get(addrmap[i].obj);
+
+ if (addrmap[i-1].objfile == addrmap[i].objfile) {
+
+ if ((last_short == NULL && this_short != NULL) ||
+ (last_short != NULL && this_short == NULL))
+ return 0;
+
+ if (last_short == NULL && this_short == NULL)
+ return 1;
+
+ if (strcmp(last_short->short_obj, this_short->short_obj) == 0)
+ return 1;
+ }
+ return 0;
+}
+
static void output_kallmodsyms_objfiles(void)
{
size_t i = 0;
size_t emitted_offsets = 0;
+ size_t emitted_modules = 0;
size_t emitted_objfiles = 0;

if (base_relative)
@@ -775,12 +1217,15 @@ static void output_kallmodsyms_objfiles(void)
long long offset;
int overflow;

- /*
- * Fuse consecutive address ranges citing the same object file
- * into one.
- */
- if (i > 0 && addrmap[i-1].objfile == addrmap[i].objfile)
- continue;
+ printf("/* 0x%llx--0x%llx: %s */\n", addrmap[i].addr,
+ addrmap[i].end_addr, addrmap[i].obj);
+
+ /*
+ * Fuse consecutive address ranges citing the same built-in
+ * module and objfile name into one.
+ */
+ if (same_kallmodsyms_range(i))
+ continue;

if (base_relative) {
if (!absolute_percpu) {
@@ -807,11 +1252,12 @@ static void output_kallmodsyms_objfiles(void)

for (i = 0; i < addrmap_num; i++) {
struct obj2mod_elem *elem = addrmap[i].objfile;
+ struct obj2mod_elem *orig_elem = NULL;
int orig_nmods;
const char *orig_modname;
int mod_offset;

- if (i > 0 && addrmap[i-1].objfile == addrmap[i].objfile)
+ if (same_kallmodsyms_range(i))
continue;

/*
@@ -819,8 +1265,10 @@ static void output_kallmodsyms_objfiles(void)
* built-in module.
*/
if (addrmap[i].objfile == NULL) {
+ printf("/* 0x%llx--0x%llx: %s: built-in */\n",
+ addrmap[i].addr, addrmap[i].end_addr, addrmap[i].obj);
printf("\t.long\t0x0\n");
- emitted_objfiles++;
+ emitted_modules++;
continue;
}

@@ -835,8 +1283,10 @@ static void output_kallmodsyms_objfiles(void)
* always points at the start of the xref target, so its offset
* can be used as is.
*/
- if (elem->xref)
+ if (elem->xref) {
+ orig_elem = elem;
elem = elem->xref;
+ }

if (elem->nmods == 1 || orig_nmods > 1) {

@@ -872,6 +1322,19 @@ static void output_kallmodsyms_objfiles(void)
* the multimodule entry.
*/
mod_offset += onemod - elem->mods + 2;
+
+ /*
+ * If this was the result of an xref chase, store this
+ * mod_offset in the original entry so we can just reuse
+ * it if an objfile shares this name.
+ */
+
+ printf("/* 0x%llx--0x%llx: %s: single-module ref to %s in multimodule at %x */\n",
+ addrmap[i].addr, addrmap[i].end_addr,
+ orig_elem->mods, onemod, elem->mod_offset);
+
+ if (orig_elem)
+ orig_elem->mod_offset = mod_offset;
}

/*
@@ -881,12 +1344,68 @@ static void output_kallmodsyms_objfiles(void)
assert(elem->mod_offset != 0);

printf("\t.long\t0x%x\n", mod_offset);
- emitted_objfiles++;
+ emitted_modules++;
}

- assert(emitted_offsets == emitted_objfiles);
+ assert(emitted_offsets == emitted_modules);
output_label("kallsyms_num_modules");
+ printf("\t.long\t%zi\n", emitted_modules);
+
+ output_label("kallsyms_objfiles");
+
+ for (i = 0; i < addrmap_num; i++) {
+ struct obj2short_elem *elem;
+ int mod_offset;
+
+ if (same_kallmodsyms_range(i))
+ continue;
+
+ /*
+ * No corresponding objfile name: no disambiguation needed;
+ * point at 0.
+ */
+ elem = obj2short_get(addrmap[i].obj);
+
+ if (elem == NULL) {
+ printf("/* 0x%llx--0x%llx: %s: unambiguous */\n",
+ addrmap[i].addr, addrmap[i].end_addr,
+ addrmap[i].obj);
+ printf("\t.long\t0x0\n");
+ emitted_objfiles++;
+ continue;
+ }
+
+ /*
+ * Maybe the name is also used for a module: if it is, it cannot
+ * be a multimodule.
+ */
+
+ if (elem->mod_xref) {
+ assert(elem->mod_xref->nmods == 1);
+ mod_offset = elem->mod_xref->mod_offset;
+ printf("/* 0x%llx--0x%llx: %s: shortened as %s, references module */\n",
+ addrmap[i].addr, addrmap[i].end_addr,
+ addrmap[i].obj, elem->short_obj);
+ } else {
+ /*
+ * A name only used for objfiles. Chase down xrefs to
+ * reuse existing entries.
+ */
+ if (elem->short_xref)
+ elem = elem->short_xref;
+
+ mod_offset = elem->short_offset;
+ printf("/* 0x%llx--0x%llx: %s: shortened as %s */\n",
+ addrmap[i].addr, addrmap[i].end_addr,
+ addrmap[i].obj, elem->short_obj);
+ }
+ printf("\t.long\t0x%x\n", mod_offset);
+ emitted_objfiles++;
+ }
+ assert(emitted_offsets == emitted_objfiles);
+ output_label("kallsyms_num_objfiles");
printf("\t.long\t%zi\n", emitted_objfiles);
+
printf("\n");
}
#endif /* CONFIG_KALLMODSYMS */
@@ -1430,6 +1949,20 @@ static void read_modules(const char *modules_builtin)
* Read linker map.
*/
read_linker_map();
+
+ /*
+ * Now the modules are sorted out and we know their address ranges, use
+ * the modhashes computed in optimize_obj2mod to identify any symbols
+ * that are still ambiguous and set up the minimal representation of
+ * their objfile name to disambiguate them.
+ */
+ disambiguate_syms();
+
+ /*
+ * Now we have objfile names, optimize the objfile list.
+ */
+ optimize_objnames();
+
}
#else
static void read_modules(const char *unused) {}
--
2.38.0.266.g481848f278


2022-11-09 14:48:57

by Nick Alcock

[permalink] [raw]
Subject: [PATCH v9 7/8] kallsyms: add /proc/kallmodsyms for text symbol disambiguation

Use the tables added in the previous commits to introduce a new
/proc/kallmodsyms, in which [module names] are also given for things
that *could* have been modular had they not been built in to the kernel.
So symbols that are part of, say, ext4 are reported as [ext4] even if
ext4 happens to be buiilt in to the kernel in this configuration.

This helps disambiguate symbols with identical names when some are in
built-in modules are some are not, but if symbols are still ambiguous,
{object file names} are added as needed to disambiguate them. The
object file names are only shown if they would prevent ambiguity, and
are minimized by chopping off as many leading path components as
possible without (symbol, module, objfile) combinations becoming
ambiguous again. (Not every symbol with an {object file name} is
necessarily ambiguous, but at least one symbol in any such object file
would have been ambiguous if the object file was not mentioned.)

Symbols that are part of multiple modules at the same time are shown
with [multiple] [module names]: consumers will have to be ready to
handle such lines. Also, kernel symbols for built-in modules will be
sorted by address, as usual for the core kernel, so will probably appear
interspersed with other symbols that are part of different modules and
non-modular always-built-in symbols, which, as usual, have no
square-bracketed module denotation. This differs from /proc/kallsyms;
even though /proc/kallsyms shows the same symbols as /proc/kallmodsyms
in the same order, the only modules it names are loadable ones, which
are necessarily in single contiguous blocks and thus shown contiguously.

The result looks like this: ([...] to show where lines are omitted for
brevity):

ffffffff97606e50 t not_visible
ffffffff97606e70 T perf_msr_probe
ffffffff97606f80 t test_msr [rapl]
ffffffff97606fa0 t __rapl_pmu_event_start [rapl]
[...]
ffffffffa6007350 t rapl_pmu_event_stop [rapl]
ffffffffa6007440 t rapl_pmu_event_del [rapl]
ffffffffa6007460 t rapl_hrtimer_handle [rapl]
ffffffffa6007500 t rapl_pmu_event_read [rapl]
ffffffffa6007520 t rapl_pmu_event_init [rapl]
ffffffffa6007630 t rapl_cpu_offline [rapl]
ffffffffa6007710 t amd_pmu_event_map {core.o}
ffffffffa6007750 t amd_pmu_add_event {core.o}
ffffffffa6007760 t amd_put_event_constraints_f17h {core.o}

The [rapl] notation is emitted even if rapl is built into the kernel
(but, obviously, not if it's not in the .config at all, or is in a
loadable module that is not loaded). The {core.o} is an object file
name.

Further down, we see what happens when object files are reused by
multiple modules, all of which are built in to the kernel, and some of
which have symbols that would be ambiguous without an object file name
attached in addition to the module names:

ffffffff97d7aed0 t liquidio_pcie_mmio_enabled [liquidio]
ffffffff97d7aef0 t liquidio_pcie_resume [liquidio]
ffffffff97d7af00 t liquidio_ptp_adjtime [liquidio]
ffffffff97d7af50 t liquidio_ptp_enable [liquidio]
ffffffff97d7af70 t liquidio_get_stats64 [liquidio]
ffffffff97d7b0f0 t liquidio_fix_features [liquidio]
ffffffff97d7b1c0 t liquidio_get_port_parent_id [liquidio]
[...]
ffffffff97d824c0 t lio_vf_rep_modinit [liquidio]
ffffffff97d824f0 t lio_vf_rep_modexit [liquidio]
ffffffff97d82520 t lio_ethtool_get_channels [liquidio] [liquidio_vf]
ffffffff97d82600 t lio_ethtool_get_ringparam [liquidio] [liquidio_vf]
ffffffff97d826a0 t lio_get_msglevel [liquidio] [liquidio_vf]
ffffffff97d826c0 t lio_vf_set_msglevel [liquidio] [liquidio_vf]
ffffffff97d826e0 t lio_get_pauseparam [liquidio] [liquidio_vf]
ffffffff97d82710 t lio_get_ethtool_stats [liquidio] [liquidio_vf]
ffffffff97d82e70 t lio_vf_get_ethtool_stats [liquidio] [liquidio_vf]
[...]
ffffffff97d91a80 t cn23xx_vf_mbox_thread [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91aa0 t cpumask_weight.constprop.0 [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91ac0 t cn23xx_vf_msix_interrupt_handler [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91bd0 t cn23xx_vf_get_oq_ticks [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91c00 t cn23xx_vf_ask_pf_to_do_flr [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91c70 t cn23xx_octeon_pfvf_handshake [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d91e20 t cn23xx_setup_octeon_vf_device [liquidio] [liquidio_vf] {cn23xx_vf_device.o}
ffffffff97d92060 t octeon_mbox_read [liquidio] [liquidio_vf]
ffffffff97d92230 t octeon_mbox_write [liquidio] [liquidio_vf]
[...]
ffffffff97d946b0 t octeon_alloc_soft_command_resp [liquidio] [liquidio_vf]
ffffffff97d947e0 t octnet_send_nic_data_pkt [liquidio] [liquidio_vf]
ffffffff97d94820 t octnet_send_nic_ctrl_pkt [liquidio] [liquidio_vf]
ffffffff97d94ab0 t liquidio_get_stats64 [liquidio_vf]
ffffffff97d94c10 t liquidio_fix_features [liquidio_vf]
ffffffff97d94cd0 t wait_for_pending_requests [liquidio_vf]

Like /proc/kallsyms, the output is driven by address, so keeps the
curious property of /proc/kallsyms that symbols may appear repeatedly
with different addresses: but now, unlike in /proc/kallsyms, we can see
that those symbols appear repeatedly because they are *different
symbols* that ultimately belong to different modules or different object
files within a module, all of which are built in to the kernel.

As with /proc/kallsyms, non-root usage produces addresses that are
all zero.

I am not wedded to the name or format of /proc/kallmodsyms, but felt it
best to split it out of /proc/kallsyms to avoid breaking existing
kallsyms parsers.

This is currently driven by a new config option, but now that
kallmodsyms data uses very little space, this option might be something
people don't want to bother with: maybe we can just control it via
CONFIG_KALLSYMS or something.

Internally, this uses a new kallsyms_builtin_module_address() almost
identical to kallsyms_sym_address() to get the address corresponding to
a given .kallsyms_modules index, and a new get_builtin_modobj_idx quite
similar to get_symbol_pos to determine the index in the
.kallsyms_modules and .kallsyms_objfiles arrays that relate to a given
address. We save a little time by exploiting the fact that all callers
will only ever traverse this list from start to end by allowing them to
pass in the previous index returned from this function as a hint: thus
very few bsearches are actually needed. (In theory this could change to
just walk straight down kallsyms_module_addresses/offsets and not bother
bsearching at all, but doing it this way is hardly any slower and much
more robust.)

We explicitly filter out displaying modules for non-text symbols
(perhaps this could be lifted for initialized data symbols in future).
There might be occasional incorrect module or objfile names for section
start/end symbols.

The display process is complicated a little by the weird format of the
.kallsyms_mod_objnames table: we have to look for multimodule entries
and print them as space-separated lists of module names.

Signed-off-by: Nick Alcock <[email protected]>
Reviewed-by: Kris Van Hees <[email protected]>
---

Notes:
v9: add objfile support. Commit message adjustments.

kernel/kallsyms.c | 277 ++++++++++++++++++++++++++++++++++---
kernel/kallsyms_internal.h | 14 ++
2 files changed, 274 insertions(+), 17 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 60c20f301a6b..9667962173f1 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -187,6 +187,25 @@ static bool cleanup_symbol_name(char *s)
return false;
}

+#ifdef CONFIG_KALLMODSYMS
+static unsigned long kallsyms_builtin_module_address(int idx)
+{
+ if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+ return kallsyms_module_addresses[idx];
+
+ /* values are unsigned offsets if --absolute-percpu is not in effect */
+ if (!IS_ENABLED(CONFIG_KALLSYMS_ABSOLUTE_PERCPU))
+ return kallsyms_relative_base + (u32)kallsyms_module_offsets[idx];
+
+ /* ...otherwise, positive offsets are absolute values */
+ if (kallsyms_module_offsets[idx] >= 0)
+ return kallsyms_module_offsets[idx];
+
+ /* ...and negative offsets are relative to kallsyms_relative_base - 1 */
+ return kallsyms_relative_base - 1 - kallsyms_module_offsets[idx];
+}
+#endif
+
/* Lookup the address for this symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name)
{
@@ -293,6 +312,54 @@ static unsigned long get_symbol_pos(unsigned long addr,
return low;
}

+/*
+ * The caller passes in an address, and we return an index to the corresponding
+ * builtin module index in .kallsyms_modules and .kallsyms_objfiles, or
+ * (unsigned long) -1 if none match.
+ *
+ * The hint_idx, if set, is a hint as to the possible return value, to handle
+ * the common case in which consecutive runs of addresses relate to the same
+ * index.
+ */
+#ifdef CONFIG_KALLMODSYMS
+static unsigned long get_builtin_modobj_idx(unsigned long addr, unsigned long hint_idx)
+{
+ unsigned long low, high, mid;
+
+ if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+ BUG_ON(!kallsyms_module_addresses);
+ else
+ BUG_ON(!kallsyms_module_offsets);
+
+ /*
+ * Do a binary search on the sorted kallsyms_modules array. The last
+ * entry in this array indicates the end of the text section, not an
+ * object file.
+ */
+ low = 0;
+ high = kallsyms_num_modules - 1;
+
+ if (hint_idx > low && hint_idx < (high - 1) &&
+ addr >= kallsyms_builtin_module_address(hint_idx) &&
+ addr < kallsyms_builtin_module_address(hint_idx + 1))
+ return hint_idx;
+
+ if (addr >= kallsyms_builtin_module_address(low)
+ && addr < kallsyms_builtin_module_address(high)) {
+ while (high - low > 1) {
+ mid = low + (high - low) / 2;
+ if (kallsyms_builtin_module_address(mid) <= addr)
+ low = mid;
+ else
+ high = mid;
+ }
+ return low;
+ }
+
+ return (unsigned long) -1;
+}
+#endif
+
/*
* Lookup an address but don't bother to find any names.
*/
@@ -564,6 +631,9 @@ struct kallsym_iter {
char type;
char name[KSYM_NAME_LEN];
char module_name[MODULE_NAME_LEN];
+ const char *builtin_module_names;
+ const char *builtin_objfile_name;
+ unsigned long hint_builtin_modobj_idx;
int exported;
int show_value;
};
@@ -594,6 +664,9 @@ static int get_ksymbol_mod(struct kallsym_iter *iter)
&iter->value, &iter->type,
iter->name, iter->module_name,
&iter->exported);
+ iter->builtin_module_names = NULL;
+ iter->builtin_objfile_name = NULL;
+
if (ret < 0) {
iter->pos_mod_end = iter->pos;
return 0;
@@ -613,6 +686,9 @@ static int get_ksymbol_ftrace_mod(struct kallsym_iter *iter)
&iter->value, &iter->type,
iter->name, iter->module_name,
&iter->exported);
+ iter->builtin_module_names = NULL;
+ iter->builtin_objfile_name = NULL;
+
if (ret < 0) {
iter->pos_ftrace_mod_end = iter->pos;
return 0;
@@ -627,6 +703,8 @@ static int get_ksymbol_bpf(struct kallsym_iter *iter)

strlcpy(iter->module_name, "bpf", MODULE_NAME_LEN);
iter->exported = 0;
+ iter->builtin_module_names = NULL;
+ iter->builtin_objfile_name = NULL;
ret = bpf_get_kallsym(iter->pos - iter->pos_ftrace_mod_end,
&iter->value, &iter->type,
iter->name);
@@ -647,23 +725,74 @@ static int get_ksymbol_kprobe(struct kallsym_iter *iter)
{
strlcpy(iter->module_name, "__builtin__kprobes", MODULE_NAME_LEN);
iter->exported = 0;
+ iter->builtin_module_names = NULL;
+ iter->builtin_objfile_name = NULL;
return kprobe_get_kallsym(iter->pos - iter->pos_bpf_end,
&iter->value, &iter->type,
iter->name) < 0 ? 0 : 1;
}

/* Returns space to next name. */
-static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
+static unsigned long get_ksymbol_core(struct kallsym_iter *iter, int kallmodsyms)
{
unsigned off = iter->nameoff;

- iter->module_name[0] = '\0';
+ iter->exported = 0;
iter->value = kallsyms_sym_address(iter->pos);

iter->type = kallsyms_get_symbol_type(off);

+ iter->module_name[0] = '\0';
+ iter->builtin_module_names = NULL;
+ iter->builtin_objfile_name = NULL;
+
off = kallsyms_expand_symbol(off, iter->name, ARRAY_SIZE(iter->name));
+#ifdef CONFIG_KALLMODSYMS
+ if (kallmodsyms) {
+ unsigned long modobj_idx = (unsigned long) -1;
+
+ if (kallsyms_module_offsets)
+ modobj_idx =
+ get_builtin_modobj_idx(iter->value,
+ iter->hint_builtin_modobj_idx);

+ /*
+ * This is a built-in module iff the tables of built-in modules
+ * (address->module name mappings), object files (ditto), and
+ * module/objfile names are known, and if the address was found
+ * there, and if the corresponding module index is nonzero, and
+ * iff this is a text (or weak) symbol. All other cases mean
+ * off the end of the binary or in a non-modular range in
+ * between one or more modules.
+ *
+ * The same rules are true for kallsyms_objfiles, except that
+ * zero entries are much more common because we only record
+ * object file names if we need them to disambiguate one or more
+ * symbols: see scripts/kallsyms.c:disambiguate_syms.
+ *
+ * (Also guard against corrupt kallsyms_modules or
+ * kallsyms_objfiles arrays pointing off the end of
+ * kallsyms_mod_objnames.)
+ */
+ if (kallsyms_modules != NULL && kallsyms_mod_objnames != NULL &&
+ kallsyms_objfiles != NULL &&
+ (iter->type == 't' || iter->type == 'T' ||
+ iter->type == 'w' || iter->type == 'W') &&
+ modobj_idx != (unsigned long) -1) {
+
+ if (kallsyms_modules[modobj_idx] != 0 &&
+ kallsyms_modules[modobj_idx] < kallsyms_mod_objnames_len)
+ iter->builtin_module_names =
+ &kallsyms_mod_objnames[kallsyms_modules[modobj_idx]];
+
+ if (kallsyms_objfiles[modobj_idx] != 0 &&
+ kallsyms_objfiles[modobj_idx] < kallsyms_mod_objnames_len)
+ iter->builtin_objfile_name =
+ &kallsyms_mod_objnames[kallsyms_objfiles[modobj_idx]];
+ }
+ iter->hint_builtin_modobj_idx = modobj_idx;
+ }
+#endif
return off - iter->nameoff;
}

@@ -709,7 +838,7 @@ static int update_iter_mod(struct kallsym_iter *iter, loff_t pos)
}

/* Returns false if pos at or past end of file. */
-static int update_iter(struct kallsym_iter *iter, loff_t pos)
+static int update_iter(struct kallsym_iter *iter, loff_t pos, int kallmodsyms)
{
/* Module symbols can be accessed randomly. */
if (pos >= kallsyms_num_syms)
@@ -719,7 +848,7 @@ static int update_iter(struct kallsym_iter *iter, loff_t pos)
if (pos != iter->pos)
reset_iter(iter, pos);

- iter->nameoff += get_ksymbol_core(iter);
+ iter->nameoff += get_ksymbol_core(iter, kallmodsyms);
iter->pos++;

return 1;
@@ -729,14 +858,14 @@ static void *s_next(struct seq_file *m, void *p, loff_t *pos)
{
(*pos)++;

- if (!update_iter(m->private, *pos))
+ if (!update_iter(m->private, *pos, 0))
return NULL;
return p;
}

static void *s_start(struct seq_file *m, loff_t *pos)
{
- if (!update_iter(m->private, *pos))
+ if (!update_iter(m->private, *pos, 0))
return NULL;
return m->private;
}
@@ -745,7 +874,7 @@ static void s_stop(struct seq_file *m, void *p)
{
}

-static int s_show(struct seq_file *m, void *p)
+static int s_show_internal(struct seq_file *m, void *p, int kallmodsyms)
{
void *value;
struct kallsym_iter *iter = m->private;
@@ -756,23 +885,82 @@ static int s_show(struct seq_file *m, void *p)

value = iter->show_value ? (void *)iter->value : NULL;

- if (iter->module_name[0]) {
+ /*
+ * Real module, or built-in module and /proc/kallsyms being shown.
+ */
+ if (iter->module_name[0] != '\0' ||
+ (iter->builtin_module_names != NULL && kallmodsyms != 0)) {
char type;

/*
- * Label it "global" if it is exported,
- * "local" if not exported.
+ * Label it "global" if it is exported, "local" if not exported.
*/
type = iter->exported ? toupper(iter->type) :
tolower(iter->type);
- seq_printf(m, "%px %c %s\t[%s]\n", value,
- type, iter->name, iter->module_name);
- } else
- seq_printf(m, "%px %c %s\n", value,
+#ifdef CONFIG_KALLMODSYMS
+ if (kallmodsyms) {
+ /*
+ * /proc/kallmodsyms, built as a module.
+ */
+ if (iter->builtin_module_names == NULL)
+ seq_printf(m, "%px %c %s\t[%s]", value,
+ type, iter->name,
+ iter->module_name);
+ /*
+ * /proc/kallmodsyms, single-module symbol.
+ */
+ else if (*iter->builtin_module_names != '\0')
+ seq_printf(m, "%px %c %s\t[%s]", value,
+ type, iter->name,
+ iter->builtin_module_names);
+ /*
+ * /proc/kallmodsyms, multimodule symbol. Formatted
+ * as \0MODULE_COUNTmodule-1\0module-2\0, where
+ * MODULE_COUNT is a single byte, 2 or higher.
+ */
+ else {
+ size_t i = *(char *)(iter->builtin_module_names + 1);
+ const char *walk = iter->builtin_module_names + 2;
+
+ seq_printf(m, "%px %c %s\t[%s]", value,
+ type, iter->name, walk);
+
+ while (--i > 0) {
+ walk += strlen(walk) + 1;
+ seq_printf(m, " [%s]", walk);
+ }
+ }
+ /*
+ * Possibly there is an objfile name too, if needed to
+ * disambiguate at least one symbol.
+ */
+ if (iter->builtin_objfile_name)
+ seq_printf(m, " {%s.o}", iter->builtin_objfile_name);
+
+ seq_printf(m, "\n");
+ } else /* !kallmodsyms */
+#endif /* CONFIG_KALLMODSYMS */
+ seq_printf(m, "%px %c %s\t[%s]\n", value,
+ type, iter->name, iter->module_name);
+ } else {
+ seq_printf(m, "%px %c %s", value,
iter->type, iter->name);
+#ifdef CONFIG_KALLMODSYMS
+ if (kallmodsyms) {
+ if (iter->builtin_objfile_name)
+ seq_printf(m, "\t{%s.o}", iter->builtin_objfile_name);
+ }
+#endif /* CONFIG_KALLMODSYMS */
+ seq_printf(m, "\n");
+ }
return 0;
}

+static int s_show(struct seq_file *m, void *p)
+{
+ return s_show_internal(m, p, 0);
+}
+
static const struct seq_operations kallsyms_op = {
.start = s_start,
.next = s_next,
@@ -780,6 +968,36 @@ static const struct seq_operations kallsyms_op = {
.show = s_show
};

+#ifdef CONFIG_KALLMODSYMS
+static int s_mod_show(struct seq_file *m, void *p)
+{
+ return s_show_internal(m, p, 1);
+}
+
+static void *s_mod_next(struct seq_file *m, void *p, loff_t *pos)
+{
+ (*pos)++;
+
+ if (!update_iter(m->private, *pos, 1))
+ return NULL;
+ return p;
+}
+
+static void *s_mod_start(struct seq_file *m, loff_t *pos)
+{
+ if (!update_iter(m->private, *pos, 1))
+ return NULL;
+ return m->private;
+}
+
+static const struct seq_operations kallmodsyms_op = {
+ .start = s_mod_start,
+ .next = s_mod_next,
+ .stop = s_stop,
+ .show = s_mod_show
+};
+#endif
+
#ifdef CONFIG_BPF_SYSCALL

struct bpf_iter__ksym {
@@ -905,7 +1123,8 @@ bool kallsyms_show_value(const struct cred *cred)
}
}

-static int kallsyms_open(struct inode *inode, struct file *file)
+static int kallsyms_open_internal(struct inode *inode, struct file *file,
+ const struct seq_operations *ops)
{
/*
* We keep iterator in m->private, since normal case is to
@@ -913,7 +1132,7 @@ static int kallsyms_open(struct inode *inode, struct file *file)
* using get_symbol_offset for every symbol.
*/
struct kallsym_iter *iter;
- iter = __seq_open_private(file, &kallsyms_op, sizeof(*iter));
+ iter = __seq_open_private(file, ops, sizeof(*iter));
if (!iter)
return -ENOMEM;
reset_iter(iter, 0);
@@ -926,6 +1145,18 @@ static int kallsyms_open(struct inode *inode, struct file *file)
return 0;
}

+static int kallsyms_open(struct inode *inode, struct file *file)
+{
+ return kallsyms_open_internal(inode, file, &kallsyms_op);
+}
+
+#ifdef CONFIG_KALLMODSYMS
+static int kallmodsyms_open(struct inode *inode, struct file *file)
+{
+ return kallsyms_open_internal(inode, file, &kallmodsyms_op);
+}
+#endif
+
#ifdef CONFIG_KGDB_KDB
const char *kdb_walk_kallsyms(loff_t *pos)
{
@@ -936,7 +1167,7 @@ const char *kdb_walk_kallsyms(loff_t *pos)
reset_iter(&kdb_walk_kallsyms_iter, 0);
}
while (1) {
- if (!update_iter(&kdb_walk_kallsyms_iter, *pos))
+ if (!update_iter(&kdb_walk_kallsyms_iter, *pos, 0))
return NULL;
++*pos;
/* Some debugging symbols have no name. Ignore them. */
@@ -953,9 +1184,21 @@ static const struct proc_ops kallsyms_proc_ops = {
.proc_release = seq_release_private,
};

+#ifdef CONFIG_KALLMODSYMS
+static const struct proc_ops kallmodsyms_proc_ops = {
+ .proc_open = kallmodsyms_open,
+ .proc_read = seq_read,
+ .proc_lseek = seq_lseek,
+ .proc_release = seq_release_private,
+};
+#endif
+
static int __init kallsyms_init(void)
{
proc_create("kallsyms", 0444, NULL, &kallsyms_proc_ops);
+#ifdef CONFIG_KALLMODSYMS
+ proc_create("kallmodsyms", 0444, NULL, &kallmodsyms_proc_ops);
+#endif
return 0;
}
device_initcall(kallsyms_init);
diff --git a/kernel/kallsyms_internal.h b/kernel/kallsyms_internal.h
index 2d0c6f2f0243..0ee6d97b732e 100644
--- a/kernel/kallsyms_internal.h
+++ b/kernel/kallsyms_internal.h
@@ -22,8 +22,22 @@ __section(".rodata") __attribute__((weak));
extern const unsigned long kallsyms_relative_base
__section(".rodata") __attribute__((weak));

+extern const unsigned long kallsyms_num_modules
+__section(".rodata") __attribute__((weak));
+
+extern const unsigned long kallsyms_num_objfiles
+__section(".rodata") __attribute__((weak));
+
+extern const unsigned long kallsyms_mod_objnames_len
+__section(".rodata") __attribute__((weak));
+
extern const char kallsyms_token_table[] __weak;
extern const u16 kallsyms_token_index[] __weak;
+extern const unsigned long kallsyms_module_addresses[] __weak;
+extern const int kallsyms_module_offsets[] __weak;
+extern const u32 kallsyms_modules[] __weak;
+extern const u32 kallsyms_objfiles[] __weak;
+extern const char kallsyms_mod_objnames[] __weak;

extern const unsigned int kallsyms_markers[] __weak;

--
2.38.0.266.g481848f278


2022-11-09 14:51:00

by Nick Alcock

[permalink] [raw]
Subject: [PATCH v9 1/8] kbuild: bring back tristate.conf

tristate.conf was dropped because it is not needed to build a
modules.builtin (although dropping it introduces a few false positives
into modules.builtin support), and doing so avoids one round of
recursion through the build tree to build it. But kallmodsyms support
requires building a mapping from object file name to built-in module
name for all builtin modules: this seems to me impossible to accomplish
without parsing all makefiles under the influence of tristate.conf,
since the makefiles are the only place this mapping is recorded.

So bring it back for this purpose. (Thanks to the refactoring in
the 5.16 timeframe, this is basically a reimplementation of commit
8b41fc4454e36fbfdbb23f940d023d4dece2de29 rather than a simple
reversion.)

Signed-off-by: Nick Alcock <[email protected]>
Reviewed-by: Victor Erminpour <[email protected]>
Reviewed-by: Kris Van Hees <[email protected]>
---

Notes:
v7: rewrite in terms of the new confdata refactoring
v8: adjust for changes in 5.17 merge window

Documentation/kbuild/kconfig.rst | 5 ++++
Makefile | 2 +-
scripts/kconfig/confdata.c | 41 +++++++++++++++++++++++++++-----
3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/Documentation/kbuild/kconfig.rst b/Documentation/kbuild/kconfig.rst
index 5967c79c3baa..e2c78760d442 100644
--- a/Documentation/kbuild/kconfig.rst
+++ b/Documentation/kbuild/kconfig.rst
@@ -162,6 +162,11 @@ KCONFIG_AUTOCONFIG
This environment variable can be set to specify the path & name of the
"auto.conf" file. Its default value is "include/config/auto.conf".

+KCONFIG_TRISTATE
+----------------
+This environment variable can be set to specify the path & name of the
+"tristate.conf" file. Its default value is "include/config/tristate.conf".
+
KCONFIG_AUTOHEADER
------------------
This environment variable can be set to specify the path & name of the
diff --git a/Makefile b/Makefile
index d148a55bfd0f..5d26447fecb8 100644
--- a/Makefile
+++ b/Makefile
@@ -793,7 +793,7 @@ $(KCONFIG_CONFIG):
#
# Do not use $(call cmd,...) here. That would suppress prompts from syncconfig,
# so you cannot notice that Kconfig is waiting for the user input.
-%/config/auto.conf %/config/auto.conf.cmd %/generated/autoconf.h %/generated/rustc_cfg: $(KCONFIG_CONFIG)
+%/config/auto.conf %/config/auto.conf.cmd %/generated/autoconf.h %/generated/rustc_cfg %/tristate.conf: $(KCONFIG_CONFIG)
$(Q)$(kecho) " SYNC $@"
$(Q)$(MAKE) -f $(srctree)/Makefile syncconfig
else # !may-sync-config
diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c
index b7c9f1dd5e42..160d12b69957 100644
--- a/scripts/kconfig/confdata.c
+++ b/scripts/kconfig/confdata.c
@@ -223,6 +223,13 @@ static const char *conf_get_rustccfg_name(void)
return name ? name : "include/generated/rustc_cfg";
}

+static const char *conf_get_tristate_name(void)
+{
+ char *name = getenv("KCONFIG_TRISTATE");
+
+ return name ? name : "include/config/tristate.conf";
+}
+
static int conf_set_sym_val(struct symbol *sym, int def, int def_flags, char *p)
{
char *p2;
@@ -670,8 +677,12 @@ static char *escape_string_value(const char *in)

enum output_n { OUTPUT_N, OUTPUT_N_AS_UNSET, OUTPUT_N_NONE };

+#define PRINT_ESCAPE 0x01
+#define PRINT_UPCASE 0x02
+#define PRINT_TRISTATE_ONLY 0x04
+
static void __print_symbol(FILE *fp, struct symbol *sym, enum output_n output_n,
- bool escape_string)
+ int flags)
{
const char *val;
char *escaped = NULL;
@@ -679,6 +690,9 @@ static void __print_symbol(FILE *fp, struct symbol *sym, enum output_n output_n,
if (sym->type == S_UNKNOWN)
return;

+ if (flags & PRINT_TRISTATE_ONLY && sym->type != S_TRISTATE)
+ return;
+
val = sym_get_string_value(sym);

if ((sym->type == S_BOOLEAN || sym->type == S_TRISTATE) &&
@@ -688,29 +702,38 @@ static void __print_symbol(FILE *fp, struct symbol *sym, enum output_n output_n,
return;
}

- if (sym->type == S_STRING && escape_string) {
+ if (sym->type == S_STRING && flags & PRINT_ESCAPE) {
escaped = escape_string_value(val);
val = escaped;
}

- fprintf(fp, "%s%s=%s\n", CONFIG_, sym->name, val);
+ if (flags & PRINT_UPCASE)
+ fprintf(fp, "%s%s=%c\n", CONFIG_, sym->name, (char)toupper(*val));
+ else
+ fprintf(fp, "%s%s=%s\n", CONFIG_, sym->name, val);

free(escaped);
}

static void print_symbol_for_dotconfig(FILE *fp, struct symbol *sym)
{
- __print_symbol(fp, sym, OUTPUT_N_AS_UNSET, true);
+ __print_symbol(fp, sym, OUTPUT_N_AS_UNSET, PRINT_ESCAPE);
}

static void print_symbol_for_autoconf(FILE *fp, struct symbol *sym)
{
- __print_symbol(fp, sym, OUTPUT_N_NONE, false);
+ __print_symbol(fp, sym, OUTPUT_N_NONE, 0);
+}
+
+static void print_symbol_for_tristate(FILE *fp, struct symbol *sym)
+{
+ __print_symbol(fp, sym, OUTPUT_N_NONE, PRINT_ESCAPE | PRINT_UPCASE |
+ PRINT_TRISTATE_ONLY);
}

void print_symbol_for_listconfig(struct symbol *sym)
{
- __print_symbol(stdout, sym, OUTPUT_N, true);
+ __print_symbol(stdout, sym, OUTPUT_N, PRINT_ESCAPE);
}

static void print_symbol_for_c(FILE *fp, struct symbol *sym)
@@ -1207,6 +1230,12 @@ int conf_write_autoconf(int overwrite)
if (ret)
return ret;

+ ret = __conf_write_autoconf(conf_get_tristate_name(),
+ print_symbol_for_tristate,
+ &comment_style_pound);
+ if (ret)
+ return ret;
+
/*
* Create include/config/auto.conf. This must be the last step because
* Kbuild has a dependency on auto.conf and this marks the successful
--
2.38.0.266.g481848f278


2022-11-09 15:58:26

by Nick Alcock

[permalink] [raw]
Subject: [PATCH v9 3/8] kbuild: generate an address ranges map at vmlinux link time

This emits a new file, .tmp_vmlinux.ranges, which maps address
range/size pairs in vmlinux to the object files which make them up,
e.g., in part:

0x0000000000000000 0x30 arch/x86/kernel/cpu/common.o
0x0000000000001000 0x1000 arch/x86/events/intel/ds.o
0x0000000000002000 0x4000 arch/x86/kernel/irq_64.o
0x0000000000006000 0x5000 arch/x86/kernel/process.o
0x000000000000b000 0x1000 arch/x86/kernel/cpu/common.o
0x000000000000c000 0x5000 arch/x86/mm/cpu_entry_area.o
0x0000000000011000 0x10 arch/x86/kernel/espfix_64.o
0x0000000000011010 0x2 arch/x86/kernel/cpu/common.o
[...]

In my simple tests this seems to work with clang too, but if I'm not
sure how stable the format of clang's linker mapfiles is: if it turns
out not to work in some versions, the mapfile-massaging awk script added
here might need some adjustment.

Signed-off-by: Nick Alcock <[email protected]>
Reviewed-by: Kris Van Hees <[email protected]>
---

Notes:
v6: use ${wl} where appropriate to avoid failure on UML

scripts/link-vmlinux.sh | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 918470d768e9..287a2b2c4d46 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -101,7 +101,7 @@ vmlinux_link()
${ld} ${ldflags} -o ${output} \
${wl}--whole-archive ${objs} ${wl}--no-whole-archive \
${wl}--start-group ${libs} ${wl}--end-group \
- $@ ${ldlibs}
+ ${wl}-Map=.tmp_vmlinux.map $@ ${ldlibs}
}

# generate .BTF typeinfo from DWARF debuginfo
@@ -144,6 +144,19 @@ kallsyms()
{
local kallsymopt;

+ # read the linker map to identify ranges of addresses:
+ # - for each *.o file, report address, size, pathname
+ # - most such lines will have four fields
+ # - but sometimes there is a line break after the first field
+ # - start reading at "Linker script and memory map"
+ # - stop reading at ".brk"
+ ${AWK} '
+ /\.o$/ && start==1 { print $(NF-2), $(NF-1), $NF }
+ /^Linker script and memory map/ { start = 1 }
+ /^\.brk/ { exit(0) }
+ ' .tmp_vmlinux.map | sort > .tmp_vmlinux.ranges
+
+ # get kallsyms options
if is_enabled CONFIG_KALLSYMS_ALL; then
kallsymopt="${kallsymopt} --all-symbols"
fi
--
2.38.0.266.g481848f278


2022-11-09 16:17:58

by Nick Alcock

[permalink] [raw]
Subject: [PATCH v9 8/8] perf: proof-of-concept kallmodsyms support

This is only very partial: it adds support to 'perf kallsyms', allowing
you to say, e.g.

% ./perf kallsyms __scsi_device_lookup
__scsi_device_lookup: scsi_mod (built-in) 0xffffffff9b901de0-0xffffffff9b901e30 (0xffffffff9b901de0-0xffffffff9b901e30)

and get told that this is in a built-in module. We also handle symbols
that are in multiple modules at once:

% ./perf kallsyms lio_set_fecparam
lio_set_fecparam: liquidio, liquidio_vf (built-in) 0xffffffff9b934da0-0xffffffff9b934e10 (0xffffffff9b934da0-0xffffffff9b934e10)

We do this the simplistic way, by augmenting symbols with a module array
field, the members of which are atoms in a new machine.modules red-black
tree (structured the same, and managed with the same code, as the
existing transient tree that is used to compare /proc/modules against
each other).

kallsyms symbols no longer carry a [module name] around with them that
needs cutting off whenever it's used (all users that relied on this
adjusted, I hope) but instead have that name hived off into the
per-symbol module array, with a new 'built_in' field to tell users
whether this is a built-in module or not. Since we cannot use the
presence of '[' to detect modules any more, we do it at kallmodsyms read
time by spotting _end and considering it to denote the end of the core
kernel and the start of the modular range. (I *think* this works on all
arches.)

Signed-off-by: Nick Alcock <[email protected]>
Reviewed-by: Kris Van Hees <[email protected]>
---
tools/perf/builtin-kallsyms.c | 35 +++++-
tools/perf/util/event.c | 14 ++-
tools/perf/util/machine.c | 6 +-
tools/perf/util/machine.h | 1 +
tools/perf/util/symbol.c | 207 +++++++++++++++++++++++++---------
tools/perf/util/symbol.h | 12 +-
6 files changed, 211 insertions(+), 64 deletions(-)

diff --git a/tools/perf/builtin-kallsyms.c b/tools/perf/builtin-kallsyms.c
index c08ee81529e8..6bcec2522d2d 100644
--- a/tools/perf/builtin-kallsyms.c
+++ b/tools/perf/builtin-kallsyms.c
@@ -35,10 +35,37 @@ static int __cmd_kallsyms(int argc, const char **argv)
continue;
}

- printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n",
- symbol->name, map->dso->short_name, map->dso->long_name,
- map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end),
- symbol->start, symbol->end);
+ if (!symbol->modules) {
+ printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n",
+ symbol->name, map->dso->short_name, map->dso->long_name,
+ map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end),
+ symbol->start, symbol->end);
+ } else {
+ if (!symbol->built_in)
+ printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n",
+ symbol->name, map->dso->short_name, map->dso->long_name,
+ map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end),
+ symbol->start, symbol->end);
+ else if (symbol->modules[1] == 0)
+ printf("%s: %s (built-in) %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n",
+ symbol->name, symbol->modules[0], map->unmap_ip(map, symbol->start),
+ map->unmap_ip(map, symbol->end), symbol->start, symbol->end);
+ else { /* Symbol in multiple modules at once */
+ char **mod;
+
+ printf("%s: ", symbol->name);
+
+ for (mod = symbol->modules; *mod; mod++) {
+ if (mod != symbol->modules)
+ printf(", ");
+ printf("%s", *mod);
+ }
+
+ printf (" (built-in) %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n",
+ map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end),
+ symbol->start, symbol->end);
+ }
+ }
}

machine__delete(machine);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 1fa14598b916..a344b35f7e38 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -97,16 +97,28 @@ static int find_symbol_cb(void *arg, const char *name, char type,
u64 start)
{
struct process_symbol_args *args = arg;
+ char *chop, *tmp_alloc = NULL;
+ const char *tmp = name;
+
+ if ((chop = strchr(name, '\t')) != NULL) {
+ tmp_alloc = strndup(name, name - chop);
+ if (tmp_alloc == NULL)
+ return -ENOMEM;
+ tmp = tmp_alloc;
+ }

/*
* Must be a function or at least an alias, as in PARISC64, where "_text" is
* an 'A' to the same address as "_stext".
*/
if (!(kallsyms__is_function(type) ||
- type == 'A') || strcmp(name, args->name))
+ type == 'A') || strcmp(tmp, args->name)) {
+ free(tmp_alloc);
return 0;
+ }

args->start = start;
+ free(tmp_alloc);
return 1;
}

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 76316e459c3d..2be5a3c1a267 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -173,7 +173,7 @@ struct machine *machine__new_kallsyms(void)
* ask for not using the kcore parsing code, once this one is fixed
* to create a map per module.
*/
- if (machine && machine__load_kallsyms(machine, "/proc/kallsyms") <= 0) {
+ if (machine && machine__load_kallsyms(machine, "/proc/kallmodsyms") <= 0) {
machine__delete(machine);
machine = NULL;
}
@@ -237,6 +237,7 @@ void machine__exit(struct machine *machine)
zfree(&machine->mmap_name);
zfree(&machine->current_tid);
zfree(&machine->kallsyms_filename);
+ modules__delete_modules(&machine->modules);

for (i = 0; i < THREADS__TABLE_SIZE; i++) {
struct threads *threads = &machine->threads[i];
@@ -1410,7 +1411,8 @@ int machines__create_kernel_maps(struct machines *machines, pid_t pid)
int machine__load_kallsyms(struct machine *machine, const char *filename)
{
struct map *map = machine__kernel_map(machine);
- int ret = __dso__load_kallsyms(map->dso, filename, map, true);
+ int ret = __dso__load_kallsyms(map->dso, filename, map, &machine->modules,
+ true);

if (ret > 0) {
dso__set_loaded(map->dso);
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 74935dfaa937..393063840cd1 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -55,6 +55,7 @@ struct machine {
struct dsos dsos;
struct maps *kmaps;
struct map *vmlinux_map;
+ struct rb_root modules;
u64 kernel_start;
pid_t *current_tid;
size_t current_tid_sz;
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a3a165ae933a..aab7ffdd0573 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -41,10 +41,16 @@
#include <symbol/kallsyms.h>
#include <sys/utsname.h>

-static int dso__load_kernel_sym(struct dso *dso, struct map *map);
+static int dso__load_kernel_sym(struct dso *dso, struct map *map,
+ struct rb_root *modules);
static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map);
static bool symbol__is_idle(const char *name);

+static int read_proc_modules(const char *filename, struct rb_root *modules);
+static struct module_info *find_module(const char *name,
+ struct rb_root *modules);
+static void add_module(struct module_info *mi, struct rb_root *modules);
+
int vmlinux_path__nr_entries;
char **vmlinux_path;

@@ -85,6 +91,12 @@ static enum dso_binary_type binary_type_symtab[] = {

#define DSO_BINARY_TYPE__SYMTAB_CNT ARRAY_SIZE(binary_type_symtab)

+struct module_info {
+ struct rb_node rb_node;
+ char *name;
+ u64 start;
+};
+
static bool symbol_type__filter(char symbol_type)
{
symbol_type = toupper(symbol_type);
@@ -234,15 +246,10 @@ void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms)
* kernel text segment and beginning of first module's text
* segment is very big. Therefore do not fill this gap and do
* not assign it to the kernel dso map (kallsyms).
- *
- * In kallsyms, it determines module symbols using '[' character
- * like in:
- * ffffffffc1937000 T hdmi_driver_init [snd_hda_codec_hdmi]
*/
if (prev->end == prev->start) {
/* Last kernel/module symbol mapped to end of page */
- if (is_kallsyms && (!strchr(prev->name, '[') !=
- !strchr(curr->name, '[')))
+ if (is_kallsyms && prev->built_in != curr->built_in)
prev->end = roundup(prev->end + 4096, 4096);
else
prev->end = curr->start;
@@ -301,6 +308,8 @@ struct symbol *symbol__new(u64 start, u64 len, u8 binding, u8 type, const char *
sym->type = type;
sym->binding = binding;
sym->namelen = namelen - 1;
+ sym->modules = NULL;
+ sym->built_in = 0;

pr_debug4("%s: %s %#" PRIx64 "-%#" PRIx64 "\n",
__func__, name, start, sym->end);
@@ -318,6 +327,7 @@ void symbol__delete(struct symbol *sym)
annotation__exit(notes);
}
}
+ free(sym->modules);
free(((void *)sym) - symbol_conf.priv_size);
}

@@ -716,12 +726,37 @@ static bool symbol__is_idle(const char *name)
return strlist__has_entry(idle_symbols_list, name);
}

-static int map__process_kallsym_symbol(void *arg, const char *name,
+struct process_kallsym_symbol_arg {
+ struct dso *dso;
+ struct rb_root *modules;
+ int seen_end;
+};
+
+static int map__process_kallsym_symbol(void *arg_, const char *name,
char type, u64 start)
{
struct symbol *sym;
- struct dso *dso = arg;
+ struct process_kallsym_symbol_arg *arg = arg_;
+ struct dso *dso = arg->dso;
struct rb_root_cached *root = &dso->symbols;
+ struct rb_root *modules = arg->modules;
+ char *module;
+ const char *modulep;
+ int counting = 1;
+ size_t nmods = 0;
+ char **mods = NULL;
+ char **modp = NULL;
+
+ /*
+ * Split off the modules part.
+ */
+ if ((module = strchr(name, '\t')) != NULL) {
+ *module = 0;
+ module++;
+ }
+
+ if (strcmp(name, "_end") == 0)
+ arg->seen_end = 1;

if (!symbol_type__filter(type))
return 0;
@@ -731,18 +766,88 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
return 0;

/*
- * module symbols are not sorted so we add all
- * symbols, setting length to 0, and rely on
- * symbols__fixup_end() to fix it up.
+ * non-builtin module symbols are not sorted so we add all symbols,
+ * setting length to 0, and rely on symbols__fixup_end() to fix it up.
*/
sym = symbol__new(start, 0, kallsyms2elf_binding(type), kallsyms2elf_type(type), name);
if (sym == NULL)
return -ENOMEM;
+
+ sym->built_in = !arg->seen_end;
+
+ /*
+ * Pass over the modules list twice: once to count the number of
+ * modules this symbol is part of and allocate an array to store their
+ * names, then again to fill it out.
+ *
+ * Arguably inefficient, due to one allocation per built-in symbol, even
+ * though many symbols will have the same mods array. In practice,
+ * it's just too small a waste to matter. The module names are pointers
+ * into the machine->modules rb-tree (lazily populated here).
+ */
+
+fill:
+ modulep = module;
+ while (modulep && (modulep = strchr(modulep, '[')) != NULL) {
+ struct module_info *mi;
+ const char *end_bra = strchr(modulep, ']');
+
+ modulep++;
+ if (end_bra == NULL || end_bra <= modulep)
+ continue;
+
+ if (counting) {
+ nmods++;
+ continue;
+ }
+
+ /*
+ * Fill-out phase.
+ */
+
+ *modp = strndup(modulep, end_bra - modulep);
+ if (*modp == NULL) {
+ free(mods);
+ return -ENOMEM;
+ }
+
+ mi = find_module(*modp, modules);
+ if (!mi) {
+ mi = zalloc(sizeof(struct module_info));
+
+ if (!mi) {
+ free (mods);
+ free (*modp);
+ return -ENOMEM;
+ }
+ mi->name = *modp;
+ }
+ else {
+ free(*modp);
+ *modp = mi->name;
+ }
+
+ modp++;
+ }
+
+ if (counting && nmods > 0) {
+ mods = calloc(nmods + 1, sizeof (char *));
+ if (mods == NULL)
+ return -ENOMEM;
+ modp = mods;
+
+ counting = 0;
+ goto fill;
+ }
+
+ sym->modules = mods;
+
/*
* We will pass the symbols to the filter later, in
- * map__split_kallsyms, when we have split the maps per module
+ * map__split_kallsyms, when we have split the maps per
+ * (non-built-in) module
*/
- __symbols__insert(root, sym, !strchr(name, '['));
+ __symbols__insert(root, sym, !arg->seen_end);

return 0;
}
@@ -752,9 +857,11 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
* so that we can in the next step set the symbol ->end address and then
* call kernel_maps__split_kallsyms.
*/
-static int dso__load_all_kallsyms(struct dso *dso, const char *filename)
+static int dso__load_all_kallsyms(struct dso *dso, const char *filename,
+ struct rb_root *modules)
{
- return kallsyms__parse(filename, dso, map__process_kallsym_symbol);
+ struct process_kallsym_symbol_arg arg = {dso, modules, 0};
+ return kallsyms__parse(filename, &arg, map__process_kallsym_symbol);
}

static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso)
@@ -766,22 +873,14 @@ static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso)
struct rb_root_cached *root = &dso->symbols;
struct rb_node *next = rb_first_cached(root);

- if (!kmaps)
- return -1;
-
*root = RB_ROOT_CACHED;

while (next) {
- char *module;
-
pos = rb_entry(next, struct symbol, rb_node);
next = rb_next(&pos->rb_node);

rb_erase_cached(&pos->rb_node, &old_root);
RB_CLEAR_NODE(&pos->rb_node);
- module = strchr(pos->name, '\t');
- if (module)
- *module = '\0';

curr_map = maps__find(kmaps, pos->start);

@@ -830,19 +929,19 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
x86_64 = machine__is(machine, "x86_64");

while (next) {
- char *module;
-
pos = rb_entry(next, struct symbol, rb_node);
next = rb_next(&pos->rb_node);

- module = strchr(pos->name, '\t');
- if (module) {
+ if (!pos->built_in && pos->modules) {
if (!symbol_conf.use_modules)
goto discard_symbol;

- *module++ = '\0';
-
- if (strcmp(curr_map->dso->short_name, module)) {
+ /*
+ * Non-built-in symbols can only be in one module at
+ * once.
+ */
+ assert(pos->modules[1] == NULL);
+ if (strcmp(curr_map->dso->short_name, pos->modules[0])) {
if (curr_map != initial_map &&
dso->kernel == DSO_SPACE__KERNEL_GUEST &&
machine__is_default_guest(machine)) {
@@ -856,12 +955,12 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
dso__set_loaded(curr_map->dso);
}

- curr_map = maps__find_by_name(kmaps, module);
+ curr_map = maps__find_by_name(kmaps, pos->modules[0]);
if (curr_map == NULL) {
pr_debug("%s/proc/{kallsyms,modules} "
"inconsistency while looking "
"for \"%s\" module!\n",
- machine->root_dir, module);
+ machine->root_dir, pos->modules[0]);
curr_map = initial_map;
goto discard_symbol;
}
@@ -971,12 +1070,6 @@ bool symbol__restricted_filename(const char *filename,
return restricted;
}

-struct module_info {
- struct rb_node rb_node;
- char *name;
- u64 start;
-};
-
static void add_module(struct module_info *mi, struct rb_root *modules)
{
struct rb_node **p = &modules->rb_node;
@@ -995,7 +1088,7 @@ static void add_module(struct module_info *mi, struct rb_root *modules)
rb_insert_color(&mi->rb_node, modules);
}

-static void delete_modules(struct rb_root *modules)
+void modules__delete_modules(struct rb_root *modules)
{
struct module_info *mi;
struct rb_node *next = rb_first(modules);
@@ -1060,7 +1153,7 @@ static int read_proc_modules(const char *filename, struct rb_root *modules)
return -1;

if (modules__parse(filename, modules, __read_proc_modules)) {
- delete_modules(modules);
+ modules__delete_modules(modules);
return -1;
}

@@ -1101,9 +1194,9 @@ int compare_proc_modules(const char *from, const char *to)
if (!from_node && !to_node)
ret = 0;

- delete_modules(&to_modules);
+ modules__delete_modules(&to_modules);
out_delete_from:
- delete_modules(&from_modules);
+ modules__delete_modules(&from_modules);

return ret;
}
@@ -1133,7 +1226,7 @@ static int do_validate_kcore_modules(const char *filename, struct maps *kmaps)
}
}
out:
- delete_modules(&modules);
+ modules__delete_modules(&modules);
return err;
}

@@ -1467,18 +1560,20 @@ static int kallsyms__delta(struct kmap *kmap, const char *filename, u64 *delta)
}

int __dso__load_kallsyms(struct dso *dso, const char *filename,
- struct map *map, bool no_kcore)
+ struct map *map, struct rb_root *modules,
+ bool no_kcore)
{
struct kmap *kmap = map__kmap(map);
u64 delta = 0;

- if (symbol__restricted_filename(filename, "/proc/kallsyms"))
+ if (symbol__restricted_filename(filename, "/proc/kallsyms") &&
+ symbol__restricted_filename(filename, "/proc/kallmodsyms"))
return -1;

if (!kmap || !kmap->kmaps)
return -1;

- if (dso__load_all_kallsyms(dso, filename) < 0)
+ if (dso__load_all_kallsyms(dso, filename, modules) < 0)
return -1;

if (kallsyms__delta(kmap, filename, &delta))
@@ -1499,9 +1594,9 @@ int __dso__load_kallsyms(struct dso *dso, const char *filename,
}

int dso__load_kallsyms(struct dso *dso, const char *filename,
- struct map *map)
+ struct map *map, struct rb_root *modules)
{
- return __dso__load_kallsyms(dso, filename, map, false);
+ return __dso__load_kallsyms(dso, filename, map, modules, false);
}

static int dso__load_perf_map(const char *map_path, struct dso *dso)
@@ -1814,12 +1909,13 @@ int dso__load(struct dso *dso, struct map *map)
dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE_COMP;

if (dso->kernel && !kmod) {
+ machine = map__kmaps(map)->machine;
+
if (dso->kernel == DSO_SPACE__KERNEL)
- ret = dso__load_kernel_sym(dso, map);
+ ret = dso__load_kernel_sym(dso, map, &machine->modules);
else if (dso->kernel == DSO_SPACE__KERNEL_GUEST)
ret = dso__load_guest_kernel_sym(dso, map);

- machine = map__kmaps(map)->machine;
if (machine__is(machine, "x86_64"))
machine__map_x86_64_entry_trampolines(machine, dso);
goto out;
@@ -2220,7 +2316,8 @@ static char *dso__find_kallsyms(struct dso *dso, struct map *map)
return strdup(path);
}

-static int dso__load_kernel_sym(struct dso *dso, struct map *map)
+static int dso__load_kernel_sym(struct dso *dso, struct map *map,
+ struct rb_root *modules)
{
int err;
const char *kallsyms_filename = NULL;
@@ -2282,7 +2379,7 @@ static int dso__load_kernel_sym(struct dso *dso, struct map *map)
kallsyms_filename = kallsyms_allocated_filename;

do_kallsyms:
- err = dso__load_kallsyms(dso, kallsyms_filename, map);
+ err = dso__load_kallsyms(dso, kallsyms_filename, map, modules);
if (err > 0)
pr_debug("Using %s for symbols\n", kallsyms_filename);
free(kallsyms_allocated_filename);
@@ -2323,11 +2420,11 @@ static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map)
if (!kallsyms_filename)
return -1;
} else {
- sprintf(path, "%s/proc/kallsyms", machine->root_dir);
+ sprintf(path, "%s/proc/kallmodsyms", machine->root_dir);
kallsyms_filename = path;
}

- err = dso__load_kallsyms(dso, kallsyms_filename, map);
+ err = dso__load_kallsyms(dso, kallsyms_filename, map, &machine->modules);
if (err > 0)
pr_debug("Using %s for symbols\n", kallsyms_filename);
if (err > 0 && !dso__is_kcore(dso)) {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 0b893dcc8ea6..9ca218e09acf 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -66,6 +66,11 @@ struct symbol {
u8 annotate2:1;
/** Architecture specific. Unused except on PPC where it holds st_other. */
u8 arch_sym;
+ /** Null-terminated array of pointers to names of containing modules in the
+ modules red-black tree. May be NULL for none. */
+ char **modules;
+ /** Set if this symbol is built in to the core kernel. */
+ int built_in;
/** The name of length namelen associated with the symbol. */
char name[];
};
@@ -137,8 +142,9 @@ int dso__load_vmlinux(struct dso *dso, struct map *map,
const char *vmlinux, bool vmlinux_allocated);
int dso__load_vmlinux_path(struct dso *dso, struct map *map);
int __dso__load_kallsyms(struct dso *dso, const char *filename, struct map *map,
- bool no_kcore);
-int dso__load_kallsyms(struct dso *dso, const char *filename, struct map *map);
+ struct rb_root *modules, bool no_kcore);
+int dso__load_kallsyms(struct dso *dso, const char *filename, struct map *map,
+ struct rb_root *modules);

void dso__insert_symbol(struct dso *dso,
struct symbol *sym);
@@ -161,6 +167,8 @@ int sysfs__read_build_id(const char *filename, struct build_id *bid);
int modules__parse(const char *filename, void *arg,
int (*process_module)(void *arg, const char *name,
u64 start, u64 size));
+void modules__delete_modules(struct rb_root *modules);
+
int filename__read_debuglink(const char *filename, char *debuglink,
size_t size);

--
2.38.0.266.g481848f278


2022-11-10 04:11:17

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v9 1/8] kbuild: bring back tristate.conf

On Wed, Nov 09, 2022 at 01:41:25PM +0000, Nick Alcock wrote:
> tristate.conf was dropped because it is not needed to build a
> modules.builtin (although dropping it introduces a few false positives
> into modules.builtin support), and doing so avoids one round of
> recursion through the build tree to build it. But kallmodsyms support
> requires building a mapping from object file name to built-in module
> name for all builtin modules: this seems to me impossible to accomplish
> without parsing all makefiles under the influence of tristate.conf,
> since the makefiles are the only place this mapping is recorded.
>
> So bring it back for this purpose. (Thanks to the refactoring in
> the 5.16 timeframe, this is basically a reimplementation of commit
> 8b41fc4454e36fbfdbb23f940d023d4dece2de29 rather than a simple
> reversion.)
>
> Signed-off-by: Nick Alcock <[email protected]>
> Reviewed-by: Victor Erminpour <[email protected]>
> Reviewed-by: Kris Van Hees <[email protected]>

I don't see the need, see the review of next patch.

Luis


2022-11-13 04:04:18

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v9 3/8] kbuild: generate an address ranges map at vmlinux link time

On Wed, Nov 09, 2022 at 01:41:27PM +0000, Nick Alcock wrote:
> This emits a new file, .tmp_vmlinux.ranges, which maps address
> range/size pairs in vmlinux to the object files which make them up,
> e.g., in part:
>
> 0x0000000000000000 0x30 arch/x86/kernel/cpu/common.o
> 0x0000000000001000 0x1000 arch/x86/events/intel/ds.o
> 0x0000000000002000 0x4000 arch/x86/kernel/irq_64.o
> 0x0000000000006000 0x5000 arch/x86/kernel/process.o
> 0x000000000000b000 0x1000 arch/x86/kernel/cpu/common.o
> 0x000000000000c000 0x5000 arch/x86/mm/cpu_entry_area.o
> 0x0000000000011000 0x10 arch/x86/kernel/espfix_64.o
> 0x0000000000011010 0x2 arch/x86/kernel/cpu/common.o
> [...]

This does't say why we'd want this. So either you merge it with its
first user or you explain here why anyone might find this useful.

Luis

2022-11-13 04:20:16

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v9 7/8] kallsyms: add /proc/kallmodsyms for text symbol disambiguation

On Wed, Nov 09, 2022 at 01:41:31PM +0000, Nick Alcock wrote:
> This helps disambiguate symbols with identical names when some are in
> built-in modules are some are not, but if symbols are still ambiguous,
> {object file names} are added as needed to disambiguate them.

*Why* would we ever want to trouble ourselves with expanding all this
data into the kernel? The commit log does a poor effort to describe
any value-add doing this could ever have.

> I am not wedded to the name or format of /proc/kallmodsyms, but felt it
> best to split it out of /proc/kallsyms to avoid breaking existing
> kallsyms parsers.

I'd like much more review from other parties other than Oracle on this then.
If this is generally useful, I'd image interested parties would be raving
about all this but so far I don't see it. Please try to engage folks
outside of Oracle.

Luis

2022-11-14 17:21:01

by Nick Alcock

[permalink] [raw]
Subject: Re: [PATCH v9 3/8] kbuild: generate an address ranges map at vmlinux link time

[Added Steve Rostedt to Cc:]

On 13 Nov 2022, Luis Chamberlain stated:

> On Wed, Nov 09, 2022 at 01:41:27PM +0000, Nick Alcock wrote:
>> This emits a new file, .tmp_vmlinux.ranges, which maps address
>> range/size pairs in vmlinux to the object files which make them up,
>> e.g., in part:
>>
>> 0x0000000000000000 0x30 arch/x86/kernel/cpu/common.o
>> 0x0000000000001000 0x1000 arch/x86/events/intel/ds.o
>> 0x0000000000002000 0x4000 arch/x86/kernel/irq_64.o
>> 0x0000000000006000 0x5000 arch/x86/kernel/process.o
>> 0x000000000000b000 0x1000 arch/x86/kernel/cpu/common.o
>> 0x000000000000c000 0x5000 arch/x86/mm/cpu_entry_area.o
>> 0x0000000000011000 0x10 arch/x86/kernel/espfix_64.o
>> 0x0000000000011010 0x2 arch/x86/kernel/cpu/common.o
>> [...]
>
> This does't say why we'd want this. So either you merge it with its
> first user or you explain here why anyone might find this useful.

Uh... the first user is later in this patch series? If you want each
commit to have a self-contained explanation, I could certainly note why
it's useful for said first user in this commit message (and adjust other
messages similarly), but I had previous complaints that commit log
messages and the cover letter were repeating points, so I was trying to
reduce that kind of thing.

--
NULL && (void)

2022-11-14 17:37:25

by Nick Alcock

[permalink] [raw]
Subject: Re: [PATCH v9 7/8] kallsyms: add /proc/kallmodsyms for text symbol disambiguation

On 13 Nov 2022, Luis Chamberlain said:

> On Wed, Nov 09, 2022 at 01:41:31PM +0000, Nick Alcock wrote:
>> This helps disambiguate symbols with identical names when some are in
>> built-in modules are some are not, but if symbols are still ambiguous,
>> {object file names} are added as needed to disambiguate them.
>
> *Why* would we ever want to trouble ourselves with expanding all this
> data into the kernel? The commit log does a poor effort to describe
> any value-add doing this could ever have.

Er... the cover letter says:

> The whole point of symbols is that their names are unique: you can look up a
> symbol and get back a unique address, and vice versa. Alas, because
> /proc/kallsyms (rightly) reports all symbols, even hidden ones, it does not
> really satisfy this requirement. Large numbers of symbols are duplicated
> many times (just search for __list_del_entry!), and while usually these are
> just out-of-lined things defined in header files and thus all have the same
> implementation, it does make it needlessly hard to figure out which one is
> which in stack dumps, when tracing, and such things. Right now the kernel
> has no way at all to tell these apart, and nor has the user: their address
> differs and that's all. Which module did they come from? Which object
> file? We don't know. Figuring out which is which when tracing needs a
> combination of guesswork and luck. In discussions at LPC it became clear
> that this is not just annoying me but Steve Rostedt and others, so it's
> probably desirable to fix this.

This *is* the motivation. Previous iterations of this series only added
module names, but that doesn't disambiguate all symbols, and only
*partially* disambiguating symbols isn't really much use. If all symbols
can be completely unambiguously identified (via a triplet of (name,
module, translation unit), and mapped to a single address, you can be
sure that you can unambiguously cite a single such triple and get a
single address back, and vice versa: e.g. trace output could finally
give you names that you could be sure came from one specific place, and
thus often with one particular caller, even if that symbol appears in
fifty different places in the kernel with callers in fifty different
translation units that do quite different things.

(Plus, with notational additions in tracers, you could in future use
this to trace, say, only *one* instance of __list_del_entry, rather than
being forced to either trace all of them or none, or guess which entry
was which and do a tiresome binary search of repeated traces to get the
right one after lots of trials.)

(And also it's not actually that much data any more: 10KiB or so. :) )

I can add some of this to the commit log too if you like. (As noted in
earlier messages -- which you haven't yet had time to read -- I was
trying to keep that sort of duplication down, perhaps unwisely.)

>> I am not wedded to the name or format of /proc/kallmodsyms, but felt it
>> best to split it out of /proc/kallsyms to avoid breaking existing
>> kallsyms parsers.
>
> I'd like much more review from other parties other than Oracle on this then.

Well, yes. That's what these postings are all about. If I was supposed
to get review from someone else as well, I'm happy to add those people
to the Cc: of future iterations.

2022-11-15 21:26:44

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v9 3/8] kbuild: generate an address ranges map at vmlinux link time

On Mon, Nov 14, 2022 at 04:48:57PM +0000, Nick Alcock wrote:
> [Added Steve Rostedt to Cc:]
>
> On 13 Nov 2022, Luis Chamberlain stated:
>
> > On Wed, Nov 09, 2022 at 01:41:27PM +0000, Nick Alcock wrote:
> >> This emits a new file, .tmp_vmlinux.ranges, which maps address
> >> range/size pairs in vmlinux to the object files which make them up,
> >> e.g., in part:
> >>
> >> 0x0000000000000000 0x30 arch/x86/kernel/cpu/common.o
> >> 0x0000000000001000 0x1000 arch/x86/events/intel/ds.o
> >> 0x0000000000002000 0x4000 arch/x86/kernel/irq_64.o
> >> 0x0000000000006000 0x5000 arch/x86/kernel/process.o
> >> 0x000000000000b000 0x1000 arch/x86/kernel/cpu/common.o
> >> 0x000000000000c000 0x5000 arch/x86/mm/cpu_entry_area.o
> >> 0x0000000000011000 0x10 arch/x86/kernel/espfix_64.o
> >> 0x0000000000011010 0x2 arch/x86/kernel/cpu/common.o
> >> [...]
> >
> > This does't say why we'd want this. So either you merge it with its
> > first user or you explain here why anyone might find this useful.
>
> Uh... the first user is later in this patch series? If you want each
> commit to have a self-contained explanation, I could certainly note why
> it's useful for said first user in this commit message (and adjust other
> messages similarly), but I had previous complaints that commit log
> messages and the cover letter were repeating points, so I was trying to
> reduce that kind of thing.

Commit logs should be self contained. The reason for *why* we want to
add ranges should go here, not the cover letter. You can be terse in the
cover letter over the general solution.

Luis


2022-11-15 21:45:05

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v9 7/8] kallsyms: add /proc/kallmodsyms for text symbol disambiguation

On Mon, Nov 14, 2022 at 04:57:17PM +0000, Nick Alcock wrote:
> > I'd like much more review from other parties other than Oracle on this then.
>
> Well, yes. That's what these postings are all about. If I was supposed
> to get review from someone else as well, I'm happy to add those people
> to the Cc: of future iterations.

Yes, given Zhen is active on the same exact area I figured his input
would be of great value here too. I'd be wonderful if you can get
those eager to see this upstream on the toolside too.

Luis

2022-11-16 16:10:55

by Nick Alcock

[permalink] [raw]
Subject: Re: [PATCH v9 3/8] kbuild: generate an address ranges map at vmlinux link time

On 15 Nov 2022, Luis Chamberlain verbalised:

> On Mon, Nov 14, 2022 at 04:48:57PM +0000, Nick Alcock wrote:
>> [Added Steve Rostedt to Cc:]
>>
>> On 13 Nov 2022, Luis Chamberlain stated:
>> > This does't say why we'd want this. So either you merge it with its
>> > first user or you explain here why anyone might find this useful.
>>
>> Uh... the first user is later in this patch series? If you want each
>> commit to have a self-contained explanation, I could certainly note why
>> it's useful for said first user in this commit message (and adjust other
>> messages similarly), but I had previous complaints that commit log
>> messages and the cover letter were repeating points, so I was trying to
>> reduce that kind of thing.
>
> Commit logs should be self contained. The reason for *why* we want to
> add ranges should go here, not the cover letter. You can be terse in the
> cover letter over the general solution.

Ah, OK, my apologies: I was writing things precisely backwards then
(rationale in the cover letter, self-contained-but-no-rationale in the
individual patches then tying it together in the user at the end).
Whoops!

I'll rephrase as you suggest in the next round.

(And I'll definitely rephrase the cover letter again -- I tried to add a
decent rationale in there but from the sounds of things comprehensively
failed this time. Sorry!)

--
NULL && (void)