Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp2369555pxb; Wed, 9 Feb 2022 17:26:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJwyb1mohL+yYSJ8WL0rIS5cvIpuOJHDTX14YAX/U6vJ7bSkWLeXPH6y49wJQa3mf3679DZN X-Received: by 2002:a17:903:24d:: with SMTP id j13mr4973247plh.54.1644456379057; Wed, 09 Feb 2022 17:26:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644456379; cv=none; d=google.com; s=arc-20160816; b=b6kyPiLl+JY2ePYuIpK+riXMkKoGK+lhYmzwyG9/YCxTuPO1yfwELQ9yZpbXF0Ip51 wZo7gJTMq6qEDmIkXIXOgipp5Y9AGRlk883Iv6OMKaKykk6fn0GmQTdU+brC1OFVagb6 CLdW8IjyDspSkk07RUDNgyLzddj5RHRSsADrcH4NZsfjsvZO6kgiy5FbecXt+Bo3NnaW NaO1WhV9w9rpw066te5ygvaMO42iCudh0PGJ1q2dmyAl24uueHmoa1d8MT8St5ziRkz4 pd2/QucvMsAXbJTi565hWrh2do0WUGgdg9pTCxuH4AQWwAqN5TCvWFr358Ff7LVluAeq MO+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature:dkim-filter; bh=dAKyP6GJpF6V2QvUnN/LuX24hdc1hZvojEh30WfrKgY=; b=i3llzXDesStfG9B4o92HYnNUWNVWX6o58U4TILsSsOX2tI2lCvVq7lRY9zq7FI5fKG Ic4hm+eqp4fPMWmbfiHpe6tQVPNVarXe+0CLsYq6aMf6j2GOz4jYlkH8Axh3kZ0n+Jq9 55vpltw9gvhYDIm9W5ny0H/ovTrUtI0pnqrDINWhB5L8prNa+ERm7qUPVstpjIZ15+YU oMMGcwnLwkv7l/DFawKswC95rEK7S8icp/0f+6uF31ybwHY90b9bdlxi7+N890js2M+j EWgf52PqgHk8g0qgqVIdyNwda+exO/AiARlHmudYcZ4P7vsOfcsG567WDMhZGZAEJWAx R84Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nifty.com header.s=dec2015msa header.b="IG3GM/De"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id b28si7510020pgb.584.2022.02.09.17.26.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Feb 2022 17:26:19 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@nifty.com header.s=dec2015msa header.b="IG3GM/De"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8EAB3220C5; Wed, 9 Feb 2022 17:26:09 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232412AbiBJBZ5 (ORCPT + 99 others); Wed, 9 Feb 2022 20:25:57 -0500 Received: from gmail-smtp-in.l.google.com ([23.128.96.19]:51174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232118AbiBJBZl (ORCPT ); Wed, 9 Feb 2022 20:25:41 -0500 X-Greylist: delayed 169 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Wed, 09 Feb 2022 17:25:43 PST Received: from condef-01.nifty.com (condef-01.nifty.com [202.248.20.66]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35B81261F; Wed, 9 Feb 2022 17:25:42 -0800 (PST) Received: from conssluserg-03.nifty.com ([10.126.8.82])by condef-01.nifty.com with ESMTP id 21A0nMGk016551; Thu, 10 Feb 2022 09:49:22 +0900 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) (authenticated) by conssluserg-03.nifty.com with ESMTP id 21A0n28Z006407; Thu, 10 Feb 2022 09:49:03 +0900 DKIM-Filter: OpenDKIM Filter v2.10.3 conssluserg-03.nifty.com 21A0n28Z006407 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nifty.com; s=dec2015msa; t=1644454143; bh=dAKyP6GJpF6V2QvUnN/LuX24hdc1hZvojEh30WfrKgY=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=IG3GM/DeBViLLxnZ3fQanZTvYc802gcfsCckcPHRrgf4Tgn/4P+ZJFsO3kgDlgsmp vI5pkF+QlLMWdv3f52B0XYjh86Z3SnjkGoOvjbmWVbpMB6wBedSkwtS4CKTnv28iJa nvcntW2JHBQU1P18cC/PDDZfNV0aP6i8qf4TEK3NjITMDvsHosz7waB8K9oRFRNhJ4 KUJtRKRrIWLIC0b8npHrf3jv1iiMm+KoZawBTUnYsysnbEkuWH9Qlt2IBQoqIsfgQ1 ZqEC6DqW6nFmnAOyGwqdyHLC7ab+b2V/y9mylO0uxgtYh/b4C6nJCc6a2jHBRIT0D2 1vKnjktfIdKRA== X-Nifty-SrcIP: [209.85.216.50] Received: by mail-pj1-f50.google.com with SMTP id h7-20020a17090a648700b001b927560c2bso2785249pjj.1; Wed, 09 Feb 2022 16:49:02 -0800 (PST) X-Gm-Message-State: AOAM533uAB+PAvqxxxQOsb9xhjU1ODM3mH87X1FQ4M/cW1gcEx05HHFZ uswokZOlQJDYy/Hrhs0o6o1dMnI3cyZulZxSYX4= X-Received: by 2002:a17:90b:4a4b:: with SMTP id lb11mr20816pjb.144.1644454142067; Wed, 09 Feb 2022 16:49:02 -0800 (PST) MIME-Version: 1.0 References: <20220208184309.148192-1-nick.alcock@oracle.com> <20220208184309.148192-5-nick.alcock@oracle.com> In-Reply-To: <20220208184309.148192-5-nick.alcock@oracle.com> From: Masahiro Yamada Date: Thu, 10 Feb 2022 09:48:26 +0900 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v8 4/6] kallsyms: introduce sections needed to map symbols to built-in modules To: Nick Alcock Cc: "Luis R. Rodriguez" , Jiri Olsa , Steven Rostedt , bas@baslab.org, tglozar@gmail.com, Ast-x64@protonmail.com, viktor.malik@gmail.com, Daniel Xu , Arnaldo Carvalho de Melo , Adrian Hunter , Andi Kleen , Ian Rogers , Linux Kbuild mailing list , linux-modules , Linux Kernel Mailing List , Arnd Bergmann , Andrew Morton , Eugene Loh , Kris Van Hees Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 9, 2022 at 3:44 AM Nick Alcock wrote: > > The mapping consists of three new symbols, computed by integrating the > information in the (just-added) .tmp_vmlinux.ranges and > modules_thick.builtin: taken together, they map address ranges > (corresponding to object files on the input) to the names of zero or > more modules containing those address ranges. > > - kallsyms_module_addresses/kallsyms_module_offsets encodes the > address/offset of each object file (derived from the linker map), in > exactly the same way as kallsyms_addresses/kallsyms_offsets does > for symbols. There is no size: instead, the object files are assumed > to tile the address space. (This is slightly more space-efficient > than using a size). Non-text-section addresses are skipped: for now, > all the users of this interface only need module/non-module > information for instruction pointer addresses, not absolute-addressed > symbols and the like. This restriction can easily be lifted in > future. (Regarding the name: right now the entries correspond pretty > closely to object files, so we could call the section > kallsyms_objfiles or something, but the optimizer added in the next > commit will change this.) > > - kallsyms_module_names encodes the name of each module in a modified > form of strtab: notably, if an object file appears in *multiple* > modules, all of which are built in, this is encoded via a zero byte, > a one-byte module count, then a series of that many null-terminated > strings. As a special case, the table starts with a single zero byte > which does *not* represent the start of a multi-module list. > > - kallsyms_modules connects the two, encoding a table associated 1:1 > with kallsyms_module_addresses / kallsyms_module_offsets, pointing > at an offset in kallsyms_module_names describing which module (or > modules, for a multi-module list) the code occupying this address > range is part of. If an address range is part of no module (always > built-in) it points at 0 (the null byte at the start of the > kallsyms_module_names list). > > There is no optimization yet: kallsyms_modules and > kallsyms_module_names will almost certainly contain many duplicate > entries, and kallsyms_module_{addresses,offsets} may contain > consecutive entries that point to the same place. The size hit is > fairly substantial as a result, though still much less than a naive > implementation mapping each symbol to a module name would be: 50KiB or > so. > > Signed-off-by: Nick Alcock > Reviewed-by: Kris Van Hees > --- > Makefile | 2 +- > init/Kconfig | 8 + > scripts/Makefile | 6 + > scripts/kallsyms.c | 366 +++++++++++++++++++++++++++++++++++++++++++-- > 4 files changed, 371 insertions(+), 11 deletions(-) > > diff --git a/Makefile b/Makefile > index 5e823fe8390f..b719244cb571 100644 > --- a/Makefile > +++ b/Makefile > @@ -1151,7 +1151,7 @@ cmd_link-vmlinux = \ > $(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)"; \ > $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) > > -vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE > +vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) modules_thick.builtin FORCE > +$(call if_changed_dep,link-vmlinux) > > targets := vmlinux > diff --git a/init/Kconfig b/init/Kconfig > index e9119bf54b1f..e1ca3d70cb1c 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -1530,6 +1530,14 @@ config POSIX_TIMERS > > If unsure say y. > > +config KALLMODSYMS > + default y > + bool "Enable support for /proc/kallmodsyms" if EXPERT > + depends on KALLSYMS > + help > + This option enables the /proc/kallmodsyms file, which maps symbols > + to addresses and their associated modules. > + > config PRINTK > default y > bool "Enable support for printk" if EXPERT > diff --git a/scripts/Makefile b/scripts/Makefile > index ce5aa9030b74..c5cc4ac3d660 100644 > --- a/scripts/Makefile > +++ b/scripts/Makefile > @@ -29,6 +29,12 @@ ifdef CONFIG_BUILDTIME_MCOUNT_SORT > HOSTCFLAGS_sorttable.o += -DMCOUNT_SORT_ENABLED > endif > > +kallsyms-objs := kallsyms.o > + > +ifdef CONFIG_KALLMODSYMS > +kallsyms-objs += modules_thick.o > +endif > + > # The following programs are only built on demand > hostprogs += unifdef > > diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c > index 54ad86d13784..8f87b724d0fa 100644 > --- a/scripts/kallsyms.c > +++ b/scripts/kallsyms.c > @@ -5,7 +5,10 @@ > * This software may be used and distributed according to the terms > * of the GNU General Public License, incorporated herein by reference. > * > - * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S > + * Usage: nm -n vmlinux > + * | scripts/kallsyms [--all-symbols] [--absolute-percpu] > + * [--base-relative] [--builtin=modules_thick.builtin] > + * > symbols.S > * > * Table compression uses all the unused char codes on the symbols and > * maps these to the most used substrings (tokens). For instance, it might > @@ -24,6 +27,10 @@ > #include > #include > #include > +#include > +#include "modules_thick.h" > + > +#include "../include/generated/autoconf.h" I do not remember if I had pointed this out before, but including autoconf.h from a host program is wrong. Do not use ifdef CONFIG_... in the hostprog code. Having --builtin=modules_thick.builtin is enough. -- Best Regards Masahiro Yamada