Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp4094275rwl; Mon, 10 Apr 2023 06:10:47 -0700 (PDT) X-Google-Smtp-Source: AKy350bgXLxB3FLdfGZ0ZLQtlELQVyctHXDsxAtdOx0y9tRJhE6j1tn80GVjOOnxH/LeQYNXnHtN X-Received: by 2002:a17:906:2a48:b0:933:4d37:82b2 with SMTP id k8-20020a1709062a4800b009334d3782b2mr6665712eje.57.1681132247395; Mon, 10 Apr 2023 06:10:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681132247; cv=none; d=google.com; s=arc-20160816; b=MVn3Qch+YxQG22IMkZBoef7Vbn4wKL1P1ZQGGnqVDtGf5FhqQfqPvc8r802UvSLMXU WwM+hbDH8aplfYPBRP/REmZm2NsQjKEb1rQ2q5iDWexIOxH8pMLRcyyOXbzR1pbmfU+t ufyhVuCwBsCKPWt+vvmlAiS61ZB/mPTdhGBGuGwk1eD1rJTvZtpn7yY+9m7ObnVsgaz/ 5owAAClEzVZf5MCtmEY+rK+zVg05LBvsoeappELzgA8xe/VxwatmUGGbotsxbti2VXgq QGGTB2roX2s6SxsrBE9p89YGMPfs84oa6a/VwqGxtT2ynM95VTFLDY39yzXeYp6MTgCd Z6Hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:content-language:references:cc:to:user-agent:mime-version:date :message-id:dkim-signature; bh=motyMZYGjG8ZpzgKBhQCp3NlNr2re4z0Ma0gRtL0ojg=; b=aSA7BU3fmSbJ1kkDwnuRf0zx6z5HWN330PyE28zMaSpCK6gPfZTOEr3ynIUwTiqo5s s46A/xUjSQN6/ubfXs8I2Z9ifMz4tS+u4o8qubI954APBK274bAs7SoNplCMaBsDtQbb lwVzkK9PNL4di21h5eVI3Tb4LmG4e34h3lg0ChWOx7HZSV5PpcNVxl5A6kcNNOAjLMf3 ezrSAiloConR30jAKpOmpC7EzaC3MtT3bweENjaY+aBPk162CLrGdcCXctx9h3Y7YG0+ 82NjdGwo+Z7TmeMtkgFy7hzmJ6iKZ218i46R9WVoBwexF8hvotD3zr1T3/DPGwVZ03zL loKQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Lu7ahFI0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w18-20020a170906481200b00949c01dcfa7si7467057ejq.864.2023.04.10.06.10.23; Mon, 10 Apr 2023 06:10:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Lu7ahFI0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229810AbjDJNJz (ORCPT + 99 others); Mon, 10 Apr 2023 09:09:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229825AbjDJNJy (ORCPT ); Mon, 10 Apr 2023 09:09:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEAB14488 for ; Mon, 10 Apr 2023 06:09:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681132144; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=motyMZYGjG8ZpzgKBhQCp3NlNr2re4z0Ma0gRtL0ojg=; b=Lu7ahFI0gWttfCuMZnt14th3Lstp+3qFad05XEe7CJsaMv7AZUeD2UbkbOtIeYHDGy4Ka4 p2l6gyg9yRjG6xIRP7HM+1JUoqfp0yV4TUcc8D2qNlBTyTV8PBqO4c2dzBnDw00/6rKbUA 7NbwOxkEybu79u6LKd2MjZ9Qtv0xEQc= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-245-MnB7E1_4OV-68HV_0v2j6w-1; Mon, 10 Apr 2023 09:09:02 -0400 X-MC-Unique: MnB7E1_4OV-68HV_0v2j6w-1 Received: by mail-qt1-f198.google.com with SMTP id s23-20020a05622a1a9700b003e6578904c3so15400012qtc.17 for ; Mon, 10 Apr 2023 06:09:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681132141; h=content-transfer-encoding:in-reply-to:subject:from:content-language :references:cc:to:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=motyMZYGjG8ZpzgKBhQCp3NlNr2re4z0Ma0gRtL0ojg=; b=qyLEFzMVcP2tAH9BPI/pG/u6ZaOCHFHvXn4mCPxkdHeeJ6jU9U1W8R3SKQG4Iy3zUk wmgtmb2xPW5n0rWQcdrpnyBrpV4YZS6stxEecWLjirJpePY8C0Kpn7qz/09NzdsiyoJJ to5JweMYqOEWbp0hbSqIEMhzedM4GY6VUVegms3i6oaeNasLu617ExeRxb/b/1ZN/mcm 1ZLdngswTbC1sVQ/BeZpkc5s6nWkwrdj+6yd7CEd2d2f5R7z+VkE9jD5Hh3ln6pVGa8R R/PqDr4AhwUzJR5yKPMFX7SzTbf8G65RXubEK85JqYAHgIszJOQvJMitpNkMp3ICtq3i SQTQ== X-Gm-Message-State: AAQBX9eqqUT5ipWsHcYekn4AVt1NYqSt8/SQSs5AyIt6lLeAL0pTvT+x wWvDv0/RvpmH2+5Ihpf6DVxmciAr5Y+WQuF5nt3F4P+5n/r62TuiK6qRmeK/T50o+pFPJmRCCTb W8faddfKcPF9E0G5lyrkSh9iV X-Received: by 2002:a05:622a:51:b0:3e6:9e18:73ee with SMTP id y17-20020a05622a005100b003e69e1873eemr8542964qtw.15.1681132141311; Mon, 10 Apr 2023 06:09:01 -0700 (PDT) X-Received: by 2002:a05:622a:51:b0:3e6:9e18:73ee with SMTP id y17-20020a05622a005100b003e69e1873eemr8542922qtw.15.1681132140968; Mon, 10 Apr 2023 06:09:00 -0700 (PDT) Received: from [192.168.1.14] (pool-68-160-135-240.bstnma.fios.verizon.net. [68.160.135.240]) by smtp.gmail.com with ESMTPSA id z63-20020a376542000000b007417e60f621sm3239007qkb.126.2023.04.10.06.08.59 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 10 Apr 2023 06:09:00 -0700 (PDT) Message-ID: <025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com> Date: Mon, 10 Apr 2023 09:08:58 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 To: Josh Poimboeuf , Nick Alcock Cc: mcgrof@kernel.org, masahiroy@kernel.org, linux-modules@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, eugene.loh@oracle.com, kris.van.hees@oracle.com, live-patching@vger.kernel.org, Peter Zijlstra , Steven Rostedt , =?UTF-8?B?RsSBbmctcnXDrCBTw7JuZw==?= References: <20221205163157.269335-1-nick.alcock@oracle.com> <20230407232118.o2x5lakfgyzy56gz@treble> Content-Language: en-US From: Joe Lawrence Subject: Re: [PATCH modules-next v10 00/13] kallsyms: reliable symbol->address lookup with /proc/kallmodsyms In-Reply-To: <20230407232118.o2x5lakfgyzy56gz@treble> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/7/23 19:21, Josh Poimboeuf wrote: > On Mon, Dec 05, 2022 at 04:31:44PM +0000, Nick Alcock wrote: >> The whole point of symbols is that their names are unique: you can look up a >> symbol and get back a unique address, and vice versa. Alas, because >> /proc/kallsyms (rightly) reports all symbols, even hidden ones, it does not >> really satisfy this requirement. Large numbers of symbols are duplicated >> many times (just search for __list_del_entry!), and while usually these are >> just out-of-lined things defined in header files and thus all have the same >> implementation, it does make it needlessly hard to figure out which one is >> which in stack dumps, when tracing, and such things. Some configuration >> options make things much worse: my test make allyesconfig runs introduced >> thousands of text symbols named _sub_I_65535_1, one per compiler-generated >> object file, and it was fairly easy to make them appear in ftrace output. >> >> Right now the kernel has no way at all to tell such symbols apart, and nor >> has the user: their address differs and that's all. Which module did they >> come from? Which object file? We don't know. Figuring out which is which >> when tracing needs a combination of guesswork and luck, and if there are >> thousands of them that's not a pleasant prospect. In discussions at LPC it >> became clear that this is not just annoying me but Steve Rostedt and others, >> so it's probably desirable to fix this. >> >> It turns out that the linker, and the kernel build system, can be made to >> give us everything we need to resolve this once and for all. This series >> provides a new /proc/kallmodsyms which is like /proc/kallsyms except that it >> annotates every (textual) symbol which comes from a built-in kernel module >> with the module's name, in square brackets: if a symbol is used by multiple >> modules, it gets [multiple] [names]; if a symbol is still ambiguous it gets >> a cut-down {object file name}; the combination of symbol, [module] [names] >> and {object file name} is unique (with one minor exception: the arm64 nvhe >> module is pre-linked with ld -r, causing all symbols in it to appear to come >> from the same object file: if it was reworked to use thin archives this >> problem would go away). > > Hi Nick, > > Sorry for jumping in late on an old patch set. I just saw the LWN > article about the MODULE_LICENSE() patches and I have some comments > about duplicate symbols and a question about the motivation for this > patch set. > > For livepatch we have a solution for disambiguating duplicate local > symbols called "sympos". It works (for now) but there are some cases > (like LTO) where it falls apart and it may not be the best long term > solution. > > The function granularity KASLR (fgkaslr) patches proposed a potentially > better option: use the GNU linker -zunique_symbols flag which renames > all duplicates to have unique names across the entire linked object. > And IIRC, that idea was eventually dropped. Fāng-ruì Sòng posted a few reasons why -zunique-symbols wouldn't be a great solution [1] [1] https://lore.kernel.org/lkml/CAFP8O3K1mkiCGMTEeuSifZtr2piHsKTjP5TOA25nqpv2SrbzYQ@mail.gmail.com/ was suggested instead, but I'm not 100% if that ever became the preferred solution. > There are other components which also struggle with duplicate symbols: > ftrace, kprobes, BPF, etc. It would be good to come up with a kallsyms > solution that works for everybody. > > Anyway, I was nodding along with the above cover letter until I got to > the third paragraph. > > A "built-in kernel module" is not actually a module, as it's built in to > vmlinux. I suspect the point is that if you rebuild with a different > config, it might become a module. But many other changes could also > occur with a changed config, including changed inlining decisions and > GCC IPA optimization function renaming, in which case the symbol might > no longer exist with the new config. > > Also I'm confused what it means for a symbol to be "used by multiple > modules". If the same TU or inline symbol is linked into two modules, > it will be loaded twice at two different addresses, and the > implementations could even differ. > > It sounds like there are two problems being conflated: > > 1) how to uniquely identify symbols in the current kernel > > For this, all we really need is file+sym. > > Or, enable -zunique-symbols in the linker. > > 2) how to uniquely identify symbols across multiple kernels/configs > > This seems much trickier, as much can change across kernels and > configs, including compiler inlining and naming decisions, not to > mention actual code changes. > > The problems are related, but distinct. > > #2 seems significantly harder to implement properly. > > Would solving #1 give you most of what you need? > > Based on the difficulty of #2, it really needs a proper justification. > I didn't see that in either of the patch sets. > > Can you share more details about what specific problem needs solved and > why? And how this would be used? Examples would be helpful. > > The article linked to this brief explanation [1], but that doesn't > clarify why "distinct notation used by users for things in named > modules" would be important. > > Is there a reason the user can't just use whatever notation is > appropriate for their specific kernel? Or, once we have #1, couldn't > tooling do an intermediate translation? > > [1] https://lwn.net/ml/linux-kernel/87h6z5wqlk.fsf@esperi.org.uk/ > -- Joe