Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2916230pxb; Sun, 28 Feb 2021 18:21:15 -0800 (PST) X-Google-Smtp-Source: ABdhPJzQCkO/35Qu7e950ktbSOXrv7qM7FThMk65/sv1rbzXxaeB67bPnXLzF3n2QrMKWE5CXcaF X-Received: by 2002:a05:6402:4c1:: with SMTP id n1mr11424123edw.199.1614565275290; Sun, 28 Feb 2021 18:21:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614565275; cv=none; d=google.com; s=arc-20160816; b=TX+Spc86XlFkH20U8kTEVB0IURlCHx4i2tkQ5TsmCWugrJtH6rI9SXazKaHvbUoN1k a+Y5ChTN5XFOb1421veaYNkppMOUTOga4tiCWr/lCTKqM1LDVHjh7PoAFGRkcZ9HEx34 CuvdjiKcVraC3Tr4jeCPUZmcgbFpwiBhA4k9anJBztos51tfbxU4XJCpZoRVRU8Yje5P U1AKXiM2J+7Hu82hkYLCMR+Kw+nIJvH3w4DkjFlWi+4s2TnB54p3aG5QUmvHvfW221+U T9TMmPYRVM+lhlNFgGk94cPofr6+Yk7tlUGBAx4TmSkIgyEskkjZbioKOLexMiQosY+U XE+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:message-id :mime-version:in-reply-to:references:cc:to:subject:from:date :dkim-signature; bh=lt0oVa4TTqOxC4dz7vJKEQu8Xc7A+35tptwF7QqK3/k=; b=lTyP05pU9Fl3/4EJSxKNdtqxlNZcgciGo0kZodgY7FA44jt0PfTq9a+ZuZsF8B0Uub hoWgW/LoeMRKiJXiXYpo4/DyfEp4bIgQWo+7kdpTbQiJA9TrrcZjkRo1GDRfAFdva6f+ dTfZUkDlsa/HDOvtyVoYgriSpI1BMSJhYYi//LD+/ASc/WPJrT7RLasS53vIm2kcfhUc alhYLFi9yBiUdot8mZZjbe6L7SORhALeFRAOT9d+jT0jVAiDUYe6PQH8h7qxImsdJnwv vXWV5kNItSvFJQMN+DZNwGsIbgBaPDdd8SUT/gpd9hyblnhUN0UHKiEdnZM8oyyX8U0H QH3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=m0FkE2kr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o16si8323932ejd.213.2021.02.28.18.20.53; Sun, 28 Feb 2021 18:21:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=m0FkE2kr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231486AbhCABML (ORCPT + 99 others); Sun, 28 Feb 2021 20:12:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231352AbhCABMJ (ORCPT ); Sun, 28 Feb 2021 20:12:09 -0500 Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDBD4C061756 for ; Sun, 28 Feb 2021 17:11:28 -0800 (PST) Received: by mail-pj1-x102a.google.com with SMTP id i4-20020a17090a7184b02900bfb60fbc6bso1569523pjk.0 for ; Sun, 28 Feb 2021 17:11:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:subject:to:cc:references:in-reply-to:mime-version :message-id:content-transfer-encoding; bh=lt0oVa4TTqOxC4dz7vJKEQu8Xc7A+35tptwF7QqK3/k=; b=m0FkE2krnamYFeX1Pti0Fl1Ein+4gSp9e9e/va840QQyrijFm1iLewX6PGXF9ibg8n s3GMnM4wODiOLo9FGyv8WP7YT9ws4Nz/G4WLQek+pg9mpQmaHCH7hRoPFQmI6rZI9AR2 XGdnUYLaSS2E8CeGEeAoxUG9Q5dAGN1ID+Kx2mRbGr2HMlvLZez5AEqF7TOB9MoUbmZu BK6iZm6XLsCcCkDH1nRsy6uWEewKtVpFd/zXtYvEhc7RgDRoNXv/MGp5/9CXfMSuHTTf UMU8sEdAsEdkjekgqoyNXscD00gPQ36O1F03EZic5KRdL7TYgxbgLTwpBCP/6+tUObFu PlTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:subject:to:cc:references:in-reply-to :mime-version:message-id:content-transfer-encoding; bh=lt0oVa4TTqOxC4dz7vJKEQu8Xc7A+35tptwF7QqK3/k=; b=kh+kVQ91V88YB5k8MTw+rWLDjzQftMFootByaWZE/wuUAPHIohJ2UJ8eHYu/8guZ2y MbOXKUO7mUVe0qrtEhAkcUSxCaH4nvn7EYC1tClFZ7G3lj/UnKu9yPFJ7dfSNVE0UW2q /wAD229gZ1P2e32E03P31UpyrQkyIKCzxhNAfXyo6DHhu+GF/vUTUA5V/R+z1zmAKzSv 8lXWFhegfc5hkWvjcec8wZL/Bdoh5K4wKbhOsBMFmvyYrB4Rsq7KYs0v/xii+Dqvjn44 cdBQsiEhVGlpmAKL/dktATImrHbmTjA9uRFTCiKO1DvEnAo2yFhq5vyXbEQ7iKBm0rNf 2SCg== X-Gm-Message-State: AOAM530LmC8mf6naIgAbTG6J/twXjc9nWiUrPNP4jDndt60TZJttxH2H Y/wrcYaebO+aXB4ds1AO98E= X-Received: by 2002:a17:90a:16d7:: with SMTP id y23mr15152020pje.227.1614561088230; Sun, 28 Feb 2021 17:11:28 -0800 (PST) Received: from localhost (58-6-239-121.tpgi.com.au. [58.6.239.121]) by smtp.gmail.com with ESMTPSA id c15sm14911942pfj.170.2021.02.28.17.11.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Feb 2021 17:11:27 -0800 (PST) Date: Mon, 01 Mar 2021 11:11:22 +1000 From: Nicholas Piggin Subject: Re: [PATCH] [RFC] arm64: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION To: Arnd Bergmann , Fangrui Song Cc: Ard Biesheuvel , Arnd Bergmann , Andrew Scull , Mark Brown , Catalin Marinas , clang-built-linux , David Brazdil , Geert Uytterhoeven , Ionela Voinescu , Kees Cook , Kristina Martsenko , Linux ARM , "linux-kernel@vger.kernel.org" , Mark Rutland , Marc Zyngier , Nathan Chancellor , Nick Desaulniers , Vincenzo Frascino , Will Deacon , Nicolas Pitre References: <20210225112122.2198845-1-arnd@kernel.org> <20210226211323.arkvjnr4hifxapqu@google.com> In-Reply-To: MIME-Version: 1.0 Message-Id: <1614559739.p25z5x88wl.astroid@bobo.none> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Excerpts from Arnd Bergmann's message of February 27, 2021 7:49 pm: > On Fri, Feb 26, 2021 at 10:13 PM 'Fangrui Song' via Clang Built Linux > wrote: >> >> For folks who are interested in --gc-sections on metadata sections, >> I want to bring you awareness of the implication of __start_/__stop_ sym= bols and C identifier name sections. >> You can see https://github.com/ClangBuiltLinux/linux/issues/1307 for a s= ummary. >> (Its linked blog article has some examples.) >> >> In the kernel linker scripts, most C identifier name sections begin with= double-underscore __. >> Some are surrounded by `KEEP(...)`, some are not. >> >> * A `KEEP` keyword has GC root semantics and makes ld --gc-sections inef= fectful. >> * Without `KEEP`, __start_/__stop_ references from a live input section >> can unnecessarily retain all the associated C identifier name input >> sections. The new ld.lld option `-z start-stop-gc` can defeat this ru= le. >> >> As an example, a __start___jump_table reference from a live section >> causes all `__jump_table` input section to be retained, even if you >> change `KEEP(__jump_table)` to `(__jump_table)`. >> (If you change the symbol name from `__start_${section}` to something >> else (e.g. `__start${section}`), the rule will not apply.) >=20 > I suspect the __start_* symbols are cargo-culted by many developers > copying stuff around between kernel linker scripts, that's certainly how = I > approach making changes to it normally without a deeper understanding > of how the linker actually works or what the different bits of syntax mea= n > there. >=20 > I see the original vmlinux.lds linker script showed up in linux-2.1.23, a= nd > it contained >=20 > + . =3D ALIGN(16); /* Exception table */ > + __start___ex_table =3D .; > + __ex_table : { *(__ex_table) } > + __stop___ex_table =3D .; > + > + __start___ksymtab =3D .; /* Kernel symbol table */ > + __ksymtab : { *(__ksymtab) } > + __stop___ksymtab =3D .; >=20 > originally for arch/sparc, and shortly afterwards for i386. The magic > __ex_table section was first used in linux-2.1.7 without a linker > script. It's probably a good idea to try cleaning these up by using > non-magic start/stop symbols for all sections, and relying on KEEP() > instead where needed. >=20 >> There are a lot of KEEP usage. Perhaps some can be dropped to facilitate >> ld --gc-sections. >=20 > I see a lot of these were added by Nick Piggin (added to Cc) in this comm= it: >=20 > commit 266ff2a8f51f02b429a987d87634697eb0d01d6a > Author: Nicholas Piggin > Date: Wed May 9 22:59:58 2018 +1000 >=20 > kbuild: Fix asm-generic/vmlinux.lds.h for LD_DEAD_CODE_DATA_ELIMINATI= ON >=20 > KEEP more tables, and add the function/data section wildcard to more > section selections. >=20 > This is a little ad-hoc at the moment, but kernel code should be move= d > to consistently use .text..x (note: double dots) for explicit section= s > and all references to it in the linker script can be made with > TEXT_MAIN, and similarly for other sections. >=20 > For now, let's see if major architectures move to enabling this optio= n > then we can do some refactoring passes. Otherwise if it remains unuse= d > or superseded by LTO, this may not be required. >=20 > Signed-off-by: Nicholas Piggin > Signed-off-by: Masahiro Yamada >=20 > which apparently was intentionally cautious. >=20 > Unlike what Nick expected in his submission, I now think the annotations > will be needed for LTO just like they are for --gc-sections. Yeah I wasn't sure exactly what LTO looks like or how it would work. I thought perhaps LTO might be able to find dead code with circular /=20 back references, we could put references from the code back to these=20 tables or something so they would be kept without KEEP. I don't know, I=20 was handwaving! I managed to get powerpc (and IIRC x86?) working with gc sections with those KEEP annotations, but effectiveness of course is far worse than=20 what Nicolas was able to achieve with all his techniques and tricks. But yes unless there is some other mechanism to handle these tables,=20 then KEEP probably has to stay. I suggest this wants a very explicit and=20 systematic way to handle it (maybe with some toolchain support) rather=20 than trying to just remove things case by case and see what breaks. I don't know if Nicolas is still been working on his shrinking patches recenty but he probably knows more than anyone about this stuff. Thanks, Nick