Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp5371286rwb; Mon, 14 Nov 2022 03:48:13 -0800 (PST) X-Google-Smtp-Source: AA0mqf6phiPy7q7a23RUvgTIWWJrPwJ00LunPCWoXY6/m56hFmm7ZHHGgLJz1Dafrh24dNFu3OvS X-Received: by 2002:a17:906:830e:b0:7ad:a198:3177 with SMTP id j14-20020a170906830e00b007ada1983177mr9975868ejx.750.1668426492799; Mon, 14 Nov 2022 03:48:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668426492; cv=none; d=google.com; s=arc-20160816; b=sbOcneH5XXSLSPQMoNRg8yWVXvALQAinXqqvU14eZqJ+btXtYNO+b7rXkY3mSoHO5Q wDjF9joYxsz7tl4pkIPFLSlyNqNkyIfXeszNTLcbM+4bixDCBfA4Mndj+W62kMPiCRl3 9JOFqTJ7EnqIUZUe2nm69N9juACFrbchh9WcW1PLPuxlZwJgHb0IMOL9KvMYVXWreZgw Yksd4Xmtm8BzsLQz2J/zZd76ZZVIdVcJ5uYRaXlSbBgs0NeWiI44l4iDkXUtdkC088LC 9ZNIhl6JZEi+qb3tXQ5jsP3zMLZOgOtpuu+Q1NrVQPzfBbZkxChc5O8IGWYLPrhpFxGp owwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=fqG0pt6jIH45cQYkH2W3PsLPpTuIMUaQIOzLJdyTGKU=; b=e8pUhVJZzybEtu5LsljwTcjnpUULn75r1UmK1tEbG4ox7qP7LPoVWB101R6eYhelme u5t8/tS4w0QWKdU1zTqgCvfOvv1BCh3ZYhrqhly6Vuxh8AU5B1DQhZoYx05HrkwbVjRZ N0DTE1gSiaLJzlsI+OuoCSyLvCjxhLZpyD2mI7It/7dKaIWy/BG+5qzRkaWGR5r1nuTy f94GnpFIaPjW8VQFG/0v5KK4fy8KsTDAqSucrzOm6aQQPX0/+Rl5aArEWE3qLz1LnjV3 lRfi01NwsD/WL0mGeYaIQzdjzaBMjrcOh/apyOpewMTFBWlayWZFS/C3WAOXlnLJw5tL B/ZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=JdcQDNm0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ht8-20020a170907608800b0078dd7383ed8si8522962ejc.414.2022.11.14.03.47.46; Mon, 14 Nov 2022 03:48:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=JdcQDNm0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236446AbiKNLoG (ORCPT + 88 others); Mon, 14 Nov 2022 06:44:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233300AbiKNLoD (ORCPT ); Mon, 14 Nov 2022 06:44:03 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B4ACFD2 for ; Mon, 14 Nov 2022 03:44:01 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id EA6F3B80E3E for ; Mon, 14 Nov 2022 11:43:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B7732C433D6; Mon, 14 Nov 2022 11:43:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1668426238; bh=iD4SRt0ZX//1elKcCadrZysU+86x0awGGwlfpKQ4OmY=; h=From:To:Cc:Subject:Date:From; b=JdcQDNm0WuMwymkafzY3K6Cvb972fEIvDDKv33aUyHMFTHtmLvFpukUFOsF6eue3+ k/nTaaDEvOjzwu0Cd7UCY+DboAueX3OQ9ZSJZbW/nW942gp3Uco5LgvJvyi1fSNIp8 VqbeUjaKzIvH47tddROIAfcfyaOE3nOsuoWcBF7yJX1BEQ8zSrdyEoGdJ1+awBfRDt PO865dp0KPyysbBJDjWWoSUWAoa9OQ5qWxM2h9+p2C4vtBAOJKj5CRUZD7GSh4xlAd 3XsgMOPWWQCQkt98IZVV/hFRZWdY4qV9TWa8DWVGarDg1jobjJKkHNKCJ+E5EaB/00 IzKjxW4DdhtVQ== From: "Jiri Slaby (SUSE)" To: linux-kernel@vger.kernel.org Cc: "Jiri Slaby (SUSE)" , Alexander Potapenko , Alexander Shishkin , Alexei Starovoitov , Alexey Makhalov , Andrew Morton , Andrey Konovalov , Andrey Ryabinin , Andrii Nakryiko , Andy Lutomirski , Ard Biesheuvel , Arnaldo Carvalho de Melo , Ben Segall , Borislav Petkov , Daniel Borkmann , Daniel Bristot de Oliveira , Dave Hansen , Dietmar Eggemann , Dmitry Vyukov , Don Zickus , Hao Luo , "H . J . Lu" , "H. Peter Anvin" , Huang Rui , Ingo Molnar , Jan Hubicka , Jason Baron , Jiri Kosina , Jiri Olsa , Joe Lawrence , John Fastabend , Josh Poimboeuf , Juergen Gross , Juri Lelli , KP Singh , Mark Rutland , Martin KaFai Lau , Martin Liska , Masahiro Yamada , Mel Gorman , Miguel Ojeda , Michal Marek , Miroslav Benes , Namhyung Kim , Nick Desaulniers , Oleksandr Tyshchenko , Peter Zijlstra , Petr Mladek , "Rafael J. Wysocki" , Richard Biener , Sedat Dilek , Song Liu , Stanislav Fomichev , Stefano Stabellini , Steven Rostedt , Thomas Gleixner , Valentin Schneider , Vincent Guittot , Vincenzo Frascino , Viresh Kumar , VMware PV-Drivers Reviewers , Yonghong Song Subject: [PATCH 00/46] gcc-LTO support for the kernel Date: Mon, 14 Nov 2022 12:42:58 +0100 Message-Id: <20221114114344.18650-1-jirislaby@kernel.org> X-Mailer: git-send-email 2.38.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, this is the first call for comments (and kbuild complaints) for this support of gcc (full) LTO in the kernel. Most of the patches come from Andi. Me and Martin rebased them to new kernels and fixed the to-use known issues. Also I updated most of the commit logs and reordered the patches to groups of patches with similar intent. The very first patch comes from Alexander and is pending on some x86 queue already (I believe). I am attaching it only for completeness. Without that, the kernel does not boot (LTO reorders a lot). In our measurements, the performance differences are negligible. The kernel is bigger with gcc LTO due to more inlining. The next step might be to play with non-static functions as we export everything, so the compiler cannot actually drop anything (esp. inlined and no longer needed functions). Cc: Alexander Potapenko Cc: Alexander Shishkin Cc: Alexei Starovoitov Cc: Alexey Makhalov Cc: Andrew Morton Cc: Andrey Konovalov Cc: Andrey Ryabinin Cc: Andrii Nakryiko Cc: Andy Lutomirski Cc: Ard Biesheuvel Cc: Arnaldo Carvalho de Melo Cc: Ben Segall Cc: Borislav Petkov Cc: Daniel Borkmann Cc: Daniel Bristot de Oliveira Cc: Dave Hansen Cc: Dietmar Eggemann Cc: Dmitry Vyukov Cc: Don Zickus Cc: Hao Luo Cc: H.J. Lu Cc: "H. Peter Anvin" Cc: Huang Rui Cc: Ingo Molnar Cc: Jan Hubicka Cc: Jason Baron Cc: Jiri Kosina Cc: Jiri Olsa Cc: Joe Lawrence Cc: John Fastabend Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Juri Lelli Cc: KP Singh Cc: Mark Rutland Cc: Martin KaFai Lau Cc: Martin Liska Cc: Masahiro Yamada Cc: Mel Gorman Cc: Miguel Ojeda Cc: Michal Marek Cc: Miroslav Benes Cc: Namhyung Kim Cc: Nick Desaulniers Cc: Oleksandr Tyshchenko Cc: Peter Zijlstra Cc: Petr Mladek Cc: "Rafael J. Wysocki" Cc: Richard Biener Cc: Sedat Dilek Cc: Song Liu Cc: Stanislav Fomichev Cc: Stefano Stabellini Cc: Steven Rostedt Cc: Thomas Gleixner Cc: Valentin Schneider Cc: Vincent Guittot Cc: Vincenzo Frascino Cc: Viresh Kumar Cc: VMware PV-Drivers Reviewers Cc: Yonghong Song Alexander Lobakin (1): x86/boot: robustify calling startup_{32,64}() from the decompressor code Andi Kleen (36): Compiler Attributes, lto: introduce __noreorder tracepoint, lto: Mark static call functions as __visible static_call, lto: Mark static keys as __visible static_call, lto: Mark static_call_return0() as __visible static_call, lto: Mark func_a() as __visible_on_lto x86/alternative, lto: Mark int3_*() as global and __visible x86/paravirt, lto: Mark native_steal_clock() as __visible_on_lto x86/preempt, lto: Mark preempt_schedule_*thunk() as __visible x86/xen, lto: Mark xen_vcpu_stolen() as __visible x86, lto: Mark gdt_page and native_sched_clock() as __visible amd, lto: Mark amd pmu and pstate functions as __visible_on_lto entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible export, lto: Mark __kstrtab* in EXPORT_SYMBOL() as global and __visible softirq, lto: Mark irq_enter/exit_rcu() as __visible btf, lto: Make all BTF IDs global on LTO init.h, lto: mark initcalls as __noreorder bpf, lto: mark interpreter jump table as __noreorder sched, lto: mark sched classes as __noreorder linkage, lto: use C version for SYSCALL_ALIAS() / cond_syscall() scripts, lto: re-add gcc-ld scripts, lto: use CONFIG_LTO for many LTO specific actions Kbuild, lto: Add Link Time Optimization support x86/purgatory, lto: Disable gcc LTO for purgatory x86/realmode, lto: Disable gcc LTO for real mode code x86/vdso, lto: Disable gcc LTO for the vdso scripts, lto: disable gcc LTO for some mod sources Kbuild, lto: disable gcc LTO for bounds+asm-offsets lib/string, lto: disable gcc LTO for string.o Compiler attributes, lto: disable __flatten with LTO Kbuild, lto: don't include weak source file symbols in System.map x86, lto: Disable relative init pointers with gcc LTO x86/livepatch, lto: Disable live patching with gcc LTO x86/lib, lto: Mark 32bit mem{cpy,move,set} as __used scripts, lto: check C symbols for modversions scripts/bloat-o-meter, lto: handle gcc LTO x86, lto: Finally enable gcc LTO for x86 Jiri Slaby (5): kbuild: pass jobserver to cmd_ld_vmlinux.o compiler.h: introduce __visible_on_lto compiler.h: introduce __global_on_lto btf, lto: pass scope as strings x86/apic, lto: Mark apic_driver*() as __noreorder Martin Liska (4): kbuild: lto: preserve MAKEFLAGS for module linking x86/sev, lto: Mark cpuid_table_copy as __visible_on_lto mm/kasan, lto: Mark kasan mem{cpy,move,set} as __used kasan, lto: remove extra BUILD_BUG() in memory_is_poisoned Documentation/kbuild/index.rst | 2 + Documentation/kbuild/lto-build.rst | 76 +++++++++++++++++++++++++++++ Kbuild | 3 ++ Makefile | 6 ++- arch/Kconfig | 52 ++++++++++++++++++++ arch/x86/Kconfig | 5 +- arch/x86/boot/compressed/head_32.S | 2 +- arch/x86/boot/compressed/head_64.S | 2 +- arch/x86/boot/compressed/misc.c | 16 +++--- arch/x86/entry/vdso/Makefile | 2 + arch/x86/events/amd/core.c | 2 +- arch/x86/include/asm/apic.h | 4 +- arch/x86/include/asm/preempt.h | 4 +- arch/x86/kernel/alternative.c | 5 +- arch/x86/kernel/cpu/common.c | 2 +- arch/x86/kernel/paravirt.c | 2 +- arch/x86/kernel/sev-shared.c | 2 +- arch/x86/kernel/tsc.c | 2 +- arch/x86/lib/memcpy_32.c | 6 +-- arch/x86/purgatory/Makefile | 2 + arch/x86/realmode/Makefile | 1 + drivers/cpufreq/amd-pstate.c | 15 +++--- drivers/xen/time.c | 2 +- include/asm-generic/vmlinux.lds.h | 2 +- include/linux/btf_ids.h | 24 ++++----- include/linux/compiler.h | 8 +++ include/linux/compiler_attributes.h | 15 ++++++ include/linux/export.h | 6 ++- include/linux/init.h | 2 +- include/linux/linkage.h | 16 +++--- include/linux/static_call.h | 12 ++--- include/linux/tracepoint.h | 4 +- kernel/bpf/core.c | 2 +- kernel/entry/common.c | 2 +- kernel/kallsyms.c | 2 +- kernel/livepatch/Kconfig | 1 + kernel/sched/sched.h | 1 + kernel/softirq.c | 4 +- kernel/static_call.c | 2 +- kernel/static_call_inline.c | 6 +-- kernel/time/posix-stubs.c | 19 +++++++- lib/Makefile | 2 + mm/kasan/generic.c | 2 +- mm/kasan/shadow.c | 6 +-- scripts/Makefile.build | 17 ++++--- scripts/Makefile.lib | 2 +- scripts/Makefile.lto | 43 ++++++++++++++++ scripts/Makefile.modfinal | 2 +- scripts/Makefile.vmlinux | 3 +- scripts/Makefile.vmlinux_o | 6 +-- scripts/bloat-o-meter | 2 +- scripts/gcc-ld | 40 +++++++++++++++ scripts/link-vmlinux.sh | 9 ++-- scripts/mksysmap | 2 + scripts/mod/Makefile | 3 ++ scripts/module.lds.S | 2 +- 56 files changed, 384 insertions(+), 100 deletions(-) create mode 100644 Documentation/kbuild/lto-build.rst create mode 100644 scripts/Makefile.lto create mode 100755 scripts/gcc-ld -- 2.38.1