2021-02-25 16:06:04

by Masahiro Yamada

[permalink] [raw]
Subject: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS


Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
about the build speed.

I re-implemented this feature, and the build time cost is now
almost unnoticeable level.

I hope this makes Linus happy.



Masahiro Yamada (4):
kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO
export.h: make __ksymtab_strings per-symbol section
kbuild: separate out vmlinux.lds generation
kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in
one-pass

Makefile | 34 ++++++------
arch/alpha/kernel/Makefile | 3 +-
arch/arc/kernel/Makefile | 3 +-
arch/arm/kernel/Makefile | 3 +-
arch/arm64/kernel/Makefile | 3 +-
arch/csky/kernel/Makefile | 3 +-
arch/h8300/kernel/Makefile | 2 +-
arch/hexagon/kernel/Makefile | 3 +-
arch/ia64/kernel/Makefile | 3 +-
arch/m68k/kernel/Makefile | 2 +-
arch/microblaze/kernel/Makefile | 3 +-
arch/mips/kernel/Makefile | 3 +-
arch/nds32/kernel/Makefile | 3 +-
arch/nios2/kernel/Makefile | 2 +-
arch/openrisc/kernel/Makefile | 3 +-
arch/parisc/kernel/Makefile | 3 +-
arch/powerpc/kernel/Makefile | 2 +-
arch/riscv/kernel/Makefile | 2 +-
arch/s390/kernel/Makefile | 3 +-
arch/sh/kernel/Makefile | 3 +-
arch/sparc/kernel/Makefile | 2 +-
arch/um/kernel/Makefile | 2 +-
arch/x86/kernel/Makefile | 2 +-
arch/xtensa/kernel/Makefile | 3 +-
include/asm-generic/export.h | 25 +--------
include/asm-generic/vmlinux.lds.h | 29 +++++++++--
include/linux/export.h | 56 +++++---------------
init/Kconfig | 4 +-
scripts/Makefile.build | 7 +--
scripts/adjust_autoksyms.sh | 76 ---------------------------
scripts/gen-keep-ksyms.sh | 86 +++++++++++++++++++++++++++++++
scripts/gen_autoksyms.sh | 55 --------------------
scripts/gen_ksymdeps.sh | 25 ---------
scripts/lto-used-symbollist.txt | 5 --
scripts/module.lds.S | 38 ++++++++++----
35 files changed, 210 insertions(+), 291 deletions(-)
delete mode 100755 scripts/adjust_autoksyms.sh
create mode 100755 scripts/gen-keep-ksyms.sh
delete mode 100755 scripts/gen_autoksyms.sh
delete mode 100755 scripts/gen_ksymdeps.sh
delete mode 100644 scripts/lto-used-symbollist.txt

--
2.27.0


2021-02-25 16:06:15

by Masahiro Yamada

[permalink] [raw]
Subject: [PATCH 3/4] kbuild: separate out vmlinux.lds generation

This is a preparation for the CONFIG_TRIM_UNUSED_KSYMS improvement.

In the new implementation of CONFIG_TRIM_UNUSED_KSYMS (next commit),
unused export symbols are trimmed at the link stage. Kbuild needs to
build the entire tree to know which symbols are needed by modules for
symbol resolution.

The list of needed symbols shall be generated after the directory
traverse, and included from vmlinux.lds.S and module.lds.S.

The build rule of module.lds.S is already separated as modules_prepare.

The build of vmlinux.lds must be delayed because such a list is not yet
available when Kbuild is visiting arch/$(SRCARCH)/kernel/Makefile.

Separate the build rule of vmlinux.lds, and invokes it from the top
Makefile.

I guarded the $(warning ) in scripts/Makefile.build, otherwise a false-
positive warning would be displayed for example when building ARCH=ia64
with CONFIG_IA64_PALINFO=m. Ideally, vmlinux.lds.S could be moved to a
different directory, but I am just doing less-invasive changes for now.

Signed-off-by: Masahiro Yamada <[email protected]>
---

Makefile | 8 ++++++--
arch/alpha/kernel/Makefile | 3 ++-
arch/arc/kernel/Makefile | 3 ++-
arch/arm/kernel/Makefile | 3 ++-
arch/arm64/kernel/Makefile | 3 ++-
arch/csky/kernel/Makefile | 3 ++-
arch/h8300/kernel/Makefile | 2 +-
arch/hexagon/kernel/Makefile | 3 ++-
arch/ia64/kernel/Makefile | 3 ++-
arch/m68k/kernel/Makefile | 2 +-
arch/microblaze/kernel/Makefile | 3 ++-
arch/mips/kernel/Makefile | 3 ++-
arch/nds32/kernel/Makefile | 3 ++-
arch/nios2/kernel/Makefile | 2 +-
arch/openrisc/kernel/Makefile | 3 ++-
arch/parisc/kernel/Makefile | 3 ++-
arch/powerpc/kernel/Makefile | 2 +-
arch/riscv/kernel/Makefile | 2 +-
arch/s390/kernel/Makefile | 3 ++-
arch/sh/kernel/Makefile | 3 ++-
arch/sparc/kernel/Makefile | 2 +-
arch/um/kernel/Makefile | 2 +-
arch/x86/kernel/Makefile | 2 +-
arch/xtensa/kernel/Makefile | 3 ++-
scripts/Makefile.build | 2 ++
25 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/Makefile b/Makefile
index b18dbc634690..34393fd72fe1 100644
--- a/Makefile
+++ b/Makefile
@@ -1184,6 +1184,9 @@ quiet_cmd_autoksyms_h = GEN $@
$(autoksyms_h):
$(call cmd,autoksyms_h)

+$(KBUILD_LDS): prepare FORCE
+ $(Q)$(MAKE) $(build)=$(patsubst %/,%,$(dir $@)) $@
+
ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)

# Final link of vmlinux with optional arch pass after final link
@@ -1191,14 +1194,15 @@ cmd_link-vmlinux = \
$(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)"; \
$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)

-vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE
+vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(KBUILD_LDS) \
+ $(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) FORCE
+$(call if_changed,link-vmlinux)

targets := vmlinux

# The actual objects are generated when descending,
# make sure no implicit rule kicks in
-$(sort $(vmlinux-deps) $(subdir-modorder)): descend ;
+$(sort $(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) $(subdir-modorder)): descend ;

filechk_kernel.release = \
echo "$(KERNELVERSION)$$($(CONFIG_SHELL) $(srctree)/scripts/setlocalversion $(srctree))"
diff --git a/arch/alpha/kernel/Makefile b/arch/alpha/kernel/Makefile
index 5a74581bf0ee..6e2baaebdee3 100644
--- a/arch/alpha/kernel/Makefile
+++ b/arch/alpha/kernel/Makefile
@@ -3,7 +3,8 @@
# Makefile for the linux kernel.
#

-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
asflags-y := $(KBUILD_CFLAGS)
ccflags-y := -Wno-sign-compare

diff --git a/arch/arc/kernel/Makefile b/arch/arc/kernel/Makefile
index 8c4fc4b54c14..0a06c018f0cd 100644
--- a/arch/arc/kernel/Makefile
+++ b/arch/arc/kernel/Makefile
@@ -31,4 +31,5 @@ else
obj-y += ctx_sw_asm.o
endif

-extra-y := vmlinux.lds head.o
+targets += vmlinux.lds
+extra-y := head.o
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index ae295a3bcfef..7483916c034d 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -106,4 +106,5 @@ endif

obj-$(CONFIG_HAVE_ARM_SMCCC) += smccc-call.o

-extra-y := $(head-y) vmlinux.lds
+extra-y := $(head-y)
+targets += vmlinux.lds
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ed65576ce710..32e530c22cba 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -64,7 +64,8 @@ obj-$(CONFIG_COMPAT_VDSO) += vdso32-wrap.o

obj-y += probes/
head-y := head.o
-extra-y += $(head-y) vmlinux.lds
+extra-y += $(head-y)
+targets += vmlinux.lds

ifeq ($(CONFIG_DEBUG_EFI),y)
AFLAGS_head.o += -DVMLINUX_PATH="\"$(realpath $(objtree)/vmlinux)\""
diff --git a/arch/csky/kernel/Makefile b/arch/csky/kernel/Makefile
index 37f37c0e934a..2ebc393b57f4 100644
--- a/arch/csky/kernel/Makefile
+++ b/arch/csky/kernel/Makefile
@@ -1,5 +1,6 @@
# SPDX-License-Identifier: GPL-2.0-only
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds

obj-y += entry.o atomic.o signal.o traps.o irq.o time.o vdso.o
obj-y += power.o syscall.o syscall_table.o setup.o
diff --git a/arch/h8300/kernel/Makefile b/arch/h8300/kernel/Makefile
index 307aa51576dd..7ef912ee576f 100644
--- a/arch/h8300/kernel/Makefile
+++ b/arch/h8300/kernel/Makefile
@@ -3,7 +3,7 @@
# Makefile for the linux kernel.
#

-extra-y := vmlinux.lds
+targets += vmlinux.lds

obj-y := process.o traps.o ptrace.o \
signal.o setup.o syscalls.o \
diff --git a/arch/hexagon/kernel/Makefile b/arch/hexagon/kernel/Makefile
index fae3dce32fde..9765301d2672 100644
--- a/arch/hexagon/kernel/Makefile
+++ b/arch/hexagon/kernel/Makefile
@@ -1,5 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds

obj-$(CONFIG_SMP) += smp.o

diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index c89bd5f8cbf8..d430230b21af 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -7,7 +7,8 @@ ifdef CONFIG_DYNAMIC_FTRACE
CFLAGS_REMOVE_ftrace.o = -pg
endif

-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds

obj-y := entry.o efi.o efi_stub.o gate-data.o fsys.o ia64_ksyms.o irq.o irq_ia64.o \
irq_lsapic.o ivt.o pal.o patch.o process.o ptrace.o sal.o \
diff --git a/arch/m68k/kernel/Makefile b/arch/m68k/kernel/Makefile
index dbac7f8743fc..b054f4198e63 100644
--- a/arch/m68k/kernel/Makefile
+++ b/arch/m68k/kernel/Makefile
@@ -12,7 +12,7 @@ extra-$(CONFIG_HP300) := head.o
extra-$(CONFIG_Q40) := head.o
extra-$(CONFIG_SUN3X) := head.o
extra-$(CONFIG_SUN3) := sun3-head.o
-extra-y += vmlinux.lds
+targets += vmlinux.lds

obj-y := entry.o irq.o module.o process.o ptrace.o
obj-y += setup.o signal.o sys_m68k.o syscalltable.o time.o traps.o
diff --git a/arch/microblaze/kernel/Makefile b/arch/microblaze/kernel/Makefile
index 15a20eb814ce..cdf98cbfcce9 100644
--- a/arch/microblaze/kernel/Makefile
+++ b/arch/microblaze/kernel/Makefile
@@ -12,7 +12,8 @@ CFLAGS_REMOVE_ftrace.o = -pg
CFLAGS_REMOVE_process.o = -pg
endif

-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds

obj-y += dma.o exceptions.o \
hw_exception_handler.o irq.o \
diff --git a/arch/mips/kernel/Makefile b/arch/mips/kernel/Makefile
index b4a57f1de772..f2e82faa06c4 100644
--- a/arch/mips/kernel/Makefile
+++ b/arch/mips/kernel/Makefile
@@ -3,7 +3,8 @@
# Makefile for the Linux/MIPS kernel.
#

-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds

obj-y += branch.o cmpxchg.o elf.o entry.o genex.o idle.o irq.o \
process.o prom.o ptrace.o reset.o setup.o signal.o \
diff --git a/arch/nds32/kernel/Makefile b/arch/nds32/kernel/Makefile
index 394df3f6442c..ec061f18f00f 100644
--- a/arch/nds32/kernel/Makefile
+++ b/arch/nds32/kernel/Makefile
@@ -19,7 +19,8 @@ obj-$(CONFIG_OF) += devtree.o
obj-$(CONFIG_CACHE_L2) += atl2c.o
obj-$(CONFIG_PERF_EVENTS) += perf_event_cpu.o
obj-$(CONFIG_PM) += pm.o sleep.o
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds

CFLAGS_fpu.o += -mext-fpu-sp -mext-fpu-dp

diff --git a/arch/nios2/kernel/Makefile b/arch/nios2/kernel/Makefile
index 0b645e1e3158..1ec4be68462e 100644
--- a/arch/nios2/kernel/Makefile
+++ b/arch/nios2/kernel/Makefile
@@ -4,7 +4,7 @@
#

extra-y += head.o
-extra-y += vmlinux.lds
+targets += vmlinux.lds

obj-y += cpuinfo.o
obj-y += entry.o
diff --git a/arch/openrisc/kernel/Makefile b/arch/openrisc/kernel/Makefile
index 2d172e79f58d..6be5c65ea3e9 100644
--- a/arch/openrisc/kernel/Makefile
+++ b/arch/openrisc/kernel/Makefile
@@ -3,7 +3,8 @@
# Makefile for the linux kernel.
#

-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds

obj-y := setup.o or32_ksyms.o process.o dma.o \
traps.o time.o irq.o entry.o ptrace.o signal.o \
diff --git a/arch/parisc/kernel/Makefile b/arch/parisc/kernel/Makefile
index 068d90950d93..31e5109251aa 100644
--- a/arch/parisc/kernel/Makefile
+++ b/arch/parisc/kernel/Makefile
@@ -3,7 +3,8 @@
# Makefile for arch/parisc/kernel
#

-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds

obj-y := cache.o pacache.o setup.o pdt.o traps.o time.o irq.o \
pa7300lc.o syscall.o entry.o sys_parisc.o firmware.o \
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 6084fa499aa3..c7576957f05a 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -101,7 +101,7 @@ extra-$(CONFIG_40x) := head_40x.o
extra-$(CONFIG_44x) := head_44x.o
extra-$(CONFIG_FSL_BOOKE) := head_fsl_booke.o
extra-$(CONFIG_PPC_8xx) := head_8xx.o
-extra-y += vmlinux.lds
+targets += vmlinux.lds

obj-$(CONFIG_RELOCATABLE) += reloc_$(BITS).o

diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index f6caf4d9ca15..fcebdb13bcda 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -9,7 +9,7 @@ CFLAGS_REMOVE_patch.o = -pg
endif

extra-y += head.o
-extra-y += vmlinux.lds
+targets += vmlinux.lds

obj-y += soc.o
obj-y += cpu.o
diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile
index c97818a382f3..15d3ee771f22 100644
--- a/arch/s390/kernel/Makefile
+++ b/arch/s390/kernel/Makefile
@@ -42,7 +42,8 @@ obj-y += entry.o reipl.o relocate_kernel.o kdebugfs.o alternative.o
obj-y += nospec-branch.o ipl_vmparm.o machine_kexec_reloc.o unwind_bc.o
obj-y += smp.o

-extra-y += head64.o vmlinux.lds
+extra-y += head64.o
+targets += vmlinux.lds

obj-$(CONFIG_SYSFS) += nospec-sysfs.o
CFLAGS_REMOVE_nospec-branch.o += $(CC_FLAGS_EXPOLINE)
diff --git a/arch/sh/kernel/Makefile b/arch/sh/kernel/Makefile
index aa0fbc9202b1..e8384889f5f0 100644
--- a/arch/sh/kernel/Makefile
+++ b/arch/sh/kernel/Makefile
@@ -3,7 +3,8 @@
# Makefile for the Linux/SuperH kernel.
#

-extra-y := head_32.o vmlinux.lds
+extra-y := head_32.o
+targets += vmlinux.lds

ifdef CONFIG_FUNCTION_TRACER
# Do not profile debug and lowlevel utilities
diff --git a/arch/sparc/kernel/Makefile b/arch/sparc/kernel/Makefile
index d3a0e072ebe8..685669edb9f8 100644
--- a/arch/sparc/kernel/Makefile
+++ b/arch/sparc/kernel/Makefile
@@ -12,7 +12,7 @@ extra-y := head_$(BITS).o
# Undefine sparc when processing vmlinux.lds - it is used
# And teach CPP we are doing $(BITS) builds (for this case)
CPPFLAGS_vmlinux.lds := -Usparc -m$(BITS)
-extra-y += vmlinux.lds
+targets += vmlinux.lds

ifdef CONFIG_FUNCTION_TRACER
# Do not profile debug and lowlevel utilities
diff --git a/arch/um/kernel/Makefile b/arch/um/kernel/Makefile
index 5aa882011e04..76eea4cc00f0 100644
--- a/arch/um/kernel/Makefile
+++ b/arch/um/kernel/Makefile
@@ -12,7 +12,7 @@ CPPFLAGS_vmlinux.lds := -DSTART=$(LDS_START) \
-DELF_ARCH=$(LDS_ELF_ARCH) \
-DELF_FORMAT=$(LDS_ELF_FORMAT) \
$(LDS_EXTRA)
-extra-y := vmlinux.lds
+targets += vmlinux.lds

obj-y = config.o exec.o exitcode.o irq.o ksyms.o mem.o \
physmem.o process.o ptrace.o reboot.o sigio.o \
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 2ddf08351f0b..7d6fce044f97 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -7,7 +7,7 @@ extra-y := head_$(BITS).o
extra-y += head$(BITS).o
extra-y += ebda.o
extra-y += platform-quirks.o
-extra-y += vmlinux.lds
+targets += vmlinux.lds

CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)

diff --git a/arch/xtensa/kernel/Makefile b/arch/xtensa/kernel/Makefile
index d4082c6a121b..79be7bfdf989 100644
--- a/arch/xtensa/kernel/Makefile
+++ b/arch/xtensa/kernel/Makefile
@@ -3,7 +3,8 @@
# Makefile for the Linux/Xtensa kernel.
#

-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds

obj-y := align.o coprocessor.o entry.o irq.o platform.o process.o \
ptrace.o setup.o signal.o stacktrace.o syscall.o time.o traps.o \
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 3f6bf0ea7c0e..fd573e5ca0b9 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -63,12 +63,14 @@ ifndef obj
$(warning kbuild: Makefile.build is included improperly)
endif

+ifeq ($(filter-out %.mod, $(MAKECMDGOALS)),)
ifeq ($(need-modorder),)
ifneq ($(obj-m),)
$(warning $(patsubst %.o,'%.ko',$(obj-m)) will not be built even though obj-m is specified.)
$(warning You cannot use subdir-y/m to visit a module Makefile. Use obj-y/m instead.)
endif
endif
+endif

# ===========================================================================

--
2.27.0

2021-02-25 16:07:21

by Masahiro Yamada

[permalink] [raw]
Subject: [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO

Commit fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
does not work as expected if the .config file has already specified
CONFIG_UNUSED_KSYMS_WHITELIST="my/own/white/list" before enabling
CONFIG_LTO_CLANG.

So, the user-supplied whitelist and LTO-specific white list must be
independent of each other.

I refactored the shell script so CONFIG_MODVERSIONS and CONFIG_CLANG_LTO
handle whitelists in the same way.

Fixes: fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
Signed-off-by: Masahiro Yamada <[email protected]>
---

init/Kconfig | 1 -
scripts/gen_autoksyms.sh | 33 ++++++++++++++++++++++++---------
scripts/lto-used-symbollist.txt | 5 -----
3 files changed, 24 insertions(+), 15 deletions(-)
delete mode 100644 scripts/lto-used-symbollist.txt

diff --git a/init/Kconfig b/init/Kconfig
index 0bf5b340b80e..351161326e3c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2277,7 +2277,6 @@ config TRIM_UNUSED_KSYMS
config UNUSED_KSYMS_WHITELIST
string "Whitelist of symbols to keep in ksymtab"
depends on TRIM_UNUSED_KSYMS
- default "scripts/lto-used-symbollist.txt" if LTO_CLANG
help
By default, all unused exported symbols will be un-exported from the
build when TRIM_UNUSED_KSYMS is selected.
diff --git a/scripts/gen_autoksyms.sh b/scripts/gen_autoksyms.sh
index d54dfba15bf2..b74d5949fea6 100755
--- a/scripts/gen_autoksyms.sh
+++ b/scripts/gen_autoksyms.sh
@@ -19,7 +19,24 @@ esac
# We need access to CONFIG_ symbols
. include/config/auto.conf

-ksym_wl=/dev/null
+needed_symbols=
+
+# Special case for modversions (see modpost.c)
+if [ -n "$CONFIG_MODVERSIONS" ]; then
+ needed_symbols="$needed_symbols module_layout"
+fi
+
+# With CONFIG_LTO_CLANG, LLVM bitcode has not yet been compiled into a binary
+# when the .mod files are generated, which means they don't yet contain
+# references to certain symbols that will be present in the final binaries.
+if [ -n "$CONFIG_LTO_CLANG" ]; then
+ # intrinsic functions
+ needed_symbols="$needed_symbols memcpy memmove memset"
+ # stack protector symbols
+ needed_symbols="$needed_symbols __stack_chk_fail __stack_chk_guard"
+fi
+
+ksym_wl=
if [ -n "$CONFIG_UNUSED_KSYMS_WHITELIST" ]; then
# Use 'eval' to expand the whitelist path and check if it is relative
eval ksym_wl="$CONFIG_UNUSED_KSYMS_WHITELIST"
@@ -40,16 +57,14 @@ cat > "$output_file" << EOT
EOT

[ -f modules.order ] && modlist=modules.order || modlist=/dev/null
-sed 's/ko$/mod/' $modlist |
-xargs -n1 sed -n -e '2{s/ /\n/g;/^$/!p;}' -- |
-cat - "$ksym_wl" |
+
+{
+ sed 's/ko$/mod/' $modlist | xargs -n1 sed -n -e '2p'
+ echo "$needed_symbols"
+ [ -n "$ksym_wl" ] && cat "$ksym_wl"
+} | sed -e 's/ /\n/g' | sed -n -e '/^$/!p' |
# Remove the dot prefix for ppc64; symbol names with a dot (.) hold entry
# point addresses.
sed -e 's/^\.//' |
sort -u |
sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$output_file"
-
-# Special case for modversions (see modpost.c)
-if [ -n "$CONFIG_MODVERSIONS" ]; then
- echo "#define __KSYM_module_layout 1" >> "$output_file"
-fi
diff --git a/scripts/lto-used-symbollist.txt b/scripts/lto-used-symbollist.txt
deleted file mode 100644
index 38e7bb9ebaae..000000000000
--- a/scripts/lto-used-symbollist.txt
+++ /dev/null
@@ -1,5 +0,0 @@
-memcpy
-memmove
-memset
-__stack_chk_fail
-__stack_chk_guard
--
2.27.0

2021-02-25 16:08:34

by Masahiro Yamada

[permalink] [raw]
Subject: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass

Commit a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some
guarding") re-enabled this feature, but Linus is still unhappy about
the build time.

The reason of the slowness is the recursion - after updating
<generated/autoksyms.h> (, which contains all symbols needed by modules),
Kbuild begins the second traverse, rebuilding objects whose EXPORT_SYMBOL
needs flipping.

This commit re-implements CONFIG_TRIM_UNUSED_KSYMS to make it work
in one pass. After the tree traverse, a linker script snippet
<generated/keep-ksyms.h> is generated. It feeds the list of necessary
sections to vmlinus.lds.S and modules.lds.S. The other sections fall
into DISCARDS.

There is no more build issue, I believe. I dropped the 'if EXPORT' and
'depends on !COMPILE_TEST' guarding.

Signed-off-by: Masahiro Yamada <[email protected]>
---

Makefile | 30 +++-----
include/asm-generic/export.h | 23 ------
include/asm-generic/vmlinux.lds.h | 29 +++++--
include/linux/export.h | 54 +++----------
init/Kconfig | 3 +-
scripts/Makefile.build | 5 --
scripts/adjust_autoksyms.sh | 76 -------------------
.../{gen_autoksyms.sh => gen-keep-ksyms.sh} | 34 ++++++---
scripts/gen_ksymdeps.sh | 25 ------
scripts/module.lds.S | 38 +++++++---
10 files changed, 103 insertions(+), 214 deletions(-)
delete mode 100755 scripts/adjust_autoksyms.sh
rename scripts/{gen_autoksyms.sh => gen-keep-ksyms.sh} (78%)
delete mode 100755 scripts/gen_ksymdeps.sh

diff --git a/Makefile b/Makefile
index 34393fd72fe1..cda800fa2f78 100644
--- a/Makefile
+++ b/Makefile
@@ -1160,29 +1160,23 @@ export KBUILD_LDS := arch/$(SRCARCH)/kernel/vmlinux.lds
# used by scripts/Makefile.package
export KBUILD_ALLDIRS := $(sort $(filter-out arch/%,$(vmlinux-alldirs)) LICENSES arch include scripts tools)

-vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS)
+targets :=

-# Recurse until adjust_autoksyms.sh is satisfied
-PHONY += autoksyms_recursive
ifdef CONFIG_TRIM_UNUSED_KSYMS
# For the kernel to actually contain only the needed exported symbols,
# we have to build modules as well to determine what those symbols are.
# (this can be evaluated only once include/config/auto.conf has been included)
KBUILD_MODULES := 1

-autoksyms_recursive: descend modules.order
- $(Q)$(CONFIG_SHELL) $(srctree)/scripts/adjust_autoksyms.sh \
- "$(MAKE) -f $(srctree)/Makefile vmlinux"
-endif
-
-autoksyms_h := $(if $(CONFIG_TRIM_UNUSED_KSYMS), include/generated/autoksyms.h)
+quiet_cmd_gen_used_ksyms = GEN $@
+ cmd_gen_used_ksyms = $(CONFIG_SHELL) $(srctree)/scripts/gen-keep-ksyms.sh $< > $@

-quiet_cmd_autoksyms_h = GEN $@
- cmd_autoksyms_h = mkdir -p $(dir $@); \
- $(CONFIG_SHELL) $(srctree)/scripts/gen_autoksyms.sh $@
+include/generated/keep-ksyms.h: modules.order FORCE
+ $(call if_changed,gen_used_ksyms)
+targets += include/generated/keep-ksyms.h

-$(autoksyms_h):
- $(call cmd,autoksyms_h)
+$(KBUILD_LDS) modules_prepare: include/generated/keep-ksyms.h
+endif

$(KBUILD_LDS): prepare FORCE
$(Q)$(MAKE) $(build)=$(patsubst %/,%,$(dir $@)) $@
@@ -1194,11 +1188,11 @@ cmd_link-vmlinux = \
$(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)"; \
$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)

-vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(KBUILD_LDS) \
+vmlinux: scripts/link-vmlinux.sh $(KBUILD_LDS) \
$(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) FORCE
+$(call if_changed,link-vmlinux)

-targets := vmlinux
+targets += vmlinux

# The actual objects are generated when descending,
# make sure no implicit rule kicks in
@@ -1227,7 +1221,7 @@ scripts: scripts_basic scripts_dtc
PHONY += prepare archprepare

archprepare: outputmakefile archheaders archscripts scripts include/config/kernel.release \
- asm-generic $(version_h) $(autoksyms_h) include/generated/utsrelease.h \
+ asm-generic $(version_h) include/generated/utsrelease.h \
include/generated/autoconf.h

prepare0: archprepare
@@ -1503,7 +1497,7 @@ endif # CONFIG_MODULES
# make distclean Remove editor backup files, patch leftover files and the like

# Directories & files removed with 'make clean'
-CLEAN_FILES += include/ksym vmlinux.symvers \
+CLEAN_FILES += vmlinux.symvers \
modules.builtin modules.builtin.modinfo modules.nsdeps \
compile_commands.json

diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index e847f1fde367..b9be5b1dd7e6 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -57,30 +57,7 @@ __kstrtab_\name:
#endif
.endm

-#if defined(CONFIG_TRIM_UNUSED_KSYMS)
-
-#include <linux/kconfig.h>
-#include <generated/autoksyms.h>
-
-.macro __ksym_marker sym
- .section ".discard.ksym","a"
-__ksym_marker_\sym:
- .previous
-.endm
-
-#define __EXPORT_SYMBOL(sym, val, sec) \
- __ksym_marker sym; \
- __cond_export_sym(sym, val, sec, __is_defined(__KSYM_##sym))
-#define __cond_export_sym(sym, val, sec, conf) \
- ___cond_export_sym(sym, val, sec, conf)
-#define ___cond_export_sym(sym, val, sec, enabled) \
- __cond_export_sym_##enabled(sym, val, sec)
-#define __cond_export_sym_1(sym, val, sec) ___EXPORT_SYMBOL sym, val, sec
-#define __cond_export_sym_0(sym, val, sec) /* nothing */
-
-#else
#define __EXPORT_SYMBOL(sym, val, sec) ___EXPORT_SYMBOL sym, val, sec
-#endif

#define EXPORT_SYMBOL(name) \
__EXPORT_SYMBOL(name, KSYM_FUNC(name),)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5a2b31890bb8..f2b0990be159 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -50,6 +50,24 @@
* [__nosave_begin, __nosave_end] for the nosave data
*/

+#if CONFIG_TRIM_UNUSED_KSYMS
+#include <generated/keep-ksyms.h>
+
+#define KSYM_DISCARDS *(___ksymtab+*) \
+ *(___ksymtab_gpl+*) \
+ *(___kcrctab+*) \
+ *(___kcrctab_gpl+*) \
+ *(__ksymtab_strings+*)
+
+#else
+#define KSYMTAB KEEP(*(SORT(___ksymtab+*)))
+#define KSYMTAB_GPL KEEP(*(SORT(___ksymtab_gpl+*)))
+#define KCRCTAB KEEP(*(SORT(___kcrctab+*)))
+#define KCRCTAB_GPL KEEP(*(SORT(___kcrctab_gpl+*)))
+#define KSYMTAB_STRINGS *(__ksymtab_strings+*)
+#define KSYM_DISCARDS
+#endif
+
#ifndef LOAD_OFFSET
#define LOAD_OFFSET 0
#endif
@@ -486,34 +504,34 @@
/* Kernel symbol table: Normal symbols */ \
__ksymtab : AT(ADDR(__ksymtab) - LOAD_OFFSET) { \
__start___ksymtab = .; \
- KEEP(*(SORT(___ksymtab+*))) \
+ KSYMTAB \
__stop___ksymtab = .; \
} \
\
/* Kernel symbol table: GPL-only symbols */ \
__ksymtab_gpl : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) { \
__start___ksymtab_gpl = .; \
- KEEP(*(SORT(___ksymtab_gpl+*))) \
+ KSYMTAB_GPL \
__stop___ksymtab_gpl = .; \
} \
\
/* Kernel symbol table: Normal symbols */ \
__kcrctab : AT(ADDR(__kcrctab) - LOAD_OFFSET) { \
__start___kcrctab = .; \
- KEEP(*(SORT(___kcrctab+*))) \
+ KCRCTAB \
__stop___kcrctab = .; \
} \
\
/* Kernel symbol table: GPL-only symbols */ \
__kcrctab_gpl : AT(ADDR(__kcrctab_gpl) - LOAD_OFFSET) { \
__start___kcrctab_gpl = .; \
- KEEP(*(SORT(___kcrctab_gpl+*))) \
+ KCRCTAB_GPL \
__stop___kcrctab_gpl = .; \
} \
\
/* Kernel symbol table: strings */ \
__ksymtab_strings : AT(ADDR(__ksymtab_strings) - LOAD_OFFSET) { \
- *(__ksymtab_strings+*) \
+ KSYMTAB_STRINGS \
} \
\
/* __*init sections */ \
@@ -993,6 +1011,7 @@
/DISCARD/ : { \
EXIT_DISCARDS \
EXIT_CALL \
+ KSYM_DISCARDS \
COMMON_DISCARDS \
}

diff --git a/include/linux/export.h b/include/linux/export.h
index 01e6ab19b226..f9cc13cd2c8c 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -76,9 +76,18 @@ struct kernel_symbol {
};
#endif

-#ifdef __GENKSYMS__
+#if !defined(CONFIG_MODULES) || defined(__DISABLE_EXPORTS)
+
+/*
+ * Allow symbol exports to be disabled completely so that C code may
+ * be reused in other execution contexts such as the UEFI stub or the
+ * decompressor.
+ */
+#define __EXPORT_SYMBOL(sym, sec, ns)
+
+#elif defined(__GENKSYMS__)

-#define ___EXPORT_SYMBOL(sym, sec, ns) __GENKSYMS_EXPORT_SYMBOL(sym)
+#define __EXPORT_SYMBOL(sym, sec, ns) __GENKSYMS_EXPORT_SYMBOL(sym)

#else

@@ -94,7 +103,7 @@ struct kernel_symbol {
* section flag requires it. Use '%progbits' instead of '@progbits' since the
* former apparently works on all arches according to the binutils source.
*/
-#define ___EXPORT_SYMBOL(sym, sec, ns) \
+#define __EXPORT_SYMBOL(sym, sec, ns) \
extern typeof(sym) sym; \
extern const char __kstrtab_##sym[]; \
extern const char __kstrtabns_##sym[]; \
@@ -107,45 +116,6 @@ struct kernel_symbol {
" .previous \n"); \
__KSYMTAB_ENTRY(sym, sec)

-#endif
-
-#if !defined(CONFIG_MODULES) || defined(__DISABLE_EXPORTS)
-
-/*
- * Allow symbol exports to be disabled completely so that C code may
- * be reused in other execution contexts such as the UEFI stub or the
- * decompressor.
- */
-#define __EXPORT_SYMBOL(sym, sec, ns)
-
-#elif defined(CONFIG_TRIM_UNUSED_KSYMS)
-
-#include <generated/autoksyms.h>
-
-/*
- * For fine grained build dependencies, we want to tell the build system
- * about each possible exported symbol even if they're not actually exported.
- * We use a symbol pattern __ksym_marker_<symbol> that the build system filters
- * from the $(NM) output (see scripts/gen_ksymdeps.sh). These symbols are
- * discarded in the final link stage.
- */
-#define __ksym_marker(sym) \
- static int __ksym_marker_##sym[0] __section(".discard.ksym") __used
-
-#define __EXPORT_SYMBOL(sym, sec, ns) \
- __ksym_marker(sym); \
- __cond_export_sym(sym, sec, ns, __is_defined(__KSYM_##sym))
-#define __cond_export_sym(sym, sec, ns, conf) \
- ___cond_export_sym(sym, sec, ns, conf)
-#define ___cond_export_sym(sym, sec, ns, enabled) \
- __cond_export_sym_##enabled(sym, sec, ns)
-#define __cond_export_sym_1(sym, sec, ns) ___EXPORT_SYMBOL(sym, sec, ns)
-#define __cond_export_sym_0(sym, sec, ns) /* nothing */
-
-#else
-
-#define __EXPORT_SYMBOL(sym, sec, ns) ___EXPORT_SYMBOL(sym, sec, ns)
-
#endif /* CONFIG_MODULES */

#ifdef DEFAULT_SYMBOL_NAMESPACE
diff --git a/init/Kconfig b/init/Kconfig
index 351161326e3c..e52034f66aeb 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2259,8 +2259,7 @@ config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS
If unsure, say N.

config TRIM_UNUSED_KSYMS
- bool "Trim unused exported kernel symbols" if EXPERT
- depends on !COMPILE_TEST
+ bool "Trim unused exported kernel symbols"
help
The kernel and some modules make many symbols available for
other modules to use via EXPORT_SYMBOL() and variants. Depending
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index fd573e5ca0b9..fd2d7517a652 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -245,16 +245,12 @@ objtool_dep = $(objtool_obj) \
include/config/stack/validation.h)

ifdef CONFIG_TRIM_UNUSED_KSYMS
-cmd_gen_ksymdeps = \
- $(CONFIG_SHELL) $(srctree)/scripts/gen_ksymdeps.sh $@ >> $(dot-target).cmd
-
# List module undefined symbols
undefined_syms = $(NM) $< | $(AWK) '$$1 == "U" { printf("%s%s", x++ ? " " : "", $$2) }';
endif

define rule_cc_o_c
$(call cmd_and_fixdep,cc_o_c)
- $(call cmd,gen_ksymdeps)
$(call cmd,checksrc)
$(call cmd,checkdoc)
$(call cmd,objtool)
@@ -264,7 +260,6 @@ endef

define rule_as_o_S
$(call cmd_and_fixdep,as_o_S)
- $(call cmd,gen_ksymdeps)
$(call cmd,objtool)
$(call cmd,modversions_S)
endef
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
deleted file mode 100755
index 2b366d945ccb..000000000000
--- a/scripts/adjust_autoksyms.sh
+++ /dev/null
@@ -1,76 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0-only
-
-# Script to update include/generated/autoksyms.h and dependency files
-#
-# Copyright: (C) 2016 Linaro Limited
-# Created by: Nicolas Pitre, January 2016
-#
-
-# Update the include/generated/autoksyms.h file.
-#
-# For each symbol being added or removed, the corresponding dependency
-# file's timestamp is updated to force a rebuild of the affected source
-# file. All arguments passed to this script are assumed to be a command
-# to be exec'd to trigger a rebuild of those files.
-
-set -e
-
-cur_ksyms_file="include/generated/autoksyms.h"
-new_ksyms_file="include/generated/autoksyms.h.tmpnew"
-
-info() {
- if [ "$quiet" != "silent_" ]; then
- printf " %-7s %s\n" "$1" "$2"
- fi
-}
-
-info "CHK" "$cur_ksyms_file"
-
-# Use "make V=1" to debug this script.
-case "$KBUILD_VERBOSE" in
-*1*)
- set -x
- ;;
-esac
-
-# We need access to CONFIG_ symbols
-. include/config/auto.conf
-
-# Generate a new symbol list file
-$CONFIG_SHELL $srctree/scripts/gen_autoksyms.sh "$new_ksyms_file"
-
-# Extract changes between old and new list and touch corresponding
-# dependency files.
-changed=$(
-count=0
-sort "$cur_ksyms_file" "$new_ksyms_file" | uniq -u |
-sed -n 's/^#define __KSYM_\(.*\) 1/\1/p' | tr "A-Z_" "a-z/" |
-while read sympath; do
- if [ -z "$sympath" ]; then continue; fi
- depfile="include/ksym/${sympath}.h"
- mkdir -p "$(dirname "$depfile")"
- touch "$depfile"
- # Filesystems with coarse time precision may create timestamps
- # equal to the one from a file that was very recently built and that
- # needs to be rebuild. Let's guard against that by making sure our
- # dep files are always newer than the first file we created here.
- while [ ! "$depfile" -nt "$new_ksyms_file" ]; do
- touch "$depfile"
- done
- echo $((count += 1))
-done | tail -1 )
-changed=${changed:-0}
-
-if [ $changed -gt 0 ]; then
- # Replace the old list with tne new one
- old=$(grep -c "^#define __KSYM_" "$cur_ksyms_file" || true)
- new=$(grep -c "^#define __KSYM_" "$new_ksyms_file" || true)
- info "KSYMS" "symbols: before=$old, after=$new, changed=$changed"
- info "UPD" "$cur_ksyms_file"
- mv -f "$new_ksyms_file" "$cur_ksyms_file"
- # Then trigger a rebuild of affected source files
- exec $@
-else
- rm -f "$new_ksyms_file"
-fi
diff --git a/scripts/gen_autoksyms.sh b/scripts/gen-keep-ksyms.sh
similarity index 78%
rename from scripts/gen_autoksyms.sh
rename to scripts/gen-keep-ksyms.sh
index b74d5949fea6..cedb18fac46b 100755
--- a/scripts/gen_autoksyms.sh
+++ b/scripts/gen-keep-ksyms.sh
@@ -1,13 +1,23 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0-only

-# Create an autoksyms.h header file from the list of all module's needed symbols
-# as recorded on the second line of *.mod files and the user-provided symbol
-# whitelist.
-
set -e

-output_file="$1"
+modlist=$1
+
+emit ()
+{
+ local macro="$1"
+ local prefix="$2"
+ local syms="$3"
+
+ echo "#define $macro \\"
+ for s in $syms
+ do
+ echo " KEEP(*($prefix$s)) \\"
+ done
+ echo
+}

# Use "make V=1" to debug this script.
case "$KBUILD_VERBOSE" in
@@ -49,15 +59,14 @@ fi

# Generate a new ksym list file with symbols needed by the current
# set of modules.
-cat > "$output_file" << EOT
+cat << EOT
/*
* Automatically generated file; DO NOT EDIT.
*/

EOT

-[ -f modules.order ] && modlist=modules.order || modlist=/dev/null
-
+syms=$(
{
sed 's/ko$/mod/' $modlist | xargs -n1 sed -n -e '2p'
echo "$needed_symbols"
@@ -67,4 +76,11 @@ EOT
# point addresses.
sed -e 's/^\.//' |
sort -u |
-sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$output_file"
+sed -e 's/\(.*\)/\1/'
+)
+
+emit "KSYMTAB" "___ksymtab+" "$syms"
+emit "KSYMTAB_GPL" "___ksymtab_gpl+" "$syms"
+emit "KCRCTAB" "___kcrctab_gpl+" "$syms"
+emit "KCRCTAB_GPL" "___kcrctab_gpl+" "$syms"
+emit "KSYMTAB_STRINGS" "__ksymtab_strings+" "$syms"
diff --git a/scripts/gen_ksymdeps.sh b/scripts/gen_ksymdeps.sh
deleted file mode 100755
index 1324986e1362..000000000000
--- a/scripts/gen_ksymdeps.sh
+++ /dev/null
@@ -1,25 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0
-
-set -e
-
-# List of exported symbols
-ksyms=$($NM $1 | sed -n 's/.*__ksym_marker_\(.*\)/\1/p' | tr A-Z a-z)
-
-if [ -z "$ksyms" ]; then
- exit 0
-fi
-
-echo
-echo "ksymdeps_$1 := \\"
-
-for s in $ksyms
-do
- echo $s | sed -e 's:^_*: $(wildcard include/ksym/:' \
- -e 's:__*:/:g' -e 's/$/.h) \\/'
-done
-
-echo
-echo "$1: \$(ksymdeps_$1)"
-echo
-echo "\$(ksymdeps_$1):"
diff --git a/scripts/module.lds.S b/scripts/module.lds.S
index 168cd27e6122..a6d2d96e29f0 100644
--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -3,16 +3,30 @@
* Archs are free to supply their own linker scripts. ld will
* combine them automatically.
*/
-SECTIONS {
- /DISCARD/ : {
- *(.discard)
- *(.discard.*)
- }

- __ksymtab 0 : { *(SORT(___ksymtab+*)) }
- __ksymtab_gpl 0 : { *(SORT(___ksymtab_gpl+*)) }
- __kcrctab 0 : { *(SORT(___kcrctab+*)) }
- __kcrctab_gpl 0 : { *(SORT(___kcrctab_gpl+*)) }
+#if CONFIG_TRIM_UNUSED_KSYMS
+#include <generated/keep-ksyms.h>
+
+#define KSYM_DISCARDS *(___ksymtab+*) \
+ *(___ksymtab_gpl+*) \
+ *(___kcrctab+*) \
+ *(___kcrctab_gpl+*) \
+ *(__ksymtab_strings+*)
+#else
+#define KSYMTAB KEEP(*(SORT(___ksymtab+*)))
+#define KSYMTAB_GPL KEEP(*(SORT(___ksymtab_gpl+*)))
+#define KCRCTAB KEEP(*(SORT(___kcrctab+*)))
+#define KCRCTAB_GPL KEEP(*(SORT(___kcrctab_gpl+*)))
+#define KSYMTAB_STRINGS *(__ksymtab_strings+*)
+#define KSYM_DISCARDS
+#endif
+
+SECTIONS {
+ __ksymtab 0 : { KSYMTAB }
+ __ksymtab_gpl 0 : { KSYMTAB_GPL }
+ __kcrctab 0 : { KCRCTAB }
+ __kcrctab_gpl 0 : { KCRCTAB_GPL }
+ __ksymtab_strings 0 : { KSYMTAB_STRINGS }

.init_array 0 : ALIGN(8) { *(SORT(.init_array.*)) *(.init_array) }

@@ -41,6 +55,12 @@ SECTIONS {
}

.text : { *(.text .text.[0-9a-zA-Z_]*) }
+
+ /DISCARD/ : {
+ *(.discard)
+ *(.discard.*)
+ KSYM_DISCARDS
+ }
}

/* bring in arch-specific sections */
--
2.27.0

2021-02-25 16:08:56

by Masahiro Yamada

[permalink] [raw]
Subject: [PATCH 2/4] export.h: make __ksymtab_strings per-symbol section

The export symbol tables are placed on own sections (__ksymtab*+<sym>)
and sorted by SORT (an alias of SORT_BY_NAME) because the module
subsystem uses the binary search for symbol resolution.

We did not have a good reason to do so for __ksymtab_strings, but
now I have.

To make CONFIG_TRIM_UNUSED_KSYMS work in one-pass, the linker needs
to trim unused strings of symbols and namespaces. To allow per-symbol
keep/drop choice, __ksymtab_strings must be placed on own sections.
Of course, SORT is unneeded here, though.

This keeps the string unification introduced by commit ce2b617ce8cb
("export.h: reduce __ksymtab_strings string duplication by using "MS"
section flags").

For example, the empty namespaces share the same address.

$ nm -n
[ snip ]
ffffffff8233b53a r __kstrtabns_IO_APIC_get_PCI_irq_vector
ffffffff8233b53a r __kstrtabns_I_BDEV
ffffffff8233b53a r __kstrtabns_LZ4_decompress_fast
ffffffff8233b53a r __kstrtabns_LZ4_decompress_fast_continue
ffffffff8233b53a r __kstrtabns_LZ4_decompress_fast_usingDict
ffffffff8233b53a r __kstrtabns_LZ4_decompress_safe
ffffffff8233b53a r __kstrtabns_LZ4_decompress_safe_continue
...

I confirmed no size change in vmlinux.

Signed-off-by: Masahiro Yamada <[email protected]>
---

include/asm-generic/export.h | 2 +-
include/asm-generic/vmlinux.lds.h | 2 +-
include/linux/export.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index 07a36a874dca..e847f1fde367 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -39,7 +39,7 @@
__ksymtab_\name:
__put \val, __kstrtab_\name
.previous
- .section __ksymtab_strings,"aMS",%progbits,1
+ .section __ksymtab_strings+\name,"aMS",%progbits,1
__kstrtab_\name:
.asciz "\name"
.previous
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index c54adce8f6f6..5a2b31890bb8 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -513,7 +513,7 @@
\
/* Kernel symbol table: strings */ \
__ksymtab_strings : AT(ADDR(__ksymtab_strings) - LOAD_OFFSET) { \
- *(__ksymtab_strings) \
+ *(__ksymtab_strings+*) \
} \
\
/* __*init sections */ \
diff --git a/include/linux/export.h b/include/linux/export.h
index 6271a5d9c988..01e6ab19b226 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -99,7 +99,7 @@ struct kernel_symbol {
extern const char __kstrtab_##sym[]; \
extern const char __kstrtabns_##sym[]; \
__CRC_SYMBOL(sym, sec); \
- asm(" .section \"__ksymtab_strings\",\"aMS\",%progbits,1 \n" \
+ asm(" .section \"__ksymtab_strings+" #sym "\",\"aMS\",%progbits,1\n" \
"__kstrtab_" #sym ": \n" \
" .asciz \"" #sym "\" \n" \
"__kstrtabns_" #sym ": \n" \
--
2.27.0

2021-02-25 17:50:18

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO

Hi Masahiro,

On Thu, Feb 25, 2021 at 8:03 AM Masahiro Yamada <[email protected]> wrote:
>
> Commit fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
> does not work as expected if the .config file has already specified
> CONFIG_UNUSED_KSYMS_WHITELIST="my/own/white/list" before enabling
> CONFIG_LTO_CLANG.
>
> So, the user-supplied whitelist and LTO-specific white list must be
> independent of each other.
>
> I refactored the shell script so CONFIG_MODVERSIONS and CONFIG_CLANG_LTO
> handle whitelists in the same way.
>
> Fixes: fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
> Signed-off-by: Masahiro Yamada <[email protected]>
> ---
>
> init/Kconfig | 1 -
> scripts/gen_autoksyms.sh | 33 ++++++++++++++++++++++++---------
> scripts/lto-used-symbollist.txt | 5 -----
> 3 files changed, 24 insertions(+), 15 deletions(-)
> delete mode 100644 scripts/lto-used-symbollist.txt

> +
> +ksym_wl=
> if [ -n "$CONFIG_UNUSED_KSYMS_WHITELIST" ]; then
> # Use 'eval' to expand the whitelist path and check if it is relative
> eval ksym_wl="$CONFIG_UNUSED_KSYMS_WHITELIST"
> @@ -40,16 +57,14 @@ cat > "$output_file" << EOT
> EOT
>
> [ -f modules.order ] && modlist=modules.order || modlist=/dev/null
> -sed 's/ko$/mod/' $modlist |
> -xargs -n1 sed -n -e '2{s/ /\n/g;/^$/!p;}' -- |
> -cat - "$ksym_wl" |
> +
> +{
> + sed 's/ko$/mod/' $modlist | xargs -n1 sed -n -e '2p'
> + echo "$needed_symbols"
> + [ -n "$ksym_wl" ] && cat "$ksym_wl"
> +} | sed -e 's/ /\n/g' | sed -n -e '/^$/!p' |
> # Remove the dot prefix for ppc64; symbol names with a dot (.) hold entry
> # point addresses.
> sed -e 's/^\.//' |
> sort -u |
> sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$output_file"
> -
> -# Special case for modversions (see modpost.c)
> -if [ -n "$CONFIG_MODVERSIONS" ]; then
> - echo "#define __KSYM_module_layout 1" >> "$output_file"
> -fi
> diff --git a/scripts/lto-used-symbollist.txt b/scripts/lto-used-symbollist.txt
> deleted file mode 100644
> index 38e7bb9ebaae..000000000000
> --- a/scripts/lto-used-symbollist.txt
> +++ /dev/null
> @@ -1,5 +0,0 @@
> -memcpy
> -memmove
> -memset
> -__stack_chk_fail
> -__stack_chk_guard
> --
> 2.27.0
>
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 0bf5b340b80e..351161326e3c 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -2277,7 +2277,6 @@ config TRIM_UNUSED_KSYMS
> config UNUSED_KSYMS_WHITELIST
> string "Whitelist of symbols to keep in ksymtab"
> depends on TRIM_UNUSED_KSYMS
> - default "scripts/lto-used-symbollist.txt" if LTO_CLANG
> help
> By default, all unused exported symbols will be un-exported from the
> build when TRIM_UNUSED_KSYMS is selected.
> diff --git a/scripts/gen_autoksyms.sh b/scripts/gen_autoksyms.sh
> index d54dfba15bf2..b74d5949fea6 100755
> --- a/scripts/gen_autoksyms.sh
> +++ b/scripts/gen_autoksyms.sh
> @@ -19,7 +19,24 @@ esac
> # We need access to CONFIG_ symbols
> . include/config/auto.conf
>
> -ksym_wl=/dev/null
> +needed_symbols=
> +
> +# Special case for modversions (see modpost.c)
> +if [ -n "$CONFIG_MODVERSIONS" ]; then
> + needed_symbols="$needed_symbols module_layout"
> +fi
> +
> +# With CONFIG_LTO_CLANG, LLVM bitcode has not yet been compiled into a binary
> +# when the .mod files are generated, which means they don't yet contain
> +# references to certain symbols that will be present in the final binaries.
> +if [ -n "$CONFIG_LTO_CLANG" ]; then
> + # intrinsic functions
> + needed_symbols="$needed_symbols memcpy memmove memset"
> + # stack protector symbols
> + needed_symbols="$needed_symbols __stack_chk_fail __stack_chk_guard"
> +fi

Thank you for the patch!

Arnd just reported that _mcount is also needed with some
configurations. Would you mind including that in the next version?

https://lore.kernel.org/r/[email protected]/

Sami

2021-02-25 18:49:08

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass

Hi Masahiro,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20210225]
[cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
config: powerpc-mpc8313_rdb_defconfig (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
git checkout 014940331790a8cd9bee92c7201494ec3217201e
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> scripts/module.lds.S:7:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]
7 | #if CONFIG_TRIM_UNUSED_KSYMS
| ^~~~~~~~~~~~~~~~~~~~~~~~


vim +/CONFIG_TRIM_UNUSED_KSYMS +7 scripts/module.lds.S

> 7 #if CONFIG_TRIM_UNUSED_KSYMS
8 #include <generated/keep-ksyms.h>
9

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (2.13 kB)
.config.gz (18.80 kB)
Download all attachments

2021-02-25 19:00:16

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass

Hi Masahiro,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20210225]
[cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
config: arc-randconfig-r031-20210225 (attached as .config)
compiler: arc-elf-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
git checkout 014940331790a8cd9bee92c7201494ec3217201e
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

In file included from drivers/of/unittest-data/testcases.dtb.S:1:
>> include/asm-generic/vmlinux.lds.h:53:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]
53 | #if CONFIG_TRIM_UNUSED_KSYMS
| ^~~~~~~~~~~~~~~~~~~~~~~~


vim +/CONFIG_TRIM_UNUSED_KSYMS +53 include/asm-generic/vmlinux.lds.h

> 53 #if CONFIG_TRIM_UNUSED_KSYMS
54 #include <generated/keep-ksyms.h>
55

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (2.21 kB)
.config.gz (25.26 kB)
Download all attachments

2021-02-25 19:02:20

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS

On Fri, Feb 26, 2021 at 2:20 AM Nicolas Pitre <[email protected]> wrote:
>
> On Fri, 26 Feb 2021, Masahiro Yamada wrote:
>
> >
> > Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> > about the build speed.
> >
> > I re-implemented this feature, and the build time cost is now
> > almost unnoticeable level.
> >
> > I hope this makes Linus happy.
>
> :-)
>
> I'm surprised to see that Linus is using this feature. When disabled
> (the default) this should have had no impact on the build time.

Linus is not using this feature, but does build tests.
After pulling the module subsystem pull request in this merge window,
CONFIG_TRIM_UNUSED_KSYMS was enabled by allmodconfig.


> This feature provides a nice security advantage by significantly
> reducing the kernel input surface. And people are using that also to
> better what third party vendor can and cannot do with a distro kernel,
> etc. But that's not the reason why I implemented this feature in the
> first place.
>
> My primary goal was to efficiently reduce the kernel binary size using
> LTO even with kernel modules enabled.


Clang LTO landed in this MW.

Do you think it will reduce the kernel binary size?
No, opposite.

CONFIG_LTO_CLANG cannot trim any code even if it
is obviously unused.
Hence, it never reduces the kernel binary size.
Rather, it produces a bigger kernel.

The reason is Clang LTO was implemented against
relocatable ELF (vmlinux.o) .

I pointed out this flaw in the review process, but
it was dismissed.

This is the main reason why I did not give any Ack
(but it was merged via Kees Cook's tree).


So, the help text of this option should be revised:

This option allows for unused exported symbols to be dropped from
the build. In turn, this provides the compiler more opportunities
(especially when using LTO) for optimizing the code and reducing
binary size. This might have some security advantages as well.

Clang LTO is opposite to your expectation.



> Each EXPORT_SYMBOL() created a
> symbol dependency that prevented LTO from optimizing out the related
> code even though a tiny fraction of those exported symbols were needed.
>
> The idea behind the recursion was to catch those cases where disabling
> an exported symbol within a module would optimize out references to more
> exported symbols that, in turn, could be disabled and possibly trigger
> yet more code elimination. There is no way that can be achieved without
> extra compiler passes in a recursive manner.

I do not understand.

Modules are relocatable ELF.
Clang LTO cannot eliminate any code.
GCC LTO does not work with relocatable ELF
in the first place.


Are you talking about a story in a perfect world?
But, I do not know how LTO can eliminate dead code
from relocatable ELF.




- Current implementation

CLANG LTO works against vmlinux.o,
so it is completely useless for the purpose of
eliminating dead code.

So, this case is don't care.
TRIM_UNUSED_KSYMS removes only the meta data of EXPORT_SYMBOL,
but no further optimization anyway.


- What if Clang LTO had been implemented in the final link?
(this means LTO runs 3 times if KALLSYMS_ALL is enabled)

With proper linker script input with /DISCARD/,
the meta-data of EXPORT_SYMBOL() will be dropped,
and LTO should be able to do further dead code elimination.
So, I guess we do not need to no-op EXPORT_SYMBOL by CPP
(unless I am missing something).






--
Best Regards
Masahiro Yamada

2021-02-25 19:17:14

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO

On Fri, Feb 26, 2021 at 2:46 AM Sami Tolvanen <[email protected]> wrote:
>
> Hi Masahiro,
>
> On Thu, Feb 25, 2021 at 8:03 AM Masahiro Yamada <[email protected]> wrote:
> >
> > Commit fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
> > does not work as expected if the .config file has already specified
> > CONFIG_UNUSED_KSYMS_WHITELIST="my/own/white/list" before enabling
> > CONFIG_LTO_CLANG.
> >
> > So, the user-supplied whitelist and LTO-specific white list must be
> > independent of each other.
> >
> > I refactored the shell script so CONFIG_MODVERSIONS and CONFIG_CLANG_LTO
> > handle whitelists in the same way.
> >
> > Fixes: fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
> > Signed-off-by: Masahiro Yamada <[email protected]>
> > ---
> >
> > init/Kconfig | 1 -
> > scripts/gen_autoksyms.sh | 33 ++++++++++++++++++++++++---------
> > scripts/lto-used-symbollist.txt | 5 -----
> > 3 files changed, 24 insertions(+), 15 deletions(-)
> > delete mode 100644 scripts/lto-used-symbollist.txt
>
> > +
> > +ksym_wl=
> > if [ -n "$CONFIG_UNUSED_KSYMS_WHITELIST" ]; then
> > # Use 'eval' to expand the whitelist path and check if it is relative
> > eval ksym_wl="$CONFIG_UNUSED_KSYMS_WHITELIST"
> > @@ -40,16 +57,14 @@ cat > "$output_file" << EOT
> > EOT
> >
> > [ -f modules.order ] && modlist=modules.order || modlist=/dev/null
> > -sed 's/ko$/mod/' $modlist |
> > -xargs -n1 sed -n -e '2{s/ /\n/g;/^$/!p;}' -- |
> > -cat - "$ksym_wl" |
> > +
> > +{
> > + sed 's/ko$/mod/' $modlist | xargs -n1 sed -n -e '2p'
> > + echo "$needed_symbols"
> > + [ -n "$ksym_wl" ] && cat "$ksym_wl"
> > +} | sed -e 's/ /\n/g' | sed -n -e '/^$/!p' |
> > # Remove the dot prefix for ppc64; symbol names with a dot (.) hold entry
> > # point addresses.
> > sed -e 's/^\.//' |
> > sort -u |
> > sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$output_file"
> > -
> > -# Special case for modversions (see modpost.c)
> > -if [ -n "$CONFIG_MODVERSIONS" ]; then
> > - echo "#define __KSYM_module_layout 1" >> "$output_file"
> > -fi
> > diff --git a/scripts/lto-used-symbollist.txt b/scripts/lto-used-symbollist.txt
> > deleted file mode 100644
> > index 38e7bb9ebaae..000000000000
> > --- a/scripts/lto-used-symbollist.txt
> > +++ /dev/null
> > @@ -1,5 +0,0 @@
> > -memcpy
> > -memmove
> > -memset
> > -__stack_chk_fail
> > -__stack_chk_guard
> > --
> > 2.27.0
> >
> >
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 0bf5b340b80e..351161326e3c 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -2277,7 +2277,6 @@ config TRIM_UNUSED_KSYMS
> > config UNUSED_KSYMS_WHITELIST
> > string "Whitelist of symbols to keep in ksymtab"
> > depends on TRIM_UNUSED_KSYMS
> > - default "scripts/lto-used-symbollist.txt" if LTO_CLANG
> > help
> > By default, all unused exported symbols will be un-exported from the
> > build when TRIM_UNUSED_KSYMS is selected.
> > diff --git a/scripts/gen_autoksyms.sh b/scripts/gen_autoksyms.sh
> > index d54dfba15bf2..b74d5949fea6 100755
> > --- a/scripts/gen_autoksyms.sh
> > +++ b/scripts/gen_autoksyms.sh
> > @@ -19,7 +19,24 @@ esac
> > # We need access to CONFIG_ symbols
> > . include/config/auto.conf
> >
> > -ksym_wl=/dev/null
> > +needed_symbols=
> > +
> > +# Special case for modversions (see modpost.c)
> > +if [ -n "$CONFIG_MODVERSIONS" ]; then
> > + needed_symbols="$needed_symbols module_layout"
> > +fi
> > +
> > +# With CONFIG_LTO_CLANG, LLVM bitcode has not yet been compiled into a binary
> > +# when the .mod files are generated, which means they don't yet contain
> > +# references to certain symbols that will be present in the final binaries.
> > +if [ -n "$CONFIG_LTO_CLANG" ]; then
> > + # intrinsic functions
> > + needed_symbols="$needed_symbols memcpy memmove memset"
> > + # stack protector symbols
> > + needed_symbols="$needed_symbols __stack_chk_fail __stack_chk_guard"
> > +fi
>
> Thank you for the patch!
>
> Arnd just reported that _mcount is also needed with some
> configurations. Would you mind including that in the next version?
>
> https://lore.kernel.org/r/[email protected]/

Sure, I can even pick it up
although that patch was not addressed to me or kbuild ML.



--
Best Regards
Masahiro Yamada

2021-02-25 19:30:19

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS

On Fri, 26 Feb 2021, Masahiro Yamada wrote:

> On Fri, Feb 26, 2021 at 2:20 AM Nicolas Pitre <[email protected]> wrote:
> >
> > On Fri, 26 Feb 2021, Masahiro Yamada wrote:
> >
> > >
> > > Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> > > about the build speed.
> > >
> > > I re-implemented this feature, and the build time cost is now
> > > almost unnoticeable level.
> > >
> > > I hope this makes Linus happy.
> >
> > :-)
> >
> > I'm surprised to see that Linus is using this feature. When disabled
> > (the default) this should have had no impact on the build time.
>
> Linus is not using this feature, but does build tests.
> After pulling the module subsystem pull request in this merge window,
> CONFIG_TRIM_UNUSED_KSYMS was enabled by allmodconfig.

If CONFIG_TRIM_UNUSED_KSYMS is enabled then build time willincrease.
That comes with the feature.

> > This feature provides a nice security advantage by significantly
> > reducing the kernel input surface. And people are using that also to
> > better what third party vendor can and cannot do with a distro kernel,
> > etc. But that's not the reason why I implemented this feature in the
> > first place.
> >
> > My primary goal was to efficiently reduce the kernel binary size using
> > LTO even with kernel modules enabled.
>
>
> Clang LTO landed in this MW.
>
> Do you think it will reduce the kernel binary size?
> No, opposite.

LTO ought to reduce binary size. It is rather broken otherwise.
Having a global view before optimizing allows for the compiler to do
project wide constant propagation and dead code elimination.

> CONFIG_LTO_CLANG cannot trim any code even if it
> is obviously unused.
> Hence, it never reduces the kernel binary size.
> Rather, it produces a bigger kernel.

Then what's the point?

> The reason is Clang LTO was implemented against
> relocatable ELF (vmlinux.o) .

That's not true LTO then.

> I pointed out this flaw in the review process, but
> it was dismissed.
>
> This is the main reason why I did not give any Ack
> (but it was merged via Kees Cook's tree).

> So, the help text of this option should be revised:
>
> This option allows for unused exported symbols to be dropped from
> the build. In turn, this provides the compiler more opportunities
> (especially when using LTO) for optimizing the code and reducing
> binary size. This might have some security advantages as well.
>
> Clang LTO is opposite to your expectation.

Then Clang LTO is a misnomer. That is the option to revise not this one.

> > Each EXPORT_SYMBOL() created a
> > symbol dependency that prevented LTO from optimizing out the related
> > code even though a tiny fraction of those exported symbols were needed.
> >
> > The idea behind the recursion was to catch those cases where disabling
> > an exported symbol within a module would optimize out references to more
> > exported symbols that, in turn, could be disabled and possibly trigger
> > yet more code elimination. There is no way that can be achieved without
> > extra compiler passes in a recursive manner.
>
> I do not understand.
>
> Modules are relocatable ELF.
> Clang LTO cannot eliminate any code.
> GCC LTO does not work with relocatable ELF
> in the first place.

I don't think I follow you here. What relocatable ELF has to do with LTO?

I've successfully used gcc LTO on the kernel quite a while ago.

For a reference about binary size reduction with LTO and
CONFIG_TRIM_UNUSED_KSYMS please read this article:

https://lwn.net/Articles/746780/


Nicolas

2021-02-25 20:10:25

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass

On Fri, Feb 26, 2021 at 3:47 AM kernel test robot <[email protected]> wrote:
>
> Hi Masahiro,
>
> I love your patch! Perhaps something to improve:
>
> [auto build test WARNING on linus/master]
> [also build test WARNING on next-20210225]
> [cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
> config: powerpc-mpc8313_rdb_defconfig (attached as .config)
> compiler: powerpc-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> git checkout 014940331790a8cd9bee92c7201494ec3217201e
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <[email protected]>
>
> All warnings (new ones prefixed by >>):
>
> >> scripts/module.lds.S:7:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]

Thanks. This should be #ifdef, of course.



> 7 | #if CONFIG_TRIM_UNUSED_KSYMS
> | ^~~~~~~~~~~~~~~~~~~~~~~~
>
>
> vim +/CONFIG_TRIM_UNUSED_KSYMS +7 scripts/module.lds.S
>
> > 7 #if CONFIG_TRIM_UNUSED_KSYMS
> 8 #include <generated/keep-ksyms.h>
> 9
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/[email protected]



--
Best Regards
Masahiro Yamada

2021-02-25 20:53:02

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS

On Fri, 26 Feb 2021, Masahiro Yamada wrote:

>
> Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> about the build speed.
>
> I re-implemented this feature, and the build time cost is now
> almost unnoticeable level.
>
> I hope this makes Linus happy.

:-)

I'm surprised to see that Linus is using this feature. When disabled
(the default) this should have had no impact on the build time.

This feature provides a nice security advantage by significantly
reducing the kernel input surface. And people are using that also to
better what third party vendor can and cannot do with a distro kernel,
etc. But that's not the reason why I implemented this feature in the
first place.

My primary goal was to efficiently reduce the kernel binary size using
LTO even with kernel modules enabled. Each EXPORT_SYMBOL() created a
symbol dependency that prevented LTO from optimizing out the related
code even though a tiny fraction of those exported symbols were needed.

The idea behind the recursion was to catch those cases where disabling
an exported symbol within a module would optimize out references to more
exported symbols that, in turn, could be disabled and possibly trigger
yet more code elimination. There is no way that can be achieved without
extra compiler passes in a recursive manner.


Nicolas

2021-02-25 21:25:05

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass

Hi Masahiro,

On Thu, Feb 25, 2021 at 12:07 PM Masahiro Yamada <[email protected]> wrote:
>
> On Fri, Feb 26, 2021 at 3:47 AM kernel test robot <[email protected]> wrote:
> >
> > Hi Masahiro,
> >
> > I love your patch! Perhaps something to improve:
> >
> > [auto build test WARNING on linus/master]
> > [also build test WARNING on next-20210225]
> > [cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch]
> >
> > url: https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> > base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
> > config: powerpc-mpc8313_rdb_defconfig (attached as .config)
> > compiler: powerpc-linux-gcc (GCC) 9.3.0
> > reproduce (this is a W=1 build):
> > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > # https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
> > git remote add linux-review https://github.com/0day-ci/linux
> > git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> > git checkout 014940331790a8cd9bee92c7201494ec3217201e
> > # save the attached .config to linux build tree
> > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <[email protected]>
> >
> > All warnings (new ones prefixed by >>):
> >
> > >> scripts/module.lds.S:7:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]
>
> Thanks. This should be #ifdef, of course.

I applied this series and changed these from #if to #ifdef, but I
still see the following build error with TRIM_UNUSED_KSYMS +
OF_UNITTEST:

In file included from drivers/of/unittest-data/testcases.dtb.S:1:
../include/asm-generic/vmlinux.lds.h:54:10: fatal error:
'generated/keep-ksyms.h' file not found
#include <generated/keep-ksyms.h>
^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

This is with x86_64_defconfig and scripts/config -e OF -e OF_UNITTEST
-e TRIM_UNUSED_KSYMS.

Sami

2021-02-26 07:09:52

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass

On Fri, Feb 26, 2021 at 6:20 AM Sami Tolvanen <[email protected]> wrote:
>
> Hi Masahiro,
>
> On Thu, Feb 25, 2021 at 12:07 PM Masahiro Yamada <[email protected]> wrote:
> >
> > On Fri, Feb 26, 2021 at 3:47 AM kernel test robot <[email protected]> wrote:
> > >
> > > Hi Masahiro,
> > >
> > > I love your patch! Perhaps something to improve:
> > >
> > > [auto build test WARNING on linus/master]
> > > [also build test WARNING on next-20210225]
> > > [cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
> > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > And when submitting patch, we suggest to use '--base' as documented in
> > > https://git-scm.com/docs/git-format-patch]
> > >
> > > url: https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> > > base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
> > > config: powerpc-mpc8313_rdb_defconfig (attached as .config)
> > > compiler: powerpc-linux-gcc (GCC) 9.3.0
> > > reproduce (this is a W=1 build):
> > > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > > chmod +x ~/bin/make.cross
> > > # https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
> > > git remote add linux-review https://github.com/0day-ci/linux
> > > git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> > > git checkout 014940331790a8cd9bee92c7201494ec3217201e
> > > # save the attached .config to linux build tree
> > > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc
> > >
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot <[email protected]>
> > >
> > > All warnings (new ones prefixed by >>):
> > >
> > > >> scripts/module.lds.S:7:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]
> >
> > Thanks. This should be #ifdef, of course.
>
> I applied this series and changed these from #if to #ifdef, but I
> still see the following build error with TRIM_UNUSED_KSYMS +
> OF_UNITTEST:
>
> In file included from drivers/of/unittest-data/testcases.dtb.S:1:
> ../include/asm-generic/vmlinux.lds.h:54:10: fatal error:
> 'generated/keep-ksyms.h' file not found
> #include <generated/keep-ksyms.h>
> ^~~~~~~~~~~~~~~~~~~~~~~~
> 1 error generated.
>
> This is with x86_64_defconfig and scripts/config -e OF -e OF_UNITTEST
> -e TRIM_UNUSED_KSYMS.
>
> Sami

Thanks. I will fix it.
I will come back with v2
probably after v5.12-rc1 is tagged.





--
Best Regards
Masahiro Yamada

2021-03-09 07:33:02

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS

On Fri, Feb 26, 2021 at 4:24 AM Nicolas Pitre <[email protected]> wrote:
>
> On Fri, 26 Feb 2021, Masahiro Yamada wrote:
>
> > On Fri, Feb 26, 2021 at 2:20 AM Nicolas Pitre <[email protected]> wrote:
> > >
> > > On Fri, 26 Feb 2021, Masahiro Yamada wrote:
> > >
> > > >
> > > > Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> > > > about the build speed.
> > > >
> > > > I re-implemented this feature, and the build time cost is now
> > > > almost unnoticeable level.
> > > >
> > > > I hope this makes Linus happy.
> > >
> > > :-)
> > >
> > > I'm surprised to see that Linus is using this feature. When disabled
> > > (the default) this should have had no impact on the build time.
> >
> > Linus is not using this feature, but does build tests.
> > After pulling the module subsystem pull request in this merge window,
> > CONFIG_TRIM_UNUSED_KSYMS was enabled by allmodconfig.
>
> If CONFIG_TRIM_UNUSED_KSYMS is enabled then build time willincrease.
> That comes with the feature.


This patch set intends to change this.
TRIM_UNUSED_KSYMS will build without additional cost,
like LD_DEAD_CODE_DATA_ELIMINATION.



>
> > > This feature provides a nice security advantage by significantly
> > > reducing the kernel input surface. And people are using that also to
> > > better what third party vendor can and cannot do with a distro kernel,
> > > etc. But that's not the reason why I implemented this feature in the
> > > first place.
> > >
> > > My primary goal was to efficiently reduce the kernel binary size using
> > > LTO even with kernel modules enabled.
> >
> >
> > Clang LTO landed in this MW.
> >
> > Do you think it will reduce the kernel binary size?
> > No, opposite.
>
> LTO ought to reduce binary size. It is rather broken otherwise.
> Having a global view before optimizing allows for the compiler to do
> project wide constant propagation and dead code elimination.
>
> > CONFIG_LTO_CLANG cannot trim any code even if it
> > is obviously unused.
> > Hence, it never reduces the kernel binary size.
> > Rather, it produces a bigger kernel.
>
> Then what's the point?


Presumably, reducing the size is not
the main interest for Googlers.


>
> > The reason is Clang LTO was implemented against
> > relocatable ELF (vmlinux.o) .
>
> That's not true LTO then.


This is the same as what I said in the review process.
:-)

https://lore.kernel.org/linux-kbuild/CAK7LNASQPOGohtUyzBM6n54pzpLN35kDXC7VbvWzX8QWUmqq9g@mail.gmail.com/




>
> > I pointed out this flaw in the review process, but
> > it was dismissed.
> >
> > This is the main reason why I did not give any Ack
> > (but it was merged via Kees Cook's tree).
>
> > So, the help text of this option should be revised:
> >
> > This option allows for unused exported symbols to be dropped from
> > the build. In turn, this provides the compiler more opportunities
> > (especially when using LTO) for optimizing the code and reducing
> > binary size. This might have some security advantages as well.
> >
> > Clang LTO is opposite to your expectation.
>
> Then Clang LTO is a misnomer. That is the option to revise not this one.
>
> > > Each EXPORT_SYMBOL() created a
> > > symbol dependency that prevented LTO from optimizing out the related
> > > code even though a tiny fraction of those exported symbols were needed.
> > >
> > > The idea behind the recursion was to catch those cases where disabling
> > > an exported symbol within a module would optimize out references to more
> > > exported symbols that, in turn, could be disabled and possibly trigger
> > > yet more code elimination. There is no way that can be achieved without
> > > extra compiler passes in a recursive manner.
> >
> > I do not understand.
> >
> > Modules are relocatable ELF.
> > Clang LTO cannot eliminate any code.
> > GCC LTO does not work with relocatable ELF
> > in the first place.
>
> I don't think I follow you here. What relocatable ELF has to do with LTO?



What is important is,
GCC LTO is the feature of gcc, not binutils.
That is, LD_FINAL is $(CC).

GCC LTO can be implemented for the final link stage
by using $(CC) as the linker driver.
Then, it can determine which code is unreachable.
In other words, GCC LTO works only when building
the final executable.


On the other hand, a relocatable ELF is created
by $(LD) -r by combining some objects together.
The relocatable ELF can be fed to another $(LD) -r,
or the final link stage.


vmlinux is an executable ELF.
modules (*.ko files) are relocatable ELFs.


You can confirm it easily
by using the 'file' command.

masahiro@oscar:~/ref/linux$ file vmlinux
vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
statically linked,
BuildID[sha1]=ee0cef2ff3d9f490e0f5ee1d7e74b19aa167933b, not stripped
masahiro@oscar:~/ref/linux$ file net/ipv4/netfilter/iptable_nat.ko
net/ipv4/netfilter/iptable_nat.ko: ELF 64-bit LSB relocatable, x86-64,
version 1 (SYSV),
BuildID[sha1]=4829e82f9b9e7fd65be3c19c1cf0e16a7ddf0967, not stripped



Modules are not filled with addresses yet
since we do not know which memory address
the module will be loaded to.
The addresses are resolved at modprobe time.

As I said above, modules are created by $(LD) -r.
It is not possible to implement GCC LTO for modules.



In contrast, Clang LTO is the ability of $(LD).
So, it can be implemented for not only for executable ELFs,
but also for relocated ELFs.
The problem is Clang LTO cannot determine which code is
unreachable if it is implemented for a relocatable ELF,
since it is not a final image.

Did I answer your question?





> I've successfully used gcc LTO on the kernel quite a while ago.
>
> For a reference about binary size reduction with LTO and
> CONFIG_TRIM_UNUSED_KSYMS please read this article:
>
> https://lwn.net/Articles/746780/


Thanks for the great articles.

Just for curiosity, I think you used GCC LTO from
Andy's GitHub.


In the article, you took stm32_defconfig as an example,
but ARM does not select ARCH_SUPPORTS_LTO.

Did you add some local hacks to make LTO work
for ARM?

I tried the lto-5.8.1 branch, but
I did not even succeed in building x86 + LTO.






>
> Nicolas



--
Best Regards
Masahiro Yamada

2021-03-09 16:51:24

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS

On Tue, 9 Mar 2021, Masahiro Yamada wrote:

> On Fri, Feb 26, 2021 at 4:24 AM Nicolas Pitre <[email protected]> wrote:
> >
> > If CONFIG_TRIM_UNUSED_KSYMS is enabled then build time willincrease.
> > That comes with the feature.
>
> This patch set intends to change this.
> TRIM_UNUSED_KSYMS will build without additional cost,
> like LD_DEAD_CODE_DATA_ELIMINATION.

OK... I do see how you're going about it.

> > > Modules are relocatable ELF.
> > > Clang LTO cannot eliminate any code.
> > > GCC LTO does not work with relocatable ELF
> > > in the first place.
> >
> > I don't think I follow you here. What relocatable ELF has to do with LTO?
>
> What is important is,
> GCC LTO is the feature of gcc, not binutils.
> That is, LD_FINAL is $(CC).

Exact.

> GCC LTO can be implemented for the final link stage
> by using $(CC) as the linker driver.
> Then, it can determine which code is unreachable.
> In other words, GCC LTO works only when building
> the final executable.

Yes. And it does so by filling .o files with its intermediate code
representation and not ELF code.

> On the other hand, a relocatable ELF is created
> by $(LD) -r by combining some objects together.
> The relocatable ELF can be fed to another $(LD) -r,
> or the final link stage.

You still can create relocatable ELF using LTO. But LTO stops there.
From that point on, .o files will no longer contain data that LTO can
use if you further combine those object files together. But until that
point, LTO is still usable.

> As I said above, modules are created by $(LD) -r.
> It is not possible to implement GCC LTO for modules.

If I remember correctly (that was a while ago) the problem with LTO and
the kernel had to do with the fact that avery subdirectory was gathering
object files in built-in.o using ld -r. At some point we switched to
gathering object files into built-in.a files where no linking is taking
place. The real linking happens in vmlinux.o where LTO may now do its
magic.

The same is true for modules. Compiling foo_module.c into foo_module.o
will create a .o file with LTO data rather than executable code. But
when you create the final .o for the module then LTO takes place and
produce the relocatable ELF executable.

> > I've successfully used gcc LTO on the kernel quite a while ago.
> >
> > For a reference about binary size reduction with LTO and
> > CONFIG_TRIM_UNUSED_KSYMS please read this article:
> >
> > https://lwn.net/Articles/746780/
>
> Thanks for the great articles.
>
> Just for curiosity, I think you used GCC LTO from
> Andy's GitHub.

Right. I provided the reference in the preceding article:
https://lwn.net/Articles/744507/

> In the article, you took stm32_defconfig as an example,
> but ARM does not select ARCH_SUPPORTS_LTO.
>
> Did you add some local hacks to make LTO work
> for ARM?

Of course. This article was written in 2017 and no LTO support at all
was in mainline back then. But, besides adding CONFIG_LTO, very little
was needed to make it compile, and I did upstream most changes such as
commit 75fea300d7, commit a85b2257a5, commit 5d48417592, commit
19c233b79d, etc.

> I tried the lto-5.8.1 branch, but
> I did not even succeed in building x86 + LTO.

My latest working LTO branch (i.e. last time I worked on it) is much
older than that.

Maybe people aren't very excited about LTO because it makes the time to
recompiling the kernel many times longer because gcc does its
optimization passes on the whole kernel even if you modify a single
file.


Nicolas