2020-02-22 00:40:50

by Alex Xu (Hello71)

[permalink] [raw]
Subject: [PATCH] kbuild: move -pipe to global KBUILD_CFLAGS

-pipe reduces unnecessary disk wear for systems where /tmp is not a
tmpfs, slightly increases compilation speed, and avoids leaving behind
files when gcc crashes.

According to the gcc manual, "this fails to work on some systems where
the assembler is unable to read from a pipe; but the GNU assembler has
no trouble". We already require GNU ld on all platforms, so this is not
an additional dependency. LLVM as also supports pipes.

-pipe has always been used for most architectures, this change
standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
affected.

Signed-off-by: Alex Xu (Hello71) <[email protected]>
---
Makefile | 2 +-
arch/alpha/Makefile | 2 +-
arch/arc/Makefile | 2 +-
arch/arm/Makefile | 1 -
arch/csky/Makefile | 1 -
arch/ia64/Makefile | 2 +-
arch/m68k/Makefile | 2 +-
arch/mips/Makefile | 2 +-
arch/nios2/Makefile | 2 +-
arch/openrisc/Makefile | 2 +-
arch/parisc/Makefile | 2 +-
arch/powerpc/Makefile | 2 +-
arch/s390/Makefile | 2 +-
arch/sh/Makefile | 2 +-
arch/sparc/Makefile | 4 ++--
arch/xtensa/Makefile | 2 +-
16 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/Makefile b/Makefile
index aab38cb02b24..782c12267151 100644
--- a/Makefile
+++ b/Makefile
@@ -459,7 +459,7 @@ KBUILD_CFLAGS := -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs \
-fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE \
-Werror=implicit-function-declaration -Werror=implicit-int \
-Wno-format-security \
- -std=gnu89
+ -std=gnu89 -pipe
KBUILD_CPPFLAGS := -D__KERNEL__
KBUILD_AFLAGS_KERNEL :=
KBUILD_CFLAGS_KERNEL :=
diff --git a/arch/alpha/Makefile b/arch/alpha/Makefile
index 12dee59b011c..b40a9be72d9b 100644
--- a/arch/alpha/Makefile
+++ b/arch/alpha/Makefile
@@ -12,7 +12,7 @@ NM := $(NM) -B

LDFLAGS_vmlinux := -static -N #-relax
CHECKFLAGS += -D__alpha__
-cflags-y := -pipe -mno-fp-regs -ffixed-8
+cflags-y := -mno-fp-regs -ffixed-8
cflags-y += $(call cc-option, -fno-jump-tables)

cpuflags-$(CONFIG_ALPHA_EV4) := -mcpu=ev4
diff --git a/arch/arc/Makefile b/arch/arc/Makefile
index 20e9ab6cc521..b6a2f553771c 100644
--- a/arch/arc/Makefile
+++ b/arch/arc/Makefile
@@ -9,7 +9,7 @@ ifeq ($(CROSS_COMPILE),)
CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-)
endif

-cflags-y += -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__
+cflags-y += -fno-common -fno-builtin -mmedium-calls -D__linux__
cflags-$(CONFIG_ISA_ARCOMPACT) += -mA7
cflags-$(CONFIG_ISA_ARCV2) += -mcpu=hs38

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index db857d07114f..7711467e0797 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -21,7 +21,6 @@ KBUILD_LDS_MODULE += $(srctree)/arch/arm/kernel/module.lds
endif

GZFLAGS :=-9
-#KBUILD_CFLAGS +=-pipe

# Never generate .eh_frame
KBUILD_CFLAGS += $(call cc-option,-fno-dwarf2-cfi-asm)
diff --git a/arch/csky/Makefile b/arch/csky/Makefile
index fb1bbbd91954..3ba8d936122c 100644
--- a/arch/csky/Makefile
+++ b/arch/csky/Makefile
@@ -42,7 +42,6 @@ KBUILD_CFLAGS += -msoft-float -mdiv
KBUILD_CFLAGS += -fno-tree-vectorize
endif

-KBUILD_CFLAGS += -pipe
ifeq ($(CSKYABI),abiv2)
KBUILD_CFLAGS += -mno-stack-size
endif
diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile
index 32240000dc0c..554dc20873d8 100644
--- a/arch/ia64/Makefile
+++ b/arch/ia64/Makefile
@@ -24,7 +24,7 @@ KBUILD_LDS_MODULE += $(srctree)/arch/ia64/module.lds
KBUILD_AFLAGS_KERNEL := -mconstant-gp
EXTRA :=

-cflags-y := -pipe $(EXTRA) -ffixed-r13 -mfixed-range=f12-f15,f32-f127 \
+cflags-y := $(EXTRA) -ffixed-r13 -mfixed-range=f12-f15,f32-f127 \
-falign-functions=32 -frename-registers -fno-optimize-sibling-calls
KBUILD_CFLAGS_KERNEL := -mconstant-gp

diff --git a/arch/m68k/Makefile b/arch/m68k/Makefile
index 5d9288384096..99a226bbd06c 100644
--- a/arch/m68k/Makefile
+++ b/arch/m68k/Makefile
@@ -60,7 +60,7 @@ cpuflags-$(CONFIG_M5206) := $(call cc-option,-mcpu=5206,-m5200)
KBUILD_AFLAGS += $(cpuflags-y)
KBUILD_CFLAGS += $(cpuflags-y)

-KBUILD_CFLAGS += -pipe -ffreestanding
+KBUILD_CFLAGS += -ffreestanding

ifdef CONFIG_MMU
# without -fno-strength-reduce the 53c7xx.c driver fails ;-(
diff --git a/arch/mips/Makefile b/arch/mips/Makefile
index e1c44aed8156..05eb77328a13 100644
--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -95,7 +95,7 @@ all-$(CONFIG_SYS_SUPPORTS_ZBOOT)+= vmlinuz
# machines may also. Since BFD is incredibly buggy with respect to
# crossformat linking we rely on the elf2ecoff tool for format conversion.
#
-cflags-y += -G 0 -mno-abicalls -fno-pic -pipe
+cflags-y += -G 0 -mno-abicalls -fno-pic
cflags-y += -msoft-float
LDFLAGS_vmlinux += -G 0 -static -n -nostdlib
KBUILD_AFLAGS_MODULE += -mlong-calls
diff --git a/arch/nios2/Makefile b/arch/nios2/Makefile
index 52c03e60b114..3205cb5fd143 100644
--- a/arch/nios2/Makefile
+++ b/arch/nios2/Makefile
@@ -24,7 +24,7 @@ LIBGCC := $(shell $(CC) $(KBUILD_CFLAGS) $(KCFLAGS) -print-libgcc-file-n

KBUILD_AFLAGS += -march=r$(CONFIG_NIOS2_ARCH_REVISION)

-KBUILD_CFLAGS += -pipe -D__linux__ -D__ELF__
+KBUILD_CFLAGS += -D__linux__ -D__ELF__
KBUILD_CFLAGS += -march=r$(CONFIG_NIOS2_ARCH_REVISION)
KBUILD_CFLAGS += $(if $(CONFIG_NIOS2_HW_MUL_SUPPORT),-mhw-mul,-mno-hw-mul)
KBUILD_CFLAGS += $(if $(CONFIG_NIOS2_HW_MULX_SUPPORT),-mhw-mulx,-mno-hw-mulx)
diff --git a/arch/openrisc/Makefile b/arch/openrisc/Makefile
index bf10141c7426..86075fc673d9 100644
--- a/arch/openrisc/Makefile
+++ b/arch/openrisc/Makefile
@@ -22,7 +22,7 @@ KBUILD_DEFCONFIG := or1ksim_defconfig
OBJCOPYFLAGS := -O binary -R .note -R .comment -S
LIBGCC := $(shell $(CC) $(KBUILD_CFLAGS) -print-libgcc-file-name)

-KBUILD_CFLAGS += -pipe -ffixed-r10 -D__linux__
+KBUILD_CFLAGS += -ffixed-r10 -D__linux__

ifeq ($(CONFIG_OPENRISC_HAVE_INST_MUL),y)
KBUILD_CFLAGS += $(call cc-option,-mhard-mul)
diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile
index dca8f2de8cf5..88bee828400d 100644
--- a/arch/parisc/Makefile
+++ b/arch/parisc/Makefile
@@ -64,7 +64,7 @@ endif

OBJCOPY_FLAGS =-O binary -R .note -R .comment -S

-cflags-y := -pipe
+cflags-y :=

# These flags should be implied by an hppa-linux configuration, but they
# are not in gcc 3.2.
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index f35730548e42..0550b976157c 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -215,7 +215,7 @@ asinstr := $(call as-instr,lis 9$(comma)foo@high,-DHAVE_AS_ATHIGH=1)
KBUILD_CPPFLAGS += -I $(srctree)/arch/$(ARCH) $(asinstr)
KBUILD_AFLAGS += $(AFLAGS-y)
KBUILD_CFLAGS += $(call cc-option,-msoft-float)
-KBUILD_CFLAGS += -pipe $(CFLAGS-y)
+KBUILD_CFLAGS += $(CFLAGS-y)
CPP = $(CC) -E $(KBUILD_CFLAGS)

CHECKFLAGS += -m$(BITS) -D__powerpc__ -D__powerpc$(BITS)__
diff --git a/arch/s390/Makefile b/arch/s390/Makefile
index e0e3a465bbfd..3ca3e3a29864 100644
--- a/arch/s390/Makefile
+++ b/arch/s390/Makefile
@@ -118,7 +118,7 @@ endif
cfi := $(call as-instr,.cfi_startproc\n.cfi_val_offset 15$(comma)-160\n.cfi_endproc,-DCONFIG_AS_CFI_VAL_OFFSET=1)

KBUILD_CFLAGS += -mbackchain -msoft-float $(cflags-y)
-KBUILD_CFLAGS += -pipe -Wno-sign-compare
+KBUILD_CFLAGS += -Wno-sign-compare
KBUILD_CFLAGS += -fno-asynchronous-unwind-tables $(cfi)
KBUILD_AFLAGS += $(aflags-y) $(cfi)
export KBUILD_AFLAGS_DECOMPRESSOR
diff --git a/arch/sh/Makefile b/arch/sh/Makefile
index b4a86f27e048..2e224b2a436b 100644
--- a/arch/sh/Makefile
+++ b/arch/sh/Makefile
@@ -194,7 +194,7 @@ drivers-$(CONFIG_OPROFILE) += arch/sh/oprofile/
cflags-y += $(foreach d, $(cpuincdir-y), -I $(srctree)/arch/sh/include/$(d)) \
$(foreach d, $(machdir-y), -I $(srctree)/arch/sh/include/$(d))

-KBUILD_CFLAGS += -pipe $(cflags-y)
+KBUILD_CFLAGS += $(cflags-y)
KBUILD_CPPFLAGS += $(cflags-y)
KBUILD_AFLAGS += $(cflags-y)

diff --git a/arch/sparc/Makefile b/arch/sparc/Makefile
index 4a0919581697..ad30e7e668e0 100644
--- a/arch/sparc/Makefile
+++ b/arch/sparc/Makefile
@@ -29,7 +29,7 @@ UTS_MACHINE := sparc
# versions of gcc. Some gcc versions won't pass -Av8 to binutils when you
# give -mcpu=v8. This silently worked with older bintutils versions but
# does not any more.
-KBUILD_CFLAGS += -m32 -mcpu=v8 -pipe -mno-fpu -fcall-used-g5 -fcall-used-g7
+KBUILD_CFLAGS += -m32 -mcpu=v8 -mno-fpu -fcall-used-g5 -fcall-used-g7
KBUILD_CFLAGS += -Wa,-Av8

KBUILD_AFLAGS += -m32 -Wa,-Av8
@@ -44,7 +44,7 @@ KBUILD_LDFLAGS := -m elf64_sparc
export BITS := 64
UTS_MACHINE := sparc64

-KBUILD_CFLAGS += -m64 -pipe -mno-fpu -mcpu=ultrasparc -mcmodel=medlow
+KBUILD_CFLAGS += -m64 -mno-fpu -mcpu=ultrasparc -mcmodel=medlow
KBUILD_CFLAGS += -ffixed-g4 -ffixed-g5 -fcall-used-g7 -Wno-sign-compare
KBUILD_CFLAGS += -Wa,--undeclared-regs
KBUILD_CFLAGS += $(call cc-option,-mtune=ultrasparc3)
diff --git a/arch/xtensa/Makefile b/arch/xtensa/Makefile
index 67a7d151d1e7..fdaa588c504c 100644
--- a/arch/xtensa/Makefile
+++ b/arch/xtensa/Makefile
@@ -42,7 +42,7 @@ export PLATFORM

# temporarily until string.h is fixed
KBUILD_CFLAGS += -ffreestanding -D__linux__
-KBUILD_CFLAGS += -pipe -mlongcalls -mtext-section-literals
+KBUILD_CFLAGS += -mlongcalls -mtext-section-literals
KBUILD_CFLAGS += $(call cc-option,-mforce-no-pic,)
KBUILD_CFLAGS += $(call cc-option,-mno-serialize-volatile,)

--
2.25.1


2020-02-22 02:08:29

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] kbuild: move -pipe to global KBUILD_CFLAGS

On Sat, Feb 22, 2020 at 9:40 AM Alex Xu (Hello71) <[email protected]> wrote:
>
> -pipe reduces unnecessary disk wear for systems where /tmp is not a
> tmpfs, slightly increases compilation speed, and avoids leaving behind
> files when gcc crashes.
>
> According to the gcc manual, "this fails to work on some systems where
> the assembler is unable to read from a pipe; but the GNU assembler has
> no trouble". We already require GNU ld on all platforms, so this is not
> an additional dependency. LLVM as also supports pipes.
>
> -pipe has always been used for most architectures, this change
> standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
> affected.
>
> Signed-off-by: Alex Xu (Hello71) <[email protected]>

<snip>

> diff --git a/arch/arc/Makefile b/arch/arc/Makefile
> index 20e9ab6cc521..b6a2f553771c 100644
> --- a/arch/arc/Makefile
> +++ b/arch/arc/Makefile
> @@ -9,7 +9,7 @@ ifeq ($(CROSS_COMPILE),)
> CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-)
> endif
>
> -cflags-y += -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__
> +cflags-y += -fno-common -fno-builtin -mmedium-calls -D__linux__
> cflags-$(CONFIG_ISA_ARCOMPACT) += -mA7
> cflags-$(CONFIG_ISA_ARCV2) += -mcpu=hs38
>
> diff --git a/arch/arm/Makefile b/arch/arm/Makefile
> index db857d07114f..7711467e0797 100644
> --- a/arch/arm/Makefile
> +++ b/arch/arm/Makefile
> @@ -21,7 +21,6 @@ KBUILD_LDS_MODULE += $(srctree)/arch/arm/kernel/module.lds
> endif
>
> GZFLAGS :=-9
> -#KBUILD_CFLAGS +=-pipe


This was commented out by a very old commit,
which is available in the historical git tree.

https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=ce20ed858a20f6f04de475cae79e40d3697f4776

But, I could not parse the reason from the commit message.
Russell, do you remember why?


If arch maintainers are fine with this change,
I can pick up it.

Thanks.



--
Best Regards

Masahiro Yamada

2020-02-22 02:17:31

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] kbuild: move -pipe to global KBUILD_CFLAGS

Hi Alex,

On Fri, Feb 21, 2020 at 07:38:20PM -0500, Alex Xu (Hello71) wrote:
> -pipe reduces unnecessary disk wear for systems where /tmp is not a
> tmpfs, slightly increases compilation speed, and avoids leaving behind
> files when gcc crashes.
>
> According to the gcc manual, "this fails to work on some systems where
> the assembler is unable to read from a pipe; but the GNU assembler has
> no trouble". We already require GNU ld on all platforms, so this is not
> an additional dependency. LLVM as also supports pipes.
>
> -pipe has always been used for most architectures, this change
> standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
> affected.
>
> Signed-off-by: Alex Xu (Hello71) <[email protected]>

Do you have any numbers to show this is actually beneficial from a
compilation time perspective? I ask because I saw an improvement in
compilation time when removing -pipe from x86's KBUILD_CFLAGS in
commit 437e88ab8f9e ("x86/build: Remove -pipe from KBUILD_CFLAGS").

For what it's worth, clang ignores -pipe so this does not actually
matter for its integrated assembler.

That type of change could have been a fluke but I guarantee people
will care more about any change in compilation time than any of the
other things that you mention so it might be wise to check on major
architectures to make sure that it doesn't hurt.

Cheers,
Nathan

2020-02-22 04:03:08

by Alex Xu (Hello71)

[permalink] [raw]
Subject: Re: [PATCH] kbuild: move -pipe to global KBUILD_CFLAGS

Excerpts from Nathan Chancellor's message of February 21, 2020 9:16 pm:
> Hi Alex,
>
> On Fri, Feb 21, 2020 at 07:38:20PM -0500, Alex Xu (Hello71) wrote:
>> -pipe reduces unnecessary disk wear for systems where /tmp is not a
>> tmpfs, slightly increases compilation speed, and avoids leaving behind
>> files when gcc crashes.
>>
>> According to the gcc manual, "this fails to work on some systems where
>> the assembler is unable to read from a pipe; but the GNU assembler has
>> no trouble". We already require GNU ld on all platforms, so this is not
>> an additional dependency. LLVM as also supports pipes.
>>
>> -pipe has always been used for most architectures, this change
>> standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
>> affected.
>>
>> Signed-off-by: Alex Xu (Hello71) <[email protected]>
>
> Do you have any numbers to show this is actually beneficial from a
> compilation time perspective? I ask because I saw an improvement in
> compilation time when removing -pipe from x86's KBUILD_CFLAGS in
> commit 437e88ab8f9e ("x86/build: Remove -pipe from KBUILD_CFLAGS").
>
> For what it's worth, clang ignores -pipe so this does not actually
> matter for its integrated assembler.
>
> That type of change could have been a fluke but I guarantee people
> will care more about any change in compilation time than any of the
> other things that you mention so it might be wise to check on major
> architectures to make sure that it doesn't hurt.
>
> Cheers,
> Nathan
>

Sorry, I should've checked the performance first. I have now run:

cd /tmp/linux # previously: make O=/tmp/linux
export MAKEFLAGS=12 # Ryzen 1600, 6 cores, 12 threads
make allnoconfig
for i in {1..10}; do
make clean >/dev/null
time make XPIPE=-pipe >/dev/null
make clean >/dev/null
time make >/dev/null
done

after patching -pipe to $(XPIPE) in Makefile.

Results (without ld warnings):

make > /dev/null 130.54s user 10.41s system 969% cpu 14.537 total
make XPIPE=-pipe > /dev/null 129.83s user 9.95s system 977% cpu 14.296 total
make > /dev/null 129.73s user 10.28s system 966% cpu 14.493 total
make XPIPE=-pipe > /dev/null 130.04s user 10.63s system 986% cpu 14.252 total
make > /dev/null 129.53s user 10.28s system 972% cpu 14.379 total
make XPIPE=-pipe > /dev/null 130.29s user 10.17s system 983% cpu 14.288 total
make > /dev/null 130.19s user 10.52s system 968% cpu 14.530 total
make XPIPE=-pipe > /dev/null 129.90s user 10.47s system 978% cpu 14.343 total
make > /dev/null 129.50s user 10.81s system 959% cpu 14.620 total
make XPIPE=-pipe > /dev/null 130.37s user 10.60s system 975% cpu 14.446 total
make > /dev/null 129.63s user 10.18s system 972% cpu 14.374 total
make XPIPE=-pipe > /dev/null 131.29s user 9.92s system 1016% cpu 13.899 total
make > /dev/null 129.96s user 10.39s system 961% cpu 14.596 total
make XPIPE=-pipe > /dev/null 131.63s user 10.16s system 1011% cpu 14.015 total
make > /dev/null 129.33s user 10.54s system 970% cpu 14.405 total
make XPIPE=-pipe > /dev/null 129.70s user 10.40s system 976% cpu 14.349 total
make > /dev/null 129.53s user 10.25s system 964% cpu 14.494 total
make XPIPE=-pipe > /dev/null 130.38s user 10.62s system 973% cpu 14.479 total
make > /dev/null 130.73s user 10.08s system 957% cpu 14.704 total
make XPIPE=-pipe > /dev/null 130.43s user 10.62s system 985% cpu 14.309 total
make > /dev/null 130.54s user 10.41s system 969% cpu 14.537 total

There is a fair bit of variance, probably due to cpufreq, schedutil, CPU
temperature, CPU scheduler, motherboard power delivery, etc. But, I
think it can be clearly seen that -pipe is, on average, about 0.1 to 0.2
seconds faster.

I also tried "make defconfig":

make > /dev/null 1238.26s user 102.39s system 1095% cpu 2:02.33 total
make XPIPE=-pipe > /dev/null 1231.33s user 102.52s system 1081% cpu 2:03.29 total
make > /dev/null 1232.92s user 102.07s system 1096% cpu 2:01.71 total
make XPIPE=-pipe > /dev/null 1239.59s user 102.30s system 1096% cpu 2:02.39 total
make > /dev/null 1229.81s user 101.72s system 1093% cpu 2:01.74 total
make XPIPE=-pipe > /dev/null 1234.64s user 101.30s system 1098% cpu 2:01.64 total
make > /dev/null 1228.50s user 104.39s system 1093% cpu 2:01.91 total
make XPIPE=-pipe > /dev/null 1238.78s user 102.57s system 1099% cpu 2:01.99 total
make > /dev/null 1238.26s user 102.39s system 1095% cpu 2:02.33 total

I stopped after this because I needed to use the machine for other
tasks. The results are less clear, but I think there's not a big
difference one way or another, at least on my machine.

CPU: Ryzen 1600, overclocked to ~3.8 GHz
RAM: Corsair Vengeance, overclocked to ~3300 MHz, forgot timings
Motherboard: ASRock B450 Pro4

I would speculate that the recent pipe changes have caused a change in
the relative speed compared to 2018. I am using 5.6.0-rc2 with -O3
-march=native patches.

Regards,
Alex.

2020-02-22 08:02:00

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] kbuild: move -pipe to global KBUILD_CFLAGS

On Fri, Feb 21, 2020 at 11:01:24PM -0500, Alex Xu (Hello71) wrote:
> Excerpts from Nathan Chancellor's message of February 21, 2020 9:16 pm:
> > Hi Alex,
> >
> > On Fri, Feb 21, 2020 at 07:38:20PM -0500, Alex Xu (Hello71) wrote:
> >> -pipe reduces unnecessary disk wear for systems where /tmp is not a
> >> tmpfs, slightly increases compilation speed, and avoids leaving behind
> >> files when gcc crashes.
> >>
> >> According to the gcc manual, "this fails to work on some systems where
> >> the assembler is unable to read from a pipe; but the GNU assembler has
> >> no trouble". We already require GNU ld on all platforms, so this is not
> >> an additional dependency. LLVM as also supports pipes.
> >>
> >> -pipe has always been used for most architectures, this change
> >> standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
> >> affected.
> >>
> >> Signed-off-by: Alex Xu (Hello71) <[email protected]>
> >
> > Do you have any numbers to show this is actually beneficial from a
> > compilation time perspective? I ask because I saw an improvement in
> > compilation time when removing -pipe from x86's KBUILD_CFLAGS in
> > commit 437e88ab8f9e ("x86/build: Remove -pipe from KBUILD_CFLAGS").
> >
> > For what it's worth, clang ignores -pipe so this does not actually
> > matter for its integrated assembler.
> >
> > That type of change could have been a fluke but I guarantee people
> > will care more about any change in compilation time than any of the
> > other things that you mention so it might be wise to check on major
> > architectures to make sure that it doesn't hurt.
> >
> > Cheers,
> > Nathan
> >
>
> Sorry, I should've checked the performance first. I have now run:
>
> cd /tmp/linux # previously: make O=/tmp/linux
> export MAKEFLAGS=12 # Ryzen 1600, 6 cores, 12 threads
> make allnoconfig
> for i in {1..10}; do
> make clean >/dev/null
> time make XPIPE=-pipe >/dev/null
> make clean >/dev/null
> time make >/dev/null
> done
>
> after patching -pipe to $(XPIPE) in Makefile.
>
> Results (without ld warnings):
>
> make > /dev/null 130.54s user 10.41s system 969% cpu 14.537 total
> make XPIPE=-pipe > /dev/null 129.83s user 9.95s system 977% cpu 14.296 total
> make > /dev/null 129.73s user 10.28s system 966% cpu 14.493 total
> make XPIPE=-pipe > /dev/null 130.04s user 10.63s system 986% cpu 14.252 total
> make > /dev/null 129.53s user 10.28s system 972% cpu 14.379 total
> make XPIPE=-pipe > /dev/null 130.29s user 10.17s system 983% cpu 14.288 total
> make > /dev/null 130.19s user 10.52s system 968% cpu 14.530 total
> make XPIPE=-pipe > /dev/null 129.90s user 10.47s system 978% cpu 14.343 total
> make > /dev/null 129.50s user 10.81s system 959% cpu 14.620 total
> make XPIPE=-pipe > /dev/null 130.37s user 10.60s system 975% cpu 14.446 total
> make > /dev/null 129.63s user 10.18s system 972% cpu 14.374 total
> make XPIPE=-pipe > /dev/null 131.29s user 9.92s system 1016% cpu 13.899 total
> make > /dev/null 129.96s user 10.39s system 961% cpu 14.596 total
> make XPIPE=-pipe > /dev/null 131.63s user 10.16s system 1011% cpu 14.015 total
> make > /dev/null 129.33s user 10.54s system 970% cpu 14.405 total
> make XPIPE=-pipe > /dev/null 129.70s user 10.40s system 976% cpu 14.349 total
> make > /dev/null 129.53s user 10.25s system 964% cpu 14.494 total
> make XPIPE=-pipe > /dev/null 130.38s user 10.62s system 973% cpu 14.479 total
> make > /dev/null 130.73s user 10.08s system 957% cpu 14.704 total
> make XPIPE=-pipe > /dev/null 130.43s user 10.62s system 985% cpu 14.309 total
> make > /dev/null 130.54s user 10.41s system 969% cpu 14.537 total
>
> There is a fair bit of variance, probably due to cpufreq, schedutil, CPU
> temperature, CPU scheduler, motherboard power delivery, etc. But, I
> think it can be clearly seen that -pipe is, on average, about 0.1 to 0.2
> seconds faster.
>
> I also tried "make defconfig":
>
> make > /dev/null 1238.26s user 102.39s system 1095% cpu 2:02.33 total
> make XPIPE=-pipe > /dev/null 1231.33s user 102.52s system 1081% cpu 2:03.29 total
> make > /dev/null 1232.92s user 102.07s system 1096% cpu 2:01.71 total
> make XPIPE=-pipe > /dev/null 1239.59s user 102.30s system 1096% cpu 2:02.39 total
> make > /dev/null 1229.81s user 101.72s system 1093% cpu 2:01.74 total
> make XPIPE=-pipe > /dev/null 1234.64s user 101.30s system 1098% cpu 2:01.64 total
> make > /dev/null 1228.50s user 104.39s system 1093% cpu 2:01.91 total
> make XPIPE=-pipe > /dev/null 1238.78s user 102.57s system 1099% cpu 2:01.99 total
> make > /dev/null 1238.26s user 102.39s system 1095% cpu 2:02.33 total
>
> I stopped after this because I needed to use the machine for other
> tasks. The results are less clear, but I think there's not a big
> difference one way or another, at least on my machine.
>
> CPU: Ryzen 1600, overclocked to ~3.8 GHz
> RAM: Corsair Vengeance, overclocked to ~3300 MHz, forgot timings
> Motherboard: ASRock B450 Pro4
>
> I would speculate that the recent pipe changes have caused a change in
> the relative speed compared to 2018. I am using 5.6.0-rc2 with -O3
> -march=native patches.
>
> Regards,
> Alex.

I used hyperfine [1] to run a quick benchmark with a freshly built
GCC 9.2.0 for x86 and aarch64 and here are the results:

$ hyperfine -w 1 -r 25 \
-p 'rm -rf out.x86_64' \
'make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- O=out.x86_64 defconfig all' \
'make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- KCFLAGS=-pipe O=out.x86_64 defconfig all'

Benchmark #1: make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- O=out.x86_64 defconfig all
Time (mean ± σ): 68.535 s ± 0.275 s [User: 2241.681 s, System: 185.454 s]
Range (min … max): 67.855 s … 68.953 s 25 runs

Benchmark #2: make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- KCFLAGS=-pipe O=out.x86_64 defconfig all
Time (mean ± σ): 68.922 s ± 0.095 s [User: 2264.168 s, System: 190.297 s]
Range (min … max): 68.781 s … 69.126 s 25 runs

Summary
'make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- O=out.x86_64 defconfig all' ran
1.01 ± 0.00 times faster than 'make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- KCFLAGS=-pipe O=out.x86_64 defconfig all'

$ hyperfine -w 1 -r 25 \
-p 'rm -rf out.aarch64' \
'make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- O=out.aarch64 defconfig all' \
'make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- KCFLAGS=-pipe O=out.aarch64 defconfig all'

Benchmark #1: make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- O=out.aarch64 defconfig all
Time (mean ± σ): 166.732 s ± 0.594 s [User: 5654.780 s, System: 475.493 s]
Range (min … max): 165.873 s … 167.859 s 25 runs

Benchmark #2: make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- KCFLAGS=-pipe O=out.aarch64 defconfig all
Time (mean ± σ): 168.047 s ± 0.428 s [User: 5734.031 s, System: 488.392 s]
Range (min … max): 167.328 s … 168.959 s 25 runs

Summary
'make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- O=out.aarch64 defconfig all' ran
1.01 ± 0.00 times faster than 'make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- KCFLAGS=-pipe O=out.aarch64 defconfig all'

In both cases it seems like performance regresses (by 1% but still) but
maybe it is my machine, even though this benchmark was done on a
different machine than the one from my commit back in 2018.

I am not sure I would write off these results, since I did the benchmark
25 times on each one back to back, eliminating most of the variance that
you described.

[1]: https://github.com/sharkdp/hyperfine

Cheers,
Nathan

2020-02-22 09:02:15

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] kbuild: move -pipe to global KBUILD_CFLAGS

On Sat, Feb 22, 2020 at 11:07:14AM +0900, Masahiro Yamada wrote:
> On Sat, Feb 22, 2020 at 9:40 AM Alex Xu (Hello71) <[email protected]> wrote:
> >
> > -pipe reduces unnecessary disk wear for systems where /tmp is not a
> > tmpfs, slightly increases compilation speed, and avoids leaving behind
> > files when gcc crashes.
> >
> > According to the gcc manual, "this fails to work on some systems where
> > the assembler is unable to read from a pipe; but the GNU assembler has
> > no trouble". We already require GNU ld on all platforms, so this is not
> > an additional dependency. LLVM as also supports pipes.
> >
> > -pipe has always been used for most architectures, this change
> > standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
> > affected.
> >
> > Signed-off-by: Alex Xu (Hello71) <[email protected]>
>
> <snip>
>
> > diff --git a/arch/arc/Makefile b/arch/arc/Makefile
> > index 20e9ab6cc521..b6a2f553771c 100644
> > --- a/arch/arc/Makefile
> > +++ b/arch/arc/Makefile
> > @@ -9,7 +9,7 @@ ifeq ($(CROSS_COMPILE),)
> > CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-)
> > endif
> >
> > -cflags-y += -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__
> > +cflags-y += -fno-common -fno-builtin -mmedium-calls -D__linux__
> > cflags-$(CONFIG_ISA_ARCOMPACT) += -mA7
> > cflags-$(CONFIG_ISA_ARCV2) += -mcpu=hs38
> >
> > diff --git a/arch/arm/Makefile b/arch/arm/Makefile
> > index db857d07114f..7711467e0797 100644
> > --- a/arch/arm/Makefile
> > +++ b/arch/arm/Makefile
> > @@ -21,7 +21,6 @@ KBUILD_LDS_MODULE += $(srctree)/arch/arm/kernel/module.lds
> > endif
> >
> > GZFLAGS :=-9
> > -#KBUILD_CFLAGS +=-pipe
>
>
> This was commented out by a very old commit,
> which is available in the historical git tree.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=ce20ed858a20f6f04de475cae79e40d3697f4776
>
> But, I could not parse the reason from the commit message.
> Russell, do you remember why?

-pipe may reduce the disk load but increases the CPU load, so it's an
option that's up to the build environment. One may wish to pass a
lower parralellism when using -pipe to make to mitigate that, but both
options are up to the build environment to decide upon.

If we unconditionally add -pipe, then we take away choice.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

2020-02-22 14:24:41

by Alex Xu (Hello71)

[permalink] [raw]
Subject: Re: [PATCH] kbuild: move -pipe to global KBUILD_CFLAGS

Excerpts from Nathan Chancellor's message of February 22, 2020 3:01 am:
> I used hyperfine [1] to run a quick benchmark with a freshly built
> GCC 9.2.0 for x86 and aarch64 and here are the results:
>
> In both cases it seems like performance regresses (by 1% but still) but
> maybe it is my machine, even though this benchmark was done on a
> different machine than the one from my commit back in 2018.
>
> I am not sure I would write off these results, since I did the benchmark
> 25 times on each one back to back, eliminating most of the variance that
> you described.
>
> [1]: https://github.com/sharkdp/hyperfine
>
> Cheers,
> Nathan
>

What kernel version are you running? Do you have the 5.6 pipe reworks?

2020-02-22 18:13:07

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] kbuild: move -pipe to global KBUILD_CFLAGS

On Sat, Feb 22, 2020 at 09:24:14AM -0500, Alex Xu (Hello71) wrote:
> Excerpts from Nathan Chancellor's message of February 22, 2020 3:01 am:
> > I used hyperfine [1] to run a quick benchmark with a freshly built
> > GCC 9.2.0 for x86 and aarch64 and here are the results:
> >
> > In both cases it seems like performance regresses (by 1% but still) but
> > maybe it is my machine, even though this benchmark was done on a
> > different machine than the one from my commit back in 2018.
> >
> > I am not sure I would write off these results, since I did the benchmark
> > 25 times on each one back to back, eliminating most of the variance that
> > you described.
> >
> > [1]: https://github.com/sharkdp/hyperfine
> >
> > Cheers,
> > Nathan
> >
>
> What kernel version are you running? Do you have the 5.6 pipe reworks?

No, it is a stock Ubuntu 18.04 kernel, which is running 4.15.0.

$ uname -a
Linux c2-medium-x86 4.15.0-50-generic #54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

If you are curious about the specs:

$ neofetch --stdout
nathan@c2-medium-x86
--------------------
OS: Ubuntu 18.04.3 LTS x86_64
Host: PowerEdge R6415
Kernel: 4.15.0-50-generic
Uptime: 126 days, 12 hours, 39 mins
Packages: 686
Shell: zsh 5.4.2
Terminal: /dev/pts/0
CPU: AMD EPYC 7401P 24- (48) @ 2.794GHz
Memory: 2974MiB / 64018MiB

Cheers,
Nathan