2022-06-21 13:39:30

by Miko Larsson

[permalink] [raw]
Subject: [PATCH 0/2] Kconfig: -O3 enablement

Hi,

This very small series allows -O3 to be used for all architectures. The
first patch marks -O3 as experimental, with the reasoning being that it
might expose unwanted regressions to users, and the second patch
actually allows -O3 by removing the "depend on ARC" string.

The reasoning behind this series is to open up -O3 so that bugs related
to it (both compiler-related and kernel-related) can be discovered by
eyeballs wanting to improve the "-O3 experience," as that might be
beneficial to both compilers and the kernel. This has been attempted
before [1], but unfortunately nothing ever came of it.

[1] https://lore.kernel.org/lkml/[email protected]/

Cc: [email protected]
Cc: [email protected]
Cc: Nathan Chancellor <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: "Gustavo A. R. Silva" <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Chris Down <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: John Ogness <[email protected]>
Cc: Elliot Berman <[email protected]>
Cc: Oleksandr Natalenko <[email protected]>

Miko Larsson (2):
Kconfig: Mark -O3 as experimental
Kconfig: Allow -O3 for all architectures

init/Kconfig | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--
2.36.1


2022-06-21 13:39:38

by Miko Larsson

[permalink] [raw]
Subject: [PATCH 2/2] Kconfig: Allow -O3 for all architectures

This commit allows all architectures to use the experimental -O3
optimization option. Previously, only ARC was allowed to use this
option.

Signed-off-by: Miko Larsson <[email protected]>
---
init/Kconfig | 1 -
1 file changed, 1 deletion(-)

diff --git a/init/Kconfig b/init/Kconfig
index 9c292acb2..b88613cb5 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1403,7 +1403,6 @@ config CC_OPTIMIZE_FOR_PERFORMANCE

config CC_OPTIMIZE_FOR_PERFORMANCE_O3
bool "Optimize more for performance (-O3) (EXPERIMENTAL)"
- depends on ARC
help
Choosing this option will pass "-O3" to your compiler to optimize
the kernel yet more for performance.
--
2.36.1

2022-06-21 13:39:40

by Miko Larsson

[permalink] [raw]
Subject: [PATCH 1/2] Kconfig: Mark -O3 as experimental

Mark -O3 as experimental, as it might cause unwanted regressions for
users.

Signed-off-by: Miko Larsson <[email protected]>
---
init/Kconfig | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/init/Kconfig b/init/Kconfig
index c7900e897..9c292acb2 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1402,12 +1402,15 @@ config CC_OPTIMIZE_FOR_PERFORMANCE
helpful compile-time warnings.

config CC_OPTIMIZE_FOR_PERFORMANCE_O3
- bool "Optimize more for performance (-O3)"
+ bool "Optimize more for performance (-O3) (EXPERIMENTAL)"
depends on ARC
help
Choosing this option will pass "-O3" to your compiler to optimize
the kernel yet more for performance.

+ This option is EXPERIMENTAL; you may encounter compiler bugs and/or
+ kernel bugs with this option enabled.
+
config CC_OPTIMIZE_FOR_SIZE
bool "Optimize for size (-Os)"
help
--
2.36.1

2022-06-21 16:17:58

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

On Tue, Jun 21, 2022 at 6:35 AM Miko Larsson <[email protected]> wrote:
>
> Hi,
>
> This very small series allows -O3 to be used for all architectures. The
> first patch marks -O3 as experimental, with the reasoning being that it
> might expose unwanted regressions to users, and the second patch
> actually allows -O3 by removing the "depend on ARC" string.

I think we should just remove -O3 support from KCONFIG.

If someone wants to mess around with "experimental features," there's
nothing stopping you from doing:

$ make KCFLAGS=-O3

>
> The reasoning behind this series is to open up -O3 so that bugs related
> to it (both compiler-related and kernel-related) can be discovered by
> eyeballs wanting to improve the "-O3 experience," as that might be
> beneficial to both compilers and the kernel. This has been attempted
> before [1], but unfortunately nothing ever came of it.
>
> [1] https://lore.kernel.org/lkml/[email protected]/
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: Nathan Chancellor <[email protected]>
> Cc: Vlastimil Babka <[email protected]>
> Cc: Nick Desaulniers <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Sebastian Andrzej Siewior <[email protected]>
> Cc: Sean Christopherson <[email protected]>
> Cc: "Gustavo A. R. Silva" <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Chris Down <[email protected]>
> Cc: Dmitry Torokhov <[email protected]>
> Cc: John Ogness <[email protected]>
> Cc: Elliot Berman <[email protected]>
> Cc: Oleksandr Natalenko <[email protected]>
>
> Miko Larsson (2):
> Kconfig: Mark -O3 as experimental
> Kconfig: Allow -O3 for all architectures
>
> init/Kconfig | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> --
> 2.36.1
>


--
Thanks,
~Nick Desaulniers

2022-06-22 02:11:35

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

On Wed, Jun 22, 2022 at 1:17 AM Nick Desaulniers
<[email protected]> wrote:
>
> On Tue, Jun 21, 2022 at 6:35 AM Miko Larsson <[email protected]> wrote:
> >
> > Hi,
> >
> > This very small series allows -O3 to be used for all architectures. The
> > first patch marks -O3 as experimental, with the reasoning being that it
> > might expose unwanted regressions to users, and the second patch
> > actually allows -O3 by removing the "depend on ARC" string.
>
> I think we should just remove -O3 support from KCONFIG.
>
> If someone wants to mess around with "experimental features," there's
> nothing stopping you from doing:
>
> $ make KCFLAGS=-O3
>

ARC uses -O3 since day1.

"Generic build system uses -O2, we want -O3"
in commit cfdbc2e16e65c1ec1c23057640607cee98d1a1bd

If they want -O3, it is up to the ARC maintainer.



If you want to say "use this option carefully",
EXPERT might be another option.

depends on ARC || EXPERT







> >
> > The reasoning behind this series is to open up -O3 so that bugs related
> > to it (both compiler-related and kernel-related) can be discovered by
> > eyeballs wanting to improve the "-O3 experience," as that might be
> > beneficial to both compilers and the kernel. This has been attempted
> > before [1], but unfortunately nothing ever came of it.
> >
> > [1] https://lore.kernel.org/lkml/[email protected]/
> >
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: Nathan Chancellor <[email protected]>
> > Cc: Vlastimil Babka <[email protected]>
> > Cc: Nick Desaulniers <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Cc: Sebastian Andrzej Siewior <[email protected]>
> > Cc: Sean Christopherson <[email protected]>
> > Cc: "Gustavo A. R. Silva" <[email protected]>
> > Cc: Arnd Bergmann <[email protected]>
> > Cc: Chris Down <[email protected]>
> > Cc: Dmitry Torokhov <[email protected]>
> > Cc: John Ogness <[email protected]>
> > Cc: Elliot Berman <[email protected]>
> > Cc: Oleksandr Natalenko <[email protected]>
> >
> > Miko Larsson (2):
> > Kconfig: Mark -O3 as experimental
> > Kconfig: Allow -O3 for all architectures
> >
> > init/Kconfig | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > --
> > 2.36.1
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers



--
Best Regards
Masahiro Yamada

2022-06-23 15:55:03

by Miko Larsson

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

On Wednesday, 22 June 2022 03:57:34 CEST Masahiro Yamada wrote:
> If you want to say "use this option carefully",
> EXPERT might be another option.
>
> depends on ARC || EXPERT
>

Yeah, this would be a fair compromise, though I think it would be
better to use "visible if" instead of "depends on". I can get a v2 of
the series together if this is desired.

--
~miko


2022-06-23 16:06:12

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

On Fri, Jun 24, 2022 at 12:42 AM Miko Larsson <[email protected]> wrote:
>
> On Wednesday, 22 June 2022 03:57:34 CEST Masahiro Yamada wrote:
> > If you want to say "use this option carefully",
> > EXPERT might be another option.
> >
> > depends on ARC || EXPERT
> >
>
> Yeah, this would be a fair compromise, though I think it would be
> better to use "visible if" instead of "depends on". I can get a v2 of
> the series together if this is desired.


Why is "visible if" better than "depends on"?



> --
> ~miko
>
>


--
Best Regards
Masahiro Yamada

2022-06-23 19:03:41

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

On Fri, Jun 24, 2022 at 2:00 AM Miko Larsson <[email protected]> wrote:
>
> On Thursday, 23 June 2022 17:44:57 CEST Masahiro Yamada wrote:
> > On Fri, Jun 24, 2022 at 12:42 AM Miko Larsson <[email protected]> wrote:
> > > On Wednesday, 22 June 2022 03:57:34 CEST Masahiro Yamada wrote:
> > > > If you want to say "use this option carefully",
> > > > EXPERT might be another option.
> > > >
> > > > depends on ARC || EXPERT
> > >
> > > Yeah, this would be a fair compromise, though I think it would be
> > > better to use "visible if" instead of "depends on". I can get a v2 of
> > > the series together if this is desired.
> >
> > Why is "visible if" better than "depends on"?
> >
>
> Technically it most likely doesn't matter, but logically it makes more
> sense, since we'd make CC_OPTIMIZE_FOR_PERFORMANCE_O3 be visible if
> we're on ARC or if we have EXPERT enabled, instead of depending on
> them. But yeah, it probably doesn't matter.


Did you write and test the code?


"visible if" is only supported for "menu".
This is clearly documented at line 207
of Documentation/kbuild/kconfig-language.rst


Using "visible if" for config entry will just
result in the syntax error.





> --
> ~miko
>
>
>


--
Best Regards
Masahiro Yamada

2022-06-23 19:09:47

by Miko Larsson

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

On Thursday, 23 June 2022 19:15:14 CEST Masahiro Yamada wrote:
> On Fri, Jun 24, 2022 at 2:00 AM Miko Larsson <[email protected]> wrote:
> > On Thursday, 23 June 2022 17:44:57 CEST Masahiro Yamada wrote:
> > > On Fri, Jun 24, 2022 at 12:42 AM Miko Larsson <[email protected]>
wrote:
> > > > On Wednesday, 22 June 2022 03:57:34 CEST Masahiro Yamada wrote:
> > > > > If you want to say "use this option carefully",
> > > > > EXPERT might be another option.
> > > > >
> > > > > depends on ARC || EXPERT
> > > >
> > > > Yeah, this would be a fair compromise, though I think it would be
> > > > better to use "visible if" instead of "depends on". I can get a v2 of
> > > > the series together if this is desired.
> > >
> > > Why is "visible if" better than "depends on"?
> >
> > Technically it most likely doesn't matter, but logically it makes more
> > sense, since we'd make CC_OPTIMIZE_FOR_PERFORMANCE_O3 be visible if
> > we're on ARC or if we have EXPERT enabled, instead of depending on
> > them. But yeah, it probably doesn't matter.
>
> Did you write and test the code?
>

Admittedly, I didn't, since I had falsely assumed that "visible if" was
just an "alternative" to "depends on".

> "visible if" is only supported for "menu".
> This is clearly documented at line 207
> of Documentation/kbuild/kconfig-language.rst
>
>
> Using "visible if" for config entry will just
> result in the syntax error.
>

Oops, yeah, I wasn't aware of this. Sorry.

--
~miko


2022-06-23 19:11:25

by Miko Larsson

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

On Thursday, 23 June 2022 17:44:57 CEST Masahiro Yamada wrote:
> On Fri, Jun 24, 2022 at 12:42 AM Miko Larsson <[email protected]> wrote:
> > On Wednesday, 22 June 2022 03:57:34 CEST Masahiro Yamada wrote:
> > > If you want to say "use this option carefully",
> > > EXPERT might be another option.
> > >
> > > depends on ARC || EXPERT
> >
> > Yeah, this would be a fair compromise, though I think it would be
> > better to use "visible if" instead of "depends on". I can get a v2 of
> > the series together if this is desired.
>
> Why is "visible if" better than "depends on"?
>

Technically it most likely doesn't matter, but logically it makes more
sense, since we'd make CC_OPTIMIZE_FOR_PERFORMANCE_O3 be visible if
we're on ARC or if we have EXPERT enabled, instead of depending on
them. But yeah, it probably doesn't matter.

--
~miko



2022-06-23 19:12:40

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

On Wed, Jun 22, 2022 at 3:57 AM Masahiro Yamada <[email protected]> wrote:
> On Wed, Jun 22, 2022 at 1:17 AM Nick Desaulniers <[email protected]> wrote:
> >
> > On Tue, Jun 21, 2022 at 6:35 AM Miko Larsson <[email protected]> wrote:
> > >
> > > Hi,
> > >
> > > This very small series allows -O3 to be used for all architectures. The
> > > first patch marks -O3 as experimental, with the reasoning being that it
> > > might expose unwanted regressions to users, and the second patch
> > > actually allows -O3 by removing the "depend on ARC" string.
> >
> > I think we should just remove -O3 support from KCONFIG.

I agree that would be best

> > If someone wants to mess around with "experimental features," there's
> > nothing stopping you from doing:
> >
> > $ make KCFLAGS=-O3
> >
>
> ARC uses -O3 since day1.
>
> "Generic build system uses -O2, we want -O3"
> in commit cfdbc2e16e65c1ec1c23057640607cee98d1a1bd
>
> If they want -O3, it is up to the ARC maintainer.

I suppose whatever the reason for using -O3 at the time has
likely changed by now.

> If you want to say "use this option carefully",
> EXPERT might be another option.
>
> depends on ARC || EXPERT

This probably also needs a dependency on !COMPILE_TEST so we don't
report compile-time problems that are specific to -O3. Maybe a good first
step would be to turn this into

depends on ARCH && EXPERT && !COMPILE_TEST

which should help both with compile-testing on ARC, and it would
prevent it from being visible on other architectures.

Arnd

2022-06-24 18:45:00

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

On Thu, Jun 23 at 7:27:29 PM Arnd Bergmann <[email protected]> wrote:
>
> This probably also needs a dependency on !COMPILE_TEST so we don't
> report compile-time problems that are specific to -O3.

Honestly, let's just remove -O3 entirely.

Enabling it, and then not even build-testing the result, is just about
the *worst* possible case. That's just horrible.

The argument that "but ARC uses it" is not an argument. It was always
a bad argument, and ARC needs to just fix whatever it is that made it
an issue (likely already fixed with a compiler upgrade).

And there is no way I would ever accept this as a "let people try it" when

- as mentioned, just use KCFLAGS=-O3 if you want to

- -O3 has a *loong* history of generating worse code than -O2

so I will *not* be taking these kinds of patches without some very
serious explanations of why -O3 has suddenly become acceptable again.

Those explanations had better be more than "let people try". They
should have in-depth actual performance numbers for a real load, not
some made-up "bigger is better" logic.

Linus

2022-06-24 18:47:03

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 0/2] Kconfig: -O3 enablement

> On Fri, Jun 24, 2022 at 11:29 AM Linus Torvalds <[email protected]> wrote:

Bah. That was really me, just with a badly set up "reply through lore"
setup, so with the wrong email address.

That's what happens when you are

(a) incompetent

(b) stopped getting the mailing lists as regular email because you
think lore works so well

(c) normally rely on being cc'd

My apologies for the incompetence.

Linus

2022-06-28 21:51:35

by Nick Desaulniers

[permalink] [raw]
Subject: [PATCH] kbuild: drop support for CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3

The difference in most compilers between `-O3` and `-O2` is mostly down
to whether loops with statically determinable trip counts are fully
unrolled vs unrolled to a multiple of SIMD width.

This patch is effectively a revert of
commit 15f5db60a137 ("kbuild,arc: add
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC") without re-adding
ARCH_CFLAGS

Ever since
commit cfdbc2e16e65 ("ARC: Build system: Makefiles, Kconfig, Linker
script")
ARC has been built with -O3, though the reason for doing so was not
specified in inline comments or the commit message. This commit does not
re-add -O3 to arch/arc/Makefile.

Folks looking to experiment with `-O3` (or any compiler flag for that
matter) may pass them along to the command line invocation of make:

$ make KCFLAGS=-O3

Code that looks to re-add an explicit Kconfig option for `-O3` should
provide:
1. A rigorous and reproducible performance profile of a reasonable
userspace workload that demonstrates a hot loop in the kernel that
would benefit from `-O3` over `-O2`.
2. Disassembly of said loop body before and after.
3. Provides stats on terms of increase in file size.

Link: https://lore.kernel.org/linux-kbuild/CA+55aFz2sNBbZyg-_i8_Ldr2e8o9dfvdSfHHuRzVtP2VMAUWPg@mail.gmail.com/
Signed-off-by: Nick Desaulniers <[email protected]>
---
Makefile | 2 --
arch/arc/configs/axs101_defconfig | 1 -
arch/arc/configs/axs103_defconfig | 1 -
arch/arc/configs/axs103_smp_defconfig | 1 -
arch/arc/configs/haps_hs_defconfig | 1 -
arch/arc/configs/haps_hs_smp_defconfig | 1 -
arch/arc/configs/hsdk_defconfig | 1 -
arch/arc/configs/nsim_700_defconfig | 1 -
arch/arc/configs/nsimosci_defconfig | 1 -
arch/arc/configs/nsimosci_hs_defconfig | 1 -
arch/arc/configs/nsimosci_hs_smp_defconfig | 1 -
arch/arc/configs/tb10x_defconfig | 1 -
arch/arc/configs/vdk_hs38_defconfig | 1 -
arch/arc/configs/vdk_hs38_smp_defconfig | 1 -
init/Kconfig | 7 -------
15 files changed, 22 deletions(-)

diff --git a/Makefile b/Makefile
index 8973b285ce6c..b69f6cd7f2e2 100644
--- a/Makefile
+++ b/Makefile
@@ -755,8 +755,6 @@ KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)

ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE
KBUILD_CFLAGS += -O2
-else ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3
-KBUILD_CFLAGS += -O3
else ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
KBUILD_CFLAGS += -Os
endif
diff --git a/arch/arc/configs/axs101_defconfig b/arch/arc/configs/axs101_defconfig
index 0016149f9583..e31a8ebc3ecc 100644
--- a/arch/arc/configs/axs101_defconfig
+++ b/arch/arc/configs/axs101_defconfig
@@ -9,7 +9,6 @@ CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
# CONFIG_VM_EVENT_COUNTERS is not set
diff --git a/arch/arc/configs/axs103_defconfig b/arch/arc/configs/axs103_defconfig
index 5b031582a1cf..e0e8567f0d75 100644
--- a/arch/arc/configs/axs103_defconfig
+++ b/arch/arc/configs/axs103_defconfig
@@ -9,7 +9,6 @@ CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
# CONFIG_VM_EVENT_COUNTERS is not set
diff --git a/arch/arc/configs/axs103_smp_defconfig b/arch/arc/configs/axs103_smp_defconfig
index d4eec39e0112..fcbc952bc75b 100644
--- a/arch/arc/configs/axs103_smp_defconfig
+++ b/arch/arc/configs/axs103_smp_defconfig
@@ -9,7 +9,6 @@ CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
# CONFIG_VM_EVENT_COUNTERS is not set
diff --git a/arch/arc/configs/haps_hs_defconfig b/arch/arc/configs/haps_hs_defconfig
index 7337cdf4ffdd..d87ad7e88d62 100644
--- a/arch/arc/configs/haps_hs_defconfig
+++ b/arch/arc/configs/haps_hs_defconfig
@@ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_EXPERT=y
CONFIG_PERF_EVENTS=y
# CONFIG_COMPAT_BRK is not set
diff --git a/arch/arc/configs/haps_hs_smp_defconfig b/arch/arc/configs/haps_hs_smp_defconfig
index bc927221afc0..8d82cdb7f86a 100644
--- a/arch/arc/configs/haps_hs_smp_defconfig
+++ b/arch/arc/configs/haps_hs_smp_defconfig
@@ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
# CONFIG_VM_EVENT_COUNTERS is not set
diff --git a/arch/arc/configs/hsdk_defconfig b/arch/arc/configs/hsdk_defconfig
index aa000075a575..f856b03e0fb5 100644
--- a/arch/arc/configs/hsdk_defconfig
+++ b/arch/arc/configs/hsdk_defconfig
@@ -9,7 +9,6 @@ CONFIG_NAMESPACES=y
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_BLK_DEV_RAM=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
# CONFIG_VM_EVENT_COUNTERS is not set
diff --git a/arch/arc/configs/nsim_700_defconfig b/arch/arc/configs/nsim_700_defconfig
index 326f6cde7826..a1ce12bf5b16 100644
--- a/arch/arc/configs/nsim_700_defconfig
+++ b/arch/arc/configs/nsim_700_defconfig
@@ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_KALLSYMS_ALL=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
diff --git a/arch/arc/configs/nsimosci_defconfig b/arch/arc/configs/nsimosci_defconfig
index bf39a0091679..ca10f4a2c823 100644
--- a/arch/arc/configs/nsimosci_defconfig
+++ b/arch/arc/configs/nsimosci_defconfig
@@ -10,7 +10,6 @@ CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_KALLSYMS_ALL=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
diff --git a/arch/arc/configs/nsimosci_hs_defconfig b/arch/arc/configs/nsimosci_hs_defconfig
index 7121bd71c543..31b6ec3683c6 100644
--- a/arch/arc/configs/nsimosci_hs_defconfig
+++ b/arch/arc/configs/nsimosci_hs_defconfig
@@ -10,7 +10,6 @@ CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_KALLSYMS_ALL=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
diff --git a/arch/arc/configs/nsimosci_hs_smp_defconfig b/arch/arc/configs/nsimosci_hs_smp_defconfig
index f9863b294a70..41a0037f48a5 100644
--- a/arch/arc/configs/nsimosci_hs_smp_defconfig
+++ b/arch/arc/configs/nsimosci_hs_smp_defconfig
@@ -8,7 +8,6 @@ CONFIG_IKCONFIG_PROC=y
# CONFIG_UTS_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_PERF_EVENTS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_KPROBES=y
diff --git a/arch/arc/configs/tb10x_defconfig b/arch/arc/configs/tb10x_defconfig
index a12656ec0072..d93b65008d4a 100644
--- a/arch/arc/configs/tb10x_defconfig
+++ b/arch/arc/configs/tb10x_defconfig
@@ -14,7 +14,6 @@ CONFIG_INITRAMFS_SOURCE="../tb10x-rootfs.cpio"
CONFIG_INITRAMFS_ROOT_UID=2100
CONFIG_INITRAMFS_ROOT_GID=501
# CONFIG_RD_GZIP is not set
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_AIO is not set
CONFIG_EMBEDDED=y
diff --git a/arch/arc/configs/vdk_hs38_defconfig b/arch/arc/configs/vdk_hs38_defconfig
index d7c858df520c..0c3b21416819 100644
--- a/arch/arc/configs/vdk_hs38_defconfig
+++ b/arch/arc/configs/vdk_hs38_defconfig
@@ -4,7 +4,6 @@ CONFIG_HIGH_RES_TIMERS=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
# CONFIG_VM_EVENT_COUNTERS is not set
diff --git a/arch/arc/configs/vdk_hs38_smp_defconfig b/arch/arc/configs/vdk_hs38_smp_defconfig
index 015c1d43889e..f9ad9d3ee702 100644
--- a/arch/arc/configs/vdk_hs38_smp_defconfig
+++ b/arch/arc/configs/vdk_hs38_smp_defconfig
@@ -4,7 +4,6 @@ CONFIG_HIGH_RES_TIMERS=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_BLK_DEV_INITRD=y
-CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
# CONFIG_VM_EVENT_COUNTERS is not set
diff --git a/init/Kconfig b/init/Kconfig
index c7900e8975f1..1b4d8acc3def 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1401,13 +1401,6 @@ config CC_OPTIMIZE_FOR_PERFORMANCE
with the "-O2" compiler flag for best performance and most
helpful compile-time warnings.

-config CC_OPTIMIZE_FOR_PERFORMANCE_O3
- bool "Optimize more for performance (-O3)"
- depends on ARC
- help
- Choosing this option will pass "-O3" to your compiler to optimize
- the kernel yet more for performance.
-
config CC_OPTIMIZE_FOR_SIZE
bool "Optimize for size (-Os)"
help
--
2.37.0.rc0.161.g10f37bed90-goog

2022-06-29 11:13:40

by Miko Larsson

[permalink] [raw]
Subject: Re: [PATCH] kbuild: drop support for CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3

On Tuesday, 28 June 2022 23:04:07 CEST Nick Desaulniers wrote:
> The difference in most compilers between `-O3` and `-O2` is mostly down
> to whether loops with statically determinable trip counts are fully
> unrolled vs unrolled to a multiple of SIMD width.
>
> This patch is effectively a revert of
> commit 15f5db60a137 ("kbuild,arc: add
> CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC") without re-adding
> ARCH_CFLAGS
>
> Ever since
> commit cfdbc2e16e65 ("ARC: Build system: Makefiles, Kconfig, Linker
> script")
> ARC has been built with -O3, though the reason for doing so was not
> specified in inline comments or the commit message. This commit does not
> re-add -O3 to arch/arc/Makefile.
>
> Folks looking to experiment with `-O3` (or any compiler flag for that
> matter) may pass them along to the command line invocation of make:
>
> $ make KCFLAGS=-O3
>
> Code that looks to re-add an explicit Kconfig option for `-O3` should
> provide:
> 1. A rigorous and reproducible performance profile of a reasonable
> userspace workload that demonstrates a hot loop in the kernel that
> would benefit from `-O3` over `-O2`.
> 2. Disassembly of said loop body before and after.
> 3. Provides stats on terms of increase in file size.
>

Might be worth cleaning up the rest of the kernel of instances of -O3,
too. -O3 used to build lz4 and mips vdso, for instance. Might be a bit
of a digression, though

--
~miko


2022-07-11 05:34:44

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] kbuild: drop support for CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3

On Wed, Jun 29, 2022 at 6:06 AM Nick Desaulniers
<[email protected]> wrote:
>
> The difference in most compilers between `-O3` and `-O2` is mostly down
> to whether loops with statically determinable trip counts are fully
> unrolled vs unrolled to a multiple of SIMD width.
>
> This patch is effectively a revert of
> commit 15f5db60a137 ("kbuild,arc: add
> CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC") without re-adding
> ARCH_CFLAGS
>
> Ever since
> commit cfdbc2e16e65 ("ARC: Build system: Makefiles, Kconfig, Linker
> script")
> ARC has been built with -O3, though the reason for doing so was not
> specified in inline comments or the commit message. This commit does not
> re-add -O3 to arch/arc/Makefile.
>
> Folks looking to experiment with `-O3` (or any compiler flag for that
> matter) may pass them along to the command line invocation of make:
>
> $ make KCFLAGS=-O3
>
> Code that looks to re-add an explicit Kconfig option for `-O3` should
> provide:
> 1. A rigorous and reproducible performance profile of a reasonable
> userspace workload that demonstrates a hot loop in the kernel that
> would benefit from `-O3` over `-O2`.
> 2. Disassembly of said loop body before and after.
> 3. Provides stats on terms of increase in file size.
>
> Link: https://lore.kernel.org/linux-kbuild/CA+55aFz2sNBbZyg-_i8_Ldr2e8o9dfvdSfHHuRzVtP2VMAUWPg@mail.gmail.com/
> Signed-off-by: Nick Desaulniers <[email protected]>
> ---

Applied to linux-kbuild.
Thanks.


> Makefile | 2 --
> arch/arc/configs/axs101_defconfig | 1 -
> arch/arc/configs/axs103_defconfig | 1 -
> arch/arc/configs/axs103_smp_defconfig | 1 -
> arch/arc/configs/haps_hs_defconfig | 1 -
> arch/arc/configs/haps_hs_smp_defconfig | 1 -
> arch/arc/configs/hsdk_defconfig | 1 -
> arch/arc/configs/nsim_700_defconfig | 1 -
> arch/arc/configs/nsimosci_defconfig | 1 -
> arch/arc/configs/nsimosci_hs_defconfig | 1 -
> arch/arc/configs/nsimosci_hs_smp_defconfig | 1 -
> arch/arc/configs/tb10x_defconfig | 1 -
> arch/arc/configs/vdk_hs38_defconfig | 1 -
> arch/arc/configs/vdk_hs38_smp_defconfig | 1 -
> init/Kconfig | 7 -------
> 15 files changed, 22 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 8973b285ce6c..b69f6cd7f2e2 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -755,8 +755,6 @@ KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
>
> ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE
> KBUILD_CFLAGS += -O2
> -else ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3
> -KBUILD_CFLAGS += -O3
> else ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
> KBUILD_CFLAGS += -Os
> endif
> diff --git a/arch/arc/configs/axs101_defconfig b/arch/arc/configs/axs101_defconfig
> index 0016149f9583..e31a8ebc3ecc 100644
> --- a/arch/arc/configs/axs101_defconfig
> +++ b/arch/arc/configs/axs101_defconfig
> @@ -9,7 +9,6 @@ CONFIG_NAMESPACES=y
> # CONFIG_UTS_NS is not set
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> # CONFIG_VM_EVENT_COUNTERS is not set
> diff --git a/arch/arc/configs/axs103_defconfig b/arch/arc/configs/axs103_defconfig
> index 5b031582a1cf..e0e8567f0d75 100644
> --- a/arch/arc/configs/axs103_defconfig
> +++ b/arch/arc/configs/axs103_defconfig
> @@ -9,7 +9,6 @@ CONFIG_NAMESPACES=y
> # CONFIG_UTS_NS is not set
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> # CONFIG_VM_EVENT_COUNTERS is not set
> diff --git a/arch/arc/configs/axs103_smp_defconfig b/arch/arc/configs/axs103_smp_defconfig
> index d4eec39e0112..fcbc952bc75b 100644
> --- a/arch/arc/configs/axs103_smp_defconfig
> +++ b/arch/arc/configs/axs103_smp_defconfig
> @@ -9,7 +9,6 @@ CONFIG_NAMESPACES=y
> # CONFIG_UTS_NS is not set
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> # CONFIG_VM_EVENT_COUNTERS is not set
> diff --git a/arch/arc/configs/haps_hs_defconfig b/arch/arc/configs/haps_hs_defconfig
> index 7337cdf4ffdd..d87ad7e88d62 100644
> --- a/arch/arc/configs/haps_hs_defconfig
> +++ b/arch/arc/configs/haps_hs_defconfig
> @@ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
> # CONFIG_UTS_NS is not set
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_EXPERT=y
> CONFIG_PERF_EVENTS=y
> # CONFIG_COMPAT_BRK is not set
> diff --git a/arch/arc/configs/haps_hs_smp_defconfig b/arch/arc/configs/haps_hs_smp_defconfig
> index bc927221afc0..8d82cdb7f86a 100644
> --- a/arch/arc/configs/haps_hs_smp_defconfig
> +++ b/arch/arc/configs/haps_hs_smp_defconfig
> @@ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
> # CONFIG_UTS_NS is not set
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> # CONFIG_VM_EVENT_COUNTERS is not set
> diff --git a/arch/arc/configs/hsdk_defconfig b/arch/arc/configs/hsdk_defconfig
> index aa000075a575..f856b03e0fb5 100644
> --- a/arch/arc/configs/hsdk_defconfig
> +++ b/arch/arc/configs/hsdk_defconfig
> @@ -9,7 +9,6 @@ CONFIG_NAMESPACES=y
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> CONFIG_BLK_DEV_RAM=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> # CONFIG_VM_EVENT_COUNTERS is not set
> diff --git a/arch/arc/configs/nsim_700_defconfig b/arch/arc/configs/nsim_700_defconfig
> index 326f6cde7826..a1ce12bf5b16 100644
> --- a/arch/arc/configs/nsim_700_defconfig
> +++ b/arch/arc/configs/nsim_700_defconfig
> @@ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
> # CONFIG_UTS_NS is not set
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_KALLSYMS_ALL=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> diff --git a/arch/arc/configs/nsimosci_defconfig b/arch/arc/configs/nsimosci_defconfig
> index bf39a0091679..ca10f4a2c823 100644
> --- a/arch/arc/configs/nsimosci_defconfig
> +++ b/arch/arc/configs/nsimosci_defconfig
> @@ -10,7 +10,6 @@ CONFIG_NAMESPACES=y
> # CONFIG_UTS_NS is not set
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_KALLSYMS_ALL=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> diff --git a/arch/arc/configs/nsimosci_hs_defconfig b/arch/arc/configs/nsimosci_hs_defconfig
> index 7121bd71c543..31b6ec3683c6 100644
> --- a/arch/arc/configs/nsimosci_hs_defconfig
> +++ b/arch/arc/configs/nsimosci_hs_defconfig
> @@ -10,7 +10,6 @@ CONFIG_NAMESPACES=y
> # CONFIG_UTS_NS is not set
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_KALLSYMS_ALL=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> diff --git a/arch/arc/configs/nsimosci_hs_smp_defconfig b/arch/arc/configs/nsimosci_hs_smp_defconfig
> index f9863b294a70..41a0037f48a5 100644
> --- a/arch/arc/configs/nsimosci_hs_smp_defconfig
> +++ b/arch/arc/configs/nsimosci_hs_smp_defconfig
> @@ -8,7 +8,6 @@ CONFIG_IKCONFIG_PROC=y
> # CONFIG_UTS_NS is not set
> # CONFIG_PID_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_PERF_EVENTS=y
> # CONFIG_COMPAT_BRK is not set
> CONFIG_KPROBES=y
> diff --git a/arch/arc/configs/tb10x_defconfig b/arch/arc/configs/tb10x_defconfig
> index a12656ec0072..d93b65008d4a 100644
> --- a/arch/arc/configs/tb10x_defconfig
> +++ b/arch/arc/configs/tb10x_defconfig
> @@ -14,7 +14,6 @@ CONFIG_INITRAMFS_SOURCE="../tb10x-rootfs.cpio"
> CONFIG_INITRAMFS_ROOT_UID=2100
> CONFIG_INITRAMFS_ROOT_GID=501
> # CONFIG_RD_GZIP is not set
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_KALLSYMS_ALL=y
> # CONFIG_AIO is not set
> CONFIG_EMBEDDED=y
> diff --git a/arch/arc/configs/vdk_hs38_defconfig b/arch/arc/configs/vdk_hs38_defconfig
> index d7c858df520c..0c3b21416819 100644
> --- a/arch/arc/configs/vdk_hs38_defconfig
> +++ b/arch/arc/configs/vdk_hs38_defconfig
> @@ -4,7 +4,6 @@ CONFIG_HIGH_RES_TIMERS=y
> CONFIG_IKCONFIG=y
> CONFIG_IKCONFIG_PROC=y
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> # CONFIG_VM_EVENT_COUNTERS is not set
> diff --git a/arch/arc/configs/vdk_hs38_smp_defconfig b/arch/arc/configs/vdk_hs38_smp_defconfig
> index 015c1d43889e..f9ad9d3ee702 100644
> --- a/arch/arc/configs/vdk_hs38_smp_defconfig
> +++ b/arch/arc/configs/vdk_hs38_smp_defconfig
> @@ -4,7 +4,6 @@ CONFIG_HIGH_RES_TIMERS=y
> CONFIG_IKCONFIG=y
> CONFIG_IKCONFIG_PROC=y
> CONFIG_BLK_DEV_INITRD=y
> -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y
> CONFIG_EMBEDDED=y
> CONFIG_PERF_EVENTS=y
> # CONFIG_VM_EVENT_COUNTERS is not set
> diff --git a/init/Kconfig b/init/Kconfig
> index c7900e8975f1..1b4d8acc3def 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1401,13 +1401,6 @@ config CC_OPTIMIZE_FOR_PERFORMANCE
> with the "-O2" compiler flag for best performance and most
> helpful compile-time warnings.
>
> -config CC_OPTIMIZE_FOR_PERFORMANCE_O3
> - bool "Optimize more for performance (-O3)"
> - depends on ARC
> - help
> - Choosing this option will pass "-O3" to your compiler to optimize
> - the kernel yet more for performance.
> -
> config CC_OPTIMIZE_FOR_SIZE
> bool "Optimize for size (-Os)"
> help
> --
> 2.37.0.rc0.161.g10f37bed90-goog
>


--
Best Regards
Masahiro Yamada

2022-07-11 06:02:04

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] kbuild: drop support for CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3

On Wed, Jun 29, 2022 at 7:48 PM Miko Larsson <[email protected]> wrote:
>
> On Tuesday, 28 June 2022 23:04:07 CEST Nick Desaulniers wrote:
> > The difference in most compilers between `-O3` and `-O2` is mostly down
> > to whether loops with statically determinable trip counts are fully
> > unrolled vs unrolled to a multiple of SIMD width.
> >
> > This patch is effectively a revert of
> > commit 15f5db60a137 ("kbuild,arc: add
> > CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC") without re-adding
> > ARCH_CFLAGS
> >
> > Ever since
> > commit cfdbc2e16e65 ("ARC: Build system: Makefiles, Kconfig, Linker
> > script")
> > ARC has been built with -O3, though the reason for doing so was not
> > specified in inline comments or the commit message. This commit does not
> > re-add -O3 to arch/arc/Makefile.
> >
> > Folks looking to experiment with `-O3` (or any compiler flag for that
> > matter) may pass them along to the command line invocation of make:
> >
> > $ make KCFLAGS=-O3
> >
> > Code that looks to re-add an explicit Kconfig option for `-O3` should
> > provide:
> > 1. A rigorous and reproducible performance profile of a reasonable
> > userspace workload that demonstrates a hot loop in the kernel that
> > would benefit from `-O3` over `-O2`.
> > 2. Disassembly of said loop body before and after.
> > 3. Provides stats on terms of increase in file size.
> >
>
> Might be worth cleaning up the rest of the kernel of instances of -O3,
> too. -O3 used to build lz4 and mips vdso, for instance. Might be a bit
> of a digression, though


This patch focuses on the removal of CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3,
so I think it is OK as-is.

The rest of cleanups, if needed,
should be submitted separately.




>
> --
> ~miko
>
>


--
Best Regards
Masahiro Yamada