2019-09-30 11:46:21

by Will Deacon

[permalink] [raw]
Subject: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

This reverts commit ac7c3e4ff401b304489a031938dbeaab585bfe0a for ARM and
arm64.

Building an arm64 kernel with CONFIG_OPTIMIZE_INLINING=y has been shown
to violate fixed register allocations of local variables passed to
inline assembly with GCC prior to version 9 which can lead to subtle
failures at runtime:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111

A very similar has been reported for 32-bit ARM as well:

https://lkml.kernel.org/r/[email protected]

Although GCC 9.1 appears to work for the specific case in the bugzilla
above, the exact issue has not been root-caused so play safe and disable
the option for now on these architectures.

Cc: Nicolas Saenz Julienne <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Russell King <[email protected]>
Cc: Masahiro Yamada <[email protected]>,
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
lib/Kconfig.debug | 1 +
1 file changed, 1 insertion(+)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 93d97f9b0157..c37c72adaeff 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -312,6 +312,7 @@ config HEADERS_CHECK

config OPTIMIZE_INLINING
def_bool y
+ depends on !(ARM || ARM64) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111
help
This option determines if the kernel forces gcc to inline the functions
developers have marked 'inline'. Doing so takes away freedom from gcc to
--
2.23.0.444.g18eeb5a265-goog


2019-10-01 09:44:15

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

On Mon, Sep 30, 2019 at 8:45 PM Will Deacon <[email protected]> wrote:
>
> This reverts commit ac7c3e4ff401b304489a031938dbeaab585bfe0a for ARM and
> arm64.
>
> Building an arm64 kernel with CONFIG_OPTIMIZE_INLINING=y has been shown
> to violate fixed register allocations of local variables passed to
> inline assembly with GCC prior to version 9 which can lead to subtle
> failures at runtime:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111
>
> A very similar has been reported for 32-bit ARM as well:
>
> https://lkml.kernel.org/r/[email protected]


For reviewers:
The main discussion is here:

https://lore.kernel.org/patchwork/patch/1122097/



> Although GCC 9.1 appears to work for the specific case in the bugzilla
> above, the exact issue has not been root-caused so play safe and disable
> the option for now on these architectures.
>
> Cc: Nicolas Saenz Julienne <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Masahiro Yamada <[email protected]>,
> Cc: Arnd Bergmann <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>
> ---
> lib/Kconfig.debug | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 93d97f9b0157..c37c72adaeff 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -312,6 +312,7 @@ config HEADERS_CHECK
>
> config OPTIMIZE_INLINING
> def_bool y
> + depends on !(ARM || ARM64) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111


This is a too big hammer.

For ARM, it is not a compiler bug, so I am trying to fix the kernel code.

For ARM64, even if it is a compiler bug, you can add __always_inline
to the functions in question.
(arch_atomic64_dec_if_positive in this case).

You do not need to force __always_inline globally.




> help
> This option determines if the kernel forces gcc to inline the functions
> developers have marked 'inline'. Doing so takes away freedom from gcc to
> --
> 2.23.0.444.g18eeb5a265-goog
>




--
Best Regards
Masahiro Yamada

2019-10-01 10:45:05

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

On Tue, Oct 01, 2019 at 06:40:26PM +0900, Masahiro Yamada wrote:
> On Mon, Sep 30, 2019 at 8:45 PM Will Deacon <[email protected]> wrote:
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 93d97f9b0157..c37c72adaeff 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -312,6 +312,7 @@ config HEADERS_CHECK
> >
> > config OPTIMIZE_INLINING
> > def_bool y
> > + depends on !(ARM || ARM64) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111
>
>
> This is a too big hammer.

It matches the previous default behaviour!

> For ARM, it is not a compiler bug, so I am trying to fix the kernel code.
>
> For ARM64, even if it is a compiler bug, you can add __always_inline
> to the functions in question.
> (arch_atomic64_dec_if_positive in this case).
>
> You do not need to force __always_inline globally.

So you'd prefer I do something like the diff below? I mean, it's a start,
but I do worry that we're hanging arch/arm/ out to dry.

Will

--->8

diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index c6bd87d2915b..574808b9df4c 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -321,7 +321,8 @@ static inline s64 __lse_atomic64_dec_if_positive(atomic64_t *v)
}

#define __CMPXCHG_CASE(w, sfx, name, sz, mb, cl...) \
-static inline u##sz __lse__cmpxchg_case_##name##sz(volatile void *ptr, \
+static __always_inline u##sz \
+__lse__cmpxchg_case_##name##sz(volatile void *ptr, \
u##sz old, \
u##sz new) \
{ \
@@ -362,7 +363,8 @@ __CMPXCHG_CASE(x, , mb_, 64, al, "memory")
#undef __CMPXCHG_CASE

#define __CMPXCHG_DBL(name, mb, cl...) \
-static inline long __lse__cmpxchg_double##name(unsigned long old1, \
+static __always_inline long \
+__lse__cmpxchg_double##name(unsigned long old1, \
unsigned long old2, \
unsigned long new1, \
unsigned long new2, \

2019-10-01 11:42:48

by Andrew Murray

[permalink] [raw]
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

On Tue, Oct 01, 2019 at 11:42:54AM +0100, Will Deacon wrote:
> On Tue, Oct 01, 2019 at 06:40:26PM +0900, Masahiro Yamada wrote:
> > On Mon, Sep 30, 2019 at 8:45 PM Will Deacon <[email protected]> wrote:
> > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > index 93d97f9b0157..c37c72adaeff 100644
> > > --- a/lib/Kconfig.debug
> > > +++ b/lib/Kconfig.debug
> > > @@ -312,6 +312,7 @@ config HEADERS_CHECK
> > >
> > > config OPTIMIZE_INLINING
> > > def_bool y
> > > + depends on !(ARM || ARM64) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111
> >
> >
> > This is a too big hammer.
>
> It matches the previous default behaviour!
>
> > For ARM, it is not a compiler bug, so I am trying to fix the kernel code.
> >
> > For ARM64, even if it is a compiler bug, you can add __always_inline
> > to the functions in question.
> > (arch_atomic64_dec_if_positive in this case).
> >
> > You do not need to force __always_inline globally.
>
> So you'd prefer I do something like the diff below? I mean, it's a start,
> but I do worry that we're hanging arch/arm/ out to dry.

If I've understood one part of this issue correctly - and using the
c2p_unsupported build failure as an example [1], there are instances in
the kernel where it is assumed that the compiler will optimise out a call
to an undefined function, and also assumed that the compiler will know
at compile time that the function will never get called. It's common to
satisfy this assumption when the calling function is inlined.

But I suspect there may be other cases similar to c2p_unsupported which
are still lurking.

For example the following functions are called but non-existent, and thus
may be an area worth investigating:

__buggy_use_of_MTHCA_PUT, __put_dbe_unknown, __cmpxchg_wrong_size,
__bad_percpu_size, __put_user_bad, __get_user_unknown,
__bad_unaligned_access_size, __bad_xchg

But more generally, as this is a common pattern - isn't there a benefit
here for changing all of these to BUILD_BUG? (So they can be found easily).

Or to avoid this class of issues, change them to BUG or unreachable - but
lose the benefit of compile time detection?

Thanks,

Andrew Murray

[1] https://lore.kernel.org/patchwork/patch/1122097/
>
> Will
>
> --->8
>
> diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
> index c6bd87d2915b..574808b9df4c 100644
> --- a/arch/arm64/include/asm/atomic_lse.h
> +++ b/arch/arm64/include/asm/atomic_lse.h
> @@ -321,7 +321,8 @@ static inline s64 __lse_atomic64_dec_if_positive(atomic64_t *v)
> }
>
> #define __CMPXCHG_CASE(w, sfx, name, sz, mb, cl...) \
> -static inline u##sz __lse__cmpxchg_case_##name##sz(volatile void *ptr, \
> +static __always_inline u##sz \
> +__lse__cmpxchg_case_##name##sz(volatile void *ptr, \
> u##sz old, \
> u##sz new) \
> { \
> @@ -362,7 +363,8 @@ __CMPXCHG_CASE(x, , mb_, 64, al, "memory")
> #undef __CMPXCHG_CASE
>
> #define __CMPXCHG_DBL(name, mb, cl...) \
> -static inline long __lse__cmpxchg_double##name(unsigned long old1, \
> +static __always_inline long \
> +__lse__cmpxchg_double##name(unsigned long old1, \
> unsigned long old2, \
> unsigned long new1, \
> unsigned long new2, \
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2019-10-01 14:38:54

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

On Tue, Oct 01, 2019 at 12:41:30PM +0100, Andrew Murray wrote:
> On Tue, Oct 01, 2019 at 11:42:54AM +0100, Will Deacon wrote:
> > On Tue, Oct 01, 2019 at 06:40:26PM +0900, Masahiro Yamada wrote:
> > > On Mon, Sep 30, 2019 at 8:45 PM Will Deacon <[email protected]> wrote:
> > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > > index 93d97f9b0157..c37c72adaeff 100644
> > > > --- a/lib/Kconfig.debug
> > > > +++ b/lib/Kconfig.debug
> > > > @@ -312,6 +312,7 @@ config HEADERS_CHECK
> > > >
> > > > config OPTIMIZE_INLINING
> > > > def_bool y
> > > > + depends on !(ARM || ARM64) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111
> > >
> > >
> > > This is a too big hammer.
> >
> > It matches the previous default behaviour!
> >
> > > For ARM, it is not a compiler bug, so I am trying to fix the kernel code.
> > >
> > > For ARM64, even if it is a compiler bug, you can add __always_inline
> > > to the functions in question.
> > > (arch_atomic64_dec_if_positive in this case).
> > >
> > > You do not need to force __always_inline globally.
> >
> > So you'd prefer I do something like the diff below? I mean, it's a start,
> > but I do worry that we're hanging arch/arm/ out to dry.
>
> If I've understood one part of this issue correctly - and using the
> c2p_unsupported build failure as an example [1], there are instances in
> the kernel where it is assumed that the compiler will optimise out a call
> to an undefined function, and also assumed that the compiler will know
> at compile time that the function will never get called. It's common to
> satisfy this assumption when the calling function is inlined.
>
> But I suspect there may be other cases similar to c2p_unsupported which
> are still lurking.
>
> For example the following functions are called but non-existent, and thus
> may be an area worth investigating:
>
> __buggy_use_of_MTHCA_PUT, __put_dbe_unknown, __cmpxchg_wrong_size,
> __bad_percpu_size, __put_user_bad, __get_user_unknown,
> __bad_unaligned_access_size, __bad_xchg
>
> But more generally, as this is a common pattern - isn't there a benefit
> here for changing all of these to BUILD_BUG? (So they can be found easily).

Precisely, what is your suggestion?

If you think that replacing the call to __get_user_bad with BUILD_BUG(),
BUILD_BUG() becomes a no-op when __OPTIMIZE__ is not defined (see the
definition of __compiletime_assert() in linux/compiler.h); this means
such places will be reachable, which leads to uninitialised variables.

> Or to avoid this class of issues, change them to BUG or unreachable - but
> lose the benefit of compile time detection?

I think you ought to read the GCC manual wrt __builtin_unreachable().
"If control flow reaches the point of the `__builtin_unreachable',
the program is undefined. It is useful in situations where the
compiler cannot deduce the unreachability of the code."

I have seen cases where the instructions following an unreachable
code section have been the literal pool for the function - which,
if reached, would be quite confusing to debug. If you're lucky, you
might get an undefined instruction exception. If not, you could
continue and start executing another part of the function, leading
to possibly no crash at all - but unexpected results (which may end
up leaking sensitive data.)

For example, in our BUG() implementation on 32-bit ARM, we use
unreachable() after the asm() statement creating the bug table
entry and inserting the undefined instruction into the text.
Here's the resulting disassembly:

278: ebfffffe bl 0 <page_mapped>
278: R_ARM_CALL page_mapped
27c: e3500000 cmp r0, #0
280: 1a00006c bne 438 <invalidate_inode_pages2_range+0x3ac>
...
2d4: ebfffffe bl 0 <_raw_spin_lock_irqsave>
2d4: R_ARM_CALL _raw_spin_lock_irqsave
2d8: e5943008 ldr r3, [r4, #8]
2dc: e3130001 tst r3, #1
2e0: e1a02000 mov r2, r0
2e4: 1a000054 bne 43c <invalidate_inode_pages2_range+0x3b0>
...
438: e7f001f2 .word 0xe7f001f2
43c: e2433001 sub r3, r3, #1
440: eaffffa9 b 2ec <invalidate_inode_pages2_range+0x260>

Now, consider what unreachable() actually gets you here - it tells
the compiler that we do not expect to reach this point (that being
the point between 438 and 43c.) If we were to reach that point, we
would continue executing the code at 43c.

In this case, it would be like...

if (BUG_ON(page_mapped(page)))
goto random-location-in-xa_lock_irqsave()-inside-invalidate_complete_page2();

So no. unreachable() is not an option.

We really do want these places to be compile-time detected - relying
on triggering them at runtime is just not good enough IMHO (think
about how much testing the kernel would require to discover one of
these suckers buried in the depths of the code.)

Here's the question to ask: do we want to reliably detect issues
that we know are bad, which can lead to:
- unreliable kernel operation,
- user exploitable crashes,
or do we want to hide them for the sake of allowing -O0 compilation?

Given that the kernel as a general rule has very poor run-time test
coverage at the moment, I don't think this is the time to consider
giving up the protection that we have against this badness.

We've had several instances where these checks have triggered in the
user access code, and people have noticed when doing build tests.
They probably don't have the ability to do run-time testing on every
arch.

So, the existing facility of detecting these at build time is, IMHO,
an absolute must.

It would be different if the kernel community as a whole had the
ability to run-test every code path through the kernel source on
every arch, but I don't see that's a remotely realistic prospect.

If we want -O0 to work, but still want to preserve the ability to
detect these adequately, I think the easiest solution to that would
be to provide these dummy functions only when building with -O0,
making them all BUG().

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

2019-10-01 15:29:48

by Andrew Murray

[permalink] [raw]
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

On Tue, Oct 01, 2019 at 03:36:26PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Oct 01, 2019 at 12:41:30PM +0100, Andrew Murray wrote:
> > On Tue, Oct 01, 2019 at 11:42:54AM +0100, Will Deacon wrote:
> > > On Tue, Oct 01, 2019 at 06:40:26PM +0900, Masahiro Yamada wrote:
> > > > On Mon, Sep 30, 2019 at 8:45 PM Will Deacon <[email protected]> wrote:
> > > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > > > index 93d97f9b0157..c37c72adaeff 100644
> > > > > --- a/lib/Kconfig.debug
> > > > > +++ b/lib/Kconfig.debug
> > > > > @@ -312,6 +312,7 @@ config HEADERS_CHECK
> > > > >
> > > > > config OPTIMIZE_INLINING
> > > > > def_bool y
> > > > > + depends on !(ARM || ARM64) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111
> > > >
> > > >
> > > > This is a too big hammer.
> > >
> > > It matches the previous default behaviour!
> > >
> > > > For ARM, it is not a compiler bug, so I am trying to fix the kernel code.
> > > >
> > > > For ARM64, even if it is a compiler bug, you can add __always_inline
> > > > to the functions in question.
> > > > (arch_atomic64_dec_if_positive in this case).
> > > >
> > > > You do not need to force __always_inline globally.
> > >
> > > So you'd prefer I do something like the diff below? I mean, it's a start,
> > > but I do worry that we're hanging arch/arm/ out to dry.
> >
> > If I've understood one part of this issue correctly - and using the
> > c2p_unsupported build failure as an example [1], there are instances in
> > the kernel where it is assumed that the compiler will optimise out a call
> > to an undefined function, and also assumed that the compiler will know
> > at compile time that the function will never get called. It's common to
> > satisfy this assumption when the calling function is inlined.
> >
> > But I suspect there may be other cases similar to c2p_unsupported which
> > are still lurking.
> >
> > For example the following functions are called but non-existent, and thus
> > may be an area worth investigating:
> >
> > __buggy_use_of_MTHCA_PUT, __put_dbe_unknown, __cmpxchg_wrong_size,
> > __bad_percpu_size, __put_user_bad, __get_user_unknown,
> > __bad_unaligned_access_size, __bad_xchg
> >
> > But more generally, as this is a common pattern - isn't there a benefit
> > here for changing all of these to BUILD_BUG? (So they can be found easily).
>
> Precisely, what is your suggestion?
>
> If you think that replacing the call to __get_user_bad with BUILD_BUG(),
> BUILD_BUG() becomes a no-op when __OPTIMIZE__ is not defined (see the
> definition of __compiletime_assert() in linux/compiler.h); this means
> such places will be reachable, which leads to uninitialised variables.

I hadn't noticed the use of __OPTIMIZE__ - indeed if __compiletime_assert
is no-op'd and you reach it then you won't have a build error - but you
may get uninitialised values instead.

Presumably the purpose of __OPTIMIZE__ in this case is to prevent getting
an undefined function error for the __compiletime_assert line, even though
it doesn't get called (when using a compiler that doesn't optimize out the
call to the unused function).

Why is the call to __get_user_bad not guarded in this way for when
__OPTIMIZE__ isn't set, i.e. why doesn't it suffer from the issue
that the following fixes?

c03567a8e8d5 ("include/linux/compiler.h: don't perform compiletime_assert with -O0")


>
> > Or to avoid this class of issues, change them to BUG or unreachable - but
> > lose the benefit of compile time detection?
>
> I think you ought to read the GCC manual wrt __builtin_unreachable().
> "If control flow reaches the point of the `__builtin_unreachable',
> the program is undefined. It is useful in situations where the
> compiler cannot deduce the unreachability of the code."

Thanks, I've now read it. My suggestion arose from looking at existing
uses in the kernel - e.g. drivers/spi/spi-rockchip.c,
drivers/pinctrl/pinctrl-ingenic.c, etc. I guess these types of uses
should use BUG or similar instead right?

>
> I have seen cases where the instructions following an unreachable
> code section have been the literal pool for the function - which,
> if reached, would be quite confusing to debug. If you're lucky, you
> might get an undefined instruction exception. If not, you could
> continue and start executing another part of the function, leading
> to possibly no crash at all - but unexpected results (which may end
> up leaking sensitive data.)
>
> For example, in our BUG() implementation on 32-bit ARM, we use
> unreachable() after the asm() statement creating the bug table
> entry and inserting the undefined instruction into the text.
> Here's the resulting disassembly:
>
> 278: ebfffffe bl 0 <page_mapped>
> 278: R_ARM_CALL page_mapped
> 27c: e3500000 cmp r0, #0
> 280: 1a00006c bne 438 <invalidate_inode_pages2_range+0x3ac>
> ...
> 2d4: ebfffffe bl 0 <_raw_spin_lock_irqsave>
> 2d4: R_ARM_CALL _raw_spin_lock_irqsave
> 2d8: e5943008 ldr r3, [r4, #8]
> 2dc: e3130001 tst r3, #1
> 2e0: e1a02000 mov r2, r0
> 2e4: 1a000054 bne 43c <invalidate_inode_pages2_range+0x3b0>
> ...
> 438: e7f001f2 .word 0xe7f001f2
> 43c: e2433001 sub r3, r3, #1
> 440: eaffffa9 b 2ec <invalidate_inode_pages2_range+0x260>
>
> Now, consider what unreachable() actually gets you here - it tells
> the compiler that we do not expect to reach this point (that being
> the point between 438 and 43c.) If we were to reach that point, we
> would continue executing the code at 43c.
>
> In this case, it would be like...
>
> if (BUG_ON(page_mapped(page)))
> goto random-location-in-xa_lock_irqsave()-inside-invalidate_complete_page2();
>
> So no. unreachable() is not an option.

Thanks for the example.

>
> We really do want these places to be compile-time detected - relying
> on triggering them at runtime is just not good enough IMHO (think
> about how much testing the kernel would require to discover one of
> these suckers buried in the depths of the code.)
>
> Here's the question to ask: do we want to reliably detect issues
> that we know are bad, which can lead to:
> - unreliable kernel operation,
> - user exploitable crashes,
> or do we want to hide them for the sake of allowing -O0 compilation?
>
> Given that the kernel as a general rule has very poor run-time test
> coverage at the moment, I don't think this is the time to consider
> giving up the protection that we have against this badness.
>
> We've had several instances where these checks have triggered in the
> user access code, and people have noticed when doing build tests.
> They probably don't have the ability to do run-time testing on every
> arch.
>
> So, the existing facility of detecting these at build time is, IMHO,
> an absolute must.
>
> It would be different if the kernel community as a whole had the
> ability to run-test every code path through the kernel source on
> every arch, but I don't see that's a remotely realistic prospect.

I completely agree.

>
> If we want -O0 to work, but still want to preserve the ability to
> detect these adequately, I think the easiest solution to that would
> be to provide these dummy functions only when building with -O0,
> making them all BUG().

Though please note that this issue isn't limited to -O0, it relates to
building without CONFIG_OPTIMIZE_INLINING - resulting in functions
marked as inline not always being inlined.

I'm interested to determine if there is a benefit for all the functions
I mentioned (there are more) to have the same name which may be something
other than BUILD_BUG.

Thanks,

Andrew Murray

>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
> According to speedtest.net: 11.9Mbps down 500kbps up

2019-10-01 15:50:16

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

On Tue, Oct 01, 2019 at 04:28:27PM +0100, Andrew Murray wrote:
> I hadn't noticed the use of __OPTIMIZE__ - indeed if __compiletime_assert
> is no-op'd and you reach it then you won't have a build error - but you
> may get uninitialised values instead.
>
> Presumably the purpose of __OPTIMIZE__ in this case is to prevent getting
> an undefined function error for the __compiletime_assert line, even though
> it doesn't get called (when using a compiler that doesn't optimize out the
> call to the unused function).
>
> Why is the call to __get_user_bad not guarded in this way for when
> __OPTIMIZE__ isn't set, i.e. why doesn't it suffer from the issue
> that the following fixes?

Officially, the kernel does not support building with -O0. To start
with, the top level makefile has:

ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
KBUILD_CFLAGS += -Os
else
KBUILD_CFLAGS += -O2
endif

and we've said for years that the kernel relies upon the compiler
optimiser to build correctly. You may be lucky if you pass it via
some method to 'make' but that's going to rely on the argument order
to the compiler, and the order in which the compiler processes its
arguments, and whether it (for example) correctly disables all
optimisations if it encounters -O0 somewhere.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

2019-10-01 16:51:20

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

> So you'd prefer I do something like the diff below?

Yes, I find that diff preferable. Use __always_inline only when absolutely
necessary. Even then, it sounds like this is a workaround for one compiler,
so it should probably also have a comment. (I don't mind changing this for
all compilers).

2019-10-01 16:52:58

by Andrew Murray

[permalink] [raw]
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

On Tue, Oct 01, 2019 at 04:48:14PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Oct 01, 2019 at 04:28:27PM +0100, Andrew Murray wrote:
> > I hadn't noticed the use of __OPTIMIZE__ - indeed if __compiletime_assert
> > is no-op'd and you reach it then you won't have a build error - but you
> > may get uninitialised values instead.
> >
> > Presumably the purpose of __OPTIMIZE__ in this case is to prevent getting
> > an undefined function error for the __compiletime_assert line, even though
> > it doesn't get called (when using a compiler that doesn't optimize out the
> > call to the unused function).
> >
> > Why is the call to __get_user_bad not guarded in this way for when
> > __OPTIMIZE__ isn't set, i.e. why doesn't it suffer from the issue
> > that the following fixes?
>
> Officially, the kernel does not support building with -O0. To start
> with, the top level makefile has:
>
> ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
> KBUILD_CFLAGS += -Os
> else
> KBUILD_CFLAGS += -O2
> endif
>
> and we've said for years that the kernel relies upon the compiler
> optimiser to build correctly. You may be lucky if you pass it via
> some method to 'make' but that's going to rely on the argument order
> to the compiler, and the order in which the compiler processes its
> arguments, and whether it (for example) correctly disables all
> optimisations if it encounters -O0 somewhere.

So in practice, __OPTIMIZE__ will likely always be set, as far as I
can tell it's supported in GCC, Clang and Intel compilers. Though
the exception to this is for crypto/jitterentropy.c where the -O0 flag
is unconditionally set.

Are there other exceptions to this in terms of compilers?

Perhaps it may be possible to use BUILD_BUG after all.

Thanks,

Andrew Murray

>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
> According to speedtest.net: 11.9Mbps down 500kbps up

2019-10-01 17:13:30

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

On Tue, Oct 01, 2019 at 09:21:21AM -0700, Nick Desaulniers wrote:
> > So you'd prefer I do something like the diff below?
>
> Yes, I find that diff preferable. Use __always_inline only when absolutely
> necessary. Even then, it sounds like this is a workaround for one compiler,
> so it should probably also have a comment. (I don't mind changing this for
> all compilers).

I wondered about a comment, but I couldn't find a good place to put it.
We basically need to say "If you're using explicit register variables in
your function, then you should use __always_inline", but there are other
functions too (such as arch/arm64/include/asm/vdso/gettimeofday.h) so
a random comment buried in the atomic code doesn't feel very helpful to
me.

Will