2022-07-14 09:11:40

by Sudip Mukherjee

[permalink] [raw]
Subject: mainline build failure of powerpc allmodconfig for prom_init_check

Hi All,

Not sure if it has been reported before but the latest mainline kernel
branch fails to build for powerpc allmodconfig with gcc-12 and the error is:

Error: External symbol 'memset' referenced from prom_init.c
make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

The commit ca5dabcff1df ("powerpc/prom_init: Fix build failure with
GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and KASAN") looks similar but the error
is still there with gcc-12.

Note: I don't see this error with gcc-11.


I will be happy to test any patch or provide any extra log if needed.

--
Regards
Sudip


2022-07-17 09:26:25

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Thu, Jul 14, 2022 at 9:55 AM Sudip Mukherjee (Codethink)
<[email protected]> wrote:
>
> Hi All,
>
> Not sure if it has been reported before but the latest mainline kernel
> branch fails to build for powerpc allmodconfig with gcc-12 and the error is:
>
> Error: External symbol 'memset' referenced from prom_init.c
> make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

I was trying to check it. With gcc-11 the assembly code generated is
not using memset, but using __memset.
But with gcc-12, I can see the assembly code is using memset. One
example from the assembly:

call_prom:
.quad .call_prom,.TOC.@tocbase,0
.previous
.size call_prom,24
.type .call_prom,@function
.call_prom:
mflr 0 #,
std 29,-24(1) #,
std 30,-16(1) #,
std 31,-8(1) #,
mr 29,3 # tmp166, service
mr 31,4 # nargs, tmp167
mr 30,5 # tmp168, nret
# arch/powerpc/kernel/prom_init.c:396: struct prom_args args;
li 4,254 #,
li 5,52 #,
# arch/powerpc/kernel/prom_init.c:394: {
std 0,16(1) #,
stdu 1,-208(1) #,,
# arch/powerpc/kernel/prom_init.c:396: struct prom_args args;
addi 3,1,112 # tmp174,,
# arch/powerpc/kernel/prom_init.c:394: {
std 9,304(1) #,
std 10,312(1) #,
std 6,280(1) #,
std 7,288(1) #,
std 8,296(1) #,
# arch/powerpc/kernel/prom_init.c:396: struct prom_args args;
bl .memset #
nop
rldicl 9,31,0,32 # nargs, nargs
addi 9,9,1 # tmp163, nargs,
mtctr 9 # tmp163, tmp163



--
Regards
Sudip

2022-07-17 14:57:35

by Linus Torvalds

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
<[email protected]> wrote:
>
> I was trying to check it. With gcc-11 the assembly code generated is
> not using memset, but using __memset.
> But with gcc-12, I can see the assembly code is using memset. One
> example from the assembly:

You could try making the 'args' array in 'struct prom_args' be marked
'volatile'.

Ie something like this:

--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -115,6 +115,6 @@ struct prom_args {
__be32 service;
__be32 nargs;
__be32 nret;
- __be32 args[10];
+ volatile __be32 args[10];
};

because I think it's just the compilers turning the small loop over
those fields into a "memset()".

Linus

2022-07-17 20:29:57

by Segher Boessenkool

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 07:44:22AM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
> <[email protected]> wrote:
> > I was trying to check it. With gcc-11 the assembly code generated is
> > not using memset, but using __memset.
> > But with gcc-12, I can see the assembly code is using memset. One
> > example from the assembly:
>
> You could try making the 'args' array in 'struct prom_args' be marked
> 'volatile'.
>
> Ie something like this:
>
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -115,6 +115,6 @@ struct prom_args {
> __be32 service;
> __be32 nargs;
> __be32 nret;
> - __be32 args[10];
> + volatile __be32 args[10];
> };
>
> because I think it's just the compilers turning the small loop over
> those fields into a "memset()".

Yes. See <https://gcc.gnu.org/onlinedocs/gcc/Standards.html#C-Language>
near the end:
Most of the compiler support routines used by GCC are present in
libgcc, but there are a few exceptions. GCC requires the freestanding
environment provide memcpy, memmove, memset and memcmp. Finally, if
__builtin_trap is used, and the target does not implement the trap
pattern, then GCC emits a call to abort.

Can't we simply have a small simple implementation of these functions in
arch/powerpc/boot/? This stuff is not performance-critical, and this is
not the first time we hit these problems.


Segher

2022-07-17 20:31:41

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 3:44 PM Linus Torvalds
<[email protected]> wrote:
>
> On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
> <[email protected]> wrote:
> >
> > I was trying to check it. With gcc-11 the assembly code generated is
> > not using memset, but using __memset.
> > But with gcc-12, I can see the assembly code is using memset. One
> > example from the assembly:
>
> You could try making the 'args' array in 'struct prom_args' be marked
> 'volatile'.
>
> Ie something like this:
>
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -115,6 +115,6 @@ struct prom_args {
> __be32 service;
> __be32 nargs;
> __be32 nret;
> - __be32 args[10];
> + volatile __be32 args[10];
> };
>
> because I think it's just the compilers turning the small loop over
> those fields into a "memset()".

That didn't work.
"Error: External symbol 'memset' referenced from prom_init.c" is still
there with this change.
And the generated assembly still has the memset for "struct prom_args".


--
Regards
Sudip

2022-07-17 21:05:46

by Linus Torvalds

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 1:25 PM Sudip Mukherjee
<[email protected]> wrote:
>
> And the generated assembly still has the memset for "struct prom_args".

Strange. That smells like a compiler bug to me.

But I can't read powerpc assembly code - it's been too many years, and
even back when I did read it I hated how the register "names" worked.

Maybe it was never the args array, and it was about the other fields.
Not that that makes any sense either, but it makes more sense than the
compiler turning a series of volatile accesses into a memset.

Linus

2022-07-17 21:13:50

by Segher Boessenkool

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 01:29:07PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 1:25 PM Sudip Mukherjee
> <[email protected]> wrote:
> >
> > And the generated assembly still has the memset for "struct prom_args".
>
> Strange. That smells like a compiler bug to me.
>
> But I can't read powerpc assembly code - it's been too many years, and
> even back when I did read it I hated how the register "names" worked.
>
> Maybe it was never the args array, and it was about the other fields.
> Not that that makes any sense either, but it makes more sense than the
> compiler turning a series of volatile accesses into a memset.

Calling mem* on a volatile object (or a struct containing one) is not
valid. I opened gcc.gnu.org/PR106335. Thanks for bringing this up!


Segher

2022-07-17 21:35:59

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 9:29 PM Linus Torvalds
<[email protected]> wrote:
>
> On Sun, Jul 17, 2022 at 1:25 PM Sudip Mukherjee
> <[email protected]> wrote:
> >
> > And the generated assembly still has the memset for "struct prom_args".
>
> Strange. That smells like a compiler bug to me.

Both gcc-12 and clang gives this error.

>
> But I can't read powerpc assembly code - it's been too many years, and
> even back when I did read it I hated how the register "names" worked.
>
> Maybe it was never the args array, and it was about the other fields.
> Not that that makes any sense either, but it makes more sense than the
> compiler turning a series of volatile accesses into a memset.

I have also tried adding volatile to all the members of that struct. :(


--
Regards
Sudip

2022-07-17 21:37:50

by Linus Torvalds

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 1:38 PM Sudip Mukherjee
<[email protected]> wrote:
>
> I have also tried adding volatile to all the members of that struct. :(

Can you read the code to figure otu what the memcpy is all about?

Or maybe there is something that disables 'volatile' with pre-processor hackery.

Because a compiler that turns a loop over volatile members into
'memset()' isn't a C compiler, it's just a random noise generator.
'volatile' is fundamental enough that I really doubt both gcc and
clang can be that broken.

I just tested this

struct hello {
volatile int array[100];
};

void test(void)
{
int i;
struct hello hello;
for (i = 0; i < 100; i++)
hello.array[i] = 0;
}

on x86-64, and sure enough, gcc-12 turns turns it into a memset
without the volatile (in fact, the above will just be optimized away
entirely since it has no user), but with the volatile it's a proper
regular loop that does 32-byte accesses one by one (and in the proper
ascending oder). Something that memset() most definitely does not
guarantee:

.L2:
movslq %eax, %rdx
addl $1, %eax
movl $0, -120(%rsp,%rdx,4)
cmpl $100, %eax
jne .L2

and honestly, anything else sounds completely unacceptable.

So I suspect there is something wrong with your testing, because gcc
simply isn't that incredibly broken. Clang is interesting in that it
seems to unroll the loop five times, but it still does the proper
"write individual 32-bit entities in ascending order".

The other alternative is that it's something else than that 'struct
prom_args'. Again, I don't read powerpc asm good.

Linus

2022-07-17 21:38:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 2:00 PM Segher Boessenkool
<[email protected]> wrote:
>
> Calling mem* on a volatile object (or a struct containing one) is not
> valid. I opened gcc.gnu.org/PR106335.

Well, that very quickly got marked as a duplicate of a decade-old bug.

So I guess we shouldn't expect this to be fixed any time soon.

That said, your test-case of copying the whole structure is very
different from the one in the kernel that works on them one structure
member at a time.

I can *kind of* see the logic that when you do a whole struct
assignment, it turns into a "memcpy" without regard for volatile
members. You're not actually accessing the volatile members in some
particular order, so the struct assignment arguably does not really
have an access ordering that needs to be preserved.

But the kernel code in question very much does access the members
individually, and so I think that the compiler quite unequivocally did
something horribly horribly bad by turning them into a memset.

So I don't think your test-case is really particularly good, and maybe
that's why that old bug has languished for over a decade - people
didn't realize just *how* incredibly broken it was.

Linus

2022-07-17 22:06:19

by Segher Boessenkool

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 02:11:52PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 2:00 PM Segher Boessenkool
> <[email protected]> wrote:
> > Calling mem* on a volatile object (or a struct containing one) is not
> > valid. I opened gcc.gnu.org/PR106335.
>
> Well, that very quickly got marked as a duplicate of a decade-old bug.
>
> So I guess we shouldn't expect this to be fixed any time soon.

It shouldn't be all that hard to implement. GCC wants all ports to
define their own mem* because these functions are so critical for
performance, but it isn't hard to do a straightforward by-field copy
for assignments if using memcpy would not be valid at all. Also, if
we would have this we could make a compiler flag saying to always
open-code this, getting rid of this annoyance (namely, that extetnal
mem* are required) for -ffreestanding.

> That said, your test-case of copying the whole structure is very
> different from the one in the kernel that works on them one structure
> member at a time.
>
> I can *kind of* see the logic that when you do a whole struct
> assignment, it turns into a "memcpy" without regard for volatile
> members. You're not actually accessing the volatile members in some
> particular order, so the struct assignment arguably does not really
> have an access ordering that needs to be preserved.

The order is not defined, correct. But a "volatile int" can only be
accessed as an int, and an external memcpy will typically use different
size accesses, and can even access some fields more than once (or
partially); all not okay for a volatile object.

> But the kernel code in question very much does access the members
> individually, and so I think that the compiler quite unequivocally did
> something horribly horribly bad by turning them into a memset.
>
> So I don't think your test-case is really particularly good, and maybe
> that's why that old bug has languished for over a decade - people
> didn't realize just *how* incredibly broken it was.

People haven't looked at my test case for all that time, it sprouted
from my demented mind just minutes ago ;-) The purpose of writing it
this way was to make sure that memcpy will be called for this (on any
target etc.), not some shorter and/or smarter thing.

I don't know what the real reason is that this bugs hasn't been fixed
yet. It should be quite easy to make this more correct. In
<https://patchwork.ozlabs.org/project/gcc/patch/[email protected]/#843066>
Richard suggested doing it in the frontend, which seems reasonable (but
more work than the patch there).

There have been no follow-up patches as far as I can see :-(


Segher

2022-07-18 01:51:00

by Linus Torvalds

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 2:49 PM Segher Boessenkool
<[email protected]> wrote:
>
> > I can *kind of* see the logic that when you do a whole struct
> > assignment, it turns into a "memcpy" without regard for volatile
> > members. You're not actually accessing the volatile members in some
> > particular order, so the struct assignment arguably does not really
> > have an access ordering that needs to be preserved.
>
> The order is not defined, correct. But a "volatile int" can only be
> accessed as an int, and an external memcpy will typically use different
> size accesses, and can even access some fields more than once (or
> partially); all not okay for a volatile object.

That is not actually a valid or realistic argument in the general case.

The thing is, an operation on an aggregate type in C is fundamentally
different from the "do the same operation on the individual parts of
the struct".

Just to make a very concrete example of that, it's not at all
unreasonable to have a struct like this:

struct io_accessor {
union {
volatile unsigned char byte[8];
volatile unsigned short word[4];
...

and while that wasn't the example here, it's not completely insane as
a concept to use as a helper type so that you can do volatile accesses
of different sizes.

And while you'd be right to say that "assigning that kind of struct is
probably insane", and I wouldn't argue against it, I also think that
basically *any* union member is basically an argument that a structure
assignment is *NOT* about "assign all the individual members", and
never really can be.

In the above union, make one union member be a non-volatile type, and
suddenly it actually *can* be ok to copy the struct. Even though it
also has volatile bytes.

So once you start doing structure assignments but argue about
individual fields being volatile, I think you're on very shaky ground.

And I think "memcpy" is a reasonable way to say "we don't care - and
in the general case we CANNOT know - what the individual members are,
so we'll just copy it as one thing".

So the compiler emitting a "memcpy()" to assign a structure sounds
fine. Even in theory. Because the "but individual fields.." argument
just cannot work in general.

In contrast, when you access the members individually (like the kernel
does in this powerpc case), there is no such ambiguity.

There is no way in hell that it is ever ok to do a "memcpy()" when the
user has done the assignments one volatile member at a time.

So that's why I don't think your test-case with the struct assignment
is very good. I think it's very reasonable for a compiler person to
say "you assigned the whole struct, you get what you asked for, you
get a memcpy".

But when you do a loop that assigns individual volatile fields? No
such problem. Completely unambiguous that you need to do them one at a
time as individual accesses.

Linus

2022-07-18 04:08:09

by Michael Ellerman

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

Segher Boessenkool <[email protected]> writes:
> On Sun, Jul 17, 2022 at 07:44:22AM -0700, Linus Torvalds wrote:
>> On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
>> <[email protected]> wrote:
>> > I was trying to check it. With gcc-11 the assembly code generated is
>> > not using memset, but using __memset.
>> > But with gcc-12, I can see the assembly code is using memset. One
>> > example from the assembly:
>>
>> You could try making the 'args' array in 'struct prom_args' be marked
>> 'volatile'.
>>
>> Ie something like this:
>>
>> --- a/arch/powerpc/kernel/prom_init.c
>> +++ b/arch/powerpc/kernel/prom_init.c
>> @@ -115,6 +115,6 @@ struct prom_args {
>> __be32 service;
>> __be32 nargs;
>> __be32 nret;
>> - __be32 args[10];
>> + volatile __be32 args[10];
>> };
>>
>> because I think it's just the compilers turning the small loop over
>> those fields into a "memset()".
>
> Yes. See <https://gcc.gnu.org/onlinedocs/gcc/Standards.html#C-Language>
> near the end:
> Most of the compiler support routines used by GCC are present in
> libgcc, but there are a few exceptions. GCC requires the freestanding
> environment provide memcpy, memmove, memset and memcmp. Finally, if
> __builtin_trap is used, and the target does not implement the trap
> pattern, then GCC emits a call to abort.
>
> Can't we simply have a small simple implementation of these functions in
> arch/powerpc/boot/? This stuff is not performance-critical, and this is
> not the first time we hit these problems.

prom_init.c isn't in arch/powerpc/boot :)

It's linked into the kernel proper, but we want it to behave like a
pre-boot environment (because not all boot paths run it) which is why we
restrict what symbols it can call.

We could have a prom_memset() etc. but we'd need to do some tricks to
rewrite references to memset() to prom_memset() before linking.

cheers

2022-07-18 05:36:33

by Michael Ellerman

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

Sudip Mukherjee <[email protected]> writes:
> On Thu, Jul 14, 2022 at 9:55 AM Sudip Mukherjee (Codethink)
> <[email protected]> wrote:
>>
>> Hi All,
>>
>> Not sure if it has been reported before but the latest mainline kernel
>> branch fails to build for powerpc allmodconfig with gcc-12 and the error is:
>>
>> Error: External symbol 'memset' referenced from prom_init.c
>> make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1
>
> I was trying to check it. With gcc-11 the assembly code generated is
> not using memset, but using __memset.
> But with gcc-12, I can see the assembly code is using memset. One
> example from the assembly:
>
> call_prom:
> .quad .call_prom,.TOC.@tocbase,0
> .previous
> .size call_prom,24
> .type .call_prom,@function
> .call_prom:
> mflr 0 #,
> std 29,-24(1) #,
> std 30,-16(1) #,
> std 31,-8(1) #,
> mr 29,3 # tmp166, service
> mr 31,4 # nargs, tmp167
> mr 30,5 # tmp168, nret
> # arch/powerpc/kernel/prom_init.c:396: struct prom_args args;
> li 4,254 #,

Here we load 254 into r4, which is the 2nd parameter to memset (c).

> li 5,52 #,

This is r5, the 3rd parameter (n), ie. the size of the structure.

That tells us we're memsetting the entire structure, ie. the 10 x 4
bytes of args.args plus 3 x 4 bytes for the other members.

> # arch/powerpc/kernel/prom_init.c:394: {
> std 0,16(1) #,
> stdu 1,-208(1) #,,
> # arch/powerpc/kernel/prom_init.c:396: struct prom_args args;
> addi 3,1,112 # tmp174,,

Here we load (calculate) the address of "args" into r3, the first
parameter to memset.

> # arch/powerpc/kernel/prom_init.c:394: {
> std 9,304(1) #,
> std 10,312(1) #,
> std 6,280(1) #,
> std 7,288(1) #,
> std 8,296(1) #,
> # arch/powerpc/kernel/prom_init.c:396: struct prom_args args;
> bl .memset #

So we're memsetting all of args to 254, not zero.

That's happening because allmodconfig with gcc 12 enables
CONFIG_INIT_STACK_ALL_PATTERN, whereas gcc 11 doesn't.

I think the simplest fix in the short term is to just disable stack
initialisation for prom_init.c. It only runs at boot so there's no real
security impact to disabling it.

cheers

2022-07-18 08:25:11

by David Laight

[permalink] [raw]
Subject: RE: mainline build failure of powerpc allmodconfig for prom_init_check

From: Michael Ellerman
> Sent: 18 July 2022 05:41
...
> So we're memsetting all of args to 254, not zero.
>
> That's happening because allmodconfig with gcc 12 enables
> CONFIG_INIT_STACK_ALL_PATTERN, whereas gcc 11 doesn't.

I can't help feeling it would be better if that generated
a call to a memset64() function.
Saving loads of tests at the top of the function,
and (most of?) the constant expansion to 64bit.

Although and explicit 'stack clear' function would be better
for the kernel - since it would give the option of patching
it away at startup.

I really can't help feeling that initialising on-stack
arrays will kill performance.
While kernel stack frames have to be relatively small,
in userspace very large on-stack arrays can be allocated
(and correctly bound checked) knowing that the cost is
minimal (maybe a TLB miss).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-07-18 14:30:59

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init

With GCC 12 allmodconfig prom_init fails to build:

Error: External symbol 'memset' referenced from prom_init.c
make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

The allmodconfig build enables KASAN, so all calls to memset in
prom_init should be converted to __memset by the #ifdefs in
asm/string.h, because prom_init must use the non-KASAN instrumented
versions.

The build failure happens because there's a call to memset that hasn't
been caught by the pre-processor and converted to __memset. Typically
that's because it's a memset generated by the compiler itself, and that
is the case here.

With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
causes the compiler to emit memset calls to initialise on-stack
variables with a pattern.

Because prom_init is non-user-facing boot-time only code, as a
workaround just disable stack variable initialisation to unbreak the
build.

Reported-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
---
arch/powerpc/kernel/Makefile | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index f91f0f29a566..c8cf924bf9c0 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -20,6 +20,7 @@ CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += -fno-stack-protector
CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_prom_init.o += -ffreestanding
+CFLAGS_prom_init.o += $(call cc-option, -ftrivial-auto-var-init=uninitialized)

ifdef CONFIG_FUNCTION_TRACER
# Do not trace early boot code
--
2.35.3

2022-07-18 15:22:14

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init

On Mon, Jul 18, 2022 at 2:44 PM Michael Ellerman <[email protected]> wrote:
>
> With GCC 12 allmodconfig prom_init fails to build:
>
> Error: External symbol 'memset' referenced from prom_init.c
> make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1
>

<snip>

>
> Reported-by: Sudip Mukherjee <[email protected]>
> Signed-off-by: Michael Ellerman <[email protected]>

And, this has fixed the build failure.
Thanks Michael.


--
Regards
Sudip

2022-07-18 15:24:30

by Segher Boessenkool

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Mon, Jul 18, 2022 at 01:52:38PM +1000, Michael Ellerman wrote:
> Segher Boessenkool <[email protected]> writes:
> > Can't we simply have a small simple implementation of these functions in
> > arch/powerpc/boot/? This stuff is not performance-critical, and this is
> > not the first time we hit these problems.
>
> prom_init.c isn't in arch/powerpc/boot :)

Ah duh :-)

> It's linked into the kernel proper, but we want it to behave like a
> pre-boot environment (because not all boot paths run it) which is why we
> restrict what symbols it can call.
>
> We could have a prom_memset() etc. but we'd need to do some tricks to
> rewrite references to memset() to prom_memset() before linking.

You can do it in its linker script?


Segher

2022-07-18 19:13:41

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init

On Mon, Jul 18, 2022 at 6:44 AM Michael Ellerman <[email protected]> wrote:
>
> With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
> causes the compiler to emit memset calls to initialise on-stack
> variables with a pattern.

Ahh, and that explains why "volatile" made no difference. That did
seem very odd.

Thanks for figuring it out,

Linus

2022-07-18 19:29:43

by Linus Torvalds

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman <[email protected]> wrote:
>
> > li 4,254 #,
>
> Here we load 254 into r4, which is the 2nd parameter to memset (c).

I love how even powerpc people know that "4" is bogus, and have to
make it clear that it means "r4".

I don't understand why the powerpc assembler is so messed up, and uses
random integer constants for register "names".

And it gets even worse, when you start mixing FP, vector and integer "names".

I've seen many bad assemblers (in fact, I have *written* a couple of
bad assemblers myself), but I have never seen anything quite that
broken on any other architecture.

Oddities, yes ("$" as a prefix for register? Alpha asm is also very
odd), but nothing *quite* as broken as "simple constants have entirely
different meanings depending on the exact instruction and argument
position".

It's not even an IBM thing. S390 uses perfectly sane register syntax,
and calls things '%r4" etc.

The human-written asm files have those #define's in headers just to
make things slightly more legible, because apparently the assembler
doesn't even *accept* the sane names. So it's not even a "the compiler
generates this abbreviated illegible mess". It's literally that the
assembler is so horrid.

Why do people put up with that?

Linus

2022-07-18 22:20:04

by Segher Boessenkool

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Mon, Jul 18, 2022 at 12:06:52PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman <[email protected]> wrote:
> > > li 4,254 #,
> >
> > Here we load 254 into r4, which is the 2nd parameter to memset (c).
>
> I love how even powerpc people know that "4" is bogus, and have to
> make it clear that it means "r4".

This is compiler output. Compiler output is mainly meant for the
assembler to produce object code from. It isn't meant to be readable
(and e.g. -fverbose-asm didn't help much here, that's the "#," ;-) ).

The mnemonic determines what the operands mean. It is much easier to
read and write "li 4,254" than "li r4,254" or "li %r4,254", all of which
are valid. You can also write "li 3+1,2*127", but not with the other
forms (this is useful if you use assembler macros, which are way more
powerful and appropriate than abusing the C preprocessor, when writing
assembler code).

It matters more if you have three or four or five or six operands to an
assembler instruction, all the extra line noise makes things illegible.

The "%r4" variant hails from winnt. It is a bit problematic in inline
assembler, because you need to escape the % in extended inline asm, but
not in basic inline asm. It also is pure line noise to read.

The "r4" variant is problematic if you have symbols named the same.
When you use the -mregnames assembler option it is taken to mean the
register; you can write "(r6)" to mean the symbol. (There also are "sp"
and "rtos" and "xer" and whatnot, not just "r4").

> I don't understand why the powerpc assembler is so messed up, and uses
> random integer constants for register "names".

360 was the same. 370 was the same. 390 is the same. 801 was the
same. RIOS (aka POWER) was the same. So yes, PowerPC inherited it, I
don't know how much thought was put into this, don't change a winning
team etc.

> And it gets even worse, when you start mixing FP, vector and integer "names".

It is clear from the mnemonic what the operands are: some register, an
immediate, a constant, etc. An expression (which can include object
symbols) can be any of those.

Assembler language is unforgiving. It isn't easy to write, and most
mistakes will not be diagnosed. If the assmbler language makes it
easier to read the code, that makes it more likely correct code will be
written, and that correct code will be written in less time.

> I've seen many bad assemblers (in fact, I have *written* a couple of
> bad assemblers myself), but I have never seen anything quite that
> broken on any other architecture.
>
> Oddities, yes ("$" as a prefix for register? Alpha asm is also very
> odd), but nothing *quite* as broken as "simple constants have entirely
> different meanings depending on the exact instruction and argument
> position".

What is broken about that? It makes everything very consistent, and
very readable. Sigils are just nasty, and having the register names the
same as valid symbol names is also problematic.

> It's not even an IBM thing. S390 uses perfectly sane register syntax,
> and calls things '%r4" etc.

s390 has the same syntax, and even inherited the GAS code for this from
the ppc port.

> The human-written asm files have those #define's in headers just to
> make things slightly more legible, because apparently the assembler
> doesn't even *accept* the sane names.

That was true a long time ago. And the "#define r0 0" thing caused
quite a few bugs itself btw.

> So it's not even a "the compiler
> generates this abbreviated illegible mess". It's literally that the
> assembler is so horrid.

The disassembler has shown "r4" etc. by default since ages. The
assembler needs -mregnames to accept it; enabling this by default would
be a compatibility break, not acceptable.

> Why do people put up with that?

Why are people misinformed?

Is there anything in particular in the documentation we could improve?


Hope this helps,


Segher

2022-07-18 23:00:14

by Linus Torvalds

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

On Mon, Jul 18, 2022 at 3:12 PM Segher Boessenkool
<[email protected]> wrote:
>
> Assembler language is unforgiving. It isn't easy to write, and most
> mistakes will not be diagnosed. If the assmbler language makes it
> easier to read the code, that makes it more likely correct code will be
> written, and that correct code will be written in less time.

What's your argument? That it's unforgiving, so it has to be
unreadable and easy to make mistakes in too?

You can get the order of operands wrong, and it will still build -
just to completely the wrong thing.

> > Oddities, yes ("$" as a prefix for register? Alpha asm is also very
> > odd), but nothing *quite* as broken as "simple constants have entirely
> > different meanings depending on the exact instruction and argument
> > position".
>
> What is broken about that? It makes everything very consistent, and
> very readable. Sigils are just nasty, and having the register names the
> same as valid symbol names is also problematic.

Oh, I agree that sigils are good to make the type clear. So '%r4' is
better than 'r4' because the latter could be ambiguous and you could
have a symbol 'r4'.

But just '4' is plain garbage. There's no "could be ambiguous" about it.

Type checking matters. Not just in C. In asm too.

The reason '$0' is odd because it's *just* a sigil and a number.

Which certainly is not unusual in itself, but it reads like it's a
number to me. Not just because of x86 gas ('$' means 'immediate'), but
Z80 ('$' means HEX), or at least 'Nth argument' (shell).

And yeah, alpha got it from MIPS, afaik.

And presumably MIPS got it from "we're hacking up the simplest
assembler we can".

> > The human-written asm files have those #define's in headers just to
> > make things slightly more legible, because apparently the assembler
> > doesn't even *accept* the sane names.
>
> That was true a long time ago. And the "#define r0 0" thing caused
> quite a few bugs itself btw.

Those #define's still exist. Look it up.

And yes, they are horrible, and they are wrong, and they shouldn't exist.

Linus

2022-07-19 14:27:03

by Michael Ellerman

[permalink] [raw]
Subject: Re: mainline build failure of powerpc allmodconfig for prom_init_check

Linus Torvalds <[email protected]> writes:
> On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman <[email protected]> wrote:
>>
>> > li 4,254 #,
>>
>> Here we load 254 into r4, which is the 2nd parameter to memset (c).
>
> I love how even powerpc people know that "4" is bogus, and have to
> make it clear that it means "r4".

I wouldn't say it's bogus, I was just translating from asm to English :)

But I agree it's preferable to use a proper register name rather than a
bare integer. I never write asm using bare integers, I always use r4 or
%r4, because as you say it's too easy to get mixed up otherwise.

When looking at generated code I usually use objdump -d output, which
uses the "r4" syntax.

> It's not even an IBM thing. S390 uses perfectly sane register syntax,
> and calls things '%r4" etc.

as accepts that syntax if you tell it to.

We use that syntax in some of our newer inline asm blocks.

> The human-written asm files have those #define's in headers just to
> make things slightly more legible, because apparently the assembler
> doesn't even *accept* the sane names.

I would like to switch to using %rX everywhere and get rid of those
defines, but it's never seemed like it's worth the churn. We have ~48K
lines of asm in arch/powerpc.

cheers