2018-01-12 22:16:39

by Gabriel C

[permalink] [raw]
Subject: Broken boot on tip/master ( and now linux-next )

Hey guys,

I have an Supermicro H11DSi-NT box with 2 x EPYC 7281 CPUs.

I've notice already a bit earlier something is wrong when booting tip/master
on that box but didn't got any time to investigate that.


Today I did an linux-next build which failed in the same way tip/master did
so I did an tip/master build too which failed to boot as well.

Some things I noticed:

With CONFIG_AMD_MEM_ENCRYPT=y and mem_encrypt=on the box hangs right after grub
with no way to see what is going on.

With mem_encrypt=off the box boots to an point but something trashes APCI tables.

With:

CONFIG_AMD_MEM_ENCRYPT=n
CONFIG_RETPOLINE=n

The box boots to an point but same , ACPI seems broken , eg this :

...

[ 0.000000] ACPI: \xc0\xde\xdb\xc2 0x00000000C2DC8DA0 000000 (v10 ?(<- 00000000 C2DB56A0)
[ 0.000000] ACPI: 0x000000002D3C2808 000000 (v00 00000000 00000000)
[ 0.000000] ACPI BIOS Error (bug): Invalid table length 0x0 in RSDT/XSDT (20170831/tbutils-325)
[ 0.000000] No NUMA configuration found

...

From here on hell break :)


I got a dmesg from the broken boot , this can be found there:

http://sigsegv.24-7.ro/~crazy/tip-master/dmesg-tip-master-broken-boot.txt
http://sigsegv.24-7.ro/~crazy/tip-master/config-4.15.0-rc7-00557-g16ccd38ce1c1

A good dmesg from linus tree + patches from this series https://marc.info/?l=linux-kernel&m=151561236821659&w=2

http://sigsegv.24-7.ro/~crazy/tip-master/dmesg-OK.txt

Does someone have any idea what could have broke that ?

Would be nice to have some hints before starting to bisect that.

Regards,

Gabriel C


2018-01-12 22:32:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: Broken boot on tip/master ( and now linux-next )

On Fri, Jan 12, 2018 at 11:16:34PM +0100, Gabriel C wrote:
> The box boots to an point but same , ACPI seems broken , eg this :
>
> ...
>
> [ 0.000000] ACPI: \xc0\xde\xdb\xc2 0x00000000C2DC8DA0 000000 (v10 ?(<- 00000000 C2DB56A0)
> [ 0.000000] ACPI: 0x000000002D3C2808 000000 (v00 00000000 00000000)
> [ 0.000000] ACPI BIOS Error (bug): Invalid table length 0x0 in RSDT/XSDT (20170831/tbutils-325)
> [ 0.000000] No NUMA configuration found
>

https://lkml.kernel.org/r/[email protected]

tip/master is fixed now.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-12 23:45:43

by Gabriel C

[permalink] [raw]
Subject: Re: Broken boot on tip/master ( and now linux-next )

On 12.01.2018 23:32, Borislav Petkov wrote:
> On Fri, Jan 12, 2018 at 11:16:34PM +0100, Gabriel C wrote:
>> The box boots to an point but same , ACPI seems broken , eg this :
>>
>> ...
>>
>> [ 0.000000] ACPI: \xc0\xde\xdb\xc2 0x00000000C2DC8DA0 000000 (v10 ?(<- 00000000 C2DB56A0)
>> [ 0.000000] ACPI: 0x000000002D3C2808 000000 (v00 00000000 00000000)
>> [ 0.000000] ACPI BIOS Error (bug): Invalid table length 0x0 in RSDT/XSDT (20170831/tbutils-325)
>> [ 0.000000] No NUMA configuration found
>>
>
> https://lkml.kernel.org/r/[email protected]
>
> tip/master is fixed now.
>

Thanks Borislav,

without these commits all is fine.