2023-06-28 14:21:13

by Bagas Sanjaya

[permalink] [raw]
Subject: Fwd: commit 9df9d2f0471b causes boot failure in pre-rc1 6.5 kernel

Hi,

I notice a regression report on Bugzilla [1]. Quoting from it:

> Since yesterday my builds of the https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git no longer boot with a black screen immediately upon booting. Today I finished git bisecting the issue and arrived at the following:
>
> 9df9d2f0471b4c4702670380b8d8a45b40b23a7d is the first bad commit
> commit 9df9d2f0471b4c4702670380b8d8a45b40b23a7d
> Author: Thomas Gleixner <[email protected]>
> Date: Wed Jun 14 01:39:39 2023 +0200
>
> init: Invoke arch_cpu_finalize_init() earlier
>
> X86 is reworking the boot process so that initializations which are not
> required during early boot can be moved into the late boot process and out
> of the fragile and restricted initial boot phase.
>
> arch_cpu_finalize_init() is the obvious place to do such initializations,
> but arch_cpu_finalize_init() is invoked too late in start_kernel() e.g. for
> initializing the FPU completely. fork_init() requires that the FPU is
> initialized as the size of task_struct on X86 depends on the size of the
> required FPU register buffer.
>
> Fortunately none of the init calls between calibrate_delay() and
> arch_cpu_finalize_init() is relevant for the functionality of
> arch_cpu_finalize_init().
>
> Invoke it right after calibrate_delay() where everything which is relevant
> for arch_cpu_finalize_init() has been set up already.
>
> No functional change intended.
>
> Signed-off-by: Thomas Gleixner <[email protected]>
> Reviewed-by: Rick Edgecombe <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
>
> init/main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Since it might be relevant, my CPU is Intel Core i5-12400 with UEFI from december 2022 and the compiler is gcc (Gentoo Hardened 13.1.1_p20230527 p3) 13.1.1 20230527. If additional information such as the kernel configuration is required, let me know.

See Bugzilla for the full thread.

The reporter can't provide requested dmesg due to this is early
boot failure, unfortunately.

Nevertheless, this regression has already been taken care of on
Bugzilla, but to ensure it is tracked and doesn't get fallen through
cracks unnoticed, I'm adding it to regzbot:

#regzbot introduced: 9df9d2f0471b https://bugzilla.kernel.org/show_bug.cgi?id=217602
#regzbot title: early arch_cpu_finalize_init() cause immediate boot failure

Thanks.

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=217602

--
An old man doll... just what I always wanted! - Clara


2023-07-01 13:11:27

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: Fwd: commit 9df9d2f0471b causes boot failure in pre-rc1 6.5 kernel

On 01.07.23 14:27, Bagas Sanjaya wrote:
> On Wed, Jun 28, 2023 at 09:06:22PM +0700, Bagas Sanjaya wrote:
>> I notice a regression report on Bugzilla [1]. Quoting from it:
> [....]
>> See Bugzilla for the full thread.
>>
>> The reporter can't provide requested dmesg due to this is early
>> boot failure, unfortunately.
>>
>> Nevertheless, this regression has already been taken care of on
>> Bugzilla, but to ensure it is tracked and doesn't get fallen through
>> cracks unnoticed, I'm adding it to regzbot:
>>
>> #regzbot introduced: 9df9d2f0471b https://bugzilla.kernel.org/show_bug.cgi?id=217602
>> #regzbot title: early arch_cpu_finalize_init() cause immediate boot failure
>
> #regzbot fix: 0303c9729afc40

Bagas, FWIW, there was no need for this at all. Regzbot would have
noticed that patch automatically due to the "Link:
https://bugzilla.kernel.org/show_bug.cgi?id=217602" in the patch
description (thx for this, tglx) once it landed in next or mainline
(just like it noticed
https://lore.kernel.org/lkml/168813193932.404.2885732890333911092.tip-bot2@tip-bot2/
earlier).

Ciao, Thorsten


2023-07-01 13:17:45

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: Fwd: commit 9df9d2f0471b causes boot failure in pre-rc1 6.5 kernel

On Wed, Jun 28, 2023 at 09:06:22PM +0700, Bagas Sanjaya wrote:
> Hi,
>
> I notice a regression report on Bugzilla [1]. Quoting from it:
>
> > Since yesterday my builds of the https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git no longer boot with a black screen immediately upon booting. Today I finished git bisecting the issue and arrived at the following:
> >
> > 9df9d2f0471b4c4702670380b8d8a45b40b23a7d is the first bad commit
> > commit 9df9d2f0471b4c4702670380b8d8a45b40b23a7d
> > Author: Thomas Gleixner <[email protected]>
> > Date: Wed Jun 14 01:39:39 2023 +0200
> >
> > init: Invoke arch_cpu_finalize_init() earlier
> >
> > X86 is reworking the boot process so that initializations which are not
> > required during early boot can be moved into the late boot process and out
> > of the fragile and restricted initial boot phase.
> >
> > arch_cpu_finalize_init() is the obvious place to do such initializations,
> > but arch_cpu_finalize_init() is invoked too late in start_kernel() e.g. for
> > initializing the FPU completely. fork_init() requires that the FPU is
> > initialized as the size of task_struct on X86 depends on the size of the
> > required FPU register buffer.
> >
> > Fortunately none of the init calls between calibrate_delay() and
> > arch_cpu_finalize_init() is relevant for the functionality of
> > arch_cpu_finalize_init().
> >
> > Invoke it right after calibrate_delay() where everything which is relevant
> > for arch_cpu_finalize_init() has been set up already.
> >
> > No functional change intended.
> >
> > Signed-off-by: Thomas Gleixner <[email protected]>
> > Reviewed-by: Rick Edgecombe <[email protected]>
> > Link: https://lore.kernel.org/r/[email protected]
> >
> > init/main.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > Since it might be relevant, my CPU is Intel Core i5-12400 with UEFI from december 2022 and the compiler is gcc (Gentoo Hardened 13.1.1_p20230527 p3) 13.1.1 20230527. If additional information such as the kernel configuration is required, let me know.
>
> See Bugzilla for the full thread.
>
> The reporter can't provide requested dmesg due to this is early
> boot failure, unfortunately.
>
> Nevertheless, this regression has already been taken care of on
> Bugzilla, but to ensure it is tracked and doesn't get fallen through
> cracks unnoticed, I'm adding it to regzbot:
>
> #regzbot introduced: 9df9d2f0471b https://bugzilla.kernel.org/show_bug.cgi?id=217602
> #regzbot title: early arch_cpu_finalize_init() cause immediate boot failure
>

#regzbot fix: 0303c9729afc40

--
An old man doll... just what I always wanted! - Clara


Attachments:
(No filename) (2.78 kB)
signature.asc (235.00 B)
Download all attachments

2023-07-01 13:55:11

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: Fwd: commit 9df9d2f0471b causes boot failure in pre-rc1 6.5 kernel

On 7/1/23 19:43, Linux regression tracking #update (Thorsten Leemhuis) wrote:
> On 01.07.23 14:27, Bagas Sanjaya wrote:
>> On Wed, Jun 28, 2023 at 09:06:22PM +0700, Bagas Sanjaya wrote:
>>> I notice a regression report on Bugzilla [1]. Quoting from it:
>> [....]
>>> See Bugzilla for the full thread.
>>>
>>> The reporter can't provide requested dmesg due to this is early
>>> boot failure, unfortunately.
>>>
>>> Nevertheless, this regression has already been taken care of on
>>> Bugzilla, but to ensure it is tracked and doesn't get fallen through
>>> cracks unnoticed, I'm adding it to regzbot:
>>>
>>> #regzbot introduced: 9df9d2f0471b https://bugzilla.kernel.org/show_bug.cgi?id=217602
>>> #regzbot title: early arch_cpu_finalize_init() cause immediate boot failure
>>
>> #regzbot fix: 0303c9729afc40
>
> Bagas, FWIW, there was no need for this at all. Regzbot would have
> noticed that patch automatically due to the "Link:
> https://bugzilla.kernel.org/show_bug.cgi?id=217602" in the patch
> description (thx for this, tglx) once it landed in next or mainline
> (just like it noticed
> https://lore.kernel.org/lkml/168813193932.404.2885732890333911092.tip-bot2@tip-bot2/
> earlier).
>

OK, thanks for another tip! I was doing above because at the time
regzbot doesn't mark the regression as solved, so I had to manually
told it.

--
An old man doll... just what I always wanted! - Clara


2023-07-01 15:34:19

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: Fwd: commit 9df9d2f0471b causes boot failure in pre-rc1 6.5 kernel

[dropping people from CC that likely don't care]

On 01.07.23 15:42, Bagas Sanjaya wrote:
> On 7/1/23 19:43, Linux regression tracking #update (Thorsten Leemhuis) wrote:
>> On 01.07.23 14:27, Bagas Sanjaya wrote:
>>> On Wed, Jun 28, 2023 at 09:06:22PM +0700, Bagas Sanjaya wrote:
>>>> I notice a regression report on Bugzilla [1]. Quoting from it:
>>> [....]
>>>> See Bugzilla for the full thread.
>>>>
>>>> #regzbot introduced: 9df9d2f0471b https://bugzilla.kernel.org/show_bug.cgi?id=217602
>>>> #regzbot title: early arch_cpu_finalize_init() cause immediate boot failure
>>> #regzbot fix: 0303c9729afc40
>>
>> Bagas, FWIW, there was no need for this at all. Regzbot would have
>> noticed that patch automatically due to the "Link:
>> https://bugzilla.kernel.org/show_bug.cgi?id=217602" in the patch
>> description (thx for this, tglx) once it landed in next or mainline
>> (just like it noticed
>> https://lore.kernel.org/lkml/168813193932.404.2885732890333911092.tip-bot2@tip-bot2/
>> earlier).
>
> OK, thanks for another tip! I was doing above because at the time
> regzbot doesn't mark the regression as solved, so I had to manually
> told it.

regzbot could notice this sooner by monitoring all the for-linus and
for-next subsystem development trees. But I'm not sure if that's worth
it, because normally things that land there show up in -next within 24
hours anyway.

Ciao, Thorsten


2023-07-01 19:21:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Fwd: commit 9df9d2f0471b causes boot failure in pre-rc1 6.5 kernel

On Sat, Jul 01 2023 at 14:43, Linux regression tracking #update (Thorsten Leemhuis) wrote:
> Bagas, FWIW, there was no need for this at all. Regzbot would have
> noticed that patch automatically due to the "Link:
> https://bugzilla.kernel.org/show_bug.cgi?id=217602" in the patch
> description (thx for this, tglx) once it landed in next or mainline
> (just like it noticed
> https://lore.kernel.org/lkml/168813193932.404.2885732890333911092.tip-bot2@tip-bot2/
> earlier).

I just looked at your tracking site and noticed a small hickup. There is
"Noteworthy: [1]" [1] is a link, but that does not really work:

https://bugzilla.kernel.org/show_bug.cgi?id=87jzvm12q0.ffs@tglx

Makes bugzilla unhappy :)

Thanks,

tglx

2023-07-02 12:42:19

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: Fwd: commit 9df9d2f0471b causes boot failure in pre-rc1 6.5 kernel

On 01.07.23 20:13, Thomas Gleixner wrote:
> On Sat, Jul 01 2023 at 14:43, Linux regression tracking #update (Thorsten Leemhuis) wrote:
>> Bagas, FWIW, there was no need for this at all. Regzbot would have
>> noticed that patch automatically due to the "Link:
>> https://bugzilla.kernel.org/show_bug.cgi?id=217602" in the patch
>> description (thx for this, tglx) once it landed in next or mainline
>> (just like it noticed
>> https://lore.kernel.org/lkml/168813193932.404.2885732890333911092.tip-bot2@tip-bot2/
>> earlier).
>
> I just looked at your tracking site and noticed a small hickup. There is
> "Noteworthy: [1]" [1] is a link, but that does not really work:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=87jzvm12q0.ffs@tglx
>
> Makes bugzilla unhappy :)

Well, it could have at least tried to understand what's wanted here ;)
[Just kidding of course.]

Took me a while to find where things went sideways in regzbot, but found
and fixed it so it won't happen again; a few wrong entries in the DB
will remain, but they within the next few weeks will become history (and
I might manually fixup one or two).

Thx for letting me know, I had seen this earlier myself, but forgotten this.

Ohh, and thx for addressing the regression so quickly!

Ciao, Thorsten