2022-07-05 15:01:11

by Tor Vic

[permalink] [raw]
Subject: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4

Hi,

Linux 5.19-rc5 does not boot on my Kaby Lake Thinkpad.
rc3 and rc4 were still fine, so I guess something between rc4 and rc5
introduced a regression.

Unfortunately, there are no errors or warning messages.
It gets stuck quite early on boot, about the time USB is initialized,
so less than 1 second into post-bootloader boot.
It then just sits there doing nothing - SysRq still works though.

I don't have time for a bisect, but I thought I'll let you know about
this issue, and maybe someone already has an idea.

Some system information below. Root filesystem is f2fs.

Machine:
Type: Laptop System: LENOVO product: 20HN0016GE v: ThinkPad X270
CPU:
Info: dual core Intel Core i5-7200U [MT MCP] speed (MHz): avg: 1563
min/max: 400/3100
Graphics:
Device-1: Intel HD Graphics 620 driver: i915 v: kernel
Device-2: Acer Integrated Camera type: USB driver: uvcvideo
Display: x11 server: X.Org v: 21.1.3 with: Xwayland v: 22.1.2 driver: X:
loaded: intel unloaded: modesetting,vesa gpu: i915
resolution: 1920x1080~60Hz
OpenGL: renderer: Mesa Intel HD Graphics 620 (KBL GT2) v: 4.6 Mesa 22.1.3
Network:
Device-1: Intel Ethernet I219-V driver: e1000e
Device-2: Intel Wireless 8265 / 8275 driver: iwlwifi
Device-3: Intel Bluetooth wireless interface type: USB driver: btusb
Drives:
Local Storage: total: 238.47 GiB used: 76.38 GiB (32.0%)
Info:
Processes: 178 Uptime: 9m Memory: 7.54 GiB used: 1.74 GiB (23.1%)
Shell: Zsh inxi: 3.3.19

% lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 02)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02)
00:14.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller (rev 21)
00:14.2 Signal processing controller: Intel Corporation Sunrise Point-LP Thermal subsystem (rev 21)
00:15.0 Signal processing controller: Intel Corporation Sunrise Point-LP Serial IO I2C Controller #0 (rev 21)
00:15.1 Signal processing controller: Intel Corporation Sunrise Point-LP Serial IO I2C Controller #1 (rev 21)
00:16.0 Communication controller: Intel Corporation Sunrise Point-LP CSME HECI #1 (rev 21)
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #1 (rev f1)
00:1c.2 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #3 (rev f1)
00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #5 (rev f1)
00:1f.0 ISA bridge: Intel Corporation Sunrise Point-LP LPC Controller (rev 21)
00:1f.2 Memory controller: Intel Corporation Sunrise Point-LP PMC (rev 21)
00:1f.3 Audio device: Intel Corporation Sunrise Point-LP HD Audio (rev 21)
00:1f.4 SMBus: Intel Corporation Sunrise Point-LP SMBus (rev 21)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (4) I219-V (rev 21)
02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader (rev 01)
03:00.0 Network controller: Intel Corporation Wireless 8265 / 8275 (rev 78)
04:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961/SM963

Thank you,
Tor Vic


2022-07-05 17:04:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4

On Tue, Jul 5, 2022 at 6:50 AM <[email protected]> wrote:
>
> Linux 5.19-rc5 does not boot on my Kaby Lake Thinkpad.
> rc3 and rc4 were still fine, so I guess something between rc4 and rc5
> introduced a regression.

Sounds that way.

> Unfortunately, there are no errors or warning messages.
> It gets stuck quite early on boot, about the time USB is initialized,
> so less than 1 second into post-bootloader boot.
> It then just sits there doing nothing - SysRq still works though.

There aren't all that many changes in rc5, and your hardware looks
*very* standard (all intel chipset, and a Samsung SM961 SSD).

And with the lack of details, we'll either need a bisect:

> I don't have time for a bisect, but I thought I'll let you know about
> this issue, and maybe someone already has an idea.

or we'll need more reports..

> Some system information below. Root filesystem is f2fs.

Ok, f2fs is certainly unusual, but there are no f2fs changes in rc5.

There's some PM changes for i915 ("drm/i915/dgfx: Disable d3cold at
gfx root port") and a couple of thinkpad-acpi platform driver updates,
so I'm adding a few random people to the cc in case somebody goes
"ahh..."

But otherwise I think we'll just need more reports or info to even
start guessing.

Linus

2022-07-05 17:26:22

by Mario Limonciello

[permalink] [raw]
Subject: RE: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4

[Public]

> -----Original Message-----
> From: Linus Torvalds <[email protected]>
> Sent: Tuesday, July 5, 2022 11:40
> To: Tor Vic <[email protected]>
> Cc: [email protected]; [email protected];
> Hans de Goede <[email protected]>; Jani Nikula
> <[email protected]>
> Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4
>
> On Tue, Jul 5, 2022 at 6:50 AM <[email protected]> wrote:
> >
> > Linux 5.19-rc5 does not boot on my Kaby Lake Thinkpad.
> > rc3 and rc4 were still fine, so I guess something between rc4 and rc5
> > introduced a regression.
>
> Sounds that way.
>
> > Unfortunately, there are no errors or warning messages.
> > It gets stuck quite early on boot, about the time USB is initialized,
> > so less than 1 second into post-bootloader boot.
> > It then just sits there doing nothing - SysRq still works though.
>
> There aren't all that many changes in rc5, and your hardware looks
> *very* standard (all intel chipset, and a Samsung SM961 SSD).
>
> And with the lack of details, we'll either need a bisect:
>
> > I don't have time for a bisect, but I thought I'll let you know about
> > this issue, and maybe someone already has an idea.
>
> or we'll need more reports..
>
> > Some system information below. Root filesystem is f2fs.
>
> Ok, f2fs is certainly unusual, but there are no f2fs changes in rc5.
>
> There's some PM changes for i915 ("drm/i915/dgfx: Disable d3cold at
> gfx root port") and a couple of thinkpad-acpi platform driver updates,
> so I'm adding a few random people to the cc in case somebody goes
> "ahh..."
>

If a bisect isn't possible for you the kernel command line should be pretty
helpful to isolate which area the problem is introduced.
I'd say start out with "nomodeset" on the kernel command line to prevent
i915 from loading. If that fixes it, hopefully it's a small number of commits
to peel back like the one Linus mentioned.

For thinkpad_acpi you can try modprobe.blacklist=thinkpad_acpi.

> But otherwise I think we'll just need more reports or info to even
> start guessing.
>
> Linus

2022-07-06 09:01:22

by Tor Vic

[permalink] [raw]
Subject: RE: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4


> Limonciello, Mario <[email protected]> hat am 05.07.2022 17:10 GMT geschrieben:
>
>
> [Public]
>
> > -----Original Message-----
> > From: Linus Torvalds <[email protected]>
> > Sent: Tuesday, July 5, 2022 11:40
> > To: Tor Vic <[email protected]>
> > Cc: [email protected]; [email protected];
> > Hans de Goede <[email protected]>; Jani Nikula
> > <[email protected]>
> > Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4
> >
> > On Tue, Jul 5, 2022 at 6:50 AM <[email protected]> wrote:
> > >
> > > Linux 5.19-rc5 does not boot on my Kaby Lake Thinkpad.
> > > rc3 and rc4 were still fine, so I guess something between rc4 and rc5
> > > introduced a regression.
> >
> > Sounds that way.
> >
> > > Unfortunately, there are no errors or warning messages.
> > > It gets stuck quite early on boot, about the time USB is initialized,
> > > so less than 1 second into post-bootloader boot.
> > > It then just sits there doing nothing - SysRq still works though.
> >
> > There aren't all that many changes in rc5, and your hardware looks
> > *very* standard (all intel chipset, and a Samsung SM961 SSD).
> >
> > And with the lack of details, we'll either need a bisect:
> >
> > > I don't have time for a bisect, but I thought I'll let you know about
> > > this issue, and maybe someone already has an idea.
> >
> > or we'll need more reports..
> >
> > > Some system information below. Root filesystem is f2fs.
> >
> > Ok, f2fs is certainly unusual, but there are no f2fs changes in rc5.
> >
> > There's some PM changes for i915 ("drm/i915/dgfx: Disable d3cold at
> > gfx root port") and a couple of thinkpad-acpi platform driver updates,
> > so I'm adding a few random people to the cc in case somebody goes
> > "ahh..."
> >
>
> If a bisect isn't possible for you the kernel command line should be pretty
> helpful to isolate which area the problem is introduced.
> I'd say start out with "nomodeset" on the kernel command line to prevent
> i915 from loading. If that fixes it, hopefully it's a small number of commits
> to peel back like the one Linus mentioned.

Good advice!
Using "nomodeset" makes the computer boot again.
I will now try to revert the commit Linus mentioned above and report back.

>
> For thinkpad_acpi you can try modprobe.blacklist=thinkpad_acpi.
>
> > But otherwise I think we'll just need more reports or info to even
> > start guessing.
> >
> > Linus

2022-07-06 12:33:16

by Tor Vic

[permalink] [raw]
Subject: RE: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4


> Limonciello, Mario <[email protected]> hat am 05.07.2022 17:10 GMT geschrieben:
>
>
> [Public]
>
> > -----Original Message-----
> > From: Linus Torvalds <[email protected]>
> > Sent: Tuesday, July 5, 2022 11:40
> > To: Tor Vic <[email protected]>
> > Cc: [email protected]; [email protected];
> > Hans de Goede <[email protected]>; Jani Nikula
> > <[email protected]>
> > Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4
> >
> > On Tue, Jul 5, 2022 at 6:50 AM <[email protected]> wrote:
> > >
> > > Linux 5.19-rc5 does not boot on my Kaby Lake Thinkpad.
> > > rc3 and rc4 were still fine, so I guess something between rc4 and rc5
> > > introduced a regression.
> >
> > Sounds that way.
> >
> > > Unfortunately, there are no errors or warning messages.
> > > It gets stuck quite early on boot, about the time USB is initialized,
> > > so less than 1 second into post-bootloader boot.
> > > It then just sits there doing nothing - SysRq still works though.
> >
> > There aren't all that many changes in rc5, and your hardware looks
> > *very* standard (all intel chipset, and a Samsung SM961 SSD).
> >
> > And with the lack of details, we'll either need a bisect:
> >
> > > I don't have time for a bisect, but I thought I'll let you know about
> > > this issue, and maybe someone already has an idea.
> >
> > or we'll need more reports..
> >
> > > Some system information below. Root filesystem is f2fs.
> >
> > Ok, f2fs is certainly unusual, but there are no f2fs changes in rc5.
> >
> > There's some PM changes for i915 ("drm/i915/dgfx: Disable d3cold at
> > gfx root port") and a couple of thinkpad-acpi platform driver updates,
> > so I'm adding a few random people to the cc in case somebody goes
> > "ahh..."
> >
>
> If a bisect isn't possible for you the kernel command line should be pretty
> helpful to isolate which area the problem is introduced.
> I'd say start out with "nomodeset" on the kernel command line to prevent
> i915 from loading. If that fixes it, hopefully it's a small number of commits
> to peel back like the one Linus mentioned.
>
> For thinkpad_acpi you can try modprobe.blacklist=thinkpad_acpi.
>
> > But otherwise I think we'll just need more reports or info to even
> > start guessing.
> >
> > Linus

Reverting the three i915 and the two thinkpad_acpi commits introduced with rc5
does not solve this problem. Maybe I missed something else...

Tor

Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4

On 06.07.22 10:42, [email protected] wrote:
>
>> Limonciello, Mario <[email protected]> hat am 05.07.2022 17:10 GMT geschrieben:
>>> -----Original Message-----
>>> From: Linus Torvalds <[email protected]>
>>> Sent: Tuesday, July 5, 2022 11:40
>>> To: Tor Vic <[email protected]>
>>> Cc: [email protected]; [email protected];
>>> Hans de Goede <[email protected]>; Jani Nikula
>>> <[email protected]>
>>> Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4
>>>
>>> On Tue, Jul 5, 2022 at 6:50 AM <[email protected]> wrote:
>>>>
>>>> Linux 5.19-rc5 does not boot on my Kaby Lake Thinkpad.
>>>> rc3 and rc4 were still fine, so I guess something between rc4 and rc5
>>>> introduced a regression.
>>>
>>> Sounds that way.
>>>
>>>> Unfortunately, there are no errors or warning messages.
>>>> It gets stuck quite early on boot, about the time USB is initialized,
>>>> so less than 1 second into post-bootloader boot.
>>>> It then just sits there doing nothing - SysRq still works though.
>>>
>>> There aren't all that many changes in rc5, and your hardware looks
>>> *very* standard (all intel chipset, and a Samsung SM961 SSD).
>>>
>>> And with the lack of details, we'll either need a bisect:
>>>
>>>> I don't have time for a bisect, but I thought I'll let you know about
>>>> this issue, and maybe someone already has an idea.
>>>
>>> or we'll need more reports..
>>>
>>>> Some system information below. Root filesystem is f2fs.
>>>
>>> Ok, f2fs is certainly unusual, but there are no f2fs changes in rc5.
>>>
>>> There's some PM changes for i915 ("drm/i915/dgfx: Disable d3cold at
>>> gfx root port") and a couple of thinkpad-acpi platform driver updates,
>>> so I'm adding a few random people to the cc in case somebody goes
>>> "ahh..."
>>>
>>
>> If a bisect isn't possible for you the kernel command line should be pretty
>> helpful to isolate which area the problem is introduced.
>> I'd say start out with "nomodeset" on the kernel command line to prevent
>> i915 from loading. If that fixes it, hopefully it's a small number of commits
>> to peel back like the one Linus mentioned.
>
> Good advice!
> Using "nomodeset" makes the computer boot again.

Wild guess, I'm not involved at all in any of the following, I just
noticed it and thought it might be worth mentioning:

I heard Fedora rawhide added this patch to solve a boot problem that
sounded similar to yours:
https://patchwork.freedesktop.org/patch/489982/

See also this thread:
https://lore.kernel.org/all/[email protected]/

A few config option are mentioned there that seem to have an impact.
Maybe it's worth changing those or trying that patch.

But as I said, I'm not involved, so maybe this is a bad advice.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

2022-07-06 19:29:03

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4

On Wed, Jul 6, 2022 at 11:21 AM Thorsten Leemhuis
<[email protected]> wrote:
>
> Wild guess, I'm not involved at all in any of the following, I just
> noticed it and thought it might be worth mentioning:
>
> I heard Fedora rawhide added this patch to solve a boot problem that
> sounded similar to yours:
> https://patchwork.freedesktop.org/patch/489982/

Yes, this looks likely, and matches that "starts in 5.19-rc5" since
the offending commit came in as

ee7a69aa38d8 ("fbdev: Disable sysfb device registration when
removing conflicting FBs")

so that does look like a likely cause.

Linus

2022-07-06 19:47:24

by Tor Vic

[permalink] [raw]
Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4



On 06.07.22 18:21, Thorsten Leemhuis wrote:
> On 06.07.22 10:42, [email protected] wrote:
>>
>>> Limonciello, Mario <[email protected]> hat am 05.07.2022 17:10 GMT geschrieben:
>>>> -----Original Message-----
>>>> From: Linus Torvalds <[email protected]>
>>>> Sent: Tuesday, July 5, 2022 11:40
>>>> To: Tor Vic <[email protected]>
>>>> Cc: [email protected]; [email protected];
>>>> Hans de Goede <[email protected]>; Jani Nikula
>>>> <[email protected]>
>>>> Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4
>>>>
>>>> On Tue, Jul 5, 2022 at 6:50 AM <[email protected]> wrote:
>>>>>
>>>>> Linux 5.19-rc5 does not boot on my Kaby Lake Thinkpad.
>>>>> rc3 and rc4 were still fine, so I guess something between rc4 and rc5
>>>>> introduced a regression.
>>>>
>>>> Sounds that way.
>>>>
>>>>> Unfortunately, there are no errors or warning messages.
>>>>> It gets stuck quite early on boot, about the time USB is initialized,
>>>>> so less than 1 second into post-bootloader boot.
>>>>> It then just sits there doing nothing - SysRq still works though.
>>>>
>>>> There aren't all that many changes in rc5, and your hardware looks
>>>> *very* standard (all intel chipset, and a Samsung SM961 SSD).
>>>>
>>>> And with the lack of details, we'll either need a bisect:
>>>>
>>>>> I don't have time for a bisect, but I thought I'll let you know about
>>>>> this issue, and maybe someone already has an idea.
>>>>
>>>> or we'll need more reports..
>>>>
>>>>> Some system information below. Root filesystem is f2fs.
>>>>
>>>> Ok, f2fs is certainly unusual, but there are no f2fs changes in rc5.
>>>>
>>>> There's some PM changes for i915 ("drm/i915/dgfx: Disable d3cold at
>>>> gfx root port") and a couple of thinkpad-acpi platform driver updates,
>>>> so I'm adding a few random people to the cc in case somebody goes
>>>> "ahh..."
>>>>
>>>
>>> If a bisect isn't possible for you the kernel command line should be pretty
>>> helpful to isolate which area the problem is introduced.
>>> I'd say start out with "nomodeset" on the kernel command line to prevent
>>> i915 from loading. If that fixes it, hopefully it's a small number of commits
>>> to peel back like the one Linus mentioned.
>>
>> Good advice!
>> Using "nomodeset" makes the computer boot again.
>
> Wild guess, I'm not involved at all in any of the following, I just
> noticed it and thought it might be worth mentioning:
>
> I heard Fedora rawhide added this patch to solve a boot problem that
> sounded similar to yours:
> https://patchwork.freedesktop.org/patch/489982/
>
> See also this thread:
> https://lore.kernel.org/all/[email protected]/

Hi Thorsten,

Yep, that sounds just like it!
I do have 'CONFIG_SYSFB_SIMPLEFB=y' in my laptop's config.

Will try this tomorrow and report back again.

>
> A few config option are mentioned there that seem to have an impact.
> Maybe it's worth changing those or trying that patch.
>
> But as I said, I'm not involved, so maybe this is a bad advice.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.

2022-07-07 08:02:16

by Tor Vic

[permalink] [raw]
Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4


> Linus Torvalds <[email protected]> hat am 06.07.2022 18:59 GMT geschrieben:
>
>
> On Wed, Jul 6, 2022 at 11:21 AM Thorsten Leemhuis
> <[email protected]> wrote:
> >
> > Wild guess, I'm not involved at all in any of the following, I just
> > noticed it and thought it might be worth mentioning:
> >
> > I heard Fedora rawhide added this patch to solve a boot problem that
> > sounded similar to yours:
> > https://patchwork.freedesktop.org/patch/489982/
>
> Yes, this looks likely, and matches that "starts in 5.19-rc5" since
> the offending commit came in as
>
> ee7a69aa38d8 ("fbdev: Disable sysfb device registration when
> removing conflicting FBs")
>
> so that does look like a likely cause.
>

I confirm that applying
"drm/aperture: Run fbdev removal before internal helpers"
on top of rc5 solves this issue, computer boots normally again.
I suppose this commit will be cherry-picked for rc6.

Thanks to all!
Tor

> Linus

Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4

On 05.07.22 15:50, [email protected] wrote:

> Linux 5.19-rc5 does not boot on my Kaby Lake Thinkpad.
> rc3 and rc4 were still fine, so I guess something between rc4 and rc5
> introduced a regression.
>
> Unfortunately, there are no errors or warning messages.
> It gets stuck quite early on boot, about the time USB is initialized,
> so less than 1 second into post-bootloader boot.
> It then just sits there doing nothing - SysRq still works though.
>
> I don't have time for a bisect, but I thought I'll let you know about
> this issue, and maybe someone already has an idea.
> [...]

This is mostly dealt with already, as can be seen from this thread,
nevertheless adding it to the tracking:

#regzbot ^introduced ee7a69aa38d8
#regzbot title drm: fbdev/simplefb: Linux 5.19-rc5 gets stuck on boot,
not rc4
#regzbot ignore-activity
#regzbot fixed-by: bf43e4521ff3223a

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

Subject: Re: [Regression?] Linux 5.19-rc5 gets stuck on boot, not rc4

On 07.07.22 09:54, [email protected] wrote:
>> Linus Torvalds <[email protected]> hat am 06.07.2022 18:59 GMT geschrieben:
>> On Wed, Jul 6, 2022 at 11:21 AM Thorsten Leemhuis
>> <[email protected]> wrote:
>>>
>>> Wild guess, I'm not involved at all in any of the following, I just
>>> noticed it and thought it might be worth mentioning:
>>>
>>> I heard Fedora rawhide added this patch to solve a boot problem that
>>> sounded similar to yours:
>>> https://patchwork.freedesktop.org/patch/489982/
>>
>> Yes, this looks likely, and matches that "starts in 5.19-rc5" since
>> the offending commit came in as
>>
>> ee7a69aa38d8 ("fbdev: Disable sysfb device registration when
>> removing conflicting FBs")
>>
>> so that does look like a likely cause.
>>
>
> I confirm that applying
> "drm/aperture: Run fbdev removal before internal helpers"
> on top of rc5 solves this issue, computer boots normally again.
> I suppose this commit will be cherry-picked for rc6.

Thx for confirming. Yes, that patch is in the drm-misc-fixes tree as
bf43e4521ff3 and thus will likely be merged with this weeks DRM merge.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.