Subject: [regression] Bug 217114 - Tiger Lake SATA Controller not operating correctly [bisected]

Hi, this is your Linux kernel regression tracker.

I noticed a regression report in bugzilla.kernel.org that apparently
affects 6.2 and later as well as 6.1.13 and later, as it was already
backported there.

As many (most?) kernel developer don't keep an eye on bugzilla, I
decided to forward the report by mail. Quoting from
https://bugzilla.kernel.org/show_bug.cgi?id=217114 :

> [email protected] 2023-03-02 11:25:00 UTC
>
> As per kernel problem found in https://bbs.archlinux.org/viewtopic.php?id=283906 ,
>
> Commit 104ff59af73aba524e57ae0fef70121643ff270e

[FWIW: That's "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller" from
Simon Gaiser]

> seems to have broken Intel Tiger Lake SATA controllers in a way that prevents boot, as the sysroot partition will not be found.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=104ff59af73aba524e57ae0fef70121643ff270e
>
> [tag] [reply] [−]
> Private
> Comment 1 [email protected] 2023-03-02 17:31:53 UTC
>
> As some people in the reference arch forum post reported this seems to have started in 6.1.13. 6.1.12 loads as expected.
>
> The problem is the sata disks can not be recognized any longer which is why the reported sysroot partition can't be found.
>
> My primary disk is nvme and as long as I remove all sata references from my fstab I can boot but then can't mount the device partitions because the devices are not present in /dev.
>
> Any attempts to boot with a sata disk in fstab results in a boot failure with emergency shell.
>
> [tag] [reply] [−]
> Private
> Comment 2 [email protected] 2023-03-02 19:31:28 UTC
>
> I can provide any details required
>
> My sata controller:
> 10000:e0:17.0 SATA controller: Intel Corporation Tiger Lake-LP SATA Controller (rev 20) (prog-if 01 [AHCI 1.0])
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0
> Interrupt: pin A routed to IRQ 146
> Region 0: Memory at 50100000 (32-bit, non-prefetchable) [size=8K]
> Region 1: Memory at 50102800 (32-bit, non-prefetchable) [size=256]
> Region 5: Memory at 50102000 (32-bit, non-prefetchable) [size=2K]
> Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
> Address: fee01000 Data: 0000
> Capabilities: [70] Power Management version 3
> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
> Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
> Kernel driver in use: ahci
>


See the ticket for more details.


[TLDR for the rest of this mail: I'm adding this report to the list of
tracked Linux kernel regressions; the text you find below is based on a
few templates paragraphs you might have encountered already in similar
form.]

BTW, let me use this mail to also add the report to the list of tracked
regressions to ensure it's doesn't fall through the cracks:

#regzbot introduced: 104ff59af73a
https://bugzilla.kernel.org/show_bug.cgi?id=217114
#regzbot title: ata: ahci: Tiger Lake SATA Controller not operating
correctly
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (e.g. the buzgzilla ticket and maybe this mail as well, if
this thread sees some discussion). See page linked in footer for details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


Subject: Re: [regression] Bug 217114 - Tiger Lake SATA Controller not operating correctly [bisected]

On 03.03.23 08:10, Linux regression tracking (Thorsten Leemhuis) wrote:
> Hi, this is your Linux kernel regression tracker.
>
> I noticed a regression report in bugzilla.kernel.org that apparently
> affects 6.2 and later as well as 6.1.13 and later, as it was already
> backported there.
>
> As many (most?) kernel developer don't keep an eye on bugzilla, I
> decided to forward the report by mail. Quoting from
> https://bugzilla.kernel.org/show_bug.cgi?id=217114 :
>
>> [email protected] 2023-03-02 11:25:00 UTC
>>
>> As per kernel problem found in https://bbs.archlinux.org/viewtopic.php?id=283906 ,
>>
>> Commit 104ff59af73aba524e57ae0fef70121643ff270e
>
> [FWIW: That's "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller" from
> Simon Gaiser]

BTW, there is one thing I wondered after sending above mail: was it
really wise to merge this to mainline two days before 6.2 was released?
Yes, the change subject's makes it sounds like this is a hardware
enablement, but the `Mark the Tiger Lake UP{3,4} AHCI controller as
"low_power"` at the beginning of the change description shines a
different light on it.

Ciao, Thorsten

>> seems to have broken Intel Tiger Lake SATA controllers in a way that prevents boot, as the sysroot partition will not be found.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=104ff59af73aba524e57ae0fef70121643ff270e
>>
>> [tag] [reply] [−]
>> Private
>> Comment 1 [email protected] 2023-03-02 17:31:53 UTC
>>
>> As some people in the reference arch forum post reported this seems to have started in 6.1.13. 6.1.12 loads as expected.
>>
>> The problem is the sata disks can not be recognized any longer which is why the reported sysroot partition can't be found.
>>
>> My primary disk is nvme and as long as I remove all sata references from my fstab I can boot but then can't mount the device partitions because the devices are not present in /dev.
>>
>> Any attempts to boot with a sata disk in fstab results in a boot failure with emergency shell.
>>
>> [tag] [reply] [−]
>> Private
>> Comment 2 [email protected] 2023-03-02 19:31:28 UTC
>>
>> I can provide any details required
>>
>> My sata controller:
>> 10000:e0:17.0 SATA controller: Intel Corporation Tiger Lake-LP SATA Controller (rev 20) (prog-if 01 [AHCI 1.0])
>> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>> Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>> Latency: 0
>> Interrupt: pin A routed to IRQ 146
>> Region 0: Memory at 50100000 (32-bit, non-prefetchable) [size=8K]
>> Region 1: Memory at 50102800 (32-bit, non-prefetchable) [size=256]
>> Region 5: Memory at 50102000 (32-bit, non-prefetchable) [size=2K]
>> Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
>> Address: fee01000 Data: 0000
>> Capabilities: [70] Power Management version 3
>> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
>> Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>> Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
>> Kernel driver in use: ahci
>>
>
>
> See the ticket for more details.
>
>
> [TLDR for the rest of this mail: I'm adding this report to the list of
> tracked Linux kernel regressions; the text you find below is based on a
> few templates paragraphs you might have encountered already in similar
> form.]
>
> BTW, let me use this mail to also add the report to the list of tracked
> regressions to ensure it's doesn't fall through the cracks:
>
> #regzbot introduced: 104ff59af73a
> https://bugzilla.kernel.org/show_bug.cgi?id=217114
> #regzbot title: ata: ahci: Tiger Lake SATA Controller not operating
> correctly
> #regzbot ignore-activity
>
> This isn't a regression? This issue or a fix for it are already
> discussed somewhere else? It was fixed already? You want to clarify when
> the regression started to happen? Or point out I got the title or
> something else totally wrong? Then just reply and tell me -- ideally
> while also telling regzbot about it, as explained by the page listed in
> the footer of this mail.
>
> Developers: When fixing the issue, remember to add 'Link:' tags pointing
> to the report (e.g. the buzgzilla ticket and maybe this mail as well, if
> this thread sees some discussion). See page linked in footer for details.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.

2023-03-03 08:12:27

by Damien Le Moal

[permalink] [raw]
Subject: Re: [regression] Bug 217114 - Tiger Lake SATA Controller not operating correctly [bisected]

On 3/3/23 16:30, Thorsten Leemhuis wrote:
> On 03.03.23 08:10, Linux regression tracking (Thorsten Leemhuis) wrote:
>> Hi, this is your Linux kernel regression tracker.
>>
>> I noticed a regression report in bugzilla.kernel.org that apparently
>> affects 6.2 and later as well as 6.1.13 and later, as it was already
>> backported there.
>>
>> As many (most?) kernel developer don't keep an eye on bugzilla, I
>> decided to forward the report by mail. Quoting from
>> https://bugzilla.kernel.org/show_bug.cgi?id=217114 :
>>
>>> [email protected] 2023-03-02 11:25:00 UTC
>>>
>>> As per kernel problem found in https://bbs.archlinux.org/viewtopic.php?id=283906 ,
>>>
>>> Commit 104ff59af73aba524e57ae0fef70121643ff270e
>>
>> [FWIW: That's "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller" from
>> Simon Gaiser]
>
> BTW, there is one thing I wondered after sending above mail: was it
> really wise to merge this to mainline two days before 6.2 was released?
> Yes, the change subject's makes it sounds like this is a hardware
> enablement, but the `Mark the Tiger Lake UP{3,4} AHCI controller as
> "low_power"` at the beginning of the change description shines a
> different light on it.

Yes, I made the decision to send this patch as a "fix" rather than a change, and
that was rc8. In retrospect, maybe not the best decision. But the patch was
fixing issues for Simon, so...

Anyway, will follow this. I requested more information on Bugzilla. The issue
here is that it may be due to the device having a bad LPM support (there are
many) rather than the controller itself. Need to sort this out.



--
Damien Le Moal
Western Digital Research


2023-03-03 09:49:10

by Damien Le Moal

[permalink] [raw]
Subject: Re: [regression] Bug 217114 - Tiger Lake SATA Controller not operating correctly [bisected]

On 3/3/23 16:10, Linux regression tracking (Thorsten Leemhuis) wrote:
> Hi, this is your Linux kernel regression tracker.
>
> I noticed a regression report in bugzilla.kernel.org that apparently
> affects 6.2 and later as well as 6.1.13 and later, as it was already
> backported there.
>
> As many (most?) kernel developer don't keep an eye on bugzilla, I
> decided to forward the report by mail. Quoting from
> https://bugzilla.kernel.org/show_bug.cgi?id=217114 :
>
>> [email protected] 2023-03-02 11:25:00 UTC
>>
>> As per kernel problem found in https://bbs.archlinux.org/viewtopic.php?id=283906 ,
>>
>> Commit 104ff59af73aba524e57ae0fef70121643ff270e
>
> [FWIW: That's "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller" from
> Simon Gaiser]

I sent a revert with cc: stable.

Simon,

Let's work on finding a better solution for enabling LPM for that adapter
without causing regressions. I will need your help for testing as I do not have
this hardware.

--
Damien Le Moal
Western Digital Research


2023-03-03 10:37:51

by Simon Gaiser

[permalink] [raw]
Subject: Re: [regression] Bug 217114 - Tiger Lake SATA Controller not operating correctly [bisected]

Damien Le Moal:
> On 3/3/23 16:10, Linux regression tracking (Thorsten Leemhuis) wrote:
>> Hi, this is your Linux kernel regression tracker.
>>
>> I noticed a regression report in bugzilla.kernel.org that apparently
>> affects 6.2 and later as well as 6.1.13 and later, as it was already
>> backported there.
>>
>> As many (most?) kernel developer don't keep an eye on bugzilla, I
>> decided to forward the report by mail. Quoting from
>> https://bugzilla.kernel.org/show_bug.cgi?id=217114 :
>>
>>> [email protected] 2023-03-02 11:25:00 UTC
>>>
>>> As per kernel problem found in https://bbs.archlinux.org/viewtopic.php?id=283906 ,
>>>
>>> Commit 104ff59af73aba524e57ae0fef70121643ff270e
>>
>> [FWIW: That's "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller" from
>> Simon Gaiser]
>
> I sent a revert with cc: stable.
>
> Simon,
>
> Let's work on finding a better solution for enabling LPM for that
> adapter without causing regressions. I will need your help for testing
> as I do not have this hardware.

Sure, let me know what I can do to help.

Simon


Attachments:
OpenPGP_signature (833.00 B)
OpenPGP digital signature
Subject: Re: [regression] Bug 217114 - Tiger Lake SATA Controller not operating correctly [bisected]

On 03.03.23 10:48, Damien Le Moal wrote:
> On 3/3/23 16:10, Linux regression tracking (Thorsten Leemhuis) wrote:
>>
>> I noticed a regression report in bugzilla.kernel.org that apparently
>> affects 6.2 and later as well as 6.1.13 and later, as it was already
>> backported there.
>>
>> As many (most?) kernel developer don't keep an eye on bugzilla, I
>> decided to forward the report by mail. Quoting from
>> https://bugzilla.kernel.org/show_bug.cgi?id=217114 :
>>
>>> [email protected] 2023-03-02 11:25:00 UTC
>>>
>>> As per kernel problem found in https://bbs.archlinux.org/viewtopic.php?id=283906 ,
>>>
>>> Commit 104ff59af73aba524e57ae0fef70121643ff270e
>>
>> [FWIW: That's "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller" from
>> Simon Gaiser]
>
> I sent a revert with cc: stable.

Many thx for this and your quick actions.

@Greg, @Sasha: that revert landed as 6210038aeaf4 ("ata: ahci: Revert
"ata: ahci: Add Tiger Lake UP{3,4} AHCI controller""); you might want to
ensure you have it in the first batch of 6.1 backports in case you need
to split the backports from the merge window over multiple 6.1.y releases.

Ciao, Thorsten

2023-03-04 14:12:19

by Greg KH

[permalink] [raw]
Subject: Re: [regression] Bug 217114 - Tiger Lake SATA Controller not operating correctly [bisected]

On Sat, Mar 04, 2023 at 02:58:28PM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 03.03.23 10:48, Damien Le Moal wrote:
> > On 3/3/23 16:10, Linux regression tracking (Thorsten Leemhuis) wrote:
> >>
> >> I noticed a regression report in bugzilla.kernel.org that apparently
> >> affects 6.2 and later as well as 6.1.13 and later, as it was already
> >> backported there.
> >>
> >> As many (most?) kernel developer don't keep an eye on bugzilla, I
> >> decided to forward the report by mail. Quoting from
> >> https://bugzilla.kernel.org/show_bug.cgi?id=217114 :
> >>
> >>> [email protected] 2023-03-02 11:25:00 UTC
> >>>
> >>> As per kernel problem found in https://bbs.archlinux.org/viewtopic.php?id=283906 ,
> >>>
> >>> Commit 104ff59af73aba524e57ae0fef70121643ff270e
> >>
> >> [FWIW: That's "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller" from
> >> Simon Gaiser]
> >
> > I sent a revert with cc: stable.
>
> Many thx for this and your quick actions.
>
> @Greg, @Sasha: that revert landed as 6210038aeaf4 ("ata: ahci: Revert
> "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller""); you might want to
> ensure you have it in the first batch of 6.1 backports in case you need
> to split the backports from the merge window over multiple 6.1.y releases.

I've queued this up now, thanks for letting us know.

greg k-h