Hello,
We have identified a regression in the acpi-fan driver probe between
v6.9-rc7 and v6.10-rc1 on some Intel Chromebooks in the Collabora LAVA
lab.
For the Acer Chromebook Spin 514 (CP514-2H), the following error is
reported in the logs:
[ 0.651202] acpi-fan INTC1044:00: probe with driver acpi-fan failed with error -22
Similar errors are reported on other devices with fans compatible with
the same driver.
On Acer Chromebox CXI4, ASUS Chromebook Flip C436FA and
HP Chromebook x360 14 G1:
[ 0.488001] acpi-fan INT3404:00: probe with driver acpi-fan failed with error -22
On ASUS Chromebook Vero 514 CBV514-1H:
[ 1.168905] acpi-fan INTC1048:00: probe with driver acpi-fan failed with error -22
The issue is still present on next-20240529.
I'm sending this report to track the regression while a fix is
identified. I'll investigate the issue/run a bisection and report back
with the results.
This regression was discovered during some preliminary tests with the
ACPI probe kselftest [1] in KernelCI. The config used was the upstream
x86_64 defconfig with a fragment applied on top [2].
Best,
Laura
[1] https://lore.kernel.org/all/[email protected]/
[2] https://pastebin.com/raw/0tFM0Zyg
#regzbot introduced: v6.9-rc7..v6.10-rc1
Hello,
On 5/30/24 17:37, Laura Nao wrote:
> Hello,
>
> We have identified a regression in the acpi-fan driver probe between
> v6.9-rc7 and v6.10-rc1 on some Intel Chromebooks in the Collabora LAVA
> lab.
>
> For the Acer Chromebook Spin 514 (CP514-2H), the following error is
> reported in the logs:
>
> [ 0.651202] acpi-fan INTC1044:00: probe with driver acpi-fan failed with error -22
>
> Similar errors are reported on other devices with fans compatible with
> the same driver.
>
> On Acer Chromebox CXI4, ASUS Chromebook Flip C436FA and
> HP Chromebook x360 14 G1:
>
> [ 0.488001] acpi-fan INT3404:00: probe with driver acpi-fan failed with error -22
>
> On ASUS Chromebook Vero 514 CBV514-1H:
>
> [ 1.168905] acpi-fan INTC1048:00: probe with driver acpi-fan failed with error -22
>
> The issue is still present on next-20240529.
>
> I'm sending this report to track the regression while a fix is
> identified. I'll investigate the issue/run a bisection and report back
> with the results.
>
> This regression was discovered during some preliminary tests with the
> ACPI probe kselftest [1] in KernelCI. The config used was the upstream
> x86_64 defconfig with a fragment applied on top [2].
>
> Best,
>
> Laura
>
> [1] https://lore.kernel.org/all/[email protected]/
> [2] https://pastebin.com/raw/0tFM0Zyg
>
> #regzbot introduced: v6.9-rc7..v6.10-rc1
The issue started happening after:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/thermal/thermal_core.c?h=v6.10-rc1&id=31a0fa0019b022024cc082ae292951a596b06f8c
Before this commit, get_cur_state() was not called by
__thermal_cooling_device_register, so the error was not triggered.
After enabling debugging for the acpi-fan driver, I noticed these errors
in the logs:
[ 0.682224] acpi INTC1044:00: Invalid control value returned
[ 0.682635] acpi INTC1044:00: Invalid control value returned
The value stored in fst.control is 255, which is indeed not a valid
value.
I suspect this might be a firmware issue that is now manifesting due to
the addition of the extra get_cur_state() call.
I'll dig a bit more and report back.
Best,
Laura
#regzbot introduced: 31a0fa0019
Hi,
On Fri, May 31, 2024 at 1:35 PM Laura Nao <[email protected]> wrote:
>
> Hello,
>
> On 5/30/24 17:37, Laura Nao wrote:
> > Hello,
> >
> > We have identified a regression in the acpi-fan driver probe between
> > v6.9-rc7 and v6.10-rc1 on some Intel Chromebooks in the Collabora LAVA
> > lab.
> >
> > For the Acer Chromebook Spin 514 (CP514-2H), the following error is
> > reported in the logs:
> >
> > [ 0.651202] acpi-fan INTC1044:00: probe with driver acpi-fan failed with error -22
> >
> > Similar errors are reported on other devices with fans compatible with
> > the same driver.
> >
> > On Acer Chromebox CXI4, ASUS Chromebook Flip C436FA and
> > HP Chromebook x360 14 G1:
> >
> > [ 0.488001] acpi-fan INT3404:00: probe with driver acpi-fan failed with error -22
> >
> > On ASUS Chromebook Vero 514 CBV514-1H:
> >
> > [ 1.168905] acpi-fan INTC1048:00: probe with driver acpi-fan failed with error -22
> >
> > The issue is still present on next-20240529.
> >
> > I'm sending this report to track the regression while a fix is
> > identified. I'll investigate the issue/run a bisection and report back
> > with the results.
> >
> > This regression was discovered during some preliminary tests with the
> > ACPI probe kselftest [1] in KernelCI. The config used was the upstream
> > x86_64 defconfig with a fragment applied on top [2].
> >
> > Best,
> >
> > Laura
> >
> > [1] https://lore.kernel.org/all/[email protected]/
> > [2] https://pastebin.com/raw/0tFM0Zyg
> >
> > #regzbot introduced: v6.9-rc7..v6.10-rc1
>
> The issue started happening after:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/thermal/thermal_core.c?h=v6.10-rc1&id=31a0fa0019b022024cc082ae292951a596b06f8c
>
> Before this commit, get_cur_state() was not called by
> __thermal_cooling_device_register, so the error was not triggered.
>
> After enabling debugging for the acpi-fan driver, I noticed these errors
> in the logs:
>
> [ 0.682224] acpi INTC1044:00: Invalid control value returned
> [ 0.682635] acpi INTC1044:00: Invalid control value returned
>
> The value stored in fst.control is 255, which is indeed not a valid
> value.
>
> I suspect this might be a firmware issue that is now manifesting due to
> the addition of the extra get_cur_state() call.
>
> I'll dig a bit more and report back.
It looks like _FST returns all ones if it is evaluated before _FSL on
the affected platforms.
It shouldn't do that, but then it is not particularly useful to fail
cdev registration for this reason.
The attached patch should work around this issue, please try it and report back.
On Mon, Jun 3, 2024 at 8:20 PM Rafael J. Wysocki <[email protected]> wrote:
>
> Hi,
>
> On Fri, May 31, 2024 at 1:35 PM Laura Nao <[email protected]> wrote:
> >
> > Hello,
> >
> > On 5/30/24 17:37, Laura Nao wrote:
> > > Hello,
> > >
> > > We have identified a regression in the acpi-fan driver probe between
> > > v6.9-rc7 and v6.10-rc1 on some Intel Chromebooks in the Collabora LAVA
> > > lab.
> > >
> > > For the Acer Chromebook Spin 514 (CP514-2H), the following error is
> > > reported in the logs:
> > >
> > > [ 0.651202] acpi-fan INTC1044:00: probe with driver acpi-fan failed with error -22
> > >
> > > Similar errors are reported on other devices with fans compatible with
> > > the same driver.
> > >
> > > On Acer Chromebox CXI4, ASUS Chromebook Flip C436FA and
> > > HP Chromebook x360 14 G1:
> > >
> > > [ 0.488001] acpi-fan INT3404:00: probe with driver acpi-fan failed with error -22
> > >
> > > On ASUS Chromebook Vero 514 CBV514-1H:
> > >
> > > [ 1.168905] acpi-fan INTC1048:00: probe with driver acpi-fan failed with error -22
> > >
> > > The issue is still present on next-20240529.
> > >
> > > I'm sending this report to track the regression while a fix is
> > > identified. I'll investigate the issue/run a bisection and report back
> > > with the results.
> > >
> > > This regression was discovered during some preliminary tests with the
> > > ACPI probe kselftest [1] in KernelCI. The config used was the upstream
> > > x86_64 defconfig with a fragment applied on top [2].
> > >
> > > Best,
> > >
> > > Laura
> > >
> > > [1] https://lore.kernel.org/all/[email protected]/
> > > [2] https://pastebin.com/raw/0tFM0Zyg
> > >
> > > #regzbot introduced: v6.9-rc7..v6.10-rc1
> >
> > The issue started happening after:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/thermal/thermal_core.c?h=v6.10-rc1&id=31a0fa0019b022024cc082ae292951a596b06f8c
> >
> > Before this commit, get_cur_state() was not called by
> > __thermal_cooling_device_register, so the error was not triggered.
> >
> > After enabling debugging for the acpi-fan driver, I noticed these errors
> > in the logs:
> >
> > [ 0.682224] acpi INTC1044:00: Invalid control value returned
> > [ 0.682635] acpi INTC1044:00: Invalid control value returned
> >
> > The value stored in fst.control is 255, which is indeed not a valid
> > value.
> >
> > I suspect this might be a firmware issue that is now manifesting due to
> > the addition of the extra get_cur_state() call.
> >
> > I'll dig a bit more and report back.
>
> It looks like _FST returns all ones if it is evaluated before _FSL on
> the affected platforms.
>
> It shouldn't do that, but then it is not particularly useful to fail
> cdev registration for this reason.
>
> The attached patch should work around this issue, please try it and report back.
A !ret check is missing in that patch, so please try the attached new
version of it instead.
Thanks!
Hi Rafael,
On 6/4/24 14:07, Rafael J. Wysocki wrote:
> On Mon, Jun 3, 2024 at 8:20 PM Rafael J. Wysocki <[email protected]>
> wrote:
>>
>> Hi,
>>
>> On Fri, May 31, 2024 at 1:35 PM Laura Nao <[email protected]>
>> wrote:
>>>
>>> Hello,
>>>
>>> On 5/30/24 17:37, Laura Nao wrote:
>>>> Hello,
>>>>
>>>> We have identified a regression in the acpi-fan driver probe
>>>> between
>>>> v6.9-rc7 and v6.10-rc1 on some Intel Chromebooks in the Collabora
>>>> LAVA
>>>> lab.
>>>>
>>>> For the Acer Chromebook Spin 514 (CP514-2H), the following error is
>>>> reported in the logs:
>>>>
>>>> [ 0.651202] acpi-fan INTC1044:00: probe with driver acpi-fan
>>>> failed with error -22
>>>>
>>>> Similar errors are reported on other devices with fans compatible
>>>> with
>>>> the same driver.
>>>>
>>>> On Acer Chromebox CXI4, ASUS Chromebook Flip C436FA and
>>>> HP Chromebook x360 14 G1:
>>>>
>>>> [ 0.488001] acpi-fan INT3404:00: probe with driver acpi-fan
>>>> failed with error -22
>>>>
>>>> On ASUS Chromebook Vero 514 CBV514-1H:
>>>>
>>>> [ 1.168905] acpi-fan INTC1048:00: probe with driver acpi-fan
>>>> failed with error -22
>>>>
>>>> The issue is still present on next-20240529.
>>>>
>>>> I'm sending this report to track the regression while a fix is
>>>> identified. I'll investigate the issue/run a bisection and report
>>>> back
>>>> with the results.
>>>>
>>>> This regression was discovered during some preliminary tests with
>>>> the
>>>> ACPI probe kselftest [1] in KernelCI. The config used was the
>>>> upstream
>>>> x86_64 defconfig with a fragment applied on top [2].
>>>>
>>>> Best,
>>>>
>>>> Laura
>>>>
>>>> [1]
>>>> https://lore.kernel.org/all/[email protected]/
>>>> [2] https://pastebin.com/raw/0tFM0Zyg
>>>>
>>>> #regzbot introduced: v6.9-rc7..v6.10-rc1
>>>
>>> The issue started happening after:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/thermal/thermal_core.c?h=v6.10-rc1&id=31a0fa0019b022024cc082ae292951a596b06f8c
>>>
>>> Before this commit, get_cur_state() was not called by
>>> __thermal_cooling_device_register, so the error was not triggered.
>>>
>>> After enabling debugging for the acpi-fan driver, I noticed these
>>> errors
>>> in the logs:
>>>
>>> [ 0.682224] acpi INTC1044:00: Invalid control value returned
>>> [ 0.682635] acpi INTC1044:00: Invalid control value returned
>>>
>>> The value stored in fst.control is 255, which is indeed not a valid
>>> value.
>>>
>>> I suspect this might be a firmware issue that is now manifesting due
>>> to
>>> the addition of the extra get_cur_state() call.
>>>
>>> I'll dig a bit more and report back.
>>
>> It looks like _FST returns all ones if it is evaluated before _FSL on
>> the affected platforms.
>>
Right, I'll look into that.
>> It shouldn't do that, but then it is not particularly useful to fail
>> cdev registration for this reason.
>>
>> The attached patch should work around this issue, please try it and
>> report back.
>
> A !ret check is missing in that patch, so please try the attached new
> version of it instead.
>
> Thanks!
I confirm the patch works as expected and fixes the probe issue.
Thank you!
Laura