2022-02-09 06:24:24

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: 5.17-rc regression: rmi4 clients cannot deal with asynchronous suspend? (was: X1 Carbon touchpad not resumed)

On Mon, Feb 07, 2022 at 01:41:36PM -0800, Rajat Jain wrote:
> [email protected]
>
> On Mon, Feb 7, 2022 at 1:09 PM Rajat Jain <[email protected]> wrote:
> >
> > +Rafael (for any inputs on asynchronous suspend / resume)
> > +Dmitry Torokhov (since no other maintainer of rmi4 in MAINTAINERS file)
> > [email protected] (who fixed RMI device hierarchy recently)
> > + Some Synaptics folks (from recent commits - Vincent Huang, Andrew
> > Duggan, Cheiny)
> >
> > On Mon, Feb 7, 2022 at 12:23 PM Wolfram Sang <[email protected]> wrote:
> > >
> > > Hello Hugh,
> > >
> > > > Bisection led to 172d931910e1db800f4e71e8ed92281b6f8c6ee2
> > > > ("i2c: enable async suspend/resume on i2c client devices")
> > > > and reverting that fixes it for me.
> > >
> > > Thank you for the report plus bisection and sorry for the regression!
> >
> > +1, Thanks for the bisection, and apologies for the inconveniences.
> >
> > The problem here seems to be that for some reason, some devices (all
> > connected to rmi4 adapter) failed to resume, but only when
> > asynchronous suspend is enabled (by 172d931910e1):
> >
> > [ 79.221064] rmi4_smbus 6-002c: failed to get SMBus version number!
> > [ 79.265074] rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed
> > to read current IRQ mask.
> > [ 79.308330] rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -6.
> > [ 79.308335] rmi4_f01 rmi4-00.fn01: Resume failed with code -6.
> > [ 79.308339] rmi4_physical rmi4-00: Failed to suspend functions: -6
> > [ 79.308342] rmi4_smbus 6-002c: Failed to resume device: -6
> > [ 79.351967] rmi4_physical rmi4-00: Failed to read irqs, code=-6
> >
> > A resume failure that only shows up during asynchronous resume,
> > typically means that the device is dependent on some other device to
> > resume first, but this dependency is NOT established in a parent child
> > relationship (which is wrong and needs to be fixed, perhaps using
> > device_add_link()). Thus the kernel may be resuming these devices
> > without first resuming some other device that these devices need to
> > depend on.
> >
> > TBH, I'm not sure how to fix this. The only hint I see is that all of
> > these devices seem to be attached to rmi4 device so perhaps something
> > there? I see 6e4860410b828f recently fixed device hierarchy for rmi4,
> > and so seemingly should have fixed this very issue (as also seen in
> > commit message)?
> >
> > >
> > > I will wait a few days if people come up with a fix. If not, I will
> > > revert the offending commit.
> >
> > While I'll be sad because this means no i2c-client can now resume in
> > parallel and increases resume latency by a *LOT* (hundreds of ms on
> > all Linux systems), I understand that this needs to be done unless
> > someone comes up with a fix.

There is intricate dance happening switching touchpad from legacy PS/2
into RMI mode, with touchpad being dependent not only on SMbus
controller, but also on i8042 keyboard controller and its PS/2 port (or
rather their emulation by the system firmware).

I wonder if we could apply a little bit more targeted patch:

diff --git a/drivers/input/rmi4/rmi_smbus.c b/drivers/input/rmi4/rmi_smbus.c
index 2407ea43de59..3901d06d38ca 100644
--- a/drivers/input/rmi4/rmi_smbus.c
+++ b/drivers/input/rmi4/rmi_smbus.c
@@ -335,6 +335,7 @@ static int rmi_smb_probe(struct i2c_client *client,
return error;
}

+ device_disable_async_suspend(&client->dev);
return 0;
}


... and if that works then we cant try to establish proper dependencies
via device links later.

Hugh, could you please try this out and see if it helps?

Thanks!

--
Dmitry


2022-02-09 07:57:52

by Hugh Dickins

[permalink] [raw]
Subject: Re: 5.17-rc regression: rmi4 clients cannot deal with asynchronous suspend? (was: X1 Carbon touchpad not resumed)

On Mon, 7 Feb 2022, Dmitry Torokhov wrote:
> On Mon, Feb 07, 2022 at 01:41:36PM -0800, Rajat Jain wrote:
> > [email protected]
> >
> > On Mon, Feb 7, 2022 at 1:09 PM Rajat Jain <[email protected]> wrote:
> > >
> > > +Rafael (for any inputs on asynchronous suspend / resume)
> > > +Dmitry Torokhov (since no other maintainer of rmi4 in MAINTAINERS file)
> > > [email protected] (who fixed RMI device hierarchy recently)
> > > + Some Synaptics folks (from recent commits - Vincent Huang, Andrew
> > > Duggan, Cheiny)
> > >
> > > On Mon, Feb 7, 2022 at 12:23 PM Wolfram Sang <[email protected]> wrote:
> > > >
> > > > Hello Hugh,
> > > >
> > > > > Bisection led to 172d931910e1db800f4e71e8ed92281b6f8c6ee2
> > > > > ("i2c: enable async suspend/resume on i2c client devices")
> > > > > and reverting that fixes it for me.
> > > >
> > > > Thank you for the report plus bisection and sorry for the regression!
> > >
> > > +1, Thanks for the bisection, and apologies for the inconveniences.
> > >
> > > The problem here seems to be that for some reason, some devices (all
> > > connected to rmi4 adapter) failed to resume, but only when
> > > asynchronous suspend is enabled (by 172d931910e1):
> > >
> > > [ 79.221064] rmi4_smbus 6-002c: failed to get SMBus version number!
> > > [ 79.265074] rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed
> > > to read current IRQ mask.
> > > [ 79.308330] rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -6.
> > > [ 79.308335] rmi4_f01 rmi4-00.fn01: Resume failed with code -6.
> > > [ 79.308339] rmi4_physical rmi4-00: Failed to suspend functions: -6
> > > [ 79.308342] rmi4_smbus 6-002c: Failed to resume device: -6
> > > [ 79.351967] rmi4_physical rmi4-00: Failed to read irqs, code=-6
> > >
> > > A resume failure that only shows up during asynchronous resume,
> > > typically means that the device is dependent on some other device to
> > > resume first, but this dependency is NOT established in a parent child
> > > relationship (which is wrong and needs to be fixed, perhaps using
> > > device_add_link()). Thus the kernel may be resuming these devices
> > > without first resuming some other device that these devices need to
> > > depend on.
> > >
> > > TBH, I'm not sure how to fix this. The only hint I see is that all of
> > > these devices seem to be attached to rmi4 device so perhaps something
> > > there? I see 6e4860410b828f recently fixed device hierarchy for rmi4,
> > > and so seemingly should have fixed this very issue (as also seen in
> > > commit message)?
> > >
> > > >
> > > > I will wait a few days if people come up with a fix. If not, I will
> > > > revert the offending commit.
> > >
> > > While I'll be sad because this means no i2c-client can now resume in
> > > parallel and increases resume latency by a *LOT* (hundreds of ms on
> > > all Linux systems), I understand that this needs to be done unless
> > > someone comes up with a fix.
>
> There is intricate dance happening switching touchpad from legacy PS/2
> into RMI mode, with touchpad being dependent not only on SMbus
> controller, but also on i8042 keyboard controller and its PS/2 port (or
> rather their emulation by the system firmware).
>
> I wonder if we could apply a little bit more targeted patch:
>
> diff --git a/drivers/input/rmi4/rmi_smbus.c b/drivers/input/rmi4/rmi_smbus.c
> index 2407ea43de59..3901d06d38ca 100644
> --- a/drivers/input/rmi4/rmi_smbus.c
> +++ b/drivers/input/rmi4/rmi_smbus.c
> @@ -335,6 +335,7 @@ static int rmi_smb_probe(struct i2c_client *client,
> return error;
> }
>
> + device_disable_async_suspend(&client->dev);
> return 0;
> }
>
>
> ... and if that works then we cant try to establish proper dependencies
> via device links later.
>
> Hugh, could you please try this out and see if it helps?

Yes, that works nicely, thanks Dmitry.

By the way, my memory's been jogged by "rmi4" and the discussion above:
I had a similar-ish problem with it a year ago, discussed with PM guys,

https://lore.kernel.org/linux-pm/[email protected]/

I'm not saying you have to read through that thread, but you may find
some relevance in it - Saravana concluded rmi4 driver isn't capturing
parent/child relationship correctly (at that time, anyway).

Hugh

2022-02-09 10:02:50

by Jarkko Nikula

[permalink] [raw]
Subject: Re: 5.17-rc regression: rmi4 clients cannot deal with asynchronous suspend? (was: X1 Carbon touchpad not resumed)

Hi

On 2/8/22 04:50, Hugh Dickins wrote:
> On Mon, 7 Feb 2022, Dmitry Torokhov wrote:
>> On Mon, Feb 07, 2022 at 01:41:36PM -0800, Rajat Jain wrote:
>>>>>> Bisection led to 172d931910e1db800f4e71e8ed92281b6f8c6ee2
>>>>>> ("i2c: enable async suspend/resume on i2c client devices")
>>>>>> and reverting that fixes it for me.
>>>>>
>>>>> Thank you for the report plus bisection and sorry for the regression!
>>>>
>>>> +1, Thanks for the bisection, and apologies for the inconveniences.
>>>>
>>>> The problem here seems to be that for some reason, some devices (all
>>>> connected to rmi4 adapter) failed to resume, but only when
>>>> asynchronous suspend is enabled (by 172d931910e1):
>>>>
>>>> [ 79.221064] rmi4_smbus 6-002c: failed to get SMBus version number!
>>>> [ 79.265074] rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed
>>>> to read current IRQ mask.
>>>> [ 79.308330] rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -6.
>>>> [ 79.308335] rmi4_f01 rmi4-00.fn01: Resume failed with code -6.
>>>> [ 79.308339] rmi4_physical rmi4-00: Failed to suspend functions: -6
>>>> [ 79.308342] rmi4_smbus 6-002c: Failed to resume device: -6
>>>> [ 79.351967] rmi4_physical rmi4-00: Failed to read irqs, code=-6
>>>>

v5.17-rc3 on Lenovo ThinkPad X1 Carbon 8th don't even suspend due the
same commit 172d931910e1. Sadly I tested the original patch on other
machine(s) but not on this one with rmi4 :-(

[ 39.957293] PM: suspend entry (s2idle)
[ 40.938666] Filesystems sync: 0.980 seconds
[ 40.942751] Freezing user space processes ... (elapsed 0.001 seconds)
done.
[ 40.945511] OOM killer disabled.
[ 40.946111] Freezing remaining freezable tasks ... (elapsed 0.001
seconds) done.
[ 40.948590] printk: Suspending console(s) (use no_console_suspend to
debug)
[ 40.993123] i801_smbus 0000:00:1f.4: No response
[ 40.993218] rmi4_f01 rmi4-00.fn01: Failed to write sleep mode: -6.
[ 40.993232] rmi4_f01 rmi4-00.fn01: Suspend failed with code -6.
[ 40.993241] rmi4_physical rmi4-00: Failed to suspend functions: -6
[ 40.993404] rmi4_smbus 1-002c: Failed to suspend device: -6
[ 40.993414] PM: dpm_run_callback(): rmi_smb_suspend+0x0/0x30
[rmi_smbus] returns -6
[ 40.993438] rmi4_smbus 1-002c: PM: failed to suspend async: error -6
[ 41.014198] PM: Some devices failed to suspend, or early wake event
detected
[ 41.021544] i801_smbus 0000:00:1f.4: No response
[ 41.021612] rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write
to F03 TX register (-6).
[ 41.022189] i801_smbus 0000:00:1f.4: No response
[ 41.022230] rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write
to F03 TX register (-6).
[ 41.023480] i801_smbus 0000:00:1f.4: No response
[ 41.023542] rmi4_physical rmi4-00: rmi_driver_clear_irq_bits: Failed
to change enabled interrupts!
[ 41.033850] i801_smbus 0000:00:1f.4: No response
[ 41.034006] OOM killer enabled.
[ 41.035449] i801_smbus 0000:00:1f.4: No response
[ 41.035722] Restarting tasks ...
[ 41.036705] rmi4_physical rmi4-00: rmi_driver_set_irq_bits: Failed to
change enabled interrupts!
[ 41.038367] done.
[ 41.039003] psmouse: probe of serio2 failed with error -1
[ 41.071700] PM: suspend exit

>> I wonder if we could apply a little bit more targeted patch:
>>
>> diff --git a/drivers/input/rmi4/rmi_smbus.c b/drivers/input/rmi4/rmi_smbus.c
>> index 2407ea43de59..3901d06d38ca 100644
>> --- a/drivers/input/rmi4/rmi_smbus.c
>> +++ b/drivers/input/rmi4/rmi_smbus.c
>> @@ -335,6 +335,7 @@ static int rmi_smb_probe(struct i2c_client *client,
>> return error;
>> }
>>
>> + device_disable_async_suspend(&client->dev);
>> return 0;
>> }
>>
>>
>> ... and if that works then we cant try to establish proper dependencies
>> via device links later.
>>
>> Hugh, could you please try this out and see if it helps?
>
> Yes, that works nicely, thanks Dmitry.
>
Gladly fixes the issue on ThinkPad X1 Carbon 8th too:

Tested-by: Jarkko Nikula <[email protected]>

2022-02-09 10:04:07

by Loic Poulain

[permalink] [raw]
Subject: Re: 5.17-rc regression: rmi4 clients cannot deal with asynchronous suspend? (was: X1 Carbon touchpad not resumed)

Hi folks,

On Tue, 8 Feb 2022 at 03:50, Hugh Dickins <[email protected]> wrote:
>
> On Mon, 7 Feb 2022, Dmitry Torokhov wrote:
> > On Mon, Feb 07, 2022 at 01:41:36PM -0800, Rajat Jain wrote:
> > > [email protected]
> > >
> > > On Mon, Feb 7, 2022 at 1:09 PM Rajat Jain <[email protected]> wrote:
> > > >
> > > > +Rafael (for any inputs on asynchronous suspend / resume)
> > > > +Dmitry Torokhov (since no other maintainer of rmi4 in MAINTAINERS file)
> > > > [email protected] (who fixed RMI device hierarchy recently)
> > > > + Some Synaptics folks (from recent commits - Vincent Huang, Andrew
> > > > Duggan, Cheiny)
> > > >
> > > > On Mon, Feb 7, 2022 at 12:23 PM Wolfram Sang <[email protected]> wrote:
> > > > >
> > > > > Hello Hugh,
> > > > >
> > > > > > Bisection led to 172d931910e1db800f4e71e8ed92281b6f8c6ee2
> > > > > > ("i2c: enable async suspend/resume on i2c client devices")
> > > > > > and reverting that fixes it for me.
> > > > >
> > > > > Thank you for the report plus bisection and sorry for the regression!
> > > >
> > > > +1, Thanks for the bisection, and apologies for the inconveniences.
> > > >
> > > > The problem here seems to be that for some reason, some devices (all
> > > > connected to rmi4 adapter) failed to resume, but only when
> > > > asynchronous suspend is enabled (by 172d931910e1):
> > > >
> > > > [ 79.221064] rmi4_smbus 6-002c: failed to get SMBus version number!

Looks like this is the initial issue. Does the rmi device lose power
while suspended? if so could it be that enabling async_suspend makes
the device resuming earlier, at a time it is not yet ready? What if
you simply start with a naive msleep(200) in rmi_smb_resume()?

The rmi4 bus does not rely on generic device suspend/resume
infrastructure for its subdevices, so async_suspend should only impact
the moment at which the smbus rmi4 root device is resumed, but not the
way it and its subdevices are resumed.

Would be interesting to get some pm_debug/pm_trace to compare the
good/bad cases.




> > > > [ 79.265074] rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed
> > > > to read current IRQ mask.
> > > > [ 79.308330] rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -6.
> > > > [ 79.308335] rmi4_f01 rmi4-00.fn01: Resume failed with code -6.
> > > > [ 79.308339] rmi4_physical rmi4-00: Failed to suspend functions: -6
> > > > [ 79.308342] rmi4_smbus 6-002c: Failed to resume device: -6
> > > > [ 79.351967] rmi4_physical rmi4-00: Failed to read irqs, code=-6
> > > >
> > > > A resume failure that only shows up during asynchronous resume,
> > > > typically means that the device is dependent on some other device to
> > > > resume first, but this dependency is NOT established in a parent child
> > > > relationship (which is wrong and needs to be fixed, perhaps using
> > > > device_add_link()). Thus the kernel may be resuming these devices
> > > > without first resuming some other device that these devices need to
> > > > depend on.
> > > >
> > > > TBH, I'm not sure how to fix this. The only hint I see is that all of
> > > > these devices seem to be attached to rmi4 device so perhaps something
> > > > there? I see 6e4860410b828f recently fixed device hierarchy for rmi4,
> > > > and so seemingly should have fixed this very issue (as also seen in
> > > > commit message)?
> > > >
> > > > >
> > > > > I will wait a few days if people come up with a fix. If not, I will
> > > > > revert the offending commit.
> > > >
> > > > While I'll be sad because this means no i2c-client can now resume in
> > > > parallel and increases resume latency by a *LOT* (hundreds of ms on
> > > > all Linux systems), I understand that this needs to be done unless
> > > > someone comes up with a fix.
> >
> > There is intricate dance happening switching touchpad from legacy PS/2
> > into RMI mode, with touchpad being dependent not only on SMbus
> > controller, but also on i8042 keyboard controller and its PS/2 port (or
> > rather their emulation by the system firmware).
> >
> > I wonder if we could apply a little bit more targeted patch:
> >
> > diff --git a/drivers/input/rmi4/rmi_smbus.c b/drivers/input/rmi4/rmi_smbus.c
> > index 2407ea43de59..3901d06d38ca 100644
> > --- a/drivers/input/rmi4/rmi_smbus.c
> > +++ b/drivers/input/rmi4/rmi_smbus.c
> > @@ -335,6 +335,7 @@ static int rmi_smb_probe(struct i2c_client *client,
> > return error;
> > }
> >
> > + device_disable_async_suspend(&client->dev);
> > return 0;
> > }
> >
> >
> > ... and if that works then we cant try to establish proper dependencies
> > via device links later.
> >
> > Hugh, could you please try this out and see if it helps?
>
> Yes, that works nicely, thanks Dmitry.
>
> By the way, my memory's been jogged by "rmi4" and the discussion above:
> I had a similar-ish problem with it a year ago, discussed with PM guys,
>
> https://lore.kernel.org/linux-pm/[email protected]/
>
> I'm not saying you have to read through that thread, but you may find
> some relevance in it - Saravana concluded rmi4 driver isn't capturing
> parent/child relationship correctly (at that time, anyway).

2022-02-14 02:57:12

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: 5.17-rc regression: rmi4 clients cannot deal with asynchronous suspend? (was: X1 Carbon touchpad not resumed)

Hi Hugh, Jarkko,

On Tue, Feb 08, 2022 at 04:57:53PM +0200, Jarkko Nikula wrote:
> Hi
>
> On 2/8/22 04:50, Hugh Dickins wrote:
> > On Mon, 7 Feb 2022, Dmitry Torokhov wrote:
> > > On Mon, Feb 07, 2022 at 01:41:36PM -0800, Rajat Jain wrote:
> > > > > > > Bisection led to 172d931910e1db800f4e71e8ed92281b6f8c6ee2
> > > > > > > ("i2c: enable async suspend/resume on i2c client devices")
> > > > > > > and reverting that fixes it for me.
> > > > > >
> > > > > > Thank you for the report plus bisection and sorry for the regression!
> > > > >
> > > > > +1, Thanks for the bisection, and apologies for the inconveniences.
> > > > >
> > > > > The problem here seems to be that for some reason, some devices (all
> > > > > connected to rmi4 adapter) failed to resume, but only when
> > > > > asynchronous suspend is enabled (by 172d931910e1):
> > > > >
> > > > > [ 79.221064] rmi4_smbus 6-002c: failed to get SMBus version number!
> > > > > [ 79.265074] rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed
> > > > > to read current IRQ mask.
> > > > > [ 79.308330] rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -6.
> > > > > [ 79.308335] rmi4_f01 rmi4-00.fn01: Resume failed with code -6.
> > > > > [ 79.308339] rmi4_physical rmi4-00: Failed to suspend functions: -6
> > > > > [ 79.308342] rmi4_smbus 6-002c: Failed to resume device: -6
> > > > > [ 79.351967] rmi4_physical rmi4-00: Failed to read irqs, code=-6
> > > > >
>
> v5.17-rc3 on Lenovo ThinkPad X1 Carbon 8th don't even suspend due the same
> commit 172d931910e1. Sadly I tested the original patch on other machine(s)
> but not on this one with rmi4 :-(
>
> [ 39.957293] PM: suspend entry (s2idle)
> [ 40.938666] Filesystems sync: 0.980 seconds
> [ 40.942751] Freezing user space processes ... (elapsed 0.001 seconds)
> done.
> [ 40.945511] OOM killer disabled.
> [ 40.946111] Freezing remaining freezable tasks ... (elapsed 0.001
> seconds) done.
> [ 40.948590] printk: Suspending console(s) (use no_console_suspend to
> debug)
> [ 40.993123] i801_smbus 0000:00:1f.4: No response
> [ 40.993218] rmi4_f01 rmi4-00.fn01: Failed to write sleep mode: -6.
> [ 40.993232] rmi4_f01 rmi4-00.fn01: Suspend failed with code -6.
> [ 40.993241] rmi4_physical rmi4-00: Failed to suspend functions: -6
> [ 40.993404] rmi4_smbus 1-002c: Failed to suspend device: -6
> [ 40.993414] PM: dpm_run_callback(): rmi_smb_suspend+0x0/0x30 [rmi_smbus]
> returns -6
> [ 40.993438] rmi4_smbus 1-002c: PM: failed to suspend async: error -6
> [ 41.014198] PM: Some devices failed to suspend, or early wake event
> detected
> [ 41.021544] i801_smbus 0000:00:1f.4: No response
> [ 41.021612] rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to
> F03 TX register (-6).
> [ 41.022189] i801_smbus 0000:00:1f.4: No response
> [ 41.022230] rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to
> F03 TX register (-6).
> [ 41.023480] i801_smbus 0000:00:1f.4: No response
> [ 41.023542] rmi4_physical rmi4-00: rmi_driver_clear_irq_bits: Failed to
> change enabled interrupts!
> [ 41.033850] i801_smbus 0000:00:1f.4: No response
> [ 41.034006] OOM killer enabled.
> [ 41.035449] i801_smbus 0000:00:1f.4: No response
> [ 41.035722] Restarting tasks ...
> [ 41.036705] rmi4_physical rmi4-00: rmi_driver_set_irq_bits: Failed to
> change enabled interrupts!
> [ 41.038367] done.
> [ 41.039003] psmouse: probe of serio2 failed with error -1
> [ 41.071700] PM: suspend exit

Sorry for the delay, but I wonder if you could try the patch below and
tell me if that also fixes the issue for you?

Also adding Hans as to make sure changes to psmouse_smbus make sense to
him.

Thanks!

diff --git a/drivers/input/mouse/psmouse-smbus.c b/drivers/input/mouse/psmouse-smbus.c
index a472489ccbad..164f6c757f6b 100644
--- a/drivers/input/mouse/psmouse-smbus.c
+++ b/drivers/input/mouse/psmouse-smbus.c
@@ -75,6 +75,8 @@ static void psmouse_smbus_detach_i2c_client(struct i2c_client *client)
"Marking SMBus companion %s as gone\n",
dev_name(&smbdev->client->dev));
smbdev->dead = true;
+ device_link_remove(&smbdev->client->dev,
+ &smbdev->psmouse->ps2dev.serio->dev);
serio_rescan(smbdev->psmouse->ps2dev.serio);
} else {
list_del(&smbdev->node);
@@ -174,6 +176,8 @@ static void psmouse_smbus_disconnect(struct psmouse *psmouse)
kfree(smbdev);
} else {
smbdev->dead = true;
+ device_link_remove(&smbdev->client->dev,
+ &psmouse->ps2dev.serio->dev);
psmouse_dbg(smbdev->psmouse,
"posting removal request for SMBus companion %s\n",
dev_name(&smbdev->client->dev));
@@ -270,6 +274,12 @@ int psmouse_smbus_init(struct psmouse *psmouse,

if (smbdev->client) {
/* We have our companion device */
+ if (!device_link_add(&smbdev->client->dev,
+ &psmouse->ps2dev.serio->dev,
+ DL_FLAG_STATELESS))
+ psmouse_warn(psmouse,
+ "failed to set up link with iSMBus companion %s\n",
+ dev_name(&smbdev->client->dev));
return 0;
}


--
Dmitry

2022-02-14 10:04:55

by Hans de Goede

[permalink] [raw]
Subject: Re: 5.17-rc regression: rmi4 clients cannot deal with asynchronous suspend? (was: X1 Carbon touchpad not resumed)

Hi All,

On 2/14/22 03:31, Dmitry Torokhov wrote:
> Hi Hugh, Jarkko,
>
> On Tue, Feb 08, 2022 at 04:57:53PM +0200, Jarkko Nikula wrote:
>> Hi
>>
>> On 2/8/22 04:50, Hugh Dickins wrote:
>>> On Mon, 7 Feb 2022, Dmitry Torokhov wrote:
>>>> On Mon, Feb 07, 2022 at 01:41:36PM -0800, Rajat Jain wrote:
>>>>>>>> Bisection led to 172d931910e1db800f4e71e8ed92281b6f8c6ee2
>>>>>>>> ("i2c: enable async suspend/resume on i2c client devices")
>>>>>>>> and reverting that fixes it for me.
>>>>>>>
>>>>>>> Thank you for the report plus bisection and sorry for the regression!
>>>>>>
>>>>>> +1, Thanks for the bisection, and apologies for the inconveniences.
>>>>>>
>>>>>> The problem here seems to be that for some reason, some devices (all
>>>>>> connected to rmi4 adapter) failed to resume, but only when
>>>>>> asynchronous suspend is enabled (by 172d931910e1):
>>>>>>
>>>>>> [ 79.221064] rmi4_smbus 6-002c: failed to get SMBus version number!
>>>>>> [ 79.265074] rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed
>>>>>> to read current IRQ mask.
>>>>>> [ 79.308330] rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -6.
>>>>>> [ 79.308335] rmi4_f01 rmi4-00.fn01: Resume failed with code -6.
>>>>>> [ 79.308339] rmi4_physical rmi4-00: Failed to suspend functions: -6
>>>>>> [ 79.308342] rmi4_smbus 6-002c: Failed to resume device: -6
>>>>>> [ 79.351967] rmi4_physical rmi4-00: Failed to read irqs, code=-6
>>>>>>
>>
>> v5.17-rc3 on Lenovo ThinkPad X1 Carbon 8th don't even suspend due the same
>> commit 172d931910e1. Sadly I tested the original patch on other machine(s)
>> but not on this one with rmi4 :-(
>>
>> [ 39.957293] PM: suspend entry (s2idle)
>> [ 40.938666] Filesystems sync: 0.980 seconds
>> [ 40.942751] Freezing user space processes ... (elapsed 0.001 seconds)
>> done.
>> [ 40.945511] OOM killer disabled.
>> [ 40.946111] Freezing remaining freezable tasks ... (elapsed 0.001
>> seconds) done.
>> [ 40.948590] printk: Suspending console(s) (use no_console_suspend to
>> debug)
>> [ 40.993123] i801_smbus 0000:00:1f.4: No response
>> [ 40.993218] rmi4_f01 rmi4-00.fn01: Failed to write sleep mode: -6.
>> [ 40.993232] rmi4_f01 rmi4-00.fn01: Suspend failed with code -6.
>> [ 40.993241] rmi4_physical rmi4-00: Failed to suspend functions: -6
>> [ 40.993404] rmi4_smbus 1-002c: Failed to suspend device: -6
>> [ 40.993414] PM: dpm_run_callback(): rmi_smb_suspend+0x0/0x30 [rmi_smbus]
>> returns -6
>> [ 40.993438] rmi4_smbus 1-002c: PM: failed to suspend async: error -6
>> [ 41.014198] PM: Some devices failed to suspend, or early wake event
>> detected
>> [ 41.021544] i801_smbus 0000:00:1f.4: No response
>> [ 41.021612] rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to
>> F03 TX register (-6).
>> [ 41.022189] i801_smbus 0000:00:1f.4: No response
>> [ 41.022230] rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to
>> F03 TX register (-6).
>> [ 41.023480] i801_smbus 0000:00:1f.4: No response
>> [ 41.023542] rmi4_physical rmi4-00: rmi_driver_clear_irq_bits: Failed to
>> change enabled interrupts!
>> [ 41.033850] i801_smbus 0000:00:1f.4: No response
>> [ 41.034006] OOM killer enabled.
>> [ 41.035449] i801_smbus 0000:00:1f.4: No response
>> [ 41.035722] Restarting tasks ...
>> [ 41.036705] rmi4_physical rmi4-00: rmi_driver_set_irq_bits: Failed to
>> change enabled interrupts!
>> [ 41.038367] done.
>> [ 41.039003] psmouse: probe of serio2 failed with error -1
>> [ 41.071700] PM: suspend exit
>
> Sorry for the delay, but I wonder if you could try the patch below and
> tell me if that also fixes the issue for you?
>
> Also adding Hans as to make sure changes to psmouse_smbus make sense to
> him.

I'm not really familiar with the whole psmouse intertouch code. I've added Benjamin
Tissoires to the Cc who knows this code a lot better then I do.

Regards,

Hans



> diff --git a/drivers/input/mouse/psmouse-smbus.c b/drivers/input/mouse/psmouse-smbus.c
> index a472489ccbad..164f6c757f6b 100644
> --- a/drivers/input/mouse/psmouse-smbus.c
> +++ b/drivers/input/mouse/psmouse-smbus.c
> @@ -75,6 +75,8 @@ static void psmouse_smbus_detach_i2c_client(struct i2c_client *client)
> "Marking SMBus companion %s as gone\n",
> dev_name(&smbdev->client->dev));
> smbdev->dead = true;
> + device_link_remove(&smbdev->client->dev,
> + &smbdev->psmouse->ps2dev.serio->dev);
> serio_rescan(smbdev->psmouse->ps2dev.serio);
> } else {
> list_del(&smbdev->node);
> @@ -174,6 +176,8 @@ static void psmouse_smbus_disconnect(struct psmouse *psmouse)
> kfree(smbdev);
> } else {
> smbdev->dead = true;
> + device_link_remove(&smbdev->client->dev,
> + &psmouse->ps2dev.serio->dev);
> psmouse_dbg(smbdev->psmouse,
> "posting removal request for SMBus companion %s\n",
> dev_name(&smbdev->client->dev));
> @@ -270,6 +274,12 @@ int psmouse_smbus_init(struct psmouse *psmouse,
>
> if (smbdev->client) {
> /* We have our companion device */
> + if (!device_link_add(&smbdev->client->dev,
> + &psmouse->ps2dev.serio->dev,
> + DL_FLAG_STATELESS))
> + psmouse_warn(psmouse,
> + "failed to set up link with iSMBus companion %s\n",
> + dev_name(&smbdev->client->dev));
> return 0;
> }
>
>

2022-02-14 10:40:30

by Hugh Dickins

[permalink] [raw]
Subject: Re: 5.17-rc regression: rmi4 clients cannot deal with asynchronous suspend? (was: X1 Carbon touchpad not resumed)

On Sun, 13 Feb 2022, Dmitry Torokhov wrote:
>
> Sorry for the delay, but I wonder if you could try the patch below and
> tell me if that also fixes the issue for you?

It fixes it for me, thanks Dmitry; with nothing unpleasant in dmesg.

Hugh

2022-02-14 18:41:18

by Jarkko Nikula

[permalink] [raw]
Subject: Re: 5.17-rc regression: rmi4 clients cannot deal with asynchronous suspend? (was: X1 Carbon touchpad not resumed)

On 2/14/22 09:36, Hugh Dickins wrote:
> On Sun, 13 Feb 2022, Dmitry Torokhov wrote:
>>
>> Sorry for the delay, but I wonder if you could try the patch below and
>> tell me if that also fixes the issue for you?
>
> It fixes it for me, thanks Dmitry; with nothing unpleasant in dmesg.
>
Also for me.

Tested-by: Jarkko Nikula <[email protected]>