2013-07-17 12:14:41

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc 0/4 3.11] Fix mei suspend/resume failure

This series should fix the long standing reset storm
issue in mei suspend/resume failure

The issue was reported in few places:
https://lkml.org/lkml/2013/6/26/693
https://lkml.org/lkml/2013/7/14/69


Tomas Winkler (4):
mei: hbm: fix typo in error message
mei: me: fix reset state machine
mei: don't have to clean the state on power up
mei: me: fix waiting for hw ready

drivers/misc/mei/hbm.c | 2 +-
drivers/misc/mei/hw-me.c | 14 ++++++++++----
drivers/misc/mei/init.c | 3 ++-
3 files changed, 13 insertions(+), 6 deletions(-)

--
1.8.1.2


2013-07-17 12:14:45

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc 3.11 2/4] mei: me: fix reset state machine

ME HW ready bit is down after hw reset was asserted or on error.
Only on error we need to enter the reset flow, additional reset
need to be prevented when reset was triggered during
initialization , power up/down or a reset is already in progress

Signed-off-by: Tomas Winkler <[email protected]>
---
drivers/misc/mei/hw-me.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c
index e4f8dec..a0e19e6 100644
--- a/drivers/misc/mei/hw-me.c
+++ b/drivers/misc/mei/hw-me.c
@@ -483,7 +483,9 @@ irqreturn_t mei_me_irq_thread_handler(int irq, void *dev_id)
/* check if ME wants a reset */
if (!mei_hw_is_ready(dev) &&
dev->dev_state != MEI_DEV_RESETTING &&
- dev->dev_state != MEI_DEV_INITIALIZING) {
+ dev->dev_state != MEI_DEV_INITIALIZING &&
+ dev->dev_state != MEI_DEV_POWER_DOWN &&
+ dev->dev_state != MEI_DEV_POWER_UP) {
dev_dbg(&dev->pdev->dev, "FW not ready.\n");
mei_reset(dev, 1);
mutex_unlock(&dev->device_lock);
--
1.8.1.2

2013-07-17 12:14:50

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc 3.11 4/4] mei: me: fix waiting for hw ready

1. MEI_INTEROP_TIMEOUT is in seconds not in jiffies
so we use mei_secs_to_jiffies macro
While cold boot is fast this is relevant in resume
2. wait_event_interruptible_timeout can return with
-ERESTARTSYS so do not override it with -ETIMEDOUT
3.Adjust error message

Signed-off-by: Tomas Winkler <[email protected]>
---
drivers/misc/mei/hw-me.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c
index a0e19e6..b22c7e2 100644
--- a/drivers/misc/mei/hw-me.c
+++ b/drivers/misc/mei/hw-me.c
@@ -239,14 +239,18 @@ static int mei_me_hw_ready_wait(struct mei_device *dev)
if (mei_me_hw_is_ready(dev))
return 0;

+ dev->recvd_hw_ready = false;
mutex_unlock(&dev->device_lock);
err = wait_event_interruptible_timeout(dev->wait_hw_ready,
- dev->recvd_hw_ready, MEI_INTEROP_TIMEOUT);
+ dev->recvd_hw_ready,
+ mei_secs_to_jiffies(MEI_INTEROP_TIMEOUT));
mutex_lock(&dev->device_lock);
if (!err && !dev->recvd_hw_ready) {
+ if (!err)
+ err = -ETIMEDOUT;
dev_err(&dev->pdev->dev,
- "wait hw ready failed. status = 0x%x\n", err);
- return -ETIMEDOUT;
+ "wait hw ready failed. status = %d\n", err);
+ return err;
}

dev->recvd_hw_ready = false;
--
1.8.1.2

2013-07-17 12:15:29

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc 3.11 3/4] mei: don't have to clean the state on power up

When powering up, we don't have to clean up the device state
nothing is connected.

Signed-off-by: Tomas Winkler <[email protected]>
---
drivers/misc/mei/init.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mei/init.c b/drivers/misc/mei/init.c
index ed1d752..e6f16f8 100644
--- a/drivers/misc/mei/init.c
+++ b/drivers/misc/mei/init.c
@@ -148,7 +148,8 @@ void mei_reset(struct mei_device *dev, int interrupts_enabled)

dev->hbm_state = MEI_HBM_IDLE;

- if (dev->dev_state != MEI_DEV_INITIALIZING) {
+ if (dev->dev_state != MEI_DEV_INITIALIZING &&
+ dev->dev_state != MEI_DEV_POWER_UP) {
if (dev->dev_state != MEI_DEV_DISABLED &&
dev->dev_state != MEI_DEV_POWER_DOWN)
dev->dev_state = MEI_DEV_RESETTING;
--
1.8.1.2

2013-07-17 12:14:43

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc 3.11 1/4] mei: hbm: fix typo in error message

writet -> write

Signed-off-by: Tomas Winkler <[email protected]>
---
drivers/misc/mei/hbm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/mei/hbm.c b/drivers/misc/mei/hbm.c
index f9296ab..6127ab6 100644
--- a/drivers/misc/mei/hbm.c
+++ b/drivers/misc/mei/hbm.c
@@ -167,7 +167,7 @@ int mei_hbm_start_req(struct mei_device *dev)

dev->hbm_state = MEI_HBM_IDLE;
if (mei_write_message(dev, mei_hdr, dev->wr_msg.data)) {
- dev_err(&dev->pdev->dev, "version message writet failed\n");
+ dev_err(&dev->pdev->dev, "version message write failed\n");
dev->dev_state = MEI_DEV_RESETTING;
mei_reset(dev, 1);
return -ENODEV;
--
1.8.1.2

2013-07-18 12:41:24

by Winkler, Tomas

[permalink] [raw]
Subject: RE: [char-misc 0/4 3.11] Fix mei suspend/resume failure



>
> This series should fix the long standing reset storm issue in mei
> suspend/resume failure
>
> The issue was reported in few places:
> https://lkml.org/lkml/2013/6/26/693
> https://lkml.org/lkml/2013/7/14/69
>
>
> Tomas Winkler (4):
> mei: hbm: fix typo in error message
> mei: me: fix reset state machine
> mei: don't have to clean the state on power up
> mei: me: fix waiting for hw ready
>
> drivers/misc/mei/hbm.c | 2 +-
> drivers/misc/mei/hw-me.c | 14 ++++++++++---- drivers/misc/mei/init.c | 3
> ++-
> 3 files changed, 13 insertions(+), 6 deletions(-)


CC: Konstantin and Shuah

Tomas

2013-07-22 21:36:04

by Winkler, Tomas

[permalink] [raw]
Subject: RE: [char-misc 0/4 3.11] Fix mei suspend/resume failure

>
>
> >
> > This series should fix the long standing reset storm issue in mei
> > suspend/resume failure
> >

Can you guys confirm these fixes the issue for you.
Thanks
Tomas

> > The issue was reported in few places:
> > https://lkml.org/lkml/2013/6/26/693
> > https://lkml.org/lkml/2013/7/14/69
> >
> >
> > Tomas Winkler (4):
> > mei: hbm: fix typo in error message
> > mei: me: fix reset state machine
> > mei: don't have to clean the state on power up
> > mei: me: fix waiting for hw ready
> >
> > drivers/misc/mei/hbm.c | 2 +-
> > drivers/misc/mei/hw-me.c | 14 ++++++++++---- drivers/misc/mei/init.c
> > | 3
> > ++-
> > 3 files changed, 13 insertions(+), 6 deletions(-)
>
>
> CC: Konstantin and Shuah
>
> Tomas

2013-07-22 22:24:23

by Shuah Khan

[permalink] [raw]
Subject: Re: [char-misc 0/4 3.11] Fix mei suspend/resume failure

On 07/22/2013 03:36 PM, Winkler, Tomas wrote:
>>
>>
>>>
>>> This series should fix the long standing reset storm issue in mei
>>> suspend/resume failure
>>>
>
> Can you guys confirm these fixes the issue for you.
> Thanks
> Tomas

Yes. I am getting started with testing. Will let you know in a bit.

-- Shuah


--
Shuah Khan, Linux Kernel Developer - Open Source Group Samsung Research
America (Silicon Valley) [email protected] | (970) 672-0658

2013-07-23 01:18:05

by Shuah Khan

[permalink] [raw]
Subject: Re: [char-misc 0/4 3.11] Fix mei suspend/resume failure

On 07/22/2013 04:24 PM, Shuah Khan wrote:
> On 07/22/2013 03:36 PM, Winkler, Tomas wrote:
>>>
>>>
>>>>
>>>> This series should fix the long standing reset storm issue in mei
>>>> suspend/resume failure
>>>>
>>
>> Can you guys confirm these fixes the issue for you.
>> Thanks
>> Tomas
>
> Yes. I am getting started with testing. Will let you know in a bit.
>
> -- Shuah
>
>

Did several suspend to disk tests (reboot, platform, and suspend) and
didn't see the mei reset problem.

-- Shuah

Shuah Khan, Linux Kernel Developer - Open Source Group Samsung Research
America (Silicon Valley) [email protected] | (970) 672-0658

2013-07-24 19:59:31

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [char-misc 0/4 3.11] Fix mei suspend/resume failure

Winkler, Tomas wrote:
>>
>>
>>>
>>> This series should fix the long standing reset storm issue in mei
>>> suspend/resume failure
>>>
>
> Can you guys confirm these fixes the issue for you.
> Thanks
> Tomas

Nope, it's still broken

kernel: 3.10.2 + my patch for i915 + four your patches. config in attachment

I see endless flood in dmesg after several suspend-resume cycles

Jul 24 23:39:41 zurg kernel: [ 1.737907] mei_me 0000:00:16.0: setting latency timer to 64
Jul 24 23:39:41 zurg kernel: [ 1.738335] mei_me 0000:00:16.0: irq 47 for MSI/MSI-X
Jul 24 23:40:29 zurg kernel: [ 47.040300] mei_me 0000:00:16.0: suspend
Jul 24 23:40:29 zurg kernel: [ 47.759004] mei_me 0000:00:16.0: irq 46 for MSI/MSI-X
Jul 24 23:40:29 zurg kernel: [ 47.859078] mei_me 0000:00:16.0: reset: properties response hbm wrong status.
Jul 24 23:40:29 zurg kernel: [ 47.859082] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:40:36 zurg kernel: [ 54.846016] mei_me 0000:00:16.0: wait hw ready failed. status = -110
Jul 24 23:40:50 zurg kernel: [ 64.161164] mei_me 0000:00:16.0: suspend
Jul 24 23:40:50 zurg kernel: [ 64.881192] mei_me 0000:00:16.0: irq 46 for MSI/MSI-X
Jul 24 23:40:50 zurg kernel: [ 64.981346] mei_me 0000:00:16.0: reset: properties response hbm wrong status.
Jul 24 23:40:50 zurg kernel: [ 64.981349] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:41:01 zurg kernel: [ 69.163476] mei_me 0000:00:16.0: suspend
Jul 24 23:41:01 zurg kernel: [ 71.968336] mei_me 0000:00:16.0: wait hw ready failed. status = -110
Jul 24 23:41:01 zurg kernel: [ 72.220902] mei_me 0000:00:16.0: irq 46 for MSI/MSI-X
Jul 24 23:41:01 zurg kernel: [ 72.321278] mei_me 0000:00:16.0: reset: properties response hbm wrong status.
Jul 24 23:41:01 zurg kernel: [ 72.321282] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:41:12 zurg kernel: [ 76.497560] mei_me 0000:00:16.0: suspend
Jul 24 23:41:12 zurg kernel: [ 79.308237] mei_me 0000:00:16.0: wait hw ready failed. status = -110
Jul 24 23:41:12 zurg kernel: [ 79.561639] mei_me 0000:00:16.0: irq 46 for MSI/MSI-X
Jul 24 23:41:12 zurg kernel: [ 79.661996] mei_me 0000:00:16.0: reset: properties response hbm wrong status.
Jul 24 23:41:12 zurg kernel: [ 79.661999] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:41:23 zurg kernel: [ 86.052274] mei_me 0000:00:16.0: suspend
Jul 24 23:41:23 zurg kernel: [ 86.648672] mei_me 0000:00:16.0: wait hw ready failed. status = -110
Jul 24 23:41:23 zurg kernel: [ 86.902364] mei_me 0000:00:16.0: irq 46 for MSI/MSI-X
Jul 24 23:41:23 zurg kernel: [ 87.001336] mei_me 0000:00:16.0: reset: properties response hbm wrong status.
Jul 24 23:41:23 zurg kernel: [ 87.001339] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:41:34 zurg kernel: [ 91.183450] mei_me 0000:00:16.0: suspend
Jul 24 23:41:34 zurg kernel: [ 93.989014] mei_me 0000:00:16.0: wait hw ready failed. status = -110
Jul 24 23:41:34 zurg kernel: [ 94.243145] mei_me 0000:00:16.0: irq 46 for MSI/MSI-X
Jul 24 23:41:34 zurg kernel: [ 94.342737] mei_me 0000:00:16.0: reset: properties response hbm wrong status.
Jul 24 23:41:34 zurg kernel: [ 94.342741] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:41:45 zurg kernel: [ 98.395809] mei_me 0000:00:16.0: suspend
Jul 24 23:41:45 zurg kernel: [ 101.330239] mei_me 0000:00:16.0: wait hw ready failed. status = -110
Jul 24 23:41:45 zurg kernel: [ 101.582884] mei_me 0000:00:16.0: irq 46 for MSI/MSI-X
Jul 24 23:41:45 zurg kernel: [ 101.682818] mei_me 0000:00:16.0: reset: properties response hbm wrong status.
Jul 24 23:41:45 zurg kernel: [ 101.682822] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:41:56 zurg kernel: [ 105.746993] mei_me 0000:00:16.0: suspend
Jul 24 23:41:56 zurg kernel: [ 108.670366] mei_me 0000:00:16.0: wait hw ready failed. status = -110
Jul 24 23:41:56 zurg kernel: [ 108.924615] mei_me 0000:00:16.0: irq 46 for MSI/MSI-X
Jul 24 23:41:56 zurg kernel: [ 109.024898] mei_me 0000:00:16.0: reset: properties response hbm wrong status.
Jul 24 23:41:56 zurg kernel: [ 109.024901] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:42:03 zurg kernel: [ 116.011631] mei_me 0000:00:16.0: wait hw ready failed. status = -110
Jul 24 23:42:33 zurg kernel: [ 145.944986] mei_me 0000:00:16.0: reset: init clients timeout hbm_state = 1.
Jul 24 23:42:33 zurg kernel: [ 145.944994] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:42:33 zurg kernel: [ 145.946641] mei_me 0000:00:16.0: reset: wrong host start response
Jul 24 23:42:33 zurg kernel: [ 145.946654] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:42:33 zurg kernel: [ 145.950458] mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
Jul 24 23:42:33 zurg kernel: [ 145.950475] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:42:33 zurg kernel: [ 145.951480] mei_me 0000:00:16.0: reset: wrong host start response
Jul 24 23:42:33 zurg kernel: [ 145.951494] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:42:33 zurg kernel: [ 145.951517] mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
Jul 24 23:42:33 zurg kernel: [ 145.951523] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:42:33 zurg kernel: [ 145.951703] mei_me 0000:00:16.0: reset: wrong host start response
Jul 24 23:42:33 zurg kernel: [ 145.951710] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:42:33 zurg kernel: [ 145.952346] mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
Jul 24 23:42:33 zurg kernel: [ 145.952360] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:42:33 zurg kernel: [ 145.952689] mei_me 0000:00:16.0: reset: wrong host start response
Jul 24 23:42:33 zurg kernel: [ 145.952696] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Jul 24 23:42:33 zurg kernel: [ 145.952716] mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
Jul 24 23:42:33 zurg kernel: [ 145.952721] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING


>
>>> The issue was reported in few places:
>>> https://lkml.org/lkml/2013/6/26/693
>>> https://lkml.org/lkml/2013/7/14/69
>>>
>>>
>>> Tomas Winkler (4):
>>> mei: hbm: fix typo in error message
>>> mei: me: fix reset state machine
>>> mei: don't have to clean the state on power up
>>> mei: me: fix waiting for hw ready
>>>
>>> drivers/misc/mei/hbm.c | 2 +-
>>> drivers/misc/mei/hw-me.c | 14 ++++++++++---- drivers/misc/mei/init.c
>>> | 3
>>> ++-
>>> 3 files changed, 13 insertions(+), 6 deletions(-)
>>
>>
>> CC: Konstantin and Shuah
>>
>> Tomas


Attachments:
.config (88.04 kB)

2013-07-24 20:08:55

by Winkler, Tomas

[permalink] [raw]
Subject: RE: [char-misc 0/4 3.11] Fix mei suspend/resume failure



>
> Winkler, Tomas wrote:
> >>
> >>
> >>>
> >>> This series should fix the long standing reset storm issue in mei
> >>> suspend/resume failure
> >>>
> >
> > Can you guys confirm these fixes the issue for you.
> > Thanks
> > Tomas
>
> Nope, it's still broken
>
> kernel: 3.10.2 + my patch for i915 + four your patches. config in attachment
>

Very surprising,
Can you send me the pci ids of the mei device on your platform.
Thanks
Tomas

2013-07-24 20:36:28

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [char-misc 0/4 3.11] Fix mei suspend/resume failure

Winkler, Tomas wrote:
>
>
>>
>> Winkler, Tomas wrote:
>>>>
>>>>
>>>>>
>>>>> This series should fix the long standing reset storm issue in mei
>>>>> suspend/resume failure
>>>>>
>>>
>>> Can you guys confirm these fixes the issue for you.
>>> Thanks
>>> Tomas
>>
>> Nope, it's still broken
>>
>> kernel: 3.10.2 + my patch for i915 + four your patches. config in attachment
>>
>
> Very surprising,
> Can you send me the pci ids of the mei device on your platform.

8086:1c3a

00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04)
Subsystem: Lenovo Device 21da
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+
Interrupt: pin A routed to IRQ 16
Region 0: Memory at f2625000 (64-bit, non-prefetchable) [size=16]
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [8c] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 00000000fee0f00c Data: 4187


> Thanks
> Tomas