2014-11-05 01:48:16

by Liu, Chuansheng

[permalink] [raw]
Subject: [PATCH] PCI: Do not enable async suspend for JMicron chips

The JMicron chip 361/363/368 contains one SATA controller and
one PATA controller, they are brother-relation ship in PCI tree,
but for powering on these both controller, we must follow the
sequence one by one, otherwise one of them can not be powered on
successfully.

So here we disable the async suspend method for Jmicron chip.

Bug link:
https://bugzilla.kernel.org/show_bug.cgi?id=81551
https://bugzilla.kernel.org/show_bug.cgi?id=84861

And we can revert the below commit after this patch is applied:
e6b7e41(ata: Disabling the async PM for JMicron chip 363/361)

Cc: [email protected] # 3.15+
Acked-by: Aaron Lu <[email protected]>
Signed-off-by: Chuansheng Liu <[email protected]>
---
drivers/pci/pci.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 625a4ac..53128f0 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2046,7 +2046,17 @@ void pci_pm_init(struct pci_dev *dev)
pm_runtime_forbid(&dev->dev);
pm_runtime_set_active(&dev->dev);
pm_runtime_enable(&dev->dev);
- device_enable_async_suspend(&dev->dev);
+
+ /*
+ * The JMicron chip 361/363/368 contains one SATA controller and
+ * one PATA controller, they are brother-relation ship in PCI tree,
+ * but for powering on these both controller, we must follow the
+ * sequence one by one, otherwise one of them can not be powered on
+ * successfully, so here we disable the async suspend method for
+ * Jmicron chip.
+ */
+ if (dev->vendor != PCI_VENDOR_ID_JMICRON)
+ device_enable_async_suspend(&dev->dev);
dev->wakeup_prepared = false;

dev->pm_cap = 0;
--
1.7.9.5


2014-11-05 18:01:49

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips

On Tue, Nov 4, 2014 at 6:31 PM, Chuansheng Liu <[email protected]> wrote:
> The JMicron chip 361/363/368 contains one SATA controller and
> one PATA controller, they are brother-relation ship in PCI tree,
> but for powering on these both controller, we must follow the
> sequence one by one, otherwise one of them can not be powered on
> successfully.

This should mention what's broken and what problem a user would see.
This changelog sounds a lot like the one for e6b7e41cdd8c, so I don't
know if this is for a new, related problem, or what.

> So here we disable the async suspend method for Jmicron chip.
>
> Bug link:
> https://bugzilla.kernel.org/show_bug.cgi?id=81551
> https://bugzilla.kernel.org/show_bug.cgi?id=84861
>
> And we can revert the below commit after this patch is applied:
> e6b7e41(ata: Disabling the async PM for JMicron chip 363/361)
>
> Cc: [email protected] # 3.15+
> Acked-by: Aaron Lu <[email protected]>
> Signed-off-by: Chuansheng Liu <[email protected]>
> ---
> drivers/pci/pci.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 625a4ac..53128f0 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2046,7 +2046,17 @@ void pci_pm_init(struct pci_dev *dev)
> pm_runtime_forbid(&dev->dev);
> pm_runtime_set_active(&dev->dev);
> pm_runtime_enable(&dev->dev);
> - device_enable_async_suspend(&dev->dev);
> +
> + /*
> + * The JMicron chip 361/363/368 contains one SATA controller and
> + * one PATA controller, they are brother-relation ship in PCI tree,
> + * but for powering on these both controller, we must follow the
> + * sequence one by one, otherwise one of them can not be powered on
> + * successfully, so here we disable the async suspend method for
> + * Jmicron chip.
> + */
> + if (dev->vendor != PCI_VENDOR_ID_JMICRON)
> + device_enable_async_suspend(&dev->dev);

I don't like littering the core PCI code with vendor tests like this.
This would be the only one, except for an ancient DECchip 21050 bridge
erratum.

And why would we want a test for *all* JMicron devices here, when you
claim the problem only affects a few specific ones?

And what's the story with the e6b7e41cdd8c ("ata: Disabling the async
PM for JMicron chip 363/361") connection? Is something broken even
with e6b7e41cdd8c, and this is a better fix? Or is this simply a
different way of fixing the same problem?

> dev->wakeup_prepared = false;
>
> dev->pm_cap = 0;
> --
> 1.7.9.5
>

2014-11-05 18:46:51

by Barto

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips

this patch solves these 2 bug reports :

https://bugzilla.kernel.org/show_bug.cgi?id=84861

https://bugzilla.kernel.org/show_bug.cgi?id=81551

in simple words : JMicron IDE/Sata controlers family ( JMBxxx ) are not
fully compatible with async_suspend feature, when a user tries to put
his PC on standby mode then at wake-up JMicron IDE/Sata controlers will
not work, because of a brother-relation between the SATA and IDE part on
this JMicron PCI card




Le 05/11/2014 19:01, Bjorn Helgaas a écrit :
> On Tue, Nov 4, 2014 at 6:31 PM, Chuansheng Liu <[email protected]> wrote:
>> The JMicron chip 361/363/368 contains one SATA controller and
>> one PATA controller, they are brother-relation ship in PCI tree,
>> but for powering on these both controller, we must follow the
>> sequence one by one, otherwise one of them can not be powered on
>> successfully.
>
> This should mention what's broken and what problem a user would see.
> This changelog sounds a lot like the one for e6b7e41cdd8c, so I don't
> know if this is for a new, related problem, or what.
>
>> So here we disable the async suspend method for Jmicron chip.
>>
>> Bug link:
>> https://bugzilla.kernel.org/show_bug.cgi?id=81551
>> https://bugzilla.kernel.org/show_bug.cgi?id=84861
>>
>> And we can revert the below commit after this patch is applied:
>> e6b7e41(ata: Disabling the async PM for JMicron chip 363/361)
>>
>> Cc: [email protected] # 3.15+
>> Acked-by: Aaron Lu <[email protected]>
>> Signed-off-by: Chuansheng Liu <[email protected]>
>> ---
>> drivers/pci/pci.c | 12 +++++++++++-
>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 625a4ac..53128f0 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -2046,7 +2046,17 @@ void pci_pm_init(struct pci_dev *dev)
>> pm_runtime_forbid(&dev->dev);
>> pm_runtime_set_active(&dev->dev);
>> pm_runtime_enable(&dev->dev);
>> - device_enable_async_suspend(&dev->dev);
>> +
>> + /*
>> + * The JMicron chip 361/363/368 contains one SATA controller and
>> + * one PATA controller, they are brother-relation ship in PCI tree,
>> + * but for powering on these both controller, we must follow the
>> + * sequence one by one, otherwise one of them can not be powered on
>> + * successfully, so here we disable the async suspend method for
>> + * Jmicron chip.
>> + */
>> + if (dev->vendor != PCI_VENDOR_ID_JMICRON)
>> + device_enable_async_suspend(&dev->dev);
>
> I don't like littering the core PCI code with vendor tests like this.
> This would be the only one, except for an ancient DECchip 21050 bridge
> erratum.
>
> And why would we want a test for *all* JMicron devices here, when you
> claim the problem only affects a few specific ones?
>
> And what's the story with the e6b7e41cdd8c ("ata: Disabling the async
> PM for JMicron chip 363/361") connection? Is something broken even
> with e6b7e41cdd8c, and this is a better fix? Or is this simply a
> different way of fixing the same problem?
>
>> dev->wakeup_prepared = false;
>>
>> dev->pm_cap = 0;
>> --
>> 1.7.9.5
>>
>

2014-11-05 19:04:45

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips

On Wed, Nov 5, 2014 at 11:46 AM, Barto <[email protected]> wrote:
> this patch solves these 2 bug reports :
>
> https://bugzilla.kernel.org/show_bug.cgi?id=84861
>
> https://bugzilla.kernel.org/show_bug.cgi?id=81551

Those bugs were already mentioned. But e6b7e41cdd8c claims to solve
https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a
duplicate of 81551, so it should also be fixed by e6b7e41cdd8c.

So the question is, why was e6b7e41cdd8c insufficient? Presumably it
was tested and somebody thought it did fix the problem.

> in simple words : JMicron IDE/Sata controlers family ( JMBxxx ) are not
> fully compatible with async_suspend feature, when a user tries to put
> his PC on standby mode then at wake-up JMicron IDE/Sata controlers will
> not work, because of a brother-relation between the SATA and IDE part on
> this JMicron PCI card
>
>
>
>
> Le 05/11/2014 19:01, Bjorn Helgaas a écrit :
>> On Tue, Nov 4, 2014 at 6:31 PM, Chuansheng Liu <[email protected]> wrote:
>>> The JMicron chip 361/363/368 contains one SATA controller and
>>> one PATA controller, they are brother-relation ship in PCI tree,
>>> but for powering on these both controller, we must follow the
>>> sequence one by one, otherwise one of them can not be powered on
>>> successfully.
>>
>> This should mention what's broken and what problem a user would see.
>> This changelog sounds a lot like the one for e6b7e41cdd8c, so I don't
>> know if this is for a new, related problem, or what.
>>
>>> So here we disable the async suspend method for Jmicron chip.
>>>
>>> Bug link:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551
>>> https://bugzilla.kernel.org/show_bug.cgi?id=84861
>>>
>>> And we can revert the below commit after this patch is applied:
>>> e6b7e41(ata: Disabling the async PM for JMicron chip 363/361)
>>>
>>> Cc: [email protected] # 3.15+
>>> Acked-by: Aaron Lu <[email protected]>
>>> Signed-off-by: Chuansheng Liu <[email protected]>
>>> ---
>>> drivers/pci/pci.c | 12 +++++++++++-
>>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>> index 625a4ac..53128f0 100644
>>> --- a/drivers/pci/pci.c
>>> +++ b/drivers/pci/pci.c
>>> @@ -2046,7 +2046,17 @@ void pci_pm_init(struct pci_dev *dev)
>>> pm_runtime_forbid(&dev->dev);
>>> pm_runtime_set_active(&dev->dev);
>>> pm_runtime_enable(&dev->dev);
>>> - device_enable_async_suspend(&dev->dev);
>>> +
>>> + /*
>>> + * The JMicron chip 361/363/368 contains one SATA controller and
>>> + * one PATA controller, they are brother-relation ship in PCI tree,
>>> + * but for powering on these both controller, we must follow the
>>> + * sequence one by one, otherwise one of them can not be powered on
>>> + * successfully, so here we disable the async suspend method for
>>> + * Jmicron chip.
>>> + */
>>> + if (dev->vendor != PCI_VENDOR_ID_JMICRON)
>>> + device_enable_async_suspend(&dev->dev);
>>
>> I don't like littering the core PCI code with vendor tests like this.
>> This would be the only one, except for an ancient DECchip 21050 bridge
>> erratum.
>>
>> And why would we want a test for *all* JMicron devices here, when you
>> claim the problem only affects a few specific ones?
>>
>> And what's the story with the e6b7e41cdd8c ("ata: Disabling the async
>> PM for JMicron chip 363/361") connection? Is something broken even
>> with e6b7e41cdd8c, and this is a better fix? Or is this simply a
>> different way of fixing the same problem?
>>
>>> dev->wakeup_prepared = false;
>>>
>>> dev->pm_cap = 0;
>>> --
>>> 1.7.9.5
>>>
>>

2014-11-06 01:36:22

by Barto

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips

Bjorn : the patch initialy created for bug 81551 ( ATAPI-CD-ROM-drive
dead after resume from suspend/s2disk ) was not enough for the bug 84861
( JMicron Technology Corp. JMB368 IDE controller dead after resume when
async suspend is enabled ),

the reason : there are too much models inside the family of JMBxxx
JMicron SATA/IDE controlers PCI cards, and the first patch targets ONLY
the JMB363/361 model, which is not enough :

if (pdev->vendor == PCI_VENDOR_ID_JMICRON &&
+ (pdev->device == PCI_DEVICE_ID_JMICRON_JMB363 ||
+ pdev->device == PCI_DEVICE_ID_JMICRON_JMB361))
+ device_disable_async_suspend(&pdev->dev);

for example I have a JMB363/368 JMicron SATA/IDE PCI card, and the first
patch created for the bug 81551 is not enough, that's why Chuansheng Liu
created a new patch who targets ALL models of JMicron JMBxxx SATA/IDE
cards, in order to be sure that these models of JMicron will have
"async_suspend feature disabled,

the good patch who works for all models of JMicron JMBxx SATA/IDE
controlers :

+ if (dev->vendor != PCI_VENDOR_ID_JMICRON)
+ device_enable_async_suspend(&dev->dev);
dev->wakeup_prepared = false;





Le 05/11/2014 20:03, Bjorn Helgaas a écrit :
> On Wed, Nov 5, 2014 at 11:46 AM, Barto <[email protected]> wrote:
>> this patch solves these 2 bug reports :
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=81551
>
> Those bugs were already mentioned. But e6b7e41cdd8c claims to solve
> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a
> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c.
>
> So the question is, why was e6b7e41cdd8c insufficient? Presumably it
> was tested and somebody thought it did fix the problem.
>
>> in simple words : JMicron IDE/Sata controlers family ( JMBxxx ) are not
>> fully compatible with async_suspend feature, when a user tries to put
>> his PC on standby mode then at wake-up JMicron IDE/Sata controlers will
>> not work, because of a brother-relation between the SATA and IDE part on
>> this JMicron PCI card
>>
>>
>>
>>
>> Le 05/11/2014 19:01, Bjorn Helgaas a écrit :
>>> On Tue, Nov 4, 2014 at 6:31 PM, Chuansheng Liu <[email protected]> wrote:
>>>> The JMicron chip 361/363/368 contains one SATA controller and
>>>> one PATA controller, they are brother-relation ship in PCI tree,
>>>> but for powering on these both controller, we must follow the
>>>> sequence one by one, otherwise one of them can not be powered on
>>>> successfully.
>>>
>>> This should mention what's broken and what problem a user would see.
>>> This changelog sounds a lot like the one for e6b7e41cdd8c, so I don't
>>> know if this is for a new, related problem, or what.
>>>
>>>> So here we disable the async suspend method for Jmicron chip.
>>>>
>>>> Bug link:
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=84861
>>>>
>>>> And we can revert the below commit after this patch is applied:
>>>> e6b7e41(ata: Disabling the async PM for JMicron chip 363/361)
>>>>
>>>> Cc: [email protected] # 3.15+
>>>> Acked-by: Aaron Lu <[email protected]>
>>>> Signed-off-by: Chuansheng Liu <[email protected]>
>>>> ---
>>>> drivers/pci/pci.c | 12 +++++++++++-
>>>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>> index 625a4ac..53128f0 100644
>>>> --- a/drivers/pci/pci.c
>>>> +++ b/drivers/pci/pci.c
>>>> @@ -2046,7 +2046,17 @@ void pci_pm_init(struct pci_dev *dev)
>>>> pm_runtime_forbid(&dev->dev);
>>>> pm_runtime_set_active(&dev->dev);
>>>> pm_runtime_enable(&dev->dev);
>>>> - device_enable_async_suspend(&dev->dev);
>>>> +
>>>> + /*
>>>> + * The JMicron chip 361/363/368 contains one SATA controller and
>>>> + * one PATA controller, they are brother-relation ship in PCI tree,
>>>> + * but for powering on these both controller, we must follow the
>>>> + * sequence one by one, otherwise one of them can not be powered on
>>>> + * successfully, so here we disable the async suspend method for
>>>> + * Jmicron chip.
>>>> + */
>>>> + if (dev->vendor != PCI_VENDOR_ID_JMICRON)
>>>> + device_enable_async_suspend(&dev->dev);
>>>
>>> I don't like littering the core PCI code with vendor tests like this.
>>> This would be the only one, except for an ancient DECchip 21050 bridge
>>> erratum.
>>>
>>> And why would we want a test for *all* JMicron devices here, when you
>>> claim the problem only affects a few specific ones?
>>>
>>> And what's the story with the e6b7e41cdd8c ("ata: Disabling the async
>>> PM for JMicron chip 363/361") connection? Is something broken even
>>> with e6b7e41cdd8c, and this is a better fix? Or is this simply a
>>> different way of fixing the same problem?
>>>
>>>> dev->wakeup_prepared = false;
>>>>
>>>> dev->pm_cap = 0;
>>>> --
>>>> 1.7.9.5
>>>>
>>>
>

2014-11-06 01:49:39

by Liu, Chuansheng

[permalink] [raw]
Subject: RE: [PATCH] PCI: Do not enable async suspend for JMicron chips

Hello Bjorn,

> -----Original Message-----
> From: Bjorn Helgaas [mailto:[email protected]]
> Sent: Thursday, November 06, 2014 3:04 AM
> To: Barto
> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki;
> [email protected]; [email protected]
> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>
> On Wed, Nov 5, 2014 at 11:46 AM, Barto <[email protected]>
> wrote:
> > this patch solves these 2 bug reports :
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=84861
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=81551
>
> Those bugs were already mentioned. But e6b7e41cdd8c claims to solve
> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a
> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c.
>
> So the question is, why was e6b7e41cdd8c insufficient? Presumably it
> was tested and somebody thought it did fix the problem.

The first patch e6b7e41cdd8c which is just exclude some of JMicron chips(363/361) out of async_suspend,
then Barto found the same issue on JMicron 368, so we need one more general patch to let JMicron chips
out of async_suspend, so we make this patch.

Bjorn, tj,
Could you kindly take this patch? As Barto said, it effected the user experience indeed, thanks.

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-11-06 04:09:15

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips

On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng
<[email protected]> wrote:
> Hello Bjorn,
>
>> -----Original Message-----
>> From: Bjorn Helgaas [mailto:[email protected]]
>> Sent: Thursday, November 06, 2014 3:04 AM
>> To: Barto
>> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki;
>> [email protected]; [email protected]
>> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>>
>> On Wed, Nov 5, 2014 at 11:46 AM, Barto <[email protected]>
>> wrote:
>> > this patch solves these 2 bug reports :
>> >
>> > https://bugzilla.kernel.org/show_bug.cgi?id=84861
>> >
>> > https://bugzilla.kernel.org/show_bug.cgi?id=81551
>>
>> Those bugs were already mentioned. But e6b7e41cdd8c claims to solve
>> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a
>> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c.
>>
>> So the question is, why was e6b7e41cdd8c insufficient? Presumably it
>> was tested and somebody thought it did fix the problem.
>
> The first patch e6b7e41cdd8c which is just exclude some of JMicron chips(363/361) out of async_suspend,
> then Barto found the same issue on JMicron 368, so we need one more general patch to let JMicron chips
> out of async_suspend, so we make this patch.
>
> Bjorn, tj,
> Could you kindly take this patch? As Barto said, it effected the user experience indeed, thanks.

Thanks for clarifying the changelog as far as the different chips and
the different bugzillas.

But you haven't addressed my concerns about (1) putting a PCI vendor
ID check in the generic PCI core code, and (2) applying this to *all*
JMicron devices. You might want to explore a quirk-type solution or
maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c.

Bjorn

2014-11-06 05:30:02

by Barto

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips

Hello Bjorn,

in my bugreport I have already tried to add the JMicron 368 in the "if
statement" and it didn't work, check my message here :

https://bugzilla.kernel.org/show_bug.cgi?id=84861#c11

if Chuansheng has choosen a more generic way ( applying the patch to all
JMicron devices ) it's because also because we don't know how many
JMBxxx models could be affected by this bug, tomorrow maybe one user
would create a bug report about a "JMB369" pci card who have again this
bug, and maybe on month later another user with a "JMB382", it could be
a nightmare for Chuanseng if he had to create every time a new patch for
each model of JMicron,

so for the moment the better approach for me is to disable async_suspend
for all JMBxxx JMicron, Chuanseng's patch seems reasonnable, as long as
we don't know the exact list of JMBxxx models we can assume that all
JMicron SATA/IDE are affected by this problem


Le 06/11/2014 05:08, Bjorn Helgaas a écrit :

>
> But you haven't addressed my concerns about (1) putting a PCI vendor
> ID check in the generic PCI core code, and (2) applying this to *all*
> JMicron devices. You might want to explore a quirk-type solution or
> maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c.
>
> Bjorn
>

2014-11-06 05:31:46

by Liu, Chuansheng

[permalink] [raw]
Subject: RE: [PATCH] PCI: Do not enable async suspend for JMicron chips

Hello Bjorn,

> -----Original Message-----
> From: Bjorn Helgaas [mailto:[email protected]]
> Sent: Thursday, November 06, 2014 12:09 PM
> To: Liu, Chuansheng
> Cc: Barto; Tejun Heo ([email protected]); Lu, Aaron; Rafael Wysocki;
> [email protected]; [email protected]
> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>
> On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng
> <[email protected]> wrote:
> > Hello Bjorn,
> >
> >> -----Original Message-----
> >> From: Bjorn Helgaas [mailto:[email protected]]
> >> Sent: Thursday, November 06, 2014 3:04 AM
> >> To: Barto
> >> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki;
> >> [email protected]; [email protected]
> >> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
> >>
> >> On Wed, Nov 5, 2014 at 11:46 AM, Barto <[email protected]>
> >> wrote:
> >> > this patch solves these 2 bug reports :
> >> >
> >> > https://bugzilla.kernel.org/show_bug.cgi?id=84861
> >> >
> >> > https://bugzilla.kernel.org/show_bug.cgi?id=81551
> >>
> >> Those bugs were already mentioned. But e6b7e41cdd8c claims to solve
> >> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a
> >> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c.
> >>
> >> So the question is, why was e6b7e41cdd8c insufficient? Presumably it
> >> was tested and somebody thought it did fix the problem.
> >
> > The first patch e6b7e41cdd8c which is just exclude some of JMicron
> chips(363/361) out of async_suspend,
> > then Barto found the same issue on JMicron 368, so we need one more
> general patch to let JMicron chips
> > out of async_suspend, so we make this patch.
> >
> > Bjorn, tj,
> > Could you kindly take this patch? As Barto said, it effected the user
> experience indeed, thanks.
>
> Thanks for clarifying the changelog as far as the different chips and
> the different bugzillas.
>
> But you haven't addressed my concerns about (1) putting a PCI vendor
> ID check in the generic PCI core code, and (2) applying this to *all*
> JMicron devices. You might want to explore a quirk-type solution or
> maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c.
Understand your point, in fact, before this patch submitted, I had written another patch https://lkml.org/lkml/2014/9/24/68
which addressed to add the quirk-type solution in ATA code, and Aaron given better suggestion that implemented at pci_pm_init().
How do you think of it? Thanks.


????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-11-06 05:36:38

by Aaron Lu

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips

On 11/06/2014 01:29 PM, Liu, Chuansheng wrote:
> Hello Bjorn,
>
>> -----Original Message-----
>> From: Bjorn Helgaas [mailto:[email protected]]
>> Sent: Thursday, November 06, 2014 12:09 PM
>> To: Liu, Chuansheng
>> Cc: Barto; Tejun Heo ([email protected]); Lu, Aaron; Rafael Wysocki;
>> [email protected]; [email protected]
>> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>>
>> On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng
>> <[email protected]> wrote:
>>> Hello Bjorn,
>>>
>>>> -----Original Message-----
>>>> From: Bjorn Helgaas [mailto:[email protected]]
>>>> Sent: Thursday, November 06, 2014 3:04 AM
>>>> To: Barto
>>>> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki;
>>>> [email protected]; [email protected]
>>>> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>>>>
>>>> On Wed, Nov 5, 2014 at 11:46 AM, Barto <[email protected]>
>>>> wrote:
>>>>> this patch solves these 2 bug reports :
>>>>>
>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=84861
>>>>>
>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551
>>>>
>>>> Those bugs were already mentioned. But e6b7e41cdd8c claims to solve
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a
>>>> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c.
>>>>
>>>> So the question is, why was e6b7e41cdd8c insufficient? Presumably it
>>>> was tested and somebody thought it did fix the problem.
>>>
>>> The first patch e6b7e41cdd8c which is just exclude some of JMicron
>> chips(363/361) out of async_suspend,
>>> then Barto found the same issue on JMicron 368, so we need one more
>> general patch to let JMicron chips
>>> out of async_suspend, so we make this patch.
>>>
>>> Bjorn, tj,
>>> Could you kindly take this patch? As Barto said, it effected the user
>> experience indeed, thanks.
>>
>> Thanks for clarifying the changelog as far as the different chips and
>> the different bugzillas.
>>
>> But you haven't addressed my concerns about (1) putting a PCI vendor
>> ID check in the generic PCI core code, and (2) applying this to *all*
>> JMicron devices. You might want to explore a quirk-type solution or
>> maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c.
> Understand your point, in fact, before this patch submitted, I had written another patch https://lkml.org/lkml/2014/9/24/68
> which addressed to add the quirk-type solution in ATA code, and Aaron given better suggestion that implemented at pci_pm_init().
> How do you think of it? Thanks.

I think Bjorn means that we should place the code as a fixup somewhere
in the quirks.c. I didn't take a closer look but DECLARE_PCI_FIXUP_FINAL
for those JMicron PCI devices seems to be a proper phase.

Thanks,
Aaron

2014-11-06 06:41:40

by Liu, Chuansheng

[permalink] [raw]
Subject: RE: [PATCH] PCI: Do not enable async suspend for JMicron chips



> -----Original Message-----
> From: Lu, Aaron
> Sent: Thursday, November 06, 2014 1:37 PM
> To: Liu, Chuansheng; Bjorn Helgaas
> Cc: Barto; Tejun Heo ([email protected]); Rafael Wysocki;
> [email protected]; [email protected]
> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>
> On 11/06/2014 01:29 PM, Liu, Chuansheng wrote:
> > Hello Bjorn,
> >
> >> -----Original Message-----
> >> From: Bjorn Helgaas [mailto:[email protected]]
> >> Sent: Thursday, November 06, 2014 12:09 PM
> >> To: Liu, Chuansheng
> >> Cc: Barto; Tejun Heo ([email protected]); Lu, Aaron; Rafael Wysocki;
> >> [email protected]; [email protected]
> >> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
> >>
> >> On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng
> >> <[email protected]> wrote:
> >>> Hello Bjorn,
> >>>
> >>>> -----Original Message-----
> >>>> From: Bjorn Helgaas [mailto:[email protected]]
> >>>> Sent: Thursday, November 06, 2014 3:04 AM
> >>>> To: Barto
> >>>> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki;
> >>>> [email protected]; [email protected]
> >>>> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
> >>>>
> >>>> On Wed, Nov 5, 2014 at 11:46 AM, Barto <[email protected]>
> >>>> wrote:
> >>>>> this patch solves these 2 bug reports :
> >>>>>
> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=84861
> >>>>>
> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551
> >>>>
> >>>> Those bugs were already mentioned. But e6b7e41cdd8c claims to solve
> >>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a
> >>>> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c.
> >>>>
> >>>> So the question is, why was e6b7e41cdd8c insufficient? Presumably it
> >>>> was tested and somebody thought it did fix the problem.
> >>>
> >>> The first patch e6b7e41cdd8c which is just exclude some of JMicron
> >> chips(363/361) out of async_suspend,
> >>> then Barto found the same issue on JMicron 368, so we need one more
> >> general patch to let JMicron chips
> >>> out of async_suspend, so we make this patch.
> >>>
> >>> Bjorn, tj,
> >>> Could you kindly take this patch? As Barto said, it effected the user
> >> experience indeed, thanks.
> >>
> >> Thanks for clarifying the changelog as far as the different chips and
> >> the different bugzillas.
> >>
> >> But you haven't addressed my concerns about (1) putting a PCI vendor
> >> ID check in the generic PCI core code, and (2) applying this to *all*
> >> JMicron devices. You might want to explore a quirk-type solution or
> >> maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c.
> > Understand your point, in fact, before this patch submitted, I had written
> another patch https://lkml.org/lkml/2014/9/24/68
> > which addressed to add the quirk-type solution in ATA code, and Aaron given
> better suggestion that implemented at pci_pm_init().
> > How do you think of it? Thanks.
>
> I think Bjorn means that we should place the code as a fixup somewhere
> in the quirks.c. I didn't take a closer look but DECLARE_PCI_FIXUP_FINAL
> for those JMicron PCI devices seems to be a proper phase.

Thanks Aaron, then how about below patch?

diff --git a/drivers/ata/pata_jmicron.c b/drivers/ata/pata_jmicron.c
index 47e418b..9e85f86 100644
--- a/drivers/ata/pata_jmicron.c
+++ b/drivers/ata/pata_jmicron.c
@@ -158,6 +158,21 @@ static int jmicron_init_one (struct pci_dev *pdev, const struct pci_device_id *i
return ata_pci_bmdma_init_one(pdev, ppi, &jmicron_sht, NULL, 0);
}

+/*
+ * For JMicron chips, we need to disable the async_suspend method, otherwise
+ * they will hit the power-on issue when doing device resume, add one quick
+ * solution to disable the async_suspend method.
+ */
+static void pci_async_suspend_fixup(struct pci_dev *pdev)
+{
+ /*
+ * disabling the async_suspend method for JMicron chips to
+ * avoid device resuming issue.
+ */
+ device_disable_async_suspend(&pdev->dev);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, pci_async_suspend_fixup);
+
static const struct pci_device_id jmicron_pci_tbl[] = {
{ PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
PCI_CLASS_STORAGE_IDE << 8, 0xffff00, 0 },

Barto,
Could you have a try on your side? Thanks.


????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-11-06 08:25:40

by Barto

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips

I tried your patch and it doesn't work,

I think you have forgotten something, maybe you need also to modify the
file /drivers/ata/ahci.c and not only /drivers/ata/pata_jmicron.c

don't forget that I have a JMB363/368 SATA/IDE controller PCIe, which is
both a SATA and IDE controller in one PCie card, and a IDE hardisk is
connected on this JMB363/368 SATA/IDE controller PCIe,

for now the patch who works is this :

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2046,7 +2046,17 @@ void pci_pm_init(struct pci_dev *dev)
pm_runtime_forbid(&dev->dev);
pm_runtime_set_active(&dev->dev);
pm_runtime_enable(&dev->dev);
- device_enable_async_suspend(&dev->dev);
+
+ /*
+ * The JMicron chip 361/363/368 contains one SATA controller and
+ * one PATA controller, they are brother-relation ship in PCI tree,
+ * but for powering on these both controller, we must follow the
+ * sequence one by one, otherwise one of them can not be powered on
+ * successfully, so here we disable the async suspend method for
+ * Jmicron chip.
+ */
+ if (dev->vendor != PCI_VENDOR_ID_JMICRON)
+ device_enable_async_suspend(&dev->dev);
dev->wakeup_prepared = false;

dev->pm_cap = 0;


Le 06/11/2014 07:39, Liu, Chuansheng a écrit :
>

> diff --git a/drivers/ata/pata_jmicron.c b/drivers/ata/pata_jmicron.c
> index 47e418b..9e85f86 100644
> --- a/drivers/ata/pata_jmicron.c
> +++ b/drivers/ata/pata_jmicron.c
> @@ -158,6 +158,21 @@ static int jmicron_init_one (struct pci_dev *pdev, const struct pci_device_id *i
> return ata_pci_bmdma_init_one(pdev, ppi, &jmicron_sht, NULL, 0);
> }
>
> +/*
> + * For JMicron chips, we need to disable the async_suspend method, otherwise
> + * they will hit the power-on issue when doing device resume, add one quick
> + * solution to disable the async_suspend method.
> + */
> +static void pci_async_suspend_fixup(struct pci_dev *pdev)
> +{
> + /*
> + * disabling the async_suspend method for JMicron chips to
> + * avoid device resuming issue.
> + */
> + device_disable_async_suspend(&pdev->dev);
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, pci_async_suspend_fixup);
> +
> static const struct pci_device_id jmicron_pci_tbl[] = {
> { PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
> PCI_CLASS_STORAGE_IDE << 8, 0xffff00, 0 },
>
> Barto,
> Could you have a try on your side? Thanks.
>
>

2014-11-06 17:39:40

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips

On Wed, Nov 5, 2014 at 11:39 PM, Liu, Chuansheng
<[email protected]> wrote:
>
>
>> -----Original Message-----
>> From: Lu, Aaron
>> Sent: Thursday, November 06, 2014 1:37 PM
>> To: Liu, Chuansheng; Bjorn Helgaas
>> Cc: Barto; Tejun Heo ([email protected]); Rafael Wysocki;
>> [email protected]; [email protected]
>> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>>
>> On 11/06/2014 01:29 PM, Liu, Chuansheng wrote:
>> > Hello Bjorn,
>> >
>> >> -----Original Message-----
>> >> From: Bjorn Helgaas [mailto:[email protected]]
>> >> Sent: Thursday, November 06, 2014 12:09 PM
>> >> To: Liu, Chuansheng
>> >> Cc: Barto; Tejun Heo ([email protected]); Lu, Aaron; Rafael Wysocki;
>> >> [email protected]; [email protected]
>> >> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>> >>
>> >> On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng
>> >> <[email protected]> wrote:
>> >>> Hello Bjorn,
>> >>>
>> >>>> -----Original Message-----
>> >>>> From: Bjorn Helgaas [mailto:[email protected]]
>> >>>> Sent: Thursday, November 06, 2014 3:04 AM
>> >>>> To: Barto
>> >>>> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki;
>> >>>> [email protected]; [email protected]
>> >>>> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>> >>>>
>> >>>> On Wed, Nov 5, 2014 at 11:46 AM, Barto <[email protected]>
>> >>>> wrote:
>> >>>>> this patch solves these 2 bug reports :
>> >>>>>
>> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=84861
>> >>>>>
>> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551
>> >>>>
>> >>>> Those bugs were already mentioned. But e6b7e41cdd8c claims to solve
>> >>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a
>> >>>> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c.
>> >>>>
>> >>>> So the question is, why was e6b7e41cdd8c insufficient? Presumably it
>> >>>> was tested and somebody thought it did fix the problem.
>> >>>
>> >>> The first patch e6b7e41cdd8c which is just exclude some of JMicron
>> >> chips(363/361) out of async_suspend,
>> >>> then Barto found the same issue on JMicron 368, so we need one more
>> >> general patch to let JMicron chips
>> >>> out of async_suspend, so we make this patch.
>> >>>
>> >>> Bjorn, tj,
>> >>> Could you kindly take this patch? As Barto said, it effected the user
>> >> experience indeed, thanks.
>> >>
>> >> Thanks for clarifying the changelog as far as the different chips and
>> >> the different bugzillas.
>> >>
>> >> But you haven't addressed my concerns about (1) putting a PCI vendor
>> >> ID check in the generic PCI core code, and (2) applying this to *all*
>> >> JMicron devices. You might want to explore a quirk-type solution or
>> >> maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c.
>> > Understand your point, in fact, before this patch submitted, I had written
>> another patch https://lkml.org/lkml/2014/9/24/68
>> > which addressed to add the quirk-type solution in ATA code, and Aaron given
>> better suggestion that implemented at pci_pm_init().
>> > How do you think of it? Thanks.
>>
>> I think Bjorn means that we should place the code as a fixup somewhere
>> in the quirks.c. I didn't take a closer look but DECLARE_PCI_FIXUP_FINAL
>> for those JMicron PCI devices seems to be a proper phase.
>
> Thanks Aaron, then how about below patch?
>
> diff --git a/drivers/ata/pata_jmicron.c b/drivers/ata/pata_jmicron.c
> index 47e418b..9e85f86 100644
> --- a/drivers/ata/pata_jmicron.c
> +++ b/drivers/ata/pata_jmicron.c
> @@ -158,6 +158,21 @@ static int jmicron_init_one (struct pci_dev *pdev, const struct pci_device_id *i
> return ata_pci_bmdma_init_one(pdev, ppi, &jmicron_sht, NULL, 0);
> }
>
> +/*
> + * For JMicron chips, we need to disable the async_suspend method, otherwise
> + * they will hit the power-on issue when doing device resume, add one quick
> + * solution to disable the async_suspend method.

A "quick solution" is a red flag for me, because it's a hint that you
just want the problem to go away without fully understanding it.
That's probably not the case; it's probably just that *I* don't
understand it all yet.

> +static void pci_async_suspend_fixup(struct pci_dev *pdev)
> +{
> + /*
> + * disabling the async_suspend method for JMicron chips to
> + * avoid device resuming issue.
> + */
> + device_disable_async_suspend(&pdev->dev);
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, pci_async_suspend_fixup);

I know Barto tested this and it didn't work, but this *strategy* is
far better than adding the JMicron test in pci_pm_init(). It's ideal
if a quirk like this could be in the pata_jmicron.c file, so it's only
included when that driver is loaded. But it would still be OK if it
had to be in drivers/pci/quirks.c, e.g., if something has to be done
even before the driver is loaded.

The idea of a quirk is to work around a defect in a device. What is
the defect in this case? It seems there are two devices involved,
e.g. (from https://bugzilla.kernel.org/show_bug.cgi?id=81551):

02:00.0 JMicron Technology Corp. JMB363 SATA/IDE Controller
02:00.1 JMicron Technology Corp. JMB363 SATA/IDE Controller

The PCI power management code is designed to work correctly with all
devices that conform to the spec. So either one of these devices
doesn't conform to the spec, or the PM code is assuming some behavior
that the spec doesn't actually require. If the latter, we need to fix
the PM code, because it won't work with other non-JMicron devices
either.

I haven't seen reports of other devices, so my guess is that it really
*is* a defect in these JMicron devices, but I still need a better
understanding of exactly what's broken. Maybe it's something like:

- 02:00.0 and 02:00.1 are both in D3
- if we try to put 02:00.1 in D0 first, it fails ("Refused to change
power state, currently in D3")

and the resolution is to change 02:00.0 from D3 to D0 first, then
change 02:00.1 from D3 to D0. Or maybe it's just that there needs to
be a delay between changing 02:00.0 and changing 02:00.1 (since I
suspect we *start* those changes in that order anyway)?

In either case it sounds like a device defect, because I don't see
anything in the PCI Power Management spec about ordering requirements
for multifunction devices.

Speaking of ordering, what is it that guarantees 02:00.0 will be
powered up before 02:00.1 when async suspend is disabled? Is it the
dpm_noirq_list order in dpm_resume_noirq()?

I don't know how much we gain by allowing async resume of
multifunction devices. Is it possible that Windows always powers up
multifunction devices in order? I doubt that Windows would have a
quirk similar to what you're proposing here, so I wonder how they deal
with this issue.

It sounds like this issue only affects multifunction JMicron devices,
so possibly the quirk could be made smarter by only disabling async
suspend when it finds those.

Per https://bugzilla.kernel.org/show_bug.cgi?id=81551#c8, this is a
regression introduced by 76569faa62c4 ("PM / sleep: Asynchronous
threads for resume_noirq"), which appeared in v3.15. That information
needs to be in the changelog.

Rafael, do you want to jump in here? I know almost nothing about the
PCI power management code, so if you want to take over this whole
thing, that would be fine by me.

Bjorn

2014-11-06 21:02:54

by Barto

[permalink] [raw]
Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips


> The idea of a quirk is to work around a defect in a device. What is
> the defect in this case? It seems there are two devices involved,
> e.g. (from https://bugzilla.kernel.org/show_bug.cgi?id=81551):
>
> 02:00.0 JMicron Technology Corp. JMB363 SATA/IDE Controller
> 02:00.1 JMicron Technology Corp. JMB363 SATA/IDE Controller
>

in my case I don't have exactly the same lines in dmesg,

my JMicron JMB363/368 seems to have a different design, it's not exactly
identical to JMB363 SATA/IDE Controller, in dmesg I can read this :

dmesg | grep micron

[ 0.860659] pata_jmicron 0000:03:00.1: enabling device (0000 -> 0001)
[ 0.866760] scsi0 : pata_jmicron
[ 0.870045] scsi1 : pata_jmicron

lspci :

lspci | grep JMicron

03:00.0 SATA controller: JMicron Technology Corp. JMB363 SATA/IDE
Controller (rev 10)
03:00.1 IDE interface: JMicron Technology Corp. JMB368 IDE controller
(rev 10)



2014-11-07 01:11:19

by Liu, Chuansheng

[permalink] [raw]
Subject: RE: [PATCH] PCI: Do not enable async suspend for JMicron chips

Hello Bjorn,

Will send out one new quirk-solution, and some reaction below:)

> -----Original Message-----
> From: Bjorn Helgaas [mailto:[email protected]]
> Sent: Friday, November 07, 2014 1:39 AM
> To: Liu, Chuansheng
> Cc: Lu, Aaron; Barto; Tejun Heo ([email protected]); Rafael Wysocki;
> [email protected]; [email protected]
> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
>
> On Wed, Nov 5, 2014 at 11:39 PM, Liu, Chuansheng
> <[email protected]> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Lu, Aaron
> >> Sent: Thursday, November 06, 2014 1:37 PM
> >> To: Liu, Chuansheng; Bjorn Helgaas
> >> Cc: Barto; Tejun Heo ([email protected]); Rafael Wysocki;
> >> [email protected]; [email protected]
> >> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
> >>
> >> On 11/06/2014 01:29 PM, Liu, Chuansheng wrote:
> >> > Hello Bjorn,
> >> >
> >> >> -----Original Message-----
> >> >> From: Bjorn Helgaas [mailto:[email protected]]
> >> >> Sent: Thursday, November 06, 2014 12:09 PM
> >> >> To: Liu, Chuansheng
> >> >> Cc: Barto; Tejun Heo ([email protected]); Lu, Aaron; Rafael Wysocki;
> >> >> [email protected]; [email protected]
> >> >> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips
> >> >>
> >> >> On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng
> >> >> <[email protected]> wrote:
> >> >>> Hello Bjorn,
> >> >>>
> >> >>>> -----Original Message-----
> >> >>>> From: Bjorn Helgaas [mailto:[email protected]]
> >> >>>> Sent: Thursday, November 06, 2014 3:04 AM
> >> >>>> To: Barto
> >> >>>> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki;
> >> >>>> [email protected]; [email protected]
> >> >>>> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron
> chips
> >> >>>>
> >> >>>> On Wed, Nov 5, 2014 at 11:46 AM, Barto
> <[email protected]>
> >> >>>> wrote:
> >> >>>>> this patch solves these 2 bug reports :
> >> >>>>>
> >> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=84861
> >> >>>>>
> >> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551
> >> >>>>
> >> >>>> Those bugs were already mentioned. But e6b7e41cdd8c claims to
> solve
> >> >>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a
> >> >>>> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c.
> >> >>>>
> >> >>>> So the question is, why was e6b7e41cdd8c insufficient? Presumably
> it
> >> >>>> was tested and somebody thought it did fix the problem.
> >> >>>
> >> >>> The first patch e6b7e41cdd8c which is just exclude some of JMicron
> >> >> chips(363/361) out of async_suspend,
> >> >>> then Barto found the same issue on JMicron 368, so we need one more
> >> >> general patch to let JMicron chips
> >> >>> out of async_suspend, so we make this patch.
> >> >>>
> >> >>> Bjorn, tj,
> >> >>> Could you kindly take this patch? As Barto said, it effected the user
> >> >> experience indeed, thanks.
> >> >>
> >> >> Thanks for clarifying the changelog as far as the different chips and
> >> >> the different bugzillas.
> >> >>
> >> >> But you haven't addressed my concerns about (1) putting a PCI vendor
> >> >> ID check in the generic PCI core code, and (2) applying this to *all*
> >> >> JMicron devices. You might want to explore a quirk-type solution or
> >> >> maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c.
> >> > Understand your point, in fact, before this patch submitted, I had written
> >> another patch https://lkml.org/lkml/2014/9/24/68
> >> > which addressed to add the quirk-type solution in ATA code, and Aaron
> given
> >> better suggestion that implemented at pci_pm_init().
> >> > How do you think of it? Thanks.
> >>
> >> I think Bjorn means that we should place the code as a fixup somewhere
> >> in the quirks.c. I didn't take a closer look but DECLARE_PCI_FIXUP_FINAL
> >> for those JMicron PCI devices seems to be a proper phase.
> >
> > Thanks Aaron, then how about below patch?
> >
> > diff --git a/drivers/ata/pata_jmicron.c b/drivers/ata/pata_jmicron.c
> > index 47e418b..9e85f86 100644
> > --- a/drivers/ata/pata_jmicron.c
> > +++ b/drivers/ata/pata_jmicron.c
> > @@ -158,6 +158,21 @@ static int jmicron_init_one (struct pci_dev *pdev,
> const struct pci_device_id *i
> > return ata_pci_bmdma_init_one(pdev, ppi, &jmicron_sht, NULL,
> 0);
> > }
> >
> > +/*
> > + * For JMicron chips, we need to disable the async_suspend method,
> otherwise
> > + * they will hit the power-on issue when doing device resume, add one quick
> > + * solution to disable the async_suspend method.
>
> A "quick solution" is a red flag for me, because it's a hint that you
> just want the problem to go away without fully understanding it.
> That's probably not the case; it's probably just that *I* don't
> understand it all yet.
>
> > +static void pci_async_suspend_fixup(struct pci_dev *pdev)
> > +{
> > + /*
> > + * disabling the async_suspend method for JMicron chips to
> > + * avoid device resuming issue.
> > + */
> > + device_disable_async_suspend(&pdev->dev);
> > +}
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_JMICRON, PCI_ANY_ID,
> pci_async_suspend_fixup);
>
> I know Barto tested this and it didn't work, but this *strategy* is
> far better than adding the JMicron test in pci_pm_init(). It's ideal
> if a quirk like this could be in the pata_jmicron.c file, so it's only
> included when that driver is loaded. But it would still be OK if it
> had to be in drivers/pci/quirks.c, e.g., if something has to be done
> even before the driver is loaded.
>
> The idea of a quirk is to work around a defect in a device. What is
> the defect in this case? It seems there are two devices involved,
> e.g. (from https://bugzilla.kernel.org/show_bug.cgi?id=81551):
>
> 02:00.0 JMicron Technology Corp. JMB363 SATA/IDE Controller
> 02:00.1 JMicron Technology Corp. JMB363 SATA/IDE Controller
>
> The PCI power management code is designed to work correctly with all
> devices that conform to the spec. So either one of these devices
> doesn't conform to the spec, or the PM code is assuming some behavior
> that the spec doesn't actually require. If the latter, we need to fix
> the PM code, because it won't work with other non-JMicron devices
> either.
>
> I haven't seen reports of other devices, so my guess is that it really
> *is* a defect in these JMicron devices, but I still need a better
> understanding of exactly what's broken. Maybe it's something like:
>
> - 02:00.0 and 02:00.1 are both in D3
> - if we try to put 02:00.1 in D0 first, it fails ("Refused to change
> power state, currently in D3")
Yes, it is the case, 02:00.0 and 02:00.1 are brother relationship, but has power-on
sequence dependency.

>
> and the resolution is to change 02:00.0 from D3 to D0 first, then
> change 02:00.1 from D3 to D0. Or maybe it's just that there needs to
> be a delay between changing 02:00.0 and changing 02:00.1 (since I
> suspect we *start* those changes in that order anyway)?
Delay has no help for this issue, we must follow the order to power on one by one.

>
> In either case it sounds like a device defect, because I don't see
> anything in the PCI Power Management spec about ordering requirements
> for multifunction devices.
>
> Speaking of ordering, what is it that guarantees 02:00.0 will be
> powered up before 02:00.1 when async suspend is disabled? Is it the
> dpm_noirq_list order in dpm_resume_noirq()?
Yes, it depends on that list to power on in order.

>
> I don't know how much we gain by allowing async resume of
> multifunction devices. Is it possible that Windows always powers up
> multifunction devices in order? I doubt that Windows would have a
> quirk similar to what you're proposing here, so I wonder how they deal
> with this issue.
>
> It sounds like this issue only affects multifunction JMicron devices,
> so possibly the quirk could be made smarter by only disabling async
> suspend when it finds those.
Yes, the new quirk-solution will be sent out soon.

>
> Per https://bugzilla.kernel.org/show_bug.cgi?id=81551#c8, this is a
> regression introduced by 76569faa62c4 ("PM / sleep: Asynchronous
> threads for resume_noirq"), which appeared in v3.15. That information
> needs to be in the changelog.
>
Fairly enough. The new quirk-solution patch has been tested by Barto successfully,
will update the changelog with more detail.
It is regression, but not fully real since we actually found the character for JMicron devices.


????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?