LinuxLists.cc - Re: PCI / PM: Crashes in PME scan during system suspend

2017-04-14 08:22:49

Subject: Re: PCI / PM: Crashes in PME scan during system suspend

On Tue, Feb 14, 2017 at 12:26:01PM +0100, Rafael J. Wysocki wrote:
> On Tuesday, February 14, 2017 10:31:38 AM Geert Uytterhoeven wrote:
> > Laurent Pinchart reported that r8a7790/Lager crashes during suspend tests.
> >
> > I managed to reproduce the issue on r8a7791/koelsch:
> > - It only happens during suspend tests, after writing either "platform"
> > or "processors" to /sys/power/pm_test,
> > - It does not (or is less likely) to happen during full system suspend
> > ("core" or "none").
> >
> > More investigation shows this happens when the PME scan runs, once per
> > second. During PME scan, the PCI host bridge (rcar-pci) registers are
> > accessed while the host bridge's module clock has already been disabled,
> > leading to a crash.
>
> OK, so clearly PME scans should be suspended before the host bridge
> registers become inaccessible.
>
> Another question, though, is whether or not PME scans are actually necessary
> on the affected platforms at all.

I'm not seeing a fix for this in linux-next, am I missing something?
Has anyone looked into it or is the issue still open?

Below is a tentative patch which moves PME polling to a freezable
workqueue, so it is frozen before the host bridge is suspended.
Geert, Laurent, could you test this?

The patch may be problematic in that pci_pme_list_scan() acquires
pci_pme_list_mutex, which is also acquired by pci_pme_active(),
which gets called when devices are suspended -- *after* the worker
has been frozen. I'm not really familiar with the freezer, can it
happen that the worker is frozen while holding the mutex? If so
this would deadlock. Rafael?

Alternative approaches would be to
(a) skip devices in pci_pme_list_scan() if their is_prepared or
is_suspended flags are set, or
(b) disable PME polling via a PM notifier.

The latter seems preferable performance-wise. (To avoid checking
these flags once per second.)

Best regards,

Lukas

-- >8 --

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7904d02..d35c016 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1782,8 +1782,8 @@ static void pci_pme_list_scan(struct work_struct *work)
}
}
if (!list_empty(&pci_pme_list))
- schedule_delayed_work(&pci_pme_work,
- msecs_to_jiffies(PME_TIMEOUT));
+ queue_delayed_work(system_freezable_wq, &pci_pme_work,
+ msecs_to_jiffies(PME_TIMEOUT));
mutex_unlock(&pci_pme_list_mutex);
}

@@ -1848,8 +1848,9 @@ void pci_pme_active(struct pci_dev *dev, bool enable)
mutex_lock(&pci_pme_list_mutex);
list_add(&pme_dev->list, &pci_pme_list);
if (list_is_singular(&pci_pme_list))
- schedule_delayed_work(&pci_pme_work,
- msecs_to_jiffies(PME_TIMEOUT));
+ queue_delayed_work(system_freezable_wq,
+ &pci_pme_work,
+ msecs_to_jiffies(PME_TIMEOUT));
mutex_unlock(&pci_pme_list_mutex);
} else {
mutex_lock(&pci_pme_list_mutex);

2017-04-14 22:33:36

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: PCI / PM: Crashes in PME scan during system suspend

On Friday, April 14, 2017 10:22:49 AM Lukas Wunner wrote:
> On Tue, Feb 14, 2017 at 12:26:01PM +0100, Rafael J. Wysocki wrote:
> > On Tuesday, February 14, 2017 10:31:38 AM Geert Uytterhoeven wrote:
> > > Laurent Pinchart reported that r8a7790/Lager crashes during suspend tests.
> > >
> > > I managed to reproduce the issue on r8a7791/koelsch:
> > > - It only happens during suspend tests, after writing either "platform"
> > > or "processors" to /sys/power/pm_test,
> > > - It does not (or is less likely) to happen during full system suspend
> > > ("core" or "none").
> > >
> > > More investigation shows this happens when the PME scan runs, once per
> > > second. During PME scan, the PCI host bridge (rcar-pci) registers are
> > > accessed while the host bridge's module clock has already been disabled,
> > > leading to a crash.
> >
> > OK, so clearly PME scans should be suspended before the host bridge
> > registers become inaccessible.
> >
> > Another question, though, is whether or not PME scans are actually necessary
> > on the affected platforms at all.
>
> I'm not seeing a fix for this in linux-next, am I missing something?
> Has anyone looked into it or is the issue still open?

It is still open AFAICS.

> Below is a tentative patch which moves PME polling to a freezable
> workqueue, so it is frozen before the host bridge is suspended.
> Geert, Laurent, could you test this?
>
> The patch may be problematic in that pci_pme_list_scan() acquires
> pci_pme_list_mutex, which is also acquired by pci_pme_active(),
> which gets called when devices are suspended -- *after* the worker
> has been frozen. I'm not really familiar with the freezer, can it
> happen that the worker is frozen while holding the mutex? If so
> this would deadlock. Rafael?

That depends on the worker, precisely on where it calls try_to_freeze().

That said I think it won't do that while holding any locks. :-)

Thanks,
Rafael

2017-04-16 07:55:31

by Lukas Wunner

[permalink] [raw]

Subject: Re: PCI / PM: Crashes in PME scan during system suspend

On Sat, Apr 15, 2017 at 12:27:31AM +0200, Rafael J. Wysocki wrote:
> On Friday, April 14, 2017 10:22:49 AM Lukas Wunner wrote:
> > Below is a tentative patch which moves PME polling to a freezable
> > workqueue, so it is frozen before the host bridge is suspended.
> > Geert, Laurent, could you test this?
> >
> > The patch may be problematic in that pci_pme_list_scan() acquires
> > pci_pme_list_mutex, which is also acquired by pci_pme_active(),
> > which gets called when devices are suspended -- *after* the worker
> > has been frozen. I'm not really familiar with the freezer, can it
> > happen that the worker is frozen while holding the mutex? If so
> > this would deadlock. Rafael?
>
> That depends on the worker, precisely on where it calls try_to_freeze().

pci_pme_list_scan() doesn't call try_to_freeze() at all, that's what had
me confused. Neither do many other workers that are scheduled to
system_freezable_wq.

However after familiarizing myself a bit more with the freezer it seems
that is fine. The freezer prevents new workers from being scheduled and
then schedules the existing workers for up to 20 seconds to allow them
to finish. The call to try_to_freeze() is thus only needed for long
running workers or for threads which are in some infinite loop.

Below is the same patch with a proper commit message, it's mostly copied
and pasted from Geert's comprehensive report. Still needs to be tested
by Geert and/or Laurent as I was unable reproduce the issue on my x86
laptop. Perhaps on x86 the OS is not responsible for disabling the PCI
clock (but presumably the BIOS) or x86 doesn't emit this kind of fault.

Thanks,

Lukas

-- >8 --
Subject: [PATCH] PCI: Freeze PME scan before suspending devices

Laurent Pinchart reported that the Renesas R-Car H2 Lager board
(r8a7790) crashes during suspend tests. Geert Uytterhoeven managed to
reproduce the issue on an M2-W Koelsch board (r8a7791):

It occurs when the PME scan runs, once per second. During PME scan, the
PCI host bridge (rcar-pci) registers are accessed while its module clock
has already been disabled, leading to the crash.

The issue only occurs during suspend tests, after writing either
"platform" or "processors" to /sys/power/pm_test. It does not (or is
less likely) to happen during full system suspend ("core" or "none")
because system suspend also disables timers, and thus the workqueue
handling PME scans no longer runs. Geert believes the issue may still
happen in the small window between disabling module clocks and disabling
timers.

Rafael Wysocki agrees that PME scans should be suspended before the host
bridge registers become inaccessible. To that end, queue the task on a
workqueue that gets frozen before devices suspend.

How to reproduce:

# echo 0 > /sys/module/printk/parameters/console_suspend
# echo platform > /sys/power/pm_test # Or "processors"
# echo mem > /sys/power/state

Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are
enabled.

PM: Syncing filesystems ... [ 38.566237] done.
PM: Preparing system for sleep (mem)
Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: Suspending system (mem)
PM: suspend of devices complete after 152.456 msecs
PM: late suspend of devices complete after 2.809 msecs
PM: noirq suspend of devices complete after 29.863 msecs
suspend debug: Waiting for 5 second(s).
Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
pgd = c0003000
[00000000] *pgd=80000040004003, *pmd=00000000
Internal error: : 1211 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383
Hardware name: Generic R8A7791 (Flattened Device Tree)
Workqueue: events pci_pme_list_scan
task: eb56e140 task.stack: eb58e000
PC is at pci_generic_config_read+0x64/0x6c
LR is at rcar_pci_cfg_base+0x64/0x84
pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093
sp : eb58fe98 ip : c041d750 fp : 00000008
r10: c0e2283c r9 : 00000000 r8 : 600d0013
r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4
r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
Control: 30c5387d Table: 6a9f6c80 DAC: 55555555
Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
Stack: (0xeb58fe98 to 0xeb590000)
fe80: 00000002 00000044
fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000
fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830
fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000
ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380
ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000
ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0
ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd
[<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>]
(pci_bus_read_config_word+0x58/0x80)
[<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>]
(pci_check_pme_status+0x34/0x78)
[<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54)
[<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4)
[<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>]
(process_one_work+0x1bc/0x308)
[<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0)
[<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc)
[<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c)
Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000)
---[ end trace 667d43ba3aa9e589 ]---

Reported-by: Laurent Pinchart <[email protected]>
Reported-by: Geert Uytterhoeven <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Mika Westerberg <[email protected]>
Cc: Simon Horman <[email protected]>
Cc: Yinghai Lu <[email protected]>
Fixes: df17e62e5bff ("PCI: Add support for polling PME state on suspended legacy PCI devices")
Signed-off-by: Lukas Wunner <[email protected]>
---
drivers/pci/pci.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index aa55501..c561a9e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1784,8 +1784,8 @@ static void pci_pme_list_scan(struct work_struct *work)
}
}
if (!list_empty(&pci_pme_list))
- schedule_delayed_work(&pci_pme_work,
- msecs_to_jiffies(PME_TIMEOUT));
+ queue_delayed_work(system_freezable_wq, &pci_pme_work,
+ msecs_to_jiffies(PME_TIMEOUT));
mutex_unlock(&pci_pme_list_mutex);
}

@@ -1850,8 +1850,9 @@ void pci_pme_active(struct pci_dev *dev, bool enable)
mutex_lock(&pci_pme_list_mutex);
list_add(&pme_dev->list, &pci_pme_list);
if (list_is_singular(&pci_pme_list))
- schedule_delayed_work(&pci_pme_work,
- msecs_to_jiffies(PME_TIMEOUT));
+ queue_delayed_work(system_freezable_wq,
+ &pci_pme_work,
+ msecs_to_jiffies(PME_TIMEOUT));
mutex_unlock(&pci_pme_list_mutex);
} else {
mutex_lock(&pci_pme_list_mutex);
--
2.11.0

2017-04-18 06:49:44

by Geert Uytterhoeven

[permalink] [raw]

Subject: Re: PCI / PM: Crashes in PME scan during system suspend

Hi Lukas,

On Sun, Apr 16, 2017 at 9:55 AM, Lukas Wunner <[email protected]> wrote:
> On Sat, Apr 15, 2017 at 12:27:31AM +0200, Rafael J. Wysocki wrote:
>> On Friday, April 14, 2017 10:22:49 AM Lukas Wunner wrote:
>> > Below is a tentative patch which moves PME polling to a freezable
>> > workqueue, so it is frozen before the host bridge is suspended.
>> > Geert, Laurent, could you test this?
>> >
>> > The patch may be problematic in that pci_pme_list_scan() acquires
>> > pci_pme_list_mutex, which is also acquired by pci_pme_active(),
>> > which gets called when devices are suspended -- *after* the worker
>> > has been frozen. I'm not really familiar with the freezer, can it
>> > happen that the worker is frozen while holding the mutex? If so
>> > this would deadlock. Rafael?
>>
>> That depends on the worker, precisely on where it calls try_to_freeze().
>
> pci_pme_list_scan() doesn't call try_to_freeze() at all, that's what had
> me confused. Neither do many other workers that are scheduled to
> system_freezable_wq.
>
> However after familiarizing myself a bit more with the freezer it seems
> that is fine. The freezer prevents new workers from being scheduled and
> then schedules the existing workers for up to 20 seconds to allow them
> to finish. The call to try_to_freeze() is thus only needed for long
> running workers or for threads which are in some infinite loop.
>
> Below is the same patch with a proper commit message, it's mostly copied
> and pasted from Geert's comprehensive report. Still needs to be tested
> by Geert and/or Laurent as I was unable reproduce the issue on my x86
> laptop. Perhaps on x86 the OS is not responsible for disabling the PCI
> clock (but presumably the BIOS) or x86 doesn't emit this kind of fault.

Thanks, works fine (see below)!

> -- >8 --
> Subject: [PATCH] PCI: Freeze PME scan before suspending devices
>
> Laurent Pinchart reported that the Renesas R-Car H2 Lager board
> (r8a7790) crashes during suspend tests. Geert Uytterhoeven managed to
> reproduce the issue on an M2-W Koelsch board (r8a7791):
>
> It occurs when the PME scan runs, once per second. During PME scan, the
> PCI host bridge (rcar-pci) registers are accessed while its module clock
> has already been disabled, leading to the crash.
>
> The issue only occurs during suspend tests, after writing either
> "platform" or "processors" to /sys/power/pm_test. It does not (or is
> less likely) to happen during full system suspend ("core" or "none")
> because system suspend also disables timers, and thus the workqueue
> handling PME scans no longer runs. Geert believes the issue may still
> happen in the small window between disabling module clocks and disabling
> timers.

It can also be reproduced easily by configuring s2ram to use s2idle instead
of deep suspend, which is a real usecase:

# echo 0 > /sys/module/printk/parameters/console_suspend
# echo s2idle > /sys/power/mem_sleep
# echo mem > /sys/power/state

> Rafael Wysocki agrees that PME scans should be suspended before the host
> bridge registers become inaccessible. To that end, queue the task on a
> workqueue that gets frozen before devices suspend.
>
> How to reproduce:
>
> # echo 0 > /sys/module/printk/parameters/console_suspend
> # echo platform > /sys/power/pm_test # Or "processors"
> # echo mem > /sys/power/state
>
> Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are
> enabled.
>
> PM: Syncing filesystems ... [ 38.566237] done.
> PM: Preparing system for sleep (mem)
> Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done.
> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> PM: Suspending system (mem)
> PM: suspend of devices complete after 152.456 msecs
> PM: late suspend of devices complete after 2.809 msecs
> PM: noirq suspend of devices complete after 29.863 msecs
> suspend debug: Waiting for 5 second(s).
> Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
> pgd = c0003000
> [00000000] *pgd=80000040004003, *pmd=00000000
> Internal error: : 1211 [#1] SMP ARM
> Modules linked in:
> CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
> 4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383
> Hardware name: Generic R8A7791 (Flattened Device Tree)
> Workqueue: events pci_pme_list_scan
> task: eb56e140 task.stack: eb58e000
> PC is at pci_generic_config_read+0x64/0x6c
> LR is at rcar_pci_cfg_base+0x64/0x84
> pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093
> sp : eb58fe98 ip : c041d750 fp : 00000008
> r10: c0e2283c r9 : 00000000 r8 : 600d0013
> r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4
> r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000
> Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
> Control: 30c5387d Table: 6a9f6c80 DAC: 55555555
> Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
> Stack: (0xeb58fe98 to 0xeb590000)
> fe80: 00000002 00000044
> fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000
> fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830
> fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
> ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
> ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000
> ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380
> ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000
> ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0
> ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000
> ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd
> [<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>]
> (pci_bus_read_config_word+0x58/0x80)
> [<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>]
> (pci_check_pme_status+0x34/0x78)
> [<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54)
> [<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4)
> [<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>]
> (process_one_work+0x1bc/0x308)
> [<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0)
> [<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc)
> [<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c)
> Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000)
> ---[ end trace 667d43ba3aa9e589 ]---
>
> Reported-by: Laurent Pinchart <[email protected]>
> Reported-by: Geert Uytterhoeven <[email protected]>
> Cc: Rafael J. Wysocki <[email protected]>
> Cc: Mika Westerberg <[email protected]>
> Cc: Simon Horman <[email protected]>
> Cc: Yinghai Lu <[email protected]>
> Fixes: df17e62e5bff ("PCI: Add support for polling PME state on suspended legacy PCI devices")
> Signed-off-by: Lukas Wunner <[email protected]>

Tested-by: Geert Uytterhoeven <[email protected]>

> ---
> drivers/pci/pci.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index aa55501..c561a9e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1784,8 +1784,8 @@ static void pci_pme_list_scan(struct work_struct *work)
> }
> }
> if (!list_empty(&pci_pme_list))
> - schedule_delayed_work(&pci_pme_work,
> - msecs_to_jiffies(PME_TIMEOUT));
> + queue_delayed_work(system_freezable_wq, &pci_pme_work,
> + msecs_to_jiffies(PME_TIMEOUT));
> mutex_unlock(&pci_pme_list_mutex);
> }
>
> @@ -1850,8 +1850,9 @@ void pci_pme_active(struct pci_dev *dev, bool enable)
> mutex_lock(&pci_pme_list_mutex);
> list_add(&pme_dev->list, &pci_pme_list);
> if (list_is_singular(&pci_pme_list))
> - schedule_delayed_work(&pci_pme_work,
> - msecs_to_jiffies(PME_TIMEOUT));
> + queue_delayed_work(system_freezable_wq,
> + &pci_pme_work,
> + msecs_to_jiffies(PME_TIMEOUT));
> mutex_unlock(&pci_pme_list_mutex);
> } else {
> mutex_lock(&pci_pme_list_mutex);
> --
> 2.11.0

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2017-04-18 14:12:39

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: PCI / PM: Crashes in PME scan during system suspend

On Tuesday, April 18, 2017 08:49:39 AM Geert Uytterhoeven wrote:
> Hi Lukas,
>
> On Sun, Apr 16, 2017 at 9:55 AM, Lukas Wunner <[email protected]> wrote:
> > On Sat, Apr 15, 2017 at 12:27:31AM +0200, Rafael J. Wysocki wrote:
> >> On Friday, April 14, 2017 10:22:49 AM Lukas Wunner wrote:
> >> > Below is a tentative patch which moves PME polling to a freezable
> >> > workqueue, so it is frozen before the host bridge is suspended.
> >> > Geert, Laurent, could you test this?
> >> >
> >> > The patch may be problematic in that pci_pme_list_scan() acquires
> >> > pci_pme_list_mutex, which is also acquired by pci_pme_active(),
> >> > which gets called when devices are suspended -- *after* the worker
> >> > has been frozen. I'm not really familiar with the freezer, can it
> >> > happen that the worker is frozen while holding the mutex? If so
> >> > this would deadlock. Rafael?
> >>
> >> That depends on the worker, precisely on where it calls try_to_freeze().
> >
> > pci_pme_list_scan() doesn't call try_to_freeze() at all, that's what had
> > me confused. Neither do many other workers that are scheduled to
> > system_freezable_wq.
> >
> > However after familiarizing myself a bit more with the freezer it seems
> > that is fine. The freezer prevents new workers from being scheduled and
> > then schedules the existing workers for up to 20 seconds to allow them
> > to finish. The call to try_to_freeze() is thus only needed for long
> > running workers or for threads which are in some infinite loop.
> >
> > Below is the same patch with a proper commit message, it's mostly copied
> > and pasted from Geert's comprehensive report. Still needs to be tested
> > by Geert and/or Laurent as I was unable reproduce the issue on my x86
> > laptop. Perhaps on x86 the OS is not responsible for disabling the PCI
> > clock (but presumably the BIOS) or x86 doesn't emit this kind of fault.
>
> Thanks, works fine (see below)!
>
> > -- >8 --
> > Subject: [PATCH] PCI: Freeze PME scan before suspending devices
> >
> > Laurent Pinchart reported that the Renesas R-Car H2 Lager board
> > (r8a7790) crashes during suspend tests. Geert Uytterhoeven managed to
> > reproduce the issue on an M2-W Koelsch board (r8a7791):
> >
> > It occurs when the PME scan runs, once per second. During PME scan, the
> > PCI host bridge (rcar-pci) registers are accessed while its module clock
> > has already been disabled, leading to the crash.
> >
> > The issue only occurs during suspend tests, after writing either
> > "platform" or "processors" to /sys/power/pm_test. It does not (or is
> > less likely) to happen during full system suspend ("core" or "none")
> > because system suspend also disables timers, and thus the workqueue
> > handling PME scans no longer runs. Geert believes the issue may still
> > happen in the small window between disabling module clocks and disabling
> > timers.
>
> It can also be reproduced easily by configuring s2ram to use s2idle instead
> of deep suspend, which is a real usecase:
>
> # echo 0 > /sys/module/printk/parameters/console_suspend
> # echo s2idle > /sys/power/mem_sleep
> # echo mem > /sys/power/state
>
> > Rafael Wysocki agrees that PME scans should be suspended before the host
> > bridge registers become inaccessible. To that end, queue the task on a
> > workqueue that gets frozen before devices suspend.
> >
> > How to reproduce:
> >
> > # echo 0 > /sys/module/printk/parameters/console_suspend
> > # echo platform > /sys/power/pm_test # Or "processors"
> > # echo mem > /sys/power/state
> >
> > Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are
> > enabled.
> >
> > PM: Syncing filesystems ... [ 38.566237] done.
> > PM: Preparing system for sleep (mem)
> > Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done.
> > Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> > PM: Suspending system (mem)
> > PM: suspend of devices complete after 152.456 msecs
> > PM: late suspend of devices complete after 2.809 msecs
> > PM: noirq suspend of devices complete after 29.863 msecs
> > suspend debug: Waiting for 5 second(s).
> > Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
> > pgd = c0003000
> > [00000000] *pgd=80000040004003, *pmd=00000000
> > Internal error: : 1211 [#1] SMP ARM
> > Modules linked in:
> > CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
> > 4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383
> > Hardware name: Generic R8A7791 (Flattened Device Tree)
> > Workqueue: events pci_pme_list_scan
> > task: eb56e140 task.stack: eb58e000
> > PC is at pci_generic_config_read+0x64/0x6c
> > LR is at rcar_pci_cfg_base+0x64/0x84
> > pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093
> > sp : eb58fe98 ip : c041d750 fp : 00000008
> > r10: c0e2283c r9 : 00000000 r8 : 600d0013
> > r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4
> > r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000
> > Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
> > Control: 30c5387d Table: 6a9f6c80 DAC: 55555555
> > Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
> > Stack: (0xeb58fe98 to 0xeb590000)
> > fe80: 00000002 00000044
> > fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000
> > fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830
> > fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
> > ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
> > ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000
> > ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380
> > ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000
> > ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0
> > ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000
> > ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd
> > [<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>]
> > (pci_bus_read_config_word+0x58/0x80)
> > [<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>]
> > (pci_check_pme_status+0x34/0x78)
> > [<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54)
> > [<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4)
> > [<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>]
> > (process_one_work+0x1bc/0x308)
> > [<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0)
> > [<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc)
> > [<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c)
> > Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000)
> > ---[ end trace 667d43ba3aa9e589 ]---
> >
> > Reported-by: Laurent Pinchart <[email protected]>
> > Reported-by: Geert Uytterhoeven <[email protected]>
> > Cc: Rafael J. Wysocki <[email protected]>
> > Cc: Mika Westerberg <[email protected]>
> > Cc: Simon Horman <[email protected]>
> > Cc: Yinghai Lu <[email protected]>
> > Fixes: df17e62e5bff ("PCI: Add support for polling PME state on suspended legacy PCI devices")
> > Signed-off-by: Lukas Wunner <[email protected]>
>
> Tested-by: Geert Uytterhoeven <[email protected]>

There is a small concern here that some wakeup events may be missed if they
are delivered via PME without a working IRQ, but that's fairly minor and it
cannot be avoided entirely, so

Acked-by: Rafael J. Wysocki <[email protected]>

Thanks!

2017-04-18 18:39:29

by Lukas Wunner

[permalink] [raw]

Subject: Re: PCI / PM: Crashes in PME scan during system suspend

On Tue, Apr 18, 2017 at 04:06:27PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, April 18, 2017 08:49:39 AM Geert Uytterhoeven wrote:
> > On Sun, Apr 16, 2017 at 9:55 AM, Lukas Wunner <[email protected]> wrote:
> > > Subject: [PATCH] PCI: Freeze PME scan before suspending devices
> > >
> > > Laurent Pinchart reported that the Renesas R-Car H2 Lager board
> > > (r8a7790) crashes during suspend tests. Geert Uytterhoeven managed to
> > > reproduce the issue on an M2-W Koelsch board (r8a7791):
> > >
> > > It occurs when the PME scan runs, once per second. During PME scan, the
> > > PCI host bridge (rcar-pci) registers are accessed while its module clock
> > > has already been disabled, leading to the crash.
> > >
> > > The issue only occurs during suspend tests, after writing either
> > > "platform" or "processors" to /sys/power/pm_test. It does not (or is
> > > less likely) to happen during full system suspend ("core" or "none")
> > > because system suspend also disables timers, and thus the workqueue
> > > handling PME scans no longer runs. Geert believes the issue may still
> > > happen in the small window between disabling module clocks and disabling
> > > timers.
> >
> > It can also be reproduced easily by configuring s2ram to use s2idle instead
> > of deep suspend, which is a real usecase:
> >
> > # echo 0 > /sys/module/printk/parameters/console_suspend
> > # echo s2idle > /sys/power/mem_sleep
> > # echo mem > /sys/power/state
> >
> > Tested-by: Geert Uytterhoeven <[email protected]>
>
> There is a small concern here that some wakeup events may be missed if they
> are delivered via PME without a working IRQ, but that's fairly minor and it
> cannot be avoided entirely, so

Well, that's a conundrum. I don't know which devices depend on PME polling
and whether they may signal PME between freezing the workqueue and suspending
the host bridge. If this unexpectedly turns out to be a problem in practice,
it might be possible to solve it by calling pci_pme_list_scan() once directly
from one of the host bridge's pm_ops callbacks.

I've amended the commit message with the tags and additional information
provided by you and Geert and will resend the patch to the list shortly.

Thanks,

Lukas