2015-05-17 08:26:04

by Geert Uytterhoeven

[permalink] [raw]
Subject: Calling irq_set_irq_wake() from .set_irq_wake()? (was: Re: [PATCH] gpio: pcf875x: Revert "gpio: pcf857x: Propagate wake-up setting to parent irq controller")

Hi Grygorii, Thomas, Ingo,

On Thu, May 14, 2015 at 2:32 PM, [email protected]
<[email protected]> wrote:
> On 05/11/2015 08:36 PM, Geert Uytterhoeven wrote:
>> On Mon, May 11, 2015 at 4:13 PM, Roger Quadros <[email protected]> wrote:
>>> commit b80eef95beb0 ('gpio: pcf857x: Propagate wake-up setting to parent irq controller')
>>> introduces the following recursive locking warning while suspending dra7-evm.
>>>
>>> The issue addressed by that commit has been already resolved by
>>> commit 10a50f1ab5f0 ('genirq: Set IRQCHIP_SKIP_SET_WAKE flag for dummy_irq_chip')
>>
>> That's not 100% correct: commit b80eef95beb0 ('gpio: pcf857x: Propagate wake-up
>> setting to parent irq controller') fixes _two_ things:
>> 1. warning due to missing irq_set_wake / IRQCHIP_SKIP_SET_WAKE,
>> 2. propagating set_wake, so the parent interrupt controller stays awake, as
>> it's needed for wake-up,
>>
>> Only the first issue is addressed by commit 10a50f1ab5f0 ('genirq: Set
>> IRQCHIP_SKIP_SET_WAKE flag for dummy_irq_chip').
>>
>>> and so let's revert commit b80eef95beb0 ('gpio: pcf857x: Propagate wake-up setting to parent irq controller')
>>>
>>> At least the recursive locking message no longer appears after the revert.
>>>
>>> [ 30.591905] PM: Syncing filesystems ... done.
>>> [ 30.623060] Freezing user space processes ... (elapsed 0.003 seconds) done.
>>> [ 30.634470] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
>>> [ 30.658288] sd 0:0:0:0: [sda] Synchronizing SCSI cache
>>> [ 30.663678]
>>> [ 30.663681] =============================================
>>> [ 30.663683] [ INFO: possible recursive locking detected ]
>>> [ 30.663688] 4.1.0-rc3 #1115 Not tainted
>>> [ 30.663693] ---------------------------------------------
>>> [ 30.663697] suspend.sh/2319 is trying to acquire lock:
>>> [ 30.663719] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>> [ 30.663722]
>>> [ 30.663722] but task is already holding lock:
>>> [ 30.663734] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>
>> Does this mean .set_irq_wake() cannot call irq_set_irq_wake()?
>> Many GPIO drivers do that, as they need to propagate wake-up state to the
>> parent interrupt controller?
>
> As I remember, there was similar problem, so I found corresponding patch (just FYI)
>
> ab2b926 mfd: Fix twl6030 lockdep recursion warning on setting wake IRQs
>
> Not sure such kind of solution is the best choice (

That looks like a convoluted solution...

Thomas, Ingo, can you please chime in w.r.t. calling irq_set_irq_wake()
from .set_irq_wake()?

The thread starts at http://www.spinics.net/lists/linux-gpio/msg05844.html

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


2015-05-18 14:31:07

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Calling irq_set_irq_wake() from .set_irq_wake()? (was: Re: [PATCH] gpio: pcf875x: Revert "gpio: pcf857x: Propagate wake-up setting to parent irq controller")

On Sun, 17 May 2015, Geert Uytterhoeven wrote:
> >>> At least the recursive locking message no longer appears after the revert.
> >>>
> >>> [ 30.591905] PM: Syncing filesystems ... done.
> >>> [ 30.623060] Freezing user space processes ... (elapsed 0.003 seconds) done.
> >>> [ 30.634470] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
> >>> [ 30.658288] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> >>> [ 30.663678]
> >>> [ 30.663681] =============================================
> >>> [ 30.663683] [ INFO: possible recursive locking detected ]
> >>> [ 30.663688] 4.1.0-rc3 #1115 Not tainted
> >>> [ 30.663693] ---------------------------------------------
> >>> [ 30.663697] suspend.sh/2319 is trying to acquire lock:
> >>> [ 30.663719] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
> >>> [ 30.663722]
> >>> [ 30.663722] but task is already holding lock:
> >>> [ 30.663734] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
> >>
> >> Does this mean .set_irq_wake() cannot call irq_set_irq_wake()?

It can call it, if it's guaranteed that this wont deadlock.

To tell lockdep that you sure about that, you need to set a different
lock class for the child interrupts. irq_set_lockdep_class() is what
you want to use here.

> >> Many GPIO drivers do that, as they need to propagate wake-up state to the
> >> parent interrupt controller?
> >
> > As I remember, there was similar problem, so I found corresponding patch (just FYI)
> >
> > ab2b926 mfd: Fix twl6030 lockdep recursion warning on setting wake IRQs
> >
> > Not sure such kind of solution is the best choice (
>
> That looks like a convoluted solution...

Indeed. See above.

Thanks,

tglx

2015-05-18 14:53:02

by [email protected]

[permalink] [raw]
Subject: Re: Calling irq_set_irq_wake() from .set_irq_wake()?

On 05/18/2015 05:31 PM, Thomas Gleixner wrote:
> On Sun, 17 May 2015, Geert Uytterhoeven wrote:
>>>>> At least the recursive locking message no longer appears after the revert.
>>>>>
>>>>> [ 30.591905] PM: Syncing filesystems ... done.
>>>>> [ 30.623060] Freezing user space processes ... (elapsed 0.003 seconds) done.
>>>>> [ 30.634470] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
>>>>> [ 30.658288] sd 0:0:0:0: [sda] Synchronizing SCSI cache
>>>>> [ 30.663678]
>>>>> [ 30.663681] =============================================
>>>>> [ 30.663683] [ INFO: possible recursive locking detected ]
>>>>> [ 30.663688] 4.1.0-rc3 #1115 Not tainted
>>>>> [ 30.663693] ---------------------------------------------
>>>>> [ 30.663697] suspend.sh/2319 is trying to acquire lock:
>>>>> [ 30.663719] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>> [ 30.663722]
>>>>> [ 30.663722] but task is already holding lock:
>>>>> [ 30.663734] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>
>>>> Does this mean .set_irq_wake() cannot call irq_set_irq_wake()?
>
> It can call it, if it's guaranteed that this wont deadlock.
>
> To tell lockdep that you sure about that, you need to set a different
> lock class for the child interrupts. irq_set_lockdep_class() is what
> you want to use here.

Hm. Seems we already have corresponding call in gpiochip_irq_map:

static int gpiochip_irq_map(struct irq_domain *d, unsigned int irq,
irq_hw_number_t hwirq)
{
struct gpio_chip *chip = d->host_data;

irq_set_chip_data(irq, chip);
irq_set_lockdep_class(irq, &gpiochip_irq_lock_class);
^^^^

commit e45d1c80c0eee88e82751461e9cac49d9ed287bc
Author: Linus Walleij <[email protected]>
Date: Tue Apr 22 14:01:46 2014 +0200

gpio: put GPIO IRQs into their own lock clas

added in Kernel v3.16

Roger, can you confirm that you've observed this issue with latest kernel, pls?

>
>>>> Many GPIO drivers do that, as they need to propagate wake-up state to the
>>>> parent interrupt controller?
>>>
>>> As I remember, there was similar problem, so I found corresponding patch (just FYI)
>>>
>>> ab2b926 mfd: Fix twl6030 lockdep recursion warning on setting wake IRQs
>>>
>>> Not sure such kind of solution is the best choice (
>>
>> That looks like a convoluted solution...
>
> Indeed. See above.


--
regards,
-grygorii

2015-05-19 09:38:35

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: Calling irq_set_irq_wake() from .set_irq_wake()?

Hi Grygorii,

On Mon, May 18, 2015 at 4:52 PM, [email protected]
<[email protected]> wrote:
> On 05/18/2015 05:31 PM, Thomas Gleixner wrote:
>> On Sun, 17 May 2015, Geert Uytterhoeven wrote:
>>>>>> At least the recursive locking message no longer appears after the revert.
>>>>>>
>>>>>> [ 30.591905] PM: Syncing filesystems ... done.
>>>>>> [ 30.623060] Freezing user space processes ... (elapsed 0.003 seconds) done.
>>>>>> [ 30.634470] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
>>>>>> [ 30.658288] sd 0:0:0:0: [sda] Synchronizing SCSI cache
>>>>>> [ 30.663678]
>>>>>> [ 30.663681] =============================================
>>>>>> [ 30.663683] [ INFO: possible recursive locking detected ]
>>>>>> [ 30.663688] 4.1.0-rc3 #1115 Not tainted
>>>>>> [ 30.663693] ---------------------------------------------
>>>>>> [ 30.663697] suspend.sh/2319 is trying to acquire lock:
>>>>>> [ 30.663719] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>>> [ 30.663722]
>>>>>> [ 30.663722] but task is already holding lock:
>>>>>> [ 30.663734] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>>
>>>>> Does this mean .set_irq_wake() cannot call irq_set_irq_wake()?
>>
>> It can call it, if it's guaranteed that this wont deadlock.
>>
>> To tell lockdep that you sure about that, you need to set a different
>> lock class for the child interrupts. irq_set_lockdep_class() is what
>> you want to use here.
>
> Hm. Seems we already have corresponding call in gpiochip_irq_map:
>
> static int gpiochip_irq_map(struct irq_domain *d, unsigned int irq,
> irq_hw_number_t hwirq)
> {
> struct gpio_chip *chip = d->host_data;
>
> irq_set_chip_data(irq, chip);
> irq_set_lockdep_class(irq, &gpiochip_irq_lock_class);
> ^^^^

That piece of code sets the lockdep class of the gpiochip's interrupts, not
the parent interrupt.

Found out the hard way by adding some debug code ;-)

gpiochip_irq_map: setting lockdep class for irq 111
gpiochip_irq_map: setting lockdep class for irq 112
gpiochip_irq_map: setting lockdep class for irq 113
gpiochip_irq_map: setting lockdep class for irq 114
gpiochip_irq_map: setting lockdep class for irq 115
gpiochip_irq_map: setting lockdep class for irq 116
gpiochip_irq_map: setting lockdep class for irq 117
gpiochip_irq_map: setting lockdep class for irq 118
gpiochip_irq_map: setting lockdep class for irq 119
gpiochip_irq_map: setting lockdep class for irq 120
gpiochip_irq_map: setting lockdep class for irq 121
gpiochip_irq_map: setting lockdep class for irq 122
gpiochip_irq_map: setting lockdep class for irq 123
gpiochip_irq_map: setting lockdep class for irq 124
gpiochip_irq_map: setting lockdep class for irq 125
gpiochip_irq_map: setting lockdep class for irq 126

pcf857x_irq_set_wake: setting wake for irq 96

However, I cannot reproduce the problem on sh73a0/kzm9g with
s2ram on a current tree (renesas-drivers-2015-05-19-v4.1-rc4 from
(https://git.kernel.org/cgit/linux/kernel/git/geert/renesas-drivers.git), using

CONFIG_LOCKDEP_SUPPORT=y
CONFIG_LOCKDEP=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_PROVE_LOCKING=y

Wake-up from gpio-keys works fine, no scary messages.

> commit e45d1c80c0eee88e82751461e9cac49d9ed287bc
> Author: Linus Walleij <[email protected]>
> Date: Tue Apr 22 14:01:46 2014 +0200
>
> gpio: put GPIO IRQs into their own lock clas
>
> added in Kernel v3.16
>
> Roger, can you confirm that you've observed this issue with latest kernel, pls?

Yes please. Thanks!

>>>>> Many GPIO drivers do that, as they need to propagate wake-up state to the
>>>>> parent interrupt controller?
>>>>
>>>> As I remember, there was similar problem, so I found corresponding patch (just FYI)
>>>>
>>>> ab2b926 mfd: Fix twl6030 lockdep recursion warning on setting wake IRQs
>>>>
>>>> Not sure such kind of solution is the best choice (
>>>
>>> That looks like a convoluted solution...
>>
>> Indeed. See above.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2015-06-03 19:53:18

by Grygorii Strashko

[permalink] [raw]
Subject: Re: Calling irq_set_irq_wake() from .set_irq_wake()?

Hi Geert,

On 05/19/2015 12:38 PM, Geert Uytterhoeven wrote:
> On Mon, May 18, 2015 at 4:52 PM, [email protected]
> <[email protected]> wrote:
>> On 05/18/2015 05:31 PM, Thomas Gleixner wrote:
>>> On Sun, 17 May 2015, Geert Uytterhoeven wrote:
>>>>>>> At least the recursive locking message no longer appears after the revert.
>>>>>>>
>>>>>>> [ 30.591905] PM: Syncing filesystems ... done.
>>>>>>> [ 30.623060] Freezing user space processes ... (elapsed 0.003 seconds) done.
>>>>>>> [ 30.634470] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
>>>>>>> [ 30.658288] sd 0:0:0:0: [sda] Synchronizing SCSI cache
>>>>>>> [ 30.663678]
>>>>>>> [ 30.663681] =============================================
>>>>>>> [ 30.663683] [ INFO: possible recursive locking detected ]
>>>>>>> [ 30.663688] 4.1.0-rc3 #1115 Not tainted
>>>>>>> [ 30.663693] ---------------------------------------------
>>>>>>> [ 30.663697] suspend.sh/2319 is trying to acquire lock:
>>>>>>> [ 30.663719] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>>>> [ 30.663722]
>>>>>>> [ 30.663722] but task is already holding lock:
>>>>>>> [ 30.663734] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>>>
>>>>>> Does this mean .set_irq_wake() cannot call irq_set_irq_wake()?
>>>
>>> It can call it, if it's guaranteed that this wont deadlock.
>>>
>>> To tell lockdep that you sure about that, you need to set a different
>>> lock class for the child interrupts. irq_set_lockdep_class() is what
>>> you want to use here.
>>
>> Hm. Seems we already have corresponding call in gpiochip_irq_map:
>>
>> static int gpiochip_irq_map(struct irq_domain *d, unsigned int irq,
>> irq_hw_number_t hwirq)
>> {
>> struct gpio_chip *chip = d->host_data;
>>
>> irq_set_chip_data(irq, chip);
>> irq_set_lockdep_class(irq, &gpiochip_irq_lock_class);
>> ^^^^
>
> That piece of code sets the lockdep class of the gpiochip's interrupts, not
> the parent interrupt.
>
> Found out the hard way by adding some debug code ;-)
[..]
>
> However, I cannot reproduce the problem on sh73a0/kzm9g with
> s2ram on a current tree (renesas-drivers-2015-05-19-v4.1-rc4 from
> (https://git.kernel.org/cgit/linux/kernel/git/geert/renesas-drivers.git), using
>
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_LOCKDEP=y
> CONFIG_DEBUG_LOCKDEP=y
> CONFIG_PROVE_LOCKING=y
>
> Wake-up from gpio-keys works fine, no scary messages.
>
>> commit e45d1c80c0eee88e82751461e9cac49d9ed287bc
>> Author: Linus Walleij <[email protected]>
>> Date: Tue Apr 22 14:01:46 2014 +0200
>>
>> gpio: put GPIO IRQs into their own lock clas
>>
>> added in Kernel v3.16
>>
>> Roger, can you confirm that you've observed this issue with latest kernel, pls?
>
> Yes please. Thanks!

Unfortunately, I was able to reproduce it, but have no clue how to fix it gracefully.
lockdep_set_class_and_subclass(..,gpio_chip->base)?

HW configuration which generates lockdep warning:

[SOC GPIO bankA.gpioX] <- irq - [pcf875x.gpioY] <- irq - DevZ.enable_irq_wake(pcf_gpioY_irq);

There stacked GPIO chips, but gpiolib uses only one lockdep class for all GPIOirqchips -
- gpiochip_irq_lock_class.


>
>>>>>> Many GPIO drivers do that, as they need to propagate wake-up state to the
>>>>>> parent interrupt controller?
>>>>>
>>>>> As I remember, there was similar problem, so I found corresponding patch (just FYI)
>>>>>
>>>>> ab2b926 mfd: Fix twl6030 lockdep recursion warning on setting wake IRQs
>>>>>
>>>>> Not sure such kind of solution is the best choice (
>>>>
>>>> That looks like a convoluted solution...
>>>

regards,
-grygorii

2015-06-05 02:36:14

by Roger Quadros

[permalink] [raw]
Subject: Re: Calling irq_set_irq_wake() from .set_irq_wake()?

Hi,

On Wed, 3 Jun 2015 22:52:47 +0300
Grygorii Strashko <[email protected]> wrote:

> Hi Geert,
>
> On 05/19/2015 12:38 PM, Geert Uytterhoeven wrote:
> > On Mon, May 18, 2015 at 4:52 PM, [email protected]
> > <[email protected]> wrote:
> >> On 05/18/2015 05:31 PM, Thomas Gleixner wrote:
> >>> On Sun, 17 May 2015, Geert Uytterhoeven wrote:
> >>>>>>> At least the recursive locking message no longer appears after the revert.
> >>>>>>>
> >>>>>>> [ 30.591905] PM: Syncing filesystems ... done.
> >>>>>>> [ 30.623060] Freezing user space processes ... (elapsed 0.003 seconds) done.
> >>>>>>> [ 30.634470] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
> >>>>>>> [ 30.658288] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> >>>>>>> [ 30.663678]
> >>>>>>> [ 30.663681] =============================================
> >>>>>>> [ 30.663683] [ INFO: possible recursive locking detected ]
> >>>>>>> [ 30.663688] 4.1.0-rc3 #1115 Not tainted
> >>>>>>> [ 30.663693] ---------------------------------------------
> >>>>>>> [ 30.663697] suspend.sh/2319 is trying to acquire lock:
> >>>>>>> [ 30.663719] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
> >>>>>>> [ 30.663722]
> >>>>>>> [ 30.663722] but task is already holding lock:
> >>>>>>> [ 30.663734] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
> >>>>>>
> >>>>>> Does this mean .set_irq_wake() cannot call irq_set_irq_wake()?
> >>>
> >>> It can call it, if it's guaranteed that this wont deadlock.
> >>>
> >>> To tell lockdep that you sure about that, you need to set a different
> >>> lock class for the child interrupts. irq_set_lockdep_class() is what
> >>> you want to use here.
> >>
> >> Hm. Seems we already have corresponding call in gpiochip_irq_map:
> >>
> >> static int gpiochip_irq_map(struct irq_domain *d, unsigned int irq,
> >> irq_hw_number_t hwirq)
> >> {
> >> struct gpio_chip *chip = d->host_data;
> >>
> >> irq_set_chip_data(irq, chip);
> >> irq_set_lockdep_class(irq, &gpiochip_irq_lock_class);
> >> ^^^^
> >
> > That piece of code sets the lockdep class of the gpiochip's interrupts, not
> > the parent interrupt.
> >
> > Found out the hard way by adding some debug code ;-)
> [..]
> >
> > However, I cannot reproduce the problem on sh73a0/kzm9g with
> > s2ram on a current tree (renesas-drivers-2015-05-19-v4.1-rc4 from
> > (https://git.kernel.org/cgit/linux/kernel/git/geert/renesas-drivers.git), using
> >
> > CONFIG_LOCKDEP_SUPPORT=y
> > CONFIG_LOCKDEP=y
> > CONFIG_DEBUG_LOCKDEP=y
> > CONFIG_PROVE_LOCKING=y
> >
> > Wake-up from gpio-keys works fine, no scary messages.
> >
> >> commit e45d1c80c0eee88e82751461e9cac49d9ed287bc
> >> Author: Linus Walleij <[email protected]>
> >> Date: Tue Apr 22 14:01:46 2014 +0200
> >>
> >> gpio: put GPIO IRQs into their own lock clas
> >>
> >> added in Kernel v3.16
> >>
> >> Roger, can you confirm that you've observed this issue with latest kernel, pls?
> >
> > Yes please. Thanks!

Issue is reproducible on v4.1-rc6

>
> Unfortunately, I was able to reproduce it, but have no clue how to fix it gracefully.
> lockdep_set_class_and_subclass(..,gpio_chip->base)?
>
> HW configuration which generates lockdep warning:
>
> [SOC GPIO bankA.gpioX] <- irq - [pcf875x.gpioY] <- irq - DevZ.enable_irq_wake(pcf_gpioY_irq);
>
> There stacked GPIO chips, but gpiolib uses only one lockdep class for all GPIOirqchips -
> - gpiochip_irq_lock_class.

If this is a gpiolib core issue are we (dra7-evm) the only stacked GPIO users facing
this problem?

Linus/Alexandre/Geert,

Please advise what can be done for v4.1. The warning is annoying for dra7-evm users.
Should we temporarily revert the patch even though it is correct and add it back when the
gpiolib core issue is fixed?

cheers,
-roger

2015-06-05 09:47:38

by Grygorii Strashko

[permalink] [raw]
Subject: Re: Calling irq_set_irq_wake() from .set_irq_wake()?

On 06/05/2015 05:35 AM, Roger Quadros wrote:
> Hi,
>
> On Wed, 3 Jun 2015 22:52:47 +0300
> Grygorii Strashko <[email protected]> wrote:
>
>> Hi Geert,
>>
>> On 05/19/2015 12:38 PM, Geert Uytterhoeven wrote:
>>> On Mon, May 18, 2015 at 4:52 PM, [email protected]
>>> <[email protected]> wrote:
>>>> On 05/18/2015 05:31 PM, Thomas Gleixner wrote:
>>>>> On Sun, 17 May 2015, Geert Uytterhoeven wrote:
>>>>>>>>> At least the recursive locking message no longer appears after the revert.
>>>>>>>>>
>>>>>>>>> [ 30.591905] PM: Syncing filesystems ... done.
>>>>>>>>> [ 30.623060] Freezing user space processes ... (elapsed 0.003 seconds) done.
>>>>>>>>> [ 30.634470] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
>>>>>>>>> [ 30.658288] sd 0:0:0:0: [sda] Synchronizing SCSI cache
>>>>>>>>> [ 30.663678]
>>>>>>>>> [ 30.663681] =============================================
>>>>>>>>> [ 30.663683] [ INFO: possible recursive locking detected ]
>>>>>>>>> [ 30.663688] 4.1.0-rc3 #1115 Not tainted
>>>>>>>>> [ 30.663693] ---------------------------------------------
>>>>>>>>> [ 30.663697] suspend.sh/2319 is trying to acquire lock:
>>>>>>>>> [ 30.663719] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>>>>>> [ 30.663722]
>>>>>>>>> [ 30.663722] but task is already holding lock:
>>>>>>>>> [ 30.663734] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>>>>>
>>>>>>>> Does this mean .set_irq_wake() cannot call irq_set_irq_wake()?
>>>>>
>>>>> It can call it, if it's guaranteed that this wont deadlock.
>>>>>
>>>>> To tell lockdep that you sure about that, you need to set a different
>>>>> lock class for the child interrupts. irq_set_lockdep_class() is what
>>>>> you want to use here.
>>>>
>>>> Hm. Seems we already have corresponding call in gpiochip_irq_map:
>>>>
>>>> static int gpiochip_irq_map(struct irq_domain *d, unsigned int irq,
>>>> irq_hw_number_t hwirq)
>>>> {
>>>> struct gpio_chip *chip = d->host_data;
>>>>
>>>> irq_set_chip_data(irq, chip);
>>>> irq_set_lockdep_class(irq, &gpiochip_irq_lock_class);
>>>> ^^^^
>>>
>>> That piece of code sets the lockdep class of the gpiochip's interrupts, not
>>> the parent interrupt.
>>>
>>> Found out the hard way by adding some debug code ;-)
>> [..]
>>>
>>> However, I cannot reproduce the problem on sh73a0/kzm9g with
>>> s2ram on a current tree (renesas-drivers-2015-05-19-v4.1-rc4 from
>>> (https://git.kernel.org/cgit/linux/kernel/git/geert/renesas-drivers.git), using
>>>
>>> CONFIG_LOCKDEP_SUPPORT=y
>>> CONFIG_LOCKDEP=y
>>> CONFIG_DEBUG_LOCKDEP=y
>>> CONFIG_PROVE_LOCKING=y
>>>
>>> Wake-up from gpio-keys works fine, no scary messages.
>>>
>>>> commit e45d1c80c0eee88e82751461e9cac49d9ed287bc
>>>> Author: Linus Walleij <[email protected]>
>>>> Date: Tue Apr 22 14:01:46 2014 +0200
>>>>
>>>> gpio: put GPIO IRQs into their own lock clas
>>>>
>>>> added in Kernel v3.16
>>>>
>>>> Roger, can you confirm that you've observed this issue with latest kernel, pls?
>>>
>>> Yes please. Thanks!
>
> Issue is reproducible on v4.1-rc6
>
>>
>> Unfortunately, I was able to reproduce it, but have no clue how to fix it gracefully.
>> lockdep_set_class_and_subclass(..,gpio_chip->base)?
>>
>> HW configuration which generates lockdep warning:
>>
>> [SOC GPIO bankA.gpioX] <- irq - [pcf875x.gpioY] <- irq - DevZ.enable_irq_wake(pcf_gpioY_irq);
>>
>> There stacked GPIO chips, but gpiolib uses only one lockdep class for all GPIOirqchips -
>> - gpiochip_irq_lock_class.
>
> If this is a gpiolib core issue are we (dra7-evm) the only stacked GPIO users facing
> this problem?
>
> Linus/Alexandre/Geert,
>
> Please advise what can be done for v4.1. The warning is annoying for dra7-evm users.
> Should we temporarily revert the patch even though it is correct and add it back when the
> gpiolib core issue is fixed?

No. Pls. don't do that. See https://lkml.org/lkml/2015/6/3/965

Simple revert is not good solution.

Probably we need to allow GPIO drivers to specify own lockdep class somehow.

--
regards,
-grygorii

2015-06-06 08:16:59

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: Calling irq_set_irq_wake() from .set_irq_wake()?

On Fri, Jun 5, 2015 at 11:47 AM, Grygorii Strashko
<[email protected]> wrote:
> On 06/05/2015 05:35 AM, Roger Quadros wrote:
>> On Wed, 3 Jun 2015 22:52:47 +0300
>> Grygorii Strashko <[email protected]> wrote:
>>> On 05/19/2015 12:38 PM, Geert Uytterhoeven wrote:
>>>> On Mon, May 18, 2015 at 4:52 PM, [email protected]
>>>> <[email protected]> wrote:
>>>>> On 05/18/2015 05:31 PM, Thomas Gleixner wrote:
>>>>>> On Sun, 17 May 2015, Geert Uytterhoeven wrote:
>>>>>>>>>> At least the recursive locking message no longer appears after the revert.
>>>>>>>>>>
>>>>>>>>>> [ 30.591905] PM: Syncing filesystems ... done.
>>>>>>>>>> [ 30.623060] Freezing user space processes ... (elapsed 0.003 seconds) done.
>>>>>>>>>> [ 30.634470] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
>>>>>>>>>> [ 30.658288] sd 0:0:0:0: [sda] Synchronizing SCSI cache
>>>>>>>>>> [ 30.663678]
>>>>>>>>>> [ 30.663681] =============================================
>>>>>>>>>> [ 30.663683] [ INFO: possible recursive locking detected ]
>>>>>>>>>> [ 30.663688] 4.1.0-rc3 #1115 Not tainted
>>>>>>>>>> [ 30.663693] ---------------------------------------------
>>>>>>>>>> [ 30.663697] suspend.sh/2319 is trying to acquire lock:
>>>>>>>>>> [ 30.663719] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>>>>>>> [ 30.663722]
>>>>>>>>>> [ 30.663722] but task is already holding lock:
>>>>>>>>>> [ 30.663734] (class){......}, at: [<c0096ebc>] __irq_get_desc_lock+0x48/0x88
>>>>>>>>>
>>>>>>>>> Does this mean .set_irq_wake() cannot call irq_set_irq_wake()?
>>>>>>
>>>>>> It can call it, if it's guaranteed that this wont deadlock.
>>>>>>
>>>>>> To tell lockdep that you sure about that, you need to set a different
>>>>>> lock class for the child interrupts. irq_set_lockdep_class() is what
>>>>>> you want to use here.
>>>>>
>>>>> Hm. Seems we already have corresponding call in gpiochip_irq_map:
>>>>>
>>>>> static int gpiochip_irq_map(struct irq_domain *d, unsigned int irq,
>>>>> irq_hw_number_t hwirq)
>>>>> {
>>>>> struct gpio_chip *chip = d->host_data;
>>>>>
>>>>> irq_set_chip_data(irq, chip);
>>>>> irq_set_lockdep_class(irq, &gpiochip_irq_lock_class);
>>>>> ^^^^
>>>>
>>>> That piece of code sets the lockdep class of the gpiochip's interrupts, not
>>>> the parent interrupt.
>>>>
>>>> Found out the hard way by adding some debug code ;-)
>>> [..]
>>>>
>>>> However, I cannot reproduce the problem on sh73a0/kzm9g with
>>>> s2ram on a current tree (renesas-drivers-2015-05-19-v4.1-rc4 from
>>>> (https://git.kernel.org/cgit/linux/kernel/git/geert/renesas-drivers.git), using
>>>>
>>>> CONFIG_LOCKDEP_SUPPORT=y
>>>> CONFIG_LOCKDEP=y
>>>> CONFIG_DEBUG_LOCKDEP=y
>>>> CONFIG_PROVE_LOCKING=y
>>>>
>>>> Wake-up from gpio-keys works fine, no scary messages.
>>>>
>>>>> commit e45d1c80c0eee88e82751461e9cac49d9ed287bc
>>>>> Author: Linus Walleij <[email protected]>
>>>>> Date: Tue Apr 22 14:01:46 2014 +0200
>>>>>
>>>>> gpio: put GPIO IRQs into their own lock clas
>>>>>
>>>>> added in Kernel v3.16
>>>>>
>>>>> Roger, can you confirm that you've observed this issue with latest kernel, pls?
>>>>
>>>> Yes please. Thanks!
>>
>> Issue is reproducible on v4.1-rc6
>>
>>>
>>> Unfortunately, I was able to reproduce it, but have no clue how to fix it gracefully.
>>> lockdep_set_class_and_subclass(..,gpio_chip->base)?
>>>
>>> HW configuration which generates lockdep warning:
>>>
>>> [SOC GPIO bankA.gpioX] <- irq - [pcf875x.gpioY] <- irq - DevZ.enable_irq_wake(pcf_gpioY_irq);
>>>
>>> There stacked GPIO chips, but gpiolib uses only one lockdep class for all GPIOirqchips -
>>> - gpiochip_irq_lock_class.
>>
>> If this is a gpiolib core issue are we (dra7-evm) the only stacked GPIO users facing
>> this problem?
>>
>> Linus/Alexandre/Geert,
>>
>> Please advise what can be done for v4.1. The warning is annoying for dra7-evm users.
>> Should we temporarily revert the patch even though it is correct and add it back when the
>> gpiolib core issue is fixed?
>
> No. Pls. don't do that. See https://lkml.org/lkml/2015/6/3/965

I'm about to leave for a business trip to Japan. I will give it a try when I'm
back home.

> Simple revert is not good solution.
>
> Probably we need to allow GPIO drivers to specify own lockdep class somehow.

Indeed.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds