2023-05-10 00:44:08

by Chris Packham

[permalink] [raw]
Subject: [PATCH] gpiolib: Don't implicitly disable irq when masking

When preparing to kexec into a new kernel the kexec code will mask all
interrupts for all interrupt domains before disabling them. In the case
of a gpio chip which has a mix of gpio and irq pins a warning would be
triggered as follows

[root@localhost ~]# echo c >/proc/sysrq-trigger
[ 175.677728] sysrq: Trigger a crash
<snip expected dump from SysRq>
[ 175.803409] WARNING: CPU: 1 PID: 2345 at drivers/gpio/gpiolib.c:3284 gpiochip_disable_irq+0x68/0xac
[ 175.918321] CPU: 1 PID: 2345 Comm: sh Kdump: loaded Tainted: G O 5.15.109+ #5
[ 175.926742] Hardware name: Allied Telesis x240-26GHXm (DT)
[ 175.932216] pstate: 804000c9 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 175.939165] pc : gpiochip_disable_irq+0x68/0xac
[ 175.943686] lr : gpiochip_disable_irq+0x58/0xac
[ 175.948208] sp : ffff80000c03ba00
[ 175.951513] x29: ffff80000c03ba00 x28: ffff00002843c4c0 x27: 0000000000000000
[ 175.958638] x26: 0000000000000000 x25: ffff800008df0790 x24: 0000000000000000
[ 175.965763] x23: ffff8000095397f0 x22: ffff800009485178 x21: ffff000016ea38f0
[ 175.972888] x20: ffff000016ea3a20 x19: ffff000016073cc0 x18: ffffffffffffffff
[ 175.980012] x17: 20204f2020202020 x16: 2020202020204720 x15: ffff80008c03b667
[ 175.987136] x14: 0000000000000000 x13: ffff800009415148 x12: 0000000000000555
[ 175.994260] x11: 00000000000001c7 x10: ffff800009415148 x9 : ffff800009415148
[ 176.001385] x8 : 00000000ffffefff x7 : ffff80000946d148 x6 : ffff80000946d148
[ 176.008510] x5 : ffff00003fdc59a0 x4 : 0000000000000000 x3 : 0000000000000000
[ 176.015634] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000007
[ 176.022757] Call trace:
[ 176.025197] gpiochip_disable_irq+0x68/0xac
[ 176.029373] gpiochip_irq_mask+0x30/0x40
[ 176.033289] machine_crash_shutdown+0xb4/0x11c
[ 176.037727] __crash_kexec+0x88/0x16c
[ 176.041384] panic+0x194/0x348
[ 176.044433] sysrq_reset_seq_param_set+0x0/0x90
[ 176.048954] __handle_sysrq+0x8c/0x1a0
[ 176.052695] write_sysrq_trigger+0x88/0xc0
[ 176.056783] proc_reg_write+0xa8/0x100
[ 176.060527] vfs_write+0xf0/0x290
[ 176.063835] ksys_write+0x68/0xf4
[ 176.067144] __arm64_sys_write+0x1c/0x2c
[ 176.071058] invoke_syscall+0x48/0x114
[ 176.074800] el0_svc_common.constprop.0+0x44/0xfc
[ 176.079495] do_el0_svc+0x28/0xa0
[ 176.082803] el0_svc+0x28/0x80
[ 176.085851] el0t_64_sync_handler+0xa4/0x130
[ 176.090112] el0t_64_sync+0x1a0/0x1a4
[ 176.093768] ---[ end trace 486d53a4eb15ab42 ]---
<more warnings for each gpio pin that is not used as an interrupt>

This is because gpiochip_irq_mask was being used to mask all possible
irqs in the domain but gpiochip_disable_irq will WARN if any of those
gpios haven't been requested as interrupts yet. Remove the call to
gpiochip_disable_irq to stop the warning.

Fixes: a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable")
Signed-off-by: Chris Packham <[email protected]>
---
drivers/gpio/gpiolib.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 8c041a8dd9d8..903f5185ae55 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1451,7 +1451,6 @@ static void gpiochip_irq_mask(struct irq_data *d)

if (gc->irq.irq_mask)
gc->irq.irq_mask(d);
- gpiochip_disable_irq(gc, d->hwirq);
}

static void gpiochip_irq_unmask(struct irq_data *d)
--
2.40.1


2023-05-10 08:16:41

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: Don't implicitly disable irq when masking

Wed, May 10, 2023 at 12:11:51PM +1200, Chris Packham kirjoitti:
> When preparing to kexec into a new kernel the kexec code will mask all
> interrupts for all interrupt domains before disabling them. In the case
> of a gpio chip which has a mix of gpio and irq pins a warning would be
> triggered as follows

> [root@localhost ~]# echo c >/proc/sysrq-trigger

Besides the very noisy traceback in the commit message (read
https://kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages)
see below.

> This is because gpiochip_irq_mask was being used to mask all possible

We refer to the functions in the form as follows gpiochip_irq_mask().


> irqs in the domain but gpiochip_disable_irq will WARN if any of those

IRQs
gpiochip_disable_irq()

> gpios haven't been requested as interrupts yet. Remove the call to

GPIOs

> gpiochip_disable_irq to stop the warning.

gpiochip_disable_irq()

> Fixes: a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable")
> Signed-off-by: Chris Packham <[email protected]>
> ---
> drivers/gpio/gpiolib.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index 8c041a8dd9d8..903f5185ae55 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -1451,7 +1451,6 @@ static void gpiochip_irq_mask(struct irq_data *d)
>
> if (gc->irq.irq_mask)
> gc->irq.irq_mask(d);
> - gpiochip_disable_irq(gc, d->hwirq);
> }

At the same time the gpiochip_irq_unmask() has the symmetrical call. Why?

Also it's obvious that you have used outdated repository. You need to rebase
against subsystem tree for-next branch.

P.S. It's also makes sense to Cc to Marc Zyngier <[email protected]>.

--
With Best Regards,
Andy Shevchenko



2023-05-10 21:13:00

by Chris Packham

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: Don't implicitly disable irq when masking

Hi Andy,

On 10/05/23 19:42, [email protected] wrote:
> Wed, May 10, 2023 at 12:11:51PM +1200, Chris Packham kirjoitti:
>> When preparing to kexec into a new kernel the kexec code will mask all
>> interrupts for all interrupt domains before disabling them. In the case
>> of a gpio chip which has a mix of gpio and irq pins a warning would be
>> triggered as follows
>> [root@localhost ~]# echo c >/proc/sysrq-trigger
> Besides the very noisy traceback in the commit message (read
> https://kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages)
> see below.
>
>> This is because gpiochip_irq_mask was being used to mask all possible
> We refer to the functions in the form as follows gpiochip_irq_mask().
>
>
>> irqs in the domain but gpiochip_disable_irq will WARN if any of those
> IRQs
> gpiochip_disable_irq()
>
>> gpios haven't been requested as interrupts yet. Remove the call to
> GPIOs
>
>> gpiochip_disable_irq to stop the warning.
> gpiochip_disable_irq()
Will take the above points onboard for v2.
>
>> Fixes: a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable")
>> Signed-off-by: Chris Packham <[email protected]>
>> ---
>> drivers/gpio/gpiolib.c | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
>> index 8c041a8dd9d8..903f5185ae55 100644
>> --- a/drivers/gpio/gpiolib.c
>> +++ b/drivers/gpio/gpiolib.c
>> @@ -1451,7 +1451,6 @@ static void gpiochip_irq_mask(struct irq_data *d)
>>
>> if (gc->irq.irq_mask)
>> gc->irq.irq_mask(d);
>> - gpiochip_disable_irq(gc, d->hwirq);
>> }
> At the same time the gpiochip_irq_unmask() has the symmetrical call. Why?

Hmm you're right I never noticed that. I think that would also trigger a
similar warning if it were ever hit. It's not hit in my use-case because
nothing is running through all the irq domains unmasking interrupts.

The coupling of gpiochip_irq_mask()/gpiochip_irq_unmask() with
gpiochip_disable_irq()/gpiochip_enable_irq() goes back to the same
commit a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable").
It's not immediately obvious to me why the coupling is needed. I was
hoping that someone seeing my patch would confirm that it's not needed
or say why it's needed suggest an alternative approach.

> Also it's obvious that you have used outdated repository. You need to rebase
> against subsystem tree for-next branch.
Yeah that's the tricky part. I'm currently based on lts-5.15 and in
order to actually test this I need all of the support for my platform so
I can use kdump to demonstrate the issue. I might be able to use a
different platform that is already supported in a newer kernel
>
> P.S. It's also makes sense to Cc to Marc Zyngier <[email protected]>.
>
Added.

2023-05-11 08:06:54

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: Don't implicitly disable irq when masking

On Wed, May 10, 2023 at 10:59 PM Chris Packham
<[email protected]> wrote:

> The coupling of gpiochip_irq_mask()/gpiochip_irq_unmask() with
> gpiochip_disable_irq()/gpiochip_enable_irq() goes back to the same
> commit a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable").
> It's not immediately obvious to me why the coupling is needed.

That is just a refactoring of what existed before.

The use case is here:
drivers/media/cec/platform/cec-gpio/cec-gpio.c

The driver needs to switch, at runtime, between actively driving a GPIO
line with gpiod_set_value(), and setting the same line into input mode
and listening for signalling triggering IRQs on it, and then back to
output mode and driving the line again. It's a bidirectional GPIO line.
This use case yields a high need of control.

> I was
> hoping that someone seeing my patch would confirm that it's not needed
> or say why it's needed suggest an alternative approach.

Which IRQ-enabled gpiochip is this? Has it been converted to be immutable?
I think that could be part of the problem.

Yours,
Linus Walleij

2023-05-11 21:07:29

by Chris Packham

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: Don't implicitly disable irq when masking

Hi Linus,

On 11/05/23 20:00, Linus Walleij wrote:
> On Wed, May 10, 2023 at 10:59 PM Chris Packham
> <[email protected]> wrote:
>
>> The coupling of gpiochip_irq_mask()/gpiochip_irq_unmask() with
>> gpiochip_disable_irq()/gpiochip_enable_irq() goes back to the same
>> commit a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable").
>> It's not immediately obvious to me why the coupling is needed.
> That is just a refactoring of what existed before.
>
> The use case is here:
> drivers/media/cec/platform/cec-gpio/cec-gpio.c
>
> The driver needs to switch, at runtime, between actively driving a GPIO
> line with gpiod_set_value(), and setting the same line into input mode
> and listening for signalling triggering IRQs on it, and then back to
> output mode and driving the line again. It's a bidirectional GPIO line.
> This use case yields a high need of control.
>
>> I was
>> hoping that someone seeing my patch would confirm that it's not needed
>> or say why it's needed suggest an alternative approach.
> Which IRQ-enabled gpiochip is this? Has it been converted to be immutable?
> I think that could be part of the problem.

For me it's a pca9555. I spent yesterday trying to demonstrate the
problem on a newer kernel. Some teething issues aside I can trigger the
warning if I have a gpio-button using one of the pca9555 pins as an
interrupt and then I export some of the other pins via sysfs.

Interestingly the warning isn't triggered if I use a gpio-hog instead of
exporting the pins. I haven't figured out why that is but I'm assuming
it's something to do with the hogged pins being excluded from the irq
domain before it is registered.

2023-05-11 21:15:50

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: Don't implicitly disable irq when masking

On Thu, May 11, 2023 at 10:36 PM Chris Packham
<[email protected]> wrote:

> I spent yesterday trying to demonstrate the
> problem on a newer kernel. Some teething issues aside I can trigger the
> warning if I have a gpio-button using one of the pca9555 pins as an
> interrupt and then I export some of the other pins via sysfs.
>
> Interestingly the warning isn't triggered if I use a gpio-hog instead of
> exporting the pins.

What happens if you use the gpio character device instead of sysfs?

Like for example with the tools in tools/gpio or using libgpiod
example tools?

> I haven't figured out why that is but I'm assuming
> it's something to do with the hogged pins being excluded from the irq
> domain before it is registered.

If you write something to the "edge" file I can easily see things
going sidewise. The sysfs is really not a nice tool, which is why
it is deprecated.

Yours,
Linus Walleij

2023-05-12 03:53:30

by Chris Packham

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: Don't implicitly disable irq when masking


On 12/05/23 08:36, Chris Packham wrote:
> Hi Linus,
>
> On 11/05/23 20:00, Linus Walleij wrote:
>> On Wed, May 10, 2023 at 10:59 PM Chris Packham
>> <[email protected]> wrote:
>>
>>> The coupling of gpiochip_irq_mask()/gpiochip_irq_unmask() with
>>> gpiochip_disable_irq()/gpiochip_enable_irq() goes back to the same
>>> commit a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable").
>>> It's not immediately obvious to me why the coupling is needed.
>> That is just a refactoring of what existed before.
>>
>> The use case is here:
>> drivers/media/cec/platform/cec-gpio/cec-gpio.c
>>
>> The driver needs to switch, at runtime, between actively driving a GPIO
>> line with gpiod_set_value(), and setting the same line into input mode
>> and listening for signalling triggering IRQs on it, and then back to
>> output mode and driving the line again. It's a bidirectional GPIO line.
>> This use case yields a high need of control.
>>
>>> I was
>>> hoping that someone seeing my patch would confirm that it's not needed
>>> or say why it's needed suggest an alternative approach.
>> Which IRQ-enabled gpiochip is this? Has it been converted to be
>> immutable?
>> I think that could be part of the problem.
>
> For me it's a pca9555. I spent yesterday trying to demonstrate the
> problem on a newer kernel. Some teething issues aside I can trigger
> the warning if I have a gpio-button using one of the pca9555 pins as
> an interrupt and then I export some of the other pins via sysfs.
>
> Interestingly the warning isn't triggered if I use a gpio-hog instead
> of exporting the pins. I haven't figured out why that is but I'm
> assuming it's something to do with the hogged pins being excluded from
> the irq domain before it is registered.

I'm starting to understand things.

When the gpio is exported to userland the irq_desc is created via
device_add()/gpio_is_visible()/gpiod_to_irq()/gpiochip_to_irq(). I think
that might be a bug because if the user wanted an interrupt they would
have said so via edge_store() which also does the gpiod_to_irq() that
ultimately creates the irq_desc. Having the gpio turned into an
interrupt seems like an odd side-effect of gpio_is_visible().