When preparing to kexec into a new kernel the kexec code will mask all
interrupts for all interrupt domains before disabling them. In the case
of a gpio chip which has a mix of gpio and irq pins a warning would be
triggered as follows
[root@localhost ~]# echo c >/proc/sysrq-trigger
[ 175.677728] sysrq: Trigger a crash
<snip expected dump from SysRq>
[ 175.803409] WARNING: CPU: 1 PID: 2345 at drivers/gpio/gpiolib.c:3284 gpiochip_disable_irq+0x68/0xac
[ 175.918321] CPU: 1 PID: 2345 Comm: sh Kdump: loaded Tainted: G O 5.15.109+ #5
[ 175.926742] Hardware name: Allied Telesis x240-26GHXm (DT)
[ 175.932216] pstate: 804000c9 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 175.939165] pc : gpiochip_disable_irq+0x68/0xac
[ 175.943686] lr : gpiochip_disable_irq+0x58/0xac
[ 175.948208] sp : ffff80000c03ba00
[ 175.951513] x29: ffff80000c03ba00 x28: ffff00002843c4c0 x27: 0000000000000000
[ 175.958638] x26: 0000000000000000 x25: ffff800008df0790 x24: 0000000000000000
[ 175.965763] x23: ffff8000095397f0 x22: ffff800009485178 x21: ffff000016ea38f0
[ 175.972888] x20: ffff000016ea3a20 x19: ffff000016073cc0 x18: ffffffffffffffff
[ 175.980012] x17: 20204f2020202020 x16: 2020202020204720 x15: ffff80008c03b667
[ 175.987136] x14: 0000000000000000 x13: ffff800009415148 x12: 0000000000000555
[ 175.994260] x11: 00000000000001c7 x10: ffff800009415148 x9 : ffff800009415148
[ 176.001385] x8 : 00000000ffffefff x7 : ffff80000946d148 x6 : ffff80000946d148
[ 176.008510] x5 : ffff00003fdc59a0 x4 : 0000000000000000 x3 : 0000000000000000
[ 176.015634] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000007
[ 176.022757] Call trace:
[ 176.025197] gpiochip_disable_irq+0x68/0xac
[ 176.029373] gpiochip_irq_mask+0x30/0x40
[ 176.033289] machine_crash_shutdown+0xb4/0x11c
[ 176.037727] __crash_kexec+0x88/0x16c
[ 176.041384] panic+0x194/0x348
[ 176.044433] sysrq_reset_seq_param_set+0x0/0x90
[ 176.048954] __handle_sysrq+0x8c/0x1a0
[ 176.052695] write_sysrq_trigger+0x88/0xc0
[ 176.056783] proc_reg_write+0xa8/0x100
[ 176.060527] vfs_write+0xf0/0x290
[ 176.063835] ksys_write+0x68/0xf4
[ 176.067144] __arm64_sys_write+0x1c/0x2c
[ 176.071058] invoke_syscall+0x48/0x114
[ 176.074800] el0_svc_common.constprop.0+0x44/0xfc
[ 176.079495] do_el0_svc+0x28/0xa0
[ 176.082803] el0_svc+0x28/0x80
[ 176.085851] el0t_64_sync_handler+0xa4/0x130
[ 176.090112] el0t_64_sync+0x1a0/0x1a4
[ 176.093768] ---[ end trace 486d53a4eb15ab42 ]---
<more warnings for each gpio pin that is not used as an interrupt>
This is because gpiochip_irq_mask was being used to mask all possible
irqs in the domain but gpiochip_disable_irq will WARN if any of those
gpios haven't been requested as interrupts yet. Remove the call to
gpiochip_disable_irq to stop the warning.
Fixes: a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable")
Signed-off-by: Chris Packham <[email protected]>
---
drivers/gpio/gpiolib.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 8c041a8dd9d8..903f5185ae55 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1451,7 +1451,6 @@ static void gpiochip_irq_mask(struct irq_data *d)
if (gc->irq.irq_mask)
gc->irq.irq_mask(d);
- gpiochip_disable_irq(gc, d->hwirq);
}
static void gpiochip_irq_unmask(struct irq_data *d)
--
2.40.1
Wed, May 10, 2023 at 12:11:51PM +1200, Chris Packham kirjoitti:
> When preparing to kexec into a new kernel the kexec code will mask all
> interrupts for all interrupt domains before disabling them. In the case
> of a gpio chip which has a mix of gpio and irq pins a warning would be
> triggered as follows
> [root@localhost ~]# echo c >/proc/sysrq-trigger
Besides the very noisy traceback in the commit message (read
https://kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages)
see below.
> This is because gpiochip_irq_mask was being used to mask all possible
We refer to the functions in the form as follows gpiochip_irq_mask().
> irqs in the domain but gpiochip_disable_irq will WARN if any of those
IRQs
gpiochip_disable_irq()
> gpios haven't been requested as interrupts yet. Remove the call to
GPIOs
> gpiochip_disable_irq to stop the warning.
gpiochip_disable_irq()
> Fixes: a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable")
> Signed-off-by: Chris Packham <[email protected]>
> ---
> drivers/gpio/gpiolib.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index 8c041a8dd9d8..903f5185ae55 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -1451,7 +1451,6 @@ static void gpiochip_irq_mask(struct irq_data *d)
>
> if (gc->irq.irq_mask)
> gc->irq.irq_mask(d);
> - gpiochip_disable_irq(gc, d->hwirq);
> }
At the same time the gpiochip_irq_unmask() has the symmetrical call. Why?
Also it's obvious that you have used outdated repository. You need to rebase
against subsystem tree for-next branch.
P.S. It's also makes sense to Cc to Marc Zyngier <[email protected]>.
--
With Best Regards,
Andy Shevchenko
Hi Andy,
On 10/05/23 19:42, [email protected] wrote:
> Wed, May 10, 2023 at 12:11:51PM +1200, Chris Packham kirjoitti:
>> When preparing to kexec into a new kernel the kexec code will mask all
>> interrupts for all interrupt domains before disabling them. In the case
>> of a gpio chip which has a mix of gpio and irq pins a warning would be
>> triggered as follows
>> [root@localhost ~]# echo c >/proc/sysrq-trigger
> Besides the very noisy traceback in the commit message (read
> https://kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages)
> see below.
>
>> This is because gpiochip_irq_mask was being used to mask all possible
> We refer to the functions in the form as follows gpiochip_irq_mask().
>
>
>> irqs in the domain but gpiochip_disable_irq will WARN if any of those
> IRQs
> gpiochip_disable_irq()
>
>> gpios haven't been requested as interrupts yet. Remove the call to
> GPIOs
>
>> gpiochip_disable_irq to stop the warning.
> gpiochip_disable_irq()
Will take the above points onboard for v2.
>
>> Fixes: a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable")
>> Signed-off-by: Chris Packham <[email protected]>
>> ---
>> drivers/gpio/gpiolib.c | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
>> index 8c041a8dd9d8..903f5185ae55 100644
>> --- a/drivers/gpio/gpiolib.c
>> +++ b/drivers/gpio/gpiolib.c
>> @@ -1451,7 +1451,6 @@ static void gpiochip_irq_mask(struct irq_data *d)
>>
>> if (gc->irq.irq_mask)
>> gc->irq.irq_mask(d);
>> - gpiochip_disable_irq(gc, d->hwirq);
>> }
> At the same time the gpiochip_irq_unmask() has the symmetrical call. Why?
Hmm you're right I never noticed that. I think that would also trigger a
similar warning if it were ever hit. It's not hit in my use-case because
nothing is running through all the irq domains unmasking interrupts.
The coupling of gpiochip_irq_mask()/gpiochip_irq_unmask() with
gpiochip_disable_irq()/gpiochip_enable_irq() goes back to the same
commit a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable").
It's not immediately obvious to me why the coupling is needed. I was
hoping that someone seeing my patch would confirm that it's not needed
or say why it's needed suggest an alternative approach.
> Also it's obvious that you have used outdated repository. You need to rebase
> against subsystem tree for-next branch.
Yeah that's the tricky part. I'm currently based on lts-5.15 and in
order to actually test this I need all of the support for my platform so
I can use kdump to demonstrate the issue. I might be able to use a
different platform that is already supported in a newer kernel
>
> P.S. It's also makes sense to Cc to Marc Zyngier <[email protected]>.
>
Added.
On Wed, May 10, 2023 at 10:59 PM Chris Packham
<[email protected]> wrote:
> The coupling of gpiochip_irq_mask()/gpiochip_irq_unmask() with
> gpiochip_disable_irq()/gpiochip_enable_irq() goes back to the same
> commit a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable").
> It's not immediately obvious to me why the coupling is needed.
That is just a refactoring of what existed before.
The use case is here:
drivers/media/cec/platform/cec-gpio/cec-gpio.c
The driver needs to switch, at runtime, between actively driving a GPIO
line with gpiod_set_value(), and setting the same line into input mode
and listening for signalling triggering IRQs on it, and then back to
output mode and driving the line again. It's a bidirectional GPIO line.
This use case yields a high need of control.
> I was
> hoping that someone seeing my patch would confirm that it's not needed
> or say why it's needed suggest an alternative approach.
Which IRQ-enabled gpiochip is this? Has it been converted to be immutable?
I think that could be part of the problem.
Yours,
Linus Walleij
Hi Linus,
On 11/05/23 20:00, Linus Walleij wrote:
> On Wed, May 10, 2023 at 10:59 PM Chris Packham
> <[email protected]> wrote:
>
>> The coupling of gpiochip_irq_mask()/gpiochip_irq_unmask() with
>> gpiochip_disable_irq()/gpiochip_enable_irq() goes back to the same
>> commit a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable").
>> It's not immediately obvious to me why the coupling is needed.
> That is just a refactoring of what existed before.
>
> The use case is here:
> drivers/media/cec/platform/cec-gpio/cec-gpio.c
>
> The driver needs to switch, at runtime, between actively driving a GPIO
> line with gpiod_set_value(), and setting the same line into input mode
> and listening for signalling triggering IRQs on it, and then back to
> output mode and driving the line again. It's a bidirectional GPIO line.
> This use case yields a high need of control.
>
>> I was
>> hoping that someone seeing my patch would confirm that it's not needed
>> or say why it's needed suggest an alternative approach.
> Which IRQ-enabled gpiochip is this? Has it been converted to be immutable?
> I think that could be part of the problem.
For me it's a pca9555. I spent yesterday trying to demonstrate the
problem on a newer kernel. Some teething issues aside I can trigger the
warning if I have a gpio-button using one of the pca9555 pins as an
interrupt and then I export some of the other pins via sysfs.
Interestingly the warning isn't triggered if I use a gpio-hog instead of
exporting the pins. I haven't figured out why that is but I'm assuming
it's something to do with the hogged pins being excluded from the irq
domain before it is registered.
On Thu, May 11, 2023 at 10:36 PM Chris Packham
<[email protected]> wrote:
> I spent yesterday trying to demonstrate the
> problem on a newer kernel. Some teething issues aside I can trigger the
> warning if I have a gpio-button using one of the pca9555 pins as an
> interrupt and then I export some of the other pins via sysfs.
>
> Interestingly the warning isn't triggered if I use a gpio-hog instead of
> exporting the pins.
What happens if you use the gpio character device instead of sysfs?
Like for example with the tools in tools/gpio or using libgpiod
example tools?
> I haven't figured out why that is but I'm assuming
> it's something to do with the hogged pins being excluded from the irq
> domain before it is registered.
If you write something to the "edge" file I can easily see things
going sidewise. The sysfs is really not a nice tool, which is why
it is deprecated.
Yours,
Linus Walleij
On 12/05/23 08:36, Chris Packham wrote:
> Hi Linus,
>
> On 11/05/23 20:00, Linus Walleij wrote:
>> On Wed, May 10, 2023 at 10:59 PM Chris Packham
>> <[email protected]> wrote:
>>
>>> The coupling of gpiochip_irq_mask()/gpiochip_irq_unmask() with
>>> gpiochip_disable_irq()/gpiochip_enable_irq() goes back to the same
>>> commit a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable").
>>> It's not immediately obvious to me why the coupling is needed.
>> That is just a refactoring of what existed before.
>>
>> The use case is here:
>> drivers/media/cec/platform/cec-gpio/cec-gpio.c
>>
>> The driver needs to switch, at runtime, between actively driving a GPIO
>> line with gpiod_set_value(), and setting the same line into input mode
>> and listening for signalling triggering IRQs on it, and then back to
>> output mode and driving the line again. It's a bidirectional GPIO line.
>> This use case yields a high need of control.
>>
>>> I was
>>> hoping that someone seeing my patch would confirm that it's not needed
>>> or say why it's needed suggest an alternative approach.
>> Which IRQ-enabled gpiochip is this? Has it been converted to be
>> immutable?
>> I think that could be part of the problem.
>
> For me it's a pca9555. I spent yesterday trying to demonstrate the
> problem on a newer kernel. Some teething issues aside I can trigger
> the warning if I have a gpio-button using one of the pca9555 pins as
> an interrupt and then I export some of the other pins via sysfs.
>
> Interestingly the warning isn't triggered if I use a gpio-hog instead
> of exporting the pins. I haven't figured out why that is but I'm
> assuming it's something to do with the hogged pins being excluded from
> the irq domain before it is registered.
I'm starting to understand things.
When the gpio is exported to userland the irq_desc is created via
device_add()/gpio_is_visible()/gpiod_to_irq()/gpiochip_to_irq(). I think
that might be a bug because if the user wanted an interrupt they would
have said so via edge_store() which also does the gpiod_to_irq() that
ultimately creates the irq_desc. Having the gpio turned into an
interrupt seems like an odd side-effect of gpio_is_visible().