2019-10-25 19:30:41

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH] net: usb: lan78xx: Disable interrupts before calling generic_handle_irq()

lan78xx_status() will run with interrupts enabled due to the change in
ed194d136769 ("usb: core: remove local_irq_save() around ->complete()
handler"). generic_handle_irq() expects to be run with IRQs disabled.

[ 4.886203] 000: irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
[ 4.886243] 000: WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x154/0x168
[ 4.896294] 000: Modules linked in:
[ 4.896301] 000: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.6 #39
[ 4.896310] 000: Hardware name: Raspberry Pi 3 Model B+ (DT)
[ 4.896315] 000: pstate: 60000005 (nZCv daif -PAN -UAO)
[ 4.896321] 000: pc : __handle_irq_event_percpu+0x154/0x168
[ 4.896331] 000: lr : __handle_irq_event_percpu+0x154/0x168
[ 4.896339] 000: sp : ffff000010003cc0
[ 4.896346] 000: x29: ffff000010003cc0 x28: 0000000000000060
[ 4.896355] 000: x27: ffff000011021980 x26: ffff00001189c72b
[ 4.896364] 000: x25: ffff000011702bc0 x24: ffff800036d6e400
[ 4.896373] 000: x23: 000000000000004f x22: ffff000010003d64
[ 4.896381] 000: x21: 0000000000000000 x20: 0000000000000002
[ 4.896390] 000: x19: ffff8000371c8480 x18: 0000000000000060
[ 4.896398] 000: x17: 0000000000000000 x16: 00000000000000eb
[ 4.896406] 000: x15: ffff000011712d18 x14: 7265746e69206465
[ 4.896414] 000: x13: ffff000010003ba0 x12: ffff000011712df0
[ 4.896422] 000: x11: 0000000000000001 x10: ffff000011712e08
[ 4.896430] 000: x9 : 0000000000000001 x8 : 000000000003c920
[ 4.896437] 000: x7 : ffff0000118cc410 x6 : ffff0000118c7f00
[ 4.896445] 000: x5 : 000000000003c920 x4 : 0000000000004510
[ 4.896453] 000: x3 : ffff000011712dc8 x2 : 0000000000000000
[ 4.896461] 000: x1 : 73a3f67df94c1500 x0 : 0000000000000000
[ 4.896466] 000: Call trace:
[ 4.896471] 000: __handle_irq_event_percpu+0x154/0x168
[ 4.896481] 000: handle_irq_event_percpu+0x50/0xb0
[ 4.896489] 000: handle_irq_event+0x40/0x98
[ 4.896497] 000: handle_simple_irq+0xa4/0xf0
[ 4.896505] 000: generic_handle_irq+0x24/0x38
[ 4.896513] 000: intr_complete+0xb0/0xe0
[ 4.896525] 000: __usb_hcd_giveback_urb+0x58/0xd8
[ 4.896533] 000: usb_giveback_urb_bh+0xd0/0x170
[ 4.896539] 000: tasklet_action_common.isra.0+0x9c/0x128
[ 4.896549] 000: tasklet_hi_action+0x24/0x30
[ 4.896556] 000: __do_softirq+0x120/0x23c
[ 4.896564] 000: irq_exit+0xb8/0xd8
[ 4.896571] 000: __handle_domain_irq+0x64/0xb8
[ 4.896579] 000: bcm2836_arm_irqchip_handle_irq+0x60/0xc0
[ 4.896586] 000: el1_irq+0xb8/0x140
[ 4.896592] 000: arch_cpu_idle+0x10/0x18
[ 4.896601] 000: do_idle+0x200/0x280
[ 4.896608] 000: cpu_startup_entry+0x20/0x28
[ 4.896615] 000: rest_init+0xb4/0xc0
[ 4.896623] 000: arch_call_rest_init+0xc/0x14
[ 4.896632] 000: start_kernel+0x454/0x480

Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler")
Cc: Woojung Huh <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Andrew Lunn <[email protected]>
Cc: Stefan Wahren <[email protected]>
Cc: Jisheng Zhang <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: David Miller <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---

Hi,

This patch just fixes the warning. There are still problems left (the
unstable NFS report from me) but I suggest to look at this
separately. The initial patch to revert all the irqdomain code might
just hide the problem. At this point I don't know what's going on so I
rather go baby steps. The revert is still possible if nothing else
works.

Thanks,
Daniel

drivers/net/usb/lan78xx.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 62948098191f..f24a1b0b801f 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -1264,8 +1264,11 @@ static void lan78xx_status(struct lan78xx_net *dev, struct urb *urb)
netif_dbg(dev, link, dev->net, "PHY INTR: 0x%08x\n", intdata);
lan78xx_defer_kevent(dev, EVENT_LINK_RESET);

- if (dev->domain_data.phyirq > 0)
+ if (dev->domain_data.phyirq > 0) {
+ local_irq_disable();
generic_handle_irq(dev->domain_data.phyirq);
+ local_irq_enable();
+ }
} else
netdev_warn(dev->net,
"unexpected interrupt: 0x%08x\n", intdata);
--
2.23.0


2019-10-27 12:16:35

by Stefan Wahren

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Disable interrupts before calling generic_handle_irq()

Hi Daniel,

Am 25.10.19 um 10:04 schrieb Daniel Wagner:
> ...
>
> Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler")
> Cc: Woojung Huh <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Andrew Lunn <[email protected]>
> Cc: Stefan Wahren <[email protected]>
> Cc: Jisheng Zhang <[email protected]>
> Cc: Sebastian Andrzej Siewior <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: David Miller <[email protected]>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
>
> Hi,
>
> This patch just fixes the warning. There are still problems left (the
> unstable NFS report from me) but I suggest to look at this
> separately. The initial patch to revert all the irqdomain code might
> just hide the problem. At this point I don't know what's going on so I
> rather go baby steps. The revert is still possible if nothing else
> works.

did you ever see this pseudo lan78xx-irqs fire? I examined
/proc/interrupts on RPi 3B+ and always saw a 0.

FWIW you can have:

Tested-by: Stefan Wahren <[email protected]>

for this patch.

Regards
Stefan

2019-10-28 23:58:46

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Disable interrupts before calling generic_handle_irq()

From: Daniel Wagner <[email protected]>
Date: Fri, 25 Oct 2019 10:04:13 +0200

> lan78xx_status() will run with interrupts enabled due to the change in
> ed194d136769 ("usb: core: remove local_irq_save() around ->complete()
> handler"). generic_handle_irq() expects to be run with IRQs disabled.
...
> Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler")
...
> Signed-off-by: Daniel Wagner <[email protected]>

Applied and queued up for -stable, thanks Daniel.

2019-10-29 19:45:24

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Disable interrupts before calling generic_handle_irq()

Hi Stefan,

On Sun, Oct 27, 2019 at 01:14:41PM +0100, Stefan Wahren wrote:
> did you ever see this pseudo lan78xx-irqs fire? I examined
> /proc/interrupts on RPi 3B+ and always saw a 0.

# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
2: 15 10 20 14 ARMCTRL-level 1 Edge 3f00b880.mailbox
41: 127709 127690 127596 127783 ARMCTRL-level 41 Edge 3f980000.usb, dwc2_hsotg:usb1
61: 219 208 183 192 ARMCTRL-level 61 Edge ttyS1
65: 1285 1340 2112 1483 ARMCTRL-level 88 Edge mmc0
71: 11 15 13 13 ARMCTRL-level 94 Edge mmc1
147: 2823 2995 3648 3615 bcm2836-timer 1 Edge arch_timer
148: 0 0 0 0 bcm2836-timer 3 Edge kvm guest timer
150: 0 1 2 0 lan78xx-irqs 17 Edge usb-001:004:01
IPI0: 11102 11331 12204 11011 Rescheduling interrupts
IPI1: 34 537 547 523 Function call interrupts
IPI2: 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
IPI4: 0 0 0 0 Timer broadcast interrupts
IPI5: 0 0 0 0 IRQ work interrupts
IPI6: 0 0 0 0 CPU wake-up interrupts


Yes, this seems to work now fine with the current version.

Thanks,
Daniel

2019-11-04 09:00:10

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Disable interrupts before calling generic_handle_irq()

On Fri, Oct 25, 2019 at 10:04:13AM +0200, Daniel Wagner wrote:
> This patch just fixes the warning. There are still problems left (the
> unstable NFS report from me) but I suggest to look at this
> separately. The initial patch to revert all the irqdomain code might
> just hide the problem. At this point I don't know what's going on so I
> rather go baby steps. The revert is still possible if nothing else
> works.

I replaced my power supply with the official RPi one and the NFS
timeouts problems are gone. Also a long test session with different
network loads didn't show any problems. I feel so stupid...

Thanks,
Daniel

2019-11-04 18:07:22

by Stefan Wahren

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Disable interrupts before calling generic_handle_irq()

Hi Daniel,

Am 04.11.19 um 09:57 schrieb Daniel Wagner:
> On Fri, Oct 25, 2019 at 10:04:13AM +0200, Daniel Wagner wrote:
>> This patch just fixes the warning. There are still problems left (the
>> unstable NFS report from me) but I suggest to look at this
>> separately. The initial patch to revert all the irqdomain code might
>> just hide the problem. At this point I don't know what's going on so I
>> rather go baby steps. The revert is still possible if nothing else
>> works.
> I replaced my power supply with the official RPi one and the NFS
> timeouts problems are gone. Also a long test session with different
> network loads didn't show any problems. I feel so stupid...
did you never saw a warning about under voltage from the Raspberry Pi
hwmon driver?
>
> Thanks,
> Daniel
>

2019-11-05 09:25:56

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Disable interrupts before calling generic_handle_irq()

> did you never saw a warning about under voltage from the Raspberry Pi
> hwmon driver?

Guess why I feel so stupid? I just ignored it... /me goes back to
shaming in the corner.