2019-10-19 08:00:13

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

handle_simple_irq() expect interrupts to be disabled. The USB
framework is using threaded interrupts, which implies that interrupts
are re-enabled as soon as it has run.

This reverts the changes from cc89c323a30e ("lan78xx: Use irq_domain
for phy interrupt from USB Int. EP").

[ 4.886203] 000: irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
[ 4.886243] 000: WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x154/0x168
[ 4.896294] 000: Modules linked in:
[ 4.896301] 000: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.6 #39
[ 4.896310] 000: Hardware name: Raspberry Pi 3 Model B+ (DT)
[ 4.896315] 000: pstate: 60000005 (nZCv daif -PAN -UAO)
[ 4.896321] 000: pc : __handle_irq_event_percpu+0x154/0x168
[ 4.896331] 000: lr : __handle_irq_event_percpu+0x154/0x168
[ 4.896339] 000: sp : ffff000010003cc0
[ 4.896346] 000: x29: ffff000010003cc0 x28: 0000000000000060
[ 4.896355] 000: x27: ffff000011021980 x26: ffff00001189c72b
[ 4.896364] 000: x25: ffff000011702bc0 x24: ffff800036d6e400
[ 4.896373] 000: x23: 000000000000004f x22: ffff000010003d64
[ 4.896381] 000: x21: 0000000000000000 x20: 0000000000000002
[ 4.896390] 000: x19: ffff8000371c8480 x18: 0000000000000060
[ 4.896398] 000: x17: 0000000000000000 x16: 00000000000000eb
[ 4.896406] 000: x15: ffff000011712d18 x14: 7265746e69206465
[ 4.896414] 000: x13: ffff000010003ba0 x12: ffff000011712df0
[ 4.896422] 000: x11: 0000000000000001 x10: ffff000011712e08
[ 4.896430] 000: x9 : 0000000000000001 x8 : 000000000003c920
[ 4.896437] 000: x7 : ffff0000118cc410 x6 : ffff0000118c7f00
[ 4.896445] 000: x5 : 000000000003c920 x4 : 0000000000004510
[ 4.896453] 000: x3 : ffff000011712dc8 x2 : 0000000000000000
[ 4.896461] 000: x1 : 73a3f67df94c1500 x0 : 0000000000000000
[ 4.896466] 000: Call trace:
[ 4.896471] 000: __handle_irq_event_percpu+0x154/0x168
[ 4.896481] 000: handle_irq_event_percpu+0x50/0xb0
[ 4.896489] 000: handle_irq_event+0x40/0x98
[ 4.896497] 000: handle_simple_irq+0xa4/0xf0
[ 4.896505] 000: generic_handle_irq+0x24/0x38
[ 4.896513] 000: intr_complete+0xb0/0xe0
[ 4.896525] 000: __usb_hcd_giveback_urb+0x58/0xd8
[ 4.896533] 000: usb_giveback_urb_bh+0xd0/0x170
[ 4.896539] 000: tasklet_action_common.isra.0+0x9c/0x128
[ 4.896549] 000: tasklet_hi_action+0x24/0x30
[ 4.896556] 000: __do_softirq+0x120/0x23c
[ 4.896564] 000: irq_exit+0xb8/0xd8
[ 4.896571] 000: __handle_domain_irq+0x64/0xb8
[ 4.896579] 000: bcm2836_arm_irqchip_handle_irq+0x60/0xc0
[ 4.896586] 000: el1_irq+0xb8/0x140
[ 4.896592] 000: arch_cpu_idle+0x10/0x18
[ 4.896601] 000: do_idle+0x200/0x280
[ 4.896608] 000: cpu_startup_entry+0x20/0x28
[ 4.896615] 000: rest_init+0xb4/0xc0
[ 4.896623] 000: arch_call_rest_init+0xc/0x14
[ 4.896632] 000: start_kernel+0x454/0x480

[dwagner: Updated Jisheng's initial patch]

Fixes: cc89c323a30e ("lan78xx: Use irq_domain for phy interrupt from USB Int. EP")
Cc: Woojung Huh <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Andrew Lunn <[email protected]>
Cc: Stefan Wahren <[email protected]>
Cc: Jisheng Zhang <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---
Hi,

With Andrew's "net: usb: lan78xx: Connect PHY before registering MAC"
and this patch I am able to boot and use the RPi3 with -rt.

There was already a lot of dicussion on this topic but no fixes so
far. So I just suggest to revert the original commit since it is not
clear to me what it fixes:

https://www.spinics.net/lists/netdev/msg542290.html
https://marc.info/?l=linux-netdev&m=154604180927252&w=2
https://patchwork.kernel.org/patch/10888797/

Without this revert RPi3 is not usable for -rt at this point.

Thanks,
Daniel

drivers/net/usb/lan78xx.c | 208 +++-----------------------------------
1 file changed, 16 insertions(+), 192 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 58f5a219fb65..f96c7ff3edd4 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -20,10 +20,6 @@
#include <linux/mdio.h>
#include <linux/phy.h>
#include <net/ip6_checksum.h>
-#include <linux/interrupt.h>
-#include <linux/irqdomain.h>
-#include <linux/irq.h>
-#include <linux/irqchip/chained_irq.h>
#include <linux/microchipphy.h>
#include <linux/phy_fixed.h>
#include <linux/of_mdio.h>
@@ -87,38 +83,6 @@
/* statistic update interval (mSec) */
#define STAT_UPDATE_TIMER (1 * 1000)

-/* defines interrupts from interrupt EP */
-#define MAX_INT_EP (32)
-#define INT_EP_INTEP (31)
-#define INT_EP_OTP_WR_DONE (28)
-#define INT_EP_EEE_TX_LPI_START (26)
-#define INT_EP_EEE_TX_LPI_STOP (25)
-#define INT_EP_EEE_RX_LPI (24)
-#define INT_EP_MAC_RESET_TIMEOUT (23)
-#define INT_EP_RDFO (22)
-#define INT_EP_TXE (21)
-#define INT_EP_USB_STATUS (20)
-#define INT_EP_TX_DIS (19)
-#define INT_EP_RX_DIS (18)
-#define INT_EP_PHY (17)
-#define INT_EP_DP (16)
-#define INT_EP_MAC_ERR (15)
-#define INT_EP_TDFU (14)
-#define INT_EP_TDFO (13)
-#define INT_EP_UTX (12)
-#define INT_EP_GPIO_11 (11)
-#define INT_EP_GPIO_10 (10)
-#define INT_EP_GPIO_9 (9)
-#define INT_EP_GPIO_8 (8)
-#define INT_EP_GPIO_7 (7)
-#define INT_EP_GPIO_6 (6)
-#define INT_EP_GPIO_5 (5)
-#define INT_EP_GPIO_4 (4)
-#define INT_EP_GPIO_3 (3)
-#define INT_EP_GPIO_2 (2)
-#define INT_EP_GPIO_1 (1)
-#define INT_EP_GPIO_0 (0)
-
static const char lan78xx_gstrings[][ETH_GSTRING_LEN] = {
"RX FCS Errors",
"RX Alignment Errors",
@@ -350,15 +314,6 @@ struct statstage {
struct lan78xx_statstage64 curr_stat;
};

-struct irq_domain_data {
- struct irq_domain *irqdomain;
- unsigned int phyirq;
- struct irq_chip *irqchip;
- irq_flow_handler_t irq_handler;
- u32 irqenable;
- struct mutex irq_lock; /* for irq bus access */
-};
-
struct lan78xx_net {
struct net_device *net;
struct usb_device *udev;
@@ -415,8 +370,6 @@ struct lan78xx_net {

int delta;
struct statstage stats;
-
- struct irq_domain_data domain_data;
};

/* define external phy id */
@@ -1251,6 +1204,7 @@ static void lan78xx_defer_kevent(struct lan78xx_net *dev, int work)
static void lan78xx_status(struct lan78xx_net *dev, struct urb *urb)
{
u32 intdata;
+ struct phy_device *phydev = dev->net->phydev;

if (urb->actual_length != 4) {
netdev_warn(dev->net,
@@ -1264,8 +1218,7 @@ static void lan78xx_status(struct lan78xx_net *dev, struct urb *urb)
netif_dbg(dev, link, dev->net, "PHY INTR: 0x%08x\n", intdata);
lan78xx_defer_kevent(dev, EVENT_LINK_RESET);

- if (dev->domain_data.phyirq > 0)
- generic_handle_irq(dev->domain_data.phyirq);
+ phy_mac_interrupt(phydev);
} else
netdev_warn(dev->net,
"unexpected interrupt: 0x%08x\n", intdata);
@@ -1874,127 +1827,6 @@ static void lan78xx_link_status_change(struct net_device *net)
}
}

-static int irq_map(struct irq_domain *d, unsigned int irq,
- irq_hw_number_t hwirq)
-{
- struct irq_domain_data *data = d->host_data;
-
- irq_set_chip_data(irq, data);
- irq_set_chip_and_handler(irq, data->irqchip, data->irq_handler);
- irq_set_noprobe(irq);
-
- return 0;
-}
-
-static void irq_unmap(struct irq_domain *d, unsigned int irq)
-{
- irq_set_chip_and_handler(irq, NULL, NULL);
- irq_set_chip_data(irq, NULL);
-}
-
-static const struct irq_domain_ops chip_domain_ops = {
- .map = irq_map,
- .unmap = irq_unmap,
-};
-
-static void lan78xx_irq_mask(struct irq_data *irqd)
-{
- struct irq_domain_data *data = irq_data_get_irq_chip_data(irqd);
-
- data->irqenable &= ~BIT(irqd_to_hwirq(irqd));
-}
-
-static void lan78xx_irq_unmask(struct irq_data *irqd)
-{
- struct irq_domain_data *data = irq_data_get_irq_chip_data(irqd);
-
- data->irqenable |= BIT(irqd_to_hwirq(irqd));
-}
-
-static void lan78xx_irq_bus_lock(struct irq_data *irqd)
-{
- struct irq_domain_data *data = irq_data_get_irq_chip_data(irqd);
-
- mutex_lock(&data->irq_lock);
-}
-
-static void lan78xx_irq_bus_sync_unlock(struct irq_data *irqd)
-{
- struct irq_domain_data *data = irq_data_get_irq_chip_data(irqd);
- struct lan78xx_net *dev =
- container_of(data, struct lan78xx_net, domain_data);
- u32 buf;
- int ret;
-
- /* call register access here because irq_bus_lock & irq_bus_sync_unlock
- * are only two callbacks executed in non-atomic contex.
- */
- ret = lan78xx_read_reg(dev, INT_EP_CTL, &buf);
- if (buf != data->irqenable)
- ret = lan78xx_write_reg(dev, INT_EP_CTL, data->irqenable);
-
- mutex_unlock(&data->irq_lock);
-}
-
-static struct irq_chip lan78xx_irqchip = {
- .name = "lan78xx-irqs",
- .irq_mask = lan78xx_irq_mask,
- .irq_unmask = lan78xx_irq_unmask,
- .irq_bus_lock = lan78xx_irq_bus_lock,
- .irq_bus_sync_unlock = lan78xx_irq_bus_sync_unlock,
-};
-
-static int lan78xx_setup_irq_domain(struct lan78xx_net *dev)
-{
- struct device_node *of_node;
- struct irq_domain *irqdomain;
- unsigned int irqmap = 0;
- u32 buf;
- int ret = 0;
-
- of_node = dev->udev->dev.parent->of_node;
-
- mutex_init(&dev->domain_data.irq_lock);
-
- lan78xx_read_reg(dev, INT_EP_CTL, &buf);
- dev->domain_data.irqenable = buf;
-
- dev->domain_data.irqchip = &lan78xx_irqchip;
- dev->domain_data.irq_handler = handle_simple_irq;
-
- irqdomain = irq_domain_add_simple(of_node, MAX_INT_EP, 0,
- &chip_domain_ops, &dev->domain_data);
- if (irqdomain) {
- /* create mapping for PHY interrupt */
- irqmap = irq_create_mapping(irqdomain, INT_EP_PHY);
- if (!irqmap) {
- irq_domain_remove(irqdomain);
-
- irqdomain = NULL;
- ret = -EINVAL;
- }
- } else {
- ret = -EINVAL;
- }
-
- dev->domain_data.irqdomain = irqdomain;
- dev->domain_data.phyirq = irqmap;
-
- return ret;
-}
-
-static void lan78xx_remove_irq_domain(struct lan78xx_net *dev)
-{
- if (dev->domain_data.phyirq > 0) {
- irq_dispose_mapping(dev->domain_data.phyirq);
-
- if (dev->domain_data.irqdomain)
- irq_domain_remove(dev->domain_data.irqdomain);
- }
- dev->domain_data.phyirq = 0;
- dev->domain_data.irqdomain = NULL;
-}
-
static int lan8835_fixup(struct phy_device *phydev)
{
int buf;
@@ -2123,16 +1955,15 @@ static int lan78xx_phy_init(struct lan78xx_net *dev)
return -EIO;
}

- /* if phyirq is not set, use polling mode in phylib */
- if (dev->domain_data.phyirq > 0)
- phydev->irq = dev->domain_data.phyirq;
- else
- phydev->irq = 0;
- netdev_dbg(dev->net, "phydev->irq = %d\n", phydev->irq);
-
/* set to AUTOMDIX */
phydev->mdix = ETH_TP_MDI_AUTO;

+ ret = phy_read(phydev, LAN88XX_INT_STS);
+ ret = phy_write(phydev, LAN88XX_INT_MASK,
+ LAN88XX_INT_MASK_MDINTPIN_EN_ |
+ LAN88XX_INT_MASK_LINK_CHANGE_);
+ phydev->irq = PHY_IGNORE_INTERRUPT;
+
ret = phy_connect_direct(dev->net, phydev,
lan78xx_link_status_change,
dev->interface);
@@ -2570,6 +2401,11 @@ static int lan78xx_reset(struct lan78xx_net *dev)
}
ret = lan78xx_write_reg(dev, MAC_CR, buf);

+ /* enable PHY interrupts */
+ ret = lan78xx_read_reg(dev, INT_EP_CTL, &buf);
+ buf |= INT_ENP_PHY_INT;
+ ret = lan78xx_write_reg(dev, INT_EP_CTL, buf);
+
ret = lan78xx_read_reg(dev, MAC_TX, &buf);
buf |= MAC_TX_TXEN_;
ret = lan78xx_write_reg(dev, MAC_TX, buf);
@@ -2977,13 +2813,6 @@ static int lan78xx_bind(struct lan78xx_net *dev, struct usb_interface *intf)

dev->net->hw_features = dev->net->features;

- ret = lan78xx_setup_irq_domain(dev);
- if (ret < 0) {
- netdev_warn(dev->net,
- "lan78xx_setup_irq_domain() failed : %d", ret);
- goto out1;
- }
-
dev->net->hard_header_len += TX_OVERHEAD;
dev->hard_mtu = dev->net->mtu + dev->net->hard_header_len;

@@ -2991,13 +2820,13 @@ static int lan78xx_bind(struct lan78xx_net *dev, struct usb_interface *intf)
ret = lan78xx_reset(dev);
if (ret) {
netdev_warn(dev->net, "Registers INIT FAILED....");
- goto out2;
+ goto out;
}

ret = lan78xx_mdio_init(dev);
if (ret) {
netdev_warn(dev->net, "MDIO INIT FAILED.....");
- goto out2;
+ goto out;
}

dev->net->flags |= IFF_MULTICAST;
@@ -3006,10 +2835,7 @@ static int lan78xx_bind(struct lan78xx_net *dev, struct usb_interface *intf)

return ret;

-out2:
- lan78xx_remove_irq_domain(dev);
-
-out1:
+out:
netdev_warn(dev->net, "Bind routine FAILED");
cancel_work_sync(&pdata->set_multicast);
cancel_work_sync(&pdata->set_vlan);
@@ -3021,8 +2847,6 @@ static void lan78xx_unbind(struct lan78xx_net *dev, struct usb_interface *intf)
{
struct lan78xx_priv *pdata = (struct lan78xx_priv *)(dev->data[0]);

- lan78xx_remove_irq_domain(dev);
-
lan78xx_remove_mdio(dev);

if (pdata) {
--
2.21.0


2019-10-19 08:16:17

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

On Fri, 18 Oct 2019 09:28:17 +0100,
Daniel Wagner <[email protected]> wrote:
>
> handle_simple_irq() expect interrupts to be disabled. The USB
> framework is using threaded interrupts, which implies that interrupts
> are re-enabled as soon as it has run.
>
> This reverts the changes from cc89c323a30e ("lan78xx: Use irq_domain
> for phy interrupt from USB Int. EP").
>
> [ 4.886203] 000: irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
> [ 4.886243] 000: WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x154/0x168
> [ 4.896294] 000: Modules linked in:
> [ 4.896301] 000: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.6 #39
> [ 4.896310] 000: Hardware name: Raspberry Pi 3 Model B+ (DT)
> [ 4.896315] 000: pstate: 60000005 (nZCv daif -PAN -UAO)
> [ 4.896321] 000: pc : __handle_irq_event_percpu+0x154/0x168
> [ 4.896331] 000: lr : __handle_irq_event_percpu+0x154/0x168
> [ 4.896339] 000: sp : ffff000010003cc0
> [ 4.896346] 000: x29: ffff000010003cc0 x28: 0000000000000060
> [ 4.896355] 000: x27: ffff000011021980 x26: ffff00001189c72b
> [ 4.896364] 000: x25: ffff000011702bc0 x24: ffff800036d6e400
> [ 4.896373] 000: x23: 000000000000004f x22: ffff000010003d64
> [ 4.896381] 000: x21: 0000000000000000 x20: 0000000000000002
> [ 4.896390] 000: x19: ffff8000371c8480 x18: 0000000000000060
> [ 4.896398] 000: x17: 0000000000000000 x16: 00000000000000eb
> [ 4.896406] 000: x15: ffff000011712d18 x14: 7265746e69206465
> [ 4.896414] 000: x13: ffff000010003ba0 x12: ffff000011712df0
> [ 4.896422] 000: x11: 0000000000000001 x10: ffff000011712e08
> [ 4.896430] 000: x9 : 0000000000000001 x8 : 000000000003c920
> [ 4.896437] 000: x7 : ffff0000118cc410 x6 : ffff0000118c7f00
> [ 4.896445] 000: x5 : 000000000003c920 x4 : 0000000000004510
> [ 4.896453] 000: x3 : ffff000011712dc8 x2 : 0000000000000000
> [ 4.896461] 000: x1 : 73a3f67df94c1500 x0 : 0000000000000000
> [ 4.896466] 000: Call trace:
> [ 4.896471] 000: __handle_irq_event_percpu+0x154/0x168
> [ 4.896481] 000: handle_irq_event_percpu+0x50/0xb0
> [ 4.896489] 000: handle_irq_event+0x40/0x98
> [ 4.896497] 000: handle_simple_irq+0xa4/0xf0
> [ 4.896505] 000: generic_handle_irq+0x24/0x38
> [ 4.896513] 000: intr_complete+0xb0/0xe0
> [ 4.896525] 000: __usb_hcd_giveback_urb+0x58/0xd8
> [ 4.896533] 000: usb_giveback_urb_bh+0xd0/0x170
> [ 4.896539] 000: tasklet_action_common.isra.0+0x9c/0x128
> [ 4.896549] 000: tasklet_hi_action+0x24/0x30
> [ 4.896556] 000: __do_softirq+0x120/0x23c
> [ 4.896564] 000: irq_exit+0xb8/0xd8
> [ 4.896571] 000: __handle_domain_irq+0x64/0xb8
> [ 4.896579] 000: bcm2836_arm_irqchip_handle_irq+0x60/0xc0
> [ 4.896586] 000: el1_irq+0xb8/0x140
> [ 4.896592] 000: arch_cpu_idle+0x10/0x18
> [ 4.896601] 000: do_idle+0x200/0x280
> [ 4.896608] 000: cpu_startup_entry+0x20/0x28
> [ 4.896615] 000: rest_init+0xb4/0xc0
> [ 4.896623] 000: arch_call_rest_init+0xc/0x14
> [ 4.896632] 000: start_kernel+0x454/0x480
>
> [dwagner: Updated Jisheng's initial patch]
>
> Fixes: cc89c323a30e ("lan78xx: Use irq_domain for phy interrupt from USB Int. EP")
> Cc: Woojung Huh <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Andrew Lunn <[email protected]>
> Cc: Stefan Wahren <[email protected]>
> Cc: Jisheng Zhang <[email protected]>
> Cc: Sebastian Andrzej Siewior <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> Hi,
>
> With Andrew's "net: usb: lan78xx: Connect PHY before registering MAC"
> and this patch I am able to boot and use the RPi3 with -rt.
>
> There was already a lot of dicussion on this topic but no fixes so
> far. So I just suggest to revert the original commit since it is not
> clear to me what it fixes:
>
> https://www.spinics.net/lists/netdev/msg542290.html
> https://marc.info/?l=linux-netdev&m=154604180927252&w=2
> https://patchwork.kernel.org/patch/10888797/
>
> Without this revert RPi3 is not usable for -rt at this point.

Acked-by: Marc Zyngier <[email protected]>

M.

--
Jazz is not dead, it just smells funny.

Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

On 2019-10-18 10:28:17 [+0200], Daniel Wagner wrote:
> handle_simple_irq() expect interrupts to be disabled. The USB
> framework is using threaded interrupts, which implies that interrupts
> are re-enabled as soon as it has run.

Without threading interrupts, this is invoked in pure softirq context
since commit ed194d1367698 ("usb: core: remove local_irq_save() around
->complete() handler") where the local_irq_disable() has been removed.

This is probably not a problem because the lock is never observed with
in IRQ context.

Wouldn't handle_nested_irq() work here instead of the simple thingy?

Sebastian

2019-10-22 17:34:20

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

On Fri, 18 Oct 2019 15:15:32 +0200, Sebastian Andrzej Siewior wrote:
> On 2019-10-18 10:28:17 [+0200], Daniel Wagner wrote:
> > handle_simple_irq() expect interrupts to be disabled. The USB
> > framework is using threaded interrupts, which implies that interrupts
> > are re-enabled as soon as it has run.
>
> Without threading interrupts, this is invoked in pure softirq context
> since commit ed194d1367698 ("usb: core: remove local_irq_save() around
> ->complete() handler") where the local_irq_disable() has been removed.
>
> This is probably not a problem because the lock is never observed with
> in IRQ context.
>
> Wouldn't handle_nested_irq() work here instead of the simple thingy?

Daniel could you try this suggestion? Would it work?

I'm not sure we are at the stage yet where "doesn't work on -rt" is
sufficient reason to revert a working upstream patch. Please correct
me if I'm wrong.

2019-10-22 17:39:04

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

On Tue, 22 Oct 2019 10:17:47 -0700
Jakub Kicinski <[email protected]> wrote:

> On Fri, 18 Oct 2019 15:15:32 +0200, Sebastian Andrzej Siewior wrote:
> > On 2019-10-18 10:28:17 [+0200], Daniel Wagner wrote:
> > > handle_simple_irq() expect interrupts to be disabled. The USB
> > > framework is using threaded interrupts, which implies that interrupts
> > > are re-enabled as soon as it has run.
> >
> > Without threading interrupts, this is invoked in pure softirq context
> > since commit ed194d1367698 ("usb: core: remove local_irq_save() around
> > ->complete() handler") where the local_irq_disable() has been removed.
> >
> > This is probably not a problem because the lock is never observed with
> > in IRQ context.
> >
> > Wouldn't handle_nested_irq() work here instead of the simple thingy?
>
> Daniel could you try this suggestion? Would it work?
>
> I'm not sure we are at the stage yet where "doesn't work on -rt" is
> sufficient reason to revert a working upstream patch. Please correct
> me if I'm wrong.

But that's the thing: it doesn't work at all, RT or not (it spits an
awful warning). See the various reports Daniel linked to. Maintainers
have been completely unresponsive, and the RPI folks have their own out
of tree hack, I believe (which probably reverts to the previous,
working situation where the driver uses polling for some of its PHY
handling business).

Sebastian's suggestion is definitely worth trying if you have the HW
though.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2019-10-23 08:07:39

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

Sebastian suggested to try this here:

--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -1264,8 +1264,11 @@ static void lan78xx_status(struct lan78xx_net *dev, struct urb *urb)
netif_dbg(dev, link, dev->net, "PHY INTR: 0x%08x\n", intdata);
lan78xx_defer_kevent(dev, EVENT_LINK_RESET);

- if (dev->domain_data.phyirq > 0)
+ if (dev->domain_data.phyirq > 0) {
+ local_irq_disable();
generic_handle_irq(dev->domain_data.phyirq);
+ local_irq_enable();
+ }
} else
netdev_warn(dev->net,
"unexpected interrupt: 0x%08x\n", intdata);

While this gets rid of the warning, the networking interface is not
really stable:

[ 43.999628] nfs: server 192.168.19.2 not responding, still trying
[ 43.999633] nfs: server 192.168.19.2 not responding, still trying
[ 43.999649] nfs: server 192.168.19.2 not responding, still trying
[ 43.999674] nfs: server 192.168.19.2 not responding, still trying
[ 43.999678] nfs: server 192.168.19.2 not responding, still trying
[ 44.006712] nfs: server 192.168.19.2 OK
[ 44.018443] nfs: server 192.168.19.2 OK
[ 44.024765] nfs: server 192.168.19.2 OK
[ 44.025361] nfs: server 192.168.19.2 OK
[ 44.025420] nfs: server 192.168.19.2 OK
[ 256.991659] nfs: server 192.168.19.2 not responding, still trying
[ 256.991664] nfs: server 192.168.19.2 not responding, still trying
[ 256.991669] nfs: server 192.168.19.2 not responding, still trying
[ 256.991685] nfs: server 192.168.19.2 not responding, still trying
[ 256.991713] nfs: server 192.168.19.2 not responding, still trying
[ 256.998797] nfs: server 192.168.19.2 OK
[ 256.999745] nfs: server 192.168.19.2 OK
[ 256.999828] nfs: server 192.168.19.2 OK
[ 257.000438] nfs: server 192.168.19.2 OK
[ 257.004784] nfs: server 192.168.19.2 OK


Eventually, the rootfs can be loaded and the system boots. Though the
system is not really usable because it often stalls:


root@debian:~# apt update
Ign:1 http://deb.debian.org/debian stretch InRelease
Hit:2 http://deb.debian.org/debian stretch Release
Reading package lists... 0%


I don't see this with the irqdomain code reverted.

2019-10-23 12:32:45

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

> > Wouldn't handle_nested_irq() work here instead of the simple thingy?
>
> Daniel could you try this suggestion? Would it work?


[ 6.427289] ------------[ cut here ]------------
[ 6.431977] kernel BUG at drivers/net/phy/mdio_bus.c:626!
[ 6.437453] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 6.443013] Modules linked in:
[ 6.446116] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc3-00100-g70cc5ab156c3-dirty #50
[ 6.454763] Hardware name: Raspberry Pi 3 Model B+ (DT)
[ 6.460062] pstate: 00000005 (nzcv daif -PAN -UAO)
[ 6.464928] pc : mdiobus_read+0x68/0x70
[ 6.468821] lr : lan88xx_phy_ack_interrupt+0x1c/0x30
[ 6.473852] sp : ffff800010003d70
[ 6.477208] x29: ffff800010003d70 x28: 0000000000000000
[ 6.482597] x27: 0000000000000001 x26: 0000000000000060
[ 6.487985] x25: 00000000000000e0 x24: ffff8000113ca680
[ 6.493373] x23: ffff0000372ac174 x22: ffff00003768f6d4
[ 6.498762] x21: ffff00003768f600 x20: 0000000000000000
[ 6.504149] x19: ffff0000372c1000 x18: 000000000000000e
[ 6.509537] x17: 0000000000000001 x16: 0000000000000007
[ 6.514923] x15: 000000000000000e x14: 0000000000000013
[ 6.520311] x13: 0000000000000000 x12: 0000000000000000
[ 6.525699] x11: 000000000000057b x10: 0000000000000003
[ 6.531087] x9 : ffff0000383f4750 x8 : ffff0000383f3dc0
[ 6.536474] x7 : ffff0000374be100 x6 : ffff0000383f4750
[ 6.541862] x5 : ffff000037cab700 x4 : 0000000000000008
[ 6.547249] x3 : 0000000000000101 x2 : 000000000000001a
[ 6.552636] x1 : 0000000000000001 x0 : ffff0000372c0800
[ 6.558023] Call trace:
[ 6.560506] mdiobus_read+0x68/0x70
[ 6.564045] phy_interrupt+0x5c/0xb0
[ 6.567672] handle_nested_irq+0xb8/0x130
[ 6.571740] intr_complete+0xb0/0xe0
[ 6.575368] __usb_hcd_giveback_urb+0x58/0xf8
[ 6.579787] usb_giveback_urb_bh+0xac/0x108
[ 6.584032] tasklet_action_common.isra.0+0x154/0x1a0
[ 6.589155] tasklet_hi_action+0x24/0x30
[ 6.593134] __do_softirq+0x120/0x23c
[ 6.596847] irq_exit+0xb8/0xd8
[ 6.600031] __handle_domain_irq+0x64/0xb8
[ 6.604183] bcm2836_arm_irqchip_handle_irq+0x60/0xc0
[ 6.609305] el1_irq+0xb8/0x180
[ 6.612490] arch_cpu_idle+0x10/0x18
[ 6.616115] do_idle+0x200/0x280
[ 6.619387] cpu_startup_entry+0x20/0x40
[ 6.623366] rest_init+0xd4/0xe0
[ 6.626642] arch_call_rest_init+0xc/0x14
[ 6.630706] start_kernel+0x420/0x44c
[ 6.634425] Code: a94153f3 a9425bf5 a8c37bfd d65f03c0 (d4210000)
[ 6.640614] ---[ end trace 97efd7bf12ed0c65 ]---
[ 6.645295] Kernel panic - not syncing: Fatal exception in interrupt
[ 6.651740] SMP: stopping secondary CPUs
[ 6.655719] Kernel Offset: disabled
[ 6.659254] CPU features: 0x0002,24002004
[ 6.663315] Memory Limit: none
[ 6.666416] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---


Not really. It turns out phy_interrupt() wants to be run in a threaded
context. As Sebastian says on IRC "but this means, that it was broken from
the beginning"

Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

On 2019-10-23 10:06:40 [+0200], Daniel Wagner wrote:
> Sebastian suggested to try this here:
>
> --- a/drivers/net/usb/lan78xx.c
> +++ b/drivers/net/usb/lan78xx.c
> @@ -1264,8 +1264,11 @@ static void lan78xx_status(struct lan78xx_net *dev, struct urb *urb)
> netif_dbg(dev, link, dev->net, "PHY INTR: 0x%08x\n", intdata);
> lan78xx_defer_kevent(dev, EVENT_LINK_RESET);
>
> - if (dev->domain_data.phyirq > 0)
> + if (dev->domain_data.phyirq > 0) {
> + local_irq_disable();
> generic_handle_irq(dev->domain_data.phyirq);
> + local_irq_enable();
> + }
> } else
> netdev_warn(dev->net,
> "unexpected interrupt: 0x%08x\n", intdata);

This should should be applied as a regression fix introduced by commit
ed194d1367698 ("usb: core: remove local_irq_save() around ->complete() handler")

> While this gets rid of the warning, the networking interface is not
> really stable:
>
> [ 43.999628] nfs: server 192.168.19.2 not responding, still trying
> [ 43.999633] nfs: server 192.168.19.2 not responding, still trying
> [ 43.999649] nfs: server 192.168.19.2 not responding, still trying
> [ 43.999674] nfs: server 192.168.19.2 not responding, still trying
> [ 43.999678] nfs: server 192.168.19.2 not responding, still trying
> [ 44.006712] nfs: server 192.168.19.2 OK
> [ 44.018443] nfs: server 192.168.19.2 OK
> [ 44.024765] nfs: server 192.168.19.2 OK
> [ 44.025361] nfs: server 192.168.19.2 OK
> [ 44.025420] nfs: server 192.168.19.2 OK
> [ 256.991659] nfs: server 192.168.19.2 not responding, still trying
> [ 256.991664] nfs: server 192.168.19.2 not responding, still trying
> [ 256.991669] nfs: server 192.168.19.2 not responding, still trying
> [ 256.991685] nfs: server 192.168.19.2 not responding, still trying
> [ 256.991713] nfs: server 192.168.19.2 not responding, still trying
> [ 256.998797] nfs: server 192.168.19.2 OK
> [ 256.999745] nfs: server 192.168.19.2 OK
> [ 256.999828] nfs: server 192.168.19.2 OK
> [ 257.000438] nfs: server 192.168.19.2 OK
> [ 257.004784] nfs: server 192.168.19.2 OK

Since this does not improve the situation as a whole it might be best to
remove the code as suggested by Daniel.

Sebastian

2019-10-25 09:49:46

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

On Thu, Oct 24, 2019 at 12:43:17PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-10-23 10:06:40 [+0200], Daniel Wagner wrote:
>
> Since this does not improve the situation as a whole it might be best to
> remove the code as suggested by Daniel.

I have tried to fix the above issue. It looks like the interrupt
handler doesn't work at all. Below is the log with all debug prints
enabled. I just see one PHY interrupt. Don't know if that is okay or
not.


[ 3.719647] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): deferred multicast write 0x00007ca0
[ 3.861125] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed
[ 3.872500] libphy: lan78xx-mdiobus: probed
[ 3.883927] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): registered mdiobus bus usb-001:004
[ 3.893600] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): phydev->irq = 79
[ 4.274367] random: crng init done
[ 4.929478] lan78xx 1-1.1.1:1.0 eth0: receive multicast hash filter
[ 4.935922] lan78xx 1-1.1.1:1.0 eth0: deferred multicast write 0x00007ca2
[ 6.537962] lan78xx 1-1.1.1:1.0 eth0: PHY INTR: 0x00020000
[ 6.549129] lan78xx 1-1.1.1:1.0 eth0: speed: 1000 duplex: 1 anadv: 0x05e1 anlpa: 0xc1e1
[ 6.557293] lan78xx 1-1.1.1:1.0 eth0: rx pause disabled, tx pause disabled
[ 6.572581] Sending DHCP requests ..., OK
[ 12.200693] IP-Config: Got DHCP answer from 192.168.19.2, my address is 192.168.19.53
[ 12.208654] IP-Config: Complete:
[ 12.211929] device=eth0, hwaddr=b8:27:eb:85:c7:c9, ipaddr=192.168.19.53, mask=255.255.255.0, gw=192.168.19.1
[ 12.222350] host=192.168.19.53, domain=, nis-domain=(none)
[ 12.228364] bootserver=192.168.19.2, rootserver=192.168.19.2, rootpath=
[ 12.228369] nameserver0=192.168.19.2
[ 12.239812] ALSA device list:
[ 12.242839] No soundcards found.
[ 12.256896] VFS: Mounted root (nfs filesystem) on device 0:19.
[ 12.263501] devtmpfs: mounted
[ 12.273037] Freeing unused kernel memory: 5504K
[ 12.277769] Run /sbin/init as init process



and after this the NFS timeouts appear. I tried to figure out how the
PHY works [1] and played a bit around with fiddling with a few bits in
the registers. But now success at all.

I agree with Sebastian, with the revert the driver works at least.

Thanks,
Daniel

[1] http://ww1.microchip.com/downloads/en/DeviceDoc/LAN7800-Data-Sheet-DS00001992G.pdf

2019-10-25 16:59:19

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

On Thu, Oct 24, 2019 at 01:06:10PM +0200, Daniel Wagner wrote:
> [ 3.719647] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): deferred multicast write 0x00007ca0
> [ 3.861125] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed
> [ 3.872500] libphy: lan78xx-mdiobus: probed
> [ 3.883927] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): registered mdiobus bus usb-001:004
> [ 3.893600] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): phydev->irq = 79
> [ 4.274367] random: crng init done
> [ 4.929478] lan78xx 1-1.1.1:1.0 eth0: receive multicast hash filter
> [ 4.935922] lan78xx 1-1.1.1:1.0 eth0: deferred multicast write 0x00007ca2
> [ 6.537962] lan78xx 1-1.1.1:1.0 eth0: PHY INTR: 0x00020000
> [ 6.549129] lan78xx 1-1.1.1:1.0 eth0: speed: 1000 duplex: 1 anadv: 0x05e1 anlpa: 0xc1e1
> [ 6.557293] lan78xx 1-1.1.1:1.0 eth0: rx pause disabled, tx pause disabled
> [ 6.572581] Sending DHCP requests ..., OK
> [ 12.200693] IP-Config: Got DHCP answer from 192.168.19.2, my address is 192.168.19.53
> [ 12.208654] IP-Config: Complete:
> [ 12.211929] device=eth0, hwaddr=b8:27:eb:85:c7:c9, ipaddr=192.168.19.53, mask=255.255.255.0, gw=192.168.19.1
> [ 12.222350] host=192.168.19.53, domain=, nis-domain=(none)
> [ 12.228364] bootserver=192.168.19.2, rootserver=192.168.19.2, rootpath=
> [ 12.228369] nameserver0=192.168.19.2
> [ 12.239812] ALSA device list:
> [ 12.242839] No soundcards found.
> [ 12.256896] VFS: Mounted root (nfs filesystem) on device 0:19.
> [ 12.263501] devtmpfs: mounted
> [ 12.273037] Freeing unused kernel memory: 5504K
> [ 12.277769] Run /sbin/init as init process

Sebastians suggested to try the RPi kernel. The rpi-5.2.y kernel
behaves exactly the same. That is one PHY interrupt and later on NFS
timeouts.

According their website the current shipped RPi kernel is in version
4.18. Here is what happends with rpi-4.18.y:

[ 3.359058] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): deferred multicast write 0x00007ca0
[ 3.500910] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed
[ 3.517693] libphy: lan78xx-mdiobus: probed
[ 3.521970] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): registered mdiobus bus usb-001:004
[ 3.532219] lan78xx 1-1.1.1:1.0 eth0: phydev->irq = 150
[ 3.843609] lan78xx 1-1.1.1:1.0 eth0: receive multicast hash filter
[ 3.850085] lan78xx 1-1.1.1:1.0 eth0: deferred multicast write 0x00007ca2
[ 3.876059] random: crng init done
[ 6.563736] lan78xx 1-1.1.1:1.0 eth0: PHY INTR: 0x00020000
[ 6.569331] lan78xx 1-1.1.1:1.0 eth0: PHY INTR: 0x00020000
[ 6.582759] lan78xx 1-1.1.1:1.0 eth0: speed: 1000 duplex: 1 anadv: 0x05e1 anlpa: 0xc1e1
[ 6.590889] lan78xx 1-1.1.1:1.0 eth0: rx pause disabled, tx pause disabled
[ 6.603222] Sending DHCP requests ..
[ 12.323251] Voltage normalised (0x00000000)
[ 13.723234] ., OK
[ 13.765259] IP-Config: Got DHCP answer from 192.168.19.2, my address is 192.168.19.53
[ 13.773250] IP-Config: Complete:
[ 13.776552] device=eth0, hwaddr=b8:27:eb:85:c7:c9, ipaddr=192.168.19.53, mask=255.255.255.0, gw=192.168.19.1
[ 13.786990] host=192.168.19.53, domain=, nis-domain=(none)
[ 13.793029] bootserver=192.168.19.2, rootserver=192.168.19.2, rootpath=
[ 13.793035] nameserver0=192.168.19.2

There are no NFS timeouts and commands like 'apt update' work reasoble
fast. So no long delays or hangs. Time to burn this hardware.

2019-10-25 18:46:53

by Stefan Wahren

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

Hi Daniel,

Am 24.10.19 um 16:12 schrieb Daniel Wagner:
> On Thu, Oct 24, 2019 at 01:06:10PM +0200, Daniel Wagner wrote:
>
> Sebastians suggested to try the RPi kernel. The rpi-5.2.y kernel
> behaves exactly the same. That is one PHY interrupt and later on NFS
> timeouts.
>
> According their website the current shipped RPi kernel is in version
> 4.18. Here is what happends with rpi-4.18.y:

No, it's 4.19. It's always a LTS kernel.

I'm curious, what's the motivation behind this? The rpi tree contains
additional hacks, so i'm not sure the results are comparable. Also the
USB host driver is a different one.

> There are no NFS timeouts and commands like 'apt update' work reasoble
> fast. So no long delays or hangs. Time to burn this hardware.

Since enabling lan78xx for Raspberry Pi 3B+, we found a lot of driver
issues. So i'm not really surprised, that there are still more of them.

Thanks Stefan

2019-10-25 19:09:34

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

From: Daniel Wagner <[email protected]>
Date: Fri, 18 Oct 2019 10:28:17 +0200

> handle_simple_irq() expect interrupts to be disabled. The USB
> framework is using threaded interrupts, which implies that interrupts
> are re-enabled as soon as it has run.
...

Where are we with this patch? I'm tossing it.

It seems Sebastian made a suggestion, someone else said his suggestion
should be tried, then everything died.

Please follow up and post when something is ready.

2019-10-25 19:23:57

by Stefan Wahren

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

Am 18.10.19 um 10:28 schrieb Daniel Wagner:
> handle_simple_irq() expect interrupts to be disabled. The USB
> framework is using threaded interrupts, which implies that interrupts
> are re-enabled as soon as it has run.
>
> This reverts the changes from cc89c323a30e ("lan78xx: Use irq_domain
> for phy interrupt from USB Int. EP").
>
> [ 4.886203] 000: irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
> [ 4.886243] 000: WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x154/0x168
> [ 4.896294] 000: Modules linked in:
> [ 4.896301] 000: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.6 #39
> [ 4.896310] 000: Hardware name: Raspberry Pi 3 Model B+ (DT)
> [ 4.896315] 000: pstate: 60000005 (nZCv daif -PAN -UAO)
> [ 4.896321] 000: pc : __handle_irq_event_percpu+0x154/0x168
> [ 4.896331] 000: lr : __handle_irq_event_percpu+0x154/0x168
> [ 4.896339] 000: sp : ffff000010003cc0
> [ 4.896346] 000: x29: ffff000010003cc0 x28: 0000000000000060
> [ 4.896355] 000: x27: ffff000011021980 x26: ffff00001189c72b
> [ 4.896364] 000: x25: ffff000011702bc0 x24: ffff800036d6e400
> [ 4.896373] 000: x23: 000000000000004f x22: ffff000010003d64
> [ 4.896381] 000: x21: 0000000000000000 x20: 0000000000000002
> [ 4.896390] 000: x19: ffff8000371c8480 x18: 0000000000000060
> [ 4.896398] 000: x17: 0000000000000000 x16: 00000000000000eb
> [ 4.896406] 000: x15: ffff000011712d18 x14: 7265746e69206465
> [ 4.896414] 000: x13: ffff000010003ba0 x12: ffff000011712df0
> [ 4.896422] 000: x11: 0000000000000001 x10: ffff000011712e08
> [ 4.896430] 000: x9 : 0000000000000001 x8 : 000000000003c920
> [ 4.896437] 000: x7 : ffff0000118cc410 x6 : ffff0000118c7f00
> [ 4.896445] 000: x5 : 000000000003c920 x4 : 0000000000004510
> [ 4.896453] 000: x3 : ffff000011712dc8 x2 : 0000000000000000
> [ 4.896461] 000: x1 : 73a3f67df94c1500 x0 : 0000000000000000
> [ 4.896466] 000: Call trace:
> [ 4.896471] 000: __handle_irq_event_percpu+0x154/0x168
> [ 4.896481] 000: handle_irq_event_percpu+0x50/0xb0
> [ 4.896489] 000: handle_irq_event+0x40/0x98
> [ 4.896497] 000: handle_simple_irq+0xa4/0xf0
> [ 4.896505] 000: generic_handle_irq+0x24/0x38
> [ 4.896513] 000: intr_complete+0xb0/0xe0
> [ 4.896525] 000: __usb_hcd_giveback_urb+0x58/0xd8
> [ 4.896533] 000: usb_giveback_urb_bh+0xd0/0x170
> [ 4.896539] 000: tasklet_action_common.isra.0+0x9c/0x128
> [ 4.896549] 000: tasklet_hi_action+0x24/0x30
> [ 4.896556] 000: __do_softirq+0x120/0x23c
> [ 4.896564] 000: irq_exit+0xb8/0xd8
> [ 4.896571] 000: __handle_domain_irq+0x64/0xb8
> [ 4.896579] 000: bcm2836_arm_irqchip_handle_irq+0x60/0xc0
> [ 4.896586] 000: el1_irq+0xb8/0x140
> [ 4.896592] 000: arch_cpu_idle+0x10/0x18
> [ 4.896601] 000: do_idle+0x200/0x280
> [ 4.896608] 000: cpu_startup_entry+0x20/0x28
> [ 4.896615] 000: rest_init+0xb4/0xc0
> [ 4.896623] 000: arch_call_rest_init+0xc/0x14
> [ 4.896632] 000: start_kernel+0x454/0x480
>
> [dwagner: Updated Jisheng's initial patch]
>
> Fixes: cc89c323a30e ("lan78xx: Use irq_domain for phy interrupt from USB Int. EP")
> Cc: Woojung Huh <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Andrew Lunn <[email protected]>
> Cc: Stefan Wahren <[email protected]>
> Cc: Jisheng Zhang <[email protected]>
> Cc: Sebastian Andrzej Siewior <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---

Acked-by: Stefan Wahren <[email protected]>

2019-10-25 19:29:12

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

Hi Stefan,

On Thu, Oct 24, 2019 at 07:25:37PM +0200, Stefan Wahren wrote:
> Am 24.10.19 um 16:12 schrieb Daniel Wagner:
> > On Thu, Oct 24, 2019 at 01:06:10PM +0200, Daniel Wagner wrote:
> >
> > Sebastians suggested to try the RPi kernel. The rpi-5.2.y kernel
> > behaves exactly the same. That is one PHY interrupt and later on NFS
> > timeouts.
> >
> > According their website the current shipped RPi kernel is in version
> > 4.18. Here is what happends with rpi-4.18.y:
>
> No, it's 4.19. It's always a LTS kernel.

Ah okay and obviously, 4.19 works also nicely. No surprise here.

> I'm curious, what's the motivation behind this? The rpi tree contains
> additional hacks, so i'm not sure the results are comparable. Also the
> USB host driver is a different one.

The idea was to see what the PHY interrupt is doing. As it turns out
the RPi tree and mainline have almost the same infrastructure code
here (irqdomain). There are some additional tweaks in the RPi
kernel. My initial revert patch removed all this infrastructure code,
which is probably not a good idea. If the way forward is to steal the
bits and pieces from the RPi tree which should keep this code in
place.

With the local_irq_disable() patch, which I am going to send asap, the
warning which everyone is seeing should be gone. So one bug down.

> > There are no NFS timeouts and commands like 'apt update' work reasoble
> > fast. So no long delays or hangs. Time to burn this hardware.
>
> Since enabling lan78xx for Raspberry Pi 3B+, we found a lot of driver
> issues. So i'm not really surprised, that there are still more of them.

If the vendor would work on fixing the bugs it would not be real
problem.

Thanks,
Daniel

Subject: Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

On 2019-10-24 14:57:16 [-0700], David Miller wrote:
> From: Daniel Wagner <[email protected]>
> Date: Fri, 18 Oct 2019 10:28:17 +0200
>
> > handle_simple_irq() expect interrupts to be disabled. The USB
> > framework is using threaded interrupts, which implies that interrupts
> > are re-enabled as soon as it has run.
> ...
>
> Where are we with this patch? I'm tossing it.
>
> It seems Sebastian made a suggestion, someone else said his suggestion
> should be tried, then everything died.
>
> Please follow up and post when something is ready.

My suggestion with the nested-handler was nonsense. The suggestion with
the local_irq_disable() before invoking generic_handle_irq() did avoid
the warning but the NFS-root was not stable (and Daniel reported a lot
of USB interrupts coming which indicates that the interrupt-handler is
not acknowledging the interrupt properly).

It works by reverting the IRQ-domain code as this patch does. "Works" as
in "no warnings" and "NFS-root looks stable".

You have two ACKs on that patch. I could ask Daniel to repost the patch
with additional informations.
My last information from Daniel was that the rpi tree works and I'm not
sure if he is looking for the missing ingredient or preferring the
removal of the non-working code.

Sebastian