2013-09-30 07:26:33

by Ruslan N. Marchenko

[permalink] [raw]
Subject: [BUG] Regression in 2fdac010 drivers/net/ethernet/via/via-velocity.c: update napi implementation

Hi Julia et al.,

With this commit VIA NANO board get CPU lockup in random places on any network activity.
The error is random, most of the times dead locked CPU on boot.

After loading via-velocity module it hangs for a while and then spits kernel mesages
about hanged tasks from varios places - like ata, rcu_preempt or network stack.
Also, if you boot the box with network disconnected - it works, until you plug the network cable.

On some kernel builds (like ubuntu stock) behaviour varies, sometimes the box itself works,
but network stack dies after multiple ICMPv6 skb errors, so box ends up with network disconnected.
Reloading network susbsys helps to recover for some time (depending on how fast eth led is blinking).

Reverting the patch allows me to boot without problems even from latest Linus trunk and send this mail.

Regards,
Ruslan


2013-09-30 11:43:39

by Julia Lawall

[permalink] [raw]
Subject: Re: [BUG] Regression in 2fdac010 drivers/net/ethernet/via/via-velocity.c: update napi implementation

On Mon, 30 Sep 2013, Ruslan N. Marchenko wrote:

> Hi Julia et al.,
>
> With this commit VIA NANO board get CPU lockup in random places on any network activity.
> The error is random, most of the times dead locked CPU on boot.
>
> After loading via-velocity module it hangs for a while and then spits kernel mesages
> about hanged tasks from varios places - like ata, rcu_preempt or network stack.
> Also, if you boot the box with network disconnected - it works, until you plug the network cable.
>
> On some kernel builds (like ubuntu stock) behaviour varies, sometimes the box itself works,
> but network stack dies after multiple ICMPv6 skb errors, so box ends up with network disconnected.
> Reloading network susbsys helps to recover for some time (depending on how fast eth led is blinking).
>
> Reverting the patch allows me to boot without problems even from latest Linus trunk and send this mail.

There has already been a discussion about this, and a patch has already
been proposed. It has to do with lock managament. I will look for the
email.

julia

2013-09-30 11:54:10

by Julia Lawall

[permalink] [raw]
Subject: Re: [BUG] Regression in 2fdac010 drivers/net/ethernet/via/via-velocity.c: update napi implementation

On Mon, 30 Sep 2013, Ruslan N. Marchenko wrote:

> Hi Julia et al.,
>
> With this commit VIA NANO board get CPU lockup in random places on any network activity.
> The error is random, most of the times dead locked CPU on boot.
>
> After loading via-velocity module it hangs for a while and then spits kernel mesages
> about hanged tasks from varios places - like ata, rcu_preempt or network stack.
> Also, if you boot the box with network disconnected - it works, until you plug the network cable.
>
> On some kernel builds (like ubuntu stock) behaviour varies, sometimes the box itself works,
> but network stack dies after multiple ICMPv6 skb errors, so box ends up with network disconnected.
> Reloading network susbsys helps to recover for some time (depending on how fast eth led is blinking).
>
> Reverting the patch allows me to boot without problems even from latest Linus trunk and send this mail.

You can find the discussion and improved patch in this thread:

http://www.spinics.net/lists/netdev/msg251287.html
Bug - regression - Via velocity interface coming up freezes kernel
From: Dirk Kraft <dirk.kraft@xxxxxxxxx>
Date: Sun, 22 Sep 2013 19:28:52 +0200

julia




>
> Regards,
> Ruslan
>

2013-09-30 13:54:34

by Ruslan N. Marchenko

[permalink] [raw]
Subject: Re: [BUG] Regression in 2fdac010 drivers/net/ethernet/via/via-velocity.c: update napi implementation

On Mon, Sep 30, 2013 at 01:54:06PM +0200, Julia Lawall wrote:
> On Mon, 30 Sep 2013, Ruslan N. Marchenko wrote:
>
> > Hi Julia et al.,
> >
> > With this commit VIA NANO board get CPU lockup in random places on any network activity.
> > The error is random, most of the times dead locked CPU on boot.
> >
> > After loading via-velocity module it hangs for a while and then spits kernel mesages
> > about hanged tasks from varios places - like ata, rcu_preempt or network stack.
> > Also, if you boot the box with network disconnected - it works, until you plug the network cable.
> >
> > On some kernel builds (like ubuntu stock) behaviour varies, sometimes the box itself works,
> > but network stack dies after multiple ICMPv6 skb errors, so box ends up with network disconnected.
> > Reloading network susbsys helps to recover for some time (depending on how fast eth led is blinking).
> >
> > Reverting the patch allows me to boot without problems even from latest Linus trunk and send this mail.
>
> You can find the discussion and improved patch in this thread:
>
> http://www.spinics.net/lists/netdev/msg251287.html
> Bug - regression - Via velocity interface coming up freezes kernel
> From: Dirk Kraft <dirk.kraft@xxxxxxxxx>
> Date: Sun, 22 Sep 2013 19:28:52 +0200
>
> julia

Oops, sorry, somehow I missed it even though 've been searching for via-velocity regression.
Will give a try to new proposed patch as well.

Thanks,
Ruslan

2013-10-01 06:30:09

by Ruslan N. Marchenko

[permalink] [raw]
Subject: Re: [BUG] Regression in 2fdac010 drivers/net/ethernet/via/via-velocity.c: update napi implementation

On Mon, Sep 30, 2013 at 03:54:27PM +0200, Ruslan N. Marchenko wrote:
> On Mon, Sep 30, 2013 at 01:54:06PM +0200, Julia Lawall wrote:
> > On Mon, 30 Sep 2013, Ruslan N. Marchenko wrote:
> >
> > > Hi Julia et al.,
> > >
> > > With this commit VIA NANO board get CPU lockup in random places on any network activity.
> > > .
> > > Reverting the patch allows me to boot without problems even from latest Linus trunk and send this mail.
> >
> > You can find the discussion and improved patch in this thread:
> >
> > http://www.spinics.net/lists/netdev/msg251287.html
> > Bug - regression - Via velocity interface coming up freezes kernel
> > From: Dirk Kraft <dirk.kraft@xxxxxxxxx>
> > Date: Sun, 22 Sep 2013 19:28:52 +0200
> >
> > julia
>
> Oops, sorry, somehow I missed it even though 've been searching for via-velocity regression.
> Will give a try to new proposed patch as well.
>
and short confirmation adding patch from Francois Romieu works on current -rc3+

Regards,
Ruslan

2013-10-01 22:46:21

by Francois Romieu

[permalink] [raw]
Subject: Re: [BUG] Regression in 2fdac010 drivers/net/ethernet/via/via-velocity.c: update napi implementation

Julia Lawall <[email protected]> :
[...]
> There has already been a discussion about this, and a patch has already
> been proposed. It has to do with lock managament. I will look for the
> email.

The underlying problem has to do with disabled irq. netif_receive_skb
assumes irq to be enabled. Current via-velocity poll() method should
narrow its (spinlocked) irq disabled section.

What I've done should not require much analysis (aka "what could race
with the rx bh processing in a napi driver ?") and avoids a more intrusive
lockless napi design.

--
Ueimor