On Wed, 2004-10-06 at 16:39, Colin Leroy wrote:
> Hi,
>
> I noticed that, if you have netconsole set up and using a sungem card,
> and if the network cable is not plugged in, that the whole kernel hangs
> shortly after the "device not up yet, forcing it" netconsole
> message. I suspect this is due to the autoneg in sungem, but didn't have
> time to look further.
>
> Would you have any hints on the cause of this problem?
Not sure, I suppose the driver is doing printk's with spinlocks held
from the autoneg stuff and there is a spinlock deadlock happening ...
Ben.
On 06 Oct 2004 at 18h10, Benjamin Herrenschmidt wrote:
Hi,
> On Wed, 2004-10-06 at 16:39, Colin Leroy wrote:
> > Hi,
> >
> > I noticed that, if you have netconsole set up and using a sungem
> > card, and if the network cable is not plugged in, that the whole
> > kernel hangs shortly after the "device not up yet, forcing it"
> > netconsole message. I suspect this is due to the autoneg in sungem,
> > but didn't have time to look further.
> >
> > Would you have any hints on the cause of this problem?
>
> Not sure, I suppose the driver is doing printk's with spinlocks held
> from the autoneg stuff and there is a spinlock deadlock happening ...
Thanks. I'll look into this. If I'm not mistaken, I've got no way of
catching it easily, do I ? CONFIG_DEBUG_SPINLOCK's help seems to say
that I need NMI watchdog in order to catch deadlocks, which is only
available on x86(_64).
--
Colin
On Wed, 2004-10-06 at 18:42, Colin Leroy wrote:
> > Not sure, I suppose the driver is doing printk's with spinlocks held
> > from the autoneg stuff and there is a spinlock deadlock happening ...
>
> Thanks. I'll look into this. If I'm not mistaken, I've got no way of
> catching it easily, do I ? CONFIG_DEBUG_SPINLOCK's help seems to say
> that I need NMI watchdog in order to catch deadlocks, which is only
> available on x86(_64).
Hrm... we have some sort of spinlock debugging, at least on ppc64...
BTW, did you have SMP or PREEMPT ? If none of these, then you should
not see any spin deadlock...
The solution is to look at the code though and find what's wrong :)
Ben.
On 06 Oct 2004 at 19h10, Benjamin Herrenschmidt wrote:
Hi,
> Hrm... we have some sort of spinlock debugging, at least on ppc64...
> BTW, did you have SMP or PREEMPT ? If none of these, then you should
> not see any spin deadlock...
No, in fact. You're right...
Indeed, if there was a deadlock, it would also happen when cable is
plugged in, wouldn't it ? (as sungem outputs "Link is up at xxx..." or
something when correctly initialized).
> The solution is to look at the code though and find what's wrong :)
I'll try.
The called method in the driver when calling
dev_change_flags(ndev, ndev->flags | IFF_UP) from netpoll
is
gem_open(), if I'm not mistaken?
Could some kind of infinite loop happen within gem_link_timer, maybe ?
--
Colin