Hi
Ultra 10 can boot, it receives DHCP answer, ssh works, but few minutes
after machine becomes unresponsible by net at all.
It does not respond even on ARP requests.
On Sparc host I see that `TX packets` does not grow, but RX grows.
The state of the interface is
UP BROADCAST RUNNING MULTICAST
There is no relevant kernel messages in dmesg.
next-20090317 worked fine.
Let me know if I can help.
2009/3/18 Alexander Beregalov <[email protected]>:
> Hi
>
> Ultra 10 can boot, it receives DHCP answer, ssh works, but few minutes
> after machine becomes unresponsible by net at all.
> It does not respond even on ARP requests.
>
> On Sparc host I see that `TX packets` does not grow, but RX grows.
> The state of the interface is
> UP BROADCAST RUNNING MULTICAST
>
> There is no relevant kernel messages in dmesg.
>
> next-20090317 worked fine.
>
>
> Let me know if I can help.
>
I have reproduced it on next-20090318 for the second time, but still
do not know how.
next-20090317 might be affected.
Now both packet counters grow, but tcpdump cannot capture any packet.
From: Alexander Beregalov <[email protected]>
Date: Wed, 18 Mar 2009 19:58:40 +0300
> 2009/3/18 Alexander Beregalov <[email protected]>:
> > Hi
> >
> > Ultra 10 can boot, it receives DHCP answer, ssh works, but few minutes
> > after machine becomes unresponsible by net at all.
> > It does not respond even on ARP requests.
> >
> > On Sparc host I see that `TX packets` does not grow, but RX grows.
> > The state of the interface is
> > UP BROADCAST RUNNING MULTICAST
> >
> > There is no relevant kernel messages in dmesg.
> >
> > next-20090317 worked fine.
> >
> >
> > Let me know if I can help.
> >
>
> I have reproduced it on next-20090318 for the second time, but still
> do not know how.
> next-20090317 might be affected.
>
> Now both packet counters grow, but tcpdump cannot capture any packet.
Does the current vanilla kernel work fine?
2009/3/19 David Miller <[email protected]>:
> From: Alexander Beregalov <[email protected]>
> Date: Wed, 18 Mar 2009 19:58:40 +0300
>
>> 2009/3/18 Alexander Beregalov <[email protected]>:
>> > Hi
>> >
>> > Ultra 10 can boot, it receives DHCP answer, ssh works, but few minutes
>> > after machine becomes unresponsible by net at all.
>> > It does not respond even on ARP requests.
>> >
>> > On Sparc host I see that `TX packets` does not grow, but RX grows.
>> > The state of the interface is
>> > UP BROADCAST RUNNING MULTICAST
>> >
>> > There is no relevant kernel messages in dmesg.
>> >
>> > next-20090317 worked fine.
>> >
>> >
>> > Let me know if I can help.
>> >
>>
>> I have reproduced it on next-20090318 for the second time, but still
>> do not know how.
>> next-20090317 might be affected.
>>
>> Now both packet counters grow, but tcpdump cannot capture any packet.
>
> Does the current vanilla kernel work fine?
Yes. The problem happens only with -next tree.
From: Alexander Beregalov <[email protected]>
Date: Thu, 19 Mar 2009 11:16:20 +0300
> 2009/3/19 David Miller <[email protected]>:
> > Does the current vanilla kernel work fine?
>
> Yes. The problem happens only with -next tree.
I have one of my ultra10s up and will try to reproduce and diagnose
this locally.
Thanks.
>> 2009/3/19 David Miller <[email protected]>:
>> > Does the current vanilla kernel work fine?
>>
>> Yes. The problem happens only with -next tree.
>
> I have one of my ultra10s up and will try to reproduce and diagnose
> this locally.
Hi David
2.6.29 also does not work.
The guilty patch should be between 2.6.29-rc8 and final 2.6.29
even more narrow - 59fcbdd..v2.6.29
I can try to bisect.
2009/3/24 Alexander Beregalov <[email protected]>:
>>> 2009/3/19 David Miller <[email protected]>:
>>> > Does the current vanilla kernel work fine?
>>>
>>> Yes. The problem happens only with -next tree.
>>
>> I have one of my ultra10s up and will try to reproduce and diagnose
>> this locally.
>
> Hi David
>
> 2.6.29 also does not work.
>
> The guilty patch should be between 2.6.29-rc8 and final 2.6.29
>
> even more narrow - 59fcbdd..v2.6.29
>
> I can try to bisect.
e4a389a9b5c892446b5de2038bdc0cca8703c615 is first bad commit
commit e4a389a9b5c892446b5de2038bdc0cca8703c615
Author: Roel Kluin <[email protected]>
Date: Wed Mar 18 23:12:13 2009 -0700
net: kfree(napi->skb) => kfree_skb
2009/3/24 Alexander Beregalov <[email protected]>:
> 2009/3/24 Alexander Beregalov <[email protected]>:
>>>> 2009/3/19 David Miller <[email protected]>:
>>>> > Does the current vanilla kernel work fine?
>>>>
>>>> Yes. The problem happens only with -next tree.
>>>
>>> I have one of my ultra10s up and will try to reproduce and diagnose
>>> this locally.
>>
>> Hi David
>>
>> 2.6.29 also does not work.
>>
>> The guilty patch should be between 2.6.29-rc8 and final 2.6.29
>>
>> even more narrow - 59fcbdd..v2.6.29
>>
>> I can try to bisect.
>
> e4a389a9b5c892446b5de2038bdc0cca8703c615 is first bad commit
No, it is wrong, sorry.
I will do more tests tomorrow.
From: Alexander Beregalov <[email protected]>
Date: Tue, 24 Mar 2009 18:50:55 +0300
> 2009/3/24 Alexander Beregalov <[email protected]>:
> > e4a389a9b5c892446b5de2038bdc0cca8703c615 is first bad commit
>
> No, it is wrong, sorry.
> I will do more tests tomorrow.
I bet it is the problem we are discussing here:
http://marc.info/?l=linux-kernel&m=123789980524715&w=2
Herbert has posted two patches to try and fix this issue,
you can try the second one to see if it solves your bug
too.