LinuxLists.cc - bug report

2008-06-06 20:06:23

Subject: bug report

Hy!

I have a problem.

http://www.cyberszeg.hu/log/kern.log
http://www.cyberszeg.hu/log/config-2.6.25.4
http://www.cyberszeg.hu/log/lspci.txt
http://www.cyberszeg.hu/log/ifconfig.txt

Best regards Zsirmo

2008-06-07 01:44:47

by Oliver Pinter

[permalink] [raw]

Subject: Re: bug report

Add Ingon and netdev to CC

On 6/6/08, Zsiros Attila <[email protected]> wrote:
> Hy!
>
> I have a problem.
>
> http://www.cyberszeg.hu/log/kern.log
> http://www.cyberszeg.hu/log/config-2.6.25.4
> http://www.cyberszeg.hu/log/lspci.txt
> http://www.cyberszeg.hu/log/ifconfig.txt
>
> Best regards Zsirmo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
Thanks,
Oliver

2008-06-07 01:45:30

by Oliver Pinter

[permalink] [raw]

Subject: Re: bug report

Sorry, s/Ingon/Ingo/

On 6/7/08, Oliver Pinter <[email protected]> wrote:
> Add Ingon and netdev to CC
>
>
> On 6/6/08, Zsiros Attila <[email protected]> wrote:
>> Hy!
>>
>> I have a problem.
>>
>> http://www.cyberszeg.hu/log/kern.log
>> http://www.cyberszeg.hu/log/config-2.6.25.4
>> http://www.cyberszeg.hu/log/lspci.txt
>> http://www.cyberszeg.hu/log/ifconfig.txt
>>
>> Best regards Zsirmo
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>
>
> --
> Thanks,
> Oliver
>

--
Thanks,
Oliver

2008-06-07 05:57:17

by Andrew Morton

[permalink] [raw]

On Sat, 7 Jun 2008 03:44:32 +0200 "Oliver Pinter" <[email protected]> wrote:

> Add Ingon and netdev to CC
>
>
> On 6/6/08, Zsiros Attila <[email protected]> wrote:
> > Hy!
> >
> > I have a problem.
> >
> > http://www.cyberszeg.hu/log/kern.log
> > http://www.cyberszeg.hu/log/config-2.6.25.4
> > http://www.cyberszeg.hu/log/lspci.txt
> > http://www.cyberszeg.hu/log/ifconfig.txt
> >

: Jun 6 14:03:07 www kernel: [ 5897.660390] NETDEV WATCHDOG: eth0: transmit timed out
: Jun 6 14:03:07 www kernel: [ 5897.660398] tg3: eth0: transmit timed out, resetting
: Jun 6 14:03:07 www kernel: [ 5897.660432] tg3: DEBUG: MAC_TX_STATUS[0000001e] MAC_RX_STATUS[0000000e]
: Jun 6 14:03:07 www kernel: [ 5897.660454] tg3: DEBUG: RDMAC_STATUS[00000000] WDMAC_STATUS[00000000]
: Jun 6 14:03:07 www kernel: [ 5897.762983] tg3: tg3_stop_block timed out, ofs=1800 enable_bit=2
: Jun 6 14:03:07 www kernel: [ 5897.864168] tg3: tg3_stop_block timed out, ofs=c00 enable_bit=2
: Jun 6 14:03:07 www kernel: [ 5897.965619] tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2

That looks like a driver failure.

: Jun 6 14:03:07 www kernel: [ 5898.096689] tg3: eth0: Link is down.
: Jun 6 14:03:11 www kernel: [ 5901.633931] tg3: eth0: Link is up at 1000 Mbps, full duplex.
: Jun 6 14:03:11 www kernel: [ 5901.633937] tg3: eth0: Flow control is on for TX and on for RX.
: Jun 6 14:04:11 www kernel: [ 6464.556309] clamscan: page allocation failure. order:3, mode:0x4020
: Jun 6 14:04:11 www kernel: [ 6464.556319] Pid: 24139, comm: clamscan Not tainted 2.6.25.4 #1
: Jun 6 14:04:11 www kernel: [ 6464.556325]
: Jun 6 14:04:11 www kernel: [ 6464.556326] Call Trace:
: Jun 6 14:04:11 www kernel: [ 6464.556329] <IRQ> [__alloc_pages+544/890] __alloc_pages+0x220/0x37a
: Jun 6 14:04:11 www kernel: [ 6464.556353] [tcp_v4_do_rcv+186/504] tcp_v4_do_rcv+0xba/0x1f8
: Jun 6 14:04:11 www kernel: [ 6464.556359] [ktime_get+12/98] ktime_get+0xc/0x62
: Jun 6 14:04:11 www kernel: [ 6464.556364] [__netdev_alloc_skb+23/49] __netdev_alloc_skb+0x17/0x31
: Jun 6 14:04:11 www kernel: [ 6464.556370] [__slab_alloc+330/1403] __slab_alloc+0x14a/0x57b
: Jun 6 14:04:11 www kernel: [ 6464.556374] [__netdev_alloc_skb+23/49] __netdev_alloc_skb+0x17/0x31
: Jun 6 14:04:11 www kernel: [ 6464.556379] [__netdev_alloc_skb+23/49] __netdev_alloc_skb+0x17/0x31
: Jun 6 14:04:11 www kernel: [ 6464.556384] [__kmalloc_track_caller+185/190] __kmalloc_track_caller+0xb9/0xbe
: Jun 6 14:04:11 www kernel: [ 6464.556391] [__alloc_skb+86/305] __alloc_skb+0x56/0x131
: Jun 6 14:04:11 www kernel: [ 6464.556395] [__netdev_alloc_skb+23/49] __netdev_alloc_skb+0x17/0x31
: Jun 6 14:04:11 www kernel: [ 6464.556408] [_end+128472975/2130444940] :tg3:tg3_alloc_rx_skb+0x8f/0x17e
: Jun 6 14:04:11 www kernel: [ 6464.556418] [_end+128493403/2130444940] :tg3:tg3_poll+0x6e8/0x922
: Jun 6 14:04:11 www kernel: [ 6464.556426] [net_rx_action+134/309] net_rx_action+0x86/0x135
: Jun 6 14:04:11 www kernel: [ 6464.556433] [__do_softirq+102/212] __do_softirq+0x66/0xd4
: Jun 6 14:04:11 www kernel: [ 6464.556439] [call_softirq+28/48] call_softirq+0x1c/0x30
: Jun 6 14:04:11 www kernel: [ 6464.556444] [do_softirq+48/107] do_softirq+0x30/0x6b
: Jun 6 14:04:11 www kernel: [ 6464.556448] [do_IRQ+114/212] do_IRQ+0x72/0xd4
: Jun 6 14:04:11 www kernel: [ 6464.556453] [ret_from_intr+0/10] ret_from_intr+0x0/0xa

The driver is trying to do a 32 kbyte GFP_ATOMIC memory allocation.
rofl, good luck with that.

But the netwoking code sould survive this.

<12 billion more page allocation failures>

Are you using jumbo frames or have you manually set the MTU to
something enormous? Because 32k is a pretty crazy amount of memory for
the driver to be trying to allocate - it's going to fail all over the
place, as you have discovered.

2008-06-07 08:48:21

by Ilpo Järvinen

[permalink] [raw]

Subject: Re: bug report

...Added Brian Vowell.

On Fri, 6 Jun 2008, Andrew Morton wrote:

> On Sat, 7 Jun 2008 03:44:32 +0200 "Oliver Pinter" <[email protected]> wrote:
>
> > Add Ingon and netdev to CC
> >
> >
> > On 6/6/08, Zsiros Attila <[email protected]> wrote:
> > > Hy!
> > >
> > > I have a problem.
> > >
> > > http://www.cyberszeg.hu/log/kern.log
> > > http://www.cyberszeg.hu/log/config-2.6.25.4
> > > http://www.cyberszeg.hu/log/lspci.txt
> > > http://www.cyberszeg.hu/log/ifconfig.txt
> > >
>
> : Jun 6 14:03:07 www kernel: [ 5897.660390] NETDEV WATCHDOG: eth0: transmit timed out
> : Jun 6 14:03:07 www kernel: [ 5897.660398] tg3: eth0: transmit timed out, resetting
> : Jun 6 14:03:07 www kernel: [ 5897.660432] tg3: DEBUG: MAC_TX_STATUS[0000001e] MAC_RX_STATUS[0000000e]
> : Jun 6 14:03:07 www kernel: [ 5897.660454] tg3: DEBUG: RDMAC_STATUS[00000000] WDMAC_STATUS[00000000]
> : Jun 6 14:03:07 www kernel: [ 5897.762983] tg3: tg3_stop_block timed out, ofs=1800 enable_bit=2
> : Jun 6 14:03:07 www kernel: [ 5897.864168] tg3: tg3_stop_block timed out, ofs=c00 enable_bit=2
> : Jun 6 14:03:07 www kernel: [ 5897.965619] tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2
>
> That looks like a driver failure.
>
> : Jun 6 14:03:07 www kernel: [ 5898.096689] tg3: eth0: Link is down.
> : Jun 6 14:03:11 www kernel: [ 5901.633931] tg3: eth0: Link is up at 1000 Mbps, full duplex.
> : Jun 6 14:03:11 www kernel: [ 5901.633937] tg3: eth0: Flow control is on for TX and on for RX.
> : Jun 6 14:04:11 www kernel: [ 6464.556309] clamscan: page allocation failure. order:3, mode:0x4020
> : Jun 6 14:04:11 www kernel: [ 6464.556319] Pid: 24139, comm: clamscan Not tainted 2.6.25.4 #1
> : Jun 6 14:04:11 www kernel: [ 6464.556325]
> : Jun 6 14:04:11 www kernel: [ 6464.556326] Call Trace:
> : Jun 6 14:04:11 www kernel: [ 6464.556329] <IRQ> [__alloc_pages+544/890] __alloc_pages+0x220/0x37a
> : Jun 6 14:04:11 www kernel: [ 6464.556353] [tcp_v4_do_rcv+186/504] tcp_v4_do_rcv+0xba/0x1f8
> : Jun 6 14:04:11 www kernel: [ 6464.556359] [ktime_get+12/98] ktime_get+0xc/0x62
> : Jun 6 14:04:11 www kernel: [ 6464.556364] [__netdev_alloc_skb+23/49] __netdev_alloc_skb+0x17/0x31
> : Jun 6 14:04:11 www kernel: [ 6464.556370] [__slab_alloc+330/1403] __slab_alloc+0x14a/0x57b
> : Jun 6 14:04:11 www kernel: [ 6464.556374] [__netdev_alloc_skb+23/49] __netdev_alloc_skb+0x17/0x31
> : Jun 6 14:04:11 www kernel: [ 6464.556379] [__netdev_alloc_skb+23/49] __netdev_alloc_skb+0x17/0x31
> : Jun 6 14:04:11 www kernel: [ 6464.556384] [__kmalloc_track_caller+185/190] __kmalloc_track_caller+0xb9/0xbe
> : Jun 6 14:04:11 www kernel: [ 6464.556391] [__alloc_skb+86/305] __alloc_skb+0x56/0x131
> : Jun 6 14:04:11 www kernel: [ 6464.556395] [__netdev_alloc_skb+23/49] __netdev_alloc_skb+0x17/0x31
> : Jun 6 14:04:11 www kernel: [ 6464.556408] [_end+128472975/2130444940] :tg3:tg3_alloc_rx_skb+0x8f/0x17e
> : Jun 6 14:04:11 www kernel: [ 6464.556418] [_end+128493403/2130444940] :tg3:tg3_poll+0x6e8/0x922
> : Jun 6 14:04:11 www kernel: [ 6464.556426] [net_rx_action+134/309] net_rx_action+0x86/0x135
> : Jun 6 14:04:11 www kernel: [ 6464.556433] [__do_softirq+102/212] __do_softirq+0x66/0xd4
> : Jun 6 14:04:11 www kernel: [ 6464.556439] [call_softirq+28/48] call_softirq+0x1c/0x30
> : Jun 6 14:04:11 www kernel: [ 6464.556444] [do_softirq+48/107] do_softirq+0x30/0x6b
> : Jun 6 14:04:11 www kernel: [ 6464.556448] [do_IRQ+114/212] do_IRQ+0x72/0xd4
> : Jun 6 14:04:11 www kernel: [ 6464.556453] [ret_from_intr+0/10] ret_from_intr+0x0/0xa
>
> The driver is trying to do a 32 kbyte GFP_ATOMIC memory allocation.
> rofl, good luck with that.
>
> But the netwoking code sould survive this.
>
> <12 billion more page allocation failures>
>
> Are you using jumbo frames or have you manually set the MTU to
> something enormous? Because 32k is a pretty crazy amount of memory for
> the driver to be trying to allocate - it's going to fail all over the
> place, as you have discovered.

Same allocation failed problem (among an TCP issue that is nowadays
fixed) was also reported by Brian Vowell:
http://bugzilla.kernel.org/show_bug.cgi?id=10767

--
i.

2008-06-07 12:50:22

by Oliver Pinter

[permalink] [raw]

Subject: Re: bug report

[snip]
Jun 6 14:04:11 www kernel: [ 6464.556309] clamscan: page allocation
failure. order:3, mode:0x4020 <------------- this
Jun 6 14:04:11 www kernel: [ 6464.556319] Pid: 24139, comm: clamscan
Not tainted 2.6.25.4 #1
Jun 6 14:04:11 www kernel: [ 6464.556325]
Jun 6 14:04:11 www kernel: [ 6464.556326] Call Trace:
Jun 6 14:04:11 www kernel: [ 6464.556329] <IRQ>
[__alloc_pages+544/890] __alloc_pages+0x220/0x37a
Jun 6 14:04:11 www kernel: [ 6464.556353] [tcp_v4_do_rcv+186/504]
tcp_v4_do_rcv+0xba/0x1f8
[snip]

...

[snip]
Jun 6 14:04:11 www kernel: [ 6464.558231] Pid: 24139, comm: clamscan
Not tainted 2.6.25.4 #1
Jun 6 14:04:11 www kernel: [ 6464.558234]
Jun 6 14:04:11 www kernel: [ 6464.558235] Call Trace:
Jun 6 14:04:11 www kernel: [ 6464.558239] <IRQ>
[__alloc_pages+544/890] __alloc_pages+0x220/0x37a
Jun 6 14:04:11 www kernel: [ 6464.558257] [tcp_v4_do_rcv+186/504]
tcp_v4_do_rcv+0xba/0x1f8
Jun 6 14:04:11 www kernel: [ 6464.558263]
[apic_timer_interrupt+102/112] apic_timer_interrupt+0x6
[snip]

On 6/7/08, Ilpo J?rvinen <[email protected]> wrote:
> ...Added Brian Vowell.
>
> On Fri, 6 Jun 2008, Andrew Morton wrote:
>
>> On Sat, 7 Jun 2008 03:44:32 +0200 "Oliver Pinter" <[email protected]>
>> wrote:
>>
>> > Add Ingon and netdev to CC
>> >
>> >
>> > On 6/6/08, Zsiros Attila <[email protected]> wrote:
>> > > Hy!
>> > >
>> > > I have a problem.
>> > >
>> > > http://www.cyberszeg.hu/log/kern.log
>> > > http://www.cyberszeg.hu/log/config-2.6.25.4
>> > > http://www.cyberszeg.hu/log/lspci.txt
>> > > http://www.cyberszeg.hu/log/ifconfig.txt
>> > >
>>
>> : Jun 6 14:03:07 www kernel: [ 5897.660390] NETDEV WATCHDOG: eth0:
>> transmit timed out
>> : Jun 6 14:03:07 www kernel: [ 5897.660398] tg3: eth0: transmit timed
>> out, resetting
>> : Jun 6 14:03:07 www kernel: [ 5897.660432] tg3: DEBUG:
>> MAC_TX_STATUS[0000001e] MAC_RX_STATUS[0000000e]
>> : Jun 6 14:03:07 www kernel: [ 5897.660454] tg3: DEBUG:
>> RDMAC_STATUS[00000000] WDMAC_STATUS[00000000]
>> : Jun 6 14:03:07 www kernel: [ 5897.762983] tg3: tg3_stop_block timed
>> out, ofs=1800 enable_bit=2
>> : Jun 6 14:03:07 www kernel: [ 5897.864168] tg3: tg3_stop_block timed
>> out, ofs=c00 enable_bit=2
>> : Jun 6 14:03:07 www kernel: [ 5897.965619] tg3: tg3_stop_block timed
>> out, ofs=4800 enable_bit=2
>>
>> That looks like a driver failure.
>>
>> : Jun 6 14:03:07 www kernel: [ 5898.096689] tg3: eth0: Link is down.
>> : Jun 6 14:03:11 www kernel: [ 5901.633931] tg3: eth0: Link is up at 1000
>> Mbps, full duplex.
>> : Jun 6 14:03:11 www kernel: [ 5901.633937] tg3: eth0: Flow control is on
>> for TX and on for RX.
>> : Jun 6 14:04:11 www kernel: [ 6464.556309] clamscan: page allocation
>> failure. order:3, mode:0x4020
>> : Jun 6 14:04:11 www kernel: [ 6464.556319] Pid: 24139, comm: clamscan
>> Not tainted 2.6.25.4 #1
>> : Jun 6 14:04:11 www kernel: [ 6464.556325]
>> : Jun 6 14:04:11 www kernel: [ 6464.556326] Call Trace:
>> : Jun 6 14:04:11 www kernel: [ 6464.556329] <IRQ>
>> [__alloc_pages+544/890] __alloc_pages+0x220/0x37a
>> : Jun 6 14:04:11 www kernel: [ 6464.556353] [tcp_v4_do_rcv+186/504]
>> tcp_v4_do_rcv+0xba/0x1f8
>> : Jun 6 14:04:11 www kernel: [ 6464.556359] [ktime_get+12/98]
>> ktime_get+0xc/0x62
>> : Jun 6 14:04:11 www kernel: [ 6464.556364] [__netdev_alloc_skb+23/49]
>> __netdev_alloc_skb+0x17/0x31
>> : Jun 6 14:04:11 www kernel: [ 6464.556370] [__slab_alloc+330/1403]
>> __slab_alloc+0x14a/0x57b
>> : Jun 6 14:04:11 www kernel: [ 6464.556374] [__netdev_alloc_skb+23/49]
>> __netdev_alloc_skb+0x17/0x31
>> : Jun 6 14:04:11 www kernel: [ 6464.556379] [__netdev_alloc_skb+23/49]
>> __netdev_alloc_skb+0x17/0x31
>> : Jun 6 14:04:11 www kernel: [ 6464.556384]
>> [__kmalloc_track_caller+185/190] __kmalloc_track_caller+0xb9/0xbe
>> : Jun 6 14:04:11 www kernel: [ 6464.556391] [__alloc_skb+86/305]
>> __alloc_skb+0x56/0x131
>> : Jun 6 14:04:11 www kernel: [ 6464.556395] [__netdev_alloc_skb+23/49]
>> __netdev_alloc_skb+0x17/0x31
>> : Jun 6 14:04:11 www kernel: [ 6464.556408] [_end+128472975/2130444940]
>> :tg3:tg3_alloc_rx_skb+0x8f/0x17e
>> : Jun 6 14:04:11 www kernel: [ 6464.556418] [_end+128493403/2130444940]
>> :tg3:tg3_poll+0x6e8/0x922
>> : Jun 6 14:04:11 www kernel: [ 6464.556426] [net_rx_action+134/309]
>> net_rx_action+0x86/0x135
>> : Jun 6 14:04:11 www kernel: [ 6464.556433] [__do_softirq+102/212]
>> __do_softirq+0x66/0xd4
>> : Jun 6 14:04:11 www kernel: [ 6464.556439] [call_softirq+28/48]
>> call_softirq+0x1c/0x30
>> : Jun 6 14:04:11 www kernel: [ 6464.556444] [do_softirq+48/107]
>> do_softirq+0x30/0x6b
>> : Jun 6 14:04:11 www kernel: [ 6464.556448] [do_IRQ+114/212]
>> do_IRQ+0x72/0xd4
>> : Jun 6 14:04:11 www kernel: [ 6464.556453] [ret_from_intr+0/10]
>> ret_from_intr+0x0/0xa
>>
>> The driver is trying to do a 32 kbyte GFP_ATOMIC memory allocation.
>> rofl, good luck with that.
>>
>> But the netwoking code sould survive this.
>>
>> <12 billion more page allocation failures>
>>
>> Are you using jumbo frames or have you manually set the MTU to
>> something enormous? Because 32k is a pretty crazy amount of memory for
>> the driver to be trying to allocate - it's going to fail all over the
>> place, as you have discovered.
>
> Same allocation failed problem (among an TCP issue that is nowadays
> fixed) was also reported by Brian Vowell:
> http://bugzilla.kernel.org/show_bug.cgi?id=10767
>
>
> --
> i.
>

--
Thanks,
Oliver

2008-06-07 15:10:03

by Phil Oester

[permalink] [raw]

Subject: Re: bug report

On Sat, Jun 07, 2008 at 02:50:06PM +0200, Oliver Pinter wrote:
> [snip]
> Jun 6 14:04:11 www kernel: [ 6464.556309] clamscan: page allocation
> failure. order:3, mode:0x4020 <------------- this

"Me too". Lots of order 3 allocation failures on e1000 nics
since upgrading some heavy traffic boxes to 2.6.25 (though from 2.6.21,
so unclear on when it began).

Phil

Jun 2 11:11:24 px01 kernel: swapper: page allocation failure. order:3, mode:0x4020
Jun 2 11:11:24 px01 kernel: Pid: 0, comm: swapper Not tainted 2.6.25.2-x86_64.1 #1
Jun 2 11:11:24 px01 kernel:
Jun 2 11:11:24 px01 kernel: Call Trace:
Jun 2 11:11:24 px01 kernel: <IRQ> [<ffffffff8024ce51>] __alloc_pages+0x2dd/0x2f6
Jun 2 11:11:24 px01 kernel: [<ffffffff8026578d>] __slab_alloc+0x16e/0x4f9
Jun 2 11:11:24 px01 kernel: [<ffffffff80394e84>] tcp_collapse+0x164/0x394
Jun 2 11:11:25 px01 kernel: [<ffffffff802656d0>] __slab_alloc+0xb1/0x4f9
Jun 2 11:11:25 px01 kernel: [<ffffffff80394e84>] tcp_collapse+0x164/0x394
Jun 2 11:11:25 px01 kernel: [<ffffffff8026685d>] __kmalloc_track_caller+0x82/0xb8
Jun 2 11:11:25 px01 kernel: [<ffffffff8036a124>] __alloc_skb+0x5b/0x121
Jun 2 11:11:25 px01 kernel: [<ffffffff80394e84>] tcp_collapse+0x164/0x394
Jun 2 11:11:25 px01 kernel: [<ffffffff80395261>] tcp_prune_queue+0x1ad/0x21e
Jun 2 11:11:25 px01 kernel: [<ffffffff803954ba>] tcp_data_queue+0x1e8/0xbab
Jun 2 11:11:25 px01 kernel: [<ffffffff803974e8>] tcp_rcv_established+0x64c/0x6fc
Jun 2 11:11:25 px01 kernel: [<ffffffff8039c76f>] tcp_v4_do_rcv+0x2c/0x1b4
Jun 2 11:11:25 px01 kernel: [<ffffffff8039e43e>] tcp_v4_rcv+0x6b3/0x705
Jun 2 11:11:25 px01 kernel: [<ffffffff80384cdc>] ip_local_deliver_finish+0xf6/0x1b3
Jun 2 11:11:25 px01 kernel: [<ffffffff80384bc3>] ip_rcv_finish+0x2bf/0x2e2
Jun 2 11:11:25 px01 kernel: [<ffffffff8023ada3>] getnstimeofday+0x2f/0x83
Jun 2 11:11:25 px01 kernel: [<ffffffff8036e6b2>] netif_receive_skb+0x1af/0x21b
Jun 2 11:11:25 px01 kernel: [<ffffffff8032da6d>] e1000_clean_rx_irq+0x3f3/0x4e7

2008-06-07 18:53:39

by Oliver Pinter

[permalink] [raw]

Subject: Re: bug report

Hi!

Attila first use one other nic (rtl8169), and after one other (3com
BCM570). All NIC producted this error...
the full thread in Hungarian Unix Portal: http://hup.hu/node/56295

On 6/7/08, Phil Oester <[email protected]> wrote:
> On Sat, Jun 07, 2008 at 02:50:06PM +0200, Oliver Pinter wrote:
>> [snip]
>> Jun 6 14:04:11 www kernel: [ 6464.556309] clamscan: page allocation
>> failure. order:3, mode:0x4020 <------------- this
>
> "Me too". Lots of order 3 allocation failures on e1000 nics
> since upgrading some heavy traffic boxes to 2.6.25 (though from 2.6.21,
> so unclear on when it began).
>
> Phil
>
> Jun 2 11:11:24 px01 kernel: swapper: page allocation failure. order:3,
> mode:0x4020
> Jun 2 11:11:24 px01 kernel: Pid: 0, comm: swapper Not tainted
> 2.6.25.2-x86_64.1 #1
> Jun 2 11:11:24 px01 kernel:
> Jun 2 11:11:24 px01 kernel: Call Trace:
> Jun 2 11:11:24 px01 kernel: <IRQ> [<ffffffff8024ce51>]
> __alloc_pages+0x2dd/0x2f6
> Jun 2 11:11:24 px01 kernel: [<ffffffff8026578d>] __slab_alloc+0x16e/0x4f9
> Jun 2 11:11:24 px01 kernel: [<ffffffff80394e84>] tcp_collapse+0x164/0x394
> Jun 2 11:11:25 px01 kernel: [<ffffffff802656d0>] __slab_alloc+0xb1/0x4f9
> Jun 2 11:11:25 px01 kernel: [<ffffffff80394e84>] tcp_collapse+0x164/0x394
> Jun 2 11:11:25 px01 kernel: [<ffffffff8026685d>]
> __kmalloc_track_caller+0x82/0xb8
> Jun 2 11:11:25 px01 kernel: [<ffffffff8036a124>] __alloc_skb+0x5b/0x121
> Jun 2 11:11:25 px01 kernel: [<ffffffff80394e84>] tcp_collapse+0x164/0x394
> Jun 2 11:11:25 px01 kernel: [<ffffffff80395261>]
> tcp_prune_queue+0x1ad/0x21e
> Jun 2 11:11:25 px01 kernel: [<ffffffff803954ba>]
> tcp_data_queue+0x1e8/0xbab
> Jun 2 11:11:25 px01 kernel: [<ffffffff803974e8>]
> tcp_rcv_established+0x64c/0x6fc
> Jun 2 11:11:25 px01 kernel: [<ffffffff8039c76f>] tcp_v4_do_rcv+0x2c/0x1b4
> Jun 2 11:11:25 px01 kernel: [<ffffffff8039e43e>] tcp_v4_rcv+0x6b3/0x705
> Jun 2 11:11:25 px01 kernel: [<ffffffff80384cdc>]
> ip_local_deliver_finish+0xf6/0x1b3
> Jun 2 11:11:25 px01 kernel: [<ffffffff80384bc3>] ip_rcv_finish+0x2bf/0x2e2
> Jun 2 11:11:25 px01 kernel: [<ffffffff8023ada3>] getnstimeofday+0x2f/0x83
> Jun 2 11:11:25 px01 kernel: [<ffffffff8036e6b2>]
> netif_receive_skb+0x1af/0x21b
> Jun 2 11:11:25 px01 kernel: [<ffffffff8032da6d>]
> e1000_clean_rx_irq+0x3f3/0x4e7
>
>

--
Thanks,
Oliver

2008-06-08 12:06:18

by Zsiros Attila

[permalink] [raw]

Subject: Re: bug report

Hy!

Newer log
http://www.cyberszeg.hu/log/kern2.log

Zsirmo

Oliver Pinter ?rta:
> Hi!
>
> Attila first use one other nic (rtl8169), and after one other (3com
> BCM570). All NIC producted this error...
> the full thread in Hungarian Unix Portal: http://hup.hu/node/56295
>
> On 6/7/08, Phil Oester <[email protected]> wrote:
>
>> On Sat, Jun 07, 2008 at 02:50:06PM +0200, Oliver Pinter wrote:
>>
>>> [snip]
>>> Jun 6 14:04:11 www kernel: [ 6464.556309] clamscan: page allocation
>>> failure. order:3, mode:0x4020 <------------- this
>>>
>> "Me too". Lots of order 3 allocation failures on e1000 nics
>> since upgrading some heavy traffic boxes to 2.6.25 (though from 2.6.21,
>> so unclear on when it began).
>>
>> Phil
>>
>> Jun 2 11:11:24 px01 kernel: swapper: page allocation failure. order:3,
>> mode:0x4020
>> Jun 2 11:11:24 px01 kernel: Pid: 0, comm: swapper Not tainted
>> 2.6.25.2-x86_64.1 #1
>> Jun 2 11:11:24 px01 kernel:
>> Jun 2 11:11:24 px01 kernel: Call Trace:
>> Jun 2 11:11:24 px01 kernel: <IRQ> [<ffffffff8024ce51>]
>> __alloc_pages+0x2dd/0x2f6
>> Jun 2 11:11:24 px01 kernel: [<ffffffff8026578d>] __slab_alloc+0x16e/0x4f9
>> Jun 2 11:11:24 px01 kernel: [<ffffffff80394e84>] tcp_collapse+0x164/0x394
>> Jun 2 11:11:25 px01 kernel: [<ffffffff802656d0>] __slab_alloc+0xb1/0x4f9
>> Jun 2 11:11:25 px01 kernel: [<ffffffff80394e84>] tcp_collapse+0x164/0x394
>> Jun 2 11:11:25 px01 kernel: [<ffffffff8026685d>]
>> __kmalloc_track_caller+0x82/0xb8
>> Jun 2 11:11:25 px01 kernel: [<ffffffff8036a124>] __alloc_skb+0x5b/0x121
>> Jun 2 11:11:25 px01 kernel: [<ffffffff80394e84>] tcp_collapse+0x164/0x394
>> Jun 2 11:11:25 px01 kernel: [<ffffffff80395261>]
>> tcp_prune_queue+0x1ad/0x21e
>> Jun 2 11:11:25 px01 kernel: [<ffffffff803954ba>]
>> tcp_data_queue+0x1e8/0xbab
>> Jun 2 11:11:25 px01 kernel: [<ffffffff803974e8>]
>> tcp_rcv_established+0x64c/0x6fc
>> Jun 2 11:11:25 px01 kernel: [<ffffffff8039c76f>] tcp_v4_do_rcv+0x2c/0x1b4
>> Jun 2 11:11:25 px01 kernel: [<ffffffff8039e43e>] tcp_v4_rcv+0x6b3/0x705
>> Jun 2 11:11:25 px01 kernel: [<ffffffff80384cdc>]
>> ip_local_deliver_finish+0xf6/0x1b3
>> Jun 2 11:11:25 px01 kernel: [<ffffffff80384bc3>] ip_rcv_finish+0x2bf/0x2e2
>> Jun 2 11:11:25 px01 kernel: [<ffffffff8023ada3>] getnstimeofday+0x2f/0x83
>> Jun 2 11:11:25 px01 kernel: [<ffffffff8036e6b2>]
>> netif_receive_skb+0x1af/0x21b
>> Jun 2 11:11:25 px01 kernel: [<ffffffff8032da6d>]
>> e1000_clean_rx_irq+0x3f3/0x4e7
>>
>>
>>
>
>
>

2008-06-09 17:04:42

by Oliver Pinter

[permalink] [raw]

Subject: Re: bug report

an other case, but with nvidia

http://frugalware.org/paste/2639

On 6/8/08, Zsiros Attila <[email protected]> wrote:
> Hy!
>
> Newer log
> http://www.cyberszeg.hu/log/kern2.log
>
> Zsirmo
>
> Oliver Pinter ?rta:
>> Hi!
>>
>> Attila first use one other nic (rtl8169), and after one other (3com
>> BCM570). All NIC producted this error...
>> the full thread in Hungarian Unix Portal: http://hup.hu/node/56295
>>
>> On 6/7/08, Phil Oester <[email protected]> wrote:
>>
>>> On Sat, Jun 07, 2008 at 02:50:06PM +0200, Oliver Pinter wrote:
>>>
>>>> [snip]
>>>> Jun 6 14:04:11 www kernel: [ 6464.556309] clamscan: page allocation
>>>> failure. order:3, mode:0x4020 <------------- this
>>>>
>>> "Me too". Lots of order 3 allocation failures on e1000 nics
>>> since upgrading some heavy traffic boxes to 2.6.25 (though from 2.6.21,
>>> so unclear on when it began).
>>>
>>> Phil
>>>
>>> Jun 2 11:11:24 px01 kernel: swapper: page allocation failure. order:3,
>>> mode:0x4020
>>> Jun 2 11:11:24 px01 kernel: Pid: 0, comm: swapper Not tainted
>>> 2.6.25.2-x86_64.1 #1
>>> Jun 2 11:11:24 px01 kernel:
>>> Jun 2 11:11:24 px01 kernel: Call Trace:
>>> Jun 2 11:11:24 px01 kernel: <IRQ> [<ffffffff8024ce51>]
>>> __alloc_pages+0x2dd/0x2f6
>>> Jun 2 11:11:24 px01 kernel: [<ffffffff8026578d>]
>>> __slab_alloc+0x16e/0x4f9
>>> Jun 2 11:11:24 px01 kernel: [<ffffffff80394e84>]
>>> tcp_collapse+0x164/0x394
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff802656d0>]
>>> __slab_alloc+0xb1/0x4f9
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff80394e84>]
>>> tcp_collapse+0x164/0x394
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff8026685d>]
>>> __kmalloc_track_caller+0x82/0xb8
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff8036a124>] __alloc_skb+0x5b/0x121
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff80394e84>]
>>> tcp_collapse+0x164/0x394
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff80395261>]
>>> tcp_prune_queue+0x1ad/0x21e
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff803954ba>]
>>> tcp_data_queue+0x1e8/0xbab
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff803974e8>]
>>> tcp_rcv_established+0x64c/0x6fc
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff8039c76f>]
>>> tcp_v4_do_rcv+0x2c/0x1b4
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff8039e43e>] tcp_v4_rcv+0x6b3/0x705
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff80384cdc>]
>>> ip_local_deliver_finish+0xf6/0x1b3
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff80384bc3>]
>>> ip_rcv_finish+0x2bf/0x2e2
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff8023ada3>]
>>> getnstimeofday+0x2f/0x83
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff8036e6b2>]
>>> netif_receive_skb+0x1af/0x21b
>>> Jun 2 11:11:25 px01 kernel: [<ffffffff8032da6d>]
>>> e1000_clean_rx_irq+0x3f3/0x4e7
>>>
>>>
>>>
>>
>>
>>
>
>

--
Thanks,
Oliver