2008-12-15 17:49:38

by Artur Skawina

[permalink] [raw]
Subject: wireless-testing, p54 and sinus 154 data no longer works

I have a sinus 154 data usb adapter, which used to work w/ p54 in 2.6.27:

> Dec 13 13:21:30 kernel: usb 1-7: new high speed USB device using ehci_hcd and address 22
> Dec 13 13:21:31 kernel: usb 1-7: configuration #1 chosen from 1 choice
> Dec 13 13:21:31 kernel: firmware: requesting isl3890usb
> Dec 13 13:21:31 kernel: p54: LM86 firmware
> Dec 13 13:21:31 kernel: p54: FW rev 2.5.6.0 - Softmac protocol 3.0
> Dec 13 13:21:33 kernel: p54: unknown eeprom code : 0x1
> Dec 13 13:21:33 kernel: p54: unknown eeprom code : 0x3
> Dec 13 13:21:33 kernel: p54: unknown eeprom code : 0x1905
> Dec 13 13:21:33 kernel: p54: unknown eeprom code : 0x1007
> Dec 13 13:21:33 kernel: p54: unknown eeprom code : 0x1008
> Dec 13 13:21:33 kernel: p54: unknown eeprom code : 0x1100
> Dec 13 13:21:33 kernel: phy9: Selected rate control algorithm 'pid'
> Dec 13 13:21:33 kernel: phy9: hwaddr 00:30:f1:12:34:56, isl3886
> Dec 13 13:21:33 kernel: udev: renamed network interface wlan0 to wlan1
> Dec 13 13:21:33 kernel: usb 1-7: New USB device found, idVendor=0846, idProduct=4200
> Dec 13 13:21:33 kernel: usb 1-7: New USB device strings: Mfr=0, Product=0, SerialNumber=0

but does not work w/ current wireless-testing.git/pending:

> Dec 15 17:43:17 kernel: usb 1-7: new high speed USB device using ehci_hcd and address 29
> Dec 15 17:43:17 kernel: usb 1-7: configuration #1 chosen from 1 choice
> Dec 15 17:43:17 kernel: usb 1-7: firmware: requesting isl3886usb
> Dec 15 17:43:17 kernel: phy16: p54 detected a LM86 firmware
> Dec 15 17:43:17 kernel: phy16: FW rev 2.5.6.0 - Softmac protocol 3.0
> Dec 15 17:43:17 kernel: phy16: you are using an obsolete firmware. visit http://wireless.kernel.org/en/users/Drivers/p54 and grab one for "kernel >= 2.6.28"!
> Dec 15 17:43:17 kernel: phy16: cryptographic accelerator WEP:YES, TKIP:YES, CCMP:no
> Dec 15 17:43:20 kernel: phy16: device does not respond!
> Dec 15 17:43:20 kernel: p54usb: probe of 1-7:1.0 failed with error -16
> Dec 15 17:43:20 kernel: usb 1-7: New USB device found, idVendor=0846, idProduct=4200
> Dec 15 17:43:20 kernel: usb 1-7: New USB device strings: Mfr=0, Product=0, SerialNumber=0

Using different firmwares only changes the cryptographic-accelerator line (CCMP:yes) [1],
but i still get the "device does not respond!" error.

artur

> Dec 15 17:06:29 kernel: usb 1-7: new high speed USB device using ehci_hcd and address 17
> Dec 15 17:06:29 kernel: usb 1-7: configuration #1 chosen from 1 choice
> Dec 15 17:06:29 kernel: usb 1-7: firmware: requesting isl3886usb
> Dec 15 17:06:30 kernel: phy4: p54 detected a LM86 firmware
> Dec 15 17:06:30 kernel: phy4: FW rev 2.13.1.0 - Softmac protocol 5.5
> Dec 15 17:06:30 kernel: phy4: cryptographic accelerator WEP:YES, TKIP:YES, CCMP:YES
> Dec 15 17:06:32 kernel: phy4: device does not respond!
> Dec 15 17:06:32 kernel: p54usb: probe of 1-7:1.0 failed with error -16
> Dec 15 17:06:32 kernel: usb 1-7: New USB device found, idVendor=0846, idProduct=4200
> Dec 15 17:06:32 kernel: usb 1-7: New USB device strings: Mfr=0, Product=0, SerialNumber=0



2008-12-15 20:20:33

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Monday 15 December 2008 19:41:13 Larry Finger wrote:
>> Artur Skawina wrote:
>>> I have a sinus 154 data usb adapter, which used to work w/ p54 in 2.6.27:
>> Can you bisect this problem?
> Naa I guess its "p54: fix memory management".
> 872037115977dc07ae05e4b6b769bf8726da4e87
>
> so tell us if this patch works.

Nope, the attached patch did not improve things, as the log below shows.
I'll try bisecting.

Thanks,

artur

> Dec 15 21:11:08 kernel: usb 1-7: new high speed USB device using ehci_hcd and address 12
> Dec 15 21:11:09 kernel: usb 1-7: configuration #1 chosen from 1 choice
> Dec 15 21:11:09 kernel: cfg80211: Using static regulatory domain info
> Dec 15 21:11:09 kernel: cfg80211: Regulatory domain: US
> Dec 15 21:11:09 kernel: ^I(start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
> Dec 15 21:11:09 kernel: ^I(2402000 KHz - 2472000 KHz @ 40000 KHz), (600 mBi, 2700 mBm)
> Dec 15 21:11:09 kernel: ^I(5170000 KHz - 5190000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
> Dec 15 21:11:09 kernel: ^I(5190000 KHz - 5210000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
> Dec 15 21:11:09 kernel: ^I(5210000 KHz - 5230000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
> Dec 15 21:11:09 kernel: ^I(5230000 KHz - 5330000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
> Dec 15 21:11:09 kernel: ^I(5735000 KHz - 5835000 KHz @ 40000 KHz), (600 mBi, 3000 mBm)
> Dec 15 21:11:09 kernel: cfg80211: Calling CRDA for country: US
> Dec 15 21:11:09 kernel: usb 1-7: New USB device found, idVendor=0846, idProduct=4200
> Dec 15 21:11:09 kernel: usb 1-7: New USB device strings: Mfr=0, Product=0, SerialNumber=0
> Dec 15 21:11:09 kernel: usb 1-7: firmware: requesting isl3886usb
> Dec 15 21:11:09 kernel: phy0: p54 detected a LM86 firmware
> Dec 15 21:11:09 kernel: phy0: FW rev 2.5.6.0 - Softmac protocol 3.0
> Dec 15 21:11:09 kernel: phy0: you are using an obsolete firmware. visit http://wireless.kernel.org/en/users/Drivers/p54 and grab one for "kernel >= 2.6.28"!
> Dec 15 21:11:09 kernel: phy0: cryptographic accelerator WEP:YES, TKIP:YES, CCMP:no
> Dec 15 21:11:12 kernel: phy0: device does not respond!
> Dec 15 21:11:12 kernel: p54usb: probe of 1-7:1.0 failed with error -16
> Dec 15 21:11:12 kernel: usbcore: registered new interface driver p54usb

2008-12-15 23:24:32

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Tuesday 16 December 2008 00:03:42 Artur Skawina wrote:
> Artur Skawina wrote:
> > Christian Lamparter wrote:
> >> On Monday 15 December 2008 19:41:13 Larry Finger wrote:
> >>> Artur Skawina wrote:
> >>>> I have a sinus 154 data usb adapter, which used to work w/ p54 in 2.6.27:
> >>> Can you bisect this problem?
> >> Naa I guess its "p54: fix memory management".

> Done:
> 78e74789bf2dab82f9539cd46e7f7583ff9516c1 is first bad commit
> commit 78e74789bf2dab82f9539cd46e7f7583ff9516c1
> Author: Christian Lamparter <[email protected]>
> Date: Wed Oct 15 04:07:16 2008 +0200
>
> p54: fix memory management

ahh, well there went the 100% chance...

Anyway can you fire up usbmon
(see Linux-Source/Documentation/usb/usbmon.txt ,
or right there: http://www.mjmwired.net/kernel/Documentation/usb/usbmon.txt)
and capture the frames which are sent to the device?
for the p54usb in 2.6.27 and the not working one in the git tree?

Regards,
Chr

2008-12-15 19:43:20

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Monday 15 December 2008 19:41:13 Larry Finger wrote:
> Artur Skawina wrote:
> > I have a sinus 154 data usb adapter, which used to work w/ p54 in 2.6.27:
>
> Can you bisect this problem?
Naa I guess its "p54: fix memory management".
872037115977dc07ae05e4b6b769bf8726da4e87

so tell us if this patch works.

Regards,
Chr


Attachments:
(No filename) (326.00 B)
p54usb.patch (1.03 kB)
Download all attachments

2008-12-15 23:04:09

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Artur Skawina wrote:
> Christian Lamparter wrote:
>> On Monday 15 December 2008 19:41:13 Larry Finger wrote:
>>> Artur Skawina wrote:
>>>> I have a sinus 154 data usb adapter, which used to work w/ p54 in 2.6.27:
>>> Can you bisect this problem?
>> Naa I guess its "p54: fix memory management".
>> 872037115977dc07ae05e4b6b769bf8726da4e87
>>
>> so tell us if this patch works.
>
> Nope, the attached patch did not improve things, as the log below shows.
> I'll try bisecting.

Done:
78e74789bf2dab82f9539cd46e7f7583ff9516c1 is first bad commit
commit 78e74789bf2dab82f9539cd46e7f7583ff9516c1
Author: Christian Lamparter <[email protected]>
Date: Wed Oct 15 04:07:16 2008 +0200

p54: fix memory management

artur

2008-12-15 18:41:17

by Larry Finger

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Artur Skawina wrote:
> I have a sinus 154 data usb adapter, which used to work w/ p54 in 2.6.27:

Can you bisect this problem?

Larry

2009-01-15 20:06:29

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Thursday 15 January 2009 20:12:03 Artur Skawina wrote:
>>>> Upgraded to current wireless-testing/pending on the problematic box
>>>> and almost immediately got [1]. No slab corruption this time (at least
>>>> not yet). Will switch to GFP_ATOMIC and retry w/ the new fw.

>>> yes that's a bug, p54_set_tim must use GFP_ATOMIC, do you want to post the patch?
>> nah, too trivial :) fwiw i used this:
> well, it should be "done", right? :-)
>
> just add "John W. Linville" <[email protected]>" to the cc,
> change the subject line to "[PATCH] p54: set_tim must be atomic."
> and put a Signed-off-by: your name <your@mail> in the mail body and a "---" to seperate the commit message
> and then the patch. And it will be in the next round of -rcX fixes.

well, i really meant the 'too trivial' part and was hoping you'd do it ;)
ok, will resend.

Is this something that could interfere w/ testing:
> mac80211-phy0: failed to remove key (0, 00:1b:fb:12:34:56) from hardware (-12)
?

thanks,

artur

2009-01-20 20:18:54

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Tuesday 20 January 2009 00:17:13 Artur Skawina wrote:
>> Artur Skawina wrote:
>>> yes, that's what i was thinking. Will FREE_AFTER_TX(skb) be false for all skbs
>>> that are dropped in p54_rx_frame_sent()?
> Not exactly, but it's the right direction. FREE_AFTER_TX is true for _setting_ control frames... e.g frequency/filter/dcf.

Which means "yes", i guess. Because if it were ever to be true the race would
reappear, and given how hard it was to reproduce, i'd rather not take any
chances (in all the testing i hit the BUG in skb_pull only twice).

> if your device finally works. (Well "works" is maybe a bit too much,
> as you'll probably run into other bugs, e.g truesize bug or the "shot rc", or
> more problems with powersaving station that only have apsd).

It seems to work quite well, i'm seeing the occasional truesize warning and
haven't yet figured out why the client won't reconnect before hostapds
ap_max_inactivity timeout triggers, but other than that everything seems fine.

Thanks,

artur

2009-01-19 18:15:13

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Monday 19 January 2009 00:27:02 Artur Skawina wrote:
>> Artur Skawina wrote:
>>> didn't trigger anything here, just the usual:
>>>
>>> BUG kmalloc-4096: Poison overwritten

>> This is almost 100% reproducible; sometimes the machine freezes instead.

Here's an interesting sequence:

1) a TX urb is submitted.
2) p54u_rx_cb() => p54_rx_frame_sent(), which does kfree_skb( the_skb_in_(1) ).
3) p54u_tx_cb() for (1) is called with the same, now freed, skb. kaboom.

IOW the skb is freed before the usb completion runs.

Somehow i don't think this is the reason for the corruption, but it certainly
seems to be responsible for some, if not all, of the crashes/panics.

artur

2009-01-19 18:53:01

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

> Here's an interesting sequence:
>
> 1) a TX urb is submitted.
> 2) p54u_rx_cb() => p54_rx_frame_sent(), which does kfree_skb( the_skb_in_(1) ).
> 3) p54u_tx_cb() for (1) is called with the same, now freed, skb. kaboom.
>
> IOW the skb is freed before the usb completion runs.
>
> Somehow i don't think this is the reason for the corruption, but it certainly
> seems to be responsible for some, if not all, of the crashes/panics.

Hmm, if it happens for beacons, it's probably also happening for other frames,
which are not freed, but given to ieee80211_tx_status_irqsafe() and that could
explain the corruption. Will verify, but probably not before tomorrow.

artur

2009-01-18 23:32:49

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Artur Skawina wrote:
> Christian Lamparter wrote:
>> On Friday 16 January 2009 10:13:07 Johannes Berg wrote:
>>> On Thu, 2009-01-15 at 21:18 -0600, Larry Finger wrote:
>>>
>>>>>> Object 0xddec18d0: >69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ikkkkkkkkkkkkkkk
>>>> I too have seen real single bit changes - in my case 6b went to 6a,
>>>> and my memory is fine. I wouldn't necessarily blame your hardware.
>>> 6b to 6a is often the result of a refcounting bug that happens to unref
>>> a value _after_ it has been freed. But that doesn't explain 6b to 69,
>>> unless you happen to have _two_ refcounting bugs. Not that I necessarily
>>> think that memory is bad
>> Well, this idiotic debug patch (kref-kernel-debug-patch) could shed some light into
>> the problem who's using a freed skb.
>
> didn't trigger anything here, just the usual:
>
> BUG kmalloc-4096: Poison overwritten

I've run memtest, even swapped both ram and cpu, and symptoms stay the same, so it's
very likely not bad hw.

Still haven't found the corruptor, but at least i've narrowed it down a bit;
what i'm seeing is:

1) an skb "S" gets allocated in p54u_rx_cb and is submitted together w/ the urb.
2) "S" later comes back to p54u_rx_cb, where it is given to p54_rx (eventually
ieee80211_rx_irqsafe) and a new one is allocated.
3) a few (~15) rx/tx packets pass.
4) SLUB detects modified poison in what used to be S->head in (1) and (2) above;
usually 0x6b turns into 0x6a, but i have also seen 0x69, just a few times.
(the offset from skb->head to the decremented byte seems to stay the same,
at least during the few times i tried w/ the same kernel, last one was eg
684 bytes)

This is almost 100% reproducible; sometimes the machine freezes instead.

artur

2009-01-20 20:50:14

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Tuesday 20 January 2009 21:18:48 Artur Skawina wrote:
> Christian Lamparter wrote:
> > On Tuesday 20 January 2009 00:17:13 Artur Skawina wrote:
> > if your device finally works. (Well "works" is maybe a bit too much,
> > as you'll probably run into other bugs, e.g truesize bug or the "shot rc", or
> > more problems with powersaving station that only have apsd).
>
> It seems to work quite well, i'm seeing the occasional truesize warning and
> haven't yet figured out why the client won't reconnect before hostapds
> ap_max_inactivity timeout triggers, but other than that everything seems fine.

So, p54usb first generation needs the ZERO_PACKET flag as well?
(e.g a patch like the attached p54usb-net2280-zeropacket.diff ?)

Regards,
Chr


Attachments:
(No filename) (748.00 B)
p54usb-net2280-zeropacket.diff (831.00 B)
Download all attachments

2009-01-13 22:47:08

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

(added lkml - so please keep the CC!)

On Tuesday 13 January 2009 22:39:00 Artur Skawina wrote:
> Artur Skawina wrote:
> >>> The machine has 512M, ~100M should be (usually is) free, is under constant light
> >>> load (typically <2k ints/s, 60% idle) and is running fine for weeks/months between
> >>> reboots, but locks up after only a few packets go over the hostap driven
> >>> p54usb device. I need the box to be up, that limits the number of tests i can
> >>> run, at least as long as the lockups w/o any diagnostics happen...
> >> Do keyboard-leds "flash" when it locks up, or does console respond
> >> if you press alt-sysrq-m / alt-sysrq-w on the connected keyboard?
> >
> > most of the times it happened there was no kbd attached. At least once
> > when it _was_ connected, sysrq was working, and i saw 0*8KB; that's why
> > i initially suspected fragmentation.
> >
> >> ( If your box has a serial port, you can try to get the logs from there... )
>
> after switching from SLUB to SLAB and enabling some debugging i finally caught this:

arg, that's not good... I hoped for a obvious BUG in p54, or mac80211. But not in the other part of the kernel.
I've no idea what's going on in the timer/mm part (but maybe someone else @ lkml ??!)
since "cache_free_debugcheck" has about 3 (well, there are 4, but the first one is unlikely) BUG_ON?

This smells like a memory corruption. Have you tried to enable CONFIG_DEBUG_SLAB?
Is this related to the "truesize bug", Or how long does the box survive if you don't allow named to bind/listen to wlanX ?

> ------------[ cut here ]------------
> Kernel BUG at c016a8a3 [verbose debug info unavailable]
> invalid opcode: 0000 [#1]
> last sysfs file: /sys/devices/pci0000:00/0000:00:07.2/usb1/1-1/1-1.1/uevent
> Modules linked in: netconsole saa7134_empress saa6752hs lnbp21 s5h1420 saa7134 budget videobuf_dma_sg budget_ci budget_core saa7146 ttpci_eeprom videobuf_core tveeprom serio_raw ir_common [last unloaded: netconsole]
>
> Pid: 1885, comm: named Not tainted (2.6.28-rc8-00519-g90435df #42)
> EIP: 0060:[<c016a8a3>] EFLAGS: 00210012 CPU: 0
> EIP is at cache_free_debugcheck+0x203/0x250
> EAX: dfb6c71f EBX: df803d20 ECX: dfb6c03f EDX: 00000002
> ESI: dfb6c720 EDI: 00000370 EBP: c1000000 ESP: c0669f74
> DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process named (pid: 1885, ti=c0669000 task=df8443d0 task.ti=deb85000)
> Stack:
> 00000000 df809660 d31d4528 00000003 00000000 00000002 c137c440 c060e2dc
> c01483e2 dfb6c000 df808d38 df803d20 c069cb40 00200286 c016a911 00000000
> 00000005 c069cb40 00000009 c01483e2 00000020 00000001 00000100 c014850f
> Call Trace:
> [<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
> [<c016a911>] kmem_cache_free+0x21/0x60
> [<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
> [<c014850f>] rcu_process_callbacks+0xf/0x20
> [<c0127a37>] __do_softirq+0x57/0xf0
> [<c01279e0>] __do_softirq+0x0/0xf0
> <IRQ> <0> [<c01277e5>] irq_exit+0x45/0x70
> [<c0112590>] smp_apic_timer_interrupt+0x40/0x70
> [<c0103d9c>] apic_timer_interrupt+0x28/0x30
> Code: 8b 44 24 24 b9 fe ff ff ff 89 4c 90 1c f6 43 19 08 74 0e b9 6b 00 00 00 89 f2 89 d8 e8 e7 fa ff ff 83 c4 28 89 f0 5b 5e 5f 5d c3 <0f> 0b eb fe 0f 0b eb fe 8b 43 10 8d 44 06 f8 8d b6 00 00 00 00
> EIP: [<c016a8a3>] cache_free_debugcheck+0x203/0x250 SS:ESP 0068:c0669f74
> Kernel panic - not syncing: Fatal exception in interrupt
>
> followed after some time by lots of page alloc failures [1].
>
> artur
>
> [1]
> [...]
> __ratelimit: 1551 callbacks suppressed
> named: page allocation failure. order:0, mode:0x20
> Pid: 1885, comm: named Tainted: G D 2.6.28-rc8-00519-g90435df #42
> Call Trace:
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> named: page allocation failure. order:0, mode:0x20
> Pid: 1885, comm: named Tainted: G D 2.6.28-rc8-00519-g90435df #42
> Call Trace:
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c016b573>] cache_alloc_refill+0x363/0x710
> [<c03a52c4>] __alloc_skb+0x34/0x120
> [<c016bcc1>] kmem_cache_alloc+0xe1/0xf0
> [<c03a52c4>] __alloc_skb+0x34/0x120
> [<c03b8205>] find_skb+0x35/0x90
> [<c03b840e>] netpoll_send_udp+0x2e/0x200
> [<e33661ad>] write_msg+0x9d/0xe0 [netconsole]
> [<e3366110>] write_msg+0x0/0xe0 [netconsole]
> [<c0123443>] __call_console_drivers+0x43/0x50
> [<c01238bb>] release_console_sem+0x13b/0x1c0
> [<c0123dd7>] vprintk+0x227/0x2d0
> [<c0123443>] __call_console_drivers+0x43/0x50
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c04c30c0>] printk+0x17/0x1f
> [<c0105909>] print_trace_address+0x49/0x60
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c01059a4>] dump_trace+0x84/0x100
> [<c0105fde>] show_trace+0x4e/0x60
> [<c04c2fc1>] dump_stack+0x6e/0x73
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c016b573>] cache_alloc_refill+0x363/0x710
> [<c03a52c4>] __alloc_skb+0x34/0x120
> [<c03a539e>] __alloc_skb+0x10e/0x120
> [<c016ba6e>] __kmalloc_track_caller+0x14e/0x160
> [<c016bc53>] kmem_cache_alloc+0x73/0xf0
> [<c03a5da9>] dev_alloc_skb+0x19/0x30
> [<c03a52e5>] __alloc_skb+0x55/0x120
> [<c03a5da9>] dev_alloc_skb+0x19/0x30
> [<c02ced8e>] boomerang_rx+0x15e/0x520
> [<c02d04cf>] boomerang_interrupt+0x13f/0x480
> [<e109d6a9>] budget_ci_irq+0xa9/0x100 [budget_ci]
> [<c0103d9c>] apic_timer_interrupt+0x28/0x30
> [<c0146348>] handle_IRQ_event+0x28/0x50
> [<c0147600>] handle_level_irq+0x0/0xb0
> [<c014764b>] handle_level_irq+0x4b/0xb0
> <IRQ> [<c0103d6f>] common_interrupt+0x23/0x28
> [<c024007b>] prio_tree_right+0xab/0x100
> [<c02442f7>] delay_tsc+0x17/0x20
> [<c0244298>] __const_udelay+0x18/0x20
> [<c04c304a>] panic+0x84/0xe3
> [<c010584c>] oops_end+0x7c/0x90
> [<c01045d0>] do_invalid_op+0x0/0xa0
> [<c0104651>] do_invalid_op+0x81/0xa0
> [<c016a8a3>] cache_free_debugcheck+0x203/0x250
> [<c011d233>] __wake_up_common+0x43/0x70
> [<c04c4b82>] error_code+0x6a/0x70
> [<c016a8a3>] cache_free_debugcheck+0x203/0x250
> [<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
> [<c016a911>] kmem_cache_free+0x21/0x60
> [<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
> [<c014850f>] rcu_process_callbacks+0xf/0x20
> [<c0127a37>] __do_softirq+0x57/0xf0
> [<c01279e0>] __do_softirq+0x0/0xf0
> <IRQ> [<c01277e5>] irq_exit+0x45/0x70
> [<c0112590>] smp_apic_timer_interrupt+0x40/0x70
> [<c0103d9c>] apic_timer_interrupt+0x28/0x30
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> Normal per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 174
> Active_anon:13626 active_file:3702 inactive_anon:11682
> inactive_file:91928 unevictable:5 dirty:48 writeback:0 unstable:0
> free:737 slab:3377 mapped:2606 pagetables:219 bounce:0
> DMA free:2004kB min:84kB low:104kB high:124kB active_anon:24kB inactive_anon:28kB active_file:104kB inactive_file:8164kB unevictable:0kB present:15872kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 492 492
> Normal free:944kB min:2792kB low:3488kB high:4188kB active_anon:54480kB inactive_anon:46700kB active_file:14704kB inactive_file:359548kB unevictable:20kB present:503928kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> DMA: 1*4kB 0*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2004kB
> Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 944kB
> 95760 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 530104kB
> Total swap = 530104kB
> 131070 pages RAM
> 2635 pages reserved
> 10978 pages shared
> 121856 pages non-shared
> named: page allocation failure. order:0, mode:0x20
> Pid: 1885, comm: named Tainted: G D 2.6.28-rc8-00519-g90435df #42
> Call Trace:
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c016b573>] cache_alloc_refill+0x363/0x710
> [<c03a52c4>] __alloc_skb+0x34/0x120
> [<c016bcc1>] kmem_cache_alloc+0xe1/0xf0
> [<c03a52c4>] __alloc_skb+0x34/0x120
> [<c03b739b>] refill_skbs+0x5b/0x70
> [<c03b81e9>] find_skb+0x19/0x90
> [<c0266d90>] bit_cursor+0x0/0x610
> [<c03b840e>] netpoll_send_udp+0x2e/0x200
> [<e33661ad>] write_msg+0x9d/0xe0 [netconsole]
> [<e3366110>] write_msg+0x0/0xe0 [netconsole]
> [<c0123443>] __call_console_drivers+0x43/0x50
> [<c01238bb>] release_console_sem+0x13b/0x1c0
> [<c0123dd7>] vprintk+0x227/0x2d0
> [<c0123443>] __call_console_drivers+0x43/0x50
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c04c30c0>] printk+0x17/0x1f
> [<c0105909>] print_trace_address+0x49/0x60
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c01059a4>] dump_trace+0x84/0x100
> [<c0105fde>] show_trace+0x4e/0x60
> [<c04c2fc1>] dump_stack+0x6e/0x73
> [<c01505cd>] __alloc_pages_internal+0x35d/0x470
> [<c016b573>] cache_alloc_refill+0x363/0x710
> [<c03a52c4>] __alloc_skb+0x34/0x120
> [<c03a539e>] __alloc_skb+0x10e/0x120
> [<c016ba6e>] __kmalloc_track_caller+0x14e/0x160
> [<c016bc53>] kmem_cache_alloc+0x73/0xf0
> [<c03a5da9>] dev_alloc_skb+0x19/0x30
> [<c03a52e5>] __alloc_skb+0x55/0x120
> [<c03a5da9>] dev_alloc_skb+0x19/0x30
> [<c02ced8e>] boomerang_rx+0x15e/0x520
> [<c02d04cf>] boomerang_interrupt+0x13f/0x480
> [<e109d6a9>] budget_ci_irq+0xa9/0x100 [budget_ci]
> [<c0103d9c>] apic_timer_interrupt+0x28/0x30
> [<c0146348>] handle_IRQ_event+0x28/0x50
> [<c0147600>] handle_level_irq+0x0/0xb0
> [<c014764b>] handle_level_irq+0x4b/0xb0
> <IRQ> [<c0103d6f>] common_interrupt+0x23/0x28
> [<c024007b>] prio_tree_right+0xab/0x100
> [<c02442f7>] delay_tsc+0x17/0x20
> [<c0244298>] __const_udelay+0x18/0x20
> [<c04c304a>] panic+0x84/0xe3
> [<c010584c>] oops_end+0x7c/0x90
> [<c01045d0>] do_invalid_op+0x0/0xa0
> [<c0104651>] do_invalid_op+0x81/0xa0
> [<c016a8a3>] cache_free_debugcheck+0x203/0x250
> [<c011d233>] __wake_up_common+0x43/0x70
> [<c04c4b82>] error_code+0x6a/0x70
> [<c016a8a3>] cache_free_debugcheck+0x203/0x250
> [<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
> [<c016a911>] kmem_cache_free+0x21/0x60
> [<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
> [<c014850f>] rcu_process_callbacks+0xf/0x20
> [<c0127a37>] __do_softirq+0x57/0xf0
> [<c01279e0>] __do_softirq+0x0/0xf0
> <IRQ> [<c01277e5>] irq_exit+0x45/0x70
> [<c0112590>] smp_apic_timer_interrupt+0x40/0x70
> [<c0103d9c>] apic_timer_interrupt+0x28/0x30
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> Normal per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 174
> Active_anon:13626 active_file:3702 inactive_anon:11682
> inactive_file:91928 unevictable:5 dirty:48 writeback:0 unstable:0
> free:737 slab:3377 mapped:2606 pagetables:219 bounce:0
> DMA free:2004kB min:84kB low:104kB high:124kB active_anon:24kB inactive_anon:28kB active_file:104kB inactive_file:8164kB unevictable:0kB present:15872kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 492 492
> Normal free:944kB min:2792kB low:3488kB high:4188kB active_anon:54480kB inactive_anon:46700kB active_file:14704kB inactive_file:359548kB unevictable:20kB present:503928kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> DMA: 1*4kB 0*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2004kB
> Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 944kB
> 95760 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 530104kB
> Total swap = 530104kB
> 131070 pages RAM
> 2635 pages reserved
> 10978 pages shared
> 121856 pages non-shared
> named: page allocation failure. order:0, mode:0x20
> [...]
>

2009-01-13 18:06:09

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Tuesday 13 January 2009 17:45:08 Artur Skawina wrote:
> Christian Lamparter wrote:
> > On Monday 12 January 2009 18:09:47 Artur Skawina wrote:
> >> kernel: phy0: p54 detected a LM86 firmware
> >> kernel: p54: rx_mtu reduced from 3240 to 2392
> >> kernel: phy0: FW rev 2.13.1.0 - Softmac protocol 5.5
> > Well, I guess you got the latest wireless-testing tree, right?
[...]
> will do, after upgrading the kernel (could take a few days).
hmm... OT: who old is your hostapd version? As far as I can tell AP-mode is still WIP.
Maybe you can save you alot trouble if your mobile can connect to ad-hoc wifis.

> >> so to test the device further i set up hostapd, which mostly worked; after
> >> a few tweaks (eg patching out country code setting in hostapd)
> > patching out country code?! why that?
> > currently, hostapd is the only thing that works out-of-the-box :-D
>
> hostapd was aborting no matter what country code i configured, so i
> disabled that, leaving figuring this out for later.
have you installed crda and udev on your box?
( http://wireless.kernel.org/en/developers/Regulatory/CRDA )

> [also hostapd is reporting some TX failures, didn't investigate yet]
TX failures? do you have hostapd's log of it?

> it certainly looks like a leak, at first i suspected memory fragmentation, but
> Larrys patch didn't help; later the order-0 alloc failures appeared.
well, you have a 32-bit architecture, unlike me or larry.
So, with a bit of luck the skbs always fitted into a single page and you never
had problems with memory fragmentation.

> The machine has 512M, ~100M should be (usually is) free, is under constant light
> load (typically <2k ints/s, 60% idle) and is running fine for weeks/months between
> reboots, but locks up after only a few packets go over the hostap driven
> p54usb device. I need the box to be up, that limits the number of tests i can
> run, at least as long as the lockups w/o any diagnostics happen...
Do keyboard-leds "flash" when it locks up, or does console respond
if you press alt-sysrq-m / alt-sysrq-w on the connected keyboard?
( If your box has a serial port, you can try to get the logs from there... )

Regards,
Chr

2009-01-16 23:46:17

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Friday 16 January 2009 23:10:25 Artur Skawina wrote:
>> Christian Lamparter wrote:
>>> On Friday 16 January 2009 10:13:07 Johannes Berg wrote:
>>>> On Thu, 2009-01-15 at 21:18 -0600, Larry Finger wrote:
>>>>
>>>>>>> Object 0xddec18d0: >69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ikkkkkkkkkkkkkkk
>>>>> I too have seen real single bit changes - in my case 6b went to 6a,
>>>>> and my memory is fine. I wouldn't necessarily blame your hardware.
>>>> 6b to 6a is often the result of a refcounting bug that happens to unref
>>>> a value _after_ it has been freed. But that doesn't explain 6b to 69,
>>>> unless you happen to have _two_ refcounting bugs. Not that I necessarily
>>>> think that memory is bad
>>> Well, this idiotic debug patch (kref-kernel-debug-patch) could shed some light into
>>> the problem who's using a freed skb.
>> didn't trigger anything here, just the usual:
>>
>> BUG kmalloc-4096: Poison overwritten
>> -----------------------------------------------------------------------------
>>
>> INFO: 0xdc8161b0-0xdc8161b0. First byte 0x6a instead of 0x6b
>> INFO: Allocated in dev_alloc_skb+0x19/0x30 age=1762 cpu=0 pid=3530
>> INFO: Freed in __kfree_skb+0xf/0x90 age=613 cpu=0 pid=3527
>> INFO: Slab 0xc1390200 objects=7 used=5 fp=0xdc816120 flags=0x400020c2
>> INFO: Object 0xdc816120 @offset=24864 fp=0xdc8140c0
>>
>> but no other warnings...
> but the refcount-test-module.c works on your system?
>
> It should produce something like:

yes, that works.

RefTest
RefTest - Static
released dfb02f10
RefTest - Watermark test
------------[ cut here ]------------
WARNING: at lib/kref.c:67 kref_put+0x65/0xb0()
Modules linked in: refcount_test_module(+) netconsole saa7134_empress saa6752hs lnbp21 s5h1420 saa7134 videobuf_dma_sg budget videobuf_core tveeprom budget_ci budget_core saa7146 ttpci_eeprom serio_raw ir_common [last unloaded: microcode]
Pid: 3592, comm: insmod Not tainted 2.6.29-rc1-00280-g0fe8b7d-dirty #49
Call Trace:
[<c04d3a46>] ? printk+0x18/0x1a
[<c0122cb3>] warn_slowpath+0x73/0xd0
[<c01274ba>] ? irq_exit+0x2a/0x80
[<c0110803>] ? smp_apic_timer_interrupt+0x43/0x80
[<c0103714>] ? apic_timer_interrupt+0x28/0x30
[<c0123800>] ? vprintk+0x110/0x2e0
[<e2060000>] ? reftst_release+0x0/0x18 [refcount_test_module]
[<c04d3a46>] ? printk+0x18/0x1a
[<e2060016>] ? reftst_release+0x16/0x18 [refcount_test_module]
[<e2060000>] ? reftst_release+0x0/0x18 [refcount_test_module]
[<c0245955>] kref_put+0x65/0xb0
[<e2063000>] ? reftst_init+0x0/0xbd [refcount_test_module]
[<e2063074>] reftst_init+0x74/0xbd [refcount_test_module]
[<e2063000>] ? reftst_init+0x0/0xbd [refcount_test_module]
[<c010102a>] _stext+0x2a/0x150
[<c0145cc8>] sys_init_module+0x88/0x1c0
[<c016face>] ? sys_close+0x6e/0xb0
[<c0103146>] sysenter_do_call+0x12/0x25
[<c04d0000>] ? p54u_probe+0xa0/0x13c0
---[ end trace 5d36c2dda20ece59 ]---
RefTest - slXb poison test
------------[ cut here ]------------
WARNING: at lib/kref.c:67 kref_put+0x65/0xb0()
Modules linked in: refcount_test_module(+) netconsole saa7134_empress saa6752hs lnbp21 s5h1420 saa7134 videobuf_dma_sg budget videobuf_core tveeprom budget_ci budget_core saa7146 ttpci_eeprom serio_raw ir_common [last unloaded: microcode]
Pid: 3592, comm: insmod Tainted: G W 2.6.29-rc1-00280-g0fe8b7d-dirty #49
Call Trace:
[<c04d3a46>] ? printk+0x18/0x1a
[<c0122cb3>] warn_slowpath+0x73/0xd0
[...]
=============================================================================
BUG kmalloc-96: Poison overwritten
-----------------------------------------------------------------------------

INFO: 0xdeaedda0-0xdeaedda0. First byte 0x6a instead of 0x6b
INFO: Allocated in 0xe2063083 age=6932 cpu=0 pid=3592
INFO: Freed in 0xe20630af age=6928 cpu=0 pid=3592
INFO: Slab 0xc13d5da0 objects=28 used=26 fp=0xdeaedd80 flags=0x400000c2
INFO: Object 0xdeaedd80 @offset=3456 fp=0xdeaed480

artur

2009-01-16 03:19:09

by Larry Finger

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Artur Skawina wrote:
> Christian Lamparter wrote:
>> well... I'm still looking for an explanation for your other problems:
>> =============================================================================
>> BUG kmalloc-4096: Poison overwritten
>> -----------------------------------------------------------------------------
>>
>> Object 0xddec18c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
>> Object 0xddec18d0: >69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ikkkkkkkkkkkkkkk
>> Object 0xddec18e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
>>
>> The odd thing is here that it's only a "single" bit in the object flipped and not more!
>
> yes, that does not look good. i didn't find anything obvious under wireless/p54
> that could be responsible for it, didn't look at the rest of the stack yet.
> I'd say ignore this for now, until i'm able to reproduce it; one try on a very
> different machine didn't help and i didn't have a lot of time to retest after
> the kernel upgrade(s) today.
>
>> And we all have slub/slab debug options enabled as well (in fact, we had to fix the usbdriver for that,
>> see patch "p54usb: rewriting rx/tx routines to make use of usb_anchor's facilities" )
>>
>> can you check your RAM with memtest or something?
>
> yes, i just need to plan for enough downtime...

I too have seen real single bit changes - in my case 6b went to 6a,
and my memory is fine. I wouldn't necessarily blame your hardware.

Larry

2009-01-16 03:31:20

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Larry Finger wrote:
> Artur Skawina wrote:
>> Christian Lamparter wrote:
>>> well... I'm still looking for an explanation for your other problem=
s:
>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
>>> BUG kmalloc-4096: Poison overwritten
>>> -------------------------------------------------------------------=
----------
>>>
>>> Object 0xddec18c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b =
6b kkkkkkkkkkkkkkkk
>>> Object 0xddec18d0: >69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6=
b 6b ikkkkkkkkkkkkkkk
>>> Object 0xddec18e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b =
6b kkkkkkkkkkkkkkkk
>>>
>>> The odd thing is here that it's only a "single" bit in the object f=
lipped and not more!
>> yes, that does not look good. i didn't find anything obvious under w=
ireless/p54
>> that could be responsible for it, didn't look at the rest of the sta=
ck yet.
>> I'd say ignore this for now, until i'm able to reproduce it; one try=
on a very
>> different machine didn't help and i didn't have a lot of time to ret=
est after
>> the kernel upgrade(s) today.=20
>>
>>> And we all have slub/slab debug options enabled as well (in fact, w=
e had to fix the usbdriver for that,
>>> see patch "p54usb: rewriting rx/tx routines to make use of usb_anch=
or's facilities" )
>>>
>>> can you check your RAM with memtest or something?
>> yes, i just need to plan for enough downtime...=20
>=20
> I too have seen real single bit changes - in my case 6b went to 6a,
> and my memory is fine. I wouldn't necessarily blame your hardware.

In fact i was just trying Christian's URB_ZERO_PACKET suggestion, and g=
ot this
(about the time when some useful packets started coming in):

BUG kmalloc-4096: Poison overwritten
-----------------------------------------------------------------------=
------

INFO: 0xdc8351c8-0xdc8351c8. First byte 0x6a instead of 0x6b
INFO: Allocated in dev_alloc_skb+0x19/0x30 age=3D3720 cpu=3D0 pid=3D195=
2
INFO: Freed in __kfree_skb+0xf/0x90 age=3D238 cpu=3D0 pid=3D0
INFO: Slab 0xc1390600 objects=3D7 used=3D6 fp=3D0xdc8350f0 flags=3D0x40=
0020c2
INFO: Object 0xdc8350f0 @offset=3D20720 fp=3D0x(null)

Bytes b4 0xdc8350e0: fb 00 00 00 ae a2 fd ff 5a 5a 5a 5a 5a 5a 5a 5a =C3=
=BB...=C2=AE=C2=A2=C3=BD=C3=BFZZZZZZZZ
Object 0xdc8350f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
[...]
Object 0xdc8351b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdc8351c0: 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkjkkkkkkk
Object 0xdc8351d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
[...]
Object 0xdc8360e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 k=
kkkkkkkkkkkkkk=C2=A5
Redzone 0xdc8360f0: bb bb bb bb =C2=
=BB=C2=BB=C2=BB=C2=BB =20
Padding 0xdc836118: 5a 5a 5a 5a 5a 5a 5a 5a Z=
ZZZZZZZ =20
Pid: 0, comm: swapper Not tainted 2.6.29-rc1-00273-g0d39407-dirty #47
Call Trace:
[<c016b79e>] check_bytes_and_report+0xce/0xf0
[<c016b9f7>] check_object+0x1e7/0x230
[<c016d115>] __slab_alloc+0x245/0x4f0
[<c016e3ff>] __kmalloc_track_caller+0xcf/0x110
[<c03af9f9>] dev_alloc_skb+0x19/0x30
[<c03af9f9>] dev_alloc_skb+0x19/0x30
[<c03aed95>] __alloc_skb+0x55/0x110
[<c03af9f9>] dev_alloc_skb+0x19/0x30
[<c02f1fcc>] p54u_rx_cb+0x1cc/0x1f0
[<c024552a>] kref_put+0x2a/0x80
[<c0340f51>] usb_hcd_giveback_urb+0x41/0xa0
[<c034f81f>] uhci_giveback_urb+0x7f/0x1f0
[<c034fdef>] uhci_scan_schedule+0x33f/0xa40
[<c035213d>] uhci_irq+0x7d/0x140
[<c0340bf5>] usb_hcd_irq+0x25/0x60
[<c0145ff8>] handle_IRQ_event+0x28/0x50
[<c0147210>] handle_level_irq+0x0/0xb0
[<c014725b>] handle_level_irq+0x4b/0xb0
<IRQ> [<c0103707>] common_interrupt+0x27/0x2c
[<c0108392>] default_idle+0x42/0x50
[<c0102139>] cpu_idle+0x39/0x80
=46IX kmalloc-4096: Restoring 0xdc8351c8-0xdc8351c8=3D0x6b

=46IX kmalloc-4096: Marking all objects used

artur

2009-01-15 20:07:09

by Artur Skawina

[permalink] [raw]
Subject: [PATCH] p54: set_tim must be atomic.

Fix for:

BUG: scheduling while atomic: named/2004/0x10000200
Pid: 2004, comm: named Not tainted 2.6.29-rc1-00271-ge9fa6b0 #45
Call Trace:
[<c04d4ef7>] schedule+0x2a7/0x320
[<c03aed74>] __alloc_skb+0x34/0x110
[<c011f5b3>] __cond_resched+0x13/0x30
[<c04d501d>] _cond_resched+0x2d/0x40
[<c016d8c5>] kmem_cache_alloc+0x95/0xc0
[<c016b8d4>] check_object+0xc4/0x230
[<c03aed74>] __alloc_skb+0x34/0x110
[<c02ede91>] p54_alloc_skb+0x71/0xf0
[<c02ee36f>] p54_set_tim+0x3f/0xa0
[<c04ae064>] sta_info_set_tim_bit+0x64/0x80
[<c04c1017>] invoke_tx_handlers+0xd57/0xd80
[<c016c397>] free_debug_processing+0x197/0x210
[<c03ae215>] pskb_expand_head+0xf5/0x170
[<c04bfd94>] __ieee80211_tx_prepare+0x164/0x2f0
[<c04c1a8d>] ieee80211_skb_resize+0x6d/0xe0
[<c04c250f>] ieee80211_master_start_xmit+0x23f/0x550
[<c016d188>] __slab_alloc+0x2b8/0x4f0
[<c013a711>] getnstimeofday+0x51/0x120
[<c03b5e7b>] dev_hard_start_xmit+0x1db/0x240
[<c03c6a4b>] __qdisc_run+0x1ab/0x200
[<c0136aa1>] __run_hrtimer+0x31/0xf0
[<c03b6247>] dev_queue_xmit+0x247/0x500
[<c04c1e56>] ieee80211_subif_start_xmit+0x356/0x7d0
[<c0466ff7>] packet_rcv_spkt+0x37/0x150
[<c0466ff7>] packet_rcv_spkt+0x37/0x150
[<c03b5e7b>] dev_hard_start_xmit+0x1db/0x240
[<c03c6a4b>] __qdisc_run+0x1ab/0x200
[<c03b6247>] dev_queue_xmit+0x247/0x500
[<c03bc1e2>] neigh_resolve_output+0xe2/0x200
[<c0410080>] ip_finish_output+0x0/0x290
[<c0410267>] ip_finish_output+0x1e7/0x290
[<c040f355>] ip_local_out+0x15/0x20
[<c040f5d2>] ip_push_pending_frames+0x272/0x380
[<c042bbc6>] udp_push_pending_frames+0x146/0x3a0
[<c042d52a>] udp_sendmsg+0x2fa/0x6b0
[<c0433bc7>] inet_sendmsg+0x37/0x70
[<c03a7b7e>] sock_sendmsg+0xbe/0x100
[<c0133cd0>] autoremove_wake_function+0x0/0x50
[<c011c043>] __wake_up_common+0x43/0x70
[<c024a892>] copy_from_user+0x32/0x130
[<c024a892>] copy_from_user+0x32/0x130
[<c03b001e>] verify_iovec+0x2e/0xb0
[<c03a7d3f>] sys_sendmsg+0x17f/0x290
[<c017730a>] pipe_write+0x29a/0x570
[<c013a172>] update_wall_time+0x492/0x8e0
[<c013a711>] getnstimeofday+0x51/0x120
[<c011b05d>] sched_slice+0x3d/0x80
[<c013a711>] getnstimeofday+0x51/0x120
[<c0136657>] hrtimer_forward+0x147/0x1a0
[<c01101b0>] lapic_next_event+0x10/0x20
[<c013ccb3>] clockevents_program_event+0xa3/0x170
[<c03a9054>] sys_socketcall+0xa4/0x290
[<c0110920>] smp_apic_timer_interrupt+0x40/0x70
[<c0103165>] sysenter_do_call+0x12/0x25


Signed-off-by: Artur Skawina <[email protected]>
Acked-by: Christian Lamparter <[email protected]>

---
drivers/net/wireless/p54/p54common.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/p54/p54common.c b/drivers/net/wireless/p54/p54common.c
index 635b423..579bc6c 100644
--- a/drivers/net/wireless/p54/p54common.c
+++ b/drivers/net/wireless/p54/p54common.c
@@ -1295,7 +1295,7 @@ static int p54_set_tim(struct ieee80211_hw *dev, struct ieee80211_sta *sta,
struct p54_tim *tim;

skb = p54_alloc_skb(dev, P54_HDR_FLAG_CONTROL_OPSET, sizeof(*tim),
- P54_CONTROL_TYPE_TIM, GFP_KERNEL);
+ P54_CONTROL_TYPE_TIM, GFP_ATOMIC);
if (!skb)
return -ENOMEM;


2009-01-19 23:17:17

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Artur Skawina wrote:
> Christian Lamparter wrote:
>> On Monday 19 January 2009 22:53:03 Artur Skawina wrote:
>>> We can move the FREE_AFTER_TX(skb) check from the completion to the submission
>>> path, right? Then find a way to do the pull _before_ giving away the skbs.
>>> I can't shutdown the machine where i can reproduce this today, so it will have
>>> to wait until at least tomorrow.
>>>
>>> artur
>> Like this?
>>
> yes, that's what i was thinking. Will FREE_AFTER_TX(skb) be false for all skbs
> that are dropped in p54_rx_frame_sent()?
> Also most of the push/pull business in completions can probably go, we just have to set
> up the pointers right on submission, ie transfer the rx header into the headroom, then
> no skb manipulations in completions should be needed, and these kind of races should
> be harmless.

I only looked at the FREE_AFTER_TX(skb) part and completely missed that you had already
removed the pull, sorry.

artur

2009-01-19 22:54:16

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Monday 19 January 2009 22:53:03 Artur Skawina wrote:
>>>> Here's an interesting sequence:
>>>>
>>>> 1) a TX urb is submitted.
>>>> 2) p54u_rx_cb() => p54_rx_frame_sent(), which does kfree_skb( the_skb_in_(1) ).
>>>> 3) p54u_tx_cb() for (1) is called with the same, now freed, skb. kaboom.
>>>>
>>>> IOW the skb is freed before the usb completion runs.
>>> Well, the sequence should be:
>>>
>>> 1) p54_tx gets called
>>> 1.1) one IRQ urb is submitted
>>> 1.2) one BULK urb is submitted
>>> 2) the firmware acks that it got the urbs
>>> 2.1) p54u_tx_cb is called for the IRQ urb. which frees the small buffer
>>> 2.2) p54u_tx_cb is called for the BULK urb. which only removes the net2280_tx_hdr from the skb.
>>> [time passes]
>>> 3) firmware is finished sending.
>>> 3.1) p54u_rx_cb gets called
>>> => p54_rx_frame_sent passed the feedback to mac80211
>> That's what one would expect, and is probably why i couldn't see anything
>> wrong in the code despite going over it several times. Until i got a crash
>> which left no doubt as to what happened, and made me notice the "wrong"
>> completion order, log attached [1].
>> In theory, theory and practice do not differ, in practice...
>>
>>>> Somehow i don't think this is the reason for the corruption, but it certainly
>>>> seems to be responsible for some, if not all, of the crashes/panics.
>>> dunno... we should see a bit more fallout, because skb_pull changes skb->data and skb->len.
>> Doing an skb_pull in p54u_tx_cb on skbs that have already been given to mac80211
>> cannot be good.
>> We can move the FREE_AFTER_TX(skb) check from the completion to the submission
>> path, right? Then find a way to do the pull _before_ giving away the skbs.
>> I can't shutdown the machine where i can reproduce this today, so it will have
>> to wait until at least tomorrow.
>>
>> artur
>
> Like this?
>
yes, that's what i was thinking. Will FREE_AFTER_TX(skb) be false for all skbs
that are dropped in p54_rx_frame_sent()?
Also most of the push/pull business in completions can probably go, we just have to set
up the pointers right on submission, ie transfer the rx header into the headroom, then
no skb manipulations in completions should be needed, and these kind of races should
be harmless.
Will test this; it's still possible that what i saw was the result of the tons of
debugging i put into that kernel. But this kind of problem could explain everything
i'm seeing, including the huge packet loss.

Thanks,

artur


2009-01-19 21:53:09

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Monday 19 January 2009 19:15:09 Artur Skawina wrote:
>> Christian Lamparter wrote:
>>> On Monday 19 January 2009 00:27:02 Artur Skawina wrote:
>>>> Artur Skawina wrote:
>>>>> didn't trigger anything here, just the usual:
>>>>>
>>>>> BUG kmalloc-4096: Poison overwritten
>>>> This is almost 100% reproducible; sometimes the machine freezes in=
stead.
>> Here's an interesting sequence:
>>
>> 1) a TX urb is submitted.
>> 2) p54u_rx_cb() =3D> p54_rx_frame_sent(), which does kfree_skb( the_=
skb_in_(1) ).
>> 3) p54u_tx_cb() for (1) is called with the same, now freed, skb. kab=
oom.
>>
>> IOW the skb is freed before the usb completion runs.
>=20
> Well, the sequence should be:
>=20
> 1) p54_tx gets called
> 1.1) one IRQ urb is submitted
> 1.2) one BULK urb is submitted
> 2) the firmware acks that it got the urbs
> 2.1) p54u_tx_cb is called for the IRQ urb. which frees the small buff=
er
> 2.2) p54u_tx_cb is called for the BULK urb. which only removes the ne=
t2280_tx_hdr from the skb.
> [time passes]
> 3) firmware is finished sending.
> 3.1) p54u_rx_cb gets called
> =3D> p54_rx_frame_sent passed the feedback to mac80211

That's what one would expect, and is probably why i couldn't see anythi=
ng
wrong in the code despite going over it several times. Until i got a cr=
ash
which left no doubt as to what happened, and made me notice the "wrong"
completion order, log attached [1].
In theory, theory and practice do not differ, in practice...

>> Somehow i don't think this is the reason for the corruption, but it =
certainly
>> seems to be responsible for some, if not all, of the crashes/panics.
> dunno... we should see a bit more fallout, because skb_pull changes s=
kb->data and skb->len.=20

Doing an skb_pull in p54u_tx_cb on skbs that have already been given to=
mac80211
cannot be good.
We can move the FREE_AFTER_TX(skb) check from the completion to the sub=
mission
path, right? Then find a way to do the pull _before_ giving away the sk=
bs.
I can't shutdown the machine where i can reproduce this today, so it wi=
ll have
to wait until at least tomorrow.

artur

[1]: note there are two issues, first the skb_pull triggers a BUG, whic=
h
is then interrupted by the SLUB error, then the rest of the first BUG
follows. i've added '#'-comments explaining some of the numbers.

[...]
TX: LEN=3D 20 SKB=3Dd7eb2b60 HEAD=3Dd78f28c0
TX: DONE=20
TX: LEN=3D 32 SKB=3Dda7c0620 HEAD=3Dde02b3b0
TX: DONE=20
TX: LEN=3D 436 SKB=3Dd7eb2ee0 HEAD=3Ddd4ee180
TXCB: LEN=3D 36 SKB=3Dd7eb2b60 HEAD=3Dd78f28c0
TXCB: LEN=3D 48 SKB=3Dda7c0620 HEAD=3Dde02b3b0
TX: DONE=20
TX: LEN=3D 436 SKB=3Ddd55e2a0 HEAD=3Ddd4ef670
TX: DONE=20
TX: LEN=3D 176 SKB=3Dde01e8c0 HEAD=3Dd9812060
TX: DONE=20
TXCB: LEN=3D 452 SKB=3Dd7eb2ee0 HEAD=3Ddd4ee180
TXCB: LEN=3D 452 SKB=3Ddd55e2a0 HEAD=3Ddd4ef670
TXCB: LEN=3D 192 SKB=3Dde01e8c0 HEAD=3Dd9812060
RXCB: LEN=3D 36 SKB=3Ddd5dfee0 HEAD=3Dd9813090
RXCB: LEN=3D 36 SKB=3Ddd433700 HEAD=3Dd9811030
RXCB: LEN=3D 36 SKB=3Ddfbf3a80 HEAD=3Dd9816120
TX: LEN=3D 160 SKB=3Ddfb2da80 HEAD=3Dd9889030 # (1) urb submiss=
ion, SKB=3Ddfb2da80
TX: DONE=20
TX: LEN=3D 168 SKB=3Dce1388c0 HEAD=3Dd9888000
RXCB: LEN=3D 36 SKB=3Dce138d20 HEAD=3Dd98140c0 # (2) this eventu=
ally calls p54_rx_frame_sent() and skb_kfree on SKB=3Ddfb2da80
TXCB: LEN=3D1802201963 SKB=3Ddfb2da80 HEAD=3D6b6b6b6b # (3) p54u_tx_cb(=
) was called with the now freed and poisoned SKB=3Ddfb2da80
------------[ cut here ]------------
kernel BUG at include/linux/skbuff.h:1073! # BUG() in skb_pull(), =
see stacktrace below, after the SLUB error
invalid opcode: 0000 [#1] DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:00/0000:00:07.2/usb1/1-1/1-1.1/ue=
vent
Modules linked in: p54usb p54common<3>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
BUG skbuff_head_cache: Poison overwritten
-----------------------------------------------------------------------=
------

INFO: 0xdfb2dad0-0xdfb2dad0. First byte 0x5b instead of 0x6b =
# 0x10 is the amount that was pulled (net2280 rx header)
INFO: Allocated in __alloc_skb+0x34/0x110 age=3D28 cpu=3D0 pid=3D0
INFO: Freed in p54_rx+0x760/0x770 [p54common] age=3D0 cpu=3D0 pid=3D0 =
# p54_rx+0x760/0x770 is the skb_free in p54_rx_frame_sent() (gcc=
decided to inline it)
INFO: Slab 0xc13f65a0 objects=3D18 used=3D16 fp=3D0xdfb2da80 flags=3D0x=
400000c2
INFO: Object 0xdfb2da80 @offset=3D2688 fp=3D0xdfb2dee0 =
# the freed skb

Bytes b4 0xdfb2da70: 00 00 00 00 0f 23 00 00 ba e7 1b 02 5a 5a 5a 5a .=
=2E...#..=C2=BA=C3=A7..ZZZZ
Object 0xdfb2da80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2da90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2daa0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2dab0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2dac0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2dad0: 5b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [=
kkkkkkkkkkkkkkk
Object 0xdfb2dae0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2daf0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2db00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2db10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2db20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xdfb2db30: 6b 6b 6b a5 k=
kk=C2=A5 =20
Redzone 0xdfb2db34: bb bb bb bb =C2=
=BB=C2=BB=C2=BB=C2=BB =20
Padding 0xdfb2db5c: 5a 5a 5a 5a Z=
ZZZ =20
Pid: 0, comm: swapper Tainted: G W 2.6.29-rc1-00222-g27956bf-di=
rty #69
Call Trace:
[<c016cf1e>] check_bytes_and_report+0xce/0xf0
[<c016d177>] check_object+0x1e7/0x230
[<c016e915>] __slab_alloc+0x245/0x4f0
[<c03afe04>] __alloc_skb+0x34/0x110
[<c016f0eb>] kmem_cache_alloc+0xbb/0xc0
[<c03afe04>] __alloc_skb+0x34/0x110
[<c03afe04>] __alloc_skb+0x34/0x110
[<c03c5695>] find_skb+0x35/0x90
[<c03c58ae>] netpoll_send_udp+0x2e/0x200
[<e31371ad>] write_msg+0x9d/0xe0 [netconsole]
[<e3137110>] write_msg+0x0/0xe0 [netconsole]
[<c0122533>] __call_console_drivers+0x43/0x50
[<c01228f7>] release_console_sem+0x87/0x1c0
[<c0122ec7>] vprintk+0x227/0x2d0
[<e3137110>] write_msg+0x0/0xe0 [netconsole]
[<c04dcb00>] printk+0x17/0x1f
[<c0142ea4>] print_modules+0x74/0xb0
[<c01051ba>] show_registers+0x1a/0x200
[<c0138647>] atomic_notifier_call_chain+0x17/0x20
[<c0138740>] notify_die+0x30/0x40
[<c0105df6>] __die+0xa6/0x100
[<c0106070>] die+0x40/0x80
[<c0103f60>] do_invalid_op+0x0/0xa0
[<c0103fe1>] do_invalid_op+0x81/0xa0
[<c03aced8>] skb_pull+0x18/0x40
[<e3137110>] write_msg+0x0/0xe0 [netconsole]
[<c0122533>] __call_console_drivers+0x43/0x50
[<c013ad72>] update_wall_time+0x492/0x8e0
[<c04de722>] error_code+0x6a/0x70
[<c016007b>] sys_mincore+0x26b/0x350
[<c03aced8>] skb_pull+0x18/0x40
[<e1f1d815>] p54u_tx_cb+0x95/0x120 [p54usb]
[<c0341489>] usb_hcd_giveback_urb+0x49/0xe0
[<c034fe4f>] uhci_giveback_urb+0x5f/0x1f0
[<c034fe6f>] uhci_giveback_urb+0x7f/0x1f0
[<c035043f>] uhci_scan_schedule+0x33f/0xa40
[<c035278d>] uhci_irq+0x7d/0x140
[<c0341125>] usb_hcd_irq+0x25/0x60
[<c0146de8>] handle_IRQ_event+0x28/0x50
[<c0148000>] handle_level_irq+0x0/0xb0
[<c014804b>] handle_level_irq+0x4b/0xb0
<IRQ> [<c0103707>] common_interrupt+0x27/0x2c
[<c0122dbc>] vprintk+0x11c/0x2d0
[<c034301f>] usb_alloc_urb+0xf/0x30
[<c034301f>] usb_alloc_urb+0xf/0x30
[<c04dcb00>] printk+0x17/0x1f
[<e1f1d212>] p54u_tx_net2280+0x52/0x320 [p54usb]
[<e1d4845d>] p54_tx+0x51d/0x820 [p54common]
[<c04ca789>] __ieee80211_tx+0x49/0x130
[<c04cb61d>] ieee80211_master_start_xmit+0x28d/0x610
[<c0137401>] __run_hrtimer+0x31/0xf0
[<c013e894>] tick_program_event+0x14/0x20
[<c013b311>] getnstimeofday+0x51/0x120
[<c03b702b>] dev_hard_start_xmit+0x1db/0x240
[<c03c883b>] __qdisc_run+0x1ab/0x200
[<c03b4dd1>] net_tx_action+0x81/0xb0
[<c0126f07>] __do_softirq+0x57/0xf0
[<c0126eb0>] __do_softirq+0x0/0xf0
<IRQ> [<c0137c38>] ktime_get+0x18/0x40
[<c0148000>] handle_level_irq+0x0/0xb0
[<c0126c65>] irq_exit+0x55/0x70
[<c0104f1b>] do_IRQ+0x9b/0x120
[<c0103707>] common_interrupt+0x27/0x2c
[<c0108392>] default_idle+0x42/0x50
[<c0102139>] cpu_idle+0x39/0x80
=46IX skbuff_head_cache: Restoring 0xdfb2dad0-0xdfb2dad0=3D0x6b

=46IX skbuff_head_cache: Marking all objects used
netconsole budget saa7134_empress saa6752hs saa7134 serio_raw budget_c=
i videobuf_dma_sg budget_core saa7146 ttpci_eeprom videobuf_core tveepr=
om ir_common [last unloaded: p54common]

Pid: 0, comm: swapper Tainted: G W (2.6.29-rc1-00222-g27956bf-d=
irty #69) =20
EIP: 0060:[<c03aced8>] EFLAGS: 00010087 CPU: 0
EIP is at skb_pull+0x18/0x40
EAX: 6b6b6b5b EBX: 6b6b6b6b ECX: dfb2da80 EDX: 00000010
ESI: dfb2da80 EDI: d744bdd0 EBP: d744b1f0 ESP: c0690ef0
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=3Dc0690000 task=3Dc05fc300 task.ti=3Dc06460=
00)
Stack:
d980a370 e1f1d815 e1f1f440 6b6b6b6b dfb2da80 6b6b6b6b d744bf74 0000008=
6
d980a370 00000000 df9fb180 d980a370 c0341489 c034fe4f 00000046 df965de=
0
df9fb180 df994900 dfa46a68 c034fe6f df9948bc df99491c 00a464d8 df9fb24=
c
Call Trace:
[<e1f1d815>] p54u_tx_cb+0x95/0x120 [p54usb] =
# this is the skb_pull in p54u_tx_cb
[<c0341489>] usb_hcd_giveback_urb+0x49/0xe0
[<c034fe4f>] uhci_giveback_urb+0x5f/0x1f0
[<c034fe6f>] uhci_giveback_urb+0x7f/0x1f0
[<c035043f>] uhci_scan_schedule+0x33f/0xa40
[<c035278d>] uhci_irq+0x7d/0x140
[<c0341125>] usb_hcd_irq+0x25/0x60
[<c0146de8>] handle_IRQ_event+0x28/0x50
[<c0148000>] handle_level_irq+0x0/0xb0
[<c014804b>] handle_level_irq+0x4b/0xb0
<IRQ> <0> [<c0103707>] common_interrupt+0x27/0x2c
[<c0122dbc>] vprintk+0x11c/0x2d0
[<c034301f>] usb_alloc_urb+0xf/0x30
[<c034301f>] usb_alloc_urb+0xf/0x30
[<c04dcb00>] printk+0x17/0x1f
[<e1f1d212>] p54u_tx_net2280+0x52/0x320 [p54usb]
[<e1d4845d>] p54_tx+0x51d/0x820 [p54common]
[<c04ca789>] __ieee80211_tx+0x49/0x130
[<c04cb61d>] ieee80211_master_start_xmit+0x28d/0x610
[<c0137401>] __run_hrtimer+0x31/0xf0
[<c013e894>] tick_program_event+0x14/0x20
[<c013b311>] getnstimeofday+0x51/0x120
[<c03b702b>] dev_hard_start_xmit+0x1db/0x240
[<c03c883b>] __qdisc_run+0x1ab/0x200
[<c03b4dd1>] net_tx_action+0x81/0xb0
[<c0126f07>] __do_softirq+0x57/0xf0
[<c0126eb0>] __do_softirq+0x0/0xf0
<IRQ> <0> [<c0137c38>] ktime_get+0x18/0x40
[<c0148000>] handle_level_irq+0x0/0xb0
[<c0126c65>] irq_exit+0x55/0x70
[<c0104f1b>] do_IRQ+0x9b/0x120
[<c0103707>] common_interrupt+0x27/0x2c
[<c0108392>] default_idle+0x42/0x50
[<c0102139>] cpu_idle+0x39/0x80
Code: 83 c4 0c c3 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 53 89 c1=
8b 58 50 31 c0 39 d3 72 24 89 d8 29 d0 3b 41 54 89 41 50 73 08 <0f> 0b=
eb fe 8d 74 26 00 8b 99 a8 00 00 00 89 d0 01 d8 89 81 a8=20
EIP: [<c03aced8>] skb_pull+0x18/0x40 SS:ESP 0068:c0690ef0
---[ end trace bba3fc5c928ed0e0 ]---
Kernel panic - not syncing: Fatal exception in interrupt


2009-01-13 22:31:29

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Artur Skawina wrote:
>>>> reboots, but locks up after only a few packets go over the hostap =
driven
>>>> p54usb device. I need the box to be up, that limits the number of =
tests i can

> after switching from SLUB to SLAB and enabling some debugging i final=
ly caught this:

and slub debugging gave something a bit more useful:

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
BUG kmalloc-4096: Poison overwritten
-----------------------------------------------------------------------=
------

INFO: 0xddec18d0-0xddec18d0. First byte 0x69 instead of 0x6b
INFO: Allocated in dev_alloc_skb+0x19/0x30 age=3D5064 cpu=3D0 pid=3D0
INFO: Freed in __kfree_skb+0xf/0x90 age=3D205 cpu=3D0 pid=3D0
INFO: Slab 0xc13bd800 objects=3D7 used=3D4 fp=3D0xddec1030 flags=3D0x40=
0020c2
INFO: Object 0xddec1030 @offset=3D4144 fp=3D0xddec6120

Bytes b4 0xddec1020: f7 00 00 00 d7 e3 00 00 5a 5a 5a 5a 5a 5a 5a 5a =C3=
=B7...=C3=97=C3=A3..ZZZZZZZZ
Object 0xddec1030: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1040: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1050: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1060: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1070: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1090: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec10a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec10b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec10c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec10d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec10e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec10f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1100: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1110: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1120: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1130: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1140: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1150: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1160: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1170: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1180: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec11a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec11b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec11c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec11d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec11e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec11f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1200: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1210: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1220: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1230: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1240: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1250: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1260: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1270: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1280: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1290: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec12a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec12b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec12c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec12d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec12e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec12f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1300: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1310: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1320: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1330: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1340: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1350: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1360: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1370: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1380: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1390: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec13a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec13b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec13c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec13d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec13e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec13f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1400: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1410: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1420: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1430: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1440: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1450: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1460: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1470: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1480: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1490: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec14a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec14b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec14c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec14d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec14e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec14f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1500: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1510: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1520: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1530: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1540: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1550: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1560: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1570: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1580: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1590: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec15a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec15b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec15c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec15d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec15e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec15f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1600: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1610: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1620: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1630: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1640: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1650: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1660: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1670: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1680: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1690: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec16a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec16b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec16c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec16d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec16e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec16f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1700: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1710: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1720: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1730: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1740: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1750: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1760: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1770: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1780: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1790: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec17a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec17b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec17c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec17d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec17e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec17f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1800: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1810: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1820: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1830: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1840: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1850: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1860: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1870: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1880: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1890: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec18a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec18b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec18c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec18d0: 69 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b i=
kkkkkkkkkkkkkkk
Object 0xddec18e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec18f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1900: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1910: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1920: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1930: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1940: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1950: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1960: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1970: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1980: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1990: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec19a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec19b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec19c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec19d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec19e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec19f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1a90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1aa0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ab0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ac0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ad0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ae0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1af0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1b90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ba0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1bb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1bc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1bd0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1be0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1bf0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1c90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ca0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1cb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1cc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1cd0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ce0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1cf0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1d90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1da0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1db0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1dc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1dd0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1de0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1df0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1e90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ea0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1eb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ec0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ed0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ee0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ef0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1f90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1fa0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1fb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1fc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1fd0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1fe0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec1ff0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec2000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec2010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b k=
kkkkkkkkkkkkkkk
Object 0xddec2020: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 k=
kkkkkkkkkkkkkk=C2=A5
Redzone 0xddec2030: bb bb bb bb =C2=
=BB=C2=BB=C2=BB=C2=BB =20
Padding 0xddec2058: 5a 5a 5a 5a 5a 5a 5a 5a Z=
ZZZZZZZ =20
Pid: 0, comm: swapper Not tainted 2.6.28-rc8-00519-g90435df #43
Call Trace:
[<c016a88e>] check_bytes_and_report+0xce/0xf0
[<c016aae7>] check_object+0x1e7/0x230
[<c016c1d5>] __slab_alloc+0x245/0x4e0
[<c03a6d09>] dev_alloc_skb+0x19/0x30
[<c016d453>] __kmalloc_track_caller+0xc3/0x100
[<c03a6d09>] dev_alloc_skb+0x19/0x30
[<c03a6245>] __alloc_skb+0x55/0x120
[<c03a6d09>] dev_alloc_skb+0x19/0x30
[<c02eb11c>] p54u_rx_cb+0x1bc/0x1e0
[<c0339d91>] usb_hcd_giveback_urb+0x41/0xa0
[<c03482bf>] uhci_giveback_urb+0x7f/0x1f0
[<c034886f>] uhci_scan_schedule+0x33f/0xa20
[<c034ab7d>] uhci_irq+0x7d/0x150
[<e0866a5c>] interrupt_hw+0x4c/0x230 [saa7146]
[<c0339a35>] usb_hcd_irq+0x25/0x60
[<c0146348>] handle_IRQ_event+0x28/0x50
[<c0147600>] handle_level_irq+0x0/0xb0
[<c014764b>] handle_level_irq+0x4b/0xb0
<IRQ> [<c0103d6f>] common_interrupt+0x23/0x28
[<c0108942>] default_idle+0x42/0x50
[<c01021a9>] cpu_idle+0x39/0x80
=46IX kmalloc-4096: Restoring 0xddec18d0-0xddec18d0=3D0x6b

=46IX kmalloc-4096: Marking all objects used

2009-01-19 00:26:52

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Monday 19 January 2009 00:27:02 Artur Skawina wrote:
> Artur Skawina wrote:
> >
> > didn't trigger anything here, just the usual:
> >
> > BUG kmalloc-4096: Poison overwritten
>

> Still haven't found the corruptor, but at least i've narrowed it down a bit;
> what i'm seeing is:
>
> 1) an skb "S" gets allocated in p54u_rx_cb and is submitted together w/ the urb.
> 2) "S" later comes back to p54u_rx_cb, where it is given to p54_rx (eventually
> ieee80211_rx_irqsafe) and a new one is allocated.
> 3) a few (~15) rx/tx packets pass.
> 4) SLUB detects modified poison in what used to be S->head in (1) and (2) above;
> usually 0x6b turns into 0x6a, but i have also seen 0x69, just a few times.
> (the offset from skb->head to the decremented byte seems to stay the same,
> at least during the few times i tried w/ the same kernel, last one was eg
> 684 bytes)
>
> This is almost 100% reproducible; sometimes the machine freezes instead.

Do you know what is inside "S" when it handed over to ieee80211_rx_irqsafe?
Is it always the same content, or is it sometimes a data or mgmt frame?

Another shot in the dark: do you have a daemon that listen to mon.wlan0, wmaster / "any" other than hostapd?

Regards,
Chr

2009-01-20 22:22:27

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
>
> So, p54usb first generation needs the ZERO_PACKET flag as well?
> (e.g a patch like the attached p54usb-net2280-zeropacket.diff ?)

yes, the client would often not even associate w/o that flag in the
int urb.

artur

2009-01-15 19:42:38

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Thursday 15 January 2009 20:12:03 Artur Skawina wrote:
> Christian Lamparter wrote:
> >> tried reproducing (using the old w-t/pending pull) on different machine,
> >> but no oops or crash there...
> >> (similar connectivity issues though, such as connecting only works
> >> exactly once after starting hostapd, to reconnect i have to restart
> >> hostapd)
> > hmm, maybe Windows Mobile uses another PS mechanism, e.g WMM-PS.
>
> hmm, i'll have to set up another device to sniff the traffic i guess.
probably... or do you have another client with a mac80211 driver?

> >> Upgraded to current wireless-testing/pending on the problematic box
> >> and almost immediately got [1]. No slab corruption this time (at least
> >> not yet). Will switch to GFP_ATOMIC and retry w/ the new fw.
> >>
> >> artur
> > yes that's a bug, p54_set_tim must use GFP_ATOMIC, do you want to post the patch?
>
> nah, too trivial :) fwiw i used this:
well, it should be "done", right? :-)

just add "John W. Linville" <[email protected]>" to the cc,
change the subject line to "[PATCH] p54: set_tim must be atomic."
and put a Signed-off-by: your name <your@mail> in the mail body and a "---" to seperate the commit message
and then the patch. And it will be in the next round of -rcX fixes.

> diff --git a/drivers/net/wireless/p54/p54common.c b/drivers/net/wireless/p54/p54common.c
> index 635b423..579bc6c 100644
> --- a/drivers/net/wireless/p54/p54common.c
> +++ b/drivers/net/wireless/p54/p54common.c
> @@ -1295,7 +1295,7 @@ static int p54_set_tim(struct ieee80211_hw *dev, struct ieee80211_sta *sta,
> struct p54_tim *tim;
>
> skb = p54_alloc_skb(dev, P54_HDR_FLAG_CONTROL_OPSET, sizeof(*tim),
> - P54_CONTROL_TYPE_TIM, GFP_KERNEL);
> + P54_CONTROL_TYPE_TIM, GFP_ATOMIC);
> if (!skb)
> return -ENOMEM;

Regards,
Chr

2009-01-13 19:03:00

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Tuesday 13 January 2009 17:45:08 Artur Skawina wrote:
>> Christian Lamparter wrote:
>>> On Monday 12 January 2009 18:09:47 Artur Skawina wrote:
>>>> kernel: phy0: p54 detected a LM86 firmware
>>>> kernel: p54: rx_mtu reduced from 3240 to 2392
>>>> kernel: phy0: FW rev 2.13.1.0 - Softmac protocol 5.5
>>> Well, I guess you got the latest wireless-testing tree, right?
> [...]
>> will do, after upgrading the kernel (could take a few days).
> hmm... OT: who old is your hostapd version? As far as I can tell AP-mode is still WIP.

hostap-0_6_6-148-g3cf8523 ie git build from ~ dec 24.

> Maybe you can save you alot trouble if your mobile can connect to ad-hoc wifis.

It can, but i'd like to try to get hostap running w/ this hw anyway...

>>>> so to test the device further i set up hostapd, which mostly worked; after
>>>> a few tweaks (eg patching out country code setting in hostapd)
>>> patching out country code?! why that?
>>> currently, hostapd is the only thing that works out-of-the-box :-D
>> hostapd was aborting no matter what country code i configured, so i
>> disabled that, leaving figuring this out for later.
> have you installed crda and udev on your box?
> ( http://wireless.kernel.org/en/developers/Regulatory/CRDA )

yes, but only after patching hostapd; the lockups prevented further
testing...

>> [also hostapd is reporting some TX failures, didn't investigate yet]
> TX failures? do you have hostapd's log of it?

I'll send it in a few minutes, as soon as i'll reproduce it.

>> The machine has 512M, ~100M should be (usually is) free, is under constant light
>> load (typically <2k ints/s, 60% idle) and is running fine for weeks/months between
>> reboots, but locks up after only a few packets go over the hostap driven
>> p54usb device. I need the box to be up, that limits the number of tests i can
>> run, at least as long as the lockups w/o any diagnostics happen...
> Do keyboard-leds "flash" when it locks up, or does console respond
> if you press alt-sysrq-m / alt-sysrq-w on the connected keyboard?

most of the times it happened there was no kbd attached. At least once
when it _was_ connected, sysrq was working, and i saw 0*8KB; that's why
i initially suspected fragmentation.

> ( If your box has a serial port, you can try to get the logs from there... )

I'll consider setting up another similar machine for testing, so that i'll
be able to run w/ more debugging enabled. I was hoping to avoid it, esp.
as i'm not sure i'll be able to reproduce the problems w/o the load.

thanks,

artur




2009-01-16 22:52:34

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Friday 16 January 2009 23:10:25 Artur Skawina wrote:
> Christian Lamparter wrote:
> > On Friday 16 January 2009 10:13:07 Johannes Berg wrote:
> >> On Thu, 2009-01-15 at 21:18 -0600, Larry Finger wrote:
> >>
> >>>>> Object 0xddec18d0: >69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b =
6b 6b 6b ikkkkkkkkkkkkkkk
> >>> I too have seen real single bit changes - in my case 6b went to 6=
a,
> >>> and my memory is fine. I wouldn't necessarily blame your hardware=
=2E
> >> 6b to 6a is often the result of a refcounting bug that happens to =
unref
> >> a value _after_ it has been freed. But that doesn't explain 6b to =
69,
> >> unless you happen to have _two_ refcounting bugs. Not that I neces=
sarily
> >> think that memory is bad
> > Well, this idiotic debug patch (kref-kernel-debug-patch) could shed=
some light into
> > the problem who's using a freed skb.=20
>=20
> didn't trigger anything here, just the usual:
>=20
> BUG kmalloc-4096: Poison overwritten
> ---------------------------------------------------------------------=
--------
>=20
> INFO: 0xdc8161b0-0xdc8161b0. First byte 0x6a instead of 0x6b
> INFO: Allocated in dev_alloc_skb+0x19/0x30 age=3D1762 cpu=3D0 pid=3D3=
530
> INFO: Freed in __kfree_skb+0xf/0x90 age=3D613 cpu=3D0 pid=3D3527
> INFO: Slab 0xc1390200 objects=3D7 used=3D5 fp=3D0xdc816120 flags=3D0x=
400020c2
> INFO: Object 0xdc816120 @offset=3D24864 fp=3D0xdc8140c0
>=20
> but no other warnings...
but the refcount-test-module.c works on your system?

It should produce something like:

[ 608.177429] RefTest =
=20
[ 608.179753] RefTest - Static =
=20
[ 608.182861] released ffff8800b767dec8 =
=20
[ 608.186760] RefTest - Watermark test =
=20
[ 608.190563] ------------[ cut here ]------------ =
=20
[ 608.195412] WARNING: at lib/kref.c:67 kref_put+0x56/0x72() =
=20
[ 608.201139] Hardware name: System Product Name =
=20
[ 608.205813] Modules linked in: refcount(+) [...] [last unloaded: ref=
count] =
=20
[ 608.288005] Pid: 8793, comm: insmod Tainted: P W 2.6.29-rc1-=
wl #5 =20
[ 608.295036] Call Trace: =
=20
[ 608.297719] [<ffffffff80233389>] warn_slowpath+0xd8/0x112 =
=20
[ 608.304057] [<ffffffff80233edd>] vprintk+0x28a/0x2b6 =
=20
[ 608.309546] [<ffffffffa0089000>] reftst_release+0x0/0x14 [refcount]=
=20
[ 608.316241] [<ffffffffa00a5000>] reftst_init+0x0/0xd5 [refcount] =
=20
[ 608.322572] [<ffffffffa00a5000>] reftst_init+0x0/0xd5 [refcount] =
=20
[ 608.328904] [<ffffffffa0089000>] reftst_release+0x0/0x14 [refcount]=
=20
[ 608.335497] [<ffffffffa00a5000>] reftst_init+0x0/0xd5 [refcount] =
=20
[ 608.341833] [<ffffffff803b853a>] kref_put+0x56/0x72 =
=20
[ 608.347034] [<ffffffffa00a5082>] reftst_init+0x82/0xd5 [refcount] =
=20
[ 608.353461] [<ffffffff80209051>] _stext+0x51/0x120 =
=20
[ 608.358576] [<ffffffff803ba5b2>] __up_read+0x13/0x8d =
=20
[ 608.363862] [<ffffffff8024823d>] __blocking_notifier_call_chain+0x5=
1/0x5f =20
[ 608.370980] [<ffffffff80256080>] sys_init_module+0xa0/0x1ba =
=20
[ 608.376878] [<ffffffff8020b41b>] system_call_fastpath+0x16/0x1b =
=20
[ 608.383116] ---[ end trace e0829545e9e114bf ]--- =
=20
[ 608.388194] RefTest - slXb poison test =
=20
[ 608.392223] ------------[ cut here ]------------ =
=20
[ 608.397066] WARNING: at lib/kref.c:67 kref_put+0x56/0x72() =
=20
[ 608.402783] Hardware name: System Product Name =
=20
[ 608.407454] Modules linked in: refcount(+) [...] ehci_hcd [last unlo=
aded: refcount] =
=20
[ 608.488299] Pid: 8793, comm: insmod Tainted: P W 2.6.29-rc1-=
wl #5 =20
[ 608.495325] Call Trace: =
=20
[ 608.498015] [<ffffffff80233389>] warn_slowpath+0xd8/0x112 =
=20
[ 608.503735] [<ffffffff80233edd>] vprintk+0x28a/0x2b6 =
=20
[ 608.509023] [<ffffffff80595798>] thread_return+0x3d/0xc2 =
=20
[ 608.514670] [<ffffffffa0089000>] reftst_release+0x0/0x14 [refcount]=
=20
[ 608.521268] [<ffffffffa00a5000>] reftst_init+0x0/0xd5 [refcount] =
=20
[ 608.527601] [<ffffffffa00a50bd>] reftst_init+0xbd/0xd5 [refcount] =
=20
[ 608.534025] [<ffffffff8029147a>] check_object+0x156/0x205 =
=20
[ 608.539751] [<ffffffff80291c89>] __slab_free+0x271/0x2a8 =
=20
[ 608.545384] [<ffffffffa0089000>] reftst_release+0x0/0x14 [refcount]=
=20
[ 608.551980] [<ffffffffa00a5000>] reftst_init+0x0/0xd5 [refcount] =
=20
[ 608.558315] [<ffffffff803b853a>] kref_put+0x56/0x72 =
=20
[ 608.563513] [<ffffffffa00a5000>] reftst_init+0x0/0xd5 [refcount] =
=20
[ 608.569849] [<ffffffffa00a50cc>] reftst_init+0xcc/0xd5 [refcount] =
=20
[ 608.576274] [<ffffffff80209051>] _stext+0x51/0x120 =
=20
[ 608.581384] [<ffffffff803ba5b2>] __up_read+0x13/0x8d =
=20
[ 608.586671] [<ffffffff8024823d>] __blocking_notifier_call_chain+0x5=
1/0x5f =20
[ 608.593789] [<ffffffff80256080>] sys_init_module+0xa0/0x1ba =
=20
[ 608.599685] [<ffffffff8020b41b>] system_call_fastpath+0x16/0x1b =
=20
[ 608.605923] ---[ end trace e0829545e9e114c0 ]---

[[[some delay - until the object is alloced by someone else]]]
=
=20
[ 613.331424] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =
=20
[ 613.335408] BUG kmalloc-96: Poison overwritten =
=20
[ 613.335408] --------------------------------------------------------=
--------------------- =20
[ 613.335408] =
=20
[ 613.335408] INFO: 0xffff8800be0aa950-0xffff8800be0aa950. First byte =
0x6a instead of 0x6b =20
[ 613.335408] INFO: Allocated in 0xffffffffa00a5093 age=3D1243 cpu=3D0=
pid=3D8793 =
=20
[ 613.335408] INFO: Freed in 0xffffffffa00a50bd age=3D1244 cpu=3D0 pid=
=3D8793 =
=20
[ 613.335408] INFO: Slab 0xffffe20002992530 objects=3D24 used=3D23 fp=3D=
0xffff8800be0aa930 flags=3D0x40000000000000c3 =
=20
[ 613.335408] INFO: Object 0xffff8800be0aa930 @offset=3D2352 fp=3D0x(n=
ull) =
=20
[ 613.335408]
[ 613.335408] Bytes b4 0xffff8800be0aa920: ec ca 00 00 01 00 00 00 5a=
5a 5a 5a 5a 5a 5a 5a =EF=BF=BD=EF=BF=BD......ZZZZZZZZ
[ 613.335408] Object 0xffff8800be0aa930: 6b 6b 6b 6b 6b 6b 6b 6b 6b=
6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 613.335408] Object 0xffff8800be0aa940: 6b 6b 6b 6b 6b 6b 6b 6b 6b=
6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 613.335408] Object 0xffff8800be0aa950: 6a 6b 6b 6b 6b 6b 6b 6b 6b=
6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk
[ 613.335408] Object 0xffff8800be0aa960: 6b 6b 6b 6b 6b 6b 6b 6b 6b=
6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 613.335408] Object 0xffff8800be0aa970: 6b 6b 6b 6b 6b 6b 6b 6b 6b=
6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 613.335408] Object 0xffff8800be0aa980: 6b 6b 6b 6b 6b 6b 6b 6b 6b=
6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk=EF=BF=BD
[ 613.335408] Redzone 0xffff8800be0aa990: bb bb bb bb bb bb bb bb =
=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=
=BD=EF=BF=BD=EF=BF=BD
[ 613.473011] Padding 0xffff8800be0aa9d0: 5a 5a 5a 5a 5a 5a 5a 5a =
ZZZZZZZZ
[ 613.473011] Pid: 1671, comm: kcryptd Tainted: P W 2.6.29-rc1=
-wl #5
[ 613.473011] Call Trace:
[ 613.473011] [<ffffffff802912e8>] check_bytes_and_report+0xb5/0xf1 [=
no sign of refcount test in this backtrace here]
[ 613.473011] [<ffffffff8022a2d7>] update_curr+0x4a/0xcd
[ 613.473011] [<ffffffff802913ec>] check_object+0xc8/0x205
[ 613.473011] [<ffffffff80292af2>] __slab_alloc+0x3cb/0x4ba
[ 613.473011] [<ffffffff804c4b9c>] r1bio_pool_alloc+0x24/0x3d
[ 613.473011] [<ffffffff804c4b9c>] r1bio_pool_alloc+0x24/0x3d
[ 613.473011] [<ffffffff80293381>] __kmalloc+0x98/0xc1
[...]
[ 613.473011] FIX kmalloc-96: Restoring 0xffff8800be0aa950-0xffff8800b=
e0aa950=3D0x6b
[ 613.473011]
[ 613.473011] FIX kmalloc-96: Marking all objects used

2009-01-15 19:12:11

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Thursday 15 January 2009 18:55:19 Artur Skawina wrote:
>> Artur Skawina wrote:
>>> Artur Skawina wrote:
>>>>>>> reboots, but locks up after only a few packets go over the hostap driven
>>>>>>> p54usb device. I need the box to be up, that limits the number of tests i can
>>>> after switching from SLUB to SLAB and enabling some debugging i finally caught this:
>>> and slub debugging gave something a bit more useful:
>>>
>>> =============================================================================
>>> BUG kmalloc-4096: Poison overwritten
>> tried reproducing (using the old w-t/pending pull) on different machine,
>> but no oops or crash there...
>> (similar connectivity issues though, such as connecting only works
>> exactly once after starting hostapd, to reconnect i have to restart
>> hostapd)
> hmm, maybe Windows Mobile uses another PS mechanism, e.g WMM-PS.

hmm, i'll have to set up another device to sniff the traffic i guess.

>> Upgraded to current wireless-testing/pending on the problematic box
>> and almost immediately got [1]. No slab corruption this time (at least
>> not yet). Will switch to GFP_ATOMIC and retry w/ the new fw.
>>
>> artur
> yes that's a bug, p54_set_tim must use GFP_ATOMIC, do you want to post the patch?

nah, too trivial :) fwiw i used this:

diff --git a/drivers/net/wireless/p54/p54common.c b/drivers/net/wireless/p54/p54common.c
index 635b423..579bc6c 100644
--- a/drivers/net/wireless/p54/p54common.c
+++ b/drivers/net/wireless/p54/p54common.c
@@ -1295,7 +1295,7 @@ static int p54_set_tim(struct ieee80211_hw *dev, struct ieee80211_sta *sta,
struct p54_tim *tim;

skb = p54_alloc_skb(dev, P54_HDR_FLAG_CONTROL_OPSET, sizeof(*tim),
- P54_CONTROL_TYPE_TIM, GFP_KERNEL);
+ P54_CONTROL_TYPE_TIM, GFP_ATOMIC);
if (!skb)
return -ENOMEM;


Will try to do some more tests in the next few days.

Thanks,

artur

2009-01-16 20:38:56

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Friday 16 January 2009 10:13:07 Johannes Berg wrote:
> On Thu, 2009-01-15 at 21:18 -0600, Larry Finger wrote:
>
> > >> Object 0xddec18d0: >69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ikkkkkkkkkkkkkkk
>
> > I too have seen real single bit changes - in my case 6b went to 6a,
> > and my memory is fine. I wouldn't necessarily blame your hardware.
>
> 6b to 6a is often the result of a refcounting bug that happens to unref
> a value _after_ it has been freed. But that doesn't explain 6b to 69,
> unless you happen to have _two_ refcounting bugs. Not that I necessarily
> think that memory is bad
Well, this idiotic debug patch (kref-kernel-debug-patch) could shed some light into
the problem who's using a freed skb.
(and please, I _know_ there is already a better solution for this refcount-thing... but where?)

Of course, there's some sample code to test it... see refcount-test-module.c

BTW, & 0xfffffff0) == 0x6b6b6b60); can detect even more...

Regards,
Chr


Attachments:
(No filename) (980.00 B)
kref-kernel-debug.patch (420.00 B)
refcount-test-module.c (1.31 kB)
Download all attachments

2009-01-15 18:53:45

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Thursday 15 January 2009 18:55:19 Artur Skawina wrote:
> Artur Skawina wrote:
> > Artur Skawina wrote:
> >>>>> reboots, but locks up after only a few packets go over the hostap driven
> >>>>> p54usb device. I need the box to be up, that limits the number of tests i can
> >
> >> after switching from SLUB to SLAB and enabling some debugging i finally caught this:
> >
> > and slub debugging gave something a bit more useful:
> >
> > =============================================================================
> > BUG kmalloc-4096: Poison overwritten
>
> tried reproducing (using the old w-t/pending pull) on different machine,
> but no oops or crash there...
> (similar connectivity issues though, such as connecting only works
> exactly once after starting hostapd, to reconnect i have to restart
> hostapd)
hmm, maybe Windows Mobile uses another PS mechanism, e.g WMM-PS.

> Upgraded to current wireless-testing/pending on the problematic box
> and almost immediately got [1]. No slab corruption this time (at least
> not yet). Will switch to GFP_ATOMIC and retry w/ the new fw.
>
> artur
yes that's a bug, p54_set_tim must use GFP_ATOMIC, do you want to post the patch?

> [1]:
> BUG: scheduling while atomic: named/2004/0x10000200
> Modules linked in: netconsole saa7134_empress saa6752hs lnbp21 s5h1420 saa7134 videobuf_dma_sg serio_raw budget videobuf_core tveeprom budget_ci budget_core saa7146 ttpci_eeprom ir_common [last unloaded: microcode]
> Pid: 2004, comm: named Not tainted 2.6.29-rc1-00271-ge9fa6b0 #45
> Call Trace:
> [<c04d4ef7>] schedule+0x2a7/0x320
> [<c03aed74>] __alloc_skb+0x34/0x110
> [<c011f5b3>] __cond_resched+0x13/0x30
> [<c04d501d>] _cond_resched+0x2d/0x40
> [<c016d8c5>] kmem_cache_alloc+0x95/0xc0
> [<c016b8d4>] check_object+0xc4/0x230
> [<c03aed74>] __alloc_skb+0x34/0x110
> [<c02ede91>] p54_alloc_skb+0x71/0xf0
> [<c02ee36f>] p54_set_tim+0x3f/0xa0
> [<c04ae064>] sta_info_set_tim_bit+0x64/0x80

Regards,
Chr

2009-01-19 01:18:02

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Monday 19 January 2009 00:27:02 Artur Skawina wrote:
>> Artur Skawina wrote:
>>> didn't trigger anything here, just the usual:
>>>
>>> BUG kmalloc-4096: Poison overwritten
>
>> Still haven't found the corruptor, but at least i've narrowed it down a bit;
>> what i'm seeing is:
>>
>> 1) an skb "S" gets allocated in p54u_rx_cb and is submitted together w/ the urb.
>> 2) "S" later comes back to p54u_rx_cb, where it is given to p54_rx (eventually
>> ieee80211_rx_irqsafe) and a new one is allocated.
>> 3) a few (~15) rx/tx packets pass.
>> 4) SLUB detects modified poison in what used to be S->head in (1) and (2) above;
>> usually 0x6b turns into 0x6a, but i have also seen 0x69, just a few times.
>> (the offset from skb->head to the decremented byte seems to stay the same,
>> at least during the few times i tried w/ the same kernel, last one was eg
>> 684 bytes)
>>
>> This is almost 100% reproducible; sometimes the machine freezes instead.
>
> Do you know what is inside "S" when it handed over to ieee80211_rx_irqsafe?
> Is it always the same content, or is it sometimes a data or mgmt frame?

Last time the urb was 88 bytes, that would make the packet <=72 bytes; i didn't
log the content. will do.

> Another shot in the dark: do you have a daemon that listen to mon.wlan0, wmaster / "any" other than hostapd?

I usually have tcpdump listening on wlan0, nothing else (other than dhcpd).

artur

2009-01-13 21:39:05

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Artur Skawina wrote:
> Christian Lamparter wrote:

>>> The machine has 512M, ~100M should be (usually is) free, is under constant light
>>> load (typically <2k ints/s, 60% idle) and is running fine for weeks/months between
>>> reboots, but locks up after only a few packets go over the hostap driven
>>> p54usb device. I need the box to be up, that limits the number of tests i can
>>> run, at least as long as the lockups w/o any diagnostics happen...
>> Do keyboard-leds "flash" when it locks up, or does console respond
>> if you press alt-sysrq-m / alt-sysrq-w on the connected keyboard?
>
> most of the times it happened there was no kbd attached. At least once
> when it _was_ connected, sysrq was working, and i saw 0*8KB; that's why
> i initially suspected fragmentation.
>
>> ( If your box has a serial port, you can try to get the logs from there... )

after switching from SLUB to SLAB and enabling some debugging i finally caught this:

------------[ cut here ]------------
Kernel BUG at c016a8a3 [verbose debug info unavailable]
invalid opcode: 0000 [#1]
last sysfs file: /sys/devices/pci0000:00/0000:00:07.2/usb1/1-1/1-1.1/uevent
Modules linked in: netconsole saa7134_empress saa6752hs lnbp21 s5h1420 saa7134 budget videobuf_dma_sg budget_ci budget_core saa7146 ttpci_eeprom videobuf_core tveeprom serio_raw ir_common [last unloaded: netconsole]

Pid: 1885, comm: named Not tainted (2.6.28-rc8-00519-g90435df #42)
EIP: 0060:[<c016a8a3>] EFLAGS: 00210012 CPU: 0
EIP is at cache_free_debugcheck+0x203/0x250
EAX: dfb6c71f EBX: df803d20 ECX: dfb6c03f EDX: 00000002
ESI: dfb6c720 EDI: 00000370 EBP: c1000000 ESP: c0669f74
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process named (pid: 1885, ti=c0669000 task=df8443d0 task.ti=deb85000)
Stack:
00000000 df809660 d31d4528 00000003 00000000 00000002 c137c440 c060e2dc
c01483e2 dfb6c000 df808d38 df803d20 c069cb40 00200286 c016a911 00000000
00000005 c069cb40 00000009 c01483e2 00000020 00000001 00000100 c014850f
Call Trace:
[<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
[<c016a911>] kmem_cache_free+0x21/0x60
[<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
[<c014850f>] rcu_process_callbacks+0xf/0x20
[<c0127a37>] __do_softirq+0x57/0xf0
[<c01279e0>] __do_softirq+0x0/0xf0
<IRQ> <0> [<c01277e5>] irq_exit+0x45/0x70
[<c0112590>] smp_apic_timer_interrupt+0x40/0x70
[<c0103d9c>] apic_timer_interrupt+0x28/0x30
Code: 8b 44 24 24 b9 fe ff ff ff 89 4c 90 1c f6 43 19 08 74 0e b9 6b 00 00 00 89 f2 89 d8 e8 e7 fa ff ff 83 c4 28 89 f0 5b 5e 5f 5d c3 <0f> 0b eb fe 0f 0b eb fe 8b 43 10 8d 44 06 f8 8d b6 00 00 00 00
EIP: [<c016a8a3>] cache_free_debugcheck+0x203/0x250 SS:ESP 0068:c0669f74
Kernel panic - not syncing: Fatal exception in interrupt

followed after some time by lots of page alloc failures [1].

artur

[1]:
[...]
__ratelimit: 1551 callbacks suppressed
named: page allocation failure. order:0, mode:0x20
Pid: 1885, comm: named Tainted: G D 2.6.28-rc8-00519-g90435df #42
Call Trace:
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
named: page allocation failure. order:0, mode:0x20
Pid: 1885, comm: named Tainted: G D 2.6.28-rc8-00519-g90435df #42
Call Trace:
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c016b573>] cache_alloc_refill+0x363/0x710
[<c03a52c4>] __alloc_skb+0x34/0x120
[<c016bcc1>] kmem_cache_alloc+0xe1/0xf0
[<c03a52c4>] __alloc_skb+0x34/0x120
[<c03b8205>] find_skb+0x35/0x90
[<c03b840e>] netpoll_send_udp+0x2e/0x200
[<e33661ad>] write_msg+0x9d/0xe0 [netconsole]
[<e3366110>] write_msg+0x0/0xe0 [netconsole]
[<c0123443>] __call_console_drivers+0x43/0x50
[<c01238bb>] release_console_sem+0x13b/0x1c0
[<c0123dd7>] vprintk+0x227/0x2d0
[<c0123443>] __call_console_drivers+0x43/0x50
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c04c30c0>] printk+0x17/0x1f
[<c0105909>] print_trace_address+0x49/0x60
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c01059a4>] dump_trace+0x84/0x100
[<c0105fde>] show_trace+0x4e/0x60
[<c04c2fc1>] dump_stack+0x6e/0x73
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c016b573>] cache_alloc_refill+0x363/0x710
[<c03a52c4>] __alloc_skb+0x34/0x120
[<c03a539e>] __alloc_skb+0x10e/0x120
[<c016ba6e>] __kmalloc_track_caller+0x14e/0x160
[<c016bc53>] kmem_cache_alloc+0x73/0xf0
[<c03a5da9>] dev_alloc_skb+0x19/0x30
[<c03a52e5>] __alloc_skb+0x55/0x120
[<c03a5da9>] dev_alloc_skb+0x19/0x30
[<c02ced8e>] boomerang_rx+0x15e/0x520
[<c02d04cf>] boomerang_interrupt+0x13f/0x480
[<e109d6a9>] budget_ci_irq+0xa9/0x100 [budget_ci]
[<c0103d9c>] apic_timer_interrupt+0x28/0x30
[<c0146348>] handle_IRQ_event+0x28/0x50
[<c0147600>] handle_level_irq+0x0/0xb0
[<c014764b>] handle_level_irq+0x4b/0xb0
<IRQ> [<c0103d6f>] common_interrupt+0x23/0x28
[<c024007b>] prio_tree_right+0xab/0x100
[<c02442f7>] delay_tsc+0x17/0x20
[<c0244298>] __const_udelay+0x18/0x20
[<c04c304a>] panic+0x84/0xe3
[<c010584c>] oops_end+0x7c/0x90
[<c01045d0>] do_invalid_op+0x0/0xa0
[<c0104651>] do_invalid_op+0x81/0xa0
[<c016a8a3>] cache_free_debugcheck+0x203/0x250
[<c011d233>] __wake_up_common+0x43/0x70
[<c04c4b82>] error_code+0x6a/0x70
[<c016a8a3>] cache_free_debugcheck+0x203/0x250
[<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
[<c016a911>] kmem_cache_free+0x21/0x60
[<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
[<c014850f>] rcu_process_callbacks+0xf/0x20
[<c0127a37>] __do_softirq+0x57/0xf0
[<c01279e0>] __do_softirq+0x0/0xf0
<IRQ> [<c01277e5>] irq_exit+0x45/0x70
[<c0112590>] smp_apic_timer_interrupt+0x40/0x70
[<c0103d9c>] apic_timer_interrupt+0x28/0x30
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
Normal per-cpu:
CPU 0: hi: 186, btch: 31 usd: 174
Active_anon:13626 active_file:3702 inactive_anon:11682
inactive_file:91928 unevictable:5 dirty:48 writeback:0 unstable:0
free:737 slab:3377 mapped:2606 pagetables:219 bounce:0
DMA free:2004kB min:84kB low:104kB high:124kB active_anon:24kB inactive_anon:28kB active_file:104kB inactive_file:8164kB unevictable:0kB present:15872kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 492 492
Normal free:944kB min:2792kB low:3488kB high:4188kB active_anon:54480kB inactive_anon:46700kB active_file:14704kB inactive_file:359548kB unevictable:20kB present:503928kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2004kB
Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 944kB
95760 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 530104kB
Total swap = 530104kB
131070 pages RAM
2635 pages reserved
10978 pages shared
121856 pages non-shared
named: page allocation failure. order:0, mode:0x20
Pid: 1885, comm: named Tainted: G D 2.6.28-rc8-00519-g90435df #42
Call Trace:
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c016b573>] cache_alloc_refill+0x363/0x710
[<c03a52c4>] __alloc_skb+0x34/0x120
[<c016bcc1>] kmem_cache_alloc+0xe1/0xf0
[<c03a52c4>] __alloc_skb+0x34/0x120
[<c03b739b>] refill_skbs+0x5b/0x70
[<c03b81e9>] find_skb+0x19/0x90
[<c0266d90>] bit_cursor+0x0/0x610
[<c03b840e>] netpoll_send_udp+0x2e/0x200
[<e33661ad>] write_msg+0x9d/0xe0 [netconsole]
[<e3366110>] write_msg+0x0/0xe0 [netconsole]
[<c0123443>] __call_console_drivers+0x43/0x50
[<c01238bb>] release_console_sem+0x13b/0x1c0
[<c0123dd7>] vprintk+0x227/0x2d0
[<c0123443>] __call_console_drivers+0x43/0x50
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c04c30c0>] printk+0x17/0x1f
[<c0105909>] print_trace_address+0x49/0x60
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c01059a4>] dump_trace+0x84/0x100
[<c0105fde>] show_trace+0x4e/0x60
[<c04c2fc1>] dump_stack+0x6e/0x73
[<c01505cd>] __alloc_pages_internal+0x35d/0x470
[<c016b573>] cache_alloc_refill+0x363/0x710
[<c03a52c4>] __alloc_skb+0x34/0x120
[<c03a539e>] __alloc_skb+0x10e/0x120
[<c016ba6e>] __kmalloc_track_caller+0x14e/0x160
[<c016bc53>] kmem_cache_alloc+0x73/0xf0
[<c03a5da9>] dev_alloc_skb+0x19/0x30
[<c03a52e5>] __alloc_skb+0x55/0x120
[<c03a5da9>] dev_alloc_skb+0x19/0x30
[<c02ced8e>] boomerang_rx+0x15e/0x520
[<c02d04cf>] boomerang_interrupt+0x13f/0x480
[<e109d6a9>] budget_ci_irq+0xa9/0x100 [budget_ci]
[<c0103d9c>] apic_timer_interrupt+0x28/0x30
[<c0146348>] handle_IRQ_event+0x28/0x50
[<c0147600>] handle_level_irq+0x0/0xb0
[<c014764b>] handle_level_irq+0x4b/0xb0
<IRQ> [<c0103d6f>] common_interrupt+0x23/0x28
[<c024007b>] prio_tree_right+0xab/0x100
[<c02442f7>] delay_tsc+0x17/0x20
[<c0244298>] __const_udelay+0x18/0x20
[<c04c304a>] panic+0x84/0xe3
[<c010584c>] oops_end+0x7c/0x90
[<c01045d0>] do_invalid_op+0x0/0xa0
[<c0104651>] do_invalid_op+0x81/0xa0
[<c016a8a3>] cache_free_debugcheck+0x203/0x250
[<c011d233>] __wake_up_common+0x43/0x70
[<c04c4b82>] error_code+0x6a/0x70
[<c016a8a3>] cache_free_debugcheck+0x203/0x250
[<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
[<c016a911>] kmem_cache_free+0x21/0x60
[<c01483e2>] __rcu_process_callbacks+0xd2/0x1f0
[<c014850f>] rcu_process_callbacks+0xf/0x20
[<c0127a37>] __do_softirq+0x57/0xf0
[<c01279e0>] __do_softirq+0x0/0xf0
<IRQ> [<c01277e5>] irq_exit+0x45/0x70
[<c0112590>] smp_apic_timer_interrupt+0x40/0x70
[<c0103d9c>] apic_timer_interrupt+0x28/0x30
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
Normal per-cpu:
CPU 0: hi: 186, btch: 31 usd: 174
Active_anon:13626 active_file:3702 inactive_anon:11682
inactive_file:91928 unevictable:5 dirty:48 writeback:0 unstable:0
free:737 slab:3377 mapped:2606 pagetables:219 bounce:0
DMA free:2004kB min:84kB low:104kB high:124kB active_anon:24kB inactive_anon:28kB active_file:104kB inactive_file:8164kB unevictable:0kB present:15872kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 492 492
Normal free:944kB min:2792kB low:3488kB high:4188kB active_anon:54480kB inactive_anon:46700kB active_file:14704kB inactive_file:359548kB unevictable:20kB present:503928kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2004kB
Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 944kB
95760 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 530104kB
Total swap = 530104kB
131070 pages RAM
2635 pages reserved
10978 pages shared
121856 pages non-shared
named: page allocation failure. order:0, mode:0x20
[...]

2009-01-15 18:56:37

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Artur Skawina wrote:
> Artur Skawina wrote:
>> Artur Skawina wrote:
>>>>>> reboots, but locks up after only a few packets go over the hostap driven
>>>>>> p54usb device. I need the box to be up, that limits the number of tests i can
>>> after switching from SLUB to SLAB and enabling some debugging i finally caught this:
>> and slub debugging gave something a bit more useful:
>>
>> =============================================================================
>> BUG kmalloc-4096: Poison overwritten
>
> tried reproducing (using the old w-t/pending pull) on different machine,
> but no oops or crash there...
> (similar connectivity issues though, such as connecting only works
> exactly once after starting hostapd, to reconnect i have to restart
> hostapd)
>
> Upgraded to current wireless-testing/pending on the problematic box
> and almost immediately got [1]. No slab corruption this time (at least
> not yet). Will switch to GFP_ATOMIC and retry w/ the new fw.

Switching p54_set_tim to use GFP_ATOMIC seemed to help, a few
quick tests resulted in no crashes and oopses; unfortunately
ipv4 wasn't working all that well -- arp didn't work at all in
fact.

Tried the newer fw, but it does not seem to like my hw:

usb 1-1.1: new full speed USB device using uhci_hcd and address 6
usb 1-1.1: New USB device found, idVendor=0846, idProduct=4200
usb 1-1.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 1-1.1: configuration #1 chosen from 1 choice
usb 1-1.1: firmware: requesting isl3886usb
phy1: p54 detected a LM86 firmware
p54: rx_mtu reduced from 3240 to 2392
phy1: FW rev 2.13.24.0 - Softmac protocol 5.9
phy1: cryptographic accelerator WEP:YES, TKIP:YES, CCMP:YES
p54usb: probe of 1-1.1:1.0 failed with error -110

Last one working is:

usb 1-1.1: new full speed USB device using uhci_hcd and address 5
usb 1-1.1: New USB device found, idVendor=0846, idProduct=4200
usb 1-1.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 1-1.1: configuration #1 chosen from 1 choice
usb 1-1.1: firmware: requesting isl3886usb
phy0: p54 detected a LM86 firmware
p54: rx_mtu reduced from 3240 to 2392
phy0: FW rev 2.13.1.0 - Softmac protocol 5.5
phy0: cryptographic accelerator WEP:YES, TKIP:YES, CCMP:YES
phy0: hwaddr 00:30:f1:b1:09:e6, MAC:isl3886 RF:Frisbee
wmaster0 (p54usb): not using net_device_ops yet
phy0: Selected rate control algorithm 'minstrel'
wlan0 (p54usb): not using net_device_ops yet
udev: renamed network interface wlan0 to ap0
mon.ap0 (p54usb): not using net_device_ops yet
cfg80211: Calling CRDA for country: US


Disconnecting/reconnecting the wireless client resulted in this:
> mac80211-phy0: failed to remove key (0, 00:1b:fb:12:34:56) from hardware (-12)
followed later, when killing hostapd, by:
> mac80211-phy0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-12)


artur

2009-01-19 23:32:37

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Tuesday 20 January 2009 00:17:13 Artur Skawina wrote:
> Artur Skawina wrote:
> > Christian Lamparter wrote:
> >> On Monday 19 January 2009 22:53:03 Artur Skawina wrote:
> >>> We can move the FREE_AFTER_TX(skb) check from the completion to the submission
> >>> path, right? Then find a way to do the pull _before_ giving away the skbs.
> >>> I can't shutdown the machine where i can reproduce this today, so it will have
> >>> to wait until at least tomorrow.
> >>>
> >>> artur
> >> Like this?
> >>
> > yes, that's what i was thinking. Will FREE_AFTER_TX(skb) be false for all skbs
> > that are dropped in p54_rx_frame_sent()?
Not exactly, but it's the right direction. FREE_AFTER_TX is true for _setting_ control frames... e.g frequency/filter/dcf.

> > Also most of the push/pull business in completions can probably go, we just have to set
> > up the pointers right on submission, ie transfer the rx header into the headroom, then
> > no skb manipulations in completions should be needed, and these kind of races should
> > be harmless.
>
> I only looked at the FREE_AFTER_TX(skb) part and completely missed that you had already
> removed the pull, sorry.
Yeah, this will make life easier for the "stuck package" detection...
Although, I'll miss the skb_push & skb_pull checks. But that's a small price,
if your device finally works. (Well "works" is maybe a bit too much,
as you'll probably run into other bugs, e.g truesize bug or the "shot rc", or
more problems with powersaving station that only have apsd).

Regards,
Chr

2009-01-15 17:55:26

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Artur Skawina wrote:
> Artur Skawina wrote:
>>>>> reboots, but locks up after only a few packets go over the hostap driven
>>>>> p54usb device. I need the box to be up, that limits the number of tests i can
>
>> after switching from SLUB to SLAB and enabling some debugging i finally caught this:
>
> and slub debugging gave something a bit more useful:
>
> =============================================================================
> BUG kmalloc-4096: Poison overwritten

tried reproducing (using the old w-t/pending pull) on different machine,
but no oops or crash there...
(similar connectivity issues though, such as connecting only works
exactly once after starting hostapd, to reconnect i have to restart
hostapd)

Upgraded to current wireless-testing/pending on the problematic box
and almost immediately got [1]. No slab corruption this time (at least
not yet). Will switch to GFP_ATOMIC and retry w/ the new fw.

artur


[1]:
BUG: scheduling while atomic: named/2004/0x10000200
Modules linked in: netconsole saa7134_empress saa6752hs lnbp21 s5h1420 saa7134 videobuf_dma_sg serio_raw budget videobuf_core tveeprom budget_ci budget_core saa7146 ttpci_eeprom ir_common [last unloaded: microcode]
Pid: 2004, comm: named Not tainted 2.6.29-rc1-00271-ge9fa6b0 #45
Call Trace:
[<c04d4ef7>] schedule+0x2a7/0x320
[<c03aed74>] __alloc_skb+0x34/0x110
[<c011f5b3>] __cond_resched+0x13/0x30
[<c04d501d>] _cond_resched+0x2d/0x40
[<c016d8c5>] kmem_cache_alloc+0x95/0xc0
[<c016b8d4>] check_object+0xc4/0x230
[<c03aed74>] __alloc_skb+0x34/0x110
[<c02ede91>] p54_alloc_skb+0x71/0xf0
[<c02ee36f>] p54_set_tim+0x3f/0xa0
[<c04ae064>] sta_info_set_tim_bit+0x64/0x80
[<c04c1017>] invoke_tx_handlers+0xd57/0xd80
[<c016c397>] free_debug_processing+0x197/0x210
[<c03ae215>] pskb_expand_head+0xf5/0x170
[<c04bfd94>] __ieee80211_tx_prepare+0x164/0x2f0
[<c04c1a8d>] ieee80211_skb_resize+0x6d/0xe0
[<c04c250f>] ieee80211_master_start_xmit+0x23f/0x550
[<c016d188>] __slab_alloc+0x2b8/0x4f0
[<c013a711>] getnstimeofday+0x51/0x120
[<c03b5e7b>] dev_hard_start_xmit+0x1db/0x240
[<c03c6a4b>] __qdisc_run+0x1ab/0x200
[<c0136aa1>] __run_hrtimer+0x31/0xf0
[<c03b6247>] dev_queue_xmit+0x247/0x500
[<c04c1e56>] ieee80211_subif_start_xmit+0x356/0x7d0
[<c0466ff7>] packet_rcv_spkt+0x37/0x150
[<c0466ff7>] packet_rcv_spkt+0x37/0x150
[<c03b5e7b>] dev_hard_start_xmit+0x1db/0x240
[<c03c6a4b>] __qdisc_run+0x1ab/0x200
[<c03b6247>] dev_queue_xmit+0x247/0x500
[<c03bc1e2>] neigh_resolve_output+0xe2/0x200
[<c0410080>] ip_finish_output+0x0/0x290
[<c0410267>] ip_finish_output+0x1e7/0x290
[<c040f355>] ip_local_out+0x15/0x20
[<c040f5d2>] ip_push_pending_frames+0x272/0x380
[<c042bbc6>] udp_push_pending_frames+0x146/0x3a0
[<c042d52a>] udp_sendmsg+0x2fa/0x6b0
[<c0433bc7>] inet_sendmsg+0x37/0x70
[<c03a7b7e>] sock_sendmsg+0xbe/0x100
[<c0133cd0>] autoremove_wake_function+0x0/0x50
[<c011c043>] __wake_up_common+0x43/0x70
[<c024a892>] copy_from_user+0x32/0x130
[<c024a892>] copy_from_user+0x32/0x130
[<c03b001e>] verify_iovec+0x2e/0xb0
[<c03a7d3f>] sys_sendmsg+0x17f/0x290
[<c017730a>] pipe_write+0x29a/0x570
[<c013a172>] update_wall_time+0x492/0x8e0
[<c013a711>] getnstimeofday+0x51/0x120
[<c011b05d>] sched_slice+0x3d/0x80
[<c013a711>] getnstimeofday+0x51/0x120
[<c0136657>] hrtimer_forward+0x147/0x1a0
[<c01101b0>] lapic_next_event+0x10/0x20
[<c013ccb3>] clockevents_program_event+0xa3/0x170
[<c03a9054>] sys_socketcall+0xa4/0x290
[<c0110920>] smp_apic_timer_interrupt+0x40/0x70
[<c0103165>] sysenter_do_call+0x12/0x25

2009-01-16 22:10:31

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Friday 16 January 2009 10:13:07 Johannes Berg wrote:
>> On Thu, 2009-01-15 at 21:18 -0600, Larry Finger wrote:
>>
>>>>> Object 0xddec18d0: >69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ikkkkkkkkkkkkkkk
>>> I too have seen real single bit changes - in my case 6b went to 6a,
>>> and my memory is fine. I wouldn't necessarily blame your hardware.
>> 6b to 6a is often the result of a refcounting bug that happens to unref
>> a value _after_ it has been freed. But that doesn't explain 6b to 69,
>> unless you happen to have _two_ refcounting bugs. Not that I necessarily
>> think that memory is bad
> Well, this idiotic debug patch (kref-kernel-debug-patch) could shed some light into
> the problem who's using a freed skb.

didn't trigger anything here, just the usual:

BUG kmalloc-4096: Poison overwritten
-----------------------------------------------------------------------------

INFO: 0xdc8161b0-0xdc8161b0. First byte 0x6a instead of 0x6b
INFO: Allocated in dev_alloc_skb+0x19/0x30 age=1762 cpu=0 pid=3530
INFO: Freed in __kfree_skb+0xf/0x90 age=613 cpu=0 pid=3527
INFO: Slab 0xc1390200 objects=7 used=5 fp=0xdc816120 flags=0x400020c2
INFO: Object 0xdc816120 @offset=24864 fp=0xdc8140c0

but no other warnings...

artur

2009-01-19 22:38:06

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Monday 19 January 2009 22:53:03 Artur Skawina wrote:
> Christian Lamparter wrote:
> > On Monday 19 January 2009 19:15:09 Artur Skawina wrote:
> >> Christian Lamparter wrote:
> >>> On Monday 19 January 2009 00:27:02 Artur Skawina wrote:
> >>>> Artur Skawina wrote:
> >>>>> didn't trigger anything here, just the usual:
> >>>>>
> >>>>> BUG kmalloc-4096: Poison overwritten
> >>>> This is almost 100% reproducible; sometimes the machine freezes instead.
> >> Here's an interesting sequence:
> >>
> >> 1) a TX urb is submitted.
> >> 2) p54u_rx_cb() => p54_rx_frame_sent(), which does kfree_skb( the_skb_in_(1) ).
> >> 3) p54u_tx_cb() for (1) is called with the same, now freed, skb. kaboom.
> >>
> >> IOW the skb is freed before the usb completion runs.
> >
> > Well, the sequence should be:
> >
> > 1) p54_tx gets called
> > 1.1) one IRQ urb is submitted
> > 1.2) one BULK urb is submitted
> > 2) the firmware acks that it got the urbs
> > 2.1) p54u_tx_cb is called for the IRQ urb. which frees the small buffer
> > 2.2) p54u_tx_cb is called for the BULK urb. which only removes the net2280_tx_hdr from the skb.
> > [time passes]
> > 3) firmware is finished sending.
> > 3.1) p54u_rx_cb gets called
> > => p54_rx_frame_sent passed the feedback to mac80211
>
> That's what one would expect, and is probably why i couldn't see anything
> wrong in the code despite going over it several times. Until i got a crash
> which left no doubt as to what happened, and made me notice the "wrong"
> completion order, log attached [1].
> In theory, theory and practice do not differ, in practice...
>
> >> Somehow i don't think this is the reason for the corruption, but it certainly
> >> seems to be responsible for some, if not all, of the crashes/panics.
> > dunno... we should see a bit more fallout, because skb_pull changes skb->data and skb->len.
>
> Doing an skb_pull in p54u_tx_cb on skbs that have already been given to mac80211
> cannot be good.
> We can move the FREE_AFTER_TX(skb) check from the completion to the submission
> path, right? Then find a way to do the pull _before_ giving away the skbs.
> I can't shutdown the machine where i can reproduce this today, so it will have
> to wait until at least tomorrow.
>
> artur

Like this?


Attachments:
(No filename) (2.20 kB)
p54-diff (3.58 kB)
Download all attachments

2009-01-15 22:40:58

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Thursday 15 January 2009 21:06:24 Artur Skawina wrote:
> Christian Lamparter wrote:
> > On Thursday 15 January 2009 20:12:03 Artur Skawina wrote:
> >>>> Upgraded to current wireless-testing/pending on the problematic =
box
> >>>> and almost immediately got [1]. No slab corruption this time (at=
least
> >>>> not yet). Will switch to GFP_ATOMIC and retry w/ the new fw.
>=20
> >>> yes that's a bug, p54_set_tim must use GFP_ATOMIC, do you want to=
post the patch?
> >> nah, too trivial :) fwiw i used this:
> > well, it should be "done", right? :-)
> >=20
> > just add "John W. Linville" <[email protected]>" to the cc,
> > change the subject line to "[PATCH] p54: set_tim must be atomic."
> > and put a Signed-off-by: your name <your@mail> in the mail body and=
a "---" to seperate the commit message
> > and then the patch. And it will be in the next round of -rcX fixes.
>=20
> well, i really meant the 'too trivial' part and was hoping you'd do i=
t ;)
> ok, will resend.
>=20
well... I'm still looking for an explanation for your other problems:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
BUG kmalloc-4096: Poison overwritten
-----------------------------------------------------------------------=
------

=C2=A0 Object 0xddec18c0: =C2=A06b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b =
6b 6b 6b kkkkkkkkkkkkkkkk
=C2=A0 Object 0xddec18d0: =C2=A0>69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6=
b 6b 6b 6b ikkkkkkkkkkkkkkk
=C2=A0 Object 0xddec18e0: =C2=A06b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b =
6b 6b 6b kkkkkkkkkkkkkkkk

The odd thing is here that it's only a "single" bit in the object flipp=
ed and not more!
And we all have slub/slab debug options enabled as well (in fact, we ha=
d to fix the usbdriver for that,
see patch "p54usb: rewriting rx/tx routines to make use of usb_anchor's=
facilities" )

can you check your RAM with memtest or something?

--------------------------
Disconnecting/reconnecting the wireless client resulted in this:
> mac80211-phy0: failed to remove key (0, 00:1b:fb:12:34:56) from hardw=
are (-12)
followed later, when killing hostapd, by:
> mac80211-phy0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardw=
are (-12)

well, -12 is -ENOMEM. Well we can use GFP_KERNEL in p54_set_key so its =
unlikely that your "box"
doesn't have enough space, but your p54usb device 'itself' has no more =
room left... and
this is unlikely too, so it probably stalled/crashed.

We fixed a something like this before for the 2nd generation devices:
http://www.spinics.net/lists/linux-wireless/msg25692.html

However, I don't know if the net2280 chip on your device needs it or no=
t, but it should hurt if you try it.

Regards,
Chr

2009-01-16 09:13:40

by Johannes Berg

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Thu, 2009-01-15 at 21:18 -0600, Larry Finger wrote:

> >> Object 0xddec18d0: >69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ikkkkkkkkkkkkkkk

> I too have seen real single bit changes - in my case 6b went to 6a,
> and my memory is fine. I wouldn't necessarily blame your hardware.

6b to 6a is often the result of a refcounting bug that happens to unref
a value _after_ it has been freed. But that doesn't explain 6b to 69,
unless you happen to have _two_ refcounting bugs. Not that I necessarily
think that memory is bad

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2009-01-13 16:45:12

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> On Monday 12 January 2009 18:09:47 Artur Skawina wrote:
>> kernel: phy0: p54 detected a LM86 firmware
>> kernel: p54: rx_mtu reduced from 3240 to 2392
>> kernel: phy0: FW rev 2.13.1.0 - Softmac protocol 5.5
> Well, I guess you got the latest wireless-testing tree, right?

Actually that was with the wireless testing tree from ~ mid-december,
plus the fix from this thread, as i wanted to get things working
before upgrading and running into new regressions. As that didn't
work that well, i've applied some p54 patches, hence the rx_mtu
line above; it didn't make any significant difference - the box
still dies within minutes.

> Then you can try if the "latest" firmware (the ones for => 2.6.29) improve the situation.

will do, after upgrading the kernel (could take a few days).

>> so to test the device further i set up hostapd, which mostly worked; after
>> a few tweaks (eg patching out country code setting in hostapd)
> patching out country code?! why that?
> currently, hostapd is the only thing that works out-of-the-box :-D

hostapd was aborting no matter what country code i configured, so i
disabled that, leaving figuring this out for later.
[also hostapd is reporting some TX failures, didn't investigate yet]

> you should execute "iw reg set XY" where XY is a 2-(upper-case)-letter ID of your country, e.g US = USA, CZ, DE, FR, etc...
> (see http://wireless.kernel.org/en/users/Documentation/iw )
> and then you have to wait a bit (e.g sleep 1) before you can start hostapd.

will try.

>> i got an almost working AP. "Almost" because connecting a client works sometimes,
>> but not always, and there seems to be way too much packet loss going on.
>> "dtim_period=1" seemed to improve things enough to achieve basic ipv4
>> connectivity (w/o it dhcp seemed to work, but arp usually did not).
> well, do you have another wifi device which has sniffer/monitor mode?

Another identical p54usb device will do i guess.

> It sounds like your client or AP has a problem with power save management.

yes, the client in this case was a windows mobile device, hostap being the ap.

>> But then i ran into a more serious issue -- the machine locks up completely
>> when trying to use the device (just connecting and disconnecting a client is
>> often enough). Happens after ~ a minute, the computer freezes, isn't
>> responsible over network, nothing in logs. It's a headless box, tried
>> hooking up a kbd and monitor, but that didn't help - console was dead too.
>> After setting up netconsole i first got two skb truesize warnings [1], then
>> after a short while again a complete lock up, nothing on the console.
>> Then tried plugging in a keyboard and that provoked a stream of order 0 page
>> allocation failures, but the machine remained dead.
> "order 0 page allocation failures"? Your machine suffers from a serios memory starvation.
> Either something is leaking, or you have not very much memory do you in your "headless box", or both...

it certainly looks like a leak, at first i suspected memory fragmentation, but
Larrys patch didn't help; later the order-0 alloc failures appeared.
The machine has 512M, ~100M should be (usually is) free, is under constant light
load (typically <2k ints/s, 60% idle) and is running fine for weeks/months between
reboots, but locks up after only a few packets go over the hostap driven
p54usb device. I need the box to be up, that limits the number of tests i can
run, at least as long as the lockups w/o any diagnostics happen...

artur

2009-01-13 19:59:54

by Larry Finger

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> well, you have a 32-bit architecture, unlike me or larry.
> So, with a bit of luck the skbs always fitted into a single page and you never
> had problems with memory fragmentation.

Whether it is a 32- or 64-bit kernel, it has a 4K page size. If
that were not true, the rx_mtu reduction would not have triggered.

I have additional information obtained from a kmemleak kernel. Driver
p54usb and mac80211 do not show any memory leaks, at least in STA
mode. I tested this by running normal traffic for about 15 minutes,
and then rmmod'ed p54usb. If kernel leaks are the problem, then they
are arising from hostapd. Userland memory leaks can be checked with
valgrind.

Larry



2009-01-15 23:59:30

by Artur Skawina

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

Christian Lamparter wrote:
> well... I'm still looking for an explanation for your other problems:
> =============================================================================
> BUG kmalloc-4096: Poison overwritten
> -----------------------------------------------------------------------------
>
> Object 0xddec18c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> Object 0xddec18d0: >69< 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ikkkkkkkkkkkkkkk
> Object 0xddec18e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
>
> The odd thing is here that it's only a "single" bit in the object flipped and not more!

yes, that does not look good. i didn't find anything obvious under wireless/p54
that could be responsible for it, didn't look at the rest of the stack yet.
I'd say ignore this for now, until i'm able to reproduce it; one try on a very
different machine didn't help and i didn't have a lot of time to retest after
the kernel upgrade(s) today.

> And we all have slub/slab debug options enabled as well (in fact, we had to fix the usbdriver for that,
> see patch "p54usb: rewriting rx/tx routines to make use of usb_anchor's facilities" )
>
> can you check your RAM with memtest or something?

yes, i just need to plan for enough downtime...

> --------------------------
> Disconnecting/reconnecting the wireless client resulted in this:
>> mac80211-phy0: failed to remove key (0, 00:1b:fb:12:34:56) from hardware (-12)
> followed later, when killing hostapd, by:
>> mac80211-phy0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-12)
>
> well, -12 is -ENOMEM. Well we can use GFP_KERNEL in p54_set_key so its unlikely that your "box"
> doesn't have enough space, but your p54usb device 'itself' has no more room left... and
> this is unlikely too, so it probably stalled/crashed.

Hmm, could having it plugged into a full speed usb port be relevant?
On the other computer, i tried both w/ the card connected directly (high speed)
and through a full speed hub - didn't make any difference (no crash, same
arp etc problems).

> We fixed a something like this before for the 2nd generation devices:
> http://www.spinics.net/lists/linux-wireless/msg25692.html
>
> However, I don't know if the net2280 chip on your device needs it or not, but it should hurt if you try it.

ok, will look into it.

Thanks,

artur

2009-01-19 18:48:52

by Christian Lamparter

[permalink] [raw]
Subject: Re: wireless-testing, p54 and sinus 154 data no longer works

On Monday 19 January 2009 19:15:09 Artur Skawina wrote:
> Christian Lamparter wrote:
> > On Monday 19 January 2009 00:27:02 Artur Skawina wrote:
> >> Artur Skawina wrote:
> >>> didn't trigger anything here, just the usual:
> >>>
> >>> BUG kmalloc-4096: Poison overwritten
>
> >> This is almost 100% reproducible; sometimes the machine freezes instead.
>
> Here's an interesting sequence:
>
> 1) a TX urb is submitted.
> 2) p54u_rx_cb() => p54_rx_frame_sent(), which does kfree_skb( the_skb_in_(1) ).
> 3) p54u_tx_cb() for (1) is called with the same, now freed, skb. kaboom.
>
> IOW the skb is freed before the usb completion runs.

Well, the sequence should be:

1) p54_tx gets called
1.1) one IRQ urb is submitted
1.2) one BULK urb is submitted
2) the firmware acks that it got the urbs
2.1) p54u_tx_cb is called for the IRQ urb. which frees the small buffer
2.2) p54u_tx_cb is called for the BULK urb. which only removes the net2280_tx_hdr from the skb.
[time passes]
3) firmware is finished sending.
3.1) p54u_rx_cb gets called
=> p54_rx_frame_sent passed the feedback to mac80211

> Somehow i don't think this is the reason for the corruption, but it certainly
> seems to be responsible for some, if not all, of the crashes/panics.
dunno... we should see a bit more fallout, because skb_pull changes skb->data and skb->len.

Regards,
Chr