Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755683AbZAMWrW (ORCPT ); Tue, 13 Jan 2009 17:47:22 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752103AbZAMWrL (ORCPT ); Tue, 13 Jan 2009 17:47:11 -0500 Received: from fmmailgate03.web.de ([217.72.192.234]:60103 "EHLO fmmailgate03.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751653AbZAMWrI (ORCPT ); Tue, 13 Jan 2009 17:47:08 -0500 From: Christian Lamparter To: Artur Skawina Subject: Re: wireless-testing, p54 and sinus 154 data no longer works Date: Tue, 13 Jan 2009 23:47:11 +0100 User-Agent: KMail/1.9.9 Cc: linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org References: <494698AF.4020204@gmail.com> <496CE55F.2080506@gmail.com> <496D09F4.40209@gmail.com> In-Reply-To: <496D09F4.40209@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200901132347.11383.chunkeey@web.de> X-Provags-ID: V01U2FsdGVkX19zTzlJqYfoLrbj1RwpiDrhRk8e/fA34wpq9r0S +01NvgUd5crL81PQiH5YwvmOYyIJSvbTLMgr+68Fn9vjkSlNAd j22c91+1w= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11860 Lines: 247 (added lkml - so please keep the CC!) On Tuesday 13 January 2009 22:39:00 Artur Skawina wrote: > Artur Skawina wrote: > >>> The machine has 512M, ~100M should be (usually is) free, is under constant light > >>> load (typically <2k ints/s, 60% idle) and is running fine for weeks/months between > >>> reboots, but locks up after only a few packets go over the hostap driven > >>> p54usb device. I need the box to be up, that limits the number of tests i can > >>> run, at least as long as the lockups w/o any diagnostics happen... > >> Do keyboard-leds "flash" when it locks up, or does console respond > >> if you press alt-sysrq-m / alt-sysrq-w on the connected keyboard? > > > > most of the times it happened there was no kbd attached. At least once > > when it _was_ connected, sysrq was working, and i saw 0*8KB; that's why > > i initially suspected fragmentation. > > > >> ( If your box has a serial port, you can try to get the logs from there... ) > > after switching from SLUB to SLAB and enabling some debugging i finally caught this: arg, that's not good... I hoped for a obvious BUG in p54, or mac80211. But not in the other part of the kernel. I've no idea what's going on in the timer/mm part (but maybe someone else @ lkml ??!) since "cache_free_debugcheck" has about 3 (well, there are 4, but the first one is unlikely) BUG_ON? This smells like a memory corruption. Have you tried to enable CONFIG_DEBUG_SLAB? Is this related to the "truesize bug", Or how long does the box survive if you don't allow named to bind/listen to wlanX ? > ------------[ cut here ]------------ > Kernel BUG at c016a8a3 [verbose debug info unavailable] > invalid opcode: 0000 [#1] > last sysfs file: /sys/devices/pci0000:00/0000:00:07.2/usb1/1-1/1-1.1/uevent > Modules linked in: netconsole saa7134_empress saa6752hs lnbp21 s5h1420 saa7134 budget videobuf_dma_sg budget_ci budget_core saa7146 ttpci_eeprom videobuf_core tveeprom serio_raw ir_common [last unloaded: netconsole] > > Pid: 1885, comm: named Not tainted (2.6.28-rc8-00519-g90435df #42) > EIP: 0060:[] EFLAGS: 00210012 CPU: 0 > EIP is at cache_free_debugcheck+0x203/0x250 > EAX: dfb6c71f EBX: df803d20 ECX: dfb6c03f EDX: 00000002 > ESI: dfb6c720 EDI: 00000370 EBP: c1000000 ESP: c0669f74 > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 > Process named (pid: 1885, ti=c0669000 task=df8443d0 task.ti=deb85000) > Stack: > 00000000 df809660 d31d4528 00000003 00000000 00000002 c137c440 c060e2dc > c01483e2 dfb6c000 df808d38 df803d20 c069cb40 00200286 c016a911 00000000 > 00000005 c069cb40 00000009 c01483e2 00000020 00000001 00000100 c014850f > Call Trace: > [] __rcu_process_callbacks+0xd2/0x1f0 > [] kmem_cache_free+0x21/0x60 > [] __rcu_process_callbacks+0xd2/0x1f0 > [] rcu_process_callbacks+0xf/0x20 > [] __do_softirq+0x57/0xf0 > [] __do_softirq+0x0/0xf0 > <0> [] irq_exit+0x45/0x70 > [] smp_apic_timer_interrupt+0x40/0x70 > [] apic_timer_interrupt+0x28/0x30 > Code: 8b 44 24 24 b9 fe ff ff ff 89 4c 90 1c f6 43 19 08 74 0e b9 6b 00 00 00 89 f2 89 d8 e8 e7 fa ff ff 83 c4 28 89 f0 5b 5e 5f 5d c3 <0f> 0b eb fe 0f 0b eb fe 8b 43 10 8d 44 06 f8 8d b6 00 00 00 00 > EIP: [] cache_free_debugcheck+0x203/0x250 SS:ESP 0068:c0669f74 > Kernel panic - not syncing: Fatal exception in interrupt > > followed after some time by lots of page alloc failures [1]. > > artur > > [1] > [...] > __ratelimit: 1551 callbacks suppressed > named: page allocation failure. order:0, mode:0x20 > Pid: 1885, comm: named Tainted: G D 2.6.28-rc8-00519-g90435df #42 > Call Trace: > [] __alloc_pages_internal+0x35d/0x470 > named: page allocation failure. order:0, mode:0x20 > Pid: 1885, comm: named Tainted: G D 2.6.28-rc8-00519-g90435df #42 > Call Trace: > [] __alloc_pages_internal+0x35d/0x470 > [] cache_alloc_refill+0x363/0x710 > [] __alloc_skb+0x34/0x120 > [] kmem_cache_alloc+0xe1/0xf0 > [] __alloc_skb+0x34/0x120 > [] find_skb+0x35/0x90 > [] netpoll_send_udp+0x2e/0x200 > [] write_msg+0x9d/0xe0 [netconsole] > [] write_msg+0x0/0xe0 [netconsole] > [] __call_console_drivers+0x43/0x50 > [] release_console_sem+0x13b/0x1c0 > [] vprintk+0x227/0x2d0 > [] __call_console_drivers+0x43/0x50 > [] __alloc_pages_internal+0x35d/0x470 > [] printk+0x17/0x1f > [] print_trace_address+0x49/0x60 > [] __alloc_pages_internal+0x35d/0x470 > [] __alloc_pages_internal+0x35d/0x470 > [] dump_trace+0x84/0x100 > [] show_trace+0x4e/0x60 > [] dump_stack+0x6e/0x73 > [] __alloc_pages_internal+0x35d/0x470 > [] cache_alloc_refill+0x363/0x710 > [] __alloc_skb+0x34/0x120 > [] __alloc_skb+0x10e/0x120 > [] __kmalloc_track_caller+0x14e/0x160 > [] kmem_cache_alloc+0x73/0xf0 > [] dev_alloc_skb+0x19/0x30 > [] __alloc_skb+0x55/0x120 > [] dev_alloc_skb+0x19/0x30 > [] boomerang_rx+0x15e/0x520 > [] boomerang_interrupt+0x13f/0x480 > [] budget_ci_irq+0xa9/0x100 [budget_ci] > [] apic_timer_interrupt+0x28/0x30 > [] handle_IRQ_event+0x28/0x50 > [] handle_level_irq+0x0/0xb0 > [] handle_level_irq+0x4b/0xb0 > [] common_interrupt+0x23/0x28 > [] prio_tree_right+0xab/0x100 > [] delay_tsc+0x17/0x20 > [] __const_udelay+0x18/0x20 > [] panic+0x84/0xe3 > [] oops_end+0x7c/0x90 > [] do_invalid_op+0x0/0xa0 > [] do_invalid_op+0x81/0xa0 > [] cache_free_debugcheck+0x203/0x250 > [] __wake_up_common+0x43/0x70 > [] error_code+0x6a/0x70 > [] cache_free_debugcheck+0x203/0x250 > [] __rcu_process_callbacks+0xd2/0x1f0 > [] kmem_cache_free+0x21/0x60 > [] __rcu_process_callbacks+0xd2/0x1f0 > [] rcu_process_callbacks+0xf/0x20 > [] __do_softirq+0x57/0xf0 > [] __do_softirq+0x0/0xf0 > [] irq_exit+0x45/0x70 > [] smp_apic_timer_interrupt+0x40/0x70 > [] apic_timer_interrupt+0x28/0x30 > Mem-Info: > DMA per-cpu: > CPU 0: hi: 0, btch: 1 usd: 0 > Normal per-cpu: > CPU 0: hi: 186, btch: 31 usd: 174 > Active_anon:13626 active_file:3702 inactive_anon:11682 > inactive_file:91928 unevictable:5 dirty:48 writeback:0 unstable:0 > free:737 slab:3377 mapped:2606 pagetables:219 bounce:0 > DMA free:2004kB min:84kB low:104kB high:124kB active_anon:24kB inactive_anon:28kB active_file:104kB inactive_file:8164kB unevictable:0kB present:15872kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 492 492 > Normal free:944kB min:2792kB low:3488kB high:4188kB active_anon:54480kB inactive_anon:46700kB active_file:14704kB inactive_file:359548kB unevictable:20kB present:503928kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 > DMA: 1*4kB 0*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2004kB > Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 944kB > 95760 total pagecache pages > 0 pages in swap cache > Swap cache stats: add 0, delete 0, find 0/0 > Free swap = 530104kB > Total swap = 530104kB > 131070 pages RAM > 2635 pages reserved > 10978 pages shared > 121856 pages non-shared > named: page allocation failure. order:0, mode:0x20 > Pid: 1885, comm: named Tainted: G D 2.6.28-rc8-00519-g90435df #42 > Call Trace: > [] __alloc_pages_internal+0x35d/0x470 > [] cache_alloc_refill+0x363/0x710 > [] __alloc_skb+0x34/0x120 > [] kmem_cache_alloc+0xe1/0xf0 > [] __alloc_skb+0x34/0x120 > [] refill_skbs+0x5b/0x70 > [] find_skb+0x19/0x90 > [] bit_cursor+0x0/0x610 > [] netpoll_send_udp+0x2e/0x200 > [] write_msg+0x9d/0xe0 [netconsole] > [] write_msg+0x0/0xe0 [netconsole] > [] __call_console_drivers+0x43/0x50 > [] release_console_sem+0x13b/0x1c0 > [] vprintk+0x227/0x2d0 > [] __call_console_drivers+0x43/0x50 > [] __alloc_pages_internal+0x35d/0x470 > [] printk+0x17/0x1f > [] print_trace_address+0x49/0x60 > [] __alloc_pages_internal+0x35d/0x470 > [] __alloc_pages_internal+0x35d/0x470 > [] dump_trace+0x84/0x100 > [] show_trace+0x4e/0x60 > [] dump_stack+0x6e/0x73 > [] __alloc_pages_internal+0x35d/0x470 > [] cache_alloc_refill+0x363/0x710 > [] __alloc_skb+0x34/0x120 > [] __alloc_skb+0x10e/0x120 > [] __kmalloc_track_caller+0x14e/0x160 > [] kmem_cache_alloc+0x73/0xf0 > [] dev_alloc_skb+0x19/0x30 > [] __alloc_skb+0x55/0x120 > [] dev_alloc_skb+0x19/0x30 > [] boomerang_rx+0x15e/0x520 > [] boomerang_interrupt+0x13f/0x480 > [] budget_ci_irq+0xa9/0x100 [budget_ci] > [] apic_timer_interrupt+0x28/0x30 > [] handle_IRQ_event+0x28/0x50 > [] handle_level_irq+0x0/0xb0 > [] handle_level_irq+0x4b/0xb0 > [] common_interrupt+0x23/0x28 > [] prio_tree_right+0xab/0x100 > [] delay_tsc+0x17/0x20 > [] __const_udelay+0x18/0x20 > [] panic+0x84/0xe3 > [] oops_end+0x7c/0x90 > [] do_invalid_op+0x0/0xa0 > [] do_invalid_op+0x81/0xa0 > [] cache_free_debugcheck+0x203/0x250 > [] __wake_up_common+0x43/0x70 > [] error_code+0x6a/0x70 > [] cache_free_debugcheck+0x203/0x250 > [] __rcu_process_callbacks+0xd2/0x1f0 > [] kmem_cache_free+0x21/0x60 > [] __rcu_process_callbacks+0xd2/0x1f0 > [] rcu_process_callbacks+0xf/0x20 > [] __do_softirq+0x57/0xf0 > [] __do_softirq+0x0/0xf0 > [] irq_exit+0x45/0x70 > [] smp_apic_timer_interrupt+0x40/0x70 > [] apic_timer_interrupt+0x28/0x30 > Mem-Info: > DMA per-cpu: > CPU 0: hi: 0, btch: 1 usd: 0 > Normal per-cpu: > CPU 0: hi: 186, btch: 31 usd: 174 > Active_anon:13626 active_file:3702 inactive_anon:11682 > inactive_file:91928 unevictable:5 dirty:48 writeback:0 unstable:0 > free:737 slab:3377 mapped:2606 pagetables:219 bounce:0 > DMA free:2004kB min:84kB low:104kB high:124kB active_anon:24kB inactive_anon:28kB active_file:104kB inactive_file:8164kB unevictable:0kB present:15872kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 492 492 > Normal free:944kB min:2792kB low:3488kB high:4188kB active_anon:54480kB inactive_anon:46700kB active_file:14704kB inactive_file:359548kB unevictable:20kB present:503928kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 > DMA: 1*4kB 0*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2004kB > Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 944kB > 95760 total pagecache pages > 0 pages in swap cache > Swap cache stats: add 0, delete 0, find 0/0 > Free swap = 530104kB > Total swap = 530104kB > 131070 pages RAM > 2635 pages reserved > 10978 pages shared > 121856 pages non-shared > named: page allocation failure. order:0, mode:0x20 > [...] > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/