Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965299AbXHIMy3 (ORCPT ); Thu, 9 Aug 2007 08:54:29 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S940264AbXHIMxt (ORCPT ); Thu, 9 Aug 2007 08:53:49 -0400 Received: from hellhawk.shadowen.org ([80.68.90.175]:3674 "EHLO hellhawk.shadowen.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S940115AbXHIMxp (ORCPT ); Thu, 9 Aug 2007 08:53:45 -0400 Date: Thu, 9 Aug 2007 13:53:06 +0100 From: Andy Whitcroft To: Andrew Morton , Peter Zijlstra Cc: linux-kernel@vger.kernel.org, drfickle@us.ibm.com, mel@csn.ul.ie, Andy Whitcroft Subject: Re: 2.6.23-rc2-mm1 -- spinlock bad magic Message-ID: <20070809125306.GA3091@shadowen.org> References: <20070809015106.cd0bfc53.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070809015106.cd0bfc53.akpm@linux-foundation.org> User-Agent: Mutt/1.5.13 (2006-08-11) X-SPF-Guess: neutral Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6322 Lines: 145 Seeing spinlock bad magic BUG's from x86 and x86_64 test boxes, fsx-linux seems to be tickling it. Adding Peter as prop_norm_single seems to be his: x86 (oai5-hs20): BUG: spinlock bad magic on CPU#1, fsx-linux/25600 lock: c3c15a18, .magic: 00000000, .owner: /-1, .owner_cpu: 0 [] _raw_spin_lock+0x16/0x67 [] _spin_lock_irqsave+0x9/0xd [] prop_norm_single+0x27/0x70 [] task_dirty_inc+0x1f/0x48 [] set_page_dirty+0x17/0x1b [] set_page_dirty_balance+0x8/0x35 [] __do_fault+0x34c/0x363 [] prio_tree_insert+0x55/0x154 [] do_linear_fault+0x43/0x49 [] handle_mm_fault+0x17d/0x2f2 [] do_page_fault+0x214/0x60f [] sys_mmap2+0x71/0x78 [] do_page_fault+0x0/0x60f [] error_code+0x72/0x78 [] ioctl_standard_call+0x215/0x29e x86_64 (bl6-13): BUG: spinlock bad magic on CPU#1, fsx-linux/19721 lock: ffff8100028cef48, .magic: 00000000, .owner: /-1, .owner_cpu: 0 Call Trace: [] _raw_spin_lock+0x22/0xf6 [] _spin_lock_irqsave+0x9/0xe [] prop_norm_single+0x40/0x9a [] set_page_dirty+0x8d/0xc9 [] set_page_dirty_balance+0x9/0x39 [] __do_fault+0x37a/0x395 [] handle_mm_fault+0x342/0x6c3 [] do_page_fault+0x3e5/0x7ab [] arch_get_unmapped_area+0x184/0x1f9 [] _spin_lock_irqsave+0x9/0xe [] __up_write+0x21/0x10d [] error_exit+0x0/0x84 We also have a third machine which dies when running fs-linux but that trips a soft-lockup trying to flush the tlb, subsequent stacks also show prop_norm_single. x86_64 (elm3b6): First bug: BUG: soft lockup - CPU#3 stuck for 11s! [pdflush:272] CPU 3: Modules linked in: Pid: 272, comm: pdflush Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[] [] flush_tlb_others+0x69/0x95 RSP: 0000:ffff810001f15a90 EFLAGS: 00000202 RAX: 0000000000000003 RBX: ffff810001f15ac0 RCX: 0000000000000008 RDX: 00000000000008f3 RSI: 00000000000000f3 RDI: 0000000000000002 RBP: 0000000000000000 R08: ffff810082f05210 R09: ffffffff802e60c1 R10: ffff8100815e6e70 R11: 0000000000000000 R12: ffff8101ffc38080 R13: ffffffff80358b47 R14: ffff810001f15a40 R15: ffff810081d73208 FS: 0000000000000000(0000) GS:ffff810180724280(0000) knlGS:00000000f7f75b80 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000000f7e20494 CR3: 00000000029f0000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: [] flush_tlb_page+0x8f/0x97 [] page_mkclean+0x120/0x171 [] ext3_ordered_writepage+0x13f/0x16c [] clear_page_dirty_for_io+0x52/0xba [] write_cache_pages+0x1b2/0x33a [] update_curr+0xd9/0xf8 [] __writepage+0x0/0x2a [] generic_writepages+0x1f/0x25 [] do_writepages+0x2c/0x35 [] __writeback_single_inode+0x1c9/0x346 [] try_to_del_timer_sync+0x55/0x60 [] del_timer_sync+0x12/0x1f [] update_curr+0xd9/0xf8 [] dequeue_entity+0x7d/0x92 [] generic_sync_sb_inodes+0x216/0x372 [] sync_sb_inodes+0x1d/0x1f [] writeback_inodes+0x83/0xd6 [] wb_kupdate+0xa0/0x113 [] pdflush+0x156/0x206 [] wb_kupdate+0x0/0x113 [] pdflush+0x0/0x206 [] kthread+0x44/0x6d [] child_rip+0xa/0x12 [] kthread+0x0/0x6d [] child_rip+0x0/0x12 Later: BUG: soft lockup - CPU#0 stuck for 11s! [syslogd:2167] CPU 0: Modules linked in: Pid: 2167, comm: syslogd Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[] [] _spin_lock_irqsave+0x19/0x29 RSP: 0000:ffff8101819358b8 EFLAGS: 00000286 RAX: 0000000000000287 RBX: ffff8101819358b8 RCX: 0000000000000012 RDX: ffff810181935980 RSI: ffff81018073eba8 RDI: ffff81018073ebc0 RBP: 0000000046badbf6 R08: 000000000001761f R09: 0000000000000064 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: ffffffff802f56ac R15: ffff8101819358e8 FS: 0000000000000000(0000) GS:ffffffff805c8000(0063) knlGS:00000000f7e39460 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 00000000f7e2b960 CR3: 000000018197a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: [] prop_norm_single+0x3e/0x9a [] prop_fraction_single+0x2d/0x53 [] task_dirties_fraction+0x38/0x50 [] task_dirty_limit+0x20/0x63 [] bdi_writeout_fraction+0x4e/0x67 [] get_dirty_limits+0x19e/0x1ad [] __wake_up+0x40/0x4c [] balance_dirty_pages_ratelimited_nr+0xca/0x2ac [] generic_file_buffered_write+0x1e9/0x619 [] skb_dequeue+0x50/0x5b [] __generic_file_aio_write_nolock+0x41e/0x451 [] skb_free_datagram+0xc/0xe [] sock_recvmsg+0xee/0x107 [] generic_file_aio_write+0x64/0xc0 [] ext3_file_write+0x1c/0xa2 [] ext3_file_write+0x0/0xa2 [] do_sync_readv_writev+0xe0/0x128 [] autoremove_wake_function+0x0/0x37 [] sys_recvfrom+0xbb/0x11f [] compat_do_readv_writev+0x18a/0x271 [] do_gettimeofday+0x3e/0xc8 [] compat_sys_writev+0x5f/0x76 [] ia32_sysret+0x0/0xa -apw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/