Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751861Ab1CYMeP (ORCPT ); Fri, 25 Mar 2011 08:34:15 -0400 Received: from li9-11.members.linode.com ([67.18.176.11]:40506 "EHLO test.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751224Ab1CYMeL (ORCPT ); Fri, 25 Mar 2011 08:34:11 -0400 Date: Fri, 25 Mar 2011 08:33:50 -0400 From: "Ted Ts'o" To: Jens Axboe Cc: Dave Chinner , Markus Trippelsdorf , Linus Torvalds , "linux-kernel@vger.kernel.org" , Chris Mason Subject: Re: [GIT PULL] Core block IO bits for 2.6.39 - early Oops Message-ID: <20110325123350.GE2548@thunk.org> Mail-Followup-To: Ted Ts'o , Jens Axboe , Dave Chinner , Markus Trippelsdorf , Linus Torvalds , "linux-kernel@vger.kernel.org" , Chris Mason References: <4D8B4A89.80608@fusionio.com> <20110324183019.GA1676@gentoo.trippels.de> <4D8B8F34.5000203@fusionio.com> <4D8B92AE.8090308@fusionio.com> <20110324185445.GB1696@gentoo.trippels.de> <4D8B9457.2020608@fusionio.com> <20110324193441.GA1723@gentoo.trippels.de> <20110325044128.GJ26611@dastard> <91CCAB14-F9CC-4676-94C3-FBCDD0663FD5@mit.edu> <4D8C872C.1030805@fusionio.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4D8C872C.1030805@fusionio.com> User-Agent: Mutt/1.5.20 (2009-06-14) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on test.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4071 Lines: 94 On Fri, Mar 25, 2011 at 01:14:36PM +0100, Jens Axboe wrote: > But this plugging change is merged, so it is a very likely candidate. > With the oddness going on, I suspect that we end up flushing a plug that > resides on a stack that is no longer valid. Ah, sorry, I was typing this while I was still doing my morning e-mail reading/deleting in bed, so I couldn't check to see if it had been merged already. I'm downstairs now, and you could very well be right, since here's the the stack trace from the kvm console output, and it features flush_plug_list very prominently in the stack trace (see attached). > So I'd really like to have something ala: > > if (is_str_ptr_valid(current, ptr, size)) > ... > > to aid the debugging. Well, I can repro the problem in an hour or two of running xfstests under KVM, so I'm happy to try any debugging patches. Let me know where you'd like that placed, and I'll do a run. - Ted P.S. Hmm, and I this one does have ext4 in the stack trace. My bad; I misremembered. The other one didn't, though. (Oh, and this one died in test #130, but if it's related to some kind of plug/unplug race, it's probably quite random when it happens.) BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: [] cfq_insert_request+0x39/0x4a2 *pdpt = 00000000017f0001 *pde = 0000000000000000 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC last sysfs file: Modules linked in: Pid: 15723, comm: xfs_io Not tainted 2.6.38-08184-g108052f #1489 Bochs Bochs EIP: 0060:[] EFLAGS: 00010046 CPU: 0 EIP is at cfq_insert_request+0x39/0x4a2 EAX: 00000000 EBX: 00000000 ECX: c096e710 EDX: cb599870 ESI: f596c000 EDI: cb599870 EBP: f6db3bfc ESP: f6db3bd0 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process xfs_io (pid: 15723, ti=f6db2000 task=c5f6e7c0 task.ti=f6db2000) Stack: f6db3c04 00000046 00000028 f6db3c14 00000046 00000000 c03573c8 c03573c8 cb599870 f5839d60 f6db3e84 f6db3c14 c0354613 00000006 02a00091 f5839d60 f6db3e84 f6db3c24 c03546a9 cb599870 f5839d60 f6db3c40 c03573dd f6db3e88 Call Trace: [] ? flush_plug_list+0xa5/0x114 [] ? flush_plug_list+0xa5/0x114 [] elv_insert+0x173/0x1a6 [] __elv_add_request+0x63/0x67 [] flush_plug_list+0xba/0x114 [] __blk_flush_plug+0x15/0x31 [] schedule+0x25b/0x6d9 [] ? sched_clock_cpu+0x134/0x144 [] ? __make_request+0x1fa/0x231 [] schedule_timeout+0x1b/0x9b [] ? mark_lock+0x1e/0x1df [] ? mark_held_locks+0x43/0x5b [] ? _raw_spin_unlock_irq+0x27/0x30 [] ? trace_hardirqs_on_caller+0x104/0x125 [] ? trace_hardirqs_on+0xb/0xd [] wait_for_common+0xc6/0x11f [] ? default_wake_function+0x0/0x12 [] wait_for_completion+0x17/0x19 [] blkdev_issue_flush+0x96/0xcf [] ? wait_for_common+0x36/0x11f [] ext4_sync_file+0x205/0x250 [] vfs_fsync_range+0x46/0x68 [] generic_write_sync+0x55/0x64 [] generic_file_aio_write+0x99/0xb8 [] ext4_file_write+0x1df/0x22f [] do_sync_write+0x8f/0xca [] ? security_file_permission+0x27/0x2b [] ? rw_verify_area+0xca/0xed [] ? do_sync_write+0x0/0xca [] vfs_write+0x85/0xe3 [] ? restore_all_notrace+0x0/0x18 [] sys_pwrite64+0x48/0x61 [] syscall_call+0x7/0xb Code: 8b 40 0c 8b 5a 5c 89 d7 8b 70 04 8b 06 8b 80 74 04 00 00 85 c0 74 11 ff 73 74 68 33 0a 87 c0 50 e8 ce a7 e5 ff 83 c4 0c 8b 47 58 <8b> 50 0c 89 d8 e8 68 f0 ff ff 8b 57 20 a1 40 6a 92 c0 83 e2 11 EIP: [] cfq_insert_request+0x39/0x4a2 SS:ESP 0068:f6db3bd0 CR2: 000000000000000c ---[ end trace da176424940e586f ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/