Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756585Ab2FODFe (ORCPT ); Thu, 14 Jun 2012 23:05:34 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50281 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754010Ab2FODFc (ORCPT ); Thu, 14 Jun 2012 23:05:32 -0400 Message-ID: <4FDAA6CC.5050906@redhat.com> Date: Fri, 15 Jun 2012 11:06:52 +0800 From: Asias He User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0 MIME-Version: 1.0 To: Jens Axboe CC: Jiri Slaby , Tejun Heo , LKML , Jiri Slaby Subject: Re: NULL pointer dereference at blk_drain_queue References: <4FD9A908.90307@suse.cz> <4FD9ABFE.8060603@kernel.dk> <4FD9B1F7.1040107@suse.cz> <4FD9E508.8090405@redhat.com> <4FD9E64E.7080505@kernel.dk> In-Reply-To: <4FD9E64E.7080505@kernel.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4567 Lines: 101 On 06/14/2012 09:25 PM, Jens Axboe wrote: > On 06/14/2012 03:20 PM, Asias He wrote: >> On 06/14/2012 05:42 PM, Jiri Slaby wrote: >>> On 06/14/2012 11:16 AM, Jens Axboe wrote: >>>> On 06/14/2012 11:04 AM, Jiri Slaby wrote: >>>>> Hi, >>>>> >>>>> with today's -next I'm (reproducibly) getting this while updating packages: >>>>> BUG: unable to handle kernel NULL pointer dereference at (null) >>>>> IP: [] __wake_up_common+0x26/0x90 >>>>> PGD 463f1067 PUD 463f2067 PMD 0 >>>>> Oops: 0000 [#1] SMP >>>>> CPU 1 >>>>> Modules linked in: >>>>> Pid: 2711, comm: kworker/1:0 Not tainted 3.5.0-rc2-next-20120614_64+ >>>>> #1752 Bochs Bochs >>>>> RIP: 0010:[] [] >>>>> __wake_up_common+0x26/0x90 >>>>> RSP: 0018:ffff880047221cb0 EFLAGS: 00010082 >>>>> RAX: 0000000000000086 RBX: ffff880046350888 RCX: 0000000000000000 >>>>> RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff880046350888 >>>>> RBP: ffff880047221cf0 R08: 0000000000000000 R09: 00000001000c0009 >>>>> R10: ffff880047804480 R11: 0000000000000000 R12: ffff880046350890 >>>>> R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000003 >>>>> FS: 0000000000000000(0000) GS:ffff880049700000(0000) knlGS:0000000000000000 >>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>>> CR2: 0000000000000000 CR3: 0000000045ced000 CR4: 00000000000006e0 >>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>>>> Process kworker/1:0 (pid: 2711, threadinfo ffff880047220000, task >>>>> ffff8800435bc5c0) >>>>> Stack: >>>>> 000000004628da68 0000000000000000 ffff88004970d340 ffff880046350888 >>>>> 0000000000000086 0000000000000003 0000000000000000 0000000000000000 >>>>> ffff880047221d30 ffffffff8108d9a3 ffff88004970d340 ffff880046350848 >>>>> Call Trace: >>>>> [] __wake_up+0x43/0x70 >>>>> [] blk_drain_queue+0xf6/0x120 >>>>> [] blk_cleanup_queue+0x7f/0xd0 >>>>> [] md_free+0x50/0x70 >>>>> [] kobject_cleanup+0x82/0x1b0 >>>>> [] kobject_put+0x2b/0x60 >>>>> [] mddev_delayed_delete+0x2f/0x40 >>>>> [] process_one_work+0x11b/0x3f0 >>>>> [] ? restart_array+0xc0/0xc0 >>>>> [] worker_thread+0x12e/0x340 >>>>> [] ? manage_workers.isra.29+0x1f0/0x1f0 >>>>> [] kthread+0x8e/0xa0 >>>>> [] kernel_thread_helper+0x4/0x10 >>>>> [] ? flush_kthread_worker+0x70/0x70 >>>>> [] ? gs_change+0xb/0xb >>>>> Code: 80 00 00 00 00 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 >>>>> 54 4c 8d 67 08 53 48 83 ec 18 89 55 c4 48 8b 57 08 4c 89 45 c8 <4c> 8b >>>>> 2a 48 8d 42 e8 49 83 ed 18 49 39 d4 75 0d eb 40 0f 1f 84 >>>> >>>> It's a bug in local commit bc85cf83, for stacked devices we have not >>>> initialized the wait queues. So the below should fix it, as would always >>>> initializing all queue structures even for the partial use case. >>>> >>>> >>>> diff --git a/block/blk-core.c b/block/blk-core.c >>>> index b477fa0..93eb3e4 100644 >>>> --- a/block/blk-core.c >>>> +++ b/block/blk-core.c >>>> @@ -415,10 +415,12 @@ void blk_drain_queue(struct request_queue *q, bool drain_all) >>>> * allocation path, so the wakeup chaining is lost and we're >>>> * left with hung waiters. We need to wake up those waiters. >>>> */ >>>> - spin_lock_irq(q->queue_lock); >>>> - for (i = 0; i < ARRAY_SIZE(q->rq.wait); i++) >>>> - wake_up_all(&q->rq.wait[i]); >>>> - spin_unlock_irq(q->queue_lock); >>>> + if (q->request_fn) { >>>> + spin_lock_irq(q->queue_lock); >>>> + for (i = 0; i < ARRAY_SIZE(q->rq.wait); i++) >>>> + wake_up_all(&q->rq.wait[i]); >>>> + spin_unlock_irq(q->queue_lock); >>>> + } >>> >>> Yes, that fixed it. >> >> Jiri, good to hear this fixes for you. BTW. How do you trigger this >> issue? >> >> Jens, do you prefer to fix it up in your tree yourself or wait a patch >> from me? > > I will fixup the existing patch, so we don't have this problem in a > bisection point after it's merged. OK. Thanks. -- Asias -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/