Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752681AbYKYIxX (ORCPT ); Tue, 25 Nov 2008 03:53:23 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751836AbYKYIxM (ORCPT ); Tue, 25 Nov 2008 03:53:12 -0500 Received: from pasmtpa.tele.dk ([80.160.77.114]:43919 "EHLO pasmtpA.tele.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751154AbYKYIxK (ORCPT ); Tue, 25 Nov 2008 03:53:10 -0500 Date: Tue, 25 Nov 2008 09:51:09 +0100 From: Jens Axboe To: malahal@us.ibm.com Cc: Stephen Rothwell , Thomas Gleixner , Mike Anderson , James Bottomley , Alexander Beregalov , LKML , linux-next@vger.kernel.org, Ingo Molnar , linux-scsi@vger.kernel.org, David Miller Subject: Re: next-20081119: general protection fault: get_next_timer_interrupt() Message-ID: <20081125085109.GR26308@kernel.dk> References: <1227554117.25499.46.camel@localhost.localdomain> <20081124213517.GA25898@linux.vnet.ibm.com> <20081125000902.GA24251@us.ibm.com> <20081125115710.6c249f32.sfr@canb.auug.org.au> <20081125020852.GA27280@us.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081125020852.GA27280@us.ibm.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1627 Lines: 40 On Mon, Nov 24 2008, malahal@us.ibm.com wrote: > Stephen Rothwell [sfr@canb.auug.org.au] wrote: > > > The block timer code calls del_timer(), should it call del_timer_sync()? > > > It is possible although unlikely that you are hitting del_timer_sync vs > > > del_timer problem in the block timeout code. Can only be seen on SMP > > > systems though! > > > > Is this still a problem in next-20081121? In that tree, the block commit > > "block: leave the request timeout timer running even on an empty list" > > was changed to add this: > > > > diff --git a/block/blk-core.c b/block/blk-core.c > > index 04267d6..44f547c 100644 > > --- a/block/blk-core.c > > +++ b/block/blk-core.c > > @@ -391,6 +391,7 @@ EXPORT_SYMBOL(blk_stop_queue); > > void blk_sync_queue(struct request_queue *q) > > { > > del_timer_sync(&q->unplug_timer); > > + del_timer_sync(&q->timeout); > > kblockd_flush_work(&q->unplug_work); > > } > > EXPORT_SYMBOL(blk_sync_queue); > > I was looking at the Linux tree. Clearly same problem doesn't exist with > the above commit! I wonder why kblockd_flush_work() is called after the > del_timer_sync(). It makes sense to cancel the work and then shutdown > the timer(s). I doubt if you are running into this problem though. If the kernel tested doesn't include the above fix, it'll surely go boom. Can someone verify that this is the case? -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/