Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754140AbYFIPq6 (ORCPT ); Mon, 9 Jun 2008 11:46:58 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751204AbYFIPqu (ORCPT ); Mon, 9 Jun 2008 11:46:50 -0400 Received: from charybdis-ext.suse.de ([195.135.221.2]:58891 "EHLO emea5-mh.id5.novell.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751214AbYFIPqt (ORCPT ); Mon, 9 Jun 2008 11:46:49 -0400 Subject: Re: [bug, 2.6.26-rc4/rc5] sporadic bootup crashes in blk_lookup_devt()/prepare_namespace() From: Kay Sievers To: Linus Torvalds Cc: Cornelia Huck , Vegard Nossum , Adrian Bunk , Andrew Morton , Ingo Molnar , Linux Kernel Mailing List , Jens Axboe , Greg Kroah-Hartman , "Rafael J. Wysocki" , Neil Brown , Mariusz Kozlowski , Dave Young In-Reply-To: References: <20080609080312.GA32458@elte.hu> <20080609020623.b6727f2b.akpm@linux-foundation.org> <19f34abd0806090209l541d93c6jaba2704314b34418@mail.gmail.com> <20080609133426.GB20194@cs181133002.pp.htv.fi> <19f34abd0806090658v54f3a912n2ed30ad6cc20d00@mail.gmail.com> <19f34abd0806090728s3b3fdbeq7dd3d31d02c8f28e@mail.gmail.com> <20080609165757.184724ff@gondolin.boeblingen.de.ibm.com> Content-Type: text/plain Date: Mon, 09 Jun 2008 17:46:53 +0200 Message-Id: <1213026413.14898.5.camel@linux.site> Mime-Version: 1.0 X-Mailer: Evolution 2.22.1.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1930 Lines: 50 On Mon, 2008-06-09 at 08:29 -0700, Linus Torvalds wrote: > On Mon, 9 Jun 2008, Cornelia Huck wrote: > > > > Does this crash happen with the conversion to the class iterator > > functions (should be in linux-next) as well? They take the class > > mutex... > > I really don't think it's the locking, although I do agree that the > locking looks bogus _too_. > > I suspect that the problem is even simpler than that. On the > "block_class.devices" list we can have two types of devices: the ones that > have been added by the block/genhd.c code (disks: dev->type "disk_type"), > and the ones that are added by the class layer for partitions (partitions: > dev.type "part_type"). > > And *all* the block/genhd.c loops over that device list look like this: > > list_for_each_entry(dev, &block_class.devices, node) { > if (dev->type != &disk_type) > continue; > sgp = dev_to_disk(dev); > ... > > because you cannot do that "dev_to_disk()" on a partition entry (it won't > have a container of type gendisk, it will be of type hd_struct). > > Well, all except one. Guess which one.. > > So I suspect that (a) yes, we need to fix the locking, but (b) the fix for > this particular bug is probably the trivial one appended. > > And yes, this bug was introduced by commit 30f2f0eb4b ("block: do_mounts - > accept root="), so the alternative is to revert it > entirely. Kay? Yeah, the patch looks fine. That could be the reason. I think we should keep the patch, as it fixed a different issue, and it seems the bug was there even before the patch - the function was just not called 3 times, so even more unlikely to trigger it. Thanks, Kay -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/