Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756118Ab0A1CmR (ORCPT ); Wed, 27 Jan 2010 21:42:17 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756057Ab0A1CmQ (ORCPT ); Wed, 27 Jan 2010 21:42:16 -0500 Received: from cantor2.suse.de ([195.135.220.15]:55671 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754672Ab0A1CmP (ORCPT ); Wed, 27 Jan 2010 21:42:15 -0500 Date: Thu, 28 Jan 2010 13:42:05 +1100 From: Neil Brown To: Andrew Morton Cc: =?UTF-8?B?RnJhbsOnb2lz?= Figarola , linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, Al Viro , dm-devel@redhat.com Subject: Re: [BUG] kernel 2.6.32.x hangs during boot process Message-ID: <20100128134205.352044bd@notabene> In-Reply-To: <20100122160740.6c16c22d.akpm@linux-foundation.org> References: <20100122160740.6c16c22d.akpm@linux-foundation.org> X-Mailer: Claws Mail 3.7.3 (GTK+ 2.18.5; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9273 Lines: 200 On Fri, 22 Jan 2010 16:07:40 -0800 Andrew Morton wrote: > (cc's added) (another cc added, one that might actually be useful.....) > > On Sat, 16 Jan 2010 10:58:30 +0100 > Fran__ois Figarola wrote: > > > Dear all, > > > > First, I apologize por my poor english... > > > > Since I've tried to boot 2.6.32.x kernel, my system hangs during the > > boot process, and I think it could be related to the problem reported > > earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92). > > > > The hardware is a Dell PowerEdge 2950 which runs fine with the > > 2.6.31.x kernel series (actually running with the latest 2.6.31.11), > > and the system is debian etch. > > > > Here is the trace of the bug I've got (using netconsole) with a > > 2.6.32.3 kernel : > > > > BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8) > > [unmount of ext3 dm-4] > > ------------[ cut here ]------------ > > kernel BUG at fs/dcache.c:670! > > That's > > if (atomic_read(&dentry->d_count) != 0) { > printk(KERN_ERR > "BUG: Dentry %p{i=%lx,n=%s}" > " still in use (%d)" > " [unmount of %s %s]\n", > dentry, > dentry->d_inode ? > dentry->d_inode->i_ino : 0UL, > dentry->d_name.name, > atomic_read(&dentry->d_count), > dentry->d_sb->s_type->name, > dentry->d_sb->s_id); > BUG(); > } > > I'm a bit surprised that the system is doing a dm suspemd/resume during > the boot process. It could be that a dm_resume if how you activate a dm device once it is built, but I'm not sure.... Maybe the guys on dm-devel can help. NeilBrown > > I assume it's a DM bug, dunno. > > > invalid opcode: 0000 [#1] SMP > > last sysfs file: /sys/block/dm-2/removable > > CPU 0 > > Modules linked in: i5k_amb hwmon button processor thermal fan [last > > unloaded: scsi_wait_scan] > > Pid: 3311, comm: kpartx Not tainted 2.6.32.3 #2 PowerEdge 2950 > > RIP: 0010:[] __[] > > shrink_dcache_for_umount_subtree+0x280/0x290 > > RSP: 0018:ffff88066670dcf8 __EFLAGS: 00010296 > > RAX: 000000000000005c RBX: ffff8806677696c0 RCX: 0000000000000096 > > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246 > > RBP: ffff880667690000 R08: 0000000000000000 R09: ffff8806670d1628 > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667690060 > > R13: 0000000000000007 R14: ffff8806654d1a88 R15: 0000000000dec0b0 > > FS: __00007f176e96b770(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 > > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00007fff0a2e0080 CR3: 0000000666607000 CR4: 00000000000006f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process kpartx (pid: 3311, threadinfo ffff88066670c000, task ffff8806652997d0) > > Stack: > > ffff880665b8b178 ffff880665b8af18 ffffffff81619600 0000000000000001 > > <0> ffff880667408e00 ffffffff810f9629 ffff880665b8af18 ffffffff810e8049 > > <0> ffff8806651333f8 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159 > > Call Trace: > > [] ? shrink_dcache_for_umount+0x29/0x50 > > [] ? generic_shutdown_super+0x19/0x100 > > [] ? kill_block_super+0x29/0x50 > > [] ? deactivate_locked_super+0x58/0x80 > > [] ? thaw_bdev+0xd2/0x110 > > [] ? dm_resume+0xf7/0x160 > > [] ? dev_suspend+0x0/0x220 > > [] ? dev_suspend+0x1b1/0x220 > > [] ? ctl_ioctl+0x1eb/0x260 > > [] ? handle_mm_fault+0x63b/0x990 > > [] ? dm_ctl_ioctl+0xe/0x20 > > [] ? finish_task_switch+0x3a/0xc0 > > [] ? vfs_ioctl+0x2f/0xb0 > > [] ? do_vfs_ioctl+0x3fb/0x580 > > [] ? thread_return+0x3e/0x64d > > [] ? sys_ioctl+0xa1/0xb0 > > [] ? system_call_fastpath+0x16/0x1b > > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00 > > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f> > > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48 > > RIP __[] shrink_dcache_for_umount_subtree+0x280/0x290 > > RSP > > ---[ end trace 3cc1cb65fcc6a8ca ]--- > > > > another trace with same behavior on a new compiled kernel with more > > debug options; > > but I can't see any difference : > > > > BUG: Dentry ffff880667556738{i=41a46,n=sleep} still in use (8) > > [unmount of ext3 dm-4] > > ------------[ cut here ]------------ > > kernel BUG at fs/dcache.c:670! > > invalid opcode: 0000 [#1] SMP > > last sysfs file: /sys/block/dm-3/removable > > CPU 1 > > Modules linked in: i5k_amb(+) button hwmon processor thermal fan [last > > unloaded: scsi_wait_scan] > > Pid: 3315, comm: kpartx Not tainted 2.6.32.3 #3 PowerEdge 2950 > > RIP: 0010:[] __[] > > shrink_dcache_for_umount_subtree+0x280/0x290 > > RSP: 0018:ffff880667089cf8 __EFLAGS: 00010296 > > RAX: 000000000000005c RBX: ffff880667790a60 RCX: 0000000000000096 > > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246 > > RBP: ffff880667556738 R08: 0000000000000000 R09: ffff88066604b420 > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667556798 > > R13: 0000000000000007 R14: ffff880665842360 R15: 0000000000b3c0b0 > > FS: __00007f7b1006c770(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 > > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00007f6e67f1c350 CR3: 0000000664ff1000 CR4: 00000000000006e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process kpartx (pid: 3315, threadinfo ffff880667088000, task ffff880664f55f40) > > Stack: > > ffff880667058af0 ffff880667058890 ffffffff81619600 0000000000000001 > > <0> ffff880667408e00 ffffffff810f9629 ffff880667058890 ffffffff810e8049 > > <0> ffff88067f83e758 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159 > > Call Trace: > > [] ? shrink_dcache_for_umount+0x29/0x50 > > [] ? generic_shutdown_super+0x19/0x100 > > [] ? kill_block_super+0x29/0x50 > > [] ? deactivate_locked_super+0x58/0x80 > > [] ? thaw_bdev+0xd2/0x110 > > [] ? dm_resume+0xf7/0x160 > > [] ? dev_suspend+0x0/0x220 > > [] ? dev_suspend+0x1b1/0x220 > > [] ? ctl_ioctl+0x1eb/0x260 > > [] ? handle_mm_fault+0x63b/0x990 > > [] ? dm_ctl_ioctl+0xe/0x20 > > [] ? finish_task_switch+0x3a/0xc0 > > [] ? vfs_ioctl+0x2f/0xb0 > > [] ? do_vfs_ioctl+0x3fb/0x580 > > [] ? thread_return+0x3e/0x64d > > [] ? sys_ioctl+0xa1/0xb0 > > [] ? system_call_fastpath+0x16/0x1b > > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00 > > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f> > > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48 > > RIP __[] shrink_dcache_for_umount_subtree+0x280/0x290 > > RSP > > ---[ end trace a9fb3c2286e56cbd ]--- > > > > > > I think the problem should be related with lvm or device mapper because > > I could start perfectly a 2.6.32.2 kernel on another PowerEdge 2950 > > without any kind of lvm or dm configured... > > but I'm really not expert with kernel debug. > > > > Here is the fstab of the buggy system : > > > > # /etc/fstab: static file system information. > > # > > # __ __ __ __ __ __ > > proc __ __ __ __ __ __/proc __ __ __ __ __ proc __ __defaults __ __ __ __0 __ __ __ 0 > > /dev/dm-4 __ __ __ / __ __ __ __ __ __ __ ext3 __ __errors=remount-ro 0 __ __ __ 1 > > /dev/dm-1 __ __ __ /boot __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2 > > /dev/dm-7 __ __ __ /home __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2 > > /dev/dm-5 __ __ __ /usr __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2 > > /dev/dm-6 __ __ __ /var __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2 > > /dev/dm-2 __ __ __ none __ __ __ __ __ __swap __ __sw __ __ __ __ __ __ __0 __ __ __ 0 > > /dev/hda __ __ __ __/media/cdrom0 __ udf,iso9660 user,noauto __ __ 0 __ __ __ 0 > > debugfs /sys/kernel/debug debugfs noauto 0 0 > > > > I hope it can help, and try to give us more informations if necessary. > > > > Fran__ois. > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/