Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933268Ab2FWAxS (ORCPT ); Fri, 22 Jun 2012 20:53:18 -0400 Received: from mail-vb0-f46.google.com ([209.85.212.46]:57098 "EHLO mail-vb0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933197Ab2FWAxR (ORCPT ); Fri, 22 Jun 2012 20:53:17 -0400 From: Sadasivan Shaiju References: <797b2bac7e6fb198ea25433e302856b9@mail.gmail.com> <20120622142936.a6e6b4d8.akpm@linux-foundation.org> In-Reply-To: <20120622142936.a6e6b4d8.akpm@linux-foundation.org> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQJQg4vj5B9HX+tRPErgvKS2U5x0eQKlOHwYlet/MFA= Date: Fri, 22 Jun 2012 17:53:14 -0700 Message-ID: Subject: RE: crash in filesytem during reboot . (and proposed patch) To: Andrew Morton Cc: linux-kernel@vger.kernel.org Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4707 Lines: 155 Hi Andrew, Please see inline . -----Original Message----- From: Andrew Morton [mailto:akpm@linux-foundation.org] Sent: Friday, June 22, 2012 2:30 PM To: Sadasivan Shaiju Cc: linux-kernel@vger.kernel.org Subject: Re: crash in filesytem during reboot . (and proposed patch) On Fri, 15 Jun 2012 11:12:09 -0700 Sadasivan Shaiju wrote: > Hi > > > Your email is quadruple-spaced. Please, fix that. Sure I will fix this . > I am getting the following crashes during a reboot of the system > . It looks like a race condition during unmount . > > <4>Call Trace: > <4>[] clear_inode+0x28/0xe8 > <4>[] generic_drop_inode+0x3c/0xa8 > <4>[] d_kill+0x4c/0x78 > <4>[] __shrink_dcache_sb+0x258/0x360 > <4>[] shrink_dcache_parent+0x140/0x190 <4>[] > proc_flush_task+0xac/0x2e8 <4>[] release_task+0x80/0x4c0 <4>[] > wait_consider_task+0x608/0xa80 <4>[] do_wait+0x10c/0x2b8 <4>[] > SyS_wait4+0x88/0x120 <4>[] compat_sys_wait4+0xc8/0xd0 <4>[] > handle_sysn32+0x44/0x84 > > Call Trace: > [] file_ra_state_init+0x0/0x20 > [] __dentry_open+0x26c/0x3d0 > [] do_filp_open+0x70c/0xbc8 > [] do_sys_open+0x78/0x1e0 > [] handle_sysn32+0x44/0x84 > > Call Trace: > [] iput+0x3c/0x88 > [] d_kill+0x4c/0x78 > [] __shrink_dcache_sb+0x258/0x360 > [] shrink_dcache_parent+0x140/0x190 > [] proc_flush_task+0xac/0x2e8 [] > release_task+0x80/0x4c0 [] do_exit+0x6f8/0x908 > [] unregister_module_notifier+0x0/0x10 > > Call Trace: > [] iput+0x3c/0x88 > [] d_kill+0x4c/0x78 > [] dput+0x120/0x220 > [] do_lookup+0xdc/0x210 [] > __link_path_walk+0x910/0x1408 [] > path_walk+0x64/0x108 [] do_path_lookup+0x60/0x68 > [] > do_filp_open+0xdc/0xbc8 [] do_sys_open+0x78/0x1e0 > [] handle_sysn32+0x44/0x84 > > ... > > I am thinking of putting the following fix in > shrink_dcache_parent() . Please let me know is there any problem > with this fix . > > ... > > --- linux-2.6.32.orig/fs/dcache.c 2012-05-30 15:59:18.000000000 -0700 > +++ linux-2.6.32/fs/dcache.c 2012-06-11 17:10:33.000000000 -0700 > @@ -881,8 +881,14 @@ > struct super_block *sb = parent->d_sb; > int found; > > - while ((found = select_parent(parent)) != 0) > - __shrink_dcache_sb(sb, &found, 0); > + while ((found = select_parent(parent)) != 0) { > + if (down_read_trylock(&sb->s_umount)) { > + if ((sb->s_root != NULL)) { > + __shrink_dcache_sb(sb, &found, 0); > + } > + up_read(&sb->s_umount); > + } > + } > } Please fully describe the race which you believe you have found. What races against what? The race is between generic_shutdown_super() and __shrink_dcache_sb () . Under high memory pressure one Of our user process crashed and the parent was trying to do a clean up with the following stack flow <4>[] clear_inode+0x28/0xe8 <4>[] generic_drop_inode+0x3c/0xa8 <4>[] d_kill+0x4c/0x78 <4>[] __shrink_dcache_sb+0x258/0x360 <4>[] shrink_dcache_parent+0x140/0x190 <4>[] proc_flush_task+0xac/0x2e8 <4>[] release_task+0x80/0x4c0 <4>[] wait_consider_task+0x608/0xa80 <4>[] do_wait+0x10c/0x2b8 <4>[] SyS_wait4+0x88/0x120 <4>[] compat_sys_wait4+0xc8/0xd0 <4>[] handle_sysn32+0x44/0x84 During that time the system get rebooted and unmounting starts . Meanwhile the parent process is trying to clean up The child' dentry's and clear_inode will reference to a stale inode and it will crash . So I try to grab the s_umount lock So that __shrink_dcache_sb() won't be called during unmounts . This prevents accessing the stale inode in clear_inode . A similar race condition is already prevented in prune_dcache() (between generic_shutdown_super ()and __shrink_dcache_sb () ) . Please also confirm that the bug is still present in current kernels - 2.6.32 is rather old. I am not sure whether the bug is still present in current kernels. But I do see some rcu locks in this area in the current kernel . We are moving to 3.4 kernel . But the current product is still based on 2.6.32 . So we need to fix this issue in 2.6.32 . Regards, Shaiju. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/