Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754654Ab2JVNpq (ORCPT ); Mon, 22 Oct 2012 09:45:46 -0400 Received: from cantor2.suse.de ([195.135.220.15]:41564 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754197Ab2JVNpp (ORCPT ); Mon, 22 Oct 2012 09:45:45 -0400 Date: Mon, 22 Oct 2012 15:45:41 +0200 From: Jan Kara To: Fabio Coatti Cc: NeilBrown , Jan Kara , "Myklebust, Trond" , Paul Bolle , linux-kernel@vger.kernel.org, Jeff Layton Subject: Re: ext3 issue on 3.6.1 Message-ID: <20121022134541.GA9438@quack.suse.cz> References: <20121018212357.GA17646@quack.suse.cz> <1350596425.1366.15.camel@x61.thuisdomein> <20121018220809.GD17646@quack.suse.cz> <20121019124454.316ecfac@notabene.brown> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5472 Lines: 102 On Mon 22-10-12 12:23:03, Fabio Coatti wrote: > 2012/10/19 Fabio Coatti : > > 2012/10/19 NeilBrown : > >> On Fri, 19 Oct 2012 00:08:09 +0200 Jan Kara wrote: > >> > >>> On Thu 18-10-12 23:40:25, Paul Bolle wrote: > >>> > On Thu, 2012-10-18 at 23:23 +0200, Jan Kara wrote: > >>> > > On Fri 12-10-12 14:57:55, Fabio Coatti wrote: > >>> > > > [13031.051521] ------------[ cut here ]------------ > >>> > > > [13031.051576] WARNING: at fs/inode.c:280 drop_nlink+0x1b/0x35() > >>> > > > [13031.051624] Hardware name: ProLiant BL465c G7 > >>> > > > [13031.051668] Pid: 3344, comm: php Tainted: G W > >>> > > > 3.6.1-1000hz-preempt #2 > >>> > > > [13031.051746] Call Trace: > >>> > > > [13031.051787] [] ? warn_slowpath_common+0x73/0x87 > >>> > > > [13031.051837] [] ? drop_nlink+0x1b/0x35 > >>> > > > [13031.051885] [] ? nfs_dentry_iput+0x33/0x49 > >>> > > > [13031.051934] [] ? d_kill+0xe8/0x108 > >>> > > > [13031.051980] [] ? dput+0x147/0x154 > >>> > > > [13031.052027] [] ? __fput+0x19a/0x1b2 > >>> > > > [13031.052073] [] ? task_work_run+0x4c/0x60 > >>> > > > [13031.052123] [] ? int_signal+0x12/0x17 > >>> > > > [13031.052169] ---[ end trace e60232a455c8e2dd ]--- > >>> > > And this seems unrelated - likely an NFS problem... Let's sort this out > >>> > > if you still see it after ext3 issue is solved. > >>> > > >>> > Looks rather similar too https://lkml.org/lkml/2012/8/29/165 , doesn't > >>> > it? > >>> Yup. I wonder why that patch didn't get merged. Neil? > >>> > >>> Honza > >> > >> Don't know. Maybe I slipped under Trond's radar some how. > >> > >> Trond: can you comment on and hopefully apply this patch? > >> > >> Subject of original email was "WARNING: at fs/inode.c:280 drop_nlink+0x31/0x33() > > > > I'll apply this patch and see what happens, I guess it applies also to > > 3.6.2 where I still see the warning. Could this be a culprit for > > several server lockups that we are seeing in 3.6.X machines and not in > > 2.6.39.X? I'm running some tests with 3.6.X with same setup of other > > machines wth 2.6.39.X and where the new kernel is installed at least > > once a day the machines lockups (not a reassuring thing :) . To answer > > to previous questions, yes, the server has a ext3 read only mount and > > no, the logs shows no other weird things besides the one I posted > > before (see below for a fresh one on 3.6.2). The server has several > > nfs mounts, all R/W. > > > > Ok, after some days of running the modified kernel, the news are not so good :( > > the kernel (3.6.2) message reported above disappeared (dmesg is > clean), however the server is not usable and now I get several 100%CPU > eating processes (namely, apache) and on reboot the console spits out > the message attached (unfortunately a ugly picture, the message was > visible only in a remote console with no history). Sorry, not much I can say about that one... > Then I've given a try to 3.6.3 with the same suggested patch, as I see > nothing related on changelog, but I got the following message: > > [ 228.849355] ------------[ cut here ]------------ > [ 228.849529] WARNING: at fs/ext3/inode.c:1754 > ext3_journalled_writepage+0x55/0x1a7() > [ 228.849706] Hardware name: ProLiant BL465c G7 > [ 228.849833] Pid: 2749, comm: flush-8:0 Not tainted 3.6.3-p #1 > [ 228.849953] Call Trace: > [ 228.850070] [] ? warn_slowpath_common+0x73/0x87 > [ 228.850192] [] ? ext3_journalled_writepage+0x55/0x1a7 > [ 228.850343] [] ? __writepage+0xa/0x21 > [ 228.850474] [] ? write_cache_pages+0x206/0x2f8 > [ 228.850598] [] ? set_page_dirty+0x5e/0x5e > [ 228.850721] [] ? queue_unplugged+0x28/0x34 > [ 228.850823] [] ? generic_writepages+0x3e/0x55 > [ 228.850919] [] ? __writeback_single_inode+0x39/0xd1 > [ 228.851016] [] ? writeback_sb_inodes+0x206/0x392 > [ 228.851112] [] ? __writeback_inodes_wb+0x67/0xa2 > [ 228.851208] [] ? wb_writeback+0xfd/0x18b > [ 228.851315] [] ? wb_do_writeback+0x13d/0x1a2 > [ 228.851436] [] ? add_timer_on+0x61/0x61 > [ 228.851529] [] ? bdi_writeback_thread+0x7f/0x13e > [ 228.851624] [] ? wb_do_writeback+0x1a2/0x1a2 > [ 228.851719] [] ? wb_do_writeback+0x1a2/0x1a2 > [ 228.851815] [] ? kthread+0x81/0x89 > [ 228.851909] [] ? kernel_thread_helper+0x4/0x10 > [ 228.852004] [] ? kthread_worker_fn+0xe0/0xe0 > [ 228.852098] [] ? gs_change+0xb/0xb > [ 228.852189] ---[ end trace 67e723d93533674a ]--- We had this one previously, didn't we? And I asked: Can you post full kernel log (dmesg)? Do you have any filesystem mounted read-only when you see the message? Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/