From: Nikola Pajkovsky Subject: Re: xfstests generic/130 hang with non-4k block size ext4 on 4.7-rc1 kernel Date: Fri, 10 Jun 2016 07:52:56 +0200 Message-ID: <87oa79h9on.fsf@gooddata.com> References: <20160531154017.GC5357@thunk.org> <20160601063822.GH10350@eguan.usersys.redhat.com> <20160601165800.GI10350@eguan.usersys.redhat.com> <20160602085840.GH19636@quack2.suse.cz> <20160602121750.GC32574@quack2.suse.cz> <20160603101612.GJ10350@eguan.usersys.redhat.com> <20160603115844.GB2470@quack2.suse.cz> <20160608125631.GA19589@quack2.suse.cz> <87oa7a6d1q.fsf@gooddata.com> <20160609150405.GB19882@quack2.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Holger =?utf-8?Q?Hoffst=C3=A4tte?= , linux-ext4@vger.kernel.org, Jan Kara To: Jan Kara Return-path: Received: from mail-wm0-f43.google.com ([74.125.82.43]:36436 "EHLO mail-wm0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750710AbcFJFw7 convert rfc822-to-8bit (ORCPT ); Fri, 10 Jun 2016 01:52:59 -0400 Received: by mail-wm0-f43.google.com with SMTP id n184so251886028wmn.1 for ; Thu, 09 Jun 2016 22:52:58 -0700 (PDT) In-Reply-To: <20160609150405.GB19882@quack2.suse.cz> (Jan Kara's message of "Thu, 9 Jun 2016 17:04:05 +0200") Sender: linux-ext4-owner@vger.kernel.org List-ID: Jan Kara writes: > On Thu 09-06-16 09:23:29, Nikola Pajkovsky wrote: >> Holger Hoffst=C3=A4tte writes: >>=20 >> > On Wed, 08 Jun 2016 14:56:31 +0200, Jan Kara wrote: >> > (snip) >> >> Attached patch fixes the issue for me. I'll submit it once a full= xfstests >> >> run finishes for it (which may take a while as our server room is= currently >> >> moving to a different place). >> >>=20 >> >> Honza >> >> --=20 >> >> Jan Kara >> >> SUSE Labs, CR >> >> From 3a120841a5d9a6c42bf196389467e9e663cf1cf8 Mon Sep 17 00:00:00= 2001 >> >> From: Jan Kara >> >> Date: Wed, 8 Jun 2016 10:01:45 +0200 >> >> Subject: [PATCH] ext4: Fix deadlock during page writeback >> >>=20 >> >> Commit 06bd3c36a733 (ext4: fix data exposure after a crash) uncov= ered a >> >> deadlock in ext4_writepages() which was previously much harder to= hit. >> >> After this commit xfstest generic/130 reproduces the deadlock on = small >> >> filesystems. >> > >> > Since you marked this for -stable, just a heads-up that the previo= us patch >> > for the data exposure was rejected from -stable (see [1]) because = it >> > has the mismatching "!IS_NOQUOTA(inode) &&" line, which didn't exi= st >> > until 4.6. I removed it locally but Greg probably wants an officia= l patch. >> > >> > So both this and the previous patch need to be submitted. >> > >> > [1] http://permalink.gmane.org/gmane.linux.kernel.stable/18074{4,5= ,6} >>=20 >> I'm just wondering if the Jan's patch is not related to blocked >> processes in following trace. It very hard to hit it and I don't hav= e >> any reproducer. > > This looks like a different issue. Does the machine recover itself or= is it > a hard hang and you have to press a reset button? The machine is bit bigger than I have pretend. It's 18 vcpu with 160 GB ram and machine has dedicated mount point only for PostgreSQL data. Nevertheless, I was able always to ssh to the machine, so machine itsel= f was not in hard hang and ext4 mostly gets recover by itself (it took 30min). But I have seen situation, were every process who 'touch' the e= xt4 goes immediately to D state and does not recover even after hour. --=20 Nikola -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html