Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp436664imm; Sat, 14 Jul 2018 04:22:42 -0700 (PDT) X-Google-Smtp-Source: AAOMgpezb0ZF1cc1iatzDHnSJJlgiR2EOwZ05WNH34HArKZ6DFM6W52LTDosJTS6BEcHLj/En9we X-Received: by 2002:a65:5683:: with SMTP id v3-v6mr9021597pgs.176.1531567362719; Sat, 14 Jul 2018 04:22:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531567362; cv=none; d=google.com; s=arc-20160816; b=Cw41OiLgqbOB+ngiugk9VolCUzsTQlMkpbtiamrYUn2EuKcpjjvx5el7OyR+O7H1Ze oTqn3IL8kpSzdlm28tJiU2cFJArYyGlKk0efqmk59YCiiaMTT6QoIKWCTudfKLG7XfVF j1idf3KKkmRNAi+lNPkiiSoW9kFVSqe4pPuLv30YWQ+sDHGbZ5PSWL/j6Ea0jR1VFGH+ 65P40DNC5TJed1Xdd3cOWn8HbqMcgcsiZtBMxoNrN+4fqfUFF+8ru1paxI2a5iCVL37B 2ReQCLFNFxwHHzmuABnHWNgym5lObsWJt7ekvEZuifM9N3ECP5AvxFMdeqf8tj16FAwQ 84TQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:subject:message-id:date:from :in-reply-to:references:mime-version:arc-authentication-results; bh=PolphIcP6Ea6uQDWcoPGw9ax13kZQpvqs/9WdvOR3zE=; b=DF/nHlIL0P3mR6vOHCMWXO6fX4QM6CaFyoCYFV741fZwypbnl0CPZIKxDYcY8UG8SK g14sAQQ723U94nIU5jnOVef9TbzkafhkV+4tiPh1im00tPZaRi5mUpIcT8bnE/JwIo+R efHrTLwH7qBle2/616LkxkOeIKL9viNm3dZ4S8LK1+66WeKaWqf9ECBxiZeEthH1Hvz9 Sa7ixl5pAXA4E8JCYvMbfzWIOaDu8Rlgu46qj6KGEebYNCvnFpVM2gRSsbphF5OVKxUp c5O8F9sPCDvCMcXJexR2y9XuK4LCQtX1f8SoF8svaPMapfz8bDDhLxUuHgsX9ucOMFhO Zlog== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t19-v6si26012504pgb.196.2018.07.14.04.22.28; Sat, 14 Jul 2018 04:22:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727199AbeGNLkM (ORCPT + 99 others); Sat, 14 Jul 2018 07:40:12 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:48511 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726252AbeGNLkM (ORCPT ); Sat, 14 Jul 2018 07:40:12 -0400 Received: from mail-oi0-f69.google.com ([209.85.218.69]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1feIcB-00082t-6o for linux-kernel@vger.kernel.org; Sat, 14 Jul 2018 11:21:27 +0000 Received: by mail-oi0-f69.google.com with SMTP id w204-v6so47183643oib.9 for ; Sat, 14 Jul 2018 04:21:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=PolphIcP6Ea6uQDWcoPGw9ax13kZQpvqs/9WdvOR3zE=; b=eiBXJRLR1Gu0R0UCvgipxT7gHJKNpnJ3eEVXri445qe4y6xawE79TzLzDyI8JInUTf Vpw8fyjc+MEV0wPcUA8EXtYl+lDH3VSfx3kYAyeIKT5DSarwvpgurqSrwXYwi5AplYXT 627MwDaTu0lNG/4GW4jyB1QjOWXe0rPYGAwTIwsqtWJdJMFOxfFzKsNlyPcCrfbolmu/ 0cVX2PVWEufagiJTQvDBqsmVGe2Se3BK1o8osO2sUbWE3gQvePOtBBM7vovBvNIP2x2l Uifgz9odnXGYQV7ArxXqlwNHomWkoDqAel2YqtMMRpV67V5Niipy37eS+oRaITAouRS5 8K/A== X-Gm-Message-State: AOUpUlHhTaiP1bh+vDQ3GkPVXn4a3nxIDlljxGq/5aOCV03dr1Ik42mR 1YknqBIZjxdd928KZjlcZcp7WSCxMlPGr5lrSP/cIQrCZSwVHBRDeNRqAQ5I6QceRNNU74ukhMX kqu0MQgnlEHUgAr/r1+2CjEwFxpbTKd9mMSXwQ6JzZZpiLftx/AB26uetjw== X-Received: by 2002:aca:3357:: with SMTP id z84-v6mr11423987oiz.49.1531567285516; Sat, 14 Jul 2018 04:21:25 -0700 (PDT) X-Received: by 2002:aca:3357:: with SMTP id z84-v6mr11423956oiz.49.1531567285033; Sat, 14 Jul 2018 04:21:25 -0700 (PDT) MIME-Version: 1.0 References: <20180706174324.GA3049@xps13.dannf> <20180707041018.GB3546@thunk.org> <20180710165143.GA20459@xps13.dannf> <20180710204329.GB20459@xps13.dannf> <20180712230826.GB28610@thunk.org> In-Reply-To: <20180712230826.GB28610@thunk.org> From: dann frazier Date: Sat, 14 Jul 2018 05:21:14 -0600 Message-ID: Subject: Re: [Bisect] ext4_validate_inode_bitmap:98: comm stress-ng: Corrupt inode bitmap To: tytso@mit.edu, Ike Pan , linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, yanaijie@huawei.com, Colin King , Kamal Mostafa Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 12, 2018 at 5:08 PM Theodore Y. Ts'o wrote: > > > > > Review console log and on each run I have filesystem rebuild. The problem > > is that mke2fs I am using is 1.44.3-rc2. I am now reseting the environment > > and re-test. > > > > Could it be that you saw the error in ext4_validate_block_bitmap()? Looks like it. From Ike's report: # grep EXT4 d05-4-ipmi.log [ 26.215587] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: (null) [ 29.844105] EXT4-fs (sdb2): re-mounted. Opts: errors=remount-ro [ 3586.211348] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 4705: bad block bitmap checksum [ 8254.776992] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 4193: bad block bitmap checksum I've ran my test case for several days w/ just the inode bitmap fix and haven't been able to reproduce it - but perhaps that's just the nature of the chdir test. > The patch which I sent Dann only fixed the problem for inode bitmaps; > I noticed today that we need to fix it for block allocation bitmaps as > well. I've also now ran several iterations w/ the block bitmap fix as well, and still no problems, so: Tested-by: dann frazier > commit 8d5a803c6a6ce4ec258e31f76059ea5153ba46ef > Author: Theodore Ts'o > Date: Thu Jul 12 19:08:05 2018 -0400 > > ext4: check for allocation block validity with block group locked > > With commit 044e6e3d74a3: "ext4: don't update checksum of new > initialized bitmaps" the buffer valid bit will get set without > actually setting up the checksum for the allocation bitmap, since the > checksum will get calculated once we actually allocate an inode or > block. > > If we are doing this, then we need to (re-)check the verified bit > after we take the block group lock. Otherwise, we could race with > another process reading and verifying the bitmap, which would then > complain about the checksum being invalid. > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1780137 > > Signed-off-by: Theodore Ts'o > Cc: stable@kernel.org Would it also make sense to add the following? Fixes: 044e6e3d74a3 ("ext4: don't update checksum of new initialized bitmaps") -dann > diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c > index e68cefe08261..aa52d87985aa 100644 > --- a/fs/ext4/balloc.c > +++ b/fs/ext4/balloc.c > @@ -368,6 +368,8 @@ static int ext4_validate_block_bitmap(struct super_block *sb, > return -EFSCORRUPTED; > > ext4_lock_group(sb, block_group); > + if (buffer_verified(bh)) > + goto verified; > if (unlikely(!ext4_block_bitmap_csum_verify(sb, block_group, > desc, bh))) { > ext4_unlock_group(sb, block_group); > @@ -386,6 +388,7 @@ static int ext4_validate_block_bitmap(struct super_block *sb, > return -EFSCORRUPTED; > } > set_buffer_verified(bh); > +verified: > ext4_unlock_group(sb, block_group); > return 0; > } > diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c > index fb83750c1a14..e9d8e2667ab5 100644 > --- a/fs/ext4/ialloc.c > +++ b/fs/ext4/ialloc.c > @@ -90,6 +90,8 @@ static int ext4_validate_inode_bitmap(struct super_block *sb, > return -EFSCORRUPTED; > > ext4_lock_group(sb, block_group); > + if (buffer_verified(bh)) > + goto verified; > blk = ext4_inode_bitmap(sb, desc); > if (!ext4_inode_bitmap_csum_verify(sb, block_group, desc, bh, > EXT4_INODES_PER_GROUP(sb) / 8)) { > @@ -101,6 +103,7 @@ static int ext4_validate_inode_bitmap(struct super_block *sb, > return -EFSBADCRC; > } > set_buffer_verified(bh); > +verified: > ext4_unlock_group(sb, block_group); > return 0; > }