Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp546030ybf; Fri, 28 Feb 2020 03:06:41 -0800 (PST) X-Google-Smtp-Source: APXvYqw6splAGF6W4pbMJn6necAWsmsSr14VZp7SQhlYkT4eKVjSI2VpS3v5aVVFKNGQr+DeqWqF X-Received: by 2002:aca:4306:: with SMTP id q6mr2732337oia.54.1582888001325; Fri, 28 Feb 2020 03:06:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582888001; cv=none; d=google.com; s=arc-20160816; b=si8LzZ+KYASW72BOrTfOHvGu6Dzw4fd/SQ23pZMhxNg3tLcMLbloCfkIY9AGlLsdDy Q52JgwfQfQ1236YCDe1ybRNdnmGOEyq0TI5elYtlnBodhQDt0afcUNlUDfbNW1HTMTMI dBkTNjsel3JBF6/8djqiXsSatHI7XG5+73c5iWmULTB3mWcEF+BjqhtZ409BqmduXF0Z J0LQz86x2O8jGM1/9qdgj5HUHq5uy9azD6qRnhCZZozClnTrdLV5Qd7gR9ulMdSIT8Yt ZdhEfOvIUUR1Z7cRXLD0yLQ/rbGs902Pv2TTy/nToECMsA9Jm0fQlDD23botSXhxr1Fs VLfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=E0Pa7cpRLuMV2NYp/M4q6fJc7IXZYTneRZvDCVuvqsA=; b=ly1Xsp3pNTCTTiv57TK0y0MUaaX1Cc+006zwUvGpOsuX2seQdDhuXAOWOnsvxbgdLY HV72Id/Me0cAEsGZp+i/qhllx22Lht4tw62y31/0FMZCx3g+30HOwJgU+aqx5zx2BPlZ /4aajdpPwmHD28cTxBh91EGz4rzY2CmMJd/r17MJV8bOdWyVsU+t/yvlmy6LvxsMxvXN Oq7fqTHvCO4SxZg9F8nS4Gv3STfdt3KZt3T4RwjtIEcD0gygWIzbHIY3jBp1Hipu8jAz vUN4NLcuVhXxxk3+D/U6SWrVyOVT05L/ut3Jk2ZzMnalAwAis3kzuiKwbZ56k7Ybe/N5 X8Sw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@dupond.be header.s=dkim header.b=fvnlVGHp; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=dupond.be Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m4si1362699otr.268.2020.02.28.03.06.22; Fri, 28 Feb 2020 03:06:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@dupond.be header.s=dkim header.b=fvnlVGHp; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=dupond.be Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726700AbgB1LGS (ORCPT + 99 others); Fri, 28 Feb 2020 06:06:18 -0500 Received: from apollo.dupie.be ([51.15.19.225]:46604 "EHLO apollo.dupie.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726811AbgB1LGS (ORCPT ); Fri, 28 Feb 2020 06:06:18 -0500 Received: from [10.10.1.146] (systeembeheer.combell.com [217.21.177.69]) by apollo.dupie.be (Postfix) with ESMTPSA id 7E55680AC3C; Fri, 28 Feb 2020 12:06:14 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dupond.be; s=dkim; t=1582887974; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E0Pa7cpRLuMV2NYp/M4q6fJc7IXZYTneRZvDCVuvqsA=; b=fvnlVGHpWXCJjzkyonoyKLnA2YaPyqON6A/KnAVvqeM2ZvReBGaNz+SBDfdeq9CHp5jUnd YGAVbm4OAEeDTNrcBlV97lMuF8fhgx8puN0Za8Gsv3L8TFVzKnWCdyoPqFeh3isKloX922 fx6hIG47X6mgDAqf8kmVOdP7cIdk7AzZKIAAXUvhQwHLJhYLAV1FA7NTDarvS2dqls800x 4QJyov2jINvSR4hsyLfEtrDcI0cBSXB0CVHimJNnzxTDTxWYXgH1adqGc1eyXU08n/g3VP rw9Kld5Wifm4KR2ZO6qF9aG2lH759MBwreEpkZHWbaluqmTecMedWIoKcNAJzg== Subject: Re: Filesystem corruption after unreachable storage To: "Theodore Y. Ts'o" Cc: linux-ext4@vger.kernel.org References: <20200124203725.GH147870@mit.edu> <3a7bc899-31d9-51f2-1ea9-b3bef2a98913@dupond.be> <20200220155022.GA532518@mit.edu> <7376c09c-63e3-488f-fcf8-89c81832ef2d@dupond.be> <20200225172355.GA14617@mit.edu> From: Jean-Louis Dupond Message-ID: Date: Fri, 28 Feb 2020 12:06:17 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: <20200225172355.GA14617@mit.edu> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On 25/02/2020 18:23, Theodore Y. Ts'o wrote: > On Tue, Feb 25, 2020 at 02:19:09PM +0100, Jean-Louis Dupond wrote: >> FYI, >> >> Just did same test with e2fsprogs 1.45.5 (from buster backports) and kernel >> 5.4.13-1~bpo10+1. >> And having exactly the same issue. >> The VM needs a manual fsck after storage outage. >> >> Don't know if its useful to test with 5.5 or 5.6? >> But it seems like the issue still exists. > This is going to be a long shot, but if you could try testing with > 5.6-rc3, or with this commit cherry-picked into a 5.4 or later kernel: > > commit 8eedabfd66b68a4623beec0789eac54b8c9d0fb6 > Author: wangyan > Date: Thu Feb 20 21:46:14 2020 +0800 > > jbd2: fix ocfs2 corrupt when clearing block group bits > > I found a NULL pointer dereference in ocfs2_block_group_clear_bits(). > The running environment: > kernel version: 4.19 > A cluster with two nodes, 5 luns mounted on two nodes, and do some > file operations like dd/fallocate/truncate/rm on every lun with storage > network disconnection. > > The fallocate operation on dm-23-45 caused an null pointer dereference. > ... > > ... it would be interesting to see if fixes things for you. I can't > guarantee that it will, but the trigger of the failure which wangyan > found is very similar indeed. > > Thanks, > > - Ted Unfortunately it was a too long shot :) Tested with a 5.4 kernel with that patch included, and also with 5.6-rc3. But both had the same issue. - Filesystem goes read-only when the storage comes back - Manual fsck needed on bootup to recover from it. It would be great if we could make it not corrupt the filesystem on storage recovery. I'm happy to test some patches if they are available :) Thanks Jean-Louis