Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp9411978ybc; Sat, 30 Nov 2019 06:52:57 -0800 (PST) X-Google-Smtp-Source: APXvYqxUiHpryMC0RT5T9Eo+V4dpDzGJMckAq9L34B23jF94Oc0HZHRy0WvTgwfgD53NUwor1lxf X-Received: by 2002:a17:906:941:: with SMTP id j1mr13017055ejd.185.1575125576930; Sat, 30 Nov 2019 06:52:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1575125576; cv=none; d=google.com; s=arc-20160816; b=0cmm7Qx1CFrzd89IszVl7nt6XwrYL/MDidY28GV5V0Bh4yAerI66bbVWh2J8RrZaO/ dU++9kdrj0VpSnhmj9v4RZfQoLU77WdRvN7qjwMDKFX8ySQn16VsTgQkhuTw5XKtaPcH gAnAeLY/7e0B8DKQV712PBdictItD8I+qx+Nx28rn6j7DkxTIJy7tz1BFqF/ZIEgSH8a 1nIgrem8aDyveSSiSTho/1RW/XMoZ8c6CSGwuXPBIsyO2wgYj760IvAxai+RRXRForgQ CJsixOlNUOnl+VR+i0tVFb1GxpzxY/uPNTUZNXwTcFA+nnf+1BpIvI8GSzGmFzLzx8M5 DFuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject; bh=7b+hfPb4FXTEHbtqDEKA6+YxLY6m0Hxz4aT1FxDoVzo=; b=mC0HfehbbOeAgp81k1TllqZXOklEmLVlpXiboVmdYPnqT7wi+sFVB6V2ZjRMhv1Gcj yS3gU/VhSV/G4g7ASiQ79tth1bAph9lktJQKUjBOTWZ63q72/aErz6KVmEF4fMJNy0tt lmf2B1MbBF0+r2lUHuQPk5goZfhGOkmr9h1/tl8MOsiSKl6T0KrITlkCeE3aM4aLQAAV yM1b97/SsBoCKvhT00T+BwjUlU34hhjOj1TJth9EiR4HOCQ5w7SZEvWpeQPZkoeXDws6 fdE2FN/QA2sr1ASXzSNRd+mBCa7KP/W0hOc4M933JDAOWIJejFyGA3h0yarIIkxmzTu6 1KJg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ec6si16113215ejb.174.2019.11.30.06.52.12; Sat, 30 Nov 2019 06:52:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726965AbfK3Oua (ORCPT + 99 others); Sat, 30 Nov 2019 09:50:30 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:7183 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726385AbfK3Ou3 (ORCPT ); Sat, 30 Nov 2019 09:50:29 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id C56233B173EF51170D9B; Sat, 30 Nov 2019 22:50:18 +0800 (CST) Received: from [127.0.0.1] (10.173.220.179) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Sat, 30 Nov 2019 22:50:09 +0800 Subject: Re: [PATCH] ext4, jbd2: ensure panic when there is no need to record errno in the jbd2 sb From: "zhangyi (F)" To: Jan Kara , CC: , , , References: <20191126144537.30020-1-yi.zhang@huawei.com> <20191129144611.GA27588@quack2.suse.cz> <0aa529fe-a881-aa4c-3b8f-980c8eceb64b@huawei.com> Message-ID: Date: Sat, 30 Nov 2019 22:50:08 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <0aa529fe-a881-aa4c-3b8f-980c8eceb64b@huawei.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.173.220.179] X-CFilter-Loop: Reflected Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On 2019/11/30 11:24, zhangyi (F) wrote: > On 2019/11/29 22:46, Jan Kara wrote: >> On Tue 26-11-19 22:45:37, zhangyi (F) wrote: >>> JBD2_REC_ERR flag used to indicate the errno has been updated when jbd2 >>> aborted, and then __ext4_abort() and ext4_handle_error() can invoke >>> panic if ERRORS_PANIC is specified. But there is one exception, if jbd2 >>> thread failed to submit commit record, it abort journal through >>> invoking __jbd2_journal_abort_hard() without set this flag, so we can >>> no longer panic. Fix this by set such flag even if there is no need to >>> record errno in the jbd2 super block. >>> >>> Fixes: 4327ba52afd03 ("ext4, jbd2: ensure entering into panic after recording an error in superblock") >>> Signed-off-by: zhangyi (F) >>> Cc: >> >> Thanks for the patch. This indeed looks like a bug. I was trying hard to >> understand why are we actually using __jbd2_journal_abort_hard() in >> fs/jbd2/commit.c in the first place. And after some digging, I think it is >> an oversight and we should just use jbd2_journal_abort(). The calls have been >> introduced by commit 818d276ceb83a "ext4: Add the journal checksum >> feature". Before that commit, we were just using jbd2_journal_abort() when >> writing commit block failed. And when we use jbd2_journal_abort() from >> everywhere, that will also deal with the problem you've found. >> >> Also as a nice cleanup we could then just drop __jbd2_journal_abort_hard(), >> __jbd2_journal_abort_soft() and have all the functionality in a single >> function jbd2_journal_abort(). >> > > Indeed, it seems that we also need to record the errno if we failed to > submit commit block, I will remove __jbd2_journal_abort_hard() and combine > them in my next iteration. > Hi Ted and Jan, I am confusing about the commit fb7c02445c49 "ext4: pass -ESHUTDOWN code to jbd2 layer" when I trying to cleanup the __journal_abort_soft() and __jbd2_journal_abort_hard(). Before this commit, we will not record the errno if we shutdown the filesystem no matter it has been aborted or not, so the errno will not change. After this commit, we record 0 to "sb->s_errno" for the first jbd2_journal_abort(-ESHUTDOWN), and we also do not update the errno if it has been aborted and record a no-zero errno because of the follow checking. + if (journal->j_flags & JBD2_ABORT) { + write_unlock(&journal->j_state_lock); + if (!old_errno && old_errno != -ESHUTDOWN && ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + errno == -ESHUTDOWN) + jbd2_journal_update_sb_errno(journal); + return; + } So the only modification of this patch is: 1) fix the lock; 2) set journal->j_errno = -ESHUTDOWN and JBD2_REC_ERR flag when we invoke jbd2_journal_abort(-ESHUTDOWN). These two modifications do not relate to the git log you mentioned. I guess do you want to mean if (old_errno && old_errno != -ESHUTDOWN && errno == -ESHUTDOWN) ? If so, why we need to overwrite the last aborted errno to 0, if the filesystem was already aborted for some reasons, will it cover up the issue? Am I miss something? Thanks, Yi.