Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp6280750rwb; Mon, 5 Dec 2022 10:05:15 -0800 (PST) X-Google-Smtp-Source: AA0mqf4RVilgK+IRRD7XZ9WygwoiFGOn9aBwjcILr9gWu654FvqbRkenaFDES8Gm92Zdj53K7XXi X-Received: by 2002:a05:6402:f13:b0:46c:e627:13bb with SMTP id i19-20020a0564020f1300b0046ce62713bbmr2702491eda.204.1670263515006; Mon, 05 Dec 2022 10:05:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670263514; cv=none; d=google.com; s=arc-20160816; b=NMN7lNJ/8clRjgr4AmBieA8Vp/MMEF9gvAIYIDvZQYGonpuJh2ZEEJaI7E/4ymIATt wL8JiINosv6xsIV62Voo/tyv1c9n22u/VMi7nvWRuhZcwENpfcyIBCpxtMf3m7tghHGj FI9ZOPSxWn20IEmk8T2MQK5sfdPPxTlX4U9uG8d3RvfFys5+IO2FZnsDxVE5OJYld3bI okEKrzfR9tgOKbdC13O5R60aoYvl0HlLVCrYvALY8kWuv5/IdmQPjKw0LtiG+iIf2+0G Bp08j88oMsJzmRGyExVePFui82jCAgDfztjQwn7F/WVgegIqKWpfDyH/lhlRvXIzkLy7 2H1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=EyxV8qZhIzGczmqYjJSp/unx1C6FSCFFVqHS1FGWrSc=; b=h0qfPCHTYUbxoFBqu3mkqFQ9/uXOxPHLEEnYXxJ7i7wdp3E6JFX5Y2ibNIWRV8vZJt 61pj/GQHYAPv8PRYmrbiGYeVzqQYKkvc3A//H1WSzkHjUuTa8AQlHeqtk5erJliw/hJn NvEe36c7wJotYVFuWNdX/eXCmnf/Keqy3piwWtJfojKmC4LVfGluSRcylbd1g0vS94LK 237h94t8IPaflHe+nLsp52pscYR+GOsy7fQ3wi5jMdE2dYZKdGI3qMCc/MMAs8+j+Nlv P9Jfq+KiaYZ6s+HSMsv4sMp+frJ6KV35yO88PsNgPq2KkiCeZaxIrZ09aX11xuHQ/7Ir F7JQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@mit.edu header.s=outgoing header.b=DFmWZlMI; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mit.edu Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qo14-20020a170907874e00b007b271fa8e7asi11153224ejc.197.2022.12.05.10.04.43; Mon, 05 Dec 2022 10:05:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@mit.edu header.s=outgoing header.b=DFmWZlMI; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mit.edu Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232762AbiLERvs (ORCPT + 99 others); Mon, 5 Dec 2022 12:51:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232541AbiLERv3 (ORCPT ); Mon, 5 Dec 2022 12:51:29 -0500 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DEAC20188; Mon, 5 Dec 2022 09:49:28 -0800 (PST) Received: from cwcc.thunk.org (pool-173-48-120-46.bstnma.fios.verizon.net [173.48.120.46]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 2B5Hmp1V006666 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 5 Dec 2022 12:48:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mit.edu; s=outgoing; t=1670262534; bh=EyxV8qZhIzGczmqYjJSp/unx1C6FSCFFVqHS1FGWrSc=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=DFmWZlMI8RgxOi8rLeDWHiz7WyxufuRbmz42fSAF2T9p89fGPNnkEL/ygeraPArkM 2Ehn8OFIC1QkNzpmQ9SmzOy23xPFjOMOEIpUbibyaLWegYnyLt15O+Ddt/4dtOA0N8 7ubNpN2u6JieouauF/rVYGukq1Yj3CHiD3oCh8Qb8EGPLo8ipBO4xYXdHwgyot3474 1m05wQ0VCex2QxQ5IEwPy66picVF1bxivFFqZ7//pZCiqG8YZAwQtp2O1Z7+LCr7jT 7RF7StYyEU9YkCDwvt839APAqit8o+g6FrSdVR2m9NaS+9oCa+5ioq5hkb7aCYw1Df 4Mf2GFYC4IYYg== Received: by cwcc.thunk.org (Postfix, from userid 15806) id D789015C3489; Mon, 5 Dec 2022 12:48:50 -0500 (EST) Date: Mon, 5 Dec 2022 12:48:50 -0500 From: "Theodore Ts'o" To: zhanchengbin Cc: Ye Bin , adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, jack@suse.cz, Ye Bin Subject: Re: [PATCH RFC] ext4:record error information when insert extent failed in 'ext4_split_extent_at' Message-ID: References: <20221024122725.3083432-1-yebin@huaweicloud.com> <2ab8e268-4e1a-8e97-4798-48fcdb651cdf@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2ab8e268-4e1a-8e97-4798-48fcdb651cdf@huawei.com> X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Fri, Oct 28, 2022 at 09:41:55AM +0800, zhanchengbin wrote: > There have been a lot of problems here before, but the problem has not been > fundamentally solved. Indeed, this only papers only the problem. The commit description is pretty good, in that it describes the the root cause of the problem: > > If 'ext4_ext_insert_extent' return '-ENOMEM' which will not fix 'ex->ee_len' by > > old length. 'ext4_ext_insert_extent' will trigger extent tree merge, fix like > > 'ex->ee_len = orig_ex.ee_len' may lead to new issues. > > To solve above issue, record error messages when 'ext4_ext_insert_extent' return > > 'err' not equal '(-ENOSPC && -EDQUOT)'. If filesysten is mounted with 'errors=continue' > > as filesystem is not clean 'fsck' will repair issue. If filesystem is mounted with > > 'errors=remount-ro' filesystem will be remounted by read-only. What should actually happen here is that we undo the change in orig_ex if ext4_ext_insert_extent fails. As you point out, in an earlier part of the code path, this gets handled by a "goto fix_extent_len". The problem is that it's possible that the shape of the extent tree may have changed before ext4_insert_extent() fails with the ENOMEM. So the simplistic fix of just jumping to fix_extent_len isn't going to work. But fixing it by much marking the file system is corrupt is a bit of a cop-out. Consider that if you were running ext4 in a Huawei Cloud, and you run into the memory allocation failure, what would you prefer? For the individual system call to fail, and propagating the failure to userspace? Or to leave the file system corrupted? (And then either forcing a reboot so the file system to be fixed, or allow the system to stumble along, with further unknown unexpected behaviour from userspace?) The real correct fix is that ext4_ext_insert_extent() needs to rollback any changes to the extent tree that was made, *or* it needs to make sure that the operation will succeed before starting to make any changes, *or* we need to look up the orig_extent via orig_ex->ee_block, and then undo the change. It might be that we can't always reliably rollback the change, or we might think that the operation will succeed, but then it fail due to an I/O error. If it's due to an I/O error, then it's fine to bail and mark the file system as corrupted. But if the failure is caused by an ENOMEM, we should be able to handle this case more gracefully. Can you look into a better fix? Thanks, - Ted