Received: by 2002:a05:6a10:83d0:0:0:0:0 with SMTP id o16csp58723pxh; Thu, 7 Apr 2022 13:56:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxotQpcQ710RzkeiL6nmeAGNCMwgodxSyinRqw7I+o0XAbCcCBTNc3qfhtMh2ulgVXCU/MC X-Received: by 2002:aa7:82d9:0:b0:4fa:2c7f:41e with SMTP id f25-20020aa782d9000000b004fa2c7f041emr16197208pfn.1.1649364971006; Thu, 07 Apr 2022 13:56:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649364971; cv=none; d=google.com; s=arc-20160816; b=kuz4M2WtF7/XS0SlTC0m7IyBcQj0AeDgxSVxdFHzbzEuaFKrJHJmMHVOgFMCcdrG5F lY74KC55j2uNokYyrfmeNSCKZFOICSZlVJo6hJ3vsxEokQstT7LCw6VYrnp89FFMiy7W g32L1WNG9ZpoRWbP3e+nfUHy5UH+9E7cOVpp5YqP4TBQBwIsBOjNmr94K8He8Zkxb2Z0 LSwn769R3bZVMRfnWYF6EoZ+6roGcauqspVBiCqys48d4QiJqTyhuQ5/azFMeGFyi8zO inzUM/8uHxOvmBIJyCeWXaVk9plrGQ01cwYAd9R2ORMBol6RIGAVNq39awkMCVIepRv2 2Aiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=O3otWKqQMFWd6nPrSSx36BnYER/d/h+i8B1BYgoeEqo=; b=ssZ2bG512R0LtHu7MgGiinkYQ/X3XTLjyDO3KRoJH9F6okNr1dcwR6NHH6sKubAQsX /VSftg9nxjX/q1suSXjhYJraAc8wq0sSEEQzGSSdCjQfkghfBL8skx/pmUOY62uodZZF WAhSv/UqYE+O7mTU1VJgRDGOr/fD3/qnzSMN4FjZq6Qit7HyavGzN791CaN2FGINgbL3 A4EoAw3sOpFgj5KO6XY5iALc23k0tiOHziy9LyLwIag6Mh9u3RLdKmmN+M8TxVxSvrB6 xVf1bZs3aYGxJevkwo1vLmkIy4bpNapQpH2HDEv0KNyZu12LSFr6kcsawJ7w43gel/RB IDpQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mit.edu Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id 31-20020a63195f000000b003994e312061si9532025pgz.639.2022.04.07.13.56.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Apr 2022 13:56:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mit.edu Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A357C27FCBE; Thu, 7 Apr 2022 12:59:03 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234988AbiDGO4m (ORCPT + 99 others); Thu, 7 Apr 2022 10:56:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344324AbiDGO4k (ORCPT ); Thu, 7 Apr 2022 10:56:40 -0400 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA2DE1EE9EC for ; Thu, 7 Apr 2022 07:54:38 -0700 (PDT) Received: from cwcc.thunk.org (pool-108-7-220-252.bstnma.fios.verizon.net [108.7.220.252]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 237EsWUG019896 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 7 Apr 2022 10:54:32 -0400 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 4325215C3EE7; Thu, 7 Apr 2022 10:54:32 -0400 (EDT) Date: Thu, 7 Apr 2022 10:54:32 -0400 From: "Theodore Ts'o" To: Ritesh Harjani Cc: anserper@ya.ru, linux-ext4@vger.kernel.org, Andrew Perepechko Subject: Re: [PATCH v3] ext4: truncate during setxattr leads to kernel panic Message-ID: References: <20220402084023.1841375-1-anserper@ya.ru> <20220405095451.kx43cdu2ureywgcq@riteshh-domain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220405095451.kx43cdu2ureywgcq@riteshh-domain> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Tue, Apr 05, 2022 at 03:24:51PM +0530, Ritesh Harjani wrote: > On 22/04/02 11:40AM, anserper@ya.ru wrote: > > From: Andrew Perepechko > > > > When changing a large xattr value to a different large xattr value, > > the old xattr inode is freed. Truncate during the final iput causes > > current transaction restart. Eventually, parent inode bh is marked > > dirty and kernel panic happens when jbd2 figures out that this bh > > belongs to the committed transaction. > > > > A possible fix is to call this final iput in a separate thread. > > This way, setxattr transactions will never be split into two. > > Since the setxattr code adds xattr inodes with nlink=0 into the > > orphan list, old xattr inodes will be properly cleaned up in > > any case. > > Ok, I think there is a lot happening in above description. I think part of the > problem I am unable to understand it easily is because I haven't spend much time > with xattr code. But I think below 2 requests will be good to have - > > 1. Do we have the call stack for this problem handy. I think it will be good to > mention it in the commit message itself. It is sometimes easy to look at the > call stack if someone else encounters a similar problem. That also gives more > idea about where the problem is occuring. > > 2. Do we have a easy reproducer for this problem? I think it will be a good > addition to fstests given that this adds another context in calling iput on > old_ea_inode. Andrew, would it be possible for you to supply a call stack and a reproducer? It sounds like what's going on is if the file system has the ea_inode feature enabled, and we have a large xattr value which is stored in an inode, it's possible if that when that inode is truncated, it is spread across two transactions. But the problem is that when the iput(ea_inode) is called from ext4_xattr_set_entry(), there is a handle which is passed into that function, since the xattr operation is part of its own transaction, and so the truncate operation is part of "nested handle". That's OK, so long as the initial handle is started with sufficient credits for the nested start_handle. But when that handle is closed, and then re-opened, it has two problems. The first is that the xattr operation is no longer atomic (and spread across two transaction). The second is that if the write access to the inode table's bh was requested before the implied truncate from iput(ea_inode), then when we call handle_dirty_metadata() on that bh, we get a jbd2 assertion. (Which is good, because it notifies and catches the first problem.) So by moving the iput to a separate thread, it avoids this problem, since the truncate can take place in its own handle. The other solution would be to some how pass the inode all the way up through the call chain, and only call iput(ea_inode) after handle is stopped. But that would require quite a lot of code surgery, since ext4_xattr_set_entry is called in a number of places, and the iput() would have to be plumbed up through two callers to where the handle is actually stopped. - Ted