Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp584532pxm; Wed, 23 Feb 2022 06:49:54 -0800 (PST) X-Google-Smtp-Source: ABdhPJycszMVo0ShS9ryK/kgwRSGunSCh/Tx7glYyjLi5VWDang6Z4CfIq4MYe8VIUkgJ/JHkQMs X-Received: by 2002:a50:c318:0:b0:413:3b2e:832 with SMTP id a24-20020a50c318000000b004133b2e0832mr2877723edb.156.1645627794619; Wed, 23 Feb 2022 06:49:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645627794; cv=none; d=google.com; s=arc-20160816; b=HbFRwPL9KWawfD9nM98CTEnzjogV5/rvYPRVUGY3ahkZo6Qxv6K2kbljI3m86vNKaa o3h1ZPGqcTO9u5S+v8sbecRVh1zwZWcXh5I3H+tMWcuA8Cl+17UH6jXPsTKKGgL7JWIn tG81PNhHWQbT5JKhrvCKa/oZFaV+3urNIN5tTOXd2GA0lAKX5eoP0eqFo45XvljAdkiH A6Waqz36eVXqbHyenNsSm11qbsfKiPNg0Ndva50q+ZBRX5CwNVLI/OGB6kpyUAtw4npC 0mxfnp+zUR7qf9UqbPjaMA1efHSfQHhdnxZbIBvEUAlsevQs7jLSygNeztZeVNshE6W6 IVIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=EpfKLc99g2AFPaNgll4n6SEjyTg8UdxdMIQ94Ww8aqM=; b=TcX5HyihP6OND9hu+Uh6rzIcc2Qk7MuXvVvFwoaME9UHO5IoRkpck7bQKwpqHLSrKa EpB7BzSbH2Yc+89f8tbw1hI1rBCNdTQMlDUhTA++R1g/OSkB/xz+UAV2S5Y24q1ofWn2 GtZrZV3K1c7rkUtPSBp8G+RDRyc1NJmyEmR7AHTaFFBbUdZ1A3oOQSz+v/zKuSHwSqm1 l/rBSjuQG/vRA9rzJqmpDlnQYSq7czPMI2r64ZyAsWz7joMQj+W5u7YlI1HprAEtBrPW QK2LRcGvT+fdZAy7b6M7G1FceJTgCrFqJbka0eeofMbPxPCuNSKZFKsaQeW0The+Wh0k 0bFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=OIbDrTlU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y14si12033776ejq.156.2022.02.23.06.49.30; Wed, 23 Feb 2022 06:49:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=OIbDrTlU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238633AbiBWIuq (ORCPT + 99 others); Wed, 23 Feb 2022 03:50:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237723AbiBWIuo (ORCPT ); Wed, 23 Feb 2022 03:50:44 -0500 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF1BC6E8F1 for ; Wed, 23 Feb 2022 00:50:13 -0800 (PST) Received: by mail-ed1-x533.google.com with SMTP id s24so36568351edr.5 for ; Wed, 23 Feb 2022 00:50:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=EpfKLc99g2AFPaNgll4n6SEjyTg8UdxdMIQ94Ww8aqM=; b=OIbDrTlUvuyMg6rNiq7c57g+SKrXjLRR7n6ZdMWrCssWWLDi2jKpHDX6T9s9GSoqNd ruTn78V4szfzLVPVQmPja9+TVFg4wfHrTA0jhmslTmgxg1NgXV8AcRktrK8KhiIGfbjC FQDxxV85yh2//76ghd5IVwIZU9fht3S4D/f3E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=EpfKLc99g2AFPaNgll4n6SEjyTg8UdxdMIQ94Ww8aqM=; b=xSqMacMGy99qhHDlB3JaRH8jtdsTE4Vd3mFzv5oSF/36LIuWpPD6rSFhI1Ur9ULeG+ wXOXWnKCNQEwD+iWWYCLcJCCtlSzHEfc0y9Wfj2t1rFaUhTICxSyq9J9UCL2OR4/OZh8 1/QC08ZoiOl+dPdfNIpzHxwtPIGBVbuficTHmGsZKFFhdarnNim6ME13qCJPX1wLg5R0 ekbLU34MrkBNqOX5v39Z1ccMIXIklK08qZ/tLd+YW5O2sgHll+FItmikCX7qAvyKVlKg Yg+D0bUEbt0bc2xSTx+cCqwIYdSGZfr2v2NErZJ13CdPpKCsVFrkgOHqcPdXp91ELk8Z iKVg== X-Gm-Message-State: AOAM531GzC3LyNwaysdmAIiBiMDJf/46ZD6VVFeTOIGnF5Cj4Q9PJMJG aObAPDw6ElhqMv0CcUohHXb7uw== X-Received: by 2002:aa7:d7c8:0:b0:3f9:3b65:f2b3 with SMTP id e8-20020aa7d7c8000000b003f93b65f2b3mr29820183eds.389.1645606212336; Wed, 23 Feb 2022 00:50:12 -0800 (PST) Received: from miu.piliscsaba.redhat.com (catv-178-48-189-3.catv.fixed.vodafone.hu. [178.48.189.3]) by smtp.gmail.com with ESMTPSA id c11sm11836709edx.42.2022.02.23.00.50.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 00:50:12 -0800 (PST) Date: Wed, 23 Feb 2022 09:50:10 +0100 From: Miklos Szeredi To: Jiachen Zhang Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, xieyongji@bytedance.com Subject: Re: [PATCH v2] fuse: fix deadlock between atomic O_TRUNC open() and page invalidations Message-ID: References: <20211229040239.66075-1-zhangjiachen.jaycee@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211229040239.66075-1-zhangjiachen.jaycee@bytedance.com> X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 29, 2021 at 12:02:39PM +0800, Jiachen Zhang wrote: > fuse_finish_open() will be called with FUSE_NOWRITE set in case of atomic > O_TRUNC open(), so commit 76224355db75 ("fuse: truncate pagecache on > atomic_o_trunc") replaced invalidate_inode_pages2() by truncate_pagecache() > in such a case to avoid the A-A deadlock. However, we found another A-B-B-A > deadlock related to the case above, which will cause the xfstests > generic/464 testcase hung in our virtio-fs test environment. > > For example, consider two processes concurrently open one same file, one > with O_TRUNC and another without O_TRUNC. The deadlock case is described > below, if open(O_TRUNC) is already set_nowrite(acquired A), and is trying > to lock a page (acquiring B), open() could have held the page lock > (acquired B), and waiting on the page writeback (acquiring A). This would > lead to deadlocks. > > open(O_TRUNC) > ---------------------------------------------------------------- > fuse_open_common > inode_lock [C acquire] > fuse_set_nowrite [A acquire] > > fuse_finish_open > truncate_pagecache > lock_page [B acquire] > truncate_inode_page > unlock_page [B release] > > fuse_release_nowrite [A release] > inode_unlock [C release] > ---------------------------------------------------------------- > > open() > ---------------------------------------------------------------- > fuse_open_common > fuse_finish_open > invalidate_inode_pages2 > lock_page [B acquire] > fuse_launder_page > fuse_wait_on_page_writeback [A acquire & release] > unlock_page [B release] > ---------------------------------------------------------------- > > Besides this case, all calls of invalidate_inode_pages2() and > invalidate_inode_pages2_range() in fuse code also can deadlock with > open(O_TRUNC). This commit tries to fix it by adding a new lock, > atomic_o_trunc, to protect the areas with the A-B-B-A deadlock risk. Thanks. Can you please try the following patch? Instead of introducing a new lock it tries to fix this by moving the truncate_pagecache() out of the nowrite protected section. Thanks, Miklos --- diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 656e921f3506..56f439719129 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -537,6 +537,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, struct fuse_file *ff; void *security_ctx = NULL; u32 security_ctxlen; + bool trunc = flags & O_TRUNC; /* Userspace expects S_IFREG in create mode */ BUG_ON((mode & S_IFMT) != S_IFREG); @@ -561,7 +562,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, inarg.mode = mode; inarg.umask = current_umask(); - if (fm->fc->handle_killpriv_v2 && (flags & O_TRUNC) && + if (fm->fc->handle_killpriv_v2 && trunc && !(flags & O_EXCL) && !capable(CAP_FSETID)) { inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID; } @@ -623,6 +624,8 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, } else { file->private_data = ff; fuse_finish_open(inode, file); + if (fm->fc->atomic_o_trunc && trunc) + truncate_pagecache(inode, 0); } return err; diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 829094451774..2e041708ef44 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -210,7 +210,6 @@ void fuse_finish_open(struct inode *inode, struct file *file) fi->attr_version = atomic64_inc_return(&fc->attr_version); i_size_write(inode, 0); spin_unlock(&fi->lock); - truncate_pagecache(inode, 0); file_update_time(file); fuse_invalidate_attr_mask(inode, FUSE_STATX_MODSIZE); } else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) { @@ -239,30 +238,32 @@ int fuse_open_common(struct inode *inode, struct file *file, bool isdir) if (err) return err; - if (is_wb_truncate || dax_truncate) { + if (is_wb_truncate || dax_truncate) inode_lock(inode); - fuse_set_nowrite(inode); - } if (dax_truncate) { filemap_invalidate_lock(inode->i_mapping); err = fuse_dax_break_layouts(inode, 0, 0); if (err) - goto out; + goto out_inode_unlock; } + if (is_wb_truncate || dax_truncate) + fuse_set_nowrite(inode); + err = fuse_do_open(fm, get_node_id(inode), file, isdir); if (!err) fuse_finish_open(inode, file); -out: + if (is_wb_truncate | dax_truncate) + fuse_release_nowrite(inode); + if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC)) + truncate_pagecache(inode, 0); if (dax_truncate) filemap_invalidate_unlock(inode->i_mapping); - - if (is_wb_truncate | dax_truncate) { - fuse_release_nowrite(inode); +out_inode_unlock: + if (is_wb_truncate | dax_truncate) inode_unlock(inode); - } return err; }