Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp856038pxb; Fri, 22 Apr 2022 12:43:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz1vlcjRbz+TqPKxMWwisDdzNHTRWWdtGHCjzjKkw3HKnxjqgSmdCWU3kIEbnpw8T08Geqa X-Received: by 2002:a17:902:8506:b0:154:8692:a7ac with SMTP id bj6-20020a170902850600b001548692a7acmr6084734plb.10.1650656634567; Fri, 22 Apr 2022 12:43:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650656634; cv=none; d=google.com; s=arc-20160816; b=r7oV2k4JHDIAB0Q3BbYFCHE8aNEyypXGBOVgeXqy5CVxJsF8oJI+IH7iOG4BDsjoh9 ztFehz814glZ22IinvLm6h+Qau/d34yUysFbqDuq3n0ntVuYT2tE+HrQawT0+YGa0W/D tnyrNl4gX4kFNbfYlwqyzMBzK9gjfjxr2YbvMnvZ2K8hLXza2vuYoOp0kMt9OjCSvDdc k1ENMtXNTgvk7nexRq1LLX99C26ocunM3KtodTJoQ50dDOBcuL6C+KvmY5fXNlcynJRI PydMGnwB1z0Kupp39pud33OtI3znTkQSI3xPFlW2P7jcDjXHLGNKXlT8TjkH6KT12ASg FphQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=7zsIRzNIRn8xv0TLvT1DjIsq/RVivIHdkhDcQAho+Zs=; b=w/r+uAeBfBGP4chZ52A+JOdsns1wDSdLyaS6qomHFnwnaMAHq2YGM+Q1m/a0wfmOm0 TUarpHXqfasUjmYh2+gl003cKiZ6Dc7GiEBNZhg04qxAQL9Y/IiQfr8JOIBwv1Jq758I GNcVVjAUNqD8WTXDaysr5IblceHK5a7AnadC+bmWG8JOScLc60vGC1SN/njQaCuasJ4G WVLMyUXfor4AUoJbB4gpHU2lqlICIWuJSaksrunQRBwLorOh9EZmh1H2kpuWUewfj6rG LkS7o4mo/bot22aAlYdbESgJCMNVuX3vz8wQYKQjUZVEPsJVZdUGU8b/gfjJ36mo0aXy Lf8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=Nr0Z18y8; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id u2-20020a17090a450200b001d75a3e38easi4145507pjg.41.2022.04.22.12.43.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Apr 2022 12:43:54 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=Nr0Z18y8; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8D9421FC1DB; Fri, 22 Apr 2022 11:50:42 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1448902AbiDVOv2 (ORCPT + 99 others); Fri, 22 Apr 2022 10:51:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1448897AbiDVOvX (ORCPT ); Fri, 22 Apr 2022 10:51:23 -0400 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 352E75C359 for ; Fri, 22 Apr 2022 07:48:29 -0700 (PDT) Received: by mail-ed1-x536.google.com with SMTP id t25so10704423edt.9 for ; Fri, 22 Apr 2022 07:48:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7zsIRzNIRn8xv0TLvT1DjIsq/RVivIHdkhDcQAho+Zs=; b=Nr0Z18y8ph+MXQmfpZLNfzIbpNdfJoEKXbDAt/UnP/BTrVycDDpH13iPs4rguEOzQ/ otnKQwTY56EQgfUVXjybo05UyV3pChE/kkiOzcRklDeEiXQPap60evUUw/J1TX05u+iF UwqjtvWsFFR4dhn46H3L5tvi2uxVlS5TnRnjQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7zsIRzNIRn8xv0TLvT1DjIsq/RVivIHdkhDcQAho+Zs=; b=dUgkD3x7rC3ND8mnCSww6OLGVhcaV5pK99WUeWGNRw2QiMf1uyzg72i1Mnj3eqsdlb 5zcrBXdv+ydReocMeRrEaJbeDk0c9lPJv7HMGLJrwSBhAt74Ek27+3nkqJUhme51vbEk ykVse9ZtpiWWsDTcFxftGOt9qGO3r1dDTN2nZhyXkWMxEteanz6lVlE4lfNaSmvOrVyZ eSj+HbGFtIADrPaYOxxQBA36GCiBZmslln6mwdndvu+3b3GA/WmXfc/OLfKy9NKpCiKE L0ks82oOBrhg9RKijr//1uUUuhJR2URgcKkUH5MMxsnPFSPpy6tVnqfr+sKgzCc2+n9o Wr5g== X-Gm-Message-State: AOAM532DHckjf2uSZHNWOanv7xCCHxj6wBHwOrS7EcsnV/O7Ag37HF7z x7d0sNGAxMIMWMP6v7FUx+gvJoDIfUX5qlsk+yawyA== X-Received: by 2002:a05:6402:270e:b0:424:55a:d8a3 with SMTP id y14-20020a056402270e00b00424055ad8a3mr5174379edd.221.1650638907755; Fri, 22 Apr 2022 07:48:27 -0700 (PDT) MIME-Version: 1.0 References: <20220408061809.12324-1-dharamhans87@gmail.com> <20220408061809.12324-2-dharamhans87@gmail.com> In-Reply-To: From: Miklos Szeredi Date: Fri, 22 Apr 2022 16:48:16 +0200 Message-ID: Subject: Re: [PATCH 1/1] FUSE: Allow parallel direct writes on the same file To: Dharmendra Hans Cc: linux-fsdevel@vger.kernel.org, fuse-devel , linux-kernel@vger.kernel.org, Bernd Schubert , Dharmendra Singh Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 22 Apr 2022 at 16:30, Dharmendra Hans wrote: > > On Thu, Apr 21, 2022 at 8:52 PM Miklos Szeredi wrote: > > > > On Fri, 8 Apr 2022 at 08:18, Dharmendra Singh wrote: > > > > > > As of now, in Fuse, direct writes on the same file are serialized > > > over inode lock i.e we hold inode lock for the whole duration of > > > the write request. This serialization works pretty well for the FUSE > > > user space implementations which rely on this inode lock for their > > > cache/data integrity etc. But it hurts badly such FUSE implementations > > > which has their own ways of mainting data/cache integrity and does not > > > use this serialization at all. > > > > > > This patch allows parallel direct writes on the same file with the > > > help of a flag called FOPEN_PARALLEL_WRITES. If this flag is set on > > > the file (flag is passed from libfuse to fuse kernel as part of file > > > open/create), we do not hold inode lock for the whole duration of the > > > request, instead acquire it only to protect updates on certain fields > > > of the inode. FUSE implementations which rely on this inode lock can > > > continue to do so and this is default behaviour. > > > > > > Signed-off-by: Dharmendra Singh > > > --- > > > fs/fuse/file.c | 38 ++++++++++++++++++++++++++++++++++---- > > > include/uapi/linux/fuse.h | 2 ++ > > > 2 files changed, 36 insertions(+), 4 deletions(-) > > > > > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > > > index 37eebfb90500..d3e8f44c1228 100644 > > > --- a/fs/fuse/file.c > > > +++ b/fs/fuse/file.c > > > @@ -1465,6 +1465,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > > > int err = 0; > > > struct fuse_io_args *ia; > > > unsigned int max_pages; > > > + bool p_write = write && > > > + (ff->open_flags & FOPEN_PARALLEL_WRITES) ? true : false; > > > > > > max_pages = iov_iter_npages(iter, fc->max_pages); > > > ia = fuse_io_alloc(io, max_pages); > > > @@ -1472,10 +1474,11 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > > > return -ENOMEM; > > > > > > if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) { > > > - if (!write) > > > + /* Parallel write does not come with inode lock held */ > > > + if (!write || p_write) > > > > Probably would be good to add an inode_is_locked() assert in > > fuse_sync_writes() to make sure we don't miss cases silently. > > I think fuse_set_nowrite() called from fuse_sync_writes() already has > this assertion. Ah, okay. > > > > > > inode_lock(inode); > > > fuse_sync_writes(inode); > > > - if (!write) > > > + if (!write || p_write) > > > inode_unlock(inode); > > > } > > > > > > @@ -1568,22 +1571,36 @@ static ssize_t fuse_direct_read_iter(struct kiocb *iocb, struct iov_iter *to) > > > static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from) > > > { > > > struct inode *inode = file_inode(iocb->ki_filp); > > > + struct file *file = iocb->ki_filp; > > > + struct fuse_file *ff = file->private_data; > > > struct fuse_io_priv io = FUSE_IO_PRIV_SYNC(iocb); > > > ssize_t res; > > > + bool p_write = ff->open_flags & FOPEN_PARALLEL_WRITES ? true : false; > > > + bool unlock_inode = true; > > > > > > /* Don't allow parallel writes to the same file */ > > > inode_lock(inode); > > > res = generic_write_checks(iocb, from); > > > > I don't think this needs inode lock. At least nfs_file_direct_write() > > doesn't have it. > > > > What it does have, however is taking the inode lock for shared for the > > actual write operation, which is probably something that fuse needs as > > well. > > > > Also I worry about size extending writes not holding the inode lock > > exclusive. Would that be a problem in your use case? > > Thanks for pointing out this issue. Actually there is an issue in > appending writes. > Until unless current appeding write is finished and does not update > i_size, next appending > write can't be allowed as it would be otherwise one request > overwriting data written > by another request. > For other kind of writes, I do not see the issue as i_size update can > be handled as it is > done currently as these writes are based upon fixed offset instead of > generating offset > from i_size. That's true, but I still worry... Does your workload include non-append extending writes? Seems to me making those run in parallel is asking for trouble. > If we agreed, I would be sending the updated patch shortly. > (Also please take a look on other patches raised by me for atomic-open, these > patches are pending since couple of weeks) I'm looking at that currently. Thanks, Miklos