Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp785751ioo; Thu, 26 May 2022 15:03:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx6g5+EPWrd1jqf+BSmZLG83l1LzQOd677VZR6g28ivVk0mqMH0bpy0N+IbQI1pirUOPwls X-Received: by 2002:a17:902:7b93:b0:162:bc8:935a with SMTP id w19-20020a1709027b9300b001620bc8935amr26924916pll.44.1653602623630; Thu, 26 May 2022 15:03:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653602623; cv=none; d=google.com; s=arc-20160816; b=v2MbzQDKZtZZStxglJQszOJQX5Yq2OVuKsYS4EF05DaRap0jYJkPTBbfP3UK+kB9sg KAuKkjU+O/qR5DGuTjKa5+BHb+B65kAuI6J8Mu2XEXq/PaKlkfK/SMOWoDAUVDVM00EO YUyK1dnn+QW5mBAdZJ8EVo1di9r8tVTHqVnmhaYO3UasX/9tAfcuuSSZ7OzUflNrfPFI uttPkCe9PJqBXikC7J7v0lr4LMJvOGsfLaxJKeaI3nrVYyJ7VIqUb0nKqqMsn0jexCGF 8W8bcimwUblyq2JK9dVpOm1+fWxrpZh+RnQ7Clo3v3J+kU8qwwsq0C4sqcLXxoB485ho cKQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Y+/f5rCCX5O2YnIFMqX5A6PQPJ8kjAWayKsnZFb/i6I=; b=SuCvTvm1vrTkik0hYoxcRbjjwWTYibM+vFfaem0Lq1Hu8YQLLyalFVz+qMMTmVtMTa pR/LbH/VJQjI4rz+oBXKE4nlViGzjYaVPxQ3k2D8UJgjVljogdczZXgiUYfwdYTzpa1q VOMx8X1OSZtEIWgnPXH5i816WIlHDQu3TyCcuvajCb/0tvmrOW04wtwynVBEtxYjyn3N 45L7mZcKvwbEuUankt/SZLKDwFlKxnSogJjjMrRpHn0dvS/9veb8ObBCrzo8RrrxVpdB ZHCTWB2A6p+MZLpj7ha6qgxPq0gjZQLHmtx0LGccox08V1DJYh/ycihpqNujKbNxHi6t u/8w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gKn7O7G5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l8-20020a637008000000b003822b0c2142si3994631pgc.279.2022.05.26.15.03.30; Thu, 26 May 2022 15:03:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gKn7O7G5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345500AbiEZSOe (ORCPT + 99 others); Thu, 26 May 2022 14:14:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243262AbiEZSOa (ORCPT ); Thu, 26 May 2022 14:14:30 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 41A2FB2272 for ; Thu, 26 May 2022 11:14:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1653588868; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Y+/f5rCCX5O2YnIFMqX5A6PQPJ8kjAWayKsnZFb/i6I=; b=gKn7O7G5OA7hIJFsDEqJexdpHcm2myK5PWfPkyTEb+USWkOq5JMajWpCktUdXFeAob2RgN u7u8QsHj69sLrV8BLUomhHQaxQfvh5waMl8JcqQDWouXYBCsHfVbnN7tF9TYOk+ARDliOI P/Fi3Vw91w0uR3+Sp+fSHvR5Pe0sU3c= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-331-QGQMv57xNBOCTKB7phssCw-1; Thu, 26 May 2022 14:14:24 -0400 X-MC-Unique: QGQMv57xNBOCTKB7phssCw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5631E801228; Thu, 26 May 2022 18:14:24 +0000 (UTC) Received: from horse.redhat.com (unknown [10.18.25.210]) by smtp.corp.redhat.com (Postfix) with ESMTP id 285D2400F3E8; Thu, 26 May 2022 18:14:24 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id D98762208FA; Thu, 26 May 2022 14:14:23 -0400 (EDT) Date: Thu, 26 May 2022 14:14:23 -0400 From: Vivek Goyal To: Bernd Schubert Cc: Dharmendra Singh , miklos@szeredi.hu, linux-fsdevel@vger.kernel.org, fuse-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org, Dharmendra Singh Subject: Re: [PATCH v3 1/1] FUSE: Allow non-extending parallel direct writes on the same file. Message-ID: References: <20220520043443.17439-1-dharamhans87@gmail.com> <20220520043443.17439-2-dharamhans87@gmail.com> <3350e4e2-bad5-7f2f-2b09-c1807815a29c@ddn.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3350e4e2-bad5-7f2f-2b09-c1807815a29c@ddn.com> X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 25, 2022 at 10:49:49PM +0200, Bernd Schubert wrote: > > > On 5/25/22 22:31, Vivek Goyal wrote: > > On Fri, May 20, 2022 at 10:04:43AM +0530, Dharmendra Singh wrote: > > > From: Dharmendra Singh > > > > > > In general, as of now, in FUSE, direct writes on the same file are > > > serialized over inode lock i.e we hold inode lock for the full duration > > > of the write request. I could not found in fuse code a comment which > > > clearly explains why this exclusive lock is taken for direct writes. > > > Our guess is some USER space fuse implementations might be relying > > > on this lock for seralization and also it protects for the issues > > > arising due to file size assumption or write failures. This patch > > > relaxes this exclusive lock in some cases of direct writes. > > > > I have this question as well. My understanding was that in general, > > reads can do shared lock while writes have to take exclusive lock. > > And I assumed that extends to both buffered as well as direct > > writes. > > > > I would also like to understand what's the fundamental restriction > > and why O_DIRECT is special that this restriction does not apply. > > > > Is any other file system doing this as well? > > > > If fuse server dir is shared with other fuse clients, it is possible > > that i_size in this client is stale. Will that be a problem. I guess > > if that's the problem then, even a single write will be a problem > > because two fuse clients might be trying to write. > > > > Just trying to make sure that it is safe to allow parallel direct > > writes. > > I think missing in this series is to add a comment when this lock is needed > at all. Our network file system is log structured - any parallel writes to > the same file from different remote clients are handled through addition of > fragments on the network server side - lockless safe due to byte level > accuracy. With the exception of conflicting writes - last client wins - > application is then doing 'silly' things - locking would not help either. > And random parallel writes from the same (network) client are even an ideal > case for us, as that is handled through shared blocks for different > fragments (file offset + len). So for us shared writes are totally safe. > > When Dharmendra and I discussed about the lock we came up with a few write > error handling cases where that lock might be needed - I guess that should > be added as comment. Right, please add the changelogs to make thought process clear. So there are no restrictions on the fuse client side from parallelism point of view on direct write path? Why file extending writes are not safe? Is this a restriction from fuse client point of view or just being safe from server point of view. If fuse user space is talking to another filesystem, I guess then it is not problem because that filesystem will take care of locking as needed. I see ext4 is allowing parallel direct writes for certain cases. And where they can't allow it, they have documented it in comments. (ext4_dio_write_checks()). I think we need similar rationalization, especially from fuse client's perspective and have comments in code and specify when it is ok to have parallel direct writes and when it is not ok and why. This will help people when they are looking at the code later. Thanks Vivek