Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp829600pxt; Thu, 5 Aug 2021 12:43:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw8Qgjcep84Xsc47MXx+I+m6U++6N7J4tn5PyoBPqPVT06yAF3slrY5PyYsY8VG/gmPLb6l X-Received: by 2002:a05:6402:c8a:: with SMTP id cm10mr8437364edb.192.1628192608626; Thu, 05 Aug 2021 12:43:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628192608; cv=none; d=google.com; s=arc-20160816; b=zfsdH+s7u01coLo+A4upFDCa7dHJyszAYTHyBZGopNHyu1HqEXA2ylv/pQlh3k0lGU 7vxxLnERwI5uQItsCKw2xp/2KS/F8jXZswJmm/1Uz6zR+WXfnWVX8F3NCTVxAmAZ7/Ka gcqLIu7ZLQH0c6QK/RfYiBBbfR8ugO8ZjkEvPOkvNLiDfRnAsPI7X33Li6lCqmJpYNCY xZUUyWcAtcAfJx6RDXEckGrGmrw1gZlG4fe56p8xc3NreWZo/Zh3DG7KSfKPopjcP+nw LTHrQEu34RxKZUH/gBe7x9XHQbgfuSg88pR6rCmKYANqphHPP7xhP2wsPsI+S6ADMB0g nUFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=kSMSzayyNT2YehjLQ2Cwc0r8HqyEvjniy8Pw7lCSffg=; b=t414FRd/VsABQpVOFF97pULboyDo6MGsmqtH1ULR7g6RuHhIMO0ppnkJzdE0lBM98I bn7B88xz/VEcpQQN4MPzIgtvOxQG0qCkaLBUXbVAzm2dYH8gAAr/8AqfsewPtiTOKgt5 GfMCM1wjXioSU+bHVdwSiK6990u6FWEPkcbTN5wEnhL7vaOfM8Gn8vipb9GbeKuzZ9oT H5BbA7BjB/rZR1CLozXATdE2gOm8d0U1MQCxds0deq10ImI5uGBCBbZyrH5QFAYz8+Ot OoDgCOmkgDLYPEC5/s7gSyIQ2a4h4Hh9+617TzcdZLCFBAkNxGPAlfMpPA/5YsmcAxJN Hk0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=CjeV+4n+; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o20si6909464ejm.682.2021.08.05.12.42.56; Thu, 05 Aug 2021 12:43:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=CjeV+4n+; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239940AbhHER1n (ORCPT + 99 others); Thu, 5 Aug 2021 13:27:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239708AbhHER1l (ORCPT ); Thu, 5 Aug 2021 13:27:41 -0400 Received: from mail-lf1-x141.google.com (mail-lf1-x141.google.com [IPv6:2a00:1450:4864:20::141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63666C0613D5 for ; Thu, 5 Aug 2021 10:27:26 -0700 (PDT) Received: by mail-lf1-x141.google.com with SMTP id n17so9897065lft.13 for ; Thu, 05 Aug 2021 10:27:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kSMSzayyNT2YehjLQ2Cwc0r8HqyEvjniy8Pw7lCSffg=; b=CjeV+4n++fBbfbgUnSajYHMZzSgIGQ/AzXROF+Y4CaB/aORGEHsRe4pUpEJVdYpwfB NmAZXoNyQn6JPxdWXpyZnM00VCUOuYnkUNerNg1Ng90I+xulwoWK81gZiomswCGM8sp6 nIAp7+gezxOviLaRF+m3bsJwsaKOi0OIDRvRA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kSMSzayyNT2YehjLQ2Cwc0r8HqyEvjniy8Pw7lCSffg=; b=HKvsgr6ALOPY1xSRVm3bf7M47g0YrPJUynAGSL1SmaAbyf5twLxSt4/FwcD/uMJoY+ hWQufm8WhMe/26ZWxsuaUO/utV532NVppVhJKKJJBWpB8LV5FmiTus0MjD7wLaVsVDYD Nypd1lbgYCXbEKv3+nhR/1QHeubik1EzpmWAXIUDSgimjgY4v0saoBf1TAwkiXpHVk/T ckEcyI6cAsPKRt6CtyVrXPmP9suOy+d2K2pE+3ubaSi6OA4fUhlE4qsGyTHKw7/jw67y 8FsYHBkPtFb41WBMclrGAG0W4pJBHxYWd91l5AXoCEzjBRKWRYMSCONgy4KqNyufEPyb PVhw== X-Gm-Message-State: AOAM533NeVpS9H9AZkC7GrMaR5/qqvPM/QW20HU2Z/lwWX6gwAb+LwCb grRJCQGdKHgH7lqxCGBKJAstHwQj/rWSN2idCEY= X-Received: by 2002:a05:6512:1103:: with SMTP id l3mr4599756lfg.14.1628184442894; Thu, 05 Aug 2021 10:27:22 -0700 (PDT) Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com. [209.85.167.41]) by smtp.gmail.com with ESMTPSA id j10sm447157lfm.299.2021.08.05.10.27.21 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Aug 2021 10:27:22 -0700 (PDT) Received: by mail-lf1-f41.google.com with SMTP id t9so12577847lfc.6 for ; Thu, 05 Aug 2021 10:27:21 -0700 (PDT) X-Received: by 2002:a05:6512:2388:: with SMTP id c8mr4369071lfv.201.1628184441363; Thu, 05 Aug 2021 10:27:21 -0700 (PDT) MIME-Version: 1.0 References: <1017390.1628158757@warthog.procyon.org.uk> <1170464.1628168823@warthog.procyon.org.uk> <1186271.1628174281@warthog.procyon.org.uk> <1219713.1628181333@warthog.procyon.org.uk> In-Reply-To: <1219713.1628181333@warthog.procyon.org.uk> From: Linus Torvalds Date: Thu, 5 Aug 2021 10:27:05 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Canvassing for network filesystem write size vs page size To: David Howells Cc: Anna Schumaker , Trond Myklebust , Jeff Layton , Steve French , Dominique Martinet , Mike Marshall , Miklos Szeredi , "Matthew Wilcox (Oracle)" , Shyam Prasad N , linux-cachefs@redhat.com, linux-afs@lists.infradead.org, "open list:NFS, SUNRPC, AND..." , CIFS , ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, devel@lists.orangefs.org, Linux-MM , linux-fsdevel , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Thu, Aug 5, 2021 at 9:36 AM David Howells wrote: > > Some network filesystems, however, currently keep track of which byte ranges > are modified within a dirty page (AFS does; NFS seems to also) and only write > out the modified data. NFS definitely does. I haven't used NFS in two decades, but I worked on some of the code (read: I made nfs use the page cache both for reading and writing) back in my Transmeta days, because NFSv2 was the default filesystem setup back then. See fs/nfs/write.c, although I have to admit that I don't recognize that code any more. It's fairly important to be able to do streaming writes without having to read the old contents for some loads. And read-modify-write cycles are death for performance, so you really want to coalesce writes until you have the whole page. That said, I suspect it's also *very* filesystem-specific, to the point where it might not be worth trying to do in some generic manner. In particular, NFS had things like interesting credential issues, so if you have multiple concurrent writers that used different 'struct file *' to write to the file, you can't just mix the writes. You have to sync the writes from one writer before you start the writes for the next one, because one might succeed and the other not. So you can't just treat it as some random "page cache with dirty byte extents". You really have to be careful about credentials, timeouts, etc, and the pending writes have to keep a fair amount of state around. At least that was the case two decades ago. [ goes off and looks. See "nfs_write_begin()" and friends in fs/nfs/file.c for some of the examples of these things, althjough it looks like the code is less aggressive about avoding the read-modify-write case than I thought I remembered, and only does it for write-only opens ] Linus Linus