Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp806936pxt; Fri, 6 Aug 2021 14:26:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyU05l2zRN0qZnQ3bzcrInhc/MY3IzW2Pr9vG+c5cdWpxHuLBZY4AjEI2iZdzvCeIl5CWEE X-Received: by 2002:a17:906:4103:: with SMTP id j3mr11586887ejk.38.1628285205867; Fri, 06 Aug 2021 14:26:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628285205; cv=none; d=google.com; s=arc-20160816; b=NlbI/RbC8Zgn9xS5W91TmuFQchPUwlZ/FyOsjGPDwMKxs0YsKowkAmpvLMOU7/kQVv hOU0Hv9HVTr7GGd5NAQpXGuPtzun1Zzc4kypCzZuddww7t0jc2EliY/H+FX8YfXfKvSg iqrbT/LrfR5eGOk1Vl0Y65G3/Fp/2AJVfok8Nwu78MhG0HL3G8OX/SQvJ7+9QriiMCB1 heWbC9dKIy1tZ8MEH+oKfF5TiO1cqTiLUIqwdIT7kWKao4PSCAgz7cfchnke+43DRso3 Pglk5GgXGOttWtag5N8CsJMMXcOLWvEcMnYh2hrpwNqSo5m++YRuO98cusH1w4owln3l HCJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:content-id:mime-version:subject :cc:to:references:in-reply-to:from:organization:dkim-signature; bh=5B073nk1DS+XhzOJ9EAEUKaNTVzIvIbT8CxRSjSpPZU=; b=cDdLQlwO4aCSJZfUJ9exs70qOBcSqNMrQgH/UPeLjApNxKiXNnsmAOuV0hQKNPk0KS hLT3b/p2kF7KumsUPi4bZ3c/nkWQneGQILPtlNL4i27nsmRRh8R1QCFSPI5VxcNLe3Bs 7WvYy0kqzS2HpwrSLXslrO8hTQqjYWs+g8lqgXByiar2+kjW7Y7us+PNTgbqL5zcxKyS Vbb8gubbxdlteJPQ69+M2n0r57nmZ9CwP9zZsCscq1O1pBMCSB5LPG5NuprG2oHal3jV 9aSJ0mabDKSS3bG6xtTPPhG99G4tatEP0kUDwzxXCdHHdO+oTgxJASMl9EJ9j22kpFJO 4sQA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="WbQk/Y3X"; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mh16si3062003ejb.353.2021.08.06.14.26.20; Fri, 06 Aug 2021 14:26:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="WbQk/Y3X"; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243463AbhHFPFd (ORCPT + 99 others); Fri, 6 Aug 2021 11:05:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:40872 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243553AbhHFPF1 (ORCPT ); Fri, 6 Aug 2021 11:05:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628262311; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=5B073nk1DS+XhzOJ9EAEUKaNTVzIvIbT8CxRSjSpPZU=; b=WbQk/Y3XZCoG2+qhBe/tlrupojF1gle5w0fu2xR0a+jCRlu+8gUV3HEm5wBe486Q3RgmJn 843j0TdzoGNcmNF/kqDI8IxPjwvueBTDuKDOGs5OjGLsKcZ5qNHAzZL9f5oa2oMgBhwupX f855jDAyRkJv6WBOWrofQDprgT2KWCE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-306-ky3v8YG6NV-FlJD_H1kZwQ-1; Fri, 06 Aug 2021 11:05:08 -0400 X-MC-Unique: ky3v8YG6NV-FlJD_H1kZwQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 72BB9107ACF5; Fri, 6 Aug 2021 15:05:05 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.22.32.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id CBEEA10640E8; Fri, 6 Aug 2021 15:04:54 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: References: <1017390.1628158757@warthog.procyon.org.uk> <1170464.1628168823@warthog.procyon.org.uk> <1186271.1628174281@warthog.procyon.org.uk> <1219713.1628181333@warthog.procyon.org.uk> <1302671.1628257357@warthog.procyon.org.uk> To: Matthew Wilcox Cc: dhowells@redhat.com, Linus Torvalds , Anna Schumaker , Trond Myklebust , Jeff Layton , Steve French , Dominique Martinet , Mike Marshall , Miklos Szeredi , Shyam Prasad N , linux-cachefs@redhat.com, linux-afs@lists.infradead.org, "open list:NFS, SUNRPC, AND..." , CIFS , ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, devel@lists.orangefs.org, Linux-MM , linux-fsdevel , Linux Kernel Mailing List Subject: Re: Canvassing for network filesystem write size vs page size MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1306893.1628262293.1@warthog.procyon.org.uk> Date: Fri, 06 Aug 2021 16:04:53 +0100 Message-ID: <1306894.1628262293@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Matthew Wilcox wrote: > No, that is very much not the same thing. Look at what NFS does, like > Linus said. Consider this test program: > > fd = open(); > lseek(fd, 5, SEEK_SET); > write(fd, buf, 3); > write(fd, buf2, 10); > write(fd, buf3, 2); > close(fd); Yes, I get that. I can do that when there isn't a local cache or content encryption. Note that, currently, if the pages (or cache blocks) being read/modified are beyond the EOF at the point when the file is opened, truncated down or last subject to 3rd-party invalidation, I don't go to the server at all. > > But that kind of screws with local caching. The local cache might need to > > track the missing bits, and we are likely to be using blocks larger than a > > page. > > There's nothing to cache. Pages which are !Uptodate aren't going to get > locally cached. Eh? Of course there is. You've just written some data. That need to get copied to the cache as well as the server if that file is supposed to be being cached (for filesystems that support local caching of files open for writing, which AFS does). > > Basically, there are a lot of scenarios where not having fully populated > > pages sucks. And for streaming writes, wouldn't it be better if you used > > DIO writes? > > DIO can't do sub-512-byte writes. Yes it can - and it works for my AFS client at least with the patches in my fscache-iter-2 branch. This is mainly a restriction for block storage devices we're doing DMA to - but we're not doing direct DMA to block storage devices typically when talking to a network filesystem. For AFS, at least, I can just make one big FetchData/StoreData RPC that reads/writes the entire DIO request in a single op; for other filesystems (NFS, ceph for example), it needs breaking up into a sequence of RPCs, but there's no particular reason that I know of that requires it to be 512-byte aligned on any of these. Things get more interesting if you're doing DIO to a content-encrypted file because the block size may be 4096 or even a lot larger - in which case we would have to do local RMW to handle misaligned writes, but it presents no particular difficulty. > You might not be trying to do anything for block filesystems, but we > should think about what makes sense for block filesystems as well as > network filesystems. Whilst that's a good principle, they have very different characteristics that might make that difficult. David