Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp19010948rwd; Wed, 28 Jun 2023 03:59:25 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4B3mivzbF1p++IAeRk54LB4WlC7/JcVhsBO6OFyNMQ3P+dsOQ5IJONToAu8vW86ZRly7eN X-Received: by 2002:a17:907:320d:b0:974:1c90:b3d3 with SMTP id xg13-20020a170907320d00b009741c90b3d3mr32126472ejb.12.1687949965262; Wed, 28 Jun 2023 03:59:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687949965; cv=none; d=google.com; s=arc-20160816; b=FO6nTH1JFFp1Rq6iq1aSMEqKmmlmYKqQ1iZ6c9gJ0Cv09Jrq7aigqTyNqKGq0XvZRf jL0CQtYwXMRGfHYiaSJJnU19C5oDqbvI4Q/MtQHQ97jDd0mN6eiyQrKPy/PmwWMCgsCd J/DDanjcs8IOZ4oAyMa/YxdiFCmA9Sq15CdibT3YipZyqQ/P7a+yZbAuxKe3I/kirDMl wfm7T4acC3so/kQWclHdSSYKFW5JFWVIiibpLUCKokPGqyfam/hhP5ekdkRxpB/R35nF 7avr4Igr+2xMMAqrj22CjJ1dpUAta0msP/Neh/Wnhhk35tsh1tLOrG84jNuX9mu9dpqH c91g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=35xC16dMlQgwE6owW0vdiolQSi7usOC9ELk4zbmo8S8=; fh=S42i0caU9EswLJjmookwAuL1hYpCs+3wunQkaolajsI=; b=fQpCB+YLImBRGXdibK4636ADY1ya9tbwNFVgzsY6lTyFH2fcatraEyX4oDkmQxIRk3 AW3+yyiVKH4LSloQlGZjN8lN2jEf8cOYNiH+4Gzg1vi8fdv+E6+GTZzAEKqoBtrumASq elyHA1cM/u8eTqZN6GrKua2PZLQDDdH3RV/irH7WNmwCkfHYr/06RHdMFTmxWSpkfHgN JHppIy58Ls7qzBI2IeUwcL2NtTCnPXXhYFQlbBk07jgt7go42AWydnxJ07fgMKXOdnvf ZKExF1eQS18Qrur78v2yapBpchh9bxnDzGevnhJgAI8OJK7L40MQZG7bML4kmKF4htWc VNzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Dp9xv0nx; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pv18-20020a170907209200b0094f712dac55si5583266ejb.734.2023.06.28.03.59.01; Wed, 28 Jun 2023 03:59:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Dp9xv0nx; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231270AbjF1Kua (ORCPT + 99 others); Wed, 28 Jun 2023 06:50:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:58686 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230517AbjF1Kt7 (ORCPT ); Wed, 28 Jun 2023 06:49:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687949351; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=35xC16dMlQgwE6owW0vdiolQSi7usOC9ELk4zbmo8S8=; b=Dp9xv0nxM0g/Xp4VRs20blkh1IJLy3JVjQb3wj3Hrmj9OWfu2l+MAGYK6cyspy5FrHF98C giWGk8YPFJ+q3Jebud7lxIax/5RrT8v5ychPgXArMVduAkaO/NHVFVow/iR9BLk6+VspeS tUB9wfXPCKCspktL0oF/tCq5KMBONsc= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-178-etuXAlI0MLCPNHRgBXUBRA-1; Wed, 28 Jun 2023 06:49:04 -0400 X-MC-Unique: etuXAlI0MLCPNHRgBXUBRA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3433C3C02B6D; Wed, 28 Jun 2023 10:49:03 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id D68A8C09A07; Wed, 28 Jun 2023 10:49:00 +0000 (UTC) From: David Howells To: Andrew Morton Cc: David Howells , Matthew Wilcox , Linus Torvalds , Jeff Layton , Christoph Hellwig , linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, Rohith Surabattula , Steve French , Shyam Prasad N , Dave Wysochanski , Dominique Martinet , Ilya Dryomov , linux-mm@kvack.org Subject: [PATCH v7 2/2] mm, netfs, fscache: Stop read optimisation when folio removed from pagecache Date: Wed, 28 Jun 2023 11:48:52 +0100 Message-ID: <20230628104852.3391651-3-dhowells@redhat.com> In-Reply-To: <20230628104852.3391651-1-dhowells@redhat.com> References: <20230628104852.3391651-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Fscache has an optimisation by which reads from the cache are skipped until we know that (a) there's data there to be read and (b) that data isn't entirely covered by pages resident in the netfs pagecache. This is done with two flags manipulated by fscache_note_page_release(): if (... test_bit(FSCACHE_COOKIE_HAVE_DATA, &cookie->flags) && test_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags)) clear_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags); where the NO_DATA_TO_READ flag causes cachefiles_prepare_read() to indicate that netfslib should download from the server or clear the page instead. The fscache_note_page_release() function is intended to be called from ->releasepage() - but that only gets called if PG_private or PG_private_2 is set - and currently the former is at the discretion of the network filesystem and the latter is only set whilst a page is being written to the cache, so sometimes we miss clearing the optimisation. Fix this by following Willy's suggestion[1] and adding an address_space flag, AS_RELEASE_ALWAYS, that causes filemap_release_folio() to always call ->release_folio() if it's set, even if PG_private or PG_private_2 aren't set. Note that this would require folio_test_private() and page_has_private() to become more complicated. To avoid that, in the places[*] where these are used to conditionalise calls to filemap_release_folio() and try_to_release_page(), the tests are removed the those functions just jumped to unconditionally and the test is performed there. [*] There are some exceptions in vmscan.c where the check guards more than just a call to the releaser. I've added a function, folio_needs_release() to wrap all the checks for that. AS_RELEASE_ALWAYS should be set if a non-NULL cookie is obtained from fscache and cleared in ->evict_inode() before truncate_inode_pages_final() is called. Additionally, the FSCACHE_COOKIE_NO_DATA_TO_READ flag needs to be cleared and the optimisation cancelled if a cachefiles object already contains data when we open it. Fixes: 1f67e6d0b188 ("fscache: Provide a function to note the release of a page") Fixes: 047487c947e8 ("cachefiles: Implement the I/O routines") Reported-by: Rohith Surabattula Suggested-by: Matthew Wilcox Signed-off-by: David Howells cc: Matthew Wilcox cc: Linus Torvalds cc: Steve French cc: Shyam Prasad N cc: Rohith Surabattula cc: Dave Wysochanski cc: Dominique Martinet cc: Ilya Dryomov cc: linux-cachefs@redhat.com cc: linux-cifs@vger.kernel.org cc: linux-afs@lists.infradead.org cc: v9fs-developer@lists.sourceforge.net cc: ceph-devel@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- Notes: ver #7) - Make NFS set AS_RELEASE_ALWAYS. ver #4) - Split out merging of folio_has_private()/filemap_release_folio() call pairs into a preceding patch. - Don't need to clear AS_RELEASE_ALWAYS in ->evict_inode(). ver #3) - Fixed mapping_clear_release_always() to use clear_bit() not set_bit(). - Moved a '&&' to the correct line. ver #2) - Rewrote entirely according to Willy's suggestion[1]. fs/9p/cache.c | 2 ++ fs/afs/internal.h | 2 ++ fs/cachefiles/namei.c | 2 ++ fs/ceph/cache.c | 2 ++ fs/nfs/fscache.c | 3 +++ fs/smb/client/fscache.c | 2 ++ include/linux/pagemap.h | 16 ++++++++++++++++ mm/internal.h | 5 ++++- 8 files changed, 33 insertions(+), 1 deletion(-) diff --git a/fs/9p/cache.c b/fs/9p/cache.c index cebba4eaa0b5..12c0ae29f185 100644 --- a/fs/9p/cache.c +++ b/fs/9p/cache.c @@ -68,6 +68,8 @@ void v9fs_cache_inode_get_cookie(struct inode *inode) &path, sizeof(path), &version, sizeof(version), i_size_read(&v9inode->netfs.inode)); + if (v9inode->netfs.cache) + mapping_set_release_always(inode->i_mapping); p9_debug(P9_DEBUG_FSC, "inode %p get cookie %p\n", inode, v9fs_inode_cookie(v9inode)); diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 9d3d64921106..da73b97e19a9 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -681,6 +681,8 @@ static inline void afs_vnode_set_cache(struct afs_vnode *vnode, { #ifdef CONFIG_AFS_FSCACHE vnode->netfs.cache = cookie; + if (cookie) + mapping_set_release_always(vnode->netfs.inode.i_mapping); #endif } diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index d9d22d0ec38a..7bf7a5fcc045 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -585,6 +585,8 @@ static bool cachefiles_open_file(struct cachefiles_object *object, if (ret < 0) goto check_failed; + clear_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &object->cookie->flags); + object->file = file; /* Always update the atime on an object we've just looked up (this is diff --git a/fs/ceph/cache.c b/fs/ceph/cache.c index 177d8e8d73fe..de1dee46d3df 100644 --- a/fs/ceph/cache.c +++ b/fs/ceph/cache.c @@ -36,6 +36,8 @@ void ceph_fscache_register_inode_cookie(struct inode *inode) &ci->i_vino, sizeof(ci->i_vino), &ci->i_version, sizeof(ci->i_version), i_size_read(inode)); + if (ci->netfs.cache) + mapping_set_release_always(inode->i_mapping); } void ceph_fscache_unregister_inode_cookie(struct ceph_inode_info *ci) diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c index 8c35d88a84b1..b05717fe0d4e 100644 --- a/fs/nfs/fscache.c +++ b/fs/nfs/fscache.c @@ -180,6 +180,9 @@ void nfs_fscache_init_inode(struct inode *inode) &auxdata, /* aux_data */ sizeof(auxdata), i_size_read(inode)); + + if (netfs_inode(inode)->cache) + mapping_set_release_always(inode->i_mapping); } /* diff --git a/fs/smb/client/fscache.c b/fs/smb/client/fscache.c index 8f6909d633da..3677525ee993 100644 --- a/fs/smb/client/fscache.c +++ b/fs/smb/client/fscache.c @@ -108,6 +108,8 @@ void cifs_fscache_get_inode_cookie(struct inode *inode) &cifsi->uniqueid, sizeof(cifsi->uniqueid), &cd, sizeof(cd), i_size_read(&cifsi->netfs.inode)); + if (cifsi->netfs.cache) + mapping_set_release_always(inode->i_mapping); } void cifs_fscache_unuse_inode_cookie(struct inode *inode, bool update) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index a56308a9d1a4..a1176ceb4a0c 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -199,6 +199,7 @@ enum mapping_flags { /* writeback related tags are not used */ AS_NO_WRITEBACK_TAGS = 5, AS_LARGE_FOLIO_SUPPORT = 6, + AS_RELEASE_ALWAYS, /* Call ->release_folio(), even if no private data */ }; /** @@ -269,6 +270,21 @@ static inline int mapping_use_writeback_tags(struct address_space *mapping) return !test_bit(AS_NO_WRITEBACK_TAGS, &mapping->flags); } +static inline bool mapping_release_always(const struct address_space *mapping) +{ + return test_bit(AS_RELEASE_ALWAYS, &mapping->flags); +} + +static inline void mapping_set_release_always(struct address_space *mapping) +{ + set_bit(AS_RELEASE_ALWAYS, &mapping->flags); +} + +static inline void mapping_clear_release_always(struct address_space *mapping) +{ + clear_bit(AS_RELEASE_ALWAYS, &mapping->flags); +} + static inline gfp_t mapping_gfp_mask(struct address_space * mapping) { return mapping->gfp_mask; diff --git a/mm/internal.h b/mm/internal.h index a76314764d8c..86aef26df905 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -175,7 +175,10 @@ static inline void set_page_refcounted(struct page *page) */ static inline bool folio_needs_release(struct folio *folio) { - return folio_has_private(folio); + struct address_space *mapping = folio->mapping; + + return folio_has_private(folio) || + (mapping && mapping_release_always(mapping)); } extern unsigned long highest_memmap_pfn;