Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp379146pxb; Wed, 20 Jan 2021 09:07:46 -0800 (PST) X-Google-Smtp-Source: ABdhPJzPoZBhIw24wTLEghoj0j8hbUDSqzjSd5dvb8yg3AcGVXHc2vZIolqdjrHtmPH6LaI5JC43 X-Received: by 2002:a17:906:5ad0:: with SMTP id x16mr6726150ejs.135.1611162466494; Wed, 20 Jan 2021 09:07:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611162466; cv=none; d=google.com; s=arc-20160816; b=XQpzU1yJE/S+ZJrxnGBxbYGkgbxq3DhSwtVqjQZ+fvoFZ9Pr0d/0BvkEw7URksJrZN JGR7ekAxicG9ltd9vKWd+E+DPqv5pwQ6iHUqpwzfCKbE1hz0sa8eOupvDFb3hWRskZpR QqAMYy3jHhZVuIHBEMvMav5IKSwNZZ2dAA1R5xBa/uprAbl/7SL9gv0UjqQptV7cTiUG jo7itsnQvuXGfp4rDYqXH4yElefKYtSsi1XMyaN2vvBh/AssBAHWWUWVA+pRqlgknwbB RB9UdK5vV0kGtOah8MAKN1+GHsNj7x+bMAcqMVMbXj4gDQGKZFkDgM8dgeNS+I6XaQ2L LutA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:to:from:dkim-signature; bh=B/r1ya0nLqX9lahB9LJGeLPRvtBxybq1MYxgaBVoSeA=; b=v6bqu7Z8tnu67aRaDOvTIWFJOnAItIBzmYWwEOwP0KUPSX6O2ltSTzH+xmqyICgHpT B94P/KL7tf7boYjMwHtjtUw3Ewa9NxmiPobU4moC9ndj9hx2JRS6D05TfEi8dbTzclGe sDMqD9HNSom5hmary1+ZVAe5grUUjtfUytke3ivp+J5YRRuZIfwCspfTDeJkC/IMT5pa 39jnFAf9wToH0K2gmNjP/jS+ZZHgy0nyO0dpJU9bNqBYjar1iGXDn0Io/VQKsi4EnvD2 6wT+uQu7Mm/eEl2thTL6EKkNqJCQh1rxMBEe2M3zN87QWRi2DHOCjZ8u8TiSZSArPDTZ nhwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VVrwwPkt; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h10si871560ejb.554.2021.01.20.09.07.22; Wed, 20 Jan 2021 09:07:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VVrwwPkt; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391078AbhATRFS (ORCPT + 99 others); Wed, 20 Jan 2021 12:05:18 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:42002 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391250AbhATRBf (ORCPT ); Wed, 20 Jan 2021 12:01:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611161997; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=B/r1ya0nLqX9lahB9LJGeLPRvtBxybq1MYxgaBVoSeA=; b=VVrwwPktvfm4DGJexf6hnEbjcctMZWouIYdqCjlagQcmv0v60dX0CEgGXD7Q8N1gdkb+RX n+sIhX7DhMqvwuydUlbz5pE6fL20xUvFOzJx1wIcgm1Ru+/3qPc452OWlJ0vBp/V5YDvEh GbDvVPG2B+JNNmo31li57+msdD3xOxQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-343-GjU1eZXgPJe-tKxATCdhqQ-1; Wed, 20 Jan 2021 11:59:55 -0500 X-MC-Unique: GjU1eZXgPJe-tKxATCdhqQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id EF019612A8 for ; Wed, 20 Jan 2021 16:59:54 +0000 (UTC) Received: from bcodding.csb (ovpn-64-66.rdu2.redhat.com [10.10.64.66]) by smtp.corp.redhat.com (Postfix) with ESMTP id C6AD56BF6B for ; Wed, 20 Jan 2021 16:59:54 +0000 (UTC) Received: by bcodding.csb (Postfix, from userid 24008) id 6503910D4111; Wed, 20 Jan 2021 11:59:54 -0500 (EST) From: Benjamin Coddington To: linux-nfs@vger.kernel.org Subject: [PATCH v1 00/10] NFS client readdir per-page validation Date: Wed, 20 Jan 2021 11:59:44 -0500 Message-Id: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Due to the constraint that the NFS readdir page cache must contain every entry in cookie order from zero up to the entry of interest, the time or operations required to complete a directory listing increase exponentially with the size of the directory if the client is unable to keep the pagecache stable. The pagecache can be invalidated by a changing directory, or by memory pressure on the client. This can cause some trouble for the NFS client reading large directories over slow connections. We have a hueristic that allows eventual completion, but it only works as long as there are no other readers simultaneously filling the pagecache. I think we can resolve this problem by implementing per-page validation. By storing the directory's change version on the page, and checking for changes to the directory on every READDIR, we can validate pages against each reader's version of entry aligment. Rather than attempting to assemble the entire directory in a consistent manner in the pagecache, we can just retrieve the section we're interested in emitting. This set is a first pass at implementing this idea. Please help me pound it into acceptable shape or point out problems! Thanks for any feedback. Here's a small program that does a great job of demonstraing the client's current readdir pagecache performance problem by dropping the directory's pagecache at an interval while trying to emit every entry: #define _GNU_SOURCE #include #include #include #include #include #include #include #include #define BUF_SIZE 1024 #define EVICT_INTERVAL 5 int evict_pagecache(int fd) { return posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED); } int main(int argc, char **argv) { int dir_fd; pid_t pid; cpu_set_t *cpusetp = CPU_ALLOC(2); off_t off; char buf[BUF_SIZE]; if (argc < 2) { printf("%s \n", argv[0]); return 1; } dir_fd = open(argv[1], O_RDONLY|O_DIRECTORY|O_CLOEXEC); if (dir_fd < 0) { printf("cannot open dir\n"); return 1; } pid = fork(); if (pid == 0) { CPU_SET(1, cpusetp); sched_setaffinity(0, sizeof(cpu_set_t), cpusetp); do { evict_pagecache(dir_fd); off = lseek(dir_fd, 0, SEEK_CUR); printf("currently at %llu\n", off); usleep(EVICT_INTERVAL * 1000000); } while (1); } else { CPU_SET(0, cpusetp); sched_setaffinity(0, sizeof(cpu_set_t), cpusetp); while (syscall(SYS_getdents, dir_fd, buf, BUF_SIZE)) {} kill(pid, SIGINT); } close(dir_fd); return 0; } Benjamin Coddington (10): NFS: save the directory's change attribute on pagecache pages NFSv4: Send GETATTR with READDIR NFS: Add a struct to track readdir pagecache location NFS: Keep the readdir pagecache cursor updated NFS: readdir per-page cache validation NFS: stash the readdir pagecache cursor on the open directory context NFS: Support headless readdir pagecache pages NFS: Reset pagecache cursor on llseek NFS: Remove nfs_readdir_dont_search_cache() NFS: Revalidate the directory pagecache on every nfs_readdir() fs/nfs/dir.c | 210 +++++++++++++++++++++++++++----------- fs/nfs/nfs42proc.c | 2 +- fs/nfs/nfs4proc.c | 27 +++-- fs/nfs/nfs4xdr.c | 6 ++ include/linux/nfs_fs.h | 8 +- include/linux/nfs_fs_sb.h | 5 + include/linux/nfs_xdr.h | 2 + 7 files changed, 188 insertions(+), 72 deletions(-) -- 2.25.4