Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp2293621pxb; Wed, 9 Feb 2022 15:23:09 -0800 (PST) X-Google-Smtp-Source: ABdhPJywgnOAb2e1lQSv+Pf9CyRE/uOcNNTY8dQ8YTo5nFNaeKkwmbfApEtbvrYBgWsa7KIKfPgl X-Received: by 2002:a62:4e44:: with SMTP id c65mr3685067pfb.1.1644448989419; Wed, 09 Feb 2022 15:23:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644448989; cv=none; d=google.com; s=arc-20160816; b=Q3fRdpG3AZzYx7/92y2nZw0eQbXavXYq/n3IpC+LvCu0zTNkTV936vTqWVFuViT1yE yCgwe/I+XQWzZBH8Vs+R6UShSTRBGpKE3hjFnc/ukMkQdTKZXt9gXRTb1PdHFuNOlJI3 c6wcslJEGNssAzOtvo8XYg8h+QTU6WJ51AIk/RIzwu9uS/VqUyM5Surp0nu8PY2rA8VT Oqewh6i4NxfeZPnTlHNpWp0f0ffCnIQdZsZfzgMN5CXuyp5Qoe/QoWxII4RznJPthOfq LAuedX0vWfzd4I1uuVMDTjUALPs3Odr+I6x5g/IGFzdrvTIjgUY1vDHuUkR8tjgMjPcR 0y3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=oiI2AIPhpyxhDFmK6miH1RwOaEsyvPzIdi91i1E1UeI=; b=jbIfHJdiAEsMSoHV6GuWfCnXxXXBJXIXX/hditMfhgQ8Sf1IplxyquvP2ydU+QJ7y/ jrn6IuSZlLB/5CgbZ0PJU+itn/tV4OUcyeJHZ5zI0d7mLaLXki3qvte3ruo7H13tRZ32 3xsjfVUMxnkRyTBCku633dz+y149wiYPgvBA4HZ49Okf79J79hbloNp+BdX3iy4qB+hU GJn1UibLJ43F6+SqO+IQMC86ncVge9NU67OtgJBhc6h494oa6NRMpof86nzVeBhm3n8Y dXl9kiAWQv7q3XrXYT+qrJ2M08QyzztT/Noy89wLvEFfXi4KsqQz/eSJb/Zkuhv+OZVR n0/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=tT+6rEUn; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id o8si17135714pgn.270.2022.02.09.15.23.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Feb 2022 15:23:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=tT+6rEUn; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0C00FE0536CD; Wed, 9 Feb 2022 15:18:41 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241286AbiBISyY (ORCPT + 99 others); Wed, 9 Feb 2022 13:54:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241173AbiBISyI (ORCPT ); Wed, 9 Feb 2022 13:54:08 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95C39C1038D3 for ; Wed, 9 Feb 2022 10:49:41 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3EB96B82384 for ; Wed, 9 Feb 2022 18:49:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9F1A1C340E7 for ; Wed, 9 Feb 2022 18:49:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1644432578; bh=5+CVVaeilXukvinn/cQ50dVTffooXSiBsulgngtDGFw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=tT+6rEUneU0HG6wWI1eiozMyP/JPzLF9DmFfNDxqfi7oHdY49qBAESO1Km07icNsl r2TX81vWbtOTZc38Hos8o6w91yEzPEMZW5e9Hbz4iAr/rfhlANb3GVCqDRwRq9iONU IS3TfLQUwyCTd7PriZmbqQAxtKhD5vT2luL2p8/g2i88Va6q8zzdqO5es4gvzeS0AT LEVyZJTF5f72sKqKTPuwrDG9ZrO54NhThg7ThbHkYBfCRpp2TlfjOg9nAw67NdMy1T wewmK2Lx5JGgoI1y7n6XbstAptrYelCiRCDYg8mEFNiFGIWBjRB5gxNpieJ0+LkelL NDfs5rzX1vy8A== From: trondmy@kernel.org To: linux-nfs@vger.kernel.org Subject: [PATCH v3 1/2] NFS: Adjust the amount of readahead performed by NFS readdir Date: Wed, 9 Feb 2022 13:42:50 -0500 Message-Id: <20220209184251.23909-2-trondmy@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220209184251.23909-1-trondmy@kernel.org> References: <20220209184251.23909-1-trondmy@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Trond Myklebust The current NFS readdir code will always try to maximise the amount of readahead it performs on the assumption that we can cache anything that isn't immediately read by the process. There are several cases where this assumption breaks down, including when the 'ls -l' heuristic kicks in to try to force use of readdirplus as a batch replacement for lookup/getattr. This patch therefore tries to tone down the amount of readahead we perform, and adjust it to try to match the amount of data being requested by user space. Signed-off-by: Trond Myklebust --- fs/nfs/dir.c | 55 +++++++++++++++++++++++++++++++++++++++++- include/linux/nfs_fs.h | 1 + 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 993725b95df6..1323eef34f7e 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -69,6 +69,8 @@ const struct address_space_operations nfs_dir_aops = { .freepage = nfs_readdir_clear_array, }; +#define NFS_INIT_DTSIZE PAGE_SIZE + static struct nfs_open_dir_context *alloc_nfs_open_dir_context(struct inode *dir) { struct nfs_inode *nfsi = NFS_I(dir); @@ -80,6 +82,7 @@ static struct nfs_open_dir_context *alloc_nfs_open_dir_context(struct inode *dir ctx->dir_cookie = 0; ctx->dup_cookie = 0; ctx->page_index = 0; + ctx->dtsize = NFS_INIT_DTSIZE; ctx->eof = false; spin_lock(&dir->i_lock); if (list_empty(&nfsi->open_files) && @@ -155,6 +158,7 @@ struct nfs_readdir_descriptor { struct page *page; struct dir_context *ctx; pgoff_t page_index; + pgoff_t page_index_max; u64 dir_cookie; u64 last_cookie; u64 dup_cookie; @@ -167,12 +171,36 @@ struct nfs_readdir_descriptor { unsigned long gencount; unsigned long attr_gencount; unsigned int cache_entry_index; + unsigned int buffer_fills; + unsigned int dtsize; signed char duped; bool plus; bool eob; bool eof; }; +static void nfs_set_dtsize(struct nfs_readdir_descriptor *desc, unsigned int sz) +{ + struct nfs_server *server = NFS_SERVER(file_inode(desc->file)); + unsigned int maxsize = server->dtsize; + + if (sz > maxsize) + sz = maxsize; + if (sz < NFS_MIN_FILE_IO_SIZE) + sz = NFS_MIN_FILE_IO_SIZE; + desc->dtsize = sz; +} + +static void nfs_shrink_dtsize(struct nfs_readdir_descriptor *desc) +{ + nfs_set_dtsize(desc, desc->dtsize >> 1); +} + +static void nfs_grow_dtsize(struct nfs_readdir_descriptor *desc) +{ + nfs_set_dtsize(desc, desc->dtsize << 1); +} + static void nfs_readdir_array_init(struct nfs_cache_array *array) { memset(array, 0, sizeof(struct nfs_cache_array)); @@ -759,6 +787,7 @@ static int nfs_readdir_page_filler(struct nfs_readdir_descriptor *desc, break; arrays++; *arrays = page = new; + desc->page_index_max++; } else { new = nfs_readdir_page_get_next(mapping, page->index + 1, @@ -768,6 +797,7 @@ static int nfs_readdir_page_filler(struct nfs_readdir_descriptor *desc, if (page != *arrays) nfs_readdir_page_unlock_and_put(page); page = new; + desc->page_index_max = new->index; } status = nfs_readdir_add_to_array(entry, page); } while (!status && !entry->eof); @@ -833,7 +863,7 @@ static int nfs_readdir_xdr_to_array(struct nfs_readdir_descriptor *desc, struct nfs_entry *entry; size_t array_size; struct inode *inode = file_inode(desc->file); - size_t dtsize = NFS_SERVER(inode)->dtsize; + unsigned int dtsize = desc->dtsize; int status = -ENOMEM; entry = kzalloc(sizeof(*entry), GFP_KERNEL); @@ -869,6 +899,7 @@ static int nfs_readdir_xdr_to_array(struct nfs_readdir_descriptor *desc, status = nfs_readdir_page_filler(desc, entry, pages, pglen, arrays, narrays); + desc->buffer_fills++; } while (!status && nfs_readdir_page_needs_filling(page) && page_mapping(page)); @@ -916,6 +947,7 @@ static int find_and_lock_cache_page(struct nfs_readdir_descriptor *desc) if (!desc->page) return -ENOMEM; if (nfs_readdir_page_needs_filling(desc->page)) { + desc->page_index_max = desc->page_index; res = nfs_readdir_xdr_to_array(desc, nfsi->cookieverf, verf, &desc->page, 1); if (res < 0) { @@ -1047,6 +1079,7 @@ static int uncached_readdir(struct nfs_readdir_descriptor *desc) desc->cache_entry_index = 0; desc->last_cookie = desc->dir_cookie; desc->duped = 0; + desc->page_index_max = 0; status = nfs_readdir_xdr_to_array(desc, desc->verf, verf, arrays, sz); @@ -1056,10 +1089,22 @@ static int uncached_readdir(struct nfs_readdir_descriptor *desc) } desc->page = NULL; + /* + * Grow the dtsize if we have to go back for more pages, + * or shrink it if we're reading too many. + */ + if (!desc->eof) { + if (!desc->eob) + nfs_grow_dtsize(desc); + else if (desc->buffer_fills == 1 && + i < (desc->page_index_max >> 1)) + nfs_shrink_dtsize(desc); + } for (i = 0; i < sz && arrays[i]; i++) nfs_readdir_page_array_free(arrays[i]); out: + desc->page_index_max = -1; kfree(arrays); dfprintk(DIRCACHE, "NFS: %s: returns %d\n", __func__, status); return status; @@ -1102,6 +1147,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) desc->file = file; desc->ctx = ctx; desc->plus = nfs_use_readdirplus(inode, ctx); + desc->page_index_max = -1; spin_lock(&file->f_lock); desc->dir_cookie = dir_ctx->dir_cookie; @@ -1110,6 +1156,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) page_index = dir_ctx->page_index; desc->attr_gencount = dir_ctx->attr_gencount; desc->eof = dir_ctx->eof; + nfs_set_dtsize(desc, dir_ctx->dtsize); memcpy(desc->verf, dir_ctx->verf, sizeof(desc->verf)); spin_unlock(&file->f_lock); @@ -1151,6 +1198,11 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) nfs_do_filldir(desc, nfsi->cookieverf); nfs_readdir_page_unlock_and_put_cached(desc); + if (desc->eob || desc->eof) + break; + /* Grow the dtsize if we have to go back for more pages */ + if (desc->page_index == desc->page_index_max) + nfs_grow_dtsize(desc); } while (!desc->eob && !desc->eof); spin_lock(&file->f_lock); @@ -1160,6 +1212,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) dir_ctx->attr_gencount = desc->attr_gencount; dir_ctx->page_index = desc->page_index; dir_ctx->eof = desc->eof; + dir_ctx->dtsize = desc->dtsize; memcpy(dir_ctx->verf, desc->verf, sizeof(dir_ctx->verf)); spin_unlock(&file->f_lock); out_free: diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 333ea05e2531..034d95809b97 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -106,6 +106,7 @@ struct nfs_open_dir_context { __u64 dir_cookie; __u64 dup_cookie; pgoff_t page_index; + unsigned int dtsize; signed char duped; bool eof; }; -- 2.34.1