Received: by 2002:ac2:48a3:0:0:0:0:0 with SMTP id u3csp554244lfg; Fri, 11 Mar 2022 13:05:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJyfmgqCFCn9Q2dw4D1HmRQLm/gw8d2S5zTgKOsC10oFFOlrat/lfVujrwhLFRT+NrozO4x+ X-Received: by 2002:a63:82c1:0:b0:37c:9950:2fec with SMTP id w184-20020a6382c1000000b0037c99502fecmr9749146pgd.13.1647032740024; Fri, 11 Mar 2022 13:05:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1647032740; cv=none; d=google.com; s=arc-20160816; b=FuZ6ovXLO768Byq9u+K/jVdVggJcXaBr/fYoYcKNfKVa1HG/eHXN2X4wwnvT72KbSx 5nVv7gR6JflDgxb9YPlgrsr6Ue/lQg/EfCtgjLvr5EMvLyObd+CCOG0wr0HjIowVof1m mwKm4KVr5eJxe7CLFAKObg179jj7V9KlGHWBS0zRNE1oXwG6GDouD8SpJnJC/8HSy07o 9/HHOZ/gnOMcVBWnP4vEEoIf6oTLYsBtXuu+eJyQICLWhoy3B2jK7Ne/D+r2BZW9EtpG yj4683h4WO53h9352DlvEa7mryB1o/PZsh0bh0IfwMm1lzUA9XUcLw1D5ipNpNL7qXZS PsIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8mD+49DoohZzo6pu1up5pgQTWUjATz3E+QnK1iKIt14=; b=YwnUbQlvZOVwJw5B6O8onuA1Q+zTFFrksT8IYqLgrYc7XaarlbeXNS+DKUTeSTl6ml NHs1sKyDmBZJDrXxpHX0tXEE+z03xElLhJSqVjIDeqs8jROx+J+U3FEk1o/TkVcUD79X D3dThNFCpiRmPGtr9Vw2mlDtGpoF1IWd+eC/MYXJbJ2J7t3SIoF9LlnRDawYHQyCkgyq xkLy90laAETucSyd1p6pQi1fL6dUhRJqwzT4jz8oiBvVD2xiyBtNZGkQWZAmKGGbG9Nr igwrJbUAM+sVesDRe5x7Q8tuInVk/+o339lr1yT9ii3IunxK4KccPzLXaZxy0iySrr1q OLHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YEZXW+8H; spf=softfail (google.com: domain of transitioning linux-nfs-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id i70-20020a638749000000b0037ff9046dd2si8403150pge.262.2022.03.11.13.05.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 13:05:40 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-nfs-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YEZXW+8H; spf=softfail (google.com: domain of transitioning linux-nfs-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 00CD6180D39; Fri, 11 Mar 2022 12:50:29 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240923AbiCKL7V (ORCPT + 99 others); Fri, 11 Mar 2022 06:59:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348401AbiCKL7V (ORCPT ); Fri, 11 Mar 2022 06:59:21 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 709FD2250F for ; Fri, 11 Mar 2022 03:58:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646999895; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8mD+49DoohZzo6pu1up5pgQTWUjATz3E+QnK1iKIt14=; b=YEZXW+8H3whP6KQ38kjL/q9YaEnzAJnXiJfo1/t/MvqUo8/lfGCTt06W+RZB/gNQfsyd8o atuXxwpjuMpfjBiNL256npKjHJfFaJ+7FkiiEHhHUpwoSufqHScSdQj/kf9DzxA7DpOqKa H2SqzLtIOQk8IrYazg+zkvLnrfQNZbY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-194-kvwBMe4CO5aUGnuyK-TpRw-1; Fri, 11 Mar 2022 06:58:14 -0500 X-MC-Unique: kvwBMe4CO5aUGnuyK-TpRw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6BAB2824FA7; Fri, 11 Mar 2022 11:58:13 +0000 (UTC) Received: from [172.16.176.1] (ovpn-64-2.rdu2.redhat.com [10.10.64.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0D08110589DE; Fri, 11 Mar 2022 11:58:12 +0000 (UTC) From: "Benjamin Coddington" To: "Trond Myklebust" Cc: linux-nfs@vger.kernel.org Subject: Re: [PATCH v9 23/27] NFS: Convert readdir page cache to use a cookie based index Date: Fri, 11 Mar 2022 06:58:11 -0500 Message-ID: <466F8F77-E052-4D06-A016-946FCBD9C9BF@redhat.com> In-Reply-To: <9099fead49c961a53027c8ed309a8efd2222d679.camel@hammerspace.com> References: <20220227231227.9038-1-trondmy@kernel.org> <20220227231227.9038-2-trondmy@kernel.org> <20220227231227.9038-3-trondmy@kernel.org> <20220227231227.9038-4-trondmy@kernel.org> <20220227231227.9038-5-trondmy@kernel.org> <20220227231227.9038-6-trondmy@kernel.org> <20220227231227.9038-7-trondmy@kernel.org> <20220227231227.9038-8-trondmy@kernel.org> <20220227231227.9038-9-trondmy@kernel.org> <20220227231227.9038-10-trondmy@kernel.org> <20220227231227.9038-11-trondmy@kernel.org> <20220227231227.9038-12-trondmy@kernel.org> <20220227231227.9038-13-trondmy@kernel.org> <20220227231227.9038-14-trondmy@kernel.org> <20220227231227.9038-15-trondmy@kernel.org> <20220227231227.9038-16-trondmy@kernel.org> <20220227231227.9038-17-trondmy@kernel.org> <20220227231227.9038-18-trondmy@kernel.org> <20220227231227.9038-19-trondmy@kernel.org> <20220227231227.9038-20-trondmy@kernel.org> <20220227231227.9038-21-trondmy@kernel.org> <20220227231227.9038-22-trondmy@kernel.org> <20220227231227.9038-23-trondmy@kernel.org> <20220227231227.9038-24-trondmy@kernel.org> <9099fead49c961a53027c8ed309a8efd2222d679.camel@hammerspace.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On 10 Mar 2022, at 16:07, Trond Myklebust wrote: > On Wed, 2022-03-09 at 15:01 -0500, Benjamin Coddington wrote: >> On 27 Feb 2022, at 18:12, trondmy@kernel.org wrote: >> >>> From: Trond Myklebust >>> >>> Instead of using a linear index to address the pages, use the >>> cookie of >>> the first entry, since that is what we use to match the page >>> anyway. >>> >>> This allows us to avoid re-reading the entire cache on a seekdir() >>> type >>> of operation. The latter is very common when re-exporting NFS, and >>> is a >>> major performance drain. >>> >>> The change does affect our duplicate cookie detection, since we can >>> no >>> longer rely on the page index as a linear offset for detecting >>> whether >>> we looped backwards. However since we no longer do a linear search >>> through all the pages on each call to nfs_readdir(), this is less >>> of a >>> concern than it was previously. >>> The other downside is that invalidate_mapping_pages() no longer can >>> use >>> the page index to avoid clearing pages that have been read. A >>> subsequent >>> patch will restore the functionality this provides to the 'ls -l' >>> heuristic. >> >> I didn't realize the approach was to also hash out the linearly- >> cached >> entries.  I thought we'd do something like flag the context for >> hashed page >> indexes after a seekdir event, and if there are collisions with the >> linear >> entries, they'll get fixed up when found. > > Why? What's the point of using 2 models where 1 will do? I don't think the hashed model is quite as simple and efficient overall, and may produce impacts to a system beyond NFS. >> >> Doesn't that mean that with this approach seekdir() only hits the >> same pages >> when the entry offset is page-aligned?  That's 1 in 127 odds. > > The point is not to stomp all over the pages that contain aligned data > when the application does call seekdir(). > > IOW: we always optimise for the case where we do a linear read of the > directory, but we support random seekdir() + read too. And that could be done just by bumping the seekdir users to some constant offset (index 262144 ?), or something else equally dead-nuts simple. That keeps tightly clustered page indexes, so walking the cache is faster. That reduces the "buckshot" effect the hashing has of eating up pagecache pages they'll never use again. That doesn't cap our caching ability at 33 million entries. Its weird to me that we're doing exactly what XArray says not to do, hash the index, when we don't have to. >> It also means we're amplifying the pagecache's useage for slightly >> changing >> directories - rather than re-using the same pages we're scattering >> our usage >> across the index.  Eh, maybe not a big deal if we just expect the >> page >> cache's LRU to do the work. >> > > I don't understand your point about 'not reusing'. If the user seeks to > the same cookie, we reuse the page. However I don't know how you would > go about setting up a schema that allows you to seek to an arbitrary > cookie without doing a linear search. So when I was taking about 'reusing' a page, that's about re-filling the same pages rather than constantly conjuring new ones, which requires less of the pagecache's resources in total. Maybe the pagecache can handle that without it negatively impacting other users of the cache that /will/ re-use their cached pages, but I worry it might be irresponsible of us to fill the pagecache with pages we know we're never going to find again. Ben