Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp1302934pxb; Wed, 2 Feb 2022 01:35:12 -0800 (PST) X-Google-Smtp-Source: ABdhPJyqFmPjCH0z0nZEbanWFZEXcrrFPn1ebfjZrtWweTajrWEtHtz0pca8YLkB5671MZbRKWej X-Received: by 2002:a17:907:a40d:: with SMTP id sg13mr23585690ejc.223.1643794511802; Wed, 02 Feb 2022 01:35:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643794511; cv=none; d=google.com; s=arc-20160816; b=Cyx1GhSdBAEwl/Jpf/nxrIcTjPziXragPCbHNxw+r317wuhsckHB0URdk1N82C8Urz jzpQr0zofKn/kIxYNR/F3D55sKCOn9ksl8aD0PL4kC0ZpDnC6yRpZxAuCApWqeoXM4mH 34BIj9Bx7RCqq6y1rrnxfhjh/YmvwmIJ8KMllt35oAcO8q9RH8A6Jl6xKyzmJI4Nriyx xhjsNadWSqpRUYem/GlnwVem1ExUguXRBiImddS9Psa0b6c+YmqX2xM9E/1xaXTi7Asp qgC5kMA3gIDrhxQVocc0rMDGbT+xfIoNAyp1FaymIZTVxWalS7s0TSWjMjEOcLy6Q0tm PXiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:content-disposition:mime-version :message-id:subject:to:from:date; bh=OcGVlWAUjKojzP3BkYEJJP+3iV5c974/TOrwSPbRZFY=; b=AENu9rIOlhGWPxkQu+ihKXbB/dljRuzJbdUWt7Xa8rpn4O0N5XD9DZyTENAh2BdhxY 9a4fu8/AhWX0cs7ORdHBzIU9cnm2uUqMYgqIAZL9/lfgz0fLxwQu7PuZmw7ydkXNEr4A zDoXQlUFloh/nnYBTcGcu8NbU0vKzBc6Q9us33NUHWmJCbkGuB9xL1XSNRW8V3MmoixZ tW8PWdi/JMqwqU5k+ogMLUoY1/veAognebvrElMrmWADs5HZ8XSMHVzEwxay5j4q9Nzh beUHKrtujD2LTjrXjpeSJoTJtVO09uW7noayew0fi5etzYY5XH2gz/6mc3QDHyHy3tsV /m3w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id go12si10936925ejc.450.2022.02.02.01.34.37; Wed, 02 Feb 2022 01:35:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243333AbiBBCAO (ORCPT + 99 others); Tue, 1 Feb 2022 21:00:14 -0500 Received: from peace.netnation.com ([204.174.223.2]:57916 "EHLO peace.netnation.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232709AbiBBCAN (ORCPT ); Tue, 1 Feb 2022 21:00:13 -0500 X-Greylist: delayed 1137 seconds by postgrey-1.27 at vger.kernel.org; Tue, 01 Feb 2022 21:00:13 EST Received: from sim by peace.netnation.com with local (Exim 4.92) (envelope-from ) id 1nF4dz-0006vM-UM for linux-nfs@vger.kernel.org; Tue, 01 Feb 2022 17:41:11 -0800 Date: Tue, 1 Feb 2022 17:41:11 -0800 From: Simon Kirby To: linux-nfs Subject: Cache flush on file lock Message-ID: <20220202014111.GA7467@hostway.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Hello! I noticed high traffic in an NFS environment and tracked it down to some users who moved SQLite databases over from previously-local storage. The usage pattern of SQLite here seems particularly bad on NFSv3 clients, where a combination of F_RDLCK to F_WRLCK upgrading and locking polling is entirely discarding the cache for other processes on the same client. Our load balancing configuration typically sticks most file accesses to individual hosts (NFS clients), so I figured it was time to re-evaluate the status of NFSv4 and file delegations here, since the files could be delegated to one client, and then maybe the page cache could work as it does on a local file system. It turns out this isn't happening... First, it seems that SQLite always opens the file O_RDWR. knfsd does not seem to create a delegation in this case; I see it only for O_RDONLY. Second, it seems that do_setlk() in fs/nfs/file.c always nfs_zap_caches() unless there's a ->have_delegation(inode, FMODE_READ). That condition has changed slightly over the years, but the basic concept of invalidating the cache in do_setlk has been around since pre-git. Since it seems like there's the intention to preserve cache with a read delegation, I wrote a simplified testcase to simulate SQLite locking. With the open changed to O_RDONY (and F_RDLCK only), the v3 mount and server show "POSIX ADVISORY READ" in /proc/locks. The v4 mount shows "DELEG ACTIVE READ" on the server and "POSIX ADVISORY READ" on the client. With O_RDONLY, I can see that cache is zapped following F_RDLCK on v3 and not zapped on v4, so this appears to be working as expected. With O_RDWR restored, both server and client show "POSIX ADVISORY READ" with v3 or v4 mounts, and since there is no read delegation, the cache gets zapped. RFC 8881 10.4.2 seems to talk about locking when an OPEN_DELEGATE_WRITE delegation is present, so it seems this was perhaps intended to work. How far off would we be from write delegations happening here? I can post the testcase code if it would be helpful. Simon-