Hello!
I noticed high traffic in an NFS environment and tracked it down to some
users who moved SQLite databases over from previously-local storage.
The usage pattern of SQLite here seems particularly bad on NFSv3 clients,
where a combination of F_RDLCK to F_WRLCK upgrading and locking polling
is entirely discarding the cache for other processes on the same client.
Our load balancing configuration typically sticks most file accesses to
individual hosts (NFS clients), so I figured it was time to re-evaluate
the status of NFSv4 and file delegations here, since the files could be
delegated to one client, and then maybe the page cache could work as it
does on a local file system. It turns out this isn't happening...
First, it seems that SQLite always opens the file O_RDWR. knfsd does not
seem to create a delegation in this case; I see it only for O_RDONLY.
Second, it seems that do_setlk() in fs/nfs/file.c always nfs_zap_caches()
unless there's a ->have_delegation(inode, FMODE_READ). That condition has
changed slightly over the years, but the basic concept of invalidating
the cache in do_setlk has been around since pre-git.
Since it seems like there's the intention to preserve cache with a read
delegation, I wrote a simplified testcase to simulate SQLite locking.
With the open changed to O_RDONY (and F_RDLCK only), the v3 mount and
server show "POSIX ADVISORY READ" in /proc/locks. The v4 mount shows
"DELEG ACTIVE READ" on the server and "POSIX ADVISORY READ" on the
client.
With O_RDONLY, I can see that cache is zapped following F_RDLCK on v3 and
not zapped on v4, so this appears to be working as expected.
With O_RDWR restored, both server and client show "POSIX ADVISORY READ"
with v3 or v4 mounts, and since there is no read delegation, the cache
gets zapped.
RFC 8881 10.4.2 seems to talk about locking when an OPEN_DELEGATE_WRITE
delegation is present, so it seems this was perhaps intended to work.
How far off would we be from write delegations happening here?
I can post the testcase code if it would be helpful.
Simon-
On 1 Feb 2022, at 20:41, Simon Kirby wrote:
> Hello!
>
> I noticed high traffic in an NFS environment and tracked it down to some
> users who moved SQLite databases over from previously-local storage.
>
> The usage pattern of SQLite here seems particularly bad on NFSv3 clients,
> where a combination of F_RDLCK to F_WRLCK upgrading and locking polling
> is entirely discarding the cache for other processes on the same client.
>
> Our load balancing configuration typically sticks most file accesses to
> individual hosts (NFS clients), so I figured it was time to re-evaluate
> the status of NFSv4 and file delegations here, since the files could be
> delegated to one client, and then maybe the page cache could work as it
> does on a local file system. It turns out this isn't happening...
>
> First, it seems that SQLite always opens the file O_RDWR. knfsd does not
> seem to create a delegation in this case; I see it only for O_RDONLY.
>
> Second, it seems that do_setlk() in fs/nfs/file.c always nfs_zap_caches()
> unless there's a ->have_delegation(inode, FMODE_READ). That condition has
> changed slightly over the years, but the basic concept of invalidating
> the cache in do_setlk has been around since pre-git.
>
> Since it seems like there's the intention to preserve cache with a read
> delegation, I wrote a simplified testcase to simulate SQLite locking.
>
> With the open changed to O_RDONY (and F_RDLCK only), the v3 mount and
> server show "POSIX ADVISORY READ" in /proc/locks. The v4 mount shows
> "DELEG ACTIVE READ" on the server and "POSIX ADVISORY READ" on the
> client.
>
> With O_RDONLY, I can see that cache is zapped following F_RDLCK on v3 and
> not zapped on v4, so this appears to be working as expected.
>
> With O_RDWR restored, both server and client show "POSIX ADVISORY READ"
> with v3 or v4 mounts, and since there is no read delegation, the cache
> gets zapped.
>
> RFC 8881 10.4.2 seems to talk about locking when an OPEN_DELEGATE_WRITE
> delegation is present, so it seems this was perhaps intended to work.
>
> How far off would we be from write delegations happening here?
The linux server doesn't have write delegations implemented yet, I suspect
that's why you're not seeing them.
Ben
> On Feb 1, 2022, at 8:41 PM, Simon Kirby <[email protected]> wrote:
>
> How far off would we be from write delegations happening here?
We are tracking a "feature request" for write delegation support
in the Linux NFS server:
https://bugzilla.linux-nfs.org/show_bug.cgi?id=348
At this point the effort is not resourced. It's not clear how
much benefit it would be.
That said, it seems to me that your use case might benefit if
the Linux NFS server offered a READ delegation for the SQLite
database file even when it is open R/W. It might be appropriate
if the server offered such a delegation when there are no other
clients that have the file open for write or that hold write
delegations.
Patches and performance data are, as always, welcome.
--
Chuck Lever
> On Feb 2, 2022, at 2:55 PM, Chuck Lever III <[email protected]> wrote:
>
>
>
>> On Feb 1, 2022, at 8:41 PM, Simon Kirby <[email protected]> wrote:
>>
>> How far off would we be from write delegations happening here?
>
> We are tracking a "feature request" for write delegation support
> in the Linux NFS server:
>
> https://bugzilla.linux-nfs.org/show_bug.cgi?id=348
>
> At this point the effort is not resourced. It's not clear how
> much benefit it would be.
>
> That said, it seems to me that your use case might benefit if
> the Linux NFS server offered a READ delegation for the SQLite
> database file even when it is open R/W. It might be appropriate
> if the server offered such a delegation when there are no other
> clients that have the file open for write or that hold write
> delegations.
>
> Patches and performance data are, as always, welcome.
You didn't mention which version of the server you are using.
In fact, commit aba2072f4523 ("nfsd: grant read delegations to
clients holding writes") ought to make the server offer a READ
delegation in this situation. It appears in v5.13, if I'm
reading my "git describe" output correctly.
--
Chuck Lever
On 2/2/22 13:55, Chuck Lever III wrote:
>
>
>> On Feb 1, 2022, at 8:41 PM, Simon Kirby <[email protected]> wrote:
>>
>> How far off would we be from write delegations happening here?
>
> We are tracking a "feature request" for write delegation support
> in the Linux NFS server:
>
> https://bugzilla.linux-nfs.org/show_bug.cgi?id=348
>
> At this point the effort is not resourced. It's not clear how
> much benefit it would be.
>
First shock is I thought write delegation was already supported.
But especially for users who have NFS mounted home directories, how
could there not be a major performance benefit? To cite one example that
has been a problem before (in particular, on Mac OSX clients with NFSv3
home directories mounted from a linux fileserver), facilitating the
firefox "awesome" bar, which writes to a SQLlite database in
~/.mozilla/fireworks with alarming frequency in order to keep up the
awesomeness.
This flat out didn't work previously unless you went to about:config and
set
storage.nfs_filesystem = true
Not at all sure how this changed the execution of firefox though.
> That said, it seems to me that your use case might benefit if
> the Linux NFS server offered a READ delegation for the SQLite
> database file even when it is open R/W. It might be appropriate
> if the server offered such a delegation when there are no other
> clients that have the file open for write or that hold write
> delegations.
>
> Patches and performance data are, as always, welcome.
>
>
> --
> Chuck Lever
>
>
>
> On Feb 4, 2022, at 3:55 PM, Patrick Goetz <[email protected]> wrote:
>
>
>
> On 2/2/22 13:55, Chuck Lever III wrote:
>>> On Feb 1, 2022, at 8:41 PM, Simon Kirby <[email protected]> wrote:
>>>
>>> How far off would we be from write delegations happening here?
>> We are tracking a "feature request" for write delegation support
>> in the Linux NFS server:
>> https://bugzilla.linux-nfs.org/show_bug.cgi?id=348
>> At this point the effort is not resourced. It's not clear how
>> much benefit it would be.
>
> First shock is I thought write delegation was already supported.
WRITE delegation and READ delegation are both supported on
the Linux NFS client.
The Linux NFS server offers only READ delegations.
> But especially for users who have NFS mounted home directories, how could there not be a major performance benefit? To cite one example that has been a problem before (in particular, on Mac OSX clients with NFSv3 home directories mounted from a linux fileserver), facilitating the firefox "awesome" bar, which writes to a SQLlite database in ~/.mozilla/fireworks with alarming frequency in order to keep up the awesomeness.
I might be missing something, but if SQlite is using fsync()
then there isn't much choice for the NFS client but to push
dirty data to the server and see that the data is committed.
The purpose of that fsync() is to ensure the written data is
robust against client or server failure.
WRITE delegation would probably not have much impact on that.
It's purpose is to enable the client to cache more aggressively
in cases where immediate data persistence is not needed.
The fact that there is only one client reading and updating
that database is beside the point.
> This flat out didn't work previously unless you went to about:config and set
>
> storage.nfs_filesystem = true
>
> Not at all sure how this changed the execution of firefox though.
>
>
>> That said, it seems to me that your use case might benefit if
>> the Linux NFS server offered a READ delegation for the SQLite
>> database file even when it is open R/W. It might be appropriate
>> if the server offered such a delegation when there are no other
>> clients that have the file open for write or that hold write
>> delegations.
>> Patches and performance data are, as always, welcome.
>> --
>> Chuck Lever
--
Chuck Lever