2021-01-27 09:19:50

by David Wysochanski

[permalink] [raw]
Subject: [PATCH 0/8] Convert NFS fscache read paths to netfs API

This minimal set of patches update the NFS client to use the new
readahead method and convert the fscache read paths to use the new
netfs API, and are at:
https://github.com/DaveWysochanskiRH/kernel/releases/tag/fscache-iter-lib-nfs-20210127
https://github.com/DaveWysochanskiRH/kernel/commit/8693c3602ce02e4038fa732bef02b2641f739e02

The patches are based on David Howells fscache-netfs-lib tree at
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-netfs-lib

The first 6 patches refactor some of the NFS read code to facilitate
re-use, the next 2 patches do the conversion to the new API. Note
that the last patch converts nfs_readpages to nfs_readahead.

Changes since dhowells posting on Jan 25, 2021
- address feedback from Willy (Chuck's feedback in TODO list)
- ensure kernel builds cleanly with each patch
- readahead patch: cleanup handling of 'ret'

Still TODO
1. Fix known bugs (some may disappear with uncondional netfs API)
a) One oops with fscache enabled on parallel read unit test
b) nfs_issue_op: takes rcu_read_lock but may calls nfs_page_alloc()
with GFP_KERNEL which may sleep (dhowells noted this in a review)
c) nfs_refresh_inode() takes inode->i_lock but may call
__fscache_invalidate() which may sleep (found with lockdep)
2. Fixup NFS fscache stats (NFSIOS_FSCACHE_*)
* Compare with netfs stats and determine if still needed
3. Cleanup dfprintks and/or convert to tracepoints
4. Further tests (see "Not tested yet")

Tests run
1. Custom NFS+fscache unit tests for basic operation: PASS*
* vers=3,4.0,4.1,4.2,sec=sys,server=localhost (same kernel)
* one fscache enabled parallel read test that oopses due to page lock state
* probably goes away if we unconditionally call netfs API
2. cthon04: PASS
* test options "-b -g -s -l", fsc,vers=3,4.0,4.1,4.2,sec=sys
* No failures, oopses or hangs
3. iozone tests: PASS
* nofsc,fsc,vers=3,4.0,4.1,4.2,sec=sys,server=rhel7,rhel8
* No failures, oopses, or hangs
4. xfstests/generic: PASS*
* no hangs or oopses; a few failures unrelated to these patches
* Ran following configurations
* vers=4.2,fsc,sec=sys,rhel7-server
* vers=4.2,nofsc,sec=sys,rhel8-server (test still running, ok so far)
* vers=4.1,nofsc,sec=sys,netapp-server(pnfs/files)
* vers=4.0,fsc (test still running, ok so far)

Not tested yet:
* error injections (for example, connection disruptions, server errors during IO, etc)
* many process mixed read/write on same file
* performance