changes since v4:
- erofs: add reviewed-by tag from Chao Yu (patch 8)
- cachefiles: rename CACHEFILES_OP_INIT to CACHEFILES_OP_OPEN (patch 4)
- cachefiles: add a new message type (CACHEFILES_OP_CLOSE). It will be
sent to user daemon when withdrawing cookie. It is used to notify user daemon
to close the attached anon_fd. (patch 5)
- cachefiles: add a read-write spinlock @cache->reqs_lock (patch 3) to protect
parallel accessing to the xarray (patch 4).
- cachefiles: remove the logic of automaticlly flushing all associated
requests when anon_fd gets closed (in cachefiles_ondemand_fd_release()).
The reason is that, the reordering of cread (response to READ request) and
close(anon_fd) may unexpectedly complete another READ request which reuses
the ID of previous READ request.
```
Process 1 Process 2
close(anon_fd)
complete READ request A with ID X
on-demand read
enqueue READ request B into xarray,
now READ request B reuses ID X
cread(ID X) of READ request A
now ID X responds to READ request B
complete READ request B // unexpected
```
So now closing anon_fd won't flush all associated requests. A
mandatory response (cread) is required for each READ request.
RFC: https://lore.kernel.org/all/[email protected]/t/
v1: https://lore.kernel.org/lkml/[email protected]/T/
v2: https://lore.kernel.org/all/[email protected]/t/
v3: https://lore.kernel.org/lkml/[email protected]/T/
v4: https://lore.kernel.org/lkml/[email protected]/T/#t
[Background]
============
Nydus [1] is a container image distribution service specially optimised
for distribution over network. Nydus is an excellent container image
acceleration solution, since it only pulls data from remote when it's
really needed, a.k.a. on-demand reading.
erofs (Enhanced Read-Only File System) is a filesystem specially
optimised for read-only scenarios. (Documentation/filesystem/erofs.rst)
Recently we are focusing on erofs in container images distribution
scenario [2], trying to combine it with nydus. In this case, erofs can
be mounted from one bootstrap file (metadata) with (optional) multiple
data blob files (data) stored on another local filesystem. (All these
files are actually image files in erofs disk format.)
To accelerate the container startup (fetching container image from remote
and then start the container), we do hope that the bootstrap blob file
could support demand read. That is, erofs can be mounted and accessed
even when the bootstrap/data blob files have not been fully downloaded.
That means we have to manage the cache state of the bootstrap/data blob
files (if cache hit, read directly from the local cache; if cache miss,
fetch the data somehow). It would be painful and may be dumb for erofs to
implement the cache management itself. Thus we prefer fscache/cachefiles
to do the cache management. Besides, the demand-read feature shall be
general and it can benefit other using scenarios if it can be implemented
in fscache level.
[1] https://nydus.dev
[2] https://sched.co/pcdL
[Overall Design]
================
Please refer to patch 6 ("cachefiles: document on-demand read mode") for
more details.
When working in original mode, cachefiles mainly serves as a local cache for
remote networking fs, while in on-demand read mode, cachefiles can boost the
scenario where on-demand read semantics is needed, e.g. container image
distribution.
The essential difference between these two modes is that, in original mode,
when cache miss, netfs itself will fetch data from remote, and then write the
fetched data into cache file. While in on-demand read mode, a user daemon is
responsible for fetching data and then writing to the cache file.
The on-demand read mode relies on a simple protocol used for communication
between kernel and user daemon.
The current implementation relies on the anonymous fd mechanism to avoid
the dependence on the format of cache file. When cache file is opened
for the first time, an anon_fd associated with the cache file is sent to
user daemon. With the given anon_fd, user daemon could fetch and write data
into the cache file in the background, even when kernel has not triggered
the cache miss. Besides, the write() syscall to the anon_fd will finally
call cachefiles kernel module, which will write data to cache file in
the latest format of cache file.
1. cache miss
When cache miss, cachefiles kernel module will notify user daemon the
anon_fd, along with the requested file range. When notified, user dameon
needs to fetch data of the requested file range, and then write the fetched
data into cache file with the given anonymous fd. When finished
processing the request, user daemon needs to notify the kernel.
After notifying the user daemon, the kernel read routine will hang there,
until the request is handled by user daemon. When it's awaken by the
notification from user daemon, i.e. the corresponding hole has been filled
by the user daemon, it will retry to read from the same file range.
2. cache hit
Once data is already ready in cache file, netfs will read from cache file directly.
[Advantage of fscache-based demand-read]
========================================
1. Asynchronous Prefetch
In current mechanism, fscache is responsible for cache state management,
while the data plane (fetch data from local/remote on cache miss) is
done on the user daemon side.
If data has already been ready in the backing file, the upper fs (e.g.
erofs) will read from the backing file directly and won't be trapped to
user space anymore. Thus the user daemon could fetch data (from remote)
asynchronously on the background, and thus accelerate the backing file
accessing in some degree.
2. Support massive blob files
Besides this mechanism supports a large amount of backing files, and
thus can benefit the densely employed scenario.
In our using scenario, one container image can correspond to one
bootstrap file (required) and multiple data blob files (optional). For
example, one container image for node.js will corresponds to ~20 files
in total. In densely employed environment, there could be as many as
hundreds of containers and thus thousands of backing files on one
machine.
[Test]
==========
You could start a quick test by
https://github.com/lostjeffle/demand-read-cachefilesd
Jeffle Xu (22):
fscache: export fscache_end_operation()
cachefiles: extract write routine
cachefiles: introduce on-demand read mode
cachefiles: notify user daemon with anon_fd when looking up cookie
cachefiles: notify user daemon when withdrawing cookie
cachefiles: implement on-demand read
cachefiles: document on-demand read mode
erofs: use meta buffers for erofs_read_superblock()
erofs: make erofs_map_blocks() generally available
erofs: add mode checking helper
erofs: register global fscache volume
erofs: add cookie context helper functions
erofs: add anonymous inode managing page cache of blob file
erofs: add erofs_fscache_read_pages() helper
erofs: register cookie context for bootstrap blob
erofs: implement fscache-based metadata read
erofs: implement fscache-based data read for non-inline layout
erofs: implement fscache-based data read for inline layout
erofs: register cookie context for data blobs
erofs: implement fscache-based data read for data blobs
erofs: implement fscache-based data readahead
erofs: add 'uuid' mount option
.../filesystems/caching/cachefiles.rst | 176 ++++++
fs/cachefiles/Kconfig | 11 +
fs/cachefiles/daemon.c | 587 +++++++++++++++++-
fs/cachefiles/interface.c | 2 +
fs/cachefiles/internal.h | 53 ++
fs/cachefiles/io.c | 72 ++-
fs/cachefiles/namei.c | 16 +-
fs/erofs/Makefile | 3 +-
fs/erofs/data.c | 18 +-
fs/erofs/fscache.c | 492 +++++++++++++++
fs/erofs/inode.c | 6 +-
fs/erofs/internal.h | 30 +
fs/erofs/super.c | 106 +++-
fs/fscache/internal.h | 11 -
fs/nfs/fscache.c | 8 -
include/linux/fscache.h | 15 +
include/linux/netfs.h | 1 +
include/trace/events/cachefiles.h | 2 +
include/uapi/linux/cachefiles.h | 51 ++
19 files changed, 1560 insertions(+), 100 deletions(-)
create mode 100644 fs/erofs/fscache.c
create mode 100644 include/uapi/linux/cachefiles.h
--
2.27.0
Document new user interface introduced by on-demand read mode.
Signed-off-by: Jeffle Xu <[email protected]>
---
.../filesystems/caching/cachefiles.rst | 176 ++++++++++++++++++
1 file changed, 176 insertions(+)
diff --git a/Documentation/filesystems/caching/cachefiles.rst b/Documentation/filesystems/caching/cachefiles.rst
index 8bf396b76359..c8286c901eae 100644
--- a/Documentation/filesystems/caching/cachefiles.rst
+++ b/Documentation/filesystems/caching/cachefiles.rst
@@ -28,6 +28,8 @@ Cache on Already Mounted Filesystem
(*) Debugging.
+ (*) On-demand Read.
+
Overview
@@ -482,3 +484,177 @@ the control file. For example::
echo $((1|4|8)) >/sys/module/cachefiles/parameters/debug
will turn on all function entry debugging.
+
+
+On-demand Read
+==============
+
+When working in original mode, cachefiles mainly serves as a local cache for
+remote networking fs, while in on-demand read mode, cachefiles can boost the
+scenario where on-demand read semantics is needed, e.g. container image
+distribution.
+
+The essential difference between these two modes is that, in original mode,
+when cache miss, netfs itself will fetch data from remote, and then write the
+fetched data into cache file. While in on-demand read mode, a user daemon is
+responsible for fetching data and then writing to the cache file.
+
+``CONFIG_CACHEFILES_ONDEMAND`` shall be enabled to support on-demand read mode.
+
+
+Protocol Communication
+----------------------
+
+The on-demand read mode relies on a simple protocol used for communication
+between kernel and user daemon. The model is like::
+
+ kernel --[request]--> user daemon --[reply]--> kernel
+
+The cachefiles kernel module will send requests to user daemon when needed.
+User daemon needs to poll on the devnode ('/dev/cachefiles') to check if
+there's pending request to be processed. A POLLIN event will be returned
+when there's pending request.
+
+Then user daemon needs to read the devnode to fetch one request and process it
+accordingly. It is worth nothing that each read only gets one request. When
+finished processing the request, user dameon needs to write the reply to the
+devnode.
+
+Each request is started with a message header like::
+
+ struct cachefiles_msg {
+ __u32 id;
+ __u32 opcode;
+ __u32 len;
+ __u8 data[];
+ };
+
+ * ``id`` identifies the position of this request in an internal xarray
+ managing all pending requests.
+
+ * ``opcode`` identifies the type of this request.
+
+ * ``data`` identifies the payload of this request.
+
+ * ``len`` identifies the whole length of this request, including the
+ header and following type specific payload.
+
+
+Turn on On-demand Mode
+----------------------
+
+An optional parameter is added to "bind" command::
+
+ bind [ondemand]
+
+When "bind" command takes without argument, it defaults to the original mode.
+When "bind" command takes with "ondemand" argument, i.e. "bind ondemand",
+on-demand read mode will be enabled.
+
+
+OPEN Request
+------------
+
+When netfs opens a cache file for the first time, a request with
+CACHEFILES_OP_OPEN opcode, a.k.a OPEN request will be sent to user daemon. The
+payload format is like::
+
+ struct cachefiles_open {
+ __u32 volume_key_len;
+ __u32 cookie_key_len;
+ __u32 fd;
+ __u32 flags;
+ __u8 data[];
+ };
+
+ * ``data`` contains volume_key and cookie_key in sequence.
+
+ * ``volume_key_len`` identifies the length of the volume key of the
+ cache file, in bytes. volume_key is of string format, with a suffix
+ '\0'.
+
+ * ``cookie_key_len`` identifies the length of the cookie key of the
+ cache file, in bytes. The format of cookie_key is netfs specific. It
+ can be of binary format.
+
+ * ``fd`` identifies the anonymous fd of the cache file, with which user
+ daemon can perform write/llseek file operations on the cache file.
+
+
+OPEN request contains (volume_key, cookie_key, anon_fd) triple for corresponding
+cache file. With this triple, user daemon could fetch and write data into the
+cache file in the background, even when kernel has not triggered the cache miss
+yet. User daemon is able to distinguish the requested cache file with the given
+(volume_key, cookie_key), and write the fetched data into cache file with the
+given anon_fd.
+
+After recording the (volume_key, cookie_key, anon_fd) triple, user daemon shall
+reply with "cinit" (complete init) command::
+
+ cinit <id>
+
+ * ``id`` is exactly the id field of the previous OPEN request.
+
+
+Besides, CACHEFILES_OPEN_WANT_CACHE_SIZE flag may be set in flags field of
+OPEN request. This flag is used in the scenario where one cache file can contain
+multiple netfs files for the purpose of deduplication, e.g. In this case, netfs
+itself has no idea the cache file size, whilst user daemon needs to offer the
+hint on the cache file size.
+
+Thus when receiving an OPEN request with CACHEFILES_OPEN_WANT_CACHE_SIZE flag
+set, user daemon must reply with the cache file size::
+
+ cinit <id>,<cache_size>
+
+ * ``id`` is exactly the id field of the previous OPEN request.
+
+ * ``cache_size`` identifies the size of the cache file.
+
+
+CLOSE Request
+-------------
+When cookie withdrawed, a request with CACHEFILES_OP_CLOSE opcode, a.k.a CLOSE
+request, will be sent to user daemon. It will notify user daemon to close the
+attached anon_fd. The payload format is like::
+
+ struct cachefiles_close {
+ __u32 fd;
+ };
+
+ * ``fd`` identifies the anon_fd to be closed, which is exactly the same
+ with that in OPEN request.
+
+
+READ Request
+------------
+
+When on-demand read mode is turned on, and cache miss encountered, kernel will
+send a request with CACHEFILES_OP_READ opcode, a.k.a READ request, to user
+daemon. It will notify user daemon to fetch data in the requested file range.
+The payload format is like::
+
+ struct cachefiles_read {
+ __u64 off;
+ __u64 len;
+ __u32 fd;
+ };
+
+ * ``off`` identifies the starting offset of the requested file range.
+
+ * ``len`` identifies the length of the requested file range.
+
+ * ``fd`` identifies the anonymous fd of the requested cache file. It is
+ guaranteed that it shall be the same with the fd field in the previous
+ OPEN request.
+
+When receiving one READ request, user daemon needs to fetch data of the
+requested file range, and then write the fetched data into cache file with the
+given anonymous fd.
+
+When finished processing the READ request, user daemon needs to reply with
+"cread" (complete read) command::
+
+ cread <id>
+
+ * ``id`` is exactly the id field of the previous READ request.
--
2.27.0
The only change is that, meta buffers read cache page without __GFP_FS
flag, which shall not matter.
Signed-off-by: Jeffle Xu <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
---
fs/erofs/super.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 915eefe0d7e2..12755217631f 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -281,21 +281,19 @@ static int erofs_init_devices(struct super_block *sb,
static int erofs_read_superblock(struct super_block *sb)
{
struct erofs_sb_info *sbi;
- struct page *page;
+ struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
struct erofs_super_block *dsb;
unsigned int blkszbits;
void *data;
int ret;
- page = read_mapping_page(sb->s_bdev->bd_inode->i_mapping, 0, NULL);
- if (IS_ERR(page)) {
+ data = erofs_read_metabuf(&buf, sb, 0, EROFS_KMAP);
+ if (IS_ERR(data)) {
erofs_err(sb, "cannot read erofs superblock");
- return PTR_ERR(page);
+ return PTR_ERR(data);
}
sbi = EROFS_SB(sb);
-
- data = kmap(page);
dsb = (struct erofs_super_block *)(data + EROFS_SUPER_OFFSET);
ret = -EINVAL;
@@ -365,8 +363,7 @@ static int erofs_read_superblock(struct super_block *sb)
if (erofs_sb_has_ztailpacking(sbi))
erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!");
out:
- kunmap(page);
- put_page(page);
+ erofs_put_metabuf(&buf);
return ret;
}
--
2.27.0
Export fscache_end_operation() to avoid code duplication.
Besides, considering the paired fscache_begin_read_operation() is
already exported, it shall make sense to also export
fscache_end_operation().
Signed-off-by: Jeffle Xu <[email protected]>
Reviewed-by: Liu Bo <[email protected]>
---
fs/fscache/internal.h | 11 -----------
fs/nfs/fscache.c | 8 --------
include/linux/fscache.h | 14 ++++++++++++++
3 files changed, 14 insertions(+), 19 deletions(-)
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index f121c21590dc..ed1c9ed737f2 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -70,17 +70,6 @@ static inline void fscache_see_cookie(struct fscache_cookie *cookie,
where);
}
-/*
- * io.c
- */
-static inline void fscache_end_operation(struct netfs_cache_resources *cres)
-{
- const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
-
- if (ops)
- ops->end_operation(cres);
-}
-
/*
* main.c
*/
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index cfe901650ab0..39654ca72d3d 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -249,14 +249,6 @@ void nfs_fscache_release_file(struct inode *inode, struct file *filp)
}
}
-static inline void fscache_end_operation(struct netfs_cache_resources *cres)
-{
- const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
-
- if (ops)
- ops->end_operation(cres);
-}
-
/*
* Fallback page reading interface.
*/
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 296c5f1d9f35..d2430da8aa67 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -456,6 +456,20 @@ int fscache_begin_read_operation(struct netfs_cache_resources *cres,
return -ENOBUFS;
}
+/**
+ * fscache_end_operation - End the read operation for the netfs lib
+ * @cres: The cache resources for the read operation
+ *
+ * Clean up the resources at the end of the read request.
+ */
+static inline void fscache_end_operation(struct netfs_cache_resources *cres)
+{
+ const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
+
+ if (ops)
+ ops->end_operation(cres);
+}
+
/**
* fscache_read - Start a read from the cache.
* @cres: The cache resources to use
--
2.27.0
... so that it can be used in the following introduced fs/erofs/fscache.c.
Signed-off-by: Jeffle Xu <[email protected]>
---
fs/erofs/data.c | 4 ++--
fs/erofs/internal.h | 2 ++
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 226a57c57ee6..6e2a28242453 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -104,8 +104,8 @@ static int erofs_map_blocks_flatmode(struct inode *inode,
return 0;
}
-static int erofs_map_blocks(struct inode *inode,
- struct erofs_map_blocks *map, int flags)
+int erofs_map_blocks(struct inode *inode,
+ struct erofs_map_blocks *map, int flags)
{
struct super_block *sb = inode->i_sb;
struct erofs_inode *vi = EROFS_I(inode);
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 5aa2cf2c2f80..e424293f47a2 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -484,6 +484,8 @@ void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb,
int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *dev);
int erofs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
u64 start, u64 len);
+int erofs_map_blocks(struct inode *inode,
+ struct erofs_map_blocks *map, int flags);
/* inode.c */
static inline unsigned long erofs_inode_hash(erofs_nid_t nid)
--
2.27.0
This patch implements the data plane of reading data from bootstrap blob
file over fscache for inline layout.
For the heading non-inline part, the data plane for non-inline layout is
resued, while only the tail packing part needs special handling.
Signed-off-by: Jeffle Xu <[email protected]>
---
fs/erofs/fscache.c | 43 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index df56562f33c4..254b3e72ab4d 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -85,8 +85,9 @@ static int erofs_fscache_readpage_noinline(struct page *page,
{
struct fscache_cookie *cookie = fsmap->m_ctx->cookie;
/*
- * 1) For FLAT_PLAIN layout, the output map.m_la shall be equal to o_la,
- * and the output map.m_pa is exactly the physical address of o_la.
+ * 1) For FLAT_PLAIN and FLAT_INLINE (the heading non tail packing part)
+ * layout, the output map.m_la shall be equal to o_la, and the output
+ * map.m_pa is exactly the physical address of o_la.
* 2) For CHUNK_BASED layout, the output map.m_la is rounded down to the
* nearest chunk boundary, and the output map.m_pa is actually the
* physical address of this chunk boundary. So we need to recalculate
@@ -97,6 +98,40 @@ static int erofs_fscache_readpage_noinline(struct page *page,
return erofs_fscache_read_page(cookie, page, start);
}
+static int erofs_fscache_readpage_inline(struct page *page,
+ struct erofs_fscache_map *fsmap)
+{
+ struct inode *inode = page->mapping->host;
+ struct super_block *sb = inode->i_sb;
+ struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
+ erofs_blk_t blknr;
+ size_t offset, len;
+ void *src, *dst;
+
+ /*
+ * For inline (tail packing) layout, the offset may be non-zero, which
+ * can be calculated from corresponding physical address directly.
+ * Currently only flat layout supports inline (FLAT_INLINE), and the
+ * output map.m_pa is exactly the physical address of o_la in this case.
+ */
+ offset = erofs_blkoff(fsmap->m_pa);
+ blknr = erofs_blknr(fsmap->m_pa);
+ len = fsmap->m_llen;
+
+ src = erofs_read_metabuf(&buf, sb, blknr, EROFS_KMAP);
+ if (IS_ERR(src))
+ return PTR_ERR(src);
+
+ dst = kmap(page);
+ memcpy(dst, src + offset, len);
+ memset(dst + len, 0, PAGE_SIZE - len);
+ kunmap(page);
+
+ erofs_put_metabuf(&buf);
+
+ return 0;
+}
+
static int erofs_fscache_do_readpage(struct page *page)
{
struct inode *inode = page->mapping->host;
@@ -126,8 +161,12 @@ static int erofs_fscache_do_readpage(struct page *page)
if (ret)
return ret;
+ if (map.m_flags & EROFS_MAP_META)
+ return erofs_fscache_readpage_inline(page, &fsmap);
+
switch (vi->datalayout) {
case EROFS_INODE_FLAT_PLAIN:
+ case EROFS_INODE_FLAT_INLINE:
case EROFS_INODE_CHUNK_BASED:
return erofs_fscache_readpage_noinline(page, &fsmap);
default:
--
2.27.0
Introduce 'struct erofs_cookie_ctx' for managing cookie for backing
file, and the following introduced API for reading from backing file.
Besides, introduce two helper functions for initializing and cleaning
up erofs_cookie_ctx.
struct erofs_cookie_ctx *
erofs_fscache_get_ctx(struct super_block *sb, char *path);
void erofs_fscache_put_ctx(struct erofs_cookie_ctx *ctx);
Signed-off-by: Jeffle Xu <[email protected]>
---
fs/erofs/fscache.c | 74 +++++++++++++++++++++++++++++++++++++++++++++
fs/erofs/internal.h | 8 +++++
2 files changed, 82 insertions(+)
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index 9c32f42e1056..28ec7c69744a 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -6,6 +6,80 @@
static struct fscache_volume *volume;
+static int erofs_fscache_init_cookie(struct erofs_fscache_context *ctx,
+ char *path)
+{
+ struct fscache_cookie *cookie;
+
+ cookie = fscache_acquire_cookie(volume, FSCACHE_ADV_WANT_CACHE_SIZE,
+ path, strlen(path),
+ NULL, 0, 0);
+ if (!cookie)
+ return -EINVAL;
+
+ fscache_use_cookie(cookie, false);
+ ctx->cookie = cookie;
+ return 0;
+}
+
+static inline
+void erofs_fscache_cleanup_cookie(struct erofs_fscache_context *ctx)
+{
+ struct fscache_cookie *cookie = ctx->cookie;
+
+ fscache_unuse_cookie(cookie, NULL, NULL);
+ fscache_relinquish_cookie(cookie, false);
+ ctx->cookie = NULL;
+}
+
+static int erofs_fscache_init_ctx(struct erofs_fscache_context *ctx,
+ struct super_block *sb, char *path)
+{
+ int ret;
+
+ ret = erofs_fscache_init_cookie(ctx, path);
+ if (ret) {
+ erofs_err(sb, "failed to init cookie");
+ return ret;
+ }
+
+ return 0;
+}
+
+static inline
+void erofs_fscache_cleanup_ctx(struct erofs_fscache_context *ctx)
+{
+ erofs_fscache_cleanup_cookie(ctx);
+}
+
+struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb,
+ char *path)
+{
+ struct erofs_fscache_context *ctx;
+ int ret;
+
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+ if (!ctx)
+ return ERR_PTR(-ENOMEM);
+
+ ret = erofs_fscache_init_ctx(ctx, sb, path);
+ if (ret) {
+ kfree(ctx);
+ return ERR_PTR(ret);
+ }
+
+ return ctx;
+}
+
+void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx)
+{
+ if (!ctx)
+ return;
+
+ erofs_fscache_cleanup_ctx(ctx);
+ kfree(ctx);
+}
+
int __init erofs_init_fscache(void)
{
volume = fscache_acquire_volume("erofs", NULL, NULL, 0);
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 51fe5c2a419d..123a8dfc179b 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -97,6 +97,10 @@ struct erofs_sb_lz4_info {
u16 max_pclusterblks;
};
+struct erofs_fscache_context {
+ struct fscache_cookie *cookie;
+};
+
struct erofs_sb_info {
struct erofs_mount_opts opt; /* options */
#ifdef CONFIG_EROFS_FS_ZIP
@@ -621,6 +625,10 @@ static inline int z_erofs_load_lzma_config(struct super_block *sb,
int erofs_init_fscache(void);
void erofs_exit_fscache(void);
+struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb,
+ char *path);
+void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx);
+
#define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */
#endif /* __EROFS_INTERNAL_H */
--
2.27.0
Send the anonymous fd to user daemon when looking up cookie, no matter
whether the cache file exist there or not. With the given anonymous fd,
user daemon can fetch and then write data into cache file in advance,
even when cache miss has not happended yet.
Also add one advisory flag (FSCACHE_ADV_WANT_CACHE_SIZE) suggesting that
cache file size shall be retrieved at runtime. This helps the scenario
where one cache file can contain multiple netfs files for the purpose of
deduplication, e.g. In this case, netfs itself has no idea the cache
file size, whilst user daemon needs to offer the hint on the cache file
size.
Signed-off-by: Jeffle Xu <[email protected]>
---
fs/cachefiles/daemon.c | 365 +++++++++++++++++++++++++++++-
fs/cachefiles/internal.h | 24 ++
fs/cachefiles/namei.c | 16 +-
include/linux/fscache.h | 1 +
include/trace/events/cachefiles.h | 2 +
include/uapi/linux/cachefiles.h | 39 ++++
6 files changed, 444 insertions(+), 3 deletions(-)
create mode 100644 include/uapi/linux/cachefiles.h
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index c0c3a3cbee28..3c3a461f8cd8 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -19,6 +19,8 @@
#include <linux/ctype.h>
#include <linux/string.h>
#include <linux/fs_struct.h>
+#include <linux/fdtable.h>
+#include <linux/anon_inodes.h>
#include "internal.h"
static int cachefiles_daemon_open(struct inode *, struct file *);
@@ -43,6 +45,9 @@ static int cachefiles_daemon_secctx(struct cachefiles_cache *, char *);
static int cachefiles_daemon_tag(struct cachefiles_cache *, char *);
static int cachefiles_daemon_bind(struct cachefiles_cache *, char *);
static void cachefiles_daemon_unbind(struct cachefiles_cache *);
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+static int cachefiles_ondemand_cinit(struct cachefiles_cache *, char *);
+#endif
static unsigned long cachefiles_open;
@@ -75,6 +80,9 @@ static const struct cachefiles_daemon_cmd cachefiles_daemon_cmds[] = {
{ "inuse", cachefiles_daemon_inuse },
{ "secctx", cachefiles_daemon_secctx },
{ "tag", cachefiles_daemon_tag },
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+ { "cinit", cachefiles_ondemand_cinit },
+#endif
{ "", NULL }
};
@@ -87,6 +95,21 @@ static inline void cachefiles_ondemand_open(struct cachefiles_cache *cache)
static inline void cachefiles_ondemand_release(struct cachefiles_cache *cache)
{
+ struct cachefiles_req *req;
+ unsigned long index;
+
+ /*
+ * 1) Cache has been marked as dead state, and then 2) flush all pending
+ * requests in @reqs xarray. The barrier inside set_bit() will ensure
+ * that above two ops won't be reordered.
+ */
+ write_lock(&cache->reqs_lock);
+ xa_for_each(&cache->reqs, index, req) {
+ req->error = -EIO;
+ complete(&req->done);
+ }
+ write_unlock(&cache->reqs_lock);
+
xa_destroy(&cache->reqs);
}
@@ -114,6 +137,346 @@ bool cachefiles_ondemand_daemon_bind(struct cachefiles_cache *cache, char *args)
return false;
}
+static int cachefiles_ondemand_fd_release(struct inode *inode, struct file *file)
+{
+ struct cachefiles_object *object = file->private_data;
+
+ /*
+ * Uninstall anon_fd to the cachefiles object, so that no further
+ * associated requests will get enqueued.
+ */
+ object->fd = -1;
+
+ cachefiles_put_object(object, cachefiles_obj_put_ondemand_fd);
+ return 0;
+}
+
+static ssize_t cachefiles_ondemand_fd_write_iter(struct kiocb *kiocb,
+ struct iov_iter *iter)
+{
+ struct cachefiles_object *object = kiocb->ki_filp->private_data;
+ struct cachefiles_cache *cache = object->volume->cache;
+ struct file *file = object->file;
+ size_t len = iter->count;
+ loff_t pos = kiocb->ki_pos;
+ const struct cred *saved_cred;
+ int ret;
+
+ if (!file)
+ return -ENOBUFS;
+
+ cachefiles_begin_secure(cache, &saved_cred);
+ ret = __cachefiles_prepare_write(object, file, &pos, &len, true);
+ cachefiles_end_secure(cache, saved_cred);
+ if (ret < 0)
+ return ret;
+
+ ret = __cachefiles_write(object, file, pos, iter, NULL, NULL);
+ if (!ret)
+ ret = len;
+
+ return ret;
+}
+
+static loff_t cachefiles_ondemand_fd_llseek(struct file *filp, loff_t pos, int whence)
+{
+ struct cachefiles_object *object = filp->private_data;
+ struct file *file = object->file;
+
+ if (!file)
+ return -ENOBUFS;
+
+ return vfs_llseek(file, pos, whence);
+}
+
+static const struct file_operations cachefiles_ondemand_fd_fops = {
+ .owner = THIS_MODULE,
+ .release = cachefiles_ondemand_fd_release,
+ .write_iter = cachefiles_ondemand_fd_write_iter,
+ .llseek = cachefiles_ondemand_fd_llseek,
+};
+
+/*
+ * Init request completion
+ * - command: "cinit <id>[,<cache_size>]"
+ */
+static int cachefiles_ondemand_cinit(struct cachefiles_cache *cache, char *args)
+{
+ struct cachefiles_req *req;
+ struct cachefiles_open *load;
+ struct fscache_cookie *cookie;
+ char *tmp, *pid, *psize;
+ unsigned long id, flags, size = 0;
+ int ret;
+
+ if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags))
+ return -EOPNOTSUPP;
+
+ if (!*args) {
+ pr_err("Empty id specified\n");
+ return -EINVAL;
+ }
+
+ tmp = kstrdup(args, GFP_KERNEL);
+ if (!tmp)
+ return -ENOMEM;
+
+ pid = tmp;
+ psize = strchr(tmp, ',');
+ if (psize) {
+ *psize = 0;
+ psize++;
+
+ ret = kstrtoul(psize, 0, &size);
+ if (ret)
+ goto out;
+ }
+
+ ret = kstrtoul(pid, 0, &id);
+ if (ret)
+ goto out;
+
+ req = xa_erase(&cache->reqs, id);
+ if (!req) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ load = (void *)req->msg.data;
+ flags = load->flags;
+
+ if (test_bit(CACHEFILES_OPEN_WANT_CACHE_SIZE, &flags)) {
+ if (WARN_ON_ONCE(!size)) {
+ req->error = -EINVAL;
+ } else {
+ cookie = req->object->cookie;
+ cookie->object_size = size;
+ if (size)
+ set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
+ else
+ clear_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
+ }
+ }
+
+ complete(&req->done);
+out:
+ kfree(tmp);
+ return ret;
+}
+
+static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
+{
+ struct cachefiles_object *object;
+ struct cachefiles_open *load;
+ struct fd f;
+ int ret;
+
+ object = cachefiles_grab_object(req->object,
+ cachefiles_obj_get_ondemand_fd);
+
+ ret = anon_inode_getfd("[cachefiles]", &cachefiles_ondemand_fd_fops,
+ object, O_WRONLY);
+ if (ret < 0) {
+ cachefiles_put_object(object, cachefiles_obj_put_ondemand_fd);
+ return ret;
+ }
+
+ f = fdget_pos(ret);
+ if (WARN_ON_ONCE(!f.file))
+ return -EBADFD;
+
+ f.file->f_mode |= FMODE_PWRITE | FMODE_LSEEK;
+ fdput_pos(f);
+
+ load = (void *)req->msg.data;
+ load->fd = object->fd = ret;
+
+ return 0;
+}
+
+static ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
+ char __user *_buffer,
+ size_t buflen)
+{
+ struct cachefiles_req *req;
+ struct cachefiles_msg *msg;
+ unsigned long id = 0;
+ size_t n;
+ int ret = 0;
+ XA_STATE(xas, &cache->reqs, 0);
+
+ /*
+ * Search for request that has not ever been processed, to prevent
+ * requests from being sent to user daemon repeatedly.
+ */
+ xa_lock(&cache->reqs);
+ req = xas_find_marked(&xas, UINT_MAX, CACHEFILES_REQ_NEW);
+ if (req)
+ xas_clear_mark(&xas, CACHEFILES_REQ_NEW);
+ xa_unlock(&cache->reqs);
+
+ if (!req)
+ return 0;
+
+ msg = &req->msg;
+ msg->id = id = xas.xa_index;
+
+ n = msg->len;
+ if (n > buflen) {
+ ret = -EMSGSIZE;
+ goto error;
+ }
+
+ if (msg->opcode == CACHEFILES_OP_OPEN) {
+ ret = cachefiles_ondemand_get_fd(req);
+ if (ret)
+ goto error;
+ }
+
+ if (copy_to_user(_buffer, msg, n) != 0) {
+ ret = -EFAULT;
+ goto err_put_fd;
+ }
+
+ return n;
+
+err_put_fd:
+ if (msg->opcode == CACHEFILES_OP_OPEN)
+ close_fd(req->object->fd);
+error:
+ xa_erase(&cache->reqs, id);
+ req->error = ret;
+ complete(&req->done);
+ return ret;
+}
+
+typedef int (*init_req_fn)(struct cachefiles_req *req, void *private);
+
+static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
+ enum cachefiles_opcode opcode,
+ size_t data_len,
+ init_req_fn init_req,
+ void *private)
+{
+ struct cachefiles_cache *cache = object->volume->cache;
+ struct cachefiles_req *req;
+ struct xarray *xa = &cache->reqs;
+ int ret;
+ u32 id;
+
+ if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags))
+ return -EOPNOTSUPP;
+
+ if (test_bit(CACHEFILES_DEAD, &cache->flags))
+ return -EIO;
+
+ req = kzalloc(sizeof(*req) + data_len, GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+
+ req->object = object;
+ init_completion(&req->done);
+ req->msg.opcode = opcode;
+ req->msg.len = sizeof(struct cachefiles_msg) + data_len;
+
+ ret = init_req(req, private);
+ if (ret)
+ goto out;
+
+ /*
+ * Enqueue the pending request.
+ *
+ * Stop enqueuing the request when daemon is dying. So we need to
+ * 1) check cache state, and 2) enqueue request if cache is alive.
+ *
+ * The above two ops need to be atomic as a whole. @reqs_lock is used
+ * here to ensure that. Otherwise, request may be enqueued after xarray
+ * has been flushed, in which case the orphan request will never be
+ * completed and thus netfs will hang there forever.
+ */
+ read_lock(&cache->reqs_lock);
+
+ /* recheck dead state under lock */
+ if (test_bit(CACHEFILES_DEAD, &cache->flags)) {
+ read_unlock(&cache->reqs_lock);
+ ret = -EIO;
+ goto out;
+ }
+
+ xa_lock(xa);
+ ret = __xa_alloc(xa, &id, req, xa_limit_32b, GFP_KERNEL);
+ if (!ret)
+ __xa_set_mark(xa, id, CACHEFILES_REQ_NEW);
+ xa_unlock(xa);
+
+ read_unlock(&cache->reqs_lock);
+
+ if (ret)
+ goto out;
+
+ wake_up_all(&cache->daemon_pollwq);
+ wait_for_completion(&req->done);
+ ret = req->error;
+out:
+ kfree(req);
+ return ret;
+}
+
+static int init_open_req(struct cachefiles_req *req, void *private)
+{
+ struct cachefiles_object *object = req->object;
+ struct fscache_cookie *cookie = object->cookie;
+ struct fscache_volume *volume = object->volume->vcookie;
+ struct cachefiles_open *load = (void *)req->msg.data;
+ size_t volume_key_len, cookie_key_len;
+ void *volume_key, *cookie_key;
+ unsigned long flags = 0;
+
+ /* volume key is of string format */
+ volume_key_len = volume->key[0] + 1;
+ volume_key = volume->key + 1;
+
+ /* cookie key is of binary format */
+ cookie_key_len = cookie->key_len;
+ cookie_key = fscache_get_key(cookie);
+
+ if (object->cookie->advice & FSCACHE_ADV_WANT_CACHE_SIZE)
+ __set_bit(CACHEFILES_OPEN_WANT_CACHE_SIZE, &flags);
+
+ load->flags = flags;
+ load->volume_key_len = volume_key_len;
+ load->cookie_key_len = cookie_key_len;
+ memcpy(load->data, volume_key, volume_key_len);
+ memcpy(load->data + volume_key_len, cookie_key, cookie_key_len);
+
+ return 0;
+}
+
+int cachefiles_ondemand_init_object(struct cachefiles_object *object)
+{
+ struct fscache_cookie *cookie = object->cookie;
+ struct fscache_volume *volume = object->volume->vcookie;
+ size_t volume_key_len, cookie_key_len, data_len;
+
+ /*
+ * Cachefiles will firstly check cache file under the root cache
+ * directory. If coherency check failed, it will fallback to creating a
+ * new tmpfile as the cache file. Reuse the previously created anon_fd
+ * if any.
+ */
+ if (object->fd > 0)
+ return 0;
+
+ volume_key_len = volume->key[0] + 1;
+ cookie_key_len = cookie->key_len;
+ data_len = sizeof(struct cachefiles_open) +
+ volume_key_len + cookie_key_len;
+
+ return cachefiles_ondemand_send_req(object,
+ CACHEFILES_OP_OPEN, data_len,
+ init_open_req, NULL);
+}
+
#else
static inline void cachefiles_ondemand_open(struct cachefiles_cache *cache) {}
static inline void cachefiles_ondemand_release(struct cachefiles_cache *cache) {}
@@ -129,7 +492,6 @@ bool cachefiles_ondemand_daemon_bind(struct cachefiles_cache *cache, char *args)
{
return false;
}
-#endif
static inline
ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
@@ -137,6 +499,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
{
return -EOPNOTSUPP;
}
+#endif
/*
* Prepare a cache for caching.
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 3f791882fa3f..8450ebd77949 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -16,6 +16,7 @@
#include <linux/cred.h>
#include <linux/security.h>
#include <linux/xarray.h>
+#include <linux/cachefiles.h>
#define CACHEFILES_DIO_BLOCK_SIZE 4096
@@ -59,6 +60,9 @@ struct cachefiles_object {
enum cachefiles_content content_info:8; /* Info about content presence */
unsigned long flags;
#define CACHEFILES_OBJECT_USING_TMPFILE 0 /* Have an unlinked tmpfile */
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+ int fd; /* anonymous fd */
+#endif
};
/*
@@ -109,6 +113,15 @@ struct cachefiles_cache {
#endif
};
+struct cachefiles_req {
+ struct cachefiles_object *object;
+ struct completion done;
+ int error;
+ struct cachefiles_msg msg;
+};
+
+#define CACHEFILES_REQ_NEW XA_MARK_1
+
#include <trace/events/cachefiles.h>
static inline
@@ -152,6 +165,17 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache,
*/
extern const struct file_operations cachefiles_daemon_fops;
+#ifdef CONFIG_CACHEFILES_ONDEMAND
+extern int cachefiles_ondemand_init_object(struct cachefiles_object *object);
+
+#else
+static inline
+int cachefiles_ondemand_init_object(struct cachefiles_object *object)
+{
+ return 0;
+}
+#endif
+
/*
* error_inject.c
*/
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index f256c8aff7bb..22aba4c6a762 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -444,10 +444,9 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
struct dentry *fan = volume->fanout[(u8)object->cookie->key_hash];
struct file *file;
struct path path;
- uint64_t ni_size = object->cookie->object_size;
+ uint64_t ni_size;
long ret;
- ni_size = round_up(ni_size, CACHEFILES_DIO_BLOCK_SIZE);
cachefiles_begin_secure(cache, &saved_cred);
@@ -473,6 +472,15 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
goto out_dput;
}
+ ret = cachefiles_ondemand_init_object(object);
+ if (ret < 0) {
+ file = ERR_PTR(ret);
+ goto out_dput;
+ }
+
+ ni_size = object->cookie->object_size;
+ ni_size = round_up(ni_size, CACHEFILES_DIO_BLOCK_SIZE);
+
if (ni_size > 0) {
trace_cachefiles_trunc(object, d_backing_inode(path.dentry), 0, ni_size,
cachefiles_trunc_expand_tmpfile);
@@ -573,6 +581,10 @@ static bool cachefiles_open_file(struct cachefiles_object *object,
}
_debug("file -> %pd positive", dentry);
+ ret = cachefiles_ondemand_init_object(object);
+ if (ret < 0)
+ goto error_fput;
+
ret = cachefiles_check_auxdata(object, file);
if (ret < 0)
goto check_failed;
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index d2430da8aa67..a330354f33ca 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -39,6 +39,7 @@ struct fscache_cookie;
#define FSCACHE_ADV_SINGLE_CHUNK 0x01 /* The object is a single chunk of data */
#define FSCACHE_ADV_WRITE_CACHE 0x00 /* Do cache if written to locally */
#define FSCACHE_ADV_WRITE_NOCACHE 0x02 /* Don't cache if written to locally */
+#define FSCACHE_ADV_WANT_CACHE_SIZE 0x04 /* Retrieve cache size at runtime */
#define FSCACHE_INVAL_DIO_WRITE 0x01 /* Invalidate due to DIO write */
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index c6f5aa74db89..371e5816e98c 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -31,6 +31,8 @@ enum cachefiles_obj_ref_trace {
cachefiles_obj_see_lookup_failed,
cachefiles_obj_see_withdraw_cookie,
cachefiles_obj_see_withdrawal,
+ cachefiles_obj_get_ondemand_fd,
+ cachefiles_obj_put_ondemand_fd,
};
enum fscache_why_object_killed {
diff --git a/include/uapi/linux/cachefiles.h b/include/uapi/linux/cachefiles.h
new file mode 100644
index 000000000000..5ea7285863f1
--- /dev/null
+++ b/include/uapi/linux/cachefiles.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _LINUX_CACHEFILES_H
+#define _LINUX_CACHEFILES_H
+
+#include <linux/types.h>
+
+#define CACHEFILES_MSG_MAX_SIZE 512
+
+enum cachefiles_opcode {
+ CACHEFILES_OP_OPEN,
+};
+
+/*
+ * @id identifying position of this message in the radix tree
+ * @opcode message type, CACHEFILE_OP_*
+ * @len message length, including message header and following data
+ * @data message type specific payload
+ */
+struct cachefiles_msg {
+ __u32 id;
+ __u32 opcode;
+ __u32 len;
+ __u8 data[];
+};
+
+struct cachefiles_open {
+ __u32 volume_key_len;
+ __u32 cookie_key_len;
+ __u32 fd;
+ __u32 flags;
+ /* following data contains volume_key and cookie_key in sequence */
+ __u8 data[];
+};
+
+enum cachefiles_open_flags {
+ CACHEFILES_OPEN_WANT_CACHE_SIZE,
+};
+
+#endif
--
2.27.0
On 3/21/22 10:01 PM, David Howells wrote:
> Jeffle Xu <[email protected]> wrote:
>
>> + read_lock(&cache->reqs_lock);
>> +
>> + /* recheck dead state under lock */
>> + if (test_bit(CACHEFILES_DEAD, &cache->flags)) {
>> + read_unlock(&cache->reqs_lock);
>> + ret = -EIO;
>> + goto out;
>> + }
>> +
>> + xa_lock(xa);
>> + ret = __xa_alloc(xa, &id, req, xa_limit_32b, GFP_KERNEL);
>
> You're holding a spinlock. You can't use GFP_KERNEL.
Oh yes... I've dropped into this for second time... Sorry for that.
>
>> +static int cachefiles_ondemand_cinit(struct cachefiles_cache *cache, char *args)
>> +{
>> ...
>> + tmp = kstrdup(args, GFP_KERNEL);
>
> No need to copy the string. The caller already did that and added a NUL for
> good measure.
Right.
>
> I would probably move most of the functions added in this patch to
> fs/cachefiles/ondemand.c.
Alright.
--
Thanks,
Jeffle
Jeffle Xu <[email protected]> wrote:
> + read_lock(&cache->reqs_lock);
> +
> + /* recheck dead state under lock */
> + if (test_bit(CACHEFILES_DEAD, &cache->flags)) {
> + read_unlock(&cache->reqs_lock);
> + ret = -EIO;
> + goto out;
> + }
> +
> + xa_lock(xa);
> + ret = __xa_alloc(xa, &id, req, xa_limit_32b, GFP_KERNEL);
You're holding a spinlock. You can't use GFP_KERNEL.
> +static int cachefiles_ondemand_cinit(struct cachefiles_cache *cache, char *args)
> +{
> ...
> + tmp = kstrdup(args, GFP_KERNEL);
No need to copy the string. The caller already did that and added a NUL for
good measure.
I would probably move most of the functions added in this patch to
fs/cachefiles/ondemand.c.
David