Document new user interface introduced by on-demand read mode.
Signed-off-by: Jeffle Xu <[email protected]>
---
.../filesystems/caching/cachefiles.rst | 170 ++++++++++++++++++
1 file changed, 170 insertions(+)
diff --git a/Documentation/filesystems/caching/cachefiles.rst b/Documentation/filesystems/caching/cachefiles.rst
index 8bf396b76359..c10a16957141 100644
--- a/Documentation/filesystems/caching/cachefiles.rst
+++ b/Documentation/filesystems/caching/cachefiles.rst
@@ -28,6 +28,7 @@ Cache on Already Mounted Filesystem
(*) Debugging.
+ (*) On-demand Read.
Overview
@@ -482,3 +483,172 @@ the control file. For example::
echo $((1|4|8)) >/sys/module/cachefiles/parameters/debug
will turn on all function entry debugging.
+
+
+On-demand Read
+==============
+
+When working in its original mode, cachefiles mainly serves as a local cache
+for a remote networking fs - while in on-demand read mode, cachefiles can boost
+the scenario where on-demand read semantics is needed, e.g. container image
+distribution.
+
+The essential difference between these two modes is that, in original mode,
+when a cache miss occurs, the netfs will fetch the data from the remote server
+and then write it to the cache file. With on-demand read mode, however,
+fetching the data and writing it into the cache is delegated to a user daemon.
+
+``CONFIG_CACHEFILES_ONDEMAND`` should be enabled to support on-demand read mode.
+
+
+Protocol Communication
+----------------------
+
+The on-demand read mode relies on a simple protocol used for communication
+between kernel and user daemon. The protocol can be modeled as::
+
+ kernel --[request]--> user daemon --[reply]--> kernel
+
+The cachefiles kernel module will send requests to the user daemon when needed.
+The user daemon needs to poll on the devnode ('/dev/cachefiles') to check if
+there's a pending request to be processed. A POLLIN event will be returned
+when there's a pending request.
+
+The user daemon then reads the devnode to fetch a request and process it
+accordingly. It is worth noting that each read only gets one request. When
+finished processing the request, the user daemon needs to write the reply to
+the devnode.
+
+Each request starts with a message header of the form::
+
+ struct cachefiles_msg {
+ __u32 msg_id;
+ __u32 opcode;
+ __u32 len;
+ __u32 object_id;
+ __u8 data[];
+ };
+
+ where:
+
+ * ``msg_id`` is a unique ID identifying this request among all pending
+ requests.
+
+ * ``opcode`` indicates the type of this request.
+
+ * ``object_id`` is a unique ID identifying the cache file operated on.
+
+ * ``data`` indicates the payload of this request.
+
+ * ``len`` indicates the whole length of this request, including the
+ header and following type-specific payload.
+
+
+Turn on On-demand Mode
+----------------------
+
+An optional parameter is added to the "bind" command::
+
+ bind [ondemand]
+
+When the "bind" command takes without argument, it defaults to the original
+mode. When the "bind" command is given the "ondemand" argument, i.e.
+"bind ondemand", on-demand read mode will be enabled.
+
+
+The OPEN Request
+----------------
+
+When the netfs opens a cache file for the first time, a request with the
+CACHEFILES_OP_OPEN opcode, a.k.a an OPEN request will be sent to the user
+daemon. The payload format is of the form::
+
+ struct cachefiles_open {
+ __u32 volume_key_size;
+ __u32 cookie_key_size;
+ __u32 fd;
+ __u32 flags;
+ __u8 data[];
+ };
+
+ where:
+
+ * ``data`` contains the volume_key followed directly by the cookie_key.
+ The volume key is a NUL-terminated string; the cookie key is binary
+ data.
+
+ * ``volume_key_size`` indicates the size of the volume key in bytes.
+
+ * ``cookie_key_size`` indicates the size of the cookie key in bytes.
+
+ * ``fd`` indicates an anonymous fd referring to the cache file, through
+ which the user daemon can perform write/llseek file operations on the
+ cache file.
+
+
+The user daemon is able to distinguish the requested cache file with the given
+(volume_key, cookie_key) pair. Each cache file has a unique object_id, while it
+may have multiple anonymous fds. The user daemon may duplicate anonymous fds
+from the initial anonymous fd indicated by the @fd field through dup(). Thus
+each object_id can be mapped to multiple anonymous fds, while the usr daemon
+itself needs to maintain the mapping.
+
+With the given anonymous fd, the user daemon can fetch data and write it to the
+cache file in the background, even when kernel has not triggered a cache miss
+yet.
+
+The user daemon should complete the READ request by issuing a "copen" (complete
+open) command on the devnode::
+
+ copen <msg_id>,<cache_size>
+
+ * ``msg_id`` must match the msg_id field of the previous OPEN request.
+
+ * When >= 0, ``cache_size`` indicates the size of the cache file;
+ when < 0, ``cache_size`` indicates the error code ecountered by the
+ user daemon.
+
+
+The CLOSE Request
+-----------------
+
+When a cookie withdrawn, a CLOSE request (opcode CACHEFILES_OP_CLOSE) will be
+sent to the user daemon. It will notify the user daemon to close all anonymous
+fds associated with the given object_id. The CLOSE request has no extea
+payload.
+
+
+The READ Request
+----------------
+
+When on-demand read mode is turned on, and a cache miss encountered, the kernel
+will send a READ request (opcode CACHEFILES_OP_READ) to the user daemon. This
+will tell the user daemon to fetch data of the requested file range. The payload
+is of the form::
+
+ struct cachefiles_read {
+ __u64 off;
+ __u64 len;
+ };
+
+ where:
+
+ * ``off`` indicates the starting offset of the requested file range.
+
+ * ``len`` indicates the length of the requested file range.
+
+
+When receiving a READ request, the user daemon needs to fetch the data of the
+requested file range, and then write it to the cache file identified by
+object_id.
+
+To finish processing the READ request, the user daemon should reply with the
+CACHEFILES_IOC_CREAD ioctl on one of the anonymous fds associated with the given
+object_id in the READ request. The ioctl is of the form::
+
+ ioctl(fd, CACHEFILES_IOC_CREAD, msg_id);
+
+ * ``fd`` is one of the anonymous fds associated with the given object_id
+ in the READ request.
+
+ * ``msg_id`` must match the msg_id field of the previous READ request.
--
2.27.0
Jeffle Xu <[email protected]> wrote:
> +When working in its original mode, cachefiles mainly
I'd delete "mainly" there.
> serves as a local cache
> +for a remote networking fs - while in on-demand read mode, cachefiles can boost
> +the scenario where on-demand read semantics is
is -> are.
> +The essential difference between these two modes is that, in original mode,
> +when a cache miss occurs, the netfs will fetch the data from the remote server
> +and then write it to the cache file. With on-demand read mode, however,
> +fetching the data and writing it into the cache is delegated to a user daemon.
The starting sentence seems off. How about:
The essential difference between these two modes is seen when a cache miss
occurs: In the original mode, the netfs will fetch the data from the remote
server and then write it to the cache file; in on-demand read mode, fetching
data and writing it into the cache is delegated to a user daemon.
> +Protocol Communication
> +----------------------
> +
> +The on-demand read mode relies on
relies on -> uses
> a simple protocol used
Delete "used".
> for communication
> +between kernel and user daemon. The protocol can be modeled as::
> +
> + kernel --[request]--> user daemon --[reply]--> kernel
> +
> +The cachefiles kernel module will send requests to the user daemon when needed.
> +The user daemon needs to
needs to -> should
> poll on
poll on -> poll
> the devnode ('/dev/cachefiles') to check if
> +there's a pending request to be processed. A POLLIN event will be returned
> +when there's a pending request.
> +
> +The user daemon then reads the devnode to fetch a request and process it
> +accordingly.
Reading the devnode doesn't process the request, so I think something like:
"... and process it accordingly" -> "... that it can then process."
or:
"... and process it accordingly" -> "... to process."
> It is worth noting
"It should be noted"
> that each read only gets one request. When
... it has ...
> +finished processing the request, the user daemon needs to
needs to -> should write
> write the reply to
> +the devnode.
> +
> +Each request starts with a message header of the form::
> +
> + struct cachefiles_msg {
> + __u32 msg_id;
> + __u32 opcode;
> + __u32 len;
> + __u32 object_id;
> + __u8 data[];
> + };
> +
> + where:
> +
> + * ``msg_id`` is a unique ID identifying this request among all pending
> + requests.
> +
> + * ``opcode`` indicates the type of this request.
> +
> + * ``object_id`` is a unique ID identifying the cache file operated on.
> +
> + * ``data`` indicates the payload of this request.
> +
> + * ``len`` indicates the whole length of this request, including the
> + header and following type-specific payload.
> +
> +
> +Turn on On-demand Mode
Turning on
> +----------------------
> +
> +An optional parameter is added
is added -> becomes available
> to the "bind" command::
> +
> + bind [ondemand]
> +
> +When the "bind" command takes without
takes without -> is given no
> argument, it defaults to the original
> +mode. When the "bind" command is given
When it is given
> the "ondemand" argument, i.e.
> +"bind ondemand", on-demand read mode will be enabled.
> +
> +
> +The OPEN Request
> +----------------
> +
> +When the netfs opens a cache file for the first time, a request with the
> +CACHEFILES_OP_OPEN opcode, a.k.a an OPEN request will be sent to the user
> +daemon. The payload format is of the form::
> +
> + struct cachefiles_open {
> + __u32 volume_key_size;
> + __u32 cookie_key_size;
> + __u32 fd;
> + __u32 flags;
> + __u8 data[];
> + };
> +
> + where:
> +
> + * ``data`` contains the volume_key followed directly by the cookie_key.
> + The volume key is a NUL-terminated string; the cookie key is binary
> + data.
> +
> + * ``volume_key_size`` indicates the size of the volume key in bytes.
> +
> + * ``cookie_key_size`` indicates the size of the cookie key in bytes.
> +
> + * ``fd`` indicates an anonymous fd referring to the cache file, through
> + which the user daemon can perform write/llseek file operations on the
> + cache file.
> +
> +
> +The user daemon is able to distinguish the requested cache file with the given
> +(volume_key, cookie_key) pair.
"The user daemon can use the given (volume_key, cookie_key) pair to
distinguish the requested cache file." might sound better.
> Each cache file has a unique object_id, while it
> +may have multiple anonymous fds. The user daemon may duplicate anonymous fds
> +from the initial anonymous fd indicated by the @fd field through dup(). Thus
> +each object_id can be mapped to multiple anonymous fds, while the usr daemon
> +itself needs to maintain the mapping.
> +
> +With the given anonymous fd, the user daemon can fetch data and write it to the
> +cache file in the background, even when kernel has not triggered a cache miss
> +yet.
> +
> +The user daemon should complete the READ request
READ request -> OPEN request?
> by issuing a "copen" (complete
> +open) command on the devnode::
> +
> + copen <msg_id>,<cache_size>
> +
> + * ``msg_id`` must match the msg_id field of the previous OPEN request.
> +
> + * When >= 0, ``cache_size`` indicates the size of the cache file;
> + when < 0, ``cache_size`` indicates the
the -> any
> error code ecountered
encountered
> by the
> + user daemon.
> +
> +
> +The CLOSE Request
> +-----------------
> +
> +When a cookie withdrawn, a CLOSE request (opcode CACHEFILES_OP_CLOSE) will be
> +sent to the user daemon. It will notify
It will notify -> This tells
> the user daemon to close all anonymous
> +fds associated with the given object_id. The CLOSE request has no extea
extra
> +payload.
> +
> +
> +The READ Request
> +----------------
> +
> +When on-demand read mode is turned on, and a cache miss encountered,
"When a cache miss is encountered in on-demand read mode,"
> the kernel
> +will send a READ request (opcode CACHEFILES_OP_READ) to the user daemon. This
> +will tell
will tell -> tells/asks
> the user daemon to fetch data
data -> the contents
> of the requested file range. The payload
> +is of the form::
> +
> + struct cachefiles_read {
> + __u64 off;
> + __u64 len;
> + };
> +
> + where:
> +
> + * ``off`` indicates the starting offset of the requested file range.
> +
> + * ``len`` indicates the length of the requested file range.
> +
> +
> +When receiving
receiving -> it receives
> a READ request, the user daemon needs to
needs to -> should
> fetch the
requested
> data of the
> +requested file range,
"of the requested file range," -> "" (including the comma, I think)
> and then
"then" -> ""
> write it to the cache file identified by
> +object_id.
> +
> +To finish
When it has finished
> processing the READ request, the user daemon should reply with
with -> by using
> the
> +CACHEFILES_IOC_CREAD ioctl on one of the anonymous fds associated with the given
> +object_id
given object_id -> object_id given
> in the READ request. The ioctl is of the form::
> +
> + ioctl(fd, CACHEFILES_IOC_CREAD, msg_id);
> +
> + * ``fd`` is one of the anonymous fds associated with the given object_id
> + in the READ request.
the given object_id in the READ request -> object_id
> +
> + * ``msg_id`` must match the msg_id field of the previous READ request.
By "previous READ request" is this referring to something different to "the
READ request" you mentioned against the fd parameter?
David
Hi David, thanks for polishing the documents. It's a detailed and
meticulous review again. Really thanks for your time :) I will fix all
these in the next version.
On 4/21/22 10:47 PM, David Howells wrote:
> Jeffle Xu <[email protected]> wrote:
>
>> +The essential difference between these two modes is that, in original mode,
>> +when a cache miss occurs, the netfs will fetch the data from the remote server
>> +and then write it to the cache file. With on-demand read mode, however,
>> +fetching the data and writing it into the cache is delegated to a user daemon.
>
> The starting sentence seems off. How about:
>
> The essential difference between these two modes is seen when a cache miss
> occurs: In the original mode, the netfs will fetch the data from the remote
> server and then write it to the cache file; in on-demand read mode, fetching
> data and writing it into the cache is delegated to a user daemon.
Okay, it sounds better.
>> the devnode ('/dev/cachefiles') to check if
>> +there's a pending request to be processed. A POLLIN event will be returned
>> +when there's a pending request.
>> +
>> +The user daemon then reads the devnode to fetch a request and process it
>> +accordingly.
>
> Reading the devnode doesn't process the request, so I think something like:
>
> "... and process it accordingly" -> "... that it can then process."
>
> or:
>
> "... and process it accordingly" -> "... to process."
Yeah the original statement is indeed misleading.
>> Each cache file has a unique object_id, while it
>> +may have multiple anonymous fds. The user daemon may duplicate anonymous fds
>> +from the initial anonymous fd indicated by the @fd field through dup(). Thus
>> +each object_id can be mapped to multiple anonymous fds, while the usr daemon
>> +itself needs to maintain the mapping.
>> +
>> +With the given anonymous fd, the user daemon can fetch data and write it to the
>> +cache file in the background, even when kernel has not triggered a cache miss
>> +yet.
>> +
>> +The user daemon should complete the READ request
>
> READ request -> OPEN request?
Good catch. Will be fixed.
>> in the READ request. The ioctl is of the form::
>> +
>> + ioctl(fd, CACHEFILES_IOC_CREAD, msg_id);
>> +
>> + * ``fd`` is one of the anonymous fds associated with the given object_id
>> + in the READ request.
>
> the given object_id in the READ request -> object_id
>
>> +
>> + * ``msg_id`` must match the msg_id field of the previous READ request.
>
> By "previous READ request" is this referring to something different to "the
> READ request" you mentioned against the fd parameter?
Actually it is referring to the same thing (the same READ request). I
will change the statement simply to:
``msg_id`` must match the msg_id field of the READ request.
--
Thanks,
Jeffle