Received: by 2002:a05:6512:3d0e:0:0:0:0 with SMTP id d14csp18247lfv; Tue, 12 Apr 2022 15:27:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxETqGONExCFtBLsU+6+DwrrJaKUIpmefSSRL7c+FvG934obCci74T/YaNC88hR4jjvcI1/ X-Received: by 2002:a63:5917:0:b0:39c:c450:3143 with SMTP id n23-20020a635917000000b0039cc4503143mr24348291pgb.531.1649802476168; Tue, 12 Apr 2022 15:27:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649802476; cv=none; d=google.com; s=arc-20160816; b=GJeH9nQbkFIoIr9iwbM6QdS30tZAsScfAaqbFWs9ozkrrjC1MDss1/m1geLq3ccEOQ R0ZfgFonfq6eSgzB9Xbs5CF9OUb9G3XVjcx7V5STQKIOIXrQ0YuVvwZGN+Xc7yPLGXJY 3ErUJM9Xz7fefpDHD4ZViMBcHWGmL6ytjpLu6aoNyVOzgQPWIVPB4z2WjVikg7J/s37O d4/FIrgsuVBF/aK2hZkTsb/ZLEjoeEIAeKSk71mkCjE5Q7kcjOTTE9CxfYnCeq+rg2Oz JL++9if8E1wtsngmcVc563odlcDDRSlfQmjEezePDBYOXQncDu3jlSY8RAx1vz0xrVTG zCyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:content-transfer-encoding :content-id:mime-version:subject:cc:to:references:in-reply-to:from :organization:dkim-signature; bh=ZN2biAUnTrTodc0JZhas+iGpcRCK3gpzadZQYbc1pBk=; b=Hm3fPeJkFor/vJx30wxiBN8ggOZaKG2NTcZU/2OYP7kI2PYcIU0VDt8BuEeJoO4Lnp a+XiRH3GZ3O+C3olEZcyfDqcTPZaMgAIosW2YY6GDzX7L8c2UQNhx46A6+xsiqTf4PoR 4+C6RPDBD31JMakxkPCRT8hGLhquRuCLLhXd/jNfuJr+Nx0xTd0de+JTuGEz6JNHA/kW 8DYGhCjpJ6ukXlzQUXPtybiVT2Tu0rmktodlNL7kvxOdobn32uW1MKeIKPB24ACwkwyk Okud5s25Y0N3AzE2rgkgdnTyZkacE8IhzPxobHpe50QneJPK6UN7f4lclEW6LHBdcVGF Fpyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FddHh3zU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id c16-20020a6566d0000000b003816043ee9dsi4067684pgw.146.2022.04.12.15.27.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 15:27:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FddHh3zU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 25224181DBB; Tue, 12 Apr 2022 14:05:54 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346600AbiDKNlW (ORCPT + 99 others); Mon, 11 Apr 2022 09:41:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346603AbiDKNlS (ORCPT ); Mon, 11 Apr 2022 09:41:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F073F22B0E for ; Mon, 11 Apr 2022 06:39:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649684339; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZN2biAUnTrTodc0JZhas+iGpcRCK3gpzadZQYbc1pBk=; b=FddHh3zUJtmv1BoZQxlgH2aD+LQBv4zHppdN/lnHQlxiJ0hSfqToobV/DyzubDBCfzfwJd PxapzJ4DUVItNnCVauO0DQyBzaxyr6/TV3JOtPiFFjuFWAeekl/M9/iRULpxPVDA2A0KSc EEP/7GqgdgHI1RKz5nZsk/dAGZj0EPM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-388-0L_x8Q6WPne-TfycMu9Vhg-1; Mon, 11 Apr 2022 09:38:56 -0400 X-MC-Unique: 0L_x8Q6WPne-TfycMu9Vhg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7031D18A65AA; Mon, 11 Apr 2022 13:38:55 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.37.45]) by smtp.corp.redhat.com (Postfix) with ESMTP id A6991C15D40; Mon, 11 Apr 2022 13:38:52 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: <20220406075612.60298-8-jefflexu@linux.alibaba.com> References: <20220406075612.60298-8-jefflexu@linux.alibaba.com> <20220406075612.60298-1-jefflexu@linux.alibaba.com> To: Jeffle Xu Cc: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org, torvalds@linux-foundation.org, gregkh@linuxfoundation.org, willy@infradead.org, linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org, luodaowen.backend@bytedance.com, tianzichen@kuaishou.com, fannaihao@baidu.com Subject: Re: [PATCH v8 07/20] cachefiles: document on-demand read mode MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1094291.1649684331.1@warthog.procyon.org.uk> Content-Transfer-Encoding: quoted-printable Date: Mon, 11 Apr 2022 14:38:51 +0100 Message-ID: <1094292.1649684331@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 2.85 on 10.11.54.8 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jeffle Xu wrote: > + (*) On-demand Read. > + Unnecessary extra blank line. Jeffle Xu wrote: > + > + > +On-demand Read > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +When working in original mode, cachefiles mainly serves as a local cach= e for > +remote networking fs, while in on-demand read mode, cachefiles can boos= t the > +scenario where on-demand read semantics is needed, e.g. container image > +distribution. > + > +The essential difference between these two modes is that, in original m= ode, > +when cache miss, netfs itself will fetch data from remote, and then wri= te the > +fetched data into cache file. While in on-demand read mode, a user daem= on is > +responsible for fetching data and then writing to the cache file. > + > +``CONFIG_CACHEFILES_ONDEMAND`` shall be enabled to support on-demand re= ad mode. You're missing a few articles there. How about: """ When working in its original mode, cachefiles mainly serves as a local cac= he for a remote networking fs - while in on-demand read mode, cachefiles can = boost the scenario where on-demand read semantics are needed, e.g. container ima= ge distribution. The essential difference between these two modes is that, in original mode= , when a cache miss occurs, the netfs will fetch the data from the remote se= rver and then write it to the cache file. With on-demand read mode, however, fetching and the data and writing it into the cache is delegated to a user daemon. ``CONFIG_CACHEFILES_ONDEMAND`` shall be enabled to support on-demand read = mode. """ "should be enabled". Also, two spaces after a full stop please (but not after the dot in a contraction, e.g. "e.g."). > +The on-demand read mode relies on a simple protocol used for communicat= ion > +between kernel and user daemon. The model is like:: "The protocol can be modelled as"? > +The cachefiles kernel module will send requests to the > user daemon when needed. > + the > User daemon needs to poll on the devnode ('/dev/cachefiles') to check if > +there's a > pending request to be processed. A POLLIN event will be returned > +when there's a > pending request. > +Then user daemon needs to read "The user daemon [than] reads " > the devnode to fetch one one -> a > request and process it > +accordingly. It is worth nothing nothing -> noting > that each read only gets one request. When > +finished processing the request, the > user daemon needs to write the reply to the > +devnode. > +Each request is started with a message header like:: "is started with" -> "starts with". "like" -> "of the form". > + * ``id`` is a unique ID identifying this request among all pending > + requests. What's the scope of the uniqueness of "id"? Is it just unique to a partic= ular cachefiles cache? > + * ``len`` identifies the whole length of this request, including the > + header and following type specific payload. type-specific. > +An optional parameter is added to "bind" command:: to the "bind" command. > +When the > "bind" command takes takes -> is given > without argument, it defaults to the original mode. > +When the > "bind" command takes is given > with the > "ondemand" argument, i.e. "bind ondemand", > +on-demand read mode will be enabled. > +OPEN Request The > +------------ > + > +When the > netfs opens a cache file for the first time, a request with the > +CACHEFILES_OP_OPEN opcode, a.k.a an > OPEN request will be sent to the > user daemon. The > +payload format is like:: format is like -> of the form > + > + struct cachefiles_open { > + __u32 volume_key_size; > + __u32 cookie_key_size; > + __u32 fd; > + __u32 flags; > + __u8 data[]; > + }; > + "where:" > + * ``data`` contains the > volume_key and cookie_key in sequence. Might be better to say "contains the volume_key followed directly by the cookie_key. The volume key is a NUL-terminated string; cookie_key is bina= ry data.". > + > + * ``volume_key_size`` identifies identifies -> indicates/supplies > the size of the > volume key of the cache > + file, in bytes. volume_key is of string format, with a suffix '\0'. > + > + * ``cookie_key_size`` identifies the size of cookie key of the cache > + file, in bytes. cookie_key is of binary format, which is netfs > + specific. "... indicates the size of the cookie key in bytes." > + > + * ``fd`` identifies the the -> an > anonymous fd of of -> referring to > the cache file, with with -> through > which user > + daemon can perform write/llseek file operations on the cache file. > + > + > + The > OPEN request contains a > (volume_key, cookie_key, anon_fd) triple for triplet for the I would probably also use {...} rather than (...). > corresponding > +cache file. With this triple, triplet, the > user daemon could could -> can > fetch and write data into the > +cache file in the background, even when kernel has not triggered the the -> a > cache miss > +yet. The > User daemon is able to distinguish the requested cache file with the giv= en > +(volume_key, cookie_key), and write the fetched data into the > cache file with with -> using > the > +given anon_fd. > + > +After recording the (volume_key, cookie_key, anon_fd) triple, triplet, the > user daemon shall shall -> should > +reply with reply with -> complete the request by issuing a > "copen" (complete open) command:: > + > + copen , > + > + * ``id`` is exactly the id field of the previous OPEN request. > + > + * When >=3D 0, ``cache_size`` identifies the size of the cache file; > + when < 0, ``cache_size`` identifies the error code ecountered by the > + user daemon. identifies -> indicates ecountered -> encountered > +CLOSE Request The > +------------- > +When a > cookie withdrawed, withdrawed -> withdrawn > a request with a > CACHEFILES_OP_CLOSE opcode, a.k.a CLOSE > +request, Maybe phrase as "... a close request (opcode CACHEFILES_OP_CLOSE), > will be sent to user daemon. It will notify the > user daemon to close the > +attached anon_fd. The payload format is like:: like -> of the form > + > + struct cachefiles_close { > + __u32 fd; > + }; > + "where:" > + * ``fd`` identifies the anon_fd to be closed, which is exactly the sam= e "... which should be the same as that provided to the OPEN request". Is it possible for userspace to move the fd around with dup() or whatever? > + with that in OPEN request. > + > + > +READ Request The > +------------ > + > +When on-demand read mode is turned on, and a > cache miss encountered, the > kernel will > +send a request with CACHEFILES_OP_READ opcode, a.k.a READ request, "send a READ request (opcode CACHEFILES_OP_READ)" > to the > user > +daemon. It will notify It will notify -> This will ask/tell > user daemon to fetch data in the requested file range. > +The payload format is like:: format is like -> is of the form > + > + struct cachefiles_read { > + __u64 off; > + __u64 len; > + __u32 fd; > + }; > + > + * ``off`` identifies the starting offset of the requested file range. identifies -> indicates > + > + * ``len`` identifies the length of the requested file range. > + identifies -> indicates (you could alternatively say "specified") > + * ``fd`` identifies the anonymous fd of the requested cache file. It i= s > + guaranteed that it shall be the same with "same with" -> "same as" Since the kernel cannot make such a guarantee, I think you may need to res= tate this as something like "Userspace must present the same fd as was given in= the previous OPEN request". > the fd field in the previous > + OPEN request. > + > +When receiving one one -> a > READ request, the > user daemon needs to fetch the > data of the > +requested file range, and then write the fetched data , and then write the fetched data -> and write it > into cache file cache file -> cache > with using > the > +given anonymous fd. + to indicate the destination. > + > +When finished When finished -> To finish > processing the READ request, the > user daemon needs to reply with the > +CACHEFILES_IOC_CREAD ioctl on the corresponding anon_fd:: > + > + ioctl(fd, CACHEFILES_IOC_CREAD, id); > + > + * ``fd`` is exactly the fd field of the previous READ request. Does that have to be true? What if userspace moves it somewhere else? > + > + * ``id`` is exactly the id field of the previous READ request. is exactly the -> must match the David