Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp1371327pxb; Fri, 21 Jan 2022 16:48:44 -0800 (PST) X-Google-Smtp-Source: ABdhPJyzbMSMyM5bOU3f9qhZSXHV0l3AsU+TJPIvjUtQ8LW3rxci8/gXf50mNb1aJ2ctcd+yovIf X-Received: by 2002:a17:902:e887:b0:14b:457:a7e6 with SMTP id w7-20020a170902e88700b0014b0457a7e6mr6076390plg.161.1642812524789; Fri, 21 Jan 2022 16:48:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1642812524; cv=none; d=google.com; s=arc-20160816; b=DuePJkmmOcaVMk15Pzqtykjgiuw04z7+bjhEMZ7q45Q5cyenakKMaRdMypETrCaczk xC0CMhQFiZOC71DD4r6630gabngiFIpz3ILIluvzXEr1BFD3VmNVWW+NAjDVqQJYD3tO XEy2dJ6xIvEkPQbFKaYqSFrwaqTJC7sJJ7kHBKdy5I2UrB1eOcMb1oCW1EHFrJtVzqec HWJgZsq3cjHdGkOkAe+vHFn5+BjRLkb3VZE+ZgDwkmULNAU0XHNVMa2nmciKvAw7ZDQK /uFxOmjAQrsrhDE+zk3murhl5btGR9XC5mq83wmCX7Q1x3dXCHL+WrooNOIyIF7xw72C F/rA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from:cc :references:to:content-language:subject:user-agent:mime-version:date :message-id; bh=SlldxdYprKnSPQ/P88HjCeg7dQEcdBx0qz+xqdfzAGo=; b=mLTETn1xQZGQUbIGyOL+cDo07GpKZ3PJCb7aiMhimeSfv53RuUQqmO+aCqNcxaQOjW Dmc1+CpyTJ3K6w743IczrExq33fd7pYT3rK37vmZVg8cnswLI4UyM//4gwSLqkmrsbs/ RUAvmud4vLUeKPgyycrRuUgStP1XwvO+WV1arPtZ49gM5431fpJ/KGZQMiGKrJn4+3LG jIGZ6ZVEVa5EsLBlvHjQAxKc0VCigKTdieUQPQcGdaohg7uiC3npYZ/UR5G9YbiznWt6 7Drd8qFEgcPC3F2mJ4YxTAabkm/qtbFwEIrJcF1IDjy2b562CPNqBWte7dQ7gjjdZkRK sHiQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t21si6644492plo.73.2022.01.21.16.48.32; Fri, 21 Jan 2022 16:48:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379998AbiAUK5o (ORCPT + 99 others); Fri, 21 Jan 2022 05:57:44 -0500 Received: from out30-57.freemail.mail.aliyun.com ([115.124.30.57]:48014 "EHLO out30-57.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234688AbiAUK5n (ORCPT ); Fri, 21 Jan 2022 05:57:43 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R331e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0V2R34sx_1642762658; Received: from 30.225.24.54(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2R34sx_1642762658) by smtp.aliyun-inc.com(127.0.0.1); Fri, 21 Jan 2022 18:57:39 +0800 Message-ID: Date: Fri, 21 Jan 2022 18:57:38 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.3.2 Subject: Re: [Linux-cachefs] [PATCH v2 00/20] fscache, erofs: fscache-based demand-read semantics Content-Language: en-US To: David Howells , linux-cachefs@redhat.com References: Cc: xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org, Greg Kroah-Hartman , Linus Torvalds , linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org From: JeffleXu In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi David, Sincerely would you mind sharing if you like this patch set or not? It seems that the use case of file-based on-demand load is quite general. And as Gao Xaing noted, we still prefer fscache to implement this scenario, whilst fscache has well worked as the local cache for remote netfs. Humbly I'd like to know if this potential new requirement for fscache meets your expectation or future plan for fscache. If it is, then we can improve the patch set in the later versions. Besides let me know if it indeed deviates from the roadmap of fscache. Thanks, Jeffle On 1/19/22 2:40 PM, Gao Xiang wrote: > Hi David, > > On Tue, Jan 18, 2022 at 09:11:56PM +0800, Jeffle Xu wrote: >> changes since v1: >> - rebase to v5.17 >> - erofs: In chunk based layout, since the logical file offset has the >> same remainder over PAGE_SIZE with the corresponding physical address >> inside the data blob file, the file page cache can be directly >> transferred to netfs library to contain the data from data blob file. >> (patch 15) (Gao Xiang) >> - netfs,cachefiles: manage logical/physical offset separately. (patch 2) >> (It is used by erofs_begin_cache_operation() in patch 15.) >> - cachefiles: introduce a new devnode specificaly for on-demand reading. >> (patch 6) >> - netfs,fscache,cachefiles: add new CONFIG_* for on-demand reading. >> (patch 3/5) >> - You could start a quick test by >> https://github.com/lostjeffle/demand-read-cachefilesd >> - add more background information (mainly introduction to nydus) in the >> "Background" part of this cover letter >> >> [Important Issues] >> The following issues still need further discussion. Thanks for your time >> and patience. >> >> 1. I noticed that there's refactoring of netfs library[1], and patch 1 >> is not needed since [2]. >> >> 2. The current implementation will severely conflict with the >> refactoring of netfs library[1][2]. The assumption of 'struct >> netfs_i_context' [2] is that, every file in the upper netfs will >> correspond to only one backing file. While in our scenario, one file in >> erofs can correspond to multiple backing files. That is, the content of >> one file can be divided into multiple chunks, and are distrubuted over >> multiple blob files, i.e. multiple backing files. Currently I have no >> good idea solving this conflic. >> > > Would you mind give more hints on this? Personally, I still think fscache > is useful and clean way for image distribution on-demand load use cases > in addition to cache network fs data as a more generic in-kernel caching > framework. From the point view of current codestat, it has slight > modification of netfslib and cachefiles (except for a new daemon): > fs/netfs/Kconfig | 8 + > fs/netfs/read_helper.c | 65 ++++++-- > include/linux/netfs.h | 10 ++ > > fs/cachefiles/Kconfig | 8 + > fs/cachefiles/daemon.c | 147 ++++++++++++++++- > fs/cachefiles/internal.h | 23 +++ > fs/cachefiles/io.c | 82 +++++++++- > fs/cachefiles/main.c | 27 ++++ > fs/cachefiles/namei.c | 60 ++++++- > > Besides, I think that cookies can be set according to data mapping > (instead of fixed per file) will benefit the following scenario in > addition to our on-demand load use cases: > It will benefit file cache data deduplication. What I can see is that > netfslib may have some follow-on development in order to support > encryption and compression. However, I think cache data deduplication > is also potentially useful to minimize cache storage since many local > fses already support reflink. However, I'm not sure if it's a great > idea that cachefile relies on underlayfs abilities for cache deduplication. > So for cache deduplication scenarios, I'm not sure per-file cookie is > still a good idea for us (or alternatively, maintain more complicated > mapping per cookie inside fscache besides filesystem mapping, too > unnecessary IMO). > > By the way, in general, I'm not sure if it's a great idea to cache in > per-file basis (especially for too many small files), that is why we > introduced data deduplicated blobs. At least, it's simpler for read-only > fses. Recently, I found another good article to summarize this: > http://0pointer.net/blog/casync-a-tool-for-distributing-file-system-images.html > > Thanks, > Gao Xiang > -- Thanks, Jeffle