Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp970185pxb; Fri, 22 Apr 2022 15:37:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxt4EokeHc7/uVwNisvLHMrhYIYsQIo0sfCgYm4w93FAZt1cb3Hc/Wueb2ENMcJvP5ZW27T X-Received: by 2002:a05:6a00:15c4:b0:50a:7fec:c656 with SMTP id o4-20020a056a0015c400b0050a7fecc656mr7309041pfu.62.1650667036454; Fri, 22 Apr 2022 15:37:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650667036; cv=none; d=google.com; s=arc-20160816; b=qi54QlmQC1w+k8tQ9ohY3AqjDd2s41ISto4/CO6uT620WPZZuhMnGEG11/sNqyu8CZ FtauWRs2WWSSEN7OJbmwBB39Rp6a1NvPw78mwzQNLBpJ/Tsya3jKSfkixpyhozLcdcX7 NDkiLIOdaAAhhX2W/h0eSg5SA5xOX2WycTNNLCEva9qi4fu0OQqkk6G5QT4zRY+fRiGG IVgbzJ1UQdtj00Nh7oCDffxBuB04Fcwh2Zshkcba9x7EdEySTf00tp5Z0oqfFpevFgMl iin7j+vbyaeZZZut7iFrHTt+IASswTch5EQ5D/aE8WQDQyt1mNwTc1qy4UVcr+B5QBka oxug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=AM5AgaStr+7x0zPOaMC6NiilS/wdTn2nJyub9MMxsvE=; b=ycAbn5nhkZkAtK7JywJhNP3cKtEJAAatJSoj32TBr9Fw7SjftgYZy1P29vVBCpO2z3 eSbYw+Z+b35ZZFRhIQ1ExwrVQ05D9/mkeRoPfaI/rmuGGm8CuUEovNkvOgmgxrLgxWLu sWT7hfZ2yiSVtmVpFu013JmnCBdCuEMISOVDF4jG2oJYPvbAr0X9zH2ezkACJ7z230Ck ttd7xJjBcRXW6sUycyYg46j2KtHSolrTb26uYqU/dRTBRjv4ZzPziu7Vmiujsj8VmleO qOOGmTb2KJ6bzs59z2jnk36GlZQLM1900g3eIIufKf4wuGt5tUayuEcWK3Wb4LhWzi4I 4dRw== ARC-Authentication-Results: i=1; mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id ip12-20020a17090b314c00b001d2bd3fb4dbsi9894437pjb.65.2022.04.22.15.37.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Apr 2022 15:37:16 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A49333BDE77; Fri, 22 Apr 2022 13:24:55 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229664AbiDUQ1T (ORCPT + 99 others); Thu, 21 Apr 2022 12:27:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230113AbiDUQR0 (ORCPT ); Thu, 21 Apr 2022 12:17:26 -0400 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 519462B241; Thu, 21 Apr 2022 09:14:36 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R821e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=19;SR=0;TI=SMTPD_---0VAgYtNo_1650557669; Received: from 30.15.235.48(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VAgYtNo_1650557669) by smtp.aliyun-inc.com(127.0.0.1); Fri, 22 Apr 2022 00:14:31 +0800 Message-ID: <2067a5c7-4e24-f449-4676-811d12e9ab72@linux.alibaba.com> Date: Fri, 22 Apr 2022 00:14:29 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: EMFILE/ENFILE mitigation needed in erofs? Content-Language: en-US To: David Howells Cc: linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org, torvalds@linux-foundation.org, gregkh@linuxfoundation.org, willy@infradead.org, linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org, luodaowen.backend@bytedance.com, tianzichen@kuaishou.com, fannaihao@baidu.com, zhangjiachen.jaycee@bytedance.com References: <20220415123614.54024-3-jefflexu@linux.alibaba.com> <20220415123614.54024-1-jefflexu@linux.alibaba.com> <1447543.1650552898@warthog.procyon.org.uk> From: JeffleXu In-Reply-To: <1447543.1650552898@warthog.procyon.org.uk> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,UNPARSEABLE_RELAY autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/21/22 10:54 PM, David Howells wrote: > Jeffle Xu wrote: > >> + fd_install(fd, file); > > Do you need to mitigate potential EMFILE/ENFILE problems? You're potentially > trebling up the number of accounted systemwide fds: one for erofs itself, one > anonfd per cache object file to communicate with the daemon and one in the > daemon to talk to the server. Cachefiles has a fourth internally, but it's > kept off the books - further, cachefiles closes them fairly quickly after a > period of nonuse. > Hi, thanks for pointing it out. 1. Actually in our using scenarios, one erofs filesystem is formed of several chunk-deduplicated blobs (which are really cached by Cachefiles), while each blob can contain many files of erofs. For example, one container image for node.js will correspond to ~20 blob files in total. Only these blob files are cached by Cachefiles. In densely employed environment, there could be hundreds of containers and thus thousands of backing files on one machine. That is, only tens of thousands of fds/files is needed in this case. 2. Our user daemon will configure rlimit-nofile to a reasonably large (e.g. 1 million) value, so that it won't fail when trying to allocate fds. https://github.com/dragonflyoss/image-service/blob/master/src/bin/nydusd/main.rs#L152 3. Our user daemon will close the anonymous fd once the corresponding backing file has fully downloaded, to free the fd resources. 4. Even if fd/file allocation fails (in cachefiles_ondemand_get_fd()), the INIT request will fail, and thus the erofs mount will fail then. That is, it won't break the upper erofs in this case. 5. If later we find that the number of fds/files is indeed an issue, then we also plan to make the user daemon close some fds to spare some free resources. And then the Cachefiles kernel module needs to reallocate an anonymous fd for the backing file when cache miss. But it remains to be done later if it's really needed. -- Thanks, Jeffle