Received: by 2002:a05:6a10:5594:0:0:0:0 with SMTP id ee20csp305998pxb; Mon, 25 Apr 2022 10:17:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxfxK+mfJ/CPPQvAgNqzLoVsgewT6JboTHvnMLoBR4FslIVZPnqvJtHIVdDzNH0m+zurXsv X-Received: by 2002:a17:903:228f:b0:15a:d3e:1e48 with SMTP id b15-20020a170903228f00b0015a0d3e1e48mr19260732plh.51.1650907047105; Mon, 25 Apr 2022 10:17:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650907047; cv=none; d=google.com; s=arc-20160816; b=zWmdqL4ZXbLgEQ0cU6u3ol0wyOJZkxsXEfqNop0cQblP3aJDZKuOFV/FCOK7VSc9uW tql9QUC22by6s62tmdzcxUWApAEIiCfyz6LuEVI6u8iZeXb226zrqL+i3iBbBCYe8EOr ALzC7a5REFdpLmkhnQa3aTXO5mJAVWE7M4P0jqx5cpLheqNKkYeys0AA5pa3R+2ktFEW rxcg73MSJs6d6iT5srqAisdgl1lJSKOJ1HqvHGrgir5qRBBfWCjIZ64GWvP4etbmw9hF OzpZRp49fwzyLdQMm2ty7o2pThML2wTA2VN8zqZVQCiXH56/w19z1/VD/9KZ/mY5l5dR /Zow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=LPzbGzl24JS6dUWQoQTtgRU+KnJivXvjV+kZ1pErbrI=; b=O9aIEYR4WVGIgU0AZmWKXnjtYVCQgNkYIcOMD89tEZL2CgaU0TAiItbQmhJRzKC2/3 NrJaSYirBXgDvxrNkLOJaC8eErOfA5Ynzukp1IwFs3q6MYMmvWjRlWkmP/RuPUypUpQI sMZvX4osdAd9FFJI/dKSl21+6Xv30Y0xvrXYLhNqEX38w7aRtuHl4OXFbyMEi3vFWT3n B7Z9hsOJu/PT8PNkfjwog7D0iDJBiQl/jiqq51hDgBezDDu4ZblAFVBkN3bwjCBdLLYI SWhTLISM30QyKVEFusVJLbGGLvh3qDoqEM9f6mxYzYBiHdVY1Ek+2xuXxJ+MpT3Uhn21 WX5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@dneg.com header.s=google header.b=YcS2AISB; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=dneg.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d18-20020a170902ced200b0015d14e46f4bsi4122458plg.334.2022.04.25.10.16.58; Mon, 25 Apr 2022 10:17:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@dneg.com header.s=google header.b=YcS2AISB; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=dneg.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230217AbiDYQu7 (ORCPT + 99 others); Mon, 25 Apr 2022 12:50:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243554AbiDYQu7 (ORCPT ); Mon, 25 Apr 2022 12:50:59 -0400 Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 523E924969 for ; Mon, 25 Apr 2022 09:47:54 -0700 (PDT) Received: by mail-ed1-x534.google.com with SMTP id k27so3980459edk.4 for ; Mon, 25 Apr 2022 09:47:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dneg.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LPzbGzl24JS6dUWQoQTtgRU+KnJivXvjV+kZ1pErbrI=; b=YcS2AISBeoNDGJKEP2OCB/e8G6DYcr+s41ur29YKokbKW2NUSPwS7bN45vx/jcGdAR cAKBEK+fwNFktAX/pkZ0sVjAfL62LQj+ikXyuFjvUjFwOMjW4dAqkuZmD4cMqnoTdDne FdliO4Pd0C8ptGm9inOQu3bPoaghV+fbLUsEHQrTye+UG2AWp3OP3CpViodxgvWaNtVt xHm564ZBb0FnSfSmOpeDfrK8ImlYHrSo7GK50VbfH3h67PwkUbnE0CdvRXj4Xg9vAZ0I AsLld9Lkva1AUexEv9QiZ7wJEEIlsVQXctCmL29C5PIuNZWBKPpMHL8/ByalY4H3mhxT BCxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LPzbGzl24JS6dUWQoQTtgRU+KnJivXvjV+kZ1pErbrI=; b=WezSKsbgtW6yvJSXWsEzdia1UDb6CtKPYU56bwtjNI4w84k5mwSjK2Lc9PKOVELog8 OOJsxBy1qbex3DksilVe+OOQqCLx6xsVBsJpvhfLbx4Ph8N07d184CYtLxQpwfSG2Yyz JWSFVhkxzzKlV5QaO76JAjk3jRCnyT1A8k4wWGScwqzwLJ1j92MBn05QwvFipoM6CpuN QkIlyfQPYs3cqocF1bvMLNtHLucj2D15wgR5OEV9ZhAoGDdDrmeJ3HxMZaWXBNA4pIbY xnI9X+iS8gwDYjjTYfT0oEDXnY61gKnQKXDPQNBhz0rY28bWVHDkFrSfz7VoSslOpBs2 10YA== X-Gm-Message-State: AOAM530vt0uVyDbiB98sVrR+VkUXkgv00OPw8G5C1SMT54cJN5/qOwo/ HcPM1cXQ3z/i2Uc58TXY+jaRZxEQCcj8wFpU4EM3wAdpyGSJdg== X-Received: by 2002:a05:6402:3492:b0:423:dfe4:ec43 with SMTP id v18-20020a056402349200b00423dfe4ec43mr20375704edc.401.1650905272857; Mon, 25 Apr 2022 09:47:52 -0700 (PDT) MIME-Version: 1.0 References: <20220126025722.GD17638@fieldses.org> <20220211155949.GA4941@fieldses.org> <164517040900.10228.8956772146017892417@noble.neil.brown.name> <20220425132232.GA24825@fieldses.org> <20220425160236.GB24825@fieldses.org> In-Reply-To: <20220425160236.GB24825@fieldses.org> From: Daire Byrne Date: Mon, 25 Apr 2022 17:47:16 +0100 Message-ID: Subject: Re: parallel file create rates (+high latency) To: "J. Bruce Fields" Cc: NeilBrown , Patrick Goetz , linux-nfs Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Mon, 25 Apr 2022 at 17:02, J. Bruce Fields wrote: > > On Mon, Apr 25, 2022 at 04:24:50PM +0100, Daire Byrne wrote: > > On Mon, 25 Apr 2022 at 14:22, J. Bruce Fields wrote: > > > > > > On Mon, Apr 25, 2022 at 02:00:32PM +0100, Daire Byrne wrote: > > > > On Mon, 21 Feb 2022 at 13:59, Daire Byrne wrote: > > > > > > > > > > On Fri, 18 Feb 2022 at 07:46, NeilBrown wrote: > > > > > > I've ported it to mainline without much trouble. I started some simple > > > > > > testing (parallel create/delete of the same file) and hit a bug quite > > > > > > easily. I fixed that (eventually) and then tried with more than 1 CPU, > > > > > > and hit another bug. But then it was quitting time. If I can get rid > > > > > > of all the easy to find bugs, I'll post it with a CC to you, and you can > > > > > > find some more for me! > > > > > > > > > > That would be awesome! I have a real world production case for this > > > > > and it's a pretty heavy workload. If that doesn't shake out any bugs, > > > > > nothing will. > > > > > > > > > > The only caveat being that it will likely be restricted to NFSv3 > > > > > testing due to the concurrency limitations with NFSv4.1+ (from the > > > > > other thread). > > > > > > > > > > Daire > > > > > > > > Just to follow up on this again - I have been using Neil's patch for > > > > parallel file creates (thanks!) but I'm a bit confused as to why it > > > > doesn't seem to help in my NFS re-export case. > > > > > > > > With the patch, I can achieve much higher parallel (multi process) > > > > creates directly on my re-export server to a high latency remote > > > > server mount, but when I re-export that to multiple clients, the > > > > aggregate create rate again degrades to that which we might expect > > > > either without the patch or if there was only one process creating the > > > > files in sequence. > > > > > > > > My assumption was that the nfsd threads of the re-export server would > > > > act as multiple independent processes and it's clients would be spread > > > > across them such that they would also benefit from the parallel > > > > creates patch on the re-export server. So I expected many clients > > > > creating files in the same directory would achieve much higher > > > > aggregate performance. > > > > > > That's the idea. > > > > > > I've lost track, where's the latest version of Neil's patch? > > > > > > --b. > > > > The latest is still the one from this thread (with a minor update to > > apply it to v5.18-rc): > > > > https://lore.kernel.org/lkml/893053D7-E5DD-43DB-941A-05C10FF5F396@dilger.ca/T/#m922999bf830cacb745f32cc464caf72d5ffa7c2c > > Thanks! > > I haven't really tried to understand that patch--but just looking at the > diffstat, it doesn't touch fs/nfsd/. And nfsd calls into the vfs only > after it locks the parent. So nfsd is probably still using > the old behavior, where local callers are using the new (parallel) > behavior. > > So I bet what you're seeing is expected, and all that's needed is some > updates to fs/nfsd/vfs.c to reflect whatever Neil did in fs/namei.c. > > --b. Ah right, that would explain it then - thanks. I just naively assumed that nfsd would pass straight into the VFS and rely on those locks. I'll stare at fs/nfsd/vfs.c for a bit but I probably lack the expertise to make it work. It's also not entirely clear that this parallel creates RFC patch will ever make it into mainline? Daire > > My test is something like this: > > > > reexport1 # for x in {1..5000}; do > > echo /srv/server1/touch.$HOSTNAME.$x > > done | xargs -n1 -P 200 -iX -t touch X 2>&1 | pv -l -a >|/dev/null > > > > Without the patch this results in 3 creates/s and with the patch it's > > ~250 creates/s with 200 threads/processes (200ms latency) when run > > directly against a remote RHEL8 server (server1). > > > > Then I run something similar to this but simultaneously across 200 > > clients of the "reexport1" server's re-export of the originating > > "server1". I get an aggregate of around 3 creates/s even with the > > patch applied to reexport1 (v5.18-rc2) which is suspiciously similar > > to the performance without the parallel vfs create patch. > > > > The clients don't run any special kernels or configurations. I have > > only tested NFSv3 so far.