Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp5045753pxb; Wed, 26 Jan 2022 03:43:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJxJzA2/L3uSAfE+oqNVwhm8yUR9s8UOju9tHd2UTI6qPFgu/fifsGaxsdSatTwYpjfdHfpH X-Received: by 2002:a17:906:5282:: with SMTP id c2mr19661910ejm.130.1643197404182; Wed, 26 Jan 2022 03:43:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643197404; cv=none; d=google.com; s=arc-20160816; b=CLHUicBO2avn3nRWxyT10UUYhTSXuiqnOPUwbDjI2pHJFTCyNly4S44y4Jo5+fm7ZN EFl8yc5RqoaULDIpwIki6H17/25dTyI/dzBGe5bgxyqplU06ppt4yvsLQMJpDseEAR+T GP8biw/dLg4D0QrSNWx0iBEBZHSukxhdTwk30wLhHQhE5s8L3m2ISKE7Jj9DBKGVvxfi KLhCDp1fiuIrE6Z0VCE/dEPljxp7kONqu3nrZvpzeU1FgnTgqwd4jzBfFRLu4zvtgZjN uiEdGa8O7LND9Tm7+bizx9epA2xRHf34Lu/8k/iTa3dt8Otvnt6uhVlQ0T3yUwYjx78s YehQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=ntr0sCiNW2z2e+gHeKwLD+eLLVEq/Dsnqy0JTonJxzM=; b=r242X2JiClko06Eey4wtx6wNSN42VUEc3QFDe18LLgTVXACdR3roEpPfKaNjkXRfwr SkTCX6694JBtSJycJaJhWNEGkzLqHdSBdWijSY5859Zv4hKSFUqyOwQD3K6pFpj5pCqt fO20pbKyOnHj+t5XZ4loFN47kDG0gcdDfrUHa8sOlbz6ht/ETcdjgeO/D7V06a3aKAbe 9AVuSEyHx6+X3dcaq7cJPMfVgI1F0XYSwVRsnwcd/m/Ku55ivDgN9AMX/OnARAY6n8On A8W5VS7PzpJwKbTo7/yrxBBU5vWdTlXg+XMiEI7ZdqmRBsxUEj+SP3LGjIg/4LeFpsSb 2Tow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@dneg.com header.s=google header.b=F+8ubOom; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=dneg.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gn10si2555507ejc.855.2022.01.26.03.42.58; Wed, 26 Jan 2022 03:43:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@dneg.com header.s=google header.b=F+8ubOom; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=dneg.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233981AbiAYWme (ORCPT + 99 others); Tue, 25 Jan 2022 17:42:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233979AbiAYWmd (ORCPT ); Tue, 25 Jan 2022 17:42:33 -0500 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42C0CC06173B for ; Tue, 25 Jan 2022 14:42:33 -0800 (PST) Received: by mail-ej1-x62d.google.com with SMTP id o12so34098018eju.13 for ; Tue, 25 Jan 2022 14:42:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dneg.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ntr0sCiNW2z2e+gHeKwLD+eLLVEq/Dsnqy0JTonJxzM=; b=F+8ubOom2ZwoFZh7OoYgt8K1UU321gmSh+f0kRkO49FJKJ8X6Wzol3MRgnGEHvEfK+ uKqvhPO6HPO0LxW+oq2ISWL6QgfIVlolTOsRnBPz5jlSe7abfl55dJydaiRcbV5hS87j HVFe4hiV1gxENrMtArB0GlLT8JWtbzrwW9DmVEXWno8ebRrfIGExanbGkq3toMHkP5Ju LIszd3oYa0L1pZgDSP271yNoqDzTyHsvmXS/882+OKV9fEI6BtPlu+WKejZtkt+d349f z/LTp/0M9vd27DsYhJWMtHo/5hMay4xmwUgaAbMTlGTWqD1fZ5edW11/QPNnoxAmeixO VLQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ntr0sCiNW2z2e+gHeKwLD+eLLVEq/Dsnqy0JTonJxzM=; b=5ikJRZf+vIr1P0pQHQuwa1pu4T8ZiF78vQ7aJ/7EUf1wxsRQSE9bd3GtlOeEtF9oM6 Uf3V1jxW8aa4m8gR/LX4GusxwbT5m9ouSfFqlN9MiwQ6JbvqYcJ80VWKR/1klzx84kBK Zp/hDmkZie97inad+Hx10Gc8wYw7ORTauadvPXACf8+3OCLYN5mnV1vkqXjBiKqwlSco sqKv3kq/qKdr0H/v389E/1r8ogpU4CU0eRGwTFW208Cyye/5eBbZq1aFSGyx/BD8kIbB ztZXdffL7oTl8tqzYw4DIfrRYCdQGXbkuKKbwUY66dtXSqD++48XrQsbPPOUA/BtJeNz WjRQ== X-Gm-Message-State: AOAM532ZysFh9sN8I7JXOtIWwatL4MtRNK1iDJK5FvvwwQktYiVytrgB e3QOgWMleiPYoXBblwRV/Pe0s7PQM0zbirVrXGM0UeDhdtk5ze2k X-Received: by 2002:a17:906:3819:: with SMTP id v25mr17476353ejc.539.1643150551789; Tue, 25 Jan 2022 14:42:31 -0800 (PST) MIME-Version: 1.0 References: <20220124193759.GA4975@fieldses.org> <20220124205045.GB4975@fieldses.org> <20220125135959.GA15537@fieldses.org> <42867c2c-1ab3-9bb6-0e5a-57d13d667bc6@math.utexas.edu> <20220125215942.GC17638@fieldses.org> <7256e781-5ab8-2b39-cb69-58a73ae48786@math.utexas.edu> In-Reply-To: <7256e781-5ab8-2b39-cb69-58a73ae48786@math.utexas.edu> From: Daire Byrne Date: Tue, 25 Jan 2022 22:41:55 +0000 Message-ID: Subject: Re: parallel file create rates (+high latency) To: Patrick Goetz Cc: Bruce Fields , Chuck Lever III , Linux NFS Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Tue, 25 Jan 2022 at 22:11, Patrick Goetz wrote: > > IDK, 4000 images per collection, with hundreds of collections on disk? > Say at least 500,000 files? Maybe a million? With most files about 1GB > in size. I was trying to just rsync it all from the data server to a > ZFS-based backup server in our data center, but the backup started > failing constantly because the filesystem would change after rsync had > already constructed an index. Even after an initial copy, a backup like > that runs for over a week. The strategy I'm about to try and implement > is to NFS mount the data server's data partition to the backup server > and then have a script walk through the directory hierarchy, rsyncing > collections one at a time. ZFS send/receive would probably be better, > but the data server isn't configured with ZFS. We've strayed slightly off topic (even if we are talking about file creates over NFS) because you can get good parallel performance (creates, read, writes etc) over NFS with simultaneous copies using lots of processes if distributed across lots of directories. Well "good" being subjective. I get 1,500 creates/s in a single directory on a LAN NFS server from a single client and 160 creates/s aggregate over my extreme 200ms using 10 clients & 10 different directories. It seems fair all things considered. But seeing as I do a lot of these kinds of big data moves (TBs) across both the LAN and WAN, I can perhaps offer some advice from experience that might be useful: * walk the filesystem (locally) first to build a file list, split it and then use rsync --files-from (e.g. https://github.com/jbd/msrsync) to feed multiple simultaneous rsyncs. * avoid NFS and use rsyncd directly between the servers (no ssh) so filesystem walks are "local". The advantage of rsync is that it will do the filesystem walks at both ends locally and compare the directory trees as it goes along. The other nice thing it does is open a connection between sender and receiver and stream all the file data down it so it works really well even for lists of small files. The TCP connection and window scaling can sit at it's maximum without any slow remote file metadata latency disrupting it. Avoid the encapsulation of sshand use rsyncd instead as it just speeds everything up. And as always with any WAN connection, large buffers, window scaling, no firewall DPI and maybe some fancy congestion control like BBR/2 helps. Daire