Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp2198187rwi; Fri, 21 Oct 2022 00:32:31 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7/9L7elTDcbJfD08vECfz6iwWyiTSH/gZshyFS33plx0pAc2ertIFID8pu+wTIr0KmGWhN X-Received: by 2002:a05:6402:50cf:b0:45c:dfce:66ae with SMTP id h15-20020a05640250cf00b0045cdfce66aemr15358669edb.370.1666337550876; Fri, 21 Oct 2022 00:32:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666337550; cv=none; d=google.com; s=arc-20160816; b=LzB0Vxe3QgMM+9xycCl6S9+0yMNpGBDISaQahQg4Hv/GtZv52EQ0BxvzYzS6O1w9vW 0zycihhSMnsCQF7CP12I8tmwwc7njZvnsffpWOyS0PbURDyqfuo1UA2EIAAdoRimyUv+ 1wLs6zMUiKtrTMlXZ0Z6lT2laDs0teP+lwbWAG3HvaNSUAOp82+P2Sna1cIgz0h13YeC zBVfBTDbSosYeywLT/dnogKt+qJ2uqxfq4sDWa92aBsDNvfG2UJrCH6+riOmzpR3P5Bd bqQDs1V2oqDwepRu4Kv9GxIgZiYKLF7KZ6BG/fGxQ9hu+5uBQahnR7oKsFBR18KdmzYH /+wA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=D0qREkhtBFfUSew1zsFqOFwIS6EywWtUfuoiS8sZqVk=; b=wt6JLCgqB7vBSRPJuF0LI7sR2kGzfn4OpfGyG6WiM4l89rL9StIxpXroTL6tRrlzZC AG1d8ncuDnmoh2lX93v7tycPqbUlm2SFMq4kMQwSYggBB0wYqTh/lWZokDPPbbuRJps1 XCZMWtOykLrSq2vIPQschnMoEnnb5lsbDZLChDAEpiR07mJsjowOPh1CBMzMGFdFVG0T NQ+aFGfTlr51BCHecZ8Clu8ONf0Rdk2KEJONbl59ivK1RS1ahy3S0lY7abfojqFukbPs RTHZcLmiqsKuzT5CPQylgLxBlCscAIGr8ONXnXWQaVtzaZcDJYd1onREwxFSdQtwMsBj jmaw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@szeredi.hu header.s=google header.b=MFHrN7qG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=szeredi.hu Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cr21-20020a170906d55500b007999075bf51si2325205ejc.224.2022.10.21.00.32.03; Fri, 21 Oct 2022 00:32:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@szeredi.hu header.s=google header.b=MFHrN7qG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=szeredi.hu Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229484AbiJUG53 (ORCPT + 99 others); Fri, 21 Oct 2022 02:57:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229667AbiJUG51 (ORCPT ); Fri, 21 Oct 2022 02:57:27 -0400 Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A723B24470C for ; Thu, 20 Oct 2022 23:57:24 -0700 (PDT) Received: by mail-ej1-x631.google.com with SMTP id ot12so4950987ejb.1 for ; Thu, 20 Oct 2022 23:57:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=D0qREkhtBFfUSew1zsFqOFwIS6EywWtUfuoiS8sZqVk=; b=MFHrN7qGaabnyWRR9UENgIArfJYl3CIKOElAqKHJShrAoSabyqqwVnoRBzXEWRM2Qg jNloPn8mozbUus10zCY53V/ac93pJYSfZinxn5iZ9ruC6dVJQrIzjN56ep1TqgMVd9tI GvALIONGSYw4+B52usWLQCgO4Ea9sl/3e61ME= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=D0qREkhtBFfUSew1zsFqOFwIS6EywWtUfuoiS8sZqVk=; b=r00PxHe52eqnoaJu+7nD2iU/sn6w0npljL2YgjRpKbvL3ZXjdn9eWT1OvItwDVDBOa cRi6xzzQmQPuLOtTg6exg5qG5h1jUeyyl+3uv2mUXTg73QwQGdeBVxzgVAKaDDFiWfRW gPP9h2/9IDO4x1zvg3OC7+8G39x7e2H2DD2CAgM0I84iZIeBrLlPAchizFpdskilwtLr pYVmKB8caGBW9J7u09NhwXC19p3SSwRJKfvMpcA47ahN2u17OvOwcXOZjMrQbsliyYmz FUEKbDFM1Bj6HhxGq21wwyZUkOFu6ax8GHCt7TvHhWt/8u+v/PfJcrcMgLNUtZ/qsRcj pLHw== X-Gm-Message-State: ACrzQf2sNdT3+6W3Wmc6edJgWmcu31A5GSPd9pLhk6qvXxCvEX4gP7H5 YixAaxmBu13rfdRCfpTQQ0bMwzNWhtEDeSBSV7Iq3Q== X-Received: by 2002:a17:906:ef8c:b0:78d:4a00:7c7b with SMTP id ze12-20020a170906ef8c00b0078d4a007c7bmr14709843ejb.187.1666335443006; Thu, 20 Oct 2022 23:57:23 -0700 (PDT) MIME-Version: 1.0 References: <20220617071027.6569-1-dharamhans87@gmail.com> <20220617071027.6569-2-dharamhans87@gmail.com> <08d11895-cc40-43da-0437-09d3a831b27b@fastmail.fm> <4f0f82ff-69aa-e143-e254-f3da7ccf414d@ddn.com> In-Reply-To: <4f0f82ff-69aa-e143-e254-f3da7ccf414d@ddn.com> From: Miklos Szeredi Date: Fri, 21 Oct 2022 08:57:11 +0200 Message-ID: Subject: Re: [PATCH v5 1/1] Allow non-extending parallel direct writes on the same file. To: Bernd Schubert Cc: Dharmendra Singh , Vivek Goyal , linux-fsdevel@vger.kernel.org, fuse-devel , linux-kernel@vger.kernel.org, Dharmendra Singh , Horst Birthelmer Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 13 Sept 2022 at 10:44, Bernd Schubert wrote: > > > > On 6/17/22 14:43, Miklos Szeredi wrote: > > On Fri, 17 Jun 2022 at 11:25, Bernd Schubert wrote: > >> > >> Hi Miklos, > >> > >> On 6/17/22 09:36, Miklos Szeredi wrote: > >>> On Fri, 17 Jun 2022 at 09:10, Dharmendra Singh wrote: > >>> > >>>> This patch relaxes the exclusive lock for direct non-extending writes > >>>> only. File size extending writes might not need the lock either, > >>>> but we are not entirely sure if there is a risk to introduce any > >>>> kind of regression. Furthermore, benchmarking with fio does not > >>>> show a difference between patch versions that take on file size > >>>> extension a) an exclusive lock and b) a shared lock. > >>> > >>> I'm okay with this, but ISTR Bernd noted a real-life scenario where > >>> this is not sufficient. Maybe that should be mentioned in the patch > >>> header? > >> > >> > >> the above comment is actually directly from me. > >> > >> We didn't check if fio extends the file before the runs, but even if it > >> would, my current thinking is that before we serialized n-threads, now > >> we have an alternation of > >> - "parallel n-1 threads running" + 1 waiting thread > >> - "blocked n-1 threads" + 1 running > >> > >> I think if we will come back anyway, if we should continue to see slow > >> IO with MPIIO. Right now we want to get our patches merged first and > >> then will create an updated module for RHEL8 (+derivatives) customers. > >> Our benchmark machines are also running plain RHEL8 kernels - without > >> back porting the modules first we don' know yet what we will be the > >> actual impact to things like io500. > >> > >> Shall we still extend the commit message or are we good to go? > > > > Well, it would be nice to see the real workload on the backported > > patch. Not just because it would tell us if this makes sense in the > > first place, but also to have additional testing. > > > Sorry for the delay, Dharmendra and me got busy with other tasks and > Horst (in CC) took over the patches and did the MPIIO benchmarks on 5.19. > > Results with https://github.com/dchirikov/mpiio.git > > unpatched patched patched > (extending) (extending) (non-extending) > ---------------------------------------------------------- > MB/s MB/s MB/s > 2 threads 2275.00 2497.00 5688.00 > 4 threads 2438.00 2560.00 10240.00 > 8 threads 2925.00 3792.00 25600.00 > 16 threads 3792.00 10240.00 20480.00 > > > (Patched-nonextending is a manual operation on the file to extend the > size, mpiio does not support that natively, as far as I know.) > > > > Results with IOR (HPC quasi standard benchmark) > > ior -w -E -k -o /tmp/test/home/hbi/test/test.1 -a mpiio -s 1280 -b 8m -t 8m > > > unpatched patched > (extending) (extending) > ------------------------------------------- > MB/s MB/s > 2 threads 2086.10 2027.76 > 4 threads 1858.94 2132.73 > 8 threads 1792.68 4609.05 > 16 threads 1786.48 8627.96 > > > (IOR does not allow manual file extension, without changing its code.) > > We can see that patched non-extending gives the best results, as > Dharmendra has already posted before, but results are still > much better with the patches in extending mode. My assumption is here > instead serializing N-writers, there is an alternative > run of > - 1 thread extending, N-1 waiting > - N-1 writing, 1 thread waiting > in the patched version. > Okay, thanks for the heads up. I queued the patch up for v6.2 Thanks, Miklos > > > Thanks, > Bernd