Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp1097981rdh; Mon, 25 Sep 2023 03:16:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEDVxwm8IyrrXHxM5Q/7SsHjLQ77+4b4pk5LBdvbj3eI6eYFWUOx21whKeWkr+YrsuupFbQ X-Received: by 2002:a05:6870:e6d3:b0:1d7:1323:5ffb with SMTP id s19-20020a056870e6d300b001d713235ffbmr8811942oak.54.1695637004780; Mon, 25 Sep 2023 03:16:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695637004; cv=none; d=google.com; s=arc-20160816; b=yfIZ2NsjFqZTncl/YiFUsIerN8BD/mWsxbAIqLDtxwpFbq/NNVEfzalN1sRp9oPsh2 Lth8snRbiM4k/tmIrh7Ph2v/p4skfaJEgVB2r5ccmJuV6FlZ3OAqa8VJn8eD1WqxHXyi OOmuuMOy03LW0UOwUxcmXT/dzMfp9M/7wyl0rLVdmnGyMwYPZ5EVpKjqIoHpvn+/07mz /DBG0X+W1yXNdCTtlQj2BypzLo+6s7oIytOUIo5h+x4uQOtc2A6MZpJ5FOa2Dbkgf1ZY ir+QS8y+K92qTi2OJlrG8IoJMdxP1F7q8Rn4WY8UgXWH+exgNVSR+75N5rO1QEg5oXpu cwZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=kxFWm43ha+iaMtKtdh9+7qL4ORfLUkSkxjSaqF2iXTY=; fh=xOkjpRKvd4I3chshjn28yqw+esCMXjo7AECZWTwRS9M=; b=OgvB15Bdr83x//VLLhep4IqX3IfgdAp6ZyKvqAcUnf9B1kvmGICA1YqOm3HAHW8Aqk v7P5FC1vYH2b8wqBy3Nj2itUeXSGEq4aG0eCQwOlnUuNLqDaMgOeWPFs2oq2o1bEcoDL VuBG4vyiZZi4LdZyQG+Mek7EdE16/gKLnkK9azte/pi1r2KP0t75cwQGHH9dbT9qZKSg HxEyi8iQkUwgdfimU5PiXvhUWdt7wNgDZGyYPFGIfMJL4ESRzX8Rw9WASqvEQ+Yuvucs Gy8hAhAXDF/wA+eNPn6bxnCFCQzE28/vItfDyoRcKCB27pXjlAA9arzBSdmMLOGFU1RT gtsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=XRMieSHe; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id 14-20020a63020e000000b00565342470c4si9439416pgc.801.2023.09.25.03.16.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 03:16:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=XRMieSHe; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 5591680BB3D0; Mon, 25 Sep 2023 03:09:00 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231233AbjIYKJD (ORCPT + 99 others); Mon, 25 Sep 2023 06:09:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231367AbjIYKIs (ORCPT ); Mon, 25 Sep 2023 06:08:48 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B001610D2; Mon, 25 Sep 2023 03:08:30 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6DA1C433C7; Mon, 25 Sep 2023 10:08:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1695636510; bh=kxFWm43ha+iaMtKtdh9+7qL4ORfLUkSkxjSaqF2iXTY=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=XRMieSHePX0qiMEQ+iiyuoyWKuHG9sbzQvoCJyJCudiqnD7IygDQr6JeqptkMUmwJ pidxSPVgZQs78KC0SatMWiRydCaSEXuGgBkNmryFSAf9zuSZ9Wi99ESAGz7a5dC+dj u6iYA+jxph+GT/H0GALbdb6IBjsYj/tpGYy46P0O9wT3GXRmA/UbHbjdJe+D7UjAr9 yizphXtHsVb/WcwZJI47tLy/wm89V/bkx+CodQmkZa8QemVjirDBZsg7BwmOnATro5 urz+iNfVtp3DtWmOUUfr2ND3Nj1TiJkHEHb2oGSNR/mJptjdFnAiFOMJEMj0Z6zaDJ XyjoY5W9VnYxw== Message-ID: Subject: Re: [PATCH v8 0/5] fs: multigrain timestamps for XFS's change_cookie From: Jeff Layton To: Amir Goldstein Cc: Alexander Viro , Christian Brauner , Chuck Lever , Neil Brown , Olga Kornievskaia , Dai Ngo , Tom Talpey , Chandan Babu R , "Darrick J. Wong" , Dave Chinner , Jan Kara , Linus Torvalds , Kent Overstreet , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Date: Mon, 25 Sep 2023 06:08:27 -0400 In-Reply-To: References: <20230922-ctime-v8-0-45f0c236ede1@kernel.org> <4b106847d5202aec0e14fdbbe93b070b7ea97477.camel@kernel.org> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.4 (3.48.4-1.fc38) MIME-Version: 1.0 X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Mon, 25 Sep 2023 03:09:00 -0700 (PDT) On Sat, 2023-09-23 at 17:58 +0300, Amir Goldstein wrote: > On Sat, Sep 23, 2023 at 1:22=E2=80=AFPM Jeff Layton = wrote: > >=20 > > On Sat, 2023-09-23 at 10:15 +0300, Amir Goldstein wrote: > > > On Fri, Sep 22, 2023 at 8:15=E2=80=AFPM Jeff Layton wrote: > > > >=20 > > > > My initial goal was to implement multigrain timestamps on most majo= r > > > > filesystems, so we could present them to userland, and use them for > > > > NFSv3, etc. > > > >=20 > > > > With the current implementation however, we can't guarantee that a = file > > > > with a coarse grained timestamp modified after one with a fine grai= ned > > > > timestamp will always appear to have a later value. This could conf= use > > > > some programs like make, rsync, find, etc. that depend on strict > > > > ordering requirements for timestamps. > > > >=20 > > > > The goal of this version is more modest: fix XFS' change attribute. > > > > XFS's change attribute is bumped on atime updates in addition to ot= her > > > > deliberate changes. This makes it unsuitable for export via nfsd. > > > >=20 > > > > Jan Kara suggested keeping this functionality internal-only for now= and > > > > plumbing the fine grained timestamps through getattr [1]. This set = takes > > > > a slightly different approach and has XFS use the fine-grained attr= to > > > > fake up STATX_CHANGE_COOKIE in its getattr routine itself. > > > >=20 > > > > While we keep fine-grained timestamps in struct inode, when present= ing > > > > the timestamps via getattr, we truncate them at a granularity of nu= mber > > > > of ns per jiffy, > > >=20 > > > That's not good, because user explicitly set granular mtime would be > > > truncated too and booting with different kernels (HZ) would change > > > the observed timestamps of files. > > >=20 > >=20 > > That's a very good point. > >=20 > > > > which allows us to smooth over the fuzz that causes > > > > ordering problems. > > > >=20 > > >=20 > > > The reported ordering problems (i.e. cp -u) is not even limited to th= e > > > scope of a single fs, right? > > >=20 > >=20 > > It isn't. Most of the tools we're concerned with don't generally care > > about filesystem boundaries. > >=20 > > > Thinking out loud - if the QERIED bit was not per inode timestamp > > > but instead in a global fs_multigrain_ts variable, then all the inode= s > > > of all the mgtime fs would be using globally ordered timestamps > > >=20 > > > That should eliminate the reported issues with time reorder for > > > fine vs coarse grained timestamps. > > >=20 > > > The risk of extra unneeded "change cookie" updates compared to > > > per inode QUERIED bit may exist, but I think it is a rather small ove= rhead > > > and maybe worth the tradeoff of having to maintain a real per inode > > > "change cookie" in addition to a "globally ordered mgtime"? > > >=20 > > > If this idea is acceptable, you may still be able to salvage the reve= rted > > > ctime series for 6.7, because the change to use global mgtime should > > > be quite trivial? > > >=20 > >=20 > > This is basically the idea I was going to look at next once I got some > > other stuff settled here: Basically, when we apply a fine-grained > > timestamp to an inode, we'd advance the coarse-grained clock that > > filesystems use to that value. > >=20 > > It could cause some write amplification: if you are streaming writes to > > a bunch of files at the same time and someone stats one of them, then > > they'd all end up getting an extra inode transaction. That doesn't soun= d > > _too_ bad on its face, but I probably need to implement it and then run > > some numbers to see. > >=20 >=20 > Several journal transactions within a single jiffie tick? > If ctime/change_cookie of an inode is updated once within the scope > of a single running transaction, I don't think it matters how many > times it would be updated, but maybe I am missing something. >=20 > The problem is probably going to be that the seqlock of the coarse > grained clock is going to be invalidated way too frequently to be > "read mostly" in the presence of ls -lR workload, but again, I did > not study the implementation, so I may be way off. >=20 That may end up being the case, but I think if we can minimize the number of fine-grained updates, then the number of invalidations will be minimal too. I haven't rolled an implementation of this yet. This is all very much still in the "waving of hands" stage anyway. Once the dust settles from the atime and mtime API rework, I may still take a stab at doing this. --=20 Jeff Layton