Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp2991977rdb; Fri, 22 Sep 2023 14:37:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF/VcHOUviAsMZj6lGo/A9DSxPhhyX7vCDf+ZpPDXhfaG0D9WDszOf7YmBVAqQyCsSE04D3 X-Received: by 2002:a17:902:d4c4:b0:1b8:af5e:853c with SMTP id o4-20020a170902d4c400b001b8af5e853cmr1289799plg.26.1695418621137; Fri, 22 Sep 2023 14:37:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695418621; cv=none; d=google.com; s=arc-20160816; b=b4+9TyXA0NFmF9VV3R9QlwrNYzGFM5Bid3afihKe18NQVlcrxdxXYCenMUTxMmuao6 GIY2pUvKWwwPzcCG2fGhoneFZq5kMOTuMsP2D12DzYCNOgSN6MGhCI3y0CxvYSu46oF2 pzuaeJNZhjaRTLLJuBox7pnKpzxGQ5tXaSrh+Bks7cgWM5gkAaKxVcdOZ9tgt+MgYjiD E2Yxc6GBKVWN97DDF73C4OcbUoZo0nKygLsD6G9AoRqtXa60GYq4l5kfONw08/HNx4fQ R7Xe9+adtCAEfvBPSM88eoUflPt+cM5He1gWeDj2HghtBOWQyHrkv/QzZswd6PJUeyq3 kZ0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=KN1JQxormwjI3SS7UGp04taape15hhe6enoWPWGQ680=; fh=8wkjEVZrWuTHi4PO2Er8i3G2MhR3do76Nfba8QoG+xs=; b=otk3VCD4o5cm1abO90xTc7D+UNhe2i0kc9ZbkfvYLg+wQhHOCFZ7IT29a9CSsMvt3a ixmzqF8Ce+hWvda4RmAUlOUZCpBwGn+3QwB6p7gqrcIQBAX2yxE1IlODIxZyODUEjRYn l2Y/q6U6htAVB0uMvaDcP2X012vnLB5Htgd6FELSXcNEZmhuBoioIqle54yePJZn4T1P OXfpEswxpqRoupcL52o5G9p888rACaHGxvpIsdLW6CxS1j4km+uvM6Nqn6UxKlS+XZ0E 1XINYQHC7ThvwMH5e3vdpMC7lNi9g9f3vMqkJqGTY1kmTf7oos4Erh7vCCR/tGfjcgKX mlqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="w/BP4Tjn"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id k5-20020a170902c40500b001b80ecdcb88si5149599plk.473.2023.09.22.14.37.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 14:37:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="w/BP4Tjn"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 9410182A834D; Fri, 22 Sep 2023 10:32:44 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233480AbjIVRcj (ORCPT + 99 others); Fri, 22 Sep 2023 13:32:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233417AbjIVRcP (ORCPT ); Fri, 22 Sep 2023 13:32:15 -0400 Received: from out-205.mta1.migadu.com (out-205.mta1.migadu.com [95.215.58.205]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77BC9CF7 for ; Fri, 22 Sep 2023 10:31:44 -0700 (PDT) Date: Fri, 22 Sep 2023 13:31:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1695403901; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KN1JQxormwjI3SS7UGp04taape15hhe6enoWPWGQ680=; b=w/BP4TjnJu50TWE84lFipw8dcwUC8/LZjrI4crTQk62fnPvWP25AMUGGnBQhXkCX//kUs3 czezKsTXTm78YSUnUPJfRbBwqbHURPpAWt/MERdkGf6V7WuGzEAGuQRzHZ5QJLqB3zBSsH ogYA3jaA//TePd7rS7K7MFNNT7Yt/jQ= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Jeff Layton Cc: Alexander Viro , Christian Brauner , Chuck Lever , Neil Brown , Olga Kornievskaia , Dai Ngo , Tom Talpey , Chandan Babu R , "Darrick J. Wong" , Dave Chinner , Jan Kara , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: Re: [PATCH v8 1/5] fs: add infrastructure for multigrain timestamps Message-ID: <20230922173136.qpodogsb26wq3ujj@moria.home.lan> References: <20230922-ctime-v8-0-45f0c236ede1@kernel.org> <20230922-ctime-v8-1-45f0c236ede1@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230922-ctime-v8-1-45f0c236ede1@kernel.org> X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 22 Sep 2023 10:32:44 -0700 (PDT) On Fri, Sep 22, 2023 at 01:14:40PM -0400, Jeff Layton wrote: > The VFS always uses coarse-grained timestamps when updating the ctime > and mtime after a change. This has the benefit of allowing filesystems > to optimize away a lot metadata updates, down to around 1 per jiffy, > even when a file is under heavy writes. > > Unfortunately, this has always been an issue when we're exporting via > NFS, which traditionally relied on timestamps to validate caches. A lot > of changes can happen in a jiffy, and that can lead to cache-coherency > issues between hosts. > > NFSv4 added a dedicated change attribute that must change value after > any change to an inode. Some filesystems (btrfs, ext4 and tmpfs) utilize > the i_version field for this, but the NFSv4 spec allows a server to > generate this value from the inode's ctime. > > What we need is a way to only use fine-grained timestamps when they are > being actively queried. > > POSIX generally mandates that when the the mtime changes, the ctime must > also change. The kernel always stores normalized ctime values, so only > the first 30 bits of the tv_nsec field are ever used. > > Use the 31st bit of the ctime tv_nsec field to indicate that something > has queried the inode for the mtime or ctime. When this flag is set, > on the next mtime or ctime update, the kernel will fetch a fine-grained > timestamp instead of the usual coarse-grained one. > > Filesytems can opt into this behavior by setting the FS_MGTIME flag in > the fstype. Filesystems that don't set this flag will continue to use > coarse-grained timestamps. Interesting... So in bcachefs, for most inode fields the btree inode is the "master copy"; we do inode updates via btree transactions, and then on successful transaction commit we update the VFS inode to match. (exceptions: i_size, i_blocks) I'd been contemplating switching to that model for timestamp updates as well, since that would allow us to get rid of our super_operations.write_inode method - except we probably wouldn't want to do that since it would likely make timestamp updates too expensive. And now with your scheme of stashing extra state in timespec, I'm glad we didn't. Still, timestamp updates are a bit messier than I'd like, would be lovely to figure out a way to clean that up - right now we have an awkward mix of "sometimes timestamp updates happen in a btree transaction first, other times just the VFS inode is updated and marked dirty". xfs doesn't have .write_inode, so it's probably time to study what it does...