Received: by 2002:a05:6a10:c7c6:0:0:0:0 with SMTP id h6csp1945513pxy; Mon, 2 Aug 2021 14:41:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwj9vjFfBMqZHeYzfV8T2pTdUAhV7iwuk4KdiM5tpr3UpUP118RrStVwPG2Ea3CsaiofSEl X-Received: by 2002:a5d:88c4:: with SMTP id i4mr327633iol.210.1627940460342; Mon, 02 Aug 2021 14:41:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627940460; cv=none; d=google.com; s=arc-20160816; b=P/CSdTwVwjkFG8qsoMxGcGIIvhhVpCfb10rolm7EXXhbuAqIcLWJEqSB37e6W5bAKs SL1E1NGJNOtpL2KZhicBIvTQlqcD40J/mYA1xxdFEc2HCSkS2EqM5ZfF+Z685c2j+j9y xxEGDnrCHxLzv0UdtWoacn10UM05BXZTiqWpMAiLAqM5+I++GGCoRTGJ2w1yCLeNUzfg vF/bZOLb5EXn40Fmzp+INx/7BfSOsVllBLHSZNCFE276d8LLintgVX2p+KxIUbrhol+7 1IJUxVmP32Wsz1sfb4S6qlt3du8t8bc4lA6YlgC61g08P/kDuyihR4ububLdeuO28Br4 cmLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:references:in-reply-to:subject :cc:to:from:mime-version:content-transfer-encoding:dkim-signature :dkim-signature; bh=ZaoW8ZjBkYswdMN0RjCXfMc5JKZhdW+ljJasal8djZU=; b=PxSm8Wgs032jK4HDHrRooml209ejuLhr8I/SwvEAkeQE9E8Ar9VyWyHdZeo7C7bXcT oHy9ssVcG662HKJ9TVDk17JVXR6gUqMWuZjkkn0uFb4Ne1paiouWQOKxUyaoxcLWUFYQ FSphk8euE+roGLCFqj7zfei7cT51EJn6saYBwOM/e2t1Ny+IRxbTCsbnQR6t4vMog+4C /G99Flgf4nLMBvKYYniVtyO6D0qXoXDGn9KVKkbIUXidAQtDM/GGkA/4yGi2PtUHFb6Y FAf6t0RNwCimaAXETKg4ukpTIrkYo5jT+oUGuIIGNR+J9FXSNZjpRLiHHyV4uQL6omPJ zsKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=u65NS0zs; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b18si12814149iod.34.2021.08.02.14.40.40; Mon, 02 Aug 2021 14:41:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=u65NS0zs; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232816AbhHBVko (ORCPT + 99 others); Mon, 2 Aug 2021 17:40:44 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:36938 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229567AbhHBVkm (ORCPT ); Mon, 2 Aug 2021 17:40:42 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 071BC21ADD; Mon, 2 Aug 2021 21:40:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627940431; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZaoW8ZjBkYswdMN0RjCXfMc5JKZhdW+ljJasal8djZU=; b=u65NS0zsVZhYk2KnafJsz7kdjahqqXJ3VjHvJ73W+wTLgIyhyHP3tqR5KMi0UQ/zau3Yjs djE+eMeo3q+2ZMQj0CHQ9xiPJLwpRadqnYpEQw3cZoiLTw6nnJh7x7lnlnXdhXie8voCwR 3+nw96+pIL2hngGTzAVDeVUeQPkyxJM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627940431; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZaoW8ZjBkYswdMN0RjCXfMc5JKZhdW+ljJasal8djZU=; b=lWQb8HsRoDv6kNAvRxEL7k/tWjty1J2k84zWoZvHbvVBbDGdPWroy42Cxs+Dq+2qLDpMKa xZh4EqMv4iZdw2AA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 8E62F13CAE; Mon, 2 Aug 2021 21:40:27 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id lAzDEktmCGHABwAAMHmgww (envelope-from ); Mon, 02 Aug 2021 21:40:27 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit MIME-Version: 1.0 From: "NeilBrown" To: "Martin Steigerwald" Cc: "Miklos Szeredi" , "Al Viro" , "Christoph Hellwig" , "Josef Bacik" , "J. Bruce Fields" , "Chuck Lever" , "Chris Mason" , "David Sterba" , linux-fsdevel@vger.kernel.org, "Linux NFS list" , "Btrfs BTRFS" Subject: Re: A Third perspective on BTRFS nfsd subvol dev/inode number issues. In-reply-to: <3318968.VgehHcluNF@ananda> References: <162742539595.32498.13687924366155737575.stgit@noble.brown>, , <162787790940.32159.14588617595952736785@noble.neil.brown.name>, <3318968.VgehHcluNF@ananda> Date: Tue, 03 Aug 2021 07:40:24 +1000 Message-id: <162794042436.32159.11858951186865829131@noble.neil.brown.name> Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Mon, 02 Aug 2021, Martin Steigerwald wrote: > Hi Neil! > > Wow, this is a bit overwhelming for me. However, I got a very specific > question for userspace developers in order to probably provide valuable > input to the KDE Baloo desktop search developers: > > NeilBrown - 02.08.21, 06:18:29 CEST: > > The "obvious" choice for a replacement is the file handle provided by > > name_to_handle_at() (falling back to st_ino if name_to_handle_at isn't > > supported by the filesystem). This returns an extensible opaque > > byte-array. It is *already* more reliable than st_ino. Comparing > > st_ino is only a reliable way to check if two files are the same if > > you have both of them open. If you don't, then one of the files > > might have been deleted and the inode number reused for the other. A > > filehandle contains a generation number which protects against this. > > > > So I think we need to strongly encourage user-space to start using > > name_to_handle_at() whenever there is a need to test if two things are > > the same. > > How could that work for Baloo's use case to see whether a file it > encounters is already in its database or whether it is a new file. > > Would Baloo compare the whole file handle or just certain fields or make a > hash of the filehandle or what ever? Could you, in pseudo code or > something, describe the approach you'd suggest. I'd then share it on: Yes, the whole filehandle. struct file_handle { unsigned int handle_bytes; /* Size of f_handle [in, out] */ int handle_type; /* Handle type [out] */ unsigned char f_handle[0]; /* File identifier (sized by caller) [out] */ }; i.e. compare handle_type, handle_bytes, and handle_bytes worth of f_handle. This file_handle is local to the filesytem. Two different filesystems can use the same filehandle for different files. So the identity of the filesystem need to be combined with the file_handle. > > Bug 438434 - Baloo appears to be indexing twice the number of files than > are actually in my home directory > > https://bugs.kde.org/438434 This bug wouldn't be address by using the filehandle. Using a filehandle allows you to compare two files within a single filesystem. This bug is about comparing two filesystems either side of a reboot, to see if they are the same. As has already been mentioned in that bug, statfs().f_fsid is the best solution (unless comparing the mount point is satisfactory). NeilBrown