Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp5514037pxv; Wed, 28 Jul 2021 12:36:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyeOzrTZGvkKY64llV5adzFk/ZMW0OFlnSUBTNmU4CHS/1a6Wa/luk6SagvET9fgaIrdJHo X-Received: by 2002:a17:906:17c5:: with SMTP id u5mr1043296eje.326.1627500972646; Wed, 28 Jul 2021 12:36:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627500972; cv=none; d=google.com; s=arc-20160816; b=NEmWuMUvaQ5jlI6CwgyaFKPjNcsiJheAh65ODsMxDipCFgZmOSIoiCR6R7hbqL3dBh JFzYd6ys6eemS8oHUGk//1RJT5+heiPL/bbF/s6MVoRHfRpkdpy8HU8BBqs3t2egGxxg Nr3tVru+51An3G1J6iSVzWAU8LAurbIzmbySQKXfp0t5o/d4vkFEFsC5RE6/DY/Exd/O 7NYjdAInTdP8YP/QpjcoDFtB/7eSsEYyv+uI6ABHrOnzVVoTbQhdHh8Pm4vpfLWburt4 EYpBpn+J9PO6Sk4icW2pQ+q9Ln5EhVr2EGSClkz8TF5GB0t1UJEk0Cly77pT14TDyTb2 RHYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature:dkim-filter; bh=+ubDzSENl39MUZIPXJcllRmY3lYjmwUKCDsj0PDkg4w=; b=G/4jCDAHPBt9pN3RIEHvPVaUj2yTgsIZRza/IKYw9najlQnAYl4fs10KiHgpwlc3Vq ErRv/d4bgWnNL9O5wrQv/2yjyzAbE4Vxo1ZEfYHO3Kn2zSaNqaPx4ndGHpzlcEiOcL9a B099YjRDLTZ9NvtdPogn4zjiIk+StcK4MIo+SoPQEIgvb+a5Koi0D0Mh/Xwtc4/u2TnV zo5JMGKsZ5nScp59RSsiwy9zLebgvXsKsXNLIngJuPO6BEah/b8nHodETg+smIRhDEfA 8NwQiLCkdD9HzJHs1CRnxxVBEJkwdeEiB0lTrcYV2c36vS64IukadITaUc5clINmh2B1 JWMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fieldses.org header.s=default header.b=q6pK9ojW; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p4si567574edj.402.2021.07.28.12.35.43; Wed, 28 Jul 2021 12:36:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@fieldses.org header.s=default header.b=q6pK9ojW; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229986AbhG1Tfk (ORCPT + 99 others); Wed, 28 Jul 2021 15:35:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229690AbhG1Tfk (ORCPT ); Wed, 28 Jul 2021 15:35:40 -0400 Received: from fieldses.org (fieldses.org [IPv6:2600:3c00:e000:2f7::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78CC6C061757; Wed, 28 Jul 2021 12:35:37 -0700 (PDT) Received: by fieldses.org (Postfix, from userid 2815) id BAC546C91; Wed, 28 Jul 2021 15:35:36 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.11.0 fieldses.org BAC546C91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fieldses.org; s=default; t=1627500936; bh=+ubDzSENl39MUZIPXJcllRmY3lYjmwUKCDsj0PDkg4w=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=q6pK9ojWCkrR98kzMLRySGHNXaOC8domFNiLKDe9C4R+73FjE8De5OBwjGe/7Js61 WVSJ9a/8YqpcB+C0AYk02CVYsyVuoIUKtMHb93Y/jbgUOO1anbGCpabk2k2brgyjCH oxklpoC/f/SUUJUVPoYV1eB8DS1qKaeqJDcwdRIM= Date: Wed, 28 Jul 2021 15:35:36 -0400 From: "J. Bruce Fields" To: NeilBrown Cc: Christoph Hellwig , Josef Bacik , Chuck Lever , Chris Mason , David Sterba , Alexander Viro , linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org Subject: Re: [PATCH/RFC 00/11] expose btrfs subvols in mount table correctly Message-ID: <20210728193536.GD3152@fieldses.org> References: <162742539595.32498.13687924366155737575.stgit@noble.brown> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <162742539595.32498.13687924366155737575.stgit@noble.brown> User-Agent: Mutt/1.5.21 (2010-09-15) Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org I'm still stuck trying to understand why subvolumes can't get their own superblocks: - Why are the performance issues Josef raises unsurmountable? And why are they unique to btrfs? (Surely there other cases where people need hundreds or thousands of superblocks?) - If filehandle decoding can return a different vfs mount than it's passed, why can't it return a different superblock? --b. On Wed, Jul 28, 2021 at 08:37:45AM +1000, NeilBrown wrote: > There are long-standing problems with btrfs subvols, particularly in > relation to whether and how they are exposed in the mount table. > > - /proc/self/mountinfo reports the major:minor device number for each > filesystem and when a btrfs subvol is explicitly mounted, the number > reported is wrong - it does not match what stat() reports for the > mountpoint. > > - when subvol are not explicitly mounted, they don't appear in > mountinfo at all. > > Consequences include that a tool which uses stat() to find the dev of the > filesystem, then searches mountinfo for that filesystem, will not find > it. > > Some tools (e.g. findmnt) appear to have been enhanced to cope with this > strangeness, but it would be best to make btrfs behave more normally. > > - nfsd cannot currently see the transition to subvol, so reports the > main volume and all subvols to the client as being in the same > filesystem. As inode numbers are not unique across all subvols, > this can confuse clients. In particular, 'find' is likely to report a > loop. > > subvols can be made to appear in mountinfo using automounts. However > nfsd does not cope well with automounts. It assumes all filesystems to > be exported are already mounted. So adding automounts to btrfs would > break nfsd. > > We can enhance nfsd to understand that some automounts can be managed. > "internal mounts" where a filesystem provides an automount point and > mounts its own directories, can be handled differently by nfsd. > > This series addresses all these issues. After a few enhancements to the > VFS to provide needed support, they enhance exportfs and nfsd to cope > with the concept of internal mounts, and then enhance btrfs to provide > them. > > The NFSv3 support is incomplete. I'm not sure we can make it work > "perfectly". A normal nfsv3 mount seem to work well enough, but if > mounted with '-o noac', it loses track of the mounted-on inode number > and complains about inode numbers changing. > > My basic test for these is to mount a btrfs filesystem which contains > subvols, nfs-export it and mount it with nfsv3 and nfsv4, then run > 'find' in each of the filesystem and check the contents of > /proc/self/mountinfo. > > The first patch simply fixes the dev number in mountinfo and could > possibly be tagged for -stable. > > NeilBrown > > --- > > NeilBrown (11): > VFS: show correct dev num in mountinfo > VFS: allow d_automount to create in-place bind-mount. > VFS: pass lookup_flags into follow_down() > VFS: export lookup_mnt() > VFS: new function: mount_is_internal() > nfsd: include a vfsmount in struct svc_fh > exportfs: Allow filehandle lookup to cross internal mount points. > nfsd: change get_parent_attributes() to nfsd_get_mounted_on() > nfsd: Allow filehandle lookup to cross internal mount points. > btrfs: introduce mapping function from location to inum > btrfs: use automount to bind-mount all subvol roots. > > > fs/btrfs/btrfs_inode.h | 12 +++ > fs/btrfs/inode.c | 111 ++++++++++++++++++++++++++- > fs/btrfs/super.c | 1 + > fs/exportfs/expfs.c | 100 ++++++++++++++++++++---- > fs/fhandle.c | 2 +- > fs/internal.h | 1 - > fs/namei.c | 6 +- > fs/namespace.c | 32 +++++++- > fs/nfsd/export.c | 4 +- > fs/nfsd/nfs3xdr.c | 40 +++++++--- > fs/nfsd/nfs4proc.c | 9 ++- > fs/nfsd/nfs4xdr.c | 106 ++++++++++++------------- > fs/nfsd/nfsfh.c | 44 +++++++---- > fs/nfsd/nfsfh.h | 3 +- > fs/nfsd/nfsproc.c | 5 +- > fs/nfsd/vfs.c | 162 +++++++++++++++++++++++---------------- > fs/nfsd/vfs.h | 12 +-- > fs/nfsd/xdr4.h | 2 +- > fs/overlayfs/namei.c | 5 +- > fs/xfs/xfs_ioctl.c | 12 ++- > include/linux/exportfs.h | 4 +- > include/linux/mount.h | 4 + > include/linux/namei.h | 2 +- > 23 files changed, 490 insertions(+), 189 deletions(-) > > -- > Signature