2018-03-28 01:21:00

by Dai Qizhi

[permalink] [raw]
Subject: Re: client side see wrong directory entries with nfs_export on overlayfs

> 2018-03-27 13:24 GMT+03:00 Dai Qizhi <[email protected]>:
> > Dear Amir Goldstein,
> >
>
> Hi Dai,
>
> Do you mind if I re-posting your question to overlayfs mailing list.
> There are other people out there that would encounter the same
> problem and will benefit from my answer.

no problem.
>
> > I tested overlayfs with nfs_export feature , the kernel is built by
> > ubuntu , 4.16.0-041600rc7, http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16-rc7/
> >
> > the problem is , when i export two different overlay mount points on
> > server side, then mount them use nfs on the client side, the client side
> > views the two differnet mount points as the same.
> >
> >
> > on server side:
> >
> > none on /mnt/m1 type overlay (rw,relatime,lowerdir=m1/ro,upperdir=m1/rw,workdir=
> > m1/w,index=on,nfs_export=on)
> > none on /mnt/m2 type overlay (rw,relatime,lowerdir=m2/ro,upperdir=m2/rw,workdir=
> > m2/w,index=on,nfs_export=on)
> > [root@localhost data]# exportfs -rv
> > exporting *:/mnt/m4
> > exporting *:/mnt/m3
> > exporting *:/mnt/m2
> > exporting *:/mnt/m1
> > [root@localhost data]# ls /mnt/m1
> > this_is_m1
> > [root@localhost data]# ls /mnt/m2
> > this_is_m2
> >
> >
> > on client side:
> >
> > [root@localhost mnt]# mount 192.168.0.1:/mnt/m1 m1
> > [root@localhost mnt]# mount 192.168.0.1:/mnt/m2 m2
> > [root@localhost mnt]# mount |grep 192
> > 192.168.0.1:/mnt/m1 on /mnt/m1 type nfs (rw,relatime,vers=3,rsize=524288,wsize=5
> > 24288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.
> > 1,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=192.168.0.1)
> > 192.168.0.1:/mnt/m1 on /mnt/m2 type nfs (rw,relatime,vers=3,rsize=524288,wsize=5
> > 24288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.
> > 1,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=192.168.0.1)
> >
> > [root@localhost mnt]# ls /mnt/m1
> > this_is_m1
> > [root@localhost mnt]# ls /mnt/m2
> > this_is_m1
> > [root@localhost mnt]#
>
>
> Please refer to the man page of exports(5), the section about 'fsid' describes
> this problem related to exporting file systems that are not on a block device,
> such as overlayfs.

yes, export different mount point with different fsid= option works as
expected.


>
> If you are interested to know if there is a way to fix this that does
> not involve manually
> configuring different fsid per export, I will have to consult with the
> NFS experts, so please
> reply to this message with CC to <[email protected]> and
> <[email protected]>

when exporting parent directory with crossmnt option and mount differnet overlayfs
under that directory, we encounter the same problem on client side..

>
> Thanks,
> Amir.

Best regards,
Dai Qizhi



2018-03-28 04:45:50

by Amir Goldstein

[permalink] [raw]
Subject: Re: client side see wrong directory entries with nfs_export on overlayfs

On Wed, Mar 28, 2018 at 4:14 AM, Dai Qizhi <[email protected]> wrote:
[...]
>>
>> > I tested overlayfs with nfs_export feature , the kernel is built by
>> > ubuntu , 4.16.0-041600rc7, http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16-rc7/
>> >
>> > the problem is , when i export two different overlay mount points on
>> > server side, then mount them use nfs on the client side, the client side
>> > views the two differnet mount points as the same.
>> >
>> >
>> > on server side:
>> >
>> > none on /mnt/m1 type overlay (rw,relatime,lowerdir=m1/ro,upperdir=m1/rw,workdir=
>> > m1/w,index=on,nfs_export=on)
>> > none on /mnt/m2 type overlay (rw,relatime,lowerdir=m2/ro,upperdir=m2/rw,workdir=
>> > m2/w,index=on,nfs_export=on)
>> > [root@localhost data]# exportfs -rv
>> > exporting *:/mnt/m4
>> > exporting *:/mnt/m3
>> > exporting *:/mnt/m2
>> > exporting *:/mnt/m1
>> > [root@localhost data]# ls /mnt/m1
>> > this_is_m1
>> > [root@localhost data]# ls /mnt/m2
>> > this_is_m2
>> >
>> >
>> > on client side:
>> >
>> > [root@localhost mnt]# mount 192.168.0.1:/mnt/m1 m1
>> > [root@localhost mnt]# mount 192.168.0.1:/mnt/m2 m2
>> > [root@localhost mnt]# mount |grep 192
>> > 192.168.0.1:/mnt/m1 on /mnt/m1 type nfs (rw,relatime,vers=3,rsize=524288,wsize=5
>> > 24288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.
>> > 1,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=192.168.0.1)
>> > 192.168.0.1:/mnt/m1 on /mnt/m2 type nfs (rw,relatime,vers=3,rsize=524288,wsize=5
>> > 24288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.
>> > 1,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=192.168.0.1)
>> >
>> > [root@localhost mnt]# ls /mnt/m1
>> > this_is_m1
>> > [root@localhost mnt]# ls /mnt/m2
>> > this_is_m1
>> > [root@localhost mnt]#
>>
>>
>> Please refer to the man page of exports(5), the section about 'fsid' describes
>> this problem related to exporting file systems that are not on a block device,
>> such as overlayfs.
>
> yes, export different mount point with different fsid= option works as
> expected.
>
>
>>
>> If you are interested to know if there is a way to fix this that does
>> not involve manually
>> configuring different fsid per export, I will have to consult with the
>> NFS experts, so please
>> reply to this message with CC to <[email protected]> and
>> <[email protected]>
>
> when exporting parent directory with crossmnt option and mount differnet overlayfs
> under that directory, we encounter the same problem on client side..
>

I see. Jeff, Bruce, is there a school book solution to this issue?

Is there a way for a non blockdev export to automatically identify itself
to knfsd? After all, the tuple ("overlayfs";<overlay root file handle>) should
be unique on the server. <overlay root file handle> contains (struct ovl_fh)
the upper fs UUID and the upper root dir file handle.

Technically, if there is no out of tree fs on the system that is using the
value OVL_FILEID (0xfb) for file handle type <overlay root file handle>
itself would be unique.

Thanks,
Amir.

2018-03-28 11:15:46

by Jeffrey Layton

[permalink] [raw]
Subject: Re: client side see wrong directory entries with nfs_export on overlayfs

On Wed, 2018-03-28 at 07:45 +0300, Amir Goldstein wrote:
> On Wed, Mar 28, 2018 at 4:14 AM, Dai Qizhi <[email protected]>
> wrote:
> [...]
> > >
> > > > I tested overlayfs with nfs_export feature , the kernel is
> > > > built by
> > > > ubuntu , 4.16.0-041600rc7, http://kernel.ubuntu.com/~kernel-ppa
> > > > /mainline/v4.16-rc7/
> > > >
> > > > the problem is , when i export two different overlay mount
> > > > points on
> > > > server side, then mount them use nfs on the client side, the
> > > > client side
> > > > views the two differnet mount points as the same.
> > > >
> > > >
> > > > on server side:
> > > >
> > > > none on /mnt/m1 type overlay
> > > > (rw,relatime,lowerdir=m1/ro,upperdir=m1/rw,workdir=
> > > > m1/w,index=on,nfs_export=on)
> > > > none on /mnt/m2 type overlay
> > > > (rw,relatime,lowerdir=m2/ro,upperdir=m2/rw,workdir=
> > > > m2/w,index=on,nfs_export=on)
> > > > [root@localhost data]# exportfs -rv
> > > > exporting *:/mnt/m4
> > > > exporting *:/mnt/m3
> > > > exporting *:/mnt/m2
> > > > exporting *:/mnt/m1
> > > > [root@localhost data]# ls /mnt/m1
> > > > this_is_m1
> > > > [root@localhost data]# ls /mnt/m2
> > > > this_is_m2
> > > >
> > > >
> > > > on client side:
> > > >
> > > > [root@localhost mnt]# mount 192.168.0.1:/mnt/m1 m1
> > > > [root@localhost mnt]# mount 192.168.0.1:/mnt/m2 m2
> > > > [root@localhost mnt]# mount |grep 192
> > > > 192.168.0.1:/mnt/m1 on /mnt/m1 type nfs
> > > > (rw,relatime,vers=3,rsize=524288,wsize=5
> > > > 24288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mou
> > > > ntaddr=192.168.0.
> > > > 1,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,ad
> > > > dr=192.168.0.1)
> > > > 192.168.0.1:/mnt/m1 on /mnt/m2 type nfs
> > > > (rw,relatime,vers=3,rsize=524288,wsize=5
> > > > 24288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mou
> > > > ntaddr=192.168.0.
> > > > 1,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,ad
> > > > dr=192.168.0.1)
> > > >
> > > > [root@localhost mnt]# ls /mnt/m1
> > > > this_is_m1
> > > > [root@localhost mnt]# ls /mnt/m2
> > > > this_is_m1
> > > > [root@localhost mnt]#
> > >
> > >
> > > Please refer to the man page of exports(5), the section about
> > > 'fsid' describes
> > > this problem related to exporting file systems that are not on a
> > > block device,
> > > such as overlayfs.
> >
> > yes, export different mount point with different fsid= option works
> > as
> > expected.
> >
> >
> > >
> > > If you are interested to know if there is a way to fix this that
> > > does
> > > not involve manually
> > > configuring different fsid per export, I will have to consult
> > > with the
> > > NFS experts, so please
> > > reply to this message with CC to <[email protected]>
> > > and
> > > <[email protected]>
> >
> > when exporting parent directory with crossmnt option and mount
> > differnet overlayfs
> > under that directory, we encounter the same problem on client
> > side..
> >
>
> I see. Jeff, Bruce, is there a school book solution to this issue?
>
> Is there a way for a non blockdev export to automatically identify
> itself
> to knfsd? After all, the tuple ("overlayfs";<overlay root file
> handle>) should
> be unique on the server. <overlay root file handle> contains (struct
> ovl_fh)
> the upper fs UUID and the upper root dir file handle.
>
> Technically, if there is no out of tree fs on the system that is
> using the
> value OVL_FILEID (0xfb) for file handle type <overlay root file
> handle>
> itself would be unique.
>
> Thanks,
> Amir.

tl;dr: not currently, which is why when I did the reexport patches a
few years ago, they _required_ that you manually set the fsid= export
option.

Longer story:

Long, long ago, the fsid for the export was almost always determined by
the device major/minor tuple. That became really problematic whenever
devices got reordered after adding a disk to the system and rebooting.
So, Neil Brown added the ability to determine the fsid from the
libblkdev uuid (see nfs-utils commit e91ff0175602c, and kernel commits
from around that time).

In principle, you could do something similar for overlayfs: add a new
FSID_* type for overlayfs that can reliably determine a unique fsid for
different overlays. That would require kernel and userland patches, of
course...

--
Jeff Layton <[email protected]>

2018-03-28 13:57:10

by Amir Goldstein

[permalink] [raw]
Subject: Re: client side see wrong directory entries with nfs_export on overlayfs

On Wed, Mar 28, 2018 at 2:15 PM, Jeff Layton <[email protected]> wrote:
> On Wed, 2018-03-28 at 07:45 +0300, Amir Goldstein wrote:
>> On Wed, Mar 28, 2018 at 4:14 AM, Dai Qizhi <[email protected]>
>> wrote:
>> [...]
>> > >
>> > > > I tested overlayfs with nfs_export feature , the kernel is
>> > > > built by
>> > > > ubuntu , 4.16.0-041600rc7, http://kernel.ubuntu.com/~kernel-ppa
>> > > > /mainline/v4.16-rc7/
>> > > >
>> > > > the problem is , when i export two different overlay mount
>> > > > points on
>> > > > server side, then mount them use nfs on the client side, the
>> > > > client side
>> > > > views the two differnet mount points as the same.
>> > > >
>> > > >
>> > > > on server side:
>> > > >
>> > > > none on /mnt/m1 type overlay
>> > > > (rw,relatime,lowerdir=m1/ro,upperdir=m1/rw,workdir=
>> > > > m1/w,index=on,nfs_export=on)
>> > > > none on /mnt/m2 type overlay
>> > > > (rw,relatime,lowerdir=m2/ro,upperdir=m2/rw,workdir=
>> > > > m2/w,index=on,nfs_export=on)
>> > > > [root@localhost data]# exportfs -rv
>> > > > exporting *:/mnt/m4
>> > > > exporting *:/mnt/m3
>> > > > exporting *:/mnt/m2
>> > > > exporting *:/mnt/m1
>> > > > [root@localhost data]# ls /mnt/m1
>> > > > this_is_m1
>> > > > [root@localhost data]# ls /mnt/m2
>> > > > this_is_m2
>> > > >
>> > > >
>> > > > on client side:
>> > > >
>> > > > [root@localhost mnt]# mount 192.168.0.1:/mnt/m1 m1
>> > > > [root@localhost mnt]# mount 192.168.0.1:/mnt/m2 m2
>> > > > [root@localhost mnt]# mount |grep 192
>> > > > 192.168.0.1:/mnt/m1 on /mnt/m1 type nfs
>> > > > (rw,relatime,vers=3,rsize=524288,wsize=5
>> > > > 24288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mou
>> > > > ntaddr=192.168.0.
>> > > > 1,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,ad
>> > > > dr=192.168.0.1)
>> > > > 192.168.0.1:/mnt/m1 on /mnt/m2 type nfs
>> > > > (rw,relatime,vers=3,rsize=524288,wsize=5
>> > > > 24288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mou
>> > > > ntaddr=192.168.0.
>> > > > 1,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,ad
>> > > > dr=192.168.0.1)
>> > > >
>> > > > [root@localhost mnt]# ls /mnt/m1
>> > > > this_is_m1
>> > > > [root@localhost mnt]# ls /mnt/m2
>> > > > this_is_m1
>> > > > [root@localhost mnt]#
>> > >
>> > >
>> > > Please refer to the man page of exports(5), the section about
>> > > 'fsid' describes
>> > > this problem related to exporting file systems that are not on a
>> > > block device,
>> > > such as overlayfs.
>> >
>> > yes, export different mount point with different fsid= option works
>> > as
>> > expected.
>> >
>> >
>> > >
>> > > If you are interested to know if there is a way to fix this that
>> > > does
>> > > not involve manually
>> > > configuring different fsid per export, I will have to consult
>> > > with the
>> > > NFS experts, so please
>> > > reply to this message with CC to <[email protected]>
>> > > and
>> > > <[email protected]>
>> >
>> > when exporting parent directory with crossmnt option and mount
>> > differnet overlayfs
>> > under that directory, we encounter the same problem on client
>> > side..
>> >
>>
>> I see. Jeff, Bruce, is there a school book solution to this issue?
>>
>> Is there a way for a non blockdev export to automatically identify
>> itself
>> to knfsd? After all, the tuple ("overlayfs";<overlay root file
>> handle>) should
>> be unique on the server. <overlay root file handle> contains (struct
>> ovl_fh)
>> the upper fs UUID and the upper root dir file handle.
>>
>> Technically, if there is no out of tree fs on the system that is
>> using the
>> value OVL_FILEID (0xfb) for file handle type <overlay root file
>> handle>
>> itself would be unique.
>>
>> Thanks,
>> Amir.
>
> tl;dr: not currently, which is why when I did the reexport patches a
> few years ago, they _required_ that you manually set the fsid= export
> option.
>
> Longer story:
>
> Long, long ago, the fsid for the export was almost always determined by
> the device major/minor tuple. That became really problematic whenever
> devices got reordered after adding a disk to the system and rebooting.
> So, Neil Brown added the ability to determine the fsid from the
> libblkdev uuid (see nfs-utils commit e91ff0175602c, and kernel commits
> from around that time).
>
> In principle, you could do something similar for overlayfs: add a new
> FSID_* type for overlayfs that can reliably determine a unique fsid for
> different overlays. That would require kernel and userland patches, of
> course...
>

That's good to know, but I guess I won't be doing any of that anytime soon.
Thanks!
Amir.

2018-03-29 16:45:25

by Patrick Goetz

[permalink] [raw]
Subject: Re: client side see wrong directory entries with nfs_export on overlayfs

On 03/28/2018 06:15 AM, Jeff Layton wrote:
> Long, long ago, the fsid for the export was almost always determined by
> the device major/minor tuple. That became really problematic whenever
> devices got reordered after adding a disk to the system and rebooting.
> So, Neil Brown added the ability to determine the fsid from the
> libblkdev uuid (see nfs-utils commit e91ff0175602c, and kernel commits
> from around that time).
>

I recently got burned by this because it appears that bind-mounted XFS
exports are somehow different from bind-mounted ext4 exports? I.e. I
was able to export one bind mounted filesystem (under an NFS4 root
export) with no fsid, but when I tried to add another one (also a bind
mount under the same root), the export failed in an odd way. The only
difference was in the first case the underlying filesystem was ext4 and
the other XFS. I'll post a question about this later, but for now ...

This seems like bad design. Wouldn't it make more sense to require that
an fsid be assigned to all exports? Then there's no question in the
user's mind of do I need an fsid or don't I?