Note I have cases open with Redhat and Netapp, but was curious if other
people have also seen inconsistent mount attributes (ro/rw) when
mounting RHEL5 client vs. Netapp 7.2 Ontap.
---+ System environment
[greg@adcgar04 greg]$ uname -a
Linux adcgar04 2.6.18-8.el5 #1 SMP Fri Jan 26 14:15:14 EST 2007 x86_64
unknown unknown GNU/Linux
[greg@adcgar04 greg]$ rpm -qa | grep nfs-utils
nfs-utils-1.0.9-16.el5
nfs-utils-lib-1.0.8-7.2
nfs-utils-lib-devel-1.0.8-7.2
nfs-utils-lib-1.0.8-7.2
nfs-utils-lib-devel-1.0.8-7.2
[greg@adcgar04 greg]$ mount -V
mount (util-linux 2.13-pre7)
---+ NFS server (Netapp) environment
[greg@apathy greg]$ sudo rsh eng version
NetApp Release 7.2P4: Tue Nov 28 02:55:54 PST 2006
---++ NFS export file entry
[greg@apathy greg]$ sudo rsh eng exportfs | grep pandora
/vol/vol4/pandora
-sec=sys,ro,rw=@volexport-pandora,root=@volexport-pandora,anon=4058
---++ netgroup member (for export file entry above)
[greg@apathy greg]$ show_netgroup volexport-pandora | grep adcgar
adcgar04.amd.com
---+ Demonstration of inconsistent ro/rw mount reporting
[root@adcgar04 /]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
[root@adcgar04 /]# mount -v | grep mnt2
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,addr=163.181.34.137)
[root@adcgar04 /]# cat /proc/mounts | grep mnt2
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
[root@adcgar04 /]# cd /mnt2/
[root@adcgar04 mnt2]# touch asdf
touch: cannot touch `asdf': Read-only file system
---+ Discussion
---++Linux (RHEL3/4) NFS servers that have similiar exportfs options
/tmp *(ro,anonuid=4058) @volexport-pandora(rw,no_root_squash)
do not cause the inconsistent behavior between mount -v and /proc/mounts
(and it is mounted rw as expected on the client).
---++ A reply from NetApp had this info:
Starting with ONTAP 7.2.1 onward, ONTAP will display the "most
pessimistic" permissions to NFSv3 and NFSv4 clients. NFSv2 clients will
see permissions the same way as in previous releases of ONTAP, i.e. the
"most optimistic" permissions.
And mounting using NFS v2 (instead of v3) does give us the expected
rw/rw consistency and ability.
---++ So now what?
Should the linux mount -v and cat /proc/mounts be consistent with what
is actually happening?
Should netapp exports syntax handle a wildcard ro and a netgroup rw?
Comments and feedback welcome.
Thanks,
--Greg
--
----------------------------------------------------------------------
Greg Baker 512-602-3287 (work)
[email protected] 512-602-6970 (fax)
5204 E. Ben White Blvd MS 625 512-555-1212 (info)
Austin, TX 78741
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
...closing the loop on what we've found out... it doesn't appear that a
convenient workaround (at the mount level) exists. We'll play around
with local executable automount maps to see if that will work for us.
Thanks,
--Greg
*****
Indeed, now that I read that comment by Trond and went to the source
code, I found out about upstream commit
54ceac4515986030c2502960be620198dd8fe25b.
The idea is that having a single super_block structure per server per
FSID prevents corner cases that can lead to corrupt dentry cache trees,
prevents conflicting buffer cache contents to what ends up being the
same file, and some other scary situations.
So the deal is, the mount flags (and NFS options) are set only the first
time that a given combination of server and filesystem are mounted. If
you ever mount the same filesystem from the same server on another
mountpoint, you'll get the flags and options that were passed on to the
first mount. There's no working around that.
*****
Gregory Baker wrote:
> Trond Myklebust wrote:
> > On Thu, 2007-04-19 at 15:37 -0500, Gregory Baker wrote:
> >
> >> ---+ BAD RHEL 5 64 system
> > ...
> >> [root@adcgar04 mnt2]# cat /proc/mounts | grep mnt2
> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> >>
> ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
>
> >> 0 0
> >
> > That would be your problem right there. Is the same volume perhaps
> > mounted read-only somewhere else on the same client? That is no longer
> > allowed.
> >
> > Cheers
> > Trond
> >
>
> Ah, news to me!
>
> So if you mount a volume ro at one mount point /mnt1 and then try to
> mount the same volume rw at a second mount point /mnt2 you'll run into
> problems? Apologies, didn't catch this in release notes. Thanks!
>
> --Greg
>
> ---+ A New Beginning
>
> [root@adcgar04 /]# umount -a -t nfs
>
> [root@adcgar04 /]# mount -v | grep pandora
>
> [root@adcgar04 /]# cat /proc/mounts | grep pandora
>
> [root@adcgar04 /]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
>
> [root@adcgar04 /]# mount -v | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> (rw,addr=163.181.34.137)
>
> [root@adcgar04 /]# cat /proc/mounts | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
>
> [root@adcgar04 /]# touch /mnt2/asdf
>
> ---+ The ro Strikes Back
>
> [root@adcgar04 /]# mount -o ro
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1
>
> [root@adcgar04 /]# mount -o rw
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
>
> [root@adcgar04 /]# mount -v | grep mnt
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt1 type nfs
> (ro,addr=163.181.34.137)
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> (rw,addr=163.181.34.137)
>
> [root@adcgar04 /]# cat /proc/mounts | grep mnt
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1 nfs
> ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
>
> [root@adcgar04 /]# touch /mnt2/asdf
> touch: cannot touch `/mnt2/asdf': Read-only file system
>
>
>
>
>
--
----------------------------------------------------------------------
Greg Baker 512-602-3287 (work)
[email protected] 512-602-6970 (fax)
5204 E. Ben White Blvd MS 625 512-555-1212 (info)
Austin, TX 78741
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Gregory Baker wrote:
> So the deal is, the mount flags (and NFS options) are set only the first
> time that a given combination of server and filesystem are mounted. If
> you ever mount the same filesystem from the same server on another
> mountpoint, you'll get the flags and options that were passed on to the
> first mount. There's no working around that.
> *****
Should these mount flags (and NFS options) expire/be removed from some
table somewhere when you unmount/remount the filesystem? The mount
flags (and NFS options) persist between complete unmounting / remounting
of said filesystem.
[greg@apathy 115656]$ cat typescript
* Look at me. Don't have anything mounted via NFS yet the mount
* gets stuck as ro when mounted...
[root@adcgar04 tmp]# umount -a -t nfs
[root@adcgar04 tmp]# mount -v | grep pandora
[root@adcgar04 tmp]# cat /proc/mounts | grep pandora
[root@adcgar04 tmp]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
[root@adcgar04 tmp]# mount -v | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,addr=163.181.34.137)
[root@adcgar04 tmp]# cat /proc/mounts | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
[root@adcgar04 tmp]# touch /mnt2/asdf
touch: cannot touch `/mnt2/asdf': Read-only file system
* OK, let's try this fancy mount command..
[root@adcgar04 tmp]# mount -o remount,rw /mnt2
[root@adcgar04 tmp]# mount -v | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,remount,addr=163.181.34.137)
[root@adcgar04 tmp]# cat /proc/mounts | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
rw,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
* Everybody is happy!
[root@adcgar04 tmp]# touch /mnt2/asdf
* Let's get rid of NFS mount and see what happens now
[root@adcgar04 tmp]# umount /mnt2
[root@adcgar04 tmp]# mount -v | grep pandora
[root@adcgar04 tmp]# cat /proc/mounts | grep pandora
[root@adcgar04 tmp]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
[root@adcgar04 tmp]# mount -v | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,addr=163.181.34.137)
[root@adcgar04 tmp]# cat /proc/mounts | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
rw,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
* Everybody is happy!
[root@adcgar04 tmp]# touch /mnt2/asdf
* Let's give "something" the notion that this is a ro mount...
[root@adcgar04 tmp]# mount -o remount,ro /mnt2
[root@adcgar04 tmp]# mount -v | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(ro,remount,addr=163.181.34.137)
[root@adcgar04 tmp]# cat /proc/mounts | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
* RO (expected)
[root@adcgar04 tmp]# touch /mnt2/asdf
touch: cannot touch `/mnt2/asdf': Read-only file system
* OK, unmount NFS mount, try again
[root@adcgar04 tmp]# umount /mnt2
[root@adcgar04 tmp]# mount -v | grep pandora
[root@adcgar04 tmp]# cat /proc/mounts | grep pandora
[root@adcgar04 tmp]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
[root@adcgar04 tmp]# mount -v | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,addr=163.181.34.137)
[root@adcgar04 tmp]# cat /proc/mounts | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
* Boo-RO/RW (unexpected)
[root@adcgar04 tmp]# touch /mnt2/asdf
touch: cannot touch `/mnt2/asdf': Read-only file system
[root@adcgar04 tmp]# exit
* So it seems that something remembers the last mount permissions, even
* though mount -v and /proc/mount do not show a filesystem mounted.
* The permissions persist to be associated with the exported filesystem.
* What causes this? Note: it's not the automounter process; this was
* stopped during testing.
--
----------------------------------------------------------------------
Greg Baker 512-602-3287 (work)
[email protected] 512-602-6970 (fax)
5204 E. Ben White Blvd MS 625 512-555-1212 (info)
Austin, TX 78741
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Ignore this email; my foolish use of
[root@adcgar04 tmp]# mount -v | grep pandora
[root@adcgar04 tmp]# cat /proc/mounts | grep pandora
and
[root@adcgar04 tmp]# umount -a -t nfs
make the captured session useless in debugging (as I cannot reproduce
the behavior).
Sorry everyone!
--Greg
Gregory Baker wrote:
>
> Gregory Baker wrote:
>
>> So the deal is, the mount flags (and NFS options) are set only the first
>> time that a given combination of server and filesystem are mounted. If
>> you ever mount the same filesystem from the same server on another
>> mountpoint, you'll get the flags and options that were passed on to the
>> first mount. There's no working around that.
>> *****
>
> Should these mount flags (and NFS options) expire/be removed from some
> table somewhere when you unmount/remount the filesystem? The mount
> flags (and NFS options) persist between complete unmounting / remounting
> of said filesystem.
>
> [greg@apathy 115656]$ cat typescript
>
> * Look at me. Don't have anything mounted via NFS yet the mount
> * gets stuck as ro when mounted...
>
> [root@adcgar04 tmp]# umount -a -t nfs
> [root@adcgar04 tmp]# mount -v | grep pandora
> [root@adcgar04 tmp]# cat /proc/mounts | grep pandora
> [root@adcgar04 tmp]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
> [root@adcgar04 tmp]# mount -v | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> (rw,addr=163.181.34.137)
> [root@adcgar04 tmp]# cat /proc/mounts | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
> [root@adcgar04 tmp]# touch /mnt2/asdf
> touch: cannot touch `/mnt2/asdf': Read-only file system
>
> * OK, let's try this fancy mount command..
>
> [root@adcgar04 tmp]# mount -o remount,rw /mnt2
> [root@adcgar04 tmp]# mount -v | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> (rw,remount,addr=163.181.34.137)
> [root@adcgar04 tmp]# cat /proc/mounts | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> rw,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
>
> * Everybody is happy!
>
> [root@adcgar04 tmp]# touch /mnt2/asdf
>
> * Let's get rid of NFS mount and see what happens now
>
> [root@adcgar04 tmp]# umount /mnt2
> [root@adcgar04 tmp]# mount -v | grep pandora
> [root@adcgar04 tmp]# cat /proc/mounts | grep pandora
> [root@adcgar04 tmp]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
> [root@adcgar04 tmp]# mount -v | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> (rw,addr=163.181.34.137)
> [root@adcgar04 tmp]# cat /proc/mounts | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> rw,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
>
> * Everybody is happy!
>
> [root@adcgar04 tmp]# touch /mnt2/asdf
>
> * Let's give "something" the notion that this is a ro mount...
>
> [root@adcgar04 tmp]# mount -o remount,ro /mnt2
> [root@adcgar04 tmp]# mount -v | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> (ro,remount,addr=163.181.34.137)
> [root@adcgar04 tmp]# cat /proc/mounts | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
>
> * RO (expected)
>
> [root@adcgar04 tmp]# touch /mnt2/asdf
> touch: cannot touch `/mnt2/asdf': Read-only file system
>
> * OK, unmount NFS mount, try again
>
> [root@adcgar04 tmp]# umount /mnt2
> [root@adcgar04 tmp]# mount -v | grep pandora
> [root@adcgar04 tmp]# cat /proc/mounts | grep pandora
> [root@adcgar04 tmp]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
> [root@adcgar04 tmp]# mount -v | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> (rw,addr=163.181.34.137)
> [root@adcgar04 tmp]# cat /proc/mounts | grep pandora
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
>
> * Boo-RO/RW (unexpected)
>
> [root@adcgar04 tmp]# touch /mnt2/asdf
> touch: cannot touch `/mnt2/asdf': Read-only file system
> [root@adcgar04 tmp]# exit
>
> * So it seems that something remembers the last mount permissions, even
> * though mount -v and /proc/mount do not show a filesystem mounted.
> * The permissions persist to be associated with the exported filesystem.
> * What causes this? Note: it's not the automounter process; this was
> * stopped during testing.
>
>
--
----------------------------------------------------------------------
Greg Baker 512-602-3287 (work)
[email protected] 512-602-6970 (fax)
5204 E. Ben White Blvd MS 625 512-555-1212 (info)
Austin, TX 78741
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
It becomes more interesting (well, to me anyways) when you compare the
behavior of RHEL3/4 (ro,rw netgroup export from filer works) with RHEL 5
(ro,rw netgroup export from filer no works).
---+ System environment
[root@build-el3-64 greg]# uname -a
Linux build-el3-64 2.4.21-47.ELsmp #1 SMP Wed Jul 5 20:30:30 EDT 2006
x86_64 unknown unknown GNU/Linux
[root@build-el3-64 greg]# rpm -qa | grep nfs-utils
nfs-utils-1.0.6-44EL
[root@build-el3-64 mnt2]# mount -V
mount: mount-2.11y
---+ NFS server (Netapp) environment
[greg@apathy greg]$ sudo rsh eng version
NetApp Release 7.2P4: Tue Nov 28 02:55:54 PST 2006
---++ NFS export file entry
[greg@apathy greg]$ sudo rsh eng exportfs | grep pandora
/vol/vol4/pandora
-sec=sys,ro,rw=@volexport-pandora,root=@volexport-pandora,anon=4058
---++ netgroup member (for export file entry above)
[root@build-el3-64 mnt2]# show_netgroup volexport-pandora | grep
build-el3-64
build-el3-64.amd.com
---+ Demonstration of ro/rw mount reporting and working
[root@build-el3-64 greg]# mount
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
[root@build-el3-64 greg]# mount -v | grep mnt2
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,addr=163.181.34
.137)
[root@build-el3-64 greg]# cat /proc/mounts | grep mnt2
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
rw,v3,rsize=32768,wsize=327
68,hard,udp,lock,addr=eng 0 0
[root@build-el3-64 greg]# cd /mnt2
[root@build-el3-64 mnt2]# touch asdf
[root@build-el3-64 mnt2]# ls -la asdf
-rw-r--r-- 1 root root 0 Apr 19 13:18 asdf
Ideas?
Thanks,
--Greg
Trond Myklebust wrote:
> On Wed, 2007-04-18 at 16:53 -0500, Gregory Baker wrote:
>> Note I have cases open with Redhat and Netapp, but was curious if other
>> people have also seen inconsistent mount attributes (ro/rw) when
>> mounting RHEL5 client vs. Netapp 7.2 Ontap.
>>
>> ---+ System environment
>>
>> [greg@adcgar04 greg]$ uname -a
>> Linux adcgar04 2.6.18-8.el5 #1 SMP Fri Jan 26 14:15:14 EST 2007 x86_64
>> unknown unknown GNU/Linux
>>
>> [greg@adcgar04 greg]$ rpm -qa | grep nfs-utils
>> nfs-utils-1.0.9-16.el5
>> nfs-utils-lib-1.0.8-7.2
>> nfs-utils-lib-devel-1.0.8-7.2
>> nfs-utils-lib-1.0.8-7.2
>> nfs-utils-lib-devel-1.0.8-7.2
>>
>> [greg@adcgar04 greg]$ mount -V
>> mount (util-linux 2.13-pre7)
>>
>> ---+ NFS server (Netapp) environment
>>
>> [greg@apathy greg]$ sudo rsh eng version
>> NetApp Release 7.2P4: Tue Nov 28 02:55:54 PST 2006
>>
>> ---++ NFS export file entry
>>
>> [greg@apathy greg]$ sudo rsh eng exportfs | grep pandora
>>
>> /vol/vol4/pandora
>> -sec=sys,ro,rw=@volexport-pandora,root=@volexport-pandora,anon=4058
>>
>> ---++ netgroup member (for export file entry above)
>>
>> [greg@apathy greg]$ show_netgroup volexport-pandora | grep adcgar
>> adcgar04.amd.com
>>
>> ---+ Demonstration of inconsistent ro/rw mount reporting
>>
>> [root@adcgar04 /]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
>>
>> [root@adcgar04 /]# mount -v | grep mnt2
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
>> (rw,addr=163.181.34.137)
>>
>> [root@adcgar04 /]# cat /proc/mounts | grep mnt2
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
>> ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
>> 0 0
>>
>> [root@adcgar04 /]# cd /mnt2/
>>
>> [root@adcgar04 mnt2]# touch asdf
>> touch: cannot touch `asdf': Read-only file system
>>
>> ---+ Discussion
>>
>> ---++Linux (RHEL3/4) NFS servers that have similiar exportfs options
>>
>> /tmp *(ro,anonuid=4058) @volexport-pandora(rw,no_root_squash)
>>
>> do not cause the inconsistent behavior between mount -v and /proc/mounts
>> (and it is mounted rw as expected on the client).
>>
>> ---++ A reply from NetApp had this info:
>>
>> Starting with ONTAP 7.2.1 onward, ONTAP will display the "most
>> pessimistic" permissions to NFSv3 and NFSv4 clients. NFSv2 clients will
>> see permissions the same way as in previous releases of ONTAP, i.e. the
>> "most optimistic" permissions.
>>
>> And mounting using NFS v2 (instead of v3) does give us the expected
>> rw/rw consistency and ability.
>>
>> ---++ So now what?
>>
>> Should the linux mount -v and cat /proc/mounts be consistent with what
>> is actually happening?
>
> There is no way for the Linux client to find out that this is a
> read-only volume at mount time. Only when you try an operation that
> actually attempts to modify the filesystem will the protocol allow the
> server to return NFSERR_ROFS.
> Furthermore, there is nothing in the protocol that states that a
> filesystem that issues NFSERR_ROFS will return the same reply the next
> time the same modification is attempted. Filesystems may switch from
> being read only to not being read only at will.
>
>> Should netapp exports syntax handle a wildcard ro and a netgroup rw?
>
> That is a very good question, but it deserves to be directed to the
> appropriate people at netapp and is not really appropriate for
> [email protected] (which is more about Linux NFS). If you
> haven't already done so, I'd suggest opening an official escalation of
> the matter. Your embedded netapp sales representative (sorry, but I'm
> not entirely up to date on who that is) should be able to help you with
> this.
>
> Cheers
> Trond
>
--
----------------------------------------------------------------------
Greg Baker 512-602-3287 (work)
[email protected] 512-602-6970 (fax)
5204 E. Ben White Blvd MS 625 512-555-1212 (info)
Austin, TX 78741
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Thu, 2007-04-19 at 13:24 -0500, Gregory Baker wrote:
> Ideas?
Perhaps a limitation on the number of entries in the netgroup on the
server?
Cheers
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Myklebust wrote:
> On Thu, 2007-04-19 at 13:24 -0500, Gregory Baker wrote:
>
>> Ideas?
>
> Perhaps a limitation on the number of entries in the netgroup on the
> server?
>
> Cheers
> Trond
I've seen past Netapp/netgroups non-happiness. In this case the
netgroup is relatively small... I tested quickly by moving the rhel5
system (adcgar04) to the top of the netgroup list with no affect.
[greg@apathy greg]$ show_netgroup volexport-pandora
adcgar04.amd.com <-------RHEL 5 64 system
nextimon.amd.com
i-sif1.amd.com
apathy.amd.com
build-lipc2.amd.com
build-el3-32.amd.com
build-el3-64.amd.com <------RHEL 3 64 system
build-el5-32.amd.com
build-el5-64.amd.com
build-sl10-32.amd.com
build-sl10-64.amd.com
hummus01.amd.com
loriol.amd.com
coolhand.amd.com
orkney.amd.com
purgatory.amd.com
reverse.amd.com
sideways.amd.com
testpig3.amd.cfffffadfm
texas2s20.amd.com
xhaust.amd.comrond
---+What the filer thinks
[greg@apathy greg]$ sudo rsh eng exportfs -c 163.181.61.69
/vol/vol4/pandora/pandora-k26_g25_64-2 root
exportfs: 163.181.61.69 has root access to /vol/vol4/pandora
[greg@apathy greg]$ sudo rsh eng exportfs -c 163.181.61.69
/vol/vol4/pandora/pandora-k26_g25_64-2 rw
exportfs: 163.181.61.69 has rw access to /vol/vol4/pandora
[greg@apathy greg]$ host 163.181.61.69
69.61.181.163.in-addr.arpa domain name pointer adcgar04.amd.com.
---+ BAD RHEL 5 64 system
[root@adcgar04 /]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
[root@adcgar04 mnt2]# mount -v | grep mnt2
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,addr=163.181.34.137)
[root@adcgar04 mnt2]# cat /proc/mounts | grep mnt2
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
[root@adcgar04 mnt2]# touch asdf
touch: cannot touch `asdf': Read-only file system
---+GOOD RHEL 3 64 system
[root@build-el3-64 root]# mount -o tcp
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
[root@build-el3-64 root]# mount -v | grep mnt2
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,tcp,addr=163.181.34.137)
[root@build-el3-64 root]# cat /proc/mounts | grep mnt2
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
rw,v3,rsize=32768,wsize=32768,hard,tcp,lock,addr=eng 0 0
[root@build-el3-64 root]# cd /mnt2/
[root@build-el3-64 mnt2]# touch asdf
[root@build-el3-64 mnt2]# ls -la asdf
-rw-r--r-- 1 root root 0 Apr 19 15:35 asdf
--
----------------------------------------------------------------------
Greg Baker 512-602-3287 (work)
[email protected] 512-602-6970 (fax)
5204 E. Ben White Blvd MS 625 512-555-1212 (info)
Austin, TX 78741
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Thu, 2007-04-19 at 15:37 -0500, Gregory Baker wrote:
> ---+ BAD RHEL 5 64 system
...
> [root@adcgar04 mnt2]# cat /proc/mounts | grep mnt2
> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
That would be your problem right there. Is the same volume perhaps
mounted read-only somewhere else on the same client? That is no longer
allowed.
Cheers
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Trond Myklebust wrote:
> On Thu, 2007-04-19 at 15:37 -0500, Gregory Baker wrote:
>
>> ---+ BAD RHEL 5 64 system
> ...
>> [root@adcgar04 mnt2]# cat /proc/mounts | grep mnt2
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
>>
ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
>> 0 0
>
> That would be your problem right there. Is the same volume perhaps
> mounted read-only somewhere else on the same client? That is no longer
> allowed.
>
> Cheers
> Trond
>
Ah, news to me!
So if you mount a volume ro at one mount point /mnt1 and then try to
mount the same volume rw at a second mount point /mnt2 you'll run into
problems? Apologies, didn't catch this in release notes. Thanks!
--Greg
---+ A New Beginning
[root@adcgar04 /]# umount -a -t nfs
[root@adcgar04 /]# mount -v | grep pandora
[root@adcgar04 /]# cat /proc/mounts | grep pandora
[root@adcgar04 /]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
[root@adcgar04 /]# mount -v | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,addr=163.181.34.137)
[root@adcgar04 /]# cat /proc/mounts | grep pandora
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
[root@adcgar04 /]# touch /mnt2/asdf
---+ The ro Strikes Back
[root@adcgar04 /]# mount -o ro
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1
[root@adcgar04 /]# mount -o rw
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
[root@adcgar04 /]# mount -v | grep mnt
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt1 type nfs
(ro,addr=163.181.34.137)
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
(rw,addr=163.181.34.137)
[root@adcgar04 /]# cat /proc/mounts | grep mnt
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1 nfs
ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
[root@adcgar04 /]# touch /mnt2/asdf
touch: cannot touch `/mnt2/asdf': Read-only file system
--
----------------------------------------------------------------------
Greg Baker 512-602-3287 (work)
[email protected] 512-602-6970 (fax)
5204 E. Ben White Blvd MS 625 512-555-1212 (info)
Austin, TX 78741
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Wed, 2007-05-16 at 13:46 +1000, Neil Brown wrote:
> On Tuesday May 15, [email protected] wrote:
> > From: Trond Myklebust <[email protected]>
> >
> > Add a flag to allow users to elect not to share the NFS super block with
> > another mount point, even if the fsids are the same. This will allow
> > users to set different mount options for the two different super blocks,
> > and is safe as long as they do not have inodes in common.
>
> Thanks.
> Just to clarify: By "is safe" you mean won't cause cache-coherence
> strangeness", or is there some other safety issue that I'm missing.
cache coherency is what I'm referring to. I'll attempt to clarify that.
> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> >
> > fs/nfs/super.c | 34 +++++++++++++++++++++++++++++-----
> > include/linux/nfs4_mount.h | 1 +
> > include/linux/nfs_mount.h | 1 +
> > 3 files changed, 31 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> > index ca20d3c..c03120a 100644
> > --- a/fs/nfs/super.c
> > +++ b/fs/nfs/super.c
> > @@ -291,6 +291,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
> > { NFS_MOUNT_NONLM, ",nolock", "" },
> > { NFS_MOUNT_NOACL, ",noacl", "" },
> > { NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" },
> > + { NFS_MOUNT_UNSHARED, ",nosharecache", ""},
> ^^^^^^^^^^^^
>
> This is different to the spelling used in the subject line. Was that
> intentional, or a typo?
No. This was the correct spelling, and the title line is wrong. I'll fix
it up...
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
From: Trond Myklebust <[email protected]>
Adds support for the 'nosharecache' mount option to nfs-utils.
Signed-off-by: Trond Myklebust <[email protected]>
---
utils/mount/nfs.man | 34 ++++++++++++++++++++++++++++++++++
utils/mount/nfs4_mount.h | 1 +
utils/mount/nfs4mount.c | 14 ++++++++++----
utils/mount/nfs_mount.h | 1 +
utils/mount/nfsmount.c | 4 ++++
5 files changed, 50 insertions(+), 4 deletions(-)
diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
index 673556c..e66daba 100644
--- a/utils/mount/nfs.man
+++ b/utils/mount/nfs.man
@@ -288,6 +288,23 @@ Mount the NFS filesystem using the UDP protocol.
Disables NFSv3 READDIRPLUS RPCs. Use this option when
mounting servers that don't support or have broken
READDIRPLUS implementations.
+.TP 1.5i
+.I nosharecache
+As of kernel 2.6.18, it is no longer possible to mount the same
+same filesystem with different mount options to a new mountpoint.
+It was deemed unsafe to do so, since cached data cannot be shared
+between the two mountpoints. In consequence, files or directories
+that were common to both mountpoint subtrees could often be seen to
+be out of sync following an update.
+.br
+This option allows administrators to select the pre-2.6.18 behaviour,
+permitting the same filesystem to be mounted with different mount
+options.
+.br
+.B Beware:
+Use of this option is not recommended unless you are certain that there
+are no hard links or subtrees of this mountpoint that are mounted
+elsewhere.
.P
All of the non-value options have corresponding nooption forms.
For example, nointr means don't allow file operations to be
@@ -444,6 +461,23 @@ This extracts a
server performance penalty but it allows two different NFS clients
to get reasonable good results when both clients are actively
writing to common filesystem on the server.
+.TP 1.5i
+.I nosharecache
+As of kernel 2.6.18, it is no longer possible to mount the same
+same filesystem with different mount options to a new mountpoint.
+It was deemed unsafe to do so, since cached data cannot be shared
+between the two mountpoints. In consequence, files or directories
+that were common to both mountpoint subtrees could often be seen to
+be out of sync following an update.
+.br
+This option allows administrators to select the pre-2.6.18 behaviour,
+permitting the same filesystem to be mounted with different mount
+options.
+.br
+.B Beware:
+Use of this option is not recommended unless you are certain that there
+are no hard links or subtrees of this mountpoint that are mounted
+elsewhere.
.P
All of the non-value options have corresponding nooption forms.
For example, nointr means don't allow file operations to be
diff --git a/utils/mount/nfs4_mount.h b/utils/mount/nfs4_mount.h
index 74c9b95..2fcca6d 100644
--- a/utils/mount/nfs4_mount.h
+++ b/utils/mount/nfs4_mount.h
@@ -65,6 +65,7 @@ struct nfs4_mount_data {
#define NFS4_MOUNT_NOCTO 0x0010 /* 1 */
#define NFS4_MOUNT_NOAC 0x0020 /* 1 */
#define NFS4_MOUNT_STRICTLOCK 0x1000 /* 1 */
+#define NFS4_MOUNT_UNSHARED 0x8000 /* 5 */
#define NFS4_MOUNT_FLAGMASK 0xFFFF
/* pseudoflavors: */
diff --git a/utils/mount/nfs4mount.c b/utils/mount/nfs4mount.c
index 2a58d0a..0376f32 100644
--- a/utils/mount/nfs4mount.c
+++ b/utils/mount/nfs4mount.c
@@ -201,7 +201,7 @@ int nfs4mount(const char *spec, const char *node, int *flags,
char *s;
int val;
int bg, soft, intr;
- int nocto, noac;
+ int nocto, noac, unshared;
int retry;
int retval;
time_t timeout, t;
@@ -252,6 +252,7 @@ int nfs4mount(const char *spec, const char *node, int *flags,
intr = NFS4_MOUNT_INTR;
nocto = 0;
noac = 0;
+ unshared = 0;
retry = 10000; /* 10000 minutes ~ 1 week */
/*
@@ -336,6 +337,8 @@ int nfs4mount(const char *spec, const char *node, int *flags,
nocto = !val;
else if (!strcmp(opt, "ac"))
noac = !val;
+ else if (!strcmp(opt, "sharecache"))
+ unshared = !val;
else if (!sloppy) {
printf(_("unknown nfs mount option: "
"%s%s\n"), val ? "" : "no", opt);
@@ -347,7 +350,8 @@ int nfs4mount(const char *spec, const char *node, int *flags,
data.flags = (soft ? NFS4_MOUNT_SOFT : 0)
| (intr ? NFS4_MOUNT_INTR : 0)
| (nocto ? NFS4_MOUNT_NOCTO : 0)
- | (noac ? NFS4_MOUNT_NOAC : 0);
+ | (noac ? NFS4_MOUNT_NOAC : 0)
+ | (unshared ? NFS4_MOUNT_UNSHARED : 0);
/*
* Give a warning if the rpc.idmapd daemon is not running
@@ -388,11 +392,13 @@ int nfs4mount(const char *spec, const char *node, int *flags,
data.acregmin, data.acregmax, data.acdirmin, data.acdirmax);
printf("port = %d, bg = %d, retry = %d, flags = %.8x\n",
ntohs(server_addr.sin_port), bg, retry, data.flags);
- printf("soft = %d, intr = %d, nocto = %d, noac = %d\n",
+ printf("soft = %d, intr = %d, nocto = %d, noac = %d, "
+ "nosharecache = %d\n",
(data.flags & NFS4_MOUNT_SOFT) != 0,
(data.flags & NFS4_MOUNT_INTR) != 0,
(data.flags & NFS4_MOUNT_NOCTO) != 0,
- (data.flags & NFS4_MOUNT_NOAC) != 0);
+ (data.flags & NFS4_MOUNT_NOAC) != 0,
+ (data.flags & NFS4_MOUNT_UNSHARED) != 0);
if (num_flavour > 0) {
int pf_cnt, i;
diff --git a/utils/mount/nfs_mount.h b/utils/mount/nfs_mount.h
index 4a061d8..50ce2a8 100644
--- a/utils/mount/nfs_mount.h
+++ b/utils/mount/nfs_mount.h
@@ -64,6 +64,7 @@ struct nfs_mount_data {
#define NFS_MOUNT_NOACL 0x0800 /* 4 */
#define NFS_MOUNT_SECFLAVOUR 0x2000 /* 5 */
#define NFS_MOUNT_NORDIRPLUS 0x4000 /* 5 */
+#define NFS_MOUNT_UNSHARED 0x8000 /* 5 */
/* security pseudoflavors */
diff --git a/utils/mount/nfsmount.c b/utils/mount/nfsmount.c
index 815064a..f21aaff 100644
--- a/utils/mount/nfsmount.c
+++ b/utils/mount/nfsmount.c
@@ -804,6 +804,10 @@ parse_options(char *old_opts, struct nfs_mount_data *data,
data->flags &= ~NFS_MOUNT_NORDIRPLUS;
if (!val)
data->flags |= NFS_MOUNT_NORDIRPLUS;
+ } else if (!strcmp(opt, "sharecache")) {
+ data->flags &= ~NFS_MOUNT_UNSHARED;
+ if (!val)
+ data->flags |= NFS_MOUNT_UNSHARED;
#endif
} else {
bad_option:
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
From: Trond Myklebust <[email protected]>
Unless the user sets the NFS_MOUNT_NOSHAREDCACHE mount flag, we should
return EBUSY if the filesystem is already mounted on a superblock that
has set conflicting mount options.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/super.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 42 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index c03120a..f7f8844 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -601,7 +601,9 @@ static int nfs_compare_super(struct super_block *sb, void *data)
{
struct nfs_server *server = data, *old = NFS_SB(sb);
- if (old->nfs_client != server->nfs_client)
+ if (memcmp(&old->nfs_client->cl_addr,
+ &server->nfs_client->cl_addr,
+ sizeof(old->nfs_client->cl_addr)) != 0)
return 0;
/* Note: NFS_MOUNT_UNSHARED == NFS4_MOUNT_UNSHARED */
if (old->flags & NFS_MOUNT_UNSHARED)
@@ -611,6 +613,39 @@ static int nfs_compare_super(struct super_block *sb, void *data)
return 1;
}
+#define NFS_MS_MASK (MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_SYNCHRONOUS)
+
+static int nfs_compare_mount_options(const struct super_block *s, const struct nfs_server *b, int flags)
+{
+ const struct nfs_server *a = s->s_fs_info;
+ const struct rpc_clnt *clnt_a = a->client;
+ const struct rpc_clnt *clnt_b = b->client;
+
+ if ((s->s_flags & NFS_MS_MASK) != (flags & NFS_MS_MASK))
+ goto Ebusy;
+ if (a->nfs_client != b->nfs_client)
+ goto Ebusy;
+ if (a->flags != b->flags)
+ goto Ebusy;
+ if (a->wsize != b->wsize)
+ goto Ebusy;
+ if (a->rsize != b->rsize)
+ goto Ebusy;
+ if (a->acregmin != b->acregmin)
+ goto Ebusy;
+ if (a->acregmax != b->acregmax)
+ goto Ebusy;
+ if (a->acdirmin != b->acdirmin)
+ goto Ebusy;
+ if (a->acdirmax != b->acdirmax)
+ goto Ebusy;
+ if (clnt_a->cl_auth->au_flavor != clnt_b->cl_auth->au_flavor)
+ goto Ebusy;
+ return 0;
+Ebusy:
+ return -EBUSY;
+}
+
static int nfs_get_sb(struct file_system_type *fs_type,
int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt)
{
@@ -645,8 +680,11 @@ static int nfs_get_sb(struct file_system_type *fs_type,
}
if (s->s_fs_info != server) {
+ error = nfs_compare_mount_options(s, server, flags);
nfs_free_server(server);
server = NULL;
+ if (error < 0)
+ goto error_splat_super;
}
if (!s->s_root) {
@@ -903,8 +941,11 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
}
if (s->s_fs_info != server) {
+ error = nfs_compare_mount_options(s, server, flags);
nfs_free_server(server);
server = NULL;
+ if (error < 0)
+ goto error_splat_super;
}
if (!s->s_root) {
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
From: Trond Myklebust <[email protected]>
Prior to David Howell's mount changes in 2.6.18, users who mounted
different directories which happened to be from the same filesystem on the
server would get different super blocks, and hence could choose different
mount options. As long as there were no hard linked files that crossed from
one subtree to another, this was quite safe.
Post the changes, if the two directories are on the same filesystem (have
the same 'fsid'), they will share the same super block, and hence the same
mount options.
Add a flag to allow users to elect not to share the NFS super block with
another mount point, even if the fsids are the same. This will allow
users to set different mount options for the two different super blocks, as
was previously possible. It is still up to the user to ensure that there
are no cache coherency issues when doing this, however the default
behaviour will be to share super blocks whenever two paths result in
the same fsid.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/super.c | 34 +++++++++++++++++++++++++++++-----
include/linux/nfs4_mount.h | 1 +
include/linux/nfs_mount.h | 1 +
3 files changed, 31 insertions(+), 5 deletions(-)
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index ca20d3c..c03120a 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -291,6 +291,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
{ NFS_MOUNT_NONLM, ",nolock", "" },
{ NFS_MOUNT_NOACL, ",noacl", "" },
{ NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" },
+ { NFS_MOUNT_UNSHARED, ",nosharecache", ""},
{ 0, NULL, NULL }
};
const struct proc_nfs_info *nfs_infop;
@@ -602,6 +603,9 @@ static int nfs_compare_super(struct super_block *sb, void *data)
if (old->nfs_client != server->nfs_client)
return 0;
+ /* Note: NFS_MOUNT_UNSHARED == NFS4_MOUNT_UNSHARED */
+ if (old->flags & NFS_MOUNT_UNSHARED)
+ return 0;
if (memcmp(&old->fsid, &server->fsid, sizeof(old->fsid)) != 0)
return 0;
return 1;
@@ -615,6 +619,7 @@ static int nfs_get_sb(struct file_system_type *fs_type,
struct nfs_fh mntfh;
struct nfs_mount_data *data = raw_data;
struct dentry *mntroot;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
/* Validate the mount data */
@@ -629,8 +634,11 @@ static int nfs_get_sb(struct file_system_type *fs_type,
goto out_err_noserver;
}
+ if (server->flags & NFS_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;
@@ -691,6 +699,7 @@ static int nfs_xdev_get_sb(struct file_system_type *fs_type, int flags,
struct super_block *s;
struct nfs_server *server;
struct dentry *mntroot;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
dprintk("--> nfs_xdev_get_sb()\n");
@@ -702,8 +711,11 @@ static int nfs_xdev_get_sb(struct file_system_type *fs_type, int flags,
goto out_err_noserver;
}
+ if (server->flags & NFS_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(&nfs_fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;
@@ -808,6 +820,7 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
struct dentry *mntroot;
char *mntpath = NULL, *hostname = NULL, ip_addr[16];
void *p;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
if (data == NULL) {
@@ -879,8 +892,11 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
goto out_err_noserver;
}
+ if (server->flags & NFS4_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_free;
@@ -949,6 +965,7 @@ static int nfs4_xdev_get_sb(struct file_system_type *fs_type, int flags,
struct super_block *s;
struct nfs_server *server;
struct dentry *mntroot;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
dprintk("--> nfs4_xdev_get_sb()\n");
@@ -960,8 +977,11 @@ static int nfs4_xdev_get_sb(struct file_system_type *fs_type, int flags,
goto out_err_noserver;
}
+ if (server->flags & NFS4_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(&nfs_fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;
@@ -1016,6 +1036,7 @@ static int nfs4_referral_get_sb(struct file_system_type *fs_type, int flags,
struct nfs_server *server;
struct dentry *mntroot;
struct nfs_fh mntfh;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
dprintk("--> nfs4_referral_get_sb()\n");
@@ -1027,8 +1048,11 @@ static int nfs4_referral_get_sb(struct file_system_type *fs_type, int flags,
goto out_err_noserver;
}
+ if (server->flags & NFS4_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(&nfs_fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;
diff --git a/include/linux/nfs4_mount.h b/include/linux/nfs4_mount.h
index 26b4c83..ad1bd4a 100644
--- a/include/linux/nfs4_mount.h
+++ b/include/linux/nfs4_mount.h
@@ -65,6 +65,7 @@ struct nfs4_mount_data {
#define NFS4_MOUNT_NOCTO 0x0010 /* 1 */
#define NFS4_MOUNT_NOAC 0x0020 /* 1 */
#define NFS4_MOUNT_STRICTLOCK 0x1000 /* 1 */
+#define NFS4_MOUNT_UNSHARED 0x8000 /* 1 */
#define NFS4_MOUNT_FLAGMASK 0xFFFF
#endif
diff --git a/include/linux/nfs_mount.h b/include/linux/nfs_mount.h
index cc8b9c5..3e3b521 100644
--- a/include/linux/nfs_mount.h
+++ b/include/linux/nfs_mount.h
@@ -62,6 +62,7 @@ struct nfs_mount_data {
#define NFS_MOUNT_STRICTLOCK 0x1000 /* reserved for NFSv4 */
#define NFS_MOUNT_SECFLAVOUR 0x2000 /* 5 */
#define NFS_MOUNT_NORDIRPLUS 0x4000 /* 5 */
+#define NFS_MOUNT_UNSHARED 0x8000 /* 5 */
#define NFS_MOUNT_FLAGMASK 0xFFFF
#endif
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Wed, 2007-05-16 at 22:13 -0400, Trond Myklebust wrote:
> From: Trond Myklebust <[email protected]>
>
> Prior to David Howell's mount changes in 2.6.18, users who mounted
> different directories which happened to be from the same filesystem on the
> server would get different super blocks, and hence could choose different
> mount options. As long as there were no hard linked files that crossed from
> one subtree to another, this was quite safe.
> Post the changes, if the two directories are on the same filesystem (have
> the same 'fsid'), they will share the same super block, and hence the same
> mount options.
>
> Add a flag to allow users to elect not to share the NFS super block with
> another mount point, even if the fsids are the same. This will allow
> users to set different mount options for the two different super blocks, as
> was previously possible. It is still up to the user to ensure that there
> are no cache coherency issues when doing this, however the default
> behaviour will be to share super blocks whenever two paths result in
> the same fsid.
>
> Signed-off-by: Trond Myklebust <[email protected]>
This looks great, thanks Trond.
I've been wondering about a couple things.
How will the "read-only bind mounts" VFS changes affect this?
When the fscache patches are merged how will this it be affect by this?
Comments please?
Ian
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, 2007-05-18 at 11:12 +0800, Ian Kent wrote:
> How will the "read-only bind mounts" VFS changes affect this?
They won't. Dave's patches move the MS_RDONLY flag out of the
super_block and into the vfsmount. That makes it possible to share the
same super_block (and hence the same inode/dentry caches) between a
read-only mount and a read-write mount.
If you want to change any of the other flags, though, you will still
have to use nosharecache to do so: for instance if you want to set up a
'noac' mount for one directory, but want all other directories to cache
attributes, then you will have to mount the "noac" directory using the
nosharecache flag.
> When the fscache patches are merged how will this it be affect by this?
I'd suggest simply returning an error if the user tries to set both
fscache and nosharecache. David's reasoning is still valid: if two
superblocks try to write to the same fscache inode, then caching havoc
will ensue.
Note, however, that the fscache patches already have a problem. I found
out recently that RHEL-5 uses NFS_MOUNT_FSCACHE==0x4000, which is taken
by NFS_MOUNT_NORDIRPLUS in mainline.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, 2007-05-18 at 09:20 -0400, Trond Myklebust wrote:
> On Fri, 2007-05-18 at 11:12 +0800, Ian Kent wrote:
> > How will the "read-only bind mounts" VFS changes affect this?
>
> They won't. Dave's patches move the MS_RDONLY flag out of the
> super_block and into the vfsmount. That makes it possible to share the
> same super_block (and hence the same inode/dentry caches) between a
> read-only mount and a read-write mount.
>
> If you want to change any of the other flags, though, you will still
> have to use nosharecache to do so: for instance if you want to set up a
> 'noac' mount for one directory, but want all other directories to cache
> attributes, then you will have to mount the "noac" directory using the
> nosharecache flag.
>
> > When the fscache patches are merged how will this it be affect by this?
>
> I'd suggest simply returning an error if the user tries to set both
> fscache and nosharecache. David's reasoning is still valid: if two
> superblocks try to write to the same fscache inode, then caching havoc
> will ensue.
Yep. David suggested that also.
>
> Note, however, that the fscache patches already have a problem. I found
> out recently that RHEL-5 uses NFS_MOUNT_FSCACHE==0x4000, which is taken
> by NFS_MOUNT_NORDIRPLUS in mainline.
Yep, we'll need to work on fixing that. I'll talk with David.
Ian
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
T24gU2F0LCBNYXkgMDUsIDIwMDcgYXQgMTE6Mzc6NTNQTSArMDgwMCwgSWFuIEtlbnQgd3JvdGU6
Cj4gT24gRnJpLCAyMDA3LTA1LTA0IGF0IDE3OjQxIC0wNDAwLCBUcm9uZCBNeWtsZWJ1c3Qgd3Jv
dGU6Cj4gPiBPbiBGcmksIDIwMDctMDUtMDQgYXQgMTU6MzAgLTA1MDAsIEdyZWdvcnkgQmFrZXIg
d3JvdGU6Cj4gPiA+IEFyZSBhbGwgTkZTIG9wdGlvbnMgZnJvbSBhbiBleHBvcnQgaW5oZXJpdGVk
IGJ5IHN1YnNlcXVlbnQgbW91bnRzIHNpbmNlIAo+ID4gPiB0aGUgdXBzdHJlYW0gY29tbWl0IDU0
Y2VhYzQ1MTU5ODYwMzBjMjUwMjk2MGJlNjIwMTk4ZGQ4ZmUyNWI/Cj4gCj4gSSB3aXNoIEknZCBw
YXllZCBtb3JlIGF0dGVudGlvbiB0byB0aGUgc2hhcmVkIHN1cGVyYmxvY2sgcGF0Y2hlcyBidXQg
SQo+IHdhcyBjb25zdW1lZCBieSBhdXRvZnMgYXQgdGhlIHRpbWUuIEV2ZW4gc28gSSBsaWtlbHkg
d291bGRuJ3QgaGF2ZQo+IHJlYWxpemVkIHRoZSBpbXBsaWNhdGlvbnMuCj4gCj4gPiA+IAo+ID4g
PiBJJ20gcGFzdGluZyBhbiBlbWFpbCBmcm9tIGEgY29sbGVhZ3VlIGF0IEFNRCB0aGF0J3MgYWxz
byB0cnlpbmcgdG8gZ2FpbiAKPiA+ID4gYW4gdW5kZXJzdGFuZGluZyBvZiB0aGUgbmV3IG1vdW50
IGJlaGF2aW9yIChQYXVsIEtyaXphaykuLi4KPiA+ID4gCj4gPiA+IFRoYW5rcywKPiA+ID4gCj4g
PiA+IC0tR3JlZwo+ID4gCj4gPiBJZiB0d28gbW91bnRzIHJlZmVyIHRvIHRoZSBzYW1lIF9maWxl
c3lzdGVtXyAoaS5lLiB0aGUgTkZTIHNlcnZlcgo+ID4gZXhwb3J0cyB0aGVtIHdpdGggdGhlIHNh
bWUgdmFsdWUgZm9yIHRoZSAnZnNpZCcgYXR0cmlidXRlKSwgdGhlbiB0aGV5Cj4gPiB3aWxsIHNo
YXJlIHRoZSBzYW1lIG9wdGlvbnMuCj4gPiAKPiA+IElmIHlvdSBhcmUgdHJ5aW5nIHRvIHVzZSBk
aWZmZXJlbnQgc2VjdXJpdHkgbW9kZWxzIGZvciBkaWZmZXJlbnQKPiA+IG1vdW50cG9pbnQsIHlv
dSBzaG91bGQgaW4gYW55IGNhc2UgYmUgY29uc2lkZXJpbmcgdXNpbmcgZGlmZmVyZW50Cj4gPiBm
aWxlc3lzdGVtcyBvbiB0aGUgc2VydmVyOiBORlMgZmlsZWhhbmRsZXMgYXJlIG5vcm1hbGx5IG5v
dCB0aWVkIHRvIGEKPiA+IHBhdGhuYW1lIHNpbmNlIHRoZXkgYXJlIHJlcXVpcmVkIHRvIHJlbWFp
biB0aGUgc2FtZSBhZnRlciBhCj4gPiBjcm9zcy1kaXJlY3RvcnkgcmVuYW1lKCkuCj4gPiBUaGUg
ZXhjZXB0aW9uIGlzIGlmIHRoZSBmaWxlIGFuZCB0aGUgZGlyZWN0b3J5IGxpdmUgb24gZGlmZmVy
ZW50Cj4gPiBmaWxlc3lzdGVtcyAoc2luY2UgdGhlbiBhIGNyb3NzLWRpcmVjdG9yeSByZW5hbWUo
KSBpcyBub3QgcGVybWl0dGVkCj4gPiBhbnl3YXkpIGFuZCBzbyB0aGUgZmlsZWhhbmRsZXMgd2ls
bCB1c3VhbGx5IGVuY29kZSB0aGUgZnNpZCBpbiBzb21lIHdheQo+ID4gb3IgZm9ybS4KPiA+IAo+
ID4gSU9XOiBpZiAvZm9vIGFuZCAvYmFyIHJlZmVyIHRvIGRpcmVjdG9yaWVzIHRoZSBzYW1lIGZp
bGVzeXN0ZW0sIHRoZW4gdGhlCj4gPiBzZXJ2ZXIgY2Fubm90IGZpZ3VyZSBvdXQgaWYgdGhlIGZp
bGUgYmF6LnR4dCBsaXZlcyBpbiBvbmUgb3IgdGhlIG90aGVyCj4gPiBkaXJlY3RvcnkuIEdpdmlu
ZyAvZm9vIGFuZCAvYmFyIGRpZmZlcmVudCBzZWN1cml0eSBwZXJtaXNzaW9ucyBpcyByZWFsbHkK
PiA+IHRoZSBzYW1lIGFzIGdpdmluZyBfYm90aF8gc2V0cyBvZiBwZXJtaXNzaW9ucy4gRXZlbiBp
ZiB0aGUgc2VydmVyCj4gPiBlbmNvZGVzIHRoZSBwZXJtaXNzaW9ucyBpbiB0aGUgZmlsZWhhbmRs
ZSAodXN1YWxseSBieSByZWNvcmRpbmcgdGhlCj4gPiBleHBvcnQgcG9pbnQpIHRoZW4gdGhlIGNs
aWVudCBpcyBmcmVlIHRvIHNwb29mIGEgZmlsZWhhbmRsZSBmb3IgdGhlIHNhbWUKPiA+IGZpbGUg
d2l0aCB0aGUgb3RoZXIgc2V0IG9mIHBlcm1pc3Npb25zLgo+IAo+IEkgZG9uJ3Qgc2VlIGhvdyBk
b2luZyB0aGlzIG9uIHRoZSBjbGllbnQgYWRkcyBhbnkgc2VjdXJpdHkgYXQgYWxsIGFzIHRoZQo+
IHNwb29maW5nIGlzc3VlIGlzIHN0aWxsIHByZXNlbnQuIEl0IGRvZXMgcHJldmVudCBwZW9wbGUg
ZnJvbSBiZWluZyBhYmxlCj4gdG8gZG8gcmVhZC1vbmx5IG1vdW50cyBmb3IgYWRtaW5pc3RyYXRp
dmUgcHVycG9zZXMuIExpa2Ugd2hlbiB1c2VkIHRvCj4gZ3VhcmQgYWdhaW5zdCBhY2NpZGVudHMg
cmF0aGVyIHRoYW4gcHJvdmlkZSByZWFsIHNlY3VyaXR5LiBCdXQgdGhlbiBJCj4gc2VlIHRoaXMg
aXNuJ3QgdGhlIG9ubHkgcmVhc29uIHRoZXNlIHBhdGNoZXMgd2VyZSBzbyBpbXBvcnRhbnQuCj4g
Cj4gU28gaXMgdGhlcmUgImFueXRoaW5nIGF0IGFsbCIgd2UgY2FuIGRvIHRvIGNoYW5nZSB0aGlz
IGJlaGF2aW9yIG5vdywKPiBhcGFydCBmcm9tIHRlYWNoaW5nIG1vdW50IHRvIGFib3V0IGl0LiBI
YXZpbmcgdGhlIC9ldGMvbXRhYgo+IGFuZCAvcHJvYy9tb3VudHMgcmVjb3JkaW5nIGRpZmZlcmVu
dCBtb3VudCBvcHRpb25zIGZvciBhIG1vdW50IGlzIHdyb25nLgo+IAo+IEhvdyBzaG91bGQgbW91
bnQoOCkgaGFuZGxlIHRoZSBzaXR1YXRpb24/CgogSSdkIGxpa2UgdG8gaW5jbHVkZSB0aGUgZm9s
bG93aW5nIG5vdGUgdG8gdGhlIG5leHQgdXRpbC1saW51eAogcmVsZWFzZSwgdG8gdGhlIG1vdW50
LjgsIHNlY3Rpb24gQlVHUzoKCiAgSXQgIGlzICBwb3NzaWJsZSAgdGhhdCAgZmlsZXMgL2V0Yy9t
dGFiIGFuZCAvcHJvYy9tb3VudHMgZG9u4oCZdCBtYXRjaC4KICBUaGUgZmlyc3QgZmlsZSBpcyBi
YXNlZCBvbmx5IG9uIHRoZSBtb3VudCBjb21tYW5kIG9wdGlvbnMsIGJ1dCB0aGUKICBjb250ZW50
IG9mIHRoZSBzZWNvbmQgZmlsZSBhbHNvIGRlcGVuZHMgb24gdGhlIGtlcm5lbCBhbmQgb3RoZXJz
CiAgc2V0dGluZ3MgKGUuZy4gIHJlbW90ZSBORlMgc2VydmVyLiBJbiBwYXJ0aWN1bGFyIGNhc2Ug
dGhlIG1vdW50CiAgY29tbWFuZCBtYXkgcmVwb3J0cyB1bnJlbGlhYmxlIGluZm9ybWF0aW9uIGFi
b3V0IGEgTkZTIG1vdW50IHBvaW50CiAgYW5kIHRoZSAvcHJvYy9tb3VudHMgZmlsZSB1c3VhbGx5
IGNvbnRhaW5zIG1vcmUgcmVsaWFibGUKICBpbmZvcm1hdGlvbi4pCgoKICAgIEthcmVsICJtdGFi
IGhhdGVyIgoKLS0gCiBLYXJlbCBaYWsgIDxremFrQHJlZGhhdC5jb20+CgotLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tClRoaXMgU0YubmV0IGVtYWlsIGlzIHNwb25zb3JlZCBieSBEQjIgRXhwcmVzcwpEb3dubG9h
ZCBEQjIgRXhwcmVzcyBDIC0gdGhlIEZSRUUgdmVyc2lvbiBvZiBEQjIgZXhwcmVzcyBhbmQgdGFr
ZQpjb250cm9sIG9mIHlvdXIgWE1MLiBObyBsaW1pdHMuIEp1c3QgZGF0YS4gQ2xpY2sgdG8gZ2V0
IGl0IG5vdy4KaHR0cDovL3NvdXJjZWZvcmdlLm5ldC9wb3dlcmJhci9kYjIvCl9fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCk5GUyBtYWlsbGlzdCAgLSAgTkZT
QGxpc3RzLnNvdXJjZWZvcmdlLm5ldApodHRwczovL2xpc3RzLnNvdXJjZWZvcmdlLm5ldC9saXN0
cy9saXN0aW5mby9uZnMK
On Sat, May 05, 2007 at 01:17:52PM -0400, Trond Myklebust wrote:
> On Sat, 2007-05-05 at 23:37 +0800, Ian Kent wrote:
> > On Fri, 2007-05-04 at 17:41 -0400, Trond Myklebust wrote:
> > So is there "anything at all" we can do to change this behavior now,
> > apart from teaching mount to about it. Having the /etc/mtab
> > and /proc/mounts recording different mount options for a mount is wrong.
> >
> > How should mount(8) handle the situation?
>
> I'd rather like to have mount(2) handle the situation by returning an
> error if the mount options cannot be satisfied.
Agree, that makes sense. Unfortuantely this change is probably not
backwardly compatible. Maybe add a new mount(2) flag MS_STRICT which
disable a mount when the mount options cannot be satisfied.
Karel
--
Karel Zak <[email protected]>
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Mon, 2007-05-14 at 15:17 +0200, Karel Zak wrote:
> On Sat, May 05, 2007 at 01:17:52PM -0400, Trond Myklebust wrote:
> > On Sat, 2007-05-05 at 23:37 +0800, Ian Kent wrote:
> > > On Fri, 2007-05-04 at 17:41 -0400, Trond Myklebust wrote:
> > > So is there "anything at all" we can do to change this behavior now,
> > > apart from teaching mount to about it. Having the /etc/mtab
> > > and /proc/mounts recording different mount options for a mount is wrong.
> > >
> > > How should mount(8) handle the situation?
> >
> > I'd rather like to have mount(2) handle the situation by returning an
> > error if the mount options cannot be satisfied.
>
> Agree, that makes sense. Unfortuantely this change is probably not
> backwardly compatible. Maybe add a new mount(2) flag MS_STRICT which
> disable a mount when the mount options cannot be satisfied.
The kernel would be sending the error to the mount program when it
cannot satisfy the request. There is no backwards compatibilty issue.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Mon, 2007-05-14 at 15:17 +0200, Karel Zak wrote:
> On Sat, May 05, 2007 at 01:17:52PM -0400, Trond Myklebust wrote:
> > On Sat, 2007-05-05 at 23:37 +0800, Ian Kent wrote:
> > > On Fri, 2007-05-04 at 17:41 -0400, Trond Myklebust wrote:
> > > So is there "anything at all" we can do to change this behavior now,
> > > apart from teaching mount to about it. Having the /etc/mtab
> > > and /proc/mounts recording different mount options for a mount is wrong.
> > >
> > > How should mount(8) handle the situation?
> >
> > I'd rather like to have mount(2) handle the situation by returning an
> > error if the mount options cannot be satisfied.
>
> Agree, that makes sense. Unfortuantely this change is probably not
> backwardly compatible. Maybe add a new mount(2) flag MS_STRICT which
> disable a mount when the mount options cannot be satisfied.
I really don't think that failing mounts in this case will get us any
relief from people that need to be able to remotely mount an exported
filesystem with different options. We already have several that will
incur significant ongoing maintenance overhead due to this.
The VFS read-only bind mount patches need to be completed as a matter of
urgency and we can consider what we need to do to mount in parallel as a
secondary priority.
If anyone can provide input regarding the status of those patches that
would be great. I haven't got any word back from Dave Hansen (the
author) at this stage.
Ian
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Mon, 2007-05-14 at 09:24 -0400, Trond Myklebust wrote:
> On Mon, 2007-05-14 at 15:17 +0200, Karel Zak wrote:
> > On Sat, May 05, 2007 at 01:17:52PM -0400, Trond Myklebust wrote:
> > > On Sat, 2007-05-05 at 23:37 +0800, Ian Kent wrote:
> > > > On Fri, 2007-05-04 at 17:41 -0400, Trond Myklebust wrote:
> > > > So is there "anything at all" we can do to change this behavior now,
> > > > apart from teaching mount to about it. Having the /etc/mtab
> > > > and /proc/mounts recording different mount options for a mount is wrong.
> > > >
> > > > How should mount(8) handle the situation?
> > >
> > > I'd rather like to have mount(2) handle the situation by returning an
> > > error if the mount options cannot be satisfied.
> >
> > Agree, that makes sense. Unfortuantely this change is probably not
> > backwardly compatible. Maybe add a new mount(2) flag MS_STRICT which
> > disable a mount when the mount options cannot be satisfied.
>
> The kernel would be sending the error to the mount program when it
> cannot satisfy the request. There is no backwards compatibilty issue.
I guess that depends on how you interpret backwardly compatible.
Deciding to now fail mounts in this case would turn an unfortunate side
effect of an urgently needed change into a disaster for some users. I
think it's best to leave it as it is and push on with the needed VFS
changes.
Ian
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Mon, 2007-05-14 at 22:39 +0800, Ian Kent wrote:
> On Mon, 2007-05-14 at 09:24 -0400, Trond Myklebust wrote:
> > On Mon, 2007-05-14 at 15:17 +0200, Karel Zak wrote:
> > > On Sat, May 05, 2007 at 01:17:52PM -0400, Trond Myklebust wrote:
> > > > On Sat, 2007-05-05 at 23:37 +0800, Ian Kent wrote:
> > > > > On Fri, 2007-05-04 at 17:41 -0400, Trond Myklebust wrote:
> > > > > So is there "anything at all" we can do to change this behavior now,
> > > > > apart from teaching mount to about it. Having the /etc/mtab
> > > > > and /proc/mounts recording different mount options for a mount is wrong.
> > > > >
> > > > > How should mount(8) handle the situation?
> > > >
> > > > I'd rather like to have mount(2) handle the situation by returning an
> > > > error if the mount options cannot be satisfied.
> > >
> > > Agree, that makes sense. Unfortuantely this change is probably not
> > > backwardly compatible. Maybe add a new mount(2) flag MS_STRICT which
> > > disable a mount when the mount options cannot be satisfied.
> >
> > The kernel would be sending the error to the mount program when it
> > cannot satisfy the request. There is no backwards compatibilty issue.
>
> I guess that depends on how you interpret backwardly compatible.
>
> Deciding to now fail mounts in this case would turn an unfortunate side
> effect of an urgently needed change into a disaster for some users. I
> think it's best to leave it as it is and push on with the needed VFS
> changes.
No. It would alert people who think they are mounting with different
mount options to the reality that they are not.
We might then be able to add an '--unshared-cache' option (any
suggestions for a better name?) to mount in order to allow those people
who fully understand the consequences to override, and hence to mount
the same filesystem with different mount options, but no sharing of the
page cache with the original filesystem.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
> We might then be able to add an '--unshared-cache' option (any
> suggestions for a better name?) to mount in order to allow those people
> who fully understand the consequences to override, and hence to mount
> the same filesystem with different mount options, but no sharing of the
> page cache with the original filesystem.
I can assure you that AMD systems engineering would *much* prefer having
the VFS stack fixed such that we don't need to do something like this.
It would be much better to make the kernel understand that it *can*
mount the same NFS filesystem twice, once read-only and again
read-write, with the appropriate cache coherency, rather than force us
to run in a potentially dangerous configuration that might lead to data
corruption.
I agree with Ian that the VFS changes need to take priority. The only
change I see that needs to be made to mount is that it needs to spit out
a warning (or fail completely) if the requested options cannot be satisfied.
Paul Krizak 5900 E. Ben White Blvd. MS 625
Advanced Micro Devices Austin, TX 78741
Linux/Unix Systems Engineering Phone: (512) 602-8775
Silicon Design Division Cell: (512) 791-0686
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Mon, 2007-05-14 at 10:56 -0500, Paul Krizak wrote:
> > We might then be able to add an '--unshared-cache' option (any
> > suggestions for a better name?) to mount in order to allow those people
> > who fully understand the consequences to override, and hence to mount
> > the same filesystem with different mount options, but no sharing of the
> > page cache with the original filesystem.
>
> I can assure you that AMD systems engineering would *much* prefer having
> the VFS stack fixed such that we don't need to do something like this.
>
> It would be much better to make the kernel understand that it *can*
> mount the same NFS filesystem twice, once read-only and again
> read-write, with the appropriate cache coherency, rather than force us
> to run in a potentially dangerous configuration that might lead to data
> corruption.
>
> I agree with Ian that the VFS changes need to take priority. The only
> change I see that needs to be made to mount is that it needs to spit out
> a warning (or fail completely) if the requested options cannot be satisfied.
I agree that the read-only flag need to be fixed at the VFS level. I
wasn't talking about that.
I'm talking about people who want to force different security flavours
on different paths for the same filesystem, or who want to set different
rsize/wsize, attribute timeouts etc. There are applications that might
benefit from that, and that do not care about sharing caches (either
because they know that there are no cross-linked files, or because they
are using a locking system that can deal with the cache consistency
issue).
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Mon, 2007-05-14 at 11:47 -0400, Trond Myklebust wrote:
> On Mon, 2007-05-14 at 22:39 +0800, Ian Kent wrote:
> > On Mon, 2007-05-14 at 09:24 -0400, Trond Myklebust wrote:
> > > On Mon, 2007-05-14 at 15:17 +0200, Karel Zak wrote:
> > > > On Sat, May 05, 2007 at 01:17:52PM -0400, Trond Myklebust wrote:
> > > > > On Sat, 2007-05-05 at 23:37 +0800, Ian Kent wrote:
> > > > > > On Fri, 2007-05-04 at 17:41 -0400, Trond Myklebust wrote:
> > > > > > So is there "anything at all" we can do to change this behavior now,
> > > > > > apart from teaching mount to about it. Having the /etc/mtab
> > > > > > and /proc/mounts recording different mount options for a mount is wrong.
> > > > > >
> > > > > > How should mount(8) handle the situation?
> > > > >
> > > > > I'd rather like to have mount(2) handle the situation by returning an
> > > > > error if the mount options cannot be satisfied.
> > > >
> > > > Agree, that makes sense. Unfortuantely this change is probably not
> > > > backwardly compatible. Maybe add a new mount(2) flag MS_STRICT which
> > > > disable a mount when the mount options cannot be satisfied.
> > >
> > > The kernel would be sending the error to the mount program when it
> > > cannot satisfy the request. There is no backwards compatibilty issue.
> >
> > I guess that depends on how you interpret backwardly compatible.
> >
> > Deciding to now fail mounts in this case would turn an unfortunate side
> > effect of an urgently needed change into a disaster for some users. I
> > think it's best to leave it as it is and push on with the needed VFS
> > changes.
>
> No. It would alert people who think they are mounting with different
> mount options to the reality that they are not.
>
> We might then be able to add an '--unshared-cache' option (any
> suggestions for a better name?) to mount in order to allow those people
> who fully understand the consequences to override, and hence to mount
> the same filesystem with different mount options, but no sharing of the
> page cache with the original filesystem.
Not sure, perhaps "--force-options" or "--force-samefs-options".
I've heard back from Dave Hansen and he is planning on posting the VFS
patches in the relatively near future. As you said, it would be best to
resolve this in the VFS so I'm not sure how much effort should be put
into this here.
>From what I can see Dave has put quite a bit of effort into his patches
already so they should be close to ready for submission. Hopefully they
are what's needed.
Ian
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Tuesday May 15, [email protected] wrote:
> On Mon, 2007-05-14 at 11:47 -0400, Trond Myklebust wrote:
> >
> > We might then be able to add an '--unshared-cache' option (any
> > suggestions for a better name?) to mount in order to allow those people
> > who fully understand the consequences to override, and hence to mount
> > the same filesystem with different mount options, but no sharing of the
> > page cache with the original filesystem.
>
> Not sure, perhaps "--force-options" or "--force-samefs-options".
Presumably this would be a mount option (-o xxx) rather than a new
flag to mount? It would cause nfs_compare_super to return 0 if either
'data' or 'sb' contained the "no_share" flag.
I think "shared" is an important concept to have in there as it is
sharing the cache, the connection and the options. For consistency
with other options, I would have an optional "no" at the front to
invert the flag. Current nfs options don't have punctuation, so I
would probably go for something like:
-o [no]sharedcache
-o [no]shareconnection
Then comes the question of what the default should be.
The original default was nosharedcache, but the more recent default
has been sharedcache. In hindsight it would have been better not to
change the default, but things are always much clearer in hindsight.
I would lean towards restoring the default to nosharedcache, and
having to explicitly request sharedcache if you want that, and are
happy to have the same mount option enforced on all sharing mounts.
Having nosharedcache be the default would mean that sharedcache could
fail if other mount options are not an exact match, and there would be
no backward compatibility problem with that.
NeilBrown
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Tue, 2007-05-15 at 08:41 +1000, Neil Brown wrote:
> I think "shared" is an important concept to have in there as it is
> sharing the cache, the connection and the options. For consistency
> with other options, I would have an optional "no" at the front to
> invert the flag. Current nfs options don't have punctuation, so I
> would probably go for something like:
> -o [no]sharedcache
> -o [no]shareconnection
>
> Then comes the question of what the default should be.
> The original default was nosharedcache, but the more recent default
> has been sharedcache. In hindsight it would have been better not to
> change the default, but things are always much clearer in hindsight.
>
> I would lean towards restoring the default to nosharedcache, and
> having to explicitly request sharedcache if you want that, and are
> happy to have the same mount option enforced on all sharing mounts.
I disagree with that. The default was changed for a very good reason,
namely that people were making assumptions that were wrong: i.e. that
the cache remains consistent when you change the ro/rw flag or try to
mount a subdirectory.
In fact, if you mounted the _same_ directory twice, then the default was
always 'sharedcache'.
So all we did in 2.6.18, was to make a consistent set of rules for how
this works.
The default should therefore remain 'sharedcache', preferably returning
an error if the user tries to mix metaphors.
Cheers
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Monday May 14, [email protected] wrote:
> On Tue, 2007-05-15 at 08:41 +1000, Neil Brown wrote:
> >
> > I would lean towards restoring the default to nosharedcache, and
> > having to explicitly request sharedcache if you want that, and are
> > happy to have the same mount option enforced on all sharing mounts.
>
> I disagree with that. The default was changed for a very good reason,
> namely that people were making assumptions that were wrong: i.e. that
> the cache remains consistent when you change the ro/rw flag or try to
> mount a subdirectory.
And fixing the assumptions made by some people broken different
assumptions made by other people - a bit of a no-win situation I
guess.
> In fact, if you mounted the _same_ directory twice, then the default was
> always 'sharedcache'.
Ahh.. I didn't realise that.
>
> So all we did in 2.6.18, was to make a consistent set of rules for how
> this works.
>
> The default should therefore remain 'sharedcache', preferably returning
> an error if the user tries to mix metaphors.
Would we need to rev the mount_data version to add such a flag, or are
we sure that unused flags are 0, and so simply add
#define NFS_MOUNT_UNSHARED 0x8000 /* 5 */
(I don't understand the NFS_MOUNT_FLAGMASK. Can the top 16 bits of
flags be used?)
If the "you have mixed metaphors" error was unique, the mount.nfs
program could conceivable respond to it by setting the UNSHARED flag,
trying again, and printing a big loud warning.... I wonder if that
would be a good idea...
But then what if you wanted sharedcache and weren't fussed about exact
options, how would mount.nfs handle that? Finding a matching entry
in /etc/mtab would be hard because fsid matching would be non-trivial.
Maybe we want two flags "UNSHARED" and "SHARE_AND_IGNORE_MY_SETTINGS"??
NeilBrown
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Tue, 2007-05-15 at 09:58 -0400, Chuck Lever wrote:
> Just kicking it out there: why not revert the kernel back to the
> previous state of affairs where "nosharedcache" was the default, and
> then let user space handle sharing or not sharing, documenting clearly
> what the implications are in the mount(8) or nfs(5) man pages? User
> space is smart enough to emit a warning about mixing security flavors,
> for instance, as suggested above.
Read the thread. See my previous answers.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Trond Myklebust wrote:
> On Tue, 2007-05-15 at 08:41 +1000, Neil Brown wrote:
>> I think "shared" is an important concept to have in there as it is
>> sharing the cache, the connection and the options. For consistency
>> with other options, I would have an optional "no" at the front to
>> invert the flag. Current nfs options don't have punctuation, so I
>> would probably go for something like:
>> -o [no]sharedcache
>> -o [no]shareconnection
>>
>> Then comes the question of what the default should be.
>> The original default was nosharedcache, but the more recent default
>> has been sharedcache. In hindsight it would have been better not to
>> change the default, but things are always much clearer in hindsight.
>>
>> I would lean towards restoring the default to nosharedcache, and
>> having to explicitly request sharedcache if you want that, and are
>> happy to have the same mount option enforced on all sharing mounts.
>
> I disagree with that. The default was changed for a very good reason,
> namely that people were making assumptions that were wrong: i.e. that
> the cache remains consistent when you change the ro/rw flag or try to
> mount a subdirectory.
I admit to being biased representing what people's assumptions are.
My (main) assumptions of the default behavior of NFS exports and
filesystems had nothing to do with cache consistency; they were based on
"what the heck is the 'best' way to manage filesystems and NFS exports
of petabytes of data that are grouped in large RAID aggregates?"
A "typical" NFS server (for us) has the ~following characteristics:
* 40 TB of data
* 5 RAID dual parity Aggregates
* 36 Volumes
* 83 Qtrees
The volumes are roughly equivalent to what NFS views as the "filesystem"
export/superblock. Volumes are sized/created for best performance
across the disk/network/backplane subsystem. Within volumes, qtrees(1)
are created to administer quotas, data ownership, etc. The qtrees are
what are mounted by automount and the compute cluster with (potentially)
different NFS mount options.
Raise the "typical" NFS server ^few powers for the administrative
headache of an enterprise environment.
This is my biased point of view, offered up for informational purposes only.
Thanks,
--Greg
(1) Qtrees are logical divisions of data structures that can be managed
individually. The data space allocation of a qtree can be dynamically
sized at will and without interruption to the user community. With the
advent of flexvols this feature is available for volumes however this is
a recent feature. One of the beneficial features of exports on the
filer side is nested exports. Most exports can be handled at the volume
level consistently with only exceptions exported individually. We do
not make the practice of individually exporting every qtree.
> In fact, if you mounted the _same_ directory twice, then the default was
> always 'sharedcache'.
>
> So all we did in 2.6.18, was to make a consistent set of rules for how
> this works.
>
> The default should therefore remain 'sharedcache', preferably returning
> an error if the user tries to mix metaphors.
>
> Cheers
> Trond
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>
--
----------------------------------------------------------------------
Greg Baker 512-602-3287 (work)
[email protected] 512-602-6970 (fax)
5204 E. Ben White Blvd MS 625 512-555-1212 (info)
Austin, TX 78741
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Tue, 2007-05-15 at 09:58 +1000, Neil Brown wrote:
> Would we need to rev the mount_data version to add such a flag, or are
> we sure that unused flags are 0, and so simply add
>
> #define NFS_MOUNT_UNSHARED 0x8000 /* 5 */
>
> (I don't understand the NFS_MOUNT_FLAGMASK. Can the top 16 bits of
> flags be used?)
I don't see why not. It is a private field in a private data structure.
I've really never understood the point of that mask in the first place.
> If the "you have mixed metaphors" error was unique, the mount.nfs
> program could conceivable respond to it by setting the UNSHARED flag,
> trying again, and printing a big loud warning.... I wonder if that
> would be a good idea...
>
> But then what if you wanted sharedcache and weren't fussed about exact
> options, how would mount.nfs handle that? Finding a matching entry
> in /etc/mtab would be hard because fsid matching would be non-trivial.
> Maybe we want two flags "UNSHARED" and "SHARE_AND_IGNORE_MY_SETTINGS"??
None of the other filesystems allow you to do this. They will simply
return EBUSY, and leave up to you to figure out what is wrong.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
From: Trond Myklebust <[email protected]>
Add a flag to allow users to elect not to share the NFS super block with
another mount point, even if the fsids are the same. This will allow
users to set different mount options for the two different super blocks,
and is safe as long as they do not have inodes in common.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/super.c | 34 +++++++++++++++++++++++++++++-----
include/linux/nfs4_mount.h | 1 +
include/linux/nfs_mount.h | 1 +
3 files changed, 31 insertions(+), 5 deletions(-)
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index ca20d3c..c03120a 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -291,6 +291,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
{ NFS_MOUNT_NONLM, ",nolock", "" },
{ NFS_MOUNT_NOACL, ",noacl", "" },
{ NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" },
+ { NFS_MOUNT_UNSHARED, ",nosharecache", ""},
{ 0, NULL, NULL }
};
const struct proc_nfs_info *nfs_infop;
@@ -602,6 +603,9 @@ static int nfs_compare_super(struct super_block *sb, void *data)
if (old->nfs_client != server->nfs_client)
return 0;
+ /* Note: NFS_MOUNT_UNSHARED == NFS4_MOUNT_UNSHARED */
+ if (old->flags & NFS_MOUNT_UNSHARED)
+ return 0;
if (memcmp(&old->fsid, &server->fsid, sizeof(old->fsid)) != 0)
return 0;
return 1;
@@ -615,6 +619,7 @@ static int nfs_get_sb(struct file_system_type *fs_type,
struct nfs_fh mntfh;
struct nfs_mount_data *data = raw_data;
struct dentry *mntroot;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
/* Validate the mount data */
@@ -629,8 +634,11 @@ static int nfs_get_sb(struct file_system_type *fs_type,
goto out_err_noserver;
}
+ if (server->flags & NFS_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;
@@ -691,6 +699,7 @@ static int nfs_xdev_get_sb(struct file_system_type *fs_type, int flags,
struct super_block *s;
struct nfs_server *server;
struct dentry *mntroot;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
dprintk("--> nfs_xdev_get_sb()\n");
@@ -702,8 +711,11 @@ static int nfs_xdev_get_sb(struct file_system_type *fs_type, int flags,
goto out_err_noserver;
}
+ if (server->flags & NFS_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(&nfs_fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;
@@ -808,6 +820,7 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
struct dentry *mntroot;
char *mntpath = NULL, *hostname = NULL, ip_addr[16];
void *p;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
if (data == NULL) {
@@ -879,8 +892,11 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
goto out_err_noserver;
}
+ if (server->flags & NFS4_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_free;
@@ -949,6 +965,7 @@ static int nfs4_xdev_get_sb(struct file_system_type *fs_type, int flags,
struct super_block *s;
struct nfs_server *server;
struct dentry *mntroot;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
dprintk("--> nfs4_xdev_get_sb()\n");
@@ -960,8 +977,11 @@ static int nfs4_xdev_get_sb(struct file_system_type *fs_type, int flags,
goto out_err_noserver;
}
+ if (server->flags & NFS4_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(&nfs_fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;
@@ -1016,6 +1036,7 @@ static int nfs4_referral_get_sb(struct file_system_type *fs_type, int flags,
struct nfs_server *server;
struct dentry *mntroot;
struct nfs_fh mntfh;
+ int (*compare_super)(struct super_block *,void *) = nfs_compare_super;
int error;
dprintk("--> nfs4_referral_get_sb()\n");
@@ -1027,8 +1048,11 @@ static int nfs4_referral_get_sb(struct file_system_type *fs_type, int flags,
goto out_err_noserver;
}
+ if (server->flags & NFS4_MOUNT_UNSHARED)
+ compare_super = NULL;
+
/* Get a superblock - note that we may end up sharing one that already exists */
- s = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server);
+ s = sget(&nfs_fs_type, compare_super, nfs_set_super, server);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;
diff --git a/include/linux/nfs4_mount.h b/include/linux/nfs4_mount.h
index 26b4c83..ad1bd4a 100644
--- a/include/linux/nfs4_mount.h
+++ b/include/linux/nfs4_mount.h
@@ -65,6 +65,7 @@ struct nfs4_mount_data {
#define NFS4_MOUNT_NOCTO 0x0010 /* 1 */
#define NFS4_MOUNT_NOAC 0x0020 /* 1 */
#define NFS4_MOUNT_STRICTLOCK 0x1000 /* 1 */
+#define NFS4_MOUNT_UNSHARED 0x8000 /* 1 */
#define NFS4_MOUNT_FLAGMASK 0xFFFF
#endif
diff --git a/include/linux/nfs_mount.h b/include/linux/nfs_mount.h
index cc8b9c5..3e3b521 100644
--- a/include/linux/nfs_mount.h
+++ b/include/linux/nfs_mount.h
@@ -62,6 +62,7 @@ struct nfs_mount_data {
#define NFS_MOUNT_STRICTLOCK 0x1000 /* reserved for NFSv4 */
#define NFS_MOUNT_SECFLAVOUR 0x2000 /* 5 */
#define NFS_MOUNT_NORDIRPLUS 0x4000 /* 5 */
+#define NFS_MOUNT_UNSHARED 0x8000 /* 5 */
#define NFS_MOUNT_FLAGMASK 0xFFFF
#endif
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
From: Trond Myklebust <[email protected]>
Unless the user sets the NFS_MOUNT_NOSHAREDCACHE mount flag, we should
return EBUSY if the filesystem is already mounted on a superblock that
has set conflicting mount options.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/super.c | 33 ++++++++++++++++++++++++++++++++-
1 files changed, 32 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index c03120a..fd6e330 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -601,7 +601,9 @@ static int nfs_compare_super(struct super_block *sb, void *data)
{
struct nfs_server *server = data, *old = NFS_SB(sb);
- if (old->nfs_client != server->nfs_client)
+ if (memcmp(&old->nfs_client->cl_addr,
+ &server->nfs_client->cl_addr,
+ sizeof(old->nfs_client->cl_addr)) != 0)
return 0;
/* Note: NFS_MOUNT_UNSHARED == NFS4_MOUNT_UNSHARED */
if (old->flags & NFS_MOUNT_UNSHARED)
@@ -611,6 +613,29 @@ static int nfs_compare_super(struct super_block *sb, void *data)
return 1;
}
+static int nfs_compare_mount_options(const struct nfs_server *a, const struct nfs_server *b)
+{
+ if (a->nfs_client != b->nfs_client)
+ goto Ebusy;
+ if (a->flags != b->flags)
+ goto Ebusy;
+ if (a->wsize != b->wsize)
+ goto Ebusy;
+ if (a->rsize != b->rsize)
+ goto Ebusy;
+ if (a->acregmin != b->acregmin)
+ goto Ebusy;
+ if (a->acregmax != b->acregmax)
+ goto Ebusy;
+ if (a->acdirmin != b->acdirmin)
+ goto Ebusy;
+ if (a->acdirmax != b->acdirmax)
+ goto Ebusy;
+ return 0;
+Ebusy:
+ return -EBUSY;
+}
+
static int nfs_get_sb(struct file_system_type *fs_type,
int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt)
{
@@ -645,8 +670,11 @@ static int nfs_get_sb(struct file_system_type *fs_type,
}
if (s->s_fs_info != server) {
+ error = nfs_compare_mount_options(server, NFS_SB(s));
nfs_free_server(server);
server = NULL;
+ if (error < 0)
+ goto error_splat_super;
}
if (!s->s_root) {
@@ -903,8 +931,11 @@ static int nfs4_get_sb(struct file_system_type *fs_type,
}
if (s->s_fs_info != server) {
+ error = nfs_compare_mount_options(server, NFS_SB(s));
nfs_free_server(server);
server = NULL;
+ if (error < 0)
+ goto error_splat_super;
}
if (!s->s_root) {
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Tuesday May 15, [email protected] wrote:
> From: Trond Myklebust <[email protected]>
>
> Add a flag to allow users to elect not to share the NFS super block with
> another mount point, even if the fsids are the same. This will allow
> users to set different mount options for the two different super blocks,
> and is safe as long as they do not have inodes in common.
Thanks.
Just to clarify: By "is safe" you mean won't cause cache-coherence
strangeness", or is there some other safety issue that I'm missing.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
>
> fs/nfs/super.c | 34 +++++++++++++++++++++++++++++-----
> include/linux/nfs4_mount.h | 1 +
> include/linux/nfs_mount.h | 1 +
> 3 files changed, 31 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> index ca20d3c..c03120a 100644
> --- a/fs/nfs/super.c
> +++ b/fs/nfs/super.c
> @@ -291,6 +291,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
> { NFS_MOUNT_NONLM, ",nolock", "" },
> { NFS_MOUNT_NOACL, ",noacl", "" },
> { NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" },
> + { NFS_MOUNT_UNSHARED, ",nosharecache", ""},
^^^^^^^^^^^^
This is different to the spelling used in the subject line. Was that
intentional, or a typo?
Thanks,
NeilBrown
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Are all NFS options from an export inherited by subsequent mounts since
the upstream commit 54ceac4515986030c2502960be620198dd8fe25b?
I'm pasting an email from a colleague at AMD that's also trying to gain
an understanding of the new mount behavior (Paul Krizak)...
Thanks,
--Greg
[...snip...]
* Here are the various automount points that mount to eng:/vol/vol18,
which is a single entry in eng's exports file:
[skaven@byleth /tool/eng-vol0/etc]$ ypcat -k auto.tool | grep eng.*vol18
arc_syslog -intr eng:/vol/vol18/&
site-lib -intr eng:/vol/vol18/site-config/provision/site-lib
site-config -intr eng:/vol/vol18/&
eng-vol18 -intr eng:/vol/vol18
sysadmin_tmp -intr eng:/vol/vol18/&
linux -intr eng:/vol/vol18/&
[skaven@byleth /tool/eng-vol0/etc]$ grep vol18 exports
/vol/vol18 -sec=sys,rw=@pcd,root=@tx_admin_nodes,anon=4058
* So perhaps the problem is that things using the same *export* inherit
the same options?
* First mount sysadmin_tmp as the first (and only) mount to eng:/vol/vol18:
[root@adcgar05 mnt]# grep eng.*vol18 /proc/mounts
[root@adcgar05 mnt]# mount -t nfs -o rsize=1024,wsize=1024
eng:/vol/vol18/sysadmin_tmp local
[root@adcgar05 mnt]# grep eng.*vol18 /proc/mounts
eng:/vol/vol18/sysadmin_tmp /mnt/local nfs
rw,vers=3,rsize=1024,wsize=1024,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
* Now let's mount something else with "default" options:
[root@adcgar05 mnt]# mount -t nfs eng:/vol/vol18/arc_syslog local2
[root@adcgar05 mnt]# grep eng.*vol18 /proc/mounts
eng:/vol/vol18/sysadmin_tmp /mnt/local nfs
rw,vers=3,rsize=1024,wsize=1024,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
eng:/vol/vol18/arc_syslog /mnt/local2 nfs
rw,vers=3,rsize=1024,wsize=1024,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
0 0
Ah ha! That seems to be it!!!
My "mental model" of how mount/automount should work does *not* think
this is correct. While multiple mounts may occur on a single export
from a filer, the options from the client side for each individual mount
(especially for separate subdirs) should be customizable.
Inheriting the options from the previous mount to that export is asinine
and simply unsupportable in our environment. I did some further testing
and found that this "inheritance" model even happens for ro/rw
attributes, causing HUGE security implications. For example, what if
something like this happened:
1. mount an innocuous directory like /tool/sysadmin_tmp read-write
2. mount an important directory like /tool/finance_data from the same
filer:/vol/volume, but with -o ro
3. My experimentation shows that the "finance_data" mount will be
read-write due to inheritance!!!
The obvious implications of usage of automount and such, I believe this
to be very *bad* behavior. I could see how a sysadmin could set up
something like this expecting it to be secure. For example, a filer
could be set up with one exported dir, depending on its clients (with,
say, static /etc/fstab mounts) setting up whether the NFS mount is
read-only or read-write.
But with the way RHEL5 appears to act, a clever user could carefully
"cd" to a read-write dir first, then to the read-only one, and they'd
get read-write privileges where they were not supposed to!
Comments?
--
Paul Krizak 5900 E. Ben White Blvd. MS 625
Advanced Micro Devices Austin, TX 78741
Linux/Unix Systems Engineering Phone: (512) 602-8775
Silicon Design Division Cell: (512) 791-0686
[...snip...]
[...followup from Ian Kent (autofs) on rhelv5-list...]
I first noticed this (in my opinion) regression more than 6 months ago.
For a long time I thought it was only restricted to the ro/rw attributes
but recently I've see the rsize/wsize issue mentioned.
There have been several bugs logged and posts made to the NFS list but I
haven't seen much more than an acknowledgment that it's a limitation of
the NFS implementation.
So it would be good to be counted by posting this comprehensive analysis
to the NFS list (https://lists.sourceforge.net/lists/listinfo/nfs).
Maybe that will increase the priority of fixing this issue.
.
.
.
Sure is and it causes several of the autofs Connectathon tests to fail
which makes me look bad when it's not actually something I have control
over, grrr!
Ian
[...snip...]
Gregory Baker wrote:
>
> ...closing the loop on what we've found out... it doesn't appear that a
> convenient workaround (at the mount level) exists. We'll play around
> with local executable automount maps to see if that will work for us.
>
> Thanks,
>
> --Greg
>
> *****
> Indeed, now that I read that comment by Trond and went to the source
> code, I found out about upstream commit
> 54ceac4515986030c2502960be620198dd8fe25b.
>
> The idea is that having a single super_block structure per server per
> FSID prevents corner cases that can lead to corrupt dentry cache trees,
> prevents conflicting buffer cache contents to what ends up being the
> same file, and some other scary situations.
>
> So the deal is, the mount flags (and NFS options) are set only the first
> time that a given combination of server and filesystem are mounted. If
> you ever mount the same filesystem from the same server on another
> mountpoint, you'll get the flags and options that were passed on to the
> first mount. There's no working around that.
> *****
>
> Gregory Baker wrote:
>> Trond Myklebust wrote:
>> > On Thu, 2007-04-19 at 15:37 -0500, Gregory Baker wrote:
>> >
>> >> ---+ BAD RHEL 5 64 system
>> > ...
>> >> [root@adcgar04 mnt2]# cat /proc/mounts | grep mnt2
>> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
>> >>
>> ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
>>
>> >> 0 0
>> >
>> > That would be your problem right there. Is the same volume perhaps
>> > mounted read-only somewhere else on the same client? That is no longer
>> > allowed.
>> >
>> > Cheers
>> > Trond
>> >
>>
>> Ah, news to me!
>>
>> So if you mount a volume ro at one mount point /mnt1 and then try to
>> mount the same volume rw at a second mount point /mnt2 you'll run into
>> problems? Apologies, didn't catch this in release notes. Thanks!
>>
>> --Greg
>>
>> ---+ A New Beginning
>>
>> [root@adcgar04 /]# umount -a -t nfs
>>
>> [root@adcgar04 /]# mount -v | grep pandora
>>
>> [root@adcgar04 /]# cat /proc/mounts | grep pandora
>>
>> [root@adcgar04 /]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
>>
>> [root@adcgar04 /]# mount -v | grep pandora
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
>> (rw,addr=163.181.34.137)
>>
>> [root@adcgar04 /]# cat /proc/mounts | grep pandora
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
>> rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
>> 0 0
>>
>> [root@adcgar04 /]# touch /mnt2/asdf
>>
>> ---+ The ro Strikes Back
>>
>> [root@adcgar04 /]# mount -o ro
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1
>>
>> [root@adcgar04 /]# mount -o rw
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
>>
>> [root@adcgar04 /]# mount -v | grep mnt
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt1 type nfs
>> (ro,addr=163.181.34.137)
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
>> (rw,addr=163.181.34.137)
>>
>> [root@adcgar04 /]# cat /proc/mounts | grep mnt
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1 nfs
>> ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
>> 0 0
>> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
>> ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
>> 0 0
>>
>> [root@adcgar04 /]# touch /mnt2/asdf
>> touch: cannot touch `/mnt2/asdf': Read-only file system
>>
>>
>>
>>
>>
>
--
----------------------------------------------------------------------
Greg Baker 512-602-3287 (work)
[email protected] 512-602-6970 (fax)
5204 E. Ben White Blvd MS 625 512-555-1212 (info)
Austin, TX 78741
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, 2007-05-04 at 15:30 -0500, Gregory Baker wrote:
> Are all NFS options from an export inherited by subsequent mounts since
> the upstream commit 54ceac4515986030c2502960be620198dd8fe25b?
>
> I'm pasting an email from a colleague at AMD that's also trying to gain
> an understanding of the new mount behavior (Paul Krizak)...
>
> Thanks,
>
> --Greg
If two mounts refer to the same _filesystem_ (i.e. the NFS server
exports them with the same value for the 'fsid' attribute), then they
will share the same options.
If you are trying to use different security models for different
mountpoint, you should in any case be considering using different
filesystems on the server: NFS filehandles are normally not tied to a
pathname since they are required to remain the same after a
cross-directory rename().
The exception is if the file and the directory live on different
filesystems (since then a cross-directory rename() is not permitted
anyway) and so the filehandles will usually encode the fsid in some way
or form.
IOW: if /foo and /bar refer to directories the same filesystem, then the
server cannot figure out if the file baz.txt lives in one or the other
directory. Giving /foo and /bar different security permissions is really
the same as giving _both_ sets of permissions. Even if the server
encodes the permissions in the filehandle (usually by recording the
export point) then the client is free to spoof a filehandle for the same
file with the other set of permissions.
Cheers
Trond
> [...snip...]
>
> * Here are the various automount points that mount to eng:/vol/vol18,
> which is a single entry in eng's exports file:
>
> [skaven@byleth /tool/eng-vol0/etc]$ ypcat -k auto.tool | grep eng.*vol18
> arc_syslog -intr eng:/vol/vol18/&
> site-lib -intr eng:/vol/vol18/site-config/provision/site-lib
> site-config -intr eng:/vol/vol18/&
> eng-vol18 -intr eng:/vol/vol18
> sysadmin_tmp -intr eng:/vol/vol18/&
> linux -intr eng:/vol/vol18/&
>
> [skaven@byleth /tool/eng-vol0/etc]$ grep vol18 exports
> /vol/vol18 -sec=sys,rw=@pcd,root=@tx_admin_nodes,anon=4058
>
> * So perhaps the problem is that things using the same *export* inherit
> the same options?
>
> * First mount sysadmin_tmp as the first (and only) mount to eng:/vol/vol18:
>
> [root@adcgar05 mnt]# grep eng.*vol18 /proc/mounts
> [root@adcgar05 mnt]# mount -t nfs -o rsize=1024,wsize=1024
> eng:/vol/vol18/sysadmin_tmp local
> [root@adcgar05 mnt]# grep eng.*vol18 /proc/mounts
> eng:/vol/vol18/sysadmin_tmp /mnt/local nfs
> rw,vers=3,rsize=1024,wsize=1024,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
>
> * Now let's mount something else with "default" options:
> [root@adcgar05 mnt]# mount -t nfs eng:/vol/vol18/arc_syslog local2
> [root@adcgar05 mnt]# grep eng.*vol18 /proc/mounts
> eng:/vol/vol18/sysadmin_tmp /mnt/local nfs
> rw,vers=3,rsize=1024,wsize=1024,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
> eng:/vol/vol18/arc_syslog /mnt/local2 nfs
> rw,vers=3,rsize=1024,wsize=1024,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> 0 0
>
> Ah ha! That seems to be it!!!
>
> My "mental model" of how mount/automount should work does *not* think
> this is correct. While multiple mounts may occur on a single export
> from a filer, the options from the client side for each individual mount
> (especially for separate subdirs) should be customizable.
>
> Inheriting the options from the previous mount to that export is asinine
> and simply unsupportable in our environment. I did some further testing
> and found that this "inheritance" model even happens for ro/rw
> attributes, causing HUGE security implications. For example, what if
> something like this happened:
>
> 1. mount an innocuous directory like /tool/sysadmin_tmp read-write
> 2. mount an important directory like /tool/finance_data from the same
> filer:/vol/volume, but with -o ro
> 3. My experimentation shows that the "finance_data" mount will be
> read-write due to inheritance!!!
>
> The obvious implications of usage of automount and such, I believe this
> to be very *bad* behavior. I could see how a sysadmin could set up
> something like this expecting it to be secure. For example, a filer
> could be set up with one exported dir, depending on its clients (with,
> say, static /etc/fstab mounts) setting up whether the NFS mount is
> read-only or read-write.
>
> But with the way RHEL5 appears to act, a clever user could carefully
> "cd" to a read-write dir first, then to the read-only one, and they'd
> get read-write privileges where they were not supposed to!
>
> Comments?
>
> --
>
> Paul Krizak 5900 E. Ben White Blvd. MS 625
> Advanced Micro Devices Austin, TX 78741
> Linux/Unix Systems Engineering Phone: (512) 602-8775
> Silicon Design Division Cell: (512) 791-0686
>
> [...snip...]
>
> [...followup from Ian Kent (autofs) on rhelv5-list...]
>
> I first noticed this (in my opinion) regression more than 6 months ago.
> For a long time I thought it was only restricted to the ro/rw attributes
> but recently I've see the rsize/wsize issue mentioned.
>
> There have been several bugs logged and posts made to the NFS list but I
> haven't seen much more than an acknowledgment that it's a limitation of
> the NFS implementation.
>
> So it would be good to be counted by posting this comprehensive analysis
> to the NFS list (https://lists.sourceforge.net/lists/listinfo/nfs).
> Maybe that will increase the priority of fixing this issue.
> .
> .
> .
> Sure is and it causes several of the autofs Connectathon tests to fail
> which makes me look bad when it's not actually something I have control
> over, grrr!
>
> Ian
>
> [...snip...]
>
>
> Gregory Baker wrote:
> >
> > ...closing the loop on what we've found out... it doesn't appear that a
> > convenient workaround (at the mount level) exists. We'll play around
> > with local executable automount maps to see if that will work for us.
> >
> > Thanks,
> >
> > --Greg
> >
> > *****
> > Indeed, now that I read that comment by Trond and went to the source
> > code, I found out about upstream commit
> > 54ceac4515986030c2502960be620198dd8fe25b.
> >
> > The idea is that having a single super_block structure per server per
> > FSID prevents corner cases that can lead to corrupt dentry cache trees,
> > prevents conflicting buffer cache contents to what ends up being the
> > same file, and some other scary situations.
> >
> > So the deal is, the mount flags (and NFS options) are set only the first
> > time that a given combination of server and filesystem are mounted. If
> > you ever mount the same filesystem from the same server on another
> > mountpoint, you'll get the flags and options that were passed on to the
> > first mount. There's no working around that.
> > *****
> >
> > Gregory Baker wrote:
> >> Trond Myklebust wrote:
> >> > On Thu, 2007-04-19 at 15:37 -0500, Gregory Baker wrote:
> >> >
> >> >> ---+ BAD RHEL 5 64 system
> >> > ...
> >> >> [root@adcgar04 mnt2]# cat /proc/mounts | grep mnt2
> >> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> >> >>
> >> ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> >>
> >> >> 0 0
> >> >
> >> > That would be your problem right there. Is the same volume perhaps
> >> > mounted read-only somewhere else on the same client? That is no longer
> >> > allowed.
> >> >
> >> > Cheers
> >> > Trond
> >> >
> >>
> >> Ah, news to me!
> >>
> >> So if you mount a volume ro at one mount point /mnt1 and then try to
> >> mount the same volume rw at a second mount point /mnt2 you'll run into
> >> problems? Apologies, didn't catch this in release notes. Thanks!
> >>
> >> --Greg
> >>
> >> ---+ A New Beginning
> >>
> >> [root@adcgar04 /]# umount -a -t nfs
> >>
> >> [root@adcgar04 /]# mount -v | grep pandora
> >>
> >> [root@adcgar04 /]# cat /proc/mounts | grep pandora
> >>
> >> [root@adcgar04 /]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
> >>
> >> [root@adcgar04 /]# mount -v | grep pandora
> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> >> (rw,addr=163.181.34.137)
> >>
> >> [root@adcgar04 /]# cat /proc/mounts | grep pandora
> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> >> rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> >> 0 0
> >>
> >> [root@adcgar04 /]# touch /mnt2/asdf
> >>
> >> ---+ The ro Strikes Back
> >>
> >> [root@adcgar04 /]# mount -o ro
> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1
> >>
> >> [root@adcgar04 /]# mount -o rw
> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
> >>
> >> [root@adcgar04 /]# mount -v | grep mnt
> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt1 type nfs
> >> (ro,addr=163.181.34.137)
> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> >> (rw,addr=163.181.34.137)
> >>
> >> [root@adcgar04 /]# cat /proc/mounts | grep mnt
> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1 nfs
> >> ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> >> 0 0
> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> >> ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> >> 0 0
> >>
> >> [root@adcgar04 /]# touch /mnt2/asdf
> >> touch: cannot touch `/mnt2/asdf': Read-only file system
> >>
> >>
> >>
> >>
> >>
> >
>
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, 2007-05-04 at 17:41 -0400, Trond Myklebust wrote:
> On Fri, 2007-05-04 at 15:30 -0500, Gregory Baker wrote:
> > Are all NFS options from an export inherited by subsequent mounts since
> > the upstream commit 54ceac4515986030c2502960be620198dd8fe25b?
I wish I'd payed more attention to the shared superblock patches but I
was consumed by autofs at the time. Even so I likely wouldn't have
realized the implications.
> >
> > I'm pasting an email from a colleague at AMD that's also trying to gain
> > an understanding of the new mount behavior (Paul Krizak)...
> >
> > Thanks,
> >
> > --Greg
>
> If two mounts refer to the same _filesystem_ (i.e. the NFS server
> exports them with the same value for the 'fsid' attribute), then they
> will share the same options.
>
> If you are trying to use different security models for different
> mountpoint, you should in any case be considering using different
> filesystems on the server: NFS filehandles are normally not tied to a
> pathname since they are required to remain the same after a
> cross-directory rename().
> The exception is if the file and the directory live on different
> filesystems (since then a cross-directory rename() is not permitted
> anyway) and so the filehandles will usually encode the fsid in some way
> or form.
>
> IOW: if /foo and /bar refer to directories the same filesystem, then the
> server cannot figure out if the file baz.txt lives in one or the other
> directory. Giving /foo and /bar different security permissions is really
> the same as giving _both_ sets of permissions. Even if the server
> encodes the permissions in the filehandle (usually by recording the
> export point) then the client is free to spoof a filehandle for the same
> file with the other set of permissions.
I don't see how doing this on the client adds any security at all as the
spoofing issue is still present. It does prevent people from being able
to do read-only mounts for administrative purposes. Like when used to
guard against accidents rather than provide real security. But then I
see this isn't the only reason these patches were so important.
So is there "anything at all" we can do to change this behavior now,
apart from teaching mount to about it. Having the /etc/mtab
and /proc/mounts recording different mount options for a mount is wrong.
How should mount(8) handle the situation?
>
> Cheers
> Trond
>
> > [...snip...]
> >
> > * Here are the various automount points that mount to eng:/vol/vol18,
> > which is a single entry in eng's exports file:
> >
> > [skaven@byleth /tool/eng-vol0/etc]$ ypcat -k auto.tool | grep eng.*vol18
> > arc_syslog -intr eng:/vol/vol18/&
> > site-lib -intr eng:/vol/vol18/site-config/provision/site-lib
> > site-config -intr eng:/vol/vol18/&
> > eng-vol18 -intr eng:/vol/vol18
> > sysadmin_tmp -intr eng:/vol/vol18/&
> > linux -intr eng:/vol/vol18/&
> >
> > [skaven@byleth /tool/eng-vol0/etc]$ grep vol18 exports
> > /vol/vol18 -sec=sys,rw=@pcd,root=@tx_admin_nodes,anon=4058
> >
> > * So perhaps the problem is that things using the same *export* inherit
> > the same options?
> >
> > * First mount sysadmin_tmp as the first (and only) mount to eng:/vol/vol18:
> >
> > [root@adcgar05 mnt]# grep eng.*vol18 /proc/mounts
> > [root@adcgar05 mnt]# mount -t nfs -o rsize=1024,wsize=1024
> > eng:/vol/vol18/sysadmin_tmp local
> > [root@adcgar05 mnt]# grep eng.*vol18 /proc/mounts
> > eng:/vol/vol18/sysadmin_tmp /mnt/local nfs
> > rw,vers=3,rsize=1024,wsize=1024,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> > 0 0
> >
> > * Now let's mount something else with "default" options:
> > [root@adcgar05 mnt]# mount -t nfs eng:/vol/vol18/arc_syslog local2
> > [root@adcgar05 mnt]# grep eng.*vol18 /proc/mounts
> > eng:/vol/vol18/sysadmin_tmp /mnt/local nfs
> > rw,vers=3,rsize=1024,wsize=1024,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> > 0 0
> > eng:/vol/vol18/arc_syslog /mnt/local2 nfs
> > rw,vers=3,rsize=1024,wsize=1024,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> > 0 0
> >
> > Ah ha! That seems to be it!!!
> >
> > My "mental model" of how mount/automount should work does *not* think
> > this is correct. While multiple mounts may occur on a single export
> > from a filer, the options from the client side for each individual mount
> > (especially for separate subdirs) should be customizable.
> >
> > Inheriting the options from the previous mount to that export is asinine
> > and simply unsupportable in our environment. I did some further testing
> > and found that this "inheritance" model even happens for ro/rw
> > attributes, causing HUGE security implications. For example, what if
> > something like this happened:
> >
> > 1. mount an innocuous directory like /tool/sysadmin_tmp read-write
> > 2. mount an important directory like /tool/finance_data from the same
> > filer:/vol/volume, but with -o ro
> > 3. My experimentation shows that the "finance_data" mount will be
> > read-write due to inheritance!!!
> >
> > The obvious implications of usage of automount and such, I believe this
> > to be very *bad* behavior. I could see how a sysadmin could set up
> > something like this expecting it to be secure. For example, a filer
> > could be set up with one exported dir, depending on its clients (with,
> > say, static /etc/fstab mounts) setting up whether the NFS mount is
> > read-only or read-write.
> >
> > But with the way RHEL5 appears to act, a clever user could carefully
> > "cd" to a read-write dir first, then to the read-only one, and they'd
> > get read-write privileges where they were not supposed to!
> >
> > Comments?
> >
> > --
> >
> > Paul Krizak 5900 E. Ben White Blvd. MS 625
> > Advanced Micro Devices Austin, TX 78741
> > Linux/Unix Systems Engineering Phone: (512) 602-8775
> > Silicon Design Division Cell: (512) 791-0686
> >
> > [...snip...]
> >
> > [...followup from Ian Kent (autofs) on rhelv5-list...]
> >
> > I first noticed this (in my opinion) regression more than 6 months ago.
> > For a long time I thought it was only restricted to the ro/rw attributes
> > but recently I've see the rsize/wsize issue mentioned.
> >
> > There have been several bugs logged and posts made to the NFS list but I
> > haven't seen much more than an acknowledgment that it's a limitation of
> > the NFS implementation.
> >
> > So it would be good to be counted by posting this comprehensive analysis
> > to the NFS list (https://lists.sourceforge.net/lists/listinfo/nfs).
> > Maybe that will increase the priority of fixing this issue.
> > .
> > .
> > .
> > Sure is and it causes several of the autofs Connectathon tests to fail
> > which makes me look bad when it's not actually something I have control
> > over, grrr!
> >
> > Ian
> >
> > [...snip...]
> >
> >
> > Gregory Baker wrote:
> > >
> > > ...closing the loop on what we've found out... it doesn't appear that a
> > > convenient workaround (at the mount level) exists. We'll play around
> > > with local executable automount maps to see if that will work for us.
> > >
> > > Thanks,
> > >
> > > --Greg
> > >
> > > *****
> > > Indeed, now that I read that comment by Trond and went to the source
> > > code, I found out about upstream commit
> > > 54ceac4515986030c2502960be620198dd8fe25b.
> > >
> > > The idea is that having a single super_block structure per server per
> > > FSID prevents corner cases that can lead to corrupt dentry cache trees,
> > > prevents conflicting buffer cache contents to what ends up being the
> > > same file, and some other scary situations.
> > >
> > > So the deal is, the mount flags (and NFS options) are set only the first
> > > time that a given combination of server and filesystem are mounted. If
> > > you ever mount the same filesystem from the same server on another
> > > mountpoint, you'll get the flags and options that were passed on to the
> > > first mount. There's no working around that.
> > > *****
> > >
> > > Gregory Baker wrote:
> > >> Trond Myklebust wrote:
> > >> > On Thu, 2007-04-19 at 15:37 -0500, Gregory Baker wrote:
> > >> >
> > >> >> ---+ BAD RHEL 5 64 system
> > >> > ...
> > >> >> [root@adcgar04 mnt2]# cat /proc/mounts | grep mnt2
> > >> >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> > >> >>
> > >> ro,vers=3,rsize=65536,wsize=65536,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> > >>
> > >> >> 0 0
> > >> >
> > >> > That would be your problem right there. Is the same volume perhaps
> > >> > mounted read-only somewhere else on the same client? That is no longer
> > >> > allowed.
> > >> >
> > >> > Cheers
> > >> > Trond
> > >> >
> > >>
> > >> Ah, news to me!
> > >>
> > >> So if you mount a volume ro at one mount point /mnt1 and then try to
> > >> mount the same volume rw at a second mount point /mnt2 you'll run into
> > >> problems? Apologies, didn't catch this in release notes. Thanks!
> > >>
> > >> --Greg
> > >>
> > >> ---+ A New Beginning
> > >>
> > >> [root@adcgar04 /]# umount -a -t nfs
> > >>
> > >> [root@adcgar04 /]# mount -v | grep pandora
> > >>
> > >> [root@adcgar04 /]# cat /proc/mounts | grep pandora
> > >>
> > >> [root@adcgar04 /]# mount eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
> > >>
> > >> [root@adcgar04 /]# mount -v | grep pandora
> > >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> > >> (rw,addr=163.181.34.137)
> > >>
> > >> [root@adcgar04 /]# cat /proc/mounts | grep pandora
> > >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> > >> rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> > >> 0 0
> > >>
> > >> [root@adcgar04 /]# touch /mnt2/asdf
> > >>
> > >> ---+ The ro Strikes Back
> > >>
> > >> [root@adcgar04 /]# mount -o ro
> > >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1
> > >>
> > >> [root@adcgar04 /]# mount -o rw
> > >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2
> > >>
> > >> [root@adcgar04 /]# mount -v | grep mnt
> > >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt1 type nfs
> > >> (ro,addr=163.181.34.137)
> > >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 on /mnt2 type nfs
> > >> (rw,addr=163.181.34.137)
> > >>
> > >> [root@adcgar04 /]# cat /proc/mounts | grep mnt
> > >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt1 nfs
> > >> ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> > >> 0 0
> > >> eng:/vol/vol4/pandora/pandora-k26_g25_64-2 /mnt2 nfs
> > >> ro,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=eng
> > >> 0 0
> > >>
> > >> [root@adcgar04 /]# touch /mnt2/asdf
> > >> touch: cannot touch `/mnt2/asdf': Read-only file system
> > >>
> > >>
> > >>
> > >>
> > >>
> > >
> >
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Sat, 2007-05-05 at 23:37 +0800, Ian Kent wrote:
> On Fri, 2007-05-04 at 17:41 -0400, Trond Myklebust wrote:
> > On Fri, 2007-05-04 at 15:30 -0500, Gregory Baker wrote:
> > > Are all NFS options from an export inherited by subsequent mounts since
> > > the upstream commit 54ceac4515986030c2502960be620198dd8fe25b?
>
> I wish I'd payed more attention to the shared superblock patches but I
> was consumed by autofs at the time. Even so I likely wouldn't have
> realized the implications.
>
> > >
> > > I'm pasting an email from a colleague at AMD that's also trying to gain
> > > an understanding of the new mount behavior (Paul Krizak)...
> > >
> > > Thanks,
> > >
> > > --Greg
> >
> > If two mounts refer to the same _filesystem_ (i.e. the NFS server
> > exports them with the same value for the 'fsid' attribute), then they
> > will share the same options.
> >
> > If you are trying to use different security models for different
> > mountpoint, you should in any case be considering using different
> > filesystems on the server: NFS filehandles are normally not tied to a
> > pathname since they are required to remain the same after a
> > cross-directory rename().
> > The exception is if the file and the directory live on different
> > filesystems (since then a cross-directory rename() is not permitted
> > anyway) and so the filehandles will usually encode the fsid in some way
> > or form.
> >
> > IOW: if /foo and /bar refer to directories the same filesystem, then the
> > server cannot figure out if the file baz.txt lives in one or the other
> > directory. Giving /foo and /bar different security permissions is really
> > the same as giving _both_ sets of permissions. Even if the server
> > encodes the permissions in the filehandle (usually by recording the
> > export point) then the client is free to spoof a filehandle for the same
> > file with the other set of permissions.
>
> I don't see how doing this on the client adds any security at all as the
> spoofing issue is still present. It does prevent people from being able
> to do read-only mounts for administrative purposes. Like when used to
> guard against accidents rather than provide real security. But then I
> see this isn't the only reason these patches were so important.
Why are people arguing that NFS should be working in a completely
different fashion to all other filesystems? Fix the VFS to allow
read-only bind mounts, and NFS will work just fine with that.
The problem with the old behaviour was that it screwed people over by
causing file caching to be inconsistent on the same client. IOW: if I
wrote to the file on the read-write partition, then those changes would
not be immediately guaranteed to be visible to a process that happened
to read the file on the read-only partition.
> So is there "anything at all" we can do to change this behavior now,
> apart from teaching mount to about it. Having the /etc/mtab
> and /proc/mounts recording different mount options for a mount is wrong.
>
> How should mount(8) handle the situation?
I'd rather like to have mount(2) handle the situation by returning an
error if the mount options cannot be satisfied.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Sat, 2007-05-05 at 13:17 -0400, Trond Myklebust wrote:
> On Sat, 2007-05-05 at 23:37 +0800, Ian Kent wrote:
> > On Fri, 2007-05-04 at 17:41 -0400, Trond Myklebust wrote:
> > > On Fri, 2007-05-04 at 15:30 -0500, Gregory Baker wrote:
> > > > Are all NFS options from an export inherited by subsequent mounts since
> > > > the upstream commit 54ceac4515986030c2502960be620198dd8fe25b?
> >
> > I wish I'd payed more attention to the shared superblock patches but I
> > was consumed by autofs at the time. Even so I likely wouldn't have
> > realized the implications.
> >
> > > >
> > > > I'm pasting an email from a colleague at AMD that's also trying to gain
> > > > an understanding of the new mount behavior (Paul Krizak)...
> > > >
> > > > Thanks,
> > > >
> > > > --Greg
> > >
> > > If two mounts refer to the same _filesystem_ (i.e. the NFS server
> > > exports them with the same value for the 'fsid' attribute), then they
> > > will share the same options.
> > >
> > > If you are trying to use different security models for different
> > > mountpoint, you should in any case be considering using different
> > > filesystems on the server: NFS filehandles are normally not tied to a
> > > pathname since they are required to remain the same after a
> > > cross-directory rename().
> > > The exception is if the file and the directory live on different
> > > filesystems (since then a cross-directory rename() is not permitted
> > > anyway) and so the filehandles will usually encode the fsid in some way
> > > or form.
> > >
> > > IOW: if /foo and /bar refer to directories the same filesystem, then the
> > > server cannot figure out if the file baz.txt lives in one or the other
> > > directory. Giving /foo and /bar different security permissions is really
> > > the same as giving _both_ sets of permissions. Even if the server
> > > encodes the permissions in the filehandle (usually by recording the
> > > export point) then the client is free to spoof a filehandle for the same
> > > file with the other set of permissions.
> >
> > I don't see how doing this on the client adds any security at all as the
> > spoofing issue is still present. It does prevent people from being able
> > to do read-only mounts for administrative purposes. Like when used to
> > guard against accidents rather than provide real security. But then I
> > see this isn't the only reason these patches were so important.
>
> Why are people arguing that NFS should be working in a completely
> different fashion to all other filesystems? Fix the VFS to allow
> read-only bind mounts, and NFS will work just fine with that.
Sorry, I don't follow.
How does think affect a client doing two distinct mounts to a server.
Are you suggesting that such a bind mount should be done on the server
and then exported? I guess that would give us distinct super blocks.
>
> The problem with the old behaviour was that it screwed people over by
> causing file caching to be inconsistent on the same client. IOW: if I
> wrote to the file on the read-write partition, then those changes would
> not be immediately guaranteed to be visible to a process that happened
> to read the file on the read-only partition.
>
> > So is there "anything at all" we can do to change this behavior now,
> > apart from teaching mount to about it. Having the /etc/mtab
> > and /proc/mounts recording different mount options for a mount is wrong.
> >
> > How should mount(8) handle the situation?
>
> I'd rather like to have mount(2) handle the situation by returning an
> error if the mount options cannot be satisfied.
>
> Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Sun, 2007-05-06 at 02:27 +0800, Ian Kent wrote:
> On Sat, 2007-05-05 at 13:17 -0400, Trond Myklebust wrote:
> > Why are people arguing that NFS should be working in a completely
> > different fashion to all other filesystems? Fix the VFS to allow
> > read-only bind mounts, and NFS will work just fine with that.
>
> Sorry, I don't follow.
> How does think affect a client doing two distinct mounts to a server.
> Are you suggesting that such a bind mount should be done on the server
> and then exported? I guess that would give us distinct super blocks.
No. People are arguing that the client should allow the _same_
filesystem to be mounted both read-only and read-write. That is not
currently permitted for any other filesystem. The reason is that the
read-only flag acts on a per-superblock basis, and hence on a
per-filesystem basis. This is why you cannot do
mount --bind -oro /foo /bar
If you fix the VFS to allow the above by making the read-only flag a
per-mountpoint flag instead of a per-superblock flag, then NFS can
happily do the same.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Sat, 2007-05-05 at 14:49 -0400, Trond Myklebust wrote:
> On Sun, 2007-05-06 at 02:27 +0800, Ian Kent wrote:
> > On Sat, 2007-05-05 at 13:17 -0400, Trond Myklebust wrote:
> > > Why are people arguing that NFS should be working in a completely
> > > different fashion to all other filesystems? Fix the VFS to allow
> > > read-only bind mounts, and NFS will work just fine with that.
> >
> > Sorry, I don't follow.
> > How does think affect a client doing two distinct mounts to a server.
> > Are you suggesting that such a bind mount should be done on the server
> > and then exported? I guess that would give us distinct super blocks.
>
> No. People are arguing that the client should allow the _same_
> filesystem to be mounted both read-only and read-write. That is not
> currently permitted for any other filesystem. The reason is that the
> read-only flag acts on a per-superblock basis, and hence on a
> per-filesystem basis. This is why you cannot do
>
> mount --bind -oro /foo /bar
>
> If you fix the VFS to allow the above by making the read-only flag a
> per-mountpoint flag instead of a per-superblock flag, then NFS can
> happily do the same.
When this work is completed is there anything that needs to be done in
the NFS client or won't you know until you see the patches for it?
Ian
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs