2001-03-09 23:23:25

by L A Walsh

[permalink] [raw]
Subject: (struct dentry *)->vfsmnt;

Could someone enlighten me as to the purpose of this field in the
dentry struct? There is no elucidating comment in the header for this
particular field and the name/type only indicate it is pointing to
a list of vfsmounts. Can a dentry belong to more than one vfsmount?

If I have a 'dentry' and simply want to determine what the absolute
path from root is, in the 'd_path' macro, would I use 'rootmnt' of my
current->fs as the 'vfsmount' as well?

Thanks, in advance...
-linda


--
L A Walsh | Trust Technology, Core Linux, SGI
[email protected] | Voice: (650) 933-53


2001-03-10 01:01:12

by Alexander Viro

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



On Fri, 9 Mar 2001, LA Walsh wrote:

> Could someone enlighten me as to the purpose of this field in the
> dentry struct? There is no elucidating comment in the header for this
> particular field and the name/type only indicate it is pointing to
> a list of vfsmounts. Can a dentry belong to more than one vfsmount?

Yes.

> If I have a 'dentry' and simply want to determine what the absolute
> path from root is, in the 'd_path' macro, would I use 'rootmnt' of my
> current->fs as the 'vfsmount' as well?

No such thing. The same fs may be present in many places. Please,
describe the situation - where do you get that dentry from?
Cheers,
Al

2001-03-10 02:02:39

by L A Walsh

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

Alexander Viro wrote:
> No such thing. The same fs may be present in many places. Please,
> describe the situation - where do you get that dentry from?
> Cheers,
> Al
---

Al,
I'm getting it from various places, 1) if I want to know the
path relative to the root of the dentry at the end of 'path_walk'
or __user_path_walk (as used in truncate) and
2) If I've gotten a dentry as in sys_fchdir/fchown/fstat/newfstat
from a file descriptor and I want the absolute path or if multple
(such as multiple mounts of the same fs in different locations), the
one that the user used to access the dentry.

In 2.2 there was a way to get the path only from the
dentry (d_path) -- I'm looking for similar functionality for the
above cases.

Is it such that in 2.2 dentries were only relative to root
where in 2.4 they are relative to their mount point and instead of
duplicate dcache entries for each possible mount point, they get stored
as one?

If that's the case, then while I might get a path for user-path
walk, if I just have a 'fd', it may not be poasible to backtrace into
the path the user used to access the file?

Just some wild speculations on my part....:-/...did
I refine the question enough?

thanks,
-linda


--
L A Walsh | Trust Technology, Core Linux, SGI
[email protected] | Voice: (650) 933-5338

2001-03-10 02:43:04

by Alexander Viro

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



On Fri, 9 Mar 2001, LA Walsh wrote:

[getting path by dentry]
> I'm getting it from various places, 1) if I want to know the
> path relative to the root of the dentry at the end of 'path_walk'
> or __user_path_walk (as used in truncate) and

In that case you have nd->mnt and nd->dentry

> 2) If I've gotten a dentry as in sys_fchdir/fchown/fstat/newfstat
> from a file descriptor and I want the absolute path or if multple

In that case you have file->f_vfsmnt and file->f_dentry

If you have the pair (mnt, dentry) - p = d_path(mnt, dentry, buf, buflen);
will put the path into buf and set p pointing to its beginning.

dentry alone is not enough - it simply doesn't describe a unique point
in the namespace. It does describe the unique point in the filesystem
tree, but that tree can occur in several places of the unified tree.
Thus the need of pair (vfsmount, dentry).

If you ever need to do someting like "let's find any vfsmount that
could go in pair with our dentry" - you are doing something wrong
(or I had missed something last Spring). Table below gives the
list of such pairs.
vfsmount dentry
file ->f_vfsmnt ->f_dentry
nameidata ->mnt ->dentry
swap component ->swap_vfsmnt ->swap_file
knfsd export ->exp_mnt ->exp_dentry
cwd ->pwdmnt ->pwd
root ->rootmnt ->root
emul.root ->altrootmnt ->altroot
mountpoint ->mnt_parent ->mnt_mountpoint

Does that cover your case?
Cheers,
Al

2001-03-14 01:29:50

by Andreas Dilger

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

Al, you write:
> If you have the pair (mnt, dentry) - p = d_path(mnt, dentry, buf, buflen);
> will put the path into buf and set p pointing to its beginning.
>
> dentry alone is not enough - it simply doesn't describe a unique point
> in the namespace. It does describe the unique point in the filesystem
> tree, but that tree can occur in several places of the unified tree.
> Thus the need of pair (vfsmount, dentry).
>
> If you ever need to do someting like "let's find any vfsmount that
> could go in pair with our dentry" - you are doing something wrong
> (or I had missed something last Spring). Table below gives the
> list of such pairs.
> vfsmount dentry
> file ->f_vfsmnt ->f_dentry
> nameidata ->mnt ->dentry
> swap component ->swap_vfsmnt ->swap_file
> knfsd export ->exp_mnt ->exp_dentry
> cwd ->pwdmnt ->pwd
> root ->rootmnt ->root
> emul.root ->altrootmnt ->altroot
> mountpoint ->mnt_parent ->mnt_mountpoint

What about if I want to know the mountpoint (inside the filesystem)
when it is mounted? The comments in the code say sb->s_type->kern_mnt
is only valid for in-kernel filesystems (FS_SINGLE).

Would it be possible to put a valid vfsmnt pointer in kern_mnt for
non-FS_SINGLE filesystems? Would only the vfsmnt information (maybe
d_path(kern_mnt, kern_mnt->mnt_mountpoint, buf, buflen)) be enough
to determine the pathname of the filesystem mount point?

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-03-14 02:33:47

by Alexander Viro

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



On Tue, 13 Mar 2001, Andreas Dilger wrote:

> What about if I want to know the mountpoint (inside the filesystem)
> when it is mounted?

Which mountpoint? There can be a lot of them (quite possibly - some
of them out of the chroot jail you are in, so "any" is unlikely to
do you any good).
Cheers,
Al

2001-03-14 04:29:32

by Andreas Dilger

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

You write:
> > What about if I want to know the mountpoint (inside the filesystem)
> > when it is mounted?
>
> Which mountpoint? There can be a lot of them (quite possibly - some
> of them out of the chroot jail you are in, so "any" is unlikely to
> do you any good).

How about the first one? The one that calls the "read_super" method.
AFAICT, only the first mount calls down to the FS anyways (the rest
is VFS internal).

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-03-14 05:20:50

by Andreas Dilger

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

Al writes:
> > How about the first one? The one that calls the "read_super" method.
> > AFAICT, only the first mount calls down to the FS anyways (the rest
> > is VFS internal).
>
> And what should that be after
>
> mount -t ext2 /dev/sda1 /mnt
> mount --bind /mnt /tmp/foo
> umount /mnt

Yes, I know you _can_ do all sorts of tricks like this, but most people
don't really do it. In any case, I would be happy if I could even get
"/mnt" from the first mount. If it comes to the point where I can get
that, then I will start to worry about "mount --bind".

This is to store in the ext2 on-disk superblock, which is currently always
(from dumpe2fs -h /dev/hdX):

Last mounted on: <not available>

To be able to put _something_ there will suit my needs.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-03-14 05:12:17

by Alexander Viro

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



On Tue, 13 Mar 2001, Andreas Dilger wrote:

> You write:
> > > What about if I want to know the mountpoint (inside the filesystem)
> > > when it is mounted?
> >
> > Which mountpoint? There can be a lot of them (quite possibly - some
> > of them out of the chroot jail you are in, so "any" is unlikely to
> > do you any good).
>
> How about the first one? The one that calls the "read_super" method.
> AFAICT, only the first mount calls down to the FS anyways (the rest
> is VFS internal).

And what should that be after

mount -t ext2 /dev/sda1 /mnt
mount --bind /mnt /tmp/foo
umount /mnt


2001-03-14 05:50:56

by Alexander Viro

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



On Tue, 13 Mar 2001, Andreas Dilger wrote:

> Yes, I know you _can_ do all sorts of tricks like this, but most people
> don't really do it. In any case, I would be happy if I could even get

Ugh. That sounds like a work for mount(8), not mount(2), then. BTW, userland
(mount(8)) looks like the only potential user for that.

> "/mnt" from the first mount. If it comes to the point where I can get
> that, then I will start to worry about "mount --bind".
>
> This is to store in the ext2 on-disk superblock, which is currently always
> (from dumpe2fs -h /dev/hdX):
>
> Last mounted on: <not available>
>
> To be able to put _something_ there will suit my needs.

OK... I don't like the idea of passing a vfsmount to read_super (for obvious
reasons - ->mnt_count, for one thing), but there may be other ways to do that.
What kind of use (aside of getting rid of <not available> in dumpe2fs
output) do you have in mind?
Cheers,
Al

2001-03-14 06:08:07

by Andreas Dilger

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

Al, you write:
> On Tue, 13 Mar 2001, Andreas Dilger wrote:
> > "/mnt" from the first mount. If it comes to the point where I can get
> > that, then I will start to worry about "mount --bind".
> >
> > This is to store in the ext2 on-disk superblock, which is currently always
> > (from dumpe2fs -h /dev/hdX):
> >
> > Last mounted on: <not available>
> >
> > To be able to put _something_ there will suit my needs.
>
> OK... I don't like the idea of passing a vfsmount to read_super (for obvious
> reasons - ->mnt_count, for one thing), but there may be other ways to do that.
> What kind of use (aside of getting rid of <not available> in dumpe2fs
> output) do you have in mind?

On AIX, it is possible to import a volume group, and it automatically
builds /etc/fstab entries from information stored in the fs. Having the
"last mounted on" would have the mount point info, and of course LVM
would hold the device names.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-03-14 06:51:42

by Alexander Viro

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



On Tue, 13 Mar 2001, Andreas Dilger wrote:

> On AIX, it is possible to import a volume group, and it automatically
> builds /etc/fstab entries from information stored in the fs. Having the
> "last mounted on" would have the mount point info, and of course LVM
> would hold the device names.

Wait a minute. What happens if you bring /home from one box to another,
that already has /home? Corrupted /etc/fstab?

Let me put it that way: I don't understand why (if it is useful at all)
it is done in the fs. Looks like a wrong level...
Cheers,
Al

2001-03-14 17:29:08

by Andreas Dilger

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

Al writes:
> On Tue, 13 Mar 2001, Andreas Dilger wrote:
>
> > On AIX, it is possible to import a volume group, and it automatically
> > builds /etc/fstab entries from information stored in the fs. Having the
> > "last mounted on" would have the mount point info, and of course LVM
> > would hold the device names.
>
> Wait a minute. What happens if you bring /home from one box to another,
> that already has /home? Corrupted /etc/fstab?

The AIX vgimport will not corrupt /etc/fstab with duplicate mounts, nor for
that matter with duplicate LV names (AIX has a single namespace for all LVs).
If a conflict is found with an LV name, a new name like "lv01" is used (the
LV names are not that important anyways). I'm not sure what would
happen with a duplicate mount point (whether it would pick a new name, or
simply leave it out of /etc/fstab), but it isn't too hard to think of
easy ways to fix this (e.g. /home01 or /mnt/vgname/home or whatever).

It was really useful (i.e. easy to manage) to be able to move a bunch of
disks (making a whole volume group) from one system to another, import it,
and then not have to mount each filesystem to figure out what the contents
are before editing /etc/fstab to set up the correct mount point. In 99.9%
of the cases, the mountpoints were correct. I don't think you can ever
have a system that is 100% correct all of the time.

For AIX, the base filesystems in the rootvg (/, /usr, /var, /tmp, /home,
/boot, and swap) all moved as a single unit (sometimes /home was moved
out for systems that served lots of users). For data or application
specific filesystems, the normal practise was to put them into their own
volume group for backup, failover, etc. This made it easy to upgrade
systems, or move a critical application to another server in case of
hardware problems (whether manual or via HA auto failover).

> Let me put it that way: I don't understand why (if it is useful at all)
> it is done in the fs. Looks like a wrong level...

For the same reason that the UUID and LABEL are stored in the superblock:
you want this infomation kept with the filesystem and not anywhere else,
otherwise it will quickly get out-of-date. Wherever you mounted the
filesystem last is where it would be mounted if you import the VG on
another system. You can obviously edit /etc/fstab afterwards if it is
wrong, and then remount the filesystem(s), and this will store the
correct mountpoint into the filesystem for the next vgimport.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-03-14 17:41:48

by Matthew Wilcox

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

On Wed, Mar 14, 2001 at 10:26:50AM -0700, Andreas Dilger wrote:
> > Let me put it that way: I don't understand why (if it is useful at all)
> > it is done in the fs. Looks like a wrong level...
>
> For the same reason that the UUID and LABEL are stored in the superblock:
> you want this infomation kept with the filesystem and not anywhere else,
> otherwise it will quickly get out-of-date. Wherever you mounted the
> filesystem last is where it would be mounted if you import the VG on
> another system. You can obviously edit /etc/fstab afterwards if it is
> wrong, and then remount the filesystem(s), and this will store the
> correct mountpoint into the filesystem for the next vgimport.

Al is saying `why not do this in mount(8) instead of mount(2)?' I haven't
seen you answer that yet.

--
Revolutions do not require corporate support.

2001-03-14 18:12:08

by Alexander Viro

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



On Wed, 14 Mar 2001, Andreas Dilger wrote:

> The AIX vgimport will not corrupt /etc/fstab with duplicate mounts, nor for
> that matter with duplicate LV names (AIX has a single namespace for all LVs).
> If a conflict is found with an LV name, a new name like "lv01" is used (the
> LV names are not that important anyways). I'm not sure what would
> happen with a duplicate mount point (whether it would pick a new name, or
> simply leave it out of /etc/fstab), but it isn't too hard to think of
> easy ways to fix this (e.g. /home01 or /mnt/vgname/home or whatever).

[snip the rest of description]

Excuse me, but doesn't it scream "userland"? IOW, is there any reason to
do that in the kernel? If you want to spread /etc/fstab all over the
place storing bits in relevant filesystems - fine, you even don't
need to bother with superblocks. Just teach mount(8) to put the
mountpoint into /.last.mounted and be done with that...

It's a policy question - if somebody wants to play with such
schemes he can do it in the place where policy stuff belongs.
I.e. in userland. Since the reading side contains a bunch of heuristics
(obviously depending on the local naming policy for temp. mountpoints,
for one thing) you don't need anything special on the writing side...

Cheers,
Al


2001-03-14 18:09:28

by Andreas Dilger

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

You write:
> > For the same reason that the UUID and LABEL are stored in the superblock:
> > you want this infomation kept with the filesystem and not anywhere else,
> > otherwise it will quickly get out-of-date. Wherever you mounted the
> > filesystem last is where it would be mounted if you import the VG on
> > another system. You can obviously edit /etc/fstab afterwards if it is
> > wrong, and then remount the filesystem(s), and this will store the
> > correct mountpoint into the filesystem for the next vgimport.
>
> Al is saying `why not do this in mount(8) instead of mount(2)?' I haven't
> seen you answer that yet.

Because this is totally filesystem specific - why put extra knowledge
of filesystem internals into mount? I personally don't want it writing
into the ext2 or ext3 superblock. How can it possibly know what to do,
without embedding a lot of knowledge there? Yes, mount(8) can _read_
the UUID and LABEL for ext2 filesystems, but I would rather not have it
_write_ into the superblock. Also, InterMezzo and SnapFS have the same
on-disk format as ext2, but would mount(8) know that?

There are other filesystems (at least IBM JFS) that could also take
advantage of this feature, should we make mount(8) have code for each
and every filesystem? Yuck. Sort of ruins the whole modularity thing.
Yes, I know mount(8) does funny stuff for SMB and NFS, but that is a
reason to _not_ put more filesystem-specific information into mount(8).

Actually, one more reason to have this in the kernel is for InterMezzo
(distributed filesystem which uses ext3 for on-disk storage). Currently,
the mount point is passed as a mount parameter (yuck) because it is
needed internally to the InterMezzo kernel code. If the filesystem
could extract this information at mount time, it would remove the need
for the mount parameter.

The benefit of doing all of this in *_read_super() (probably would be in
ext2_setup_super() for ext2) is that filesystems which can use this feature
will do so, and others will not. It is a matter of a single "d_path()"
call at mount (or remount for R/O mounted filesystems), so it is not like
it's going to slow down the system a lot.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-03-14 19:16:47

by Andreas Dilger

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

Al writes:
> On Wed, 14 Mar 2001, Andreas Dilger wrote:
> > The AIX vgimport will not corrupt /etc/fstab with duplicate mounts, nor for
> > that matter with duplicate LV names (AIX has a single namespace for all LVs).
> > If a conflict is found with an LV name, a new name like "lv01" is used (the
> > LV names are not that important anyways). I'm not sure what would
> > happen with a duplicate mount point (whether it would pick a new name, or
> > simply leave it out of /etc/fstab), but it isn't too hard to think of
> > easy ways to fix this (e.g. /home01 or /mnt/vgname/home or whatever).
>
> Excuse me, but doesn't it scream "userland"? IOW, is there any reason to
> do that in the kernel? If you want to spread /etc/fstab all over the
> place storing bits in relevant filesystems - fine, you even don't
> need to bother with superblocks. Just teach mount(8) to put the
> mountpoint into /.last.mounted and be done with that...

Obviously, the whole vgimport stuff is going to be in userland. The only
part that needs to go in the kernel is storing the mountpoint in the
filesystem superblock. It is _not_ OK to just put it in /.last.mounted.
Quite often a data/application VG is moved independent of the root filesystem.
The info needs to stay with the filesystem itself.

> It's a policy question - if somebody wants to play with such
> schemes he can do it in the place where policy stuff belongs.
> I.e. in userland.

Yes, the policy for resolving conflicts and such will be in userland.
Yes, the policy for determining the initial mountpoints is done in
userland. The only thing I want to do in kernel space is store the
mountpoint in the "last mounted" field in the superblock. It also
will help InterMezzo to know this information.

> Since the reading side contains a bunch of heuristics
> (obviously depending on the local naming policy for temp. mountpoints,
> for one thing) you don't need anything special on the writing side...

The writing side can't be done in userland without basically making
mount(8) know about the superblock layout of each and every filesystem:

- you create a new filesystem
- you mount it

When can we update the superblock?

At filesystem creation time -> not guaranteed to stay up-to-date.
With mount(8) -> needs superblock format of each filesystem.
Inside fs-specific kernel code -> about 2 lines of code, if we could just
call d_path() or have mountpoint as param.

The information is already inside the kernel. I would _actually_ rather
just get dir_name from do_mount() down inside *_read_super. However, AFAICT
this it is easier (i.e. doesn't change the VFS interface) to pass the d_dir
dentry in the generic superblock to *_read_super. If calling d_path() is the
wrong thing to do, then I would be happy to hear another way of getting
dir_name to *_read_super() without breaking the VFS interface.

Cheers, Andreas

PS - in 2.2 I can do this with < 10 lines of code (including ext2-specific
code). I'm just asking for some _help_ to understand what needs to
be done for 2.4.
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-03-14 19:33:17

by Alexander Viro

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



On Wed, 14 Mar 2001, Andreas Dilger wrote:

> Obviously, the whole vgimport stuff is going to be in userland. The only
> part that needs to go in the kernel is storing the mountpoint in the
> filesystem superblock. It is _not_ OK to just put it in /.last.mounted.
> Quite often a data/application VG is moved independent of the root filesystem.
> The info needs to stay with the filesystem itself.

Sorry - .last.mounted in the root of filesystem, indeed.

> > Since the reading side contains a bunch of heuristics
> > (obviously depending on the local naming policy for temp. mountpoints,
> > for one thing) you don't need anything special on the writing side...
>
> The writing side can't be done in userland without basically making
> mount(8) know about the superblock layout of each and every filesystem:

That's a wonderful reason to put it _not_ into superblock... OK, what's
wrong with the variant above?

Cheers,
Al

2001-03-14 19:32:37

by Dave Kleikamp

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

AIX stores all of this information in the LVM, not in the filesystem.
The filesystem itself has nothing to do with importing and exporting
volume groups. Having the information stored as part of LVM's metadata
allows the utilities to only deal with LVM instead of every individual
file system.

Andreas Dilger wrote:
>
> Al writes:
> > On Tue, 13 Mar 2001, Andreas Dilger wrote:
> >
> > > On AIX, it is possible to import a volume group, and it automatically
> > > builds /etc/fstab entries from information stored in the fs. Having the
> > > "last mounted on" would have the mount point info, and of course LVM
> > > would hold the device names.
> >
> > Wait a minute. What happens if you bring /home from one box to another,
> > that already has /home? Corrupted /etc/fstab?

> For the same reason that the UUID and LABEL are stored in the superblock:
> you want this infomation kept with the filesystem and not anywhere else,
> otherwise it will quickly get out-of-date. Wherever you mounted the
> filesystem last is where it would be mounted if you import the VG on
> another system. You can obviously edit /etc/fstab afterwards if it is
> wrong, and then remount the filesystem(s), and this will store the
> correct mountpoint into the filesystem for the next vgimport.

--
David J. Kleikamp
IBM Linux Technology Center

2001-03-14 19:47:37

by Andreas Dilger

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

David Kleikamp writes:
> AIX stores all of this information in the LVM, not in the filesystem.
> The filesystem itself has nothing to do with importing and exporting
> volume groups. Having the information stored as part of LVM's metadata
> allows the utilities to only deal with LVM instead of every individual
> file system.

So you are saying that mount(8) writes into a field in the LVM LVCB or
something? Might be possible on Linux LVM as well...

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-03-14 19:52:27

by Alexander Viro

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



On Wed, 14 Mar 2001, Andreas Dilger wrote:

> David Kleikamp writes:
> > AIX stores all of this information in the LVM, not in the filesystem.
> > The filesystem itself has nothing to do with importing and exporting
> > volume groups. Having the information stored as part of LVM's metadata
> > allows the utilities to only deal with LVM instead of every individual
> > file system.
>
> So you are saying that mount(8) writes into a field in the LVM LVCB or
> something? Might be possible on Linux LVM as well...

Makes sense. Even better than per-fs file in root on filesystems
affected by that policy. If the situation when you really want it is
LVM putting that (and probably fs type and other mount options) into
LVM metadata looks like a good idea.
Cheers,
Al

2001-03-14 19:58:47

by Dave Kleikamp

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

Let me start with a disclaimer stating that it's been a few years since
I've worked with AIX, but this is what I believe happens.

mount itself doesn't do anything except read /etc/filesytems (AIX's
version of /etc/fstab). LVM maintains the information primarily in the
ODM (yuck). The utilities such as mkfs, mklv, chfs, etc. modify this
information in the ODM. The exportvg command extracts the information
from the ODM (and /etc/filesystems?) and stores it somewhere in the
volume group. Only then can the volume group be imported by another
system with the importvg command, which then populates the ODM and
/etc/filesystems.

Of course, I would NEVER suggest anything resembling AIX's ODM, but I do
think that the LVM is a reasonable place to store this kind of
information.

Andreas Dilger wrote:
>
> David Kleikamp writes:
> > AIX stores all of this information in the LVM, not in the filesystem.
> > The filesystem itself has nothing to do with importing and exporting
> > volume groups. Having the information stored as part of LVM's metadata
> > allows the utilities to only deal with LVM instead of every individual
> > file system.
>
> So you are saying that mount(8) writes into a field in the LVM LVCB or
> something? Might be possible on Linux LVM as well...
>
> Cheers, Andreas
> --
> Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
> \ would they cancel out, leaving him still hungry?"
> http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

--
David J. Kleikamp
IBM Linux Technology Center

2001-03-14 20:23:37

by Ragnar Kjørstad

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

On Wed, Mar 14, 2001 at 02:32:21PM -0500, Alexander Viro wrote:
> Sorry - .last.mounted in the root of filesystem, indeed.
>
> > The writing side can't be done in userland without basically making
> > mount(8) know about the superblock layout of each and every filesystem:
>
> That's a wonderful reason to put it _not_ into superblock... OK, what's
> wrong with the variant above?


The information will not be available without mounting the filesystem
first.

However - the LVM way sounded much better, so this may not matter.


--
Ragnar Kj?rstad
Big Storage

2001-03-14 21:09:58

by Andreas Dilger

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;

David Kleikamp writes:
> Let me start with a disclaimer stating that it's been a few years since
> I've worked with AIX, but this is what I believe happens.
>
> mount itself doesn't do anything except read /etc/filesytems (AIX's
> version of /etc/fstab). LVM maintains the information primarily in the
> ODM (yuck). The utilities such as mkfs, mklv, chfs, etc. modify this
> information in the ODM. The exportvg command extracts the information
> from the ODM (and /etc/filesystems?) and stores it somewhere in the
> volume group. Only then can the volume group be imported by another
> system with the importvg command, which then populates the ODM and
> /etc/filesystems.

Actually, I'm pretty sure you _never_ need to exportvg in order to have
it work on another system. That's one of the great things about AIX LVM,
because it means you can move a VG to another system after a hardware
problem, and not have any problems importing it (journaled fs also helps).
AFAIK, the only think exportvg does is remove VG information from the
ODM and /etc/filesystems.

I suppose it is possible that because AIX is so tied into the ODM and
SMIT, that it updates the VGDA mountpoint info whenever a filesystem
mountpoint is changed, but this will _never_ work on Linux because of
different tools versions, distributions, etc. Also, it would mean on
AIX that anyone editing /etc/filesystems might have a broken system at
vgimport time (wouldn't be the first time that not using ODM/SMIT caused
such a problem).

> ... I do think that the LVM is a reasonable place to store this kind of
> information.

Yes, even though it would tie the user into using a specific version of
mount(), I suppose it is a better solution than storing it inside the
filesystem. It will work with non-ext2 filesystems, and it also allows
you to store more information than simply the mountpoint (e.g. mount
options, dump + fsck info, etc). In the end, I will probably just
save the whole /etc/fstab line into the LV header somewhere, and extract
it at importvg time (possibly with modifications for vgname and mountpoint).

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-03-15 12:34:49

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;



>Actually, I'm pretty sure you _never_ need to exportvg in order to have
>it work on another system. That's one of the great things about AIX LVM,
>because it means you can move a VG to another system after a hardware
>problem, and not have any problems importing it (journaled fs also helps).
>AFAIK, the only think exportvg does is remove VG information from the
>ODM and /etc/filesystems.
>

Yes that's correct as far as I know too. The VGDA and LVCB contain all the
information required for import even without an exportvg.

>I suppose it is possible that because AIX is so tied into the ODM and
>SMIT, that it updates the VGDA mountpoint info whenever a filesystem
>mountpoint is changed, but this will _never_ work on Linux because of
>different tools versions, distributions, etc. Also, it would mean on
>AIX that anyone editing /etc/filesystems might have a broken system at
>vgimport time (wouldn't be the first time that not using ODM/SMIT caused
>such a problem).

Yes, you can think of crfs (or chfs) as a composite command that handles
this (writing to LVCB. These are more like
administrative/setup/configuration commands -- one time, or occasional
system configuration changes.

On the other hand a mount doesn't cause a persistent configuration
information change. You can issue a mount even if an entry doesn't exist in
/etc/filesystems.

>
>> ... I do think that the LVM is a reasonable place to store this kind of
>> information.
>
>Yes, even though it would tie the user into using a specific version of
>mount(), I suppose it is a better solution than storing it inside the
>filesystem. It will work with non-ext2 filesystems, and it also allows
>you to store more information than simply the mountpoint (e.g. mount
>options, dump + fsck info, etc). In the end, I will probably just
>save the whole /etc/fstab line into the LV header somewhere, and extract
>it at importvg time (possibly with modifications for vgname and
mountpoint).
>
>Cheers, Andreas

Is mount the right time to do this ? A mount happens on every boot of the
system.
And then, one can issue a mount by explicitly specifying all the parameters
without having an entry in fstab. [Doesn't that also mean that you have a
possibility of inconsistency even here ?]



2001-03-15 13:13:55

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: (struct dentry *)->vfsmnt;


>Because this is totally filesystem specific - why put extra knowledge
>of filesystem internals into mount? I personally don't want it writing
>into the ext2 or ext3 superblock. How can it possibly know what to do,
>without embedding a lot of knowledge there? Yes, mount(8) can _read_
>the UUID and LABEL for ext2 filesystems, but I would rather not have it
>_write_ into the superblock. Also, InterMezzo and SnapFS have the same
>on-disk format as ext2, but would mount(8) know that?
>
>There are other filesystems (at least IBM JFS) that could also take
>advantage of this feature, should we make mount(8) have code for each
>and every filesystem? Yuck. Sort of ruins the whole modularity thing.
>Yes, I know mount(8) does funny stuff for SMB and NFS, but that is a
>reason to _not_ put more filesystem-specific information into mount(8).
>

Since you've brought up this point.
I have wondered why Linux doesn't seem to yet have the option of a generic
user space filesystem type specific mount helper command. I recall having
seen code in mount(8) implementation to call mount.<fstype>, but its still
under an ifdef isn't it, except for smb or ncp perhaps ? (Hope I'm not
out-of-date on this)
Having something like that lets one stream-line userland filesystem
specific stuff like this, without having the generic part of mount(8) know
about it.

For example, in AIX, the association between type and the program for mount
helpers (and also for filesystem helpers for things like mkfs, fsck etc) is
configured in /etc/vfs, while SUN and HP look for them under particular
directory locations (by fstype name).

Actually, it'd be good to have this in such a way that if a specific helper
doesn't exist, default mount processing continues. This avoids the extra
work of writing such helpers for every new filesystem, unless we need
specialized behaviour there.



Suparna Bhattacharya
IBM Software Lab, India
E-mail : [email protected]
Phone : 91-80-5267117, Extn : 2525