by Miklos Szeredi

[permalink] [raw]

Subject: Re: [PATCH -V16 07/12] vfs: Support null pathname in linkat

On Mon, 12 Jul 2010, Aneesh Kumar K. V wrote:
> /*
> * Return -ENOENT if we've raced with unlink and i_nlink is 0. Doing
> * otherwise has the potential to corrupt the orphan inode list.
> */
> if (inode->i_nlink == 0)
> return -ENOENT;
>
> I can move this check to VFS so that we do it for all file systems.

That makes sense. Hopefully filesystems which implement hard links
will change i_nlink to zero for the last unlink...

Thanks,
Miklos

2010-07-13 05:47:38

by Matt Helsley

[permalink] [raw]

Subject: Re: [PATCH -V16 07/12] vfs: Support null pathname in linkat

On Mon, Jul 12, 2010 at 10:05:39AM +0200, Miklos Szeredi wrote:
> On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> > This enables to use linkat to create hardlinks from a
> > file descriptor pointing to the file. This can be used
> > with open_by_handle syscall that returns a file descriptor.
>
> This needs more thought, filesystems don't usually tolerate
> resurrecting a file which has already been unlinked (i_nlink == 0).

File resurrection support would be useful for more than the
open-by-handle patches.

Checkpoint/restart would make use of it too. Without it, programs that use
large unlinked files would take corresponding amounts of IO to checkpoint.
By resurrecting the file its contents can be checkpointed without copying
on filesystems or block devices that support CoW-sharing of snapshot data.
Then, during restart, we'd do something like:

1. Take a snapshot of the snapshot and remount it rw.
(Not necessary if userspace doesn't mind restart
destroying the checkpoint.)
2. Re-open the snapshot of the resurrected file.
<everything else we already do to restart open files (e.g. seek)>
N. Re-unlink the file.

Cheers,
-Matt Helsley

>
> Thanks,
> Miklos
>
> >
> > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > ---
> > fs/namei.c | 34 +++++++++++++++++++++++++---------
> > 1 files changed, 25 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index a6a8093..9a7b71a 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -2553,16 +2553,28 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> > {
> > struct dentry *new_dentry;
> > struct nameidata nd;
> > - struct path old_path;
> > - int error;
> > + struct path old_path, *old_pathp;
> > + struct file *file = NULL;
> > + int error, fput_needed;
> > char *to;
> >
> > if ((flags & ~AT_SYMLINK_FOLLOW) != 0)
> > return -EINVAL;
> >
> > - error = user_path_at(olddfd, oldname,
> > - flags & AT_SYMLINK_FOLLOW ? LOOKUP_FOLLOW : 0,
> > - &old_path);
> > + if (oldname == NULL && olddfd != AT_FDCWD) {
> > + file = fget_light(olddfd, &fput_needed);
> > + if (file) {
> > + old_pathp = &file->f_path;
> > + error = 0;
> > + } else
> > + error = -EBADF;
> > + } else {
> > + error = user_path_at(olddfd, oldname,
> > + flags & AT_SYMLINK_FOLLOW ?
> > + LOOKUP_FOLLOW : 0,
> > + &old_path);
> > + old_pathp = &old_path;
> > + }
> > if (error)
> > return error;
> >
> > @@ -2570,7 +2582,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> > if (error)
> > goto out;
> > error = -EXDEV;
> > - if (old_path.mnt != nd.path.mnt)
> > + if (old_pathp->mnt != nd.path.mnt)
> > goto out_release;
> > new_dentry = lookup_create(&nd, 0);
> > error = PTR_ERR(new_dentry);
> > @@ -2579,10 +2591,11 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> > error = mnt_want_write(nd.path.mnt);
> > if (error)
> > goto out_dput;
> > - error = security_path_link(old_path.dentry, &nd.path, new_dentry);
> > + error = security_path_link(old_pathp->dentry, &nd.path, new_dentry);
> > if (error)
> > goto out_drop_write;
> > - error = vfs_link(old_path.dentry, nd.path.dentry->d_inode, new_dentry);
> > + error = vfs_link(old_pathp->dentry,
> > + nd.path.dentry->d_inode, new_dentry);
> > out_drop_write:
> > mnt_drop_write(nd.path.mnt);
> > out_dput:
> > @@ -2593,7 +2606,10 @@ out_release:
> > path_put(&nd.path);
> > putname(to);
> > out:
> > - path_put(&old_path);
> > + if (file)
> > + fput_light(file, fput_needed);
> > + else
> > + path_put(&old_path);
> >
> > return error;
> > }
> > --
> > 1.7.2.rc1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/