2005-10-06 20:17:15

by Jim McQuillan

[permalink] [raw]
Subject: pivot_root doesn't work for me in 2.6.14-rc3

I've found a problem with pivot_root that worked fine in 2.6.13.3, but
fails for me, starting in 2.6.14-rc3 (haven't tried rc1 or rc2).

This is for LTSP.org (Linux Terminal Server Project) thin clients.

In our initramfs, we have a '/init' script that creates a mountpoint for
a 2nd ramfs, and i'm trying to pivot_root to that mount point.

I'm getting:

pivot_root: Invalid Argument


This worked perfectly in 2.6.13.3, so I looked at the 2.6.14-rc3 patch,
and I found the code in fs/namespace.c that is causing it to fail for
me:


@@ -1334,8 +1332,12 @@ asmlinkage long sys_pivot_root(const cha
error = -EINVAL;
if (user_nd.mnt->mnt_root != user_nd.dentry)
goto out2; /* not a mountpoint */
+ if (user_nd.mnt->mnt_parent == user_nd.mnt)
+ goto out2; /* not attached */
if (new_nd.mnt->mnt_root != new_nd.dentry)
goto out2; /* not a mountpoint */
+ if (new_nd.mnt->mnt_parent == new_nd.mnt)
+ goto out2; /* not attached */
tmp = old_nd.mnt; /* make sure we can reach put_old from
new_root */
spin_lock(&vfsmount_lock);
if (tmp != new_nd.mnt) {


The first of the 2 new tests are causing the pivot_root to fail for me.
If I comment out those lines, it works again.

I'm thinking that somebody put those lines there for a reason, so
there's possibly something wrong with the way i've been doing this for a
long time, and the tightening of the code has uncovered my problem.

I'll explain how we use the initramfs/nfsroot:

1) kernel boots, mounts initramfs
2) /init creates and mouts a ramfs on /newroot
3) create /newroot/nfsroot mountpoint
4) nfsmount /opt/ltsp/i386 from the server on /newroot/nfsroot
5) create a bunch of symlinks to things we need on the nfs filesystem
such as bin, etc, lib, sbin, usr
6) create a bunch of ram-based directories in /newroot, such as
tmp, dev, oldroot, proc and sys
7) cd /newroot; pivot_root . oldroot
8) mount /sys and /proc, start udev
9) exec /sbin/init

We don't do the pivot_root directly to the nfs-mounted filesystem,
because then EVERY file access we do causes NFS traffic.

If you'd like to see a diagram, check out

http://wiki.ltsp.org/twiki/bin/view/Ltsp/WorkInProgress#Diagram_of_initramfs_nfs_layout

Somebody recently told me that pivot_root has been put in the 'evil way
to do things' category, and that there was a new way, but he couldn't
remember what that was.

So, if anybody knows the new way, i'd appreciate hearing about it.

Thanks,

Jim McQuillan
[email protected]


2005-10-08 09:41:39

by Felix Möller

[permalink] [raw]
Subject: Re: pivot_root doesn't work for me in 2.6.14-rc3

Hi James,
> I've found a problem with pivot_root that worked fine in 2.6.13.3, but
> fails for me, starting in 2.6.14-rc3 (haven't tried rc1 or rc2).
>
> This is for LTSP.org (Linux Terminal Server Project) thin clients.
>
> In our initramfs, we have a '/init' script that creates a mountpoint for
> a 2nd ramfs, and i'm trying to pivot_root to that mount point.
>
> I'm getting:
>
> pivot_root: Invalid Argument
>
> This worked perfectly in 2.6.13.3, so I looked at the 2.6.14-rc3 patch,
> and I found the code in fs/namespace.c that is causing it to fail for
> me:
>
> @@ -1334,8 +1332,12 @@ asmlinkage long sys_pivot_root(const cha
> error = -EINVAL;
> if (user_nd.mnt->mnt_root != user_nd.dentry)
> goto out2; /* not a mountpoint */
> + if (user_nd.mnt->mnt_parent == user_nd.mnt)
> + goto out2; /* not attached */
> if (new_nd.mnt->mnt_root != new_nd.dentry)
> goto out2; /* not a mountpoint */
> + if (new_nd.mnt->mnt_parent == new_nd.mnt)
> + goto out2; /* not attached */
> tmp = old_nd.mnt; /* make sure we can reach put_old from
> new_root */
> spin_lock(&vfsmount_lock);
> if (tmp != new_nd.mnt) {
>
>
> The first of the 2 new tests are causing the pivot_root to fail for me.
> If I comment out those lines, it works again.
>
> I'm thinking that somebody put those lines there for a reason, so
> there's possibly something wrong with the way i've been doing this for a
> long time, and the tightening of the code has uncovered my problem.
Miklos Szeredi put these lines there with the following comment:
"[PATCH] pivot_root() circular reference fix

Fix http://bugzilla.kernel.org/show_bug.cgi?id=4857

When pivot_root is called from an init script in an initramfs
environment, it causes a circular reference in the mount tree.

The cause of this is that pivot_root() is not prepared to handle
pivoting an unattached mount. In an initramfs environment, rootfs is the
root of the namespace, and so it is not attached.

This patch fixes this and related problems, by returning -EINVAL if
either the current root or the new root is detached."

> I'll explain how we use the initramfs/nfsroot:
> [...]
>
> Somebody recently told me that pivot_root has been put in the 'evil way
> to do things' category, and that there was a new way, but he couldn't
> remember what that was.
googeling a little bit I found the following link:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=137005

Felix M?ller

2005-10-08 10:01:43

by Miklos Szeredi

[permalink] [raw]
Subject: Re: pivot_root doesn't work for me in 2.6.14-rc3

Felix, thanks for the forward. Please "reply to all" when posting on
LKML.

> Hi James,
> > I've found a problem with pivot_root that worked fine in 2.6.13.3, but
> > fails for me, starting in 2.6.14-rc3 (haven't tried rc1 or rc2).
> >
> > This is for LTSP.org (Linux Terminal Server Project) thin clients.
> >
> > In our initramfs, we have a '/init' script that creates a mountpoint for
> > a 2nd ramfs, and i'm trying to pivot_root to that mount point.
> >
> > I'm getting:
> >
> > pivot_root: Invalid Argument
> >
> > This worked perfectly in 2.6.13.3, so I looked at the 2.6.14-rc3 patch,
> > and I found the code in fs/namespace.c that is causing it to fail for
> > me:
> >
> > @@ -1334,8 +1332,12 @@ asmlinkage long sys_pivot_root(const cha
> > error = -EINVAL;
> > if (user_nd.mnt->mnt_root != user_nd.dentry)
> > goto out2; /* not a mountpoint */
> > + if (user_nd.mnt->mnt_parent == user_nd.mnt)
> > + goto out2; /* not attached */
> > if (new_nd.mnt->mnt_root != new_nd.dentry)
> > goto out2; /* not a mountpoint */
> > + if (new_nd.mnt->mnt_parent == new_nd.mnt)
> > + goto out2; /* not attached */
> > tmp = old_nd.mnt; /* make sure we can reach put_old from
> > new_root */
> > spin_lock(&vfsmount_lock);
> > if (tmp != new_nd.mnt) {
> >
> >
> > The first of the 2 new tests are causing the pivot_root to fail for me.
> > If I comment out those lines, it works again.
> >
> > I'm thinking that somebody put those lines there for a reason, so
> > there's possibly something wrong with the way i've been doing this for a
> > long time, and the tightening of the code has uncovered my problem.
> Miklos Szeredi put these lines there with the following comment:
> "[PATCH] pivot_root() circular reference fix
>
> Fix http://bugzilla.kernel.org/show_bug.cgi?id=4857
>
> When pivot_root is called from an init script in an initramfs
> environment, it causes a circular reference in the mount tree.
>
> The cause of this is that pivot_root() is not prepared to handle
> pivoting an unattached mount. In an initramfs environment, rootfs is the
> root of the namespace, and so it is not attached.
>
> This patch fixes this and related problems, by returning -EINVAL if
> either the current root or the new root is detached."
>
> > I'll explain how we use the initramfs/nfsroot:
> > [...]
> >
> > Somebody recently told me that pivot_root has been put in the 'evil way
> > to do things' category, and that there was a new way, but he couldn't
> > remember what that was.
> googeling a little bit I found the following link:
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=137005
>
> Felix Möller

Here's a post by Richard Fish explaining how to go about changing root
in initramfs:

> Tim Sia wrote:
>
> >
> >
> > From reading the kernel bug tracker:
> >
> > http://bugzilla.kernel.org/show_bug.cgi?id=4857
> >
> > I got the impression that pivot_root is not supported from inside
> > initramfs. I can change my initramfs
> > init script not to do pivot root, but the question is will I be able
> > to unumount the initramfs root
> > once I chrooted to the "real root fs" ?
> >
>
> Tim,
>
> The short answer is "no".
>
> I am guessing you want to unmount the initramfs to free the memory it is
> using. You can do this by simply deleting everything from the initramfs
> before you chroot. If you use only statically linked utilities in your
> initramfs, like busybox, then "rm -rf /etc /bin /sbin ..." will do the
> trick. Just make sure your PATH has the bin and sbin directories from
> your real root filesystem listed first, and run hash -r if necessary
> before removing the files. You will also need to symlink /lib to the
> lib directory on your real root filesystem to run dynamically linked
> programs (like chroot) from there. And obviously, be careful that you
> don't "rm -rf" your real root in the process (like I did once)!!
>
> If you use dynamically linked utilities (like me), things are bit more
> complicated, because once you remove /lib/ld-*.so, you cannot run any
> dynamically linked programs. I was able to work around this by creating
> symlinks like this (my real root mounts on /new_root):
>
> /lib -> /new_root/lib
> /new_root/lib -> ../libdir
>
> All my dynamic libraries go in /libdir. The mount of /new_root hides
> the symlink to ../libdir, so as soon as root is mounted and I run hash
> -r, all programs and libraries are automatically loaded from /new_root,
> and I can delete /libdir without any problems.
>
> If you have other questions for me, feel free to ask. Unfortunately
> things are very busy at work, and I will be travelling some next week,
> so I cannot promise 'real-time' answers. But if I can help, I will.
>
> Cheers,
> -Richard