2001-10-17 02:20:54

by Eric

[permalink] [raw]
Subject: [Q] pivot_root and initrd

I've been doing some work to boot a 2.4.x linux system from onboard
flash and then change_root to an attached disk.

As the kernel documentation admonishes us to use pivot_root instead of
relying on the change_root facility (Documentation/initrd.txt: "Current
kernels still support it, but you should _not_ rely on its continued
availability") I have given it a shot with less than stellar results--
perhaps someone (Warner?) could enlighten me on a few points:

1) What is the current status of pivot_root from an initrd? Is anyone
using it for this purpose, and is it being deprecated by the "union
mounts" mentioned in the bootinglinux-current document by Warner?

2) The initrd.txt and pivot_root manpages seem incorrect on how to
execute /sbin/init on the pivot-root'ed filesystem. In general, the
examples suggest the following should work, but it will not:

pivot_root . old_root
exec chroot . sh -c 'umount /old_root; exec /sbin/init' \
<dev/console >dev/console 2>&1

The standard util-linux /sbin/init program will not allow itself to be
executed without command-line args unless its PID is 1, or it is invoked
as "sh" or "init.new." As we are exec'ing init from userspace instead
of from the kernel, we fail these tests. Perhaps we can update these
examples with something (admitedly hokey) like:

pivot_root . old_root
exec chroot . sh -c 'umount /old_root; exec -a init.new /sbin/init' \
<dev/console >dev/console 2>&1

... or am I misunderstanding a finer point of the standard linux init
process? Note, if we substitute in the above "exec -a sh /sbin/init" we
get a truncated process name in the resulting 'ps' listing as 'in' (bug?).

3) The kernel does not understand pivot_root in the context of an
initrd. As I understand the process, an initd-aware kernel will spawn a
thread to handle the /linuxrc process in the initrd. Once it completes,
the kernel double-checks the real_root_dev against the initrd major and
minor and invokes change_root when the /linuxrc thread exits:

#ifdef CONFIG_BLK_DEV_INITRD
[ ... ]
pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
if (pid>0)
while (pid != wait(&i));
if (MAJOR(real_root_dev) != RAMDISK_MAJOR
|| MINOR(real_root_dev) != 0) {
error = change_root(real_root_dev,"/initrd");
[ ... ]
#endif

This poses a problem-- I load an initrd, perform a pivot_root and exec
the real /sbin/init on the new root filesystem. I am happily running;
now I do 'shutdown -r now' and the init process is terminated. Once the
init process goes away the kernel decides it is time to change_root and
exec "the real /sbin/init."

I would expect to see some sort of fall-through mechanism to prevent the
change_root once a pivot_root is done during an initrd run. The only
method that seems (un)plausible is that I am responsible for setting the
real_root_dev via sysctl to the major/minor of the initrd device after a
successful pivot_root. For those of us without sysctl, we have precious
little to do but accept a restart of init.

What say ye with a better view of the landscape?
I do not subscribe, so cc's to me would be appreciated...

E






2001-10-17 03:30:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Followup to: <[email protected]>
By author: Eric <[email protected]>
In newsgroup: linux.dev.kernel
>
> I've been doing some work to boot a 2.4.x linux system from onboard
> flash and then change_root to an attached disk.
>
> As the kernel documentation admonishes us to use pivot_root instead of
> relying on the change_root facility (Documentation/initrd.txt: "Current
> kernels still support it, but you should _not_ rely on its continued
> availability") I have given it a shot with less than stellar results--
> perhaps someone (Warner?) could enlighten me on a few points:
>
> 1) What is the current status of pivot_root from an initrd? Is anyone
> using it for this purpose, and is it being deprecated by the "union
> mounts" mentioned in the bootinglinux-current document by Warner?
>

Works great. I use it in my SuperRescue CD for example; you can there
check out a complete, working example.

ftp://ftp.kernel.org/pub/dist/superresuce/

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-10-17 18:38:13

by Eric

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

>>
>> Works great. I use it in my SuperRescue CD for example; you can
>> there check out a complete, working example.
>>
>> ftp://ftp.kernel.org/pub/dist/superresuce/
>>
>> -hpa


hpa,

Thanks for the example. The documentation for pivot_root must be just
plain lousy-- I thought I'd go "by the book" with the following
admonitions from the manpage:

1) cd to new_root prior to calling pivot_root
2) call pivot_root with relative new_root (.) and put_old
3) call 'chroot' immediately after pivot and redirect stdin/out

You are simply doing the following, I assume with success:

[ ... ]
# Switch roots and run init
cd /ram
pivot_root /ram /ram/initrd
exec /sbin/init "$@"

whereas I am doing something like the following:

[ ... ]
mount -o ro $ROOTDEV $NEWROOT
cd $NEWROOT
pivot_root . $OLDROOT

# export for visibility to exec'ed shell
export INITARGS="$@"
export OLDROOT

exec chroot . sh -c 'umount $OLDROOT; exec -a init.new /sbin/init
$INITARGS' <dev/console >dev/console 2>&1

I am mystified that the call to 'exec /sbin/init' works if you are using
the standard (you mention "based on RedHat7.1" util-linux") /sbin/init
proggie, and that a standard RH7.1 initscripts would not complain when
the root filesystem is already mounted r/w.

I would also guess that you are susceptible to the kernel's change_root
call if your /sbin/init terminates. I'll have to play with the disk a bit.

E

2001-10-17 21:18:19

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Followup to: <[email protected]>
By author: Eric <[email protected]>
In newsgroup: linux.dev.kernel
>
> I am mystified that the call to 'exec /sbin/init' works if you are using
> the standard (you mention "based on RedHat7.1" util-linux") /sbin/init
> proggie, and that a standard RH7.1 initscripts would not complain when
> the root filesystem is already mounted r/w.
>
> I would also guess that you are susceptible to the kernel's change_root
> call if your /sbin/init terminates. I'll have to play with the disk a bit.
>

I modify the initscripts to not try to fsck and remount the root --
its a ramfs (tmpfs in a later version) after all. If I had been
mounting a filesystem off the harddisk I would either have mounted it
readonly and left the init scripts as-is, or fscked it before
mounting.

I pass the following command line options to the kernel (this is set
up in isolinux.cfg):

append initrd=initrd.gz root=/dev/ram0 init=/linuxrc single

By specifying root=/dev/ram0 and an explicit init (which I'm calling
/linuxrc but could just as easily have called /sbin/init) I'm telling
the kernel that this is the final root, and effectively turn off most
of the initrd legacy weirdness.

If /sbin/init exits, the kernel panics, just like it would normally do
if init goes away.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-10-23 17:46:43

by Eric

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Would it even be worthwhile to propose a patch that would set a flag
when pivot_root is called during an initrd and prevent change_root from
occuring once the linuxrc thread exits?

Your method of placing "initrx=xxx" and "root=xxx" is similar to my
method of stuffing those values into /proc/sys/kernel/real_root_dev once
the pivot_root is complete; I am not really happy with that solution,
not the least of which because it is an undocumented work-around and
somewhat unexpected behavior for a system call that is to (presumably)
replace or augment change_root.

I was also hoping that Warner or Hans would chime-in either in defense
of the current documentation or with clarifications...

E

HPA wrote:
>>
>> I am mystified that the call to 'exec /sbin/init' works
>> if you are using the standard (you mention "based on RedHat7.1"
>> util-linux") /sbin/init proggie, and that a standard RH7.1
>> initscripts would not complain when the root filesystem is already
>> mounted r/w.
>>
>> I would also guess that you are susceptible to the kernel's
>> change_root call if your /sbin/init terminates. I'll have to
>> play with the disk a bit.
>>
>
> I modify the initscripts to not try to fsck and remount the root --
> its a ramfs (tmpfs in a later version) after all. If I had been
> mounting a filesystem off the harddisk I would either have mounted it
> readonly and left the init scripts as-is, or fscked it before
> mounting.
>
> I pass the following command line options to the kernel (this is set
> up in isolinux.cfg):
>
> append initrd=initrd.gz root=/dev/ram0 init=/linuxrc single
>
> By specifying root=/dev/ram0 and an explicit init (which I'm calling
> /linuxrc but could just as easily have called /sbin/init) I'm telling
> the kernel that this is the final root, and effectively turn off most
> of the initrd legacy weirdness.
>
> If /sbin/init exits, the kernel panics, just like it would normally do
> if init goes away.


2001-10-23 18:14:35

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Followup to: <[email protected]>
By author: Eric <[email protected]>
In newsgroup: linux.dev.kernel
>
> Would it even be worthwhile to propose a patch that would set a flag
> when pivot_root is called during an initrd and prevent change_root from
> occuring once the linuxrc thread exits?
>
> Your method of placing "initrx=xxx" and "root=xxx" is similar to my
> method of stuffing those values into /proc/sys/kernel/real_root_dev once
> the pivot_root is complete; I am not really happy with that solution,
> not the least of which because it is an undocumented work-around and
> somewhat unexpected behavior for a system call that is to (presumably)
> replace or augment change_root.
>
> I was also hoping that Warner or Hans would chime-in either in defense
> of the current documentation or with clarifications...
>

The right thing is to get rid of the old initrd compatibility cruft,
but that's a 2.5 change.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-10-23 19:54:40

by Bill Davidsen

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

In article <[email protected]>,
H. Peter Anvin <[email protected]> wrote:

| The right thing is to get rid of the old initrd compatibility cruft,
| but that's a 2.5 change.

Get rid of??? As long as you have some equivalent capability to get
the system up.

--
bill davidsen <[email protected]>
His first management concern is not solving the problem, but covering
his ass. If he lived in the middle ages he'd wear his codpiece backward.

2001-10-23 19:59:40

by Rik van Riel

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

On Tue, 23 Oct 2001, bill davidsen wrote:
> In article <[email protected]>,
> H. Peter Anvin <[email protected]> wrote:
>
> | The right thing is to get rid of the old initrd compatibility cruft,
> | but that's a 2.5 change.
>
> Get rid of??? As long as you have some equivalent capability to get
> the system up.

pivot_root(2) in combination with pivot_root(8)

Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ (volunteers needed)

http://www.surriel.com/ http://distro.conectiva.com/

2001-10-23 20:21:32

by Bill Davidsen

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

In article <[email protected]>,
Rik van Riel <[email protected]> wrote:
| On Tue, 23 Oct 2001, bill davidsen wrote:
| > In article <[email protected]>,
| > H. Peter Anvin <[email protected]> wrote:
| >
| > | The right thing is to get rid of the old initrd compatibility cruft,
| > | but that's a 2.5 change.
| >
| > Get rid of??? As long as you have some equivalent capability to get
| > the system up.
|
| pivot_root(2) in combination with pivot_root(8)

I wasn't really asking about changing root after the system is up, the
part needed is the uncompressing of the filesystem into a ramdisk root f/s
or some such. After that it's pretty much open to any of several techniques.

Getting the modules loaded to support the root f/s and run a little rc
file to get things going is the bootstrap operation, and that's where
initrd is vital. You really don't want to build a kernel for every
machine if you have more than a few! One kernel and a few config and
initrd files is vastly easier.

What replaces the initial step?

--
bill davidsen <[email protected]>
His first management concern is not solving the problem, but covering
his ass. If he lived in the middle ages he'd wear his codpiece backward.

2001-10-23 20:37:16

by Werner Almesberger

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

H. Peter Anvin wrote:
> The right thing is to get rid of the old initrd compatibility cruft,
> but that's a 2.5 change.

Yes, change_root is obsolete (and relies on assumptions that are no
longer valid in several cases), and there has been plenty of time for
distributors to switch. An early funeral in 2.5 is a good idea.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Lausanne, CH [email protected] /
/_http://icawww.epfl.ch/almesberger/_____________________________________/

2001-10-23 20:45:56

by Richard B. Johnson

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

On Tue, 23 Oct 2001, Werner Almesberger wrote:

> H. Peter Anvin wrote:
> > The right thing is to get rid of the old initrd compatibility cruft,
> > but that's a 2.5 change.
>
> Yes, change_root is obsolete (and relies on assumptions that are no
> longer valid in several cases), and there has been plenty of time for
> distributors to switch. An early funeral in 2.5 is a good idea.

Hmm. I need to install a SCSI driver, presumably from initrd
RAM disk as currently works. Will the new pivot-root be transparent?




Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.


2001-10-23 20:52:46

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Richard B. Johnson wrote:

> On Tue, 23 Oct 2001, Werner Almesberger wrote:
>
>
>>H. Peter Anvin wrote:
>>
>>>The right thing is to get rid of the old initrd compatibility cruft,
>>>but that's a 2.5 change.
>>>
>>Yes, change_root is obsolete (and relies on assumptions that are no
>>longer valid in several cases), and there has been plenty of time for
>>distributors to switch. An early funeral in 2.5 is a good idea.
>>
>
> Hmm. I need to install a SCSI driver, presumably from initrd
> RAM disk as currently works. Will the new pivot-root be transparent?
>


It's not transparent, you need to change your initrd.

-hpa


2001-10-23 21:00:56

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Followup to: <ujkB7.3878$1%[email protected]>
By author: [email protected] (bill davidsen)
In newsgroup: linux.dev.kernel
>
> I wasn't really asking about changing root after the system is up, the
> part needed is the uncompressing of the filesystem into a ramdisk root f/s
> or some such. After that it's pretty much open to any of several techniques.
>
> Getting the modules loaded to support the root f/s and run a little rc
> file to get things going is the bootstrap operation, and that's where
> initrd is vital. You really don't want to build a kernel for every
> machine if you have more than a few! One kernel and a few config and
> initrd files is vastly easier.
>
> What replaces the initial step?
>

We will definitely have initrd or initramfs to do this (initramfs is
using the initrd protocol to populate a ramfs from a tar/cpio image.)
However, when it comes up, it will be the root as far as the kernel is
concerned, and run /sbin/init (unless overridden on the kernel command
line, of course) like any other boot. None of this change_root and
/linuxrc special casing garbage.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-10-23 21:05:06

by Richard B. Johnson

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

On Tue, 23 Oct 2001, H. Peter Anvin wrote:

> Richard B. Johnson wrote:
>
> > On Tue, 23 Oct 2001, Werner Almesberger wrote:
> >
> >
> >>H. Peter Anvin wrote:
> >>
> >>>The right thing is to get rid of the old initrd compatibility cruft,
> >>>but that's a 2.5 change.
> >>>
> >>Yes, change_root is obsolete (and relies on assumptions that are no
> >>longer valid in several cases), and there has been plenty of time for
> >>distributors to switch. An early funeral in 2.5 is a good idea.
> >>
> >
> > Hmm. I need to install a SCSI driver, presumably from initrd
> > RAM disk as currently works. Will the new pivot-root be transparent?
> >
>
>
> It's not transparent, you need to change your initrd.
>
> -hpa


Presently, when /initrd/{ash.static} runs off the end of the
/initrd/linuxrc script, the kernel tries to mount the root
defined for LILO. So I add some program that executes 'pivot-root'
instead of just letting the script run off the end?

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.


2001-10-23 21:06:56

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Richard B. Johnson wrote:

>
> Presently, when /initrd/{ash.static} runs off the end of the
> /initrd/linuxrc script, the kernel tries to mount the root
> defined for LILO. So I add some program that executes 'pivot-root'
> instead of just letting the script run off the end?
>


You do something like:

cd /newroot
pivot_root /newroot /newroot/oldroot
exec /sbin/init < /dev/console > /dev/console 2>&1

2001-10-23 21:59:08

by Eric

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Both the pivot_root(8) manpage and the <linux>/Documentation/initrd.txt
document admonish us to do much more than shown below (chroot, relative
pathing of pivot_root arguments, etc).

I certainly trust HPA's example, but it is a far sight from the
'documented' procedure. If the pivot_root developers expect
everyone in the world who depended previously on an implicit
change_root to modify their procedures, they have the
responsibility to see that the "better way" is understood.

If HPA's example is adequate the documentation should be modified.

E

H. Peter Anvin wrote:
> Richard B. Johnson wrote:
>
>>
>> Presently, when /initrd/{ash.static} runs off the end of the
>> /initrd/linuxrc script, the kernel tries to mount the root
>> defined for LILO. So I add some program that executes 'pivot-root'
>> instead of just letting the script run off the end?
>>
>
>
> You do something like:
>
> cd /newroot
> pivot_root /newroot /newroot/oldroot
> exec /sbin/init < /dev/console > /dev/console 2>&1
>
> -

2001-10-24 00:20:16

by Werner Almesberger

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Eric wrote:
> Both the pivot_root(8) manpage and the <linux>/Documentation/initrd.txt
> document admonish us to do much more than shown below (chroot, relative
> pathing of pivot_root arguments, etc).

Correct, yes. Peter's procedure should work with the current
implementation, but it's safer to use the documented approach,
particularly if the solution is distributed to other people.

I currently don't have any plans for changing the pivot_root
implementation, but I wouldn't be surprised if something comes
up at some point in 2.5, since the overall boot architecture
needs a bit of work.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Lausanne, CH [email protected] /
/_http://icawww.epfl.ch/almesberger/_____________________________________/

2001-10-29 19:34:32

by kaih

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

[email protected] (Eric) wrote on 17.10.01 in <[email protected]>:

> You are simply doing the following, I assume with success:

> exec /sbin/init "$@"

> whereas I am doing something like the following:

> exec chroot . sh -c 'umount $OLDROOT; exec -a init.new /sbin/init
> $INITARGS' <dev/console >dev/console 2>&1

> I am mystified that the call to 'exec /sbin/init' works if you are using
> the standard (you mention "based on RedHat7.1" util-linux") /sbin/init
> proggie, and that a standard RH7.1 initscripts would not complain when
> the root filesystem is already mounted r/w.

It works because the PID is 1, of course.

/linuxrc (or however you call it) runs with PID=1, so when it exec's /sbin/
init, the PID is still 1.

OTOH, you have chroot run a shell as a child, which therefore does *not*
have PID=1.

MfG Kai

2001-11-05 21:52:58

by Andreas Schwab

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

[email protected] (Kai Henningsen) writes:

|> [email protected] (Eric) wrote on 17.10.01 in <[email protected]>:
|>
|> > You are simply doing the following, I assume with success:
|>
|> > exec /sbin/init "$@"
|>
|> > whereas I am doing something like the following:
|>
|> > exec chroot . sh -c 'umount $OLDROOT; exec -a init.new /sbin/init
|> > $INITARGS' <dev/console >dev/console 2>&1
|>
|> > I am mystified that the call to 'exec /sbin/init' works if you are using
|> > the standard (you mention "based on RedHat7.1" util-linux") /sbin/init
|> > proggie, and that a standard RH7.1 initscripts would not complain when
|> > the root filesystem is already mounted r/w.
|>
|> It works because the PID is 1, of course.
|>
|> /linuxrc (or however you call it) runs with PID=1, so when it exec's /sbin/
|> init, the PID is still 1.
|>
|> OTOH, you have chroot run a shell as a child, which therefore does *not*
|> have PID=1.

linuxrc does 'exec chroot', chroot does 'exec sh', sh does 'exec init'.
Thus init should end up with the same pid as linuxrc.

Andreas.

--
Andreas Schwab "And now for something
[email protected] completely different."
SuSE Labs, SuSE GmbH, Schanz?ckerstr. 10, D-90443 N?rnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5

2001-11-06 00:11:25

by Eric

[permalink] [raw]
Subject: Re: [Q] pivot_root and initrd

Andreas wrote:

>> linuxrc does 'exec chroot', chroot does 'exec sh',
>> sh does 'exec init'.
>> Thus init should end up with the same pid as linuxrc.

Exactly, but if init does not have PID 1, we fail. Kai,
it works for HPA because of the magic kernel command line
incantation:

root=/dev/ram0 (apparently with or without devfs)

Without this, init does NOT get PID 1 and therefore it
all goes south rather quickly. The pivot_root syscall
is handy, but its operation under 2.4.x is disingenuous
at best.

E