2002-01-13 22:38:28

by Matthias Andree

[permalink] [raw]
Subject: Boot failure: msdos pushes in front of reiserfs

Hello,

I have been helping Ewald Peiszer (CC'd) to get his machine to boot.

My current analysis of his situation is this:

1. he junked some of his FAT16 partitions, joined two of them as hda13
and uses hda11 ... hda13 for linux, formatted as ext2, swap,
reiserfs, in that order

2. his boot fails after the initrd provided by SuSE's install process
has loaded the reiserfs.o module, his boot logs reveal that the
kernel mounts his hda13 (which is /) as msdos rather than msdos.

3. I presume that msdos is linked into the kernel, and is thus tried
first as root file system, the kernel then panicks as it cannot find
/sbin/init (of course, it's in ReiserFS format, not msdos).

4. I asked Ewald to boot with rootfstype=reiserfs, but he reported that
this did not help, news:<[email protected]>
(German-language).

5. It seems as though some traces of FAT16 shining through reiserfs
still make msdos think it can actually mount the file system.

I see various points where this can be attacked:

1. SuSE and other distributors' installation tools, when formatting a
partition with mkfs, should zero out the first couple of MBytes with
dd if=/dev/zero of=/dev/hda13 bs=4096 count=1024 or something. I'm
not exactly sure how much is needed to get rid of the msdos traces.

2. mkreiserfs could also zero out so much of old data on the FS so that
the kernel reliably recognizes the FS as reiserfs and fails to mount
that stuff as msdos

3. Distributors, when making their initrd stuff, should make sure that
all Linux-native file systems are tried first.

4. rootfstype=reiserfs should be made work for the actual root fs, it
may be broken through initrd mounts, can anyone verify this? (note: I
did not verify it's not working, and I cannot currently tell the kernel
version, Ewald can follow up).

Ewald has only recently migrated from Windows to Linux and direly wants
his installation to boot. For now, I asked him to recompile his kernel
to let msdos, umsdos and vfat be only modules rather than linked into
the kernel, rebuild his initrd with SuSE's mk_initrd and rerun lilo,
that should work around his problem, but it's certainly not good and may
turn away people from Linux who are less enduring and patient than
Ewald.

Thanks a lot in advance,

--
Matthias Andree

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." Benjamin Franklin


2002-01-14 06:50:50

by Oleg Drokin

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

Hello!

On Sun, Jan 13, 2002 at 11:38:03PM +0100, Matthias Andree wrote:

> 3. I presume that msdos is linked into the kernel, and is thus tried
> first as root file system, the kernel then panicks as it cannot find
> /sbin/init (of course, it's in ReiserFS format, not msdos).
It is tried _and_ somewhat succesfuly, because there is still MSDOS "superblock".

> 4. I asked Ewald to boot with rootfstype=reiserfs, but he reported that
> this did not help, news:<[email protected]>
> (German-language).
Hm, probably because reiserfs is not in kernel, but is an external module.

> 5. It seems as though some traces of FAT16 shining through reiserfs
> still make msdos think it can actually mount the file system.
Exactly.

> I see various points where this can be attacked:
> 1. SuSE and other distributors' installation tools, when formatting a
> partition with mkfs, should zero out the first couple of MBytes with
> dd if=/dev/zero of=/dev/hda13 bs=4096 count=1024 or something. I'm
> not exactly sure how much is needed to get rid of the msdos traces.
Erasing first 512 bytes block is enough to get rid fo msdos superblock, I think.

> 2. mkreiserfs could also zero out so much of old data on the FS so that
> the kernel reliably recognizes the FS as reiserfs and fails to mount
> that stuff as msdos
External tools (lilo and stuff) can live there, this will destroy them.

Correct solution, if you create filesystem with mkreiserfs, and you
have no reliable way to pass fstype to kernel, when this partition is mounted
should be to destroy all occurences of other fs's superblocks by yourself, obviously.

> 3. Distributors, when making their initrd stuff, should make sure that
> all Linux-native file systems are tried first.
FS tryout order is hard-wired into the kernel (and depends on linking order, AFAIK).

> Ewald has only recently migrated from Windows to Linux and direly wants
> his installation to boot. For now, I asked him to recompile his kernel
> to let msdos, umsdos and vfat be only modules rather than linked into
> the kernel, rebuild his initrd with SuSE's mk_initrd and rerun lilo,
> that should work around his problem, but it's certainly not good and may
> turn away people from Linux who are less enduring and patient than
> Ewald.
why have not you asked him to do 'dd if=/dev/zero of=/dev/his_partition bs=512 count=1'?
(and this won't destroy existing reiserfs filesystem).

Bye,
Oleg

2002-01-14 11:20:37

by Hans Reiser

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

Oleg Drokin wrote:

>Hello!
>
>On Sun, Jan 13, 2002 at 11:38:03PM +0100, Matthias Andree wrote:
>
>>3. I presume that msdos is linked into the kernel, and is thus tried
>> first as root file system, the kernel then panicks as it cannot find
>> /sbin/init (of course, it's in ReiserFS format, not msdos).
>>
>It is tried _and_ somewhat succesfuly, because there is still MSDOS "superblock".
>
>
>>4. I asked Ewald to boot with rootfstype=reiserfs, but he reported that
>> this did not help, news:<[email protected]>
>> (German-language).
>>
>Hm, probably because reiserfs is not in kernel, but is an external module.
>
>>5. It seems as though some traces of FAT16 shining through reiserfs
>> still make msdos think it can actually mount the file system.
>>
>Exactly.
>
>>I see various points where this can be attacked:
>>1. SuSE and other distributors' installation tools, when formatting a
>> partition with mkfs, should zero out the first couple of MBytes with
>> dd if=/dev/zero of=/dev/hda13 bs=4096 count=1024 or something. I'm
>> not exactly sure how much is needed to get rid of the msdos traces.
>>
>Erasing first 512 bytes block is enough to get rid fo msdos superblock, I think.
>
>>2. mkreiserfs could also zero out so much of old data on the FS so that
>> the kernel reliably recognizes the FS as reiserfs and fails to mount
>> that stuff as msdos
>>
>External tools (lilo and stuff) can live there, this will destroy them.
>
>Correct solution, if you create filesystem with mkreiserfs, and you
>have no reliable way to pass fstype to kernel, when this partition is mounted
>should be to destroy all occurences of other fs's superblocks by yourself, obviously.
>
>>3. Distributors, when making their initrd stuff, should make sure that
>> all Linux-native file systems are tried first.
>>
>FS tryout order is hard-wired into the kernel (and depends on linking order, AFAIK).
>
>>Ewald has only recently migrated from Windows to Linux and direly wants
>>his installation to boot. For now, I asked him to recompile his kernel
>>to let msdos, umsdos and vfat be only modules rather than linked into
>>the kernel, rebuild his initrd with SuSE's mk_initrd and rerun lilo,
>>that should work around his problem, but it's certainly not good and may
>>turn away people from Linux who are less enduring and patient than
>>Ewald.
>>
>why have not you asked him to do 'dd if=/dev/zero of=/dev/his_partition bs=512 count=1'?
>(and this won't destroy existing reiserfs filesystem).
>
>Bye,
> Oleg
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
So what solution should we use, zeroing or fixing msdos to not try
something reiserfs can find, or both or what?

I want the solution to also fixes the error messages from msdos that it
issues when it sees reiserfs that are confusing for users.

Hans


2002-01-14 11:40:27

by Oleg Drokin

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

Hello!

On Mon, Jan 14, 2002 at 02:16:30PM +0300, Hans Reiser wrote:

> So what solution should we use, zeroing or fixing msdos to not try
> something reiserfs can find, or both or what?

We can use both:
destroy MSDOS superblock (if any) at mkreiserfs (or don't touch 1st block of the device if there is no
msdos superblock).
And link reiserfs code into the kernel earlier than msdos code is linked in.

This second way is for those poor souls who ran mkreiserfs on top of their FAT partitions before
we released new mkreiserfs that can destroy FAT superblocks.

> I want the solution to also fixes the error messages from msdos that it
> issues when it sees reiserfs that are confusing for users.
Changing of linking order will fix that, too.

Bye,
Oleg

2002-01-14 14:02:12

by Matthias Andree

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

On Mon, 14 Jan 2002, Oleg Drokin wrote:

> It is tried _and_ somewhat succesfuly, because there is still MSDOS "superblock".

True.

> > 4. I asked Ewald to boot with rootfstype=reiserfs, but he reported that
> > this did not help, news:<[email protected]>
> > (German-language).
> Hm, probably because reiserfs is not in kernel, but is an external module.

Yes, but this module is loaded from SuSE's initrd, so it is loaded
before the root file system is mounted. Might it be that "rootfstype" is
already checked as the initrd is mounted rather then when the actual
root is mounted? If so, fs/super.c deserves fixing, but I'm not
acquainted with that code, so I cannot tell or fix that now.

> > 2. mkreiserfs could also zero out so much of old data on the FS so that
> > the kernel reliably recognizes the FS as reiserfs and fails to mount
> > that stuff as msdos
> External tools (lilo and stuff) can live there, this will destroy them.
>
> Correct solution, if you create filesystem with mkreiserfs, and you
> have no reliable way to pass fstype to kernel, when this partition is mounted
> should be to destroy all occurences of other fs's superblocks by yourself, obviously.

Sure, but tell newbies how to do that. (Tell distributors first. :-)
>
> > 3. Distributors, when making their initrd stuff, should make sure that
> > all Linux-native file systems are tried first.
> FS tryout order is hard-wired into the kernel (and depends on linking order, AFAIK).

Yup, reiserfs is last in /proc/filesystems when loaded as module, but on
my private machine (where it's linked into the kernel), it's right after
ext2 and before vfat.

> why have not you asked him to do 'dd if=/dev/zero
> of=/dev/his_partition bs=512 count=1'? (and this won't destroy
> existing reiserfs filesystem).

Because I wasn't sure about the superblock checking behaviour of msdos.o
and whether this would harm reiserfs.o. I'm not telling newbies things
when I'm not absolutely sure they don't make things worse, I don't want
to scare them away from Linux.

Ewald, can you try

dd if=/dev/zero of=/dev/hda13 bs=512 count=1

from your rescue system and then check if your actual system boots
properly without help of SuSE's install floppies/CD?

--
Matthias Andree

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." Benjamin Franklin

2002-01-14 14:31:05

by Oleg Drokin

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

Hello!

On Mon, Jan 14, 2002 at 03:00:21PM +0100, Matthias Andree wrote:

> > > 4. I asked Ewald to boot with rootfstype=reiserfs, but he reported that
> > > this did not help, news:<[email protected]>
> > > (German-language).
> > Hm, probably because reiserfs is not in kernel, but is an external module.
> Yes, but this module is loaded from SuSE's initrd, so it is loaded
> before the root file system is mounted. Might it be that "rootfstype" is
> already checked as the initrd is mounted rather then when the actual
> root is mounted? If so, fs/super.c deserves fixing, but I'm not
> acquainted with that code, so I cannot tell or fix that now.
Looking at init/main.c and fs/super.c,
rootfsflags parameter is never saved, moreover - it's original value is destroyed, once initrd fs is mounted.
And I only see not very nice ways of fixing this, so perhaps someone more exeprienced can come up with the solution?
(my crappy ides is not to do putname() on fs_names, if (real_root_dev != ROOT_DEV), all of this is only when CONFIG_..._INITRD
enabled)

> > > 2. mkreiserfs could also zero out so much of old data on the FS so that
> > > the kernel reliably recognizes the FS as reiserfs and fails to mount
> > > that stuff as msdos
> > External tools (lilo and stuff) can live there, this will destroy them.
> >
> > Correct solution, if you create filesystem with mkreiserfs, and you
> > have no reliable way to pass fstype to kernel, when this partition is mounted
> > should be to destroy all occurences of other fs's superblocks by yourself, obviously.
> Sure, but tell newbies how to do that. (Tell distributors first. :-)
Our internal deecision is now to detect if device, we are going to mkfs have FAT superblock,
and if it is, zero it out.

> > > 3. Distributors, when making their initrd stuff, should make sure that
> > > all Linux-native file systems are tried first.
> > FS tryout order is hard-wired into the kernel (and depends on linking order, AFAIK).
> Yup, reiserfs is last in /proc/filesystems when loaded as module, but on
> my private machine (where it's linked into the kernel), it's right after
> ext2 and before vfat.
Do you have vfat as a loadable module?

Bye,
Oleg

2002-01-14 17:43:12

by Andreas Dilger

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

On Jan 14, 2002 14:36 +0300, Oleg Drokin wrote:
> On Mon, Jan 14, 2002 at 02:16:30PM +0300, Hans Reiser wrote:
> > So what solution should we use, zeroing or fixing msdos to not try
> > something reiserfs can find, or both or what?
>
> We can use both:
> destroy MSDOS superblock (if any) at mkreiserfs (or don't touch 1st
> block of the device if there is no msdos superblock).
> And link reiserfs code into the kernel earlier than msdos code.

Hmm, I could have sworn I submitted patches already which did both of these
things. In general, it is perfectly safe to zero the bootsector of a
partition when you mkfs it (mke2fs has been doing this for a long time).
If you mkfs your boot partition (and zap the bootblock) you would have to
run LILO on it anyways after they install a new kernel, because the
location of the kernel would change.

'Re: 2.4.15-pre1: "bogus" message with reiserfs root and other weirdness'
dated Nov 21, 2001 for patch to clean up reiserfs boot messages and order.

'Re: [reiserfs-list] Re: Basic reiserfs question' dated Sep 7, 2001 for
patch which (among other things) zaps non-reiserfs data from the disk
when mkreiserfs is run (also referenced in a subsequent posting
'Re: [reiserfs-list] mkreiserfs /dev/hdb' dated Oct 1, 2001).

There was a patch submitted within the past week to clean up the FAT
messages when "silent" is passed. In any case, that is mostly irrelevant
if reiserfs is moved up in the probe order.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2002-01-14 19:50:52

by Hans Reiser

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

Andreas Dilger wrote:

>On Jan 14, 2002 14:36 +0300, Oleg Drokin wrote:
>
>>On Mon, Jan 14, 2002 at 02:16:30PM +0300, Hans Reiser wrote:
>>
>>>So what solution should we use, zeroing or fixing msdos to not try
>>>something reiserfs can find, or both or what?
>>>
>>We can use both:
>> destroy MSDOS superblock (if any) at mkreiserfs (or don't touch 1st
>> block of the device if there is no msdos superblock).
>> And link reiserfs code into the kernel earlier than msdos code.
>>
>
>Hmm, I could have sworn I submitted patches already which did both of these
>things. In general, it is perfectly safe to zero the bootsector of a
>partition when you mkfs it (mke2fs has been doing this for a long time).
>If you mkfs your boot partition (and zap the bootblock) you would have to
>run LILO on it anyways after they install a new kernel, because the
>location of the kernel would change.
>
Can the kernel be in a different partition from the boot partition? If
so, it is not safe, yes?

>
>
>'Re: 2.4.15-pre1: "bogus" message with reiserfs root and other weirdness'
>dated Nov 21, 2001 for patch to clean up reiserfs boot messages and order.
>
>'Re: [reiserfs-list] Re: Basic reiserfs question' dated Sep 7, 2001 for
>patch which (among other things) zaps non-reiserfs data from the disk
>when mkreiserfs is run (also referenced in a subsequent posting
>'Re: [reiserfs-list] mkreiserfs /dev/hdb' dated Oct 1, 2001).
>

Oleg, please review his patches and integrate them into our release process.

>
>
>There was a patch submitted within the past week to clean up the FAT
>messages when "silent" is passed. In any case, that is mostly irrelevant
>if reiserfs is moved up in the probe order.
>
>Cheers, Andreas
>--
>Andreas Dilger
>http://sourceforge.net/projects/ext2resize/
>http://www-mddsp.enel.ucalgary.ca/People/adilger/
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>



2002-01-14 20:23:56

by Andrew Clausen

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

On Mon, Jan 14, 2002 at 10:46:05PM +0300, Hans Reiser wrote:
> >Hmm, I could have sworn I submitted patches already which did both of these
> >things. In general, it is perfectly safe to zero the bootsector of a
> >partition when you mkfs it (mke2fs has been doing this for a long time).
> >If you mkfs your boot partition (and zap the bootblock) you would have to
> >run LILO on it anyways after they install a new kernel, because the
> >location of the kernel would change.
>
> Can the kernel be in a different partition from the boot partition? If
> so, it is not safe, yes?

Correct.

OTOH, it seems sane to reinstall lilo anyway in such situations.

OTOH2, Parted only erases signatures, so it won't break LILO.
OTOH3, this requires parted knowing about different fs types
(which it does, to a large extent).

Andrew

2002-01-14 20:23:56

by Chris Mason

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs



On Monday, January 14, 2002 10:42:42 AM -0700 Andreas Dilger
<[email protected]> wrote:

>> We can use both:
>> destroy MSDOS superblock (if any) at mkreiserfs (or don't touch 1st
>> block of the device if there is no msdos superblock).
>> And link reiserfs code into the kernel earlier than msdos code.
>
> Hmm, I could have sworn I submitted patches already which did both of these
> things. In general, it is perfectly safe to zero the bootsector of a
> partition when you mkfs it (mke2fs has been doing this for a long time).
> If you mkfs your boot partition (and zap the bootblock) you would have to
> run LILO on it anyways after they install a new kernel, because the
> location of the kernel would change.

Hmmm mke2fs seems to always zero out the first 1024, except on sparcs (when
ZAP_BOOT_BLOCK not defined). I thought alphas stored the partition table on
the first block of the first partition as well, and that we didn't want to
zero it then.

-chris

2002-01-14 20:32:32

by Andrew Clausen

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

On Mon, Jan 14, 2002 at 03:21:15PM -0500, Chris Mason wrote:
> Hmmm mke2fs seems to always zero out the first 1024, except on sparcs (when
> ZAP_BOOT_BLOCK not defined). I thought alphas stored the partition table on
> the first block of the first partition as well, and that we didn't want to
> zero it then.

*nitpick*

I think that's Sun... Alpha's use either BSD or MSDOS tables, neither
of which do that...

Andrew

2002-01-15 17:47:37

by Matthias Andree

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

On Mon, 14 Jan 2002, Oleg Drokin wrote:

> Looking at init/main.c and fs/super.c, rootfsflags parameter is never
> saved, moreover - it's original value is destroyed, once initrd fs is
> mounted. And I only see not very nice ways of fixing this, so perhaps
> someone more exeprienced can come up with the solution? (my crappy
> ides is not to do putname() on fs_names, if (real_root_dev !=
> ROOT_DEV), all of this is only when CONFIG_..._INITRD enabled)

Thanks for confirming a bug, so I understand that mounting an initrd
loses the rootfsflags, and as the actual root= parameter is kept over an
initrd boot, it should also be possible for rootfsflags= -- can the
rootfsflags maybe be saved along with the root= parameter?

> > Yup, reiserfs is last in /proc/filesystems when loaded as module, but on
> > my private machine (where it's linked into the kernel), it's right after
> > ext2 and before vfat.
> Do you have vfat as a loadable module?

Hum, yes, but that's not the point, someone turned up with a SuSE 7.3
default kernel .config, and it had ...MSDOS=y ...REISERFS=m -- that says
about all, msdos is higher in the list, reiserfs is then loaded from
initrd, and thus at the bottom of the list. Strange enough SuSE compile
MSDOS which hardly anyone needs at boot time into the kernel, but not
reiserfs (admittedly, reiserfs takes up some memory, but then, it's a
native file system and should be loaded before non-native file systems
such as msdos, vfat, ntfs, freevxfs or whatever). This one is for the
distributors to fix.

Had they left MSDOS as a module, things would have worked out: 1. ext2
in the kernel 2. initrd loads reiserfs 3. actual root (reiserfs) is
mounted 4. only now, msdos.o becomes available.

--
Matthias Andree

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." Benjamin Franklin

2002-01-17 06:39:00

by Oleg Drokin

[permalink] [raw]
Subject: Re: [reiserfs-list] Boot failure: msdos pushes in front of reiserfs

Hello!

On Tue, Jan 15, 2002 at 06:47:12PM +0100, Matthias Andree wrote:
> > Looking at init/main.c and fs/super.c, rootfsflags parameter is never
> > saved, moreover - it's original value is destroyed, once initrd fs is
> > mounted. And I only see not very nice ways of fixing this, so perhaps
> > someone more exeprienced can come up with the solution? (my crappy
> > ides is not to do putname() on fs_names, if (real_root_dev !=
> > ROOT_DEV), all of this is only when CONFIG_..._INITRD enabled)
> Thanks for confirming a bug, so I understand that mounting an initrd
> loses the rootfsflags, and as the actual root= parameter is kept over an
> initrd boot, it should also be possible for rootfsflags= -- can the
> rootfsflags maybe be saved along with the root= parameter?

No. rootfsflags is saved. What is not saved is rootfstype. And yes, it can be saved, of course.

> > > Yup, reiserfs is last in /proc/filesystems when loaded as module, but on
> > > my private machine (where it's linked into the kernel), it's right after
> > > ext2 and before vfat.
> > Do you have vfat as a loadable module?
> Hum, yes, but that's not the point, someone turned up with a SuSE 7.3
This is the point in fact. If you'd have both reiserfs and vfat compiled-in,
you'd see that vfat ebfore reiserfs in /proc/filesystems.

Bye,
Oleg