2008-11-14 05:16:55

by Yanmin Zhang

[permalink] [raw]
Subject: system fails to boot

Jens,

We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
system boot doesn't fail.

I debug it and locate the root cause. Pls. see
http://bugzilla.kernel.org/show_bug.cgi?id=11899
https://bugzilla.redhat.com/show_bug.cgi?id=471517

As a matter of fact, there are 2 bugs.

1)root=/dev/sda1, system boot randomly fails. Mostly, boot for 5
times and fails once. nash has a bug. Some of its functions misuse return value 0.
Sometimes, 0 means timeout and no uevent available. Sometimes, 0 means nash gets
an uevent, but the uevent isn't block-related (for exmaple, usb). If by coincidence,
kernel tells nash that uevents are available, but kernel also set timeout, nash
might stops collecting other uevents in queue if current uevent isn't block-related.
I work out a patch for nash to fix it.
http://bugzilla.kernel.org/attachment.cgi?id=18858

2) root=LABEL=/, system always can't boot. initrd init reports
switchroot fails. Here is an executation branch of nash when booting:
(1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop)
(2) nash query /proc/devices with the major number; It found line "8 sd";
(3) nash use 'sd' to search its own probe table to find device (DISK) type for the device
and add it to its own list;
(4) Later on, it probes all devices in its list to get filesystem labels;
scsi register "8 sd" always.
When major is 259, nash fails to find the device(DISK) type. I enables CONFIG_DEBUG_BLOCK_EXT_DEVT=y
when compiling kernel, so 259 is picked up for device /dev/sda1, which causes nash to fail
to find device (DISK) type.
To fixing issue 2), I create a patch for nash and another patch for kernel.
http://bugzilla.kernel.org/attachment.cgi?id=18859
http://bugzilla.kernel.org/attachment.cgi?id=18837

Below is the patch for kernel 2.6.28-rc4. It registers blkext, a new block device in proc/devices.

With 2 patches on nash and 1 patch on kernel, I boot my machines for dozens of times
without failure.

Signed-off-by Zhang Yanmin <[email protected]>

Would you like to accept the kernel patch into your testing tree? Pls. do CC to me when replying
as I couldn't subscribe LKML emails now.

---

--- linux-2.6.28-rc4/block/genhd.c 2008-11-11 08:37:24.000000000 +0800
+++ linux-2.6.28-rc4_label/block/genhd.c 2008-11-13 04:05:35.000000000 +0800
@@ -1028,6 +1028,7 @@ static int __init proc_genhd_init(void)
{
proc_create("diskstats", 0, NULL, &proc_diskstats_operations);
proc_create("partitions", 0, NULL, &proc_partitions_operations);
+ register_blkdev(BLOCK_EXT_MAJOR, "blkext");
return 0;
}
module_init(proc_genhd_init);




2008-11-14 05:27:19

by Tejun Heo

[permalink] [raw]
Subject: Re: system fails to boot

Zhang, Yanmin wrote:
> Signed-off-by Zhang Yanmin <[email protected]>

Acked-by: Tejun Heo <[email protected]>

--
tejun

2008-11-14 06:15:29

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: system fails to boot

On Fri, Nov 14, 2008 at 01:16:21PM +0800, Zhang, Yanmin wrote:
> Jens,
>
> We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
> machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
> All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
> system boot doesn't fail.
>
> I debug it and locate the root cause. Pls. see
> http://bugzilla.kernel.org/show_bug.cgi?id=11899
> https://bugzilla.redhat.com/show_bug.cgi?id=471517
>
> As a matter of fact, there are 2 bugs.
>
> 1)root=/dev/sda1, system boot randomly fails. Mostly, boot for 5
> times and fails once. nash has a bug. Some of its functions misuse return value 0.
> Sometimes, 0 means timeout and no uevent available. Sometimes, 0 means nash gets
> an uevent, but the uevent isn't block-related (for exmaple, usb). If by coincidence,
> kernel tells nash that uevents are available, but kernel also set timeout, nash
> might stops collecting other uevents in queue if current uevent isn't block-related.
> I work out a patch for nash to fix it.
> http://bugzilla.kernel.org/attachment.cgi?id=18858
>
> 2) root=LABEL=/, system always can't boot. initrd init reports
> switchroot fails. Here is an executation branch of nash when booting:
> (1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop)
> (2) nash query /proc/devices with the major number; It found line "8 sd";
> (3) nash use 'sd' to search its own probe table to find device (DISK) type for the device
> and add it to its own list;
> (4) Later on, it probes all devices in its list to get filesystem labels;
> scsi register "8 sd" always.
> When major is 259, nash fails to find the device(DISK) type. I enables CONFIG_DEBUG_BLOCK_EXT_DEVT=y
> when compiling kernel, so 259 is picked up for device /dev/sda1, which causes nash to fail
> to find device (DISK) type.
> To fixing issue 2), I create a patch for nash and another patch for kernel.
> http://bugzilla.kernel.org/attachment.cgi?id=18859
> http://bugzilla.kernel.org/attachment.cgi?id=18837
>
> Below is the patch for kernel 2.6.28-rc4. It registers blkext, a new block device in proc/devices.
>
> With 2 patches on nash and 1 patch on kernel, I boot my machines for dozens of times
> without failure.
>
> Signed-off-by Zhang Yanmin <[email protected]>
> 
> Would you like to accept the kernel patch into your testing tree? Pls. do CC to me when replying
> as I couldn't subscribe LKML emails now.
>
> ---
>
> --- linux-2.6.28-rc4/block/genhd.c 2008-11-11 08:37:24.000000000 +0800
> +++ linux-2.6.28-rc4_label/block/genhd.c 2008-11-13 04:05:35.000000000 +0800
> @@ -1028,6 +1028,7 @@ static int __init proc_genhd_init(void)
> {
> proc_create("diskstats", 0, NULL, &proc_diskstats_operations);
> proc_create("partitions", 0, NULL, &proc_partitions_operations);
> + register_blkdev(BLOCK_EXT_MAJOR, "blkext");
> return 0;
> }
> module_init(proc_genhd_init);

It's procfs-specific init, what's up?

2008-11-14 06:23:19

by Tejun Heo

[permalink] [raw]
Subject: Re: system fails to boot

Alexey Dobriyan wrote:
>> Would you like to accept the kernel patch into your testing tree? Pls. do CC to me when replying
>> as I couldn't subscribe LKML emails now.
>>
>> ---
>>
>> --- linux-2.6.28-rc4/block/genhd.c 2008-11-11 08:37:24.000000000 +0800
>> +++ linux-2.6.28-rc4_label/block/genhd.c 2008-11-13 04:05:35.000000000 +0800
>> @@ -1028,6 +1028,7 @@ static int __init proc_genhd_init(void)
>> {
>> proc_create("diskstats", 0, NULL, &proc_diskstats_operations);
>> proc_create("partitions", 0, NULL, &proc_partitions_operations);
>> + register_blkdev(BLOCK_EXT_MAJOR, "blkext");
>> return 0;
>> }
>> module_init(proc_genhd_init);
>
> It's procfs-specific init, what's up?

Ah... right, better to move it to genhd_device_init(). Thanks.

--
tejun

2008-11-14 06:30:25

by Yanmin Zhang

[permalink] [raw]
Subject: Re: system fails to boot


On Fri, 2008-11-14 at 09:18 +0300, Alexey Dobriyan wrote:
> On Fri, Nov 14, 2008 at 01:16:21PM +0800, Zhang, Yanmin wrote:
> > Jens,
> >
> > We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
> > machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
> > All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
> > system boot doesn't fail.
> >
> > I debug it and locate the root cause. Pls. see
> > http://bugzilla.kernel.org/show_bug.cgi?id=11899
> > https://bugzilla.redhat.com/show_bug.cgi?id=471517
> >
> > As a matter of fact, there are 2 bugs.
> >
> > 1)root=/dev/sda1, system boot randomly fails. Mostly, boot for 5
> > times and fails once. nash has a bug. Some of its functions misuse return value 0.
> > Sometimes, 0 means timeout and no uevent available. Sometimes, 0 means nash gets
> > an uevent, but the uevent isn't block-related (for exmaple, usb). If by coincidence,
> > kernel tells nash that uevents are available, but kernel also set timeout, nash
> > might stops collecting other uevents in queue if current uevent isn't block-related.
> > I work out a patch for nash to fix it.
> > http://bugzilla.kernel.org/attachment.cgi?id=18858
> >
> > 2) root=LABEL=/, system always can't boot. initrd init reports
> > switchroot fails. Here is an executation branch of nash when booting:
> > (1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop)
> > (2) nash query /proc/devices with the major number; It found line "8 sd";
> > (3) nash use 'sd' to search its own probe table to find device (DISK) type for the device
> > and add it to its own list;
> > (4) Later on, it probes all devices in its list to get filesystem labels;
> > scsi register "8 sd" always.
> > When major is 259, nash fails to find the device(DISK) type. I enables CONFIG_DEBUG_BLOCK_EXT_DEVT=y
> > when compiling kernel, so 259 is picked up for device /dev/sda1, which causes nash to fail
> > to find device (DISK) type.
> > To fixing issue 2), I create a patch for nash and another patch for kernel.
> > http://bugzilla.kernel.org/attachment.cgi?id=18859
> > http://bugzilla.kernel.org/attachment.cgi?id=18837
> >
> > Below is the patch for kernel 2.6.28-rc4. It registers blkext, a new block device in proc/devices.
> >
> It's procfs-specific init, what's up?
nash (FC9 uses nash to explain the init script in initrd) reads /proc/devices to check the type of
root device. When CONFIG_DEBUG_BLOCK_EXT_DEVT=y, the root device MAJOR is 259. Current kernel doesn't
register block device for 259 in /proc/devices.

It's hard to explain in a short statement. Would you like to read it from
http://bugzilla.kernel.org/show_bug.cgi?id=11899?

2008-11-14 07:17:59

by Jens Axboe

[permalink] [raw]
Subject: Re: system fails to boot

On Fri, Nov 14 2008, Tejun Heo wrote:
> Zhang, Yanmin wrote:
> > ???Signed-off-by Zhang Yanmin <[email protected]>
>
> Acked-by: Tejun Heo <[email protected]>

Good debugging, thanks a lot Zhang. I've applied it, going upstream
soon.

--
Jens Axboe

2008-11-14 07:22:31

by Yanmin Zhang

[permalink] [raw]
Subject: Re: system fails to boot


On Fri, 2008-11-14 at 15:22 +0900, Tejun Heo wrote:
> Alexey Dobriyan wrote:
> >> Would you like to accept the kernel patch into your testing tree? Pls. do CC to me when replying
> >> as I couldn't subscribe LKML emails now.
> >>
> >> ---
> >>
> >> --- linux-2.6.28-rc4/block/genhd.c 2008-11-11 08:37:24.000000000 +0800
> >> +++ linux-2.6.28-rc4_label/block/genhd.c 2008-11-13 04:05:35.000000000 +0800
> >> @@ -1028,6 +1028,7 @@ static int __init proc_genhd_init(void)
> >> {
> >> proc_create("diskstats", 0, NULL, &proc_diskstats_operations);
> >> proc_create("partitions", 0, NULL, &proc_partitions_operations);
> >> + register_blkdev(BLOCK_EXT_MAJOR, "blkext");
> >> return 0;
> >> }
> >> module_init(proc_genhd_init);
> >
> > It's procfs-specific init, what's up?
>
> Ah... right, better to move it to genhd_device_init(). Thanks.
Thanks. I thought nash reads /proc/devices so just added there.
Below patch moves it to genhd_device_init. I tested it on my Nehalem machine.

---

--- linux-2.6.28-rc4/block/genhd.c 2008-11-14 17:20:29.000000000 +0800
+++ linux-2.6.28-rc4_boot/block/genhd.c 2008-11-14 23:11:43.000000000 +0800
@@ -768,6 +768,8 @@ static int __init genhd_device_init(void
bdev_map = kobj_map_init(base_probe, &block_class_lock);
blk_dev_init();

+ register_blkdev(BLOCK_EXT_MAJOR, "blkext");
+
#ifndef CONFIG_SYSFS_DEPRECATED
/* create top-level block dir */
block_depr = kobject_create_and_add("block", NULL);

2008-11-17 08:21:05

by Yanmin Zhang

[permalink] [raw]
Subject: Re: system fails to boot


On Fri, 2008-11-14 at 14:29 +0800, Zhang, Yanmin wrote:
> On Fri, 2008-11-14 at 09:18 +0300, Alexey Dobriyan wrote:
> > On Fri, Nov 14, 2008 at 01:16:21PM +0800, Zhang, Yanmin wrote:
> > > Jens,
> > >
> > > We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
> > > machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
> > > All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
> > > system boot doesn't fail.
> > >
> > > I debug it and locate the root cause. Pls. see
> > > http://bugzilla.kernel.org/show_bug.cgi?id=11899
> > > https://bugzilla.redhat.com/show_bug.cgi?id=471517
> > >
> > > As a matter of fact, there are 2 bugs.
> > >
> > > 2) root=LABEL=/, system always can't boot. initrd init reports
> > > switchroot fails. Here is an executation branch of nash when booting:
> > > (1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop)
> > > (2) nash query /proc/devices with the major number; It found line "8 sd";
> > > (3) nash use 'sd' to search its own probe table to find device (DISK) type for the device
> > > and add it to its own list;
> > > (4) Later on, it probes all devices in its list to get filesystem labels;
> > > scsi register "8 sd" always.
> > > When major is 259, nash fails to find the device(DISK) type. I enables CONFIG_DEBUG_BLOCK_EXT_DEVT=y
> > > when compiling kernel, so 259 is picked up for device /dev/sda1, which causes nash to fail
> > > to find device (DISK) type.
> > > To fixing issue 2), I create a patch for nash and another patch for kernel.
> > > http://bugzilla.kernel.org/attachment.cgi?id=18859
> > > http://bugzilla.kernel.org/attachment.cgi?id=18837
As for issue 2) with root=LABEL=/, I double-checked nash codes. That's really beyond what I imagined. I'm not
an expert of nash. kernel might allocate MINOR number from MAX_EXT_DEVT (259) for any type of disk
(cciss/ataraid/sd/ide/floppy/md ...), while nash assumes a MAJOR number is used by one of them exclusively.
In the other hand, nash probes scsi/ide/usb serially as long as the type is DEV_TYPE_DISK. I won't say
nash codes are not perfect, but nash is growing.

Peter Jones,

You maintain nash. What's your opinion?

-yanmin

2008-11-21 17:26:19

by Jike Song

[permalink] [raw]
Subject: Re: system fails to boot

On Fri, Nov 14, 2008 at 1:16 PM, Zhang, Yanmin
<[email protected]> wrote:
> Jens,
>
> We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
> machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
> All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
> system boot doesn't fail.
>
> I debug it and locate the root cause. Pls. see
> http://bugzilla.kernel.org/show_bug.cgi?id=11899
> https://bugzilla.redhat.com/show_bug.cgi?id=471517
>

Hi Yanmin,

It seems I still have this problem with -rc6, both root=LABEL and
root=/dev/sda8. I have replaced my nash/mkinitrd with patched ones.

Maybe a different issue just with the same appearance? Can you boot
-rc6 smothly on your T61?

--
Thanks,
Jike

2008-11-24 05:54:43

by Yanmin Zhang

[permalink] [raw]
Subject: Re: system fails to boot


On Sat, 2008-11-22 at 01:26 +0800, Jike Song wrote:
> On Fri, Nov 14, 2008 at 1:16 PM, Zhang, Yanmin
> <[email protected]> wrote:
> > Jens,
> >
> > We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
> > machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
> > All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
> > system boot doesn't fail.
> >
> > I debug it and locate the root cause. Pls. see
> > http://bugzilla.kernel.org/show_bug.cgi?id=11899
> > https://bugzilla.redhat.com/show_bug.cgi?id=471517
> >
>
> Hi Yanmin,
>
> It seems I still have this problem with -rc6, both root=LABEL and
> root=/dev/sda8. I have replaced my nash/mkinitrd with patched ones.
Did you recreate the initrd after updating nash/mkinitrd tools?
What's the boot log?

>
> Maybe a different issue just with the same appearance?
Maybe, but mostly little possibility.

> Can you boot
> -rc6 smothly on your T61?
Yes, I can boot 2.6.28-rc6 on my T61 and Nehalem machine. Without the patch,
2.6.28-rc6 boot randomly fails on T61.

2008-11-24 06:40:21

by Jike Song

[permalink] [raw]
Subject: Re: system fails to boot

On Mon, Nov 24, 2008 at 1:52 PM, Zhang, Yanmin
<[email protected]> wrote:
>
> On Sat, 2008-11-22 at 01:26 +0800, Jike Song wrote:
>> On Fri, Nov 14, 2008 at 1:16 PM, Zhang, Yanmin
>> <[email protected]> wrote:
>> > Jens,
>> >
>> > We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
>> > machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
>> > All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
>> > system boot doesn't fail.
>> >
>> > I debug it and locate the root cause. Pls. see
>> > http://bugzilla.kernel.org/show_bug.cgi?id=11899
>> > https://bugzilla.redhat.com/show_bug.cgi?id=471517
>> >
>>
>> Hi Yanmin,
>>
>> It seems I still have this problem with -rc6, both root=LABEL and
>> root=/dev/sda8. I have replaced my nash/mkinitrd with patched ones.
> Did you recreate the initrd after updating nash/mkinitrd tools?

Yes, I run "make install" at first, and found the failure of boot;
Then I recreated initrd manually with mkinitrd/nash (patched by you),
and booted -rc6 again - failed again.

> What's the boot log?
Sorry I don't have the laptop with me now, will post it this
evening(Ah, only possible if I can get netconsole work). Anyway,
your patches did work for me with -rc4, so I think I'll bisect it.

--
Thanks,
Jike

2008-11-24 06:59:51

by Yanmin Zhang

[permalink] [raw]
Subject: Re: system fails to boot


On Mon, 2008-11-24 at 14:40 +0800, Jike Song wrote:
> On Mon, Nov 24, 2008 at 1:52 PM, Zhang, Yanmin
> <[email protected]> wrote:
> >
> > On Sat, 2008-11-22 at 01:26 +0800, Jike Song wrote:
> >> On Fri, Nov 14, 2008 at 1:16 PM, Zhang, Yanmin
> >> <[email protected]> wrote:
> >> > Jens,
> >> >
> >> > We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
> >> > machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
> >> > All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
> >> > system boot doesn't fail.
> >> >
> >> > I debug it and locate the root cause. Pls. see
> >> > http://bugzilla.kernel.org/show_bug.cgi?id=11899
> >> > https://bugzilla.redhat.com/show_bug.cgi?id=471517
> >> >
> >>
> >> Hi Yanmin,
> >>
> >> It seems I still have this problem with -rc6, both root=LABEL and
> >> root=/dev/sda8. I have replaced my nash/mkinitrd with patched ones.
> > Did you recreate the initrd after updating nash/mkinitrd tools?
>
> Yes, I run "make install" at first, and found the failure of boot;
> Then I recreated initrd manually with mkinitrd/nash (patched by you),
> and booted -rc6 again - failed again.
Would you like to write down the detailed steps?

>
> > What's the boot log?
> Sorry I don't have the laptop with me now, will post it this
> evening(Ah, only possible if I can get netconsole work). Anyway,
> your patches did work for me with -rc4, so I think I'll bisect it.
That's good if the boot always fails. But when root=/dev/sdaXXX, system
boot fails randomly.