2002-06-19 09:02:23

by Dave Jones

[permalink] [raw]
Subject: /proc/partitions broken in 2.5.23

I got a bug report about an issue with LVM in 2.5.22-dj1, which turns
out to be caused by broken /proc/partitions in mainline.

(davej@mesh:davej)$ cat /proc/partitions
major minor #blocks name

8 0 0 sda
22 0 1515870810 hdc
22 64 1515870810 hdd
3 0 29316672 hda
3 1 117400 hda1
3 2 1 hda2
3 5 999904 hda5
3 6 1499872 hda6
3 7 683392 hda7
3 8 26015944 hda8
3 64 1515870810 hdb

Note the huge numbers in hex are 0x5a5a5a5a, so something
seems to be getting poisoned somewhere.

Also, should partitions with 0 blocks be showing up ?
I don't recall that happening with the old-style 2.4 code.

Dave

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs


2002-06-19 11:32:35

by Andries Brouwer

[permalink] [raw]
Subject: Re: /proc/partitions broken in 2.5.23

On Wed, Jun 19, 2002 at 10:02:48AM +0100, Dave Jones wrote:

> I got a bug report about an issue with LVM in 2.5.22-dj1, which turns
> out to be caused by broken /proc/partitions in mainline.
>
> (davej@mesh:davej)$ cat /proc/partitions
> major minor #blocks name
>
> 8 0 0 sda
> 22 0 1515870810 hdc
> 22 64 1515870810 hdd
> 3 0 29316672 hda
> 3 1 117400 hda1
> 3 2 1 hda2
> 3 5 999904 hda5
> 3 6 1499872 hda6
> 3 7 683392 hda7
> 3 8 26015944 hda8
> 3 64 1515870810 hdb
>
> Note the huge numbers in hex are 0x5a5a5a5a, so something
> seems to be getting poisoned somewhere.
>
> Also, should partitions with 0 blocks be showing up ?
> I don't recall that happening with the old-style 2.4 code.

I changed something here a few weeks ago. The idea was to avoid
listing partitions of size 0 but do list full devices, regardless
of size. Especially in case of removable media that is useful.
For example, a
blockdev --rereadpt /dev/sda
might show that there is something there now.

Will look at what might cause your strange numbers. Unfortunately
recent 2.5 kernels do not boot for me.

Andries

2002-06-19 11:44:03

by Dave Jones

[permalink] [raw]
Subject: Re: /proc/partitions broken in 2.5.23

On Wed, Jun 19, 2002 at 01:32:33PM +0200, Andries Brouwer wrote:

> > 22 0 1515870810 hdc
> > 22 64 1515870810 hdd
> > 3 0 29316672 hda
> > 3 1 117400 hda1
> > 3 2 1 hda2
> > 3 5 999904 hda5
> > 3 6 1499872 hda6
> > 3 7 683392 hda7
> > 3 8 26015944 hda8
> > 3 64 1515870810 hdb
> I changed something here a few weeks ago. The idea was to avoid
> listing partitions of size 0 but do list full devices, regardless
> of size. Especially in case of removable media that is useful.
> For example, a
> blockdev --rereadpt /dev/sda
> might show that there is something there now.

Seems it doesn't handle the case of 'no media in drive' too well.
hdc - cdrom, hdd - zip drive, hdb - no device there.

hda2 is odd looking too showing a #blocks of '1', when
it's actually..

Device Boot Start End Blocks Id System
/dev/hda2 234 58168 29199240 5 Extended


Oddly, on another machine, it detects an LS-120 drive with
no media correctly..
22 64 0 hdd

(but still gets the 'no device' case wrong on that box).

Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-06-19 12:51:56

by Andries Brouwer

[permalink] [raw]
Subject: Re: /proc/partitions broken in 2.5.23

On Wed, Jun 19, 2002 at 01:44:02PM +0200, Dave Jones wrote:

> hda2 is odd looking too showing a #blocks of '1', when
> it's actually..
>
> Device Boot Start End Blocks Id System
> /dev/hda2 234 58168 29199240 5 Extended

That is correct, and something I did before you were born.

An extended partition is a box containing logical partitions.
It is almost always an error when people want to write directly to it
(confusing the extended partition with some logical partition inside).
After a number of reports of people who messed up their disk
by doing mkswap or mkfs on an extended partition I changed
the length of an extended partition to 1 block, enough for LILO
but stopping mkswap and mkfs.
People who really want to access these blocks, like e.g. fdisk,
can do so via /dev/hda.

Andries

2002-06-19 13:16:14

by Dave Jones

[permalink] [raw]
Subject: Re: /proc/partitions broken in 2.5.23

On Wed, Jun 19, 2002 at 02:51:54PM +0200, Andries Brouwer wrote:

> An extended partition is a box containing logical partitions.
> It is almost always an error when people want to write directly to it
> (confusing the extended partition with some logical partition inside).
> After a number of reports of people who messed up their disk
> by doing mkswap or mkfs on an extended partition I changed
> the length of an extended partition to 1 block, enough for LILO
> but stopping mkswap and mkfs.

Ah, that makes perfect sense to me now.

Dave.


--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-06-19 17:19:42

by Wayne Whitney

[permalink] [raw]
Subject: Re: /proc/partitions broken in 2.5.23

In mailing-lists.linux-kernel, you wrote:

> I changed something here a few weeks ago. The idea was to avoid
> listing partitions of size 0 but do list full devices, regardless of
> size. Especially in case of removable media that is useful.

I traced the change to the part of mainline ChangeSet 1.496 given
below (warning: cut and pasted). It seems to cause every possible
device that a driver could provide to show up in /proc/partitions.
For LVM, that's a zillion devices, and /proc/partitions overflows,
showing some random pages from memory. Reverting the patch below
makes /proc/partitions and LVM happy again.

Cheers,
Wayne


--- a/drivers/block/genhd.c Wed May 8 09:53:06 2002
+++ b/drivers/block/genhd.c Sun Jun 9 18:58:36 2002
@@ -177,9 +177,10 @@
if (sgp == gendisk_head)
seq_puts(part, "major minor #blocks name\n\n");

- /* show all non-0 size partitions of this disk */
+ /* show the full disk and all non-0 size partitions of it */
for (n = 0; n < (sgp->nr_real << sgp->minor_shift); n++) {
- if (sgp->part[n].nr_sects == 0)
+ int minormask = (1<<sgp->minor_shift) - 1;
+ if ((n & minormask) && sgp->part[n].nr_sects == 0)
continue;
seq_printf(part, "%4d %4d %10d %s\n",
sgp->major, n, sgp->sizes[n],

2002-06-20 23:22:41

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: /proc/partitions broken in 2.5.23

On Wed, Jun 19, 2002 at 10:02:48AM +0100, Dave Jones wrote:

> I got a bug report about an issue with LVM in 2.5.22-dj1, which turns
> out to be caused by broken /proc/partitions in mainline.
>
> (davej@mesh:davej)$ cat /proc/partitions
> major minor #blocks name
>
> 8 0 0 sda
> 22 0 1515870810 hdc
>
> Note the huge numbers in hex are 0x5a5a5a5a, so something
> seems to be getting poisoned somewhere.

Is this LVM?

I don't see how LVM could produce such values.
(And in fact LVM does not even compile, so only a patched LVM
could produce anything at all.)



From: Wayne Whitney <[email protected]>

I traced the change to the part of mainline ChangeSet 1.496 given
below (warning: cut and pasted). It seems to cause every possible
device that a driver could provide to show up in /proc/partitions.
For LVM, that's a zillion devices, and /proc/partitions overflows,
showing some random pages from memory. Reverting the patch below
makes /proc/partitions and LVM happy again.


--- a/drivers/block/genhd.c Wed May 8 09:53:06 2002
+++ b/drivers/block/genhd.c Sun Jun 9 18:58:36 2002
@@ -177,9 +177,10 @@
if (sgp == gendisk_head)
seq_puts(part, "major minor #blocks name\n\n");

- /* show all non-0 size partitions of this disk */
+ /* show the full disk and all non-0 size partitions of it */
for (n = 0; n < (sgp->nr_real << sgp->minor_shift); n++) {
- if (sgp->part[n].nr_sects == 0)
+ int minormask = (1<<sgp->minor_shift) - 1;
+ if ((n & minormask) && sgp->part[n].nr_sects == 0)
continue;
seq_printf(part, "%4d %4d %10d %s\n",
sgp->major, n, sgp->sizes[n],

Yes.
Normally the nr_real field indicates how many devices are
present. But LVM sets that to 256 even when nothing is present.
So, indeed, when all size fields are set to 0 this would probably
yield a list of 256 absent LVM devices.
Maybe LVM has to be fixed, or this patch fragment reverted, or both.

Something else is that if /proc/partitions overflows that must be
fixed independently of zeros or LVM.

Andries

2002-06-20 23:34:41

by Dave Jones

[permalink] [raw]
Subject: Re: /proc/partitions broken in 2.5.23

On Fri, Jun 21, 2002 at 01:21:40AM +0200, [email protected] wrote:

> > I got a bug report about an issue with LVM in 2.5.22-dj1, which turns
> > out to be caused by broken /proc/partitions in mainline.
> >
> > (davej@mesh:davej)$ cat /proc/partitions
> > major minor #blocks name
> >
> > 8 0 0 sda
> > 22 0 1515870810 hdc
> >
> > Note the huge numbers in hex are 0x5a5a5a5a, so something
> > seems to be getting poisoned somewhere.
>
> Is this LVM?

No.

> I don't see how LVM could produce such values.
> (And in fact LVM does not even compile, so only a patched LVM
> could produce anything at all.)

The original person who reported a problem to me used LVM, and
in the course of discussion, the proc/partitions bug came to light.
The values pasted above are from a box with no LVM compiled.

> Normally the nr_real field indicates how many devices are
> present. But LVM sets that to 256 even when nothing is present.
> So, indeed, when all size fields are set to 0 this would probably
> yield a list of 256 absent LVM devices.
> Maybe LVM has to be fixed, or this patch fragment reverted, or both.

As I mentioned in another mail, it seems that removable devices with
no media have no valid #blocks, and is thus getting poisoned.

Dave

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-06-20 23:42:59

by Anders Gustafsson

[permalink] [raw]
Subject: Re: /proc/partitions broken in 2.5.23

On Fri, Jun 21, 2002 at 01:21:40AM +0200, [email protected] wrote:
> > I got a bug report about an issue with LVM in 2.5.22-dj1, which turns
> > out to be caused by broken /proc/partitions in mainline.
> >
> > (davej@mesh:davej)$ cat /proc/partitions
> > major minor #blocks name
> >
> > 8 0 0 sda
> > 22 0 1515870810 hdc
> >
> > Note the huge numbers in hex are 0x5a5a5a5a, so something
> > seems to be getting poisoned somewhere.
>
> Is this LVM?
>
> I don't see how LVM could produce such values.
> (And in fact LVM does not even compile, so only a patched LVM
> could produce anything at all.)

Me neither. And as a datapoint: with my lvm-cleanup-patch I get
correct size-values for all my partitions (but, yes, all 0-sized LVs
shows). But there is a problem with my /proc/partitions, i starts with
2 pages of garbage (8k). Is this some kind of overflow problem as lvm
creates 256 entries?


--

//anders/g

2002-06-26 03:22:26

by Peter Chubb

[permalink] [raw]
Subject: Re: /proc/partitions broken in 2.5.23


Here's a fix that works for me (the /proc/partitions output is still broken in
2.5.24 -- it appears to be following sgp->part[n] pointers that aren't
initialised)

--- /tmp/geta29820 Wed Jun 26 13:16:14 2002
+++ linux-2.5.24/drivers/block/genhd.c Wed Jun 26 11:50:36 2002
@@ -179,8 +179,7 @@

/* show the full disk and all non-0 size partitions of it */
for (n = 0; n < (sgp->nr_real << sgp->minor_shift); n++) {
- int minormask = (1<<sgp->minor_shift) - 1;
- if ((n & minormask) && sgp->part[n].nr_sects == 0)
+ if (sgp->sizes[n] == 0)
continue;
seq_printf(part, "%4d %4d %10llu %s\n",
sgp->major, n, (unsigned long long)sgp->sizes[n],
--- /tmp/geta29832 Wed Jun 26 13:17:36 2002
+++ linux-2.5.24/drivers/ide/probe.c Wed Jun 26 11:10:00 2002
@@ -1146,6 +1146,7 @@
gd->sizes = kmalloc(ATA_MINORS * sizeof(gd->sizes[0]), GFP_KERNEL);
if (!gd->sizes)
goto err_kmalloc_gd_sizes;
+ memset(gd->sizes, 0, ATA_MINORS*sizeof(gd->sizes[0]));

gd->part = kmalloc(ATA_MINORS * sizeof(struct hd_struct), GFP_KERNEL);
if (!gd->part)