2002-10-12 13:55:03

by Alastair Stevens

[permalink] [raw]
Subject: Small oddity of the week: 2.4.20-pre

Guys - I use the excellent Mindi/Mondo backup & rescue tools on my RH7.3
box, which have worked perfectly throughout a whole range of recent
kernels, up to and including 2.4.19. But, since I started running
2.4.20-pre5 (and now -pre9), Mindi refused to work any more.

I consulted the developer, and we tracked the problem down to this
pathetically innocent command sequence in the script:

fdisk -l | grep -w "/dev/hda6"

For some reason, this now produces, entirely at _random_, either one or
two lines of output! It was the duplicated output that broke Mindi. It's
easily accommodated in the script, but this randomness was never
exhibited on any earlier kernels. Is it me, or is this weird?

I hope this is useful in some way - anyone got any ideas?

Cheers
Alastair

--
\\ Alastair Stevens Cambridge
\\ Technical Director / \..-^..^...
\\ |Linux solutions \
\\ 01223 813774 \ /........../
\\ http://www.camlinux.co.uk '-=-'
--


2002-10-12 17:54:45

by Andries Brouwer

[permalink] [raw]
Subject: Re: Small oddity of the week: 2.4.20-pre

On Sat, Oct 12, 2002 at 06:30:15PM +0100, Alastair Stevens wrote:
> > > fdisk -l | grep -w "/dev/hda6"
> > >
> > > For some reason, this now produces, entirely at _random_, either one or
> > > two lines of output! It was the duplicated output that broke Mindi.
>
> Here's a typical output:
>
> 1 root@dolphin:/home/alastair> fdisk -l | grep -w "/dev/hda6"
> /dev/hda6 4419 4749 2658726 83 Linux
> /dev/hda6 4419 4749 2658726 83 Linux
> 2 root@dolphin:/home/alastair> fdisk -l | grep -w "/dev/hda6"
> /dev/hda6 4419 4749 2658726 83 Linux
> 3 root@dolphin:/home/alastair> fdisk -l | grep -w "/dev/hda6"
> /dev/hda6 4419 4749 2658726 83 Linux
> 4 root@dolphin:/home/alastair> fdisk -l | grep -w "/dev/hda6"
> /dev/hda6 4419 4749 2658726 83 Linux
>
> ie - the first time, it gives me two repeated lines. This appears to be
> random. In a clean terminal, it'll sometimes give me only the one line
> on the first run, and then do two lines multiple times....

Could it be that you have statistics garbage in /proc/partitions?
That will break fdisk.

2002-10-12 22:40:22

by Andries Brouwer

[permalink] [raw]
Subject: Re: Small oddity of the week: 2.4.20-pre

On Sat, Oct 12, 2002 at 07:30:57PM +0100, Alastair Stevens wrote:
> On Sat, 2002-10-12 at 19:00, Andries Brouwer wrote:
> > > ie - the first time, it gives me two repeated lines. This appears to be
> > > random. In a clean terminal, it'll sometimes give me only the one line
> > > on the first run, and then do two lines multiple times....
> >
> > Could it be that you have statistics garbage in /proc/partitions?
> > That will break fdisk.
>
> Well, I'm no expert, but it looks OK to me:
>
> 953 alastair@dolphin:~> cat /proc/partitions
> major minor #blocks name rio rmerge rsect ruse wio wmerge wsect
> wuse running use aveq
>
> 3 0 58633344 hda 413162 1219920 12595886 1973640 268105 574049
> 6745668 8302590 -2 36995689 21817577
> 3 1 7253316 hda1 8 24 64 110 0 0 0 0 0 110 110

Some vendors added statistics to /proc/partitions.
Laziness on their side. Since they patch the kernel anyway
it would have been very easy to make /proc/diskstatistics.
But vendors may do as they wish.
This breaks some software, but these same vendors patch their
versions of this software.

Then some misguided people came and wanted this in the official kernel.
Bad. It is bad enough that vendors add pollution, but that good kernel
developers want to do this is inexplicable to me.
It only causes trouble, and gains precisely nothing.

In this particular case there are two problems:
(i) the format was changed
(ii) the content has become dynamic

Many proc files like /proc/filesystems or /proc/partitions may change,
but not many times a second, and most likely not without the operator
being aware of it. This means that programs like mount and fdisk can
read these files [1]. But a /proc/partitions that contains statistics
will change many times a second, causing problems for programs that
try to read it one line at a time.

My conjecture is that you were bitten by the latter phenomenon.

Andries


[1] The correct use of mount is to give an explicit type.
The correct use of fdisk is to give an explicit device.
Nothing else is guaranteed.
Shorthand versions work with high probability. But with a
dynamic /proc/partitions they work with lower probability.

2002-10-22 09:58:27

by Alan

[permalink] [raw]
Subject: Re: Small oddity of the week: 2.4.20-pre

On Sat, 2002-10-12 at 15:00, Alastair Stevens wrote:
> I consulted the developer, and we tracked the problem down to this
> pathetically innocent command sequence in the script:
>
> fdisk -l | grep -w "/dev/hda6"
>
> For some reason, this now produces, entirely at _random_, either one or
> two lines of output! It was the duplicated output that broke Mindi. It's
> easily accommodated in the script, but this randomness was never
> exhibited on any earlier kernels. Is it me, or is this weird?

It would be weird but for having known people who hit the same. The file
can change as its being read (especially with stats in it). You may find
that you need to do a single large read of the /proc file