Someone reported a bug in the linux.kernel newsgroup which I presumed
was somehow linked to this mailing list which I finally found my way on
to. Glad to be here.
Anyway, enough of the pleasantries. I'm a newbie to the Linux kernel,
but formerly a professional kernel developer. I left for mental health
reasons, long story.
The bug that someone reported was that when they did a "dd" using the
device "/dev/fd0u1440" and there was no disk in the drive, it would only
fail about half of the time. The rest of the tiem it would succeed,
which is bad because there was no disk and for it so succeed and not get
an ENXIO on openning the device was wrong.
I was able to reproduce this bug, and fix it, on my machine.
The fix isn't pretty, though, and it's a one line fix. I'll leave it to
others here (open for discussion) to suggest the more propper fix, after
all I am a newbie here. My fix may be fine, though. I just don't know.
Basically, the error traced back to the floppy_revalidate(kdev_t dev)
function in drivers/block/floppy.c .. There's a line in there which
reads "if (NO_GEOM) {", and my fix is to change that line to "if (1) {".
Yep, that means that the code inside that block gets executed every
time.
You see, the problem is that only when the code inside that block gets
executed, though, is there some reading done from the drive which tests
the initial drive status.
Of course, if someone is openning a special "u1440" version of the
device (minor number 28 instead of minor number 0), then there may be a
good reason that this code shouldn't be executed. I just don't know
what that is because I don't know what those special device files are
for.
If those files are to specially handle, for example, 720kb disks in a
1440kb drive, then I'm not sure this particular piece of code being
enabled will -break- that. Of course, I only can find one floppy disk,
and it's a 1440kb disk. I don't have tons of floppies hanging around.
Someone may have already addressed this issue, too. I don't know.
If anyone was interested, though, I was able to reproduce this problem
simply by not having a disk in the drive and repeatedly opening the
drive with simple test.c code (with O_WROLY|O_CREAT|O_TRUNC as dd was
using). If I openned "/dev/fd0", I would consistently get a failure as
expected, but if I openned "/dev/fd0u1440" I would less consistently get
a failure.
I haven't checked a 2.5 kernel. This was in 2.4.20.
-Rob
p.s. Sorry for the long message. I figure too much detail is better
than too little.