2001-04-23 03:00:12

by Douglas Gilbert

[permalink] [raw]
Subject: MO drives (2048 byte block vfat fs) in lk 2.4

The "MO" bug (also 2048 byte block vfat problem) has been
reported several times in the lk 2.4 series. Since the
finger was being pointed at the SCSI subsystem I decided
to investigate. As far as I can see the sd driver offers
the same physical block (other than 512 byte) capabilities
in lk 2.4 as it did in lk 2.2 .

One error report stated that a MO drive with a vfat
fs based on 2048 byte sectors can be mounted and read
but any significant write causes a system lockup. I
have been able to replicate this behaviour. Luckily
Alt-SysRq-P did work. Pressing this sequence multiple
times gave similar addresses. Rebooting the machine
and rerunning the experiment multiple time gave
addresses in the same area.

The EIP resolved most often to cont_prepare_write() in
fs/buffer. A disassembly suggests line 1802 in buffer.c
[2.4.3ac11]. That is around a memset() between
__block_prepare_write() and __block_commit_write() calls
within the while loop. Most other addresses were within
the same while loop. Perhaps someone with expertize
in this area may like to examine that loop.


Details: I modified the "scsi_debug" adapter driver to look
like it had one 2048 byte block MO drive connected to it.
The driver uses 8 MB of RAM to simulate a storage device.
[For anyone who wants to run similar experiments, I have
placed the driver at http://www.torque.net/sg/p/scsi_debug_mo.tgz ].
The sequence of commands that lead up to the failure was:
$ modprobe scsi_debug
$ cat /proc/scsi/scsi # "optical" device should be there
$ fdisk -ul /dev/sdb # should see 3 partitions
$ mkdosfs -S 2048 /dev/sdb3
$ mount /dev/sdb3 /mnt/extra
$ cd /mnt/extra
$ touch t # worked ok
$ cp /boot/vml-2.2.18 u # system locks up

Doug Gilbert


2001-04-23 09:21:50

by Alan

[permalink] [raw]
Subject: Re: MO drives (2048 byte block vfat fs) in lk 2.4

> The EIP resolved most often to cont_prepare_write() in
> fs/buffer. A disassembly suggests line 1802 in buffer.c
> [2.4.3ac11]. That is around a memset() between
> __block_prepare_write() and __block_commit_write() calls
> within the while loop. Most other addresses were within
> the same while loop. Perhaps someone with expertize
> in this area may like to examine that loop.

I'll take a dig. The fat code pulled out the magic buffer stuff because
it was meant to be going lower down which never happened..

Alan

2001-04-23 22:12:49

by Daniel Kobras

[permalink] [raw]
Subject: Re: MO drives (2048 byte block vfat fs) in lk 2.4

On Sun, Apr 22, 2001 at 10:59:18PM -0400, Douglas Gilbert wrote:
> One error report stated that a MO drive with a vfat
> fs based on 2048 byte sectors can be mounted and read

Read? I don't think so. bread, yes, but read follows a NULL pointer and
was never seen again.

> but any significant write causes a system lockup. I
> have been able to replicate this behaviour. Luckily
> Alt-SysRq-P did work. Pressing this sequence multiple
> times gave similar addresses. Rebooting the machine
> and rerunning the experiment multiple time gave
> addresses in the same area.

bigblock_fat_bread() in fs/fat/buffer.c kmalloc()s 2k dummy bhs for each
512 byte buffer but only partly initialise them. This works as long as those
bogus bhs don't leave the fat realm. Unfortunately, generic_file_write and
friends call into the generic block layer that wants to do such evil things
on bhs as using their wait_queues or checking their state for BH_Locked.
None of which are initialised, and while I haven't checked in detail,
certainly lead the way into deadlock country. But even if these minor
disturbances magically didn't screw you yet, the fat layer will hand out
a buffer address calculated for 512 byte buffers for your 2k buffer, and your
data goes bye, bye.

The preferred fix seems to be to teach the loop device about reblocking and
rip all of the bigblock support from fat. I've spent this weekend to cure
my utter and complete ignorance of the blkdev layer, in order to be able to
implement a working set up at least. Please allow me another couple of days
to spare a little hacking time. I'll take care of the issue. Promised.

Regards,

Daniel.

--
GNU/Linux Audio Mechanics - http://www.glame.de
Cutting Edge Office - http://www.c10a02.de
GPG Key ID 89BF7E2B - http://www.keyserver.net