2003-05-15 02:46:02

by Adam J. Richter

[permalink] [raw]
Subject: disk corruption in 2.5.66-bk5 - 2.5.69

linux-2.5.66-bk5 changed blkdev_put in fs/block_dev.c
to only call sync_blockdev if bdev->bd_openers drops to zero.
For me, this has resulted in disk corruption when lilo runs
under kernels 2.5.66-bk5 through 2.5.69, even if I run
"sync" repeatedly after running lilo.

I am not sure why I seem to be the first person reporting
this disastrous problem. I suspect it may have something to do
with the fact that I boot from an initial ramdisk that changes
the root without using pivot_root (instead, process 1 does a chroot
at a time when it is the only process on the system) and the fact
that I user partitions defined by partx rather than kernel-based
partitioning. Perhaps bdev->bd_openers never drops to zero in this
configuration as a result of some other bug that only occurs in
such a configuration.

I notice also that that change in 2.5.66-bk5 causes
sync_blockdev to be called with the big kernel lock held, although
I do not think that is the cause of the data corruption.

With the following kludge, the lilo data corruption
in 2.5.69 seems to be gone. I have deliberately not cleaned
up this patch for integration at this point, because I don't
understand the issue enough yet to know the real problem
and the right fix.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Miplitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

--- linux-2.5.69/fs/block_dev.c.orig 2003-05-14 17:43:40.000000000 -0700
+++ linux-2.5.69/fs/block_dev.c 2003-05-14 17:44:29.000000000 -0700
@@ -635,14 +635,24 @@
int blkdev_put(struct block_device *bdev, int kind)
{
int ret = 0;
struct inode *bd_inode = bdev->bd_inode;
struct gendisk *disk = bdev->bd_disk;

down(&bdev->bd_sem);
+
+ /* AJR start */
+ switch (kind) {
+ case BDEV_FILE:
+ case BDEV_FS:
+ sync_blockdev(bd_inode->i_bdev);
+ break;
+ }
+ /* AJR end */
+
lock_kernel();
if (!--bdev->bd_openers) {
switch (kind) {
case BDEV_FILE:
case BDEV_FS:
sync_blockdev(bd_inode->i_bdev);
break;


2003-05-15 07:43:06

by Andrew Morton

[permalink] [raw]
Subject: Re: disk corruption in 2.5.66-bk5 - 2.5.69

"Adam J. Richter" <[email protected]> wrote:
>
> linux-2.5.66-bk5 changed blkdev_put in fs/block_dev.c
> to only call sync_blockdev if bdev->bd_openers drops to zero.
> For me, this has resulted in disk corruption when lilo runs
> under kernels 2.5.66-bk5 through 2.5.69, even if I run
> "sync" repeatedly after running lilo.

Works OK here with a simple test:

sleep 10000 < /dev/hda5 &
cat < some-file > /dev/hda5
sync
<press reset>
cmp some-file /dev/hda5

LILO does a global sync itself before quitting. Check to see if there's a
lot of writeout happening. See if there's writeout happening when you run
sync manually.

Maybe take a copy of /dev/hda1 into a regular file, then reboot, then
compare with what's on the disk.

Check that /proc/meminfo:Dirty goes to zero after `sync'.