2001-12-18 20:16:13

by Torrey Hoffman

[permalink] [raw]
Subject: RE: ramdisk corruption problems - was: RE: pivot_root and initrd kern el panic woes

More information! And a workaround!

I conjecture that the ramdisk driver (post 2.4.9) only grabs
VM pages properly if it is accessed directly, as a dd to
/dev/ram0 does. I further conjecture that accessing the
ramdisk through a mounted filesystem does not grab pages
properly.

The reason I believe this is that removing the call to
"freeramdisk" from my original script avoids corruption.

Another way to avoid ramdisk corruption is to
"dd if=/dev/zero of=/dev/ram0 bs=1k count=4000"
immediately after the call to freeramdisk.

If my conjecture is right, then the corruption is caused
because mke2fs on a "freed" /dev/ram0 doesn't touch every
block of the fs, leaving "holes" where pages are not properly
grabbed from the VM. The resulting filesystem appears to work,
but dd'ing from /dev/ram0 gets a broken filesystem image.

Note that "freeramdisk /dev/ram0" is pretty much just:
#define FLKFLSBUF _IO(0x12,97) /* flush buffer cache */
f = open("/dev/ram0", O_RDWR);
ioctl(f, BLKFLSBUF);

To experiment for yourself, stick the following script in
a subdirectory which also contains a "testdir" directory
with about 3 MB of data.

- - - - - - - -
#!/bin/bash

# to tickle the bug, do the freeramdisk but not the
# dd from /dev/zero to /dev/ram0.

freeramdisk /dev/ram0
#dd if=/dev/zero of=/dev/ram0 bs=1k count=4000

mke2fs -m0 /dev/ram0 4000
mount -t ext2 /dev/ram0 /mnt/ramdisk
rm -rf /mnt/ramdisk/*

cp -a ./testdir /mnt/ramdisk
umount /dev/ram0

dd if=/dev/ram0 of=ram0.img bs=1k count=4000
dd if=ram0.img of=/dev/ram0 bs=1k count=4000

mount -t ext2 /dev/ram0 /mnt/ramdisk
diff -q -r ./testdir /mnt/ramdisk/testdir

# If diff reports mismatches, you saw the bug.

umount /dev/ram0
- - - - - - - -

If the gods of the VM and VFS don't bother to look at
it, I might take a peek at the relevant kernel code myself.
Might take two months of study before I know enough though.

Torrey


2001-12-20 12:20:18

by Tachino Nobuhiro

[permalink] [raw]
Subject: Re: ramdisk corruption problems - was: RE: pivot_root and initrd kern el panic woes


Hello,

Following patch may fix your problem.

diff -r -u linux-2.4.17-rc2.org/drivers/block/rd.c linux-2.4.17-rc2/drivers/block/rd.c
--- linux-2.4.17-rc2.org/drivers/block/rd.c Thu Dec 20 20:30:57 2001
+++ linux-2.4.17-rc2/drivers/block/rd.c Thu Dec 20 20:46:53 2001
@@ -194,9 +194,11 @@
static int ramdisk_readpage(struct file *file, struct page * page)
{
if (!Page_Uptodate(page)) {
- memset(kmap(page), 0, PAGE_CACHE_SIZE);
- kunmap(page);
- flush_dcache_page(page);
+ if (!page->buffers) {
+ memset(kmap(page), 0, PAGE_CACHE_SIZE);
+ kunmap(page);
+ flush_dcache_page(page);
+ }
SetPageUptodate(page);
}
UnlockPage(page);


grow_dev_page() creates not Uptodate page which has valid buffers, so
it is wrong that ramdisk_readpage() clears whole page unconditionally.


At Tue, 18 Dec 2001 12:14:49 -0800,
Torrey Hoffman wrote:
>
> More information! And a workaround!
>
> I conjecture that the ramdisk driver (post 2.4.9) only grabs
> VM pages properly if it is accessed directly, as a dd to
> /dev/ram0 does. I further conjecture that accessing the
> ramdisk through a mounted filesystem does not grab pages
> properly.
>
> The reason I believe this is that removing the call to
> "freeramdisk" from my original script avoids corruption.
>
> Another way to avoid ramdisk corruption is to
> "dd if=/dev/zero of=/dev/ram0 bs=1k count=4000"
> immediately after the call to freeramdisk.
>
> If my conjecture is right, then the corruption is caused
> because mke2fs on a "freed" /dev/ram0 doesn't touch every
> block of the fs, leaving "holes" where pages are not properly
> grabbed from the VM. The resulting filesystem appears to work,
> but dd'ing from /dev/ram0 gets a broken filesystem image.
>
> Note that "freeramdisk /dev/ram0" is pretty much just:
> #define FLKFLSBUF _IO(0x12,97) /* flush buffer cache */
> f = open("/dev/ram0", O_RDWR);
> ioctl(f, BLKFLSBUF);
>
> To experiment for yourself, stick the following script in
> a subdirectory which also contains a "testdir" directory
> with about 3 MB of data.
>
> - - - - - - - -
> #!/bin/bash
>
> # to tickle the bug, do the freeramdisk but not the
> # dd from /dev/zero to /dev/ram0.
>
> freeramdisk /dev/ram0
> #dd if=/dev/zero of=/dev/ram0 bs=1k count=4000
>
> mke2fs -m0 /dev/ram0 4000
> mount -t ext2 /dev/ram0 /mnt/ramdisk
> rm -rf /mnt/ramdisk/*
>
> cp -a ./testdir /mnt/ramdisk
> umount /dev/ram0
>
> dd if=/dev/ram0 of=ram0.img bs=1k count=4000
> dd if=ram0.img of=/dev/ram0 bs=1k count=4000
>
> mount -t ext2 /dev/ram0 /mnt/ramdisk
> diff -q -r ./testdir /mnt/ramdisk/testdir
>
> # If diff reports mismatches, you saw the bug.
>
> umount /dev/ram0
> - - - - - - - -
>
> If the gods of the VM and VFS don't bother to look at
> it, I might take a peek at the relevant kernel code myself.
> Might take two months of study before I know enough though.
>
> Torrey
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/