2006-05-29 21:40:21

by Olaf Hering

[permalink] [raw]
Subject: cramfs corruption after BLKFLSBUF on loop device

This script will cause cramfs decompression errors, on SMP at least:

#!/bin/bash
while :;do blockdev --flushbufs /dev/loop0;done </dev/null &>/dev/null&
while :;do ps faxs </dev/null &>/dev/null&done </dev/null &>/dev/null&
while :;do dmesg </dev/null &>/dev/null&done </dev/null &>/dev/null&
while :;do find /mounts/instsys -type f -print0|xargs -0 cat &>/dev/null;done

...
Error -3 while decompressing!
c0000000009592a2(2649)->c0000000edf87000(4096)
Error -3 while decompressing!
c000000000959298(2520)->c0000000edbc7000(4096)
Error -3 while decompressing!
c000000000959c70(2489)->c0000000f1482000(4096)
Error -3 while decompressing!
c00000000095a629(2355)->c0000000edaff000(4096)
Error -3 while decompressing!
...

evms_access does the ioctl (lots of them) on the loop device.
Its a long standing bug, 2.6.5 fails as well. cramfs_read() clears parts
of the src buffer because the page is not uptodate. invalidate_bdev()
touched the page last.
cramfs_read() was called from line 480 or 490 when the
PageUptodate(page) test fails.

...
464 static int cramfs_readpage(struct file *file, struct page * page)
..
479 if (page->index)
480 start_offset = *(u32 *) cramfs_read(sb, blkptr_offset-4, 4);
..
488 bytes_filled = cramfs_uncompress_block(pgdata,
489 PAGE_CACHE_SIZE,
490 cramfs_read(sb, start_offset, compr_len),
491 compr_len);
...

There are rumors that cramfs is not smp safe...
Maybe the only hope is to tell evms to not do that ioctl for loop.


2006-05-30 13:19:37

by Olaf Hering

[permalink] [raw]
Subject: Re: cramfs corruption after BLKFLSBUF on loop device

On Mon, May 29, Olaf Hering wrote:

> This script will cause cramfs decompression errors, on SMP at least:
>
> #!/bin/bash
> while :;do blockdev --flushbufs /dev/loop0;done </dev/null &>/dev/null&
> while :;do ps faxs </dev/null &>/dev/null&done </dev/null &>/dev/null&
> while :;do dmesg </dev/null &>/dev/null&done </dev/null &>/dev/null&
> while :;do find /mounts/instsys -type f -print0|xargs -0 cat &>/dev/null;done
>
> ...
> Error -3 while decompressing!
> c0000000009592a2(2649)->c0000000edf87000(4096)
> Error -3 while decompressing!
> c000000000959298(2520)->c0000000edbc7000(4096)
> Error -3 while decompressing!
> c000000000959c70(2489)->c0000000f1482000(4096)
> Error -3 while decompressing!
> c00000000095a629(2355)->c0000000edaff000(4096)
> Error -3 while decompressing!
> ...

This change works for me, the added BUG() does not trigger.
read_cache_page() returns the page in PageUptodate() state.
But a few ticks later, invalidate_complete_page() calls ClearPageUptodate(),
on another cpu.
The SetPageDirty() works for my testcase, but not without the mb().
Does anyone know what sideeffects the SetPageDirty() has for the
loopmounted cramfs?



---
fs/cramfs/inode.c | 2 ++
fs/cramfs/uncompress.c | 1 +
2 files changed, 3 insertions(+)

Index: linux-2.6.16.16-1.6/fs/cramfs/inode.c
===================================================================
--- linux-2.6.16.16-1.6.orig/fs/cramfs/inode.c
+++ linux-2.6.16.16-1.6/fs/cramfs/inode.c
@@ -186,6 +186,8 @@ static void *cramfs_read(struct super_bl
/* synchronous error? */
if (IS_ERR(page))
page = NULL;
+ SetPageDirty(page);
+ mb();
}
pages[i] = page;
}
Index: linux-2.6.16.16-1.6/fs/cramfs/uncompress.c
===================================================================
--- linux-2.6.16.16-1.6.orig/fs/cramfs/uncompress.c
+++ linux-2.6.16.16-1.6/fs/cramfs/uncompress.c
@@ -50,6 +50,7 @@ int cramfs_uncompress_block(void *dst, i
err:
printk("Error %d while decompressing!\n", err);
printk("%p(%d)->%p(%d)\n", src, srclen, dst, dstlen);
+ BUG_ON(1);
return 0;
}

2006-05-30 18:25:04

by Olaf Hering

[permalink] [raw]
Subject: Re: cramfs corruption after BLKFLSBUF on loop device

On Mon, May 29, Olaf Hering wrote:

> This script will cause cramfs decompression errors, on SMP at least:
>
> #!/bin/bash
> while :;do blockdev --flushbufs /dev/loop0;done </dev/null &>/dev/null&
> while :;do ps faxs </dev/null &>/dev/null&done </dev/null &>/dev/null&
> while :;do dmesg </dev/null &>/dev/null&done </dev/null &>/dev/null&
> while :;do find /mounts/instsys -type f -print0|xargs -0 cat &>/dev/null;done
>
> ...
> Error -3 while decompressing!
> c0000000009592a2(2649)->c0000000edf87000(4096)
> Error -3 while decompressing!
> c000000000959298(2520)->c0000000edbc7000(4096)
> Error -3 while decompressing!
> c000000000959c70(2489)->c0000000f1482000(4096)
> Error -3 while decompressing!
> c00000000095a629(2355)->c0000000edaff000(4096)
> Error -3 while decompressing!
> ...
>
> evms_access does the ioctl (lots of them) on the loop device.
> Its a long standing bug, 2.6.5 fails as well. cramfs_read() clears parts
> of the src buffer because the page is not uptodate. invalidate_bdev()
> touched the page last.
> cramfs_read() was called from line 480 or 490 when the
> PageUptodate(page) test fails.

Al, you added the PageUptodate check for 2.6.2.

http://linux.bkbits.net:8080/linux-2.6/gnupatch@400c1cddyzRoKomOj57xxUAmKnMbZQ

Should there be some locking for blockdev --flushbufs, or is the check
just bogus?