2005-05-11 07:40:44

by fs

[permalink] [raw]
Subject: [PATCH] VFS mmap wrong behavior when I/O failure occurs

Related FS:
EXT2, EXT3, XFS

Related files:
fs/buffer.c mm/filemap.c

Bug description:
Make a partition in USB storage HDD, create a 64M file.
Write a program, do: open - mmap - read memory at offset 16M
(from disk) - munmap - close. After each operation, pause for a
while, such as 3 seconds. Between mmap and read memory, unplug a
USB wire to force a I/O error. The read memory will report no error
and get 0x00 back, while the right result is SIGBUS.

Bug analysis:
When accessing a page mmaped, kernel calls
do_no_page->..->filemap_nopage->mapping->a_ops->readpage to read a
page from disk. In filemap_nopage(), if !Uptodate(page) (i.e.:
readpage() fails) , it will ClearPageError(page) and try readpage()
again. Most FS will call mpage_readpage(). For EXT2/EXT3/XFS,
it calls block_read_full_page().

1st readpage()->block_read_full_page():
get_block fails(because of I/O failure),
if (iblock < lblock) {
if (get_block(inode, iblock, bh, 0))
SetPageError(page);
}
if (!buffer_mapped(bh)) {
void *kaddr = kmap_atomic(page, KM_USER0);
memset(kaddr + i * blocksize, 0, blocksize);
flush_dcache_page(page);
kunmap_atomic(kaddr, KM_USER0);
set_buffer_uptodate(bh); //NOTICE HERE
continue;
}
...

Then ClearPageError(page);

2nd readpage()->block_read_full_page():
for every buffer:
if (buffer_uptodate(bh))
continue;
So at the end, the page/buffer is uptodate, no Error set.
filemap_nopage will happily return a page memset with 0 without any
error!

Way around:
A. do not set buffer uptodate
Or
B. do not call readpage() twice.

Patch:
Since the buffer is memset to 0, no need to set_buffer_uptodate.
diff -uNp linux-2.6.11.8-orig/fs/buffer.c linux-2.6.11.8/fs/buffer.c
--- linux-2.6.11.8-orig/fs/buffer.c 2005-05-11 14:41:03.000000000
-0400
+++ linux-2.6.11.8/fs/buffer.c 2005-05-11 14:38:55.000000000 -0400
@@ -2105,7 +2105,6 @@ int block_read_full_page(struct page *pa
memset(kaddr + i * blocksize, 0,
blocksize);
flush_dcache_page(page);
kunmap_atomic(kaddr, KM_USER0);
- set_buffer_uptodate(bh);
continue;
}
/*
Signed-off-by: Qu Fuping <[email protected]>

----
Best Regards,
Qu Fuping



2005-05-11 08:20:27

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] VFS mmap wrong behavior when I/O failure occurs

fs <[email protected]> wrote:
>
> --- linux-2.6.11.8-orig/fs/buffer.c 2005-05-11 14:41:03.000000000
> -0400

Your email client wordwrapped the patch. Please fix it.

> +++ linux-2.6.11.8/fs/buffer.c 2005-05-11 14:38:55.000000000 -0400
> @@ -2105,7 +2105,6 @@ int block_read_full_page(struct page *pa
> memset(kaddr + i * blocksize, 0,
> blocksize);
> flush_dcache_page(page);
> kunmap_atomic(kaddr, KM_USER0);
> - set_buffer_uptodate(bh);
> continue;
> }
> /*

This patch will break the kernel's regular handling of file holes -
!buffer_mapped() means that there was no disk mapping for this buffer: it
sits over a hole in the file. Zeroing out the buffer and marking it
uptodate is correct behaviour.

You probably want something like this:

--- 25/fs/buffer.c~a 2005-05-11 01:15:13.000000000 -0700
+++ 25-akpm/fs/buffer.c 2005-05-11 01:16:39.000000000 -0700
@@ -2094,9 +2094,12 @@ int block_read_full_page(struct page *pa
continue;

if (!buffer_mapped(bh)) {
+ int err = 0;
+
fully_mapped = 0;
if (iblock < lblock) {
- if (get_block(inode, iblock, bh, 0))
+ err = get_block(inode, iblock, bh, 0);
+ if (err)
SetPageError(page);
}
if (!buffer_mapped(bh)) {
@@ -2104,7 +2107,8 @@ int block_read_full_page(struct page *pa
memset(kaddr + i * blocksize, 0, blocksize);
flush_dcache_page(page);
kunmap_atomic(kaddr, KM_USER0);
- set_buffer_uptodate(bh);
+ if (!err)
+ set_buffer_uptodate(bh);
continue;
}
/*
_


2005-05-11 09:25:12

by fs

[permalink] [raw]
Subject: Re: [PATCH] VFS mmap wrong behavior when I/O failure occurs

On Wed, 2005-05-11 at 04:19, Andrew Morton wrote:
> fs <[email protected]> wrote:
> >
> > --- linux-2.6.11.8-orig/fs/buffer.c 2005-05-11 14:41:03.000000000
> > -0400
>
> Your email client wordwrapped the patch. Please fix it.
>
> > +++ linux-2.6.11.8/fs/buffer.c 2005-05-11 14:38:55.000000000 -0400
> > @@ -2105,7 +2105,6 @@ int block_read_full_page(struct page *pa
> > memset(kaddr + i * blocksize, 0,
> > blocksize);
> > flush_dcache_page(page);
> > kunmap_atomic(kaddr, KM_USER0);
> > - set_buffer_uptodate(bh);
> > continue;
> > }
> > /*
>
> This patch will break the kernel's regular handling of file holes -
> !buffer_mapped() means that there was no disk mapping for this buffer: it
> sits over a hole in the file. Zeroing out the buffer and marking it
> uptodate is correct behaviour.
>
> You probably want something like this:
Yes, you make the point, that is what I really want.
>
> --- 25/fs/buffer.c~a 2005-05-11 01:15:13.000000000 -0700
> +++ 25-akpm/fs/buffer.c 2005-05-11 01:16:39.000000000 -0700
> @@ -2094,9 +2094,12 @@ int block_read_full_page(struct page *pa
> continue;
>
> if (!buffer_mapped(bh)) {
> + int err = 0;
> +
> fully_mapped = 0;
> if (iblock < lblock) {
> - if (get_block(inode, iblock, bh, 0))
> + err = get_block(inode, iblock, bh, 0);
> + if (err)
> SetPageError(page);
> }
> if (!buffer_mapped(bh)) {
> @@ -2104,7 +2107,8 @@ int block_read_full_page(struct page *pa
> memset(kaddr + i * blocksize, 0, blocksize);
> flush_dcache_page(page);
> kunmap_atomic(kaddr, KM_USER0);
> - set_buffer_uptodate(bh);
> + if (!err)
> + set_buffer_uptodate(bh);
> continue;
> }
> /*
> _
>
So the final patch will be like above:
P.S. my mail client is a little buggy, i can't handle it correctly :(

diff -uNp linux-2.6.11.8-orig/fs/buffer.c linux-2.6.11.8/fs/buffer.c
--- linux-2.6.11.8-orig/fs/buffer.c 2005-05-11 14:41:03.000000000 -0400
+++ linux-2.6.11.8/fs/buffer.c 2005-05-11 16:20:40.000000000 -0400
@@ -2095,17 +2095,21 @@ int block_read_full_page(struct page *pa
continue;

if (!buffer_mapped(bh)) {
+ int err = 0;
+
fully_mapped = 0;
if (iblock < lblock) {
- if (get_block(inode, iblock, bh, 0))
- SetPageError(page);
+ err = get_block(inode, iblock, bh, 0)
+ if(err)
+ SetPageError(page);
}
if (!buffer_mapped(bh)) {
void *kaddr = kmap_atomic(page, KM_USER0);
memset(kaddr + i * blocksize, 0, blocksize);
flush_dcache_page(page);
kunmap_atomic(kaddr, KM_USER0);
- set_buffer_uptodate(bh);
+ if(!err)
+ set_buffer_uptodate(bh);
continue;
}
/*