2009-07-01 09:46:17

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH] ext4: Return error if we fail to allocate block in noalloc_get_block_write

block_write_full_page consider a zero return from get_block as success.
noalloc_get_block_write returned zero even if we failed to find a mapping
blocks. Returning non zero ensures we fallback to the error handling path
of block_write_full_page which would properly redirty the page after
the below patch is applied.

http://article.gmane.org/gmane.linux.file-systems/33145

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---

NOTE: I am not sure whether -EGAIN is the right error to be returned error.
This patch should enable us to push the pending ext4 patches in the patch
queue without depending on the full series from Jan. Will reply to this
email with patch ordering.

fs/ext4/inode.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 25638bc..6c814af 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2514,7 +2514,10 @@ static int noalloc_get_block_write(struct inode *inode, sector_t iblock,
if (ret > 0) {
bh_result->b_size = (ret << inode->i_blkbits);
ret = 0;
- }
+ } else if (create && ret == 0)
+ /* write request on unmapped buffer head. */
+ ret = -EAGAIN;
+
return ret;
}

--
1.6.3.2.363.gc5764



2009-07-01 09:56:36

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH] ext4: Return error if we fail to allocate block in noalloc_get_block_write

On Wed, Jul 01, 2009 at 03:16:15PM +0530, Aneesh Kumar K.V wrote:
> block_write_full_page consider a zero return from get_block as success.
> noalloc_get_block_write returned zero even if we failed to find a mapping
> blocks. Returning non zero ensures we fallback to the error handling path
> of block_write_full_page which would properly redirty the page after
> the below patch is applied.
>
> http://article.gmane.org/gmane.linux.file-systems/33145
>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
>
> NOTE: I am not sure whether -EGAIN is the right error to be returned error.
> This patch should enable us to push the pending ext4 patches in the patch
> queue without depending on the full series from Jan. Will reply to this
> email with patch ordering.
>
> fs/ext4/inode.c | 5 ++++-
> 1 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 25638bc..6c814af 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2514,7 +2514,10 @@ static int noalloc_get_block_write(struct inode *inode, sector_t iblock,
> if (ret > 0) {
> bh_result->b_size = (ret << inode->i_blkbits);
> ret = 0;
> - }
> + } else if (create && ret == 0)
> + /* write request on unmapped buffer head. */
> + ret = -EAGAIN;
> +
> return ret;
> }
>

The patch ordering I tested this with was
+ stable-boundary
+ stable-boundary-undo.patch
+ jan-kara-dont-clear-dirty-bits-in-block_write_full_page
+ dont-look-at-buffer_heads-outside-i_size
+ fix-mmap-truncate-race-with-subpage-blocksize
+ 01.patch <=============================================== This is the above patch
+ fix-mmap-truncate-race-with-nondelalloc
+ move-__ext4_journaled-writepage-function
+ add-WARN_ON-with-unmapped-dirty-bh-in-writepage
+ vfs-jan-kara-page_mkwrite-infrastructure
+ allocate-blocks-correctly-with-subpage-blocksize
> jan-kara-unmap-underlying-metadata-of-new-buffers-only-when-mapped

Now with jan-kara-dont-clear-dirty-bits-in-block_write_full_page
block_write_full_page will redirty the page when get_block fails. That
would ensure that we don't drop the pages if we find unmapped
buffer_heads in the writepage callback. The patch
fix-mmap-truncate-race-with-subpage-blocksize actually pass pages with
unmapped buffer heads to block_write_full_page.

This also means we can push patches upto
move-__ext4_journaled-writepage-function to upstream in 2.6.31

I am yet to run the full patch set on ABAT. So the patches had limited
testing.

-aneesh

2009-07-07 09:21:09

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH] ext4: Return error if we fail to allocate block in noalloc_get_block_write

On Wed, Jul 01, 2009 at 03:16:15PM +0530, Aneesh Kumar K.V wrote:
> block_write_full_page consider a zero return from get_block as success.
> noalloc_get_block_write returned zero even if we failed to find a mapping
> blocks. Returning non zero ensures we fallback to the error handling path
> of block_write_full_page which would properly redirty the page after
> the below patch is applied.
>
> http://article.gmane.org/gmane.linux.file-systems/33145
>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
>
> NOTE: I am not sure whether -EGAIN is the right error to be returned error.
> This patch should enable us to push the pending ext4 patches in the patch
> queue without depending on the full series from Jan. Will reply to this
> email with patch ordering.
>
> fs/ext4/inode.c | 5 ++++-
> 1 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 25638bc..6c814af 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2514,7 +2514,10 @@ static int noalloc_get_block_write(struct inode *inode, sector_t iblock,
> if (ret > 0) {
> bh_result->b_size = (ret << inode->i_blkbits);
> ret = 0;
> - }
> + } else if (create && ret == 0)
> + /* write request on unmapped buffer head. */
> + ret = -EAGAIN;
> +
> return ret;
> }
>

Now that i tried this with a test case, this patch really doesn't make
the situation better. So we can drop the above patch. But i guess wei
should push the pending patches in the patchqueue even without
Jan kara's changes. Below is what i found.

Without the below set of patches we have some really bad bugs with ext4
on a 1k blocksize filesystem

dont-look-at-buffer_heads-outside-i_size
fix-mmap-truncate-race-with-subpage-blocksize
fix-mmap-truncate-race-with-nondelalloc
move-__ext4_journaled-writepage-function


a) As I mentioned in the call, without the changes, a simple mmap and
truncate on a file will result in us looping in writepage. The attached test
program shows that. The fsync() calls won't return.

b) But with the above changes we will now start dropping some file
contents after a truncate -> mmap -> truncate sequence. In the test
program attached, the value of *(addr + 2048 ) is dropped. Jan's patches
will fix the problem for us. But for 2.6.30 we are still ok even if the
contents are dropped because the man page for mmap states
"The effect of changing the size of the underlying file of a mapping on
the pages that correspond to added or removed regions of the file is unspecified."

considering that the above set of patches is really fixing the problem
mentioned in (a) . We may want to push the patches to 2.6.30.

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <stdlib.h>

main(int argc, char *argv[])
{
int fd = open(argv[1], O_RDWR | O_CREAT, 0666);
char *addr;
ftruncate(fd, 1024);
addr = (char *)mmap(NULL, 4096, PROT_WRITE, MAP_SHARED, fd, 0);
if (addr == MAP_FAILED) {
perror("Error in mmap\n");
exit(1);
}
*addr = 'a';
*(addr+1) = 'b';
ftruncate(fd, 4096);
*(addr + 2048) = 'k';
fsync(fd);
close(fd);
}



2009-07-07 17:18:12

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] ext4: Return error if we fail to allocate block in noalloc_get_block_write

On Wed 01-07-09 15:26:36, Aneesh Kumar K.V wrote:
> On Wed, Jul 01, 2009 at 03:16:15PM +0530, Aneesh Kumar K.V wrote:
> > block_write_full_page consider a zero return from get_block as success.
> > noalloc_get_block_write returned zero even if we failed to find a mapping
> > blocks. Returning non zero ensures we fallback to the error handling path
> > of block_write_full_page which would properly redirty the page after
> > the below patch is applied.
> >
> > http://article.gmane.org/gmane.linux.file-systems/33145
> >
> > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > ---
> >
> > NOTE: I am not sure whether -EGAIN is the right error to be returned error.
> > This patch should enable us to push the pending ext4 patches in the patch
> > queue without depending on the full series from Jan. Will reply to this
> > email with patch ordering.
> >
> > fs/ext4/inode.c | 5 ++++-
> > 1 files changed, 4 insertions(+), 1 deletions(-)
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 25638bc..6c814af 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -2514,7 +2514,10 @@ static int noalloc_get_block_write(struct inode *inode, sector_t iblock,
> > if (ret > 0) {
> > bh_result->b_size = (ret << inode->i_blkbits);
> > ret = 0;
> > - }
> > + } else if (create && ret == 0)
> > + /* write request on unmapped buffer head. */
> > + ret = -EAGAIN;
> > +
> > return ret;
> > }
> >
>
> The patch ordering I tested this with was
> + stable-boundary
> + stable-boundary-undo.patch
> + jan-kara-dont-clear-dirty-bits-in-block_write_full_page
> + dont-look-at-buffer_heads-outside-i_size
> + fix-mmap-truncate-race-with-subpage-blocksize
> + 01.patch <=============================================== This is the above patch
> + fix-mmap-truncate-race-with-nondelalloc
> + move-__ext4_journaled-writepage-function
> + add-WARN_ON-with-unmapped-dirty-bh-in-writepage
> + vfs-jan-kara-page_mkwrite-infrastructure
> + allocate-blocks-correctly-with-subpage-blocksize
> > jan-kara-unmap-underlying-metadata-of-new-buffers-only-when-mapped
>
> Now with jan-kara-dont-clear-dirty-bits-in-block_write_full_page
> block_write_full_page will redirty the page when get_block fails. That
> would ensure that we don't drop the pages if we find unmapped
> buffer_heads in the writepage callback. The patch
> fix-mmap-truncate-race-with-subpage-blocksize actually pass pages with
> unmapped buffer heads to block_write_full_page.
>
> This also means we can push patches upto
> move-__ext4_journaled-writepage-function to upstream in 2.6.31
>
> I am yet to run the full patch set on ABAT. So the patches had limited
> testing.
Yes, this looks as a good solution to me since Nick had some objections
to later patches in my patchset so they probably won't get merged for
2.6.31...

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR