2008-06-05 09:51:13

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Patches for the patchqueue

Hi Mingming,

These are the patches for patch queue. They have been sent to the
mailing list before. I am attaching them below so that it can be easily
pushed to the patch queue. The patches will be sent as a follow up to
this mail.

I modified the delalloc-clear-bh-delay-bit-fix.patch
to address only the VFS bug. The delay bit is properly cleared in
0001-delalloc-enospc.patch. I guess we should also push
003-delalloc-clear-bh-delay-bit-fix.patch to the stable series as
it is a VFS bug in handling delay buffer.

002-blkcleanup.patch patch can be merged with
ext4-use-inode-preallocation-with-noextents.patch. I am just posting it
as a separate patch so that others can review the changes.


The series file looks like

+ ext4-use-inode-preallocation-with-noextents.patch
+ 002-blkcleanup.patch
+ ext4-call-blkdev_issue_flush-on-fsync.patch
+ ext4-mb-add-ext4_has_free_blocks-check.patch

......

+ jbd2-Remove-data-ordered-mode-support-using-jbd-buf.patch
+ delalloc-vfs.patch
+ 003-delalloc-clear-bh-delay-bit-fix.patch
+ ext4-fix-fs-corruption-with-delalloc.patch

.....

+ percpucounter-add-sum-and-set-function.patch
+ 0001-delalloc-enospc.patch
+ jbd-blocks-reservation-fix-for-large-blk.patch
+ jbd2-blocks-reservation-fix-for-large-blk.patch

-aneesh


2008-06-05 18:03:52

by Mingming Cao

[permalink] [raw]
Subject: Re: Patches for the patchqueue

On Thu, 2008-06-05 at 15:25 +0530, Aneesh Kumar K.V wrote:
> f) clear the delay bit in ext4_da_get_block_write instead of
> __block_write_full_page
> so that we clear the delay bit for every successfull block allocation.
> We may fail
> while marking inode dirty in ext4_da_get_block_write after allocating
> block. So
> it is better to clear the delay bit in ext4_da_get_block_write rather
> than
> __block_write_full_page
>

> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---

> @@ -1555,7 +1565,15 @@ static int ext4_da_get_block_write(struct inode
> *inode, sector_t iblock,
> bh_result->b_size = (ret << inode->i_blkbits);
>
> /* release reserved-but-unused meta blocks */
> - ext4_da_release_space(inode, ret, 0);
> + if (buffer_delay(bh_result)) {
> + ext4_da_release_space(inode, ret, 0);
> + /*
> + * clear the delay bit now that we allocated
> + * blocks. If it is not a single block request
> + * we clear the delay bit in
> mpage_put_bnr_to_bhs
> + */
> + clear_buffer_delay(bh_result);
> + }
>
> /*
> * Update on-disk size along with block allocation

It seems with this fix, the buffer_delay bit is still cleared before the
ext4_mark_inode_dirty() could return error? Actually the already
allocated blocks are leaked if mark_inode-dirty() returns error, and we
cleared the buffer_delay for the buffer needs block.

Mingming


2008-06-05 18:51:51

by Andreas Dilger

[permalink] [raw]
Subject: Re: Patches for the patchqueue

On Jun 05, 2008 15:23 +0530, Aneesh Kumar K.V wrote:
> vfs: Don't flush delay buffer to disk
>
> In block_write_full_page() error case, we need to check the
> delayed flag before flush bh to disk when trying to recover from
> error.

> @@ -1775,7 +1775,8 @@ static int __block_write_full_page(struct inode *inode, struct page *page,
> bh = head;
> /* Recovery: lock and submit the mapped buffers */
> do {
> - if (buffer_mapped(bh) && buffer_dirty(bh)) {
> + if (buffer_mapped(bh) && buffer_dirty(bh)
> + && !buffer_delay(bh)) {

Please use proper CodingStyle here. Firstly, this can fit on the previous
line < 80 columns so it shouldn't be split. Secondly, if the line had to
be split, the "&&" should go at the end of the previous line. Thirdly,
the continued line should align with the 'if (' on the previous line.


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2008-06-05 18:59:25

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: Patches for the patchqueue

On Thu, Jun 05, 2008 at 11:02:40AM -0700, Mingming Cao wrote:
> On Thu, 2008-06-05 at 15:25 +0530, Aneesh Kumar K.V wrote:
> > f) clear the delay bit in ext4_da_get_block_write instead of
> > __block_write_full_page
> > so that we clear the delay bit for every successfull block allocation.
> > We may fail
> > while marking inode dirty in ext4_da_get_block_write after allocating
> > block. So
> > it is better to clear the delay bit in ext4_da_get_block_write rather
> > than
> > __block_write_full_page
> >
>
> > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > ---
>
> > @@ -1555,7 +1565,15 @@ static int ext4_da_get_block_write(struct inode
> > *inode, sector_t iblock,
> > bh_result->b_size = (ret << inode->i_blkbits);
> >
> > /* release reserved-but-unused meta blocks */
> > - ext4_da_release_space(inode, ret, 0);
> > + if (buffer_delay(bh_result)) {
> > + ext4_da_release_space(inode, ret, 0);
> > + /*
> > + * clear the delay bit now that we allocated
> > + * blocks. If it is not a single block request
> > + * we clear the delay bit in
> > mpage_put_bnr_to_bhs
> > + */
> > + clear_buffer_delay(bh_result);
> > + }
> >
> > /*
> > * Update on-disk size along with block allocation
>
> It seems with this fix, the buffer_delay bit is still cleared before the
> ext4_mark_inode_dirty() could return error? Actually the already
> allocated blocks are leaked if mark_inode-dirty() returns error, and we
> cleared the buffer_delay for the buffer needs block.
>

How abou the below ? For single block request we are ok to clear the
delay bit as shown by the above patch. For multiple block request we
clear the delay bit if the buffer_head passed to the get_block have
its delay bit cleared. That should take care of the block leaking you
mentioned above.

diff --git a/fs/mpage.c b/fs/mpage.c
index c4376ec..2c90350 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -908,25 +908,41 @@ static void mpage_da_map_blocks(struct mpage_da_data *mpd)
new.b_blocknr = 0;
new.b_size = remain;
err = mpd->get_block(mpd->inode, next, &new, 1);
- if (err) {
+ /*
+ * we may have successfully allocated block. But
+ * failed to mark inode dirty. If we have allocated
+ * blocks update the buffer_head mappings
+ */
+ if (buffer_new(&new)) {
/*
- * Rather than implement own error handling
- * here, we just leave remaining blocks
- * unallocated and try again with ->writepage()
+ * buffer_head is only makred new if we have
+ * a successfull block allocation
*/
- break;
- }
- BUG_ON(new.b_size == 0);
-
- if (buffer_new(&new))
__unmap_underlying_blocks(mpd->inode, &new);
+ }

/*
* If blocks are delayed marked, we need to
* put actual blocknr and drop delayed bit
*/
- if (buffer_delay(lbh))
+ if (buffer_delay(lbh) && !buffer_delay(&new)) {
+ /*
+ * get_block if successfully allocated
+ * block will clear the delay bit of
+ * new buffer_head
+ */
mpage_put_bnr_to_bhs(mpd, next, &new);
+ }
+
+ if (err) {
+ /*
+ * Rather than implement own error handling
+ * here, we just leave remaining blocks
+ * unallocated and try again with ->writepage()
+ */
+ break;
+ }
+ BUG_ON(new.b_size == 0);

/* go for the remaining blocks */
next += new.b_size >> mpd->inode->i_blkbits;