2008-07-05 17:19:07

by Theodore Ts'o

[permalink] [raw]
Subject: New ext4 patchset 2.6.26-rc8-ext4-1


As a git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git 2.6.26-rc8-ext4-1
http://git.kernel.org/?p=linux/kernel/git/tytso/ext4.git;a=shortlog;h=2.6.26-rc8
-ext4-1

As a patchset:

ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/ext4-patches/2.6.26-rc8-ext4-
1

Patches marked with a (*) are planned to be pushed to Linus during the
next merge window.

- Ted

Akinobu Mita (1):
* ext4: fix ext4_init_block_bitmap() for metablock block group

Akira Fujita (7):
ext4: online defrag-- Main function of defrag and ioctl implementation
ext4: online defrag-- Allocate new contiguous blocks with mballoc
ext4: online defrag-- Read and write file data with memory page
ext4: online defrag-- Exchange the blocks between two inodes
ext4: online defrag-- Defragmentation for the relevant files (-r mode)
ext4: online defrag-- Check the free space fragmentation (-f mode)
ext4: online defrag-- Move victim files for the target file (-f mode)

Alex Tomas (2):
* vfs: add basic delayed allocation support
* ext4: Add basic delayed allocation support

Alexey Dobriyan (1):
* ext4: switch to seq_files

Aneesh Kumar (1):
* ext4: Handle page without buffers in ext4_*_writepage()

Aneesh Kumar K.V (11):
* ext4: start searching for the right extent from the goal group.
* ext4: Update i_disksize properly when allocating from fallocate area.
* ext4: Fix sparse warning
* ext4: Use inode preallocation with -o noextents
* ext4: cleanup block allocator
* ext4: Use page_mkwrite vma_operations to get mmap write notification.
* mm: Add range_cont mode for writeback
* ext4: Add ordered mode support for delalloc
* ext4: Enable delalloc by default.
* ext4: Don't allow nonextenst mount option for large filesystem
ext4: undo the stable boundary patch changes

Eric Sandeen (5):
* ext4: call blkdev_issue_flush on fsync
* ext4: use atomic functions to set bh_state
vfs: vfs-level fiemap interface
ext4: reinstate ext4_ext_walk_space()
ext4: fiemap implementation

Frederic Bohe (1):
* ext4: fix online resize with mballoc

Jan Kara (8):
* ext4: Set journal pointer to NULL when journal is released
* ext4: Add missing unlock to an error path in ext4_quota_write()
* vfs: Move mark_inode_dirty() from under page lock in generic_write_end()
* ext4: Invert the locking order of page_lock and transaction start
* vfs: export filemap_fdatawrite_range()
* jbd2: Implement data=ordered mode handling via inodes
* ext4: Use new framework for data=ordered mode in JBD2
* jbd2: Remove data=ordered mode support using jbd buffer heads

Jose R. Santos (2):
* ext4: New inode allocation for FLEX_BG meta-data groups.
* Ext4: Documentation updates.

Julia Lawall (1):
* ext4: Use BUG_ON() instead of BUG()

Li Zefan (2):
* ext4: remove redundant code in ext4_fill_super()
* ext4: cleanup never-used magic numbers from htree code

Mingming Cao (8):
* ext4: Fix ext4_mb_init_cache return error
* JBD2: fix race between jbd2_journal_try_to_free_buffers() and jbd2 commit transaction
* ext4: mballoc avoid use root reserved blocks for non root allocation
* percpu_counter: new function percpu_counter_sum_and_set
* ext4: delayed allocation ENOSPC handling
* ext4: Invert lock ordering of page_lock and transaction start in delalloc
* Ext4: fix delalloc i_disksize early update issue
* Ext4: Documention update for new ordered mode and delayed allocation

Shen Feng (9):
* ext4: fix comments to say "ext4"
* ext4: improve some code in rb tree part of dir.c
* ext4: add error processing when calling ext4_mb_init_cache in mballoc
* ext4: miscellaneous error checks and coding cleanups for mballoc
* ext4: remove double definitions of xattr macros
* ext4: remove quota allocation when ext4_mb_new_blocks fails
* ext4: return error when calling ext4_ext_split failed
* ext4: Make ext4_ext_find_extent fills ext_path completely
* ext4: Fix ext4_ext_journal_restart() to reflect errors up to the caller

Theodore Ts'o (5):
* ext4: Rename read_block_bitmap() to ext4_read_block_bitmap()
* ext4: Remove unused variable from ext4_show_options
* jbd2: Add commit time into the commit block
* ext4: Fix lock inversion in ext4_ext_truncate()
ext4: Stable/Unstable boundary



2008-07-05 17:50:47

by Christoph Hellwig

[permalink] [raw]
Subject: Re: New ext4 patchset 2.6.26-rc8-ext4-1

On Sat, Jul 05, 2008 at 01:19:04PM -0400, Theodore Ts'o wrote:
> Alex Tomas (2):
> * vfs: add basic delayed allocation support
> * ext4: Add basic delayed allocation support

Strong NACK. For one thing the code added to mpage.c doesn't belong
there. It's far inferior to the existing delalloc code we already have
and that could be made generic easily, or the next generation code
developed by Chris mason. It's an ext4-specific hack and doesn't belong
into common code. I'm pretty sure we agreed on not having it in
common code long ago.

Also the code still deals with the !buffer_mapped and no buffers on page
cases all over which isn't needed anymore with ->page_mkwrite implemented.
Similarly the !get_block case in mpage_da_writepages doesn't make any
sense - it's never used and if people would want to use
generic_writepages they could trivially just call it directly.

And please fix up the indentation of the new buffer_delay checks in
fs/buffer.c, the && belongs on the end of the previous line, and the
second line of the conditional should not be indented the same amount
as the code inside the conditional block.

2008-07-06 02:41:15

by Theodore Ts'o

[permalink] [raw]
Subject: Re: New ext4 patchset 2.6.26-rc8-ext4-1

On Sat, Jul 05, 2008 at 01:50:47PM -0400, Christoph Hellwig wrote:
> Strong NACK. For one thing the code added to mpage.c doesn't belong
> there. It's far inferior to the existing delalloc code we already have
> and that could be made generic easily, or the next generation code
> developed by Chris mason. It's an ext4-specific hack and doesn't belong
> into common code. I'm pretty sure we agreed on not having it in
> common code long ago.

Looking back at the mailing list history, yes, we did. I think the
reason why we didn't was Andrew had expressed a vague wish that the
ext4 and xfs developers would get together and try to hash out a
common layer that all filesystems could work with. But given that
nothing has happened in the many months since that discussion, you're
absolutely right.

I've reworked the vfs generic patch in the patch series to be much
smaller --- mainly what's left is simply exporting some mpage
interfaces that would be needed by ext4. I trust this is much more to
your liking?

- Ted

commit 2962918ce8fa666bdab5b189cc4ed4b3d08a33f8
Author: Alex Tomas <[email protected]>
Date: Sat Jul 5 21:25:20 2008 -0400

vfs: add hooks for ext4's delayed allocation support

Export mpage_bio_submit() and __mpage_writepage() for the benefit of
ext4's delayed allocation support. Also change __block_write_full_page
so that if buffers that have the BH_Delay flag set it will call
get_block() to get the physical block allocated, just as in the
!BH_Mapped case.

Signed-off-by: Alex Tomas <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>

diff --git a/fs/buffer.c b/fs/buffer.c
index a413008..d2541a0 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1691,11 +1691,13 @@ static int __block_write_full_page(struct inode *inode, struct page *page,
*/
clear_buffer_dirty(bh);
set_buffer_uptodate(bh);
- } else if (!buffer_mapped(bh) && buffer_dirty(bh)) {
+ } else if ((!buffer_mapped(bh) || buffer_delay(bh)) &&
+ buffer_dirty(bh)) {
WARN_ON(bh->b_size != blocksize);
err = get_block(inode, block, bh, 1);
if (err)
goto recover;
+ clear_buffer_delay(bh);
if (buffer_new(bh)) {
/* blockdev mappings never come here */
clear_buffer_new(bh);
@@ -1774,7 +1776,8 @@ recover:
bh = head;
/* Recovery: lock and submit the mapped buffers */
do {
- if (buffer_mapped(bh) && buffer_dirty(bh)) {
+ if (buffer_mapped(bh) && buffer_dirty(bh) &&
+ !buffer_delay(bh)) {
lock_buffer(bh);
mark_buffer_async_write(bh);
} else {
diff --git a/fs/mpage.c b/fs/mpage.c
index 235e4d3..dbcc7af 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -82,7 +82,7 @@ static void mpage_end_io_write(struct bio *bio, int err)
bio_put(bio);
}

-static struct bio *mpage_bio_submit(int rw, struct bio *bio)
+struct bio *mpage_bio_submit(int rw, struct bio *bio)
{
bio->bi_end_io = mpage_end_io_read;
if (rw == WRITE)
@@ -90,6 +90,7 @@ static struct bio *mpage_bio_submit(int rw, struct bio *bio)
submit_bio(rw, bio);
return NULL;
}
+EXPORT_SYMBOL(mpage_bio_submit);

static struct bio *
mpage_alloc(struct block_device *bdev,
@@ -435,15 +436,9 @@ EXPORT_SYMBOL(mpage_readpage);
* written, so it can intelligently allocate a suitably-sized BIO. For now,
* just allocate full-size (16-page) BIOs.
*/
-struct mpage_data {
- struct bio *bio;
- sector_t last_block_in_bio;
- get_block_t *get_block;
- unsigned use_writepage;
-};

-static int __mpage_writepage(struct page *page, struct writeback_control *wbc,
- void *data)
+int __mpage_writepage(struct page *page, struct writeback_control *wbc,
+ void *data)
{
struct mpage_data *mpd = data;
struct bio *bio = mpd->bio;
@@ -651,6 +646,7 @@ out:
mpd->bio = bio;
return ret;
}
+EXPORT_SYMBOL(__mpage_writepage);

/**
* mpage_writepages - walk the list of dirty pages of the given address space & writepage() all of them
diff --git a/include/linux/mpage.h b/include/linux/mpage.h
index 068a0c9..5c42821 100644
--- a/include/linux/mpage.h
+++ b/include/linux/mpage.h
@@ -11,11 +11,21 @@
*/
#ifdef CONFIG_BLOCK

+struct mpage_data {
+ struct bio *bio;
+ sector_t last_block_in_bio;
+ get_block_t *get_block;
+ unsigned use_writepage;
+};
+
struct writeback_control;

+struct bio *mpage_bio_submit(int rw, struct bio *bio);
int mpage_readpages(struct address_space *mapping, struct list_head *pages,
unsigned nr_pages, get_block_t get_block);
int mpage_readpage(struct page *page, get_block_t get_block);
+int __mpage_writepage(struct page *page, struct writeback_control *wbc,
+ void *data);
int mpage_writepages(struct address_space *mapping,
struct writeback_control *wbc, get_block_t get_block);
int mpage_writepage(struct page *page, get_block_t *get_block,

2008-07-06 09:58:24

by Christoph Hellwig

[permalink] [raw]
Subject: Re: New ext4 patchset 2.6.26-rc8-ext4-1

On Sat, Jul 05, 2008 at 10:41:15PM -0400, Theodore Tso wrote:
> On Sat, Jul 05, 2008 at 01:50:47PM -0400, Christoph Hellwig wrote:
> > Strong NACK. For one thing the code added to mpage.c doesn't belong
> > there. It's far inferior to the existing delalloc code we already have
> > and that could be made generic easily, or the next generation code
> > developed by Chris mason. It's an ext4-specific hack and doesn't belong
> > into common code. I'm pretty sure we agreed on not having it in
> > common code long ago.
>
> Looking back at the mailing list history, yes, we did. I think the
> reason why we didn't was Andrew had expressed a vague wish that the
> ext4 and xfs developers would get together and try to hash out a
> common layer that all filesystems could work with. But given that
> nothing has happened in the many months since that discussion, you're
> absolutely right.
>
> I've reworked the vfs generic patch in the patch series to be much
> smaller --- mainly what's left is simply exporting some mpage
> interfaces that would be needed by ext4. I trust this is much more to
> your liking?

Yes, this is much better. But please also add kerneldoc comments for
the newly exported functions.


2008-07-08 19:53:54

by Jan Kara

[permalink] [raw]
Subject: Re: New ext4 patchset 2.6.26-rc8-ext4-1

> On Sat, Jul 05, 2008 at 01:19:04PM -0400, Theodore Ts'o wrote:
> > Alex Tomas (2):
> > * vfs: add basic delayed allocation support
> > * ext4: Add basic delayed allocation support
>
> Strong NACK. For one thing the code added to mpage.c doesn't belong
> there. It's far inferior to the existing delalloc code we already have
> and that could be made generic easily, or the next generation code
> developed by Chris mason. It's an ext4-specific hack and doesn't belong
> into common code. I'm pretty sure we agreed on not having it in
> common code long ago.
>
> Also the code still deals with the !buffer_mapped and no buffers on page
> cases all over which isn't needed anymore with ->page_mkwrite implemented.
I'd just comment on this: We've experimentally found out that page
without buffers *can* happen even with page_mkwrite() implementation.
One path I remember we identified as possible cause is do_wp_page()
where buffers can be removed again from the page before it is marked
dirty. So if filesystem wants to be sure that buffers are really
attached to the page, I think it must mark the page (and through it buffers)
dirty before unlocking it...

> Similarly the !get_block case in mpage_da_writepages doesn't make any
> sense - it's never used and if people would want to use
> generic_writepages they could trivially just call it directly.

Honza
--
Jan Kara <[email protected]>
SuSE CR Labs