2007-04-23 23:35:31

by Avantika Mathur

[permalink] [raw]
Subject: Ext4 devel interlock meeting minutes (April 23, 2007)

Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes

Attendees: Mingming Cao, Dave Kleikamp, Avantika Mathur, Ted Ts'o,
Suparna Bhattacharya,
Jean-Pierre Dion, Jean Noel Cordenner, Val?rie Cl?ment, Jose Santos

Minutes can be accessed at:
http://ext4.wiki.kernel.org/index.php/Ext4_Developer%27s_Conference_Call

- Mingming proposed moving back to 8am PST meeting time, since the 6am
time is inconvenient for
a few people. This discussion will be continued through email, to find
a time which works
for everyone.

- Next week's meeting will be canceled, unless there is anyone who would
like to request a meeting.

PATCH STATUS

git-tree
- Mingming will be updating the git tree with extents-fix patches from
Alex, i_flags patch from Honza, i_extra_isize patch from Kalpak.

Uninitialized Block Groups:
- The patch sent out by Andreas is against 2.6.16 and ext3. Need to
port this to current ext4, test and then add to git-tree. Avantika will
ask Andreas if he needs help with this.

JBD statistics:
- There is a patch to export JDB statistics to /proc. In order to get
this patch to mainline, there needs to be discussion about the correct
place for the statistics; /proc or perhaps debugfs.

e2fsprogs:
- Ted will post the current e2fsprogs patches in progress. Ted has been
working with these patches and making changes.
- Main work areas for making e2fsprogs compatible with extents and 64-bit.
- block iterator: make a block iterator work with both extent and
non-extent code. Code that is oblivious to extents will still work with
the block iterator. This has been written by Andreas Dilger.
- extents: in order to preserve ABI compatibility, support for a new
interface for extents which uses 64-bit logical and physical block
numbers. The block iterator then translate from on-disk to in-memory
format. This will allow for possible future increases of physical and
logical block sizes in extents, without breaking ABI.
- bitmaps in e2fsprogs: this will be discussed in more detail at the
next meeting, after people have a chance to read related email.

preallocation:
- fallocate syscall interface: the current plan, based on discussions
on the mailing list, is to create a separate wrapper for s390 in glibc.
Using regular parameter ordering for all other architectures, but a
different order on s390. Jakub Jelinek has said that the changes in
glibc can be made pretty easily.
- The preallocation patches in the ext4 git-tree are outdated, using
the ioctl interface. Once Amit re-posts the patches with the syscall
interface, they will be updated in the git-tree as well.
- Mingming mentioned the need to flush preallocation metadata changes to
disk if file size or file content is being tested. Discussed doing an
fsync at Bmap time.

TESTING
- extents testing
- Discussed methods for testing extents on highly fragmented
filesystems.
- Jose will look into possible tests, including perhaps using the
'aged' option in FFSB
- Ted suggested creating a mountoption that creates a bad block
allocator which it jumps to a new block group every 8 blocks. This
would force a very large number of extents, and may be a good test for
extents.

- large filesystem
- We would like to perform more testing on large (>16TB) filesystems
- currently hardware limitations are preventing this testing. We
have tested 10TB raid dists, and 16TB loopback devices. Avantika will
look into creating very large sparse devices for testing.

- Large file deletion
- Valerie had recently tested large file deletion on ext3/4, but did
not see the expected performance gain with ext4 due to compact metadata
when using extents.
- Valerie will try re-running the test. Jose will also be looking
into this test.


2007-04-24 06:28:11

by Alex Tomas

[permalink] [raw]
Subject: Re: Ext4 devel interlock meeting minutes (April 23, 2007)

Avantika Mathur wrote:
> TESTING
> - extents testing
> - Discussed methods for testing extents on highly fragmented
> filesystems.
> - Jose will look into possible tests, including perhaps using the
> 'aged' option in FFSB
> - Ted suggested creating a mountoption that creates a bad block
> allocator which it jumps to a new block group every 8 blocks. This
> would force a very large number of extents, and may be a good test for
> extents.

there is AGGRESSIVE_TEST define which limits number of entries in index/leaf.

> - Large file deletion
> - Valerie had recently tested large file deletion on ext3/4, but did
> not see the expected performance gain with ext4 due to compact metadata
> when using extents.

any details?

thanks, Alex

2007-04-24 14:05:03

by Valerie Clement

[permalink] [raw]
Subject: Re: Ext4 devel interlock meeting minutes (April 23, 2007)

Alex Tomas wrote:

>> - Large file deletion
>> - Valerie had recently tested large file deletion on ext3/4, but
>> did not see the expected performance gain with ext4 due to compact
>> metadata when using extents.
>
> any details?
>

Ok, I found my mistake. There was a typo in my test script and the
pagecache was not flushed between the file creation and the deletion.

Here are the results I obtain with a 2.6.17-rc7 kernel to delete a 100GB
file:

ext3 : real 2m35.048s user 0m0.000s sys 0m6.424s
ext4 : real 0m11.160s user 0m0.000s sys 0m5.532s
xfs : real 0m0.377s user 0m0.004s sys 0m0.004s

The performance gain with ext4 is much larger when running a good test...
Sorry the wrong information,

Val?rie

2007-04-24 14:21:21

by Alex Tomas

[permalink] [raw]
Subject: Re: Ext4 devel interlock meeting minutes (April 23, 2007)

Valerie Clement wrote:
> Here are the results I obtain with a 2.6.17-rc7 kernel to delete a 100GB
> file:
>
> ext3 : real 2m35.048s user 0m0.000s sys 0m6.424s
> ext4 : real 0m11.160s user 0m0.000s sys 0m5.532s
> xfs : real 0m0.377s user 0m0.004s sys 0m0.004s

would be very interesting to know how much IO was done to remove the file
and actual fragmentation in all the cases.

thanks, Alex

2007-04-24 14:33:23

by Eric Sandeen

[permalink] [raw]
Subject: Re: Ext4 devel interlock meeting minutes (April 23, 2007)

Avantika Mathur wrote:
> - large filesystem
> - We would like to perform more testing on large (>16TB) filesystems
> - currently hardware limitations are preventing this testing. We
> have tested 10TB raid dists, and 16TB loopback devices. Avantika will
> look into creating very large sparse devices for testing.

I've been hacking up some ext3@16T testing scripts to use sparse
devicemapper devices which make use of snapshots... loopback files don't
work for testing, at least not hosted on ext[234], because we still
can't do these large file offsets.

(Documentation/device-mapper/zero.txt in the kernel tree describes these
sparse dm devices)

Testing the whole range as a sparse snapshot can be slow, since
devicemapper has to do all the exception handling etc, and I think
essentially creates a fragmented block device.

I've been playing with something like this:

# 90% of the real device size is used for a "real" 1:1 mapping
# The other 10% is sparsely mapped out to add up to totalsize.
# i.e. -

# [large sparse-ish device]
#
# +----------------------~ ~-----------------------------------------+
# | sparse | real |
# +----------------------~ ~-----------------------------------------+
#
# |<------------ SPARSE_SIZE ---------------->|<----- REAL_SIZE ----->|

# is mapped on top of:

# [real block device]
# +----------------------------+
# | sp | real |
# +----------------------------+

and then marking the sparse range as full (maybe via lazy_bg, or other
methods). You could then also put a dm-error target under the "full"
sections so that any IO that may stray there will fail.

This way you can direct the real IO to the 1:1 mapping portion of the
large dm device, and shouldn't get the snapshot slowdowns.

Anyway, just something I've been playing with...

-eric

2007-04-24 14:52:38

by Valerie Clement

[permalink] [raw]
Subject: Re: Ext4 devel interlock meeting minutes (April 23, 2007)

Alex Tomas wrote:
> Valerie Clement wrote:
>> Here are the results I obtain with a 2.6.17-rc7 kernel to delete a
>> 100GB file:
>>
>> ext3 : real 2m35.048s user 0m0.000s sys 0m6.424s
>> ext4 : real 0m11.160s user 0m0.000s sys 0m5.532s
>> xfs : real 0m0.377s user 0m0.004s sys 0m0.004s
>
> would be very interesting to know how much IO was done to remove the file
> and actual fragmentation in all the cases.
>
> thanks, Alex
>
Ok, I will do it.

Val?rie

2007-04-30 11:06:14

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: Ext4 devel interlock meeting minutes (April 23, 2007)

On 4/24/07, Avantika Mathur <[email protected]> wrote:
> Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes
>
> TESTING
> - extents testing
> - Discussed methods for testing extents on highly fragmented
> filesystems.
> - Jose will look into possible tests, including perhaps using the
> 'aged' option in FFSB
> - Ted suggested creating a mountoption that creates a bad block
> allocator which it jumps to a new block group every 8 blocks. This
> would force a very large number of extents, and may be a good test for
> extents.


What i am doing for creating a large number of extents is

dd if=/dev/zero of=myfile count=10
seek=20
while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek;
seek=`expr $seek + 20`; done


-aneesh

2007-04-30 11:15:08

by Alex Tomas

[permalink] [raw]
Subject: Re: Ext4 devel interlock meeting minutes (April 23, 2007)

Aneesh Kumar wrote:
> What i am doing for creating a large number of extents is
>
> dd if=/dev/zero of=myfile count=10
> seek=20
> while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek;
> seek=`expr $seek + 20`; done

with AGGRESSIVE_TEST defined in include/linux/ext4_fs_extents.h you may
get much more extents and index blocks.

2007-05-01 12:06:05

by Kalpak Shah

[permalink] [raw]
Subject: Re: Ext4 devel interlock meeting minutes (April 23, 2007)

On Mon, 2007-04-30 at 16:36 +0530, Aneesh Kumar wrote:
> On 4/24/07, Avantika Mathur <[email protected]> wrote:
> > Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes
> >
> > TESTING
> > - extents testing
> > - Discussed methods for testing extents on highly fragmented
> > filesystems.
> > - Jose will look into possible tests, including perhaps using the
> > 'aged' option in FFSB
> > - Ted suggested creating a mountoption that creates a bad block
> > allocator which it jumps to a new block group every 8 blocks. This
> > would force a very large number of extents, and may be a good test for
> > extents.
>
>
> What i am doing for creating a large number of extents is
>
> dd if=/dev/zero of=myfile count=10
> seek=20
> while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek;
> seek=`expr $seek + 20`; done
>
>

I had written a simple tool "bitmap_manip" with which you can actually
manipulate the number of free chunks and their sizes in a filesystem. It
uses libext2fs to set the bits in block bitmaps thereby leaving the
desired free extents. I had written it to test the allocators
performance.

It can be used as:
./bitmap_manip /dev/sda9 1MA 4 16K 1 12K 3 8K 4 4K 6

This will leave only 1 16K chunk, 3 12K chunks, .... free in the
filesystem. "1MA" 4 will get us 4 1Mb free ALIGNED chunks.

It isn't very beautiful code since it was only used for testing but
maybe it can help.

Thanks,
Kalpak.

> -aneesh
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Attachments:
bitmap_manip.c (4.96 kB)