2007-03-01 19:43:14

by Avantika Mathur

[permalink] [raw]
Subject: Ext4 devel interlock meeting minutes (Feb. 28, 2007)

Ext4 Developer Interlock Call: 01/28/2007 Meeting Minutes

Attendees: Mingming Cao, Suparna Bhattacharya, Dave Kleikamp, Eric
Sandeen, Takashi Sato, Avantika Mathur
Minutes can be accessed at:

Mingming sent out minutes from the Ext4 filesystem and storage workshop
which took place two weeks ago, and will be posting these on the ext4
wiki as well. Mingming gave a talk and led a BOF on ext4 at the summit
- feel free to update or add comments to these minute.

- One thing that was not discussed at the conference is the overall
future plans for the Ext4 filesystem. Many people believe that Ext4 is
a new filesystem that will include many new features that new
filesystems have; including greater scalability. But such additions may
need massive chagnes and rewrite. Our question is, how long to we plan
to continue to support backwards compatibility.


Inode Versioning:
- Need to implement the high 32 bits for the i_version field. Andreas is
looking at adding the new field in i_extra_isize.
- The 64 bit i_version would therefore only be available in ext4; and we
would add the 32 bit patch to ext3. Need to verify with NFS that this
would be ok for them.

Nanosecond Timestamps:
- Kalpak has resent the patches
- CPU usage is a concern. Ted had suggested masking off different levels
of granularity and testing performance at each level.

- akpm suggested that we created and implement a system call for
fallocate, Amit Arora is working on a simple patch which implements the
system call fo i386 architecture.
- the main concern is the need to add an inode operation at VFS layer.
There are mixed responses about whether we should add a system call for
preallocation. hch suggested we add a cmd paramter to the fallocate
system call to do preallocate, unprealloc, reserve, unreserve etc.
-- Mingming thinks it would be it would be good to use this syscall
for reservation as well. current interface to reservation is ioctl.
- Before continuing development on the system call, it is a good idea to
discuss implementation details on lkml and linux-fsdevel.
-- Eric will send and email to linux-ext4 before extending the
discussion to other lists.
-- Mingming will ask Amit to resend patches and follow up with this

Online Defragmentation:
- Takashi tested his online defrag patches and found a problem, that he
is currently looking into.
- After fixing the problem he will upgrade and repost his patches.
- Need Alex's update on his mballoc patch as this online defrag patch is
currently depending on it.
-- Could we try to use preallocation in online defragmentation?
- In the filesystem workshop there was discussion on how locking works
if the file being defragmented is in use.
-- There were suggestions to do defragmention at directory level as
-- Use page cache rather than O_DIRECT to avoid complexity.

e2fsprogs Changes:
- Ted has planned to support 64 bit block number and extents in e2fsprogs.
- This will require many changes and rewrite. We will ask Ted about
current status and distributing work items.

Migration Tool:
- Suparna and Mingming are working with Aneesh Veetil to create a tool
to migrate from regular files to exent files, and from 128 to 256 byte

- Andrew Morton had posted asking for help in testing positive return
value from prepare_write. Shaggy and Suparna will look into this.

- Mapped I/O with preallocation
-- David Chinner has discussed an issue with performing mapped IO with
unwritten extents in XFS.
-- Mapped I/O can read/write and initialize unwritten extents without
notifying the underlying filesystem. So an unwritten extent is not
being flagged to an initialized extent, and after the data is written to
disk the extent is still flagged as unwritten. If the filesystem is
remounted, reading would return zeros.
-- This problem should only apply to a cold cache. If the cache is in
use, the data would be retrieved from cache.

- Mingming and Eric discussed a different method of implementing
preallocation proposed by Arjan
-- when you want to reserve or preallocate 1000 blocks. Reduce the
superblock counter by 1000 and add 1000 to the inode counter. As more
writes are performed, inode would decrement from the inode allocated
blocks counter.
-- This could possibly be integrated with the current ext4
reservation. The reservation window would know that there are allocated
but unwritten blocks in memory, only accessible when blocks have been
-- But using the current reservation, contiguous preallocated blocks
would not be guaranteed. Having contiguous blocks is one of the
requirements of the feature.

- Eric has benchmark data between ext3 and ext4; he will retest and post
results on the mailing list.