This is the first patch of a series that is aimed at cleaning up the
filesystem source so that it will no longer be necessary for the headers of
every Linux filesystem to be nested in fs.h. This patch is strictly local to
ext2. It is intended to serve as a pattern for the rest of the filesystems,
and as a first step towards eliminating the fs.h unions entirely.
The strategy is to abstract all references to the struct inode union through
an inline function:
static inline struct ext2_inode_info *ext2_i (struct inode *inode)
{
return &(inode->u.ext2_inode_info);
}
Mostly, this just requires changing 'inode->u.ext2_i.' to 'ext2_i(inode)->'
throughout the ext2 code. In addition, most subexpressions of type (struct
ext2_inode_info *) have been changed to local variables, reducing code
clutter (particularly in ext2/inode.c) and removing the onus from the
compiler of re-discovering common subexpressions generated by many inline
function expansions.
This patch also changes the 'ext2_i' union identifier to 'ext2_inode_info' to
help guard against stragglers - that is, ext2 union-using code that may have
been missed by this patch - and help maintainers of out-of-tree patches to
know what modifications they need to make to their own source.
Eventually, the unions will be removed entirely.
This work is similar to, and is in part inspired by Arnaldo Carvalho de
Melo's network sockets cleanup. A similar cleanup needs to be done for the
ext2 super_block union, and will look very similar to this patch. The
strategy used here extends easily to superblocks and inodes of all
filesystems, regardless of whether the union or the (void *) generic_ip
approach has been used. Furthermore, this is incremental - we can do the
filesystems one at a time if we want to.
<language-lawyers>
There is some grist here for the mills of language lawyers here. Note the
compilation warning:
ialloc.c:336: warning: passing arg 1 of `ext2_i' discards qualifiers from
pointer target type
This happens because 'dir' is declared as 'const struct inode *dir', and even
though the ext2_i inline is expanded only into expressions that cannot modify
the 'dir', the compiler ignores that and choses to produce the warning. The
question is: does the C spec mandate this behaviour or is this just a
compiler implementation oversight? If C mandates this, then do we 1) just
ignore the warnings 2) delete the 'const' 3) create a variant inline function
taking a 'const' parameter or 4) do something else?
</language-lawyers>
I'm running this patch on my server right now, and before that I did some
light testing in uml. More testing is needed. Good news would be of the
form: 'nothing changed, it didn't eat my root filesystem' ;-)
To apply:
cd /your/source/tree
cat this/patch | patch -p0
--
Daniel
--- ../2.4.17.clean/fs/ext2/dir.c Mon Sep 17 16:16:30 2001
+++ ./fs/ext2/dir.c Thu Dec 27 02:40:12 2001
@@ -300,6 +300,7 @@
struct ext2_dir_entry_2 * ext2_find_entry (struct inode * dir,
struct dentry *dentry, struct page ** res_page)
{
+ struct ext2_inode_info *ei = ext2_i(dir);
const char *name = dentry->d_name.name;
int namelen = dentry->d_name.len;
unsigned reclen = EXT2_DIR_REC_LEN(namelen);
@@ -311,7 +312,7 @@
/* OFFSET_CACHE */
*res_page = NULL;
- start = dir->u.ext2_i.i_dir_start_lookup;
+ start = ei->i_dir_start_lookup;
if (start >= npages)
start = 0;
n = start;
@@ -336,7 +337,7 @@
found:
*res_page = page;
- dir->u.ext2_i.i_dir_start_lookup = n;
+ ei->i_dir_start_lookup = n;
return de;
}
--- ../2.4.17.clean/fs/ext2/ialloc.c Sun Nov 11 12:59:56 2001
+++ ./fs/ext2/ialloc.c Thu Dec 27 02:51:24 2001
@@ -314,6 +314,7 @@
struct inode * ext2_new_inode (const struct inode * dir, int mode)
{
+ struct ext2_inode_info *di = ext2_i(dir), *ei;
struct super_block * sb;
struct buffer_head * bh;
struct buffer_head * bh2;
@@ -326,6 +327,7 @@
sb = dir->i_sb;
inode = new_inode(sb);
+ ei = ext2_i(inode);
if (!inode)
return ERR_PTR(-ENOMEM);
@@ -333,9 +335,9 @@
es = sb->u.ext2_sb.s_es;
repeat:
if (S_ISDIR(mode))
- group = find_group_dir(sb, dir->u.ext2_i.i_block_group);
+ group = find_group_dir(sb, di->i_block_group);
else
- group = find_group_other(sb, dir->u.ext2_i.i_block_group);
+ group = find_group_other(sb, di->i_block_group);
err = -ENOSPC;
if (group == -1)
@@ -386,12 +388,12 @@
inode->i_blksize = PAGE_SIZE; /* This is the optimal IO size (for stat),
not the fs block size */
inode->i_blocks = 0;
inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
- inode->u.ext2_i.i_new_inode = 1;
- inode->u.ext2_i.i_flags = dir->u.ext2_i.i_flags;
+ ei->i_new_inode = 1;
+ ei->i_flags = di->i_flags;
if (S_ISLNK(mode))
- inode->u.ext2_i.i_flags &= ~(EXT2_IMMUTABLE_FL|EXT2_APPEND_FL);
- inode->u.ext2_i.i_block_group = group;
- if (inode->u.ext2_i.i_flags & EXT2_SYNC_FL)
+ ei->i_flags &= ~(EXT2_IMMUTABLE_FL|EXT2_APPEND_FL);
+ ei->i_block_group = group;
+ if (ei->i_flags & EXT2_SYNC_FL)
inode->i_flags |= S_SYNC;
insert_inode_hash(inode);
inode->i_generation = event++;
--- ../2.4.17.clean/fs/ext2/inode.c Wed Nov 21 17:07:25 2001
+++ ./fs/ext2/inode.c Thu Dec 27 00:55:13 2001
@@ -57,7 +57,7 @@
inode->i_ino == EXT2_ACL_IDX_INO ||
inode->i_ino == EXT2_ACL_DATA_INO)
goto no_delete;
- inode->u.ext2_i.i_dtime = CURRENT_TIME;
+ ext2_i(inode)->i_dtime = CURRENT_TIME;
mark_inode_dirty(inode);
ext2_update_inode(inode, IS_SYNC(inode));
inode->i_size = 0;
@@ -75,13 +75,14 @@
void ext2_discard_prealloc (struct inode * inode)
{
#ifdef EXT2_PREALLOCATE
+ struct ext2_inode_info *ei = ext2_i(inode);
lock_kernel();
/* Writer: ->i_prealloc* */
- if (inode->u.ext2_i.i_prealloc_count) {
- unsigned short total = inode->u.ext2_i.i_prealloc_count;
- unsigned long block = inode->u.ext2_i.i_prealloc_block;
- inode->u.ext2_i.i_prealloc_count = 0;
- inode->u.ext2_i.i_prealloc_block = 0;
+ if (ext2_i(inode)->i_prealloc_count) {
+ unsigned short total = ext2_i(inode)->i_prealloc_count;
+ unsigned long block = ei->i_prealloc_block;
+ ei->i_prealloc_count = 0;
+ ei->i_prealloc_block = 0;
/* Writer: end */
ext2_free_blocks (inode, block, total);
}
@@ -91,6 +92,7 @@
static int ext2_alloc_block (struct inode * inode, unsigned long goal, int
*err)
{
+ struct ext2_inode_info *ei = ext2_i(inode);
#ifdef EXT2FS_DEBUG
static unsigned long alloc_hits = 0, alloc_attempts = 0;
#endif
@@ -99,12 +101,12 @@
#ifdef EXT2_PREALLOCATE
/* Writer: ->i_prealloc* */
- if (inode->u.ext2_i.i_prealloc_count &&
- (goal == inode->u.ext2_i.i_prealloc_block ||
- goal + 1 == inode->u.ext2_i.i_prealloc_block))
+ if (ei->i_prealloc_count &&
+ (goal == ei->i_prealloc_block ||
+ goal + 1 == ei->i_prealloc_block))
{
- result = inode->u.ext2_i.i_prealloc_block++;
- inode->u.ext2_i.i_prealloc_count--;
+ result = ei->i_prealloc_block++;
+ ei->i_prealloc_count--;
/* Writer: end */
ext2_debug ("preallocation hit (%lu/%lu).\n",
++alloc_hits, ++alloc_attempts);
@@ -114,8 +116,8 @@
alloc_hits, ++alloc_attempts);
if (S_ISREG(inode->i_mode))
result = ext2_new_block (inode, goal,
- &inode->u.ext2_i.i_prealloc_count,
- &inode->u.ext2_i.i_prealloc_block, err);
+ &ei->i_prealloc_count,
+ &ei->i_prealloc_block, err);
else
result = ext2_new_block (inode, goal, 0, 0, err);
}
@@ -239,6 +241,7 @@
Indirect chain[4],
int *err)
{
+ struct ext2_inode_info *ei = ext2_i(inode);
kdev_t dev = inode->i_dev;
int size = inode->i_sb->s_blocksize;
Indirect *p = chain;
@@ -246,7 +249,7 @@
*err = 0;
/* i_data is not going away, no lock needed */
- add_chain (chain, NULL, inode->u.ext2_i.i_data + *offsets);
+ add_chain (chain, NULL, ei->i_data + *offsets);
if (!p->key)
goto no_block;
while (--depth) {
@@ -288,7 +291,8 @@
static inline unsigned long ext2_find_near(struct inode *inode, Indirect
*ind)
{
- u32 *start = ind->bh ? (u32*) ind->bh->b_data : inode->u.ext2_i.i_data;
+ struct ext2_inode_info *ei = ext2_i(inode);
+ u32 *start = ind->bh? (u32*) ind->bh->b_data: ei->i_data;
u32 *p;
/* Try to find previous block */
@@ -304,8 +308,7 @@
* It is going to be refered from inode itself? OK, just put it into
* the same cylinder group then.
*/
- return (inode->u.ext2_i.i_block_group *
- EXT2_BLOCKS_PER_GROUP(inode->i_sb)) +
+ return (ei->i_block_group * EXT2_BLOCKS_PER_GROUP(inode->i_sb)) +
le32_to_cpu(inode->i_sb->u.ext2_sb.s_es->s_first_data_block);
}
@@ -328,10 +331,11 @@
Indirect *partial,
unsigned long *goal)
{
+ struct ext2_inode_info *ei = ext2_i(inode);
/* Writer: ->i_next_alloc* */
- if (block == inode->u.ext2_i.i_next_alloc_block + 1) {
- inode->u.ext2_i.i_next_alloc_block++;
- inode->u.ext2_i.i_next_alloc_goal++;
+ if (block == ei->i_next_alloc_block + 1) {
+ ei->i_next_alloc_block++;
+ ei->i_next_alloc_goal++;
}
/* Writer: end */
/* Reader: pointers, ->i_next_alloc* */
@@ -340,8 +344,8 @@
* try the heuristic for sequential allocation,
* failing that at least try to get decent locality.
*/
- if (block == inode->u.ext2_i.i_next_alloc_block)
- *goal = inode->u.ext2_i.i_next_alloc_goal;
+ if (block == ei->i_next_alloc_block)
+ *goal = ei->i_next_alloc_goal;
if (!*goal)
*goal = ext2_find_near(inode, partial);
return 0;
@@ -408,7 +412,7 @@
mark_buffer_uptodate(bh, 1);
unlock_buffer(bh);
mark_buffer_dirty_inode(bh, inode);
- if (IS_SYNC(inode) || inode->u.ext2_i.i_osync) {
+ if (IS_SYNC(inode) || ext2_i(inode)->i_osync) {
ll_rw_block (WRITE, 1, &bh);
wait_on_buffer (bh);
}
@@ -448,6 +452,7 @@
Indirect *where,
int num)
{
+ struct ext2_inode_info *ei = ext2_i(inode);
int i;
/* Verify that place we are splicing to is still there and vacant */
@@ -460,8 +465,8 @@
/* That's it */
*where->p = where->key;
- inode->u.ext2_i.i_next_alloc_block = block;
- inode->u.ext2_i.i_next_alloc_goal = le32_to_cpu(where[num-1].key);
+ ei->i_next_alloc_block = block;
+ ei->i_next_alloc_goal = le32_to_cpu(where[num-1].key);
/* Writer: end */
@@ -472,13 +477,13 @@
/* had we spliced it onto indirect block? */
if (where->bh) {
mark_buffer_dirty_inode(where->bh, inode);
- if (IS_SYNC(inode) || inode->u.ext2_i.i_osync) {
+ if (IS_SYNC(inode) || ei->i_osync) {
ll_rw_block (WRITE, 1, &where->bh);
wait_on_buffer(where->bh);
}
}
- if (IS_SYNC(inode) || inode->u.ext2_i.i_osync)
+ if (IS_SYNC(inode) || ei->i_osync)
ext2_sync_inode (inode);
else
mark_inode_dirty(inode);
@@ -788,7 +793,8 @@
void ext2_truncate (struct inode * inode)
{
- u32 *i_data = inode->u.ext2_i.i_data;
+ struct ext2_inode_info *ei = ext2_i(inode);
+ u32 *i_data = ei->i_data;
int addr_per_block = EXT2_ADDR_PER_BLOCK(inode->i_sb);
int offsets[4];
Indirect chain[4];
@@ -881,6 +887,7 @@
void ext2_read_inode (struct inode * inode)
{
+ struct ext2_inode_info *ei = ext2_i(inode);
struct buffer_head * bh;
struct ext2_inode * raw_inode;
unsigned long block_group;
@@ -942,13 +949,13 @@
inode->i_atime = le32_to_cpu(raw_inode->i_atime);
inode->i_ctime = le32_to_cpu(raw_inode->i_ctime);
inode->i_mtime = le32_to_cpu(raw_inode->i_mtime);
- inode->u.ext2_i.i_dtime = le32_to_cpu(raw_inode->i_dtime);
+ ei->i_dtime = le32_to_cpu(raw_inode->i_dtime);
/* We now have enough fields to check if the inode was active or not.
* This is needed because nfsd might try to access dead inodes
* the test is that same one that e2fsck uses
* NeilBrown 1999oct15
*/
- if (inode->i_nlink == 0 && (inode->i_mode == 0 || inode->u.ext2_i.i_dtime))
{
+ if (inode->i_nlink == 0 && (inode->i_mode == 0 || ei->i_dtime)) {
/* this inode is deleted */
brelse (bh);
goto bad_inode;
@@ -956,25 +963,25 @@
inode->i_blksize = PAGE_SIZE; /* This is the optimal IO size (for stat),
not the fs block size */
inode->i_blocks = le32_to_cpu(raw_inode->i_blocks);
inode->i_version = ++event;
- inode->u.ext2_i.i_flags = le32_to_cpu(raw_inode->i_flags);
- inode->u.ext2_i.i_faddr = le32_to_cpu(raw_inode->i_faddr);
- inode->u.ext2_i.i_frag_no = raw_inode->i_frag;
- inode->u.ext2_i.i_frag_size = raw_inode->i_fsize;
- inode->u.ext2_i.i_file_acl = le32_to_cpu(raw_inode->i_file_acl);
+ ei->i_flags = le32_to_cpu(raw_inode->i_flags);
+ ei->i_faddr = le32_to_cpu(raw_inode->i_faddr);
+ ei->i_frag_no = raw_inode->i_frag;
+ ei->i_frag_size = raw_inode->i_fsize;
+ ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl);
if (S_ISREG(inode->i_mode))
inode->i_size |= ((__u64)le32_to_cpu(raw_inode->i_size_high)) << 32;
else
- inode->u.ext2_i.i_dir_acl = le32_to_cpu(raw_inode->i_dir_acl);
+ ei->i_dir_acl = le32_to_cpu(raw_inode->i_dir_acl);
inode->i_generation = le32_to_cpu(raw_inode->i_generation);
- inode->u.ext2_i.i_prealloc_count = 0;
- inode->u.ext2_i.i_block_group = block_group;
+ ei->i_prealloc_count = 0;
+ ei->i_block_group = block_group;
/*
* NOTE! The in-memory inode i_data array is in little-endian order
* even on big-endian machines: we do NOT byteswap the block numbers!
*/
for (block = 0; block < EXT2_N_BLOCKS; block++)
- inode->u.ext2_i.i_data[block] = raw_inode->i_block[block];
+ ei->i_data[block] = raw_inode->i_block[block];
if (inode->i_ino == EXT2_ACL_IDX_INO ||
inode->i_ino == EXT2_ACL_DATA_INO)
@@ -999,19 +1006,19 @@
le32_to_cpu(raw_inode->i_block[0]));
brelse (bh);
inode->i_attr_flags = 0;
- if (inode->u.ext2_i.i_flags & EXT2_SYNC_FL) {
+ if (ei->i_flags & EXT2_SYNC_FL) {
inode->i_attr_flags |= ATTR_FLAG_SYNCRONOUS;
inode->i_flags |= S_SYNC;
}
- if (inode->u.ext2_i.i_flags & EXT2_APPEND_FL) {
+ if (ei->i_flags & EXT2_APPEND_FL) {
inode->i_attr_flags |= ATTR_FLAG_APPEND;
inode->i_flags |= S_APPEND;
}
- if (inode->u.ext2_i.i_flags & EXT2_IMMUTABLE_FL) {
+ if (ei->i_flags & EXT2_IMMUTABLE_FL) {
inode->i_attr_flags |= ATTR_FLAG_IMMUTABLE;
inode->i_flags |= S_IMMUTABLE;
}
- if (inode->u.ext2_i.i_flags & EXT2_NOATIME_FL) {
+ if (ei->i_flags & EXT2_NOATIME_FL) {
inode->i_attr_flags |= ATTR_FLAG_NOATIME;
inode->i_flags |= S_NOATIME;
}
@@ -1024,6 +1031,7 @@
static int ext2_update_inode(struct inode * inode, int do_sync)
{
+ struct ext2_inode_info *ei = ext2_i(inode);
struct buffer_head * bh;
struct ext2_inode * raw_inode;
unsigned long block_group;
@@ -1080,7 +1088,7 @@
* Fix up interoperability with old kernels. Otherwise, old inodes get
* re-used with the upper 16 bits of the uid/gid intact
*/
- if(!inode->u.ext2_i.i_dtime) {
+ if(!ei->i_dtime) {
raw_inode->i_uid_high = cpu_to_le16(high_16_bits(inode->i_uid));
raw_inode->i_gid_high = cpu_to_le16(high_16_bits(inode->i_gid));
} else {
@@ -1099,14 +1107,14 @@
raw_inode->i_ctime = cpu_to_le32(inode->i_ctime);
raw_inode->i_mtime = cpu_to_le32(inode->i_mtime);
raw_inode->i_blocks = cpu_to_le32(inode->i_blocks);
- raw_inode->i_dtime = cpu_to_le32(inode->u.ext2_i.i_dtime);
- raw_inode->i_flags = cpu_to_le32(inode->u.ext2_i.i_flags);
- raw_inode->i_faddr = cpu_to_le32(inode->u.ext2_i.i_faddr);
- raw_inode->i_frag = inode->u.ext2_i.i_frag_no;
- raw_inode->i_fsize = inode->u.ext2_i.i_frag_size;
- raw_inode->i_file_acl = cpu_to_le32(inode->u.ext2_i.i_file_acl);
+ raw_inode->i_dtime = cpu_to_le32(ei->i_dtime);
+ raw_inode->i_flags = cpu_to_le32(ei->i_flags);
+ raw_inode->i_faddr = cpu_to_le32(ei->i_faddr);
+ raw_inode->i_frag = ei->i_frag_no;
+ raw_inode->i_fsize = ei->i_frag_size;
+ raw_inode->i_file_acl = cpu_to_le32(ei->i_file_acl);
if (S_ISDIR(inode->i_mode))
- raw_inode->i_dir_acl = cpu_to_le32(inode->u.ext2_i.i_dir_acl);
+ raw_inode->i_dir_acl = cpu_to_le32(ei->i_dir_acl);
else {
raw_inode->i_size_high = cpu_to_le32(inode->i_size >> 32);
if (inode->i_size > 0x7fffffffULL) {
@@ -1132,7 +1140,7 @@
if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode))
raw_inode->i_block[0] = cpu_to_le32(kdev_t_to_nr(inode->i_rdev));
else for (block = 0; block < EXT2_N_BLOCKS; block++)
- raw_inode->i_block[block] = inode->u.ext2_i.i_data[block];
+ raw_inode->i_block[block] = ei->i_data[block];
mark_buffer_dirty(bh);
if (do_sync) {
ll_rw_block (WRITE, 1, &bh);
--- ../2.4.17.clean/fs/ext2/ioctl.c Wed Sep 27 16:41:33 2000
+++ ./fs/ext2/ioctl.c Thu Dec 27 01:00:26 2001
@@ -16,13 +16,14 @@
int ext2_ioctl (struct inode * inode, struct file * filp, unsigned int cmd,
unsigned long arg)
{
+ struct ext2_inode_info *ei = ext2_i(inode);
unsigned int flags;
ext2_debug ("cmd = %u, arg = %lu\n", cmd, arg);
switch (cmd) {
case EXT2_IOC_GETFLAGS:
- flags = inode->u.ext2_i.i_flags & EXT2_FL_USER_VISIBLE;
+ flags = ei->i_flags & EXT2_FL_USER_VISIBLE;
return put_user(flags, (int *) arg);
case EXT2_IOC_SETFLAGS: {
unsigned int oldflags;
@@ -36,7 +37,7 @@
if (get_user(flags, (int *) arg))
return -EFAULT;
- oldflags = inode->u.ext2_i.i_flags;
+ oldflags = ei->i_flags;
/*
* The IMMUTABLE and APPEND_ONLY flags can only be changed by
@@ -51,7 +52,7 @@
flags = flags & EXT2_FL_USER_MODIFIABLE;
flags |= oldflags & ~EXT2_FL_USER_MODIFIABLE;
- inode->u.ext2_i.i_flags = flags;
+ ei->i_flags = flags;
if (flags & EXT2_SYNC_FL)
inode->i_flags |= S_SYNC;
--- ../2.4.17.clean/fs/ext2/namei.c Thu Oct 4 01:57:36 2001
+++ ./fs/ext2/namei.c Wed Dec 26 23:29:30 2001
@@ -134,7 +134,7 @@
if (IS_ERR(inode))
goto out;
- if (l > sizeof (inode->u.ext2_i.i_data)) {
+ if (l > sizeof (ext2_i(inode)->i_data)) {
/* slow symlink */
inode->i_op = &page_symlink_inode_operations;
inode->i_mapping->a_ops = &ext2_aops;
@@ -144,7 +144,7 @@
} else {
/* fast symlink */
inode->i_op = &ext2_fast_symlink_inode_operations;
- memcpy((char*)&inode->u.ext2_i.i_data,symname,l);
+ memcpy((char *) &ext2_i(inode)->i_data, symname, l);
inode->i_size = l-1;
}
mark_inode_dirty(inode);
--- ../2.4.17.clean/fs/ext2/symlink.c Wed Sep 27 16:41:33 2000
+++ ./fs/ext2/symlink.c Thu Dec 27 00:37:47 2001
@@ -22,14 +22,14 @@
static int ext2_readlink(struct dentry *dentry, char *buffer, int buflen)
{
- char *s = (char *)dentry->d_inode->u.ext2_i.i_data;
- return vfs_readlink(dentry, buffer, buflen, s);
+ struct ext2_inode_info *ei = ext2_i(dentry->d_inode);
+ return vfs_readlink(dentry, buffer, buflen, ei->i_data);
}
static int ext2_follow_link(struct dentry *dentry, struct nameidata *nd)
{
- char *s = (char *)dentry->d_inode->u.ext2_i.i_data;
- return vfs_follow_link(nd, s);
+ struct ext2_inode_info *ei = ext2_i(dentry->d_inode);
+ return vfs_follow_link(nd, ei->i_data);
}
struct inode_operations ext2_fast_symlink_inode_operations = {
--- ../2.4.17.clean/include/linux/ext2_fs.h Thu Nov 22 14:46:52 2001
+++ ./include/linux/ext2_fs.h Wed Dec 26 23:36:07 2001
@@ -545,6 +545,11 @@
# define ATTRIB_NORET __attribute__((noreturn))
# define NORET_AND noreturn,
+static inline struct ext2_inode_info *ext2_i (struct inode *inode)
+{
+ return &(inode->u.ext2_inode_info);
+}
+
/* balloc.c */
extern int ext2_bg_has_super(struct super_block *sb, int group);
extern unsigned long ext2_bg_num_gdb(struct super_block *sb, int group);
--- ../2.4.17.clean/include/linux/fs.h Fri Dec 21 12:42:03 2001
+++ ./include/linux/fs.h Wed Dec 26 23:30:55 2001
@@ -478,7 +478,7 @@
__u32 i_generation;
union {
struct minix_inode_info minix_i;
- struct ext2_inode_info ext2_i;
+ struct ext2_inode_info ext2_inode_info;
struct ext3_inode_info ext3_i;
struct hpfs_inode_info hpfs_i;
struct ntfs_inode_info ntfs_i;
On Thu, Dec 27, 2001 at 04:21:42AM +0100, Daniel Phillips wrote:
> --- ../2.4.17.clean/include/linux/fs.h Fri Dec 21 12:42:03 2001
> +++ ./include/linux/fs.h Wed Dec 26 23:30:55 2001
> @@ -478,7 +478,7 @@
> __u32 i_generation;
> union {
> struct minix_inode_info minix_i;
> - struct ext2_inode_info ext2_i;
> + struct ext2_inode_info ext2_inode_info;
> struct ext3_inode_info ext3_i;
> struct hpfs_inode_info hpfs_i;
> struct ntfs_inode_info ntfs_i;
Change in principle looks good except IMHO you should go ahead and
remove the ext2 stuff from the union... (with the additional changes
that implies)
Jeff
Em Wed, Dec 26, 2001 at 10:28:09PM -0500, Legacy Fishtank escreveu:
> On Thu, Dec 27, 2001 at 04:21:42AM +0100, Daniel Phillips wrote:
> > --- ../2.4.17.clean/include/linux/fs.h Fri Dec 21 12:42:03 2001
> > +++ ./include/linux/fs.h Wed Dec 26 23:30:55 2001
> > @@ -478,7 +478,7 @@
> > __u32 i_generation;
> > union {
> > struct minix_inode_info minix_i;
> > - struct ext2_inode_info ext2_i;
> > + struct ext2_inode_info ext2_inode_info;
> > struct ext3_inode_info ext3_i;
> > struct hpfs_inode_info hpfs_i;
> > struct ntfs_inode_info ntfs_i;
>
> Change in principle looks good except IMHO you should go ahead and
> remove the ext2 stuff from the union... (with the additional changes
> that implies)
Jeff, he's trying to go incremental, the idea is to make the union
something like ->i_fs_private (void pointer) or something like that, or do
like I did for the struct sock case, using a per fs slabcache, to avoid the
extra allocation for the private area, Al thinks that this is not needed,
but Daniel thinks it can be worth it, doing it incrementally we can test
both approaches and avoid getting to big a patch. But yes, the idea is to
kill the union completely and remove all the fs specific includes in fs.h.
After the access to the union is abstracted, we can do the next step, which
is to remove the union, touching then only the function/macro that
abstracts the access to the fs specific data.
So, yes, the ext2 stuff will be removed, as all the other fs specific stuff
8)
- Arnaldo
On December 27, 2001 04:28 am, Legacy Fishtank wrote:
> On Thu, Dec 27, 2001 at 04:21:42AM +0100, Daniel Phillips wrote:
> > --- ../2.4.17.clean/include/linux/fs.h Fri Dec 21 12:42:03 2001
> > +++ ./include/linux/fs.h Wed Dec 26 23:30:55 2001
> > @@ -478,7 +478,7 @@
> > __u32 i_generation;
> > union {
> > struct minix_inode_info minix_i;
> > - struct ext2_inode_info ext2_i;
> > + struct ext2_inode_info ext2_inode_info;
> > struct ext3_inode_info ext3_i;
> > struct hpfs_inode_info hpfs_i;
> > struct ntfs_inode_info ntfs_i;
>
> Change in principle looks good except IMHO you should go ahead and
> remove the ext2 stuff from the union... (with the additional changes
> that implies)
Hi Jeff,
Thanks for your confidence, but that would be a considerably bigger patch.
It's not just a matter of removing the includes - other bits and pieces have
to be put in place, such as per-filesystem inode slab. The support for this
goes outside ext2.
My idea is to just let people have a look and test this minimally intrusive
change. Getting rid of the includes for ext2 inodes will be a two-patch
change:
1) Abstract away the ext2 .u's (done)
2) Per-fs inode slab, initially only for ext2 (partly done)
Removing the includes for ext2 superblocks will need another two patches. By
the time all filesystems are done, it would be thousands of lines if it was
all in one patch. I think it's better to keep it broken up, and do it
incrementally.
--
Daniel
On Dec 27, 2001 04:21 +0100, Daniel Phillips wrote:
> The strategy is to abstract all references to the struct inode union through
> an inline function:
>
> static inline struct ext2_inode_info *ext2_i (struct inode *inode)
> {
> return &(inode->u.ext2_inode_info);
> }
>
> There is some grist here for the mills of language lawyers here. Note the
> compilation warning:
>
> ialloc.c:336: warning: passing arg 1 of `ext2_i' discards qualifiers from
> pointer target type
Why not just declare ext2_i like the following? It _should_ work:
static inline struct ext2_inode_info *ext2_i(const struct inode *inode)
{
return &(inode->u.ext2_inode_info);
}
Minor nit: this is already done for the ext3 code, but it looks like:
#define EXT3_I (&((inode)->u.ext3_i))
We already have the EXT3_SB, so I thought I would be consistent with it:
#define EXT3_SB (&((sb)->u.ext3_sb))
Do people like the inline version better? Either way, I would like to make
the ext2 and ext3 codes more similar, rather than less.
Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/
Hi Andreas
On December 27, 2001 07:14 pm, Andreas Dilger wrote:
> On Dec 27, 2001 04:21 +0100, Daniel Phillips wrote:
> > The strategy is to abstract all references to the struct inode union through
> > an inline function:
> >
> > static inline struct ext2_inode_info *ext2_i (struct inode *inode)
> > {
> > return &(inode->u.ext2_inode_info);
> > }
> >
> > There is some grist here for the mills of language lawyers here. Note the
> > compilation warning:
> >
> > ialloc.c:336: warning: passing arg 1 of `ext2_i' discards qualifiers from
> > pointer target type
>
> Why not just declare ext2_i like the following? It _should_ work:
>
> static inline struct ext2_inode_info *ext2_i(const struct inode *inode)
> {
> return &(inode->u.ext2_inode_info);
> }
If that's all we do, then we get 'warning: return discards qualifiers from
pointer target type'. Adding an explicit cast gets rid of that:
static inline struct ext2_inode_info *ext2_i (const struct inode *inode)
{
return (struct ext2_inode_info *) &(inode->u.ext2_inode_info);
}
However, then we're lying to the compiler. I wonder how safe that is.
> Minor nit: this is already done for the ext3 code, but it looks like:
>
> #define EXT3_I (&((inode)->u.ext3_i))
>
> We already have the EXT3_SB, so I thought I would be consistent with it:
>
> #define EXT3_SB (&((sb)->u.ext3_sb))
>
> Do people like the inline version better? Either way, I would like to make
> the ext2 and ext3 codes more similar, rather than less.
An upcoming version will use an explicit cast:
return (struct ext2_inode_info *) (inode + 1);
In other words, it doesn't rely on the union, which we're trying to get rid of.
In this case, an inline function provides type saftey and a macro wouldn't. I
also prefer to pass the inode/sb explicitly, rather than having a macro pick it
up from context.
In the superblock patch, the inline will be 'ext2_s'. Al seems comfortable with
this terminology, so...
--
Daniel
On Thu, 27 Dec 2001, Andreas Dilger wrote:
> On Dec 27, 2001 04:21 +0100, Daniel Phillips wrote:
> > The strategy is to abstract all references to the struct inode union through
> > an inline function:
> >
> > static inline struct ext2_inode_info *ext2_i (struct inode *inode)
> > {
> > return &(inode->u.ext2_inode_info);
> > }
> >
> > There is some grist here for the mills of language lawyers here. Note the
> > compilation warning:
> >
> > ialloc.c:336: warning: passing arg 1 of `ext2_i' discards qualifiers from
> > pointer target type
>
> Why not just declare ext2_i like the following? It _should_ work:
>
> static inline struct ext2_inode_info *ext2_i(const struct inode *inode)
> {
> return &(inode->u.ext2_inode_info);
> }
This will get rid of the warning but it doesn't solve the underlying
problem of discarding the potentially useful const information. In C++,
you'd define a pair of functions to deal with this, one with and one
without const.
> Minor nit: this is already done for the ext3 code, but it looks like:
>
> #define EXT3_I (&((inode)->u.ext3_i))
>
> We already have the EXT3_SB, so I thought I would be consistent with it:
>
> #define EXT3_SB (&((sb)->u.ext3_sb))
>
> Do people like the inline version better? Either way, I would like to make
> the ext2 and ext3 codes more similar, rather than less.
The ext3 macros are rather revolting, simply because they assume the
variable name. A parameterized macro might be the best compromise:
#define EXT2_I(i) (&(i->u.ext2_inode_info))
--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."
On Dec 29, 2001 10:04 -0600, Oliver Xymoron wrote:
> On Thu, 27 Dec 2001, Andreas Dilger wrote:
> > Minor nit: this is already done for the ext3 code, but it looks like:
> >
> > #define EXT3_I (&((inode)->u.ext3_i))
> >
> > We already have the EXT3_SB, so I thought I would be consistent with it:
> >
> > #define EXT3_SB (&((sb)->u.ext3_sb))
> >
> > Do people like the inline version better? Either way, I would like to make
> > the ext2 and ext3 codes more similar, rather than less.
>
> The ext3 macros are rather revolting, simply because they assume the
> variable name. A parameterized macro might be the best compromise:
>
> #define EXT2_I(i) (&(i->u.ext2_inode_info))
My mistake, the Ext3 macros _do_ take an inode/sb parameter. It's not that
I'm a huge fan of macros over inline functions, it's just that I would like
to have a consensus about how it should be done so that it is consistent
between ext2 and ext3.
Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/
Oliver Xymoron wrote:
>
> > Minor nit: this is already done for the ext3 code, but it looks like:
> >
> > #define EXT3_I (&((inode)->u.ext3_i))
> >
> > We already have the EXT3_SB, so I thought I would be consistent with it:
> >
> > #define EXT3_SB (&((sb)->u.ext3_sb))
> >
> > Do people like the inline version better? Either way, I would like to make
> > the ext2 and ext3 codes more similar, rather than less.
>
> The ext3 macros are rather revolting, simply because they assume the
> variable name. A parameterized macro might be the best compromise:
>
> #define EXT2_I(i) (&(i->u.ext2_inode_info))
>
They _would_ be revolting, except Andreas mistyped :) We have:
#define EXT3_SB(sb) (&((sb)->u.ext3_sb))
#define EXT3_I(inode) (&((inode)->u.ext3_i))
(A number of the mm macros accidentally only work correctly if their
argument is called "page". Dunno if this is stil the case though).
-
On Sat, 29 Dec 2001, Andreas Dilger wrote:
> > The ext3 macros are rather revolting, simply because they assume the
> > variable name. A parameterized macro might be the best compromise:
> >
> > #define EXT2_I(i) (&(i->u.ext2_inode_info))
>
> My mistake, the Ext3 macros _do_ take an inode/sb parameter. It's not that
> I'm a huge fan of macros over inline functions, it's just that I would like
> to have a consensus about how it should be done so that it is consistent
> between ext2 and ext3.
The inline route is the way to go. The const guarantee on *inode doesn't
get propagated down to the objects it points to by the compiler anyway so
when the unions go away being const-correct gains us nothing.
--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."
Hi!
> Why not just declare ext2_i like the following? It _should_ work:
>
> static inline struct ext2_inode_info *ext2_i(const struct inode *inode)
> {
> return &(inode->u.ext2_inode_info);
> }
>
>
> Minor nit: this is already done for the ext3 code, but it looks like:
>
> #define EXT3_I (&((inode)->u.ext3_i))
>
> We already have the EXT3_SB, so I thought I would be consistent with it:
>
> #define EXT3_SB (&((sb)->u.ext3_sb))
>
> Do people like the inline version better? Either way, I would like to make
> the ext2 and ext3 codes more similar, rather than less.
Maybe 2.5 is time to merge ext2 and ext3?
Pavel
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.
At 03:52 27/12/01, Daniel Phillips wrote:
>On December 27, 2001 04:28 am, Legacy Fishtank wrote:
> > Change in principle looks good except IMHO you should go ahead and
> > remove the ext2 stuff from the union... (with the additional changes
> > that implies)
>
>Thanks for your confidence, but that would be a considerably bigger patch.
>It's not just a matter of removing the includes - other bits and pieces have
>to be put in place, such as per-filesystem inode slab. The support for this
>goes outside ext2.
>
>My idea is to just let people have a look and test this minimally intrusive
>change. Getting rid of the includes for ext2 inodes will be a two-patch
>change:
>
> 1) Abstract away the ext2 .u's (done)
> 2) Per-fs inode slab, initially only for ext2 (partly done)
>
>Removing the includes for ext2 superblocks will need another two patches. By
>the time all filesystems are done, it would be thousands of lines if it was
>all in one patch. I think it's better to keep it broken up, and do it
>incrementally.
If anyone wants a look NTFS TNG already has gone all the way (for a while
now in fact). Both fs inode and super block are fs internal slab caches and
both use static inline NTFS_I / NTFS_SB functions and the ntfs includes
from linux/fs.h are removed altogether. Code is in sourceforge cvs. For
instructions how to download the code or to browse it online, see:
http://sourceforge.net/cvs/?group_id=13956
The module of interest is ntfs-driver-tng.
If anyone is doing patches for all kernel file systems, don't bother with
ntfs in 2.5 as NTFS TNG will replace the old ntfs driver soon (certainly in
2.5 time frame).
Best regards,
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
On January 5, 2002 03:29 pm, Anton Altaparmakov wrote:
> If anyone wants a look NTFS TNG already has gone all the way (for a while
> now in fact). Both fs inode and super block are fs internal slab caches and
> both use static inline NTFS_I / NTFS_SB functions and the ntfs includes
> from linux/fs.h are removed altogether. Code is in sourceforge cvs. For
> instructions how to download the code or to browse it online, see:
Nice, did you use the generic_ip fields?
--
Daniel
At 14:47 05/01/02, Daniel Phillips wrote:
>On January 5, 2002 03:29 pm, Anton Altaparmakov wrote:
> > If anyone wants a look NTFS TNG already has gone all the way (for a while
> > now in fact). Both fs inode and super block are fs internal slab caches
> and
> > both use static inline NTFS_I / NTFS_SB functions and the ntfs includes
> > from linux/fs.h are removed altogether. Code is in sourceforge cvs. For
> > instructions how to download the code or to browse it online, see:
>
>Nice, did you use the generic_ip fields?
Yes. From ntfs-driver-tng/linux/fs/ntfs/fs.h:
/**
* NTFS_SB - return the ntfs volume given a vfs super block
* @sb: VFS super block
*
* NTFS_SB() returns the ntfs volume associated with the VFS super block @sb.
*/
static inline ntfs_volume *NTFS_SB(struct super_block *sb)
{
return sb->u.generic_sbp;
}
/**
* NTFS_I - return the ntfs inode given a vfs inode
* @inode: VFS inode
*
* NTFS_I() returns the ntfs inode associated with the VFS @inode.
*/
static inline ntfs_inode *NTFS_I(struct inode *inode)
{
return inode->u.generic_ip;
}
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
On January 5, 2002 03:56 pm, Anton Altaparmakov wrote:
> At 14:47 05/01/02, Daniel Phillips wrote:
> >On January 5, 2002 03:29 pm, Anton Altaparmakov wrote:
> > > If anyone wants a look NTFS TNG already has gone all the way (for a
> > > while now in fact). Both fs inode and super block are fs internal slab
> > > caches and both use static inline NTFS_I / NTFS_SB functions and the
> > > ntfs includes from linux/fs.h are removed altogether. Code is in
> > > sourceforge cvs. For instructions how to download the code or to browse
> > > it online, see:
> >
> >Nice, did you use the generic_ip fields?
>
> Yes. From ntfs-driver-tng/linux/fs/ntfs/fs.h:
>
> [...]
>
> static inline ntfs_inode *NTFS_I(struct inode *inode)
> {
> return inode->u.generic_ip;
> }
OK, so are doing two kmem_cache_allocs for every new_inode. With the
unbork.fs patch you could save 50% of the kmem_cache_allocs by
rewriting as follows:
static inline ntfs_inode *NTFS_I(struct inode *inode)
{
/* should bug-check to be sure it's really one of ours */
return (ntfs_inode *) &(inode->u);
}
And you just fill in the inode_size field of the file_system_type
declaration. The vfs will then handle all the details of allocating/freeing
inodes and the inode slab cache. (Note that Al seems to think this is the
wrong way of doing it, but hasn't said why he thinks that yet.)
For superblocks - are you sure you want a dedicated slab cache for those? It
seems to me that kmalloc is perfectly appropriate for this, and saves the
code needed to set up, keep track of, and tear down the slab cache.
--
Daniel
At 03:32 06/01/2002, Daniel Phillips wrote:
>On January 5, 2002 03:56 pm, Anton Altaparmakov wrote:
> > At 14:47 05/01/02, Daniel Phillips wrote:
> > >On January 5, 2002 03:29 pm, Anton Altaparmakov wrote:
> > > > If anyone wants a look NTFS TNG already has gone all the way (for a
> > > > while now in fact). Both fs inode and super block are fs internal slab
> > > > caches and both use static inline NTFS_I / NTFS_SB functions and the
> > > > ntfs includes from linux/fs.h are removed altogether. Code is in
> > > > sourceforge cvs. For instructions how to download the code or to
> browse
> > > > it online, see:
> > >
> > >Nice, did you use the generic_ip fields?
> >
> > Yes. From ntfs-driver-tng/linux/fs/ntfs/fs.h:
> >
> > [...]
> >
> > static inline ntfs_inode *NTFS_I(struct inode *inode)
> > {
> > return inode->u.generic_ip;
> > }
>
>OK, so are doing two kmem_cache_allocs for every new_inode. With the
>unbork.fs patch you could save 50% of the kmem_cache_allocs by
>rewriting as follows:
>
> static inline ntfs_inode *NTFS_I(struct inode *inode)
> {
> /* should bug-check to be sure it's really one of ours */
> return (ntfs_inode *) &(inode->u);
> }
>
>And you just fill in the inode_size field of the file_system_type
>declaration. The vfs will then handle all the details of allocating/freeing
>inodes and the inode slab cache. (Note that Al seems to think this is the
>wrong way of doing it, but hasn't said why he thinks that yet.)
I will hold back any changes until all the details have been ironed out
first and Al has given his seal of approval... To be honest I fail to see
how one additional slab allocation will make any difference. Certainly for
NTFS where we attach dynamically allocated data to the fs specific part of
the inode (sometimes going via the slow vmalloc() at that) during the read
inode call, I doubt very much that there will be any visible performance
difference. But if it is decided that this is the Right Way(TM) to do it, I
will of course go with it.
>For superblocks - are you sure you want a dedicated slab cache for those? It
>seems to me that kmalloc is perfectly appropriate for this, and saves the
>code needed to set up, keep track of, and tear down the slab cache.
No, I was doing both at the same time and got slightly carried away. (-; I
will convert the super block allocations to simple kmalloc()s when I get bored.
Best regards,
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
> To be honest I fail to see how one additional slab allocation will make
> any difference. /
/
You could say the same about any aspect of Linux: and, relaxing your /
standards in such a way, you would inevitably end up with a dog. A /
good fast system emerges from its many small perfections. Half of /
the number of cache entries for inodes qualifies as one of those. /
_________________________________________________________________/
----------------------------------------------------------------
Daniel
At 22:42 06/01/02, Daniel Phillips wrote:
>I wrote:
> > To be honest I fail to see how one additional slab allocation will make
> > any difference. /
> /
>You could say the same about any aspect of Linux: and, relaxing your /
>standards in such a way, you would inevitably end up with a dog. A /
>good fast system emerges from its many small perfections. Half of /
>the number of cache entries for inodes qualifies as one of those. /
Big words but mere rhetoric IMHO... You would first have to prove that
combining the two structures (vfs and fs inodes) is an actual "perfection"
compared to the case where they are individual, which is what I am not
convinced about.
Due to the nature of the content in the vfs vs. fs inode I would expect
that one is used independent of the other in many, if not in the majority
of cases. If this is correct, then it might well be an actual benefit to
have the two separate and to benefit from the hwcache line alignment in the
fs specific part. Also considering that allocation happens once in
read_inode but the structure is then accessed many times.
Please note, I am not saying you are wrong, most likely you are quite right
in fact, I am just raising a caution flag that perhaps benchmarks of both
implementations for the same fs might be a Good Idea(TM)...
Best regards,
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
Em Mon, Jan 07, 2002 at 12:30:58AM +0000, Anton Altaparmakov escreveu:
> At 22:42 06/01/02, Daniel Phillips wrote:
> >I wrote:
> >> To be honest I fail to see how one additional slab allocation will make
> >> any difference. /
> > /
> >You could say the same about any aspect of Linux: and, relaxing your /
> >standards in such a way, you would inevitably end up with a dog. A /
> >good fast system emerges from its many small perfections. Half of /
> >the number of cache entries for inodes qualifies as one of those. /
>
> Due to the nature of the content in the vfs vs. fs inode I would expect
> that one is used independent of the other in many, if not in the majority
> of cases. If this is correct, then it might well be an actual benefit to
> have the two separate and to benefit from the hwcache line alignment in the
> fs specific part. Also considering that allocation happens once in
> read_inode but the structure is then accessed many times.
>
> Please note, I am not saying you are wrong, most likely you are quite right
> in fact, I am just raising a caution flag that perhaps benchmarks of both
> implementations for the same fs might be a Good Idea(TM)...
Yes, having benchmarks done is a good idea and can clear any doubts about
the validity of such approach on a performace point of view.
When I did similar work for the network protocols, cleaning up
include/net/fs.h DaveM asked for benchmarks to see if the new approach,
i.e., using per network family slabcaches would lead to a performance drop,
I did it and we realized that it lead to performance _gains_, that in turn
made DaveM ask for a per network protocol slabcache, which made furter
memory savings and lead to further performance gains.
Yes, the usage pattern for sockets and inodes is different, thats why having
Daniel patches benchmarked against the current scheme can clear up the
matter about the validity of having the slabcaches.
Please note that we can have both approaches by leaving the
generic_ip/generic_sbp. In the struct sock case I left protinfo as a void
pointer, like generic_ip and some protocols use the slabcache approach
while others use the private area allocated separately and its pointer
stored in struct sock->protinfo.
- Arnaldo
On January 7, 2002 02:27 am, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 07, 2002 at 12:30:58AM +0000, Anton Altaparmakov escreveu:
> > At 22:42 06/01/02, Daniel Phillips wrote:
> > >I wrote:
> > >> To be honest I fail to see how one additional slab allocation will make
> > >> any difference. /
> > > /
> > >You could say the same about any aspect of Linux: and, relaxing your /
> > >standards in such a way, you would inevitably end up with a dog. A /
> > >good fast system emerges from its many small perfections. Half of /
> > >the number of cache entries for inodes qualifies as one of those. /
> >
> > Due to the nature of the content in the vfs vs. fs inode I would expect
> > that one is used independent of the other in many, if not in the majority
> > of cases. If this is correct, then it might well be an actual benefit to
> > have the two separate and to benefit from the hwcache line alignment in
> > the fs specific part. Also considering that allocation happens once in
> > read_inode but the structure is then accessed many times.
> >
> > Please note, I am not saying you are wrong, most likely you are quite
> > right in fact, I am just raising a caution flag that perhaps benchmarks
> > of both implementations for the same fs might be a Good Idea(TM)...
>
> Yes, having benchmarks done is a good idea and can clear any doubts about
> the validity of such approach on a performace point of view.
>
> When I did similar work for the network protocols, cleaning up
> include/net/fs.h DaveM asked for benchmarks to see if the new approach,
> i.e., using per network family slabcaches would lead to a performance drop,
> I did it and we realized that it lead to performance _gains_, that in turn
> made DaveM ask for a per network protocol slabcache, which made furter
> memory savings and lead to further performance gains.
Oh, so that's why you were too busy to do the fs.h patch ;-)
> Yes, the usage pattern for sockets and inodes is different, thats why having
> Daniel patches benchmarked against the current scheme can clear up the
> matter about the validity of having the slabcaches.
Yes, I can test against the good old version. Actually, my htree benchmarks
will do nicely here. They process a lot of inodes quickly and they put a lot
of pressure on the icache, which needs to be tested.
> Please note that we can have both approaches by leaving the
> generic_ip/generic_sbp. In the struct sock case I left protinfo as a void
> pointer, like generic_ip and some protocols use the slabcache approach
> while others use the private area allocated separately and its pointer
> stored in struct sock->protinfo.
Even if we leave the generic_ip in the common inode, we will for sure remove
the union at some point, meaning that even filesystems that use the
generic_ip now will have to do a big edit to clean up the fallout. Which
isn't such a bad thing I suppose.
If we wanted to be lazy, we could just leave the union there, with one
element, the generic_ip. How ugly would that be?
--
Daniel
Em Mon, Jan 07, 2002 at 03:12:06AM +0100, Daniel Phillips escreveu:
> On January 7, 2002 02:27 am, Arnaldo Carvalho de Melo wrote:
> > When I did similar work for the network protocols, cleaning up
> > include/net/fs.h DaveM asked for benchmarks to see if the new approach,
> > i.e., using per network family slabcaches would lead to a performance drop,
> > I did it and we realized that it lead to performance _gains_, that in turn
> > made DaveM ask for a per network protocol slabcache, which made furter
> > memory savings and lead to further performance gains.
>
> Oh, so that's why you were too busy to do the fs.h patch ;-)
*grin* And I still have to break the ipv6_sk_cachep into tcp6_sk_cachep,
udp6_sk_cachep and raw6_sk_cachep and test if moving the IPv4 identity
members in struct sock (sport, dport, saddr, rcv_saddr, daddr, etc) to
struct inet_opt is ok performance wise, doing that would remove the last
remnants of protocol specific stuff from include/net/sock.h 8)
<SNIP>
> Even if we leave the generic_ip in the common inode, we will for sure remove
> the union at some point, meaning that even filesystems that use the
> generic_ip now will have to do a big edit to clean up the fallout. Which
> isn't such a bad thing I suppose.
yes, in the sock.h cleanup the protinfo big union turned into just a void
pointer.
> If we wanted to be lazy, we could just leave the union there, with one
> element, the generic_ip. How ugly would that be?
Well, it can be left for later, first step would be to abstract the access
to the private areas in all the filesystems, but hey, don't worry as as
from the quick look I had some of the filesystems already use abstractions
for such access, like MSDOS_I, ext3 already does, NTFS NG also, etc 8)
- Arnaldo
Em Mon, Jan 07, 2002 at 03:12:06AM +0100, Daniel Phillips escreveu:
> On January 7, 2002 02:27 am, Arnaldo Carvalho de Melo wrote:
> > When I did similar work for the network protocols, cleaning up
> > include/net/fs.h DaveM asked for benchmarks to see if the new approach,
^^^^^^^^^^^^^^^^
include/net/sock.h
ouch, too much cleanups and lack of sleep(tm) 8)
- Arnaldo