2008-11-14 09:17:56

by Pekka Enberg

[permalink] [raw]
Subject: [PATCH] ext2/ext3: allocate ->s_blockgroup_lock separately to avoid wasting space

From: Pekka Enberg <[email protected]>

As spotted by kmemtrace, struct ext2_sb_info is 17024 bytes and ext3_sb_info is
17152 bytes on 64-bit which makes them a very bad fit for SLAB allocators. In
fact, both allocations are round up to the next available page size of
order 3 which is 32 KB.

The culprit if the wasted memory is the ->s_blockgroup_lock which can be as big
as 16 KB when CONFIG_NR_CPUS is set to 32. As struct blockgroup_lock is a
perfect fit for order 2 page in the worst case, allocate ->s_blockgroup_lock
separately to avoid wasting space.

The change shrinks struct ext2_sb_info to 592 bytes and struct ext3_sb_info to
640 bytes which fits into a 1024 byte slab cache so now we allocate 16 KB + 1
KB instead of 32 KB saving 15 KB of memory!

Signed-off-by: Pekka Enberg <[email protected]>
---
fs/ext2/super.c | 10 +++++++++-
fs/ext3/super.c | 10 +++++++++-
include/linux/blockgroup_lock.h | 2 +-
include/linux/ext2_fs_sb.h | 2 +-
include/linux/ext3_fs_sb.h | 2 +-
5 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 647cd88..da8bdea 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -132,6 +132,7 @@ static void ext2_put_super (struct super_block * sb)
percpu_counter_destroy(&sbi->s_dirs_counter);
brelse (sbi->s_sbh);
sb->s_fs_info = NULL;
+ kfree(sbi->s_blockgroup_lock);
kfree(sbi);

return;
@@ -756,6 +757,13 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
if (!sbi)
return -ENOMEM;
+
+ sbi->s_blockgroup_lock =
+ kzalloc(sizeof(struct blockgroup_lock), GFP_KERNEL);
+ if (!sbi->s_blockgroup_lock) {
+ kfree(sbi);
+ return -ENOMEM;
+ }
sb->s_fs_info = sbi;
sbi->s_sb_block = sb_block;

@@ -983,7 +991,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
printk ("EXT2-fs: not enough memory\n");
goto failed_mount;
}
- bgl_lock_init(&sbi->s_blockgroup_lock);
+ bgl_lock_init(sbi->s_blockgroup_lock);
sbi->s_debts = kcalloc(sbi->s_groups_count, sizeof(*sbi->s_debts), GFP_KERNEL);
if (!sbi->s_debts) {
printk ("EXT2-fs: not enough memory\n");
diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index f6c94f2..f41df22 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -439,6 +439,7 @@ static void ext3_put_super (struct super_block * sb)
ext3_blkdev_remove(sbi);
}
sb->s_fs_info = NULL;
+ kfree(sbi->s_blockgroup_lock);
kfree(sbi);
return;
}
@@ -1548,6 +1549,13 @@ static int ext3_fill_super (struct super_block *sb, void *data, int silent)
sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
if (!sbi)
return -ENOMEM;
+
+ sbi->s_blockgroup_lock =
+ kzalloc(sizeof(struct blockgroup_lock), GFP_KERNEL);
+ if (!sbi->s_blockgroup_lock) {
+ kfree(sbi);
+ return -ENOMEM;
+ }
sb->s_fs_info = sbi;
sbi->s_mount_opt = 0;
sbi->s_resuid = EXT3_DEF_RESUID;
@@ -1788,7 +1796,7 @@ static int ext3_fill_super (struct super_block *sb, void *data, int silent)
goto failed_mount;
}

- bgl_lock_init(&sbi->s_blockgroup_lock);
+ bgl_lock_init(sbi->s_blockgroup_lock);

for (i = 0; i < db_count; i++) {
block = descriptor_loc(sb, logic_sb_block, i);
diff --git a/include/linux/blockgroup_lock.h b/include/linux/blockgroup_lock.h
index 8607312..d6d4787 100644
--- a/include/linux/blockgroup_lock.h
+++ b/include/linux/blockgroup_lock.h
@@ -54,6 +54,6 @@ static inline void bgl_lock_init(struct blockgroup_lock *bgl)
* superblock types
*/
#define sb_bgl_lock(sb, block_group) \
- (&(sb)->s_blockgroup_lock.locks[(block_group) & (NR_BG_LOCKS-1)].lock)
+ (&(sb)->s_blockgroup_lock->locks[(block_group) & (NR_BG_LOCKS-1)].lock)

#endif
diff --git a/include/linux/ext2_fs_sb.h b/include/linux/ext2_fs_sb.h
index f273415..7e61de9 100644
--- a/include/linux/ext2_fs_sb.h
+++ b/include/linux/ext2_fs_sb.h
@@ -101,7 +101,7 @@ struct ext2_sb_info {
struct percpu_counter s_freeblocks_counter;
struct percpu_counter s_freeinodes_counter;
struct percpu_counter s_dirs_counter;
- struct blockgroup_lock s_blockgroup_lock;
+ struct blockgroup_lock *s_blockgroup_lock;
/* root of the per fs reservation window tree */
spinlock_t s_rsv_window_lock;
struct rb_root s_rsv_window_root;
diff --git a/include/linux/ext3_fs_sb.h b/include/linux/ext3_fs_sb.h
index b65f028..ec10d96 100644
--- a/include/linux/ext3_fs_sb.h
+++ b/include/linux/ext3_fs_sb.h
@@ -60,7 +60,7 @@ struct ext3_sb_info {
struct percpu_counter s_freeblocks_counter;
struct percpu_counter s_freeinodes_counter;
struct percpu_counter s_dirs_counter;
- struct blockgroup_lock s_blockgroup_lock;
+ struct blockgroup_lock *s_blockgroup_lock;

/* root of the per fs reservation window tree */
spinlock_t s_rsv_window_lock;
--
1.5.4.3



2008-11-14 21:26:28

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] ext2/ext3: allocate ->s_blockgroup_lock separately to avoid wasting space

On Nov 14, 2008 11:17 +0200, Pekka J Enberg wrote:
> As spotted by kmemtrace, struct ext2_sb_info is 17024 bytes and ext3_sb_info is
> 17152 bytes on 64-bit which makes them a very bad fit for SLAB allocators. In
> fact, both allocations are round up to the next available page size of
> order 3 which is 32 KB.
>
> The culprit if the wasted memory is the ->s_blockgroup_lock which can be as
> big as 16 KB when CONFIG_NR_CPUS is set to 32. As struct blockgroup_lock is a
> perfect fit for order 2 page in the worst case, allocate ->s_blockgroup_lock
> separately to avoid wasting space.
>
> The change shrinks struct ext2_sb_info to 592 bytes and struct ext3_sb_info to
> 640 bytes which fits into a 1024 byte slab cache so now we allocate 16 KB + 1
> KB instead of 32 KB saving 15 KB of memory!
>
> Signed-off-by: Pekka Enberg <[email protected]>

This looks very reasonable, with some minor comments below.
Could you please also include a patch for ext4. Also, Andrew prefers that
the patches for ext2/ext3/ext4 are in separate emails.

> --- a/include/linux/blockgroup_lock.h
> +++ b/include/linux/blockgroup_lock.h
> #define sb_bgl_lock(sb, block_group) \
> - (&(sb)->s_blockgroup_lock.locks[(block_group) & (NR_BG_LOCKS-1)].lock)
> + (&(sb)->s_blockgroup_lock->locks[(block_group) & (NR_BG_LOCKS-1)].lock)

How the struct is allocated seems like an implementation detail that doesn't
belong in blockgroup_lock.h at all, because "sb" is not "struct superblock"
but rather "struct ext[23]_sb_info". In fact, changing this without also
patching ext4 would cause ext4 to break.

I would suggest to change this to take the s_blockgroup_lock as a parameter,

#define bgl_lock_ptr(bgl, block_group)
(bgl->locks[(block_group) & (NR_BG_LOCKS - 1)].lock)

and then in ext[234]_fs_sb.h add a new helper in the same (first) patch:

#define sb_bgl_lock(sbi, block_group)
bgl_lock_ptr(&sbi->s_blockgroup_lock, block_group)

and remove sb_bgl_lock() from blockgroup_lock.h entirely. As part of the
later patches to change the s_blockgroup_lock allocations for each of
ext[234] this changes in ext[234]_fs_sb.h to:

#define sb_bgl_lock(sbi, block_group)
bgl_lock_ptr(sbi->s_blockgroup_lock, block_group)


This allows each of the later patches to be landed separately without
breaking the build.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2008-11-16 12:58:32

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] ext2/ext3: allocate ->s_blockgroup_lock separately to avoid wasting space

On Fri, 2008-11-14 at 11:17 +0200, Pekka J Enberg wrote:
> From: Pekka Enberg <[email protected]>
>
> As spotted by kmemtrace, struct ext2_sb_info is 17024 bytes and ext3_sb_info is
> 17152 bytes on 64-bit which makes them a very bad fit for SLAB allocators. In
> fact, both allocations are round up to the next available page size of
> order 3 which is 32 KB.
>
> The culprit if the wasted memory is the ->s_blockgroup_lock which can be as big
> as 16 KB when CONFIG_NR_CPUS is set to 32. As struct blockgroup_lock is a
> perfect fit for order 2 page in the worst case, allocate ->s_blockgroup_lock
> separately to avoid wasting space.

And here I was thinking that NR_CPUS=4096 is currently our worst
case ;-)


2008-11-17 21:31:40

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] ext2/ext3: allocate ->s_blockgroup_lock separately to avoid wasting space

Peter Zijlstra wrote:
> On Fri, 2008-11-14 at 11:17 +0200, Pekka J Enberg wrote:
>> From: Pekka Enberg <[email protected]>
>>
>> As spotted by kmemtrace, struct ext2_sb_info is 17024 bytes and ext3_sb_info is
>> 17152 bytes on 64-bit which makes them a very bad fit for SLAB allocators. In
>> fact, both allocations are round up to the next available page size of
>> order 3 which is 32 KB.
>>
>> The culprit if the wasted memory is the ->s_blockgroup_lock which can be as big
>> as 16 KB when CONFIG_NR_CPUS is set to 32. As struct blockgroup_lock is a
>> perfect fit for order 2 page in the worst case, allocate ->s_blockgroup_lock
>> separately to avoid wasting space.
>
> And here I was thinking that NR_CPUS=4096 is currently our worst
> case ;-)

Sure but look at <linux/blockgroup_lock.h>. NR_BG_LOCKS is capped to 128
for >= 32 CPUs.