I've been trying to track down the problems in ext4's online-resizing,
and one of the ones which is most noticeable is that online resizing
mballoc has some specific data structures which need to be enlarged when
the number of block groups in the filesystem are grown dynamically.
Specifically, the s_group_info array; in the current ext4 patch queue,
this isn't happening, which means after the online resizing operation,
when the filesystem is unmounted, ext4_put_super() calls
ext4_mb_release(), which then iterates over s_group_info array, and then
this triggers a kernel oops.
Is clusterfs running with mballoc in production? If so, how was this
problem fixed? Did we miss a patch to make sure that on-line resizing
worked with mballoc enabled?
Thanks, regards,
- Ted
On Jun 09, 2008 23:46 -0400, Theodore Ts'o wrote:
> I've been trying to track down the problems in ext4's online-resizing,
> and one of the ones which is most noticeable is that online resizing
> mballoc has some specific data structures which need to be enlarged when
> the number of block groups in the filesystem are grown dynamically.
>
> Specifically, the s_group_info array; in the current ext4 patch queue,
> this isn't happening, which means after the online resizing operation,
> when the filesystem is unmounted, ext4_put_super() calls
> ext4_mb_release(), which then iterates over s_group_info array, and then
> this triggers a kernel oops.
>
> Is clusterfs running with mballoc in production? If so, how was this
> problem fixed? Did we miss a patch to make sure that on-line resizing
> worked with mballoc enabled?
When Lustre is mounting the backing filesystem on the server, there is
no ext3 mountpoint visible to userspace, hence no access to the underlying
filesystem to pass the resize ioctl to, so we haven't had this problem
yet. We filed a bug on it, for the time that we can pass an ioctl through:
https://bugzilla.lustre.org/show_bug.cgi?id=15208
We have another open bug related to resize2fs and uninit_bg, but that
is for offline resizing:
https://bugzilla.lustre.org/show_bug.cgi?id=12002
Both of these bugs are mere placeholders, they don't have any patches.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
On Tue, Jun 10, 2008 at 12:24:45AM -0600, Andreas Dilger wrote:
> When Lustre is mounting the backing filesystem on the server, there is
> no ext3 mountpoint visible to userspace, hence no access to the underlying
> filesystem to pass the resize ioctl to, so we haven't had this problem
> yet. We filed a bug on it, for the time that we can pass an ioctl through:
>
> https://bugzilla.lustre.org/show_bug.cgi?id=15208
>
> We have another open bug related to resize2fs and uninit_bg, but that
> is for offline resizing:
>
> https://bugzilla.lustre.org/show_bug.cgi?id=12002
>
> Both of these bugs are mere placeholders, they don't have any patches.
There is a third (and possibly fourth) problem, which is that online
resizing with ext4dev (even without any patches from the ext4 patch
queue) is corrupting the filesystem, by not properly initializing the
block group descriptors:
Group 8: (Blocks 65537-73728)
Block bitmap at 0, Inode bitmap at 0
Inode table at 0-255
0 free blocks, 0 free inodes, 0 directories
Free blocks:
Free inodes:
Group 9: (Blocks 73729-79999)
Backup superblock at 73729, Group descriptors at 73730-73730
Reserved GDT blocks at 73731-73985
Block bitmap at 0, Inode bitmap at 0
Inode table at 0-255
0 free blocks, 0 free inodes, 0 directories
Free blocks:
Free inodes:
Furthermore, if the filesystem is grown to the point where a second
set of blocks need to be pulled from the resize inode, apparently the
resize inode is getting corrupted:
Performing an on-line resize of /dev/ubd16 to 12582912 (1k) blocks.
EXT4-fs warning (device ubdb): verify_reserved_gdb: reserved GDT 3 missing grp 1 (8195)
resize2fs: Invalid argument While trying to add group #25
I'm not sure if this is related to the third probably above, since
until that problem is fixed it makes it hard to determine what is
going on with the 4th. They may end up having the same root cause.
I'm looking into it, but it seems pretty clear to me no one has really
tested online resizing on ext4 in quite a while, and the code has
bitrotted. Hopefully it won't be too hard to fix it. In the mean
time, it really makes me wonder how on earth Josef Bacik actually
tested this patch:
commit 944600930a37aa725ba6f93c3244e2d77a1e3581
Author: Josef Bacik <[email protected]>
Date: Fri Jun 6 18:05:52 2008 -0400
ext4: fix online resize bug
There is a bug when we are trying to verify that the reserve inode's
double indirect blocks point back to the primary gdt blocks. The fix is
obvious, we need to mod the gdb count by the addr's per block. This was
verified using the same testcase as with the ext3 equivalent of this
patch.
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Mingming Cao <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
- Ted