2011-09-06 07:24:14

by MaoXiaoyun

[permalink] [raw]
Subject: ext4 BUG in dom0 Kernel 2.6.32.36



Hi:

I've met an ext4 Bug in dom0 kernel 2.6.32.36. (See kernel stack below)
32.36 kernel commit: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=ae333e97552c81ab10395ad1ffc6d6daaadb144a

The bug only show up in our cluster environments which includes 300 physical machines, one server will run into this bug per day.
Running ontop of every server, there are about 30 VMS, each of which has heavy IO workload inside.(we are doing some kinds of stress tests)

We have our own distribute file system as the storage of cluster, every VM'image file will be spilt into several files with equal size in
physical disk, and every creation of file use ext4 fallocation(fallocation size 1MB). So I believe there will be quite a lot of uninitialized
extent to be initialized during the test.

After go through the src code. Call routinue is
ext4_da_sritepages->mpage_da_map_blocks->ext4_get_blocks->ext4_ext_get_blocks->
ext4_ext_handle_uninitialized_extents->ext4_ext_convert_to_initialized->ext4_ext_insert_extent


if ext4_ext_handle_uninitialized_extents is called, then line 3306 must be satisfied.
that is we have in_range(iblock, ee_block, ee_len) = true.
so iblock >= ee_block

fs/ext4/extents.c
3306 <+++<+++if (in_range(iblock, ee_block, ee_len)) {
3307 <+++<+++<+++newblock = iblock - ee_block + ee_start;
3308 <+++<+++<+++/* number of remaining blocks in the extent */
3309 <+++<+++<+++allocated = ee_len - (iblock - ee_block);
3310 <+++<+++<+++ext_debug("%u fit into %u:%d -> %llu\n", iblock,
3311 <+++<+++<+++<+++<+++ee_block, ee_len, newblock);
3312
3313 <+++<+++<+++/* Do not put uninitialized extent in the cache */
3314 <+++<+++<+++if (!ext4_ext_is_uninitialized(ex)) {
3315 <+++<+++<+++<+++ext4_ext_put_in_cache(inode, ee_block,
3316 <+++<+++<+++<+++<+++<+++<+++ee_len, ee_start,
3317 <+++<+++<+++<+++<+++<+++<+++EXT4_EXT_CACHE_EXTENT);
3318 <+++<+++<+++<+++goto out;
3319 <+++<+++<+++}
3320 <+++<+++<+++ret = ext4_ext_handle_uninitialized_extents(handle,
3321 <+++<+++<+++<+++<+++inode, iblock, max_blocks, path,
3322 <+++<+++<+++<+++<+++flags, allocated, bh_result, newblock);
3323 <+++<+++<+++return ret;
3324 <+++<+++}


the newext is from line 2678, its ee_block is iblock + max_blocks
the nearex is path[depth].p_ext(line 1683)

BUG_ON 1716 means iblock + max_blocks = ee_block.
So maybe that means we have iblock = ee_block and max_blocks = 0.


1716 <+++<+++BUG_ON(newext->ee_block == nearex->ee_block);
1717 <+++<+++len = (EXT_MAX_EXTENT(eh) - nearex) * sizeof(struct ext4_extent);
1718 <+++<+++len = len < 0 ? 0 : len;
1719 <+++<+++ext_debug("insert %d:%llu:[%d]%d before: nearest 0x%p, "
1720 <+++<+++<+++<+++"move %d from 0x%p to 0x%p\n",
1721 <+++<+++<+++<+++le32_to_cpu(newext->ee_block),
1722 <+++<+++<+++<+++ext_pblock(newext),
1723 <+++<+++<+++<+++ext4_ext_is_uninitialized(newext),
1724 <+++<+++<+++<+++ext4_ext_get_actual_len(newext),
1725 <+++<+++<+++<+++nearex, len, nearex + 1, nearex + 2);
1726 <+++<+++memmove(nearex + 1, nearex, len);
1727 <+++<+++path[depth].p_ext = nearex;
1728 <+++}


2678 <+++<+++ex3 = &newex;
2679 <+++<+++ex3->ee_block = cpu_to_le32(iblock + max_blocks);
2680 <+++<+++ext4_ext_store_pblock(ex3, newblock + max_blocks);
2681 <+++<+++ex3->ee_len = cpu_to_le16(allocated - max_blocks);
2682 <+++<+++ext4_ext_mark_uninitialized(ex3);
2683 <+++<+++err = ext4_ext_insert_extent(handle, inode, path, ex3, 0);
2684 <+++<+++if (err == -ENOSPC && may_zeroout) {
2685 <+++<+++<+++err = ext4_ext_zeroout(inode, &orig_ex);


if max_blocks = 0; it means 2225, mpd->b_size >> mpd->inode->i_blkbits is 0.

fs/ext4/inode.c
2220 static int mpage_da_map_blocks(struct mpage_da_data *mpd)
2221 {
2222 <+++int err, blks, get_blocks_flags;
2223 <+++struct buffer_head new;
2224 <+++sector_t next = mpd->b_blocknr;
2225 <+++unsigned max_blocks = mpd->b_size >> mpd->inode->i_blkbits;
2226 <+++loff_t disksize = EXT4_I(mpd->inode)->i_disksize;
2227 <+++handle_t *handle = NULL;
2228


Could it be possilbe, right now I am tring to reproduce this problem in a much
easiler way, any suggestion?

Many thanks.


------------[ cut here ]------------
kernel BUG at fs/ext4/extents.c:1716!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/block/tapdevk/stat
CPU 3
Modules linked in: xt_iprange xt_mac arptable_filter arp_tables xt_physdev nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack
iptable_filter ip_tables bridge autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 8021q garp stp llc xenfs
dm_multipath fuse xen_netback xen_blkback blktap blkback_pagemap loop nbd video output sbs sbshc parport_pc lp parport joydev ses
enclosure snd_seq_dummy snd_seq_oss bnx2 snd_seq_midi_event snd_seq snd_seq_device dcdbas snd_pcm_oss snd_mixer_oss serio_raw snd_pcm
snd_timer snd soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support pcspkr shpchp [last unloaded: freq_table]
Pid: 9073, comm: flush-8:16 Not tainted 2.6.32.36xen #1 PowerEdge R710
RIP: e030:[<ffffffff811a6184>] [<ffffffff811a6184>] ext4_ext_insert_extent+0xac1/0xbe0
RSP: e02b:ffff8801499cd580 EFLAGS: 00010246
RAX: 0000000000002948 RBX: 0000000000000000 RCX: ffff8801499cd780
RDX: ffff8801499cd360 RSI: ffff88007dedb310 RDI: 0000000000000017
RBP: ffff8801499cd650 R08: ffff8801499cd340 R09: ffff880063488930
R10: 000000018100f8bf R11: dead000000200200 R12: ffff88005a29700c
R13: ffff88005a297000 R14: ffff8801158198c0 R15: ffff88003e9ea1b0
FS: 00007fd3cc4bf6e0(0000) GS:ffff88002808f000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000042a09e CR3: 00000000bf3bd000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process flush-8:16 (pid: 9073, threadinfo ffff8801499cc000, task ffff880149ad5b40)
Stack:
ffff8801499cd780 ffff88003e9ea180 ffff8801c5b47300 01ffffff81103c0c
<0> ffff88003e9ea180 000000017dedb2a0 ffff880115819800 ffff88007dedb2a0
<0> ffff8801499cd5d0 ffffffff811c12ea ffff8801499cd5f0 ffffffff811c16ea
Call Trace:
[<ffffffff811c12ea>] ? jbd_unlock_bh_journal_head+0x16/0x18
[<ffffffff811c16ea>] ? jbd2_journal_put_journal_head+0x4d/0x52
[<ffffffff811bb7d6>] ? jbd2_journal_get_write_access+0x31/0x38
[<ffffffff811a88e9>] ? __ext4_journal_get_write_access+0x4c/0x5f
[<ffffffff811a6ce3>] ext4_ext_handle_uninitialized_extents+0xa40/0xef5
[<ffffffff8100f175>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8100f8d2>] ? check_events+0x12/0x20
[<ffffffff81042fcf>] ? need_resched+0x23/0x2d
[<ffffffff811a74e1>] ext4_ext_get_blocks+0x265/0x6eb
[<ffffffff81042fcf>] ? need_resched+0x23/0x2d
[<ffffffff81188b55>] ext4_get_blocks+0x140/0x204
[<ffffffff81188d2f>] mpage_da_map_blocks+0xb7/0x681
[<ffffffff810d3b29>] ? find_get_pages_tag+0x48/0xcc
[<ffffffff8100f8d2>] ? check_events+0x12/0x20
[<ffffffff810da8df>] ? pagevec_lookup_tag+0x27/0x30
[<ffffffff810d87cc>] ? write_cache_pages+0x175/0x35e
[<ffffffff811893f0>] ? __mpage_da_writepage+0x0/0x164
[<ffffffff81103c0c>] ? kmem_cache_alloc+0x94/0xf6
[<ffffffff811bbc40>] ? jbd2_journal_start+0xa1/0xcd
[<ffffffff8119957f>] ? ext4_journal_start_sb+0xdc/0x111
[<ffffffff81186852>] ? ext4_meta_trans_blocks+0x74/0xce
[<ffffffff8118bc42>] ext4_da_writepages+0x47a/0x6a7
[<ffffffff810d8a00>] do_writepages+0x21/0x2a
[<ffffffff8112cdb8>] writeback_single_inode+0xc8/0x1e3
[<ffffffff8112d5e4>] writeback_inodes_wb+0x30b/0x37e
[<ffffffff8102f82d>] ? paravirt_end_context_switch+0x17/0x31
[<ffffffff8100b459>] ? xen_end_context_switch+0x1e/0x22
[<ffffffff8112d788>] wb_writeback+0x131/0x1bb
[<ffffffff81064029>] ? try_to_del_timer_sync+0x73/0x81
[<ffffffff8112d9ef>] wb_do_writeback+0x13c/0x153
[<ffffffff8106425b>] ? process_timeout+0x0/0x10
[<ffffffff810e78d1>] ? bdi_start_fn+0x0/0xd0
[<ffffffff8112da32>] bdi_writeback_task+0x2c/0xb3
[<ffffffff810e793b>] bdi_start_fn+0x6a/0xd0
[<ffffffff810754b7>] kthread+0x6e/0x76
[<ffffffff81013daa>] child_rip+0xa/0x20
[<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
[<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
[<ffffffff81013da0>] ? child_rip+0x0/0x20
Code: 8d 04 85 f4 ff ff ff 85 c0 0f 49 d8 48 63 d3 e8 47 c7 07 00 49 8d 44 24 0c 49 89 47 10 eb 3a bb f4 ff ff ff e9 c2 00 00 00 75 04
<0f> 0b eb fe 41 0f b7 45 04 49 8d 7c 24 0c 48 6b c0 0c 4c 89 e6
RIP [<ffffffff811a6184>] ext4_ext_insert_extent+0xac1/0xbe0
RSP <ffff8801499cd580>
---[ end trace 035c7d09ed95fb32 ]---


2011-09-06 11:33:50

by MaoXiaoyun

[permalink] [raw]
Subject: RE: ext4 BUG in dom0 Kernel 2.6.32.36



fsck some of the the hard disk has multiply-claimd blocks.
And it looks like i need this patch to fix "should not have EOFBLOCKS_FL set" error.

http://git390.marist.edu/cgi-bin/gitweb.cgi?p=linux-2.6.git;a=commitdiff;h=58590b06d79f7ce5ab64ff3b6d537180fa50dc84

Inode 50343178 should not have EOFBLOCKS_FL set (size 67108864, lblk 16383)
Clear? yes
Inode 50345362 should not have EOFBLOCKS_FL set (size 67108864, lblk 16383)
Clear? yes
Inode 50345386 should not have EOFBLOCKS_FL set (size 63963136, lblk 15615)
Clear? yes
Inode 50345648 should not have EOFBLOCKS_FL set (size 3145728, lblk 767)
Clear? yes
Inode 50345690 should not have EOFBLOCKS_FL set (size 67108864, lblk 16383)
Clear? yes
Inode 50346361, i_blocks is 133136, should be 133256. Fix? yes

Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
Multiply-claimed block(s) in inode 50346361: 226854591 226854592 226854593 226854594 226854595 226854596 226854597 226854598 226854599 226854600 226854601 226854602 226854603 226854604 226854605 226854591 226854592 226854593 226854594 226854595 226854596 226854597 226854598 226854599 226854600 226854601 226854602 226854603 226854604 226854605
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
(There are 1 inodes containing multiply-claimed blocks.)
File /chunks/2410339941482498_637 (inode #50346361, mod time Tue Sep 6 16:25:33 2011)
has 30 multiply-claimed block(s), shared with 0 file(s):
Clone multiply-claimed blocks? yes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong for group #0 (78, counted=63).
Fix? yes
Free blocks count wrong (7028646, counted=7028631).
Fix? yes

----------------------------------------
> From: [email protected]
> To: [email protected]; [email protected]
> CC: [email protected]; [email protected]
> Subject: ext4 BUG in dom0 Kernel 2.6.32.36
> Date: Tue, 6 Sep 2011 15:24:14 +0800
>
>
>
> Hi:
>
> I've met an ext4 Bug in dom0 kernel 2.6.32.36. (See kernel stack below)
> 32.36 kernel commit: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=ae333e97552c81ab10395ad1ffc6d6daaadb144a
>
> The bug only show up in our cluster environments which includes 300 physical machines, one server will run into this bug per day.
> Running ontop of every server, there are about 30 VMS, each of which has heavy IO workload inside.(we are doing some kinds of stress tests)
>
> We have our own distribute file system as the storage of cluster, every VM'image file will be spilt into several files with equal size in
> physical disk, and every creation of file use ext4 fallocation(fallocation size 1MB). So I believe there will be quite a lot of uninitialized
> extent to be initialized during the test.
>
> After go through the src code. Call routinue is
> ext4_da_sritepages->mpage_da_map_blocks->ext4_get_blocks->ext4_ext_get_blocks->
> ext4_ext_handle_uninitialized_extents->ext4_ext_convert_to_initialized->ext4_ext_insert_extent
>
>
> if ext4_ext_handle_uninitialized_extents is called, then line 3306 must be satisfied.
> that is we have in_range(iblock, ee_block, ee_len) = true.
> so iblock >= ee_block
>
> fs/ext4/extents.c
> 3306 <+++<+++if (in_range(iblock, ee_block, ee_len)) {
> 3307 <+++<+++<+++newblock = iblock - ee_block + ee_start;
> 3308 <+++<+++<+++/* number of remaining blocks in the extent */
> 3309 <+++<+++<+++allocated = ee_len - (iblock - ee_block);
> 3310 <+++<+++<+++ext_debug("%u fit into %u:%d -> %llu\n", iblock,
> 3311 <+++<+++<+++<+++<+++ee_block, ee_len, newblock);
> 3312
> 3313 <+++<+++<+++/* Do not put uninitialized extent in the cache */
> 3314 <+++<+++<+++if (!ext4_ext_is_uninitialized(ex)) {
> 3315 <+++<+++<+++<+++ext4_ext_put_in_cache(inode, ee_block,
> 3316 <+++<+++<+++<+++<+++<+++<+++ee_len, ee_start,
> 3317 <+++<+++<+++<+++<+++<+++<+++EXT4_EXT_CACHE_EXTENT);
> 3318 <+++<+++<+++<+++goto out;
> 3319 <+++<+++<+++}
> 3320 <+++<+++<+++ret = ext4_ext_handle_uninitialized_extents(handle,
> 3321 <+++<+++<+++<+++<+++inode, iblock, max_blocks, path,
> 3322 <+++<+++<+++<+++<+++flags, allocated, bh_result, newblock);
> 3323 <+++<+++<+++return ret;
> 3324 <+++<+++}
>
>
> the newext is from line 2678, its ee_block is iblock + max_blocks
> the nearex is path[depth].p_ext(line 1683)
>
> BUG_ON 1716 means iblock + max_blocks = ee_block.
> So maybe that means we have iblock = ee_block and max_blocks = 0.
>
>
> 1716 <+++<+++BUG_ON(newext->ee_block == nearex->ee_block);
> 1717 <+++<+++len = (EXT_MAX_EXTENT(eh) - nearex) * sizeof(struct ext4_extent);
> 1718 <+++<+++len = len < 0 ? 0 : len;
> 1719 <+++<+++ext_debug("insert %d:%llu:[%d]%d before: nearest 0x%p, "
> 1720 <+++<+++<+++<+++"move %d from 0x%p to 0x%p\n",
> 1721 <+++<+++<+++<+++le32_to_cpu(newext->ee_block),
> 1722 <+++<+++<+++<+++ext_pblock(newext),
> 1723 <+++<+++<+++<+++ext4_ext_is_uninitialized(newext),
> 1724 <+++<+++<+++<+++ext4_ext_get_actual_len(newext),
> 1725 <+++<+++<+++<+++nearex, len, nearex + 1, nearex + 2);
> 1726 <+++<+++memmove(nearex + 1, nearex, len);
> 1727 <+++<+++path[depth].p_ext = nearex;
> 1728 <+++}
>
>
> 2678 <+++<+++ex3 = &newex;
> 2679 <+++<+++ex3->ee_block = cpu_to_le32(iblock + max_blocks);
> 2680 <+++<+++ext4_ext_store_pblock(ex3, newblock + max_blocks);
> 2681 <+++<+++ex3->ee_len = cpu_to_le16(allocated - max_blocks);
> 2682 <+++<+++ext4_ext_mark_uninitialized(ex3);
> 2683 <+++<+++err = ext4_ext_insert_extent(handle, inode, path, ex3, 0);
> 2684 <+++<+++if (err == -ENOSPC && may_zeroout) {
> 2685 <+++<+++<+++err = ext4_ext_zeroout(inode, &orig_ex);
>
>
> if max_blocks = 0; it means 2225, mpd->b_size >> mpd->inode->i_blkbits is 0.
>
> fs/ext4/inode.c
> 2220 static int mpage_da_map_blocks(struct mpage_da_data *mpd)
> 2221 {
> 2222 <+++int err, blks, get_blocks_flags;
> 2223 <+++struct buffer_head new;
> 2224 <+++sector_t next = mpd->b_blocknr;
> 2225 <+++unsigned max_blocks = mpd->b_size >> mpd->inode->i_blkbits;
> 2226 <+++loff_t disksize = EXT4_I(mpd->inode)->i_disksize;
> 2227 <+++handle_t *handle = NULL;
> 2228
>
>
> Could it be possilbe, right now I am tring to reproduce this problem in a much
> easiler way, any suggestion?
>
> Many thanks.
>
>
> ------------[ cut here ]------------
> kernel BUG at fs/ext4/extents.c:1716!
> invalid opcode: 0000 [#1] SMP
> last sysfs file: /sys/block/tapdevk/stat
> CPU 3
> Modules linked in: xt_iprange xt_mac arptable_filter arp_tables xt_physdev nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack
> iptable_filter ip_tables bridge autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 8021q garp stp llc xenfs
> dm_multipath fuse xen_netback xen_blkback blktap blkback_pagemap loop nbd video output sbs sbshc parport_pc lp parport joydev ses
> enclosure snd_seq_dummy snd_seq_oss bnx2 snd_seq_midi_event snd_seq snd_seq_device dcdbas snd_pcm_oss snd_mixer_oss serio_raw snd_pcm
> snd_timer snd soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support pcspkr shpchp [last unloaded: freq_table]
> Pid: 9073, comm: flush-8:16 Not tainted 2.6.32.36xen #1 PowerEdge R710
> RIP: e030:[<ffffffff811a6184>] [<ffffffff811a6184>] ext4_ext_insert_extent+0xac1/0xbe0
> RSP: e02b:ffff8801499cd580 EFLAGS: 00010246
> RAX: 0000000000002948 RBX: 0000000000000000 RCX: ffff8801499cd780
> RDX: ffff8801499cd360 RSI: ffff88007dedb310 RDI: 0000000000000017
> RBP: ffff8801499cd650 R08: ffff8801499cd340 R09: ffff880063488930
> R10: 000000018100f8bf R11: dead000000200200 R12: ffff88005a29700c
> R13: ffff88005a297000 R14: ffff8801158198c0 R15: ffff88003e9ea1b0
> FS: 00007fd3cc4bf6e0(0000) GS:ffff88002808f000(0000) knlGS:0000000000000000
> CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000000042a09e CR3: 00000000bf3bd000 CR4: 0000000000002660
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process flush-8:16 (pid: 9073, threadinfo ffff8801499cc000, task ffff880149ad5b40)
> Stack:
> ffff8801499cd780 ffff88003e9ea180 ffff8801c5b47300 01ffffff81103c0c
> <0> ffff88003e9ea180 000000017dedb2a0 ffff880115819800 ffff88007dedb2a0
> <0> ffff8801499cd5d0 ffffffff811c12ea ffff8801499cd5f0 ffffffff811c16ea
> Call Trace:
> [<ffffffff811c12ea>] ? jbd_unlock_bh_journal_head+0x16/0x18
> [<ffffffff811c16ea>] ? jbd2_journal_put_journal_head+0x4d/0x52
> [<ffffffff811bb7d6>] ? jbd2_journal_get_write_access+0x31/0x38
> [<ffffffff811a88e9>] ? __ext4_journal_get_write_access+0x4c/0x5f
> [<ffffffff811a6ce3>] ext4_ext_handle_uninitialized_extents+0xa40/0xef5
> [<ffffffff8100f175>] ? xen_force_evtchn_callback+0xd/0xf
> [<ffffffff8100f8d2>] ? check_events+0x12/0x20
> [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
> [<ffffffff811a74e1>] ext4_ext_get_blocks+0x265/0x6eb
> [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
> [<ffffffff81188b55>] ext4_get_blocks+0x140/0x204
> [<ffffffff81188d2f>] mpage_da_map_blocks+0xb7/0x681
> [<ffffffff810d3b29>] ? find_get_pages_tag+0x48/0xcc
> [<ffffffff8100f8d2>] ? check_events+0x12/0x20
> [<ffffffff810da8df>] ? pagevec_lookup_tag+0x27/0x30
> [<ffffffff810d87cc>] ? write_cache_pages+0x175/0x35e
> [<ffffffff811893f0>] ? __mpage_da_writepage+0x0/0x164
> [<ffffffff81103c0c>] ? kmem_cache_alloc+0x94/0xf6
> [<ffffffff811bbc40>] ? jbd2_journal_start+0xa1/0xcd
> [<ffffffff8119957f>] ? ext4_journal_start_sb+0xdc/0x111
> [<ffffffff81186852>] ? ext4_meta_trans_blocks+0x74/0xce
> [<ffffffff8118bc42>] ext4_da_writepages+0x47a/0x6a7
> [<ffffffff810d8a00>] do_writepages+0x21/0x2a
> [<ffffffff8112cdb8>] writeback_single_inode+0xc8/0x1e3
> [<ffffffff8112d5e4>] writeback_inodes_wb+0x30b/0x37e
> [<ffffffff8102f82d>] ? paravirt_end_context_switch+0x17/0x31
> [<ffffffff8100b459>] ? xen_end_context_switch+0x1e/0x22
> [<ffffffff8112d788>] wb_writeback+0x131/0x1bb
> [<ffffffff81064029>] ? try_to_del_timer_sync+0x73/0x81
> [<ffffffff8112d9ef>] wb_do_writeback+0x13c/0x153
> [<ffffffff8106425b>] ? process_timeout+0x0/0x10
> [<ffffffff810e78d1>] ? bdi_start_fn+0x0/0xd0
> [<ffffffff8112da32>] bdi_writeback_task+0x2c/0xb3
> [<ffffffff810e793b>] bdi_start_fn+0x6a/0xd0
> [<ffffffff810754b7>] kthread+0x6e/0x76
> [<ffffffff81013daa>] child_rip+0xa/0x20
> [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
> [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
> [<ffffffff81013da0>] ? child_rip+0x0/0x20
> Code: 8d 04 85 f4 ff ff ff 85 c0 0f 49 d8 48 63 d3 e8 47 c7 07 00 49 8d 44 24 0c 49 89 47 10 eb 3a bb f4 ff ff ff e9 c2 00 00 00 75 04
> <0f> 0b eb fe 41 0f b7 45 04 49 8d 7c 24 0c 48 6b c0 0c 4c 89 e6
> RIP [<ffffffff811a6184>] ext4_ext_insert_extent+0xac1/0xbe0
> RSP <ffff8801499cd580>
> ---[ end trace 035c7d09ed95fb32 ]---
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-09-06 14:54:14

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: ext4 BUG in dom0 Kernel 2.6.32.36

On Tue, Sep 06, 2011 at 03:24:14PM +0800, MaoXiaoyun wrote:
>
>
> Hi:
>
> I've met an ext4 Bug in dom0 kernel 2.6.32.36. (See kernel stack below)

Did you try the 3.0 kernel?

2011-09-06 15:11:56

by MaoXiaoyun

[permalink] [raw]
Subject: RE: ext4 BUG in dom0 Kernel 2.6.32.36




> Date: Tue, 6 Sep 2011 10:53:47 -0400
> From: [email protected]
> To: [email protected]
> CC: [email protected]; [email protected]; [email protected]
> Subject: Re: ext4 BUG in dom0 Kernel 2.6.32.36
>
> On Tue, Sep 06, 2011 at 03:24:14PM +0800, MaoXiaoyun wrote:
> >
> >
> > Hi:
> >
> > I've met an ext4 Bug in dom0 kernel 2.6.32.36. (See kernel stack below)
>
> Did you try the 3.0 kernel?
No, I am afried the change would be to much for our current env.
May result in other stable issue.
So, I want to dig out what really happen. Hopes.

Thanks.


Attachments:
(No filename) (138.00 B)

2011-09-06 18:55:06

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [Xen-devel] RE: ext4 BUG in dom0 Kernel 2.6.32.36

On 09/06/2011 08:11 AM, MaoXiaoyun wrote:
>
> > Date: Tue, 6 Sep 2011 10:53:47 -0400
> > From: [email protected]
> > To: [email protected]
> > CC: [email protected]; [email protected];
> [email protected]
> > Subject: Re: ext4 BUG in dom0 Kernel 2.6.32.36
> >
> > On Tue, Sep 06, 2011 at 03:24:14PM +0800, MaoXiaoyun wrote:
> > >
> > >
> > > Hi:
> > >
> > > I've met an ext4 Bug in dom0 kernel 2.6.32.36. (See kernel stack
> below)
> >
> > Did you try the 3.0 kernel?
> No, I am afried the change would be to much for our current env.
> May result in other stable issue.
> So, I want to dig out what really happen. Hopes.

Another question is whether this is a regression compared to earlier
versions of 2.6.32? Do you know if this problem exists in a non-Xen
environment?

Thanks,
J

2011-09-07 02:35:23

by MaoXiaoyun

[permalink] [raw]
Subject: RE: [Xen-devel] RE: ext4 BUG in dom0 Kernel 2.6.32.36




----------------------------------------
> Date: Tue, 6 Sep 2011 11:55:02 -0700
> From: [email protected]
> To: [email protected]
> CC: [email protected]; [email protected]; [email protected]
> Subject: Re: [Xen-devel] RE: ext4 BUG in dom0 Kernel 2.6.32.36
>
> On 09/06/2011 08:11 AM, MaoXiaoyun wrote:
> >
> > > Date: Tue, 6 Sep 2011 10:53:47 -0400
> > > From: [email protected]
> > > To: [email protected]
> > > CC: [email protected]; [email protected];
> > [email protected]
> > > Subject: Re: ext4 BUG in dom0 Kernel 2.6.32.36
> > >
> > > On Tue, Sep 06, 2011 at 03:24:14PM +0800, MaoXiaoyun wrote:
> > > >
> > > >
> > > > Hi:
> > > >
> > > > I've met an ext4 Bug in dom0 kernel 2.6.32.36. (See kernel stack
> > below)
> > >
> > > Did you try the 3.0 kernel?
> > No, I am afried the change would be to much for our current env.
> > May result in other stable issue.
> > So, I want to dig out what really happen. Hopes.
>
> Another question is whether this is a regression compared to earlier
> versions of 2.6.32? Do you know if this problem exists in a non-Xen
> environment?
>

There are some others reports this issue in non-xen env.
http://markmail.org/message/ywr4nfgiuvgdcr7y
http://www.spinics.net/lists/linux-ext4/msg21066.html

The difficulty is I haven't find a efficient way to reproduce it.
(Currently it only show in our cluster, redeploy our cluster may cost 3days more. )


> Thanks,
> J
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-09-25 08:45:40

by MaoXiaoyun

[permalink] [raw]
Subject: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch



Hi:

We met an ext4 BUG_ON in extents.c:1716 which crash kernel flush thread, and result in disk unvailiable.

BUG details refer to: http://www.gossamer-threads.com/lists/xen/devel/217091?do=post_view_threaded

Attached is the fix, verified in our env.

Without this patch, more than 3 servers hit BUG_ON in our hundreds of servers every day.


many thanks.


Attachments:
0001-ext4-fix_dirty_extent_when_split_max&last_extent.patch (3.39 kB)

2011-09-26 14:28:30

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch

On Sun, Sep 25, 2011 at 04:45:39PM +0800, MaoXiaoyun wrote:
>
>
> Hi:
>
> We met an ext4 BUG_ON in extents.c:1716 which crash kernel flush thread, and result in disk unvailiable.
>
> BUG details refer to: http://www.gossamer-threads.com/lists/xen/devel/217091?do=post_view_threaded
>
> Attached is the fix, verified in our env.

So.. you are asking for this upstream git commit to be back-ported to 2.6.32, right?

>
> Without this patch, more than 3 servers hit BUG_ON in our hundreds of servers every day.
>
>
> many thanks.



2011-09-27 02:22:49

by MaoXiaoyun

[permalink] [raw]
Subject: RE: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch




----------------------------------------
> Date: Mon, 26 Sep 2011 10:28:08 -0400
> From: [email protected]
> To: [email protected]
> CC: [email protected]; [email protected]; [email protected]; [email protected]
> Subject: Re: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch
>
> On Sun, Sep 25, 2011 at 04:45:39PM +0800, MaoXiaoyun wrote:
> >
> >
> > Hi:
> >
> > We met an ext4 BUG_ON in extents.c:1716 which crash kernel flush thread, and result in disk unvailiable.
> >
> > BUG details refer to: http://www.gossamer-threads.com/lists/xen/devel/217091?do=post_view_threaded
> >
> > Attached is the fix, verified in our env.
>
> So.. you are asking for this upstream git commit to be back-ported to 2.6.32, right?
>

The patch is for 2.6.39. It can be patched on 2.6.32 too.
Thanks.

> >
> > Without this patch, more than 3 servers hit BUG_ON in our hundreds of servers every day.
> >
> >
> > many thanks.
>
>

2011-09-27 09:09:50

by Jan Beulich

[permalink] [raw]
Subject: [Xen-devel] RE: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch

>>> On 27.09.11 at 04:22, MaoXiaoyun <[email protected]> wrote:

>
>
> ----------------------------------------
>> Date: Mon, 26 Sep 2011 10:28:08 -0400
>> From: [email protected]
>> To: [email protected]
>> CC: [email protected]; [email protected]; [email protected];
> [email protected]
>> Subject: Re: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch
>>
>> On Sun, Sep 25, 2011 at 04:45:39PM +0800, MaoXiaoyun wrote:
>> >
>> >
>> > Hi:
>> >
>> > We met an ext4 BUG_ON in extents.c:1716 which crash kernel flush thread,
> and result in disk unvailiable.
>> >
>> > BUG details refer to:
> http://www.gossamer-threads.com/lists/xen/devel/217091?do=post_view_threaded
>> >
>> > Attached is the fix, verified in our env.
>>
>> So.. you are asking for this upstream git commit to be back-ported to 2.6.32,
> right?
>>
>
> The patch is for 2.6.39. It can be patched on 2.6.32 too.
> Thanks.

So why don't you suggest applying this to the stable tree maintainers
instead? xen-devel really isn't the right forum for this sort of bug fixes,
particularly when the underlying kernel.org tree is still being maintained.

Jan

>> >
>> > Without this patch, more than 3 servers hit BUG_ON in our hundreds of
> servers every day.
>> >
>> >
>> > many thanks.
>>
>>
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> http://lists.xensource.com/xen-devel




2011-09-27 09:54:59

by Tao Ma

[permalink] [raw]
Subject: Re: [Xen-devel] RE: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch

On 09/27/2011 05:09 PM, Jan Beulich wrote:
>>>> On 27.09.11 at 04:22, MaoXiaoyun <[email protected]> wrote:
>
>>
>>
>> ----------------------------------------
>>> Date: Mon, 26 Sep 2011 10:28:08 -0400
>>> From: [email protected]
>>> To: [email protected]
>>> CC: [email protected]; [email protected]; [email protected];
>> [email protected]
>>> Subject: Re: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch
>>>
>>> On Sun, Sep 25, 2011 at 04:45:39PM +0800, MaoXiaoyun wrote:
>>>>
>>>>
>>>> Hi:
>>>>
>>>> We met an ext4 BUG_ON in extents.c:1716 which crash kernel flush thread,
>> and result in disk unvailiable.
>>>>
>>>> BUG details refer to:
>> http://www.gossamer-threads.com/lists/xen/devel/217091?do=post_view_threaded
>>>>
>>>> Attached is the fix, verified in our env.
>>>
>>> So.. you are asking for this upstream git commit to be back-ported to 2.6.32,
>> right?
>>>
>>
>> The patch is for 2.6.39. It can be patched on 2.6.32 too.
>> Thanks.
>
> So why don't you suggest applying this to the stable tree maintainers
> instead? xen-devel really isn't the right forum for this sort of bug fixes,
> particularly when the underlying kernel.org tree is still being maintained.
AFAIK, the upstream linux kernel doesn't have this problem because this
part of codes have been refactored. So I am not sure whether Greg KH
will accept it or not.

btw, I don't think the fix is appropriate. One of my colleague is
working out another patch to resolve this(I will ask him to post the
patch when it is ready). And we will contact Redhat for considering
merging it to the enterprise kernel.

Thanks
Tao

2011-09-27 19:35:26

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch

On Mon, Sep 26, 2011 at 10:28:08AM -0400, Konrad Rzeszutek Wilk wrote:
> >
> > Attached is the fix, verified in our env.
>
> So.. you are asking for this upstream git commit to be back-ported
> to 2.6.32, right?

I'm curious --- is there a good reason why Xen users are using an
upstream 2.6.32 kernel? If they are using a distro kernel, fine, but
then the distro kernel should be providing the support. But at this
point, 2.6.32 is so positively *ancient* that, I'm personally not
interesting in providing free, unpaid distro support for users who
aren't willing to either (a) pay $$$ and get a supported distro
kernel, or (b) use a much more modern kernel. At this point, Guest
and Host Xen support is available in 3.0 kernels, so there's really no
excuse, right?

- Ted

2011-09-28 04:09:40

by MaoXiaoyun

[permalink] [raw]
Subject: RE: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch




----------------------------------------
> Date: Tue, 27 Sep 2011 15:35:23 -0400
> From: [email protected]
> To: [email protected]
> CC: [email protected]; [email protected]; [email protected]; [email protected]
> Subject: Re: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch
>
> On Mon, Sep 26, 2011 at 10:28:08AM -0400, Konrad Rzeszutek Wilk wrote:
> > >
> > > Attached is the fix, verified in our env.
> >
> > So.. you are asking for this upstream git commit to be back-ported
> > to 2.6.32, right?
>
> I'm curious --- is there a good reason why Xen users are using an
> upstream 2.6.32 kernel? If they are using a distro kernel, fine, but
> then the distro kernel should be providing the support. But at this
> point, 2.6.32 is so positively *ancient* that, I'm personally not
> interesting in providing free, unpaid distro support for users who
> aren't willing to either (a) pay $$$ and get a supported distro
> kernel, or (b) use a much more modern kernel. At this point, Guest
> and Host Xen support is available in 3.0 kernels, so there's really no
> excuse, right?

Mmm...

We first met this bug at pvops kernel(jeremy's tree, 2.6.32.36).

We failed to find any related fix from google, so we debug the bug ourself.
Fortunately, we located root cause and thought some other xen users might
have this problem as well, that's why we sent out the fix to Xen-devel.

We go through the code from 2.6.32 - 2.6.39, this bug exists.
People who use *ancient* kernel need this.

Thanks.

> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-09-28 10:45:29

by Tao Ma

[permalink] [raw]
Subject: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

Hi Ted,
On 09/28/2011 03:35 AM, Ted Ts'o wrote:
> On Mon, Sep 26, 2011 at 10:28:08AM -0400, Konrad Rzeszutek Wilk wrote:
>>> > >
>>> > > Attached is the fix, verified in our env.
>> >
>> > So.. you are asking for this upstream git commit to be back-ported
>> > to 2.6.32, right?
> I'm curious --- is there a good reason why Xen users are using an
> upstream 2.6.32 kernel? If they are using a distro kernel, fine, but
> then the distro kernel should be providing the support. But at this
> point, 2.6.32 is so positively *ancient* that, I'm personally not
> interesting in providing free, unpaid distro support for users who
> aren't willing to either (a) pay $$$ and get a supported distro
> kernel, or (b) use a much more modern kernel. At this point, Guest
> and Host Xen support is available in 3.0 kernels, so there's really no
> excuse, right?
actually this bug does show up in 2.6.39 and I think stable tree still
needs this fix. After some careful test, my colleague has generated
the patch. Please considering ack it so that Greg can add it into the
stable tree.

Thanks
Tao

>From 600d493b14ebd776cf8ea0e9dcdccc0d54200403 Mon Sep 17 00:00:00 2001
From: Zheng Liu <[email protected]>
Date: Wed, 28 Sep 2011 16:26:05 +0800
Subject: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

We will meet with a BUG_ON() if following script is run.

mkfs.ext4 -b 4096 /dev/sdb1 1000000
mount -t ext4 /dev/sdb1 /mnt/sdb1
fallocate -l 100M /mnt/sdb1/test
sync
for((i=0;i<170;i++))
do
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=`expr $i \* 2`
done
umount /mnt/sdb1
mount -t ext4 /dev/sdb1 /mnt/sdb1
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=341
umount /mnt/sdb1
mount /dev/sdb1 /mnt/sdb1
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=340
sync

The reason is that it forgot to mark dirty when splitting two extents in
ext4_ext_convert_to_initialized(). Althrough ex has been updated in memory,
it is not dirtied both in ext4_ext_convert_to_initialized() and
ext4_ext_insert_extent(). The disk layout is corrupted. Then it will meet with
a BUG_ON() when writting at the start of that extent again.

Cc: [email protected] #for 2.6.39
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Xiaoyun Mao <[email protected]>
Cc: Yingbin Wang <[email protected]>
Cc: Jia Wan <[email protected]>
Signed-off-by: Zheng Liu <[email protected]>
---
fs/ext4/extents.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 4890d6f..cd20425 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2607,6 +2607,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
ex1 = ex;
ex1->ee_len = cpu_to_le16(map->m_lblk - ee_block);
ext4_ext_mark_uninitialized(ex1);
+ ext4_ext_dirty(handle, inode, path + depth);
ex2 = &newex;
}
/*
--
1.7.4.1


2011-09-28 18:41:13

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [Xen-devel] Re: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch

On 09/27/2011 12:35 PM, Ted Ts'o wrote:
> On Mon, Sep 26, 2011 at 10:28:08AM -0400, Konrad Rzeszutek Wilk wrote:
>>>
>>> Attached is the fix, verified in our env.
>> So.. you are asking for this upstream git commit to be back-ported
>> to 2.6.32, right?
> I'm curious --- is there a good reason why Xen users are using an
> upstream 2.6.32 kernel? If they are using a distro kernel, fine, but
> then the distro kernel should be providing the support. But at this
> point, 2.6.32 is so positively *ancient* that, I'm personally not
> interesting in providing free, unpaid distro support for users who
> aren't willing to either (a) pay $$$ and get a supported distro
> kernel, or (b) use a much more modern kernel. At this point, Guest
> and Host Xen support is available in 3.0 kernels, so there's really no
> excuse, right?

The 2.6.32.x-based kernel has been the preferred "stable" kernel for Xen
users for a while, and it is still considered to be more stable and
functional than what's upstream (obviously we're trying to fix that).
Also, because many current distros don't support Xen dom0, it has been
an ad-hoc distro kernel.

Since kernel.org 2.6.32 is still considered to be a maintained
long-term-stable kernel, I keep the xen.git version up-to-date with
stable-2.6.32 bugfixes and occasional separate Xen-specific fixes. But
I'd really prefer to avoid having any non-Xen private changes in that
tree, in favour of getting everything from upstream stable.

Do you not consider it worth continuing support of the 2.6.32 stable
tree with respect to ext4?

J

2011-09-28 19:46:51

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [Xen-devel] Re: [patch 1/1] ext4-fix-dirty-extent-when-origin-leaf-extent-reac.patch

On Wed, Sep 28, 2011 at 11:41:11AM -0700, Jeremy Fitzhardinge wrote:
> Since kernel.org 2.6.32 is still considered to be a maintained
> long-term-stable kernel, I keep the xen.git version up-to-date with
> stable-2.6.32 bugfixes and occasional separate Xen-specific fixes. But
> I'd really prefer to avoid having any non-Xen private changes in that
> tree, in favour of getting everything from upstream stable.
>
> Do you not consider it worth continuing support of the 2.6.32 stable
> tree with respect to ext4?

I just don't have the *time* to maintain backports of ext4 fixes to
2.6.32. There have been so many bug fixes to ext4, and some of them
depend on changes in the quota subsystem, so trying to back port them
all would be hellish, and not something I'm willing to do on a
volunteer basis.

I'm busy enough with silly things like trying to help with the
kernel.org getting back on-line, that channelling my stay-really-late
hours to support users who are too cheap to pay distro support fees is
not really a way that I would choose to spend my personal time.

If someone would like to volunteer to be unpaid distro support, that's
great. It's worth it as long as I get to volunteer somebody else's
time. :-)

- Ted


2011-10-27 09:43:35

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

On Wed, Sep 28, 2011 at 06:45:03PM +0800, Tao Ma wrote:
> actually this bug does show up in 2.6.39 and I think stable tree still
> needs this fix. After some careful test, my colleague has generated
> the patch. Please considering ack it so that Greg can add it into the
> stable tree.

Sorry for the delay, but yes. This patch would be good for the stable
tree for 2.6.39 (if Greg is still accepting patches for
2.6.39-stable). It doesn't apply for upstream ext4 since the code has
been changed/refactored since then, but it's a good fix.

- Ted

> From 600d493b14ebd776cf8ea0e9dcdccc0d54200403 Mon Sep 17 00:00:00 2001
> From: Zheng Liu <[email protected]>
> Date: Wed, 28 Sep 2011 16:26:05 +0800
> Subject: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()
>
> We will meet with a BUG_ON() if following script is run.
>
> mkfs.ext4 -b 4096 /dev/sdb1 1000000
> mount -t ext4 /dev/sdb1 /mnt/sdb1
> fallocate -l 100M /mnt/sdb1/test
> sync
> for((i=0;i<170;i++))
> do
> dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=`expr $i \* 2`
> done
> umount /mnt/sdb1
> mount -t ext4 /dev/sdb1 /mnt/sdb1
> dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=341
> umount /mnt/sdb1
> mount /dev/sdb1 /mnt/sdb1
> dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=340
> sync
>
> The reason is that it forgot to mark dirty when splitting two extents in
> ext4_ext_convert_to_initialized(). Althrough ex has been updated in memory,
> it is not dirtied both in ext4_ext_convert_to_initialized() and
> ext4_ext_insert_extent(). The disk layout is corrupted. Then it will meet with
> a BUG_ON() when writting at the start of that extent again.
>
> Cc: [email protected] #for 2.6.39
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: "Theodore Ts'o" <[email protected]>
> Cc: Xiaoyun Mao <[email protected]>
> Cc: Yingbin Wang <[email protected]>
> Cc: Jia Wan <[email protected]>
> Signed-off-by: Zheng Liu <[email protected]>
> ---
> fs/ext4/extents.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 4890d6f..cd20425 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2607,6 +2607,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
> ex1 = ex;
> ex1->ee_len = cpu_to_le16(map->m_lblk - ee_block);
> ext4_ext_mark_uninitialized(ex1);
> + ext4_ext_dirty(handle, inode, path + depth);
> ex2 = &newex;
> }
> /*
> --
> 1.7.4.1
>

2011-10-27 12:00:45

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

On Thu, Oct 27, 2011 at 05:43:29AM -0400, Ted Ts'o wrote:
> On Wed, Sep 28, 2011 at 06:45:03PM +0800, Tao Ma wrote:
> > actually this bug does show up in 2.6.39 and I think stable tree still
> > needs this fix. After some careful test, my colleague has generated
> > the patch. Please considering ack it so that Greg can add it into the
> > stable tree.
>
> Sorry for the delay, but yes. This patch would be good for the stable
> tree for 2.6.39 (if Greg is still accepting patches for
> 2.6.39-stable). It doesn't apply for upstream ext4 since the code has
> been changed/refactored since then, but it's a good fix.

No, .39 has not been maintained for quite some time now, sorry.

greg k-h

2011-10-28 02:35:04

by Zheng Liu

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

On Thu, Oct 27, 2011 at 01:53:22PM +0200, Greg KH wrote:
> On Thu, Oct 27, 2011 at 05:43:29AM -0400, Ted Ts'o wrote:
> > On Wed, Sep 28, 2011 at 06:45:03PM +0800, Tao Ma wrote:
> > > actually this bug does show up in 2.6.39 and I think stable tree still
> > > needs this fix. After some careful test, my colleague has generated
> > > the patch. Please considering ack it so that Greg can add it into the
> > > stable tree.
> >
> > Sorry for the delay, but yes. This patch would be good for the stable
> > tree for 2.6.39 (if Greg is still accepting patches for
> > 2.6.39-stable). It doesn't apply for upstream ext4 since the code has
> > been changed/refactored since then, but it's a good fix.
>
> No, .39 has not been maintained for quite some time now, sorry.
Hi Greg,

Thank you for your attention. Actually this bug is between from .32 to
.39. Please considering to apply this patch to other stable or longterm
trees.

regards,
Zheng
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-10-28 05:28:08

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

On Fri, Oct 28, 2011 at 10:36:08AM +0800, Zheng Liu wrote:
> On Thu, Oct 27, 2011 at 01:53:22PM +0200, Greg KH wrote:
> > On Thu, Oct 27, 2011 at 05:43:29AM -0400, Ted Ts'o wrote:
> > > On Wed, Sep 28, 2011 at 06:45:03PM +0800, Tao Ma wrote:
> > > > actually this bug does show up in 2.6.39 and I think stable tree still
> > > > needs this fix. After some careful test, my colleague has generated
> > > > the patch. Please considering ack it so that Greg can add it into the
> > > > stable tree.
> > >
> > > Sorry for the delay, but yes. This patch would be good for the stable
> > > tree for 2.6.39 (if Greg is still accepting patches for
> > > 2.6.39-stable). It doesn't apply for upstream ext4 since the code has
> > > been changed/refactored since then, but it's a good fix.
> >
> > No, .39 has not been maintained for quite some time now, sorry.
> Hi Greg,
>
> Thank you for your attention. Actually this bug is between from .32 to
> .39. Please considering to apply this patch to other stable or longterm
> trees.

Ah, ok, that makes sense, can you provide me a patch that will apply to
the .32 and .33-longterm kernels?

thanks,

greg k-h

2011-10-28 08:45:48

by Zheng Liu

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

On Fri, Oct 28, 2011 at 07:24:06AM +0200, Greg KH wrote:
> On Fri, Oct 28, 2011 at 10:36:08AM +0800, Zheng Liu wrote:
> > On Thu, Oct 27, 2011 at 01:53:22PM +0200, Greg KH wrote:
> > > On Thu, Oct 27, 2011 at 05:43:29AM -0400, Ted Ts'o wrote:
> > > > On Wed, Sep 28, 2011 at 06:45:03PM +0800, Tao Ma wrote:
> > > > > actually this bug does show up in 2.6.39 and I think stable tree still
> > > > > needs this fix. After some careful test, my colleague has generated
> > > > > the patch. Please considering ack it so that Greg can add it into the
> > > > > stable tree.
> > > >
> > > > Sorry for the delay, but yes. This patch would be good for the stable
> > > > tree for 2.6.39 (if Greg is still accepting patches for
> > > > 2.6.39-stable). It doesn't apply for upstream ext4 since the code has
> > > > been changed/refactored since then, but it's a good fix.
> > >
> > > No, .39 has not been maintained for quite some time now, sorry.
> > Hi Greg,
> >
> > Thank you for your attention. Actually this bug is between from .32 to
> > .39. Please considering to apply this patch to other stable or longterm
> > trees.
>
> Ah, ok, that makes sense, can you provide me a patch that will apply to
> the .32 and .33-longterm kernels?
Hi Greg,

I couldn't download the .32 and .33-longterm kernels from kernel.org
because the full sources were not found on that server. Thus this patch
is generated from the .32-mainline kernel and can be applied to the .32
and .33 kernels.

regards,
Zheng

>From ff98f00676657b797c426c80804a3fde3f86ea83 Mon Sep 17 00:00:00 2001
From: Zheng Liu <[email protected]>
Date: Fri, 28 Oct 2011 15:08:30 +0800
Subject: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

We will meet with a BUG_ON() if following script is run.

mkfs.ext4 -b 4096 /dev/sdb1 1000000
mount -t ext4 /dev/sdb1 /mnt/sdb1
fallocate -l 100M /mnt/sdb1/test
sync
for((i=0;i<170;i++))
do
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1
seek=`expr $i \* 2`
done
umount /mnt/sdb1
mount -t ext4 /dev/sdb1 /mnt/sdb1
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=341
umount /mnt/sdb1
mount /dev/sdb1 /mnt/sdb1
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=340
sync

The reason is that it forgot to mark dirty when splitting two extents in
ext4_ext_convert_to_initialized(). Althrough ex has been updated in
memory, it is not dirtied both in ext4_ext_convert_to_initialized() and
ext4_ext_insert_extent(). The disk layout is corrupted. Then it will
meet with a BUG_ON() when writting at the start of that extent again.

Cc: [email protected] #for 2.6.39
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Xiaoyun Mao <[email protected]>
Cc: Yingbin Wang <[email protected]>
Cc: Jia Wan <[email protected]>
Signed-off-by: Zheng Liu <[email protected]>
---
fs/ext4/extents.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 715264b..ab6fa35 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2555,6 +2555,7 @@ static int
ext4_ext_convert_to_initialized(handle_t *handle,
ex1 = ex;
ex1->ee_len = cpu_to_le16(iblock - ee_block);
ext4_ext_mark_uninitialized(ex1);
+ ext4_ext_dirty(handle, inode, path + depth);
ex2 = &newex;
}
/*
--
1.7.4.1

> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-10-28 09:08:41

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

On Fri, Oct 28, 2011 at 04:46:52PM +0800, Zheng Liu wrote:
> On Fri, Oct 28, 2011 at 07:24:06AM +0200, Greg KH wrote:
> > On Fri, Oct 28, 2011 at 10:36:08AM +0800, Zheng Liu wrote:
> > > On Thu, Oct 27, 2011 at 01:53:22PM +0200, Greg KH wrote:
> > > > On Thu, Oct 27, 2011 at 05:43:29AM -0400, Ted Ts'o wrote:
> > > > > On Wed, Sep 28, 2011 at 06:45:03PM +0800, Tao Ma wrote:
> > > > > > actually this bug does show up in 2.6.39 and I think stable tree still
> > > > > > needs this fix. After some careful test, my colleague has generated
> > > > > > the patch. Please considering ack it so that Greg can add it into the
> > > > > > stable tree.
> > > > >
> > > > > Sorry for the delay, but yes. This patch would be good for the stable
> > > > > tree for 2.6.39 (if Greg is still accepting patches for
> > > > > 2.6.39-stable). It doesn't apply for upstream ext4 since the code has
> > > > > been changed/refactored since then, but it's a good fix.
> > > >
> > > > No, .39 has not been maintained for quite some time now, sorry.
> > > Hi Greg,
> > >
> > > Thank you for your attention. Actually this bug is between from .32 to
> > > .39. Please considering to apply this patch to other stable or longterm
> > > trees.
> >
> > Ah, ok, that makes sense, can you provide me a patch that will apply to
> > the .32 and .33-longterm kernels?
> Hi Greg,
>
> I couldn't download the .32 and .33-longterm kernels from kernel.org
> because the full sources were not found on that server. Thus this patch
> is generated from the .32-mainline kernel and can be applied to the .32
> and .33 kernels.

The .32 and .33 longterm kernels are part of the linux-stable tree on
git.kernel.org, they are in their own branch. Please redo this against
those trees, as I'm pretty sure that there will be conflicts, due to all
of the different changes since the .0 releases.

thanks,

greg k-h

2011-10-28 12:33:06

by Zheng Liu

[permalink] [raw]
Subject: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent() for .32 longterm

On Fri, Oct 28, 2011 at 11:07:59AM +0200, Greg KH wrote:
> On Fri, Oct 28, 2011 at 04:46:52PM +0800, Zheng Liu wrote:
> > On Fri, Oct 28, 2011 at 07:24:06AM +0200, Greg KH wrote:
> > > On Fri, Oct 28, 2011 at 10:36:08AM +0800, Zheng Liu wrote:
> > > > On Thu, Oct 27, 2011 at 01:53:22PM +0200, Greg KH wrote:
> > > > > On Thu, Oct 27, 2011 at 05:43:29AM -0400, Ted Ts'o wrote:
> > > > > > On Wed, Sep 28, 2011 at 06:45:03PM +0800, Tao Ma wrote:
> > > > > > > actually this bug does show up in 2.6.39 and I think stable tree still
> > > > > > > needs this fix. After some careful test, my colleague has generated
> > > > > > > the patch. Please considering ack it so that Greg can add it into the
> > > > > > > stable tree.
> > > > > >
> > > > > > Sorry for the delay, but yes. This patch would be good for the stable
> > > > > > tree for 2.6.39 (if Greg is still accepting patches for
> > > > > > 2.6.39-stable). It doesn't apply for upstream ext4 since the code has
> > > > > > been changed/refactored since then, but it's a good fix.
> > > > >
> > > > > No, .39 has not been maintained for quite some time now, sorry.
> > > > Hi Greg,
> > > >
> > > > Thank you for your attention. Actually this bug is between from .32 to
> > > > .39. Please considering to apply this patch to other stable or longterm
> > > > trees.
> > >
> > > Ah, ok, that makes sense, can you provide me a patch that will apply to
> > > the .32 and .33-longterm kernels?
> > Hi Greg,
> >
> > I couldn't download the .32 and .33-longterm kernels from kernel.org
> > because the full sources were not found on that server. Thus this patch
> > is generated from the .32-mainline kernel and can be applied to the .32
> > and .33 kernels.
>
> The .32 and .33 longterm kernels are part of the linux-stable tree on
> git.kernel.org, they are in their own branch. Please redo this against
> those trees, as I'm pretty sure that there will be conflicts, due to all
> of the different changes since the .0 releases.
Hi Greg

This patch is for .32 longterm kernel. Please try it again.

regards,
Zheng

>From bc522003378af679afd227ff87497dfb4fd4d652 Mon Sep 17 00:00:00 2001
From: Zheng Liu <[email protected]>
Date: Fri, 28 Oct 2011 19:41:19 +0800
Subject: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

We will meet with a BUG_ON() if following script is run.

mkfs.ext4 -b 4096 /dev/sdb1 1000000
mount -t ext4 /dev/sdb1 /mnt/sdb1
fallocate -l 100M /mnt/sdb1/test
sync
for((i=0;i<170;i++))
do
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1
seek=`expr $i \* 2`
done
umount /mnt/sdb1
mount -t ext4 /dev/sdb1 /mnt/sdb1
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=341
umount /mnt/sdb1
mount /dev/sdb1 /mnt/sdb1
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=340
sync

The reason is that it forgot to mark dirty when splitting two extents in
ext4_ext_convert_to_initialized(). Althrough ex has been updated in
memory,
it is not dirtied both in ext4_ext_convert_to_initialized() and
ext4_ext_insert_extent(). The disk layout is corrupted. Then it will
meet with
a BUG_ON() when writting at the start of that extent again.

Cc: [email protected] #for 2.6.32
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Xiaoyun Mao <[email protected]>
Cc: Yingbin Wang <[email protected]>
Cc: Jia Wan <[email protected]>
Signed-off-by: Zheng Liu <[email protected]>
---
fs/ext4/extents.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index f375559..93f7999 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2592,6 +2592,7 @@ static int
ext4_ext_convert_to_initialized(handle_t *handle,
ex1 = ex;
ex1->ee_len = cpu_to_le16(iblock - ee_block);
ext4_ext_mark_uninitialized(ex1);
+ ext4_ext_dirty(handle, inode, path + depth);
ex2 = &newex;
}
/*
--
1.7.4.1

> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-10-28 12:35:18

by Zheng Liu

[permalink] [raw]
Subject: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent() for .33 longterm

On Fri, Oct 28, 2011 at 11:07:59AM +0200, Greg KH wrote:
> On Fri, Oct 28, 2011 at 04:46:52PM +0800, Zheng Liu wrote:
> > On Fri, Oct 28, 2011 at 07:24:06AM +0200, Greg KH wrote:
> > > On Fri, Oct 28, 2011 at 10:36:08AM +0800, Zheng Liu wrote:
> > > > On Thu, Oct 27, 2011 at 01:53:22PM +0200, Greg KH wrote:
> > > > > On Thu, Oct 27, 2011 at 05:43:29AM -0400, Ted Ts'o wrote:
> > > > > > On Wed, Sep 28, 2011 at 06:45:03PM +0800, Tao Ma wrote:
> > > > > > > actually this bug does show up in 2.6.39 and I think stable tree still
> > > > > > > needs this fix. After some careful test, my colleague has generated
> > > > > > > the patch. Please considering ack it so that Greg can add it into the
> > > > > > > stable tree.
> > > > > >
> > > > > > Sorry for the delay, but yes. This patch would be good for the stable
> > > > > > tree for 2.6.39 (if Greg is still accepting patches for
> > > > > > 2.6.39-stable). It doesn't apply for upstream ext4 since the code has
> > > > > > been changed/refactored since then, but it's a good fix.
> > > > >
> > > > > No, .39 has not been maintained for quite some time now, sorry.
> > > > Hi Greg,
> > > >
> > > > Thank you for your attention. Actually this bug is between from .32 to
> > > > .39. Please considering to apply this patch to other stable or longterm
> > > > trees.
> > >
> > > Ah, ok, that makes sense, can you provide me a patch that will apply to
> > > the .32 and .33-longterm kernels?
> > Hi Greg,
> >
> > I couldn't download the .32 and .33-longterm kernels from kernel.org
> > because the full sources were not found on that server. Thus this patch
> > is generated from the .32-mainline kernel and can be applied to the .32
> > and .33 kernels.
>
> The .32 and .33 longterm kernels are part of the linux-stable tree on
> git.kernel.org, they are in their own branch. Please redo this against
> those trees, as I'm pretty sure that there will be conflicts, due to all
> of the different changes since the .0 releases.
Hi Greg,

This patch is for .33 longterm kernel. Please apply it.

regards,
Zheng

>From 86c78ef9cd861146a48e8a643601c165b0e80849 Mon Sep 17 00:00:00 2001
From: Zheng Liu <[email protected]>
Date: Fri, 28 Oct 2011 19:45:31 +0800
Subject: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()

We will meet with a BUG_ON() if following script is run.

mkfs.ext4 -b 4096 /dev/sdb1 1000000
mount -t ext4 /dev/sdb1 /mnt/sdb1
fallocate -l 100M /mnt/sdb1/test
sync
for((i=0;i<170;i++))
do
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1
seek=`expr $i \* 2`
done
umount /mnt/sdb1
mount -t ext4 /dev/sdb1 /mnt/sdb1
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=341
umount /mnt/sdb1
mount /dev/sdb1 /mnt/sdb1
dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=340
sync

The reason is that it forgot to mark dirty when splitting two extents in
ext4_ext_convert_to_initialized(). Althrough ex has been updated in
memory,
it is not dirtied both in ext4_ext_convert_to_initialized() and
ext4_ext_insert_extent(). The disk layout is corrupted. Then it will
meet with
a BUG_ON() when writting at the start of that extent again.

Cc: [email protected] #for 2.6.33
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Xiaoyun Mao <[email protected]>
Cc: Yingbin Wang <[email protected]>
Cc: Jia Wan <[email protected]>
Signed-off-by: Zheng Liu <[email protected]>
---
fs/ext4/extents.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 505a281..6cb1bbd 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2566,6 +2566,7 @@ static int
ext4_ext_convert_to_initialized(handle_t *handle,
ex1 = ex;
ex1->ee_len = cpu_to_le16(iblock - ee_block);
ext4_ext_mark_uninitialized(ex1);
+ ext4_ext_dirty(handle, inode, path + depth);
ex2 = &newex;
}
/*
--
1.7.4.1

> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-11-02 21:05:51

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent() for .32 longterm

On Fri, Oct 28, 2011 at 08:34:02PM +0800, Zheng Liu wrote:
> On Fri, Oct 28, 2011 at 11:07:59AM +0200, Greg KH wrote:
> > On Fri, Oct 28, 2011 at 04:46:52PM +0800, Zheng Liu wrote:
> > > On Fri, Oct 28, 2011 at 07:24:06AM +0200, Greg KH wrote:
> > > > On Fri, Oct 28, 2011 at 10:36:08AM +0800, Zheng Liu wrote:
> > > > > On Thu, Oct 27, 2011 at 01:53:22PM +0200, Greg KH wrote:
> > > > > > On Thu, Oct 27, 2011 at 05:43:29AM -0400, Ted Ts'o wrote:
> > > > > > > On Wed, Sep 28, 2011 at 06:45:03PM +0800, Tao Ma wrote:
> > > > > > > > actually this bug does show up in 2.6.39 and I think stable tree still
> > > > > > > > needs this fix. After some careful test, my colleague has generated
> > > > > > > > the patch. Please considering ack it so that Greg can add it into the
> > > > > > > > stable tree.
> > > > > > >
> > > > > > > Sorry for the delay, but yes. This patch would be good for the stable
> > > > > > > tree for 2.6.39 (if Greg is still accepting patches for
> > > > > > > 2.6.39-stable). It doesn't apply for upstream ext4 since the code has
> > > > > > > been changed/refactored since then, but it's a good fix.
> > > > > >
> > > > > > No, .39 has not been maintained for quite some time now, sorry.
> > > > > Hi Greg,
> > > > >
> > > > > Thank you for your attention. Actually this bug is between from .32 to
> > > > > .39. Please considering to apply this patch to other stable or longterm
> > > > > trees.
> > > >
> > > > Ah, ok, that makes sense, can you provide me a patch that will apply to
> > > > the .32 and .33-longterm kernels?
> > > Hi Greg,
> > >
> > > I couldn't download the .32 and .33-longterm kernels from kernel.org
> > > because the full sources were not found on that server. Thus this patch
> > > is generated from the .32-mainline kernel and can be applied to the .32
> > > and .33 kernels.
> >
> > The .32 and .33 longterm kernels are part of the linux-stable tree on
> > git.kernel.org, they are in their own branch. Please redo this against
> > those trees, as I'm pretty sure that there will be conflicts, due to all
> > of the different changes since the .0 releases.
> Hi Greg
>
> This patch is for .32 longterm kernel. Please try it again.
>
> regards,
> Zheng
>
> >From bc522003378af679afd227ff87497dfb4fd4d652 Mon Sep 17 00:00:00 2001
> From: Zheng Liu <[email protected]>
> Date: Fri, 28 Oct 2011 19:41:19 +0800
> Subject: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent()
>
> We will meet with a BUG_ON() if following script is run.
>
> mkfs.ext4 -b 4096 /dev/sdb1 1000000
> mount -t ext4 /dev/sdb1 /mnt/sdb1
> fallocate -l 100M /mnt/sdb1/test
> sync
> for((i=0;i<170;i++))
> do
> dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1
> seek=`expr $i \* 2`
> done
> umount /mnt/sdb1
> mount -t ext4 /dev/sdb1 /mnt/sdb1
> dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=341
> umount /mnt/sdb1
> mount /dev/sdb1 /mnt/sdb1
> dd if=/dev/zero of=/mnt/sdb1/test conv=notrunc bs=256k count=1 seek=340
> sync
>
> The reason is that it forgot to mark dirty when splitting two extents in
> ext4_ext_convert_to_initialized(). Althrough ex has been updated in
> memory,
> it is not dirtied both in ext4_ext_convert_to_initialized() and
> ext4_ext_insert_extent(). The disk layout is corrupted. Then it will
> meet with
> a BUG_ON() when writting at the start of that extent again.
>
> Cc: [email protected] #for 2.6.32
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: "Theodore Ts'o" <[email protected]>
> Cc: Xiaoyun Mao <[email protected]>
> Cc: Yingbin Wang <[email protected]>
> Cc: Jia Wan <[email protected]>
> Signed-off-by: Zheng Liu <[email protected]>

Sorry for dragging this out, but what commit id does this correspond to
in Linus's tree? I can't seem to figure it out.

Or does it not correspond to anything? If so, I need a sentence that
says why it doesn't for the patch changelog.

thanks,

greg k-h

2011-11-03 03:05:25

by Zheng Liu

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent() for .32 longterm

On Wed, Nov 02, 2011 at 02:00:24PM -0700, Greg KH wrote:
[snip]
>
> Sorry for dragging this out, but what commit id does this correspond to
> in Linus's tree? I can't seem to figure it out.
>
> Or does it not correspond to anything? If so, I need a sentence that
> says why it doesn't for the patch changelog.

Hi Greg,

This patch doesn't apply for upstream because the code has been
refactored since 3.0.

Regards,
Zheng

> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-11-03 17:57:24

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix BUG_ON() in ext4_ext_insert_extent() for .32 longterm

On Thu, Nov 03, 2011 at 11:06:44AM +0800, Zheng Liu wrote:
> On Wed, Nov 02, 2011 at 02:00:24PM -0700, Greg KH wrote:
> [snip]
> >
> > Sorry for dragging this out, but what commit id does this correspond to
> > in Linus's tree? I can't seem to figure it out.
> >
> > Or does it not correspond to anything? If so, I need a sentence that
> > says why it doesn't for the patch changelog.
>
> Hi Greg,
>
> This patch doesn't apply for upstream because the code has been
> refactored since 3.0.

Thanks for letting me know, I'll go queue this up now.

greg k-h