2010-07-16 08:44:26

by Li Zefan

[permalink] [raw]
Subject: [BUG] ext4 trace events cause NULL pointer dereferences

To reproduce this bug, enable ext4 trace events, and then keep creating
files in a nealy fullly ocupied partition:

# echo 1 > debugfs/tracing/events/ext4/eanble
# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdb7 20158332 19072148 62184 100% /
...
# cat test.sh
#! /bin/sh

for ((i = 0; ; i++))
{
echo "create file: file_${i}.dat"

dd if=/dev/zero of=file_${i}.dat bs=1M count=10 > /dev/null 2>&1

if [ $? -ne 0 ]; then
break;
fi
}
# ./test.sh
create file: file_0.dat
create file: file_1.dat
...
create file: file_108.dat
# sync
(panic)


Seems ac->ac_inode can be NULL:

DECLARE_EVENT_CLASS(ext4__mballoc,
...
TP_fast_assign(
__entry->dev = ac->ac_inode->i_sb->s_dev;
__entry->ino = ac->ac_inode->i_ino;
...
),
...
);



BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
IP: [<ffffffffa00e2e2c>] ftrace_raw_event_ext4__mballoc+0x6c/0xe0 [ext4]
PGD 37ab6067 PUD a78a4067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 0
Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc autofs4 be2iscsi bnx2i cnic uio cxgb3i iw_cxgb3 cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext3 jbd dm_mirror dm_region_hash dm_log dm_mod e1000e i5k_amb hwmon i5000_edac iTCO_wdt sg edac_core i2c_i801 i2c_core shpchp iTCO_vendor_support ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom pata_acpi ata_generic mptsas mptscsih mptbase ata_piix scsi_transport_sas [last unloaded: scsi_wait_scan]

Pid: 902, comm: flush-8:16 Not tainted 2.6.35-rc5 #1 D2671/PRIMERGY
RIP: 0010:[<ffffffffa00e2e2c>] [<ffffffffa00e2e2c>] ftrace_raw_event_ext4__mballoc+0x6c/0xe0 [ext4]
RSP: 0018:ffff880137fab6e0 EFLAGS: 00010206
RAX: ffff880137cee738 RBX: ffff880068e40910 RCX: ffff880137cee734
RDX: 0000000000000000 RSI: ffffffffa010ed38 RDI: ffff880137cee73c
RBP: ffff880137fab720 R08: 000000a2b2177ca4 R09: 000000a2b217565f
R10: 0000000000000755 R11: 0000000000000001 R12: ffffffffa010ed38
R13: 0000000000000000 R14: ffff880137cee734 R15: 0000000000000282
FS: 0000000000000000(0000) GS:ffff880002400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000100 CR3: 0000000037aba000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process flush-8:16 (pid: 902, threadinfo ffff880137faa000, task ffff8801395a8040)
Stack:
ffff880137fab770 ffff88013b2978c0 ffff880137fab710 ffff880068e40910
<0> ffff880138462460 ffff880137fab7d0 0000000000000001 0000000000000001
<0> ffff880137fab770 ffffffffa00f6781 ffff880137fab770 00000022000046ce
Call Trace:
[<ffffffffa00f6781>] ext4_mb_release_group_pa+0x131/0x160 [ext4]
[<ffffffffa00f92a8>] ext4_mb_discard_group_preallocations+0x418/0x4d0 [ext4]
[<ffffffffa00fc21c>] ext4_mb_new_blocks+0x37c/0x4f0 [ext4]
[<ffffffffa00f3059>] ext4_ext_map_blocks+0x1449/0x1af0 [ext4]
[<ffffffff810d03d2>] ? ring_buffer_lock_reserve+0xa2/0x160
[<ffffffff810ff4c6>] ? __pagevec_release+0x26/0x40
[<ffffffffa00d2b10>] ext4_map_blocks+0xe0/0x200 [ext4]
[<ffffffffa00d3efd>] mpage_da_map_blocks+0xcd/0x420 [ext4]
[<ffffffffa00d4a6b>] ext4_da_writepages+0x2db/0x630 [ext4]
[<ffffffff8100ba2e>] ? apic_timer_interrupt+0xe/0x20
[<ffffffff810fdae1>] do_writepages+0x21/0x40
[<ffffffff81163e76>] writeback_single_inode+0xc6/0x2d0
[<ffffffff8116428e>] writeback_sb_inodes+0xce/0x180
[<ffffffff811643d9>] writeback_inodes_wb+0x99/0x180
[<ffffffff811646fb>] wb_writeback+0x23b/0x2a0
[<ffffffff811648cf>] wb_do_writeback+0x16f/0x180
[<ffffffff8106e1e0>] ? process_timeout+0x0/0x10
[<ffffffff81164937>] bdi_writeback_task+0x57/0x160
[<ffffffff8107d337>] ? bit_waitqueue+0x17/0xd0
[<ffffffff8110cc60>] ? bdi_start_fn+0x0/0xe0
[<ffffffff8110ccd1>] bdi_start_fn+0x71/0xe0
[<ffffffff8110cc60>] ? bdi_start_fn+0x0/0xe0
[<ffffffff8107cde6>] kthread+0x96/0xa0
[<ffffffff8100be84>] kernel_thread_helper+0x4/0x10
[<ffffffff8107cd50>] ? kthread+0x0/0xa0
[<ffffffff8100be80>] ? kernel_thread_helper+0x0/0x10
Code: ff ff 4c 89 f9 ba 28 00 00 00 45 89 e8 e8 9d f5 fe e0 48 85 c0 49 89 c6 74 51 48 89 c7 e8 1d a3 fe e0 48 8b 13 4c 89 f1 4c 89 e6 <48> 8b 92 00 01 00 00 8b 52 10 8950 0c 48 8b 13 48 8b 52 40 48
RIP [<ffffffffa00e2e2c>] ftrace_raw_event_ext4__mballoc+0x6c/0xe0 [ext4]
RSP <ffff880137fab6e0>
CR2: 0000000000000100
---[ end trace 28cc4a1689f1df47 ]---



BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
IP: [<ffffffffa00d73fc>] ftrace_raw_event_ext4_mb_release_group_pa+0x7c/0xe0 [ext4]
PGD 1389fe067 PUD 1389b0067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 3
Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc autofs4 be2iscsi bnx2i cnic uio cxgb3i iw_cxgb3 cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext3 jbd dm_mirror dm_region_hash dm_log dm_mod iTCO_wdt iTCO_vendor_support sg i5k_amb hwmon i2c_i801 i2c_core i5000_edac edac_core shpchp e1000e ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom pata_acpi ata_generic mptsas mptscsih mptbase ata_piix scsi_transport_sas [last unloaded: scsi_wait_scan]

Pid: 938, comm: flush-8:16 Not tainted 2.6.35-rc5-lizf #2 D2671/PRIMERGY
RIP: 0010:[<ffffffffa00d73fc>] [<ffffffffa00d73fc>] ftrace_raw_event_ext4_mb_release_group_pa+0x7c/0xe0 [ext4]
RSP: 0018:ffff880136ebb6d0 EFLAGS: 00010206
RAX: ffff880137bdf21c RBX: ffffffffa0104470 RCX: ffff880137bdf218
RDX: 0000000000000000 RSI: ffffffffa0104470 RDI: ffff880137bdf220
RBP: ffff880136ebb720 R08: 0000003c4d0f4ef1 R09: 0000003c4d0f3c8b
R10: 0000000000000242 R11: 0000000000000000 R12: ffff88013904a748
R13: ffff8801392596d0 R14: ffff880137bdf218 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff880002580000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000040 CR3: 0000000138a16000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process flush-8:16 (pid: 938, threadinfo ffff880136eba000, task ffff880136ddd540)
Stack:
ffff880136e2f000 0000000000000282 ffff880136ebb770 ffff88013b2978c0
<0> ffff880136ebb710 ffff8801392596d0 ffff88013904a748 ffff880136ebb7d0
<0> ffff880136e2f000 ffff8801388054e0 ffff880136ebb770 ffffffffa00eb886
Call Trace:
[<ffffffffa00eb886>] ext4_mb_release_group_pa+0x106/0x160 [ext4]
[<ffffffffa00ee3d8>] ext4_mb_discard_group_preallocations+0x418/0x4d0 [ext4]
[<ffffffffa00f134c>] ext4_mb_new_blocks+0x37c/0x4f0 [ext4]
[<ffffffffa00e8189>] ext4_ext_map_blocks+0x1449/0x1af0 [ext4]
[<ffffffff810d03d2>] ? ring_buffer_lock_reserve+0xa2/0x160
[<ffffffff812155b6>] ? __prop_inc_single+0x46/0x60
[<ffffffff810ff4c6>] ? __pagevec_release+0x26/0x40
[<ffffffffa00c7b10>] ext4_map_blocks+0xe0/0x200 [ext4]
[<ffffffffa00c8efd>] mpage_da_map_blocks+0xcd/0x420 [ext4]
[<ffffffffa00c9a6b>] ext4_da_writepages+0x2db/0x630 [ext4]
[<ffffffff810fdae1>] do_writepages+0x21/0x40
[<ffffffff81163e76>] writeback_single_inode+0xc6/0x2d0
[<ffffffff8116428e>] writeback_sb_inodes+0xce/0x180
[<ffffffff811643d9>] writeback_inodes_wb+0x99/0x180
[<ffffffff811646fb>] wb_writeback+0x23b/0x2a0
[<ffffffff811648cf>] wb_do_writeback+0x16f/0x180
[<ffffffff8106e1e0>] ? process_timeout+0x0/0x10
[<ffffffff81164937>] bdi_writeback_task+0x57/0x160
[<ffffffff8107d337>] ? bit_waitqueue+0x17/0xd0
[<ffffffff8110cc60>] ? bdi_start_fn+0x0/0xe0
[<ffffffff8110ccd1>] bdi_start_fn+0x71/0xe0
[<ffffffff8110cc60>] ? bdi_start_fn+0x0/0xe0
[<ffffffff8107cde6>] kthread+0x96/0xa0
[<ffffffff8100be84>] kernel_thread_helper+0x4/0x10
[<ffffffff8107cd50>] ? kthread+0x0/0xa0
[<ffffffff8100be80>] ? kernel_thread_helper+0x0/0x10
Code: 89 f8 e8 d8 af ff e0 48 85 c0 49 89 c6 74 45 48 89 c7 e8 58 5d ff e0 49 8b 55 08 4c 89 f1 48 89 de 8b 52 10 89 50 0c 49 8b 55 00 <48> 8b 52 40 48 89 50 10 49 8b 5424 40 48 89 50 18 41 8b 54 24
RIP [<ffffffffa00d73fc>] ftrace_raw_event_ext4_mb_release_group_pa+0x7c/0xe0 [ext4]
RSP <ffff880136ebb6d0>
CR2: 0000000000000040
---[ end trace 08bbe3845c7f3a09 ]---


2010-07-21 13:31:27

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences

Hi Steven,

> create file: file_108.dat
> # sync
> (panic)
>
>
> Seems ac->ac_inode can be NULL:
>
> DECLARE_EVENT_CLASS(ext4__mballoc,
> ...
> TP_fast_assign(
> __entry->dev = ac->ac_inode->i_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> ...
> ),
> ...
> );

Can you teach us proper tracepint writing way?


ext4_mb_release_group_pa() has a concern when ac is NULL.

============================================================
static noinline_for_stack int
ext4_mb_release_group_pa(struct ext4_buddy *e4b,
struct ext4_prealloc_space *pa,
struct ext4_allocation_context *ac)
{
struct super_block *sb = e4b->bd_sb;
ext4_group_t group;
ext4_grpblk_t bit;

trace_ext4_mb_release_group_pa(ac, pa);
BUG_ON(pa->pa_deleted == 0);
ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, &bit);
BUG_ON(group != e4b->bd_group && pa->pa_len != 0);
mb_free_blocks(pa->pa_inode, e4b, bit, pa->pa_len);
atomic_add(pa->pa_len, &EXT4_SB(sb)->s_mb_discarded);

if (ac) { // here
ac->ac_sb = sb;
ac->ac_inode = NULL;
ac->ac_b_ex.fe_group = group;
ac->ac_b_ex.fe_start = bit;
ac->ac_b_ex.fe_len = pa->pa_len;
ac->ac_b_ex.fe_logical = 0;
trace_ext4_mballoc_discard(ac);
}

return 0;
}
===================================================

but trace_ext4_mb_release_group_pa() doesn't care it.
=====================================================
TRACE_EVENT(ext4_mb_release_group_pa,
TP_PROTO(struct ext4_allocation_context *ac,
struct ext4_prealloc_space *pa),

TP_ARGS(ac, pa),

TP_STRUCT__entry(
__field( dev_t, dev )
__field( ino_t, ino )
__field( __u64, pa_pstart )
__field( __u32, pa_len )

),

TP_fast_assign(
__entry->dev = ac->ac_sb->s_dev;
__entry->ino = ac->ac_inode->i_ino;
__entry->pa_pstart = pa->pa_pstart;
__entry->pa_len = pa->pa_len;
),

TP_printk("dev %s pstart %llu len %u",
jbd2_dev_to_name(__entry->dev), __entry->pa_pstart, __entry->pa_len)
);
=====================================================

So, adding following branch should fix this issue.

if (ac)
trace_ext4_mb_release_group_pa(ac, pa);

But, I don't think this is proper fix because we don't want any overhead
if the tracepoint is disabled.

So, How do we check NULL in TP_fast_assign()?

Thanks.

2010-07-21 14:16:10

by Steven Rostedt

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences

On Wed, 2010-07-21 at 22:31 +0900, KOSAKI Motohiro wrote:
> Hi Steven,

> if (ac)
> trace_ext4_mb_release_group_pa(ac, pa);
>
> But, I don't think this is proper fix because we don't want any overhead
> if the tracepoint is disabled.
>
> So, How do we check NULL in TP_fast_assign()?

You could do:

TP_fast_assign(
if (ac) {
__entry->dev = ac->ac_sb->s_dev;
__entry->ino = ac->ac_inode->i_ino;
__entry->pa_pstart = pa->pa_pstart;
__entry->pa_len = pa->pa_len;
}
),

But this just makes the __entry null and wastes the ring buffer.

I may be able to add a __discard_entry that may help. Then we could do
something like this:

if (ac) {
__entry->dev = ac->ac_sb->s_dev;
__entry->ino = ac->ac_inode->i_ino;
__entry->pa_pstart = pa->pa_pstart;
__entry->pa_len = pa->pa_len;
} else
__discard_entry;

Does this seem reasonable?

But for now, the wasting the entry seems to be the only choice we have,
or to do as you suggested and have the "if (ac) trace_...", but I don't
like that.

-- Steve


2010-07-21 14:21:21

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences

On Wed, Jul 21, 2010 at 10:16:06AM -0400, Steven Rostedt wrote:
> On Wed, 2010-07-21 at 22:31 +0900, KOSAKI Motohiro wrote:
> > Hi Steven,
>
> > if (ac)
> > trace_ext4_mb_release_group_pa(ac, pa);
> >
> > But, I don't think this is proper fix because we don't want any overhead
> > if the tracepoint is disabled.
> >
> > So, How do we check NULL in TP_fast_assign()?
>
> You could do:
>
> TP_fast_assign(
> if (ac) {
> __entry->dev = ac->ac_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> __entry->pa_pstart = pa->pa_pstart;
> __entry->pa_len = pa->pa_len;
> }
> ),
>
> But this just makes the __entry null and wastes the ring buffer.
>
> I may be able to add a __discard_entry that may help. Then we could do
> something like this:
>
> if (ac) {
> __entry->dev = ac->ac_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> __entry->pa_pstart = pa->pa_pstart;
> __entry->pa_len = pa->pa_len;
> } else
> __discard_entry;
>
> Does this seem reasonable?
>
> But for now, the wasting the entry seems to be the only choice we have,
> or to do as you suggested and have the "if (ac) trace_...", but I don't
> like that.
>
> -- Steve


Is there no already existing branch in ext4 you could reuse in order to
send the trace only if (ac) ?

2010-07-22 05:40:27

by Li Zefan

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences

Steven Rostedt wrote:
> On Wed, 2010-07-21 at 22:31 +0900, KOSAKI Motohiro wrote:
>> Hi Steven,
>
>> if (ac)
>> trace_ext4_mb_release_group_pa(ac, pa);
>>
>> But, I don't think this is proper fix because we don't want any overhead
>> if the tracepoint is disabled.
>>
>> So, How do we check NULL in TP_fast_assign()?
>
> You could do:
>
> TP_fast_assign(
> if (ac) {
> __entry->dev = ac->ac_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> __entry->pa_pstart = pa->pa_pstart;
> __entry->pa_len = pa->pa_len;
> }

This leaves __entry->dev etc as arbitrary value, since the entry returned
by the ring buffer is not zeroed, so I think better add an else branch to
zero those values.

> ),
>
> But this just makes the __entry null and wastes the ring buffer.
>
> I may be able to add a __discard_entry that may help. Then we could do
> something like this:
>
> if (ac) {
> __entry->dev = ac->ac_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> __entry->pa_pstart = pa->pa_pstart;
> __entry->pa_len = pa->pa_len;
> } else
> __discard_entry;
>
> Does this seem reasonable?
>
> But for now, the wasting the entry seems to be the only choice we have,
> or to do as you suggested and have the "if (ac) trace_...", but I don't
> like that.
>

As I was (and still am) not sure what is the best fix, I decided
to send out a bug report instead of a patch..

2010-07-22 05:50:06

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences

On Wed, Jul 21, 2010 at 10:31:20PM +0900, KOSAKI Motohiro wrote:
> But, I don't think this is proper fix because we don't want any overhead
> if the tracepoint is disabled.
>
> So, How do we check NULL in TP_fast_assign()?

I think ext4 is simply using an incorrectly typed tracepoint here.
If you want it to be useful in any way it needs a sb paramter and
an optional inode paramter, not the allocation context.

Also the whole ext4_mb_release_group_pa function seems to be a bit
misdesigned. The code using ac is a totally separate block at the
end of the function and does work that's unrelated to the rest
of the function. Just making it a separate helper can calling it
only from those places that have the allocation context would make
the code more clear.

2010-07-23 01:13:29

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences

On Thu, Jul 22, 2010 at 01:49:57AM -0400, Christoph Hellwig wrote:
>
> I think ext4 is simply using an incorrectly typed tracepoint here.
> If you want it to be useful in any way it needs a sb paramter and
> an optional inode paramter, not the allocation context.

I agree; this is the patch that I had whipped up to fix the problem.
(See below)

> Also the whole ext4_mb_release_group_pa function seems to be a bit
> misdesigned. The code using ac is a totally separate block at the
> end of the function and does work that's unrelated to the rest
> of the function. Just making it a separate helper can calling it
> only from those places that have the allocation context would make
> the code more clear.

I need to look more closely at this. If I had time there would be a
lot of things that I'd be refactoring and cleaning up in mballoc.c....

- Ted


>From 52f9a0d80ccdb1b23e364221167bb55b2886cc18 Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <[email protected]>
Date: Thu, 22 Jul 2010 21:09:44 -0400
Subject: [PATCH] ext4: fix potential NULL dereference while tracing

The allocation_context pointer can be NULL.

Signed-off-by: "Theodore Ts'o" <[email protected]>
---
fs/ext4/mballoc.c | 4 ++--
include/trace/events/ext4.h | 20 ++++++++++++--------
2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 3dfad95..8b3b934 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3575,7 +3575,7 @@ ext4_mb_release_inode_pa(struct ext4_buddy *e4b, struct buffer_head *bitmap_bh,
trace_ext4_mballoc_discard(ac);
}

- trace_ext4_mb_release_inode_pa(ac, pa, grp_blk_start + bit,
+ trace_ext4_mb_release_inode_pa(sb, ac, pa, grp_blk_start + bit,
next - bit);
mb_free_blocks(pa->pa_inode, e4b, bit, next - bit);
bit = next + 1;
@@ -3606,7 +3606,7 @@ ext4_mb_release_group_pa(struct ext4_buddy *e4b,
ext4_group_t group;
ext4_grpblk_t bit;

- trace_ext4_mb_release_group_pa(ac, pa);
+ trace_ext4_mb_release_group_pa(sb, ac, pa);
BUG_ON(pa->pa_deleted == 0);
ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, &bit);
BUG_ON(group != e4b->bd_group && pa->pa_len != 0);
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index f3865c7..01e9e00 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -395,11 +395,12 @@ DEFINE_EVENT(ext4__mb_new_pa, ext4_mb_new_group_pa,
);

TRACE_EVENT(ext4_mb_release_inode_pa,
- TP_PROTO(struct ext4_allocation_context *ac,
+ TP_PROTO(struct super_block *sb,
+ struct ext4_allocation_context *ac,
struct ext4_prealloc_space *pa,
unsigned long long block, unsigned int count),

- TP_ARGS(ac, pa, block, count),
+ TP_ARGS(sb, ac, pa, block, count),

TP_STRUCT__entry(
__field( dev_t, dev )
@@ -410,8 +411,9 @@ TRACE_EVENT(ext4_mb_release_inode_pa,
),

TP_fast_assign(
- __entry->dev = ac->ac_sb->s_dev;
- __entry->ino = ac->ac_inode->i_ino;
+ __entry->dev = sb->s_dev;
+ __entry->ino = (ac && ac->ac_inode) ?
+ ac->ac_inode->i_ino : 0;
__entry->block = block;
__entry->count = count;
),
@@ -422,10 +424,11 @@ TRACE_EVENT(ext4_mb_release_inode_pa,
);

TRACE_EVENT(ext4_mb_release_group_pa,
- TP_PROTO(struct ext4_allocation_context *ac,
+ TP_PROTO(struct super_block *sb,
+ struct ext4_allocation_context *ac,
struct ext4_prealloc_space *pa),

- TP_ARGS(ac, pa),
+ TP_ARGS(sb, ac, pa),

TP_STRUCT__entry(
__field( dev_t, dev )
@@ -436,8 +439,9 @@ TRACE_EVENT(ext4_mb_release_group_pa,
),

TP_fast_assign(
- __entry->dev = ac->ac_sb->s_dev;
- __entry->ino = ac->ac_inode->i_ino;
+ __entry->dev = sb->s_dev;
+ __entry->ino = (ac && ac->ac_inode) ?
+ ac->ac_inode->i_ino : 0;
__entry->pa_pstart = pa->pa_pstart;
__entry->pa_len = pa->pa_len;
),
--
1.7.0.4

2010-07-23 05:47:38

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences

> On Thu, Jul 22, 2010 at 01:49:57AM -0400, Christoph Hellwig wrote:
> >
> > I think ext4 is simply using an incorrectly typed tracepoint here.
> > If you want it to be useful in any way it needs a sb paramter and
> > an optional inode paramter, not the allocation context.
>
> I agree; this is the patch that I had whipped up to fix the problem.
> (See below)

I'm not familiar ext4 so much. but you patch seems very nice!
thank you :)

>
> > Also the whole ext4_mb_release_group_pa function seems to be a bit
> > misdesigned. The code using ac is a totally separate block at the
> > end of the function and does work that's unrelated to the rest
> > of the function. Just making it a separate helper can calling it
> > only from those places that have the allocation context would make
> > the code more clear.
>
> I need to look more closely at this. If I had time there would be a
> lot of things that I'd be refactoring and cleaning up in mballoc.c....

2010-07-23 09:11:59

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences


On Jul 23, 2010, at 1:47 AM, KOSAKI Motohiro wrote:

>
> I'm not familiar ext4 so much. but you patch seems very nice!
> thank you :)

So, can someone confirm that this fixes the NULL pointer dereference? The patch itself makes sense, and it doesn't change the tracepoint output, so I'm inclined to include it, but I haven't been able to reproduce the failure myself using the reproduction recipe given at the beginning of this thread --- and I'd feel much better if someone who has been able to reproduce the crash can confirm this fixes things for them.

Thanks,

-- Ted
-

2010-07-23 09:14:17

by Li Zefan

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences

Theodore Tso wrote:
> On Jul 23, 2010, at 1:47 AM, KOSAKI Motohiro wrote:
>
>> I'm not familiar ext4 so much. but you patch seems very nice!
>> thank you :)
>
> So, can someone confirm that this fixes the NULL pointer dereference? The patch itself makes sense, and it doesn't change the tracepoint output, so I'm inclined to include it, but I haven't been able to reproduce the failure myself using the reproduction recipe given at the beginning of this thread --- and I'd feel much better if someone who has been able to reproduce the crash can confirm this fixes things for them.
>

Thanks for the fix. I'll test the patch after the weekend.

2010-07-26 03:45:52

by Li Zefan

[permalink] [raw]
Subject: Re: [BUG] ext4 trace events cause NULL pointer dereferences

Theodore Tso wrote:
> On Jul 23, 2010, at 1:47 AM, KOSAKI Motohiro wrote:
>
>> I'm not familiar ext4 so much. but you patch seems very nice!
>> thank you :)
>
> So, can someone confirm that this fixes the NULL pointer dereference? The patch itself makes sense, and it doesn't change the tracepoint output, so I'm inclined to include it, but I haven't been able to reproduce the failure myself using the reproduction recipe given at the beginning of this thread --- and I'd feel much better if someone who has been able to reproduce the crash can confirm this fixes things for them.
>

The patch works!

Tested-by: Li Zefan <[email protected]>