2009-04-21 18:45:53

by Balbir Singh

[permalink] [raw]
Subject: [BUG] rbtree bug with mmotm 2009-04-14-17-24

Hi, Andrew,

I did a quick check on lkml to see if someone reported this issue
already, I could not find any reports. I am beginning to see several
of these on my machine. I saw recent refactoring of rbtrees, I've
cc'ed Wolfram Strepp.


BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffff803ba4c8>] rb_erase+0x11f/0x2a7
PGD 77d1c067 PUD 77c96067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 2
Modules linked in: coretemp hwmon kvm_intel kvm usbhid rtc_cmos rtc_core rtc_lib usb_storage ata_piix libata mptsas mptscsih mptbase scsi_transport_sas uhci_hcd ohci_hcd ehci_hcd usbcore
Pid: 5379, comm: make Not tainted 2.6.30-rc1-mm1-dirty #163 IBM System x3250 -[436482A]-
RIP: 0010:[<ffffffff803ba4c8>] [<ffffffff803ba4c8>] rb_erase+0x11f/0x2a7
RSP: 0018:ffff8800640f7838 EFLAGS: 00010006
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88007f1d3bf8
RDX: ffff88007f1d3bf8 RSI: ffff88007e0208c8 RDI: 0000000000000000
RBP: ffff8800640f7848 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88007dcbb480 R11: ffff88006408e978 R12: ffff88007e0208c8
R13: ffff88007e020890 R14: ffff88007e0208c8 R15: 0000000000000010
FS: 00002ae490d5c6f0(0000) GS:ffffc20000400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 0000000077d8a000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process make (pid: 5379, threadinfo ffff8800640f6000, task ffff880077cc6630)
Stack:
ffff88007e0403d0 ffff88007e0403d0 ffff8800640f7868 ffffffff803b21ea
ffff88007e020890 ffff88007e0403a0 ffff8800640f78a8 ffffffff803b2237
000000057e0208c8 ffff88007e0403a0 ffff88007e0403a0 ffff88007e020890
Call Trace:
[<ffffffff803b21ea>] rb_erase_init+0x11/0x21
[<ffffffff803b2237>] cfq_prio_tree_add+0x3d/0xab
[<ffffffff803b24ba>] cfq_resort_rr_list+0x25/0x2a
[<ffffffff803b2513>] __cfq_slice_expired+0x54/0x94
[<ffffffff803b3522>] cfq_insert_request+0x295/0x2d0
[<ffffffff803a8404>] elv_insert+0x10f/0x1cd
[<ffffffff803a8558>] __elv_add_request+0x96/0x9e
[<ffffffff803ab284>] __make_request+0x3c2/0x402
[<ffffffff803a995c>] generic_make_request+0x293/0x32d
[<ffffffff80262f0c>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff8033b600>] ? ext3_get_block+0x0/0xe5
[<ffffffff803aaeb9>] submit_bio+0xc7/0xd0
[<ffffffff802ed865>] mpage_bio_submit+0x22/0x26
[<ffffffff802ee567>] mpage_readpages+0xe2/0xf6
[<ffffffff8033b600>] ? ext3_get_block+0x0/0xe5
[<ffffffff8033aadf>] ext3_readpages+0x1a/0x1c
[<ffffffff802982fc>] __do_page_cache_readahead+0x158/0x1fa
[<ffffffff80298210>] ? __do_page_cache_readahead+0x6c/0x1fa
[<ffffffff802983ba>] ra_submit+0x1c/0x20
[<ffffffff802986c1>] ondemand_readahead+0x223/0x236
[<ffffffff8029874c>] page_cache_async_readahead+0x78/0x84
[<ffffffff80290e98>] generic_file_aio_read+0x295/0x5c8
[<ffffffff802c752c>] do_sync_read+0xe2/0x126
[<ffffffff80265a6f>] ? __lock_acquire+0x76c/0x7d7
[<ffffffff8025691c>] ? autoremove_wake_function+0x0/0x38
[<ffffffff802eec6f>] ? dnotify_parent+0x6c/0x74
[<ffffffff802c7d5c>] vfs_read+0xab/0x154
[<ffffffff802c80b9>] sys_read+0x47/0x70
[<ffffffff8020baab>] system_call_fastpath+0x16/0x1b
Code: 89 01 48 85 d2 74 12 48 39 7a 10 75 06 48 89 4a 10 eb 0a 48 89 4a 08 eb 04 49 89 0c 24 41 ff c8 0f 85 88 01 00 00 e9 4b 01 00 00 <48> 8b 7b 10 48 39 cf 0f 85 a1 00 00 00 48 8b 7b 08 48 8b 07 a8
RIP [<ffffffff803ba4c8>] rb_erase+0x11f/0x2a7
RSP <ffff8800640f7838>
CR2: 0000000000000010


--
Balbir


2009-04-21 22:03:17

by Steve Wise

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24



Balbir Singh wrote:
> Hi, Andrew,
>
> I did a quick check on lkml to see if someone reported this issue
> already, I could not find any reports. I am beginning to see several
> of these on my machine. I saw recent refactoring of rbtrees, I've
> cc'ed Wolfram Strepp.
>
>


I see a similar crash (null ptr deref in rb_erase()) booting up
2.6.30-rc2/x86_64/centos 5.3 distro.

Steve.

2009-04-22 13:16:18

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

On Wed, Apr 22 2009, Balbir Singh wrote:
> Hi, Andrew,
>
> I did a quick check on lkml to see if someone reported this issue
> already, I could not find any reports. I am beginning to see several
> of these on my machine. I saw recent refactoring of rbtrees, I've
> cc'ed Wolfram Strepp.

This has already been fixed and the code in question is also in mainline
at this point. So it looks like mmotm is just somewhat out of date.

--
Jens Axboe

2009-04-22 13:17:19

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

On Tue, Apr 21 2009, Steve Wise wrote:
>
>
> Balbir Singh wrote:
>> Hi, Andrew,
>>
>> I did a quick check on lkml to see if someone reported this issue
>> already, I could not find any reports. I am beginning to see several
>> of these on my machine. I saw recent refactoring of rbtrees, I've
>> cc'ed Wolfram Strepp.
>>
>>
>
>
> I see a similar crash (null ptr deref in rb_erase()) booting up
> 2.6.30-rc2/x86_64/centos 5.3 distro.

Plain 2.6.30-rc2? Please also post the oops, thanks!

--
Jens Axboe

2009-04-22 14:23:19

by Steve Wise

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

Jens Axboe wrote:
> On Tue, Apr 21 2009, Steve Wise wrote:
>
>> Balbir Singh wrote:
>>
>>> Hi, Andrew,
>>>
>>> I did a quick check on lkml to see if someone reported this issue
>>> already, I could not find any reports. I am beginning to see several
>>> of these on my machine. I saw recent refactoring of rbtrees, I've
>>> cc'ed Wolfram Strepp.
>>>
>>>
>>>
>> I see a similar crash (null ptr deref in rb_erase()) booting up
>> 2.6.30-rc2/x86_64/centos 5.3 distro.
>>
>
> Plain 2.6.30-rc2? Please also post the oops, thanks!
>
>

No there are a few patches applied that are heading upstream, but one is
in the NFS RDMA server which isn't loaded yet and the rest are in
iw_cxgb3 (iwarp driver) which also hasn't loaded at the time we crash.

NOTE: Out of 4 power cycles, one booted up ok, 3 hit the crash.



Here's the OOPS:

Starting udev: BUG: unable to handle kernel NULL pointer dereference at
0000000000000010
IP: [<ffffffff80341faf>] __rb_rotate_left+0x7/0x5b
PGD 12c4f6067 PUD 12c486067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/class/sound/controlC0/dev
CPU 0
Modules linked in: snd_hda_codec_intelhdmi snd_hda_codec_realtek
snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event
snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd
button sr_mod cdrom i2c_i801 rtc_cmos serio_raw rtc_core cxgb3 sg r8169
floppy soundcore mii shpchp i2c_core rtc_lib snd_page_alloc pcspkr
dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod ata_piix
libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 2364, comm: vol_id Not tainted 2.6.30-rc2-stevo #1 P5E-VM HDMI
RIP: 0010:[<ffffffff80341faf>] [<ffffffff80341faf>]
__rb_rotate_left+0x7/0x5b
RSP: 0018:ffff88012bdb9990 EFLAGS: 00010086
RAX: ffff88012b663ac0 RBX: ffff88012b663ac0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88012d479a30 RDI: ffff88012b663ac0
RBP: ffff88012b663ac0 R08: ffff88012b663ac0 R09: 0000000000000000
R10: ffff88012d547808 R11: 0000000000000200 R12: ffff88012b663ac0
R13: 0000000000000000 R14: ffff88012d479a30 R15: 0000000000000000
FS: 00000000006e3880(0063) GS:ffff88002804b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 000000012acd8000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process vol_id (pid: 2364, threadinfo ffff88012bdb8000, task
ffff88012e102240)
Stack:
ffffffff80342110 ffff88012b663a90 ffff88012b663ac0 ffff88012d479a00
ffff88012d479a30 ffff88012d53c158 ffffffff8033a676 ffff88012d479a00
ffff88012b663ac0 ffff88012b663c10 0000000000000000 ffff88012b663a90
Call Trace:
[<ffffffff80342110>] ? rb_insert_color+0xb2/0xda
[<ffffffff8033a676>] ? cfq_prio_tree_add+0x9d/0xa8
[<ffffffff8033b668>] ? cfq_add_rq_rb+0xcb/0xde
[<ffffffff8033b716>] ? cfq_insert_request+0x5b/0x390
[<ffffffff8032fb2c>] ? elv_insert+0x112/0x1c0
[<ffffffff80332790>] ? __make_request+0x3cf/0x40b
[<ffffffff80331036>] ? generic_make_request+0x277/0x311
[<ffffffff803323ba>] ? submit_bio+0xae/0xb5
[<ffffffff802c2c2f>] ? submit_bh+0xd9/0xf9
[<ffffffff802c579c>] ? block_read_full_page+0x247/0x264
[<ffffffff802c8c99>] ? blkdev_get_block+0x0/0x47
[<ffffffff80281447>] ? __do_page_cache_readahead+0x144/0x178
[<ffffffff80281638>] ? ondemand_readahead+0x13a/0x149
[<ffffffff8027b2d8>] ? generic_file_aio_read+0x219/0x539
[<ffffffff802a6986>] ? do_sync_read+0xc9/0x10c
[<ffffffff8024a4ae>] ? autoremove_wake_function+0x0/0x2e
[<ffffffff8028ca6c>] ? handle_mm_fault+0x32f/0x6f1
[<ffffffff802a70a9>] ? vfs_read+0xaa/0x133
[<ffffffff802a742a>] ? sys_read+0x45/0x6e
[<ffffffff8020b96b>] ? system_call_fastpath+0x16/0x1b
Code: 00 31 c0 eb 19 ff c0 48 89 ee 48 c7 c7 88 a5 cd 80 89 43 08 e8 29
66 17 00 b8 01 00 00 00 5a 5b 5d c3 90 90 48 8b 4f 08 4c 8b 07 <48> 8b
51 10 49 83 e0 fc 48 85 d2 48 89 57 08 74 0c 48 8b 02 83
RIP [<ffffffff80341faf>] __rb_rotate_left+0x7/0x5b
RSP <ffff88012bdb9990>
CR2: 0000000000000010
---[ end trace c900f92beb0e53d4 ]---

2009-04-22 14:26:21

by Steve Wise

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24


Steve Wise wrote:
> Jens Axboe wrote:
>> On Tue, Apr 21 2009, Steve Wise wrote:
>>
>>> Balbir Singh wrote:
>>>
>>>> Hi, Andrew,
>>>>
>>>> I did a quick check on lkml to see if someone reported this issue
>>>> already, I could not find any reports. I am beginning to see several
>>>> of these on my machine. I saw recent refactoring of rbtrees, I've
>>>> cc'ed Wolfram Strepp.
>>>>
>>>>
>>> I see a similar crash (null ptr deref in rb_erase()) booting up
>>> 2.6.30-rc2/x86_64/centos 5.3 distro.
>>>
>>
>> Plain 2.6.30-rc2? Please also post the oops, thanks!
>>
>>
>
> No there are a few patches applied that are heading upstream, but one
> is in the NFS RDMA server which isn't loaded yet and the rest are in
> iw_cxgb3 (iwarp driver) which also hasn't loaded at the time we crash.
> NOTE: Out of 4 power cycles, one booted up ok, 3 hit the crash.
>
>
>

By the way, this one looks different from the last one I saw. So its
not consistently crashing in the same spot. The one I hit yesterday
(which I don't have the OOPs dump for was in rb_erase().

Steve.

> Here's the OOPS:
>
> Starting udev: BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000010
> IP: [<ffffffff80341faf>] __rb_rotate_left+0x7/0x5b
> PGD 12c4f6067 PUD 12c486067 PMD 0
> Oops: 0000 [#1] SMP
> last sysfs file: /sys/class/sound/controlC0/dev
> CPU 0
> Modules linked in: snd_hda_codec_intelhdmi snd_hda_codec_realtek
> snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss
> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
> snd_pcm snd_timer snd button sr_mod cdrom i2c_i801 rtc_cmos serio_raw
> rtc_core cxgb3 sg r8169 floppy soundcore mii shpchp i2c_core rtc_lib
> snd_page_alloc pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash
> dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd
> ohci_hcd ehci_hcd
> Pid: 2364, comm: vol_id Not tainted 2.6.30-rc2-stevo #1 P5E-VM HDMI
> RIP: 0010:[<ffffffff80341faf>] [<ffffffff80341faf>]
> __rb_rotate_left+0x7/0x5b
> RSP: 0018:ffff88012bdb9990 EFLAGS: 00010086
> RAX: ffff88012b663ac0 RBX: ffff88012b663ac0 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffff88012d479a30 RDI: ffff88012b663ac0
> RBP: ffff88012b663ac0 R08: ffff88012b663ac0 R09: 0000000000000000
> R10: ffff88012d547808 R11: 0000000000000200 R12: ffff88012b663ac0
> R13: 0000000000000000 R14: ffff88012d479a30 R15: 0000000000000000
> FS: 00000000006e3880(0063) GS:ffff88002804b000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000010 CR3: 000000012acd8000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process vol_id (pid: 2364, threadinfo ffff88012bdb8000, task
> ffff88012e102240)
> Stack:
> ffffffff80342110 ffff88012b663a90 ffff88012b663ac0 ffff88012d479a00
> ffff88012d479a30 ffff88012d53c158 ffffffff8033a676 ffff88012d479a00
> ffff88012b663ac0 ffff88012b663c10 0000000000000000 ffff88012b663a90
> Call Trace:
> [<ffffffff80342110>] ? rb_insert_color+0xb2/0xda
> [<ffffffff8033a676>] ? cfq_prio_tree_add+0x9d/0xa8
> [<ffffffff8033b668>] ? cfq_add_rq_rb+0xcb/0xde
> [<ffffffff8033b716>] ? cfq_insert_request+0x5b/0x390
> [<ffffffff8032fb2c>] ? elv_insert+0x112/0x1c0
> [<ffffffff80332790>] ? __make_request+0x3cf/0x40b
> [<ffffffff80331036>] ? generic_make_request+0x277/0x311
> [<ffffffff803323ba>] ? submit_bio+0xae/0xb5
> [<ffffffff802c2c2f>] ? submit_bh+0xd9/0xf9
> [<ffffffff802c579c>] ? block_read_full_page+0x247/0x264
> [<ffffffff802c8c99>] ? blkdev_get_block+0x0/0x47
> [<ffffffff80281447>] ? __do_page_cache_readahead+0x144/0x178
> [<ffffffff80281638>] ? ondemand_readahead+0x13a/0x149
> [<ffffffff8027b2d8>] ? generic_file_aio_read+0x219/0x539
> [<ffffffff802a6986>] ? do_sync_read+0xc9/0x10c
> [<ffffffff8024a4ae>] ? autoremove_wake_function+0x0/0x2e
> [<ffffffff8028ca6c>] ? handle_mm_fault+0x32f/0x6f1
> [<ffffffff802a70a9>] ? vfs_read+0xaa/0x133
> [<ffffffff802a742a>] ? sys_read+0x45/0x6e
> [<ffffffff8020b96b>] ? system_call_fastpath+0x16/0x1b
> Code: 00 31 c0 eb 19 ff c0 48 89 ee 48 c7 c7 88 a5 cd 80 89 43 08 e8
> 29 66 17 00 b8 01 00 00 00 5a 5b 5d c3 90 90 48 8b 4f 08 4c 8b 07 <48>
> 8b 51 10 49 83 e0 fc 48 85 d2 48 89 57 08 74 0c 48 8b 02 83
> RIP [<ffffffff80341faf>] __rb_rotate_left+0x7/0x5b
> RSP <ffff88012bdb9990>
> CR2: 0000000000000010
> ---[ end trace c900f92beb0e53d4 ]---
>
>
>

2009-04-22 14:38:54

by Steve Wise

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24


>>
>> No there are a few patches applied that are heading upstream, but one
>> is in the NFS RDMA server which isn't loaded yet and the rest are in
>> iw_cxgb3 (iwarp driver) which also hasn't loaded at the time we
>> crash. NOTE: Out of 4 power cycles, one booted up ok, 3 hit the crash.
>>
>>
>>
>
> By the way, this one looks different from the last one I saw. So its
> not consistently crashing in the same spot. The one I hit yesterday
> (which I don't have the OOPs dump for was in rb_erase().
>
> Steve.

I'll start bisecting to isolate this.

Steve.

2009-04-22 16:30:42

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

On Wed, Apr 22 2009, Steve Wise wrote:
>
>>>
>>> No there are a few patches applied that are heading upstream, but one
>>> is in the NFS RDMA server which isn't loaded yet and the rest are in
>>> iw_cxgb3 (iwarp driver) which also hasn't loaded at the time we
>>> crash. NOTE: Out of 4 power cycles, one booted up ok, 3 hit the
>>> crash.
>>>
>>>
>>>
>>
>> By the way, this one looks different from the last one I saw. So its
>> not consistently crashing in the same spot. The one I hit yesterday
>> (which I don't have the OOPs dump for was in rb_erase().
>>
>> Steve.
>
> I'll start bisecting to isolate this.

Don't bother, I see what the bug is now. It's caused by commit
a36e71f996e25d6213f57951f7ae1874086ec57e and it's due to ->ioprio being
changed without the prio rb root being adjusted.

I'll provide a fix shortly.

--
Jens Axboe

2009-04-22 16:37:26

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

On Wed, Apr 22 2009, Jens Axboe wrote:
> On Wed, Apr 22 2009, Steve Wise wrote:
> >
> >>>
> >>> No there are a few patches applied that are heading upstream, but one
> >>> is in the NFS RDMA server which isn't loaded yet and the rest are in
> >>> iw_cxgb3 (iwarp driver) which also hasn't loaded at the time we
> >>> crash. NOTE: Out of 4 power cycles, one booted up ok, 3 hit the
> >>> crash.
> >>>
> >>>
> >>>
> >>
> >> By the way, this one looks different from the last one I saw. So its
> >> not consistently crashing in the same spot. The one I hit yesterday
> >> (which I don't have the OOPs dump for was in rb_erase().
> >>
> >> Steve.
> >
> > I'll start bisecting to isolate this.
>
> Don't bother, I see what the bug is now. It's caused by commit
> a36e71f996e25d6213f57951f7ae1874086ec57e and it's due to ->ioprio being
> changed without the prio rb root being adjusted.
>
> I'll provide a fix shortly.

Quick'n dirty, I think this should fix the issue.

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 7e13f04..79ebb4c 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -594,7 +594,7 @@ cfq_prio_tree_lookup(struct cfq_data *cfqd, int ioprio, sector_t sector,

static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
- struct rb_root *root = &cfqd->prio_trees[cfqq->ioprio];
+ struct rb_root *root = &cfqd->prio_trees[cfqq->org_ioprio];
struct rb_node **p, *parent;
struct cfq_queue *__cfqq;

@@ -606,8 +606,8 @@ static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq)
if (!cfqq->next_rq)
return;

- __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->ioprio, cfqq->next_rq->sector,
- &parent, &p);
+ __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->org_ioprio,
+ cfqq->next_rq->sector, &parent, &p);
BUG_ON(__cfqq);

rb_link_node(&cfqq->p_node, parent, p);
@@ -656,8 +656,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)

if (!RB_EMPTY_NODE(&cfqq->rb_node))
cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
- if (!RB_EMPTY_NODE(&cfqq->p_node))
- rb_erase_init(&cfqq->p_node, &cfqd->prio_trees[cfqq->ioprio]);
+ if (!RB_EMPTY_NODE(&cfqq->p_node)) {
+ rb_erase_init(&cfqq->p_node,
+ &cfqd->prio_trees[cfqq->org_ioprio]);
+ }

BUG_ON(!cfqd->busy_queues);
cfqd->busy_queues--;
@@ -976,7 +978,7 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd,
* First, if we find a request starting at the end of the last
* request, choose it.
*/
- __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->ioprio,
+ __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->org_ioprio,
sector, &parent, NULL);
if (__cfqq)
return __cfqq;
@@ -1559,8 +1561,9 @@ cfq_alloc_io_context(struct cfq_data *cfqd, gfp_t gfp_mask)

static void cfq_init_prio_data(struct cfq_queue *cfqq, struct io_context *ioc)
{
+ struct cfq_data *cfqd = cfqq->cfqd;
struct task_struct *tsk = current;
- int ioprio_class;
+ int ioprio_class, prio_readd = 0;

if (!cfq_cfqq_prio_changed(cfqq))
return;
@@ -1592,12 +1595,24 @@ static void cfq_init_prio_data(struct cfq_queue *cfqq, struct io_context *ioc)
}

/*
+ * Remove us from the prio_tree if we are present, since we index
+ * by ->org_ioprio
+ */
+ if (!RB_EMPTY_NODE(&cfqq->p_node)) {
+ rb_erase(&cfqq->p_node, &cfqd->prio_trees[cfqq->org_ioprio]);
+ prio_readd = 1;
+ }
+
+ /*
* keep track of original prio settings in case we have to temporarily
* elevate the priority of this queue
*/
cfqq->org_ioprio = cfqq->ioprio;
cfqq->org_ioprio_class = cfqq->ioprio_class;
cfq_clear_cfqq_prio_changed(cfqq);
+
+ if (prio_readd)
+ cfq_prio_tree_add(cfqd, cfqq);
}

static void changed_ioprio(struct io_context *ioc, struct cfq_io_context *cic)

--
Jens Axboe

2009-04-22 18:16:31

by Steve Wise

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

Jens Axboe wrote:
>>
>> Don't bother, I see what the bug is now. It's caused by commit
>> a36e71f996e25d6213f57951f7ae1874086ec57e and it's due to ->ioprio being
>> changed without the prio rb root being adjusted.
>>
>> I'll provide a fix shortly.
>>
>
> Quick'n dirty, I think this should fix the issue.
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 7e13f04..79ebb4c 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -594,7 +594,7 @@ cfq_prio_tree_lookup(struct cfq_data *cfqd, int ioprio, sector_t sector,
>
> static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> - struct rb_root *root = &cfqd->prio_trees[cfqq->ioprio];
> + struct rb_root *root = &cfqd->prio_trees[cfqq->org_ioprio];
> struct rb_node **p, *parent;
> struct cfq_queue *__cfqq;
>
> @@ -606,8 +606,8 @@ static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> if (!cfqq->next_rq)
> return;
>
> - __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->ioprio, cfqq->next_rq->sector,
> - &parent, &p);
> + __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->org_ioprio,
> + cfqq->next_rq->sector, &parent, &p);
> BUG_ON(__cfqq);
>
> rb_link_node(&cfqq->p_node, parent, p);
> @@ -656,8 +656,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
>
> if (!RB_EMPTY_NODE(&cfqq->rb_node))
> cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
> - if (!RB_EMPTY_NODE(&cfqq->p_node))
> - rb_erase_init(&cfqq->p_node, &cfqd->prio_trees[cfqq->ioprio]);
> + if (!RB_EMPTY_NODE(&cfqq->p_node)) {
> + rb_erase_init(&cfqq->p_node,
> + &cfqd->prio_trees[cfqq->org_ioprio]);
> + }
>
> BUG_ON(!cfqd->busy_queues);
> cfqd->busy_queues--;
> @@ -976,7 +978,7 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd,
> * First, if we find a request starting at the end of the last
> * request, choose it.
> */
> - __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->ioprio,
> + __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->org_ioprio,
> sector, &parent, NULL);
> if (__cfqq)
> return __cfqq;
> @@ -1559,8 +1561,9 @@ cfq_alloc_io_context(struct cfq_data *cfqd, gfp_t gfp_mask)
>
> static void cfq_init_prio_data(struct cfq_queue *cfqq, struct io_context *ioc)
> {
> + struct cfq_data *cfqd = cfqq->cfqd;
> struct task_struct *tsk = current;
> - int ioprio_class;
> + int ioprio_class, prio_readd = 0;
>
> if (!cfq_cfqq_prio_changed(cfqq))
> return;
> @@ -1592,12 +1595,24 @@ static void cfq_init_prio_data(struct cfq_queue *cfqq, struct io_context *ioc)
> }
>
> /*
> + * Remove us from the prio_tree if we are present, since we index
> + * by ->org_ioprio
> + */
> + if (!RB_EMPTY_NODE(&cfqq->p_node)) {
> + rb_erase(&cfqq->p_node, &cfqd->prio_trees[cfqq->org_ioprio]);
> + prio_readd = 1;
> + }
> +
> + /*
> * keep track of original prio settings in case we have to temporarily
> * elevate the priority of this queue
> */
> cfqq->org_ioprio = cfqq->ioprio;
> cfqq->org_ioprio_class = cfqq->ioprio_class;
> cfq_clear_cfqq_prio_changed(cfqq);
> +
> + if (prio_readd)
> + cfq_prio_tree_add(cfqd, cfqq);
> }
>
> static void changed_ioprio(struct io_context *ioc, struct cfq_io_context *cic)
>
>

Still crashes:

Starting udev: BUG: unable to handle kernel NULL pointer dereference at
(null)
IP: [<ffffffff8034ee05>] rb_erase+0x130/0x2a7
PGD 12cd05067 PUD 12e09f067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/block/sda/sda3/dev
CPU 0
Modules linked in: snd_hda_codec_intelhdmi snd_hda_codec_realtek
snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event
snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm cxgb3 r8169
snd_timer snd sg sr_mod cdrom rtc_cmos rtc_core button i2c_i801
serio_raw floppy soundcore shpchp mii rtc_lib i2c_core snd_page_alloc
pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod
ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 2381, comm: vol_id Not tainted 2.6.30-rc2-jens #11 P5E-VM HDMI
RIP: 0010:[<ffffffff8034ee05>] [<ffffffff8034ee05>] rb_erase+0x130/0x2a7
RSP: 0018:ffff88012cc7f7f8 EFLAGS: 00010046
RAX: ffff88012cde2a21 RBX: ffff88012cde2a20 RCX: 0000000000000000
RDX: ffff88012cde2a20 RSI: ffff88012d421e30 RDI: 0000000000000000
RBP: ffff88012cc7f808 R08: 0000000000000000 R09: ffff88012d421e30
R10: ffff88012fb8a140 R11: ffff88012f03e3c0 R12: ffff88012d421e30
R13: ffff88012d421e00 R14: ffff88012cde2630 R15: 0000000000000000
FS: 00000000006e3880(0063) GS:ffff88002804b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 000000012b454000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process vol_id (pid: 2381, threadinfo ffff88012cc7e000, task
ffff88012a901530)
Stack:
ffff88012cde2660 ffff88012d5a1788 ffff88012cc7f828 ffffffff8034690d
ffff88012cc7f828 ffff88012cde2630 ffff88012cc7f868 ffffffff803478e5
ffff88012d421e00 ffff88012cde2630 ffff88012d5a1788 ffff88012d421e00
Call Trace:
[<ffffffff8034690d>] rb_erase_init+0x11/0x21
[<ffffffff803478e5>] cfq_remove_request+0x184/0x1e6
[<ffffffff80347993>] cfq_dispatch_insert+0x4c/0x70
[<ffffffff80348618>] cfq_dispatch_requests+0x2fb/0x40a
[<ffffffff8033b344>] elv_next_request+0x193/0x1a8
[<ffffffff8034c1e1>] ? kobject_get+0x1a/0x22
[<ffffffffa005fe52>] scsi_request_fn+0x7a/0x496 [scsi_mod]
[<ffffffff8033d720>] blk_start_queueing+0x1a/0x23
[<ffffffff80347d2a>] cfq_insert_request+0x244/0x385
[<ffffffff8033b46a>] elv_insert+0x111/0x1bd
[<ffffffff8033b5ac>] __elv_add_request+0x96/0x9e
[<ffffffff8033e306>] __make_request+0x3c2/0x400
[<ffffffff802efb2e>] ? proc_pid_instantiate+0x87/0x9c
[<ffffffff8033ca9b>] generic_make_request+0x27f/0x319
[<ffffffff802cd91e>] ? bio_init+0x18/0x32
[<ffffffff8033df3b>] submit_bio+0xb4/0xbd
[<ffffffff802c9970>] submit_bh+0xe5/0x109
[<ffffffff802cc659>] block_read_full_page+0x261/0x27f
[<ffffffff802cfd5e>] ? blkdev_get_block+0x0/0x4e
[<ffffffff802cefc3>] blkdev_readpage+0x13/0x15
[<ffffffff80285329>] __do_page_cache_readahead+0x134/0x16c
[<ffffffff8028552d>] ondemand_readahead+0x143/0x155
[<ffffffff802855d7>] page_cache_sync_readahead+0x17/0x19
[<ffffffff8027ecac>] generic_file_aio_read+0x245/0x594
[<ffffffff802abff4>] do_sync_read+0xe2/0x126
[<ffffffff8028f553>] ? __do_fault+0x362/0x3ac
[<ffffffff8024c30c>] ? autoremove_wake_function+0x0/0x38
[<ffffffff8029104c>] ? handle_mm_fault+0x1d9/0x6d1
[<ffffffff8031a144>] ? security_file_permission+0x11/0x13
[<ffffffff802ac748>] vfs_read+0xab/0x134
[<ffffffff802acae9>] sys_read+0x47/0x70
[<ffffffff8020ba2b>] system_call_fastpath+0x16/0x1b
Code: eb 0a 48 89 4a 08 eb 04 49 89 0c 24 41 ff c8 0f 85 88 01 00 00 e9
4b 01 00 00 48 8b 7b 10 48 39 cf 0f 85 a1 00 00 00 48 8b 7b 08 <48> 8b
07 a8 01 75 1a 48 83 c8 01 4c 89 e6 48 89 07 48 83 23 fe
RIP [<ffffffff8034ee05>] rb_erase+0x130/0x2a7
RSP <ffff88012cc7f7f8>
CR2: 0000000000000000
---[ end trace 98881b739b7ec57a ]---

2009-04-22 18:34:25

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

On Wed, Apr 22 2009, Steve Wise wrote:
> Jens Axboe wrote:
>>>
>>> Don't bother, I see what the bug is now. It's caused by commit
>>> a36e71f996e25d6213f57951f7ae1874086ec57e and it's due to ->ioprio being
>>> changed without the prio rb root being adjusted.
>>>
>>> I'll provide a fix shortly.
>>>
>>
>> Quick'n dirty, I think this should fix the issue.
>>
>> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
>> index 7e13f04..79ebb4c 100644
>> --- a/block/cfq-iosched.c
>> +++ b/block/cfq-iosched.c
>> @@ -594,7 +594,7 @@ cfq_prio_tree_lookup(struct cfq_data *cfqd, int ioprio, sector_t sector,
>> static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue
>> *cfqq)
>> {
>> - struct rb_root *root = &cfqd->prio_trees[cfqq->ioprio];
>> + struct rb_root *root = &cfqd->prio_trees[cfqq->org_ioprio];
>> struct rb_node **p, *parent;
>> struct cfq_queue *__cfqq;
>> @@ -606,8 +606,8 @@ static void cfq_prio_tree_add(struct cfq_data
>> *cfqd, struct cfq_queue *cfqq)
>> if (!cfqq->next_rq)
>> return;
>> - __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->ioprio,
>> cfqq->next_rq->sector,
>> - &parent, &p);
>> + __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->org_ioprio,
>> + cfqq->next_rq->sector, &parent, &p);
>> BUG_ON(__cfqq);
>> rb_link_node(&cfqq->p_node, parent, p);
>> @@ -656,8 +656,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
>> if (!RB_EMPTY_NODE(&cfqq->rb_node))
>> cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
>> - if (!RB_EMPTY_NODE(&cfqq->p_node))
>> - rb_erase_init(&cfqq->p_node, &cfqd->prio_trees[cfqq->ioprio]);
>> + if (!RB_EMPTY_NODE(&cfqq->p_node)) {
>> + rb_erase_init(&cfqq->p_node,
>> + &cfqd->prio_trees[cfqq->org_ioprio]);
>> + }
>> BUG_ON(!cfqd->busy_queues);
>> cfqd->busy_queues--;
>> @@ -976,7 +978,7 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd,
>> * First, if we find a request starting at the end of the last
>> * request, choose it.
>> */
>> - __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->ioprio,
>> + __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->org_ioprio,
>> sector, &parent, NULL);
>> if (__cfqq)
>> return __cfqq;
>> @@ -1559,8 +1561,9 @@ cfq_alloc_io_context(struct cfq_data *cfqd, gfp_t gfp_mask)
>> static void cfq_init_prio_data(struct cfq_queue *cfqq, struct
>> io_context *ioc)
>> {
>> + struct cfq_data *cfqd = cfqq->cfqd;
>> struct task_struct *tsk = current;
>> - int ioprio_class;
>> + int ioprio_class, prio_readd = 0;
>> if (!cfq_cfqq_prio_changed(cfqq))
>> return;
>> @@ -1592,12 +1595,24 @@ static void cfq_init_prio_data(struct cfq_queue *cfqq, struct io_context *ioc)
>> }
>> /*
>> + * Remove us from the prio_tree if we are present, since we index
>> + * by ->org_ioprio
>> + */
>> + if (!RB_EMPTY_NODE(&cfqq->p_node)) {
>> + rb_erase(&cfqq->p_node, &cfqd->prio_trees[cfqq->org_ioprio]);
>> + prio_readd = 1;
>> + }
>> +
>> + /*
>> * keep track of original prio settings in case we have to temporarily
>> * elevate the priority of this queue
>> */
>> cfqq->org_ioprio = cfqq->ioprio;
>> cfqq->org_ioprio_class = cfqq->ioprio_class;
>> cfq_clear_cfqq_prio_changed(cfqq);
>> +
>> + if (prio_readd)
>> + cfq_prio_tree_add(cfqd, cfqq);
>> }
>> static void changed_ioprio(struct io_context *ioc, struct
>> cfq_io_context *cic)
>>
>>
>
> Still crashes:

[snip]

Odd... Can you try this variant?

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 7e13f04..3a97c18 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -154,6 +154,7 @@ struct cfq_queue {
unsigned long rb_key;
/* prio tree member */
struct rb_node p_node;
+ struct rb_root *p_root;
/* sorted list of pending requests */
struct rb_root sort_list;
/* if fifo isn't expired, next request to serve */
@@ -594,22 +595,25 @@ cfq_prio_tree_lookup(struct cfq_data *cfqd, int ioprio, sector_t sector,

static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
- struct rb_root *root = &cfqd->prio_trees[cfqq->ioprio];
+ struct rb_root *root = &cfqd->prio_trees[cfqq->org_ioprio];
struct rb_node **p, *parent;
struct cfq_queue *__cfqq;

- if (!RB_EMPTY_NODE(&cfqq->p_node))
- rb_erase_init(&cfqq->p_node, root);
+ if (cfqq->p_root) {
+ rb_erase_init(&cfqq->p_node, cfqq->p_root);
+ cfqq->p_root = NULL;
+ }

if (cfq_class_idle(cfqq))
return;
if (!cfqq->next_rq)
return;

- __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->ioprio, cfqq->next_rq->sector,
- &parent, &p);
+ __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->org_ioprio,
+ cfqq->next_rq->sector, &parent, &p);
BUG_ON(__cfqq);

+ cfqq->p_root = root;
rb_link_node(&cfqq->p_node, parent, p);
rb_insert_color(&cfqq->p_node, root);
}
@@ -656,8 +660,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)

if (!RB_EMPTY_NODE(&cfqq->rb_node))
cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
- if (!RB_EMPTY_NODE(&cfqq->p_node))
- rb_erase_init(&cfqq->p_node, &cfqd->prio_trees[cfqq->ioprio]);
+ if (cfqq->p_root) {
+ rb_erase_init(&cfqq->p_node, cfqq->p_root);
+ cfqq->p_root = NULL;
+ }

BUG_ON(!cfqd->busy_queues);
cfqd->busy_queues--;
@@ -976,7 +982,7 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd,
* First, if we find a request starting at the end of the last
* request, choose it.
*/
- __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->ioprio,
+ __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->org_ioprio,
sector, &parent, NULL);
if (__cfqq)
return __cfqq;

--
Jens Axboe

2009-04-22 18:59:05

by Steve Wise

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24


> Odd... Can you try this variant?
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 7e13f04..3a97c18 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -154,6 +154,7 @@ struct cfq_queue {
> unsigned long rb_key;
> /* prio tree member */
> struct rb_node p_node;
> + struct rb_root *p_root;
> /* sorted list of pending requests */
> struct rb_root sort_list;
> /* if fifo isn't expired, next request to serve */
> @@ -594,22 +595,25 @@ cfq_prio_tree_lookup(struct cfq_data *cfqd, int ioprio, sector_t sector,
>
> static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> - struct rb_root *root = &cfqd->prio_trees[cfqq->ioprio];
> + struct rb_root *root = &cfqd->prio_trees[cfqq->org_ioprio];
> struct rb_node **p, *parent;
> struct cfq_queue *__cfqq;
>
> - if (!RB_EMPTY_NODE(&cfqq->p_node))
> - rb_erase_init(&cfqq->p_node, root);
> + if (cfqq->p_root) {
> + rb_erase_init(&cfqq->p_node, cfqq->p_root);
> + cfqq->p_root = NULL;
> + }
>
> if (cfq_class_idle(cfqq))
> return;
> if (!cfqq->next_rq)
> return;
>
> - __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->ioprio, cfqq->next_rq->sector,
> - &parent, &p);
> + __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->org_ioprio,
> + cfqq->next_rq->sector, &parent, &p);
> BUG_ON(__cfqq);
>
> + cfqq->p_root = root;
> rb_link_node(&cfqq->p_node, parent, p);
> rb_insert_color(&cfqq->p_node, root);
> }
> @@ -656,8 +660,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
>
> if (!RB_EMPTY_NODE(&cfqq->rb_node))
> cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
> - if (!RB_EMPTY_NODE(&cfqq->p_node))
> - rb_erase_init(&cfqq->p_node, &cfqd->prio_trees[cfqq->ioprio]);
> + if (cfqq->p_root) {
> + rb_erase_init(&cfqq->p_node, cfqq->p_root);
> + cfqq->p_root = NULL;
> + }
>
> BUG_ON(!cfqd->busy_queues);
> cfqd->busy_queues--;
> @@ -976,7 +982,7 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd,
> * First, if we find a request starting at the end of the last
> * request, choose it.
> */
> - __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->ioprio,
> + __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->org_ioprio,
> sector, &parent, NULL);
> if (__cfqq)
> return __cfqq;
>
>

Still crashes with this variant:

Starting udev: BUG: unable to handle kernel NULL pointer dereference at
0000000000000010
IP: [<ffffffff8034ee7b>] rb_erase+0x1ee/0x2a7
PGD 12e539067 PUD 12d506067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/block/sda/sda8/dev
CPU 0
Modules linked in: snd_hda_codec_intelhdmi snd_hda_codec_realtek
snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event
snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm sg cxgb3
rtc_cmos snd_timer snd soundcore shpchp r8169 mii snd_page_alloc floppy
rtc_core sr_mod i2c_i801 rtc_lib cdrom serio_raw button i2c_core pcspkr
dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod ata_piix
libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 2379, comm: vol_id Not tainted 2.6.30-rc2-jens2 #12 P5E-VM HDMI
RIP: 0010:[<ffffffff8034ee7b>] [<ffffffff8034ee7b>] rb_erase+0x1ee/0x2a7
RSP: 0018:ffff88012b9777f8 EFLAGS: 00010082
RAX: 0000000000000000 RBX: ffff88012acedde0 RCX: ffff88012cd8d7f8
RDX: 0000000000000000 RSI: ffff88012fb8aa30 RDI: 0000000000000000
RBP: ffff88012b977808 R08: 0000000000000000 R09: ffff88012d4f2b28
R10: ffff88012fb8b740 R11: ffff88012f054680 R12: ffff88012fb8aa30
R13: ffff88012fb8aa00 R14: ffff88012acedf00 R15: 0000000000000000
FS: 00000000006e3880(0063) GS:ffff88002804b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 000000012c512000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process vol_id (pid: 2379, threadinfo ffff88012b976000, task
ffff88012dd2ce20)
Stack:
ffff88012acedf30 ffff88012d4f2b28 ffff88012b977828 ffffffff8034690d
ffff88012b977828 ffff88012acedf00 ffff88012b977868 ffffffff8034787d
ffff88012fb8aa00 ffff88012acedf00 ffff88012d4f2b28 ffff88012fb8aa00
Call Trace:
[<ffffffff8034690d>] rb_erase_init+0x11/0x21
[<ffffffff8034787d>] cfq_remove_request+0x174/0x1de
[<ffffffff80347933>] cfq_dispatch_insert+0x4c/0x70
[<ffffffff803485c4>] cfq_dispatch_requests+0x301/0x417
[<ffffffff8033b344>] elv_next_request+0x193/0x1a8
[<ffffffff8034c199>] ? kobject_get+0x1a/0x22
[<ffffffffa005fe52>] scsi_request_fn+0x7a/0x496 [scsi_mod]
[<ffffffff8033d720>] blk_start_queueing+0x1a/0x23
[<ffffffff80347cca>] cfq_insert_request+0x244/0x38c
[<ffffffff8033b46a>] elv_insert+0x111/0x1bd
[<ffffffff8033b5ac>] __elv_add_request+0x96/0x9e
[<ffffffff8033e306>] __make_request+0x3c2/0x400
[<ffffffff8033ca9b>] generic_make_request+0x27f/0x319
[<ffffffff802cd91e>] ? bio_init+0x18/0x32
[<ffffffff8033df3b>] submit_bio+0xb4/0xbd
[<ffffffff802c9970>] submit_bh+0xe5/0x109
[<ffffffff802cc659>] block_read_full_page+0x261/0x27f
[<ffffffff802cfd5e>] ? blkdev_get_block+0x0/0x4e
[<ffffffff802cefc3>] blkdev_readpage+0x13/0x15
[<ffffffff80285329>] __do_page_cache_readahead+0x134/0x16c
[<ffffffff8028552d>] ondemand_readahead+0x143/0x155
[<ffffffff802855d7>] page_cache_sync_readahead+0x17/0x19
[<ffffffff8027ecac>] generic_file_aio_read+0x245/0x594
[<ffffffff802abff4>] do_sync_read+0xe2/0x126
[<ffffffff8028f553>] ? __do_fault+0x362/0x3ac
[<ffffffff8024c30c>] ? autoremove_wake_function+0x0/0x38
[<ffffffff8029104c>] ? handle_mm_fault+0x1d9/0x6d1
[<ffffffff8031a144>] ? security_file_permission+0x11/0x13
[<ffffffff802ac748>] vfs_read+0xab/0x134
[<ffffffff802acae9>] sys_read+0x47/0x70
[<ffffffff8020ba2b>] system_call_fastpath+0x16/0x1b
Code: e8 d3 fb ff ff e9 86 00 00 00 48 8b 07 a8 01 75 1a 48 83 c8 01 4c
89 e6 48 89 07 48 83 23 fe 48 89 df e8 11 fc ff ff 48 8b 7b 10 <48> 8b
57 10 48 85 d2 74 05 f6 02 01 74 2c 48 8b 47 08 48 85 c0
RIP [<ffffffff8034ee7b>] rb_erase+0x1ee/0x2a7
RSP <ffff88012b9777f8>
CR2: 0000000000000010
---[ end trace e20a9dac927dc3b0 ]---

2009-04-22 19:22:28

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

On Wed, Apr 22 2009, Steve Wise wrote:
>
>> Odd... Can you try this variant?
>>
>> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
>> index 7e13f04..3a97c18 100644
>> --- a/block/cfq-iosched.c
>> +++ b/block/cfq-iosched.c
>> @@ -154,6 +154,7 @@ struct cfq_queue {
>> unsigned long rb_key;
>> /* prio tree member */
>> struct rb_node p_node;
>> + struct rb_root *p_root;
>> /* sorted list of pending requests */
>> struct rb_root sort_list;
>> /* if fifo isn't expired, next request to serve */
>> @@ -594,22 +595,25 @@ cfq_prio_tree_lookup(struct cfq_data *cfqd, int ioprio, sector_t sector,
>> static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue
>> *cfqq)
>> {
>> - struct rb_root *root = &cfqd->prio_trees[cfqq->ioprio];
>> + struct rb_root *root = &cfqd->prio_trees[cfqq->org_ioprio];
>> struct rb_node **p, *parent;
>> struct cfq_queue *__cfqq;
>> - if (!RB_EMPTY_NODE(&cfqq->p_node))
>> - rb_erase_init(&cfqq->p_node, root);
>> + if (cfqq->p_root) {
>> + rb_erase_init(&cfqq->p_node, cfqq->p_root);
>> + cfqq->p_root = NULL;
>> + }
>> if (cfq_class_idle(cfqq))
>> return;
>> if (!cfqq->next_rq)
>> return;
>> - __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->ioprio,
>> cfqq->next_rq->sector,
>> - &parent, &p);
>> + __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->org_ioprio,
>> + cfqq->next_rq->sector, &parent, &p);
>> BUG_ON(__cfqq);
>> + cfqq->p_root = root;
>> rb_link_node(&cfqq->p_node, parent, p);
>> rb_insert_color(&cfqq->p_node, root);
>> }
>> @@ -656,8 +660,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
>> if (!RB_EMPTY_NODE(&cfqq->rb_node))
>> cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
>> - if (!RB_EMPTY_NODE(&cfqq->p_node))
>> - rb_erase_init(&cfqq->p_node, &cfqd->prio_trees[cfqq->ioprio]);
>> + if (cfqq->p_root) {
>> + rb_erase_init(&cfqq->p_node, cfqq->p_root);
>> + cfqq->p_root = NULL;
>> + }
>> BUG_ON(!cfqd->busy_queues);
>> cfqd->busy_queues--;
>> @@ -976,7 +982,7 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd,
>> * First, if we find a request starting at the end of the last
>> * request, choose it.
>> */
>> - __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->ioprio,
>> + __cfqq = cfq_prio_tree_lookup(cfqd, cur_cfqq->org_ioprio,
>> sector, &parent, NULL);
>> if (__cfqq)
>> return __cfqq;
>>
>>
>
> Still crashes with this variant:

OK, so please try and revert a36e71f996e25d6213f57951f7ae1874086ec57e
and see if that works?

Or, better yet, please try and revert
55a63998b8967615a15e2211ba0ff3a84a565824 first. If that doesn't work,
try the above revert.

--
Jens Axboe

2009-04-22 20:16:20

by Steve Wise

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

Jens Axboe wrote:



>> Still crashes with this variant:
>>
>
> OK, so please try and revert a36e71f996e25d6213f57951f7ae1874086ec57e
> and see if that works?
>
> Or, better yet, please try and revert
> 55a63998b8967615a15e2211ba0ff3a84a565824 first. If that doesn't work,
> try the above revert.
>
>

Reverting 55a63998b8967615a15e2211ba0ff3a84a565824 did the trick.


Steve.

2009-04-22 20:33:19

by Steve Wise

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

Steve Wise wrote:
> Jens Axboe wrote:
>
>
>
>>> Still crashes with this variant:
>>>
>>
>> OK, so please try and revert a36e71f996e25d6213f57951f7ae1874086ec57e
>> and see if that works?
>>
>> Or, better yet, please try and revert
>> 55a63998b8967615a15e2211ba0ff3a84a565824 first. If that doesn't work,
>> try the above revert.
>>
>>
>
> Reverting 55a63998b8967615a15e2211ba0ff3a84a565824 did the trick.
>
>
> Steve.
>
>

<snipit from 55a63998b8967615a15e2211ba0ff3a84a565824>

@@ -200,17 +197,14 @@ static void __rb_erase_color(struct rb_node *node,
struct rb_node *parent,
{
if (!other->rb_left ||
rb_is_black(other->rb_left))
{
- register struct rb_node *o_right;
- if ((o_right = other->rb_right))
- rb_set_black(o_right);
+ rb_set_black(other->rb_right);
rb_set_red(other);
__rb_rotate_left(other, root);
other = parent->rb_left;
}
rb_set_color(other, rb_color(parent));
rb_set_black(parent);
- if (other->rb_left)
- rb_set_black(other->rb_left);
+ rb_set_black(other->rb_left);
__rb_rotate_right(parent, root);
node = root->rb_node;
break;


I don't know this code, but isn't the 'if (other->rb_left)' really
needed? Or is it always true that if '!other->rb_left' is true entering
this snipit, then after executing the first 'if' block, then
'other->rb_left' must be a valid ptr? (how's that for confusing english? :)

Steve

2009-04-23 05:34:19

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

On Wed, Apr 22 2009, Steve Wise wrote:
> Steve Wise wrote:
>> Jens Axboe wrote:
>>
>>
>>
>>>> Still crashes with this variant:
>>>>
>>>
>>> OK, so please try and revert a36e71f996e25d6213f57951f7ae1874086ec57e
>>> and see if that works?
>>>
>>> Or, better yet, please try and revert
>>> 55a63998b8967615a15e2211ba0ff3a84a565824 first. If that doesn't work,
>>> try the above revert.
>>>
>>>
>>
>> Reverting 55a63998b8967615a15e2211ba0ff3a84a565824 did the trick.
>>
>>
>> Steve.
>>
>>
>
> <snipit from 55a63998b8967615a15e2211ba0ff3a84a565824>
>
> @@ -200,17 +197,14 @@ static void __rb_erase_color(struct rb_node *node,
> struct rb_node *parent,
> {
> if (!other->rb_left ||
> rb_is_black(other->rb_left))
> {
> - register struct rb_node *o_right;
> - if ((o_right = other->rb_right))
> - rb_set_black(o_right);
> + rb_set_black(other->rb_right);
> rb_set_red(other);
> __rb_rotate_left(other, root);
> other = parent->rb_left;
> }
> rb_set_color(other, rb_color(parent));
> rb_set_black(parent);
> - if (other->rb_left)
> - rb_set_black(other->rb_left);
> + rb_set_black(other->rb_left);
> __rb_rotate_right(parent, root);
> node = root->rb_node;
> break;
>
>
> I don't know this code, but isn't the 'if (other->rb_left)' really
> needed? Or is it always true that if '!other->rb_left' is true entering
> this snipit, then after executing the first 'if' block, then
> 'other->rb_left' must be a valid ptr? (how's that for confusing english?
> :)

Heh, not sure. What is sure is that the commit in question is
problematic. Either because it itself has a bug, or because it exposes a
bug elsewhere in the rbtree code. So I'd suggest we just revert that
commit ASAP.

--
Jens Axboe

2009-04-23 10:41:12

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

On Wed, Apr 22 2009, Steve Wise wrote:
> Jens Axboe wrote:
>
>
>
>>> Still crashes with this variant:
>>>
>>
>> OK, so please try and revert a36e71f996e25d6213f57951f7ae1874086ec57e
>> and see if that works?
>>
>> Or, better yet, please try and revert
>> 55a63998b8967615a15e2211ba0ff3a84a565824 first. If that doesn't work,
>> try the above revert.
>>
>>
>
> Reverting 55a63998b8967615a15e2211ba0ff3a84a565824 did the trick.

Steve, can you (and Balbir) try with pulling this into 2.6.30-rc3:

git://git.kernel.dk/linux-2.6-block.git for-linus

It has two fixes for possible rbtree misuse in cfq. If it still crashes
without reverting 55a63998b8967615a15e2211ba0ff3a84a565824 with that,
then we also need to look further into the rbtree issue.

Thanks!

--
Jens Axboe

2009-04-23 21:24:44

by Steve Wise

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

Jens Axboe wrote:
> On Wed, Apr 22 2009, Steve Wise wrote:
>
>> Jens Axboe wrote:
>>
>>
>>
>>
>>>> Still crashes with this variant:
>>>>
>>>>
>>> OK, so please try and revert a36e71f996e25d6213f57951f7ae1874086ec57e
>>> and see if that works?
>>>
>>> Or, better yet, please try and revert
>>> 55a63998b8967615a15e2211ba0ff3a84a565824 first. If that doesn't work,
>>> try the above revert.
>>>
>>>
>>>
>> Reverting 55a63998b8967615a15e2211ba0ff3a84a565824 did the trick.
>>
>
> Steve, can you (and Balbir) try with pulling this into 2.6.30-rc3:
>
> git://git.kernel.dk/linux-2.6-block.git for-linus
>
> It has two fixes for possible rbtree misuse in cfq. If it still crashes
> without reverting 55a63998b8967615a15e2211ba0ff3a84a565824 with that,
> then we also need to look further into the rbtree issue.
>

rc3 + your commits in the above branch seem to resolve the issue as well.


Steve.

2009-04-24 05:29:42

by Balbir Singh

[permalink] [raw]
Subject: Re: [BUG] rbtree bug with mmotm 2009-04-14-17-24

* Jens Axboe <[email protected]> [2009-04-23 12:40:55]:

> On Wed, Apr 22 2009, Steve Wise wrote:
> > Jens Axboe wrote:
> >
> >
> >
> >>> Still crashes with this variant:
> >>>
> >>
> >> OK, so please try and revert a36e71f996e25d6213f57951f7ae1874086ec57e
> >> and see if that works?
> >>
> >> Or, better yet, please try and revert
> >> 55a63998b8967615a15e2211ba0ff3a84a565824 first. If that doesn't work,
> >> try the above revert.
> >>
> >>
> >
> > Reverting 55a63998b8967615a15e2211ba0ff3a84a565824 did the trick.
>
> Steve, can you (and Balbir) try with pulling this into 2.6.30-rc3:
>
> git://git.kernel.dk/linux-2.6-block.git for-linus
>
> It has two fixes for possible rbtree misuse in cfq. If it still crashes
> without reverting 55a63998b8967615a15e2211ba0ff3a84a565824 with that,
> then we also need to look further into the rbtree issue.
>

OK, I'll give it a try and get back with what I see.

--
Balbir