Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755402AbZDXErS (ORCPT ); Fri, 24 Apr 2009 00:47:18 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751594AbZDXErF (ORCPT ); Fri, 24 Apr 2009 00:47:05 -0400 Received: from josefsipek.net ([141.211.133.196]:41554 "EHLO josefsipek.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752497AbZDXErD (ORCPT ); Fri, 24 Apr 2009 00:47:03 -0400 Date: Fri, 24 Apr 2009 00:47:01 -0400 From: "Josef 'Jeff' Sipek" To: Jens Axboe , Jeff Moyer Cc: linux-kernel@vger.kernel.org Subject: CFQ Oops Message-ID: <20090424044701.GG3709@josefsipek.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5870 Lines: 125 I got an oops with CFQ (see below) while running the XFS QA test 133. I managed to bisect it down to commit a36e71f996e25d6213f57951f7ae1874086ec57e. >From a quick glance at the code, I'd guess that prio_trees[..] happens to be NULL, and so the rb_erase_init call on line 660 results in a NULL ptr deref. 657 if (!RB_EMPTY_NODE(&cfqq->rb_node)) 658 cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree); 659 if (!RB_EMPTY_NODE(&cfqq->p_node)) 660 rb_erase_init(&cfqq->p_node, &cfqd->prio_trees[cfqq->ioprio]); 661 662 BUG_ON(!cfqd->busy_queues); 663 cfqd->busy_queues--; Josef 'Jeff' Sipek. a36e71f996e25d6213f57951f7ae1874086ec57e is first bad commit commit a36e71f996e25d6213f57951f7ae1874086ec57e Author: Jens Axboe Date: Wed Apr 15 12:15:11 2009 +0200 cfq-iosched: add close cooperator code If we have processes that are working in close proximity to each other on disk, we don't want to idle wait. Instead allow the close process to issue a request, getting better aggregate bandwidth. The anticipatory scheduler has similar checks, noop and deadline do not need it since they don't care about process <-> io mappings. The code for CFQ is a little more involved though, since we split request queues into per-process contexts. This fixes a performance problem with eg dump(8), since it uses several processes in some silly attempt to speed IO up. Even if dump(8) isn't really a valid case (it should be fixed by using CLONE_IO), there are other cases where we see close processes and where idling ends up hurting performance. Credit goes to Jeff Moyer for writing the initial implementation. Signed-off-by: Jens Axboe :040000 040000 2e905502bf3c466baae407bcd654fc36d015c83f b4aafd7edc811aed69fa44ddf00a29ece4f32a33 M block BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] rb_erase+0x174/0x340 PGD f8843067 PUD f8842067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:0b.0/0000:01:03.0/local_cpus CPU 0 Modules linked in: xfs exportfs sco bridge stp llc bnep l2cap bluetooth ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath uinput amd_rng ata_generic tg3 s2io libphy serio_raw i2c_amd8111 pcspkr i2c_amd756 pata_amd shpchp Pid: 2926, comm: xfs_io Not tainted 2.6.30-rc3 #8 To be filled by O.E.M. RIP: 0010:[] [] rb_erase+0x174/0x340 RSP: 0018:ffff8800788e9788 EFLAGS: 00010046 RAX: ffff8800f9dd12b1 RBX: ffff8800f9dd12b0 RCX: 0000000000000001 RDX: ffff8800f9dd12b0 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff8800788e9798 R08: ffff8800f9dd1320 R09: 0000000000000000 R10: ffff8800f9dd12b0 R11: 0000000000000000 R12: ffff8800fa958a38 R13: ffff880074cc8e60 R14: ffff8800fa958a00 R15: ffff8800f9dd1320 FS: 00007f6262cc46f0(0000) GS:ffffc20000000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000f5848000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process xfs_io (pid: 2926, threadinfo ffff8800788e8000, task ffff880074d0c4d0) Stack: ffff8800f9dd1320 ffff8800f9dd1350 ffff8800788e97d8 ffffffff81190a40 ffff8800fa958a00 ffff8800f9dd1320 ffff880074cc8e60 ffff8800fa958a00 ffff8800fa9208e0 ffff880074cc8e60 ffff8800788e9808 ffffffff81190b28 Call Trace: [] cfq_remove_request+0x1e0/0x280 [] cfq_dispatch_insert+0x48/0x90 [] cfq_dispatch_requests+0x1a5/0x4d0 [] elv_next_request+0x162/0x200 [] ? kobject_get+0x1a/0x30 [] scsi_request_fn+0x62/0x560 [] ? _spin_unlock_irqrestore+0x2f/0x40 [] blk_start_queueing+0x1a/0x40 [] cfq_insert_request+0x2d5/0x430 [] elv_insert+0x120/0x2a0 [] __elv_add_request+0x7b/0xd0 [] __make_request+0x111/0x460 [] generic_make_request+0x3b5/0x490 [] submit_bio+0x70/0xf0 [] dio_bio_submit+0x5e/0x90 [] __blockdev_direct_IO+0x5c5/0xd60 [] xfs_vm_direct_IO+0x10e/0x130 [xfs] [] ? xfs_get_blocks_direct+0x0/0x20 [xfs] [] ? xfs_end_io_direct+0x0/0x80 [xfs] [] generic_file_direct_write+0xbf/0x220 [] xfs_write+0x3fb/0x9a0 [xfs] [] ? filemap_fault+0x13c/0x440 [] xfs_file_aio_write+0x5d/0x70 [xfs] [] do_sync_write+0xf1/0x140 [] ? autoremove_wake_function+0x0/0x40 [] ? finish_task_switch+0x5b/0xe0 [] ? xfs_end_bio_written+0x0/0x30 [xfs] [] ? thread_return+0x3e/0x6ae [] vfs_write+0xcb/0x190 [] sys_pwrite64+0x92/0xa0 [] system_call_fastpath+0x16/0x1b Code: 7a ff ff ff 0f 1f 00 48 3b 78 10 0f 1f 40 00 0f 84 7a 01 00 00 48 89 48 08 66 0f 1f 44 00 00 e9 28 ff ff ff 0f 1f 00 48 8b 7b 08 <48> 8b 07 a8 01 0f 84 d1 00 00 00 48 8b 47 10 48 85 c0 74 09 f6 RIP [] rb_erase+0x174/0x340 RSP CR2: 0000000000000000 ---[ end trace 79edd488f3f3e8df ]--- -- A computer without Microsoft is like chocolate cake without mustard. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/