Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932343Ab1BYBz4 (ORCPT ); Thu, 24 Feb 2011 20:55:56 -0500 Received: from cn.fujitsu.com ([222.73.24.84]:62194 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S932314Ab1BYBzy (ORCPT ); Thu, 24 Feb 2011 20:55:54 -0500 Message-ID: <4D670C14.9040504@cn.fujitsu.com> Date: Fri, 25 Feb 2011 09:55:32 +0800 From: Gui Jianfeng User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: Vivek Goyal CC: Jens Axboe , Justin TerAvest , "jmoyer@redhat.com" , Chad Talbott , lkml Subject: Re: [PATCH 0/6 v5.1] cfq-iosched: Introduce CFQ group hierarchical scheduling and "use_hierarchy" interface References: <4D61FE91.60705@cn.fujitsu.com> <4D6201A3.70301@cn.fujitsu.com> <4D64788F.6040408@cn.fujitsu.com> <20110224181140.GE18494@redhat.com> In-Reply-To: <20110224181140.GE18494@redhat.com> X-MIMETrack: Itemize by SMTP Server on mailserver/fnst(Release 8.5.1FP4|July 25, 2010) at 2011-02-25 09:54:48, Serialize by Router on mailserver/fnst(Release 8.5.1FP4|July 25, 2010) at 2011-02-25 09:54:48, Serialize complete at 2011-02-25 09:54:48 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8742 Lines: 193 Vivek Goyal wrote: > On Wed, Feb 23, 2011 at 11:01:35AM +0800, Gui Jianfeng wrote: >> Hi >> >> I rebase this series on top of *for-next* branch, it will make merging life easier. >> >> Previously, I posted a patchset to add support of CFQ group hierarchical scheduling >> in the way that it puts all CFQ queues in a hidden group and schedules with other >> CFQ group under their parent. The patchset is available here, >> http://lkml.org/lkml/2010/8/30/30 > > Gui, > > I was running some tests (iostest) with these patches and my system crashed > after a while. > > To be precise I was running "brrmmap" test of iostest. Vivek, I simply run iostest with brrmmap mode, I can't reproduce this bug. Would you give more details. Can you tell me the iostest command line options? Did you enable use_hierarchy in root group? Thanks, Gui > > train.lab.bos.redhat.com login: [72194.404201] EXT4-fs (dm-1): mounted > filesystem with ordered data mode. Opts: (null) > [72642.818976] EXT4-fs (dm-1): mounted filesystem with ordered data mode. > Opts: (null) > [72931.409460] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000010 > [72931.410216] IP: [] __rb_rotate_left+0xb/0x64 > [72931.410216] PGD 134d80067 PUD 12f524067 PMD 0 > [72931.410216] Oops: 0000 [#1] SMP > [72931.410216] last sysfs file: > /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size > [72931.410216] CPU 3 > [72931.410216] Modules linked in: kvm_intel kvm qla2xxx scsi_transport_fc > [last unloaded: scsi_wait_scan] > [72931.410216] > [72931.410216] Pid: 18675, comm: sh Not tainted 2.6.38-rc4+ #3 0A98h/HP > xw8600 Workstation > [72931.410216] RIP: 0010:[] [] > __rb_rotate_left+0xb/0x64 > [72931.410216] RSP: 0000:ffff88012f461480 EFLAGS: 00010086 > [72931.410216] RAX: 0000000000000000 RBX: ffff880135f40c00 RCX: > ffffffffffffdcc8 > [72931.410216] RDX: ffff880135f43800 RSI: ffff880135f43000 RDI: > ffff880135f42c00 > [72931.410216] RBP: ffff88012f461480 R08: ffff880135f40c00 R09: > ffff880135f43018 > [72931.410216] R10: 0000000000000000 R11: 0000001000000000 R12: > ffff880135f42c00 > [72931.410216] R13: ffff880135f41808 R14: ffff880135f43000 R15: > ffff880135f40c00 > [72931.410216] FS: 0000000000000000(0000) GS:ffff8800bfcc0000(0000) > knlGS:0000000000000000 > [72931.410216] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [72931.410216] CR2: 0000000000000010 CR3: 000000013774f000 CR4: > 00000000000006e0 > [72931.410216] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [72931.410216] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [72931.410216] Process sh (pid: 18675, threadinfo ffff88012f460000, task > ffff8801376e6f90) > [72931.410216] Stack: > [72931.410216] ffff88012f4614b8 ffffffff81226778 ffff880135f43000 > ffff880135f43000 > [72931.410216] ffff88011c5bed00 0000000000000000 0000000000000001 > ffff88012f4614d8 > [72931.410216] ffffffff8121c521 0000001000000000 ffff880135f41800 > ffff88012f461528 > [72931.410216] Call Trace: > [72931.410216] [] rb_insert_color+0xbc/0xe5 > [72931.410216] [] > __cfq_entity_service_tree_add+0x76/0xa5 > [72931.410216] [] cfq_service_tree_add+0x383/0x3eb > [72931.410216] [] cfq_resort_rr_list+0x1a/0x2a > [72931.410216] [] cfq_add_rq_rb+0xbd/0xff > [72931.410216] [] cfq_insert_request+0xc2/0x556 > [72931.410216] [] elv_insert+0x118/0x188 > [72931.410216] [] __elv_add_request+0x6e/0x75 > [72931.410216] [] __make_request+0x3ac/0x42f > [72931.410216] [] generic_make_request+0x2ec/0x356 > [72931.410216] [] submit_bio+0xd1/0xdc > [72931.410216] [] submit_bh+0xe6/0x108 > [72931.410216] [] __bread+0x4c/0x6f > [72931.410216] [] ext3_get_branch+0x64/0xdf > [72931.410216] [] ext3_get_blocks_handle+0x9b/0x90b > [72931.410216] [] ext3_get_block+0xb6/0xf6 > [72931.410216] [] do_mpage_readpage+0x198/0x4bd > [72931.410216] [] ? __inc_zone_page_state+0x29/0x2b > [72931.410216] [] ? add_to_page_cache_locked+0xb6/0x10d > [72931.410216] [] mpage_readpages+0xd6/0x123 > [72931.410216] [] ? ext3_get_block+0x0/0xf6 > [72931.410216] [] ? ext3_get_block+0x0/0xf6 > [72931.410216] [] ? alloc_pages_current+0xa2/0xc5 > [72931.410216] [] ext3_readpages+0x18/0x1a > [72931.410216] [] __do_page_cache_readahead+0x111/0x1a7 > [72931.410216] [] ra_submit+0x1c/0x20 > [72931.410216] [] filemap_fault+0x165/0x35b > [72931.410216] [] __do_fault+0x50/0x3e2 > [72931.410216] [] handle_pte_fault+0x2ff/0x779 > [72931.410216] [] ? __free_pages+0x1b/0x24 > [72931.410216] [] handle_mm_fault+0x15f/0x173 > [72931.410216] [] do_page_fault+0x348/0x36a > [72931.410216] [] ? path_put+0x1d/0x21 > [72931.410216] [] ? path_put+0x1d/0x21 > [72931.410216] [] page_fault+0x1f/0x30 > [72931.410216] Code: 48 83 c4 18 44 89 e8 5b 41 5c 41 5d c9 c3 48 83 7b 18 > 00 0f 84 71 ff ff ff e9 77 ff ff ff 90 90 48 8b 47 08 55 48 8b 17 48 89 e5 > <48> 8b 48 10 48 83 e2 fc 48 85 c9 48 89 4f 08 74 10 4c 8b 40 10 > [72931.410216] RIP [] __rb_rotate_left+0xb/0x64 > [72931.410216] RSP > [72931.410216] CR2: 0000000000000010 > [72931.410216] ---[ end trace cddc7a4456407f6a ]--- > > Thanks > Vivek > >> Vivek think this approach isn't so instinct that we should treat CFQ queues >> and groups at the same level. Here is the new approach for hierarchical >> scheduling based on Vivek's suggestion. The most big change of CFQ is that >> it gets rid of cfq_slice_offset logic, and makes use of vdisktime for CFQ >> queue scheduling just like CFQ group does. But I still give cfqq some jump >> in vdisktime based on ioprio, thanks for Vivek to point out this. Now CFQ >> queue and CFQ group use the same scheduling algorithm. >> >> "use_hierarchy" interface is now added to switch between hierarchical mode >> and flat mode. It works as memcg. >> >> V4 -> V5 Changes: >> - Change boosting base to a smaller value. >> - Rename repostion_time to position_time >> - Replace duplicated code by calling cfq_scale_slice() >> - Remove redundant use_hierarchy in cfqd >> - Fix grp_service_tree comment >> - Rename init_cfqe() to init_group_cfqe() >> >> -- >> V3 -> V4 Changes: >> - Take io class into account when calculating the boost value. >> - Refine the vtime boosting logic as Vivek's Suggestion. >> - Make the calculation of group slice cross all service trees under a group. >> - Modify Documentation in terms of Vivek's comments. >> >> -- >> V2 -> V3 Changes: >> - Starting from cfqd->grp_service_tree for both hierarchical mode and flat mode >> - Avoid recursion when allocating cfqg and force dispatch logic >> - Fix a bug when boosting vdisktime >> - Adjusting total_weight accordingly when changing weight >> - Change group slice calculation into a hierarchical way >> - Keep flat mode rather than deleting it first then adding it later >> - kfree the parent cfqg if there nobody references to it >> - Simplify select_queue logic by using some wrap function >> - Make "use_hierarchy" interface work as memcg >> - Make use of time_before() for vdisktime compare >> - Update Document >> - Fix some code style problems >> >> -- >> V1 -> V2 Changes: >> - Raname "struct io_sched_entity" to "struct cfq_entity" and don't differentiate >> queue_entity and group_entity, just use cfqe instead. >> - Give newly added cfqq a small vdisktime jump accord to its ioprio. >> - Make flat mode as default CFQ group scheduling mode. >> - Introduce "use_hierarchy" interface. >> - Update blkio cgroup documents >> >> Documentation/cgroups/blkio-controller.txt | 81 +- >> block/blk-cgroup.c | 61 + >> block/blk-cgroup.h | 3 >> block/cfq-iosched.c | 959 ++++++++++++++++++++--------- >> 4 files changed, 815 insertions(+), 289 deletions(-) >> >> Thanks, >> Gui > -- Regards Gui Jianfeng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/