Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758052AbZDPTR2 (ORCPT ); Thu, 16 Apr 2009 15:17:28 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754549AbZDPTRS (ORCPT ); Thu, 16 Apr 2009 15:17:18 -0400 Received: from mx2.redhat.com ([66.187.237.31]:59190 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757217AbZDPTRR (ORCPT ); Thu, 16 Apr 2009 15:17:17 -0400 Date: Thu, 16 Apr 2009 15:15:07 -0400 From: Vivek Goyal To: Gui Jianfeng Cc: nauman@google.com, dpshah@google.com, lizf@cn.fujitsu.com, mikew@google.com, fchecconi@gmail.com, paolo.valente@unimore.it, jens.axboe@oracle.com, ryov@valinux.co.jp, fernando@intellilink.co.jp, s-uchida@ap.jp.nec.com, taka@valinux.co.jp, arozansk@redhat.com, jmoyer@redhat.com, oz-kernel@redhat.com, dhaval@linux.vnet.ibm.com, balbir@linux.vnet.ibm.com, linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, akpm@linux-foundation.org, menage@google.com, peterz@infradead.org Subject: Re: [PATCH] IO-Controller: Fix kernel panic after moving a task Message-ID: <20090416191507.GG8896@redhat.com> References: <1236823015-4183-1-git-send-email-vgoyal@redhat.com> <1236823015-4183-6-git-send-email-vgoyal@redhat.com> <49E6C14F.3090009@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <49E6C14F.3090009@cn.fujitsu.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7271 Lines: 193 On Thu, Apr 16, 2009 at 01:25:35PM +0800, Gui Jianfeng wrote: > Vivek Goyal wrote: > > +#ifdef CONFIG_IOSCHED_CFQ_HIER > > +static void changed_cgroup(struct io_context *ioc, struct cfq_io_context *cic) > > +{ > > + struct cfq_queue *async_cfqq = cic_to_cfqq(cic, 0); > > + struct cfq_queue *sync_cfqq = cic_to_cfqq(cic, 1); > > + struct cfq_data *cfqd = cic->key; > > + struct io_group *iog, *__iog; > > + unsigned long flags; > > + struct request_queue *q; > > + > > + if (unlikely(!cfqd)) > > + return; > > + > > + q = cfqd->q; > > + > > + spin_lock_irqsave(q->queue_lock, flags); > > + > > + iog = io_lookup_io_group_current(q); > > + > > Hi Vivek, > > I triggered another kernel panic when testing. When moving a task to another > cgroup, the corresponding iog may not be setup properly all the time. "iog" > might be NULL here. io_ioq_move() receives a NULL iog, kernel crash. > > Consider the following piece of code: > > 941 int elv_set_request(struct request_queue *q, struct request *rq, gfp_t gfp_mask) > 942 { > 943 struct elevator_queue *e = q->elevator; > 944 > 945 elv_fq_set_request_io_group(q, rq); > > -->task moving to a new group is happenning here. > > 946 > 947 /* > 948 * Optimization for noop, deadline and AS which maintain only single > 949 * ioq per io group > 950 */ > 951 if (elv_iosched_single_ioq(e)) > 952 return elv_fq_set_request_ioq(q, rq, gfp_mask); > 953 > 954 if (e->ops->elevator_set_req_fn) > 955 return e->ops->elevator_set_req_fn(q, rq, gfp_mask); > > cfq_set_request() will finally call io_ioq_move(), but the iog is NULL, beacause the iogs in the > hierarchy are not built yet. So kernel crashes. > > 956 > 957 rq->elevator_private = NULL; > 958 return 0; > 959 } > Thanks Gui. Good catch. > BUG: unable to handle kernel NULL pointer dereference at 000000bc > IP: [] io_ioq_move+0xf2/0x109 > *pde = 6cc00067 > Oops: 0000 [#1] SMP > last sysfs file: /sys/block/hdb/queue/slice_idle > Modules linked in: ipv6 cpufreq_ondemand acpi_cpufreq dm_mirror dm_multipath sbs sbshc battery ac lp snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm serio_raw snd_timer rtc_cmos parport_pc snd r8169 button rtc_core parport soundcore mii i2c_i801 rtc_lib snd_page_alloc pcspkr i2c_core dm_region_hash dm_log dm_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd > > Pid: 5431, comm: dd Not tainted (2.6.29-rc7-vivek #19) Veriton M460 > EIP: 0060:[] EFLAGS: 00010046 CPU: 0 > EIP is at io_ioq_move+0xf2/0x109 > EAX: f6203a88 EBX: f6792c94 ECX: f6203a84 EDX: 00000006 > ESI: 00000000 EDI: 00000000 EBP: f6203a60 ESP: f6304c28 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > Process dd (pid: 5431, ti=f6304000 task=f669dae0 task.ti=f6304000) > Stack: > f62478c0 0100dd40 f6247908 f62d995c 00000000 00000000 f675b54c c04e9182 > f638e9b0 00000282 f62d99a4 f6325a2c c04e9113 f5a707c0 c04e7ae0 f675b000 > f62d95fc f6325a2c c04e8501 00000010 f631e4e8 f675b000 00080000 ffffff10 > Call Trace: > [] changed_cgroup+0x6f/0x8d > [] changed_cgroup+0x0/0x8d > [] __call_for_each_cic+0x1b/0x25 > [] cfq_set_request+0x158/0x2c7 > [] _spin_unlock_irqrestore+0x5/0x6 > [] elv_fq_set_request_io_group+0x2b/0x3e > [] cfq_set_request+0x0/0x2c7 > [] elv_set_request+0x3e/0x4e > [] get_request+0x1ed/0x29b > [] get_request_wait+0xdf/0xf2 > [] __make_request+0x2c6/0x372 > [] do_mpage_readpage+0x4fe/0x5e3 > [] generic_make_request+0x2d0/0x355 > [] submit_bio+0x92/0x97 > [] add_to_page_cache_locked+0x8a/0xb7 > [] mpage_end_io_read+0x0/0x50 > [] mpage_bio_submit+0x19/0x1d > [] mpage_readpages+0x9b/0xa5 > [] ext3_readpages+0x0/0x15 [ext3] > [] __do_page_cache_readahead+0xea/0x154 > [] ext3_get_block+0x0/0xbe [ext3] > [] generic_file_aio_read+0x276/0x569 > [] do_sync_read+0xbf/0xfe > [] getnstimeofday+0x51/0xdb > [] autoremove_wake_function+0x0/0x2d > [] sched_slice+0x61/0x6a > [] task_tick_fair+0x3d/0x60 > [] security_file_permission+0xc/0xd > [] do_sync_read+0x0/0xfe > [] vfs_read+0x6c/0x8b > [] sys_read+0x3c/0x63 > [] sysenter_do_call+0x12/0x21 > [] schedule+0x551/0x830 > Code: 08 31 c9 89 da e8 77 fc ff ff 8b 86 bc 00 00 00 85 ff 89 43 38 8d 46 60 89 43 40 74 1d 83 c4 0c 89 d8 5b 5e 5f 5d e9 aa f9 ff ff <8b> 86 bc 00 00 00 89 43 38 8d 46 60 89 43 40 83 c4 0c 5b 5e 5f > EIP: [] io_ioq_move+0xf2/0x109 SS:ESP 0068:f6304c28 > > Changelog: > > Make sure iogs in the hierarchy are built properly after moving a task to a new cgroup. > > Signed-off-by: Gui Jianfeng > --- > block/cfq-iosched.c | 4 +++- > block/elevator-fq.c | 1 + > block/elevator-fq.h | 1 + > 3 files changed, 5 insertions(+), 1 deletions(-) > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c > index 0ecf7c7..6d7bb8a 100644 > --- a/block/cfq-iosched.c > +++ b/block/cfq-iosched.c > @@ -12,6 +12,8 @@ > #include > #include > #include > +#include "elevator-fq.h" > + I think above explicit inclusion of "elevator-fq.h" might be unnecessary as elevator.h includes elevator-fq.h and cfq-iosched.c is including elevator.h > /* > * tunables > */ > @@ -1086,7 +1088,7 @@ static void changed_cgroup(struct io_context *ioc, struct cfq_io_context *cic) > > spin_lock_irqsave(q->queue_lock, flags); > > - iog = io_lookup_io_group_current(q); > + iog = io_get_io_group(q); A one line comment here explaining the need to get_io_group instead of lookup_io_group will be nice. Thanks Vivek > > if (async_cfqq != NULL) { > __iog = cfqq_to_io_group(async_cfqq); > diff --git a/block/elevator-fq.c b/block/elevator-fq.c > index df53418..f81cf6a 100644 > --- a/block/elevator-fq.c > +++ b/block/elevator-fq.c > @@ -1191,6 +1191,7 @@ struct io_group *io_get_io_group(struct request_queue *q) > > return iog; > } > +EXPORT_SYMBOL(io_get_io_group); > > void io_free_root_group(struct elevator_queue *e) > { > diff --git a/block/elevator-fq.h b/block/elevator-fq.h > index fc4110d..f17e425 100644 > --- a/block/elevator-fq.h > +++ b/block/elevator-fq.h > @@ -459,6 +459,7 @@ static inline struct io_group *ioq_to_io_group(struct io_queue *ioq) > } > > #ifdef CONFIG_GROUP_IOSCHED > +extern struct io_group *io_get_io_group(struct request_queue *q); > extern int io_group_allow_merge(struct request *rq, struct bio *bio); > extern void io_ioq_move(struct elevator_queue *e, struct io_queue *ioq, > struct io_group *iog); > -- > 1.5.4.rc3 > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/