Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751729AbZIWKNT (ORCPT ); Wed, 23 Sep 2009 06:13:19 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751466AbZIWKNS (ORCPT ); Wed, 23 Sep 2009 06:13:18 -0400 Received: from mail-ew0-f214.google.com ([209.85.219.214]:54515 "EHLO mail-ew0-f214.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751379AbZIWKNS convert rfc822-to-8bit (ORCPT ); Wed, 23 Sep 2009 06:13:18 -0400 X-Greylist: delayed 431 seconds by postgrey-1.27 at vger.kernel.org; Wed, 23 Sep 2009 06:13:17 EDT DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=QZGkjVkgW7NpDCluFlmYHrsCWQ04ldCaa6TdZfGgPuqiPxR7gj0xfQqOD3lBK+kf10 kiXahr7S+s2mjTrFNz8STnft4e5xqOnaBYPKNpH89LFyYltLUtLVad9V6r567LUnjOJp 2JozfsqthG+vgiws75UhsxzrhaqLde3WqQXEY= MIME-Version: 1.0 In-Reply-To: References: <7863dc4c0909221409v7893bfd3o4b590d5951a233ba@mail.gmail.com> <20090922212453.GB6062@nowhere> <1253686585.7695.84.camel@twins> <20090923073253.GA18022@elte.hu> <20090923074028.GA3078@elte.hu> <7863dc4c0909230215u2fed3edciec84f93f24d3ae1@mail.gmail.com> <20090923092024.GA29323@elte.hu> Date: Wed, 23 Sep 2009 11:06:06 +0100 X-Google-Sender-Auth: e47d94571a684ac0 Message-ID: <7863dc4c0909230306x3bc60775rc503919df83087ed@mail.gmail.com> Subject: Re: perf sched record hangs machine From: Chris Malley To: Cyrill Gorcunov Cc: Ingo Molnar , Peter Zijlstra , Frederic Weisbecker , linux-kernel@vger.kernel.org, Steven Rostedt Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6028 Lines: 117 2009/9/23 Cyrill Gorcunov : > On 9/23/09, Ingo Molnar wrote: >> >> Would still be important to fix the crash - there are boxes where lapics >> are disabled permanently and cannot be re-enabled. (plus most people >> dont touch their defaults and dont add funky boot options - so crashing >> is not an option) >> > > Ingo, Chris, could you try Peter's patch? It seems like what we need. > > (Peter, self-ipi shouldn't be separated from others ipi, yes it ?may > not issue any cycle on fsb, but iirc it uses the same logic as other > ipi use) > Applied Peter's patch, doesn't seem to have fixed the problem: [ 246.408893] BUG: unable to handle kernel paging request at ffffb300 [ 246.408939] IP: [] default_send_IPI_self+0x1d/0x50 [ 246.408961] *pde = 0073f067 *pte = 00000000 [ 246.408985] Oops: 0000 [#1] SMP [ 246.408996] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq [ 246.409007] Modules linked in: netconsole configfs binfmt_misc snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device ipw2200 libipw snd dcdbas cfg80211 intel_agp video soundcore sr_mod lib80211 output joydev pcspkr snd_page_alloc agpgart usb_storage usbhid ohci1394 tg3 ieee1394 [ 246.409112] [ 246.409121] Pid: 4188, comm: firefox Not tainted (2.6.31-cjm-07092-g819307a #4) Latitude D400 [ 246.409126] EIP: 0060:[] EFLAGS: 00010046 CPU: 0 [ 246.409131] EIP is at default_send_IPI_self+0x1d/0x50 [ 246.409135] EAX: fffff000 EBX: 000000ec ECX: 00000800 EDX: ffffb300 [ 246.409140] ESI: f16cdc64 EDI: 00000000 EBP: f16cdc00 ESP: f16cdbfc [ 246.409144] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 246.409150] Process firefox (pid: 4188, ti=f16cc000 task=f1465aa0 task.ti=f16cc000) [ 246.409154] Stack: [ 246.409158] f16c3e14 f16cdc08 c010e3b4 f16cdc28 c01b9751 f1602024 f1602020 00115838 [ 246.409179] <0> 00000000 f1602000 f16c2c00 f16cdc38 c01b981a f16cdc64 f16cdc84 f16cdc98 [ 246.409199] <0> c01ba690 f16c2c00 00000001 c030963e ffffffff ffffffff 00000000 00000001 [ 246.409223] Call Trace: [ 246.409234] [] ? set_perf_event_pending+0x14/0x20 [ 246.409244] [] ? perf_output_unlock+0x121/0x1a0 [ 246.409249] [] ? perf_output_end+0x4a/0x70 [ 246.409255] [] ? __perf_event_overflow+0x240/0x2f0 [ 246.409264] [] ? atomic64_cmpxchg+0x1e/0x30 [ 246.409270] [] ? perf_swevent_ctx_event+0x1b4/0x1c0 [ 246.409276] [] ? perf_swevent_ctx_event+0x33/0x1c0 [ 246.409281] [] ? do_perf_sw_event+0xa7/0x160 [ 246.409286] [] ? perf_tp_event+0x82/0xa0 [ 246.409296] [] ? ftrace_profile_sched_stat_runtime+0xe6/0x120 [ 246.409301] [] ? ftrace_profile_sched_stat_runtime+0x0/0x120 [ 246.409307] [] ? update_curr+0x18a/0x230 [ 246.409313] [] ? enqueue_entity+0x15/0x460 [ 246.409319] [] ? task_rq_lock+0x47/0x80 [ 246.409324] [] ? enqueue_task_fair+0x31/0x70 [ 246.409331] [] ? enqueue_task+0x6d/0x90 [ 246.409336] [] ? activate_task+0x20/0x30 [ 246.409343] [] ? try_to_wake_up+0x1fb/0x2f0 [ 246.409351] [] ? hrtimer_wakeup+0x0/0x20 [ 246.409357] [] ? wake_up_process+0xf/0x20 [ 246.409365] [] ? hrtimer_wakeup+0x18/0x20 [ 246.409370] [] ? __run_hrtimer+0x6c/0xc0 [ 246.409379] [] ? _spin_lock+0x3a/0x40 [ 246.409384] [] ? hrtimer_interrupt+0x185/0x230 [ 246.409391] [] ? timer_interrupt+0x3c/0x50 [ 246.409402] [] ? handle_IRQ_event+0x50/0x140 [ 246.409407] [] ? _spin_unlock_irqrestore+0x55/0x60 [ 246.409413] [] ? handle_level_irq+0x64/0xf0 [ 246.409418] [] ? handle_level_irq+0x6e/0xf0 [ 246.409423] [] ? handle_irq+0x1a/0x30 [ 246.409428] [] ? do_IRQ+0x46/0xc0 [ 246.409437] [] ? trace_hardirqs_on_caller+0x12c/0x170 [ 246.409442] [] ? common_interrupt+0x2e/0x34 [ 246.409448] Code: 0f 44 c1 89 02 5b 5d c3 8d b6 00 00 00 00 55 89 e5 53 89 c3 a1 5c de 68 c0 8b 48 20 eb 02 f3 90 a1 c8 10 69 c0 8d 90 00 c3 ff ff <8b> 80 00 c3 ff ff f6 c4 10 75 e8 89 c8 81 c9 00 04 04 00 0d 00 [ 246.409591] EIP: [] default_send_IPI_self+0x1d/0x50 SS:ESP 0068:f16cdbfc [ 246.409601] CR2: 00000000ffffb300 [ 246.409609] ---[ end trace 237505c339f73345 ]--- [ 246.409616] Kernel panic - not syncing: Fatal exception in interrupt [ 246.409623] Pid: 4188, comm: firefox Tainted: G D 2.6.31-cjm-07092-g819307a #4 [ 246.409627] Call Trace: [ 246.409633] [] ? printk+0x18/0x1b [ 246.409638] [] panic+0x43/0x100 [ 246.409643] [] oops_end+0xb9/0xc0 [ 246.409648] [] no_context+0xb6/0x150 [ 246.409653] [] __bad_area_nosemaphore+0x63/0x180 [ 246.409659] [] ? __lock_acquire+0x193/0x1240 [ 246.409664] [] ? __lock_acquire+0x193/0x1240 [ 246.409670] [] ? __lock_acquire+0x193/0x1240 [ 246.409675] [] ? __lock_acquire+0x193/0x1240 [ 246.409680] [] bad_area_nosemaphore+0x12/0x20 [ 246.409687] [] do_page_fault+0x31c/0x3c0 [ 246.409692] [] ? do_page_fault+0x0/0x3c0 [ 246.409697] [] error_code+0x6b/0x70 [ 246.409703] [] ? down_write_trylock+0x1b/0x50 [ 246.409708] [] ? do_page_fault+0x0/0x3c0 [ 246.409714] [] ? default_send_IPI_self+0x1d/0x50 [ 246.409720] [] set_perf_event_pending+0x14/0x20 [ 246.409725] [] perf_output_unlock+0x121/0x1a0 [ 246.409732] [] perf_output_end+0x4a/0x70 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/