Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755218AbYGaQ6H (ORCPT ); Thu, 31 Jul 2008 12:58:07 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751608AbYGaQ5w (ORCPT ); Thu, 31 Jul 2008 12:57:52 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:48994 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751344AbYGaQ5v (ORCPT ); Thu, 31 Jul 2008 12:57:51 -0400 Date: Thu, 31 Jul 2008 18:56:50 +0200 From: Ingo Molnar To: Dmitry Adamushko Cc: Alistair John Strachan , Pekka Paalanen , Linus Torvalds , Linux Kernel Mailing List , shaohua.li@intel.com, tigran@aivazian.fsnet.co.uk, Thomas Gleixner , Steven Rostedt , Max Krasnyansky , Peter Zijlstra Subject: Re: Oops in microcode sysfs registration, Message-ID: <20080731165650.GJ26393@elte.hu> References: <200807291457.58408.alistair@devzero.co.uk> <20080729192214.2d3a4ca5@daedalus.pq.iki.fi> <200807291750.41169.alistair@devzero.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-ELTE-VirusStatus: clean X-ELTE-SpamScore: -1.5 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-1.5 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.2.3 -1.5 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6508 Lines: 131 * Dmitry Adamushko wrote: > 2008/7/30 Dmitry Adamushko : > > 2008/7/29 Alistair John Strachan : > >> On Tuesday 29 July 2008 17:22:14 Pekka Paalanen wrote: > >>> > Also, I'm sure this is reproducible without the NVIDIA garbage, but I was > >>> > too lazy to test it. If you want me to repeat the experiment without the > >>> > driver I would be more than happy to do so. > >>> > >>> I'm not sure people are willing to look into this without a clean report, > >>> so this would be cool. There's even a test module for mmiotrace in the > >>> kernel, but I doubt it would make difference to use it or not, when trying > >>> to reproduce the crash without the blob. > >> > >> Of course, and I should have attempted to reproduce without the driver. > >> Fortunately that was easy: it is not an NVIDIA driver bug. > >> > >> Steps to reproduce: have CONFIG_MICROCODE=y and a suitable Intel > >> processor, then do: > >> > >> echo mmiotrace >/debug/tracing/current_tracer > >> echo none >/debug/tracing/current_tracer > >> > >> And you get this (snipped) oops: > >> > >> in mmio_trace_init > >> mmiotrace: Disabling non-boot CPUs... > >> kvm: disabling virtualization on CPU1 > >> CPU 1 is now offline > >> SMP alternatives: switching to UP code > >> CPU0 attaching NULL sched-domain. > >> CPU1 attaching NULL sched-domain. > >> CPU0 attaching NULL sched-domain. > >> mmiotrace: CPU1 is down. > >> mmiotrace: enabled. > >> in mmio_trace_reset > >> mmiotrace: Re-enabling CPUs... > >> SMP alternatives: switching to SMP code > >> Booting processor 1/1 ip 6000 > >> Initializing CPU#1 > >> Calibrating delay using timer specific routine.. <6>7204.76 BogoMIPS (lpj=3602381) > >> CPU: L1 I cache: 32K, L1 D cache: 32K > >> CPU: L2 cache: 4096K > >> CPU: Physical Processor ID: 0 > >> CPU: Processor Core ID: 1 > >> x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106 > >> CPU1: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz stepping 06 > >> checking TSC synchronization [CPU#0 -> CPU#1]: passed. > >> kvm: enabling virtualization on CPU1 > >> CPU0 attaching NULL sched-domain. > >> Switched to high resolution mode on CPU 1 > >> CPU0 attaching sched-domain: > >> domain 0: span 0-1 level MC > >> groups: 0 1 > >> CPU1 attaching sched-domain: > >> domain 0: span 0-1 level MC > >> groups: 1 0 > >> ------------[ cut here ]------------ > >> Kernel BUG at ffffffff8021a31d [verbose debug info unavailable] > >> invalid opcode: 0000 [1] PREEMPT SMP > >> CPU 0 > >> Modules linked in: rfcomm l2cap kvm_intel kvm ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables bridge stp llc acpi_cpufreq freq_table coretemp hwmon > >> snd_pcm_oss snd_mixer_oss firewire_sbp2 hci_usb bluetooth arc4 ecb crypto_blkcipher cryptomgr crypto_algapi usbhid zd1211rw mac80211 crypto cfg80211 snd_emu10k1 snd_rawmidi > >> snd_ac97_codec ac97_bus sg snd_seq_device snd_hda_intel snd_pcm snd_util_mem snd_timer sr_mod snd_hwdep i2c_i801 ehci_hcd firewire_ohci uhci_hcd snd snd_page_alloc firewire_core > >> soundcore r8169 cdrom usbcore i2c_core crc_itu_t > >> Pid: 2757, comm: bash Tainted: G A 2.6.27-rc1-damocles #3 > >> RIP: 0010:[] [] __mc_sysdev_add+0xc3/0x1f1 > >> RSP: 0018:ffff8800b8905ce8 EFLAGS: 00010297 > >> RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff880080a04000 > >> RDX: ffffffff8062c680 RSI: 0000000000000003 RDI: ffffffff8059e830 > >> RBP: ffff8800b8905d48 R08: ffff8800b8904000 R09: ffffffff80229ca4 > >> R10: ffff8800010247b0 R11: ffff8800bf879de0 R12: 0000000000000018 > >> R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 > >> FS: 00007f8ddc78f6e0(0000) GS:ffffffff805da200(0000) knlGS:0000000000000000 > >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > >> CR2: 00007f57cb9b2098 CR3: 00000000b8985000 CR4: 00000000000026e0 > >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > >> Process bash (pid: 2757, threadinfo ffff8800b8904000, task ffff8800bd125640) > >> Stack: ffffffff80627040 0000000000000000 0000000000000008 ffffffff8048bb28 > >> 0000000000000003 ffffffff802ce910 ffff8800b8905d28 0000000000000002 > >> 00000000ffffffe8 0000000000000001 0000000000000001 ffff880001028418 > >> Call Trace: > >> [] ? sysfs_add_file+0xc/0xe > >> [] mc_sysdev_add+0xb/0xd > >> [] mc_cpu_callback+0x4b/0x208 > >> [] ? mce_cpu_callback+0x3e/0xbc > >> [] notifier_call_chain+0x33/0x5b > >> [] raw_notifier_call_chain+0xf/0x11 > >> [] _cpu_up+0xce/0x119 > >> [] cpu_up+0x5e/0x8a > >> [] disable_mmiotrace+0xfe/0x173 > >> [] mmio_trace_reset+0x2d/0x44 > >> [] tracing_set_trace_write+0xd3/0x10f > >> [] ? filp_close+0x67/0x72 > >> [] vfs_write+0xa7/0xe1 > >> [] sys_write+0x47/0x6f > >> [] system_call_fastpath+0x16/0x1b > >> [ 68.405002] > >> [ 68.405002] > >> Code: e8 59 80 e8 fd 69 26 00 48 c7 c2 80 c6 62 80 48 8b 05 c0 00 3c 00 48 8b 04 d8 48 8b 48 08 65 8b 04 25 24 00 00 00 44 39 e8 74 04 <0f> 0b eb fe 4c 8d 04 0a 41 c7 84 24 7c 36 64 80 00 > >> 00 00 00 41 > >> RIP [] __mc_sysdev_add+0xc3/0x1f1 > >> RSP > >> ---[ end trace ee9c9240024cb48c ]--- > >> > >> I've replaced the originally tainted dmesg with this new clean one, so > >> there's no proprietary smell about it :-) > > > > Yes, it's kind of a known issue. Take a look at this explanation: > > http://lkml.org/lkml/2008/7/24/260 > > > > There were a few related discussions in other threads (mainly, Max > > Krasnyansky and I were asking for additional info on possible > > requirements from the 'microcode' driver...) heh, I think, we'd be > > better off just fixing it one way or another. > > does a patch below fix it for you? > [ not really what we wanted ] > > (non-white-space-damaged version is enclosed) could you please send this patch with a changelog, explanation, etc.? Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/