Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754909AbYCKVnm (ORCPT ); Tue, 11 Mar 2008 17:43:42 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750924AbYCKVnf (ORCPT ); Tue, 11 Mar 2008 17:43:35 -0400 Received: from fmmailgate01.web.de ([217.72.192.221]:38526 "EHLO fmmailgate01.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750786AbYCKVne (ORCPT ); Tue, 11 Mar 2008 17:43:34 -0400 From: Chr To: "Pallipadi, Venkatesh" Subject: Re: endless loop in native_flush_tlb_others in smp_64.c Date: Tue, 11 Mar 2008 22:43:26 +0100 User-Agent: KMail/1.9.9 Cc: "Jike Song" , "Linux Kernel" , "Ingo Molnar" , "Thomas Gleixner" , "Brown, Len" References: <200803102316.54457.chunkeey@web.de> <200803111130.55355.chunkeey@web.de> <924EFEDD5F540B4284297C4DC59F3DEEAF9755@orsmsx423.amr.corp.intel.com> In-Reply-To: <924EFEDD5F540B4284297C4DC59F3DEEAF9755@orsmsx423.amr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200803112243.27056.chunkeey@web.de> X-Provags-ID: V01U2FsdGVkX19PjOQKJdMGemqZ7TVSoR6QAt8sx/1PzUD3Ad4R Np8kfDNsA2tmX7bTCFNpycsyk5WhNG1qTzB3HdAC6eIo+v0HnA AU0gV8dcc= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2837 Lines: 66 On Tuesday 11 March 2008 12:09:24 you wrote: > On Tue, 11 Mar 2008, Jike Song wrote: > > Any chance that you can capture SYSRQ-T output via serial or > netconsole, so we can see the stacktrace and what the other CPUs are > doing, if they are doing anything. this time with a 2.6.25-rc4-wl: (unfortunatly tainted again) the serial console seems to work: GFPs all over the place... take a look here: http://www.pastebin.ca/938757 Since I get so many different Oopses. I'm beginning to suspect my fancy JFS/ReiserFS/Ext3:DM-Crypt:LVM2:MD(Raid1) combo causes memory corruptions/leaks/voodoo... like this other tragic incident: loop0 D ffff810079331bd0 0 15716 2 ffff810079331b40 0000000000000046 ffff810062295c90 ffffffff804028e0 ffff810069608800 ffff810079331af0 ffffc20010af7040 ffffffff805f6700 ffffffff805f6700 ffffffff805f2f50 ffffffff805f6700 ffff81007a7df830 Call Trace: [] __split_bio+0x367/0x378 [] generic_unplug_device+0x18/0x24 [] dm_table_unplug_all+0x2a/0x3d [] sync_buffer+0x0/0x3f [] io_schedule+0x28/0x34 [] sync_buffer+0x3b/0x3f [] __wait_on_bit+0x40/0x6e [] sync_buffer+0x0/0x3f [] out_of_line_wait_on_bit+0x6c/0x78 [] wake_bit_function+0x0/0x23 [] ll_rw_block+0x8c/0xaf [] __block_prepare_write+0x366/0x3b9 [] ext3_get_block+0x0/0xf9 [] block_write_begin+0x78/0xc9 [] ext3_write_begin+0xeb/0x1aa [] ext3_get_block+0x0/0xf9 [] do_lo_send_aops+0x9f/0x177 [] do_lo_send_aops+0x0/0x177 [] loop_thread+0x2ce/0x425 [] loop_thread+0x0/0x425 [] kthread+0x47/0x76 [] schedule_tail+0x28/0x5c [] child_rip+0xa/0x12 [] kthread+0x0/0x76 [] child_rip+0x0/0x12 situation: the system died after writing >2 Gb from /dev/zero (gosh, about only 1Mb/s-500kb/s!!) into a file in a _mounted_ loopdevice of a old-hdd-image-file on a jfs/dm-crypt/lvm2 combo. BTW: bisect is still running... the regression seems to have sneaked in between 2.6.24 and 2.6.25-rc1 however 4000 diffs will take a while... (it takes so long since the raid has to resync each reboot... Thank *** that this is just a stress-testing system that can take some beating without _failing_ apart. ;-) ) Regards, Chr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/