Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755540AbYGWWpv (ORCPT ); Wed, 23 Jul 2008 18:45:51 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753886AbYGWWpj (ORCPT ); Wed, 23 Jul 2008 18:45:39 -0400 Received: from yw-out-2324.google.com ([74.125.46.29]:45976 "EHLO yw-out-2324.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752768AbYGWWpi (ORCPT ); Wed, 23 Jul 2008 18:45:38 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=ejU/1vFSimAnybRgp/cNuuFO4caa3u2qAWR5iGAX7oyLdn0pa+Be6attQs4cPC2Drk NDgDRYPeGhUdcjSVj3I6WRItjnSVN+Cmy+hBZGvxEdx3CnHStWnYt4xe42qCC6G3KWPJ uFLSHCtUVrRmjygDnvO3vSMMG3ivgiBmMGbP8= Message-ID: <19f34abd0807231545u5bc8b55fm768527a02268f111@mail.gmail.com> Date: Thu, 24 Jul 2008 00:45:36 +0200 From: "Vegard Nossum" To: "Dmitry Adamushko" Subject: Re: recent -git: BUG in free_thread_xstate Cc: "Suresh Siddha" , LKML , "the arch/x86 maintainers" , "Paul E. McKenney" , "Ingo Molnar" , "Peter Zijlstra" , netdev@vger.kernel.org In-Reply-To: <19f34abd0807231505w1a25c2bak329a622f3a287e97@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <19f34abd0807231307y191c0ad7tfab4cda57ee88eb@mail.gmail.com> <20080723203109.GH14380@linux-os.sc.intel.com> <19f34abd0807231352j1ba1414am84ee9683df9b5657@mail.gmail.com> <19f34abd0807231445h79fac5cbwecd0563b74bc18ad@mail.gmail.com> <19f34abd0807231505w1a25c2bak329a622f3a287e97@mail.gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3886 Lines: 111 On Thu, Jul 24, 2008 at 12:05 AM, Vegard Nossum wrote: > On Thu, Jul 24, 2008 at 12:01 AM, Dmitry Adamushko > wrote: >> So I guess, 'cpu' value is slightly, well, out of reality. Check the >> address of "runqueues" in your kernel image... >> I guess, it should be quite close to the "fault" address... then we >> can even calculate 'cpu' :-) > > Yup, that's right. > > $ nm vmlinux | grep runqueues > c0803f00 d per_cpu__runqueues Hey, with this patch applied: diff --git a/include/asm-x86/string_32.h b/include/asm-x86/string_32.h index b49369a..7bef7ea 100644 --- a/include/asm-x86/string_32.h +++ b/include/asm-x86/string_32.h @@ -29,9 +29,14 @@ extern char *strchr(const char *s, int c); #define __HAVE_ARCH_STRLEN extern size_t strlen(const char *s); +extern void warn_on_slowpath(const char *file, int line); + static __always_inline void * __memcpy(void * to, const void * from, size_t n) { int d0, d1, d2; + if (n == 0x6b) + warn_on_slowpath(__FILE__, __LINE__); + __asm__ __volatile__( "rep ; movsl\n\t" "movl %4,%%ecx\n\t" I have found an important clue; it seems to be my network driver's fault: ------------[ cut here ]------------ WARNING: at include2/asm/string_32.h:38 skb_copy_and_csum_dev+0xee/0x100() Pid: 3989, comm: bash Tainted: G W 2.6.26-dirty #3 [] warn_on_slowpath+0x4f/0x70 [] ? check_bytes_and_report+0x21/0xc0 [] ? __kfree_skb+0x34/0x80 [] ? check_bytes_and_report+0x21/0xc0 [] ? check_object+0xdf/0x1f0 [] ? check_bytes_and_report+0x21/0xc0 [] ? __kfree_skb+0x34/0x80 [] ? check_object+0xdf/0x1f0 [] ? find_skb+0x3c/0x80 [] skb_copy_and_csum_dev+0xee/0x100 [] rtl8139_start_xmit+0x57/0x130 [] ? __kmalloc_track_caller+0x8b/0x120 [] netpoll_send_skb+0x14e/0x1a0 [] netpoll_send_udp+0x1e4/0x210 [] write_msg+0x8c/0xc0 [] __call_console_drivers+0x53/0x60 [] _call_console_drivers+0x4b/0x90 [] release_console_sem+0xc5/0x1f0 [] vprintk+0x2ce/0x420 [] ? do_IRQ+0x4d/0xa0 [] ? restore_nocheck+0x12/0x15 [] ? delay_tsc+0x61/0xb8 [] ? delay_tsc+0x86/0xb8 [] printk+0x1b/0x20 [] native_cpu_up+0x7cd/0x880 [] ? internal_create_group+0xd1/0x180 [] ? do_fork_idle+0x0/0x20 [] ? __raw_notifier_call_chain+0x19/0x20 [] _cpu_up+0x83/0x100 [] cpu_up+0x49/0x70 [] store_online+0x58/0x80 [] ? store_online+0x0/0x80 [] sysdev_store+0x2b/0x40 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x96/0x130 [] ? sysfs_write_file+0x0/0x100 [] sys_write+0x3d/0x70 [] sysenter_past_esp+0x78/0xd1 ======================= ---[ end trace a7919e7f17c0a725 ]--- In particular, these are interesting: [] skb_copy_and_csum_dev+0xee/0x100 This is net/core/skbuff.c:1731: skb_copy_from_linear_data(skb, to, csstart); [] rtl8139_start_xmit+0x57/0x130 This is drivers/net/8139too.c:1711: dev_kfree_skb(skb); (The line numbers are still from v2.6.26, but this reproduces on current -git as well.) Is this enough information to fix it? :-) Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/