Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753172Ab0HPIXE (ORCPT ); Mon, 16 Aug 2010 04:23:04 -0400 Received: from bombadil.infradead.org ([18.85.46.34]:40000 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753137Ab0HPIXD convert rfc822-to-8bit (ORCPT ); Mon, 16 Aug 2010 04:23:03 -0400 Subject: Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog From: Peter Zijlstra To: Sergey Senozhatsky Cc: Andrew Morton , Ingo Molnar , linux-kernel@vger.kernel.org, Don Zickus In-Reply-To: <20100813102158.GA5434@swordfish.minsk.epam.com> References: <20100813102158.GA5434@swordfish.minsk.epam.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Date: Mon, 16 Aug 2010 10:22:50 +0200 Message-ID: <1281946970.1926.998.camel@laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2269 Lines: 64 On Fri, 2010-08-13 at 13:21 +0300, Sergey Senozhatsky wrote: > [ 67.703556] BUG: using smp_processor_id() in preemptible [00000000] code: s2disk/5139 > [ 67.703563] caller is touch_nmi_watchdog+0x15/0x2c > [ 67.703566] Pid: 5139, comm: s2disk Not tainted 2.6.36-rc0-git12-07921-g60bf26a-dirty #116 > [ 67.703568] Call Trace: > [ 67.703575] [] debug_smp_processor_id+0xc9/0xe4 > [ 67.703578] [] touch_nmi_watchdog+0x15/0x2c > [ 67.703584] [] acpi_os_stall+0x34/0x40 > [ 67.703589] [] acpi_ex_system_do_stall+0x34/0x38 Which could mean two things, either ACPI got funny on us, or Don's new watchdog stuff has a hole in it. > --- > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 613bc1f..8822f1e 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -116,13 +116,14 @@ static unsigned long get_sample_period(void) > static void __touch_watchdog(void) > { > int this_cpu = smp_processor_id(); > - > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > } That change seems sensible enough.. > void touch_softlockup_watchdog(void) > { > - __get_cpu_var(watchdog_touch_ts) = 0; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_touch_ts, this_cpu) = 0; > + put_cpu(); > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) > #ifdef CONFIG_HARDLOCKUP_DETECTOR > void touch_nmi_watchdog(void) > { > - __get_cpu_var(watchdog_nmi_touch) = true; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_nmi_touch, this_cpu) = true; > + put_cpu(); > touch_softlockup_watchdog(); > } > EXPORT_SYMBOL(touch_nmi_watchdog); These other two really are about assumptions we make on the call sites, which at the very least are violated by ACPI. Don/Ingo, remember if we require touch_*_watchdog callers to have preemption disabled? Or is the proposed patch sensible? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/