Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755280AbYHMQr6 (ORCPT ); Wed, 13 Aug 2008 12:47:58 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752280AbYHMQru (ORCPT ); Wed, 13 Aug 2008 12:47:50 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:45457 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751748AbYHMQrt (ORCPT ); Wed, 13 Aug 2008 12:47:49 -0400 Date: Wed, 13 Aug 2008 18:47:28 +0200 From: Ingo Molnar To: Mark Langsdorf Cc: linux-kernel@vger.kernel.org, Linus Torvalds , "H. Peter Anvin" , Thomas Gleixner Subject: Re: invalidate caches before going into suspend Message-ID: <20080813164728.GD5720@elte.hu> References: <200808131141.18003.mark.langsdorf@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200808131141.18003.mark.langsdorf@amd.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-ELTE-VirusStatus: clean X-ELTE-SpamScore: -1.5 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-1.5 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.2.3 -1.5 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1990 Lines: 60 * Mark Langsdorf wrote: > When a CPU core is shut down, all of its caches need to be flushed to > prevent stale data from causing errors if the core is resumed. Current > Linux suspend code performs an assignment after the flush, which can > add dirty data back to the cache. On some AMD platforms, additional > speculative reads have caused crashes on resume because of this dirty > data. > > Relocate the cache flush to be the very last thing done before > halting. nice catch! Applied to x86/urgent. I'm really curious: how did you find this bug? Did you see a CPU come up as !CPU_DEAD? > Signed-off-by: Mark Langsdorf > Acked-by: Mark Borden > Acked-by: Michael Hohmuth > > diff -r f3f819497a68 arch/x86/kernel/process_64.c > --- a/arch/x86/kernel/process_64.c Thu Aug 07 04:24:53 2008 -0500 > +++ b/arch/x86/kernel/process_64.c Tue Aug 12 07:11:36 2008 -0500 > @@ -93,11 +93,11 @@ static inline void play_dead(void) > static inline void play_dead(void) > { > idle_task_exit(); > - wbinvd(); > mb(); > /* Ack it */ > __get_cpu_var(cpu_state) = CPU_DEAD; > > + wbinvd(); > local_irq_disable(); > while (1) > halt(); please send a patch for the 32-bit side too, it has the same bug. also, we might be safer if the wbinvd(), the CLI and the halt was in a single assembly sequence: if (cpu >= i486) asm ("cli; wbinvd; cli; 1: hlt; jmp 1b") else asm ("cli; 1: hlt; jmp 1b") to make sure the compiler doesnt ever insert something into this codepath? [ And note the double cli which would be further robustification - in theory we could get a spurious interrupt straight after the wbinvd. ] Hm? Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/