2008-08-13 17:23:51

by Langsdorf, Mark

[permalink] [raw]
Subject: [PATCH][retry 1] 2.6.27-rc2: invalidate caches before going into suspend

When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache. On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting.


Signed-off-by: Mark Langsdorf <[email protected]>
Acked-by: Mark Borden <[email protected]>
Acked-by: Michael Hohmuth <[email protected]>

diff -r 1e74a821dd00 arch/x86/kernel/process_32.c
--- a/arch/x86/kernel/process_32.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_32.c Wed Aug 13 07:03:35 2008 -0500
@@ -95,7 +95,6 @@ static inline void play_dead(void)
{
/* This must be done before dead CPU ack */
cpu_exit_clear();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;
@@ -104,6 +103,7 @@ static inline void play_dead(void)
* With physical CPU hotplug, we should halt the cpu
*/
local_irq_disable();
+ wbinvd();
while (1)
halt();
}
diff -r 1e74a821dd00 arch/x86/kernel/process_64.c
--- a/arch/x86/kernel/process_64.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_64.c Wed Aug 13 07:03:54 2008 -0500
@@ -93,12 +93,12 @@ static inline void play_dead(void)
static inline void play_dead(void)
{
idle_task_exit();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;

local_irq_disable();
+ wbinvd();
while (1)
halt();
}


2008-08-13 17:31:39

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH][retry 1] 2.6.27-rc2: invalidate caches before going into suspend


(please keep Cc:s intact)

* Mark Langsdorf <[email protected]> wrote:

> @@ -104,6 +103,7 @@ static inline void play_dead(void)
> * With physical CPU hotplug, we should halt the cpu
> */
> local_irq_disable();
> + wbinvd();
> while (1)
> halt();

hm, why not do what i suggested in my first mail:

if (cpu >= i486)
asm("cli; wbinvd; cli; 1: hlt; jmp 1b")
else
halt();

perhaps turn it into a wbivd_halt() primitive, to make it clean and even
more obvious.

This sequence does matter to reliable suspend/resume, and in theory gcc
could insert something before a halt() as well.

[ i only have a pretty far-fetched example that in all likelyhood wont
happen in practice: for example halt could be decided to be uninlined
by a braindead compiler, ftrace could hook in there, and dirty some
state. But still - the point is that we had a difficult bug in this
code for a long time and in such situations we should just over-do
robustness by default. ]

Ingo