2008-08-13 16:36:18

by Langsdorf, Mark

[permalink] [raw]
Subject: invalidate caches before going into suspend

When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache. On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting.


Signed-off-by: Mark Langsdorf <[email protected]>
Acked-by: Mark Borden <[email protected]>
Acked-by: Michael Hohmuth <[email protected]>

diff -r f3f819497a68 arch/x86/kernel/process_64.c
--- a/arch/x86/kernel/process_64.c Thu Aug 07 04:24:53 2008 -0500
+++ b/arch/x86/kernel/process_64.c Tue Aug 12 07:11:36 2008 -0500
@@ -93,11 +93,11 @@ static inline void play_dead(void)
static inline void play_dead(void)
{
idle_task_exit();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;

+ wbinvd();
local_irq_disable();
while (1)
halt();


2008-08-13 16:47:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend


* Mark Langsdorf <[email protected]> wrote:

> When a CPU core is shut down, all of its caches need to be flushed to
> prevent stale data from causing errors if the core is resumed. Current
> Linux suspend code performs an assignment after the flush, which can
> add dirty data back to the cache. On some AMD platforms, additional
> speculative reads have caused crashes on resume because of this dirty
> data.
>
> Relocate the cache flush to be the very last thing done before
> halting.

nice catch! Applied to x86/urgent.

I'm really curious: how did you find this bug? Did you see a CPU come up
as !CPU_DEAD?

> Signed-off-by: Mark Langsdorf <[email protected]>
> Acked-by: Mark Borden <[email protected]>
> Acked-by: Michael Hohmuth <[email protected]>
>
> diff -r f3f819497a68 arch/x86/kernel/process_64.c
> --- a/arch/x86/kernel/process_64.c Thu Aug 07 04:24:53 2008 -0500
> +++ b/arch/x86/kernel/process_64.c Tue Aug 12 07:11:36 2008 -0500
> @@ -93,11 +93,11 @@ static inline void play_dead(void)
> static inline void play_dead(void)
> {
> idle_task_exit();
> - wbinvd();
> mb();
> /* Ack it */
> __get_cpu_var(cpu_state) = CPU_DEAD;
>
> + wbinvd();
> local_irq_disable();
> while (1)
> halt();

please send a patch for the 32-bit side too, it has the same bug.

also, we might be safer if the wbinvd(), the CLI and the halt was in a
single assembly sequence:

if (cpu >= i486)
asm ("cli; wbinvd; cli; 1: hlt; jmp 1b")
else
asm ("cli; 1: hlt; jmp 1b")

to make sure the compiler doesnt ever insert something into this
codepath? [ And note the double cli which would be further
robustification - in theory we could get a spurious interrupt straight
after the wbinvd. ] Hm?

Ingo

2008-08-13 16:54:23

by H. Peter Anvin

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend

Ingo Molnar wrote:
>
> also, we might be safer if the wbinvd(), the CLI and the halt was in a
> single assembly sequence:
>
> if (cpu >= i486)
> asm ("cli; wbinvd; cli; 1: hlt; jmp 1b")
> else
> asm ("cli; 1: hlt; jmp 1b")
>
> to make sure the compiler doesnt ever insert something into this
> codepath? [ And note the double cli which would be further
> robustification - in theory we could get a spurious interrupt straight
> after the wbinvd. ] Hm?
>

Spurious interrupt of what kind? The only things that could come in
would not be non-INT type interrupts, and those aren't affected by CLI.

-hpa

2008-08-13 17:01:56

by Ingo Molnar

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend


* H. Peter Anvin <[email protected]> wrote:

> Ingo Molnar wrote:
>>
>> also, we might be safer if the wbinvd(), the CLI and the halt was in a
>> single assembly sequence:
>>
>> if (cpu >= i486)
>> asm ("cli; wbinvd; cli; 1: hlt; jmp 1b")
>> else
>> asm ("cli; 1: hlt; jmp 1b")
>>
>> to make sure the compiler doesnt ever insert something into this
>> codepath? [ And note the double cli which would be further
>> robustification - in theory we could get a spurious interrupt straight
>> after the wbinvd. ] Hm?
>>
>
> Spurious interrupt of what kind? The only things that could come in
> would not be non-INT type interrupts, and those aren't affected by
> CLI.

nothing should come in really at that point - but say IRQ#7 on older
platforms used to trigger at various points in time, even unprompted. Or
an APIC error interrupt in the last moment? All device irqs should
indeed be turned off at this stage, but since it costs us nothing to add
another cli, and because the failure mode is subtle memory corruption,
does it hurt to have it?

Ingo

2008-08-13 17:05:08

by Langsdorf, Mark

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend

On Wednesday 13 August 2008, Ingo Molnar wrote:
>
> * Mark Langsdorf <[email protected]> wrote:
>
> > When a CPU core is shut down, all of its caches need to be flushed to
> > prevent stale data from causing errors if the core is resumed. Current
> > Linux suspend code performs an assignment after the flush, which can
> > add dirty data back to the cache. On some AMD platforms, additional
> > speculative reads have caused crashes on resume because of this dirty
> > data.
> >
> > Relocate the cache flush to be the very last thing done before
> > halting.
>
> nice catch! Applied to x86/urgent.
>
> I'm really curious: how did you find this bug? Did you see a CPU come up
> as !CPU_DEAD?

AMD's diagnostic code for new CPUs was hanging when coming out of suspend,
so I presume it was hitting a bug check for not !CPU_DEAD. I got the
debug lab reports second hand. They traced the root cause to dirty data
being preserved in the cache and suggested relocating the wbinvd().

> please send a patch for the 32-bit side too, it has the same bug.
>
> also, we might be safer if the wbinvd(), the CLI and the halt was in a
> single assembly sequence:

> to make sure the compiler doesnt ever insert something into this
> codepath? [ And note the double cli which would be further
> robustification - in theory we could get a spurious interrupt straight
> after the wbinvd. ] Hm?

I don't think it's necessary. I can submit a delta patch later if you
think it's really necessary.


Signed-off-by: Mark Langsdorf <[email protected]>

diff -r 1e74a821dd00 arch/x86/kernel/process_32.c
--- a/arch/x86/kernel/process_32.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_32.c Wed Aug 13 06:40:00 2008 -0500
@@ -95,11 +95,11 @@ static inline void play_dead(void)
{
/* This must be done before dead CPU ack */
cpu_exit_clear();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;

+ wbinvd();
/*
* With physical CPU hotplug, we should halt the cpu
*/



2008-08-13 17:18:24

by Linus Torvalds

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend



On Wed, 13 Aug 2008, Mark Langsdorf wrote:
>
> I don't think it's necessary. I can submit a delta patch later if you
> think it's really necessary.

Why not at least move it to after the local_irq_disable()?

That local_irq_disable() will do tons of writes if you have
lockdep enabled, it calls trace_hardirqs_off() etc. Maybe they don't end
up ever mattering, but wouldn't it make much more sense to just move the
wbinvd down to just before the

while (1)
halt();

which is also likely to make sure that the compiler won't do anything at
all because everything is dead at that point with no function calls etc
happening.

Linus

2008-08-13 17:25:08

by Langsdorf, Mark

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend

On Wednesday 13 August 2008, Linus Torvalds wrote:
>
> On Wed, 13 Aug 2008, Mark Langsdorf wrote:
> >
> > I don't think it's necessary. I can submit a delta patch later if you
> > think it's really necessary.
>
> Why not at least move it to after the local_irq_disable()?
>
> That local_irq_disable() will do tons of writes if you have
> lockdep enabled, it calls trace_hardirqs_off() etc. Maybe they don't end
> up ever mattering, but wouldn't it make much more sense to just move the
> wbinvd down to just before the
>
> while (1)
> halt();
>
> which is also likely to make sure that the compiler won't do anything at
> all because everything is dead at that point with no function calls etc
> happening.

I don't think we realized that local_irq_disable() did all that and
so we only tested the submitted patch. I've resubmitted the unified
patch after applying your suggestion. Thanks.

-Mark Langsdorf
Operating System Research Center
AMD

2008-08-13 17:29:46

by H. Peter Anvin

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend

Ingo Molnar wrote:
>>>
>> Spurious interrupt of what kind? The only things that could come in
>> would not be non-INT type interrupts, and those aren't affected by
>> CLI.
>
> nothing should come in really at that point - but say IRQ#7 on older
> platforms used to trigger at various points in time, even unprompted. Or
> an APIC error interrupt in the last moment? All device irqs should
> indeed be turned off at this stage, but since it costs us nothing to add
> another cli, and because the failure mode is subtle memory corruption,
> does it hurt to have it?
>

Not significantly, but I cannot for my life figure out how it could help.

Either the interrupts will be blocked by the CLI already in effect, or
the additional CLI will not help, either (in fact, it will just slightly
increase the window for something like that to slip in.)

-hpa

2008-08-13 17:35:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend


* H. Peter Anvin <[email protected]> wrote:

> Ingo Molnar wrote:
>>>>
>>> Spurious interrupt of what kind? The only things that could come in
>>> would not be non-INT type interrupts, and those aren't affected by
>>> CLI.
>>
>> nothing should come in really at that point - but say IRQ#7 on older
>> platforms used to trigger at various points in time, even unprompted.
>> Or an APIC error interrupt in the last moment? All device irqs should
>> indeed be turned off at this stage, but since it costs us nothing to
>> add another cli, and because the failure mode is subtle memory
>> corruption, does it hurt to have it?
>>
>
> Not significantly, but I cannot for my life figure out how it could
> help.
>
> Either the interrupts will be blocked by the CLI already in effect, or
> the additional CLI will not help, either (in fact, it will just
> slightly increase the window for something like that to slip in.)

ah, the main point i tried to make was to have the CLI _before_ the
WBINVD - which Mark's patch didnt do.

Note the original sequence:

wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;

/*
* With physical CPU hotplug, we should halt the cpu
*/
local_irq_disable();

and Mark's patched sequence:

wbinvd();
local_irq_disable();
while (1)
halt();

both had wbinvd before the cli.

in my suggestion the second cli doesnt matter indeed - a single one
suffices.

Ingo

2008-08-13 17:38:28

by H. Peter Anvin

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend

Ingo Molnar wrote:
>>>
>> Not significantly, but I cannot for my life figure out how it could
>> help.
>>
>> Either the interrupts will be blocked by the CLI already in effect, or
>> the additional CLI will not help, either (in fact, it will just
>> slightly increase the window for something like that to slip in.)
>
> ah, the main point i tried to make was to have the CLI _before_ the
> WBINVD - which Mark's patch didnt do.
>

Ah yes, that's important.

-hpa

2008-08-13 17:38:42

by Ingo Molnar

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend


* Mark Langsdorf <[email protected]> wrote:

> > to make sure the compiler doesnt ever insert something into this
> > codepath? [ And note the double cli which would be further
> > robustification - in theory we could get a spurious interrupt
> > straight after the wbinvd. ] Hm?
>
> I don't think it's necessary. I can submit a delta patch later if you
> think it's really necessary.

this sequence:

> diff -r 1e74a821dd00 arch/x86/kernel/process_32.c
> --- a/arch/x86/kernel/process_32.c Tue Aug 12 12:04:12 2008 -0500
> +++ b/arch/x86/kernel/process_32.c Wed Aug 13 06:40:00 2008 -0500
> @@ -95,11 +95,11 @@ static inline void play_dead(void)
> {
> /* This must be done before dead CPU ack */
> cpu_exit_clear();
> - wbinvd();
> mb();
> /* Ack it */
> __get_cpu_var(cpu_state) = CPU_DEAD;
>
> + wbinvd();
> /*
> * With physical CPU hotplug, we should halt the cpu
> */

is still unsafe somewhat, as the cli only comes afterwards:

local_irq_disable();
while (1)
halt();

so there's a small race open for some stray irq to come before the CLI.

Also, any of the primitives could be instrumented in theory and more
code could be inserted - it's really the best approach to unify them
into a single primitive, like the wbinvd_halt() API i suggested in the
previous mail.

And it deserves a few comments as well.

Ingo

2008-08-13 17:53:04

by Ingo Molnar

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend


* Mark Langsdorf <[email protected]> wrote:

> On Wednesday 13 August 2008, Linus Torvalds wrote:
> >
> > On Wed, 13 Aug 2008, Mark Langsdorf wrote:
> > >
> > > I don't think it's necessary. I can submit a delta patch later if you
> > > think it's really necessary.
> >
> > Why not at least move it to after the local_irq_disable()?
> >
> > That local_irq_disable() will do tons of writes if you have
> > lockdep enabled, it calls trace_hardirqs_off() etc. Maybe they don't end
> > up ever mattering, but wouldn't it make much more sense to just move the
> > wbinvd down to just before the
> >
> > while (1)
> > halt();
> >
> > which is also likely to make sure that the compiler won't do anything at
> > all because everything is dead at that point with no function calls etc
> > happening.
>
> I don't think we realized that local_irq_disable() did all that and so
> we only tested the submitted patch. I've resubmitted the unified
> patch after applying your suggestion. Thanks.

Thanks, this looks much better. Please create a wbinvd_halt() variant
a'la safe_halt() and please also add comments why that is needed.

That way no instrumentation or other detail can ever get into that code
sequence. (full-blown debug labs with hw tracing facilities are really
an exception :-)

Note that the original bug was introduced with the original hotplug CPU
support, more than 3 years ago, via commit 76e4f660d9 - and the CLI was
inserted in the wrong place via commit 1fa744e6e, 2.5 years ago.

That gives a ballpark figure about the time it takes for CPU cache
corruption bugs to be found. So if we find a bug that matches such
patterns we absolutely want to overdo every single related detail there
- we really dont want to wait another 3 years if another bug creeps in
there sometime in the future.

( Even if you redo this patch it's still all v2.6.27-worthy material in
my opinion, obviously. It's a cool fix and it must have been
absolutely hard to debug it. )

Ingo

2008-08-13 18:28:19

by Langsdorf, Mark

[permalink] [raw]
Subject: [PATCH](retry 2) Re: invalidate caches before going into suspend

On Wednesday 13 August 2008, Ingo Molnar wrote:

> Thanks, this looks much better. Please create a wbinvd_halt() variant
> a'la safe_halt() and please also add comments why that is needed.


When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache. ?On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting. Tie into an assembly line so the compile will not
reorder it. Add some documentation explaining what is going
on and why we're doing this.

Signed-off-by: Mark Langsdorf <[email protected]>
Acked-by: Mark Borden <[email protected]>
Acked-by: Michael Hohmuth <[email protected]>
diff -r 1e74a821dd00 arch/x86/kernel/process_32.c
--- a/arch/x86/kernel/process_32.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_32.c Wed Aug 13 08:05:36 2008 -0500
@@ -95,7 +95,6 @@ static inline void play_dead(void)
{
/* This must be done before dead CPU ack */
cpu_exit_clear();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;
@@ -104,8 +103,12 @@ static inline void play_dead(void)
* With physical CPU hotplug, we should halt the cpu
*/
local_irq_disable();
- while (1)
- halt();
+ /* mask all interrupts, flush any and all caches, and halt */
+ if (cpu_has_clflush)
+ wbinvd_halt();
+ else
+ while (1)
+ halt();
}
#else
static inline void play_dead(void)
diff -r 1e74a821dd00 arch/x86/kernel/process_64.c
--- a/arch/x86/kernel/process_64.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_64.c Wed Aug 13 08:05:41 2008 -0500
@@ -93,14 +93,17 @@ static inline void play_dead(void)
static inline void play_dead(void)
{
idle_task_exit();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;

local_irq_disable();
- while (1)
- halt();
+ /* mask all interrupts, flush any and all caches, and halt */
+ if (cpu_has_clflush)
+ wbinvd_halt();
+ else
+ while (1)
+ halt();
}
#else
static inline void play_dead(void)
diff -r 1e74a821dd00 include/asm-x86/irqflags.h
--- a/include/asm-x86/irqflags.h Tue Aug 12 12:04:12 2008 -0500
+++ b/include/asm-x86/irqflags.h Wed Aug 13 07:52:10 2008 -0500
@@ -47,6 +47,16 @@ static inline void native_halt(void)
static inline void native_halt(void)
{
asm volatile("hlt": : :"memory");
+}
+
+/*
+ * flush all caches and go into a halt
+ * used to guarantee that caches are clean before halting
+ * during suspend/resume operations
+ */
+static inline void wbinvd_halt(void)
+{
+ asm volatile("cli; wbinvd; 1: hlt; jmp 1b": : :"memory");
}

#endif

2008-08-13 18:43:03

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH](retry 2) Re: invalidate caches before going into suspend

On Wed, 13 Aug 2008 13:33:17 -0500
Mark Langsdorf <[email protected]> wrote:
> + /* mask all interrupts, flush any and all caches, and halt */
> + if (cpu_has_clflush)
> + wbinvd_halt();
> + else
> + while (1)
> + halt();
> }

I like the asm version .. but this code makes me blink.
when you HAVE clflush you do wbinvd.
yeah I know it's correct but... it reads wonky ;)
also.. can we move that check into the wbinvd_halt() itself, so that
the callers don't need to care what is used to select it ?

--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-08-13 18:47:23

by Langsdorf, Mark

[permalink] [raw]
Subject: RE: [PATCH](retry 2) Re: invalidate caches before going into suspend

> Mark Langsdorf <[email protected]> wrote:
> > + /* mask all interrupts, flush any and all caches, and halt */
> > + if (cpu_has_clflush)
> > + wbinvd_halt();
> > + else
> > + while (1)
> > + halt();
> > }
>
> I like the asm version .. but this code makes me blink.
> when you HAVE clflush you do wbinvd.
> yeah I know it's correct but... it reads wonky ;)

It's the most convenient symbol to test against. I
suppose I should add a comment explaining it.

> also.. can we move that check into the wbinvd_halt() itself, so that
> the callers don't need to care what is used to select it ?

irqflags.h doesn't guarantee the presence of boot_cpu_data,
so I wasn't able to make that check cleanly. I could move
the macro definition to process_??.c if that would better.

-Mark Langsdorf
Operating System Research Center
AMD

2008-08-13 19:39:40

by Andi Kleen

[permalink] [raw]
Subject: Re: invalidate caches before going into suspend

Mark Langsdorf <[email protected]> writes:
> @@ -93,11 +93,11 @@ static inline void play_dead(void)
> static inline void play_dead(void)
> {
> idle_task_exit();
> - wbinvd();
> mb();
> /* Ack it */
> __get_cpu_var(cpu_state) = CPU_DEAD;
>
> + wbinvd();

You should put the mb() (or just a barrier()) directly above the wbinvd(),
otherwise the compiler is free to reorder this behind your back again
and move the store down after the wbinvd.

-Andi

2008-08-13 21:32:21

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH](retry 2) Re: invalidate caches before going into suspend

On Wed, 13 Aug 2008 13:47:05 -0500
"Langsdorf, Mark" <[email protected]> wrote:

> > Mark Langsdorf <[email protected]> wrote:
> > > + /* mask all interrupts, flush any and all caches, and
> > > halt */
> > > + if (cpu_has_clflush)
> > > + wbinvd_halt();
> > > + else
> > > + while (1)
> > > + halt();
> > > }
> >
> > I like the asm version .. but this code makes me blink.
> > when you HAVE clflush you do wbinvd.
> > yeah I know it's correct but... it reads wonky ;)
>
> It's the most convenient symbol to test against. I
> suppose I should add a comment explaining it.
>
> > also.. can we move that check into the wbinvd_halt() itself, so that
> > the callers don't need to care what is used to select it ?
>
> irqflags.h doesn't guarantee the presence of boot_cpu_data,
> so I wasn't able to make that check cleanly. I could move
> the macro definition to process_??.c if that would better.
>

it makes sense; it's hardly a hotpath, and making it simple to use
correctly is generally best ;)

2008-08-14 13:39:55

by Langsdorf, Mark

[permalink] [raw]
Subject: Re: [PATCH](retry 3) invalidate caches before going into suspend

When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache. ?On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting. ?Tie into an assembly line so the compile will not
reorder it. ?Add some documentation explaining what is going
on and why we're doing this.

Signed-off-by: Mark Langsdorf <[email protected]>
Acked-by: Mark Borden <[email protected]>
Acked-by: Michael Hohmuth <[email protected]>

diff -r 1e74a821dd00 arch/x86/kernel/process_32.c
--- a/arch/x86/kernel/process_32.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_32.c Thu Aug 14 03:04:52 2008 -0500
@@ -65,6 +65,28 @@ EXPORT_PER_CPU_SYMBOL(cpu_number);
EXPORT_PER_CPU_SYMBOL(cpu_number);

/*
+ * on systems with caches, caches must be flashed as the absolute
+ * last instruction before going into a suspended halt. Otherwise,
+ * dirty data can linger in the cache and become stale on resume,
+ * leading to strange errors.
+ *
+ * perform a variety of operations to guarantee that the compiler
+ * will not reorder instructions. wbinvd itself is serializing
+ * so the processor will not reorder.
+ *
+ * Systems without cache can just go into halt.
+ */
+static void wbinvd_halt(void) {
+ mb();
+ /* check for clflush to determine if wbinvd is legal */
+ if (cpu_has_clflush)
+ asm volatile("cli; wbinvd; 1: hlt; jmp 1b": : :"memory");
+ else
+ while (1)
+ halt();
+}
+
+/*
* Return saved PC of a blocked thread.
*/
unsigned long thread_saved_pc(struct task_struct *tsk)
@@ -95,7 +117,6 @@ static inline void play_dead(void)
{
/* This must be done before dead CPU ack */
cpu_exit_clear();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;
@@ -104,8 +125,8 @@ static inline void play_dead(void)
* With physical CPU hotplug, we should halt the cpu
*/
local_irq_disable();
- while (1)
- halt();
+ /* mask all interrupts, flush any and all caches, and halt */
+ wbinvd_halt();
}
#else
static inline void play_dead(void)
diff -r 1e74a821dd00 arch/x86/kernel/process_64.c
--- a/arch/x86/kernel/process_64.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_64.c Thu Aug 14 03:04:42 2008 -0500
@@ -58,6 +58,28 @@ unsigned long kernel_thread_flags = CLON

static ATOMIC_NOTIFIER_HEAD(idle_notifier);

+/*
+ * on systems with caches, caches must be flashed as the absolute
+ * last instruction before going into a suspended halt. Otherwise,
+ * dirty data can linger in the cache and become stale on resume,
+ * leading to strange errors.
+ *
+ * perform a variety of operations to guarantee that the compiler
+ * will not reorder instructions. wbinvd itself is serializing
+ * so the processor will not reorder.
+ *
+ * Systems without cache can just go into halt.
+ */
+static void wbinvd_halt(void) {
+ mb();
+ /* check for clflush to determine if wbinvd is legal */
+ if (cpu_has_clflush)
+ asm volatile("cli; wbinvd; 1: hlt; jmp 1b": : :"memory");
+ else
+ while (1)
+ halt();
+}
+
void idle_notifier_register(struct notifier_block *n)
{
atomic_notifier_chain_register(&idle_notifier, n);
@@ -93,14 +115,13 @@ static inline void play_dead(void)
static inline void play_dead(void)
{
idle_task_exit();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;

local_irq_disable();
- while (1)
- halt();
+ /* mask all interrupts, flush any and all caches, and halt */
+ wbinvd_halt();
}
#else
static inline void play_dead(void)

2008-08-14 13:50:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH](retry 3) invalidate caches before going into suspend


* Mark Langsdorf <[email protected]> wrote:

> + * Systems without cache can just go into halt.
> + */
> +static void wbinvd_halt(void) {

please put this into a unified include file, and as an inline - there's
just a single callsite. That avoids duplication and also removes a
function.

Also, please run the patch through scripts/checkpatch, it would have
suggested to change this:

> +static void wbinvd_halt(void) {

to:

> +static void wbinvd_halt(void)
> +{

Thanks,

Ingo

2008-08-14 14:01:00

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH](retry 3) invalidate caches before going into suspend

On Thu, 14 Aug 2008 08:45:04 -0500
Mark Langsdorf <[email protected]> wrote:

> When a CPU core is shut down, all of its caches need to be flushed
> to prevent stale data from causing errors if the core is resumed.
> Current Linux suspend code performs an assignment after the flush,
> which can add dirty data back to the cache.  On some AMD platforms,
> additional speculative reads have caused crashes on resume because
> of this dirty data.
>
> Relocate the cache flush to be the very last thing done before
> halting.  Tie into an assembly line so the compile will not
> reorder it.  Add some documentation explaining what is going
> on and why we're doing this.


looks good to me

Acked-by: Arjan van de Ven <[email protected]>


--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-08-14 14:06:20

by Langsdorf, Mark

[permalink] [raw]
Subject: Re: [PATCH](retry 4) invalidate caches before going into suspend

When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache. ?On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting. ?Tie into an assembly line so the compile will not
reorder it. ?Add some documentation explaining what is going
on and why we're doing this.

Signed-off-by: Mark Langsdorf <[email protected]>
Acked-by: Mark Borden <[email protected]>
Acked-by: Michael Hohmuth <[email protected]>

diff -r 1e74a821dd00 arch/x86/kernel/process_32.c
--- a/arch/x86/kernel/process_32.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_32.c Thu Aug 14 03:35:14 2008 -0500
@@ -95,7 +95,6 @@ static inline void play_dead(void)
{
/* This must be done before dead CPU ack */
cpu_exit_clear();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;
@@ -104,8 +103,8 @@ static inline void play_dead(void)
* With physical CPU hotplug, we should halt the cpu
*/
local_irq_disable();
- while (1)
- halt();
+ /* mask all interrupts, flush any and all caches, and halt */
+ wbinvd_halt();
}
#else
static inline void play_dead(void)
diff -r 1e74a821dd00 arch/x86/kernel/process_64.c
--- a/arch/x86/kernel/process_64.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_64.c Thu Aug 14 03:36:13 2008 -0500
@@ -93,14 +93,13 @@ static inline void play_dead(void)
static inline void play_dead(void)
{
idle_task_exit();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;

local_irq_disable();
- while (1)
- halt();
+ /* mask all interrupts, flush any and all caches, and halt */
+ wbinvd_halt();
}
#else
static inline void play_dead(void)
diff -r 1e74a821dd00 include/asm-x86/processor.h
--- a/include/asm-x86/processor.h Tue Aug 12 12:04:12 2008 -0500
+++ b/include/asm-x86/processor.h Thu Aug 14 03:47:01 2008 -0500
@@ -728,6 +728,29 @@ extern unsigned long idle_halt;
extern unsigned long idle_halt;
extern unsigned long idle_nomwait;

+/*
+ * on systems with caches, caches must be flashed as the absolute
+ * last instruction before going into a suspended halt. Otherwise,
+ * dirty data can linger in the cache and become stale on resume,
+ * leading to strange errors.
+ *
+ * perform a variety of operations to guarantee that the compiler
+ * will not reorder instructions. wbinvd itself is serializing
+ * so the processor will not reorder.
+ *
+ * Systems without cache can just go into halt.
+ */
+static inline void wbinvd_halt(void)
+{
+ mb();
+ /* check for clflush to determine if wbinvd is legal */
+ if (cpu_has_clflush)
+ asm volatile("cli; wbinvd; 1: hlt; jmp 1b" : : : "memory");
+ else
+ while (1)
+ halt();
+}
+
extern void enable_sep_cpu(void);
extern int sysenter_setup(void);

2008-08-15 12:05:00

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH](retry 4) invalidate caches before going into suspend


* Mark Langsdorf <[email protected]> wrote:

> When a CPU core is shut down, all of its caches need to be flushed to
> prevent stale data from causing errors if the core is resumed. Current
> Linux suspend code performs an assignment after the flush, which can
> add dirty data back to the cache. ?On some AMD platforms, additional
> speculative reads have caused crashes on resume because of this dirty
> data.
>
> Relocate the cache flush to be the very last thing done before
> halting. ?Tie into an assembly line so the compile will not reorder
> it. ?Add some documentation explaining what is going on and why we're
> doing this.
>
> Signed-off-by: Mark Langsdorf <[email protected]>
> Acked-by: Mark Borden <[email protected]>
> Acked-by: Michael Hohmuth <[email protected]>

applied to tip/x86/urgent - thanks Mark.

Ingo