2011-03-03 16:01:40

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH] x86: stop machine text poke should issue sync core (v2)

Intel Archiecture Software Developer's Manual section 7.1.3 specifies that a
core serializing instruction such as "cpuid" should be executed on _each_ core
before the new instruction is made visible.

Failure to do so can lead to unspecified behavior (Intel XMC erratas include
General Protection Fault in the list), so we should avoid this at all cost.

This problem can affect modified code executed by interrupt handlers after
interrupt are re-enabled at the end of stop_machine, because no core serializing
instruction is executed between the code modification and the moment interrupts
are reenabled.

Because stop_machine_text_poke performs the text modification from the first CPU
decrementing stop_machine_first, modified code executed in thread context is
also affected by this problem. To explain why, we have to split the CPUs in two
categories: the CPU that initiates the text modification (calls text_poke_smp)
and all the others. The scheduler, executed on all other CPUs after
stop_machine, issues an "iret" core serializing instruction, and therefore
handles core serialization for all these CPUs. However, the text modification
initiator can continue its execution on the same thread and access the modified
text without any scheduler call. Given that the CPU that initiates the code
modification is not guaranteed to be the one actually performing the code
modification, it falls into the XMC errata.

Q: Isn't this executed from an IPI handler, which will return with IRET (a
serializing instruction) anyway?
A: No, now stop_machine uses per-cpu workqueue, so that handler will be
executed from worker threads. There is no iret anymore.

Signed-off-by: Mathieu Desnoyers <[email protected]>
Reviewed-by: Masami Hiramatsu <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: Arjan van de Ven <[email protected]>
CC: Peter Zijlstra <[email protected]>
CC: Steven Rostedt <[email protected]>
CC: Andrew Morton <[email protected]>
CC: Andi Kleen <[email protected]>
CC: Frederic Weisbecker <[email protected]>
---
arch/x86/kernel/alternative.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-tip/arch/x86/kernel/alternative.c
===================================================================
--- linux-tip.orig/arch/x86/kernel/alternative.c
+++ linux-tip/arch/x86/kernel/alternative.c
@@ -620,7 +620,12 @@ static int __kprobes stop_machine_text_p
flush_icache_range((unsigned long)p->addr,
(unsigned long)p->addr + p->len);
}
-
+ /*
+ * Intel Archiecture Software Developer's Manual section 7.1.3 specifies
+ * that a core serializing instruction such as "cpuid" should be
+ * executed on _each_ core before the new instruction is made visible.
+ */
+ sync_core();
return 0;
}


--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com


2011-03-15 14:18:23

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH] x86: stop machine text poke should issue sync core (v2)

* Mathieu Desnoyers ([email protected]) wrote:
> Intel Archiecture Software Developer's Manual section 7.1.3 specifies that a
> core serializing instruction such as "cpuid" should be executed on _each_ core
> before the new instruction is made visible.

Hi,

Is anyone willing to merge this fix into the x86 tree ?

Thanks,

Mathieu

>
> Failure to do so can lead to unspecified behavior (Intel XMC erratas include
> General Protection Fault in the list), so we should avoid this at all cost.
>
> This problem can affect modified code executed by interrupt handlers after
> interrupt are re-enabled at the end of stop_machine, because no core serializing
> instruction is executed between the code modification and the moment interrupts
> are reenabled.
>
> Because stop_machine_text_poke performs the text modification from the first CPU
> decrementing stop_machine_first, modified code executed in thread context is
> also affected by this problem. To explain why, we have to split the CPUs in two
> categories: the CPU that initiates the text modification (calls text_poke_smp)
> and all the others. The scheduler, executed on all other CPUs after
> stop_machine, issues an "iret" core serializing instruction, and therefore
> handles core serialization for all these CPUs. However, the text modification
> initiator can continue its execution on the same thread and access the modified
> text without any scheduler call. Given that the CPU that initiates the code
> modification is not guaranteed to be the one actually performing the code
> modification, it falls into the XMC errata.
>
> Q: Isn't this executed from an IPI handler, which will return with IRET (a
> serializing instruction) anyway?
> A: No, now stop_machine uses per-cpu workqueue, so that handler will be
> executed from worker threads. There is no iret anymore.
>
> Signed-off-by: Mathieu Desnoyers <[email protected]>
> Reviewed-by: Masami Hiramatsu <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: "H. Peter Anvin" <[email protected]>
> CC: Arjan van de Ven <[email protected]>
> CC: Peter Zijlstra <[email protected]>
> CC: Steven Rostedt <[email protected]>
> CC: Andrew Morton <[email protected]>
> CC: Andi Kleen <[email protected]>
> CC: Frederic Weisbecker <[email protected]>
> ---
> arch/x86/kernel/alternative.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> Index: linux-tip/arch/x86/kernel/alternative.c
> ===================================================================
> --- linux-tip.orig/arch/x86/kernel/alternative.c
> +++ linux-tip/arch/x86/kernel/alternative.c
> @@ -620,7 +620,12 @@ static int __kprobes stop_machine_text_p
> flush_icache_range((unsigned long)p->addr,
> (unsigned long)p->addr + p->len);
> }
> -
> + /*
> + * Intel Archiecture Software Developer's Manual section 7.1.3 specifies
> + * that a core serializing instruction such as "cpuid" should be
> + * executed on _each_ core before the new instruction is made visible.
> + */
> + sync_core();
> return 0;
> }
>
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

Subject: Re: [PATCH] x86: stop machine text poke should issue sync core (v2)

(2011/03/15 23:18), Mathieu Desnoyers wrote:
> * Mathieu Desnoyers ([email protected]) wrote:
>> Intel Archiecture Software Developer's Manual section 7.1.3 specifies that a
>> core serializing instruction such as "cpuid" should be executed on _each_ core
>> before the new instruction is made visible.
>
> Hi,
>
> Is anyone willing to merge this fix into the x86 tree ?

Hi Ingo,
Please merge this fix for safe self modifying!

Thanks

>
> Thanks,
>
> Mathieu
>
>>
>> Failure to do so can lead to unspecified behavior (Intel XMC erratas include
>> General Protection Fault in the list), so we should avoid this at all cost.
>>
>> This problem can affect modified code executed by interrupt handlers after
>> interrupt are re-enabled at the end of stop_machine, because no core serializing
>> instruction is executed between the code modification and the moment interrupts
>> are reenabled.
>>
>> Because stop_machine_text_poke performs the text modification from the first CPU
>> decrementing stop_machine_first, modified code executed in thread context is
>> also affected by this problem. To explain why, we have to split the CPUs in two
>> categories: the CPU that initiates the text modification (calls text_poke_smp)
>> and all the others. The scheduler, executed on all other CPUs after
>> stop_machine, issues an "iret" core serializing instruction, and therefore
>> handles core serialization for all these CPUs. However, the text modification
>> initiator can continue its execution on the same thread and access the modified
>> text without any scheduler call. Given that the CPU that initiates the code
>> modification is not guaranteed to be the one actually performing the code
>> modification, it falls into the XMC errata.
>>
>> Q: Isn't this executed from an IPI handler, which will return with IRET (a
>> serializing instruction) anyway?
>> A: No, now stop_machine uses per-cpu workqueue, so that handler will be
>> executed from worker threads. There is no iret anymore.
>>
>> Signed-off-by: Mathieu Desnoyers <[email protected]>
>> Reviewed-by: Masami Hiramatsu <[email protected]>
>> CC: Thomas Gleixner <[email protected]>
>> CC: Ingo Molnar <[email protected]>
>> CC: "H. Peter Anvin" <[email protected]>
>> CC: Arjan van de Ven <[email protected]>
>> CC: Peter Zijlstra <[email protected]>
>> CC: Steven Rostedt <[email protected]>
>> CC: Andrew Morton <[email protected]>
>> CC: Andi Kleen <[email protected]>
>> CC: Frederic Weisbecker <[email protected]>
>> ---
>> arch/x86/kernel/alternative.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> Index: linux-tip/arch/x86/kernel/alternative.c
>> ===================================================================
>> --- linux-tip.orig/arch/x86/kernel/alternative.c
>> +++ linux-tip/arch/x86/kernel/alternative.c
>> @@ -620,7 +620,12 @@ static int __kprobes stop_machine_text_p
>> flush_icache_range((unsigned long)p->addr,
>> (unsigned long)p->addr + p->len);
>> }
>> -
>> + /*
>> + * Intel Archiecture Software Developer's Manual section 7.1.3 specifies
>> + * that a core serializing instruction such as "cpuid" should be
>> + * executed on _each_ core before the new instruction is made visible.
>> + */
>> + sync_core();
>> return 0;
>> }
>>
>>
>> --
>> Mathieu Desnoyers
>> Operating System Efficiency R&D Consultant
>> EfficiOS Inc.
>> http://www.efficios.com
>


--
Masami HIRAMATSU
2nd Dept. Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: [email protected]

2011-03-15 16:43:38

by Mathieu Desnoyers

[permalink] [raw]
Subject: [tip:x86/urgent] x86: stop_machine_text_poke() should issue sync_core()

Commit-ID: 0e00f7aed6af21fc09b2a94d28bc34e449bd3a53
Gitweb: http://git.kernel.org/tip/0e00f7aed6af21fc09b2a94d28bc34e449bd3a53
Author: Mathieu Desnoyers <[email protected]>
AuthorDate: Thu, 3 Mar 2011 11:01:37 -0500
Committer: H. Peter Anvin <[email protected]>
CommitDate: Tue, 15 Mar 2011 08:36:37 -0700

x86: stop_machine_text_poke() should issue sync_core()

Intel Archiecture Software Developer's Manual section 7.1.3 specifies that a
core serializing instruction such as "cpuid" should be executed on _each_ core
before the new instruction is made visible.

Failure to do so can lead to unspecified behavior (Intel XMC erratas include
General Protection Fault in the list), so we should avoid this at all cost.

This problem can affect modified code executed by interrupt handlers after
interrupt are re-enabled at the end of stop_machine, because no core serializing
instruction is executed between the code modification and the moment interrupts
are reenabled.

Because stop_machine_text_poke performs the text modification from the first CPU
decrementing stop_machine_first, modified code executed in thread context is
also affected by this problem. To explain why, we have to split the CPUs in two
categories: the CPU that initiates the text modification (calls text_poke_smp)
and all the others. The scheduler, executed on all other CPUs after
stop_machine, issues an "iret" core serializing instruction, and therefore
handles core serialization for all these CPUs. However, the text modification
initiator can continue its execution on the same thread and access the modified
text without any scheduler call. Given that the CPU that initiates the code
modification is not guaranteed to be the one actually performing the code
modification, it falls into the XMC errata.

Q: Isn't this executed from an IPI handler, which will return with IRET (a
serializing instruction) anyway?
A: No, now stop_machine uses per-cpu workqueue, so that handler will be
executed from worker threads. There is no iret anymore.

Signed-off-by: Mathieu Desnoyers <[email protected]>
LKML-Reference: <20110303160137.GB1590@Krystal>
Reviewed-by: Masami Hiramatsu <[email protected]>
Cc: <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/alternative.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 7038b95..4db3554 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -620,7 +620,12 @@ static int __kprobes stop_machine_text_poke(void *data)
flush_icache_range((unsigned long)p->addr,
(unsigned long)p->addr + p->len);
}
-
+ /*
+ * Intel Archiecture Software Developer's Manual section 7.1.3 specifies
+ * that a core serializing instruction such as "cpuid" should be
+ * executed on _each_ core before the new instruction is made visible.
+ */
+ sync_core();
return 0;
}