Currently, init_new_context() only for each online CPU, this may cause
memory corruption when CPU hotplug and fork() happens at the same time.
To avoid this, we make init_new_context() cover each possible CPU.
Scenario:
1, CPU#1 is being offline;
2, On CPU#0, do_fork() call dup_mm() and copy a mm_struct to the child;
3, On CPU#0, dup_mm() call init_new_context(), since CPU#1 is offline
and init_new_context() only covers the online CPUs, child has the
same asid as its parent on CPU#1 (however, child's asid should be 0);
4, CPU#1 is being online;
5, Now, if both parent and child run on CPU#1, memory corruption (e.g.
segfault, bus error, etc.) will occur.
Signed-off-by: Huacai Chen <[email protected]>
---
arch/mips/include/asm/mmu_context.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/mips/include/asm/mmu_context.h b/arch/mips/include/asm/mmu_context.h
index e81d719..49d220c 100644
--- a/arch/mips/include/asm/mmu_context.h
+++ b/arch/mips/include/asm/mmu_context.h
@@ -133,7 +133,7 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm)
{
int i;
- for_each_online_cpu(i)
+ for_each_possible_cpu(i)
cpu_context(i, mm) = 0;
return 0;
--
1.7.7.3
2013/3/17 Huacai Chen <[email protected]>:
> Currently, init_new_context() only for each online CPU, this may cause
> memory corruption when CPU hotplug and fork() happens at the same time.
> To avoid this, we make init_new_context() cover each possible CPU.
>
> Scenario:
> 1, CPU#1 is being offline;
> 2, On CPU#0, do_fork() call dup_mm() and copy a mm_struct to the child;
> 3, On CPU#0, dup_mm() call init_new_context(), since CPU#1 is offline
> and init_new_context() only covers the online CPUs, child has the
> same asid as its parent on CPU#1 (however, child's asid should be 0);
> 4, CPU#1 is being online;
> 5, Now, if both parent and child run on CPU#1, memory corruption (e.g.
> segfault, bus error, etc.) will occur.
Adds some further explanations about point 5:
5.1) The parent process runs on CPU#1, the version field of its
asid[CPU#1] and CPU#1's asid cache may match, thus the parent
process's asid[CPU#1] is used in TLB refilling routine.
5.2) Then the child process runs on CPU#1 which probably has the same
asid[CPU#1] as its parent(as pointed by point 3).
5.3) The child process may access address space of its parent.
It can also be solved by increasing version field of asid cache when
plugin a CPU, may be a better way?
Regards,
-- Chen Jie
On 03/17/2013 05:50 AM, Huacai Chen wrote:
> Currently, init_new_context() only for each online CPU, this may cause
> memory corruption when CPU hotplug and fork() happens at the same time.
> To avoid this, we make init_new_context() cover each possible CPU.
>
> Scenario:
> 1, CPU#1 is being offline;
> 2, On CPU#0, do_fork() call dup_mm() and copy a mm_struct to the child;
> 3, On CPU#0, dup_mm() call init_new_context(), since CPU#1 is offline
> and init_new_context() only covers the online CPUs, child has the
> same asid as its parent on CPU#1 (however, child's asid should be 0);
> 4, CPU#1 is being online;
> 5, Now, if both parent and child run on CPU#1, memory corruption (e.g.
> segfault, bus error, etc.) will occur.
>
> Signed-off-by: Huacai Chen <[email protected]>
We were seeing the same crashes, this patch set seems to fix the problem.
Acked-by: David Daney <[email protected]>
> ---
> arch/mips/include/asm/mmu_context.h | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/mips/include/asm/mmu_context.h b/arch/mips/include/asm/mmu_context.h
> index e81d719..49d220c 100644
> --- a/arch/mips/include/asm/mmu_context.h
> +++ b/arch/mips/include/asm/mmu_context.h
> @@ -133,7 +133,7 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm)
> {
> int i;
>
> - for_each_online_cpu(i)
> + for_each_possible_cpu(i)
> cpu_context(i, mm) = 0;
>
> return 0;
>