2011-04-18 12:14:17

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 0/3] convert mm->cpu_vm_mask into cpumask_var_t

Recently, I and Hugh discussed about size of mm_struct. And then, I decided
to spend some time to diet it.

Unfortunately, We don't finished to convert cpumask_size() into full
nr_cpu_ids-ism. then, We can't get full benefit of cpumask_var_t yet.
However I expect it will be solved in this or next year.


KOSAKI Motohiro (3):
mn10300: replace mm->cpu_vm_mask with mm_cpumask
tile: replace mm->cpu_vm_mask with mm_cpumask()
mm: convert mm->cpu_vm_cpumask into cpumask_var_t

Documentation/cachetlb.txt | 2 +-
arch/mn10300/kernel/smp.c | 2 +-
arch/mn10300/mm/tlb-smp.c | 6 ++--
arch/tile/include/asm/mmu_context.h | 4 +-
arch/tile/kernel/tlb.c | 12 +++++-----
include/linux/mm_types.h | 9 +++++--
include/linux/sched.h | 1 +
init/main.c | 2 +
kernel/fork.c | 37 ++++++++++++++++++++++++++++++++--
mm/init-mm.c | 1 -
10 files changed, 56 insertions(+), 20 deletions(-)

--
1.7.3.1



2011-04-18 12:15:33

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 1/3] mn10300: replace mm->cpu_vm_mask with mm_cpumask

We plant to change mm->cpu_vm_mask definition later. Thus this
patch convert it into mm_cpumask().

Signed-off-by: KOSAKI Motohiro <[email protected]>
Cc: David Howells <[email protected]>
Cc: Koichi Yasutake <[email protected]>
---
arch/mn10300/kernel/smp.c | 2 +-
arch/mn10300/mm/tlb-smp.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)

mn10300 is one of last two cpu_vm_mask direct access users.

diff --git a/arch/mn10300/kernel/smp.c b/arch/mn10300/kernel/smp.c
index 83fb279..6d59726 100644
--- a/arch/mn10300/kernel/smp.c
+++ b/arch/mn10300/kernel/smp.c
@@ -986,7 +986,7 @@ int __cpu_disable(void)
return -EBUSY;

migrate_irqs();
- cpu_clear(cpu, current->active_mm->cpu_vm_mask);
+ cpu_clear(cpu, mm_cpumask(current->active_mm));
return 0;
}

diff --git a/arch/mn10300/mm/tlb-smp.c b/arch/mn10300/mm/tlb-smp.c
index 0b6a5ad..9d357b4 100644
--- a/arch/mn10300/mm/tlb-smp.c
+++ b/arch/mn10300/mm/tlb-smp.c
@@ -146,7 +146,7 @@ void flush_tlb_mm(struct mm_struct *mm)
cpumask_t cpu_mask;

preempt_disable();
- cpu_mask = mm->cpu_vm_mask;
+ cpu_mask = mm_cpumask(mm);
cpu_clear(smp_processor_id(), cpu_mask);

local_flush_tlb();
@@ -165,7 +165,7 @@ void flush_tlb_current_task(void)
cpumask_t cpu_mask;

preempt_disable();
- cpu_mask = mm->cpu_vm_mask;
+ cpu_mask = mm_cpumask(mm);
cpu_clear(smp_processor_id(), cpu_mask);

local_flush_tlb();
@@ -186,7 +186,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long va)
cpumask_t cpu_mask;

preempt_disable();
- cpu_mask = mm->cpu_vm_mask;
+ cpu_mask = mm_cpumask(mm);
cpu_clear(smp_processor_id(), cpu_mask);

local_flush_tlb_page(mm, va);
--
1.7.3.1


2011-04-18 12:18:18

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 2/3] tile: replace mm->cpu_vm_mask with mm_cpumask()

We plan to change mm->cpu_vm_mask definition later. Thus, this patch convert
it into proper macro.

Signed-off-by: KOSAKI Motohiro <[email protected]>
Cc: Chris Metcalf <[email protected]>
---
arch/tile/include/asm/mmu_context.h | 4 ++--
arch/tile/kernel/tlb.c | 12 ++++++------
2 files changed, 8 insertions(+), 8 deletions(-)

tile is one of last two cpu_vm_mask direct access users. I hope convert it
even if anyone refuse [path 3/3].

Chris, I couldn't get cross compiler for tile. thus I hope you check it carefully.
Thanks.

diff --git a/arch/tile/include/asm/mmu_context.h b/arch/tile/include/asm/mmu_context.h
index 9bc0d07..15fb246 100644
--- a/arch/tile/include/asm/mmu_context.h
+++ b/arch/tile/include/asm/mmu_context.h
@@ -100,8 +100,8 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
__get_cpu_var(current_asid) = asid;

/* Clear cpu from the old mm, and set it in the new one. */
- cpumask_clear_cpu(cpu, &prev->cpu_vm_mask);
- cpumask_set_cpu(cpu, &next->cpu_vm_mask);
+ cpumask_clear_cpu(cpu, mm_cpumask(prev));
+ cpumask_set_cpu(cpu, mm_cpumask(next));

/* Re-load page tables */
install_page_table(next->pgd, asid);
diff --git a/arch/tile/kernel/tlb.c b/arch/tile/kernel/tlb.c
index 2dffc10..a5f241c 100644
--- a/arch/tile/kernel/tlb.c
+++ b/arch/tile/kernel/tlb.c
@@ -34,13 +34,13 @@ void flush_tlb_mm(struct mm_struct *mm)
{
HV_Remote_ASID asids[NR_CPUS];
int i = 0, cpu;
- for_each_cpu(cpu, &mm->cpu_vm_mask) {
+ for_each_cpu(cpu, mm_cpumask(mm)) {
HV_Remote_ASID *asid = &asids[i++];
asid->y = cpu / smp_topology.width;
asid->x = cpu % smp_topology.width;
asid->asid = per_cpu(current_asid, cpu);
}
- flush_remote(0, HV_FLUSH_EVICT_L1I, &mm->cpu_vm_mask,
+ flush_remote(0, HV_FLUSH_EVICT_L1I, mm_cpumask(mm),
0, 0, 0, NULL, asids, i);
}

@@ -54,8 +54,8 @@ void flush_tlb_page_mm(const struct vm_area_struct *vma, struct mm_struct *mm,
{
unsigned long size = hv_page_size(vma);
int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
- flush_remote(0, cache, &mm->cpu_vm_mask,
- va, size, size, &mm->cpu_vm_mask, NULL, 0);
+ flush_remote(0, cache, mm_cpumask(mm),
+ va, size, size, mm_cpumask(mm), NULL, 0);
}

void flush_tlb_page(const struct vm_area_struct *vma, unsigned long va)
@@ -70,8 +70,8 @@ void flush_tlb_range(const struct vm_area_struct *vma,
unsigned long size = hv_page_size(vma);
struct mm_struct *mm = vma->vm_mm;
int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
- flush_remote(0, cache, &mm->cpu_vm_mask, start, end - start, size,
- &mm->cpu_vm_mask, NULL, 0);
+ flush_remote(0, cache, mm_cpumask(mm), start, end - start, size,
+ mm_cpumask(mm), NULL, 0);
}

void flush_tlb_all(void)
--
1.7.3.1


2011-04-18 12:18:55

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 3/3] mm: convert mm->cpu_vm_cpumask into cpumask_var_t

cpumask_t is very big struct and cpu_vm_mask is placed wrong position.
It might lead to reduce cache hit ratio.

This patch has two change.
1) Move the place of cpumask into last of mm_struct. Because usually cpumask
is accessed only front bits when the system has cpu-hotplug capability
2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory
footprint if cpumask_size() will use nr_cpumask_bits properly in future.

In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var.
It may help to detect out of tree cpu_vm_mask users.

This patch has no functional change.

Signed-off-by: KOSAKI Motohiro <[email protected]>
---
Documentation/cachetlb.txt | 2 +-
include/linux/mm_types.h | 9 ++++++---
include/linux/sched.h | 1 +
init/main.c | 2 ++
kernel/fork.c | 37 ++++++++++++++++++++++++++++++++++---
mm/init-mm.c | 1 -
6 files changed, 44 insertions(+), 8 deletions(-)

This patch don't touch x86/kerrnel/tboot.c. because it can't be compiled.

diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt
index 9164ae3..9b728dc 100644
--- a/Documentation/cachetlb.txt
+++ b/Documentation/cachetlb.txt
@@ -16,7 +16,7 @@ on all processors in the system. Don't let this scare you into
thinking SMP cache/tlb flushing must be so inefficient, this is in
fact an area where many optimizations are possible. For example,
if it can be proven that a user address space has never executed
-on a cpu (see vma->cpu_vm_mask), one need not perform a flush
+on a cpu (see mm_cpumask()), one need not perform a flush
for this address space on that cpu.

First, the TLB flushing interfaces, since they are the simplest. The
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index ca01ab2..070c7f2 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -267,8 +267,6 @@ struct mm_struct {

struct linux_binfmt *binfmt;

- cpumask_t cpu_vm_mask;
-
/* Architecture-specific MM context */
mm_context_t context;

@@ -318,9 +316,14 @@ struct mm_struct {
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
pgtable_t pmd_huge_pte; /* protected by page_table_lock */
#endif
+
+ cpumask_var_t cpu_vm_mask_var;
};

/* Future-safe accessor for struct mm_struct's cpu_vm_mask. */
-#define mm_cpumask(mm) (&(mm)->cpu_vm_mask)
+static inline cpumask_t* mm_cpumask(struct mm_struct *mm)
+{
+ return mm->cpu_vm_mask_var;
+}

#endif /* _LINUX_MM_TYPES_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 3f7d3f9..7068380 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2170,6 +2170,7 @@ static inline void mmdrop(struct mm_struct * mm)
if (unlikely(atomic_dec_and_test(&mm->mm_count)))
__mmdrop(mm);
}
+extern int mm_init_cpumask(struct mm_struct *mm, struct mm_struct *oldmm);

/* mmput gets rid of the mappings and all user-space */
extern void mmput(struct mm_struct *);
diff --git a/init/main.c b/init/main.c
index 4a9479e..8451425 100644
--- a/init/main.c
+++ b/init/main.c
@@ -509,6 +509,8 @@ asmlinkage void __init start_kernel(void)
sort_main_extable();
trap_init();
mm_init();
+ BUG_ON(mm_init_cpumask(&init_mm, 0));
+
/*
* Set up the scheduler prior starting any interrupts (such as the
* timer interrupt). Full topology setup happens at smp_init()
diff --git a/kernel/fork.c b/kernel/fork.c
index cc04197..5d303a2 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -486,6 +486,20 @@ static void mm_init_aio(struct mm_struct *mm)
#endif
}

+int mm_init_cpumask(struct mm_struct *mm, struct mm_struct *oldmm)
+{
+#ifdef CONFIG_CPUMASK_OFFSTACK
+ if (!alloc_cpumask_var(&mm->cpu_vm_mask_var, GFP_KERNEL))
+ return -ENOMEM;
+
+ if (oldmm)
+ cpumask_copy(mm_cpumask(mm), mm_cpumask(oldmm));
+ else
+ memset(mm_cpumask(mm), 0, cpumask_size());
+#endif
+ return 0;
+}
+
static struct mm_struct * mm_init(struct mm_struct * mm, struct task_struct *p)
{
atomic_set(&mm->mm_users, 1);
@@ -522,10 +536,20 @@ struct mm_struct * mm_alloc(void)
struct mm_struct * mm;

mm = allocate_mm();
- if (mm) {
- memset(mm, 0, sizeof(*mm));
- mm = mm_init(mm, current);
+ if (!mm)
+ return NULL;
+
+ memset(mm, 0, sizeof(*mm));
+ mm = mm_init(mm, current);
+ if (!mm)
+ return NULL;
+
+ if (mm_init_cpumask(mm, NULL)) {
+ mm_free_pgd(mm);
+ free_mm(mm);
+ return NULL;
}
+
return mm;
}

@@ -537,6 +561,7 @@ struct mm_struct * mm_alloc(void)
void __mmdrop(struct mm_struct *mm)
{
BUG_ON(mm == &init_mm);
+ free_cpumask_var(mm->cpu_vm_mask_var);
mm_free_pgd(mm);
destroy_context(mm);
mmu_notifier_mm_destroy(mm);
@@ -691,6 +716,9 @@ struct mm_struct *dup_mm(struct task_struct *tsk)
if (!mm_init(mm, tsk))
goto fail_nomem;

+ if (mm_init_cpumask(mm, oldmm))
+ goto fail_nocpumask;
+
if (init_new_context(tsk, mm))
goto fail_nocontext;

@@ -717,6 +745,9 @@ fail_nomem:
return NULL;

fail_nocontext:
+ free_cpumask_var(mm->cpu_vm_mask_var);
+
+fail_nocpumask:
/*
* If init_new_context() failed, we cannot use mmput() to free the mm
* because it calls destroy_context()
diff --git a/mm/init-mm.c b/mm/init-mm.c
index 1d29cdf..4019979 100644
--- a/mm/init-mm.c
+++ b/mm/init-mm.c
@@ -21,6 +21,5 @@ struct mm_struct init_mm = {
.mmap_sem = __RWSEM_INITIALIZER(init_mm.mmap_sem),
.page_table_lock = __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock),
.mmlist = LIST_HEAD_INIT(init_mm.mmlist),
- .cpu_vm_mask = CPU_MASK_ALL,
INIT_MM_CONTEXT(init_mm)
};
--
1.7.3.1


2011-04-18 13:09:05

by Chris Metcalf

[permalink] [raw]
Subject: Re: [PATCH 2/3] tile: replace mm->cpu_vm_mask with mm_cpumask()

On 4/18/2011 8:18 AM, KOSAKI Motohiro wrote:
> We plan to change mm->cpu_vm_mask definition later. Thus, this patch convert
> it into proper macro.
>
> Signed-off-by: KOSAKI Motohiro <[email protected]>
> Cc: Chris Metcalf <[email protected]>

Thanks; I wasn't aware of this macro. I'll take this change into my tree
unless you would like to push it.

> Chris, I couldn't get cross compiler for tile. thus I hope you check it carefully.

The toolchain support is currently only available from Tilera (at
http://www.tilera.com/scm/) but we are in the process of cleaning it up to
push it up to the community.

--
Chris Metcalf, Tilera Corp.
http://www.tilera.com

2011-04-19 00:09:23

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 2/3] tile: replace mm->cpu_vm_mask with mm_cpumask()

> On 4/18/2011 8:18 AM, KOSAKI Motohiro wrote:
> > We plan to change mm->cpu_vm_mask definition later. Thus, this patch convert
> > it into proper macro.
> >
> > Signed-off-by: KOSAKI Motohiro <[email protected]>
> > Cc: Chris Metcalf <[email protected]>
>
> Thanks; I wasn't aware of this macro. I'll take this change into my tree
> unless you would like to push it.

Thanks.

I hope this patch route your tree. I don't want to push patch 3/3 to linus-tree
until all architecture finish to convert mm_cpumask().


> > Chris, I couldn't get cross compiler for tile. thus I hope you check it carefully.
>
> The toolchain support is currently only available from Tilera (at
> http://www.tilera.com/scm/) but we are in the process of cleaning it up to
> push it up to the community.

Thank you, too.


2011-04-19 00:19:45

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm: convert mm->cpu_vm_cpumask into cpumask_var_t

> Signed-off-by: KOSAKI Motohiro <[email protected]>
> ---
> Documentation/cachetlb.txt | 2 +-
> include/linux/mm_types.h | 9 ++++++---
> include/linux/sched.h | 1 +
> init/main.c | 2 ++
> kernel/fork.c | 37 ++++++++++++++++++++++++++++++++++---
> mm/init-mm.c | 1 -
> 6 files changed, 44 insertions(+), 8 deletions(-)
>
> This patch don't touch x86/kerrnel/tboot.c. because it can't be compiled.

My bad. I confounded CONFIG_HAVE_INTEL_TXT with CONFIG_INTEL_TXT. Proper
fixing (and incremental) patch is here.



>From 0b443d8dbdf7ce97f92e6622840585ca41abca83 Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <[email protected]>
Date: Tue, 19 Apr 2011 08:38:01 +0900
Subject: [PATCH 4/4] fix tboot

Signed-off-by: KOSAKI Motohiro <[email protected]>
---
arch/x86/kernel/tboot.c | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 998e972..0f0d1a3 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -110,7 +110,6 @@ static struct mm_struct tboot_mm = {
.mmap_sem = __RWSEM_INITIALIZER(init_mm.mmap_sem),
.page_table_lock = __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock),
.mmlist = LIST_HEAD_INIT(init_mm.mmlist),
- .cpu_vm_mask = CPU_MASK_ALL,
};

static inline void switch_to_tboot_pt(void)
@@ -337,9 +336,18 @@ static struct notifier_block tboot_cpu_notifier __cpuinitdata =

static __init int tboot_late_init(void)
{
+ int ret;
+
if (!tboot_enabled())
return 0;

+ ret = mm_init_cpumask(&tboot_mm, 0);
+ if (ret) {
+ pr_warning("tboot: Allocation failure, disable tboot.\n");
+ tboot = NULL;
+ return ret;
+ }
+
tboot_create_trampoline();

atomic_set(&ap_wfs_count, 0);
--
1.7.3.1