2012-05-17 02:16:31

by Paul Mundt

[permalink] [raw]

Subject: Re: [PATCH v5 6/7] x86/tlb: optimizing flush_tlb_mm

On Wed, May 16, 2012 at 11:09:29PM +0200, Peter Zijlstra wrote:
> On Wed, 2012-05-16 at 21:34 +0800, Alex Shi wrote:
> >
> > So, if the minimum change of tlb->start/end can be protected by
> > HAVE_GENERIC_MMU_GATHER, it is safe and harmless, am I right?
> >
> safe yes, but not entirely harmless. A quick look seems to suggest you
> fail for VM_HUGETLB. If your mmu_gather spans a vma with VM_HUGETLB
> you'll do a regular range flush not a full mm flush like the other paths
> do.
>
> Anyway, I did a quick refresh of my series on a recent -tip tree:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/peterz/mmu.git tlb-unify
>
> With that all you need is to "select HAVE_MMU_GATHER_RANGE" for x86 and
> implement a useful flush_tlb_range().
>
> In particular, see:
> http://git.kernel.org/?p=linux/kernel/git/peterz/mmu.git;a=commitdiff;h=05e53144177e6242fda404045f50f48114bcf185;hp=2cd7dc710652127522392f4b7ecb5fa6e954941e
>
> I've slightly changed the code to address an open issue with the
> vm_flags tracking. We now force flush the mmu_gather whenever VM_HUGETLB
> flips because most (all?) archs that look at that flag expect pure huge
> pages and not a mixture.
>
> I've seem to have misplaced my cross-compiler set, so I've only compiled
> x86-64 for now.

It was on my list to test when you sent out the series initially, but
seems to have slipped my mind until I saw this thread. Here's a patch on
top of your tlb-unify branch that gets sh working (tested on all of
2-level, 3-level, and nommu).

I opted to shove the asm/cacheflush.h include in to tlb.h directly since
it is calling flush_cache_range() openly now, and the rest of the
architectures are just getting at it through various whimsical means. sh
was getting it through pagemap.h -> highmem.h, while ARM presently can't
seem to make up its mind and includes pagemap.h for nommu only as well as
cacheflush.h explicitly.

With the reworked interface we don't seem to actually need to stub out
the interface for the nommu case anymore anyways, all of the users are
insular to mm/memory.c which we don't build for nommu.

Signed-off-by: Paul Mundt <[email protected]>

---

diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
index 8c00785..bedc2ed 100644
--- a/arch/sh/include/asm/pgalloc.h
+++ b/arch/sh/include/asm/pgalloc.h
@@ -13,6 +13,8 @@ extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
extern void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmd);
extern pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address);
extern void pmd_free(struct mm_struct *mm, pmd_t *pmd);
+
+#define __pmd_free_tlb(tlb, pmdp, addr) pmd_free((tlb)->mm, pmdp)
#endif

static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
diff --git a/arch/sh/include/asm/tlb.h b/arch/sh/include/asm/tlb.h
index 45e5925..71af915 100644
--- a/arch/sh/include/asm/tlb.h
+++ b/arch/sh/include/asm/tlb.h
@@ -6,18 +6,7 @@
#endif

#ifndef __ASSEMBLY__
-#include <linux/pagemap.h>
-
#ifdef CONFIG_MMU
-#include <linux/swap.h>
-
-#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
-
-#define __pte_free_tlb(tlb, ptep, addr) pte_free((tlb)->mm, ptep)
-#define __pmd_free_tlb(tlb, pmdp, addr) pmd_free((tlb)->mm, pmdp)
-#define __pud_free_tlb(tlb, pudp, addr) pud_free((tlb)->mm, pudp)
-
-#include <asm-generic/tlb.h>

#if defined(CONFIG_CPU_SH4) || defined(CONFIG_SUPERH64)
extern void tlb_wire_entry(struct vm_area_struct *, unsigned long, pte_t);
@@ -35,8 +24,6 @@ static inline void tlb_unwire_entry(void)
}
#endif

-#else /* CONFIG_MMU */
-
#define __tlb_remove_tlb_entry(tlb, pte, address) do { } while (0)

#include <asm-generic/tlb.h>
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 90a725c..571e2cf 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -18,6 +18,7 @@
#include <linux/swap.h>
#include <asm/pgalloc.h>
#include <asm/tlbflush.h>
+#include <asm/cacheflush.h>

static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page);

2012-05-17 08:06:39

by Alex Shi

[permalink] [raw]

Subject: Re: [PATCH v5 6/7] x86/tlb: optimizing flush_tlb_mm

On 05/17/2012 10:07 AM, Steven Rostedt wrote:

> On Thu, 2012-05-17 at 08:43 +0800, Alex Shi wrote:
>
>>> I've seem to have misplaced my cross-compiler set, so I've only compiled
>>> x86-64 for now.
>>
>>
>> Oh, I also need a cross-compiler for other archs. Thanks reminder!
>
> Here:
>
> http://kernel.org/pub/tools/crosstool/
>
> Oh, and if you want to automate this. I attached a ktest.pl config that
> does it for you. I'll be pushing this config and others into a examples
> directory come the next merge window.
>
> Ktest is located in the Linux tree under tools/testing/ktest/
>
> You can run a bunch of cross compiles by doing:
>
> ktest.pl crosstests.conf

It works fine. :)
But does ktest only do one kind of config testing?
How it do randconfig testing?

>
> -- Steve
>
>
>
>