2017-04-22 07:01:31

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 0/4] x86/mm: Straightforward TLB flush fixes/cleanups

I have some less straightforward cleanups coming, but these three
are easy and might even be okay for 4.12 assuming that someone feels
like reviewing them.

Changes from v2:
- Added patch 1 (I suck at grep)
- Delete one more copy of flush_tlb() in patch 2 (ditto)

Changes from v1:
- Fixed the Cc list on patch 3. No code changes at all.

Andy Lutomirski (4):
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
x86/mm: Make flush_tlb_mm_range() more predictable
x86/mm: Fix flush_tlb_page() on Xen

arch/x86/include/asm/tlbflush.h | 9 ---------
arch/x86/kernel/vm86_32.c | 2 +-
arch/x86/mm/tlb.c | 33 ++++++++-------------------------
3 files changed, 9 insertions(+), 35 deletions(-)

--
2.9.3


2017-04-22 07:01:40

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 3/4] x86/mm: Make flush_tlb_mm_range() more predictable

I'm about to rewrite the function almost completely, but first I
want to get a functional change out of the way. Currently, if
flush_tlb_mm_range() does not flush the local TLB at all, it will
never do individual page flushes on remote CPUs. This seems to be
an accident, and preserving it will be awkward. Let's change it
first so that any regressions in the rewrite will be easier to
bisect and so that the rewrite can attempt to change no visible
behavior at all.

The fix is simple: we can simply avoid short-circuiting the
calculation of base_pages_to_flush.

As a side effect, this also eliminates a potential corner case: if
tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
could have ended up flushing the entire address space one page at a
time.

Cc: Rik van Riel <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andrew Morton <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/mm/tlb.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 92ec37f517ab..9db9260a5e9f 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -309,6 +309,12 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long base_pages_to_flush = TLB_FLUSH_ALL;

preempt_disable();
+
+ if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
+ base_pages_to_flush = (end - start) >> PAGE_SHIFT;
+ if (base_pages_to_flush > tlb_single_page_flush_ceiling)
+ base_pages_to_flush = TLB_FLUSH_ALL;
+
if (current->active_mm != mm) {
/* Synchronize with switch_mm. */
smp_mb();
@@ -325,15 +331,11 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
goto out;
}

- if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
- base_pages_to_flush = (end - start) >> PAGE_SHIFT;
-
/*
* Both branches below are implicit full barriers (MOV to CR or
* INVLPG) that synchronize with switch_mm.
*/
- if (base_pages_to_flush > tlb_single_page_flush_ceiling) {
- base_pages_to_flush = TLB_FLUSH_ALL;
+ if (base_pages_to_flush == TLB_FLUSH_ALL) {
count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
local_flush_tlb();
} else {
--
2.9.3

2017-04-22 07:01:43

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 4/4] x86/mm: Fix flush_tlb_page() on Xen

flush_tlb_page() passes a bogus range to flush_tlb_others() and
expects the latter to fix it up. native_flush_tlb_others() has the
fixup but Xen's version doesn't. Move the fixup to
flush_tlb_others().

AFAICS the only real effect is that, without this fix, Xen would
flush everything instead of just the one page on remote vCPUs in
when flush_tlb_page() was called.

Cc: Rik van Riel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Fixes: e7b52ffd45a6 ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range")
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/mm/tlb.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 9db9260a5e9f..6e7bedf69af7 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -263,8 +263,6 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
{
struct flush_tlb_info info;

- if (end == 0)
- end = start + PAGE_SIZE;
info.flush_mm = mm;
info.flush_start = start;
info.flush_end = end;
@@ -378,7 +376,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
}

if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, start, 0UL);
+ flush_tlb_others(mm_cpumask(mm), mm, start, start + PAGE_SIZE);

preempt_enable();
}
--
2.9.3

2017-04-22 07:01:45

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 2/4] x86/mm: Remove flush_tlb() and flush_tlb_current_task()

I was trying to figure out what how flush_tlb_current_task() would
possibly work correctly if current->mm != current->active_mm, but I
realized I could spare myself the effort: it has no callers except
the unused flush_tlb() macro.

Cc: Rik van Riel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/include/asm/tlbflush.h | 9 ---------
arch/x86/mm/tlb.c | 17 -----------------
2 files changed, 26 deletions(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index fc5abff9b7fd..8e7ae9e6c59a 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -205,7 +205,6 @@ static inline void __flush_tlb_one(unsigned long addr)
/*
* TLB flushing:
*
- * - flush_tlb() flushes the current mm struct TLBs
* - flush_tlb_all() flushes all processes TLBs
* - flush_tlb_mm(mm) flushes the specified mm context TLB's
* - flush_tlb_page(vma, vmaddr) flushes one page
@@ -237,11 +236,6 @@ static inline void flush_tlb_all(void)
__flush_tlb_all();
}

-static inline void flush_tlb(void)
-{
- __flush_tlb_up();
-}
-
static inline void local_flush_tlb(void)
{
__flush_tlb_up();
@@ -303,14 +297,11 @@ static inline void flush_tlb_kernel_range(unsigned long start,
flush_tlb_mm_range(vma->vm_mm, start, end, vma->vm_flags)

extern void flush_tlb_all(void);
-extern void flush_tlb_current_task(void);
extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag);
extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);

-#define flush_tlb() flush_tlb_current_task()
-
void native_flush_tlb_others(const struct cpumask *cpumask,
struct mm_struct *mm,
unsigned long start, unsigned long end);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index a7655f6caf7d..92ec37f517ab 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -289,23 +289,6 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
smp_call_function_many(cpumask, flush_tlb_func, &info, 1);
}

-void flush_tlb_current_task(void)
-{
- struct mm_struct *mm = current->mm;
-
- preempt_disable();
-
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
-
- /* This is an implicit full barrier that synchronizes with switch_mm. */
- local_flush_tlb();
-
- trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL);
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
- preempt_enable();
-}
-
/*
* See Documentation/x86/tlb.txt for details. We choose 33
* because it is large enough to cover the vast majority (at
--
2.9.3

2017-04-22 07:01:37

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3 1/4] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.

Compile-tested only because I don't know whether software that uses
this mechanism even exists.

Cc: Rik van Riel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/kernel/vm86_32.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 23ee89ce59a9..3eda76b3c835 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -193,7 +193,7 @@ static void mark_screen_rdonly(struct mm_struct *mm)
pte_unmap_unlock(pte, ptl);
out:
up_write(&mm->mmap_sem);
- flush_tlb();
+ flush_tlb_mm_range(mm, 0xA0000, 0xA0000 + 32*PAGE_SIZE, 0UL);
}


--
2.9.3

Subject: [tip:x86/mm] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

Commit-ID: 9ccee2373f0658f234727700e619df097ba57023
Gitweb: http://git.kernel.org/tip/9ccee2373f0658f234727700e619df097ba57023
Author: Andy Lutomirski <[email protected]>
AuthorDate: Sat, 22 Apr 2017 00:01:19 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 26 Apr 2017 10:02:06 +0200

x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.

Compile-tested only because I don't know whether software that uses
this mechanism even exists.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/vm86_32.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 62597c3..7924a53 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -197,7 +197,7 @@ static void mark_screen_rdonly(struct mm_struct *mm)
pte_unmap_unlock(pte, ptl);
out:
up_write(&mm->mmap_sem);
- flush_tlb();
+ flush_tlb_mm_range(mm, 0xA0000, 0xA0000 + 32*PAGE_SIZE, 0UL);
}



Subject: [tip:x86/mm] x86/mm: Fix flush_tlb_page() on Xen

Commit-ID: dbd68d8e84c606673ebbcf15862f8c155fa92326
Gitweb: http://git.kernel.org/tip/dbd68d8e84c606673ebbcf15862f8c155fa92326
Author: Andy Lutomirski <[email protected]>
AuthorDate: Sat, 22 Apr 2017 00:01:22 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 26 Apr 2017 10:02:06 +0200

x86/mm: Fix flush_tlb_page() on Xen

flush_tlb_page() passes a bogus range to flush_tlb_others() and
expects the latter to fix it up. native_flush_tlb_others() has the
fixup but Xen's version doesn't. Move the fixup to
flush_tlb_others().

AFAICS the only real effect is that, without this fix, Xen would
flush everything instead of just the one page on remote vCPUs in
when flush_tlb_page() was called.

Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: e7b52ffd45a6 ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range")
Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/mm/tlb.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 9db9260..6e7bedf 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -263,8 +263,6 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
{
struct flush_tlb_info info;

- if (end == 0)
- end = start + PAGE_SIZE;
info.flush_mm = mm;
info.flush_start = start;
info.flush_end = end;
@@ -378,7 +376,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
}

if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, start, 0UL);
+ flush_tlb_others(mm_cpumask(mm), mm, start, start + PAGE_SIZE);

preempt_enable();
}

Subject: [tip:x86/mm] x86/mm: Remove flush_tlb() and flush_tlb_current_task()

Commit-ID: 29961b59a51f8c6838a26a45e871a7ed6771809b
Gitweb: http://git.kernel.org/tip/29961b59a51f8c6838a26a45e871a7ed6771809b
Author: Andy Lutomirski <[email protected]>
AuthorDate: Sat, 22 Apr 2017 00:01:20 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 26 Apr 2017 10:02:06 +0200

x86/mm: Remove flush_tlb() and flush_tlb_current_task()

I was trying to figure out what how flush_tlb_current_task() would
possibly work correctly if current->mm != current->active_mm, but I
realized I could spare myself the effort: it has no callers except
the unused flush_tlb() macro.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/tlbflush.h | 9 ---------
arch/x86/mm/tlb.c | 17 -----------------
2 files changed, 26 deletions(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 75d002b..6ed9ea4 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -215,7 +215,6 @@ static inline void __flush_tlb_one(unsigned long addr)
/*
* TLB flushing:
*
- * - flush_tlb() flushes the current mm struct TLBs
* - flush_tlb_all() flushes all processes TLBs
* - flush_tlb_mm(mm) flushes the specified mm context TLB's
* - flush_tlb_page(vma, vmaddr) flushes one page
@@ -247,11 +246,6 @@ static inline void flush_tlb_all(void)
__flush_tlb_all();
}

-static inline void flush_tlb(void)
-{
- __flush_tlb_up();
-}
-
static inline void local_flush_tlb(void)
{
__flush_tlb_up();
@@ -313,14 +307,11 @@ static inline void flush_tlb_kernel_range(unsigned long start,
flush_tlb_mm_range(vma->vm_mm, start, end, vma->vm_flags)

extern void flush_tlb_all(void);
-extern void flush_tlb_current_task(void);
extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag);
extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);

-#define flush_tlb() flush_tlb_current_task()
-
void native_flush_tlb_others(const struct cpumask *cpumask,
struct mm_struct *mm,
unsigned long start, unsigned long end);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index a7655f6..92ec37f 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -289,23 +289,6 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
smp_call_function_many(cpumask, flush_tlb_func, &info, 1);
}

-void flush_tlb_current_task(void)
-{
- struct mm_struct *mm = current->mm;
-
- preempt_disable();
-
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
-
- /* This is an implicit full barrier that synchronizes with switch_mm. */
- local_flush_tlb();
-
- trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL);
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
- preempt_enable();
-}
-
/*
* See Documentation/x86/tlb.txt for details. We choose 33
* because it is large enough to cover the vast majority (at

Subject: [tip:x86/mm] x86/mm: Make flush_tlb_mm_range() more predictable

Commit-ID: ce27374fabf553153c3f53efcaa9bfab9216bd8c
Gitweb: http://git.kernel.org/tip/ce27374fabf553153c3f53efcaa9bfab9216bd8c
Author: Andy Lutomirski <[email protected]>
AuthorDate: Sat, 22 Apr 2017 00:01:21 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 26 Apr 2017 10:02:06 +0200

x86/mm: Make flush_tlb_mm_range() more predictable

I'm about to rewrite the function almost completely, but first I
want to get a functional change out of the way. Currently, if
flush_tlb_mm_range() does not flush the local TLB at all, it will
never do individual page flushes on remote CPUs. This seems to be
an accident, and preserving it will be awkward. Let's change it
first so that any regressions in the rewrite will be easier to
bisect and so that the rewrite can attempt to change no visible
behavior at all.

The fix is simple: we can simply avoid short-circuiting the
calculation of base_pages_to_flush.

As a side effect, this also eliminates a potential corner case: if
tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
could have ended up flushing the entire address space one page at a
time.

Signed-off-by: Andy Lutomirski <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/mm/tlb.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 92ec37f..9db9260 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -309,6 +309,12 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long base_pages_to_flush = TLB_FLUSH_ALL;

preempt_disable();
+
+ if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
+ base_pages_to_flush = (end - start) >> PAGE_SHIFT;
+ if (base_pages_to_flush > tlb_single_page_flush_ceiling)
+ base_pages_to_flush = TLB_FLUSH_ALL;
+
if (current->active_mm != mm) {
/* Synchronize with switch_mm. */
smp_mb();
@@ -325,15 +331,11 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
goto out;
}

- if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
- base_pages_to_flush = (end - start) >> PAGE_SHIFT;
-
/*
* Both branches below are implicit full barriers (MOV to CR or
* INVLPG) that synchronize with switch_mm.
*/
- if (base_pages_to_flush > tlb_single_page_flush_ceiling) {
- base_pages_to_flush = TLB_FLUSH_ALL;
+ if (base_pages_to_flush == TLB_FLUSH_ALL) {
count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
local_flush_tlb();
} else {

2017-04-26 23:56:37

by Nadav Amit

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

It may be benign, but I don’t think that flushing the TLB without
holding the ptl or the mmap_sem (for no apparent reason) is a good
practice.

On 4/22/17, 12:01 AM, "Andy Lutomirski" <[email protected]> wrote:

mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.

Compile-tested only because I don't know whether software that uses
this mechanism even exists.

Cc: Rik van Riel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/kernel/vm86_32.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 23ee89ce59a9..3eda76b3c835 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -193,7 +193,7 @@ static void mark_screen_rdonly(struct mm_struct *mm)
pte_unmap_unlock(pte, ptl);
out:
up_write(&mm->mmap_sem);
- flush_tlb();
+ flush_tlb_mm_range(mm, 0xA0000, 0xA0000 + 32*PAGE_SIZE, 0UL);
}


--
2.9.3




2017-04-27 00:02:50

by Nadav Amit

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

And besides, it looks as if the code was meant to flush the entire
TLB in some cases (e.g., if pgd_none_or_clear_bad() is true).

On 4/26/17, 4:56 PM, "Nadav Amit" <[email protected]> wrote:

It may be benign, but I don’t think that flushing the TLB without
holding the ptl or the mmap_sem (for no apparent reason) is a good
practice.

On 4/22/17, 12:01 AM, "Andy Lutomirski" <[email protected]> wrote:

mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.

Compile-tested only because I don't know whether software that uses
this mechanism even exists.

Cc: Rik van Riel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/kernel/vm86_32.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 23ee89ce59a9..3eda76b3c835 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -193,7 +193,7 @@ static void mark_screen_rdonly(struct mm_struct *mm)
pte_unmap_unlock(pte, ptl);
out:
up_write(&mm->mmap_sem);
- flush_tlb();
+ flush_tlb_mm_range(mm, 0xA0000, 0xA0000 + 32*PAGE_SIZE, 0UL);
}


--
2.9.3






2017-04-27 16:08:36

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

On Wed, Apr 26, 2017 at 5:02 PM, Nadav Amit <[email protected]> wrote:
> And besides, it looks as if the code was meant to flush the entire
> TLB in some cases (e.g., if pgd_none_or_clear_bad() is true).
>
> On 4/26/17, 4:56 PM, "Nadav Amit" <[email protected]> wrote:
>
> It may be benign, but I don’t think that flushing the TLB without
> holding the ptl or the mmap_sem (for no apparent reason) is a good
> practice.
>
> On 4/22/17, 12:01 AM, "Andy Lutomirski" <[email protected]> wrote:
>
> mark_screen_rdonly() is the last remaining caller of flush_tlb().
> flush_tlb_mm_range() is potentially faster and isn't obsolete.
>
> Compile-tested only because I don't know whether software that uses
> this mechanism even exists.
>
> Cc: Rik van Riel <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Nadav Amit <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Sasha Levin <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Signed-off-by: Andy Lutomirski <[email protected]>
> ---
> arch/x86/kernel/vm86_32.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
> index 23ee89ce59a9..3eda76b3c835 100644
> --- a/arch/x86/kernel/vm86_32.c
> +++ b/arch/x86/kernel/vm86_32.c
> @@ -193,7 +193,7 @@ static void mark_screen_rdonly(struct mm_struct *mm)
> pte_unmap_unlock(pte, ptl);
> out:
> up_write(&mm->mmap_sem);
> - flush_tlb();
> + flush_tlb_mm_range(mm, 0xA0000, 0xA0000 + 32*PAGE_SIZE, 0UL);
> }
>

Those should probably be pgd_none(), not pgd_none_or_clear_bad().

But this whole function is just garbage. It mucks with page
protections without even looking up the VMA. What happens if the
pages are file-backed? How about chardevs?

I'd like to delete it. Stas, do you know if there's any code at all
that uses VM86_SCREEN_BITMAP? Some Googling didn't turn any up at
all.

2017-04-27 22:15:52

by Stas Sergeev

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

27.04.2017 19:08, Andy Lutomirski пишет:
> Those should probably be pgd_none(), not pgd_none_or_clear_bad().
>
> But this whole function is just garbage. It mucks with page
> protections without even looking up the VMA. What happens if the
> pages are file-backed? How about chardevs?
>
> I'd like to delete it. Stas, do you know if there's any code at all
> that uses VM86_SCREEN_BITMAP? Some Googling didn't turn any up at
> all.
dosemu1 has this:
https://sourceforge.net/p/dosemu/code/ci/master/tree/src/env/video/video.c
Scroll down to line 255.
---

#if VIDEO_CHECK_DIRTY
if (!config_dualmon) {
vm86s.flags |= VM86_SCREEN_BITMAP;
}
#endif --- The check expands to "if 0":
https://sourceforge.net/p/dosemu/code/ci/master/tree/src/include/video.h
line 27: ---

#define VIDEO_CHECK_DIRTY 0 --- Plus, in video.c you can see the comment
that basically says that this functionality was of no use (not sure what
exactly they were saying though). dosemu2 has no traces of this code at
all. So perfectly fine with me if you remove it. In fact, I've cleaned
up dosemu2 from any fancy stuff of vm86(), so probably more cleanups are
possible on kernel side. I even wanted to switch to vm86old() if not for
the very nasty bug that vm86old() generates SIGTRAP when int3 is called
in v86. If this is fixed (and its a 1-line fix), we can remove entire
vm86(). :)

2017-04-27 22:17:33

by Stas Sergeev

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

27.04.2017 19:08, Andy Lutomirski пишет:
> Those should probably be pgd_none(), not pgd_none_or_clear_bad().
>
> But this whole function is just garbage. It mucks with page
> protections without even looking up the VMA. What happens if the
> pages are file-backed? How about chardevs?
>
> I'd like to delete it. Stas, do you know if there's any code at all
> that uses VM86_SCREEN_BITMAP? Some Googling didn't turn any up at
> all.
dosemu1 has this:
https://sourceforge.net/p/dosemu/code/ci/master/tree/src/env/video/video.c
Scroll down to line 255.

---

#if VIDEO_CHECK_DIRTY
if (!config_dualmon) {
vm86s.flags |= VM86_SCREEN_BITMAP;
}
#endif

---


The check expands to "if 0":
https://sourceforge.net/p/dosemu/code/ci/master/tree/src/include/video.h
line 27:

---

#define VIDEO_CHECK_DIRTY 0

---

Plus, in video.c you can see the comment that basically says that this
functionality was of no use (not sure what exactly they were saying
though). dosemu2 has no traces of this code at all.

So perfectly fine with me if you remove it. In fact, I've cleaned up
dosemu2 from any fancy stuff of vm86(), so probably more cleanups are
possible on kernel side. I even wanted to switch to vm86old() if not for
the very nasty bug that vm86old() generates SIGTRAP when int3 is called
in v86. If this is fixed (and its a 1-line fix), we can remove entire
vm86(). :)