2006-10-20 02:47:52

by yunfeng zhang

[permalink] [raw]
Subject: BUG: about flush TLB during unmapping a page in memory subsystem

In rmap.c::try_to_unmap_one of 2.6.16.29, there are some code snippets

.....
/* Nuke the page table entry. */
flush_cache_page(vma, address, page_to_pfn(page));
pteval = ptep_clear_flush(vma, address, pte);
// >>> The above line is expanded as below
// >>> pte_t __pte;
// >>> __pte = ptep_get_and_clear((__vma)->vm_mm, __address, __ptep);
// >>> flush_tlb_page(__vma, __address);
// >>> __pte;

/* Move the dirty bit to the physical page now the pte is gone. */
if (pte_dirty(pteval))
set_page_dirty(page);
.....


It seems that they only can work on UP system.

On SMP, let's suppose the pte was clean, after A CPU executed
ptep_get_and_clear,
B CPU makes the pte dirty, which will make a fatal error to A CPU since it gets
a stale pte, isn't right?


2006-10-20 03:02:11

by David Miller

[permalink] [raw]
Subject: Re: BUG: about flush TLB during unmapping a page in memory subsystem

From: "yunfeng zhang" <[email protected]>
Date: Fri, 20 Oct 2006 10:47:49 +0800

> In rmap.c::try_to_unmap_one of 2.6.16.29, there are some code snippets
>
> .....
> /* Nuke the page table entry. */
> flush_cache_page(vma, address, page_to_pfn(page));
> pteval = ptep_clear_flush(vma, address, pte);
> // >>> The above line is expanded as below
> // >>> pte_t __pte;
> // >>> __pte = ptep_get_and_clear((__vma)->vm_mm, __address, __ptep);
> // >>> flush_tlb_page(__vma, __address);
> // >>> __pte;
>
> /* Move the dirty bit to the physical page now the pte is gone. */
> if (pte_dirty(pteval))
> set_page_dirty(page);
> .....
>
>
> It seems that they only can work on UP system.
>
> On SMP, let's suppose the pte was clean, after A CPU executed
> ptep_get_and_clear,
> B CPU makes the pte dirty, which will make a fatal error to A CPU since it gets
> a stale pte, isn't right?

B can't make it dirty because it's been cleared to zero
and flush_tlb_page() has removed the TLB cached copy of
the PTE. B can therefore only see the new cleared PTE.

2006-10-20 05:12:09

by yunfeng zhang

[permalink] [raw]
Subject: Re: BUG: about flush TLB during unmapping a page in memory subsystem

Maybe, the solution is below

...
// >>> ptep_clear((__vma)->vm_mm, __address, __ptep);
// >>> flush_tlb_page(__vma, __address);
// >>> __ptep;
...

And even so, we also get a pte with present = 0 AND its dirty = 1, an odd pte.

Remember B dirtied the pte before A executes flush_tlb_page.

2006-10-20 05:54:36

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: BUG: about flush TLB during unmapping a page in memory subsystem

On Fri, 2006-10-20 at 13:10 +0800, yunfeng zhang wrote:
> Maybe, the solution is below
>
> ...
> // >>> ptep_clear((__vma)->vm_mm, __address, __ptep);
> // >>> flush_tlb_page(__vma, __address);
> // >>> __ptep;
> ...
>
> And even so, we also get a pte with present = 0 AND its dirty = 1, an odd pte.
>
> Remember B dirtied the pte before A executes flush_tlb_page.

It's very much architecture specific. I suppose x86 must have some HW
requirements of checking if the PTE is still present atomically when
setting the dirty bit but I can't tell for sure :)

On PowerPC, we don't use HW dirty bits, we use SW for that, thus the
ptep_get_and_clear will be enough to prevent any further dirty bit to be
set.

Ben.