2012-10-01 13:52:26

by Will Deacon

[permalink] [raw]
Subject: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.

From: Steve Capper <[email protected]>

On x86 memory accesses to pages without the ACCESSED flag set result in the
ACCESSED flag being set automatically. With the ARM architecture a page access
fault is raised instead (and it will continue to be raised until the ACCESSED
flag is set for the appropriate PTE/PMD).

For normal memory pages, handle_pte_fault will call pte_mkyoung (effectively
setting the ACCESSED flag). For transparent huge pages, pmd_mkyoung will only
be called for a write fault.

This patch ensures that faults on transparent hugepages which do not result
in a CoW update the access flags for the faulting pmd.

Cc: Andrea Arcangeli <[email protected]>
Cc: Chris Metcalf <[email protected]>
Signed-off-by: Steve Capper <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---

Hello again,

This is another fix for an issue that we discovered when porting THP to
ARM but it somehow managed to slip through the cracks.

Will

include/linux/huge_mm.h | 3 +++
mm/huge_memory.c | 12 ++++++++++++
mm/memory.c | 4 ++++
3 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 4c59b11..bbc62ad 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -8,6 +8,9 @@ extern int do_huge_pmd_anonymous_page(struct mm_struct *mm,
extern int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm,
pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr,
struct vm_area_struct *vma);
+extern void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long address, pmd_t *pmd,
+ pmd_t orig_pmd);
extern int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, pmd_t *pmd,
pmd_t orig_pmd);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index d684934..ee9cc3b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -914,6 +914,18 @@ out_free_pages:
goto out;
}

+void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long address, pmd_t *pmd, pmd_t orig_pmd)
+{
+ pmd_t entry;
+
+ spin_lock(&mm->page_table_lock);
+ entry = pmd_mkyoung(orig_pmd);
+ if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, 0))
+ update_mmu_cache(vma, address, pmd);
+ spin_unlock(&mm->page_table_lock);
+}
+
int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, pmd_t *pmd, pmd_t orig_pmd)
{
diff --git a/mm/memory.c b/mm/memory.c
index 5736170..d5c007d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3537,7 +3537,11 @@ retry:
if (unlikely(ret & VM_FAULT_OOM))
goto retry;
return ret;
+ } else {
+ huge_pmd_set_accessed(mm, vma, address, pmd,
+ orig_pmd);
}
+
return 0;
}
}
--
1.7.4.1


2012-10-01 14:02:36

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.

On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote:
> diff --git a/mm/memory.c b/mm/memory.c
> index 5736170..d5c007d 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3537,7 +3537,11 @@ retry:
> if (unlikely(ret & VM_FAULT_OOM))
> goto retry;
> return ret;
> + } else {
> + huge_pmd_set_accessed(mm, vma, address, pmd,
> + orig_pmd);

I think putting it to 'else' is wrong. You should not touch pmd, if it's
under splitting.

> }
> +
> return 0;
> }
> }

--
Kirill A. Shutemov

2012-10-01 15:00:03

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.

Hi Will,

On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote:
> +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma,
> + unsigned long address, pmd_t *pmd, pmd_t orig_pmd)
> +{
> + pmd_t entry;
> +
> + spin_lock(&mm->page_table_lock);
> + entry = pmd_mkyoung(orig_pmd);
> + if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, 0))
> + update_mmu_cache(vma, address, pmd);

If the pmd is being splitted, this may not be a trasnhuge pmd anymore
by the time you obtained the lock. (orig_pmd could be stale, and it
wasn't verified with pmd_same either)

The lock should be obtained through pmd_trans_huge_lock.

if (pmd_trans_huge_lock(orig_pmd, vma) == 1)
{
set young bit
spin_unlock(&mm->page_table_lock);
}


On x86:

int pmdp_set_access_flags(struct vm_area_struct *vma,
unsigned long address, pmd_t *pmdp,
pmd_t entry, int dirty)
{
int changed = !pmd_same(*pmdp, entry);

VM_BUG_ON(address & ~HPAGE_PMD_MASK);

if (changed && dirty) {
*pmdp = entry;

with dirty == 0 it looks like it won't make any difference, but I
guess your arm pmdp_set_access_flag is different.

However it seems "dirty" means write access and so the invocation
would better match the pte case:

if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry,
flags & FAULT_FLAG_WRITE))


But note, you still have to update it even when "dirty" == 0, or it'll
still infinite loop for read accesses.

> + spin_unlock(&mm->page_table_lock);
> +}
> +
> int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> unsigned long address, pmd_t *pmd, pmd_t orig_pmd)
> {
> diff --git a/mm/memory.c b/mm/memory.c
> index 5736170..d5c007d 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3537,7 +3537,11 @@ retry:
> if (unlikely(ret & VM_FAULT_OOM))
> goto retry;
> return ret;
> + } else {
> + huge_pmd_set_accessed(mm, vma, address, pmd,
> + orig_pmd);
> }
> +
> return 0;

Thanks,
Andrea

2012-10-01 16:33:18

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.

On Mon, Oct 01, 2012 at 03:59:44PM +0100, Andrea Arcangeli wrote:
> Hi Will,

Hi Andrea, Kirill,

Thanks for the comments.

> On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote:
> > +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma,
> > + unsigned long address, pmd_t *pmd, pmd_t orig_pmd)
> > +{
> > + pmd_t entry;
> > +
> > + spin_lock(&mm->page_table_lock);
> > + entry = pmd_mkyoung(orig_pmd);
> > + if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, 0))
> > + update_mmu_cache(vma, address, pmd);
>
> If the pmd is being splitted, this may not be a trasnhuge pmd anymore
> by the time you obtained the lock. (orig_pmd could be stale, and it
> wasn't verified with pmd_same either)
>
> The lock should be obtained through pmd_trans_huge_lock.
>
> if (pmd_trans_huge_lock(orig_pmd, vma) == 1)
> {
> set young bit
> spin_unlock(&mm->page_table_lock);
> }

I didn't notice that -- thanks. I'll move the locking outside of the
_set_accessed function and direct it via that function instead.

> On x86:
>
> int pmdp_set_access_flags(struct vm_area_struct *vma,
> unsigned long address, pmd_t *pmdp,
> pmd_t entry, int dirty)
> {
> int changed = !pmd_same(*pmdp, entry);
>
> VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>
> if (changed && dirty) {
> *pmdp = entry;
>
> with dirty == 0 it looks like it won't make any difference, but I
> guess your arm pmdp_set_access_flag is different.

We use the generic code, which ignores the dirty argument. Still, we should
pass the correct value through anyway, so I'll fix that too.

> However it seems "dirty" means write access and so the invocation
> would better match the pte case:
>
> if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry,
> flags & FAULT_FLAG_WRITE))
>
>
> But note, you still have to update it even when "dirty" == 0, or it'll
> still infinite loop for read accesses.

Yup. v2 to follow once we've re-run our testing.

Cheers,

Will

2012-10-16 09:38:01

by Ni zhan Chen

[permalink] [raw]
Subject: Re: [PATCH] mm: thp: Set the accessed flag for old pages on access fault.

On 10/01/2012 10:59 PM, Andrea Arcangeli wrote:
> Hi Will,
>
> On Mon, Oct 01, 2012 at 02:51:45PM +0100, Will Deacon wrote:
>> +void huge_pmd_set_accessed(struct mm_struct *mm, struct vm_area_struct *vma,
>> + unsigned long address, pmd_t *pmd, pmd_t orig_pmd)
>> +{
>> + pmd_t entry;
>> +
>> + spin_lock(&mm->page_table_lock);
>> + entry = pmd_mkyoung(orig_pmd);
>> + if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry, 0))
>> + update_mmu_cache(vma, address, pmd);
> If the pmd is being splitted, this may not be a trasnhuge pmd anymore
> by the time you obtained the lock. (orig_pmd could be stale, and it
> wasn't verified with pmd_same either)

Could you tell me when should call pmd_same in general?

>
> The lock should be obtained through pmd_trans_huge_lock.
>
> if (pmd_trans_huge_lock(orig_pmd, vma) == 1)
> {
> set young bit
> spin_unlock(&mm->page_table_lock);
> }
>
>
> On x86:
>
> int pmdp_set_access_flags(struct vm_area_struct *vma,
> unsigned long address, pmd_t *pmdp,
> pmd_t entry, int dirty)
> {
> int changed = !pmd_same(*pmdp, entry);
>
> VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>
> if (changed && dirty) {
> *pmdp = entry;
>
> with dirty == 0 it looks like it won't make any difference, but I
> guess your arm pmdp_set_access_flag is different.
>
> However it seems "dirty" means write access and so the invocation
> would better match the pte case:
>
> if (pmdp_set_access_flags(vma, address & HPAGE_PMD_MASK, pmd, entry,
> flags & FAULT_FLAG_WRITE))
>
>
> But note, you still have to update it even when "dirty" == 0, or it'll
> still infinite loop for read accesses.
>
>> + spin_unlock(&mm->page_table_lock);
>> +}
>> +
>> int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
>> unsigned long address, pmd_t *pmd, pmd_t orig_pmd)
>> {
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 5736170..d5c007d 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -3537,7 +3537,11 @@ retry:
>> if (unlikely(ret & VM_FAULT_OOM))
>> goto retry;
>> return ret;
>> + } else {
>> + huge_pmd_set_accessed(mm, vma, address, pmd,
>> + orig_pmd);
>> }
>> +
>> return 0;
> Thanks,
> Andrea
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>