2014-12-02 09:27:43

by Wang, Yalin

[permalink] [raw]
Subject: [RFC V2] mm:add zero_page _mapcount when mapped into user space

This patch add/dec zero_page's _mapcount to make sure
the mapcount is correct for zero_page,
so that when read from /proc/kpagecount, zero_page's
mapcount is also correct, userspace process like procrank can
calculate PSS correctly.

Signed-off-by: Yalin Wang <[email protected]>
---
mm/memory.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 3e50383..7215423 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2198,6 +2198,7 @@ gotten:
new_page = alloc_zeroed_user_highpage_movable(vma, address);
if (!new_page)
goto oom;
+ atomic_dec(&pfn_to_page(pte_pfn(orig_pte))->_mapcount);
} else {
new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
if (!new_page)
@@ -2647,6 +2648,7 @@ static int do_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
if (!pte_none(*page_table))
goto unlock;
+ atomic_inc(&pfn_to_page(my_zero_pfn(address))->_mapcount);
goto setpte;
}

--
2.1.3


2014-12-02 11:30:49

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user space

On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
> This patch add/dec zero_page's _mapcount to make sure
> the mapcount is correct for zero_page,
> so that when read from /proc/kpagecount, zero_page's
> mapcount is also correct, userspace process like procrank can
> calculate PSS correctly.

I don't have specific code path to point to, but I would expect zero page
with non-zero mapcount would cause a problem with rmap.

How do you test the change?

--
Kirill A. Shutemov

2014-12-04 06:10:58

by Wang, Yalin

[permalink] [raw]
Subject: RE: [RFC V2] mm:add zero_page _mapcount when mapped into user space

> -----Original Message-----
> From: Kirill A. Shutemov [mailto:[email protected]]
> Sent: Tuesday, December 02, 2014 7:30 PM
> To: Wang, Yalin
> Cc: '[email protected]'; '[email protected]'; 'linux-arm-
> [email protected]'
> Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user
> space
>
> On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
> > This patch add/dec zero_page's _mapcount to make sure the mapcount is
> > correct for zero_page, so that when read from /proc/kpagecount,
> > zero_page's mapcount is also correct, userspace process like procrank
> > can calculate PSS correctly.
>
> I don't have specific code path to point to, but I would expect zero page
> with non-zero mapcount would cause a problem with rmap.
>
> How do you test the change?
>
I just test it to see the mapcount from /proc/pid/pagemap and /proc/kpagecount ,
It works well,
The problem is that when I see /proc/pid/smaps ,
The Rss / Pss don't calculate zero_page map,
Because smaps_pte_entry() --> vm_normal_page( ),
Will return NULL for zero_page,

But when userspace process cat /proc/pid/pagemap ,
It will see zero_page mapped,
And will treat as Rss ,
This is weird, should we also omit zero_page in /proc/pid/pagemap ?
Or add zero_page as Rss in /proc/pid/smaps ?

I think we should add zero_page into Rss ,
Because it is really mapped into userspace address space.
And will let userspace memory analysis more accurate .

Thanks




2014-12-04 12:28:52

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user space

On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote:
> > -----Original Message-----
> > From: Kirill A. Shutemov [mailto:[email protected]]
> > Sent: Tuesday, December 02, 2014 7:30 PM
> > To: Wang, Yalin
> > Cc: '[email protected]'; '[email protected]'; 'linux-arm-
> > [email protected]'
> > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user
> > space
> >
> > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
> > > This patch add/dec zero_page's _mapcount to make sure the mapcount is
> > > correct for zero_page, so that when read from /proc/kpagecount,
> > > zero_page's mapcount is also correct, userspace process like procrank
> > > can calculate PSS correctly.
> >
> > I don't have specific code path to point to, but I would expect zero page
> > with non-zero mapcount would cause a problem with rmap.
> >
> > How do you test the change?
> >
> I just test it to see the mapcount from /proc/pid/pagemap and /proc/kpagecount ,
> It works well,

I took a closer look and your patch is broken in multiple places:
- on zap_pte_range() you don't decrement mapcount;
- you don't update rss counters for mm;
- copy_one_pte() doesn't increase mapcount;
- ...

Basically, each and every vm_normal_page() call must be audited. As first
step. And you totally skip huge zero page.

Proper mapcount handling for zero page would require a lot more work and I
don't think it worth it. Gain is too small.

NAK.

> The problem is that when I see /proc/pid/smaps ,
> The Rss / Pss don't calculate zero_page map,
> Because smaps_pte_entry() --> vm_normal_page( ),
> Will return NULL for zero_page,
>
> But when userspace process cat /proc/pid/pagemap ,
> It will see zero_page mapped,
> And will treat as Rss ,
> This is weird, should we also omit zero_page in /proc/pid/pagemap ?
> Or add zero_page as Rss in /proc/pid/smaps ?
>
> I think we should add zero_page into Rss ,
> Because it is really mapped into userspace address space.
> And will let userspace memory analysis more accurate .

It would be easier for userspace to find out pfn of zero page and take it
into account.

Note: some architectures have multiple zero page due to coloring.

--
Kirill A. Shutemov

2014-12-05 06:39:09

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user space

On Thu, Dec 4, 2014 at 3:28 PM, Kirill A. Shutemov <[email protected]> wrote:
> On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote:
>> > -----Original Message-----
>> > From: Kirill A. Shutemov [mailto:[email protected]]
>> > Sent: Tuesday, December 02, 2014 7:30 PM
>> > To: Wang, Yalin
>> > Cc: '[email protected]'; '[email protected]'; 'linux-arm-
>> > [email protected]'
>> > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user
>> > space
>> >
>> > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
>> > > This patch add/dec zero_page's _mapcount to make sure the mapcount is
>> > > correct for zero_page, so that when read from /proc/kpagecount,
>> > > zero_page's mapcount is also correct, userspace process like procrank
>> > > can calculate PSS correctly.

Instead of tweaking mapcount you could mark zero-pages in /proc/kpageflags
and handle them accordingly in userspace. Or mark zero pages with special
magic _mapcount and detect it in /proc/kpagecount.

>> >
>> > I don't have specific code path to point to, but I would expect zero page
>> > with non-zero mapcount would cause a problem with rmap.
>> >
>> > How do you test the change?
>> >
>> I just test it to see the mapcount from /proc/pid/pagemap and /proc/kpagecount ,
>> It works well,
>
> I took a closer look and your patch is broken in multiple places:
> - on zap_pte_range() you don't decrement mapcount;
> - you don't update rss counters for mm;
> - copy_one_pte() doesn't increase mapcount;
> - ...
>
> Basically, each and every vm_normal_page() call must be audited. As first
> step. And you totally skip huge zero page.
>
> Proper mapcount handling for zero page would require a lot more work and I
> don't think it worth it. Gain is too small.
>
> NAK.
>
>> The problem is that when I see /proc/pid/smaps ,
>> The Rss / Pss don't calculate zero_page map,
>> Because smaps_pte_entry() --> vm_normal_page( ),
>> Will return NULL for zero_page,
>>
>> But when userspace process cat /proc/pid/pagemap ,
>> It will see zero_page mapped,
>> And will treat as Rss ,
>> This is weird, should we also omit zero_page in /proc/pid/pagemap ?
>> Or add zero_page as Rss in /proc/pid/smaps ?
>>
>> I think we should add zero_page into Rss ,
>> Because it is really mapped into userspace address space.
>> And will let userspace memory analysis more accurate .
>
> It would be easier for userspace to find out pfn of zero page and take it
> into account.
>
> Note: some architectures have multiple zero page due to coloring.
>
> --
> Kirill A. Shutemov
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

2014-12-05 08:08:48

by Wang, Yalin

[permalink] [raw]
Subject: RE: [RFC V2] mm:add zero_page _mapcount when mapped into user space

> -----Original Message-----
> From: Konstantin Khlebnikov [mailto:[email protected]]
> Sent: Friday, December 05, 2014 2:39 PM
> To: Kirill A. Shutemov
> Cc: Wang, Yalin; [email protected]; [email protected]; linux-
> [email protected]
> Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user
> space
>
> On Thu, Dec 4, 2014 at 3:28 PM, Kirill A. Shutemov <[email protected]>
> wrote:
> > On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote:
> >> > -----Original Message-----
> >> > From: Kirill A. Shutemov [mailto:[email protected]]
> >> > Sent: Tuesday, December 02, 2014 7:30 PM
> >> > To: Wang, Yalin
> >> > Cc: '[email protected]'; '[email protected]';
> >> > 'linux-arm- [email protected]'
> >> > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into
> >> > user space
> >> >
> >> > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
> >> > > This patch add/dec zero_page's _mapcount to make sure the
> >> > > mapcount is correct for zero_page, so that when read from
> >> > > /proc/kpagecount, zero_page's mapcount is also correct, userspace
> >> > > process like procrank can calculate PSS correctly.
>
> Instead of tweaking mapcount you could mark zero-pages in /proc/kpageflags
> and handle them accordingly in userspace. Or mark zero pages with special
> magic _mapcount and detect it in /proc/kpagecount.
>
I think add KPF_ZERO_PAGE in kpageflags is better.
I will make another patch for review .

Thanks
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?