I am not sure we have to account zero page with file_rss.
Hugh and Kame's new zero page doesn't do it.
As side effect of this, we can prevent innocent process which have a lot
of zero page when OOM happens.
(But I am not sure there is a process like this :)
So I think not file_rss counting is not bad.
RSS counting zero page with file_rss helps any program using smaps?
If we have to keep the old behavior, I have to remake this patch.
== CUT_HERE ==
Long time ago, We regards zero page as file_rss and
vm_normal_page doesn't return NULL.
But now, we reinstated ZERO_PAGE and vm_normal_page's implementation
can return NULL in case of zero page. Also we don't count it with
file_rss any more.
Then, RSS and PSS can't be matched.
For consistency, Let's ignore zero page in smaps_pte_range.
CC: Matt Mackall <[email protected]>
Signed-off-by: Minchan Kim <[email protected]>
---
fs/proc/task_mmu.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 47c03f4..f277c4a 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -361,12 +361,11 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
if (!pte_present(ptent))
continue;
- mss->resident += PAGE_SIZE;
-
page = vm_normal_page(vma, addr, ptent);
if (!page)
continue;
+ mss->resident += PAGE_SIZE;
/* Accumulate the size in pages that have been accessed. */
if (pte_young(ptent) || PageReferenced(page))
mss->referenced += PAGE_SIZE;
--
1.5.6.3
--
Kind regards,
Minchan Kim
On Mon, 28 Dec 2009 13:46:19 +0900
Minchan Kim <[email protected]> wrote:
>
> I am not sure we have to account zero page with file_rss.
> Hugh and Kame's new zero page doesn't do it.
> As side effect of this, we can prevent innocent process which have a lot
> of zero page when OOM happens.
> (But I am not sure there is a process like this :)
> So I think not file_rss counting is not bad.
>
> RSS counting zero page with file_rss helps any program using smaps?
> If we have to keep the old behavior, I have to remake this patch.
>
> == CUT_HERE ==
>
> Long time ago, We regards zero page as file_rss and
> vm_normal_page doesn't return NULL.
>
> But now, we reinstated ZERO_PAGE and vm_normal_page's implementation
> can return NULL in case of zero page. Also we don't count it with
> file_rss any more.
>
> Then, RSS and PSS can't be matched.
> For consistency, Let's ignore zero page in smaps_pte_range.
>
> CC: Matt Mackall <[email protected]>
> Signed-off-by: Minchan Kim <[email protected]>
Acked-by: KAMEZAWA Hiroyuki <[email protected]>
BTW, how about counting ZERO page in smaps ? Ignoring them completely sounds
not very good.
Thanks,
-Kame
> ---
> fs/proc/task_mmu.c | 3 +--
> 1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 47c03f4..f277c4a 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -361,12 +361,11 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
> if (!pte_present(ptent))
> continue;
>
> - mss->resident += PAGE_SIZE;
> -
> page = vm_normal_page(vma, addr, ptent);
> if (!page)
> continue;
>
> + mss->resident += PAGE_SIZE;
> /* Accumulate the size in pages that have been accessed. */
> if (pte_young(ptent) || PageReferenced(page))
> mss->referenced += PAGE_SIZE;
> --
> 1.5.6.3
>
>
>
> --
> Kind regards,
> Minchan Kim
>
Hi, Kame.
On Mon, 28 Dec 2009 13:47:52 +0900
KAMEZAWA Hiroyuki <[email protected]> wrote:
> On Mon, 28 Dec 2009 13:46:19 +0900
> Minchan Kim <[email protected]> wrote:
>
> >
> > I am not sure we have to account zero page with file_rss.
> > Hugh and Kame's new zero page doesn't do it.
> > As side effect of this, we can prevent innocent process which have a lot
> > of zero page when OOM happens.
> > (But I am not sure there is a process like this :)
> > So I think not file_rss counting is not bad.
> >
> > RSS counting zero page with file_rss helps any program using smaps?
> > If we have to keep the old behavior, I have to remake this patch.
> >
> > == CUT_HERE ==
> >
> > Long time ago, We regards zero page as file_rss and
> > vm_normal_page doesn't return NULL.
> >
> > But now, we reinstated ZERO_PAGE and vm_normal_page's implementation
> > can return NULL in case of zero page. Also we don't count it with
> > file_rss any more.
> >
> > Then, RSS and PSS can't be matched.
> > For consistency, Let's ignore zero page in smaps_pte_range.
> >
> > CC: Matt Mackall <[email protected]>
> > Signed-off-by: Minchan Kim <[email protected]>
>
> Acked-by: KAMEZAWA Hiroyuki <[email protected]>
Thanks for ACK. :)
>
> BTW, how about counting ZERO page in smaps? Ignoring them completely sounds
> not very good.
I am not use it is useful.
zero page snapshot of ongoing process is useful?
Doesn't Admin need to know about zero page?
Let's admins use it well. If we remove zero page again?
How many are applications use smaps?
Did we have a problem without it?
When I think of it, there are too many qeustions.
Most important thing to add new statistics is just need of customer.
Frankly speaking, I don't have good scenario of using zero page.
Do you have any scenario it is valueable?
>
> Thanks,
> -Kame
--
Kind regards,
Minchan Kim
On Mon, 28 Dec 2009 14:31:54 +0900
Minchan Kim <[email protected]> wrote:
>
> > BTW, how about counting ZERO page in smaps? Ignoring them completely sounds
> > not very good.
>
> I am not use it is useful.
>
> zero page snapshot of ongoing process is useful?
> Doesn't Admin need to know about zero page?
> Let's admins use it well. If we remove zero page again?
> How many are applications use smaps?
> Did we have a problem without it?
>
My concern is that hiding indormation which was exported before.
No more than that and no strong demand.
> When I think of it, there are too many qeustions.
> Most important thing to add new statistics is just need of customer.
>
> Frankly speaking, I don't have good scenario of using zero page.
> Do you have any scenario it is valueable?
>
read before write ? maybe sometimes happens.
For example. current glibc's calloc() avoids memset() if the pages are
dropped by MADVISE (without unmap).
Before starting zero-page works, I checked "questions" in lkml and
found some reports that some applications start to go OOM after zero-page
removal.
For me, I know one of my customer's application depends on behavior of
zero page (on RHEL5). So, I tried to add again it before RHEL6 because
I think removal of zero-page corrupts compatibility.
Thanks,
-Kame
On Mon, Dec 28, 2009 at 2:43 PM, KAMEZAWA Hiroyuki
<[email protected]> wrote:
> On Mon, 28 Dec 2009 14:31:54 +0900
> Minchan Kim <[email protected]> wrote:
>>
>> > BTW, how about counting ZERO page in smaps? Ignoring them completely sounds
>> > not very good.
>>
>> I am not use it is useful.
>>
>> zero page snapshot of ongoing process is useful?
>> Doesn't Admin need to know about zero page?
>> Let's admins use it well. If we remove zero page again?
>> How many are applications use smaps?
>> Did we have a problem without it?
>>
> My concern is that hiding indormation which was exported before.
> No more than that and no strong demand.
>
>
>> When I think of it, there are too many qeustions.
>> Most important thing to add new statistics is just need of customer.
>>
>> Frankly speaking, I don't have good scenario of using zero page.
>> Do you have any scenario it is valueable?
>>
> read before write ? maybe sometimes happens.
>
> For example. current glibc's calloc() avoids memset() if the pages are
> dropped by MADVISE (without unmap).
>
> Before starting zero-page works, I checked "questions" in lkml and
> found some reports that some applications start to go OOM after zero-page
> removal.
>
> For me, I know one of my customer's application depends on behavior of
> zero page (on RHEL5). So, I tried to add again it before RHEL6 because
> I think removal of zero-page corrupts compatibility.
>
Okay. I will repost the patch.
> Thanks,
> -Kame
>
>
--
Kind regards,
Minchan Kim
On Mon, 2009-12-28 at 13:46 +0900, Minchan Kim wrote:
> I am not sure we have to account zero page with file_rss.
> Hugh and Kame's new zero page doesn't do it.
> As side effect of this, we can prevent innocent process which have a lot
> of zero page when OOM happens.
> (But I am not sure there is a process like this :)
> So I think not file_rss counting is not bad.
>
> RSS counting zero page with file_rss helps any program using smaps?
> If we have to keep the old behavior, I have to remake this patch.
>
> == CUT_HERE ==
>
> Long time ago, We regards zero page as file_rss and
> vm_normal_page doesn't return NULL.
>
> But now, we reinstated ZERO_PAGE and vm_normal_page's implementation
> can return NULL in case of zero page. Also we don't count it with
> file_rss any more.
>
> Then, RSS and PSS can't be matched.
> For consistency, Let's ignore zero page in smaps_pte_range.
>
Not counting the zero page in RSS is fine with me. But will this patch
make the total from smaps agree with get_mm_rss()?
Regarding OOM handling: arguably RSS should play no role in OOM as it's
practically meaningless in a shared memory system. If we were instead
used per-process unshared pages as the metric (aka USS), we'd have a
much better notion of how much memory an OOM kill would recover.
Unfortunately, that's not trivial to track as the accounting on COW
operations is not lightweight.
> CC: Matt Mackall <[email protected]>
> Signed-off-by: Minchan Kim <[email protected]>
> ---
> fs/proc/task_mmu.c | 3 +--
> 1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 47c03f4..f277c4a 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -361,12 +361,11 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
> if (!pte_present(ptent))
> continue;
>
> - mss->resident += PAGE_SIZE;
> -
> page = vm_normal_page(vma, addr, ptent);
> if (!page)
> continue;
>
> + mss->resident += PAGE_SIZE;
> /* Accumulate the size in pages that have been accessed. */
> if (pte_young(ptent) || PageReferenced(page))
> mss->referenced += PAGE_SIZE;
> --
> 1.5.6.3
>
>
>
--
http://selenic.com : development and support for Mercurial and Linux
Hi, Matt.
On Tue, 29 Dec 2009 14:08:59 -0600
Matt Mackall <[email protected]> wrote:
> On Mon, 2009-12-28 at 13:46 +0900, Minchan Kim wrote:
> > I am not sure we have to account zero page with file_rss.
> > Hugh and Kame's new zero page doesn't do it.
> > As side effect of this, we can prevent innocent process which have a lot
> > of zero page when OOM happens.
> > (But I am not sure there is a process like this :)
> > So I think not file_rss counting is not bad.
> >
> > RSS counting zero page with file_rss helps any program using smaps?
> > If we have to keep the old behavior, I have to remake this patch.
> >
> > == CUT_HERE ==
> >
> > Long time ago, We regards zero page as file_rss and
> > vm_normal_page doesn't return NULL.
> >
> > But now, we reinstated ZERO_PAGE and vm_normal_page's implementation
> > can return NULL in case of zero page. Also we don't count it with
> > file_rss any more.
> >
> > Then, RSS and PSS can't be matched.
> > For consistency, Let's ignore zero page in smaps_pte_range.
> >
>
> Not counting the zero page in RSS is fine with me. But will this patch
> make the total from smaps agree with get_mm_rss()?
Yes. Anon page fault handler also don't count zero page any more, now.
Nonetheless, smaps counts it with resident.
It's point of this patch.
But I reposted both anon fault handler and here counts it as file_rss
as compatibility with old zero page counting.
Pz, Look at that. :)
>
> Regarding OOM handling: arguably RSS should play no role in OOM as it's
> practically meaningless in a shared memory system. If we were instead
It's very arguable issue for us that OOM depens on RSS.
> used per-process unshared pages as the metric (aka USS), we'd have a
> much better notion of how much memory an OOM kill would recover.
> Unfortunately, that's not trivial to track as the accounting on COW
> operations is not lightweight.
I think we can approximate it with the size of VM_SHARED vma of process
when VM calculate badness.
What do you think about it?
Thanks for good idea, Matt.
>
> > CC: Matt Mackall <[email protected]>
> > Signed-off-by: Minchan Kim <[email protected]>
> > ---
> > fs/proc/task_mmu.c | 3 +--
> > 1 files changed, 1 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index 47c03f4..f277c4a 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -361,12 +361,11 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
> > if (!pte_present(ptent))
> > continue;
> >
> > - mss->resident += PAGE_SIZE;
> > -
> > page = vm_normal_page(vma, addr, ptent);
> > if (!page)
> > continue;
> >
> > + mss->resident += PAGE_SIZE;
> > /* Accumulate the size in pages that have been accessed. */
> > if (pte_young(ptent) || PageReferenced(page))
> > mss->referenced += PAGE_SIZE;
> > --
> > 1.5.6.3
> >
> >
> >
>
>
>
> --
> http://selenic.com : development and support for Mercurial and Linux
>
>
--
Kind regards,
Minchan Kim
On Wed, 2009-12-30 at 10:33 +0900, Minchan Kim wrote:
> Hi, Matt.
>
> On Tue, 29 Dec 2009 14:08:59 -0600
> Matt Mackall <[email protected]> wrote:
>
> > On Mon, 2009-12-28 at 13:46 +0900, Minchan Kim wrote:
> > > I am not sure we have to account zero page with file_rss.
> > > Hugh and Kame's new zero page doesn't do it.
> > > As side effect of this, we can prevent innocent process which have a lot
> > > of zero page when OOM happens.
> > > (But I am not sure there is a process like this :)
> > > So I think not file_rss counting is not bad.
> > >
> > > RSS counting zero page with file_rss helps any program using smaps?
> > > If we have to keep the old behavior, I have to remake this patch.
> > >
> > > == CUT_HERE ==
> > >
> > > Long time ago, We regards zero page as file_rss and
> > > vm_normal_page doesn't return NULL.
> > >
> > > But now, we reinstated ZERO_PAGE and vm_normal_page's implementation
> > > can return NULL in case of zero page. Also we don't count it with
> > > file_rss any more.
> > >
> > > Then, RSS and PSS can't be matched.
> > > For consistency, Let's ignore zero page in smaps_pte_range.
> > >
> >
> > Not counting the zero page in RSS is fine with me. But will this patch
> > make the total from smaps agree with get_mm_rss()?
>
> Yes. Anon page fault handler also don't count zero page any more, now.
> Nonetheless, smaps counts it with resident.
Ok, great.
Acked-by: Matt Mackall <[email protected]>
--
http://selenic.com : development and support for Mercurial and Linux
On Wed, 30 Dec 2009, Minchan Kim wrote:
> On Tue, 29 Dec 2009 14:08:59 -0600
> Matt Mackall <[email protected]> wrote:
> > On Mon, 2009-12-28 at 13:46 +0900, Minchan Kim wrote:
> > > I am not sure we have to account zero page with file_rss.
> > > Hugh and Kame's new zero page doesn't do it.
> > > As side effect of this, we can prevent innocent process which have a lot
> > > of zero page when OOM happens.
> > > (But I am not sure there is a process like this :)
> > > So I think not file_rss counting is not bad.
> > >
> > > RSS counting zero page with file_rss helps any program using smaps?
> > > If we have to keep the old behavior, I have to remake this patch.
> > >
> > > == CUT_HERE ==
> > >
> > > Long time ago, We regards zero page as file_rss and
> > > vm_normal_page doesn't return NULL.
> > >
> > > But now, we reinstated ZERO_PAGE and vm_normal_page's implementation
> > > can return NULL in case of zero page. Also we don't count it with
> > > file_rss any more.
> > >
> > > Then, RSS and PSS can't be matched.
> > > For consistency, Let's ignore zero page in smaps_pte_range.
> > >
> >
> > Not counting the zero page in RSS is fine with me. But will this patch
> > make the total from smaps agree with get_mm_rss()?
>
> Yes. Anon page fault handler also don't count zero page any more, now.
> Nonetheless, smaps counts it with resident.
>
> It's point of this patch.
>
> But I reposted both anon fault handler and here counts it as file_rss
> as compatibility with old zero page counting.
> Pz, Look at that. :)
I am getting confused between your different patches in this area,
heading in different directions, not increments in the same series.
But I think this is the one to which, like Matt, I'll say
Acked-by: Hugh Dickins <[email protected]>
>
> >
> > Regarding OOM handling: arguably RSS should play no role in OOM as it's
> > practically meaningless in a shared memory system. If we were instead
>
> It's very arguable issue for us that OOM depens on RSS.
>
> > used per-process unshared pages as the metric (aka USS), we'd have a
> > much better notion of how much memory an OOM kill would recover.
> > Unfortunately, that's not trivial to track as the accounting on COW
> > operations is not lightweight.
>
> I think we can approximate it with the size of VM_SHARED vma of process
> when VM calculate badness.
> What do you think about it?
Sounds like it'll end up even harder to understand than by size or by rss.
>
> Thanks for good idea, Matt.
>
> >
> > > CC: Matt Mackall <[email protected]>
> > > Signed-off-by: Minchan Kim <[email protected]>
> > > ---
> > > fs/proc/task_mmu.c | 3 +--
> > > 1 files changed, 1 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > > index 47c03f4..f277c4a 100644
> > > --- a/fs/proc/task_mmu.c
> > > +++ b/fs/proc/task_mmu.c
> > > @@ -361,12 +361,11 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
> > > if (!pte_present(ptent))
> > > continue;
> > >
> > > - mss->resident += PAGE_SIZE;
> > > -
> > > page = vm_normal_page(vma, addr, ptent);
> > > if (!page)
> > > continue;
> > >
> > > + mss->resident += PAGE_SIZE;
> > > /* Accumulate the size in pages that have been accessed. */
> > > if (pte_young(ptent) || PageReferenced(page))
> > > mss->referenced += PAGE_SIZE;
> > > --
On Thu, Dec 31, 2009 at 1:19 AM, Hugh Dickins
<[email protected]> wrote:
> On Wed, 30 Dec 2009, Minchan Kim wrote:
>> On Tue, 29 Dec 2009 14:08:59 -0600
>> Matt Mackall <[email protected]> wrote:
>> > On Mon, 2009-12-28 at 13:46 +0900, Minchan Kim wrote:
>> > > I am not sure we have to account zero page with file_rss.
>> > > Hugh and Kame's new zero page doesn't do it.
>> > > As side effect of this, we can prevent innocent process which have a lot
>> > > of zero page when OOM happens.
>> > > (But I am not sure there is a process like this :)
>> > > So I think not file_rss counting is not bad.
>> > >
>> > > RSS counting zero page with file_rss helps any program using smaps?
>> > > If we have to keep the old behavior, I have to remake this patch.
>> > >
>> > > == CUT_HERE ==
>> > >
>> > > Long time ago, We regards zero page as file_rss and
>> > > vm_normal_page doesn't return NULL.
>> > >
>> > > But now, we reinstated ZERO_PAGE and vm_normal_page's implementation
>> > > can return NULL in case of zero page. Also we don't count it with
>> > > file_rss any more.
>> > >
>> > > Then, RSS and PSS can't be matched.
>> > > For consistency, Let's ignore zero page in smaps_pte_range.
>> > >
>> >
>> > Not counting the zero page in RSS is fine with me. But will this patch
>> > make the total from smaps agree with get_mm_rss()?
>>
>> Yes. Anon page fault handler also don't count zero page any more, now.
>> Nonetheless, smaps counts it with resident.
>>
>> It's point of this patch.
>>
>> But I reposted both anon fault handler and here counts it as file_rss
>> as compatibility with old zero page counting.
>> Pz, Look at that. :)
>
> I am getting confused between your different patches in this area,
> heading in different directions, not increments in the same series.
> But I think this is the one to which, like Matt, I'll say
>
> Acked-by: Hugh Dickins <[email protected]>
Thanks for ACK, Hugh.
It's my old version.
This patch can be changed according to account zero page as file_rss or not.
Anyway, We need consistency regardless of it.
>
>>
>> >
>> > Regarding OOM handling: arguably RSS should play no role in OOM as it's
>> > practically meaningless in a shared memory system. If we were instead
>>
>> It's very arguable issue for us that OOM depens on RSS.
>>
>> > used per-process unshared pages as the metric (aka USS), we'd have a
>> > much better notion of how much memory an OOM kill would recover.
>> > Unfortunately, that's not trivial to track as the accounting on COW
>> > operations is not lightweight.
>>
>> I think we can approximate it with the size of VM_SHARED vma of process
>> when VM calculate badness.
>> What do you think about it?
>
> Sounds like it'll end up even harder to understand than by size or by rss.
Yes. If we settle down OOM issue, I will repost this issue. :)
>
>>
>> Thanks for good idea, Matt.
>>
>> >
>> > > CC: Matt Mackall <[email protected]>
>> > > Signed-off-by: Minchan Kim <[email protected]>
>> > > ---
>> > > fs/proc/task_mmu.c | 3 +--
>> > > 1 files changed, 1 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>> > > index 47c03f4..f277c4a 100644
>> > > --- a/fs/proc/task_mmu.c
>> > > +++ b/fs/proc/task_mmu.c
>> > > @@ -361,12 +361,11 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
>> > > if (!pte_present(ptent))
>> > > continue;
>> > >
>> > > - mss->resident += PAGE_SIZE;
>> > > -
>> > > page = vm_normal_page(vma, addr, ptent);
>> > > if (!page)
>> > > continue;
>> > >
>> > > + mss->resident += PAGE_SIZE;
>> > > /* Accumulate the size in pages that have been accessed. */
>> > > if (pte_young(ptent) || PageReferenced(page))
>> > > mss->referenced += PAGE_SIZE;
>> > > --
>
--
Kind regards,
Minchan Kim