2009-09-01 06:47:11

by Fengguang Wu

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/4] memcg: add support for hwpoison testing

On Tue, Sep 01, 2009 at 10:32:14AM +0800, KAMEZAWA Hiroyuki wrote:
> On Tue, 1 Sep 2009 10:25:14 +0800
> Wu Fengguang <[email protected]> wrote:
> > > 4. I can't understand why you need this. I wonder you can get pfn via
> > > /proc/<pid>/????. And this may insert HWPOISON to page-cache of shared
> > > library and "unexpected" process will be poisoned.
> >
> > Sorry I should have explained this. It's mainly for correctness.
> > When a user space tool queries the task PFNs in /proc/pid/pagemap and
> > then send to /debug/hwpoison/corrupt-pfn, there is a racy window that
> > the page could be reclaimed and allocated by some one else. It would
> > be awkward to try to pin the pages in user space. So we need the
> > guarantees provided by /debug/hwpoison/corrupt-filter-memcg, which
> > will be checked inside the page lock with elevated reference count.
> >
>
> memcg never holds refcnt for a page and the kernel::vmscan.c can reclaim
> any pages under memcg whithout checking anything related to memcg.
> *And*, your code has no "pin" code.
> This patch sed does no jobs for your concern.

We grabbed page here, which is not in the scope of this patchset:

static int try_memory_failure(unsigned long pfn)
{
struct page *p;
int res = -EINVAL;

if (!pfn_valid(pfn))
return res;

p = pfn_to_page(pfn);
if (!get_page_unless_zero(compound_head(p)))
return res;

lock_page_nosync(compound_head(p));

if (hwpoison_filter(p))
goto out;

res = __memory_failure(pfn, 18,
MEMORY_FAILURE_FLAG_COUNTED |
MEMORY_FAILURE_FLAG_LOCKED);
out:
unlock_page(p);
return res;
}

> I recommend you to add
> /debug/hwpoizon/pin-pfn
>
> Then,
> echo pfn > /debug/hwpoizon/pin-pfn
> # add pfn for hwpoison debug's watch list. and elevate refcnt
> check 'pfn' is still used.
> echo pfn > /debug/hwpoison/corrupt-pfn
> # check 'watch list' and make it corrupt and release refcnt.
> or some.

Looks like a good alternative. At least no more memcg dependency..

Cheers,
Fengguang


2009-09-01 07:14:32

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/4] memcg: add support for hwpoison testing

On Tue, 1 Sep 2009 14:46:52 +0800
Wu Fengguang <[email protected]> wrote:

> On Tue, Sep 01, 2009 at 10:32:14AM +0800, KAMEZAWA Hiroyuki wrote:
> > On Tue, 1 Sep 2009 10:25:14 +0800
> > Wu Fengguang <[email protected]> wrote:
> > > > 4. I can't understand why you need this. I wonder you can get pfn via
> > > > /proc/<pid>/????. And this may insert HWPOISON to page-cache of shared
> > > > library and "unexpected" process will be poisoned.
> > >
> > > Sorry I should have explained this. It's mainly for correctness.
> > > When a user space tool queries the task PFNs in /proc/pid/pagemap and
> > > then send to /debug/hwpoison/corrupt-pfn, there is a racy window that
> > > the page could be reclaimed and allocated by some one else. It would
> > > be awkward to try to pin the pages in user space. So we need the
> > > guarantees provided by /debug/hwpoison/corrupt-filter-memcg, which
> > > will be checked inside the page lock with elevated reference count.
> > >
> >
> > memcg never holds refcnt for a page and the kernel::vmscan.c can reclaim
> > any pages under memcg whithout checking anything related to memcg.
> > *And*, your code has no "pin" code.
> > This patch sed does no jobs for your concern.
>
> We grabbed page here, which is not in the scope of this patchset:
>
> static int try_memory_failure(unsigned long pfn)
> {
> struct page *p;
> int res = -EINVAL;
>
> if (!pfn_valid(pfn))
> return res;
>
> p = pfn_to_page(pfn);
> if (!get_page_unless_zero(compound_head(p)))
> return res;
>
> lock_page_nosync(compound_head(p));
>
> if (hwpoison_filter(p))
> goto out;
>
> res = __memory_failure(pfn, 18,
> MEMORY_FAILURE_FLAG_COUNTED |
> MEMORY_FAILURE_FLAG_LOCKED);
> out:
> unlock_page(p);
> return res;
> }

Hmm. maybe off-topic but why lock_page() is necessary ?


> > I recommend you to add
> > /debug/hwpoizon/pin-pfn
> >
> > Then,
> > echo pfn > /debug/hwpoizon/pin-pfn
> > # add pfn for hwpoison debug's watch list. and elevate refcnt
> > check 'pfn' is still used.
> > echo pfn > /debug/hwpoison/corrupt-pfn
> > # check 'watch list' and make it corrupt and release refcnt.
> > or some.
>
> Looks like a good alternative. At least no more memcg dependency..
>

My point is that memcg can show 'owner' of pages but the page may
be shared with something important task _and_ if a task is migrated,
its pages' memcg information is not updated now. Then, you can kill
a task which is not in memcg.

Then, I don't recommend to use memcg. I think you'll see too much
pitfalls.

Thanks,
-Kame

2009-09-01 08:56:10

by Fengguang Wu

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/4] memcg: add support for hwpoison testing

On Tue, Sep 01, 2009 at 03:12:28PM +0800, KAMEZAWA Hiroyuki wrote:
> On Tue, 1 Sep 2009 14:46:52 +0800
> Wu Fengguang <[email protected]> wrote:
>
> > On Tue, Sep 01, 2009 at 10:32:14AM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Tue, 1 Sep 2009 10:25:14 +0800
> > > Wu Fengguang <[email protected]> wrote:
> > > > > 4. I can't understand why you need this. I wonder you can get pfn via
> > > > > /proc/<pid>/????. And this may insert HWPOISON to page-cache of shared
> > > > > library and "unexpected" process will be poisoned.
> > > >
> > > > Sorry I should have explained this. It's mainly for correctness.
> > > > When a user space tool queries the task PFNs in /proc/pid/pagemap and
> > > > then send to /debug/hwpoison/corrupt-pfn, there is a racy window that
> > > > the page could be reclaimed and allocated by some one else. It would
> > > > be awkward to try to pin the pages in user space. So we need the
> > > > guarantees provided by /debug/hwpoison/corrupt-filter-memcg, which
> > > > will be checked inside the page lock with elevated reference count.
> > > >
> > >
> > > memcg never holds refcnt for a page and the kernel::vmscan.c can reclaim
> > > any pages under memcg whithout checking anything related to memcg.
> > > *And*, your code has no "pin" code.
> > > This patch sed does no jobs for your concern.
> >
> > We grabbed page here, which is not in the scope of this patchset:
> >
> > static int try_memory_failure(unsigned long pfn)
> > {
> > struct page *p;
> > int res = -EINVAL;
> >
> > if (!pfn_valid(pfn))
> > return res;
> >
> > p = pfn_to_page(pfn);
> > if (!get_page_unless_zero(compound_head(p)))
> > return res;
> >
> > lock_page_nosync(compound_head(p));
> >
> > if (hwpoison_filter(p))
> > goto out;
> >
> > res = __memory_failure(pfn, 18,
> > MEMORY_FAILURE_FLAG_COUNTED |
> > MEMORY_FAILURE_FLAG_LOCKED);
> > out:
> > unlock_page(p);
> > return res;
> > }
>
> Hmm. maybe off-topic but why lock_page() is necessary ?

Because we also have filter for testing page flags, which requires
lock_page() to be correct.

>
> > > I recommend you to add
> > > /debug/hwpoizon/pin-pfn
> > >
> > > Then,
> > > echo pfn > /debug/hwpoizon/pin-pfn
> > > # add pfn for hwpoison debug's watch list. and elevate refcnt
> > > check 'pfn' is still used.
> > > echo pfn > /debug/hwpoison/corrupt-pfn
> > > # check 'watch list' and make it corrupt and release refcnt.
> > > or some.
> >
> > Looks like a good alternative. At least no more memcg dependency..
> >
>
> My point is that memcg can show 'owner' of pages but the page may
> be shared with something important task _and_ if a task is migrated,
> its pages' memcg information is not updated now. Then, you can kill
> a task which is not in memcg.

Ah thanks! I'm not aware of that tricky fact, and it does make a
very good reason not to use memcg, although I guess locked page won't
be migrated.

> Then, I don't recommend to use memcg. I think you'll see too much
> pitfalls.

Thanks,
Fengguang

2009-09-01 16:32:14

by Balbir Singh

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/4] memcg: add support for hwpoison testing

* Wu Fengguang <[email protected]> [2009-09-01 16:55:49]:

> > My point is that memcg can show 'owner' of pages but the page may
> > be shared with something important task _and_ if a task is migrated,
> > its pages' memcg information is not updated now. Then, you can kill
> > a task which is not in memcg.
>
> Ah thanks! I'm not aware of that tricky fact, and it does make a
> very good reason not to use memcg, although I guess locked page won't
> be migrated.
>

I think what Kamezawa-San is pointing to is that the task can migrate,
leaving behind the page in the memcg and poisioning those pages can
kill a task outside the memcg.

--
Balbir

2009-09-02 02:47:24

by Fengguang Wu

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/4] memcg: add support for hwpoison testing

On Wed, Sep 02, 2009 at 12:31:52AM +0800, Balbir Singh wrote:
> * Wu Fengguang <[email protected]> [2009-09-01 16:55:49]:
>
> > > My point is that memcg can show 'owner' of pages but the page may
> > > be shared with something important task _and_ if a task is migrated,
> > > its pages' memcg information is not updated now. Then, you can kill
> > > a task which is not in memcg.
> >
> > Ah thanks! I'm not aware of that tricky fact, and it does make a
> > very good reason not to use memcg, although I guess locked page won't
> > be migrated.
> >
>
> I think what Kamezawa-San is pointing to is that the task can migrate,
> leaving behind the page in the memcg and poisioning those pages can
> kill a task outside the memcg.

Yeah Kame's words reminded me of the memcg goal: it may not have to
track task pages 100% accurately for all the tricky racy windows/cases.
So could be risky to use memcg for hwpoison testing.

Otherwise I felt like using memcg for hwpoison testing because the
exported things are not that bad, and our hwpoison stress testing
efforts may also be very good exercises to some aspects of memcg ;)

Back to the page sharing problem. For hwpoison testing, it is
acceptable for the test program and the init process to share _clean_
libc.so pages. Because the hwpoison of such pages can be recovered
gracefully by simply unmap and drop the hwpoisoned ones.

But if two tasks share some dirty pages (eg. shmem), then it could
be killing more tasks than expected. However
- this is a general problem independent the use of memcg
- could be avoided by checking page dirtiness and map count
- our test schemes simply won't try to create such insane conditions
(It will include both tasks as the target.)

btw, hwpoison testing also allows "mis-killing" of no-owner pages (ie.
newly freed pages by the target task in some racy windows) which won't
affect the test correctness.

Thanks,
Fengguang