Hi Andrea, Peter,
I have a question on page refcounting in your NUMA
page migration code.
In Peter's case, I wonder why you introduce a new
MIGRATE_FAULT migration mode. If the normal page
migration / compaction logic can do without taking
an extra reference count, why does your code need it?
In Andrea's case, we have a comment suggesting an
extra refcount is needed, immediately followed by
a put_page:
/*
* Pin the head subpage at least until the first
* __isolate_lru_page succeeds (__isolate_lru_page pins it
* again when it succeeds). If we unpin before
* __isolate_lru_page successd, the page could be freed and
* reallocated out from under us. Thus our previous checks on
* the page, and the split_huge_page, would be worthless.
*
* We really only need to do this if "ret > 0" but it doesn't
* hurt to do it unconditionally as nobody can reference
* "page" anymore after this and so we can avoid an "if (ret >
* 0)" branch here.
*/
put_page(page);
This also confuses me.
If we do not need the extra refcount (and I do not
understand why NUMA migrate-on-fault needs one more
refcount than normal page migration), we can get
rid of the MIGRATE_FAULT mode.
If we do need the extra refcount, why is normal
page migration safe? :)
--
All rights reversed
On Fri, 2012-10-19 at 11:53 -0400, Rik van Riel wrote:
>
> If we do need the extra refcount, why is normal
> page migration safe? :)
Its mostly a matter of how convoluted you make the code, regular page
migration is about as bad as you can get
Normal does:
follow_page(FOLL_GET) +1
isolate_lru_page() +1
put_page() -1
ending up with a page with a single reference (for anon, or one extra
each for the mapping and buffer).
And while I suppose I could do a put_page() in migrate_misplaced_page()
that makes the function frob the page-count depending on the return
value.
I always try and avoid conditional locks/refs, therefore the code ends
up doing:
page = vm_normal_page()
if (page) {
get_page()
migrate_misplaced_page()
put_page()
}
where migrate_misplaced_page() does isolate_lru_page()/putback_lru_page,
and this leaves the page-count invariant.
We got a ref, therefore we must put a ref, is easier than we got a ref
and must put except when...
On 10/19/2012 12:39 PM, Peter Zijlstra wrote:
> On Fri, 2012-10-19 at 11:53 -0400, Rik van Riel wrote:
>>
>> If we do need the extra refcount, why is normal
>> page migration safe? :)
>
> Its mostly a matter of how convoluted you make the code, regular page
> migration is about as bad as you can get
>
> Normal does:
>
> follow_page(FOLL_GET) +1
>
> isolate_lru_page() +1
>
> put_page() -1
>
> ending up with a page with a single reference (for anon, or one extra
> each for the mapping and buffer).
Would it make sense to have the normal page migration code always
work with the extra refcount, so we do not have to introduce a new
MIGRATE_FAULT migration mode?
On the other hand, compaction does not take the extra reference...
Another alternative might be to do the put_page inside
do_prot_none_numa(). That would be analogous to do_wp_page
disposing of the old page for the caller.
I am not real happy about NUMA migration introducing its own
migration mode...
--
All rights reversed
On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
> Would it make sense to have the normal page migration code always
> work with the extra refcount, so we do not have to introduce a new
> MIGRATE_FAULT migration mode?
>
> On the other hand, compaction does not take the extra reference...
Right, it appears to not do this, it gets pages from the pfn and
zone->lock and the isolate_lru_page() call is the first reference.
> Another alternative might be to do the put_page inside
> do_prot_none_numa(). That would be analogous to do_wp_page
> disposing of the old page for the caller.
It'd have to be inside migrate_misplaced_page(), can't do before
isolate_lru_page() or the page might disappear. Doing it after is
(obviously) too late.
> I am not real happy about NUMA migration introducing its own
> migration mode...
You didn't seem to mind too much earlier, but I can remove it if you
want.
On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
> On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
>> Another alternative might be to do the put_page inside
>> do_prot_none_numa(). That would be analogous to do_wp_page
>> disposing of the old page for the caller.
>
> It'd have to be inside migrate_misplaced_page(), can't do before
> isolate_lru_page() or the page might disappear. Doing it after is
> (obviously) too late.
Keeping an extra refcount on the page might _still_
result in it disappearing from the process by some
other means, in-between you grabbing the refcount
and invoking migration of the page.
>> I am not real happy about NUMA migration introducing its own
>> migration mode...
>
> You didn't seem to mind too much earlier, but I can remove it if you
> want.
Could have been reviewing fatigue :)
And yes, it would have been nice to not have a special
migration mode for sched/numa.
Speaking of, when do you guys plan to submit a (cleaned up)
version of the sched/numa patch series for review on lkml?
--
All rights reversed
* Rik van Riel <[email protected]> wrote:
> On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
> >On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
>
> >>Another alternative might be to do the put_page inside
> >>do_prot_none_numa(). That would be analogous to do_wp_page
> >>disposing of the old page for the caller.
> >
> >It'd have to be inside migrate_misplaced_page(), can't do before
> >isolate_lru_page() or the page might disappear. Doing it after is
> >(obviously) too late.
>
> Keeping an extra refcount on the page might _still_
> result in it disappearing from the process by some
> other means, in-between you grabbing the refcount
> and invoking migration of the page.
>
> >>I am not real happy about NUMA migration introducing its own
> >>migration mode...
> >
> >You didn't seem to mind too much earlier, but I can remove it if you
> >want.
>
> Could have been reviewing fatigue :)
:-)
> And yes, it would have been nice to not have a special
> migration mode for sched/numa.
>
> Speaking of, when do you guys plan to submit a (cleaned up)
> version of the sched/numa patch series for review on lkml?
Which commit(s) worry you specifically?
Thanks,
Ingo
On 10/19/2012 09:23 PM, Ingo Molnar wrote:
>
> * Rik van Riel <[email protected]> wrote:
>
>> On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
>>> On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
>>
>>>> Another alternative might be to do the put_page inside
>>>> do_prot_none_numa(). That would be analogous to do_wp_page
>>>> disposing of the old page for the caller.
>>>
>>> It'd have to be inside migrate_misplaced_page(), can't do before
>>> isolate_lru_page() or the page might disappear. Doing it after is
>>> (obviously) too late.
>>
>> Keeping an extra refcount on the page might _still_
>> result in it disappearing from the process by some
>> other means, in-between you grabbing the refcount
>> and invoking migration of the page.
>>
>>>> I am not real happy about NUMA migration introducing its own
>>>> migration mode...
>>>
>>> You didn't seem to mind too much earlier, but I can remove it if you
>>> want.
>>
>> Could have been reviewing fatigue :)
>
> :-)
>
>> And yes, it would have been nice to not have a special
>> migration mode for sched/numa.
>>
>> Speaking of, when do you guys plan to submit a (cleaned up)
>> version of the sched/numa patch series for review on lkml?
>
> Which commit(s) worry you specifically?
One of them would be the commit that introduces MIGRATE_FAULT.
Adding it in one patch, and removing it into a next just makes
things harder on the reviewers.
03a040f6c17ab81659579ba6abe267c0562097e4
If the changesets with NUMA syscalls are still in your tree's
history, they should not be submitted as part of the patch
series, either.
--
All rights reversed
On 10/19/2012 11:53 PM, Rik van Riel wrote:
> Hi Andrea, Peter,
>
> I have a question on page refcounting in your NUMA
> page migration code.
>
> In Peter's case, I wonder why you introduce a new
> MIGRATE_FAULT migration mode. If the normal page
> migration / compaction logic can do without taking
> an extra reference count, why does your code need it?
Hi Rik van Riel,
This is which part of codes? Why I can't find MIGRATE_FAULT in latest
v3.7-rc2?
Regards,
Chen
>
> In Andrea's case, we have a comment suggesting an
> extra refcount is needed, immediately followed by
> a put_page:
>
> /*
> * Pin the head subpage at least until the first
> * __isolate_lru_page succeeds (__isolate_lru_page pins it
> * again when it succeeds). If we unpin before
> * __isolate_lru_page successd, the page could be freed and
> * reallocated out from under us. Thus our previous checks on
> * the page, and the split_huge_page, would be worthless.
> *
> * We really only need to do this if "ret > 0" but it doesn't
> * hurt to do it unconditionally as nobody can reference
> * "page" anymore after this and so we can avoid an "if (ret >
> * 0)" branch here.
> */
> put_page(page);
>
> This also confuses me.
>
> If we do not need the extra refcount (and I do not
> understand why NUMA migrate-on-fault needs one more
> refcount than normal page migration), we can get
> rid of the MIGRATE_FAULT mode.
>
> If we do need the extra refcount, why is normal
> page migration safe? :)
>
On 10/20/2012 10:39 PM, Ni zhan Chen wrote:
> On 10/19/2012 11:53 PM, Rik van Riel wrote:
>> Hi Andrea, Peter,
>>
>> I have a question on page refcounting in your NUMA
>> page migration code.
>>
>> In Peter's case, I wonder why you introduce a new
>> MIGRATE_FAULT migration mode. If the normal page
>> migration / compaction logic can do without taking
>> an extra reference count, why does your code need it?
>
> Hi Rik van Riel,
>
> This is which part of codes? Why I can't find MIGRATE_FAULT in latest
> v3.7-rc2?
It is in tip.git in the numa/core branch.
--
All rights reversed
* Rik van Riel <[email protected]> wrote:
> On 10/19/2012 09:23 PM, Ingo Molnar wrote:
> >
> >* Rik van Riel <[email protected]> wrote:
> >
> >>On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
> >>>On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
> >>
> >>>>Another alternative might be to do the put_page inside
> >>>>do_prot_none_numa(). That would be analogous to do_wp_page
> >>>>disposing of the old page for the caller.
> >>>
> >>>It'd have to be inside migrate_misplaced_page(), can't do before
> >>>isolate_lru_page() or the page might disappear. Doing it after is
> >>>(obviously) too late.
> >>
> >>Keeping an extra refcount on the page might _still_
> >>result in it disappearing from the process by some
> >>other means, in-between you grabbing the refcount
> >>and invoking migration of the page.
> >>
> >>>>I am not real happy about NUMA migration introducing its own
> >>>>migration mode...
> >>>
> >>>You didn't seem to mind too much earlier, but I can remove it if you
> >>>want.
> >>
> >>Could have been reviewing fatigue :)
> >
> >:-)
> >
> >>And yes, it would have been nice to not have a special
> >>migration mode for sched/numa.
> >>
> >>Speaking of, when do you guys plan to submit a (cleaned up)
> >>version of the sched/numa patch series for review on lkml?
> >
> >Which commit(s) worry you specifically?
>
> One of them would be the commit that introduces MIGRATE_FAULT.
MIGRATE_FAULT is still used.
> Adding it in one patch, and removing it into a next just makes
> things harder on the reviewers.
Yes.
> 03a040f6c17ab81659579ba6abe267c0562097e4
It's still used:
comet:~/tip> git grep MIGRATE_FAULT
include/linux/migrate_mode.h: * MIGRATE_FAULT called from the fault path to migrate-on-fault for mempolicy
include/linux/migrate_mode.h: MIGRATE_FAULT,
mm/migrate.c: if (mode != MIGRATE_ASYNC && mode != MIGRATE_FAULT) {
mm/migrate.c: if (mode == MIGRATE_FAULT) {
mm/migrate.c: * MIGRATE_FAULT has an extra reference on the page and
mm/migrate.c: if ((mode == MIGRATE_ASYNC || mode == MIGRATE_FAULT) && head &&
mm/migrate.c: if (mode != MIGRATE_ASYNC && mode != MIGRATE_FAULT)
mm/migrate.c: if (!force || mode == MIGRATE_ASYNC || mode == MIGRATE_FAULT)
mm/migrate.c: ret = __unmap_and_move(page, newpage, 0, 0, MIGRATE_FAULT);
> If the changesets with NUMA syscalls are still in your tree's
> history, they should not be submitted as part of the patch
> series, either.
No, the syscalls have not been there for quite some time.
If you have trouble with any specific commit, please quote the
specific SHA1 so that I can have a look and resolve any specific
concerns.
Otherwise, lets continue with the integration?
Thanks,
Ingo
* Rik van Riel <[email protected]> wrote:
> On 10/20/2012 10:39 PM, Ni zhan Chen wrote:
> >On 10/19/2012 11:53 PM, Rik van Riel wrote:
> >>Hi Andrea, Peter,
> >>
> >>I have a question on page refcounting in your NUMA
> >>page migration code.
> >>
> >>In Peter's case, I wonder why you introduce a new
> >>MIGRATE_FAULT migration mode. If the normal page
> >>migration / compaction logic can do without taking
> >>an extra reference count, why does your code need it?
> >
> >Hi Rik van Riel,
> >
> >This is which part of codes? Why I can't find MIGRATE_FAULT in latest
> >v3.7-rc2?
>
> It is in tip.git in the numa/core branch.
The Git access URI is:
git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git numa/core
git clone git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git numa/core
Thanks,
Ingo