On 2019-10-31 at 12:08 Andrew Morton wrote:
>(cc [email protected])
>
>On Tue, 29 Oct 2019 17:56:06 +0800 "Li Xinhai" <[email protected]> wrote:
>
>> queue_pages_range() will check for unmapped holes besides queue pages for
>> migration. The rules for checking unmapped holes are:
>> 1 Unmapped holes at any part of the specified range should be reported as
>> EFAULT if mbind() for none MPOL_DEFAULT cases;
>> 2 Unmapped holes at any part of the specified range should be ignored if
>> mbind() for MPOL_DEFAULT case;
>> Note that the second rule is the current implementation, but it seems
>> conflicts the Linux API definition.
>
>Can you quote the part of the API definition which you're looking at?
>
>My mbind(2) manpage says
>
>ERRORS
> EFAULT Part or all of the memory range specified by nodemask and maxn-
> ode points outside your accessible address space. Or, there was
> an unmapped hole in the specified memory range specified by addr
> and len.
>
>(I assume the first sentence meant to say "specified by addr and len")
>
this part:
"Or, there was an unmapped hole in the specified memory range specified by addr
and len."
is concerned by my patch.
>I agree with your interpretation, but there's no mention here that
>MPOL_DEFAULT is treated differently and I don't see why it should be.
>
The first rule match the manpage, but the current mempolicy implementation only
reports EFAULT if holes are within range, or at the head side of range. No EFAULT
reported if hole at tail side of range. I suppose the first rule has to be fixed.
The seconde rule, when MPOL_DEFAULT is used, was summarized by me according
to mempolicy implementation. Actually, this rule does not follow manpage and exsits
for long days. In my understanding, this rule is reasonable (in code, the internal flag
MPOL_MF_DISCONTIG_OK is used for that purpose, there is comments for reason)
and we'd better keep it.
>
>More broadly, I worry that it's too late to change this - existing
>applications might fail if we change the implementation in the proposed
>fashion. So perhaps what we should do here is to change the manpage to
>match reality?
>
I prefer add description in manpage for the second rule, so no change to our code.
Only fix for first rule.
>Is the current behavior causing you any problems in a real-world use
>case?
I was using mbind() with MPOL_DEFAULT(or MPOL_BIND) to reset a range of address
(which maybe contiguous or not in the whole range) to the default policy (to a specific
node), and observed this issue. If mbind() call for each mapping one by one, we don't see the
issue.
- Xinhai