hmm_vma_walk_pmd() is called through mm_walk, but already has a goto
again loop of its own, so take part in that if pte_offset_map() fails.
Signed-off-by: Hugh Dickins <[email protected]>
---
mm/hmm.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/hmm.c b/mm/hmm.c
index e23043345615..b1a9159d7c92 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
}
ptep = pte_offset_map(pmdp, addr);
+ if (!ptep)
+ goto again;
for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) {
int r;
--
2.35.3
On 2023/5/22 13:05, Hugh Dickins wrote:
> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto
> again loop of its own, so take part in that if pte_offset_map() fails.
>
> Signed-off-by: Hugh Dickins <[email protected]>
> ---
> mm/hmm.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index e23043345615..b1a9159d7c92 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
> }
>
> ptep = pte_offset_map(pmdp, addr);
> + if (!ptep)
> + goto again;
> for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) {
> int r;
>
I haven't read the entire patch set yet, but taking a note here.
The hmm_vma_handle_pte() will unmap pte and then call
migration_entry_wait() to remap pte, so this may fail, we need to
handle this case like below:
diff --git a/mm/hmm.c b/mm/hmm.c
index 6a151c09de5e..eb726ff0981c 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -276,7 +276,8 @@ static int hmm_vma_handle_pte(struct mm_walk *walk,
unsigned long addr,
if (is_migration_entry(entry)) {
pte_unmap(ptep);
hmm_vma_walk->last = addr;
- migration_entry_wait(walk->mm, pmdp, addr);
+ if (!migration_entry_wait(walk->mm, pmdp, addr))
+ return -EAGAIN;
return -EBUSY;
}
@@ -386,6 +387,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
r = hmm_vma_handle_pte(walk, addr, end, pmdp, ptep,
hmm_pfns);
if (r) {
+ if (r == -EAGAIN)
+ goto again;
/* hmm_vma_handle_pte() did pte_unmap() */
return r;
}
Of course, the migration_entry_wait() also needs to be modified.
--
Thanks,
Qi
Qi Zheng <[email protected]> writes:
> On 2023/5/22 13:05, Hugh Dickins wrote:
>> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto
>> again loop of its own, so take part in that if pte_offset_map() fails.
>> Signed-off-by: Hugh Dickins <[email protected]>
>> ---
>> mm/hmm.c | 2 ++
>> 1 file changed, 2 insertions(+)
>> diff --git a/mm/hmm.c b/mm/hmm.c
>> index e23043345615..b1a9159d7c92 100644
>> --- a/mm/hmm.c
>> +++ b/mm/hmm.c
>> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
>> }
>> ptep = pte_offset_map(pmdp, addr);
>> + if (!ptep)
>> + goto again;
>> for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) {
>> int r;
>>
>
> I haven't read the entire patch set yet, but taking a note here.
> The hmm_vma_handle_pte() will unmap pte and then call
> migration_entry_wait() to remap pte, so this may fail, we need to
> handle this case like below:
I don't see a problem here. Sure, hmm_vma_handle_pte() might return
-EBUSY but that will get returned up to hmm_range_fault() which will
retry the whole thing again and presumably fail when looking at the PMD.
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 6a151c09de5e..eb726ff0981c 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -276,7 +276,8 @@ static int hmm_vma_handle_pte(struct mm_walk
> *walk, unsigned long addr,
> if (is_migration_entry(entry)) {
> pte_unmap(ptep);
> hmm_vma_walk->last = addr;
> - migration_entry_wait(walk->mm, pmdp, addr);
> + if (!migration_entry_wait(walk->mm, pmdp, addr))
> + return -EAGAIN;
> return -EBUSY;
> }
>
> @@ -386,6 +387,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
>
> r = hmm_vma_handle_pte(walk, addr, end, pmdp, ptep,
> hmm_pfns);
> if (r) {
> + if (r == -EAGAIN)
> + goto again;
> /* hmm_vma_handle_pte() did pte_unmap() */
> return r;
> }
>
> Of course, the migration_entry_wait() also needs to be modified.
On 2023/5/23 10:39, Alistair Popple wrote:
>
> Qi Zheng <[email protected]> writes:
>
>> On 2023/5/22 13:05, Hugh Dickins wrote:
>>> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto
>>> again loop of its own, so take part in that if pte_offset_map() fails.
>>> Signed-off-by: Hugh Dickins <[email protected]>
>>> ---
>>> mm/hmm.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>> diff --git a/mm/hmm.c b/mm/hmm.c
>>> index e23043345615..b1a9159d7c92 100644
>>> --- a/mm/hmm.c
>>> +++ b/mm/hmm.c
>>> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
>>> }
>>> ptep = pte_offset_map(pmdp, addr);
>>> + if (!ptep)
>>> + goto again;
>>> for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) {
>>> int r;
>>>
>>
>> I haven't read the entire patch set yet, but taking a note here.
>> The hmm_vma_handle_pte() will unmap pte and then call
>> migration_entry_wait() to remap pte, so this may fail, we need to
>> handle this case like below:
>
> I don't see a problem here. Sure, hmm_vma_handle_pte() might return
> -EBUSY but that will get returned up to hmm_range_fault() which will
> retry the whole thing again and presumably fail when looking at the PMD.
Yeah. There is no problem with this and the modification to
migration_entry_wait() can be simplified. My previous thought was that
we can finish the retry logic in hmm_vma_walk_pmd() without handing it
over to the caller. :)
>
>> diff --git a/mm/hmm.c b/mm/hmm.c
>> index 6a151c09de5e..eb726ff0981c 100644
>> --- a/mm/hmm.c
>> +++ b/mm/hmm.c
>> @@ -276,7 +276,8 @@ static int hmm_vma_handle_pte(struct mm_walk
>> *walk, unsigned long addr,
>> if (is_migration_entry(entry)) {
>> pte_unmap(ptep);
>> hmm_vma_walk->last = addr;
>> - migration_entry_wait(walk->mm, pmdp, addr);
>> + if (!migration_entry_wait(walk->mm, pmdp, addr))
>> + return -EAGAIN;
>> return -EBUSY;
>> }
>>
>> @@ -386,6 +387,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
>>
>> r = hmm_vma_handle_pte(walk, addr, end, pmdp, ptep,
>> hmm_pfns);
>> if (r) {
>> + if (r == -EAGAIN)
>> + goto again;
>> /* hmm_vma_handle_pte() did pte_unmap() */
>> return r;
>> }
>>
>> Of course, the migration_entry_wait() also needs to be modified.
>
--
Thanks,
Qi
On Tue, 23 May 2023, Qi Zheng wrote:
> On 2023/5/23 10:39, Alistair Popple wrote:
> > Qi Zheng <[email protected]> writes:
> >> On 2023/5/22 13:05, Hugh Dickins wrote:
> >>> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto
> >>> again loop of its own, so take part in that if pte_offset_map() fails.
> >>> Signed-off-by: Hugh Dickins <[email protected]>
> >>> ---
> >>> mm/hmm.c | 2 ++
> >>> 1 file changed, 2 insertions(+)
> >>> diff --git a/mm/hmm.c b/mm/hmm.c
> >>> index e23043345615..b1a9159d7c92 100644
> >>> --- a/mm/hmm.c
> >>> +++ b/mm/hmm.c
> >>> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
> >>> }
> >>> ptep = pte_offset_map(pmdp, addr);
> >>> + if (!ptep)
> >>> + goto again;
> >>> for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) {
> >>> int r;
> >>>
> >>
> >> I haven't read the entire patch set yet, but taking a note here.
> >> The hmm_vma_handle_pte() will unmap pte and then call
> >> migration_entry_wait() to remap pte, so this may fail, we need to
> >> handle this case like below:
> >
> > I don't see a problem here. Sure, hmm_vma_handle_pte() might return
> > -EBUSY but that will get returned up to hmm_range_fault() which will
> > retry the whole thing again and presumably fail when looking at the PMD.
>
> Yeah. There is no problem with this and the modification to
> migration_entry_wait() can be simplified. My previous thought was that
> we can finish the retry logic in hmm_vma_walk_pmd() without handing it
> over to the caller. :)
Okay, Alistair has resolved this one, thanks, I agree; but what is
"the modification to migration_entry_wait()" that you refer to there?
I don't think there's any need to make it a bool, it's normal for there
to be races on entry to migration_entry_wait(), and we're used to just
returning to caller (and back up to userspace) when it does not wait.
Hugh
Hugh Dickins <[email protected]> writes:
> On Tue, 23 May 2023, Qi Zheng wrote:
>> On 2023/5/23 10:39, Alistair Popple wrote:
>> > Qi Zheng <[email protected]> writes:
>> >> On 2023/5/22 13:05, Hugh Dickins wrote:
>> >>> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto
>> >>> again loop of its own, so take part in that if pte_offset_map() fails.
>> >>> Signed-off-by: Hugh Dickins <[email protected]>
>> >>> ---
>> >>> mm/hmm.c | 2 ++
>> >>> 1 file changed, 2 insertions(+)
>> >>> diff --git a/mm/hmm.c b/mm/hmm.c
>> >>> index e23043345615..b1a9159d7c92 100644
>> >>> --- a/mm/hmm.c
>> >>> +++ b/mm/hmm.c
>> >>> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
>> >>> }
>> >>> ptep = pte_offset_map(pmdp, addr);
>> >>> + if (!ptep)
>> >>> + goto again;
>> >>> for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) {
>> >>> int r;
>> >>>
>> >>
>> >> I haven't read the entire patch set yet, but taking a note here.
>> >> The hmm_vma_handle_pte() will unmap pte and then call
>> >> migration_entry_wait() to remap pte, so this may fail, we need to
>> >> handle this case like below:
>> >
>> > I don't see a problem here. Sure, hmm_vma_handle_pte() might return
>> > -EBUSY but that will get returned up to hmm_range_fault() which will
>> > retry the whole thing again and presumably fail when looking at the PMD.
>>
>> Yeah. There is no problem with this and the modification to
>> migration_entry_wait() can be simplified. My previous thought was that
>> we can finish the retry logic in hmm_vma_walk_pmd() without handing it
>> over to the caller. :)
>
> Okay, Alistair has resolved this one, thanks, I agree; but what is
> "the modification to migration_entry_wait()" that you refer to there?
>
> I don't think there's any need to make it a bool, it's normal for there
> to be races on entry to migration_entry_wait(), and we're used to just
> returning to caller (and back up to userspace) when it does not wait.
Agreed. I didn't spot any places where returning to the caller without
actually waiting would cause looping. I assume any retries or refaults
will find the cleared PMD and fault/error out in some other manner
anyway.
hmm_range_fault() is the only place that might have been a bit special,
but it looks fine to me so:
Reviewed-by: Alistair Popple <[email protected]>
> Hugh