2017-11-27 11:47:48

by guoxuenan

[permalink] [raw]
Subject: [PATCH] mm,madvise: bugfix of madvise systemcall infinite loop under special circumstances.

From: chenjie <[email protected]>

The madvise() system call supported a set of "conventional" advice values,
the MADV_WILLNEED parameter has possibility of triggering an infinite loop under
direct access mode(DAX).

Infinite loop situation:
1、initial state [ start = vam->vm_start < vam->vm_end < end ].
2、madvise_vma() using MADV_WILLNEED parameter;
madvise_vma() -> madvise_willneed() -> return 0 && the value of [prev] is not updated.

In function SYSCALL_DEFINE3(madvise,...)
When [start = vam->vm_start] the program enters "for" loop,
find_vma_prev() will set the pointer vma and the pointer prev(prev = vam->vm_prev).
Normally ,madvise_vma() will always move the pointer prev ,but when use DAX mode,
it will never update the value of [prev].

=======================================================================
SYSCALL_DEFINE3(madvise,...)
{
[...]
//start = vam->start => prev=vma->prev
vma = find_vma_prev(current->mm, start, &prev);
[...]
for(;;)
{
update [start = vma->vm_start]

con0: if (start >= end) //false always;
goto out;
tmp = vma->vm_end;

//do not update [prev] and always return 0;
error = madvise_willneed();

con1: if (error) //false always;
goto out;

//[ vam->vm_start < start = vam->vm_end <end ]
update [start = tmp ]

con2: if (start >= end) //false always ;
goto out;

//because of pointer [prev] did not change,[vma] keep as it was;
update [ vma = prev->vm_next ]
}
[...]
}
=======================================================================
After the first cycle ;it will always keep
vam->vm_start < start = vam->vm_end < end && vma = prev->vm_next;
since Circulation exit conditions (con{0,1,2}) will never meet ,the
program stuck in infinite loop.

Signed-off-by: chenjie <[email protected]>
Signed-off-by: guoxuenan <[email protected]>
---
mm/madvise.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 375cf32..751e97a 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -276,15 +276,14 @@ static long madvise_willneed(struct vm_area_struct *vma,
{
struct file *file = vma->vm_file;

+ *prev = vma;
#ifdef CONFIG_SWAP
if (!file) {
- *prev = vma;
force_swapin_readahead(vma, start, end);
return 0;
}

if (shmem_mapping(file->f_mapping)) {
- *prev = vma;
force_shm_swapin_readahead(vma, start, end,
file->f_mapping);
return 0;
@@ -299,7 +298,6 @@ static long madvise_willneed(struct vm_area_struct *vma,
return 0;
}

- *prev = vma;
start = ((start - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
if (end > vma->vm_end)
end = vma->vm_end;
--
2.9.5


From 1585236707468135926@xxx Mon Nov 27 16:18:53 +0000 2017
X-GM-THRID: 1584073712673455004
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread


2017-11-27 08:02:07

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm,madvise: bugfix of madvise systemcall infinite loop under special circumstances.

On Mon 27-11-17 10:54:39, 郭雪楠 wrote:
> Hi,Michal, Whether need me to modify according your modification and
> resubmit a new patch?

please do
--
Michal Hocko
SUSE Labs

From 1585187559478770811@xxx Mon Nov 27 03:17:42 +0000 2017
X-GM-THRID: 1584912558403806788
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-27 03:17:42

by guoxuenan

[permalink] [raw]
Subject: Re: [PATCH] mm,madvise: bugfix of madvise systemcall infinite loop under special circumstances.

Hi,Michal, Whether need me to modify according your modification and
resubmit a new patch?

在 2017/11/25 9:52, 郭雪楠 写道:
> Yes , your modification is much better! thanks.
>
> 在 2017/11/24 21:08, Michal Hocko 写道:
>> On Fri 24-11-17 20:51:29, 郭雪楠 wrote:
>>> Sorry,I explained wrong before. But,I've tested using trinity in DAX
>>> mode,and I'am sure it has possibility of triggering an soft lockup. I
>>> have
>>> encountered the problem of endless loop here .
>>>
>>> I had a little problem here,I correct it .
>>> under Initial state :
>>> [ start = vam->vm_start < vam->vm_end < end ]
>>>
>>> When [start = vam->vm_start] the program enters for{;;} loop
>>> ,find_vma_prev() will set the pointer vma and the pointer prev (prev =
>>> vam->vm_prev ). Normally ,madvise_vma() will always move the pointer
>>> prev
>>> ,but when use DAX mode , it will never update .
>> [...]
>>> if (prev) // here prev not NULL,it will always enter this branch ..
>>> vma = prev->vm_next;
>>> else /* madvise_remove dropped mmap_sem */
>>> vma = find_vma(current->mm, start);
>>
>> You are right! My fault, I managed to confuse myself in the code flow.
>> It really looks like this has been broken for more than 10 years since
>> fe77ba6f4f97 ("[PATCH] xip: madvice/fadvice: execute in place").
>>
>> Maybe the following would be more readable and less error prone?
>> ---
>> diff --git a/mm/madvise.c b/mm/madvise.c
>> index 375cf32087e4..a631c414f915 100644
>> --- a/mm/madvise.c
>> +++ b/mm/madvise.c
>> @@ -276,30 +276,26 @@ static long madvise_willneed(struct
>> vm_area_struct *vma,
>> {
>> struct file *file = vma->vm_file;
>> + *prev = vma;
>> #ifdef CONFIG_SWAP
>> if (!file) {
>> - *prev = vma;
>> force_swapin_readahead(vma, start, end);
>> return 0;
>> }
>> - if (shmem_mapping(file->f_mapping)) {
>> - *prev = vma;
>> + if (shmem_mapping(file->f_mapping))
>> force_shm_swapin_readahead(vma, start, end,
>> file->f_mapping);
>> return 0;
>> - }
>> #else
>> if (!file)
>> return -EBADF;
>> #endif
>> - if (IS_DAX(file_inode(file))) {
>> + if (IS_DAX(file_inode(file)))
>> /* no bad return value, but ignore advice */
>> return 0;
>> - }
>> - *prev = vma;
>> start = ((start - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
>> if (end > vma->vm_end)
>> end = vma->vm_end;
>>


From 1585001118206261331@xxx Sat Nov 25 01:54:17 +0000 2017
X-GM-THRID: 1584912558403806788
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-25 01:54:18

by guoxuenan

[permalink] [raw]
Subject: Re: [PATCH] mm,madvise: bugfix of madvise systemcall infinite loop under special circumstances.

Yes , your modification is much better! thanks.

在 2017/11/24 21:08, Michal Hocko 写道:
> On Fri 24-11-17 20:51:29, 郭雪楠 wrote:
>> Sorry,I explained wrong before. But,I've tested using trinity in DAX
>> mode,and I'am sure it has possibility of triggering an soft lockup. I have
>> encountered the problem of endless loop here .
>>
>> I had a little problem here,I correct it .
>> under Initial state :
>> [ start = vam->vm_start < vam->vm_end < end ]
>>
>> When [start = vam->vm_start] the program enters for{;;} loop
>> ,find_vma_prev() will set the pointer vma and the pointer prev (prev =
>> vam->vm_prev ). Normally ,madvise_vma() will always move the pointer prev
>> ,but when use DAX mode , it will never update .
> [...]
>> if (prev) // here prev not NULL,it will always enter this branch ..
>> vma = prev->vm_next;
>> else /* madvise_remove dropped mmap_sem */
>> vma = find_vma(current->mm, start);
>
> You are right! My fault, I managed to confuse myself in the code flow.
> It really looks like this has been broken for more than 10 years since
> fe77ba6f4f97 ("[PATCH] xip: madvice/fadvice: execute in place").
>
> Maybe the following would be more readable and less error prone?
> ---
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 375cf32087e4..a631c414f915 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -276,30 +276,26 @@ static long madvise_willneed(struct vm_area_struct *vma,
> {
> struct file *file = vma->vm_file;
>
> + *prev = vma;
> #ifdef CONFIG_SWAP
> if (!file) {
> - *prev = vma;
> force_swapin_readahead(vma, start, end);
> return 0;
> }
>
> - if (shmem_mapping(file->f_mapping)) {
> - *prev = vma;
> + if (shmem_mapping(file->f_mapping))
> force_shm_swapin_readahead(vma, start, end,
> file->f_mapping);
> return 0;
> - }
> #else
> if (!file)
> return -EBADF;
> #endif
>
> - if (IS_DAX(file_inode(file))) {
> + if (IS_DAX(file_inode(file)))
> /* no bad return value, but ignore advice */
> return 0;
> - }
>
> - *prev = vma;
> start = ((start - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
> if (end > vma->vm_end)
> end = vma->vm_end;
>


From 1584933953128645544@xxx Fri Nov 24 08:06:44 +0000 2017
X-GM-THRID: 1584912558403806788
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread

2017-11-24 08:06:44

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm,madvise: bugfix of madvise systemcall infinite loop under special circumstances.

On Fri 24-11-17 10:27:57, guoxuenan wrote:
> From: chenjie <[email protected]>
>
> The madvise() system call supported a set of "conventional" advice values,
> the MADV_WILLNEED parameter will trigger an infinite loop under direct
> access mode(DAX). In DAX mode, the function madvise_vma() will return
> directly without updating the pointer [prev].
>
> For example:
> Special circumstances:
> 1、init [ start < vam->vm_start < vam->vm_end < end ]
> 2、madvise_vma() using MADV_WILLNEED parameter ;
> madvise_vma() -> madvise_willneed() -> return 0 && without updating [prev]
>
> =======================================================================
> in Function SYSCALL_DEFINE3(madvise,...)
>
> for (;;)
> {
> //[first loop: start = vam->vm_start < vam->vm_end <end ];
> update [start = vma->vm_start | end ]
>
> con0: if (start >= end) //false always;
> goto out;
> tmp = vma->vm_end;
>
> //do not update [prev] and always return 0;
> error = madvise_willneed();
>
> con1: if (error) //false always;
> goto out;
>
> //[ vam->vm_start < start = vam->vm_end <end ]
> update [start = tmp ]
>
> con2: if (start >= end) //false always ;
> goto out;
>
> //because of pointer [prev] did not change,[vma] keep as it was;
> update [ vma = prev->vm_next ]
> }
>
> =======================================================================
> After the first cycle ;it will always keep
> [ vam->vm_start < start = vam->vm_end < end ].
> since Circulation exit conditions (con{0,1,2}) will never meet ,the
> program stuck in infinite loop.

Are you sure? Have you tested this? I might be missing something because
madvise code is a bit of a mess but AFAICS prev pointer (updated or not)
will allow to move advance
if (prev)
vma = prev->vm_next;
else /* madvise_remove dropped mmap_sem */
vma = find_vma(current->mm, start);
note that start is vma->vm_end and find_vma will find a vma which
vma_end > addr

So either I am missing something or this code has actaully never worked
for DAX, XIP which I find rather suspicious.

> Signed-off-by: chenjie <[email protected]>
> Signed-off-by: guoxuenan <[email protected]>
> ---
> mm/madvise.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 21261ff..c355fee 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -294,6 +294,7 @@ static long madvise_willneed(struct vm_area_struct *vma,
> #endif
>
> if (IS_DAX(file_inode(file))) {
> + *prev = vma;
> /* no bad return value, but ignore advice */
> return 0;
> }
> --
> 2.9.5
>

--
Michal Hocko
SUSE Labs

From 1584912558403806788@xxx Fri Nov 24 02:26:40 +0000 2017
X-GM-THRID: 1584912558403806788
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread