When we select CFQ as the scheduler, in function pick_next_task_fair
it will pass NULL as the 2nd argument to pick_next_entity:
pick_next_entity(cfs_rq, NULL);
And once __pick_first_entity() is called, it could return NULL as well.
So in function pick_next_entity(), the local variable 'left' and 'curr'
could both be NULL, then this will cause NULL pointer issue.
In order to fix this issue, we just need return NULL under the condition
that both 'left' and 'curr' are NULL, meaning that no entity available.
Signed-off-by: Yafang Shao <[email protected]>
---
kernel/sched/fair.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c95880e..e64c359 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3903,6 +3903,8 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
struct sched_entity *left = __pick_first_entity(cfs_rq);
struct sched_entity *se;
+ if (!left && !curr)
+ return NULL;
/*
* If curr is set we have to see if its left of the leftmost entity
* still in the tree, provided there was anything in the tree at all.
--
1.8.3.1
On Tue, Aug 01, 2017 at 06:01:56PM +0800, Yafang Shao wrote:
> When we select CFQ as the scheduler, in function pick_next_task_fair
CFS, CFQ is a block IO scheduler.
> it will pass NULL as the 2nd argument to pick_next_entity:
> pick_next_entity(cfs_rq, NULL);
>
> And once __pick_first_entity() is called, it could return NULL as well.
>
> So in function pick_next_entity(), the local variable 'left' and 'curr'
> could both be NULL, then this will cause NULL pointer issue.
>
> In order to fix this issue, we just need return NULL under the condition
> that both 'left' and 'curr' are NULL, meaning that no entity available.
And how would that happen? We only call pick_next_entity(.curr=NULL)
when we _know_ cfs_rq->nr_running.
> Signed-off-by: Yafang Shao <[email protected]>
> ---
> kernel/sched/fair.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c95880e..e64c359 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3903,6 +3903,8 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
You diff is broken, this is very much not the clear_buddies() function.
> struct sched_entity *left = __pick_first_entity(cfs_rq);
> struct sched_entity *se;
>
> + if (!left && !curr)
> + return NULL;
> /*
> * If curr is set we have to see if its left of the leftmost entity
> * still in the tree, provided there was anything in the tree at all.
> --
> 1.8.3.1
>
Hi Peter,
Thanks for your response!
> CFS, CFQ is a block IO scheduler.
Sorry for the mistake. It should be CFS.
> And how would that happen? We only call pick_next_entity(.curr=NULL)
> when we _know_ cfs_rq->nr_running.
It crashed my machine when I did hadoop test, and after I made this
change it works now.
On SMP system, cfs_rq->nr_running isn't protected well, although we
_know_ cfs_rq->nr_running,
but it is modified by other thread running on other CPU and the
sched_entity is set NULL as well.
Then this thread broken here as accessed the NULL pointer here.
> You diff is broken, this is very much not the clear_buddies() function.
You're right. it is in pick_next_entity().
2017-08-01 16:38 GMT+08:00 Peter Zijlstra <[email protected]>:
> On Tue, Aug 01, 2017 at 06:01:56PM +0800, Yafang Shao wrote:
>> When we select CFQ as the scheduler, in function pick_next_task_fair
>
> CFS, CFQ is a block IO scheduler.
>
>> it will pass NULL as the 2nd argument to pick_next_entity:
>> pick_next_entity(cfs_rq, NULL);
>>
>> And once __pick_first_entity() is called, it could return NULL as well.
>>
>> So in function pick_next_entity(), the local variable 'left' and 'curr'
>> could both be NULL, then this will cause NULL pointer issue.
>>
>> In order to fix this issue, we just need return NULL under the condition
>> that both 'left' and 'curr' are NULL, meaning that no entity available.
>
> And how would that happen? We only call pick_next_entity(.curr=NULL)
> when we _know_ cfs_rq->nr_running.
>
>> Signed-off-by: Yafang Shao <[email protected]>
>> ---
>> kernel/sched/fair.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index c95880e..e64c359 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -3903,6 +3903,8 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
>
> You diff is broken, this is very much not the clear_buddies() function.
>
>> struct sched_entity *left = __pick_first_entity(cfs_rq);
>> struct sched_entity *se;
>>
>> + if (!left && !curr)
>> + return NULL;
>> /*
>> * If curr is set we have to see if its left of the leftmost entity
>> * still in the tree, provided there was anything in the tree at all.
>> --
>> 1.8.3.1
>>
On Tue, Aug 01, 2017 at 04:57:43PM +0800, Yafang Shao wrote:
> > And how would that happen? We only call pick_next_entity(.curr=NULL)
> > when we _know_ cfs_rq->nr_running.
>
> It crashed my machine when I did hadoop test, and after I made this change
> it works now.
> On SMP system, cfs_rq->nr_running isn't protected well, although we _know_
> cfs_rq->nr_running,
> but it is modified by other thread running on other CPU and the
> sched_entity is set NULL as well.
> Then this thread broken here as accessed the NULL pointer here.
cfs_rq->nr_running should be protected by the rq->lock. If it is not,
something else is buggered.
Hi Peter,
2017-08-01 17:12 GMT+08:00 Peter Zijlstra <[email protected]>:
> On Tue, Aug 01, 2017 at 04:57:43PM +0800, Yafang Shao wrote:
>> > And how would that happen? We only call pick_next_entity(.curr=NULL)
>> > when we _know_ cfs_rq->nr_running.
>>
>> It crashed my machine when I did hadoop test, and after I made this change
>> it works now.
>> On SMP system, cfs_rq->nr_running isn't protected well, although we _know_
>> cfs_rq->nr_running,
>> but it is modified by other thread running on other CPU and the
>> sched_entity is set NULL as well.
>> Then this thread broken here as accessed the NULL pointer here.
>
> cfs_rq->nr_running should be protected by the rq->lock. If it is not,
> something else is buggered.
Yes, I admit that something else is buggered, but unfortunately I
haven't understood the
scheduler deeply so can't find the root cause.
But from my understanding, it is obviously a bug here as we can find
that it may occurs that both 'se'
and 'curr' are NULL. That's why I submit this patch to fix it.
On Tue, Aug 01, 2017 at 05:20:11PM +0800, Yafang Shao wrote:
> Hi Peter,
>
> 2017-08-01 17:12 GMT+08:00 Peter Zijlstra <[email protected]>:
> > On Tue, Aug 01, 2017 at 04:57:43PM +0800, Yafang Shao wrote:
> >> > And how would that happen? We only call pick_next_entity(.curr=NULL)
> >> > when we _know_ cfs_rq->nr_running.
> >>
> >> It crashed my machine when I did hadoop test, and after I made this change
> >> it works now.
> >> On SMP system, cfs_rq->nr_running isn't protected well, although we _know_
> >> cfs_rq->nr_running,
> >> but it is modified by other thread running on other CPU and the
> >> sched_entity is set NULL as well.
> >> Then this thread broken here as accessed the NULL pointer here.
> >
> > cfs_rq->nr_running should be protected by the rq->lock. If it is not,
> > something else is buggered.
>
> Yes, I admit that something else is buggered, but unfortunately I
> haven't understood the
> scheduler deeply so can't find the root cause.
>
> But from my understanding, it is obviously a bug here as we can find
> that it may occurs that both 'se'
> and 'curr' are NULL. That's why I submit this patch to fix it.
Thing is, it doesn't fix a bug, it hides one. The real bug is nr_running
changing while you own the rq->lock. That should _never_ happen.