2014-02-15 23:28:30

by Sasha Levin

[permalink] [raw]
Subject: sched: fair: NULL ptr deref in check_preempt_wakeup

Hi folks,

While fuzzing with trinity inside a KVM tools guest running latest -next kernel, I've
stumbled on the following:

[ 522.645288] BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
[ 522.646271] IP: [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
[ 522.646976] PGD b0a79067 PUD ae9cf067 PMD 0
[ 522.647494] Oops: 0000 [#1] PREEMPT SMP
[ 522.648000] Dumping ftrace buffer:
[ 522.648380] (ftrace buffer empty)
[ 522.648775] Modules linked in:
[ 522.649125] CPU: 0 PID: 11735 Comm: trinity-c50 Not tainted
3.14.0-rc2-next-20140214-sasha-00008-g95d9d16-dirty #85
[ 522.650021] task: ffff8800c00bb000 ti: ffff88007fdb8000 task.ti: ffff88007fdb8000
[ 522.650021] RIP: 0010:[<ffffffff81186c6f>] [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
[ 522.650021] RSP: 0018:ffff880226e03ba8 EFLAGS: 00010046
[ 522.650021] RAX: 0000000000000000 RBX: ffff880226fd79c0 RCX: 0000000000000008
[ 522.650021] RDX: 0000000000000000 RSI: ffff880211313000 RDI: 000000000000000c
[ 522.650021] RBP: ffff880226e03be8 R08: 0000000000000000 R09: 000000000000b4bb
[ 522.650021] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 522.650021] R13: ffff880211313068 R14: ffff8800c00bb000 R15: 0000000000000000
[ 522.650021] FS: 00007f435269f700(0000) GS:ffff880226e00000(0000) knlGS:0000000000000000
[ 522.650021] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 522.650021] CR2: 0000000000000150 CR3: 00000000abd2c000 CR4: 00000000000006f0
[ 522.650021] DR0: 0000000000995750 DR1: 0000000000000000 DR2: 0000000000000000
[ 522.650021] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000600
[ 522.650021] Stack:
[ 522.650021] ffff880211313000 01ff880226fd79c0 ffff880211313000 ffff880226fd79c0
[ 522.650021] ffff880226fd79c0 ffff880211313000 0000000000000000 ffff880226e00000
[ 522.650021] ffff880226e03c08 ffffffff8117361d ffff880226fd79c0 ffff880226fd79c0
[ 522.650021] Call Trace:
[ 522.650021] <IRQ>
[ 522.650021] [<ffffffff8117361d>] check_preempt_curr+0x3d/0xb0
[ 522.650021] [<ffffffff81175d88>] ttwu_do_wakeup+0x18/0x130
[ 522.650021] [<ffffffff81175ee4>] T.2248+0x44/0x50
[ 522.650021] [<ffffffff81175f9e>] ttwu_queue+0xae/0xd0
[ 522.650021] [<ffffffff81180224>] ? try_to_wake_up+0x34/0x2a0
[ 522.650021] [<ffffffff81180454>] try_to_wake_up+0x264/0x2a0
[ 522.650021] [<ffffffff811a1672>] ? __lock_acquired+0x2a2/0x2e0
[ 522.650021] [<ffffffff8118049d>] default_wake_function+0xd/0x10
[ 522.650021] [<ffffffff811952f8>] autoremove_wake_function+0x18/0x40
[ 522.650021] [<ffffffff811951b2>] __wake_up_common+0x52/0x90
[ 522.650021] [<ffffffff8119550d>] ? __wake_up+0x2d/0x70
[ 522.650021] [<ffffffff81195523>] __wake_up+0x43/0x70
[ 522.650021] [<ffffffff843119a3>] p9_client_cb+0x43/0x70
[ 522.650021] [<ffffffff84319d05>] req_done+0x105/0x110
[ 522.650021] [<ffffffff81cafca6>] vring_interrupt+0x86/0xa0
[ 522.650021] [<ffffffff811b9a28>] ? handle_irq_event+0x38/0x70
[ 522.650021] [<ffffffff811b9779>] handle_irq_event_percpu+0x129/0x3a0
[ 522.650021] [<ffffffff811b9a33>] handle_irq_event+0x43/0x70
[ 522.650021] [<ffffffff811bd1e8>] handle_edge_irq+0xe8/0x120
[ 522.650021] [<ffffffff81070a34>] handle_irq+0x164/0x180
[ 522.650021] [<ffffffff811833c9>] ? vtime_account_system+0x79/0x90
[ 522.650021] [<ffffffff81183435>] ? vtime_common_account_irq_enter+0x55/0x60
[ 522.650021] [<ffffffff8106f629>] do_IRQ+0x59/0x100
[ 522.650021] [<ffffffff84395e72>] common_interrupt+0x72/0x72
[ 522.650021] <EOI>
[ 522.650021] [<ffffffff812510d5>] ? context_tracking_user_exit+0x1a5/0x1c0
[ 522.650021] [<ffffffff8107cfdd>] syscall_trace_enter+0x2d/0x280
[ 522.650021] [<ffffffff8439f081>] tracesys+0x7e/0xe2
[ 522.650021] Code: 0f 1f 40 00 ff c8 4d 8b ad 48 01 00 00 39 d0 7f f3 eb 18 66 0f 1f 84 00 00 00
00 00 4d 8b a4 24 48 01 00 00 4d 8b ad 48 01 00 00 <49> 8b bc 24 50 01 00 00 49 3b bd 50 01 00 00 75
e0 48 85 ff 74
[ 522.650021] RIP [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
[ 522.650021] RSP <ffff880226e03ba8>
[ 522.650021] CR2: 0000000000000150
[ 522.650021] ---[ end trace adce75aec8b1b32f ]---

Since it's pretty inlined, the code points to:

check_preempt_wakeup()
find_matching_se()
find_matching_se()
check_preempt_wakeup()


static inline struct cfs_rq *
is_same_group(struct sched_entity *se, struct sched_entity *pse)
{
if (se->cfs_rq == pse->cfs_rq) <=== HERE
return se->cfs_rq;

return NULL;
}


Thanks,
Sasha


2014-02-15 23:33:05

by Sasha Levin

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On 02/15/2014 06:27 PM, Sasha Levin wrote:
> Hi folks,
>
> While fuzzing with trinity inside a KVM tools guest running latest -next kernel, I've
> stumbled on the following:

As soon as I've finished writing that mail I've hit it again, with a different (but similar) stack
trace.

[ 770.993016] BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
[ 770.993865] IP: [<ffffffff8118ef99>] pick_next_task_fair+0x109/0x290
[ 770.994531] PGD 1addee067 PUD 1addef067 PMD 0
[ 770.995018] Oops: 0000 [#1] PREEMPT SMP
[ 770.995573] Dumping ftrace buffer:
[ 770.995928] (ftrace buffer empty)
[ 770.996304] Modules linked in:
[ 770.996661] CPU: 0 PID: 13754 Comm: trinity-c155 Not tainted 3.14.0-rc2-next-20140214
[ 770.997646] task: ffff88021151b000 ti: ffff88016b9f4000 task.ti: ffff88016b9f4000
[ 770.998384] RIP: 0010:[<ffffffff8118ef99>] [<ffffffff8118ef99>] pick_next_task_fair+
[ 770.999254] RSP: 0018:ffff88016b9f5bc8 EFLAGS: 00010097
[ 770.999787] RAX: 000000004caed01b RBX: ffff880226fd79c0 RCX: 000000000004ccca
[ 771.000035] RDX: 0000000000a7076b RSI: ffff880060ff8028 RDI: ffff88008b998078
[ 771.000035] RBP: ffff88016b9f5c08 R08: 0000000000000000 R09: 0000000000000001
[ 771.000035] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88008b998000
[ 771.000035] R13: 0000000000000000 R14: ffff880226fd7a88 R15: ffff880060ffb7c8
[ 771.000035] FS: 00007f6e01002700(0000) GS:ffff880226e00000(0000) knlGS:0000000000000
[ 771.000035] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 771.000035] CR2: 0000000000000150 CR3: 00000001feeef000 CR4: 00000000000006f0
[ 771.000035] DR0: 00007f6e009b2000 DR1: 0000000000000000 DR2: 0000000000000000
[ 771.000035] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000600
[ 771.000035] Stack:
[ 771.000035] ffff880100000001 ffffffff00000001 ffff88016b9f5c08 ffff880226fd79c0
[ 771.000035] 0000000000000000 ffff88021151b990 0000000000000282 00000000ffffffff
[ 771.000035] ffff88016b9f5c88 ffffffff8438ef35 ffff88016b9f5c78 ffffffff811a19a5
[ 771.000035] Call Trace:
[ 771.000035] [<ffffffff8438ef35>] __schedule+0x2a5/0x840
[ 771.000035] [<ffffffff811a19a5>] ? __lock_contended+0x205/0x240
[ 771.000035] [<ffffffff8438f795>] schedule+0x65/0x70
[ 771.000035] [<ffffffff8438fb73>] schedule_preempt_disabled+0x13/0x20
[ 771.000035] [<ffffffff8439105d>] mutex_lock_nested+0x2ad/0x510
[ 771.000035] [<ffffffff812ed326>] ? lookup_slow+0x46/0xd0
[ 771.000035] [<ffffffff812ed70d>] ? unlazy_walk+0x16d/0x1e0
[ 771.000035] [<ffffffff812ed326>] ? lookup_slow+0x46/0xd0
[ 771.000035] [<ffffffff812ed326>] lookup_slow+0x46/0xd0
[ 771.000035] [<ffffffff812efbe5>] path_lookupat+0xe5/0x660
[ 771.000035] [<ffffffff812b97ea>] ? kmem_cache_alloc+0x1fa/0x300
[ 771.000035] [<ffffffff812eb497>] ? getname_flags+0x57/0x1c0
[ 771.000035] [<ffffffff812f018f>] filename_lookup+0x2f/0xd0
[ 771.000035] [<ffffffff812f155c>] user_path_at_empty+0x6c/0xb0
[ 771.000035] [<ffffffff812510b5>] ? context_tracking_user_exit+0x185/0x1c0
[ 771.000035] [<ffffffff811a3ccd>] ? trace_hardirqs_on+0xd/0x10
[ 771.000035] [<ffffffff812f15ac>] user_path_at+0xc/0x10
[ 771.000035] [<ffffffff812de913>] do_sys_truncate+0x43/0xc0
[ 771.000035] [<ffffffff812de9a9>] SyS_truncate+0x9/0x10
[ 771.000035] [<ffffffff8439f0e0>] tracesys+0xdd/0xe2
[ 771.000035] Code: 4d 8b ad 48 01 00 00 39 c2 7c 19 4d 8b b7 50 01 00 00 4c 89 fe 4c 89 f7 e8 55
98 ff ff 4d 8b bf 48 01 00 00 4d 8b b7 50 01 00 00 <49> 8b bd 50 01 00 00 49 39 fe 75 a3 4d 85 f6 74
9e 4c 89 ee 4c
[ 771.000035] RIP [<ffffffff8118ef99>] pick_next_task_fair+0x109/0x290
[ 771.000035] RSP <ffff88016b9f5bc8>
[ 771.000035] CR2: 0000000000000150
[ 771.000035] ---[ end trace 408e14968ec7dd7a ]---


Thanks,
Sasha

2014-02-16 19:19:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On Sat, Feb 15, 2014 at 06:27:52PM -0500, Sasha Levin wrote:
> Hi folks,
>
> While fuzzing with trinity inside a KVM tools guest running latest -next kernel, I've
> stumbled on the following:
>
> [ 522.645288] BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
> [ 522.646271] IP: [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
>
> Since it's pretty inlined, the code points to:
>
> check_preempt_wakeup()
> find_matching_se()
> find_matching_se()
> check_preempt_wakeup()
>
>
> static inline struct cfs_rq *
> is_same_group(struct sched_entity *se, struct sched_entity *pse)
> {
> if (se->cfs_rq == pse->cfs_rq) <=== HERE
> return se->cfs_rq;
>
> return NULL;
> }

Hrm.. that means we got se->depth wrong. I'll have a poke tomorrow.

2014-02-17 08:11:23

by Michael wang

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

Hi, Sasha

On 02/16/2014 07:27 AM, Sasha Levin wrote:
> Hi folks,
>
> While fuzzing with trinity inside a KVM tools guest running latest -next
> kernel, I've
> stumbled on the following:

I've reproduced the same issue with tip/master, and below patch fixed the
problem on my box along with some rcu stall info disappeared, would you
like to have a try?

BTW, I reproduced it by steps:
1. change current to RT
2. move to a different depth cpu-cgroup
3. change it back to FAIR

Seems like it was caused by that RT has no task_move_group() implemented
which could maintain depth, and that lead to a wrong depth after switched
back to FAIR...

Regards,
Michael Wang



diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 235cfa7..4445e56 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7317,7 +7317,11 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
*/
static void switched_to_fair(struct rq *rq, struct task_struct *p)
{
- if (!p->se.on_rq)
+ struct sched_entity *se = &p->se;
+#ifdef CONFIG_FAIR_GROUP_SCHED
+ se->depth = se->parent ? se->parent->depth + 1 : 0;
+#endif
+ if (!se->on_rq)
return;

/*


>
> [ 522.645288] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000150
> [ 522.646271] IP: [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
> [ 522.646976] PGD b0a79067 PUD ae9cf067 PMD 0
> [ 522.647494] Oops: 0000 [#1] PREEMPT SMP
> [ 522.648000] Dumping ftrace buffer:
> [ 522.648380] (ftrace buffer empty)
> [ 522.648775] Modules linked in:
> [ 522.649125] CPU: 0 PID: 11735 Comm: trinity-c50 Not tainted
> 3.14.0-rc2-next-20140214-sasha-00008-g95d9d16-dirty #85
> [ 522.650021] task: ffff8800c00bb000 ti: ffff88007fdb8000 task.ti:
> ffff88007fdb8000
> [ 522.650021] RIP: 0010:[<ffffffff81186c6f>] [<ffffffff81186c6f>]
> check_preempt_wakeup+0x11f/0x210
> [ 522.650021] RSP: 0018:ffff880226e03ba8 EFLAGS: 00010046
> [ 522.650021] RAX: 0000000000000000 RBX: ffff880226fd79c0 RCX:
> 0000000000000008
> [ 522.650021] RDX: 0000000000000000 RSI: ffff880211313000 RDI:
> 000000000000000c
> [ 522.650021] RBP: ffff880226e03be8 R08: 0000000000000000 R09:
> 000000000000b4bb
> [ 522.650021] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> [ 522.650021] R13: ffff880211313068 R14: ffff8800c00bb000 R15:
> 0000000000000000
> [ 522.650021] FS: 00007f435269f700(0000) GS:ffff880226e00000(0000)
> knlGS:0000000000000000
> [ 522.650021] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 522.650021] CR2: 0000000000000150 CR3: 00000000abd2c000 CR4:
> 00000000000006f0
> [ 522.650021] DR0: 0000000000995750 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 522.650021] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7:
> 0000000000000600
> [ 522.650021] Stack:
> [ 522.650021] ffff880211313000 01ff880226fd79c0 ffff880211313000
> ffff880226fd79c0
> [ 522.650021] ffff880226fd79c0 ffff880211313000 0000000000000000
> ffff880226e00000
> [ 522.650021] ffff880226e03c08 ffffffff8117361d ffff880226fd79c0
> ffff880226fd79c0
> [ 522.650021] Call Trace:
> [ 522.650021] <IRQ>
> [ 522.650021] [<ffffffff8117361d>] check_preempt_curr+0x3d/0xb0
> [ 522.650021] [<ffffffff81175d88>] ttwu_do_wakeup+0x18/0x130
> [ 522.650021] [<ffffffff81175ee4>] T.2248+0x44/0x50
> [ 522.650021] [<ffffffff81175f9e>] ttwu_queue+0xae/0xd0
> [ 522.650021] [<ffffffff81180224>] ? try_to_wake_up+0x34/0x2a0
> [ 522.650021] [<ffffffff81180454>] try_to_wake_up+0x264/0x2a0
> [ 522.650021] [<ffffffff811a1672>] ? __lock_acquired+0x2a2/0x2e0
> [ 522.650021] [<ffffffff8118049d>] default_wake_function+0xd/0x10
> [ 522.650021] [<ffffffff811952f8>] autoremove_wake_function+0x18/0x40
> [ 522.650021] [<ffffffff811951b2>] __wake_up_common+0x52/0x90
> [ 522.650021] [<ffffffff8119550d>] ? __wake_up+0x2d/0x70
> [ 522.650021] [<ffffffff81195523>] __wake_up+0x43/0x70
> [ 522.650021] [<ffffffff843119a3>] p9_client_cb+0x43/0x70
> [ 522.650021] [<ffffffff84319d05>] req_done+0x105/0x110
> [ 522.650021] [<ffffffff81cafca6>] vring_interrupt+0x86/0xa0
> [ 522.650021] [<ffffffff811b9a28>] ? handle_irq_event+0x38/0x70
> [ 522.650021] [<ffffffff811b9779>] handle_irq_event_percpu+0x129/0x3a0
> [ 522.650021] [<ffffffff811b9a33>] handle_irq_event+0x43/0x70
> [ 522.650021] [<ffffffff811bd1e8>] handle_edge_irq+0xe8/0x120
> [ 522.650021] [<ffffffff81070a34>] handle_irq+0x164/0x180
> [ 522.650021] [<ffffffff811833c9>] ? vtime_account_system+0x79/0x90
> [ 522.650021] [<ffffffff81183435>] ?
> vtime_common_account_irq_enter+0x55/0x60
> [ 522.650021] [<ffffffff8106f629>] do_IRQ+0x59/0x100
> [ 522.650021] [<ffffffff84395e72>] common_interrupt+0x72/0x72
> [ 522.650021] <EOI>
> [ 522.650021] [<ffffffff812510d5>] ?
> context_tracking_user_exit+0x1a5/0x1c0
> [ 522.650021] [<ffffffff8107cfdd>] syscall_trace_enter+0x2d/0x280
> [ 522.650021] [<ffffffff8439f081>] tracesys+0x7e/0xe2
> [ 522.650021] Code: 0f 1f 40 00 ff c8 4d 8b ad 48 01 00 00 39 d0 7f f3
> eb 18 66 0f 1f 84 00 00 00 00 00 4d 8b a4 24 48 01 00 00 4d 8b ad 48 01
> 00 00 <49> 8b bc 24 50 01 00 00 49 3b bd 50 01 00 00 75 e0 48 85 ff 74
> [ 522.650021] RIP [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
> [ 522.650021] RSP <ffff880226e03ba8>
> [ 522.650021] CR2: 0000000000000150
> [ 522.650021] ---[ end trace adce75aec8b1b32f ]---
>
> Since it's pretty inlined, the code points to:
>
> check_preempt_wakeup()
> find_matching_se()
> find_matching_se()
> check_preempt_wakeup()
>
>
> static inline struct cfs_rq *
> is_same_group(struct sched_entity *se, struct sched_entity *pse)
> {
> if (se->cfs_rq == pse->cfs_rq) <=== HERE
> return se->cfs_rq;
>
> return NULL;
> }
>
>
> Thanks,
> Sasha
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2014-02-17 09:21:14

by Peter Zijlstra

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On Mon, Feb 17, 2014 at 04:11:09PM +0800, Michael wang wrote:
> BTW, I reproduced it by steps:
> 1. change current to RT
> 2. move to a different depth cpu-cgroup
> 3. change it back to FAIR
>
> Seems like it was caused by that RT has no task_move_group() implemented
> which could maintain depth, and that lead to a wrong depth after switched
> back to FAIR...


> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 235cfa7..4445e56 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7317,7 +7317,11 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
> */
> static void switched_to_fair(struct rq *rq, struct task_struct *p)
> {
> - if (!p->se.on_rq)
> + struct sched_entity *se = &p->se;
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> + se->depth = se->parent ? se->parent->depth + 1 : 0;
> +#endif
> + if (!se->on_rq)
> return;
>
> /*

Yes indeed. My first idea yesterday was to put it in set_task_rq() to be
absolutely sure we catch all; but if this is sufficient its better.

Thanks!

2014-02-17 21:08:16

by Sasha Levin

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On 02/17/2014 03:11 AM, Michael wang wrote:
> Hi, Sasha
>
> On 02/16/2014 07:27 AM, Sasha Levin wrote:
>> Hi folks,
>>
>> While fuzzing with trinity inside a KVM tools guest running latest -next
>> kernel, I've
>> stumbled on the following:
>
> I've reproduced the same issue with tip/master, and below patch fixed the
> problem on my box along with some rcu stall info disappeared, would you
> like to have a try?
>
> BTW, I reproduced it by steps:
> 1. change current to RT
> 2. move to a different depth cpu-cgroup
> 3. change it back to FAIR
>
> Seems like it was caused by that RT has no task_move_group() implemented
> which could maintain depth, and that lead to a wrong depth after switched
> back to FAIR...

I *think* it works. There seems to be another sched issue that causes lockups,
so I can't say for certain that this one doesn't occur anymore.

I'm still working on collecting data for the other issue, I'll mail about it soon.


Thanks,
Sasha

2014-02-18 02:27:09

by Michael wang

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On 02/17/2014 05:20 PM, Peter Zijlstra wrote:
[snip]
>> static void switched_to_fair(struct rq *rq, struct task_struct *p)
>> {
>> - if (!p->se.on_rq)
>> + struct sched_entity *se = &p->se;
>> +#ifdef CONFIG_FAIR_GROUP_SCHED
>> + se->depth = se->parent ? se->parent->depth + 1 : 0;
>> +#endif
>> + if (!se->on_rq)
>> return;
>>
>> /*
>
> Yes indeed. My first idea yesterday was to put it in set_task_rq() to be
> absolutely sure we catch all; but if this is sufficient its better.

Agree, let's wait for Sasha's testing result then :)

Regards,
Michael Wang

>
> Thanks!
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2014-02-18 02:28:22

by Michael wang

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On 02/18/2014 05:07 AM, Sasha Levin wrote:
[snip]
>
> I *think* it works. There seems to be another sched issue that causes
> lockups,
> so I can't say for certain that this one doesn't occur anymore.
>
> I'm still working on collecting data for the other issue, I'll mail
> about it soon.

Thanks for that, looking forward the results :)

Regards,
Michael Wang

>
>
> Thanks,
> Sasha
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2014-02-19 16:17:08

by Peter Zijlstra

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On Mon, Feb 17, 2014 at 04:11:09PM +0800, Michael wang wrote:
> > While fuzzing with trinity inside a KVM tools guest running latest -next
> > kernel, I've
> > stumbled on the following:
>
> I've reproduced the same issue with tip/master, and below patch fixed the
> problem on my box along with some rcu stall info disappeared, would you
> like to have a try?
>
> BTW, I reproduced it by steps:
> 1. change current to RT
> 2. move to a different depth cpu-cgroup
> 3. change it back to FAIR
>
> Seems like it was caused by that RT has no task_move_group() implemented
> which could maintain depth, and that lead to a wrong depth after switched
> back to FAIR...
>
> Regards,
> Michael Wang
>
>
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 235cfa7..4445e56 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7317,7 +7317,11 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
> */
> static void switched_to_fair(struct rq *rq, struct task_struct *p)
> {
> - if (!p->se.on_rq)
> + struct sched_entity *se = &p->se;
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> + se->depth = se->parent ? se->parent->depth + 1 : 0;
> +#endif
> + if (!se->on_rq)
> return;
>
> /*


Michael, do you think you can send a proper patch for this?

2014-02-19 18:10:58

by Sasha Levin

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On 02/17/2014 09:26 PM, Michael wang wrote:
> On 02/17/2014 05:20 PM, Peter Zijlstra wrote:
> [snip]
>>> >> static void switched_to_fair(struct rq *rq, struct task_struct *p)
>>> >> {
>>> >>- if (!p->se.on_rq)
>>> >>+ struct sched_entity *se = &p->se;
>>> >>+#ifdef CONFIG_FAIR_GROUP_SCHED
>>> >>+ se->depth = se->parent ? se->parent->depth + 1 : 0;
>>> >>+#endif
>>> >>+ if (!se->on_rq)
>>> >> return;
>>> >>
>>> >> /*
>> >
>> >Yes indeed. My first idea yesterday was to put it in set_task_rq() to be
>> >absolutely sure we catch all; but if this is sufficient its better.
> Agree, let's wait for Sasha's testing result then:)

I took my time with testing it seems I'm hitting new issues with both sched and mm, and I've wanted
to confirm I don't see this one any more.

It does seem like this patch fixes the problem for me, so:

Tested-by: Sasha Levin <[email protected]>


Thanks,
Sasha

2014-02-19 18:37:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On Wed, Feb 19, 2014 at 01:10:22PM -0500, Sasha Levin wrote:
> On 02/17/2014 09:26 PM, Michael wang wrote:
> >On 02/17/2014 05:20 PM, Peter Zijlstra wrote:
> >[snip]
> >>>>> static void switched_to_fair(struct rq *rq, struct task_struct *p)
> >>>>> {
> >>>>>- if (!p->se.on_rq)
> >>>>>+ struct sched_entity *se = &p->se;
> >>>>>+#ifdef CONFIG_FAIR_GROUP_SCHED
> >>>>>+ se->depth = se->parent ? se->parent->depth + 1 : 0;
> >>>>>+#endif
> >>>>>+ if (!se->on_rq)
> >>>>> return;
> >>>>>
> >>>>> /*
> >>>
> >>>Yes indeed. My first idea yesterday was to put it in set_task_rq() to be
> >>>absolutely sure we catch all; but if this is sufficient its better.
> >Agree, let's wait for Sasha's testing result then:)
>
> I took my time with testing it seems I'm hitting new issues with both sched
> and mm, and I've wanted to confirm I don't see this one any more.
>
> It does seem like this patch fixes the problem for me, so:
>
> Tested-by: Sasha Levin <[email protected]>
>

Thanks!

2014-02-20 02:19:10

by Michael wang

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On 02/20/2014 12:16 AM, Peter Zijlstra wrote:
[snip]
>>
>>
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 235cfa7..4445e56 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -7317,7 +7317,11 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
>> */
>> static void switched_to_fair(struct rq *rq, struct task_struct *p)
>> {
>> - if (!p->se.on_rq)
>> + struct sched_entity *se = &p->se;
>> +#ifdef CONFIG_FAIR_GROUP_SCHED
>> + se->depth = se->parent ? se->parent->depth + 1 : 0;
>> +#endif
>> + if (!se->on_rq)
>> return;
>>
>> /*
>
>
> Michael, do you think you can send a proper patch for this?

My pleasure :) will post it later.

Regards,
Michael Wang

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2014-02-20 02:22:31

by Michael wang

[permalink] [raw]
Subject: Re: sched: fair: NULL ptr deref in check_preempt_wakeup

On 02/20/2014 02:10 AM, Sasha Levin wrote:
> On 02/17/2014 09:26 PM, Michael wang wrote:
>> On 02/17/2014 05:20 PM, Peter Zijlstra wrote:
>> [snip]
>>>> >> static void switched_to_fair(struct rq *rq, struct task_struct *p)
>>>> >> {
>>>> >>- if (!p->se.on_rq)
>>>> >>+ struct sched_entity *se = &p->se;
>>>> >>+#ifdef CONFIG_FAIR_GROUP_SCHED
>>>> >>+ se->depth = se->parent ? se->parent->depth + 1 : 0;
>>>> >>+#endif
>>>> >>+ if (!se->on_rq)
>>>> >> return;
>>>> >>
>>>> >> /*
>>> >
>>> >Yes indeed. My first idea yesterday was to put it in set_task_rq()
>>> to be
>>> >absolutely sure we catch all; but if this is sufficient its better.
>> Agree, let's wait for Sasha's testing result then:)
>
> I took my time with testing it seems I'm hitting new issues with both
> sched and mm, and I've wanted to confirm I don't see this one any more.
>
> It does seem like this patch fixes the problem for me, so:
>
> Tested-by: Sasha Levin <[email protected]>

Thanks for the testing :) will post the patch later.

Regards,
Michael Wang

>
>
> Thanks,
> Sasha
>