2018-05-12 10:03:11

by Heiko Carstens

[permalink] [raw]
Subject: [bisected] 051f3ca02e46 "Introduce NUMA identity node sched domain" breaks fake NUMA on s390

Hello,

Andre Wild reported that fake NUMA doesn't work on s390 anymore. Doesn't
work means it crashed for Andre, or it is in an endless loop within
init_sched_groups_capacity() for me (sg != sd->groups is always true).

I could reproduce this with a very simple setup with only two nodes, where
each node has only one CPU. This allowed me to bisect it down to commit
051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched domain").

With that commit reverted the system comes up again and the scheduling
domains look like this:

[ 0.148592] smp: Bringing up secondary CPUs ...
[ 0.148984] smp: Brought up 2 nodes, 2 CPUs
[ 0.149097] CPU0 attaching sched-domain(s):
[ 0.149099] domain-0: span=0-1 level=NUMA
[ 0.149101] groups: 0:{ span=0 }, 1:{ span=1 }
[ 0.149106] CPU1 attaching sched-domain(s):
[ 0.149107] domain-0: span=0-1 level=NUMA
[ 0.149108] groups: 1:{ span=1 }, 0:{ span=0 }
[ 0.149111] span: 0-1 (max cpu_capacity = 1024)

Any idea what's going wrong?

Config file is attached.

Thanks,
Heiko


Attachments:
(No filename) (1.05 kB)
config (78.27 kB)
Download all attachments

2018-05-14 09:39:45

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [bisected] 051f3ca02e46 "Introduce NUMA identity node sched domain" breaks fake NUMA on s390

On Sat, May 12, 2018 at 12:02:33PM +0200, Heiko Carstens wrote:
> Hello,
>
> Andre Wild reported that fake NUMA doesn't work on s390 anymore. Doesn't
> work means it crashed for Andre, or it is in an endless loop within
> init_sched_groups_capacity() for me (sg != sd->groups is always true).
>
> I could reproduce this with a very simple setup with only two nodes, where
> each node has only one CPU. This allowed me to bisect it down to commit
> 051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched domain").
>
> With that commit reverted the system comes up again and the scheduling
> domains look like this:
>
> [ 0.148592] smp: Bringing up secondary CPUs ...
> [ 0.148984] smp: Brought up 2 nodes, 2 CPUs
> [ 0.149097] CPU0 attaching sched-domain(s):
> [ 0.149099] domain-0: span=0-1 level=NUMA
> [ 0.149101] groups: 0:{ span=0 }, 1:{ span=1 }
> [ 0.149106] CPU1 attaching sched-domain(s):
> [ 0.149107] domain-0: span=0-1 level=NUMA
> [ 0.149108] groups: 1:{ span=1 }, 0:{ span=0 }
> [ 0.149111] span: 0-1 (max cpu_capacity = 1024)
>
> Any idea what's going wrong?

Not yet; still trying to decipher your fake nume implementation.

But meanwhile; could you provide me with:

$ cat /sys/devices/system/node/node*/distance
$ cat /sys/devices/system/node/node*/cpulist



2018-05-14 10:31:42

by Heiko Carstens

[permalink] [raw]
Subject: Re: [bisected] 051f3ca02e46 "Introduce NUMA identity node sched domain" breaks fake NUMA on s390

On Mon, May 14, 2018 at 11:39:09AM +0200, Peter Zijlstra wrote:
> On Sat, May 12, 2018 at 12:02:33PM +0200, Heiko Carstens wrote:
> > Hello,
> >
> > Andre Wild reported that fake NUMA doesn't work on s390 anymore. Doesn't
> > work means it crashed for Andre, or it is in an endless loop within
> > init_sched_groups_capacity() for me (sg != sd->groups is always true).
> >
> > I could reproduce this with a very simple setup with only two nodes, where
> > each node has only one CPU. This allowed me to bisect it down to commit
> > 051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched domain").
> >
> > With that commit reverted the system comes up again and the scheduling
> > domains look like this:
> >
> > [ 0.148592] smp: Bringing up secondary CPUs ...
> > [ 0.148984] smp: Brought up 2 nodes, 2 CPUs
> > [ 0.149097] CPU0 attaching sched-domain(s):
> > [ 0.149099] domain-0: span=0-1 level=NUMA
> > [ 0.149101] groups: 0:{ span=0 }, 1:{ span=1 }
> > [ 0.149106] CPU1 attaching sched-domain(s):
> > [ 0.149107] domain-0: span=0-1 level=NUMA
> > [ 0.149108] groups: 1:{ span=1 }, 0:{ span=0 }
> > [ 0.149111] span: 0-1 (max cpu_capacity = 1024)
> >
> > Any idea what's going wrong?
>
> Not yet; still trying to decipher your fake nume implementation.
>
> But meanwhile; could you provide me with:
>
> $ cat /sys/devices/system/node/node*/distance
> $ cat /sys/devices/system/node/node*/cpulist

Yes, of course:

$ cat /sys/devices/system/node/node0/distance
0 10
$ cat /sys/devices/system/node/node1/distance
10 0

$ cat /sys/devices/system/node/node0/cpulist
0
$ cat /sys/devices/system/node/node1/cpulist
1