As of now, the asym_cpu_capacity_level will try to locate the lowest
topology level where the highest available CPU capacity is being
visible to all CPUs. This works perfectly fine for most of existing
asymmetric designs out there, though for some possible and completely
valid setups, combining different cpu microarchitectures within
clusters, this might not be the best approach, resulting in pointing
at a level, at which some of the domains might not see any asymmetry
at all. This could be problematic for misfit migration and/or energy
aware placement. And as such, for affected platforms it might result
in custom changes to wake-up and CPU selection paths.
As mentioned in the previous version, based on the available sources out there,
one of the potentially affected (by original approach) platforms might be
Exynos 9820/990 with it's 'sliced' LLC(SLC) divided between the two custom (big)
cores and the remaining A75/A55 cores, which seems to be reflected in the
made available dt entries for those platforms.
The following patches rework how the asymmetric detection is being
carried out, allowing pinning the asymmetric topology level to the lowest one,
where full range of CPU capacities is visible to all CPUs within given
sched domain. The asym_cpu_capacity_level will also keep track of those
levels where any scope of asymmetry is being observed, to denote
corresponding sched domains with the SD_ASYM_CPUCAPACITY flag
and to enable misfit migration for those.
In order to distinguish the sched domains with partial vs full range
of CPU capacity asymmetry, new sched domain flag has been introduced:
SD_ASYM_CPUCAPACITY_FULL.
The overall idea of changing the asymmetry detection has been suggested
by Valentin Schneider <[email protected]>
Verified on (mostly):
- QEMU (version 4.2.1) with variants of possible asymmetric topologies
- machine: virt
- modifying the device-tree 'cpus' node for virt machine:
qemu-system-aarch64 -kernel $KERNEL_IMG
-drive format=qcow2,file=$IMAGE
-append 'root=/dev/vda earlycon console=ttyAMA0 sched_debug
sched_verbose loglevel=15 kmemleak=on' -m 2G --nographic
-cpu cortex-a57 -machine virt -smp cores=8
-machine dumpdtb=$CUSTOM_DTB.dtb
$KERNEL_PATH/scripts/dtc/dtc -I dtb -O dts $CUSTOM_DTB.dts >
$CUSTOM_DTB.dtb
(modify the dts)
$KERNEL_PATH/scripts/dtc/dtc -I dts -O dtb $CUSTOM_DTB.dts >
$CUSTOM_DTB.dtb
qemu-system-aarch64 -kernel $KERNEL_IMG
-drive format=qcow2,file=$IMAGE
-append 'root=/dev/vda earlycon console=ttyAMA0 sched_debug
sched_verbose loglevel=15 kmemleak=on' -m 2G --nographic
-cpu cortex-a57 -machine virt -smp cores=8
-machine dtb=$CUSTOM_DTB.dtb
v4:
- Based on Peter's idea, reworking asym detection to use per-cpu
capacity list to serve as base for determining the asym scope
v3:
- Additional style/doc fixes
v2:
- Fixed style issues
- Reworked accessing the cached topology data as suggested by Valentin
Beata Michalska (3):
sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag
sched/topology: Rework CPU capacity asymmetry detection
sched/doc: Update the CPU capacity asymmetry bits
Documentation/scheduler/sched-capacity.rst | 6 +-
Documentation/scheduler/sched-energy.rst | 2 +-
include/linux/sched/sd_flags.h | 10 +++
kernel/sched/topology.c | 131 ++++++++++++++++++-----------
4 files changed, 96 insertions(+), 53 deletions(-)
--
2.7.4
Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain
topology flag, to distinguish between shed_domains where any CPU
capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where
a full range of CPU capacities is visible to all domain members
(SD_ASYM_CPUCAPACITY_FULL).
With the distinction between full and partial CPU capacity asymmetry,
brought in by the newly introduced flag, the scope of the original
SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing
behaviour when one is detected on a given sched domain, allowing
misfit migrations within sched domains that do not observe full range
of CPU capacities but still do have members with different capacity
values. It loses though it's meaning when it comes to the lowest CPU
asymmetry sched_domain level per-cpu pointer, which is to be now
denoted by SD_ASYM_CPUCAPACITY_FULL flag.
Signed-off-by: Beata Michalska <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
---
include/linux/sched/sd_flags.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
index 34b21e9..57bde66 100644
--- a/include/linux/sched/sd_flags.h
+++ b/include/linux/sched/sd_flags.h
@@ -91,6 +91,16 @@ SD_FLAG(SD_WAKE_AFFINE, SDF_SHARED_CHILD)
SD_FLAG(SD_ASYM_CPUCAPACITY, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
/*
+ * Domain members have different CPU capacities spanning all unique CPU
+ * capacity values.
+ *
+ * SHARED_PARENT: Set from the topmost domain down to the first domain where
+ * all available CPU capacities are visible
+ * NEEDS_GROUPS: Per-CPU capacity is asymmetric between groups.
+ */
+SD_FLAG(SD_ASYM_CPUCAPACITY_FULL, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
+
+/*
* Domain members share CPU capacity (i.e. SMT)
*
* SHARED_CHILD: Set from the base domain up until spanned CPUs no longer share
--
2.7.4
Update the documentation bits referring to capacity aware scheduling
with regards to newly introduced SD_ASYM_CPUCAPACITY_FULL sched_domain
flag.
Signed-off-by: Beata Michalska <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
---
Documentation/scheduler/sched-capacity.rst | 6 ++++--
Documentation/scheduler/sched-energy.rst | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/Documentation/scheduler/sched-capacity.rst b/Documentation/scheduler/sched-capacity.rst
index 9b7cbe4..805f85f 100644
--- a/Documentation/scheduler/sched-capacity.rst
+++ b/Documentation/scheduler/sched-capacity.rst
@@ -284,8 +284,10 @@ whether the system exhibits asymmetric CPU capacities. Should that be the
case:
- The sched_asym_cpucapacity static key will be enabled.
-- The SD_ASYM_CPUCAPACITY flag will be set at the lowest sched_domain level that
- spans all unique CPU capacity values.
+- The SD_ASYM_CPUCAPACITY_FULL flag will be set at the lowest sched_domain
+ level that spans all unique CPU capacity values.
+- The SD_ASYM_CPUCAPACITY flag will be set for any sched_domain that spans
+ CPUs with any range of asymmetry.
The sched_asym_cpucapacity static key is intended to guard sections of code that
cater to asymmetric CPU capacity systems. Do note however that said key is
diff --git a/Documentation/scheduler/sched-energy.rst b/Documentation/scheduler/sched-energy.rst
index afe02d3..8fbce5e 100644
--- a/Documentation/scheduler/sched-energy.rst
+++ b/Documentation/scheduler/sched-energy.rst
@@ -328,7 +328,7 @@ section lists these dependencies and provides hints as to how they can be met.
As mentioned in the introduction, EAS is only supported on platforms with
asymmetric CPU topologies for now. This requirement is checked at run-time by
-looking for the presence of the SD_ASYM_CPUCAPACITY flag when the scheduling
+looking for the presence of the SD_ASYM_CPUCAPACITY_FULL flag when the scheduling
domains are built.
See Documentation/scheduler/sched-capacity.rst for requirements to be met for this
--
2.7.4
On Mon, 17 May 2021 at 10:24, Beata Michalska <[email protected]> wrote:
>
> Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain
> topology flag, to distinguish between shed_domains where any CPU
> capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where
> a full range of CPU capacities is visible to all domain members
> (SD_ASYM_CPUCAPACITY_FULL).
I'm not sure about what you want to detect:
Is it a sched_domain level with a full range of cpu capacity, i.e.
with at least 1 min capacity and 1 max capacity ?
or do you want to get at least 1 cpu of each capacity ?
>
> With the distinction between full and partial CPU capacity asymmetry,
> brought in by the newly introduced flag, the scope of the original
> SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing
> behaviour when one is detected on a given sched domain, allowing
> misfit migrations within sched domains that do not observe full range
> of CPU capacities but still do have members with different capacity
> values. It loses though it's meaning when it comes to the lowest CPU
> asymmetry sched_domain level per-cpu pointer, which is to be now
> denoted by SD_ASYM_CPUCAPACITY_FULL flag.
>
> Signed-off-by: Beata Michalska <[email protected]>
> Reviewed-by: Valentin Schneider <[email protected]>
> ---
> include/linux/sched/sd_flags.h | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
> index 34b21e9..57bde66 100644
> --- a/include/linux/sched/sd_flags.h
> +++ b/include/linux/sched/sd_flags.h
> @@ -91,6 +91,16 @@ SD_FLAG(SD_WAKE_AFFINE, SDF_SHARED_CHILD)
> SD_FLAG(SD_ASYM_CPUCAPACITY, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
>
> /*
> + * Domain members have different CPU capacities spanning all unique CPU
> + * capacity values.
> + *
> + * SHARED_PARENT: Set from the topmost domain down to the first domain where
> + * all available CPU capacities are visible
> + * NEEDS_GROUPS: Per-CPU capacity is asymmetric between groups.
> + */
> +SD_FLAG(SD_ASYM_CPUCAPACITY_FULL, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> +
> +/*
> * Domain members share CPU capacity (i.e. SMT)
> *
> * SHARED_CHILD: Set from the base domain up until spanned CPUs no longer share
> --
> 2.7.4
>
On Tue, May 18, 2021 at 03:39:27PM +0200, Vincent Guittot wrote:
> On Mon, 17 May 2021 at 10:24, Beata Michalska <[email protected]> wrote:
> >
> > Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain
> > topology flag, to distinguish between shed_domains where any CPU
> > capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where
> > a full range of CPU capacities is visible to all domain members
> > (SD_ASYM_CPUCAPACITY_FULL).
>
> I'm not sure about what you want to detect:
>
> Is it a sched_domain level with a full range of cpu capacity, i.e.
> with at least 1 min capacity and 1 max capacity ?
> or do you want to get at least 1 cpu of each capacity ?
That would be at least one CPU of each available capacity within given domain,
so full -set- of available capacities within a domain.
---
BR
B.
>
>
> >
> > With the distinction between full and partial CPU capacity asymmetry,
> > brought in by the newly introduced flag, the scope of the original
> > SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing
> > behaviour when one is detected on a given sched domain, allowing
> > misfit migrations within sched domains that do not observe full range
> > of CPU capacities but still do have members with different capacity
> > values. It loses though it's meaning when it comes to the lowest CPU
> > asymmetry sched_domain level per-cpu pointer, which is to be now
> > denoted by SD_ASYM_CPUCAPACITY_FULL flag.
> >
> > Signed-off-by: Beata Michalska <[email protected]>
> > Reviewed-by: Valentin Schneider <[email protected]>
> > ---
> > include/linux/sched/sd_flags.h | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
> > index 34b21e9..57bde66 100644
> > --- a/include/linux/sched/sd_flags.h
> > +++ b/include/linux/sched/sd_flags.h
> > @@ -91,6 +91,16 @@ SD_FLAG(SD_WAKE_AFFINE, SDF_SHARED_CHILD)
> > SD_FLAG(SD_ASYM_CPUCAPACITY, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> >
> > /*
> > + * Domain members have different CPU capacities spanning all unique CPU
> > + * capacity values.
> > + *
> > + * SHARED_PARENT: Set from the topmost domain down to the first domain where
> > + * all available CPU capacities are visible
> > + * NEEDS_GROUPS: Per-CPU capacity is asymmetric between groups.
> > + */
> > +SD_FLAG(SD_ASYM_CPUCAPACITY_FULL, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > +
> > +/*
> > * Domain members share CPU capacity (i.e. SMT)
> > *
> > * SHARED_CHILD: Set from the base domain up until spanned CPUs no longer share
> > --
> > 2.7.4
> >
On Tue, 18 May 2021 at 16:27, Beata Michalska <[email protected]> wrote:
>
> On Tue, May 18, 2021 at 03:39:27PM +0200, Vincent Guittot wrote:
> > On Mon, 17 May 2021 at 10:24, Beata Michalska <[email protected]> wrote:
> > >
> > > Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain
> > > topology flag, to distinguish between shed_domains where any CPU
> > > capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where
> > > a full range of CPU capacities is visible to all domain members
> > > (SD_ASYM_CPUCAPACITY_FULL).
> >
> > I'm not sure about what you want to detect:
> >
> > Is it a sched_domain level with a full range of cpu capacity, i.e.
> > with at least 1 min capacity and 1 max capacity ?
> > or do you want to get at least 1 cpu of each capacity ?
> That would be at least one CPU of each available capacity within given domain,
> so full -set- of available capacities within a domain.
Would be good to add the precision.
Although I'm not sure if that's the best policy compared to only
getting the range which would be far simpler to implement.
Do you have some topology example ?
>
> ---
> BR
> B.
> >
> >
> > >
> > > With the distinction between full and partial CPU capacity asymmetry,
> > > brought in by the newly introduced flag, the scope of the original
> > > SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing
> > > behaviour when one is detected on a given sched domain, allowing
> > > misfit migrations within sched domains that do not observe full range
> > > of CPU capacities but still do have members with different capacity
> > > values. It loses though it's meaning when it comes to the lowest CPU
> > > asymmetry sched_domain level per-cpu pointer, which is to be now
> > > denoted by SD_ASYM_CPUCAPACITY_FULL flag.
> > >
> > > Signed-off-by: Beata Michalska <[email protected]>
> > > Reviewed-by: Valentin Schneider <[email protected]>
> > > ---
> > > include/linux/sched/sd_flags.h | 10 ++++++++++
> > > 1 file changed, 10 insertions(+)
> > >
> > > diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
> > > index 34b21e9..57bde66 100644
> > > --- a/include/linux/sched/sd_flags.h
> > > +++ b/include/linux/sched/sd_flags.h
> > > @@ -91,6 +91,16 @@ SD_FLAG(SD_WAKE_AFFINE, SDF_SHARED_CHILD)
> > > SD_FLAG(SD_ASYM_CPUCAPACITY, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > >
> > > /*
> > > + * Domain members have different CPU capacities spanning all unique CPU
> > > + * capacity values.
> > > + *
> > > + * SHARED_PARENT: Set from the topmost domain down to the first domain where
> > > + * all available CPU capacities are visible
> > > + * NEEDS_GROUPS: Per-CPU capacity is asymmetric between groups.
> > > + */
> > > +SD_FLAG(SD_ASYM_CPUCAPACITY_FULL, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > +
> > > +/*
> > > * Domain members share CPU capacity (i.e. SMT)
> > > *
> > > * SHARED_CHILD: Set from the base domain up until spanned CPUs no longer share
> > > --
> > > 2.7.4
> > >
On Tue, 18 May 2021 at 17:09, Beata Michalska <[email protected]> wrote:
>
> On Tue, May 18, 2021 at 04:53:09PM +0200, Vincent Guittot wrote:
> > On Tue, 18 May 2021 at 16:27, Beata Michalska <[email protected]> wrote:
> > >
> > > On Tue, May 18, 2021 at 03:39:27PM +0200, Vincent Guittot wrote:
> > > > On Mon, 17 May 2021 at 10:24, Beata Michalska <[email protected]> wrote:
> > > > >
> > > > > Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain
> > > > > topology flag, to distinguish between shed_domains where any CPU
> > > > > capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where
> > > > > a full range of CPU capacities is visible to all domain members
> > > > > (SD_ASYM_CPUCAPACITY_FULL).
> > > >
> > > > I'm not sure about what you want to detect:
> > > >
> > > > Is it a sched_domain level with a full range of cpu capacity, i.e.
> > > > with at least 1 min capacity and 1 max capacity ?
> > > > or do you want to get at least 1 cpu of each capacity ?
> > > That would be at least one CPU of each available capacity within given domain,
> > > so full -set- of available capacities within a domain.
> >
> > Would be good to add the precision.
> Will do.
> >
> > Although I'm not sure if that's the best policy compared to only
> > getting the range which would be far simpler to implement.
> > Do you have some topology example ?
>
> An example from second patch from the series:
>
> DIE [ ]
> MC [ ][ ]
>
> CPU [0] [1] [2] [3] [4] [5] [6] [7]
> Capacity |.....| |.....| |.....| |.....|
> L M B B
The one above , which is described in your patchset, works with the range policy
>
> Where:
> arch_scale_cpu_capacity(L) = 512
> arch_scale_cpu_capacity(M) = 871
> arch_scale_cpu_capacity(B) = 1024
>
> which could also look like:
>
> DIE [ ]
> MC [ ][ ]
>
> CPU [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
> Capacity |.....| |.....| |.....| |.....| |.....|
> L M B L B
I know that that HW guys can come with crazy idea but they would
probably add M instead of L with B in the 2nd cluster as a boost of
performance at the cost of powering up another "cluster" in which case
the range policy works as well
>
> Considering only range would mean loosing the 2 (M) CPUs out of sight
> for feec in some cases.
Is it realistic ? Considering all the code and complexity added by
patch 2, will we really use it at the end ?
Regards,
Vincent
>
> ---
> BR.
> B
> >
> >
> >
> >
> >
> >
> > >
> > > ---
> > > BR
> > > B.
> > > >
> > > >
> > > > >
> > > > > With the distinction between full and partial CPU capacity asymmetry,
> > > > > brought in by the newly introduced flag, the scope of the original
> > > > > SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing
> > > > > behaviour when one is detected on a given sched domain, allowing
> > > > > misfit migrations within sched domains that do not observe full range
> > > > > of CPU capacities but still do have members with different capacity
> > > > > values. It loses though it's meaning when it comes to the lowest CPU
> > > > > asymmetry sched_domain level per-cpu pointer, which is to be now
> > > > > denoted by SD_ASYM_CPUCAPACITY_FULL flag.
> > > > >
> > > > > Signed-off-by: Beata Michalska <[email protected]>
> > > > > Reviewed-by: Valentin Schneider <[email protected]>
> > > > > ---
> > > > > include/linux/sched/sd_flags.h | 10 ++++++++++
> > > > > 1 file changed, 10 insertions(+)
> > > > >
> > > > > diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
> > > > > index 34b21e9..57bde66 100644
> > > > > --- a/include/linux/sched/sd_flags.h
> > > > > +++ b/include/linux/sched/sd_flags.h
> > > > > @@ -91,6 +91,16 @@ SD_FLAG(SD_WAKE_AFFINE, SDF_SHARED_CHILD)
> > > > > SD_FLAG(SD_ASYM_CPUCAPACITY, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > > >
> > > > > /*
> > > > > + * Domain members have different CPU capacities spanning all unique CPU
> > > > > + * capacity values.
> > > > > + *
> > > > > + * SHARED_PARENT: Set from the topmost domain down to the first domain where
> > > > > + * all available CPU capacities are visible
> > > > > + * NEEDS_GROUPS: Per-CPU capacity is asymmetric between groups.
> > > > > + */
> > > > > +SD_FLAG(SD_ASYM_CPUCAPACITY_FULL, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > > > +
> > > > > +/*
> > > > > * Domain members share CPU capacity (i.e. SMT)
> > > > > *
> > > > > * SHARED_CHILD: Set from the base domain up until spanned CPUs no longer share
> > > > > --
> > > > > 2.7.4
> > > > >
On Tue, May 18, 2021 at 04:53:09PM +0200, Vincent Guittot wrote:
> On Tue, 18 May 2021 at 16:27, Beata Michalska <[email protected]> wrote:
> >
> > On Tue, May 18, 2021 at 03:39:27PM +0200, Vincent Guittot wrote:
> > > On Mon, 17 May 2021 at 10:24, Beata Michalska <[email protected]> wrote:
> > > >
> > > > Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain
> > > > topology flag, to distinguish between shed_domains where any CPU
> > > > capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where
> > > > a full range of CPU capacities is visible to all domain members
> > > > (SD_ASYM_CPUCAPACITY_FULL).
> > >
> > > I'm not sure about what you want to detect:
> > >
> > > Is it a sched_domain level with a full range of cpu capacity, i.e.
> > > with at least 1 min capacity and 1 max capacity ?
> > > or do you want to get at least 1 cpu of each capacity ?
> > That would be at least one CPU of each available capacity within given domain,
> > so full -set- of available capacities within a domain.
>
> Would be good to add the precision.
Will do.
>
> Although I'm not sure if that's the best policy compared to only
> getting the range which would be far simpler to implement.
> Do you have some topology example ?
An example from second patch from the series:
DIE [ ]
MC [ ][ ]
CPU [0] [1] [2] [3] [4] [5] [6] [7]
Capacity |.....| |.....| |.....| |.....|
L M B B
Where:
arch_scale_cpu_capacity(L) = 512
arch_scale_cpu_capacity(M) = 871
arch_scale_cpu_capacity(B) = 1024
which could also look like:
DIE [ ]
MC [ ][ ]
CPU [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
Capacity |.....| |.....| |.....| |.....| |.....|
L M B L B
Considering only range would mean loosing the 2 (M) CPUs out of sight
for feec in some cases.
---
BR.
B
>
>
>
>
>
>
> >
> > ---
> > BR
> > B.
> > >
> > >
> > > >
> > > > With the distinction between full and partial CPU capacity asymmetry,
> > > > brought in by the newly introduced flag, the scope of the original
> > > > SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing
> > > > behaviour when one is detected on a given sched domain, allowing
> > > > misfit migrations within sched domains that do not observe full range
> > > > of CPU capacities but still do have members with different capacity
> > > > values. It loses though it's meaning when it comes to the lowest CPU
> > > > asymmetry sched_domain level per-cpu pointer, which is to be now
> > > > denoted by SD_ASYM_CPUCAPACITY_FULL flag.
> > > >
> > > > Signed-off-by: Beata Michalska <[email protected]>
> > > > Reviewed-by: Valentin Schneider <[email protected]>
> > > > ---
> > > > include/linux/sched/sd_flags.h | 10 ++++++++++
> > > > 1 file changed, 10 insertions(+)
> > > >
> > > > diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
> > > > index 34b21e9..57bde66 100644
> > > > --- a/include/linux/sched/sd_flags.h
> > > > +++ b/include/linux/sched/sd_flags.h
> > > > @@ -91,6 +91,16 @@ SD_FLAG(SD_WAKE_AFFINE, SDF_SHARED_CHILD)
> > > > SD_FLAG(SD_ASYM_CPUCAPACITY, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > >
> > > > /*
> > > > + * Domain members have different CPU capacities spanning all unique CPU
> > > > + * capacity values.
> > > > + *
> > > > + * SHARED_PARENT: Set from the topmost domain down to the first domain where
> > > > + * all available CPU capacities are visible
> > > > + * NEEDS_GROUPS: Per-CPU capacity is asymmetric between groups.
> > > > + */
> > > > +SD_FLAG(SD_ASYM_CPUCAPACITY_FULL, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > > +
> > > > +/*
> > > > * Domain members share CPU capacity (i.e. SMT)
> > > > *
> > > > * SHARED_CHILD: Set from the base domain up until spanned CPUs no longer share
> > > > --
> > > > 2.7.4
> > > >
On Tue, May 18, 2021 at 05:28:11PM +0200, Vincent Guittot wrote:
> On Tue, 18 May 2021 at 17:09, Beata Michalska <[email protected]> wrote:
> >
> > On Tue, May 18, 2021 at 04:53:09PM +0200, Vincent Guittot wrote:
> > > On Tue, 18 May 2021 at 16:27, Beata Michalska <[email protected]> wrote:
> > > >
> > > > On Tue, May 18, 2021 at 03:39:27PM +0200, Vincent Guittot wrote:
> > > > > On Mon, 17 May 2021 at 10:24, Beata Michalska <[email protected]> wrote:
> > > > > >
> > > > > > Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain
> > > > > > topology flag, to distinguish between shed_domains where any CPU
> > > > > > capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where
> > > > > > a full range of CPU capacities is visible to all domain members
> > > > > > (SD_ASYM_CPUCAPACITY_FULL).
> > > > >
> > > > > I'm not sure about what you want to detect:
> > > > >
> > > > > Is it a sched_domain level with a full range of cpu capacity, i.e.
> > > > > with at least 1 min capacity and 1 max capacity ?
> > > > > or do you want to get at least 1 cpu of each capacity ?
> > > > That would be at least one CPU of each available capacity within given domain,
> > > > so full -set- of available capacities within a domain.
> > >
> > > Would be good to add the precision.
> > Will do.
> > >
> > > Although I'm not sure if that's the best policy compared to only
> > > getting the range which would be far simpler to implement.
> > > Do you have some topology example ?
> >
> > An example from second patch from the series:
> >
> > DIE [ ]
> > MC [ ][ ]
> >
> > CPU [0] [1] [2] [3] [4] [5] [6] [7]
> > Capacity |.....| |.....| |.....| |.....|
> > L M B B
>
> The one above , which is described in your patchset, works with the range policy
Yeap, but that is just a variation of all the possibilities....
>
> >
> > Where:
> > arch_scale_cpu_capacity(L) = 512
> > arch_scale_cpu_capacity(M) = 871
> > arch_scale_cpu_capacity(B) = 1024
> >
> > which could also look like:
> >
> > DIE [ ]
> > MC [ ][ ]
> >
> > CPU [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
> > Capacity |.....| |.....| |.....| |.....| |.....|
> > L M B L B
>
> I know that that HW guys can come with crazy idea but they would
> probably add M instead of L with B in the 2nd cluster as a boost of
> performance at the cost of powering up another "cluster" in which case
> the range policy works as well
>
> >
> > Considering only range would mean loosing the 2 (M) CPUs out of sight
> > for feec in some cases.
>
> Is it realistic ? Considering all the code and complexity added by
> patch 2, will we really use it at the end ?
>
I do completely agree that the first approach was slightly .... blown out of
proportions, but with Peter's idea, the complexity has dropped significantly.
With the range being considered we are back to per domain tracking of available
capacities (min/max), plus additional cycles on comparing capacities.
Unless I fail to see the simplicity of that approach ?
---
BR
B.
> Regards,
> Vincent
> >
> > ---
> > BR.
> > B
> > >
> > >
> > >
> > >
> > >
> > >
> > > >
> > > > ---
> > > > BR
> > > > B.
> > > > >
> > > > >
> > > > > >
> > > > > > With the distinction between full and partial CPU capacity asymmetry,
> > > > > > brought in by the newly introduced flag, the scope of the original
> > > > > > SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing
> > > > > > behaviour when one is detected on a given sched domain, allowing
> > > > > > misfit migrations within sched domains that do not observe full range
> > > > > > of CPU capacities but still do have members with different capacity
> > > > > > values. It loses though it's meaning when it comes to the lowest CPU
> > > > > > asymmetry sched_domain level per-cpu pointer, which is to be now
> > > > > > denoted by SD_ASYM_CPUCAPACITY_FULL flag.
> > > > > >
> > > > > > Signed-off-by: Beata Michalska <[email protected]>
> > > > > > Reviewed-by: Valentin Schneider <[email protected]>
> > > > > > ---
> > > > > > include/linux/sched/sd_flags.h | 10 ++++++++++
> > > > > > 1 file changed, 10 insertions(+)
> > > > > >
> > > > > > diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
> > > > > > index 34b21e9..57bde66 100644
> > > > > > --- a/include/linux/sched/sd_flags.h
> > > > > > +++ b/include/linux/sched/sd_flags.h
> > > > > > @@ -91,6 +91,16 @@ SD_FLAG(SD_WAKE_AFFINE, SDF_SHARED_CHILD)
> > > > > > SD_FLAG(SD_ASYM_CPUCAPACITY, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > > > >
> > > > > > /*
> > > > > > + * Domain members have different CPU capacities spanning all unique CPU
> > > > > > + * capacity values.
> > > > > > + *
> > > > > > + * SHARED_PARENT: Set from the topmost domain down to the first domain where
> > > > > > + * all available CPU capacities are visible
> > > > > > + * NEEDS_GROUPS: Per-CPU capacity is asymmetric between groups.
> > > > > > + */
> > > > > > +SD_FLAG(SD_ASYM_CPUCAPACITY_FULL, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > > > > +
> > > > > > +/*
> > > > > > * Domain members share CPU capacity (i.e. SMT)
> > > > > > *
> > > > > > * SHARED_CHILD: Set from the base domain up until spanned CPUs no longer share
> > > > > > --
> > > > > > 2.7.4
> > > > > >
On Tue, 18 May 2021 at 17:48, Beata Michalska <[email protected]> wrote:
>
> On Tue, May 18, 2021 at 05:28:11PM +0200, Vincent Guittot wrote:
> > On Tue, 18 May 2021 at 17:09, Beata Michalska <[email protected]> wrote:
> > >
> > > On Tue, May 18, 2021 at 04:53:09PM +0200, Vincent Guittot wrote:
> > > > On Tue, 18 May 2021 at 16:27, Beata Michalska <[email protected]> wrote:
> > > > >
> > > > > On Tue, May 18, 2021 at 03:39:27PM +0200, Vincent Guittot wrote:
> > > > > > On Mon, 17 May 2021 at 10:24, Beata Michalska <[email protected]> wrote:
> > > > > > >
> > > > > > > Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain
> > > > > > > topology flag, to distinguish between shed_domains where any CPU
> > > > > > > capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where
> > > > > > > a full range of CPU capacities is visible to all domain members
> > > > > > > (SD_ASYM_CPUCAPACITY_FULL).
> > > > > >
> > > > > > I'm not sure about what you want to detect:
> > > > > >
> > > > > > Is it a sched_domain level with a full range of cpu capacity, i.e.
> > > > > > with at least 1 min capacity and 1 max capacity ?
> > > > > > or do you want to get at least 1 cpu of each capacity ?
> > > > > That would be at least one CPU of each available capacity within given domain,
> > > > > so full -set- of available capacities within a domain.
> > > >
> > > > Would be good to add the precision.
> > > Will do.
> > > >
> > > > Although I'm not sure if that's the best policy compared to only
> > > > getting the range which would be far simpler to implement.
> > > > Do you have some topology example ?
> > >
> > > An example from second patch from the series:
> > >
> > > DIE [ ]
> > > MC [ ][ ]
> > >
> > > CPU [0] [1] [2] [3] [4] [5] [6] [7]
> > > Capacity |.....| |.....| |.....| |.....|
> > > L M B B
> >
> > The one above , which is described in your patchset, works with the range policy
> Yeap, but that is just a variation of all the possibilities....
> >
> > >
> > > Where:
> > > arch_scale_cpu_capacity(L) = 512
> > > arch_scale_cpu_capacity(M) = 871
> > > arch_scale_cpu_capacity(B) = 1024
> > >
> > > which could also look like:
> > >
> > > DIE [ ]
> > > MC [ ][ ]
> > >
> > > CPU [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
> > > Capacity |.....| |.....| |.....| |.....| |.....|
> > > L M B L B
> >
> > I know that that HW guys can come with crazy idea but they would
> > probably add M instead of L with B in the 2nd cluster as a boost of
> > performance at the cost of powering up another "cluster" in which case
> > the range policy works as well
> >
> > >
> > > Considering only range would mean loosing the 2 (M) CPUs out of sight
> > > for feec in some cases.
> >
> > Is it realistic ? Considering all the code and complexity added by
> > patch 2, will we really use it at the end ?
> >
> I do completely agree that the first approach was slightly .... blown out of
> proportions, but with Peter's idea, the complexity has dropped significantly.
> With the range being considered we are back to per domain tracking of available
> capacities (min/max), plus additional cycles on comparing capacities.
> Unless I fail to see the simplicity of that approach ?
With the range, you just have to keep track of one cpumask for min
capacity and 1 for max capacity (considering that the absolute max
capacity/1024 might not be in the cpumap) instead of tracking all
capacity and manipulating/updating a dynamic link list. Then as soon
as you have 1 cpu of both masks then you are done. As a 1st glance
this seems to be simpler to do.
>
> ---
> BR
> B.
> > Regards,
> > Vincent
> > >
> > > ---
> > > BR.
> > > B
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > >
> > > > > ---
> > > > > BR
> > > > > B.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > With the distinction between full and partial CPU capacity asymmetry,
> > > > > > > brought in by the newly introduced flag, the scope of the original
> > > > > > > SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing
> > > > > > > behaviour when one is detected on a given sched domain, allowing
> > > > > > > misfit migrations within sched domains that do not observe full range
> > > > > > > of CPU capacities but still do have members with different capacity
> > > > > > > values. It loses though it's meaning when it comes to the lowest CPU
> > > > > > > asymmetry sched_domain level per-cpu pointer, which is to be now
> > > > > > > denoted by SD_ASYM_CPUCAPACITY_FULL flag.
> > > > > > >
> > > > > > > Signed-off-by: Beata Michalska <[email protected]>
> > > > > > > Reviewed-by: Valentin Schneider <[email protected]>
> > > > > > > ---
> > > > > > > include/linux/sched/sd_flags.h | 10 ++++++++++
> > > > > > > 1 file changed, 10 insertions(+)
> > > > > > >
> > > > > > > diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
> > > > > > > index 34b21e9..57bde66 100644
> > > > > > > --- a/include/linux/sched/sd_flags.h
> > > > > > > +++ b/include/linux/sched/sd_flags.h
> > > > > > > @@ -91,6 +91,16 @@ SD_FLAG(SD_WAKE_AFFINE, SDF_SHARED_CHILD)
> > > > > > > SD_FLAG(SD_ASYM_CPUCAPACITY, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > > > > >
> > > > > > > /*
> > > > > > > + * Domain members have different CPU capacities spanning all unique CPU
> > > > > > > + * capacity values.
> > > > > > > + *
> > > > > > > + * SHARED_PARENT: Set from the topmost domain down to the first domain where
> > > > > > > + * all available CPU capacities are visible
> > > > > > > + * NEEDS_GROUPS: Per-CPU capacity is asymmetric between groups.
> > > > > > > + */
> > > > > > > +SD_FLAG(SD_ASYM_CPUCAPACITY_FULL, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > > > > > +
> > > > > > > +/*
> > > > > > > * Domain members share CPU capacity (i.e. SMT)
> > > > > > > *
> > > > > > > * SHARED_CHILD: Set from the base domain up until spanned CPUs no longer share
> > > > > > > --
> > > > > > > 2.7.4
> > > > > > >
On Tue, May 18, 2021 at 05:56:20PM +0200, Vincent Guittot wrote:
> On Tue, 18 May 2021 at 17:48, Beata Michalska <[email protected]> wrote:
> >
> > On Tue, May 18, 2021 at 05:28:11PM +0200, Vincent Guittot wrote:
> > > On Tue, 18 May 2021 at 17:09, Beata Michalska <[email protected]> wrote:
> > > >
> > > > On Tue, May 18, 2021 at 04:53:09PM +0200, Vincent Guittot wrote:
> > > > > On Tue, 18 May 2021 at 16:27, Beata Michalska <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, May 18, 2021 at 03:39:27PM +0200, Vincent Guittot wrote:
> > > > > > > On Mon, 17 May 2021 at 10:24, Beata Michalska <[email protected]> wrote:
> > > > > > > >
> > > > > > > > Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain
> > > > > > > > topology flag, to distinguish between shed_domains where any CPU
> > > > > > > > capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where
> > > > > > > > a full range of CPU capacities is visible to all domain members
> > > > > > > > (SD_ASYM_CPUCAPACITY_FULL).
> > > > > > >
> > > > > > > I'm not sure about what you want to detect:
> > > > > > >
> > > > > > > Is it a sched_domain level with a full range of cpu capacity, i.e.
> > > > > > > with at least 1 min capacity and 1 max capacity ?
> > > > > > > or do you want to get at least 1 cpu of each capacity ?
> > > > > > That would be at least one CPU of each available capacity within given domain,
> > > > > > so full -set- of available capacities within a domain.
> > > > >
> > > > > Would be good to add the precision.
> > > > Will do.
> > > > >
> > > > > Although I'm not sure if that's the best policy compared to only
> > > > > getting the range which would be far simpler to implement.
> > > > > Do you have some topology example ?
> > > >
> > > > An example from second patch from the series:
> > > >
> > > > DIE [ ]
> > > > MC [ ][ ]
> > > >
> > > > CPU [0] [1] [2] [3] [4] [5] [6] [7]
> > > > Capacity |.....| |.....| |.....| |.....|
> > > > L M B B
> > >
> > > The one above , which is described in your patchset, works with the range policy
> > Yeap, but that is just a variation of all the possibilities....
> > >
> > > >
> > > > Where:
> > > > arch_scale_cpu_capacity(L) = 512
> > > > arch_scale_cpu_capacity(M) = 871
> > > > arch_scale_cpu_capacity(B) = 1024
> > > >
> > > > which could also look like:
> > > >
> > > > DIE [ ]
> > > > MC [ ][ ]
> > > >
> > > > CPU [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
> > > > Capacity |.....| |.....| |.....| |.....| |.....|
> > > > L M B L B
> > >
> > > I know that that HW guys can come with crazy idea but they would
> > > probably add M instead of L with B in the 2nd cluster as a boost of
> > > performance at the cost of powering up another "cluster" in which case
> > > the range policy works as well
> > >
> > > >
> > > > Considering only range would mean loosing the 2 (M) CPUs out of sight
> > > > for feec in some cases.
> > >
> > > Is it realistic ? Considering all the code and complexity added by
> > > patch 2, will we really use it at the end ?
> > >
> > I do completely agree that the first approach was slightly .... blown out of
> > proportions, but with Peter's idea, the complexity has dropped significantly.
> > With the range being considered we are back to per domain tracking of available
> > capacities (min/max), plus additional cycles on comparing capacities.
> > Unless I fail to see the simplicity of that approach ?
>
> With the range, you just have to keep track of one cpumask for min
> capacity and 1 for max capacity (considering that the absolute max
> capacity/1024 might not be in the cpumap) instead of tracking all
> capacity and manipulating/updating a dynamic link list. Then as soon
> as you have 1 cpu of both masks then you are done. As a 1st glance
> this seems to be simpler to do.
>
You would still have to go through all the capacities to find min/max:
so it's either going through all available CPUs twice, or tracking capacities
during the single go-through run. Those masks would also have to be updated to
cover hotplug events when one of the two might become obsolete.
There is an option being considered to drop updating the list upon every
rebuild of sched domains and that would simplify things even further.
I do not see any big gain with changing the approach, especially that current
one covers all of the cases.
The idea though is a good one so thank you for that.
---
BR
B.
> >
> > ---
> > BR
> > B.
> > > Regards,
> > > Vincent
> > > >
> > > > ---
> > > > BR.
> > > > B
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > ---
> > > > > > BR
> > > > > > B.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > With the distinction between full and partial CPU capacity asymmetry,
> > > > > > > > brought in by the newly introduced flag, the scope of the original
> > > > > > > > SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing
> > > > > > > > behaviour when one is detected on a given sched domain, allowing
> > > > > > > > misfit migrations within sched domains that do not observe full range
> > > > > > > > of CPU capacities but still do have members with different capacity
> > > > > > > > values. It loses though it's meaning when it comes to the lowest CPU
> > > > > > > > asymmetry sched_domain level per-cpu pointer, which is to be now
> > > > > > > > denoted by SD_ASYM_CPUCAPACITY_FULL flag.
> > > > > > > >
> > > > > > > > Signed-off-by: Beata Michalska <[email protected]>
> > > > > > > > Reviewed-by: Valentin Schneider <[email protected]>
> > > > > > > > ---
> > > > > > > > include/linux/sched/sd_flags.h | 10 ++++++++++
> > > > > > > > 1 file changed, 10 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
> > > > > > > > index 34b21e9..57bde66 100644
> > > > > > > > --- a/include/linux/sched/sd_flags.h
> > > > > > > > +++ b/include/linux/sched/sd_flags.h
> > > > > > > > @@ -91,6 +91,16 @@ SD_FLAG(SD_WAKE_AFFINE, SDF_SHARED_CHILD)
> > > > > > > > SD_FLAG(SD_ASYM_CPUCAPACITY, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > > > > > >
> > > > > > > > /*
> > > > > > > > + * Domain members have different CPU capacities spanning all unique CPU
> > > > > > > > + * capacity values.
> > > > > > > > + *
> > > > > > > > + * SHARED_PARENT: Set from the topmost domain down to the first domain where
> > > > > > > > + * all available CPU capacities are visible
> > > > > > > > + * NEEDS_GROUPS: Per-CPU capacity is asymmetric between groups.
> > > > > > > > + */
> > > > > > > > +SD_FLAG(SD_ASYM_CPUCAPACITY_FULL, SDF_SHARED_PARENT | SDF_NEEDS_GROUPS)
> > > > > > > > +
> > > > > > > > +/*
> > > > > > > > * Domain members share CPU capacity (i.e. SMT)
> > > > > > > > *
> > > > > > > > * SHARED_CHILD: Set from the base domain up until spanned CPUs no longer share
> > > > > > > > --
> > > > > > > > 2.7.4
> > > > > > > >