by Ning, Hongyu

[permalink] [raw]

Subject: Re: [RFT for v9] (Was Re: [PATCH v8 -tip 00/26] Core scheduling)

On 2020/11/7 4:55, Joel Fernandes wrote:
> All,
>
> I am getting ready to send the next v9 series based on tip/master
> branch. Could you please give the below tree a try and report any results in
> your testing?
> git tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (branch coresched)
> git log:
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/log/?h=coresched
>
> The major changes in this series are the improvements:
> (1)
> "sched: Make snapshotting of min_vruntime more CGroup-friendly"
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=coresched-v9-for-test&id=9a20a6652b3c50fd51faa829f7947004239a04eb
>
> (2)
> "sched: Simplify the core pick loop for optimized case"
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=coresched-v9-for-test&id=0370117b4fd418cdaaa6b1489bfc14f305691152
>
> And a bug fix:
> (1)
> "sched: Enqueue task into core queue only after vruntime is updated"
> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=coresched-v9-for-test&id=401dad5536e7e05d1299d0864e6fc5072029f492
>
> There are also 2 more bug fixes that I squashed-in related to kernel
> protection and a crash seen on the tip/master branch.
>
> Hoping to send the series next week out to the list.
>
> Have a great weekend, and Thanks!
>
> - Joel
>
>
> On Mon, Oct 19, 2020 at 09:43:10PM -0400, Joel Fernandes (Google) wrote:

Adding 4 workloads test results for core scheduling v9 candidate:

- kernel under test:
-- coresched community v9 candidate from https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (branch coresched)
-- latest commit: 2e8591a330ff (HEAD -> coresched, origin/coresched) NEW: sched: Add a coresched command line option
-- coresched=on kernel parameter applied
- workloads:
-- A. sysbench cpu (192 threads) + sysbench cpu (192 threads)
-- B. sysbench cpu (192 threads) + sysbench mysql (192 threads, mysqld forced into the same cgroup)
-- C. uperf netperf.xml (192 threads over TCP or UDP protocol separately)
-- D. will-it-scale context_switch via pipe (192 threads)
- test machine setup:
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 2
Core(s) per socket: 48
Socket(s): 2
NUMA node(s): 4
- test results, no obvious performance drop compared to community v8 build:
-- workload A:
+----------------------+------+----------------------+------------------------+
| | ** | sysbench cpu * 192 | sysbench cpu * 192 |
+======================+======+======================+========================+
| cgroup | ** | cg_sysbench_cpu_0 | cg_sysbench_cpu_1 |
+----------------------+------+----------------------+------------------------+
| record_item | ** | Tput_avg (events/s) | Tput_avg (events/s) |
+----------------------+------+----------------------+------------------------+
| coresched_normalized | ** | 0.98 | 1.01 |
+----------------------+------+----------------------+------------------------+
| default_normalized | ** | 1 | 1 |
+----------------------+------+----------------------+------------------------+
| smtoff_normalized | ** | 0.59 | 0.6 |
+----------------------+------+----------------------+------------------------+

-- workload B:
+----------------------+------+----------------------+------------------------+
| | ** | sysbench cpu * 192 | sysbench mysql * 192 |
+======================+======+======================+========================+
| cgroup | ** | cg_sysbench_cpu_0 | cg_sysbench_mysql_0 |
+----------------------+------+----------------------+------------------------+
| record_item | ** | Tput_avg (events/s) | Tput_avg (events/s) |
+----------------------+------+----------------------+------------------------+
| coresched_normalized | ** | 1.02 | 0.78 |
+----------------------+------+----------------------+------------------------+
| default_normalized | ** | 1 | 1 |
+----------------------+------+----------------------+------------------------+
| smtoff_normalized | ** | 0.59 | 0.75 |
+----------------------+------+----------------------+------------------------+

-- workload C:
+----------------------+------+---------------------------+---------------------------+
| | ** | uperf netperf TCP * 192 | uperf netperf UDP * 192 |
+======================+======+===========================+===========================+
| cgroup | ** | cg_uperf | cg_uperf |
+----------------------+------+---------------------------+---------------------------+
| record_item | ** | Tput_avg (Gb/s) | Tput_avg (Gb/s) |
+----------------------+------+---------------------------+---------------------------+
| coresched_normalized | ** | 0.65 | 0.67 |
+----------------------+------+---------------------------+---------------------------+
| default_normalized | ** | 1 | 1 |
+----------------------+------+---------------------------+---------------------------+
| smtoff_normalized | ** | 0.83 | 0.91 |
+----------------------+------+---------------------------+---------------------------+

-- workload D:
+----------------------+------+-------------------------------+
| | ** | will-it-scale * 192 |
| | | (pipe based context_switch) |
+======================+======+===============================+
| cgroup | ** | cg_will-it-scale |
+----------------------+------+-------------------------------+
| record_item | ** | threads_avg |
+----------------------+------+-------------------------------+
| coresched_normalized | ** | 0.29 |
+----------------------+------+-------------------------------+
| default_normalized | ** | 1.00 |
+----------------------+------+-------------------------------+
| smtoff_normalized | ** | 0.87 |
+----------------------+------+-------------------------------+

- notes on test results record_item:
* coresched_normalized: smton, cs enabled, test result normalized by default value
* default_normalized: smton, cs disabled, test result normalized by default value
* smtoff_normalized: smtoff, test result normalized by default value

Hongyu

2020-11-13 10:05:50

by Ning, Hongyu

[permalink] [raw]

Subject: Re: [RFT for v9] (Was Re: [PATCH v8 -tip 00/26] Core scheduling)

On 2020/11/13 17:22, Ning, Hongyu wrote:
> On 2020/11/7 4:55, Joel Fernandes wrote:
>> All,
>>
>> I am getting ready to send the next v9 series based on tip/master
>> branch. Could you please give the below tree a try and report any results in
>> your testing?
>> git tree:
>> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (branch coresched)
>> git log:
>> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/log/?h=coresched
>>
>> The major changes in this series are the improvements:
>> (1)
>> "sched: Make snapshotting of min_vruntime more CGroup-friendly"
>> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=coresched-v9-for-test&id=9a20a6652b3c50fd51faa829f7947004239a04eb
>>
>> (2)
>> "sched: Simplify the core pick loop for optimized case"
>> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=coresched-v9-for-test&id=0370117b4fd418cdaaa6b1489bfc14f305691152
>>
>> And a bug fix:
>> (1)
>> "sched: Enqueue task into core queue only after vruntime is updated"
>> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=coresched-v9-for-test&id=401dad5536e7e05d1299d0864e6fc5072029f492
>>
>> There are also 2 more bug fixes that I squashed-in related to kernel
>> protection and a crash seen on the tip/master branch.
>>
>> Hoping to send the series next week out to the list.
>>
>> Have a great weekend, and Thanks!
>>
>> - Joel
>>
>>
>> On Mon, Oct 19, 2020 at 09:43:10PM -0400, Joel Fernandes (Google) wrote:
>
> Adding 4 workloads test results for core scheduling v9 candidate:
>
> - kernel under test:
> -- coresched community v9 candidate from https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (branch coresched)
> -- latest commit: 2e8591a330ff (HEAD -> coresched, origin/coresched) NEW: sched: Add a coresched command line option
> -- coresched=on kernel parameter applied
> - workloads:
> -- A. sysbench cpu (192 threads) + sysbench cpu (192 threads)
> -- B. sysbench cpu (192 threads) + sysbench mysql (192 threads, mysqld forced into the same cgroup)
> -- C. uperf netperf.xml (192 threads over TCP or UDP protocol separately)
> -- D. will-it-scale context_switch via pipe (192 threads)
> - test machine setup:
> CPU(s): 192
> On-line CPU(s) list: 0-191
> Thread(s) per core: 2
> Core(s) per socket: 48
> Socket(s): 2
> NUMA node(s): 4
> - test results, no obvious performance drop compared to community v8 build:
> -- workload A:
> +----------------------+------+----------------------+------------------------+
> | | ** | sysbench cpu * 192 | sysbench cpu * 192 |
> +======================+======+======================+========================+
> | cgroup | ** | cg_sysbench_cpu_0 | cg_sysbench_cpu_1 |
> +----------------------+------+----------------------+------------------------+
> | record_item | ** | Tput_avg (events/s) | Tput_avg (events/s) |
> +----------------------+------+----------------------+------------------------+
> | coresched_normalized | ** | 0.98 | 1.01 |
> +----------------------+------+----------------------+------------------------+
> | default_normalized | ** | 1 | 1 |
> +----------------------+------+----------------------+------------------------+
> | smtoff_normalized | ** | 0.59 | 0.6 |
> +----------------------+------+----------------------+------------------------+
>
> -- workload B:
> +----------------------+------+----------------------+------------------------+
> | | ** | sysbench cpu * 192 | sysbench mysql * 192 |
> +======================+======+======================+========================+
> | cgroup | ** | cg_sysbench_cpu_0 | cg_sysbench_mysql_0 |
> +----------------------+------+----------------------+------------------------+
> | record_item | ** | Tput_avg (events/s) | Tput_avg (events/s) |
> +----------------------+------+----------------------+------------------------+
> | coresched_normalized | ** | 1.02 | 0.78 |
> +----------------------+------+----------------------+------------------------+
> | default_normalized | ** | 1 | 1 |
> +----------------------+------+----------------------+------------------------+
> | smtoff_normalized | ** | 0.59 | 0.75 |
> +----------------------+------+----------------------+------------------------+
>
> -- workload C:
> +----------------------+------+---------------------------+---------------------------+
> | | ** | uperf netperf TCP * 192 | uperf netperf UDP * 192 |
> +======================+======+===========================+===========================+
> | cgroup | ** | cg_uperf | cg_uperf |
> +----------------------+------+---------------------------+---------------------------+
> | record_item | ** | Tput_avg (Gb/s) | Tput_avg (Gb/s) |
> +----------------------+------+---------------------------+---------------------------+
> | coresched_normalized | ** | 0.65 | 0.67 |
> +----------------------+------+---------------------------+---------------------------+
> | default_normalized | ** | 1 | 1 |
> +----------------------+------+---------------------------+---------------------------+
> | smtoff_normalized | ** | 0.83 | 0.91 |
> +----------------------+------+---------------------------+---------------------------+
>
> -- workload D:
> +----------------------+------+-------------------------------+
> | | ** | will-it-scale * 192 |
> | | | (pipe based context_switch) |
> +======================+======+===============================+
> | cgroup | ** | cg_will-it-scale |
> +----------------------+------+-------------------------------+
> | record_item | ** | threads_avg |
> +----------------------+------+-------------------------------+
> | coresched_normalized | ** | 0.29 |
> +----------------------+------+-------------------------------+
> | default_normalized | ** | 1.00 |
> +----------------------+------+-------------------------------+
> | smtoff_normalized | ** | 0.87 |
> +----------------------+------+-------------------------------+
>
> - notes on test results record_item:
> * coresched_normalized: smton, cs enabled, test result normalized by default value
> * default_normalized: smton, cs disabled, test result normalized by default value
> * smtoff_normalized: smtoff, test result normalized by default value
>
>
> Hongyu
>

Add 2 more negative test case:

- continuously toggle cpu.core_tag, during workload running with cs_on
- continuously toggle smt setting via /sys/devices/system/cpu/smt/control, during workload running with cs_on

no kernel panic or platform hang observed.