LinuxLists.cc - Does the scheduler know about the cache topology?

2011-02-07 10:00:44

Subject: Does the scheduler know about the cache topology?

Hi all.

I did some performance tests on a Core 2 Quad machine [1] with QEMU.
A QEMU instance creates one main thread and one thread for each virtual
CPU. There were two vms with one CPU each, which make four threads.

I tried different combinations where I pinned one tgread to one physical
core with taskset and measured the network performance between the vms
with iperf [2]. The best result was achieved with each vm (main and CPU
thread) assigned to one cache group (core 0 & 1 and 2 & 3).

But it also turns out that letting the scheduler handle the assignment
works well, too: The results where no pinning was done were just
slightly below the best. So I was wondering, is the Linux scheduler
aware of the CPU's cache topology?

I'm curious to hear your opinion.

Thanks,
Daniel

[1] Core 0 and 1 share one L2 cache and so do 2 and 3
[2] The topic of my research is networking performance. My interest in
cache awareness is only a side effect.

2011-02-07 10:11:47

by Vaidyanathan Srinivasan

[permalink] [raw]

Subject: Re: Does the scheduler know about the cache topology?

* Daniel Tiron <[email protected]> [2011-02-07 10:51:42]:

> Hi all.
>
> I did some performance tests on a Core 2 Quad machine [1] with QEMU.
> A QEMU instance creates one main thread and one thread for each virtual
> CPU. There were two vms with one CPU each, which make four threads.
>
> I tried different combinations where I pinned one tgread to one physical
> core with taskset and measured the network performance between the vms
> with iperf [2]. The best result was achieved with each vm (main and CPU
> thread) assigned to one cache group (core 0 & 1 and 2 & 3).
>
> But it also turns out that letting the scheduler handle the assignment
> works well, too: The results where no pinning was done were just
> slightly below the best. So I was wondering, is the Linux scheduler
> aware of the CPU's cache topology?

Yes, the sched domains are created based on the socket or L2 cache
boundaries. Scheduler will try to keep the task on same CPU or move
it close enough if it does have to migrate the task.

The CPU topology and cache domains in an SMP system is captured in the
form of sched domain tree with the scheduler, and this structure is
referred during task scheduling and migration.

When running VMs, there is an interesting side effect, the host
scheduler knows the cache domains but not the guest scheduler. If the
guest scheduler keeps moving tasks between the vcps, then the cache
affinity and benefits could be lost.

> I'm curious to hear your opinion.
>
> Thanks,
> Daniel
>
> [1] Core 0 and 1 share one L2 cache and so do 2 and 3
> [2] The topic of my research is networking performance. My interest in
> cache awareness is only a side effect.

Interrupt delivery and routing may also affect network performance.

--Vaidy

2011-02-07 12:20:31

by Chulmin Kim

[permalink] [raw]

Subject: RE: Does the scheduler know about the cache topology?

As far as I know, linux scheduler manages the physical cores using
sched_domain.

Simply speaking, Sched_domain is a group of cpus. And the load balancing is
usually done within the group.
(this group is hierarchical tree. you can find more specific information
about this structure in web)

This grouping can be done by SMT, NUMA and cache topology through ACPI
information obtained when the system is booted.

Hope this is useful for you.

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Daniel Tiron
Sent: Monday, February 07, 2011 6:52 PM
To: LKML
Subject: Does the scheduler know about the cache topology?

Hi all.

I did some performance tests on a Core 2 Quad machine [1] with QEMU.
A QEMU instance creates one main thread and one thread for each virtual CPU.
There were two vms with one CPU each, which make four threads.

I tried different combinations where I pinned one tgread to one physical
core with taskset and measured the network performance between the vms with
iperf [2]. The best result was achieved with each vm (main and CPU
thread) assigned to one cache group (core 0 & 1 and 2 & 3).

But it also turns out that letting the scheduler handle the assignment works
well, too: The results where no pinning was done were just slightly below
the best. So I was wondering, is the Linux scheduler aware of the CPU's
cache topology?

I'm curious to hear your opinion.

Thanks,
Daniel

[1] Core 0 and 1 share one L2 cache and so do 2 and 3 [2] The topic of my
research is networking performance. My interest in
cache awareness is only a side effect.