2013-03-25 10:10:09

by Christian P. Schmidt

[permalink] [raw]
Subject: kworkers for dm-crypt locked to CPU core 0?

Hi everyone,

I am trying to troubleshoot some strange performance issues I am seeing
on a machine of mine. Said machine had 10 drives mapped via separate
dm-crypt instances. The aggregate (read) throughput seems to hover
around 120-130MB/s (looking at iostat -x -d) when running an instance of
dd if=/dev/dm-<instance> of=/dev/null bs=1m for each mapping. However
"top" shows that one core is loaded with 99% system tasks, and two to
three cores are spinning in iowait. All the kworkers get their share of
roughly 10% CPU.

"taskset" shows that all of the kworkers have an affinity mask of 1, so
they can run on core 0 only, and trying to change it results in an error
of "Invalid argument".

Is there a way I can make the scheduler put those on multiple cores?

Please cc: me since I am not subscribed.

Regards,
Christian


2013-03-25 10:32:13

by Andi Kleen

[permalink] [raw]
Subject: Re: kworkers for dm-crypt locked to CPU core 0?

Christian Schmidt <[email protected]> writes:
>
> Is there a way I can make the scheduler put those on multiple cores?

Submit the IO from multiple cores. Don't use dd.

-Andi
--
[email protected] -- Speaking for myself only

2013-03-25 10:56:29

by Christian P. Schmidt

[permalink] [raw]
Subject: Re: kworkers for dm-crypt locked to CPU core 0?

Hi Andi,

On 25/03/13 11:32, Andi Kleen wrote:
> Christian Schmidt <[email protected]> writes:
>>
>> Is there a way I can make the scheduler put those on multiple cores?
>
> Submit the IO from multiple cores. Don't use dd.

The dd processes run on multiple cores. I do understand that you were
saying "multiple requests to the same device from multiple cores",
however I only see kworkers from core 0 active, even with corelocked dd
processes (via numactl). Since I just skimmed to the pcrypt.c I kind of
realized that was the answer - do I understand correctly that the core
for the kworker follows the core of the requestor?

Basically that means that I can not get the aggregate throughput for all
devices higher than that for a single device by changing the stacking,
e.g. dm-crypt on top of md-raid / below the md-raid, or below zfs/btrfs
vdevs, since all user space requests will always appear to the crypto
layer to originate from core 0, irrespective of the core the application
runs on, unless the file system runs multithreaded? I just tested this
with ZFS, and all kworkers run on core 0 during file I/O or when
resilvering a raidZ. I don't have a btrfs managing its own vdevs at hand
to compare though.

Thanks for your input,
Christian