2007-10-11 02:30:45

by poison

[permalink] [raw]
Subject: linux-2.6.23 - acting funny

Hi :)
I have two harddisks with encfs on top of reiserfs between which I could copy
data at ~22MB/s before the upgrade from 2.6.22 to 2.6.23.
After the upgrade the transfer rate stuck at ~14MB/s and changing nice values
did not help anything.

And now the funny part:
I noticed the transfer rate go up to ~20MB/s when I startet compiling stuff.
I just need to run a CPU hog like:
while true; do echo test > /dev/null; done
and the transfer rate jumps from ~14MB/s to ~20MB/s.

top shows the two encfs processes with ~30%(read) and 50%(write) CPU usage no
matter if I run the CPU hog or not.

Could this eventually be due to the new scheduler? Do I need to tune anything?

System: E6600, 4GB RAM, Slackware 12, config attached.
Do you need anything else?

PS: please CC me, I'm not subscribed


Attachments:
(No filename) (800.00 B)
config.gz (12.40 kB)
Download all attachments

2007-10-11 12:36:00

by Helmut Toplizer

[permalink] [raw]
Subject: Re: linux-2.6.23 - acting funny

Hi!

I had similar behavior in the kernel releases since I can think of.
(You may find some reports about at
http://marc.info/?a=113508574400006&r=1&w=2)

Maybe your problem is similar.

Here's what have been found out:
Plugin of ehci devices causes some strange DMA thing
which causes delays because of the CPU HLT instruction.
(DMA are handled with delays on HLT)

Possible fixes:
1) Kernel parameter: idle=poll
Disables HLT and causes heat up and noise from the cpu

2) Don't insert EHCI-USB devices

3) Patch: attached, try out at your own risk.
you need to add a kernel-boot parameter "disableviahlt"
(You've got a via-chipset, right?)

Please report back to me if 1/2 works or to linux-ide if the patch works.
Thanks

Helmut


Attachments:
(No filename) (743.00 B)
02_hlt_dma_patch.diff (1.85 kB)
Download all attachments

2007-10-11 22:53:40

by poison

[permalink] [raw]
Subject: Re: linux-2.6.23 - acting funny

Hi =)

On Thursday 11 October 2007, Helmut Toplizer wrote:
> Hi!
>
> I had similar behavior in the kernel releases since I can think of.
It doesn't happen before 2.6.23.

> (You may find some reports about at
> http://marc.info/?a=113508574400006&r=1&w=2)
>
> Maybe your problem is similar.
>
> Here's what have been found out:
> Plugin of ehci devices causes some strange DMA thing
> which causes delays because of the CPU HLT instruction.
> (DMA are handled with delays on HLT)
>
> Possible fixes:
> 1) Kernel parameter: idle=poll
> Disables HLT and causes heat up and noise from the cpu
$ cat /proc/cmdline
root=/dev/sdb1 ro vga=794 idle=poll

Same story.
If I start:
$ while true; do echo test > /dev/null; done
... the transfer rate goes up.

>
> 2) Don't insert EHCI-USB devices
Still reproducable with all USB devices except keyboard+mouse removed, ehci
deselcted in kernel config and booting with idle=poll.

>
> 3) Patch: attached, try out at your own risk.
> you need to add a kernel-boot parameter "disableviahlt"
> (You've got a via-chipset, right?)
>
> Please report back to me if 1/2 works or to linux-ide if the patch works.
No VIA chip in sight. Mainboard is an Intel 975XBX2 and hard disks are
connected to the ICH7 SATA Controller, so I don't think the patch will help
me ^^
lspci attached.

Also the transfer rate didn't degrade too much for copying directly from
reiserfs to reiserfs and not using encfs:

dd if=/mnt/.backup/2CpGkrxvz6wgA0b0xloz8PavzMLrMymOgi9 of=/mnt/.tdata/test
1033+0 records in
1033+0 records out
1083179008 bytes (1.1 GB) copied, 16.9303 s, 64.0 MB/s

Plus the transfer rate doesn't increase if I start a CPU hog while copying
between reiserfs.

So it looks more to me like theres a bad interaction between the new
scheduler, fuse and encfs ...


> Thanks
>
> Helmut

Thanks for your reply ^^


Attachments:
(No filename) (1.81 kB)
lspci (18.35 kB)
Download all attachments

2007-10-12 06:05:39

by Ingo Molnar

[permalink] [raw]
Subject: Re: linux-2.6.23 - acting funny


* poison <[email protected]> wrote:

> Also the transfer rate didn't degrade too much for copying directly
> from reiserfs to reiserfs and not using encfs:
>
> dd if=/mnt/.backup/2CpGkrxvz6wgA0b0xloz8PavzMLrMymOgi9 of=/mnt/.tdata/test
> 1033+0 records in
> 1033+0 records out
> 1083179008 bytes (1.1 GB) copied, 16.9303 s, 64.0 MB/s
>
> Plus the transfer rate doesn't increase if I start a CPU hog while
> copying between reiserfs.
>
> So it looks more to me like theres a bad interaction between the new
> scheduler, fuse and encfs ...

i have no quick ideas - the behavior you are seeing is quite unexpected.
Could you try the current sched-devel code:

http://redhat.com/~mingo/cfs-scheduler/devel/sched-devel-combo-v2.6.23.patch

since this version of CFS does various things differently then the one
in v2.6.23, lets see whether perturbing it makes any difference to your
throughput.

you could also try the scheduler backport to v2.6.22.10, at:

http://redhat.com/~mingo/cfs-scheduler/

that would establish whether it's the changes in scheduling that cause
this or something else. Plus please enable CONFIG_SCHED_DEBUG and
CONFIG_SCHEDSTATS and run this debug script while such a transfer is
going on:

http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh

and send me the resulting file.

Ingo

2007-10-14 20:55:30

by poison

[permalink] [raw]
Subject: Re: linux-2.6.23 - acting funny

Hi and thanks for your reply :)

On Friday 12 October 2007, you wrote:
> i have no quick ideas - the behavior you are seeing is quite unexpected.
> Could you try the current sched-devel code:
>
>
> http://redhat.com/~mingo/cfs-scheduler/devel/sched-devel-combo-v2.6.23.patc
>h
Maybe I messed something up. I first applied that patch and tested. Then
reversed the patch with patch -R ... but in both cases the output of the
cfs-debug-info script contained:
Sched Debug Version: v0.05-v20

>
> since this version of CFS does various things differently then the one
> in v2.6.23, lets see whether perturbing it makes any difference to your
> throughput.
Without Hog: ~15MB/s
With Hog: ~19/MBs


>
> you could also try the scheduler backport to v2.6.22.10, at:
>
> http://redhat.com/~mingo/cfs-scheduler/
with sched-cfs-v2.6.22.9-v22.patch applied to 2.6.22.10 (I didn't spot one for
2.6.22.10):
Without Hog: ~14MB/s
With Hog: ~19MB/s

>
> that would establish whether it's the changes in scheduling that cause
> this or something else. Plus please enable CONFIG_SCHED_DEBUG and
> CONFIG_SCHEDSTATS and run this debug script while such a transfer is
> going on:
>
> http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh
>
> and send me the resulting file.
I created one for each tested kernel with
$ while true; do echo test>/dev/null; done
running and without.

I'll send you the files in private.
Thanks for your time.

Regards,
Andreas