2002-09-19 00:17:34

by Duncan Sands

[permalink] [raw]
Subject: fsync 50 times slower after 2.5.27

I noticed a performance degradation in recent kernels:
fsync takes around 50 times longer in kernels 2.5.28 to
2.5.34 when the system is under heavy load, as compared
to kernels <= 2.5.27. I noticed this because it makes kmail
unusable. 2.5.34 is the most recent kernel I tested.

"Heavy load" is kernel compile (-j 4) at the same time as
another heavy compile. The following fsyncs come from
strace -T -p <kmail pid>. kmail does several fsyncs grouped
together, followed by other stuff, followed by more fsyncs.

Here are some typical fsync groups. The time for each fsync
is the last number (in seconds).

2.5.27 (similar results for 2.5.26 and 2.4.19):

fsync(17) = 0 <0.078718>
fsync(18) = 0 <0.099900>
fsync(25) = 0 <0.004719>
fsync(11) = 0 <0.001747>
fsync(12) = 0 <0.001726>
fsync(13) = 0 <0.014935>
fsync(14) = 0 <0.011553>
fsync(15) = 0 <0.002506>
fsync(16) = 0 <0.002854>

2.5.28 (similar results for kernels up to 2.5.34):

fsync(17) = 0 <0.682749>
fsync(18) = 0 <2.142922>
fsync(22) = 0 <2.269918>
fsync(24) = 0 <1.114331>
fsync(11) = 0 <4.092790>
fsync(12) = 0 <2.309529>
fsync(13) = 0 <0.441093>
fsync(14) = 0 <1.730422>
fsync(15) = 0 <5.444556>
fsync(16) = 0 <1.844690>

The filesystem is ext3 but the same occurs with ext2.
This is a UP x86 (400MHz), no preempt.

Any ideas? I have looked through the changes between
2.5.27 and 2.5.28 but don't see any obvious culprits...

All the best,

Duncan.


2002-09-19 00:30:34

by Andrew Morton

[permalink] [raw]
Subject: Re: fsync 50 times slower after 2.5.27

Duncan Sands wrote:
>
> I noticed a performance degradation in recent kernels:
> fsync takes around 50 times longer in kernels 2.5.28 to
> 2.5.34 when the system is under heavy load, as compared
> to kernels <= 2.5.27. I noticed this because it makes kmail
> unusable. 2.5.34 is the most recent kernel I tested.
>

Please try replacing the yield() in fs/jbd/transaction.c
with

set_current_state(TASK_RUNNING);
schedule();

2002-09-19 00:55:54

by Alexander Hoogerhuis

[permalink] [raw]
Subject: 2.4.20-pre7-ac2 compile and IrDA

2.4.20-pre7-ac2 has a problem compiling the irtty module:

make[3]: Entering directory `/home/alexh/src/linux/linux-2.4-ac-test/drivers/net/irda'
gcc -D__KERNEL__ -I/home/alexh/src/linux/linux-2.4-ac-test/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686 -DMODULE -DMODVERSIONS -include /home/alexh/src/linux/linux-2.4-ac-test/include/linux/modversions.h -nostdinc -iwithprefix include -DKBUILD_BASENAME=irtty -c -o irtty.o irtty.c
irtty.c: In function `irtty_set_dtr_rts':
irtty.c:761: `TIOCM_MODEM_BITS' undeclared (first use in this function)
irtty.c:761: (Each undeclared identifier is reported only once
irtty.c:761: for each function it appears in.)
make[3]: *** [irtty.o] Error 1
make[3]: Leaving directory `/home/alexh/src/linux/linux-2.4-ac-test/drivers/net/irda'
make[2]: *** [_modsubdir_irda] Error 2
make[2]: Leaving directory `/home/alexh/src/linux/linux-2.4-ac-test/drivers/net'
make[1]: *** [_modsubdir_net] Error 2
make[1]: Leaving directory `/home/alexh/src/linux/linux-2.4-ac-test/drivers'
make: *** [_mod_drivers] Error 2

lapper:~/src/linux/linux-2.4-ac-test$ grep IRDA .config
CONFIG_IRDA=m
CONFIG_IRDA_ULTRA=y
CONFIG_IRDA_CACHE_LAST_LSAP=y
CONFIG_IRDA_FAST_RR=y
# CONFIG_IRDA_DEBUG is not set
# CONFIG_USB_IRDA is not set
lapper:~/src/linux/linux-2.4-ac-test$

mvh,
A
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2002-09-19 01:16:49

by Alan

[permalink] [raw]
Subject: Re: 2.4.20-pre7-ac2 compile and IrDA

On Thu, 2002-09-19 at 02:00, Alexander Hoogerhuis wrote:
> 2.4.20-pre7-ac2 has a problem compiling the irtty module:
>
> make[3]: Entering directory `/home/alexh/src/linux/linux-2.4-ac-test/drivers/net/irda'
> gcc -D__KERNEL__ -I/home/alexh/src/linux/linux-2.4-ac-test/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686 -DMODULE -DMODVERSIONS -include /home/alexh/src/linux/linux-2.4-ac-test/include/linux/modversions.h -nostdinc -iwithprefix include -DKBUILD_BASENAME=irtty -c -o irtty.o irtty.c
> irtty.c: In function `irtty_set_dtr_rts':
> irtty.c:761: `TIOCM_MODEM_BITS' undeclared (first use in this function)
> irtty.c:761: (Each undeclared identifier is reported only once
> irtty.c:761: for each function it appears in.)
>

What architecture - its defined for x86 definitely

2002-09-19 01:38:16

by David Miller

[permalink] [raw]
Subject: Re: 2.4.20-pre7-ac2 compile and IrDA

From: Alan Cox <[email protected]>
Date: 19 Sep 2002 02:25:56 +0100

On Thu, 2002-09-19 at 02:00, Alexander Hoogerhuis wrote:
> irtty.c:761: `TIOCM_MODEM_BITS' undeclared (first use in this function)

What architecture - its defined for x86 definitely

Really? Maybe in the -ac tree, but not in what marcelo has.

? pwd
/home/davem/src/BK/marcelo-2.4/include/asm-i386
? egrep TIOCM_MODEM_BITS *.h
? cd ../../drivers/net/irda
? egrep TIOCM_MODEM_BITS *.c
irtty.c: int arg = TIOCM_MODEM_BITS;
?

2002-09-19 04:38:40

by Willy Tarreau

[permalink] [raw]
Subject: Re: 2.4.20-pre7-ac2 compile and IrDA

On Thu, Sep 19, 2002 at 02:25:56AM +0100, Alan Cox wrote:
> > gcc -D__KERNEL__ -I/home/alexh/src/linux/linux-2.4-ac-test/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686
^^^^^^^^^^^
> What architecture - its defined for x86 definitely

=> he seems to be on x86 too.

Cheers,
Willy

2002-09-19 06:29:56

by Alexander Hoogerhuis

[permalink] [raw]
Subject: Re: 2.4.20-pre7-ac2 compile and IrDA

Alan Cox <[email protected]> writes:

> On Thu, 2002-09-19 at 02:00, Alexander Hoogerhuis wrote:
> > 2.4.20-pre7-ac2 has a problem compiling the irtty module:
> >
> > make[3]: Entering directory `/home/alexh/src/linux/linux-2.4-ac-test/drivers/net/irda'
> > gcc -D__KERNEL__ -I/home/alexh/src/linux/linux-2.4-ac-test/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-boundary=2 -march=i686 -DMODULE -DMODVERSIONS -include /home/alexh/src/linux/linux-2.4-ac-test/include/linux/modversions.h -nostdinc -iwithprefix include -DKBUILD_BASENAME=irtty -c -o irtty.o irtty.c
> > irtty.c: In function `irtty_set_dtr_rts':
> > irtty.c:761: `TIOCM_MODEM_BITS' undeclared (first use in this function)
> > irtty.c:761: (Each undeclared identifier is reported only once
> > irtty.c:761: for each function it appears in.)
> >
>
> What architecture - its defined for x86 definitely

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 Mobile CPU 1.70GHz
stepping : 4
cpu MHz : 1196.146
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 2385.51

mvh,
A
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2002-09-19 10:36:58

by Alan

[permalink] [raw]
Subject: Re: 2.4.20-pre7-ac2 compile and IrDA

On Thu, 2002-09-19 at 02:33, David S. Miller wrote:
> /home/davem/src/BK/marcelo-2.4/include/asm-i386
> ? egrep TIOCM_MODEM_BITS *.h
> ? cd ../../drivers/net/irda
> ? egrep TIOCM_MODEM_BITS *.c
> irtty.c: int arg = TIOCM_MODEM_BITS;

He said pre7-ac2. I know pre7 is broken, I broke it 8(

2002-09-19 12:23:05

by Alexander Hoogerhuis

[permalink] [raw]
Subject: Re: 2.4.20-pre7-ac2 compile and IrDA

Alan Cox <[email protected]> writes:

> On Thu, 2002-09-19 at 02:33, David S. Miller wrote:
> > /home/davem/src/BK/marcelo-2.4/include/asm-i386
> > ? egrep TIOCM_MODEM_BITS *.h
> > ? cd ../../drivers/net/irda
> > ? egrep TIOCM_MODEM_BITS *.c
> > irtty.c: int arg = TIOCM_MODEM_BITS;
>
> He said pre7-ac2. I know pre7 is broken, I broke it 8(
>

Then we all agree, -pre7 is broken and -pre7-ac2 works. :)

mvh,
A
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2002-09-19 21:17:37

by Andrew Morton

[permalink] [raw]
Subject: Re: fsync 50 times slower after 2.5.27

Duncan Sands wrote:
>
> On Thursday 19 September 2002 02:35, you wrote:
> > Duncan Sands wrote:
> > > I noticed a performance degradation in recent kernels:
> > > fsync takes around 50 times longer in kernels 2.5.28 to
> > > 2.5.34 when the system is under heavy load, as compared
> > > to kernels <= 2.5.27. I noticed this because it makes kmail
> > > unusable. 2.5.34 is the most recent kernel I tested.
> >
> > Please try replacing the yield() in fs/jbd/transaction.c
> > with
> >
> > set_current_state(TASK_RUNNING);
> > schedule();
>
> OK! This seems to fix the problem for 2.5.36. I will also
> test it for 2.5.34 since I didn't test 2.5.36 as rigourously
> as 2.5.34 for the presence of the problem without the patch.

(I dragged you back onto the mailing list)

Thanks for testing. The semantics of sched_yield() have changed
significantly in 2.5. Probably correctly, but it is breaking a
few things which were tuned for the old semantics. Amongst those
things are OpenOffice and, it seems, ext3 transaction batching.

The transaction batching does good things under some situations,
and we want it to keep working. I'll sit tight for the while, see
where shed_yield() behaviour ends up. If we still have a problem
then probably a schedule_timeout(1) in there would suffice.

> I will also test using ext2 (does ext2 use transaction.c?).

No. ext2 will not exhibit this problem.

2002-09-19 21:39:42

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [Ext2-devel] Re: fsync 50 times slower after 2.5.27

Hi,

On Thu, Sep 19, 2002 at 02:22:30PM -0700, Andrew Morton wrote:

> Thanks for testing. The semantics of sched_yield() have changed
> significantly in 2.5. Probably correctly, but it is breaking a
> few things which were tuned for the old semantics. Amongst those
> things are OpenOffice and, it seems, ext3 transaction batching.
>
> The transaction batching does good things under some situations,
> and we want it to keep working. I'll sit tight for the while, see
> where shed_yield() behaviour ends up. If we still have a problem
> then probably a schedule_timeout(1) in there would suffice.

Actually, with a proper yield() implementation, we can achieve the
same effect by making the commit thread do the yield itself before
locking down the transaction. Having _every_ sync thread do a yield
itself before calling for a commit is probably overkill.

--Stephen

2002-09-20 12:10:13

by Duncan Sands

[permalink] [raw]
Subject: Re: fsync 50 times slower after 2.5.27

On Thursday 19 September 2002 23:22, Andrew Morton wrote:
> > OK! This seems to fix the problem for 2.5.36. I will also
> > test it for 2.5.34 since I didn't test 2.5.36 as rigourously
> > as 2.5.34 for the presence of the problem without the patch.
>
> (I dragged you back onto the mailing list)

No problem.

> Thanks for testing. The semantics of sched_yield() have changed
> significantly in 2.5. Probably correctly, but it is breaking a
> few things which were tuned for the old semantics. Amongst those
> things are OpenOffice and, it seems, ext3 transaction batching.

Thanks for solving! By the way, what does
set_current_state(TASK_RUNNING);
schedule();
actually do? I guess it lets higher priority tasks have a go, while the
original yield() let equal priority tasks go first? My knowledge of
sched_yield is out of date...

> The transaction batching does good things under some situations,
> and we want it to keep working. I'll sit tight for the while, see
> where shed_yield() behaviour ends up. If we still have a problem
> then probably a schedule_timeout(1) in there would suffice.
>
> > I will also test using ext2 (does ext2 use transaction.c?).
>
> No. ext2 will not exhibit this problem.

You are right. I was confused because I thought I had observed
this problem once with ext2, but in fact I didn't test this case properly.
That's what you get for wanting to get some sleep at night...

Thanks again,

Duncan.