2006-09-24 16:57:44

by Tilman Schmidt

[permalink] [raw]
Subject: [2.6.18-rc7-mm1] slow boot

FYI: On my Dell OptiPlex GX110 (Intel Pentium III, 933 MHz, 512 MB
RAM, i810 chipset), kernel 2.6.18-rc7-mm1 takes drastically longer
to boot than 2.6.18 mainline release. Some data points from the
respective dmesg outputs:

<<<<<<< 2.6.18
<4>[ 26.332472] PID hash table entries: 2048 (order: 11, 8192 bytes)
<4>[ 26.334415] Console: colour VGA+ 80x25
>>>>>>> 2.6.18-rc7-mm1
<4>[ 32.102004] PID hash table entries: 2048 (order: 11, 8192 bytes)
<4>[ 32.174041] Console: colour VGA+ 80x25

Strange delay in mm wrt mainline. (There are several of these.)

<<<<<<< 2.6.18
<4>[ 26.336075] ------------------------
<4>[ 26.336130] | Locking API testsuite:
<4>[ 26.336187] ----------------------------------------------------------------------------
<4>[ 26.336270] | spin |wlock |rlock |mutex | wsem | rsem |
<4>[ 26.336352] --------------------------------------------------------------------------
<4>[ 26.336451] A-A deadlock: ok | ok | ok | ok | ok | ok |
<4>[ 26.338380] A-B-B-A deadlock: ok | ok | ok | ok | ok | ok |
<4>[ 26.340249] A-B-B-C-C-A deadlock: ok | ok | ok | ok | ok | ok |
...
<4>[ 26.408007] Good, all 218 testcases passed! |
<4>[ 26.408065] ---------------------------------
>>>>>>> 2.6.18-rc7-mm1
<4>[ 32.183079] ------------------------
<4>[ 32.183134] | Locking API testsuite:
<4>[ 32.183190] ----------------------------------------------------------------------------
<4>[ 32.183273] | spin |wlock |rlock |mutex | wsem | rsem |
<4>[ 32.183356] --------------------------------------------------------------------------
<4>[ 32.183452] A-A deadlock: ok | ok | ok | ok | ok | ok |
<4>[ 32.273613] A-B-B-A deadlock: ok | ok | ok | ok | ok | ok |
<4>[ 32.497062] A-B-B-C-C-A deadlock: ok | ok | ok | ok | ok | ok |
...
<4>[ 39.746662] Good, all 218 testcases passed! |
<4>[ 39.746720] ---------------------------------

All locking API tests are extremely slow (tenths of seconds in mm,
milliseconds in mainline.) By the end of the testsuite, the mm
kernel has already lost 13 seconds on the mainline kernel.

<<<<<<< 2.6.18
<4>[ 26.596570] evxfevnt-0089 [02] enable : Transition to ACPI mode successful
<6>[ 26.708752] NET: Registered protocol family 16
>>>>>>> 2.6.18-rc7-mm1
<4>[ 40.787142] evxfevnt-0089 [02] enable : Transition to ACPI mode successful
<6>[ 41.329553] NET: Registered protocol family 16

<<<<<<< 2.6.18
<6>[ 26.749162] ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
<7>[ 26.757435] Boot video device is 0000:00:01.0
>>>>>>> 2.6.18-rc7-mm1
<6>[ 42.025638] ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
<7>[ 42.435641] Boot video device is 0000:00:01.0

More inexplicable delays.

<<<<<<< 2.6.18
<6>[ 30.233858] PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
<6>[ 30.242857] serio: i8042 AUX port at 0x60,0x64 irq 12
<6>[ 30.249613] serio: i8042 KBD port at 0x60,0x64 irq 1
<6>[ 30.257604] mice: PS/2 mouse device common for all mice
<6>[ 30.265941] input: PC Speaker as /class/input/input0
<6>[ 30.304365] input: AT Translated Set 2 keyboard as /class/input/input1
>>>>>>> 2.6.18-rc7-mm1
<6>[ 56.409925] PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
<6>[ 56.660486] serio: i8042 AUX port at 0x60,0x64 irq 12
<6>[ 57.028188] serio: i8042 KBD port at 0x60,0x64 irq 1
<6>[ 57.036853] mice: PS/2 mouse device common for all mice
<6>[ 58.532944] input: PC Speaker as /devices/platform/pcspkr/input0
<6>[ 59.224917] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio1/input1
<6>[ 61.408168] input: ImPS/2 Generic Wheel Mouse as /devices/platform/i8042/serio0/input2

Keyboard and mouse detection take 100x as long as normal.

<<<<<<< 2.6.18
<6>[ 37.446763] md: ... autorun DONE.
<6>[ 38.578832] NTFS driver 2.1.27 [Flags: R/W MODULE].
<6>[ 38.771306] NTFS volume version 3.1.
>>>>>>> 2.6.18-rc7-mm1
<6>[ 73.384371] md: ... autorun DONE.
<6>[ 73.565190] warning: process `showconsole' used the removed sysctl system call
<6>[ 75.436369] NTFS driver 2.1.27 [Flags: R/W MODULE].
<6>[ 75.637003] NTFS volume version 3.1.

In the end, the mm kernel has taken twice as much time to get up
and running as the mainline kernel.

Please let me know if you need more information.

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)


Attachments:
signature.asc (253.00 B)
OpenPGP digital signature

2006-09-24 19:22:45

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On Sun, 24 Sep 2006 18:59:18 +0200, Tilman Schmidt said:

> FYI: On my Dell OptiPlex GX110 (Intel Pentium III, 933 MHz, 512 MB
> RAM, i810 chipset), kernel 2.6.18-rc7-mm1 takes drastically longer
> to boot than 2.6.18 mainline release. Some data points from the
> respective dmesg outputs:

> <<<<<<< 2.6.18
> <4>[ 26.336075] ------------------------
> <4>[ 26.336130] | Locking API testsuite:

Try building your -rc7-mm1 kernel without CONFIG_DEBUG_LOCKING_API_SELFTESTS
(and you've probably got a few other high-impact DEBUG options turned
on - CONFIG_PROVE_LOCKING is another possible culprit).


Attachments:
(No filename) (226.00 B)

2006-09-24 21:54:19

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On Sun, 24 Sep 2006 18:59:18 +0200
Tilman Schmidt <[email protected]> wrote:

> In the end, the mm kernel has taken twice as much time to get up
> and running as the mainline kernel.

Don't know, sorry.

make-ext3-mount-default-to-barrier=1.patch takes my laptop's bootup time
from 53 seconds to 68, which is rather painful. In fact I'm inclined to
drop the patch because of this, and I'd also be quite concerned about the
similar reiserfs patch, make-reiserfs-default-to-barrier=flush.patch.

I've *never* seen any reports of any problems being caused by disk
writeback caching. Yes, it's a theoretical problem but for some reason it
just doesn't seem to be a problem in practice. Hence I'm really reluctant
to go and slow everyone's machines down so much in this manner.

But apart from that problem I see no differences in bootup time between
2.6.18 and 2.6.18-mm1.

Do you have the time to go through the
http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
process?

2006-09-24 21:56:22

by Tilman Schmidt

[permalink] [raw]
Subject: [2.6.18-mm1] slow boot (was: [2.6.18-rc7-mm1] slow boot)

On 24.09.2006 18:59, /me wrote:
> FYI: On my Dell OptiPlex GX110 (Intel Pentium III, 933 MHz, 512 MB
> RAM, i810 chipset), kernel 2.6.18-rc7-mm1 takes drastically longer
> to boot than 2.6.18 mainline release. [...]
> In the end, the mm kernel has taken twice as much time to get up
> and running as the mainline kernel.

Just re-tested with release 2.6.18-mm1 and it exhibits the same
behaviour as -rc7-mm1.

HTH
Tilman

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)


Attachments:
signature.asc (253.00 B)
OpenPGP digital signature

2006-09-24 22:00:46

by Tilman Schmidt

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On 24.09.2006 21:22, [email protected] wrote:
> On Sun, 24 Sep 2006 18:59:18 +0200, Tilman Schmidt said:
>
>> FYI: On my Dell OptiPlex GX110 (Intel Pentium III, 933 MHz, 512 MB
>> RAM, i810 chipset), kernel 2.6.18-rc7-mm1 takes drastically longer
>> to boot than 2.6.18 mainline release. Some data points from the
>> respective dmesg outputs:
>
>> <<<<<<< 2.6.18
>> <4>[ 26.336075] ------------------------
>> <4>[ 26.336130] | Locking API testsuite:
>
> Try building your -rc7-mm1 kernel without CONFIG_DEBUG_LOCKING_API_SELFTESTS
> (and you've probably got a few other high-impact DEBUG options turned
> on - CONFIG_PROVE_LOCKING is another possible culprit).

Thanks, but what I am actually wondering, and what I think
may be worth investigating (provided I am not the only one
affected, of course) is why -mm1 is much slower than mainline
*built with the same DEBUG options*.

Regards
Tilman

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)


Attachments:
signature.asc (253.00 B)
OpenPGP digital signature

2006-09-24 22:13:34

by Tilman Schmidt

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On 24.09.2006 23:53, Andrew Morton wrote:
> make-ext3-mount-default-to-barrier=1.patch takes my laptop's bootup time
> from 53 seconds to 68, which is rather painful. In fact I'm inclined to
> drop the patch because of this, and I'd also be quite concerned about the
> similar reiserfs patch, make-reiserfs-default-to-barrier=flush.patch.

ReiserFS would be the relevant one for me. I'll go and see what
happens if I revert that one, then.

> But apart from that problem I see no differences in bootup time between
> 2.6.18 and 2.6.18-mm1.
>
> Do you have the time to go through the
> http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
> process?

I can't promise anything in the short term, but I'll try.

Thanks
Tilman

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)


Attachments:
signature.asc (253.00 B)
OpenPGP digital signature

2006-09-24 22:13:13

by Alan

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

Ar Sul, 2006-09-24 am 14:53 -0700, ysgrifennodd Andrew Morton:
> I've *never* seen any reports of any problems being caused by disk
> writeback caching. Yes, it's a theoretical problem but for some reason it
> just doesn't seem to be a problem in practice. Hence I'm really reluctant
> to go and slow everyone's machines down so much in this manner.

It happens in some cases, the usual one is sudden loss of power. In the
crashed box cases the disk still gets to write data back and in the
Linux power off sanely cases we explicitly cache flush. Its the sudden
loss of power case that is nasty.

We are also helped of course by the fact the cache is never more than
can be flushed in about 7 seconds because of other-os features.

2006-09-24 22:42:04

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On Sun, 24 Sep 2006 23:36:41 +0100
Alan Cox <[email protected]> wrote:

> Ar Sul, 2006-09-24 am 14:53 -0700, ysgrifennodd Andrew Morton:
> > I've *never* seen any reports of any problems being caused by disk
> > writeback caching. Yes, it's a theoretical problem but for some reason it
> > just doesn't seem to be a problem in practice. Hence I'm really reluctant
> > to go and slow everyone's machines down so much in this manner.
>
> It happens in some cases, the usual one is sudden loss of power. In the
> crashed box cases the disk still gets to write data back and in the
> Linux power off sanely cases we explicitly cache flush. Its the sudden
> loss of power case that is nasty.

I don't know about reiserfs, but for ext3 writeback caching delays aren't a
problem per-se. It's write *reordering* which matters.

And given that the jounal tends to be a single contiguous hunk of disk, the
probability that a journal block at LBA #N gets written before the commit
block at LBA #N+20 is probably fairly low. There's block remapping of
course, but software journal wrapping might be a more likely cause of write
reordering.

And of course the time window is small - a few milliseconds per five
seconds, and not every five seconds at that.

Hand wavy, I know. But I wouldn't pay 15% throughput for it..

> We are also helped of course by the fact the cache is never more than
> can be flushed in about 7 seconds because of other-os features.

Well, as I say, the absolute value of any delay doesn't really matter,
except you'd lose an additional seven seconds worth of work. It's
write reordering which can damage the fs.

2006-09-25 12:10:08

by Alexander Gran

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

Hi,

simillar behavior here, but at different parts. 2.6.18-rc7-mm1 takes way
longer than 2.6.16-mm1 (the last one working here to boot.
I have delays (seconds!) between:
intel8x0_measure_ac97_clock: measured 50450 usecs
intel8x0: clocking to 48000
and
IBM TrackPoint firmware: 0x0e, buttons: 3/3
input: TPPS/2 IBM TrackPoint as /devices/platform/i8042/serio0/serio2/input3

I'll try to find out more, however I'm quite busy (and ill...) atm. If Tilman
hadn't mailed, I wouldn't report either, but it looks I'm not the only one
with these problem ;-O
BTW: No real debuging enabled here, just sysrq and dmesg buffer size.

regards
Alex

--
Encrypted Mails welcome.
PGP-Key at http://www.grans.eu/misc/pgpkey.asc | Key-ID: 0x6D7DD291
More info at http://www.grans.eu/Alexander_Gran.html


Attachments:
(No filename) (791.00 B)
(No filename) (189.00 B)
Download all attachments

2006-09-26 23:46:44

by Tilman Schmidt

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On 24.09.2006 23:53, Andrew Morton wrote:
> make-ext3-mount-default-to-barrier=1.patch takes my laptop's bootup time
> from 53 seconds to 68, which is rather painful. In fact I'm inclined to
> drop the patch because of this, and I'd also be quite concerned about the
> similar reiserfs patch, make-reiserfs-default-to-barrier=flush.patch.
[...]
> Do you have the time to go through the
> http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
> process?

Ok, so far I've narrowed it down to the section between
#X64_64-START
and
#X64_64-END
which I guess lets make-{ext3-mount,reiserfs}-default-to-barrier=1.patch
off the hook for now.

Trying to bisect further into that section now, but perhaps that'll
already trigger some thoughts?

Thanks
Tilman

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)


Attachments:
signature.asc (253.00 B)
OpenPGP digital signature

2006-09-27 00:21:47

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On Wed, 27 Sep 2006 01:48:12 +0200
Tilman Schmidt <[email protected]> wrote:

> On 24.09.2006 23:53, Andrew Morton wrote:
> > make-ext3-mount-default-to-barrier=1.patch takes my laptop's bootup time
> > from 53 seconds to 68, which is rather painful. In fact I'm inclined to
> > drop the patch because of this, and I'd also be quite concerned about the
> > similar reiserfs patch, make-reiserfs-default-to-barrier=flush.patch.
> [...]
> > Do you have the time to go through the
> > http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
> > process?
>
> Ok, so far I've narrowed it down to the section between
> #X64_64-START
> and
> #X64_64-END

argh.

> which I guess lets make-{ext3-mount,reiserfs}-default-to-barrier=1.patch
> off the hook for now.
>
> Trying to bisect further into that section now,

Thanks. You may find that none of it compiles, and you'll need to take the
four or five patches immediately after #X64_64-END (ie: fixes against the
x86_64 tree) and place them at the appropriate places immediately after the
x86_64-mm-<whatever>.patch which they fix.

Specifically, put fix-x86_64-mm-i386-pda-smp-processorid.patch immediately
after x86_64-mm-i386-pda-smp-processorid.patch and put
fix-x86_64-mm-spinlock-cleanup.patch immediately after
x86_64-mm-spinlock-cleanup.patch.

> but perhaps that'll
> already trigger some thoughts?

Nope, there's a huge amount of stuff in there. And it's pretty much all in
mainline as of a couple of hours ago, so bisecting the tree which you have
there is increasingly valuable.

Thanks for persisting with this.

2006-09-28 16:58:48

by Tilman Schmidt

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On 24.09.2006 23:53, Andrew Morton wrote:
> Do you have the time to go through the
> http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
> process?

Phew, it's done. And the winner is:

x86_64-mm-i386-stacktrace-unwinder.patch
--------8<--------8<--------8<--------8<--------8<--------8<--------
i386: Do stacktracer conversion too

Following x86-64 patches. Reuses code from them in fact.

Convert the standard backtracer to do all output using
callbacks. Use the x86-64 stack tracer implementation
that uses these callbacks to implement the stacktrace interface.

This allows to use the new dwarf2 unwinder for stacktrace
and get better backtraces.

Cc: [email protected]

Signed-off-by: Andi Kleen <[email protected]>
-------->8-------->8-------->8-------->8-------->8-------->8--------

Backing out just this patch from 2.6.18-mm1 (and resolving conflicts
manually the obvious way) gets the boot time back to normal (ie. as
fast as 2.6.18 mainline) on my
Linux gx110 2.6.18-mm1-noinitrd #2 PREEMPT Thu Sep 28 18:48:32 CEST 2006 i686 i686 i386 GNU/Linux
machine.

Thanks
Tilman

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)


Attachments:
signature.asc (253.00 B)
OpenPGP digital signature

2006-09-28 17:12:08

by Andi Kleen

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On Thursday 28 September 2006 19:00, Tilman Schmidt wrote:

missing context here, but ...

> On 24.09.2006 23:53, Andrew Morton wrote:
> > Do you have the time to go through the
> > http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
> > process?
>
> Phew, it's done. And the winner is:
>
> x86_64-mm-i386-stacktrace-unwinder.patch
> --------8<--------8<--------8<--------8<--------8<--------8<--------
> i386: Do stacktracer conversion too
>
> Following x86-64 patches. Reuses code from them in fact.
>
> Convert the standard backtracer to do all output using
> callbacks. Use the x86-64 stack tracer implementation
> that uses these callbacks to implement the stacktrace interface.
>
> This allows to use the new dwarf2 unwinder for stacktrace
> and get better backtraces.
>
> Cc: [email protected]
>
> Signed-off-by: Andi Kleen <[email protected]>
> -------->8-------->8-------->8-------->8-------->8-------->8--------
>
> Backing out just this patch from 2.6.18-mm1 (and resolving conflicts
> manually the obvious way) gets the boot time back to normal (ie. as
> fast as 2.6.18 mainline) on my
> Linux gx110 2.6.18-mm1-noinitrd #2 PREEMPT Thu Sep 28 18:48:32 CEST 2006 i686 i686 i386 GNU/Linux
> machine.


Hmm, i assume you have lockdep on. The new backtracer is of course slower
than the old one and it will slow down lockdep which takes a lot of backtraces.
But it shouldn't be a significant slowdown.

Can you perhaps boot with profile=1 and then send readprofile output after
boot?

-Andi

2006-09-28 23:18:27

by Tilman Schmidt

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

Am 28.09.2006 19:12 schrieb Andi Kleen:
> On Thursday 28 September 2006 19:00, Tilman Schmidt wrote:
>
> missing context here, but ...

Forwarded by separate mail.

>> x86_64-mm-i386-stacktrace-unwinder.patch
[...]
>> Backing out just this patch from 2.6.18-mm1 (and resolving conflicts
>> manually the obvious way) gets the boot time back to normal (ie. as
>> fast as 2.6.18 mainline) on my
>> Linux gx110 2.6.18-mm1-noinitrd #2 PREEMPT Thu Sep 28 18:48:32 CEST 2006 i686 i686 i386 GNU/Linux
>> machine.
>
> Hmm, i assume you have lockdep on.

Indeed.

> The new backtracer is of course slower
> than the old one and it will slow down lockdep which takes a lot of backtraces.
> But it shouldn't be a significant slowdown.

Unfortunately, it is. Boot time roughly doubles from 39 to 76 secs.

> Can you perhaps boot with profile=1 and then send readprofile output after
> boot?

I'm afraid I'll need instructions for that. I assume "profile=1"
is to be appended to the kernel command line; but how do I
retrieve that readprofile output you are asking for?

Thanks
Tilman

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)


Attachments:
signature.asc (253.00 B)
OpenPGP digital signature

2006-09-28 23:29:30

by Randy Dunlap

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

On Fri, 29 Sep 2006 01:20:12 +0200 Tilman Schmidt wrote:

> Am 28.09.2006 19:12 schrieb Andi Kleen:
> > On Thursday 28 September 2006 19:00, Tilman Schmidt wrote:
> >
> > missing context here, but ...
>
> Forwarded by separate mail.
>
> >> x86_64-mm-i386-stacktrace-unwinder.patch
> [...]
> >> Backing out just this patch from 2.6.18-mm1 (and resolving conflicts
> >> manually the obvious way) gets the boot time back to normal (ie. as
> >> fast as 2.6.18 mainline) on my
> >> Linux gx110 2.6.18-mm1-noinitrd #2 PREEMPT Thu Sep 28 18:48:32 CEST 2006 i686 i686 i386 GNU/Linux
> >> machine.
> >
> > Hmm, i assume you have lockdep on.
>
> Indeed.
>
> > The new backtracer is of course slower
> > than the old one and it will slow down lockdep which takes a lot of backtraces.
> > But it shouldn't be a significant slowdown.
>
> Unfortunately, it is. Boot time roughly doubles from 39 to 76 secs.
>
> > Can you perhaps boot with profile=1 and then send readprofile output after
> > boot?
>
> I'm afraid I'll need instructions for that. I assume "profile=1"
> is to be appended to the kernel command line; but how do I
> retrieve that readprofile output you are asking for?

Use 'readprofile'. Usage is described in
Documentation/basic_profiling.txt in the kernel source tree.

---
~Randy

2006-09-29 00:13:56

by Tilman Schmidt

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

4 calibrate_delay 0.0020
59946 default_idle 525.8421
4 cpu_idle 0.0351
9 sysenter_past_esp 0.0638
1 error_code 0.0156
3 sched_clock 0.0273
1 page_is_ram 0.0088
1 pgd_alloc 0.0500
5 do_page_fault 0.0043
1 __wake_up 0.0175
2 add_preempt_count 0.0213
24 sub_preempt_count 0.2243
11 __might_sleep 0.0738
2 requeue_task 0.0455
5 task_rq_lock 0.1667
4 try_to_wake_up 0.0153
1 wake_up_process 0.0588
1 free_task 0.0278
8 __put_task_struct 0.0428
2 do_fork 0.0054
2 acquire_console_sem 0.0156
1 release_console_sem 0.0021
28 vprintk 0.0367
2 profile_hit 0.0364
2 delayed_put_task_struct 0.0645
3 do_wait 0.0012
1 sys_gettimeofday 0.0109
1 current_fs_time 0.0106
71 tasklet_action 0.6017
2 __local_bh_disable 0.0163
6212 __do_softirq 35.7011
1 local_bh_enable_ip 0.0047
89 local_bh_enable 0.3309
2 __tasklet_schedule 0.0187
1 register_proc_table 0.0042
2 lock_timer_base 0.0392
74 run_timer_softirq 0.2202
2 __mod_timer 0.0132
2 mod_timer 0.0465
4 del_timer 0.0548
3 do_gettimeofday 0.0124
1 getnstimeofday 0.0047
10 free_uid 0.0877
1 sigprocmask 0.0047
2 run_workqueue 0.0103
1 flush_cpu_workqueue 0.0071
1 worker_thread 0.0036
1 schedule_work 0.0588
6 put_pid 0.1714
1 delayed_put_pid 0.0769
1 alloc_pid 0.0019
29 rcu_start_batch 0.6444
2 rcu_init_percpu_data 0.0238
95 __rcu_process_callbacks 0.2707
17 rcu_process_callbacks 0.4857
1 call_rcu_bh 0.0094
1 sys_clock_gettime 0.0076
1 kthread_should_stop 0.0385
2 __wake_up_bit 0.0408
1 wake_bit_function 0.0149
1 init_waitqueue_head 0.0303
2 remove_wait_queue 0.0351
2 add_wait_queue 0.0377
179 hrtimer_run_queues 0.6047
1 hrtimer_cancel 0.0417
2 sys_nanosleep 0.0235
1 down_write_trylock 0.0200
1 down_read 0.0169
1 debug_mutex_free_waiter 0.0152
1 lockdep_init_map 0.0061
8 debug_check_no_locks_freed 0.0282
23 lock_release 0.0821
996 lock_acquire 9.1376
1 drop_key_refs 0.0227
1 get_futex_key 0.0045
3 do_futex 0.0009
1 sys_futex 0.0041
12 lookup_symbol 0.2353
1 page_waitqueue 0.0222
1 find_get_page 0.0161
1 sys_fadvise64_64 0.0023
2 __free_pages_ok 0.0069
22 get_page_from_freelist 0.0263
1 __alloc_pages 0.0016
3 free_hot_cold_page 0.0098
2 wb_kupdate 0.0096
1 test_clear_page_writeback 0.0056
1 test_set_page_writeback 0.0045
1 generic_writepages 0.0015
1 force_page_cache_readahead 0.0090
1 release_pages 0.0030
1 __pagevec_lru_add 0.0053
1 lru_add_drain 0.0115
1 mark_page_accessed 0.0200
1 put_page 0.0182
1 vma_prio_tree_insert 0.0227
1 vm_normal_page 0.0115
3 unmap_vmas 0.0027
4 do_wp_page 0.0049
1 copy_page_range 0.0014
1 __handle_mm_fault 0.0006
3 find_vma 0.0353
1 __vm_enough_memory 0.0046
1 vma_adjust 0.0013
1 do_munmap 0.0023
2 do_mmap_pgoff 0.0013
1 page_remove_rmap 0.0143
4 __remove_vm_area 0.0769
4 check_irq_on 0.2000
7 kmem_flagcheck 0.1795
1 kmem_cache_name 0.1250
1 dbg_redzone1 0.0357
18 poison_obj 0.4091
401 check_poison_obj 1.1392
28 drain_array 0.1667
1 cache_flusharray 0.0044
74 kfree 0.3834
458 kmem_cache_free 2.5730
2 kmem_rcu_free 0.0500
4 drain_freelist 0.0263
53 cache_reap 0.2409
7 cache_alloc_debugcheck_after 0.0219
44 kmem_cache_alloc 0.2716
35 cache_alloc_refill 0.0233
1 __kmalloc_track_caller 0.0046
3 kmem_cache_zalloc 0.0152
15 __kmalloc 0.0685
4 sys_faccessat 0.0131
2 sys_access 0.1111
1 __dentry_open 0.0019
1 do_sys_open 0.0055
1 sys_write 0.0104
10 file_free_rcu 0.5882
2 get_max_files 0.2000
11 fget_light 0.0663
3 sync_supers 0.0152
1 sys_readlinkat 0.0068
3 generic_fillattr 0.0164
1 vfs_getattr 0.0059
1 vfs_lstat 0.0526
2 pipe_readv 0.0025
4 getname 0.0217
1 permission 0.0051
2 __follow_mount 0.0192
3 do_lookup 0.0097
6 __link_path_walk 0.0018
2 link_path_walk 0.0101
2 do_path_lookup 0.0032
5 __user_walk_fd 0.0769
4 poll_freewait 0.0444
1 __pollwait 0.0055
43 do_select 0.0341
22 core_sys_select 0.0298
10 sys_select 0.0306
11 do_sys_poll 0.0109
3 sys_poll 0.0435
7 d_callback 0.1795
1 _d_rehash 0.0156
1 __d_path 0.0029
5 __d_lookup 0.0187
1 alloc_inode 0.0024
1 file_update_time 0.0076
1 is_bad_inode 0.0455
5 free_fdtable_rcu 0.0198
1 sync_sb_inodes 0.0016
1 unlock_buffer 0.0526
2 alloc_buffer_head 0.0260
4 __find_get_block_slow 0.0145
1 ll_rw_block 0.0055
6 __find_get_block 0.0134
2 __getblk 0.0037
2 __block_write_full_page 0.0026
1 block_write_full_page 0.0048
1 bio_init 0.0094
1 init_once 0.0084
2 do_mpage_readpage 0.0013
1 inotify_d_instantiate 0.0098
1 proc_readfd 0.0019
1 sysfs_get_name 0.0149
1 sysfs_release 0.0066
2 sysfs_lookup 0.0045
1 sysfs_create_link 0.0032
1 reiserfs_discard_all_prealloc 0.0167
1 search_by_entry_key 0.0017
1 reiserfs_lookup 0.0039
2 _get_block_create_0 0.0013
1 reiserfs_read_locked_inode 0.0008
1 reiserfs_commit_page 0.0023
10 reiserfs_readdir 0.0088
2 load_bitmap_info_data 0.0145
2 comp_items 0.0317
1 decrement_counters_in_path 0.0147
25 search_by_key 0.0064
1 flush_commit_list 0.0007
4 do_journal_end 0.0014
2 do_journal_begin_r 0.0031
1 journal_begin 0.0043
1 journal_end_sync 0.0093
1 reiserfs_commit_for_inode 0.0026
1 direntry_create_vi 0.0030
2 reiserfs_permission 0.0556
1 dummy_inode_alloc_security 0.1429
1 dummy_task_free_security 0.2000
2 dummy_vm_enough_memory 0.0625
1 init_request_from_bio 0.0076
3 _atomic_dec_and_lock 0.0625
1 prio_tree_replace 0.0110
2 prio_tree_insert 0.0040
1 radix_tree_preload 0.0071
6 memcmp 0.1429
3 number 0.0059
106 delay_tsc 4.6087
15 read_current_timer 0.5357
1 __delay 0.0909
3 memcpy 0.0698
7 strncpy_from_user 0.0875
1 __copy_from_user_ll 0.0045
3 copy_from_user 0.0278
18 __copy_to_user_ll 0.0811
4 copy_to_user 0.0488
1 irqsafe2A_spin_12 0.0066
1 irqsafe2A_wlock_12 0.0066
1 irqsafe2A_rlock_12 0.0066
1 irqsafe2A_spin_21 0.0066
1 irqsafe2A_wlock_21 0.0066
1 irqsafe2A_rlock_21 0.0066
19 dotest 0.0206
1 irqsafe1_hard_spin_12 0.0067
1 irqsafe1_hard_wlock_12 0.0067
1 irqsafe1_hard_rlock_12 0.0067
1 irqsafe1_hard_wlock_21 0.0067
1 irqsafe1_hard_rlock_21 0.0067
2 irqsafe2B_hard_spin_12 0.0123
1 irqsafe2B_hard_wlock_12 0.0062
1 irqsafe2B_hard_rlock_12 0.0062
1 irqsafe2B_hard_spin_21 0.0062
2 irqsafe2B_hard_wlock_21 0.0123
2 irqsafe2B_hard_rlock_21 0.0123
1 irqsafe3_hard_spin_123 0.0053
1 irqsafe3_hard_wlock_123 0.0053
1 irqsafe3_hard_rlock_123 0.0053
2 irqsafe3_hard_spin_132 0.0105
1 irqsafe3_hard_wlock_132 0.0053
1 irqsafe3_hard_rlock_132 0.0053
1 irqsafe3_hard_wlock_213 0.0053
1 irqsafe3_hard_rlock_213 0.0053
2 irqsafe3_hard_spin_231 0.0105
1 irqsafe3_hard_wlock_231 0.0053
1 irqsafe3_hard_rlock_231 0.0053
1 irqsafe3_hard_spin_312 0.0053
1 irqsafe3_hard_wlock_312 0.0053
1 irqsafe3_hard_rlock_312 0.0053
1 irqsafe3_hard_spin_321 0.0053
1 irqsafe3_hard_wlock_321 0.0053
1 irqsafe3_hard_rlock_321 0.0053
1 irqsafe4_hard_spin_123 0.0050
2 irqsafe4_hard_wlock_123 0.0099
2 irqsafe4_hard_rlock_123 0.0099
2 irqsafe4_hard_spin_132 0.0099
1 irqsafe4_hard_wlock_132 0.0050
1 irqsafe4_hard_rlock_132 0.0050
1 irqsafe4_hard_wlock_213 0.0050
1 irqsafe4_hard_rlock_213 0.0050
1 irqsafe4_hard_spin_231 0.0050
1 irqsafe4_hard_wlock_231 0.0050
1 irqsafe4_hard_rlock_231 0.0050
2 irqsafe4_hard_spin_312 0.0099
1 irqsafe4_hard_wlock_312 0.0050
1 irqsafe4_hard_rlock_312 0.0050
1 irqsafe4_hard_spin_321 0.0050
1 irqsafe4_hard_wlock_321 0.0050
1 irqsafe4_hard_rlock_321 0.0050
1 irq_inversion_hard_spin_123 0.0050
2 irq_inversion_hard_wlock_123 0.0099
2 irq_inversion_hard_rlock_123 0.0099
2 irq_inversion_hard_spin_132 0.0099
1 irq_inversion_hard_wlock_132 0.0050
1 irq_inversion_hard_rlock_132 0.0050
1 irq_inversion_hard_wlock_213 0.0050
1 irq_inversion_hard_rlock_213 0.0050
1 irq_inversion_hard_spin_231 0.0050
1 irq_inversion_hard_wlock_231 0.0050
1 irq_inversion_hard_rlock_231 0.0050
2 irq_inversion_hard_spin_312 0.0099
1 irq_inversion_hard_wlock_312 0.0050
1 irq_inversion_hard_rlock_312 0.0050
1 irq_inversion_hard_spin_321 0.0050
1 irq_inversion_hard_wlock_321 0.0050
1 irq_inversion_hard_rlock_321 0.0050
1 irq_read_recursion_hard_123 0.0050
2 irq_read_recursion_hard_132 0.0099
1 irq_read_recursion_hard_231 0.0050
2 irq_read_recursion_hard_312 0.0099
1 irq_read_recursion_hard_321 0.0050
2 __spin_lock_init 0.0200
5 _raw_spin_unlock 0.0455
1 _raw_read_lock 0.0417
4 _raw_spin_lock 0.0175
3 fb_get_color_depth 0.0517
1 vgacon_save_screen 0.0105
5 cursor_timer_handler 0.1163
14 fb_flashcursor 0.0220
1 fbcon_cursor 0.0016
13 bit_cursor 0.0117
7 bit_putcs 0.0063
20 bitfill_aligned 0.1010
9 cfb_copyarea 0.0042
56 cfb_imageblit 0.0399
1 acpi_os_signal_semaphore 0.0099
1 acpi_ns_lookup 0.0007
2 acpi_tb_sum_table 0.0417
2 acpi_tb_validate_rsdp 0.0189
1 scrup 0.0053
1 con_chars_in_buffer 0.1429
2 do_con_write 0.0004
2 intel_i810_configure 0.0074
12 i810_write_dac 0.2553
62 i810_read_dac 1.3478
26 i810fb_getcolreg 0.0956
3 i810_screen_off 0.0349
4 i810_enable_cursor 0.1333
983 i810fb_cursor 1.4413
4 vortex_timer 0.0049
14 ide_inb 1.2727
2 ide_outb 0.3333
2 ide_driveid_update 0.0051
2 try_to_identify 0.0019
1 probe_hwif 0.0006
1 usb_hcd_poll_rh_status 0.0029
1 rh_timer_func 0.1000
2 usb_hcd_submit_urb 0.0010
13 i8042_interrupt 0.0260
4 i8042_timer_func 0.2500
1 sock_poll 0.0476
1 sock_wfree 0.0172
1 datagram_poll 0.0045
1 udp_poll 0.0055
5 unix_poll 0.0278
1 cache_clean 0.0025
46 schedule 0.0338
3 preempt_schedule 0.0361
2 cond_resched 0.0408
3 schedule_timeout 0.0197
2 __mutex_unlock_slowpath 0.0080
1 mutex_unlock 0.1000
9 __mutex_lock_interruptible_slowpath 0.0143
148 __mutex_lock_slowpath 0.2814
11 mutex_lock_nested 0.0202
6 _spin_lock 0.1200
26 _write_unlock_irq 0.3881
5 _spin_lock_irq 0.0893
2 _spin_lock_irqsave 0.0333
6 _spin_unlock 0.0984
605 _spin_unlock_irqrestore 6.7978
700 _spin_unlock_irq 10.4478
2 _read_unlock_irqrestore 0.0225
19 _read_unlock_irq 0.2836
12 _write_unlock_irqrestore 0.1348
1 unlock_kernel 0.0179
1 __release_kernel_lock 0.0417
46 *unknown*
72643 total 0.0308


Attachments:
readprofile.out (20.11 kB)
signature.asc (253.00 B)
OpenPGP digital signature
Download all attachments

2006-09-29 07:09:05

by Jan Beulich

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

There's nothing stack trace/unwind related among the functions listed at all afaics.
I don't know much about how profiling works, is it perhaps just missing something?
Jan

>>> Tilman Schmidt <[email protected]> 29.09.06 02:15 >>>
Am 29.09.2006 01:30 schrieb Randy Dunlap:
> On Fri, 29 Sep 2006 01:20:12 +0200 Tilman Schmidt wrote:
>> Am 28.09.2006 19:12 schrieb Andi Kleen:
>>>
>>> Can you perhaps boot with profile=1 and then send readprofile output after
>>> boot?
>> I'm afraid I'll need instructions for that. I assume "profile=1"
>> is to be appended to the kernel command line; but how do I
>> retrieve that readprofile output you are asking for?
>
> Use 'readprofile'. Usage is described in
> Documentation/basic_profiling.txt in the kernel source tree.

Thanks. Result attached.

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)

2006-09-29 08:51:27

by Andi Kleen

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot


> I'm afraid I'll need instructions for that. I assume "profile=1"
> is to be appended to the kernel command line; but how do I
> retrieve that readprofile output you are asking for?

With the readprofile command.

But never mind. The problem has been already diagnosed. No fix yet though.

-Andi

2006-09-29 16:13:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

"Jan Beulich" <[email protected]> writes:

> There's nothing stack trace/unwind related among the functions listed at all afaics.
> I don't know much about how profiling works, is it perhaps just missing something?

Perhaps lockdep calls them with interrupts off?
The old profiler doesn't support profiling with interrupts off.
oprofile does, but it cannot be used at early boot.

-Andi

2006-09-29 18:48:16

by Ingo Molnar

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot


* Andi Kleen <[email protected]> wrote:

> "Jan Beulich" <[email protected]> writes:
>
> > There's nothing stack trace/unwind related among the functions listed at all afaics.
> > I don't know much about how profiling works, is it perhaps just missing something?
>
> Perhaps lockdep calls them with interrupts off? The old profiler
> doesn't support profiling with interrupts off. oprofile does, but it
> cannot be used at early boot.

Yes, lockdep does everything that changes the dependency graph(s) with
irqs off. Jan, i bounced you the mail with the function traces included,
that should show you the overhead points.

Ingo

2006-10-04 07:04:46

by Jan Beulich

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

>>> Ingo Molnar <[email protected]> 29.09.06 20:39 >>>
>
>* Andi Kleen <[email protected]> wrote:
>
>> "Jan Beulich" <[email protected]> writes:
>>
>> > There's nothing stack trace/unwind related among the functions listed at all afaics.
>> > I don't know much about how profiling works, is it perhaps just missing something?
>>
>> Perhaps lockdep calls them with interrupts off? The old profiler
>> doesn't support profiling with interrupts off. oprofile does, but it
>> cannot be used at early boot.
>
>Yes, lockdep does everything that changes the dependency graph(s) with
>irqs off. Jan, i bounced you the mail with the function traces included,
>that should show you the overhead points.

Okay, makes sense then. I'll get to addressing the (already identified) cause
as soon as I can.

Jan

2006-10-25 18:12:26

by Tilman Schmidt

[permalink] [raw]
Subject: Re: [2.6.18-rc7-mm1] slow boot

Problem's gone in 2.6.19-rc3.

Thanks
Tilman

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)


Attachments:
signature.asc (253.00 B)
OpenPGP digital signature