2006-05-31 19:32:45

by Martin Bligh

[permalink] [raw]
Subject: 2.6.17-rc5-mm1

Another panic. This time on x440.

M.

http://test.kernel.org/abat/33803/debug/console.log

BUG: unable to handle kernel paging request at virtual address 22222232
printing eip:
c012b6eb
*pde = 15621001
*pte = 00000000
Oops: 0000 [#1]
SMP
last sysfs file: /class/vc/vcs1/dev
CPU: 1
EIP: 0060:[<c012b6eb>] Not tainted VLI
EFLAGS: 00010002 (2.6.17-rc5-mm1-autokern1 #1)
EIP is at check_deadlock+0x15/0xe0
eax: 22222222 ebx: 00000001 ecx: d4996000 edx: 00000001
esi: d686f550 edi: 22222222 ebp: 22222222 esp: d5bdfec8
ds: 007b es: 007b ss: 0068
Process mkdir09 (pid: 18867, threadinfo=d5bdf000 task=d5c0e000)
Stack: 00000000 d686f550 d3960568 22222222 c012b77b d3960568 d5bdf000
d5bdff00
d5c0e000 c012b922 d5bdff48 d3960568 00000246 c02d50de d5bdff00
d5bdff00
11111111 11111111 d5bdff00 ffffff9c d5bdff48 00000000 d5bdff48
ffffffef
Call Trace:
<c012b77b> check_deadlock+0xa5/0xe0 <c012b922>
debug_mutex_add_waiter+0x46/0x55
<c02d50de> __mutex_lock_slowpath+0x9e/0x1c0 <c0160061>
lookup_create+0x19/0x5b
<c016043a> sys_mkdirat+0x4c/0xc3 <c01604c0> sys_mkdir+0xf/0x13
<c02d6217> syscall_call+0x7/0xb


2006-05-31 21:05:57

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1

Martin Bligh <[email protected]> wrote:
>
> Another panic. This time on x440.
>
> M.
>
> http://test.kernel.org/abat/33803/debug/console.log
>
> BUG: unable to handle kernel paging request at virtual address 22222232
> printing eip:
> c012b6eb
> *pde = 15621001
> *pte = 00000000
> Oops: 0000 [#1]
> SMP
> last sysfs file: /class/vc/vcs1/dev
> CPU: 1
> EIP: 0060:[<c012b6eb>] Not tainted VLI
> EFLAGS: 00010002 (2.6.17-rc5-mm1-autokern1 #1)
> EIP is at check_deadlock+0x15/0xe0
> eax: 22222222 ebx: 00000001 ecx: d4996000 edx: 00000001
> esi: d686f550 edi: 22222222 ebp: 22222222 esp: d5bdfec8
> ds: 007b es: 007b ss: 0068
> Process mkdir09 (pid: 18867, threadinfo=d5bdf000 task=d5c0e000)
> Stack: 00000000 d686f550 d3960568 22222222 c012b77b d3960568 d5bdf000
> d5bdff00
> d5c0e000 c012b922 d5bdff48 d3960568 00000246 c02d50de d5bdff00
> d5bdff00
> 11111111 11111111 d5bdff00 ffffff9c d5bdff48 00000000 d5bdff48
> ffffffef
> Call Trace:
> <c012b77b> check_deadlock+0xa5/0xe0 <c012b922>
> debug_mutex_add_waiter+0x46/0x55
> <c02d50de> __mutex_lock_slowpath+0x9e/0x1c0 <c0160061>
> lookup_create+0x19/0x5b
> <c016043a> sys_mkdirat+0x4c/0xc3 <c01604c0> sys_mkdir+0xf/0x13
> <c02d6217> syscall_call+0x7/0xb

Looks like the lock validator came unstuck. But there's so much other crap
happening in there it's hard to tell. Did you try it without all the
lockdep stuff enabled?

2006-05-31 21:15:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1


* Andrew Morton <[email protected]> wrote:

> > EIP is at check_deadlock+0x15/0xe0

> > <c012b77b> check_deadlock+0xa5/0xe0 <c012b922>
> > debug_mutex_add_waiter+0x46/0x55
> > <c02d50de> __mutex_lock_slowpath+0x9e/0x1c0 <c0160061>
> > lookup_create+0x19/0x5b
> > <c016043a> sys_mkdirat+0x4c/0xc3 <c01604c0> sys_mkdir+0xf/0x13
> > <c02d6217> syscall_call+0x7/0xb
>
> Looks like the lock validator came unstuck. But there's so much other
> crap happening in there it's hard to tell. Did you try it without all
> the lockdep stuff enabled?

AFAICS this isnt the lock validator but the normal mutex debugging code
(CONFIG_DEBUG_MUTEXES). The log does not indicate that lockdep was
enabled.

Ingo

2006-05-31 21:27:40

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1

Ingo Molnar wrote:
> * Andrew Morton <[email protected]> wrote:
>
>
>>>EIP is at check_deadlock+0x15/0xe0
>
>
>>> <c012b77b> check_deadlock+0xa5/0xe0 <c012b922>
>>>debug_mutex_add_waiter+0x46/0x55
>>> <c02d50de> __mutex_lock_slowpath+0x9e/0x1c0 <c0160061>
>>>lookup_create+0x19/0x5b
>>> <c016043a> sys_mkdirat+0x4c/0xc3 <c01604c0> sys_mkdir+0xf/0x13
>>> <c02d6217> syscall_call+0x7/0xb
>>
>>Looks like the lock validator came unstuck. But there's so much other
>>crap happening in there it's hard to tell. Did you try it without all
>>the lockdep stuff enabled?
>
>
> AFAICS this isnt the lock validator but the normal mutex debugging code
> (CONFIG_DEBUG_MUTEXES). The log does not indicate that lockdep was
> enabled.

Buggered if I know how that got turned on. I thought we turned it off
by default now? That's what screwed up all the perf results before.

http://test.kernel.org/abat/33803/build/dotconfig
That's the build config it ran with.

CONFIG_DEBUG_MUTEXES=y

Grrr. Humpf. I can't see the option being turned on for lockdep ...
what was the config option, and is it enabled by default?

M.


2006-05-31 21:33:24

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1


* Martin J. Bligh <[email protected]> wrote:

> >AFAICS this isnt the lock validator but the normal mutex debugging code
> >(CONFIG_DEBUG_MUTEXES). The log does not indicate that lockdep was
> >enabled.
>
> Buggered if I know how that got turned on. I thought we turned it off
> by default now? That's what screwed up all the perf results before.
>
> http://test.kernel.org/abat/33803/build/dotconfig
> That's the build config it ran with.
>
> CONFIG_DEBUG_MUTEXES=y

still ... it shouldnt have crashed on us. I did change it in -mm1 so
i'll take a look tomorrow.

> Grrr. Humpf. I can't see the option being turned on for lockdep ...
> what was the config option, and is it enabled by default?

these are the lock validator options in question:

# CONFIG_PROVE_SPIN_LOCKING is not set
# CONFIG_PROVE_RW_LOCKING is not set
# CONFIG_PROVE_MUTEX_LOCKING is not set
# CONFIG_PROVE_RWSEM_LOCKING is not set

and they are off by default.

Ingo

2006-05-31 21:43:13

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1

Ingo Molnar wrote:
> * Martin J. Bligh <[email protected]> wrote:
>
>
>>>AFAICS this isnt the lock validator but the normal mutex debugging code
>>>(CONFIG_DEBUG_MUTEXES). The log does not indicate that lockdep was
>>>enabled.
>>
>>Buggered if I know how that got turned on. I thought we turned it off
>>by default now? That's what screwed up all the perf results before.
>>
>>http://test.kernel.org/abat/33803/build/dotconfig
>>That's the build config it ran with.
>>
>>CONFIG_DEBUG_MUTEXES=y
>
>
> still ... it shouldnt have crashed on us. I did change it in -mm1 so
> i'll take a look tomorrow.
>
>
>>Grrr. Humpf. I can't see the option being turned on for lockdep ...
>>what was the config option, and is it enabled by default?

In the -mm1 patch:

config DEBUG_MUTEXES
- bool "Mutex debugging, deadlock detection"
- default n
+ bool "Mutex debugging, basic checks"
+ default y

Please don't do thatas a default.
It fucks up all the performance checking ;-(


M.

2006-05-31 21:52:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1


* Martin J. Bligh <[email protected]> wrote:

> >>Grrr. Humpf. I can't see the option being turned on for lockdep ...
> >>what was the config option, and is it enabled by default?
>
> In the -mm1 patch:
>
> config DEBUG_MUTEXES
> - bool "Mutex debugging, deadlock detection"
> - default n
> + bool "Mutex debugging, basic checks"
> + default y
>
> Please don't do thatas a default.

but ... i fixed the performance problem that caused the previous
DEBUG_MUTEXES scalability problems. (there's no global mutex list
anymore) We also default to e.g. DEBUG_SLAB which is alot more costly.

> It fucks up all the performance checking ;-(

i'm wondering, why doesnt your config have DEBUG_MUTEXES disabled? Then
'make oldconfig' would pick it up automatically.

Ingo

2006-05-31 21:59:26

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1


> but ... i fixed the performance problem that caused the previous
> DEBUG_MUTEXES scalability problems. (there's no global mutex list
> anymore) We also default to e.g. DEBUG_SLAB which is alot more costly.

OK. So what's the perf impact of the new version on a 32 cpu machine?
;-) Maybe it's fine, maybe it's not.

> i'm wondering, why doesnt your config have DEBUG_MUTEXES disabled? Then
> 'make oldconfig' would pick it up automatically.

Because it builds off the same config file all the time. It was created
before CONFIG_MUTEXES existed ... creating a situation where we have to
explicitly disable new options all the time becomes a maintainance
nightmare ;-(

If we don't want to do performance regression checking on -mm, that's
fine, but I thought it was useful (has caught several things already).
If we want debug options explicitly enabled, we can do a separate debug
run, I'd think, but it makes it very difficult to do automated testing
if we add random new debug options all the time on by default ...

If we really think the debug options we're turning on by default have
zero perf impact, that's fine ... but it has not been my previous
experience. People obviously haven't checked that carefully in the past,
perhaps they are now and the world fixed itself, but I'm not that
optimistic ...

M.

2006-05-31 22:12:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1


* Martin J. Bligh <[email protected]> wrote:

> >but ... i fixed the performance problem that caused the previous
> >DEBUG_MUTEXES scalability problems. (there's no global mutex list
> >anymore) We also default to e.g. DEBUG_SLAB which is alot more costly.
>
> OK. So what's the perf impact of the new version on a 32 cpu machine?
> ;-) Maybe it's fine, maybe it's not.

no idea, but it shouldnt be nearly as bad as say SLAB_DEBUG.

> >i'm wondering, why doesnt your config have DEBUG_MUTEXES disabled? Then
> >'make oldconfig' would pick it up automatically.
>
> Because it builds off the same config file all the time. It was
> created before CONFIG_MUTEXES existed ... creating a situation where
> we have to explicitly disable new options all the time becomes a
> maintainance nightmare ;-(

hm, why? Dont you disable DEBUG_SLAB? [that's a default y option too,
and in your config it's disabled]

a oneliner script:

sed -i 's/CONFIG_MUTEX_DEBUGGING=y/# CONFIG_MUTEX_DEBUGGING is not set'

ought to do it, unless i'm missing something.

Really, there's an unfortunate friction of interests here:

on one side, the -mm kernel is about showcasing new code and finding
bugs in them as fast as possible. Having new debugging options enabled
by default is an important part of the testing effort. Users will care
more about having no crashes than about having 0.5% more performance in
select benchmarks.

on the other side, you obviously dont want a 0.5% overhead for select
benchmarks, as that would mess up the history! A very fair and valid
position too.

but one side has to give, we cant have both.

> If we don't want to do performance regression checking on -mm, that's
> fine, but I thought it was useful (has caught several things already).

please dont misunderstand my position as being against your efforts - to
the contrary, your performance regression testing has proven to be
valuable numerous times! But you are a single intelligent person whom i
can possibly talk into adding some scripting to ensure that certain
options stay off in the .config - but i cannot cat-herd the many -mm
testers on the other hand to all enable the debug options ;-) So i'm
kind of forced trying to convince you - i cannot convince the basic
human testing nature of keeping the defaults ;-)

Ingo

2006-05-31 22:22:08

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1

>>OK. So what's the perf impact of the new version on a 32 cpu machine?
>>;-) Maybe it's fine, maybe it's not.
>
>
> no idea, but it shouldnt be nearly as bad as say SLAB_DEBUG.

The "no idea" is hardly reassuring ;-)
The latter point is definitely valid though, it's not an isolated issue.

>>>i'm wondering, why doesnt your config have DEBUG_MUTEXES disabled? Then
>>>'make oldconfig' would pick it up automatically.
>>
>>Because it builds off the same config file all the time. It was
>>created before CONFIG_MUTEXES existed ... creating a situation where
>>we have to explicitly disable new options all the time becomes a
>>maintainance nightmare ;-(
>
>
> hm, why? Dont you disable DEBUG_SLAB? [that's a default y option too,
> and in your config it's disabled]
>
> a oneliner script:
>
> sed -i 's/CONFIG_MUTEX_DEBUGGING=y/# CONFIG_MUTEX_DEBUGGING is not set'
>
> ought to do it, unless i'm missing something.

because it's not maintainable in a fully automated system over time.

> Really, there's an unfortunate friction of interests here:
>
> on one side, the -mm kernel is about showcasing new code and finding
> bugs in them as fast as possible. Having new debugging options enabled
> by default is an important part of the testing effort. Users will care
> more about having no crashes than about having 0.5% more performance in
> select benchmarks.
>
> on the other side, you obviously dont want a 0.5% overhead for select
> benchmarks, as that would mess up the history! A very fair and valid
> position too.
>
> but one side has to give, we cant have both.

Above is a good description of the problem - I really want to get as
much debugging stuff done automatically as possible. However, we can
have both. We just need to do both runs, with an option to turn
whatever the random debug options of the day on or off are. Something
that won't change over time. CONFIG_DEBUG itself would seem ideal,
except it kills lots of standard things like CONFIG_DEBUG_INFO,
alt+sysrq support, and probably KALLSYMS and a few other things I
forget. Need CONFIG_DEBUG_AFFFECTS_PERFORMANCE or something.

>>If we don't want to do performance regression checking on -mm, that's
>>fine, but I thought it was useful (has caught several things already).
>
> please dont misunderstand my position as being against your efforts - to
> the contrary, your performance regression testing has proven to be
> valuable numerous times! But you are a single intelligent person whom i
> can possibly talk into adding some scripting to ensure that certain
> options stay off in the .config - but i cannot cat-herd the many -mm
> testers on the other hand to all enable the debug options ;-) So i'm
> kind of forced trying to convince you - i cannot convince the basic
> human testing nature of keeping the defaults ;-)

Is all good, is just frustrating on occcasion, sorry ;-) The problem is
that especially in a remote, distributed system, it's impossible to
maintain a continually changing set of config options for this over
time. If we can get one option to flick off all the perf invasive
stuff, that's a single change. If we have to add CONFIG_SLAB last week,
CONFIG_DEBUG_MUTEX this week, and CONFIG_LOCK_DEBUG_STUFF next week,
that gets much, much harder to maintain.

Hopefully that makes some sort of sense. Sorry, I realise the
requirements are a little strange, but I really think we have to
get testing automated or it just does not happen. The last round
affected Intel, etc as well doing benchmarking.

Adding new runs is easy. Changing the harness is hard ;-)

M.

2006-05-31 22:32:34

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1


* Martin Bligh <[email protected]> wrote:

> >>OK. So what's the perf impact of the new version on a 32 cpu machine?
> >>;-) Maybe it's fine, maybe it's not.
> >
> >
> >no idea, but it shouldnt be nearly as bad as say SLAB_DEBUG.
>
> The "no idea" is hardly reassuring ;-)
> The latter point is definitely valid though, it's not an isolated issue.

> Adding new runs is easy. Changing the harness is hard ;-)

ok. How about a CONFIG_DEBUG_NO_OVERHEAD option, that would default to
disabled but which you could set to y. Then we could make all the more
expensive debug options:

default y if !CONFIG_DEBUG_NO_OVERHEAD

this would still mean you'd have to turn off CONFIG_DEBUG_NO_OVERHEAD,
but it would be automatically maintainable for you after that initial
effort, and we'd be careful to always flag new debugging options with
this flag, if they are expensive. And initially i'd define "expensive"
as "anything that adds runtime overhead".

would this be acceptable to you?

Ingo

2006-05-31 22:37:19

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1

Ingo Molnar wrote:
> * Martin Bligh <[email protected]> wrote:
>
>
>>>>OK. So what's the perf impact of the new version on a 32 cpu machine?
>>>>;-) Maybe it's fine, maybe it's not.
>>>
>>>
>>>no idea, but it shouldnt be nearly as bad as say SLAB_DEBUG.
>>
>>The "no idea" is hardly reassuring ;-)
>>The latter point is definitely valid though, it's not an isolated issue.
>
>
>>Adding new runs is easy. Changing the harness is hard ;-)
>
>
> ok. How about a CONFIG_DEBUG_NO_OVERHEAD option, that would default to
> disabled but which you could set to y. Then we could make all the more
> expensive debug options:
>
> default y if !CONFIG_DEBUG_NO_OVERHEAD
>
> this would still mean you'd have to turn off CONFIG_DEBUG_NO_OVERHEAD,
> but it would be automatically maintainable for you after that initial
> effort, and we'd be careful to always flag new debugging options with
> this flag, if they are expensive. And initially i'd define "expensive"
> as "anything that adds runtime overhead".
>
> would this be acceptable to you?

Sure, makes sense. I don't care which way up it is, ie
CONFIG_DEBUG_OVERHEAD vs CONFIG_DEBUG_NO_OVERHEAD, as long as it's
easily separable.

There's probably other debug stuff we can turn on too, if we do that.

M.

2006-05-31 22:43:04

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1


* Ingo Molnar <[email protected]> wrote:

> > Adding new runs is easy. Changing the harness is hard ;-)
>
> ok. How about a CONFIG_DEBUG_NO_OVERHEAD option, that would default to
> disabled but which you could set to y. Then we could make all the more
> expensive debug options:
>
> default y if !CONFIG_DEBUG_NO_OVERHEAD
>
> this would still mean you'd have to turn off CONFIG_DEBUG_NO_OVERHEAD,
> but it would be automatically maintainable for you after that initial
> effort, and we'd be careful to always flag new debugging options with
> this flag, if they are expensive. And initially i'd define "expensive"
> as "anything that adds runtime overhead".

the patch below implements this and categorizes all debug options based
on whether they have runtime overhead or not.

(i have left out debug options that are non-transparent - i.e. which
print lots of stuff to the syslog like DEBUG_KOBJECT.)

Ingo

Index: linux/lib/Kconfig.debug
===================================================================
--- linux.orig/lib/Kconfig.debug
+++ linux/lib/Kconfig.debug
@@ -54,6 +54,15 @@ config DEBUG_KERNEL
Say Y here if you are developing drivers or trying to debug and
identify kernel problems.

+config DEBUG_KERNEL_RUNTIME_OVERHEAD
+ bool "Enable new debug options by default"
+ default y
+ help
+ Say Y here if you want to have new debugging options
+ enabled by default even if they cause runtime overhead.
+ (you can still disable/enable them manually, independently
+ of this switch)
+
config LOG_BUF_SHIFT
int "Kernel log buffer size (16 => 64KB, 17 => 128KB)" if DEBUG_KERNEL
range 12 21
@@ -113,7 +122,7 @@ config DEBUG_SLAB
config DEBUG_SLAB_LEAK
bool "Slab memory leak debugging"
depends on DEBUG_SLAB
- default y
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
help
Enable /proc/slab_allocators - provides detailed information about
which parts of the kernel are using slab objects. May be used for
@@ -122,7 +131,7 @@ config DEBUG_SLAB_LEAK
config DEBUG_PREEMPT
bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT
- default y
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
depends on TRACE_IRQFLAGS_SUPPORT
help
If you say Y here then the kernel will use a debug variant of the
@@ -132,7 +141,7 @@ config DEBUG_PREEMPT

config DEBUG_MUTEXES
bool "Mutex debugging, basic checks"
- default n
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
depends on DEBUG_KERNEL
help
This feature allows mutex semantics violations to be detected and
@@ -140,7 +149,7 @@ config DEBUG_MUTEXES

config DEBUG_MUTEX_ALLOC
bool "Detect incorrect freeing of live mutexes"
- default n
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
depends on DEBUG_MUTEXES
help
This feature will check whether any held mutex is incorrectly
@@ -150,7 +159,7 @@ config DEBUG_MUTEX_ALLOC

config DEBUG_MUTEX_DEADLOCKS
bool "Detect mutex related deadlocks"
- default n
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
depends on DEBUG_MUTEXES
help
This feature will automatically detect and report mutex related
@@ -158,7 +167,7 @@ config DEBUG_MUTEX_DEADLOCKS

config DEBUG_RT_MUTEXES
bool "RT Mutex debugging, deadlock detection"
- default n
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
depends on DEBUG_KERNEL && RT_MUTEXES
help
This allows rt mutex semantics violations and rt mutex related
@@ -166,7 +175,7 @@ config DEBUG_RT_MUTEXES

config DEBUG_PI_LIST
bool
- default n
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
depends on DEBUG_RT_MUTEXES

config RT_MUTEX_TESTER
@@ -179,6 +188,7 @@ config RT_MUTEX_TESTER
config DEBUG_SPINLOCK
bool "Spinlock debugging"
depends on DEBUG_KERNEL
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
help
Say Y here and build SMP to catch missing spinlock initialization
and certain other kinds of spinlock errors commonly made. This is
@@ -188,7 +198,7 @@ config DEBUG_SPINLOCK
config PROVE_SPIN_LOCKING
bool "Prove spin-locking correctness"
depends on X86
- default n
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
help
This feature enables the kernel to prove that all spinlock
locking that occurs in the kernel runtime is mathematically
@@ -226,7 +236,7 @@ config PROVE_SPIN_LOCKING
config PROVE_RW_LOCKING
bool "Prove rw-locking correctness"
depends on X86
- default n
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
help
This feature enables the kernel to prove that all rwlock
locking that occurs in the kernel runtime is mathematically
@@ -264,7 +274,7 @@ config PROVE_RW_LOCKING
config PROVE_MUTEX_LOCKING
bool "Prove mutex-locking correctness"
depends on X86
- default n
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
help
This feature enables the kernel to prove that all mutexlock
locking that occurs in the kernel runtime is mathematically
@@ -302,7 +312,7 @@ config PROVE_MUTEX_LOCKING
config PROVE_RWSEM_LOCKING
bool "Prove rwsem-locking correctness"
depends on X86
- default n
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
help
This feature enables the kernel to prove that all rwsemlock
locking that occurs in the kernel runtime is mathematically
@@ -348,7 +358,7 @@ config LOCKDEP
config DEBUG_LOCKDEP
bool "Lock dependency engine debugging"
depends on LOCKDEP
- default y
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
depends on TRACE_IRQFLAGS_SUPPORT
help
If you say Y here, the lock dependency engine will do
@@ -363,6 +373,7 @@ config TRACE_IRQFLAGS

config DEBUG_SPINLOCK_SLEEP
bool "Sleep-inside-spinlock checking"
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
depends on DEBUG_KERNEL
help
If you say Y here, various routines which may sleep will become very
@@ -390,6 +401,7 @@ config DEBUG_KOBJECT
config DEBUG_HIGHMEM
bool "Highmem debugging"
depends on DEBUG_KERNEL && HIGHMEM
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
help
This options enables addition error checking for high memory systems.
Disable for production systems.
@@ -398,7 +410,7 @@ config DEBUG_BUGVERBOSE
bool "Verbose BUG() reporting (adds 70K)" if DEBUG_KERNEL && EMBEDDED
depends on BUG
depends on ARM || ARM26 || M32R || M68K || SPARC32 || SPARC64 || X86_32 || FRV
- default !EMBEDDED
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD && !EMBEDDED
help
Say Y here to make BUG() panics output the file name and line number
of the BUG call as well as the EIP and oops trace. This aids
@@ -437,6 +449,7 @@ config DEBUG_FS
config DEBUG_VM
bool "Debug VM"
depends on DEBUG_KERNEL
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
help
Enable this to turn on extended checks in the virtual-memory system
that may impact performance.
@@ -446,7 +459,7 @@ config DEBUG_VM
config FRAME_POINTER
bool "Compile the kernel with frame pointers"
depends on DEBUG_KERNEL && (X86 || CRIS || M68K || M68KNOMMU || FRV || UML)
- default y
+ default y if DEBUG_KERNEL_RUNTIME_OVERHEAD
help
If you say Y here the resulting kernel image will be slightly larger
and slower, but it might give very useful debugging information on

2006-05-31 22:46:56

by Roman Zippel

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1

Hi,

On Thu, 1 Jun 2006, Ingo Molnar wrote:

> on one side, the -mm kernel is about showcasing new code and finding
> bugs in them as fast as possible. Having new debugging options enabled
> by default is an important part of the testing effort. Users will care
> more about having no crashes than about having 0.5% more performance in
> select benchmarks.
>
> on the other side, you obviously dont want a 0.5% overhead for select
> benchmarks, as that would mess up the history! A very fair and valid
> position too.
>
> but one side has to give, we cant have both.

As I mentioned before, please keep these defaults as a -mm-only patch,
Giving them testing in -mm is fine, but defaults are already way too much
abused as is. The default rule should be to enable an option explicitly,
if it's needed, it should not be auto-enabled, because its author likes
it so much. Using a "default y" should be close to hiding the option via
CONFIG_EMBEDDED or some other option and the default should not differ
between hidden and visible state, e.g.:

config FOO
bool "foo" if BAR
default y

bye, Roman

2006-05-31 22:49:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.17-rc5-mm1


* Martin Bligh <[email protected]> wrote:

> >>Adding new runs is easy. Changing the harness is hard ;-)
> >
> >
> >ok. How about a CONFIG_DEBUG_NO_OVERHEAD option, that would default to
> >disabled but which you could set to y. Then we could make all the more
> >expensive debug options:
> >
> > default y if !CONFIG_DEBUG_NO_OVERHEAD
> >
> >this would still mean you'd have to turn off CONFIG_DEBUG_NO_OVERHEAD,
> >but it would be automatically maintainable for you after that initial
> >effort, and we'd be careful to always flag new debugging options with
> >this flag, if they are expensive. And initially i'd define "expensive"
> >as "anything that adds runtime overhead".
> >
> >would this be acceptable to you?
>
> Sure, makes sense. I don't care which way up it is, ie
> CONFIG_DEBUG_OVERHEAD vs CONFIG_DEBUG_NO_OVERHEAD, as long as it's
> easily separable.
>
> There's probably other debug stuff we can turn on too, if we do that.

i've attached an updated patch that renames
DEBUG_KERNEL_RUNTIME_OVERHEAD to DEBUG_KERNEL_OVERHEAD :-)

i like the DEBUG_KERNEL_OVERHEAD approach a lot better, because it
relieves you of this constant (and apparently often losing) battle with
Kconfig default values. It also gives us freedom to mark most of the
transparent debugging options as default-enabled.

i think i'll also do a DEBUG_KERNEL_ALLCHECKS flag that if set will
select and enable all the transparent runtime checks. Often people just
want to enable everything that still leaves us with a usable kernel, and
filtering out those debug options can be a challenge.

Ingo

Index: linux/lib/Kconfig.debug
===================================================================
--- linux.orig/lib/Kconfig.debug
+++ linux/lib/Kconfig.debug
@@ -54,6 +54,15 @@ config DEBUG_KERNEL
Say Y here if you are developing drivers or trying to debug and
identify kernel problems.

+config DEBUG_KERNEL_OVERHEAD
+ bool "Enable new debug options by default"
+ default y
+ help
+ Say Y here if you want to have new debugging options
+ enabled by default even if they cause runtime overhead.
+ (you can still disable/enable them manually, independently
+ of this switch)
+
config LOG_BUF_SHIFT
int "Kernel log buffer size (16 => 64KB, 17 => 128KB)" if DEBUG_KERNEL
range 12 21
@@ -113,7 +122,7 @@ config DEBUG_SLAB
config DEBUG_SLAB_LEAK
bool "Slab memory leak debugging"
depends on DEBUG_SLAB
- default y
+ default y if DEBUG_KERNEL_OVERHEAD
help
Enable /proc/slab_allocators - provides detailed information about
which parts of the kernel are using slab objects. May be used for
@@ -122,7 +131,7 @@ config DEBUG_SLAB_LEAK
config DEBUG_PREEMPT
bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT
- default y
+ default y if DEBUG_KERNEL_OVERHEAD
depends on TRACE_IRQFLAGS_SUPPORT
help
If you say Y here then the kernel will use a debug variant of the
@@ -132,7 +141,7 @@ config DEBUG_PREEMPT

config DEBUG_MUTEXES
bool "Mutex debugging, basic checks"
- default n
+ default y if DEBUG_KERNEL_OVERHEAD
depends on DEBUG_KERNEL
help
This feature allows mutex semantics violations to be detected and
@@ -140,7 +149,7 @@ config DEBUG_MUTEXES

config DEBUG_MUTEX_ALLOC
bool "Detect incorrect freeing of live mutexes"
- default n
+ default y if DEBUG_KERNEL_OVERHEAD
depends on DEBUG_MUTEXES
help
This feature will check whether any held mutex is incorrectly
@@ -150,7 +159,7 @@ config DEBUG_MUTEX_ALLOC

config DEBUG_MUTEX_DEADLOCKS
bool "Detect mutex related deadlocks"
- default n
+ default y if DEBUG_KERNEL_OVERHEAD
depends on DEBUG_MUTEXES
help
This feature will automatically detect and report mutex related
@@ -158,7 +167,7 @@ config DEBUG_MUTEX_DEADLOCKS

config DEBUG_RT_MUTEXES
bool "RT Mutex debugging, deadlock detection"
- default n
+ default y if DEBUG_KERNEL_OVERHEAD
depends on DEBUG_KERNEL && RT_MUTEXES
help
This allows rt mutex semantics violations and rt mutex related
@@ -166,7 +175,7 @@ config DEBUG_RT_MUTEXES

config DEBUG_PI_LIST
bool
- default n
+ default y if DEBUG_KERNEL_OVERHEAD
depends on DEBUG_RT_MUTEXES

config RT_MUTEX_TESTER
@@ -179,6 +188,7 @@ config RT_MUTEX_TESTER
config DEBUG_SPINLOCK
bool "Spinlock debugging"
depends on DEBUG_KERNEL
+ default y if DEBUG_KERNEL_OVERHEAD
help
Say Y here and build SMP to catch missing spinlock initialization
and certain other kinds of spinlock errors commonly made. This is
@@ -188,7 +198,7 @@ config DEBUG_SPINLOCK
config PROVE_SPIN_LOCKING
bool "Prove spin-locking correctness"
depends on X86
- default n
+ default y if DEBUG_KERNEL_OVERHEAD
help
This feature enables the kernel to prove that all spinlock
locking that occurs in the kernel runtime is mathematically
@@ -226,7 +236,7 @@ config PROVE_SPIN_LOCKING
config PROVE_RW_LOCKING
bool "Prove rw-locking correctness"
depends on X86
- default n
+ default y if DEBUG_KERNEL_OVERHEAD
help
This feature enables the kernel to prove that all rwlock
locking that occurs in the kernel runtime is mathematically
@@ -264,7 +274,7 @@ config PROVE_RW_LOCKING
config PROVE_MUTEX_LOCKING
bool "Prove mutex-locking correctness"
depends on X86
- default n
+ default y if DEBUG_KERNEL_OVERHEAD
help
This feature enables the kernel to prove that all mutexlock
locking that occurs in the kernel runtime is mathematically
@@ -302,7 +312,7 @@ config PROVE_MUTEX_LOCKING
config PROVE_RWSEM_LOCKING
bool "Prove rwsem-locking correctness"
depends on X86
- default n
+ default y if DEBUG_KERNEL_OVERHEAD
help
This feature enables the kernel to prove that all rwsemlock
locking that occurs in the kernel runtime is mathematically
@@ -348,7 +358,7 @@ config LOCKDEP
config DEBUG_LOCKDEP
bool "Lock dependency engine debugging"
depends on LOCKDEP
- default y
+ default y if DEBUG_KERNEL_OVERHEAD
depends on TRACE_IRQFLAGS_SUPPORT
help
If you say Y here, the lock dependency engine will do
@@ -363,6 +373,7 @@ config TRACE_IRQFLAGS

config DEBUG_SPINLOCK_SLEEP
bool "Sleep-inside-spinlock checking"
+ default y if DEBUG_KERNEL_OVERHEAD
depends on DEBUG_KERNEL
help
If you say Y here, various routines which may sleep will become very
@@ -390,6 +401,7 @@ config DEBUG_KOBJECT
config DEBUG_HIGHMEM
bool "Highmem debugging"
depends on DEBUG_KERNEL && HIGHMEM
+ default y if DEBUG_KERNEL_OVERHEAD
help
This options enables addition error checking for high memory systems.
Disable for production systems.
@@ -398,7 +410,7 @@ config DEBUG_BUGVERBOSE
bool "Verbose BUG() reporting (adds 70K)" if DEBUG_KERNEL && EMBEDDED
depends on BUG
depends on ARM || ARM26 || M32R || M68K || SPARC32 || SPARC64 || X86_32 || FRV
- default !EMBEDDED
+ default y if DEBUG_KERNEL_OVERHEAD && !EMBEDDED
help
Say Y here to make BUG() panics output the file name and line number
of the BUG call as well as the EIP and oops trace. This aids
@@ -437,6 +449,7 @@ config DEBUG_FS
config DEBUG_VM
bool "Debug VM"
depends on DEBUG_KERNEL
+ default y if DEBUG_KERNEL_OVERHEAD
help
Enable this to turn on extended checks in the virtual-memory system
that may impact performance.
@@ -446,7 +459,7 @@ config DEBUG_VM
config FRAME_POINTER
bool "Compile the kernel with frame pointers"
depends on DEBUG_KERNEL && (X86 || CRIS || M68K || M68KNOMMU || FRV || UML)
- default y
+ default y if DEBUG_KERNEL_OVERHEAD
help
If you say Y here the resulting kernel image will be slightly larger
and slower, but it might give very useful debugging information on