2006-10-03 23:53:07

by Tim Chen

[permalink] [raw]
Subject: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Hi Herbet,

The patch "Let WARN_ON/WARN_ON_ONCE return the condition"
http://kernel.org/git/?
p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=684f978347deb42d180373ac4c427f82ef963171

introduced 40% more 2nd level cache miss to tbench workload
being run in a loop back mode on a Core 2 machine. I think the
introduction of the local variables to WARN_ON and WARN_ON_ONCE

typeof(x) __ret_warn_on = (x);
typeof(condition) __ret_warn_once = (condition);

results in the extra cache misses. In our test workload profile, we see
heavily used functions like do_softirq and local_bh_enable
takes a lot longer to execute.

The modification below helps fix the problem. I made a slight
modification to sched.c to get around a gcc bug.

Signed-off-by: Tim Chen <[email protected]>
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index a525089..05ed388 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -17,13 +17,12 @@ #endif

#ifndef HAVE_ARCH_WARN_ON
#define WARN_ON(condition)
({ \
- typeof(condition) __ret_warn_on =
(condition); \
- if (unlikely(__ret_warn_on))
{ \
+ if (unlikely(condition))
{ \
printk("BUG: warning at %s:%d/%s()\n",
__FILE__, \
__LINE__,
__FUNCTION__); \
dump_stack
(); \
} \
- unlikely
(__ret_warn_on); \
+ unlikely
(condition); \
})
#endif

@@ -43,12 +42,16 @@ #endif

#define WARN_ON_ONCE(condition) ({ \
static int __warn_once = 1; \
- typeof(condition) __ret_warn_once = (condition);\
\
- if (likely(__warn_once)) \
- if (WARN_ON(__ret_warn_once)) \
- __warn_once = 0; \
- unlikely(__ret_warn_once); \
+ if (unlikely(condition)){ \
+ if (likely(__warn_once)){ \
+ __warn_once=0; \
+ printk("BUG: warning at %s:%d/%s()\n", __FILE__,
\
+ __LINE__, __FUNCTION__);\
+ dump_stack(); \
+ } \
+ } \
+ unlikely(condition); \
})

#ifdef CONFIG_SMP
diff --git a/kernel/sched.c b/kernel/sched.c
index 5c848fd..8ae972c 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5629,7 +5629,8 @@ static unsigned long domain_distance(int
struct sched_domain *sd;

for_each_domain(cpu1, sd) {
- WARN_ON(!cpu_isset(cpu1, sd->span));
+ if (unlikely(!cpu_isset(cpu1, sd->span)))
+ WARN_ON(1);
if (cpu_isset(cpu2, sd->span))
return distance;
distance++;



2006-10-04 00:06:23

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Tim Chen wrote:
> Hi Herbet,
>
> The patch "Let WARN_ON/WARN_ON_ONCE return the condition"
> http://kernel.org/git/?
> p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=684f978347deb42d180373ac4c427f82ef963171
>
> introduced 40% more 2nd level cache miss to tbench workload
> being run in a loop back mode on a Core 2 machine. I think the
> introduction of the local variables to WARN_ON and WARN_ON_ONCE
>
> typeof(x) __ret_warn_on = (x);
> typeof(condition) __ret_warn_once = (condition);
>
> results in the extra cache misses. In our test workload profile, we see
> heavily used functions like do_softirq and local_bh_enable
> takes a lot longer to execute.
>
> The modification below helps fix the problem. I made a slight
> modification to sched.c to get around a gcc bug.
>

How does the generated code change? Doesn't evaluating the condition
multiple times have the potential to cause problems?

J

2006-10-04 00:08:41

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Sorry my mailer screwed up the line-wrap. Here's a resent of the patch.

Tim

The patch "Let WARN_ON/WARN_ON_ONCE return the condition"
http://kernel.org/git/?
p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=684f978347deb42d180373ac4c427f82ef963171

introduced 40% more 2nd level cache miss to tbench workload
being run in a loop back mode on a Core 2 machine. I think the
introduction of the local variables to WARN_ON and WARN_ON_ONCE

typeof(x) __ret_warn_on = (x);
typeof(condition) __ret_warn_once = (condition);

results in the extra cache misses. In our test workload profile, we see
heavily used functions like do_softirq and local_bh_enable
takes a lot longer to execute.

The modification below helps fix the problem. I made a slight
modification to sched.c to get around a gcc bug.

Signed-off-by: Tim Chen <[email protected]>
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index a525089..05ed388 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -17,13 +17,12 @@ #endif

#ifndef HAVE_ARCH_WARN_ON
#define WARN_ON(condition) ({ \
- typeof(condition) __ret_warn_on = (condition); \
- if (unlikely(__ret_warn_on)) { \
+ if (unlikely(condition)) { \
printk("BUG: warning at %s:%d/%s()\n", __FILE__, \
__LINE__, __FUNCTION__); \
dump_stack(); \
} \
- unlikely(__ret_warn_on); \
+ unlikely(condition); \
})
#endif

@@ -43,12 +42,16 @@ #endif

#define WARN_ON_ONCE(condition) ({ \
static int __warn_once = 1; \
- typeof(condition) __ret_warn_once = (condition);\
\
- if (likely(__warn_once)) \
- if (WARN_ON(__ret_warn_once)) \
- __warn_once = 0; \
- unlikely(__ret_warn_once); \
+ if (unlikely(condition)){ \
+ if (likely(__warn_once)){ \
+ __warn_once=0; \
+ printk("BUG: warning at %s:%d/%s()\n", __FILE__, \
+ __LINE__, __FUNCTION__);\
+ dump_stack(); \
+ } \
+ } \
+ unlikely(condition); \
})

#ifdef CONFIG_SMP
diff --git a/kernel/sched.c b/kernel/sched.c
index 5c848fd..8ae972c 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5629,7 +5629,8 @@ static unsigned long domain_distance(int
struct sched_domain *sd;

for_each_domain(cpu1, sd) {
- WARN_ON(!cpu_isset(cpu1, sd->span));
+ if (unlikely(!cpu_isset(cpu1, sd->span)))
+ WARN_ON(1);
if (cpu_isset(cpu2, sd->span))
return distance;
distance++;

2006-10-04 00:10:37

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tue, 03 Oct 2006 16:04:04 -0700
Tim Chen <[email protected]> wrote:

> Hi Herbet,
>
> The patch "Let WARN_ON/WARN_ON_ONCE return the condition"
> http://kernel.org/git/?
> p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=684f978347deb42d180373ac4c427f82ef963171
>
> introduced 40% more 2nd level cache miss to tbench workload
> being run in a loop back mode on a Core 2 machine. I think the
> introduction of the local variables to WARN_ON and WARN_ON_ONCE
>
> typeof(x) __ret_warn_on = (x);
> typeof(condition) __ret_warn_once = (condition);
>
> results in the extra cache misses.

I don't see why it should.

Perhaps the `static int __warn_once' is getting put in the same cacheline
as some frequently-modified thing. Perhaps try marking that as __read_mostly?

> In our test workload profile, we see
> heavily used functions like do_softirq and local_bh_enable
> takes a lot longer to execute.
>
> The modification below helps fix the problem. I made a slight
> modification to sched.c to get around a gcc bug.
>
> Signed-off-by: Tim Chen <[email protected]>
> diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
> index a525089..05ed388 100644
> --- a/include/asm-generic/bug.h
> +++ b/include/asm-generic/bug.h
> @@ -17,13 +17,12 @@ #endif
>
> #ifndef HAVE_ARCH_WARN_ON
> #define WARN_ON(condition)
> ({ \
> - typeof(condition) __ret_warn_on =
> (condition); \
> - if (unlikely(__ret_warn_on))
> { \
> + if (unlikely(condition))
> { \
> printk("BUG: warning at %s:%d/%s()\n",
> __FILE__, \

Your patch is extremely wordwrapped.

2006-10-04 00:32:06

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tue, 2006-10-03 at 17:07 -0700, Andrew Morton wrote:

> >
> > introduced 40% more 2nd level cache miss to tbench workload
> > being run in a loop back mode on a Core 2 machine. I think the
> > introduction of the local variables to WARN_ON and WARN_ON_ONCE
> >
> > typeof(x) __ret_warn_on = (x);
> > typeof(condition) __ret_warn_once = (condition);
> >
> > results in the extra cache misses.
>
> I don't see why it should.
>

Before the WARN_ON/WARN_ON_ONCE patch, the condition given to
WARN_ON/WARN_ON_ONCE is evaluated once and that's it. But after the
patch, the condition is stored in a variable and returned later. I
think that accessing this variable causes cache misses.

> Perhaps the `static int __warn_once' is getting put in the same cacheline
> as some frequently-modified thing. Perhaps try marking that as __read_mostly?
>

I'll give that a try to see if it will improve things.

Tim

2006-10-04 00:37:06

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tue, 2006-10-03 at 17:06 -0700, Jeremy Fitzhardinge wrote:

> How does the generated code change? Doesn't evaluating the condition
> multiple times have the potential to cause problems?
>

I think if the condition changes between two evaluations, we do have a
problem with my fix. I don't have a better idea to avoid using a local
variable to store the condition. I think we should at least reverse the
WARN_ON/WARN_ON_ONCE patch if a better way cannot be found.

Tim

2006-10-04 00:58:30

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tue, 2006-10-03 at 17:07 -0700, Andrew Morton wrote:

>
> Perhaps the `static int __warn_once' is getting put in the same cacheline
> as some frequently-modified thing. Perhaps try marking that as __read_mostly?
>

I've tried marking static int __warn_once as __read_mostly. However, it
did not help with reducing the cache miss :(

So I would suggest reversing the "Let WARN_ON/WARN_ON_ONCE return the
condition" patch. It has just been added 3 days ago so reversing it
should not be a problem.

Tim

2006-10-04 01:15:20

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tue, 03 Oct 2006 17:09:29 -0700
Tim Chen <[email protected]> wrote:

> On Tue, 2006-10-03 at 17:07 -0700, Andrew Morton wrote:
>
> >
> > Perhaps the `static int __warn_once' is getting put in the same cacheline
> > as some frequently-modified thing. Perhaps try marking that as __read_mostly?
> >
>
> I've tried marking static int __warn_once as __read_mostly. However, it
> did not help with reducing the cache miss :(
>
> So I would suggest reversing the "Let WARN_ON/WARN_ON_ONCE return the
> condition" patch. It has just been added 3 days ago so reversing it
> should not be a problem.
>

Not yet, please. This is presently a mystery, and we need to work out
what's going on.

First up, is it due to WARN_ON, or WARN_ON_ONCE? Please try reverting each
one separately.

Let's look at WARN_ON.

Before:

#define WARN_ON(condition) do { \
if (unlikely((condition)!=0)) { \
printk("BUG: warning at %s:%d/%s()\n", __FILE__, __LINE__, __FUNCTION__); \
dump_stack(); \
} \
} while (0)

After:

#define WARN_ON(condition) ({ \
typeof(condition) __ret_warn_on = (condition); \
if (unlikely(__ret_warn_on)) { \
printk("BUG: warning at %s:%d/%s()\n", __FILE__, \
__LINE__, __FUNCTION__); \
dump_stack(); \
} \
unlikely(__ret_warn_on); \
})

There's no difference, except we return the temporary.


And WARN_ON_ONCE.

Before:

#define WARN_ON_ONCE(condition) \
({ \
static int __warn_once = 1; \
int __ret = 0; \
\
if (unlikely((condition) && __warn_once)) { \
__warn_once = 0; \
WARN_ON(1); \
__ret = 1; \
} \
__ret; \
})

After:

#define WARN_ON_ONCE(condition) ({ \
static int __warn_once = 1; \
typeof(condition) __ret_warn_once = (condition);\
\
if (likely(__warn_once)) \
if (WARN_ON(__ret_warn_once)) \
__warn_once = 0; \
unlikely(__ret_warn_once); \
})

There are changes here: in the old code we'll avoid reading the static
variable. In the new code we'll read the static variable, but we'll avoid
evaluating the condition.

Why would that make a measurable difference?

Do you know which WARN_ON (or is it WARN_ON_ONCE?) callsite is causing a
problem?

2006-10-04 01:46:57

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Andrew Morton wrote:
> On Tue, 03 Oct 2006 17:09:29 -0700
> Tim Chen <[email protected]> wrote:
>
>
>>On Tue, 2006-10-03 at 17:07 -0700, Andrew Morton wrote:
>>
>>
>>>Perhaps the `static int __warn_once' is getting put in the same cacheline
>>>as some frequently-modified thing. Perhaps try marking that as __read_mostly?
>>>
>>
>>I've tried marking static int __warn_once as __read_mostly. However, it
>>did not help with reducing the cache miss :(
>>
>>So I would suggest reversing the "Let WARN_ON/WARN_ON_ONCE return the
>>condition" patch. It has just been added 3 days ago so reversing it
>>should not be a problem.
>>
>
>
> Not yet, please. This is presently a mystery, and we need to work out
> what's going on.

Still, it seems kind of odd to add this IMO. Especially the WARN_ON_ONCE
makes the if statement less readable.

if (WARN_ON_ONCE(blah)) {
}

What does that mean? Without looking at the implementation, that says
the condition is true at most once, when the warning is printed.

What's wrong with adding WARN and WARN_ONCE, and eating the single extra
line? You're always telling people to do that with assignments (which I
agree with, but are _more_ readable than this WARN_ON_ONCE thing).

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-10-04 03:35:14

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tue, 3 Oct 2006 23:24:27 -0400
Andrew James Wade <[email protected]> wrote:

> On Tuesday 03 October 2006 21:14, Andrew Morton wrote:
> > There are changes here: in the old code we'll avoid reading the static
> > variable. In the new code we'll read the static variable, but we'll avoid
> > evaluating the condition.
>
> Tim Chen's patch goes back to the old behaviour. I suspect the cache
> misses on __warn_once is what he is measuring. If so, the (untested)
> patch below should reduce the cache misses back to those of the old
> code.
>
> signed-off-by: Andrew Wade <[email protected]>
> diff -rupN a/include/asm-generic/bug.h b/include/asm-generic/bug.h
> --- a/include/asm-generic/bug.h 2006-10-03 13:58:40.000000000 -0400
> +++ b/include/asm-generic/bug.h 2006-10-03 23:17:37.000000000 -0400
> @@ -45,9 +45,10 @@
> static int __warn_once = 1; \
> typeof(condition) __ret_warn_once = (condition);\
> \
> - if (likely(__warn_once)) \
> - if (WARN_ON(__ret_warn_once)) \
> + if (unlikely(__ret_warn_once) && __warn_once) { \
> __warn_once = 0; \
> + WARN_ON(1); \
> + }; \
> unlikely(__ret_warn_once); \
> })

It might help, but we still don't know what's going on (I think).

I mean, if cache misses against __warn_once were sufficiently high for it
to affect performance, then __warn_once would be, err, in cache?

2006-10-04 04:06:36

by Andrew James Wade

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tuesday 03 October 2006 21:14, Andrew Morton wrote:
> There are changes here: in the old code we'll avoid reading the static
> variable. In the new code we'll read the static variable, but we'll avoid
> evaluating the condition.

Tim Chen's patch goes back to the old behaviour. I suspect the cache
misses on __warn_once is what he is measuring. If so, the (untested)
patch below should reduce the cache misses back to those of the old
code.

signed-off-by: Andrew Wade <[email protected]>
diff -rupN a/include/asm-generic/bug.h b/include/asm-generic/bug.h
--- a/include/asm-generic/bug.h 2006-10-03 13:58:40.000000000 -0400
+++ b/include/asm-generic/bug.h 2006-10-03 23:17:37.000000000 -0400
@@ -45,9 +45,10 @@
static int __warn_once = 1; \
typeof(condition) __ret_warn_once = (condition);\
\
- if (likely(__warn_once)) \
- if (WARN_ON(__ret_warn_once)) \
+ if (unlikely(__ret_warn_once) && __warn_once) { \
__warn_once = 0; \
+ WARN_ON(1); \
+ }; \
unlikely(__ret_warn_once); \
})

2006-10-04 04:40:00

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Tim Chen wrote:
> I think if the condition changes between two evaluations, we do have a
> problem with my fix.

That's not the problem; one hopes that the WARN_ON predicate has no
side-effects (though I know there are places which have side effects in
BUG_ON). The point is that the vast majority of WARN_ONs *don't* have
their values used, so in the current code the variable reference is dead
code and will be removed. But if gcc can't prove the predicate is
side-effect free (call to an external function, for example), then gcc
will have to generate two calls to it, regardless of whether the second
value is used.

And since the condition variable will - at worst - be stored on the
stack on a hot cache line, I don't see how there could be any extra
cache misses.

> I don't have a better idea to avoid using a local
> variable to store the condition. I think we should at least reverse the
> WARN_ON/WARN_ON_ONCE patch if a better way cannot be found.

I don't think you've proved your case here. Do you *know* there are
extra cache misses (ie, measuring them), or is it just your theory to
explain a performance regression?

The other question is whether WARN_ON should return a value. Where does
it get used? It doesn't seem very valuable.

J

2006-10-04 14:10:40

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tue, 2006-10-03 at 21:39 -0700, Jeremy Fitzhardinge wrote:

>
> I don't think you've proved your case here. Do you *know* there are
> extra cache misses (ie, measuring them), or is it just your theory to
> explain a performance regression?
>

I have measured the cache miss with tool. So it is not just my theory.

Tim

2006-10-04 16:30:36

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 04 Oct 2006 06:21:35 -0700
Tim Chen <[email protected]> wrote:

> On Tue, 2006-10-03 at 21:39 -0700, Jeremy Fitzhardinge wrote:
>
> >
> > I don't think you've proved your case here. Do you *know* there are
> > extra cache misses (ie, measuring them), or is it just your theory to
> > explain a performance regression?
> >
>
> I have measured the cache miss with tool. So it is not just my theory.
>

And what did that tool tell you?

Guys. Please. Help us out here. None of this makes sense, and it's
possible that we have an underlying problem in there which we need to know
about.

Please don't just ignore my questions. *why* are we getting a cache miss
rate on that integer which is causing measurable performance changes? If
we're reading it that frequently then the variable should be in cache(!).

Again: do you know which callsite is causing the problem? I assume one of
the ones in softirq.c? Do you know what the cache miss frequency is? etc.

Because if we don't answer these questions there's an excellent chance that
the problem (whatever it is) will come back and bite us again.

2006-10-04 16:51:40

by Andrew James Wade

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tuesday 03 October 2006 23:32, Andrew Morton wrote:

> It might help, but we still don't know what's going on (I think).
>
> I mean, if cache misses against __warn_once were sufficiently high for it
> to affect performance, then __warn_once would be, err, in cache?

Yes, of course. I'm embarrassed.

I took a look at the generated code, and GCC is having difficulty
optimizing WARN_ON_ONCE. Here is the start of __local_bh_enable:

00000130 <__local_bh_enable>:
130: 83 ec 10 sub $0x10,%esp
133: 8b 15 04 00 00 00 mov 0x4,%edx <-+
139: 89 e0 mov %esp,%eax |
13b: 25 00 e0 ff ff and $0xffffe000,%eax | !!!
140: 8b 40 14 mov 0x14(%eax),%eax |
143: 25 00 00 ff 0f and $0xfff0000,%eax |
148: 85 d2 test %edx,%edx <-+
14a: 74 04 je 150 <__local_bh_enable+0x20>
14c: 85 c0 test %eax,%eax
14e: 75 35 jne 185 <__local_bh_enable+0x55>
150: 89 e0 mov %esp,%eax
152: 8b 0d 00 00 00 00 mov 0x0,%ecx <-+
158: 25 00 e0 ff ff and $0xffffe000,%eax |
15d: 8b 40 14 mov 0x14(%eax),%eax | !!!
160: 25 00 ff 00 00 and $0xff00,%eax |
165: 3d 00 01 00 00 cmp $0x100,%eax |
16a: 0f 94 c0 sete %al |
16d: 85 c9 test %ecx,%ecx <-+
16f: 0f b6 c0 movzbl %al,%eax
172: 74 04 je 178 <__local_bh_enable+0x48>
174: 85 c0 test %eax,%eax
176: 75 42 jne 1ba <__local_bh_enable+0x8a>
178: b8 00 01 00 00 mov $0x100,%eax
17d: 83 c4 10 add $0x10,%esp
180: e9 fc ff ff ff jmp 181 <__local_bh_enable+0x51>
185: c7 44 24 0c 3e 00 00 movl $0x3e,0xc(%esp)
...

WARN_ON is better, but still has problems:

000011a0 <do_exit>:
11a0: 55 push %ebp
11a1: 57 push %edi
11a2: 56 push %esi
11a3: 53 push %ebx
11a4: 83 ec 30 sub $0x30,%esp
11a7: 89 44 24 18 mov %eax,0x18(%esp)
11ab: 89 e0 mov %esp,%eax
11ad: 25 00 e0 ff ff and $0xffffe000,%eax
11b2: 8b 30 mov (%eax),%esi
11b4: 8b 86 58 0a 00 00 mov 0xa58(%esi),%eax
11ba: 89 44 24 2c mov %eax,0x2c(%esp) <-+
11be: 8b 44 24 2c mov 0x2c(%esp),%eax <-+
11c2: 85 c0 test %eax,%eax | !!!
11c4: 0f 85 65 07 00 00 jne 192f <do_exit+0x78f> |
11ca: 8b 44 24 2c mov 0x2c(%esp),%eax <-+
11ce: 89 e0 mov %esp,%eax
...

This is gcc (GCC) 4.0.3 (Ubuntu 4.0.3-1ubuntu5), -O2.

I don't know why this would show up as cache misses, but it does look
like a compiler bug is being tickled.

Andrew Wade

2006-10-04 16:58:10

by Ananiev, Leonid I

[permalink] [raw]
Subject: RE: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

>Guys. Please. Help us out here. None of this makes sense, and it's
> possible that we have an underlying problem in there which we need to
know
> about.
This is explantion:

The static variable __warn_once was "never" read (until there is no bug)
before patch "Let WARN_ON/WARN_ON_ONCE return the condition"
http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi
t;h=684f978347deb42d180373ac4c427f82ef963171
in WARN_ON_ONCE's line
- if (unlikely((condition) && __warn_once)) { \
because 'condition' is false. There was no cache miss as a result.

Cache miss for __warn_once is happened in new lines
+ if (likely(__warn_once)) \
+ if (WARN_ON(__ret_warn_once)) \

Proposed patch is tested by tbench.

>From Leonid Ananiev

Cache miss eliminating in WARN_ON_ONCE by testing expression value
before static variable value.

Signed-off-by: Leonid Ananiev <[email protected]>
--- linux-2.6.18-git14/include/asm-generic/bug.h
+++ linux-2.6.18-git14b/include/asm-generic/bug.h
@@ -45,9 +45,12 @@
static int __warn_once = 1; \
typeof(condition) __ret_warn_once = (condition);\
\
- if (likely(__warn_once)) \
- if (WARN_ON(__ret_warn_once)) \
+ /* check 'condition' first to avoid cache miss with __warn_once
*/\
+ if (unlikely(__ret_warn_once)) \
+ if (__warn_once) { \
__warn_once = 0; \
+ WARN_ON(1); \
+ } \
unlikely(__ret_warn_once); \
})
-----Original Message-----
From: Andrew Morton [mailto:[email protected]]
Sent: Wednesday, October 04, 2006 8:30 PM
To: [email protected]
Cc: Jeremy Fitzhardinge; [email protected];
[email protected]; Ananiev, Leonid I
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 04 Oct 2006 06:21:35 -0700
Tim Chen <[email protected]> wrote:

> On Tue, 2006-10-03 at 21:39 -0700, Jeremy Fitzhardinge wrote:
>
> >
> > I don't think you've proved your case here. Do you *know* there are

> > extra cache misses (ie, measuring them), or is it just your theory
to
> > explain a performance regression?
> >
>
> I have measured the cache miss with tool. So it is not just my
theory.
>

And what did that tool tell you?

Guys. Please. Help us out here. None of this makes sense, and it's
possible that we have an underlying problem in there which we need to
know
about.

Please don't just ignore my questions. *why* are we getting a cache
miss
rate on that integer which is causing measurable performance changes?
If
we're reading it that frequently then the variable should be in
cache(!).

Again: do you know which callsite is causing the problem? I assume one
of
the ones in softirq.c? Do you know what the cache miss frequency is?
etc.

Because if we don't answer these questions there's an excellent chance
that
the problem (whatever it is) will come back and bite us again.

2006-10-04 17:11:12

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 2006-10-04 at 09:30 -0700, Andrew Morton wrote:

> > I have measured the cache miss with tool. So it is not just my theory.
> >
>
> And what did that tool tell you?

I am using emon. Measuring a 20 second stretch of tbench run saw the L2
cache miss go from 14 million to 25 million on each of the cpu core.
>
> Please don't just ignore my questions. *why* are we getting a cache miss
> rate on that integer which is causing measurable performance changes? If
> we're reading it that frequently then the variable should be in cache(!).
>

The point is valid, __warn_once should be in cache, unless something
evicts it. What I have found so far is with patch by Andrew and Leonid
that avoid looking up the __warn_once integer, the cache miss rate is
reduced to the level before.

> Again: do you know which callsite is causing the problem? I assume one of
> the ones in softirq.c? Do you know what the cache miss frequency is? etc.
>
Unfortunately emon does not directly give the callsite. Oprofile data
shows a marked increase in time spent in do_softirq and local_bh_enable.
What I could do is to individually turn off WARN_ON_ONCE at these sites
and see if they are responsible for the cache miss. Will let you know
what I found.

Oprofile data --
Before WARN_ON_ONCE patch:

117767 thread_return
106651 local_bh_enable
83767 tcp_v4_rcv
72266 copy_user_generic_unrolled
47136 do_softirq

41100 tcp_recvmsg
39394 tcp_sendmsg
118383 thread_return
88171 copy_user_generic

..
8281 local_bh_enable
6790 do_softirq

After WARN_ON_ONCE patch:

117767 thread_return
106651 local_bh_enable
83767 tcp_v4_rcv
72266 copy_user_generic_unrolled
47136 do_softirq


Tim

2006-10-04 17:28:24

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 4 Oct 2006 20:57:57 +0400
"Ananiev, Leonid I" <[email protected]> wrote:

> >Guys. Please. Help us out here. None of this makes sense, and it's
> > possible that we have an underlying problem in there which we need to
> know
> > about.
> This is explantion:
>
> The static variable __warn_once was "never" read (until there is no bug)
> before patch "Let WARN_ON/WARN_ON_ONCE return the condition"
> http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi
> t;h=684f978347deb42d180373ac4c427f82ef963171
> in WARN_ON_ONCE's line
> - if (unlikely((condition) && __warn_once)) { \
> because 'condition' is false. There was no cache miss as a result.
>
> Cache miss for __warn_once is happened in new lines
> + if (likely(__warn_once)) \
> + if (WARN_ON(__ret_warn_once)) \
>

That's one cache miss. One. For the remainder of the benchmark,
__warn_once is in cache and there are no more misses. That's how caches
work ;)

But it appears this isn't happening. Why?

2006-10-04 17:34:21

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 04 Oct 2006 09:22:09 -0700
Tim Chen <[email protected]> wrote:

> On Wed, 2006-10-04 at 09:30 -0700, Andrew Morton wrote:
>
> > > I have measured the cache miss with tool. So it is not just my theory.
> > >
> >
> > And what did that tool tell you?
>
> I am using emon. Measuring a 20 second stretch of tbench run saw the L2
> cache miss go from 14 million to 25 million on each of the cpu core.
> >
> > Please don't just ignore my questions. *why* are we getting a cache miss
> > rate on that integer which is causing measurable performance changes? If
> > we're reading it that frequently then the variable should be in cache(!).
> >
>
> The point is valid, __warn_once should be in cache, unless something
> evicts it. What I have found so far is with patch by Andrew and Leonid
> that avoid looking up the __warn_once integer, the cache miss rate is
> reduced to the level before.
>
> > Again: do you know which callsite is causing the problem? I assume one of
> > the ones in softirq.c? Do you know what the cache miss frequency is? etc.
> >
> Unfortunately emon does not directly give the callsite. Oprofile data
> shows a marked increase in time spent in do_softirq and local_bh_enable.
> What I could do is to individually turn off WARN_ON_ONCE at these sites
> and see if they are responsible for the cache miss. Will let you know
> what I found.

I see, thanks. How very peculiar.

I wonder if we just got unlucky and that particular benchmark with that
particular kernel build just happens to reach the cache system's
associativity threshold, and this one extra cacheline took it over the
edge. Or something.

But if that was the case, one would expect even a small change to the
kernel (say, a different compiler version, or different config) would
improve things a lot, or would worsen things a lot.

I think the change is right. But, again, I worry that any teeny change we
make in there in the future will bring the problem back.

2006-10-04 21:32:52

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 2006-10-04 at 10:34 -0700, Andrew Morton wrote:

>
> I wonder if we just got unlucky and that particular benchmark with that
> particular kernel build just happens to reach the cache system's
> associativity threshold, and this one extra cacheline took it over the
> edge. Or something.
>

I took a look at the "L2 cache lines evicted" data from emon. It
increases by 17 million per cpu core during the 20 second tbench run
with the WARN_ON_ONCE patch, along with 11 million increase in "L2 cache
miss". So perhaps the __warn_once variable got evicted. No theory at
least point on how it got evicted.

Also did an experiment to locate the callsite causing cache miss with
the original WARN_ON_ONCE patch. I turned off WARN_ON_ONCE one by one in
kernel/softirq.c where almost all call for WARN_ON_ONCE originated. I
found that with each turnoff of WARN_ON_ONCE, the L2 cache miss go down
by 1 to 2 million, L2 cache line eviction goes down by 2 to 3 million.

One particular WARN_ON_ONCE in local_bh_enable contributed the most: 6
million L2 cache miss and 10 million L2 cache line evictions. That
particular WARN_ON_ONCE also has the most effect on tbench throughput
(accounting for 70% of the tbench throughput reduction).


-- Code snippet --
void local_bh_enable(void)
{
#ifdef CONFIG_TRACE_IRQFLAGS
unsigned long flags;

WARN_ON_ONCE(in_irq());
#endif
WARN_ON_ONCE(irqs_disabled()); <------- most degradation here.
CONFIG_TRACE_IRQFLAGS off

#ifdef CONFIG_TRACE_IRQFLAGS
local_irq_save(flags);
#endif
/*

Thanks.

Tim

2006-10-04 21:55:32

by Ananiev, Leonid I

[permalink] [raw]
Subject: RE: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

> That's one cache miss. One. For the remainder of the benchmark,
> __warn_once is in cache and there are no more misses. That's how
caches
> work ;)

Before patch for WARN_ON_ONCE we never had risk to get a cache miss for
__warn_once.
After patch we have.
> __warn_once is in cache and there are no more misses.
There is no insuerence that it is in the cache.

Next:
CPU 1Mb cache cache_alignment : 128 => -16% degradation of tbench
CPU 2Mb cache cache_alignment : 128 => -1% degradation
CPU 4Mb cache cache_alignment : 64 => -70% degradation

An evictees depends from cache size and line size. last is more
essential in considered case.

Leonid
-----Original Message-----
From: Andrew Morton [mailto:[email protected]]
Sent: Wednesday, October 04, 2006 9:28 PM
To: Ananiev, Leonid I
Cc: [email protected]; Jeremy Fitzhardinge;
[email protected]; [email protected]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 4 Oct 2006 20:57:57 +0400
"Ananiev, Leonid I" <[email protected]> wrote:

> >Guys. Please. Help us out here. None of this makes sense, and it's
> > possible that we have an underlying problem in there which we need
to
> know
> > about.
> This is explantion:
>
> The static variable __warn_once was "never" read (until there is no
bug)
> before patch "Let WARN_ON/WARN_ON_ONCE return the condition"
>
http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi
> t;h=684f978347deb42d180373ac4c427f82ef963171
> in WARN_ON_ONCE's line
> - if (unlikely((condition) && __warn_once)) { \
> because 'condition' is false. There was no cache miss as a result.
>
> Cache miss for __warn_once is happened in new lines
> + if (likely(__warn_once)) \
> + if (WARN_ON(__ret_warn_once)) \
>

That's one cache miss. One. For the remainder of the benchmark,
__warn_once is in cache and there are no more misses. That's how caches
work ;)

But it appears this isn't happening. Why?

2006-10-04 22:06:41

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 4 Oct 2006 12:47:00 -0400
Andrew James Wade <[email protected]> wrote:

> On Tuesday 03 October 2006 23:32, Andrew Morton wrote:
>
> > It might help, but we still don't know what's going on (I think).
> >
> > I mean, if cache misses against __warn_once were sufficiently high for it
> > to affect performance, then __warn_once would be, err, in cache?
>
> Yes, of course. I'm embarrassed.
>
> I took a look at the generated code, and GCC is having difficulty
> optimizing WARN_ON_ONCE. Here is the start of __local_bh_enable:
>
> 00000130 <__local_bh_enable>:
> 130: 83 ec 10 sub $0x10,%esp
> 133: 8b 15 04 00 00 00 mov 0x4,%edx <-+
> 139: 89 e0 mov %esp,%eax |
> 13b: 25 00 e0 ff ff and $0xffffe000,%eax | !!!
> 140: 8b 40 14 mov 0x14(%eax),%eax |
> 143: 25 00 00 ff 0f and $0xfff0000,%eax |

This is the evaluation of in_irq(): calculate `current', grab
current->thread_info->preempt_count.

Normally gcc does manage to CSE the value of current.

> 148: 85 d2 test %edx,%edx <-+
> 14a: 74 04 je 150 <__local_bh_enable+0x20>
> 14c: 85 c0 test %eax,%eax
> 14e: 75 35 jne 185 <__local_bh_enable+0x55>
> 150: 89 e0 mov %esp,%eax
> 152: 8b 0d 00 00 00 00 mov 0x0,%ecx <-+
> 158: 25 00 e0 ff ff and $0xffffe000,%eax |

but this time it went and reevaluated it.

> 15d: 8b 40 14 mov 0x14(%eax),%eax | !!!
> 160: 25 00 ff 00 00 and $0xff00,%eax |
> 165: 3d 00 01 00 00 cmp $0x100,%eax |
> 16a: 0f 94 c0 sete %al |
> 16d: 85 c9 test %ecx,%ecx <-+
> 16f: 0f b6 c0 movzbl %al,%eax
> 172: 74 04 je 178 <__local_bh_enable+0x48>
> 174: 85 c0 test %eax,%eax
> 176: 75 42 jne 1ba <__local_bh_enable+0x8a>
> 178: b8 00 01 00 00 mov $0x100,%eax
> 17d: 83 c4 10 add $0x10,%esp
> 180: e9 fc ff ff ff jmp 181 <__local_bh_enable+0x51>
> 185: c7 44 24 0c 3e 00 00 movl $0x3e,0xc(%esp)
> ...
>
> WARN_ON is better, but still has problems:
>
> 000011a0 <do_exit>:
> 11a0: 55 push %ebp
> 11a1: 57 push %edi
> 11a2: 56 push %esi
> 11a3: 53 push %ebx
> 11a4: 83 ec 30 sub $0x30,%esp
> 11a7: 89 44 24 18 mov %eax,0x18(%esp)
> 11ab: 89 e0 mov %esp,%eax
> 11ad: 25 00 e0 ff ff and $0xffffe000,%eax
> 11b2: 8b 30 mov (%eax),%esi
> 11b4: 8b 86 58 0a 00 00 mov 0xa58(%esi),%eax
> 11ba: 89 44 24 2c mov %eax,0x2c(%esp) <-+
> 11be: 8b 44 24 2c mov 0x2c(%esp),%eax <-+
> 11c2: 85 c0 test %eax,%eax | !!!
> 11c4: 0f 85 65 07 00 00 jne 192f <do_exit+0x78f> |
> 11ca: 8b 44 24 2c mov 0x2c(%esp),%eax <-+
> 11ce: 89 e0 mov %esp,%eax
> ...

No, that's pretty much the same. In the __local_bh_enable case we have the
evaluation of in_irq() as well as softirq_count(). do_exit() just has a
single WARN_ON().


> This is gcc (GCC) 4.0.3 (Ubuntu 4.0.3-1ubuntu5), -O2.
>
> I don't know why this would show up as cache misses, but it does look
> like a compiler bug is being tickled.

The code could be better, but nope, there are no additional cache misses
there.

Still weird.

2006-10-05 08:17:55

by Andrew James Wade

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wednesday 04 October 2006 18:06, Andrew Morton wrote:
> On Wed, 4 Oct 2006 12:47:00 -0400
> Andrew James Wade <[email protected]> wrote:
>
> > On Tuesday 03 October 2006 23:32, Andrew Morton wrote:
> >
> > > It might help, but we still don't know what's going on (I think).
> > >
> > > I mean, if cache misses against __warn_once were sufficiently high for it
> > > to affect performance, then __warn_once would be, err, in cache?
> >
> > Yes, of course. I'm embarrassed.
> >
> > I took a look at the generated code, and GCC is having difficulty
> > optimizing WARN_ON_ONCE. Here is the start of __local_bh_enable:
> >
> > 00000130 <__local_bh_enable>:
> > 130: 83 ec 10 sub $0x10,%esp
> > 133: 8b 15 04 00 00 00 mov 0x4,%edx <-+
> > 139: 89 e0 mov %esp,%eax |
> > 13b: 25 00 e0 ff ff and $0xffffe000,%eax | !!!
> > 140: 8b 40 14 mov 0x14(%eax),%eax |
> > 143: 25 00 00 ff 0f and $0xfff0000,%eax |
>
> This is the evaluation of in_irq(): calculate `current', grab
> current->thread_info->preempt_count.

Actually I was confusing "mov 0x4,%edx" with "mov $0x4,%edx". That
code's fine (albeit unlinked). There are stupid inefficiencies in some
of the other code, but nothing really stood out at me in
__local_bh_enable, _local_bh_enable, or local_bh_Enable.

(from earlier)
> Perhaps the `static int __warn_once' is getting put in the same cacheline
> as some frequently-modified thing.

hmm:

00000460 l O .data 00000044 task_exit_notifier
000004c0 l O .data 0000002c task_free_notifier
000004ec l O .data 00000004 warnlimit.15904
000004f0 l O .data 00000004 firsttime.15774
000004f4 l O .data 00000004 __warn_once.15180
000004f8 l O .data 00000004 __warn_once.15174
000004fc l O .data 00000004 __warn_once.15213
00000500 l O .data 00000004 __warn_once.15207
00000504 l O .data 00000004 __warn_once.15145
00000508 l O .data 00000004 __warn_once.15309
0000050c l O .data 00000004 __warn_once.15256
00000510 l O .data 00000004 __warn_once.15250
000005a0 l O .data 0000006c proc_iomem_operations
(extracted from objdump -t kernel/built-in.o)

warnlimit and firsttime are fine, and proc_iomem_operations is
presumably fine as well. But I'm not so sure task_free_notifier is
infrequently modified. But that's just my .config and I'm out of my depth.

Andrew Wade

2006-10-05 08:40:04

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Thu, 5 Oct 2006 04:13:07 -0400
Andrew James Wade <[email protected]> wrote:

> (from earlier)
> > Perhaps the `static int __warn_once' is getting put in the same cacheline
> > as some frequently-modified thing.
>
> hmm:
>
> 00000460 l O .data 00000044 task_exit_notifier
> 000004c0 l O .data 0000002c task_free_notifier
> 000004ec l O .data 00000004 warnlimit.15904
> 000004f0 l O .data 00000004 firsttime.15774
> 000004f4 l O .data 00000004 __warn_once.15180
> 000004f8 l O .data 00000004 __warn_once.15174
> 000004fc l O .data 00000004 __warn_once.15213
> 00000500 l O .data 00000004 __warn_once.15207
> 00000504 l O .data 00000004 __warn_once.15145
> 00000508 l O .data 00000004 __warn_once.15309
> 0000050c l O .data 00000004 __warn_once.15256
> 00000510 l O .data 00000004 __warn_once.15250
> 000005a0 l O .data 0000006c proc_iomem_operations
> (extracted from objdump -t kernel/built-in.o)


That all looks OK (by sheer luck).

Well. What's the cache line size on that machine? Every exit() will cause
a down_read() on task_exit_notifier's lock which might affect things. And
I think you snipped the above list a bit short (depending on that line
size).


But still, we know that moving those things into __read_mostly didn't fix
it, yes?

2006-10-05 21:33:07

by Andrew James Wade

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Thursday 05 October 2006 04:36, Andrew Morton wrote:
> On Thu, 5 Oct 2006 04:13:07 -0400
> Andrew James Wade <[email protected]> wrote:
>
> > (...)
>
>
> That all looks OK (by sheer luck).
>
> Well. What's the cache line size on that machine? Every exit() will cause
> a down_read() on task_exit_notifier's lock which might affect things. And
> I think you snipped the above list a bit short (depending on that line
> size).
>
>
> But still, we know that moving those things into __read_mostly didn't fix
> it, yes?

No. To my knowledge Tim Chen hasn't tried __read_mostly, and I have not
attempted to replicate the test case. (I only have a uniprocessor
machine.) Core 2 machines have a cache line size of 64 bytes, but Tim
Chen is likely using a different kernel/.config than I am so my objdump
isn't definitive.

Tim, perhaps you can try the __read_mostly marking as Andrew suggests?

signed-off-by: Andrew Wade <[email protected]>
upN a/include/asm-generic/bug.h b/include/asm-generic/bug.h
--- a/include/asm-generic/bug.h 2006-10-05 16:16:37.000000000 -0400
+++ b/include/asm-generic/bug.h 2006-10-05 16:33:08.000000000 -0400
@@ -42,7 +42,7 @@
#endif

#define WARN_ON_ONCE(condition) ({ \
- static int __warn_once = 1; \
+ static int __warn_once __read_mostly = 1; \
typeof(condition) __ret_warn_once = (condition);\
\
if (likely(__warn_once)) \

2006-10-05 21:38:08

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression


So how's this look?

I worry a bit that someone's hardware might go and prefetch that static
variable even when we didn't ask it to. Can that happen?




Tim and Ananiev report that the recent WARN_ON_ONCE changes cause increased
cache misses with the tbench workload. Apparently due to the access to the
newly-added static variable.

Rearrange the code so that we don't touch that variable unless the warning is
going to trigger.

Also rework the logic so that the static variable starts out at zero, so we
can move it into bss.

It would seem logical to mark the static variable as __read_mostly too. But
it would be wrong, because that would put it back into the vmlinux image, and
the kernel will never read from this variable in normal operation anyway.
Unless the compiler or hardware go and do some prefetching on us?

For some reason this patch shrinks softirq.o text by 40 bytes.

Cc: Tim Chen <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: "Ananiev, Leonid I" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

include/asm-generic/bug.h | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)

diff -puN include/asm-generic/bug.h~fix-warn_on--warn_on_once-regression include/asm-generic/bug.h
--- a/include/asm-generic/bug.h~fix-warn_on--warn_on_once-regression
+++ a/include/asm-generic/bug.h
@@ -41,14 +41,14 @@
#endif
#endif

-#define WARN_ON_ONCE(condition) ({ \
- static int __warn_once = 1; \
- typeof(condition) __ret_warn_once = (condition);\
- \
- if (likely(__warn_once)) \
- if (WARN_ON(__ret_warn_once)) \
- __warn_once = 0; \
- unlikely(__ret_warn_once); \
+#define WARN_ON_ONCE(condition) ({ \
+ static int __warned; \
+ typeof(condition) __ret_warn_once = (condition); \
+ \
+ if (unlikely(__ret_warn_once)) \
+ if (WARN_ON(!__warned)) \
+ __warned = 1; \
+ unlikely(__ret_warn_once); \
})

#ifdef CONFIG_SMP
_

2006-10-05 21:43:21

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Andrew Morton wrote:
> So how's this look?
>

Looks fine to me. Other than the general question of why WARN_ON*
returns a value at all, and if so, does the final unlikely() really do
anything.

> I worry a bit that someone's hardware might go and prefetch that static
> variable even when we didn't ask it to. Can that happen?
>

Sure, the CPU has the option, but if it goes around speculatively
polluting its caches with unused stuff, it isn't going to perform very
well in general. So presumably the CPU won't do it unless it really
thinks it will help.

J

2006-10-05 21:50:53

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression



On Thu, 2006-10-05 at 17:31 -0400, Andrew James Wade wrote:
> On Thursday 05 October 2006 04:36, Andrew Morton wrote:
> > On Thu, 5 Oct 2006 04:13:07 -0400
> > Andrew James Wade <[email protected]> wrote:
> >
> > > (...)
> >
> >
> > That all looks OK (by sheer luck).
> >
> > Well. What's the cache line size on that machine? Every exit() will cause
> > a down_read() on task_exit_notifier's lock which might affect things. And
> > I think you snipped the above list a bit short (depending on that line
> > size).
> >
> >
> > But still, we know that moving those things into __read_mostly didn't fix
> > it, yes?
>
> No. To my knowledge Tim Chen hasn't tried __read_mostly, and I have not
> attempted to replicate the test case. (I only have a uniprocessor
> machine.) Core 2 machines have a cache line size of 64 bytes, but Tim
> Chen is likely using a different kernel/.config than I am so my objdump
> isn't definitive.
>
> Tim, perhaps you can try the __read_mostly marking as Andrew suggests?
>

I have run the workload with __read_mostly marking. But it didn't make
a difference. By the way, the cache line size of my machine is 64
bytes.

Thanks.

Tim

2006-10-05 21:52:36

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Thu, 05 Oct 2006 14:43:01 -0700
Jeremy Fitzhardinge <[email protected]> wrote:

> Andrew Morton wrote:
> > So how's this look?
> >
>
> Looks fine to me. Other than the general question of why WARN_ON*
> returns a value at all, and if so, does the final unlikely() really do
> anything.

Herbert had a good-sounding reason for wanting this feature, but afaict he
hasn't proceeded to use it at this stage. And he's hiding from us ;)

2006-10-05 22:04:19

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Thu, Oct 05, 2006 at 02:52:13PM -0700, Andrew Morton wrote:
>
> Herbert had a good-sounding reason for wanting this feature, but afaict he
> hasn't proceeded to use it at this stage. And he's hiding from us ;)

Well you guys had everything under control so I was happy to stay
behind my rock :)

The original reason for the return value is so you can do

if (WARN_ON(impossible_condition)) {
attempt_to_continue;
}

instead of

if (unlikely(impossible_condition)) {
WARN_ON(1);
attempt_to_continue;
}

Oh and yes the unlikely does make a difference in a statement
expression.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2006-10-05 22:40:31

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Thu, 2006-10-05 at 14:37 -0700, Andrew Morton wrote:

>
> Tim and Ananiev report that the recent WARN_ON_ONCE changes cause increased
> cache misses with the tbench workload. Apparently due to the access to the
> newly-added static variable.
>
> Rearrange the code so that we don't touch that variable unless the warning is
> going to trigger.
>
> Also rework the logic so that the static variable starts out at zero, so we
> can move it into bss.
>
> It would seem logical to mark the static variable as __read_mostly too. But
> it would be wrong, because that would put it back into the vmlinux image, and
> the kernel will never read from this variable in normal operation anyway.
> Unless the compiler or hardware go and do some prefetching on us?
>
> For some reason this patch shrinks softirq.o text by 40 bytes.
>

Andrew,

Thanks for the patch. I tested it and it helped fix tbench regression
seen earlier and reduced the cache misses.

Tim

2006-10-05 22:40:10

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Herbert Xu wrote:
> The original reason for the return value is so you can do
>
> if (WARN_ON(impossible_condition)) {
> attempt_to_continue;
> }
>
> instead of
>
> if (unlikely(impossible_condition)) {
> WARN_ON(1);
> attempt_to_continue;
> }
>

(Hm, WARN_ON(1) is pretty ugly; we should probably have a WARN() as well.)

Why is the second one any better than the first? It's a line less code,
but that doesn't seem like a big deal. It's not like passing the actual
condition into WARN_ON is useful, because it doesn't try to print it
out. And "if (WARN_ON_ONCE(cond)) ..." is arguably more useful (since it
encapsulates the printing once logic), but also very unclear (does it
evaluate true once or every time?).

There are certainly lots of places in the kernel which could use
if(WARN_ON(...)), but I haven't found any places which actually do. I
just don't see what benefit you would gain in converting things to using
if(WARN_ON(...)) anyway.

> Oh and yes the unlikely does make a difference in a statement
> expression.
>

I was thinking something like

unlikely(({
...
}))

is a bit more obvious in terms of imagining how it would get expanded
and evaluated.

J

2006-10-06 04:06:44

by Ananiev, Leonid I

[permalink] [raw]
Subject: RE: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

> Apparently due to the access to the
> newly-added static variable.

The cache miss was increased as a result of instructions reordering but
not due to the newly-added static variable. The original patch have
added local but not static variable. Newly-added local variable compiler
has placed in register.
The original patch made reordering of tests-register-and-jump and
test-memory-and-jump instructions. That was a side effect of the patch.
To watch closely ordering of 'if (a && b)' or 'if (b && a)' becomes more
weight than 'unlikely()' using if 'a' is in register and 'b' is in
memory. I would say: unlikely() cost is 10:1; ordering cost is 1000:1.

Leonid

-----Original Message-----
From: Andrew Morton [mailto:[email protected]]
Sent: Friday, October 06, 2006 1:38 AM
To: Ananiev, Leonid I
Cc: [email protected]; Jeremy Fitzhardinge;
[email protected]; [email protected]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression


So how's this look?

I worry a bit that someone's hardware might go and prefetch that static
variable even when we didn't ask it to. Can that happen?




Tim and Ananiev report that the recent WARN_ON_ONCE changes cause
increased
cache misses with the tbench workload. Apparently due to the access to
the
newly-added static variable.

Rearrange the code so that we don't touch that variable unless the
warning is
going to trigger.

Also rework the logic so that the static variable starts out at zero, so
we
can move it into bss.

It would seem logical to mark the static variable as __read_mostly too.
But
it would be wrong, because that would put it back into the vmlinux
image, and
the kernel will never read from this variable in normal operation
anyway.
Unless the compiler or hardware go and do some prefetching on us?

For some reason this patch shrinks softirq.o text by 40 bytes.

Cc: Tim Chen <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: "Ananiev, Leonid I" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

include/asm-generic/bug.h | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)

diff -puN include/asm-generic/bug.h~fix-warn_on--warn_on_once-regression
include/asm-generic/bug.h
--- a/include/asm-generic/bug.h~fix-warn_on--warn_on_once-regression
+++ a/include/asm-generic/bug.h
@@ -41,14 +41,14 @@
#endif
#endif

-#define WARN_ON_ONCE(condition) ({ \
- static int __warn_once = 1; \
- typeof(condition) __ret_warn_once = (condition);\
- \
- if (likely(__warn_once)) \
- if (WARN_ON(__ret_warn_once)) \
- __warn_once = 0; \
- unlikely(__ret_warn_once); \
+#define WARN_ON_ONCE(condition) ({
\
+ static int __warned; \
+ typeof(condition) __ret_warn_once = (condition); \
+ \
+ if (unlikely(__ret_warn_once)) \
+ if (WARN_ON(!__warned)) \
+ __warned = 1; \
+ unlikely(__ret_warn_once); \
})

#ifdef CONFIG_SMP
_

2006-10-06 16:11:38

by Andrew James Wade

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Thursday 05 October 2006 17:37, Andrew Morton wrote:
>
> So how's this look?
>
> I worry a bit that someone's hardware might go and prefetch that static
> variable even when we didn't ask it to. Can that happen?

Branch mispredictions might cause spurious fetches. But the
unlikely() should take care of that for the (presumably common)
!__warned case.

> ...
>
> It would seem logical to mark the static variable as __read_mostly too. But
> it would be wrong, because that would put it back into the vmlinux image, and
> the kernel will never read from this variable in normal operation anyway.

And if that's the case they should probably be in amongst write-hot
variables so as to reduce cache-line ping-ponging. __read_mostly should
probably be called __read_hot_write_cold, see
http://lkml.org/lkml/2006/6/26/290 .

> Unless the compiler or hardware go and do some prefetching on us?

The compiler doesn't, at least not GCC 4. I wouldn't expect it to
hoist loads out of non-loop blocks.

> For some reason this patch shrinks softirq.o text by 40 bytes.

The compiler optimizes out __ret_warn_once; it didn't do that before.

The patch looks good to me, though I am still baffled as to why the
cache misses were occurring.

Andrew Wade

2006-10-08 00:29:11

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 2006-10-04 at 10:28 -0700, Andrew Morton wrote:
> On Wed, 4 Oct 2006 20:57:57 +0400
> "Ananiev, Leonid I" <[email protected]> wrote:
>
> > >Guys. Please. Help us out here. None of this makes sense, and it's
> > > possible that we have an underlying problem in there which we need to
> > know
> > > about.
> > This is explantion:
> >
> > The static variable __warn_once was "never" read (until there is no bug)
> > before patch "Let WARN_ON/WARN_ON_ONCE return the condition"
> > http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi
> > t;h=684f978347deb42d180373ac4c427f82ef963171
> > in WARN_ON_ONCE's line
> > - if (unlikely((condition) && __warn_once)) { \
> > because 'condition' is false. There was no cache miss as a result.
> >
> > Cache miss for __warn_once is happened in new lines
> > + if (likely(__warn_once)) \
> > + if (WARN_ON(__ret_warn_once)) \
> >
>
> That's one cache miss. One. For the remainder of the benchmark,
> __warn_once is in cache and there are no more misses. That's how caches
> work ;)
>
> But it appears this isn't happening. Why?

day-ja-vu!

Andrew, this discussion came up back when Ingo and Arjan introduced
WARN_ON_ONCE. Well, not exactly. I'm sorry, but I missed what was wrong
with the current way of doing WARN_ON_ONCE?

Anyway, what's the advantage of testing a variable that is most likely
will be true, and that you will need to test the condition *anyway*.
Even if the __warn_once is in cache, it may be pushing something out of
a register, to read the variable and test it. And after all that we test
the condition too, with no savings.

Is this patch to get rid of the int ret=0? Doesn't the compiler
optimize that out?

Here's my comment when I sent the patch to change the original:

if (unlikely(__warn_once && (condition)))

to

if (unlikely((condition) && __warn_once)))

http://marc.theaimsgroup.com/?l=linux-kernel&m=114935833125957&w=2

That was different, since we were putting a likely condition in an
unlikely(). But I still don't see why we would ever want to test
__warn_once before the condition, since it doesn't save on anything and
just adds extra work. I don't see the savings.

-- Steve

2006-10-08 00:39:15

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Steven Rostedt wrote:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=114935833125957&w=2
>
> That was different, since we were putting a likely condition in an
> unlikely(). But I still don't see why we would ever want to test
> __warn_once before the condition, since it doesn't save on anything and
> just adds extra work. I don't see the savings.
>

Also, in that thread you cite (__warn_once && (condition)) is flat-out
wrong, because condition may have a side-effect. There are plenty of
places in the code which use BUG_ON or WARN_ON as a general error
checking mechanism which expect the condition to be always evaluated
once; WARN_ON_ONCE should be the same.

Personally I think it is poor style, but there you are.

J

2006-10-10 01:58:12

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Wed, 2006-10-04 at 10:34 -0700, Andrew Morton wrote:

> > > Please don't just ignore my questions. *why* are we getting a cache miss
> > > rate on that integer which is causing measurable performance changes? If
> > > we're reading it that frequently then the variable should be in cache(!).
> > >
> >
> > The point is valid, __warn_once should be in cache, unless something
> > evicts it. What I have found so far is with patch by Andrew and Leonid
> > that avoid looking up the __warn_once integer, the cache miss rate is
> > reduced to the level before.

>
> I see, thanks. How very peculiar.
>
> I wonder if we just got unlucky and that particular benchmark with that
> particular kernel build just happens to reach the cache system's
> associativity threshold, and this one extra cacheline took it over the
> edge. Or something.
>


We believe we found the real cause of the performance regression:
it is precisely a self-inflicted cache line conflict on __warn_once
global variable thanks to the older gcc (v3.4.5) backend assembly code
optimizer. The newer version of gcc (v4.1.0) does not suffer the same
problem.

We fall in the trap of thinking the __warn_once variable
is truly read mostly and is written only once in the very very unlikely
case of bug triggering. But the compiler is doing something which
turns the following innocent read only looking code into
write always assembly code. The compiler is doing so to avoid a
conditional jump.


The original "C" code looks very innocent:

if (WARN_ON(__ret_warn_once));
__warn_once = 0;

The equivalent asm code generated by gcc looks like:

temp = 0;
if (!WARN_ON(__ret_warn_once))
temp = __warn_once;
__warn_once = temp;


As a result, a global variable is being written from all CPUs
everywhere and caused excessive cache line bouncing on SMP.
We measured that HITM event increased by 75% and
read-for-ownership event increased by 50%. Adding a
__read_mostly directive to __warn_once didn't help
because gcc still generate assembly code that write to
that global variable.

Thanks.

Tim

2006-10-10 13:05:08

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Tim,

Great work on finding this!

On Mon, 2006-10-09 at 18:09 -0700, Tim Chen wrote:

>
>
> The original "C" code looks very innocent:
>
> if (WARN_ON(__ret_warn_once));
> __warn_once = 0;
>
> The equivalent asm code generated by gcc looks like:
>
> temp = 0;
> if (!WARN_ON(__ret_warn_once))
> temp = __warn_once;
> __warn_once = temp;
>
>
> As a result, a global variable is being written from all CPUs
> everywhere and caused excessive cache line bouncing on SMP.
> We measured that HITM event increased by 75% and
> read-for-ownership event increased by 50%. Adding a
> __read_mostly directive to __warn_once didn't help
> because gcc still generate assembly code that write to
> that global variable.

Holy crap! I wonder where else in the kernel gcc is doing this. (of
course I'm using gcc4 so I don't know). Is there another gcc attribute
to actually tell gcc that a variable is really mostly read only (besides
placing it in a mostly read only elf section)?

What was wrong with the original WARN_ON_ONCE with

if (unlikely(condition) && __warn_once)

This didn't have the cache crash problem too, did it?
I don't have a gcc3 around to test.

-- Steve


2006-10-10 16:31:12

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

On Tue, 2006-10-10 at 09:04 -0400, Steven Rostedt wrote:

>
> Holy crap! I wonder where else in the kernel gcc is doing this. (of
> course I'm using gcc4 so I don't know). Is there another gcc attribute
> to actually tell gcc that a variable is really mostly read only (besides
> placing it in a mostly read only elf section)?
>
> What was wrong with the original WARN_ON_ONCE with
>
> if (unlikely(condition) && __warn_once)
>
> This didn't have the cache crash problem too, did it?
> I don't have a gcc3 around to test.

In the original WARN_ON_ONCE, gcc3 only writes to __warn_once when
(condition) is true. So it does not cause cache bouncing by always
writing to __warn_once.

Tim

2006-10-10 20:03:52

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Steven Rostedt wrote:
> Holy crap! I wonder where else in the kernel gcc is doing this. (of
> course I'm using gcc4 so I don't know). Is there another gcc attribute
> to actually tell gcc that a variable is really mostly read only (besides
> placing it in a mostly read only elf section)?
>

That would be nice, but I don't know of one (apart from "volatile",
which has its own downsides). Once could imagine an annotation which
makes gcc consider writes to the variable relatively expensive, so that
it avoids generating unnecessary/excessive writes.

J

2006-10-10 21:05:21

by Ananiev, Leonid I

[permalink] [raw]
Subject: RE: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression



-----Original Message-----
From: Jeremy Fitzhardinge [mailto:[email protected]]
Sent: Wednesday, October 11, 2006 12:04 AM
To: Steven Rostedt
Cc: [email protected]; Andrew Morton;
[email protected]; [email protected]; Ananiev,
Leonid I
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Steven Rostedt wrote:
> Holy crap! I wonder where else in the kernel gcc is doing this.
Jeremy Fitzhardinge wrote:
> annotation which makes gcc consider writes to the variable relatively
expensive

I should underline that cache miss is a result of invalidating of cache
line with __warn_once in each other CPUs performed by hw for cache
coherence.
__warn_once is a common data. It is costly to test-and-modify it just in
SMP. But it is not costly to write to the variable in memory just after
reading it. As a compiler have understood source code.
A read-and-modify for common variable are performed under lock usually.

Leonid

Steven Rostedt wrote:
> Holy crap! I wonder where else in the kernel gcc is doing this. (of
> course I'm using gcc4 so I don't know). Is there another gcc
attribute
> to actually tell gcc that a variable is really mostly read only
(besides
> placing it in a mostly read only elf section)?
>

That would be nice, but I don't know of one (apart from "volatile",
which has its own downsides). Once could imagine an annotation which
makes gcc consider writes to the variable relatively expensive, so that
it avoids generating unnecessary/excessive writes.

J

2006-10-10 21:18:09

by Steven Rostedt

[permalink] [raw]
Subject: RE: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression



On Wed, 11 Oct 2006, Ananiev, Leonid I wrote:

> Steven Rostedt wrote:
> > Holy crap! I wonder where else in the kernel gcc is doing this.
> Jeremy Fitzhardinge wrote:
> > annotation which makes gcc consider writes to the variable relatively
> expensive
>
> I should underline that cache miss is a result of invalidating of cache
> line with __warn_once in each other CPUs performed by hw for cache
> coherence.
> __warn_once is a common data. It is costly to test-and-modify it just in
> SMP. But it is not costly to write to the variable in memory just after
> reading it. As a compiler have understood source code.
> A read-and-modify for common variable are performed under lock usually.
>

In todays world, SMP is becoming more and more common (still waiting to
get that DualCore cell phone). So that means that writing to a variable
is going to carry more weight than it use to, and gcc needs to take note
of this. So, to avoid a short condition jump by adding a write to
memory, is not going to save anying.

Maybe gcc just needs to have an option to "optimize" for SMP.

-- Steve

2006-10-10 21:41:59

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Steven> In todays world, SMP is becoming more and more common
Steven> (still waiting to get that DualCore cell phone).

A digression -- but most "smart" phones these days actually are dual
core, although not SMP -- typically one core will run the DSP-type
stuff for actually being a phone while the other core will run a more
general-purpose OS for the fancy UI and "smart" apps.

- R.

2006-10-10 22:59:43

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] Fix WARN_ON / WARN_ON_ONCE regression

Steven Rostedt wrote:
> In todays world, SMP is becoming more and more common (still waiting to
> get that DualCore cell phone). So that means that writing to a variable
> is going to carry more weight than it use to, and gcc needs to take note
> of this. So, to avoid a short condition jump by adding a write to
> memory, is not going to save anying.
>

In general shared variables are going to be pretty rare, and its
reasonable for gcc to assume they aren't. But it would be nice to have
a good way to solve cases like this (though it seems like (condition) &&
__warn_once is the right way to go here anyway).

J