2007-08-09 13:51:48

by Chris Snook

[permalink] [raw]
Subject: [PATCH 9/24] make atomic_read() behave consistently on ia64

From: Chris Snook <[email protected]>

Purify volatile use for atomic[64]_t on ia64.

Signed-off-by: Chris Snook <[email protected]>

--- linux-2.6.23-rc2-orig/include/asm-ia64/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc2/include/asm-ia64/atomic.h 2007-08-09 06:53:48.000000000 -0400
@@ -17,18 +17,18 @@
#include <asm/intrinsics.h>
#include <asm/system.h>

-/*
- * On IA-64, counter must always be volatile to ensure that that the
- * memory accesses are ordered.
- */
-typedef struct { volatile __s32 counter; } atomic_t;
-typedef struct { volatile __s64 counter; } atomic64_t;
+typedef struct { __s32 counter; } atomic_t;
+typedef struct { __s64 counter; } atomic64_t;

#define ATOMIC_INIT(i) ((atomic_t) { (i) })
#define ATOMIC64_INIT(i) ((atomic64_t) { (i) })

-#define atomic_read(v) ((v)->counter)
-#define atomic64_read(v) ((v)->counter)
+/*
+ * Casting to volatile here minimizes the need for barriers,
+ * without having to declare the type itself as volatile.
+ */
+#define atomic_read(v) (*(volatile __s32 *)&(v)->counter)
+#define atomic64_read(v) (*(volatile __s64 *)&(v)->counter)

#define atomic_set(v,i) (((v)->counter) = (i))
#define atomic64_set(v,i) (((v)->counter) = (i))


2007-08-09 21:03:49

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH 9/24] make atomic_read() behave consistently on ia64

> +#define atomic_read(v) (*(volatile __s32 *)&(v)->counter)
> +#define atomic64_read(v) (*(volatile __s64 *)&(v)->counter)
>
> #define atomic_set(v,i) (((v)->counter) = (i))
> #define atomic64_set(v,i) (((v)->counter) = (i))


Losing the volatile from the "set" variants definitely changes
the code generated. Before the patch gcc would give us:

st4.rel [r37]=r9

after
st4 [r37]=r9

It is unclear whether anyone relies on (or even whether they should
rely on) the release semantics that are provided by the current
version of atomic.h. But making this change would require an
audit of all the uses of atomic_set() to find an answer.

There is a more worrying difference in the generated code (this
from the ancient and venerable gcc 3.4.6 that is on my build
machine). In rwsem_down_failed_common I see this change (after
disassembling vmlinux, I used sed to zap the low 32-bits of addresses
to make the diff manageable ... that's why the addresses all end
in xxxxxxxx):

712868,712873c712913,712921
< a0000001xxxxxxxx: adds r16=-1,r30
< a0000001xxxxxxxx: [MII] ld8.acq r33=[r32]
< a0000001xxxxxxxx: nop.i 0x0;;
< a0000001xxxxxxxx: add r42=r33,r16
< a0000001xxxxxxxx: [MMI] mov.m ar.ccv=r33;;
< a0000001xxxxxxxx: cmpxchg8.acq r34=[r32],r42,ar.ccv
---
> a0000001xxxxxxxx: adds r16=-1,r31
> a0000001xxxxxxxx: [MMI] ld4.acq r14=[r32];;
> a0000001xxxxxxxx: nop.m 0x0
> a0000001xxxxxxxx: sxt4 r34=r14
> a0000001xxxxxxxx: [MMI] nop.m 0x0;;
> a0000001xxxxxxxx: nop.m 0x0
> a0000001xxxxxxxx: add r15=r34,r16
> a0000001xxxxxxxx: [MMI] mov.m ar.ccv=r34;;
> a0000001xxxxxxxx: cmpxchg8.acq r42=[r32],r15,ar.ccv

This code is probably from the rwsem_atomic_update(adjustment, sem) macro
which is cpp'd to atomic64_add_return(). It looks really bad that the new
code reads a 32-bit value and sign extends it rather than reading a 64-bit
value (but I'm perplexed as to why this patch provoked this change in the
generated code).

-Tony

2007-08-10 19:52:11

by Chris Snook

[permalink] [raw]
Subject: Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

Luck, Tony wrote:
>> +#define atomic_read(v) (*(volatile __s32 *)&(v)->counter)
>> +#define atomic64_read(v) (*(volatile __s64 *)&(v)->counter)
>>
>> #define atomic_set(v,i) (((v)->counter) = (i))
>> #define atomic64_set(v,i) (((v)->counter) = (i))
>
>
> Losing the volatile from the "set" variants definitely changes
> the code generated. Before the patch gcc would give us:
>
> st4.rel [r37]=r9
>
> after
> st4 [r37]=r9
>
> It is unclear whether anyone relies on (or even whether they should
> rely on) the release semantics that are provided by the current
> version of atomic.h. But making this change would require an
> audit of all the uses of atomic_set() to find an answer.
>
> There is a more worrying difference in the generated code (this
> from the ancient and venerable gcc 3.4.6 that is on my build
> machine). In rwsem_down_failed_common I see this change (after
> disassembling vmlinux, I used sed to zap the low 32-bits of addresses
> to make the diff manageable ... that's why the addresses all end
> in xxxxxxxx):
>
> 712868,712873c712913,712921
> < a0000001xxxxxxxx: adds r16=-1,r30
> < a0000001xxxxxxxx: [MII] ld8.acq r33=[r32]
> < a0000001xxxxxxxx: nop.i 0x0;;
> < a0000001xxxxxxxx: add r42=r33,r16
> < a0000001xxxxxxxx: [MMI] mov.m ar.ccv=r33;;
> < a0000001xxxxxxxx: cmpxchg8.acq r34=[r32],r42,ar.ccv
> ---
>> a0000001xxxxxxxx: adds r16=-1,r31
>> a0000001xxxxxxxx: [MMI] ld4.acq r14=[r32];;
>> a0000001xxxxxxxx: nop.m 0x0
>> a0000001xxxxxxxx: sxt4 r34=r14
>> a0000001xxxxxxxx: [MMI] nop.m 0x0;;
>> a0000001xxxxxxxx: nop.m 0x0
>> a0000001xxxxxxxx: add r15=r34,r16
>> a0000001xxxxxxxx: [MMI] mov.m ar.ccv=r34;;
>> a0000001xxxxxxxx: cmpxchg8.acq r42=[r32],r15,ar.ccv
>
> This code is probably from the rwsem_atomic_update(adjustment, sem) macro
> which is cpp'd to atomic64_add_return(). It looks really bad that the new
> code reads a 32-bit value and sign extends it rather than reading a 64-bit
> value (but I'm perplexed as to why this patch provoked this change in the
> generated code).
>
> -Tony

That's distressing. I'm about to resubmit with a volatile cast in
atomic_set as well, since people expect that behavior and I've been
shown a legitimate case where it could matter. Does the assembly look
right with that cast in atomic_set() as well?

-- Chris

2007-08-10 21:19:58

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH 9/24] make atomic_read() behave consistently on ia64

> That's distressing. I'm about to resubmit with a volatile cast in
> atomic_set as well, since people expect that behavior and I've been
> shown a legitimate case where it could matter. Does the assembly look
> right with that cast in atomic_set() as well?

No. With the casts to volatile in atomic_set and atomic64_set I
still see places where ld8 is changed to ld4 + sign-extend.

-Tony

2007-08-10 21:43:28

by Andreas Schwab

[permalink] [raw]
Subject: Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

"Luck, Tony" <[email protected]> writes:

>> That's distressing. I'm about to resubmit with a volatile cast in
>> atomic_set as well, since people expect that behavior and I've been
>> shown a legitimate case where it could matter. Does the assembly look
>> right with that cast in atomic_set() as well?
>
> No. With the casts to volatile in atomic_set and atomic64_set I
> still see places where ld8 is changed to ld4 + sign-extend.

Use atomic64_read to read an atomic64_t.

Signed-off-by: Andreas Schwab <[email protected]>

diff --git a/include/asm-ia64/atomic.h b/include/asm-ia64/atomic.h
index 1fc3b83..50c2b83 100644
--- a/include/asm-ia64/atomic.h
+++ b/include/asm-ia64/atomic.h
@@ -55,7 +55,7 @@ ia64_atomic64_add (__s64 i, atomic64_t *v)

do {
CMPXCHG_BUGCHECK(v);
- old = atomic_read(v);
+ old = atomic64_read(v);
new = old + i;
} while (ia64_cmpxchg(acq, v, old, new, sizeof(atomic64_t)) != old);
return new;
@@ -83,7 +83,7 @@ ia64_atomic64_sub (__s64 i, atomic64_t *v)

do {
CMPXCHG_BUGCHECK(v);
- old = atomic_read(v);
+ old = atomic64_read(v);
new = old - i;
} while (ia64_cmpxchg(acq, v, old, new, sizeof(atomic64_t)) != old);
return new;

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE Linux Products GmbH, Maxfeldstra?e 5, 90409 N?rnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2007-08-10 22:33:42

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH 9/24] make atomic_read() behave consistently on ia64

> Use atomic64_read to read an atomic64_t.

Thanks Andreas!

Chris: This bug is why the 8-byte loads got changed to 4-byte + sign-extend
by your change to atomic_read().

With this applied together with shuffling the volatile from the
declaration to the usage (in both atomic_read() and atomic_set()
the generated code *almost* reverts to the original.

There are some differences where ld4 have turned into ld8 though.
Are these bugs in the use of atomic_add() and atomic_sub(). E.g.
the first of these changes is in: ipc/msg.c:freeque() where we have:

atomic_sub(msg->q_cbytes, &msg_bytes);

Now the type of msg->q_cbytes is "unsigned long" ... so it seems a
poor idea to subtract such a large typed object from "msg_bytes"
which is a mere slip of an atomic_t.

Or is there some other type-wrangling that needs to happen in
include/asm-ia64/atomic.h? There are a total of nineteen of
these ld4->ld8 transforms.

-Tony

2007-08-10 22:44:03

by Chris Snook

[permalink] [raw]
Subject: Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

Luck, Tony wrote:
>> Use atomic64_read to read an atomic64_t.
>
> Thanks Andreas!
>
> Chris: This bug is why the 8-byte loads got changed to 4-byte + sign-extend
> by your change to atomic_read().

I figured as much. Thanks for confirming this.

> With this applied together with shuffling the volatile from the
> declaration to the usage (in both atomic_read() and atomic_set()
> the generated code *almost* reverts to the original.
>
> There are some differences where ld4 have turned into ld8 though.
> Are these bugs in the use of atomic_add() and atomic_sub(). E.g.
> the first of these changes is in: ipc/msg.c:freeque() where we have:
>
> atomic_sub(msg->q_cbytes, &msg_bytes);
>
> Now the type of msg->q_cbytes is "unsigned long" ... so it seems a
> poor idea to subtract such a large typed object from "msg_bytes"
> which is a mere slip of an atomic_t.
>
> Or is there some other type-wrangling that needs to happen in
> include/asm-ia64/atomic.h? There are a total of nineteen of
> these ld4->ld8 transforms.

Possibly. Either that or we've uncovered some latent bugs. Maybe a
combination of the two. Can you list those 19 changes so we can
evaluate them? I'm told there were some *(volatile *) bugs fixed in gcc
recently, so it's also possible your 3.4.6 is showing those. I can test
that on a more recent gcc on ia64 if it's inconvenient for you to do so
on your test box.

-- Chris

2007-08-10 22:59:59

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH 9/24] make atomic_read() behave consistently on ia64

> Possibly. Either that or we've uncovered some latent bugs. Maybe a
> combination of the two. Can you list those 19 changes so we can
evaluate them?

Here are the functions in which they occur in the object file. You
may have to chase down some inlining to find the function that
actually uses atomic_*().

freeque
do_msgrcv
sk_free
sock_wfree
sock_rfree
sock_kmalloc
sock_kfree_s
sock_setsockopt
skb_release_data
__sk_stream_mem_reclaim
sk_tream_mem_schedule
sk_stream_rfree
sk_attach_filter
ip_frag_destroy * 2
ip_frag_queue * 2
ip_frag_reasm * 2

-Tony

2007-08-10 23:10:53

by Linus Torvalds

[permalink] [raw]
Subject: RE: [PATCH 9/24] make atomic_read() behave consistently on ia64



On Fri, 10 Aug 2007, Luck, Tony wrote:
>
> Here are the functions in which they occur in the object file. You
> may have to chase down some inlining to find the function that
> actually uses atomic_*().

Could you just make the "atomic_read()" and "atomic_set()" functions be
inline functions instead?

That way you get nice compiler warnings when you pass the wrong kind of
object around. So

static void atomic_set(atomic_t *p, int value)
{
*(volatile int *)&p->value = value;
}

static int atomic_read(atomic_t *p)
{
return *(volatile int *)&p->value;
}

etc...

Linus

2007-08-10 23:16:25

by Chris Snook

[permalink] [raw]
Subject: Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

Linus Torvalds wrote:
>
> On Fri, 10 Aug 2007, Luck, Tony wrote:
>> Here are the functions in which they occur in the object file. You
>> may have to chase down some inlining to find the function that
>> actually uses atomic_*().
>
> Could you just make the "atomic_read()" and "atomic_set()" functions be
> inline functions instead?
>
> That way you get nice compiler warnings when you pass the wrong kind of
> object around. So
>
> static void atomic_set(atomic_t *p, int value)
> {
> *(volatile int *)&p->value = value;
> }
>
> static int atomic_read(atomic_t *p)
> {
> return *(volatile int *)&p->value;
> }
>
> etc...

I'll do this for the whole patchset. Stay tuned for the resubmit.

-- Chris

2007-08-10 23:33:16

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH 9/24] make atomic_read() behave consistently on ia64

> Here are the functions in which they occur in the object file. You
> may have to chase down some inlining to find the function that
> actually uses atomic_*().

Ignore this ... Andreas' patch was only two lines so I
thought I'd "save time" by just hand-editing the source over
on my build machine. I managed to goof that by editing the
wrong function for one of the cases. :-(

New result. With Andreas's patch correctly applied, the generated
vmlinux is identical with/without your patch.

-Tony

2007-08-11 00:36:41

by Paul Mackerras

[permalink] [raw]
Subject: Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

Chris Snook writes:

> I'll do this for the whole patchset. Stay tuned for the resubmit.

Could you incorporate Segher's patch to turn atomic_{read,set} into
asm on powerpc? Segher claims that using asm is really the only
reliable way to ensure that gcc does what we want, and he seems to
have a point.

Paul.

2007-08-13 09:09:34

by Chris Snook

[permalink] [raw]
Subject: Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

Paul Mackerras wrote:
> Chris Snook writes:
>
>> I'll do this for the whole patchset. Stay tuned for the resubmit.
>
> Could you incorporate Segher's patch to turn atomic_{read,set} into
> asm on powerpc? Segher claims that using asm is really the only
> reliable way to ensure that gcc does what we want, and he seems to
> have a point.
>
> Paul.

I haven't seen a patch yet. I'm going to resubmit with inline volatile-cast
atomic[64]_[read|set] on all architectures as a reference point, and if anyone
wants to go and implement some of them in assembly, that's between them and the
relevant arch maintainers. I have no problem with (someone else) doing it in
assembly. I just don't think it's necessary and won't let it hold up the effort
to get consistent behavior on all architectures.

-- Chris

2007-08-13 12:51:57

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

On Mon, 13 Aug 2007, Chris Snook wrote:
> Paul Mackerras wrote:
> > Chris Snook writes:
> > > I'll do this for the whole patchset. Stay tuned for the resubmit.
> >
> > Could you incorporate Segher's patch to turn atomic_{read,set} into
> > asm on powerpc? Segher claims that using asm is really the only
> > reliable way to ensure that gcc does what we want, and he seems to
> > have a point.
> >
> > Paul.
>
> I haven't seen a patch yet. I'm going to resubmit with inline volatile-cast
> atomic[64]_[read|set] on all architectures as a reference point, and if anyone
> wants to go and implement some of them in assembly, that's between them and
> the relevant arch maintainers. I have no problem with (someone else) doing it
> in assembly. I just don't think it's necessary and won't let it hold up the
> effort to get consistent behavior on all architectures.

http://lkml.org/lkml/2007/8/10/470

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds