2008-03-13 09:08:17

by Jan Beulich

[permalink] [raw]
Subject: [RFC] x86: bitops asm constraint fixes

This (simplified) piece of code didn't behave as expected due to
incorrect constraints in some of the bitops functions, when
X86_FEATURE_xxx is referring to other than the first long:

int test(struct cpuinfo_x86 *c) {
if (cpu_has(c, X86_FEATURE_xxx))
clear_cpu_cap(c, X86_FEATURE_xxx);
return cpu_has(c, X86_FEATURE_xxx);
}

I'd really like understand, though, what the policy of (not) having a
"memory" clobber in these operations is - currently, this appears to
be totally inconsistent. Also, many comments of the non-atomic
functions say those may also be re-ordered - this contradicts the use
of "asm volatile" in there, which again I'd like to understand.

As much as all of these, using 'int' for the 'nr' parameter and
'void *' for the 'addr' one is in conflict with
Documentation/atomic_ops.txt, especially because bt{,c,r,s} indeed
take the bit index as signed (which hence would really need special
precaution) and access the full 32 bits (if 'unsigned long' was used
properly here, 64 bits for x86-64) pointed at, so invalid uses like
referencing a 'char' array cannot currently be caught.

Finally, the code with and without this patch relies heavily on the
-fno-strict-aliasing compiler switch and I'm not certain this really
is a good idea.

In the light of all of this I'm sending this as RFC, as fixing the
above might warrant a much bigger patch...

Signed-off-by: Jan Beulich <[email protected]>

---
include/asm-x86/bitops.h | 43 ++++++++++++++++++++++++-------------------
1 file changed, 24 insertions(+), 19 deletions(-)

--- linux-2.6.25-rc5/include/asm-x86/bitops.h 2008-03-10 13:24:33.000000000 +0100
+++ 2.6.25-rc5-x86-clear-bit/include/asm-x86/bitops.h 2008-03-13 08:45:40.000000000 +0100
@@ -24,9 +24,12 @@
/* Technically wrong, but this avoids compilation errors on some gcc
versions. */
#define ADDR "=m" (*(volatile long *) addr)
+#define BIT_ADDR "=m" (((volatile int *) addr)[nr >> 5])
#else
#define ADDR "+m" (*(volatile long *) addr)
+#define BIT_ADDR "+m" (((volatile int *) addr)[nr >> 5])
#endif
+#define BASE_ADDR "m" (*(volatile int *) addr)

/**
* set_bit - Atomically set a bit in memory
@@ -79,9 +82,8 @@ static inline void __set_bit(int nr, vol
*/
static inline void clear_bit(int nr, volatile void *addr)
{
- asm volatile(LOCK_PREFIX "btr %1,%0"
- : ADDR
- : "Ir" (nr));
+ asm volatile(LOCK_PREFIX "btr %1,%2"
+ : BIT_ADDR : "Ir" (nr), BASE_ADDR);
}

/*
@@ -100,7 +102,7 @@ static inline void clear_bit_unlock(unsi

static inline void __clear_bit(int nr, volatile void *addr)
{
- asm volatile("btr %1,%0" : ADDR : "Ir" (nr));
+ asm volatile("btr %1,%2" : BIT_ADDR : "Ir" (nr), BASE_ADDR);
}

/*
@@ -135,7 +137,7 @@ static inline void __clear_bit_unlock(un
*/
static inline void __change_bit(int nr, volatile void *addr)
{
- asm volatile("btc %1,%0" : ADDR : "Ir" (nr));
+ asm volatile("btc %1,%2" : BIT_ADDR : "Ir" (nr), BASE_ADDR);
}

/**
@@ -149,8 +151,8 @@ static inline void __change_bit(int nr,
*/
static inline void change_bit(int nr, volatile void *addr)
{
- asm volatile(LOCK_PREFIX "btc %1,%0"
- : ADDR : "Ir" (nr));
+ asm volatile(LOCK_PREFIX "btc %1,%2"
+ : BIT_ADDR : "Ir" (nr), BASE_ADDR);
}

/**
@@ -198,10 +200,10 @@ static inline int __test_and_set_bit(int
{
int oldbit;

- asm("bts %2,%1\n\t"
- "sbb %0,%0"
- : "=r" (oldbit), ADDR
- : "Ir" (nr));
+ asm volatile("bts %2,%3\n\t"
+ "sbb %0,%0"
+ : "=r" (oldbit), BIT_ADDR
+ : "Ir" (nr), BASE_ADDR);
return oldbit;
}

@@ -238,10 +240,10 @@ static inline int __test_and_clear_bit(i
{
int oldbit;

- asm volatile("btr %2,%1\n\t"
+ asm volatile("btr %2,%3\n\t"
"sbb %0,%0"
- : "=r" (oldbit), ADDR
- : "Ir" (nr));
+ : "=r" (oldbit), BIT_ADDR
+ : "Ir" (nr), BASE_ADDR);
return oldbit;
}

@@ -250,10 +252,10 @@ static inline int __test_and_change_bit(
{
int oldbit;

- asm volatile("btc %2,%1\n\t"
+ asm volatile("btc %2,%3\n\t"
"sbb %0,%0"
- : "=r" (oldbit), ADDR
- : "Ir" (nr) : "memory");
+ : "=r" (oldbit), BIT_ADDR
+ : "Ir" (nr), BASE_ADDR);

return oldbit;
}
@@ -288,10 +290,11 @@ static inline int variable_test_bit(int
{
int oldbit;

- asm volatile("bt %2,%1\n\t"
+ asm volatile("bt %2,%3\n\t"
"sbb %0,%0"
: "=r" (oldbit)
- : "m" (*(unsigned long *)addr), "Ir" (nr));
+ : "m" (((volatile const int *)addr)[nr >> 5]),
+ "Ir" (nr), BASE_ADDR);

return oldbit;
}
@@ -310,6 +313,8 @@ static int test_bit(int nr, const volati
constant_test_bit((nr),(addr)) : \
variable_test_bit((nr),(addr)))

+#undef BASE_ADDR
+#undef BIT_ADDR
#undef ADDR

#ifdef CONFIG_X86_32


2008-03-14 07:52:14

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes

Jan Beulich wrote:
>
> I'd really like understand, though, what the policy of (not) having a
> "memory" clobber in these operations is - currently, this appears to
> be totally inconsistent. Also, many comments of the non-atomic
> functions say those may also be re-ordered - this contradicts the use
> of "asm volatile" in there, which again I'd like to understand.
>

In general, proper "m" constraints are better than "memory" clobbers,
since they give gcc more information. Note that the "m" constraint
doesn't actually have to be *manifest* in the assembly string.

-hpa

2008-03-14 08:09:12

by Jan Beulich

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes

>>> "H. Peter Anvin" <[email protected]> 14.03.08 08:51 >>>
>Jan Beulich wrote:
>>
>> I'd really like understand, though, what the policy of (not) having a
>> "memory" clobber in these operations is - currently, this appears to
>> be totally inconsistent. Also, many comments of the non-atomic
>> functions say those may also be re-ordered - this contradicts the use
>> of "asm volatile" in there, which again I'd like to understand.
>>
>
>In general, proper "m" constraints are better than "memory" clobbers,
>since they give gcc more information. Note that the "m" constraint
>doesn't actually have to be *manifest* in the assembly string.

... which is the case with the patch applied.

So am I taking this as 'yes, a proper re-write of these routines is
worthwhile'? But - you didn't comment on the other issues raised,
so before getting to that I'll have to wait to see what's the reason
(if any) for the other anomalies.

Jan

2008-03-14 18:58:20

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes

Jan Beulich wrote:
> This (simplified) piece of code didn't behave as expected due to
> incorrect constraints in some of the bitops functions, when
> X86_FEATURE_xxx is referring to other than the first long:
>
> int test(struct cpuinfo_x86 *c) {
> if (cpu_has(c, X86_FEATURE_xxx))
> clear_cpu_cap(c, X86_FEATURE_xxx);
> return cpu_has(c, X86_FEATURE_xxx);
> }
>
> I'd really like understand, though, what the policy of (not) having a
> "memory" clobber in these operations is - currently, this appears to
> be totally inconsistent.
I think there's years of history here, much of it involving rites with
chicken entrails.

"memory" clobber is generally needed because the bit operations can
touch memory beyond their apparent arguments. Proper "m" constraints
are the way to go.

> Also, many comments of the non-atomic
> functions say those may also be re-ordered - this contradicts the use
> of "asm volatile" in there, which again I'd like to understand.
>

"asm volatile" has no effect on ordering. It's only necessary to force
an asm with no apparent side-effects to get emitted (ie, an asm with
outputs which don't get used; asms without outputs are implicitly volatile).

All these operations should either explicitly list memory modification
as an output. The bit tests have no side-effects, so there's no problem
if gcc decides they don't need to be emitted.

> As much as all of these, using 'int' for the 'nr' parameter and
> 'void *' for the 'addr' one is in conflict with
> Documentation/atomic_ops.txt, especially because bt{,c,r,s} indeed
> take the bit index as signed (which hence would really need special
> precaution) and access the full 32 bits (if 'unsigned long' was used
> properly here, 64 bits for x86-64) pointed at, so invalid uses like
> referencing a 'char' array cannot currently be caught.
>

What's the problem with accessing a char array as bits?

> Finally, the code with and without this patch relies heavily on the
> -fno-strict-aliasing compiler switch and I'm not certain this really
> is a good idea.
>

Doesn't the casting via void * stomp all that?

> In the light of all of this I'm sending this as RFC, as fixing the
> above might warrant a much bigger patch...
>
> Signed-off-by: Jan Beulich <[email protected]>
>
> ---
> include/asm-x86/bitops.h | 43 ++++++++++++++++++++++++-------------------
> 1 file changed, 24 insertions(+), 19 deletions(-)
>
> --- linux-2.6.25-rc5/include/asm-x86/bitops.h 2008-03-10 13:24:33.000000000 +0100
> +++ 2.6.25-rc5-x86-clear-bit/include/asm-x86/bitops.h 2008-03-13 08:45:40.000000000 +0100
> @@ -24,9 +24,12 @@
> /* Technically wrong, but this avoids compilation errors on some gcc
> versions. */
> #define ADDR "=m" (*(volatile long *) addr)
> +#define BIT_ADDR "=m" (((volatile int *) addr)[nr >> 5])
> #else
> #define ADDR "+m" (*(volatile long *) addr)
> +#define BIT_ADDR "+m" (((volatile int *) addr)[nr >> 5])
> #endif
> +#define BASE_ADDR "m" (*(volatile int *) addr)
>

Hm, hardcoding ">> 5" seems like asking for trouble. At least make the
"int" an explicitly sized type. It should also pass "nr" in as an
explicit macro argument rather than just picking it up.

Does plain ADDR still get used?

It's unfortunate that gcc will runtime-compute the address of
addr[nr>>5] even though we only need it for proper compile-time constraints.

J

>
> /**
> * set_bit - Atomically set a bit in memory
> @@ -79,9 +82,8 @@ static inline void __set_bit(int nr, vol
> */
> static inline void clear_bit(int nr, volatile void *addr)
> {
> - asm volatile(LOCK_PREFIX "btr %1,%0"
> - : ADDR
> - : "Ir" (nr));
> + asm volatile(LOCK_PREFIX "btr %1,%2"
> + : BIT_ADDR : "Ir" (nr), BASE_ADDR);
> }
>
> /*
> @@ -100,7 +102,7 @@ static inline void clear_bit_unlock(unsi
>
> static inline void __clear_bit(int nr, volatile void *addr)
> {
> - asm volatile("btr %1,%0" : ADDR : "Ir" (nr));
> + asm volatile("btr %1,%2" : BIT_ADDR : "Ir" (nr), BASE_ADDR);
> }
>
> /*
> @@ -135,7 +137,7 @@ static inline void __clear_bit_unlock(un
> */
> static inline void __change_bit(int nr, volatile void *addr)
> {
> - asm volatile("btc %1,%0" : ADDR : "Ir" (nr));
> + asm volatile("btc %1,%2" : BIT_ADDR : "Ir" (nr), BASE_ADDR);
> }
>
> /**
> @@ -149,8 +151,8 @@ static inline void __change_bit(int nr,
> */
> static inline void change_bit(int nr, volatile void *addr)
> {
> - asm volatile(LOCK_PREFIX "btc %1,%0"
> - : ADDR : "Ir" (nr));
> + asm volatile(LOCK_PREFIX "btc %1,%2"
> + : BIT_ADDR : "Ir" (nr), BASE_ADDR);
> }
>
> /**
> @@ -198,10 +200,10 @@ static inline int __test_and_set_bit(int
> {
> int oldbit;
>
> - asm("bts %2,%1\n\t"
> - "sbb %0,%0"
> - : "=r" (oldbit), ADDR
> - : "Ir" (nr));
> + asm volatile("bts %2,%3\n\t"
> + "sbb %0,%0"
> + : "=r" (oldbit), BIT_ADDR
> + : "Ir" (nr), BASE_ADDR);
> return oldbit;
> }
>
> @@ -238,10 +240,10 @@ static inline int __test_and_clear_bit(i
> {
> int oldbit;
>
> - asm volatile("btr %2,%1\n\t"
> + asm volatile("btr %2,%3\n\t"
> "sbb %0,%0"
> - : "=r" (oldbit), ADDR
> - : "Ir" (nr));
> + : "=r" (oldbit), BIT_ADDR
> + : "Ir" (nr), BASE_ADDR);
> return oldbit;
> }
>
> @@ -250,10 +252,10 @@ static inline int __test_and_change_bit(
> {
> int oldbit;
>
> - asm volatile("btc %2,%1\n\t"
> + asm volatile("btc %2,%3\n\t"
> "sbb %0,%0"
> - : "=r" (oldbit), ADDR
> - : "Ir" (nr) : "memory");
> + : "=r" (oldbit), BIT_ADDR
> + : "Ir" (nr), BASE_ADDR);
>
> return oldbit;
> }
> @@ -288,10 +290,11 @@ static inline int variable_test_bit(int
> {
> int oldbit;
>
> - asm volatile("bt %2,%1\n\t"
> + asm volatile("bt %2,%3\n\t"
> "sbb %0,%0"
> : "=r" (oldbit)
> - : "m" (*(unsigned long *)addr), "Ir" (nr));
> + : "m" (((volatile const int *)addr)[nr >> 5]),
> + "Ir" (nr), BASE_ADDR);
>
> return oldbit;
> }
> @@ -310,6 +313,8 @@ static int test_bit(int nr, const volati
> constant_test_bit((nr),(addr)) : \
> variable_test_bit((nr),(addr)))
>
> +#undef BASE_ADDR
> +#undef BIT_ADDR
> #undef ADDR
>
> #ifdef CONFIG_X86_32
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2008-03-14 21:07:48

by Chuck Ebbert

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes

On 03/13/2008 05:08 AM, Jan Beulich wrote:
> This (simplified) piece of code didn't behave as expected due to
> incorrect constraints in some of the bitops functions, when
> X86_FEATURE_xxx is referring to other than the first long:
>
> int test(struct cpuinfo_x86 *c) {
> if (cpu_has(c, X86_FEATURE_xxx))
> clear_cpu_cap(c, X86_FEATURE_xxx);
> return cpu_has(c, X86_FEATURE_xxx);
> }
>

This is a long-standing bug and your patch appears to fix it.

> --- linux-2.6.25-rc5/include/asm-x86/bitops.h 2008-03-10 13:24:33.000000000 +0100
> +++ 2.6.25-rc5-x86-clear-bit/include/asm-x86/bitops.h 2008-03-13 08:45:40.000000000 +0100
> @@ -24,9 +24,12 @@
> /* Technically wrong, but this avoids compilation errors on some gcc
> versions. */
> #define ADDR "=m" (*(volatile long *) addr)
> +#define BIT_ADDR "=m" (((volatile int *) addr)[nr >> 5])
> #else
> #define ADDR "+m" (*(volatile long *) addr)
> +#define BIT_ADDR "+m" (((volatile int *) addr)[nr >> 5])
> #endif
> +#define BASE_ADDR "m" (*(volatile int *) addr)

Can't you just do everything with unsigned longs, like this?

In include/asm-x86/types.h:

ifdef CONFIG_X86_32
# define BITS_PER_LONG 32
+# define BITMAP_ORDER 5
#else
# define BITS_PER_LONG 64
+# define BITMAP_ORDER 6
#endif

Then:

> #define ADDR "=m" (*(volatile long *) addr)
> +#define BIT_ADDR "=m" (((volatile long *) addr)[nr >> BITMAP_ORDER])
> #else
> #define ADDR "+m" (*(volatile long *) addr)
> +#define BIT_ADDR "+m" (((volatile long *) addr)[nr >> BITMAP_ORDER])
> #endif

No need for BASE_ADDR that way (or ADDR could be renamed to that.)

2008-03-17 09:08:01

by Jan Beulich

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes

>>> Jeremy Fitzhardinge <[email protected]> 14.03.08 19:56 >>>
>Jan Beulich wrote:
>> This (simplified) piece of code didn't behave as expected due to
>> incorrect constraints in some of the bitops functions, when
>> X86_FEATURE_xxx is referring to other than the first long:
>>
>> int test(struct cpuinfo_x86 *c) {
>> if (cpu_has(c, X86_FEATURE_xxx))
>> clear_cpu_cap(c, X86_FEATURE_xxx);
>> return cpu_has(c, X86_FEATURE_xxx);
>> }
>>
>> I'd really like understand, though, what the policy of (not) having a
>> "memory" clobber in these operations is - currently, this appears to
>> be totally inconsistent.
>I think there's years of history here, much of it involving rites with
>chicken entrails.
>
>"memory" clobber is generally needed because the bit operations can
>touch memory beyond their apparent arguments. Proper "m" constraints
>are the way to go.

I'm mainly afraid to break some implicit assumptions potentially made
somewhere that the memory clobber is there (as e.g. implicitly
documented for set_bit() by the comment before clear_bit()) ...

>> Also, many comments of the non-atomic
>> functions say those may also be re-ordered - this contradicts the use
>> of "asm volatile" in there, which again I'd like to understand.
>>
>
>"asm volatile" has no effect on ordering. It's only necessary to force
>an asm with no apparent side-effects to get emitted (ie, an asm with
>outputs which don't get used; asms without outputs are implicitly volatile).

Ah, right, it's there just to accompany the memory clobber.

>All these operations should either explicitly list memory modification
>as an output. The bit tests have no side-effects, so there's no problem
>if gcc decides they don't need to be emitted.
>
>> As much as all of these, using 'int' for the 'nr' parameter and
>> 'void *' for the 'addr' one is in conflict with
>> Documentation/atomic_ops.txt, especially because bt{,c,r,s} indeed
>> take the bit index as signed (which hence would really need special
>> precaution) and access the full 32 bits (if 'unsigned long' was used
>> properly here, 64 bits for x86-64) pointed at, so invalid uses like
>> referencing a 'char' array cannot currently be caught.
>>
>
>What's the problem with accessing a char array as bits?

The fact that a char array may be of a size that is not a multiple of the
word size used by bt{,c,r,s}, i.e. you may either touch memory outside
of the actual bit field (problematic if what follows is an atomic variable,
and the bit operation used is not an atomic one, or if the bit array is
mis-aligned and the access would cross a page boundary).

>> Finally, the code with and without this patch relies heavily on the
>> -fno-strict-aliasing compiler switch and I'm not certain this really
>> is a good idea.
>>
>
>Doesn't the casting via void * stomp all that?

Apparently not, as you can see from the example I gave that gets
fixed with the patch.

>> In the light of all of this I'm sending this as RFC, as fixing the
>> above might warrant a much bigger patch...
>>
>> Signed-off-by: Jan Beulich <[email protected]>
>>
>> ---
>> include/asm-x86/bitops.h | 43 ++++++++++++++++++++++++-------------------
>> 1 file changed, 24 insertions(+), 19 deletions(-)
>>
>> --- linux-2.6.25-rc5/include/asm-x86/bitops.h 2008-03-10 13:24:33.000000000 +0100
>> +++ 2.6.25-rc5-x86-clear-bit/include/asm-x86/bitops.h 2008-03-13 08:45:40.000000000 +0100
>> @@ -24,9 +24,12 @@
>> /* Technically wrong, but this avoids compilation errors on some gcc
>> versions. */
>> #define ADDR "=m" (*(volatile long *) addr)
>> +#define BIT_ADDR "=m" (((volatile int *) addr)[nr >> 5])
>> #else
>> #define ADDR "+m" (*(volatile long *) addr)
>> +#define BIT_ADDR "+m" (((volatile int *) addr)[nr >> 5])
>> #endif
>> +#define BASE_ADDR "m" (*(volatile int *) addr)
>>
>
>Hm, hardcoding ">> 5" seems like asking for trouble. At least make the
>"int" an explicitly sized type. It should also pass "nr" in as an

Are you expecting int to ever be other than 32-bits wide? I'm afraid a
lot of other code makes assumptions here...

>explicit macro argument rather than just picking it up.

Not really, since ADDR doesn't take addr as parameter either...

>Does plain ADDR still get used?

It shouldn't in the end, but as long as the atomic ones that do use
memory clobbers don't use proper "m" operands (where it was part of
the purpose of the original mail to find out whether these need to
stay), they'll continue to use ADDR.

>It's unfortunate that gcc will runtime-compute the address of
>addr[nr>>5] even though we only need it for proper compile-time
>constraints.

Yes, but I fixed this meanwhile by forcing a maximum size structure
to be used as operand when the bit index is not constant, as in

struct __bits { int _[0x7ffffff]; };
...
#define BIT_ADDR "+m" (((volatile int *) addr)[nr >> 5])
#define FULL_ADDR "+m" (*(volatile struct __bits *) addr)
...
static inline void __clear_bit(int nr, volatile void *addr)
{
if (__builtin_constant_p(nr))
asm volatile("btr %1,%2" : BIT_ADDR : "Ir" (nr), BASE_ADDR);
else
asm volatile("btr %1,%0" : FULL_ADDR : "r" (nr));
}

Of course, it would be even better if we knew the size of the object,
but that would require all of these to become macros (so __typeof__(),
sizeof(), and __alignof__() could be applied). I'm not certain that's
worthwhile.

Jan

2008-03-17 09:15:52

by Jan Beulich

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes

>> --- linux-2.6.25-rc5/include/asm-x86/bitops.h 2008-03-10 13:24:33.000000000 +0100
>> +++ 2.6.25-rc5-x86-clear-bit/include/asm-x86/bitops.h 2008-03-13 08:45:40.000000000 +0100
>> @@ -24,9 +24,12 @@
>> /* Technically wrong, but this avoids compilation errors on some gcc
>> versions. */
>> #define ADDR "=m" (*(volatile long *) addr)
>> +#define BIT_ADDR "=m" (((volatile int *) addr)[nr >> 5])
>> #else
>> #define ADDR "+m" (*(volatile long *) addr)
>> +#define BIT_ADDR "+m" (((volatile int *) addr)[nr >> 5])
>> #endif
>> +#define BASE_ADDR "m" (*(volatile int *) addr)
>
>Can't you just do everything with unsigned longs, like this?

That's not very desirable: For one part, because there are uses of
bitops on arrays of ints (and casting these up isn't fully correct on
x86-64 because of the same reason that using the bitops on char
arrays isn't correct (see the other response I sent to Jeremy's reply),
but also because operating on longs requires REX prefixes n x86-64,
hence making the code bigger for no good reason.

>>In include/asm-x86/types.h:
>>
>> ifdef CONFIG_X86_32
>> # define BITS_PER_LONG 32
>>+# define BITMAP_ORDER 5
>> #else
>> # define BITS_PER_LONG 64
>>+# define BITMAP_ORDER 6
>> #endif
>
>Then:
>
>> #define ADDR "=m" (*(volatile long *) addr)
>> +#define BIT_ADDR "=m" (((volatile long *) addr)[nr >> BITMAP_ORDER])
>> #else
>> #define ADDR "+m" (*(volatile long *) addr)
>> +#define BIT_ADDR "+m" (((volatile long *) addr)[nr >> BITMAP_ORDER])
>> #endif
>
>No need for BASE_ADDR that way (or ADDR could be renamed to that.)

Not really, since BASE_ADDR is an input, whereas ADDR is an output.
However, ultimately all uses of ADDR should go (since even if any of
the functions needs the memory clobber to stay, using an input for
specifying the array base address is sufficient - such operations simply
don't need an exact "m" output operand then).

Jan

2008-03-19 19:25:24

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes

Jan Beulich wrote:
>
> That's not very desirable: For one part, because there are uses of
> bitops on arrays of ints (and casting these up isn't fully correct on
> x86-64 because of the same reason that using the bitops on char
> arrays isn't correct (see the other response I sent to Jeremy's reply),
> but also because operating on longs requires REX prefixes n x86-64,
> hence making the code bigger for no good reason.
>

It might be worthwhile to find out if 64-bit bitops are faster.

Bitops are normally defined only on longs, but since x86 is a
littleendian architecture we sometimes fudge that in arch-specific code.

-hpa

2008-03-21 13:54:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes


thanks Jan, i've applied your patch.

Ingo

2008-03-27 08:12:39

by Jan Beulich

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes

Please revert it for the time being, I've got a better version (i.e.
without extra dead code being generated) that I intended to
submit once I know whether the other issues pointed out in the
description on the original patch also should be adjusted. Of
course, that could also be done incrementally, but I would think
overhauling the whole file at once wouldn't be a bad thing...

Jan

>>> Ingo Molnar <[email protected]> 03/21/08 2:54 PM >>>

thanks Jan, i've applied your patch.

Ingo

2008-03-27 08:41:51

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes


* Jan Beulich <[email protected]> wrote:

> Please revert it for the time being, I've got a better version (i.e.
> without extra dead code being generated) that I intended to submit
> once I know whether the other issues pointed out in the description on
> the original patch also should be adjusted. Of course, that could also
> be done incrementally, but I would think overhauling the whole file at
> once wouldn't be a bad thing...

since it appears to cause no problems in x86.git (it passed a lot of
testing already) i'd prefer to keep it (so that we can see any other
side-effects of touching this code) - could you send your improvements
as a delta against x86.git/latest? [or is there any outright bug caused
by your changes that necessiates a revert?]

Ingo

2008-03-28 19:55:47

by Jan Beulich

[permalink] [raw]
Subject: Re: [RFC] x86: bitops asm constraint fixes

>>> Ingo Molnar <[email protected]> 03/27/08 9:41 AM >>>
>
>* Jan Beulich <[email protected]> wrote:
>
>> Please revert it for the time being, I've got a better version (i.e.
>> without extra dead code being generated) that I intended to submit
>> once I know whether the other issues pointed out in the description on
>> the original patch also should be adjusted. Of course, that could also
>> be done incrementally, but I would think overhauling the whole file at
>> once wouldn't be a bad thing...
>
>since it appears to cause no problems in x86.git (it passed a lot of
>testing already) i'd prefer to keep it (so that we can see any other
>side-effects of touching this code) - could you send your improvements
>as a delta against x86.git/latest? [or is there any outright bug caused
>by your changes that necessiates a revert?]

That's fine with me (and no, there's no bug in there other than the
mentioned dead code generation).

Jan