2012-05-22 18:11:55

by OGAWA Hirofumi

[permalink] [raw]
Subject: [PATCH] Use test_and_clear_bit() instead atomic_dec_and_test() for stop_machine

[forgot to Cc: lkml, resend]

Hi,

Maybe, nobody using debug patch in atomic_dec_and_test()... Well,
anyway, how about this?



stop_machine_first is just to see if it is first one or not. So, there
is no reason to use atomic_dec_and_test(), and makes the value below 0.

I think it is not desirable, because this usage only triggers
atomic_dec_and_test() underflow debug patch. (the patch tests result
of atomic_dec_and_test() is < 0)

So, this uses test_and_clear_bit() instead.

Signed-off-by: OGAWA Hirofumi <[email protected]>
---

arch/x86/kernel/alternative.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff -puN arch/x86/kernel/alternative.c~stop_machine-use-test_and_set_bit arch/x86/kernel/alternative.c
--- linux/arch/x86/kernel/alternative.c~stop_machine-use-test_and_set_bit 2012-05-23 02:48:01.000000000 +0900
+++ linux-hirofumi/arch/x86/kernel/alternative.c 2012-05-23 02:48:01.000000000 +0900
@@ -650,7 +650,7 @@ void *__kprobes text_poke(void *addr, co
* Cross-modifying kernel text with stop_machine().
* This code originally comes from immediate value.
*/
-static atomic_t stop_machine_first;
+static unsigned long stop_machine_first;
static int wrote_text;

struct text_poke_params {
@@ -664,7 +664,7 @@ static int __kprobes stop_machine_text_p
struct text_poke_param *p;
int i;

- if (atomic_dec_and_test(&stop_machine_first)) {
+ if (test_and_clear_bit(0, &stop_machine_first)) {
for (i = 0; i < tpp->nparams; i++) {
p = &tpp->params[i];
text_poke(p->addr, p->opcode, p->len);
@@ -714,7 +714,7 @@ void *__kprobes text_poke_smp(void *addr
p.len = len;
tpp.params = &p;
tpp.nparams = 1;
- atomic_set(&stop_machine_first, 1);
+ stop_machine_first = 1;
wrote_text = 0;
/* Use __stop_machine() because the caller already got online_cpus. */
__stop_machine(stop_machine_text_poke, (void *)&tpp, cpu_online_mask);
@@ -736,7 +736,7 @@ void __kprobes text_poke_smp_batch(struc
{
struct text_poke_params tpp = {.params = params, .nparams = n};

- atomic_set(&stop_machine_first, 1);
+ stop_machine_first = 1;
wrote_text = 0;
__stop_machine(stop_machine_text_poke, (void *)&tpp, cpu_online_mask);
}
_

--
OGAWA Hirofumi <[email protected]>


2012-05-22 21:46:37

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] Use test_and_clear_bit() instead atomic_dec_and_test() for stop_machine

On Wed, May 23, 2012 at 03:11:48AM +0900, OGAWA Hirofumi wrote:
> [forgot to Cc: lkml, resend]
>
> Hi,
>
> Maybe, nobody using debug patch in atomic_dec_and_test()... Well,
> anyway, how about this?

What debug patch?

>
>
>
> stop_machine_first is just to see if it is first one or not. So, there
> is no reason to use atomic_dec_and_test(), and makes the value below 0.
>
> I think it is not desirable, because this usage only triggers
> atomic_dec_and_test() underflow debug patch. (the patch tests result
> of atomic_dec_and_test() is < 0)

Well it should only underflow if you have a box with more than 2 billion
CPUs.

>
> So, this uses test_and_clear_bit() instead.
>
> Signed-off-by: OGAWA Hirofumi <[email protected]>
> ---
>
> arch/x86/kernel/alternative.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff -puN arch/x86/kernel/alternative.c~stop_machine-use-test_and_set_bit arch/x86/kernel/alternative.c
> --- linux/arch/x86/kernel/alternative.c~stop_machine-use-test_and_set_bit 2012-05-23 02:48:01.000000000 +0900
> +++ linux-hirofumi/arch/x86/kernel/alternative.c 2012-05-23 02:48:01.000000000 +0900
> @@ -650,7 +650,7 @@ void *__kprobes text_poke(void *addr, co
> * Cross-modifying kernel text with stop_machine().
> * This code originally comes from immediate value.
> */
> -static atomic_t stop_machine_first;
> +static unsigned long stop_machine_first;

The down side to this is that it adds 4 more bytes on a 64bit
machine. (sizeof(unsigned log) == 8 and sizeof(atomic_t) == 4)

You could probably also set it to -1, and do a atomic_inc_and_test(),
would that also cause the debug to trigger too?

-- Steve

> static int wrote_text;
>
> struct text_poke_params {
> @@ -664,7 +664,7 @@ static int __kprobes stop_machine_text_p
> struct text_poke_param *p;
> int i;
>
> - if (atomic_dec_and_test(&stop_machine_first)) {
> + if (test_and_clear_bit(0, &stop_machine_first)) {
> for (i = 0; i < tpp->nparams; i++) {
> p = &tpp->params[i];
> text_poke(p->addr, p->opcode, p->len);
> @@ -714,7 +714,7 @@ void *__kprobes text_poke_smp(void *addr
> p.len = len;
> tpp.params = &p;
> tpp.nparams = 1;
> - atomic_set(&stop_machine_first, 1);
> + stop_machine_first = 1;
> wrote_text = 0;
> /* Use __stop_machine() because the caller already got online_cpus. */
> __stop_machine(stop_machine_text_poke, (void *)&tpp, cpu_online_mask);
> @@ -736,7 +736,7 @@ void __kprobes text_poke_smp_batch(struc
> {
> struct text_poke_params tpp = {.params = params, .nparams = n};
>
> - atomic_set(&stop_machine_first, 1);
> + stop_machine_first = 1;
> wrote_text = 0;
> __stop_machine(stop_machine_text_poke, (void *)&tpp, cpu_online_mask);
> }

2012-05-23 02:44:15

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH] Use test_and_clear_bit() instead atomic_dec_and_test() for stop_machine

Steven Rostedt <[email protected]> writes:

> On Wed, May 23, 2012 at 03:11:48AM +0900, OGAWA Hirofumi wrote:
>> [forgot to Cc: lkml, resend]
>>
>> Hi,
>>
>> Maybe, nobody using debug patch in atomic_dec_and_test()... Well,
>> anyway, how about this?
>
> What debug patch?

It is this patch. I got this from -mm (akpm series), I don't know
whether -mm is still using though.

The patch below will detect atomic counter underflows. This has been
test-driven in the -RT patchset for some time. qdisc_destroy() triggered
it sometimes (in a seemingly nonfatal way, during device shutdown) - with
DaveM suggesting that it is most likely a bug in the networking code. So
it would be nice to have this in -mm for some time to validate all atomic
counters on a broader base.

Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

Change it to atomic check.

Signed-off-by: OGAWA Hirofumi <[email protected]>
---

arch/x86/include/asm/atomic.h | 13 +++++++++++++
arch/x86/include/asm/atomic64_32.h | 5 ++++-
arch/x86/include/asm/atomic64_64.h | 12 ++++++++++++
3 files changed, 29 insertions(+), 1 deletion(-)

diff -puN arch/x86/include/asm/atomic.h~debug_atomic_t-underflows-atomic arch/x86/include/asm/atomic.h
--- linux/arch/x86/include/asm/atomic.h~debug_atomic_t-underflows-atomic 2012-04-03 17:29:38.000000000 +0900
+++ linux-hirofumi/arch/x86/include/asm/atomic.h 2012-04-03 17:29:38.000000000 +0900
@@ -6,6 +6,18 @@
#include <asm/processor.h>
#include <asm/alternative.h>
#include <asm/cmpxchg.h>
+#include <asm/bug.h>
+
+#define ATOMIC_UNDERFLOW_CHECK(v) do { \
+ unsigned char __sf; \
+ /* if (atomic_read(v) < 0) */ \
+ __asm__ __volatile__("sets %0" \
+ : "=qm" (__sf) \
+ : /* no input */ \
+ : "memory"); \
+ WARN(__sf, KERN_ERR "atomic counter underflow: %d\n", \
+ atomic_read(v)); \
+} while(0)

/*
* Atomic operations that C can't guarantee us. Useful for
@@ -123,6 +135,7 @@ static inline int atomic_dec_and_test(at
asm volatile(LOCK_PREFIX "decl %0; sete %1"
: "+m" (v->counter), "=qm" (c)
: : "memory");
+ ATOMIC_UNDERFLOW_CHECK(v);
return c != 0;
}

diff -puN arch/x86/include/asm/atomic64_32.h~debug_atomic_t-underflows-atomic arch/x86/include/asm/atomic64_32.h
--- linux/arch/x86/include/asm/atomic64_32.h~debug_atomic_t-underflows-atomic 2012-04-03 17:29:38.000000000 +0900
+++ linux-hirofumi/arch/x86/include/asm/atomic64_32.h 2012-04-03 17:29:38.000000000 +0900
@@ -244,7 +244,10 @@ static inline void atomic64_dec(atomic64
*/
static inline int atomic64_dec_and_test(atomic64_t *v)
{
- return atomic64_dec_return(v) == 0;
+ long long ret = atomic64_dec_return(v);
+ WARN(ret < 0, KERN_ERR "atomic counter underflow: %lld\n",
+ atomic64_read(v));
+ return ret == 0;
}

/**
diff -puN arch/x86/include/asm/atomic64_64.h~debug_atomic_t-underflows-atomic arch/x86/include/asm/atomic64_64.h
--- linux/arch/x86/include/asm/atomic64_64.h~debug_atomic_t-underflows-atomic 2012-04-03 17:29:38.000000000 +0900
+++ linux-hirofumi/arch/x86/include/asm/atomic64_64.h 2012-04-03 17:29:38.000000000 +0900
@@ -5,6 +5,17 @@
#include <asm/alternative.h>
#include <asm/cmpxchg.h>

+#define ATOMIC64_UNDERFLOW_CHECK(v) do { \
+ unsigned char __sf; \
+ /* if (atomic64_read(v) < 0) */ \
+ __asm__ __volatile__("sets %0" \
+ : "=qm" (__sf) \
+ : /* no input */ \
+ : "memory"); \
+ WARN(__sf, KERN_ERR "atomic counter underflow: %ld\n", \
+ atomic64_read(v)); \
+} while(0)
+
/* The 64-bit atomic type */

#define ATOMIC64_INIT(i) { (i) }
@@ -121,6 +132,7 @@ static inline int atomic64_dec_and_test(
asm volatile(LOCK_PREFIX "decq %0; sete %1"
: "=m" (v->counter), "=qm" (c)
: "m" (v->counter) : "memory");
+ ATOMIC64_UNDERFLOW_CHECK(v);
return c != 0;
}

_

>> stop_machine_first is just to see if it is first one or not. So, there
>> is no reason to use atomic_dec_and_test(), and makes the value below 0.
>>
>> I think it is not desirable, because this usage only triggers
>> atomic_dec_and_test() underflow debug patch. (the patch tests result
>> of atomic_dec_and_test() is < 0)
>
> Well it should only underflow if you have a box with more than 2 billion
> CPUs.

It meant < 0, not underflow INT_MIN.

>> -static atomic_t stop_machine_first;
>> +static unsigned long stop_machine_first;
>
> The down side to this is that it adds 4 more bytes on a 64bit
> machine. (sizeof(unsigned log) == 8 and sizeof(atomic_t) == 4)

Oh, sure. If nobody has interest, unfortunately I will use this as my
local patch...

Thanks.
--
OGAWA Hirofumi <[email protected]>

2012-05-23 04:19:23

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH] Use test_and_clear_bit() instead atomic_dec_and_test() for stop_machine

OGAWA Hirofumi <[email protected]> writes:

>> The down side to this is that it adds 4 more bytes on a 64bit
>> machine. (sizeof(unsigned log) == 8 and sizeof(atomic_t) == 4)

Another patch without additional 4bytes. This simply change
atomic_dec_and_test() to atomic_xchg().

Thanks.
--
OGAWA Hirofumi <[email protected]>


stop_machine_first is just to see if it is first one or not. In this
usage, atomic_dec_and_test() makes value less than 0.

I think it is not desirable, because it only triggers
atomic_dec_and_test() less than 0 debug patch. (the patch tests result
of atomic_dec_and_test() is < 0)

So, this uses atomic_xchg() instead.

Signed-off-by: OGAWA Hirofumi <[email protected]>
---

arch/x86/kernel/alternative.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN arch/x86/kernel/alternative.c~stop_machine-use-atomic_xchg arch/x86/kernel/alternative.c
--- linux/arch/x86/kernel/alternative.c~stop_machine-use-atomic_xchg 2012-05-23 13:10:03.000000000 +0900
+++ linux-hirofumi/arch/x86/kernel/alternative.c 2012-05-23 13:10:03.000000000 +0900
@@ -664,7 +664,7 @@ static int __kprobes stop_machine_text_p
struct text_poke_param *p;
int i;

- if (atomic_dec_and_test(&stop_machine_first)) {
+ if (atomic_xchg(&stop_machine_first, 0)) {
for (i = 0; i < tpp->nparams; i++) {
p = &tpp->params[i];
text_poke(p->addr, p->opcode, p->len);
_

2012-05-23 08:43:28

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] Use test_and_clear_bit() instead atomic_dec_and_test() for stop_machine

On Wed, 2012-05-23 at 13:19 +0900, OGAWA Hirofumi wrote:
> OGAWA Hirofumi <[email protected]>
>
>
> stop_machine_first is just to see if it is first one or not. In this
> usage, atomic_dec_and_test() makes value less than 0.
>
> I think it is not desirable, because it only triggers
> atomic_dec_and_test() less than 0 debug patch. (the patch tests result
> of atomic_dec_and_test() is < 0)
>
> So, this uses atomic_xchg() instead.

Acked-by: Steven Rostedt <[email protected]>

-- Steve

>
> Signed-off-by: OGAWA Hirofumi <[email protected]>

2012-06-07 12:54:25

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] Use test_and_clear_bit() instead atomic_dec_and_test() for stop_machine

On Wed, 2012-05-23 at 13:19 +0900, OGAWA Hirofumi wrote:
> OGAWA Hirofumi <[email protected]> writes:
>
> >> The down side to this is that it adds 4 more bytes on a 64bit
> >> machine. (sizeof(unsigned log) == 8 and sizeof(atomic_t) == 4)
>
> Another patch without additional 4bytes. This simply change
> atomic_dec_and_test() to atomic_xchg().
>

If nobody picked this up, you might want to resend it with my acked-by.
(The second patch, not the first). As patches added to replies are
usually ignored.

-- Steve

2012-06-07 13:18:16

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH] Use test_and_clear_bit() instead atomic_dec_and_test() for stop_machine

Steven Rostedt <[email protected]> writes:

> On Wed, 2012-05-23 at 13:19 +0900, OGAWA Hirofumi wrote:
>> OGAWA Hirofumi <[email protected]> writes:
>>
>> >> The down side to this is that it adds 4 more bytes on a 64bit
>> >> machine. (sizeof(unsigned log) == 8 and sizeof(atomic_t) == 4)
>>
>> Another patch without additional 4bytes. This simply change
>> atomic_dec_and_test() to atomic_xchg().
>>
>
> If nobody picked this up, you might want to resend it with my acked-by.
> (The second patch, not the first). As patches added to replies are
> usually ignored.

Thanks. I was going to add that to my personal patchset, I will try it again.
--
OGAWA Hirofumi <[email protected]>