2014-04-30 02:57:02

by wangxiaoming321

[permalink] [raw]
Subject: [PATCH] lib/spinlock_debug: avoid one thread can not obtain the spinlock for a long time.

loops_per_jiffy is larger than expectation that possible
causes one thread can not obtain the spin lock for a long time.
So use cpu_clock() to reach timeout in one second which can
avoid HARD LOCKUP.

Signed-off-by: Chuansheng Liu <[email protected]>
Signed-off-by: xiaoming wang <[email protected]>
---
kernel/locking/spinlock_debug.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 0374a59..5d3c4f3 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -105,13 +105,19 @@ static inline void debug_spin_unlock(raw_spinlock_t *lock)

static void __spin_lock_debug(raw_spinlock_t *lock)
{
- u64 i;
+ u64 i, t;
u64 loops = loops_per_jiffy * HZ;
+ u64 one_second = 1000000000;
+ u32 this_cpu = raw_smp_processor_id();
+
+ t = cpu_clock(this_cpu);

for (i = 0; i < loops; i++) {
if (arch_spin_trylock(&lock->raw_lock))
return;
__delay(1);
+ if (cpu_clock(this_cpu) - t > one_second)
+ break;
}
/* lockup suspected: */
spin_dump(lock, "lockup suspected");
--
1.7.1


2014-04-30 06:06:37

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] lib/spinlock_debug: avoid one thread can not obtain the spinlock for a long time.

On Wed, Apr 30, 2014 at 01:04:31PM -0400, Wang, Xiaoming wrote:
> loops_per_jiffy is larger than expectation that possible
> causes one thread can not obtain the spin lock for a long time.
> So use cpu_clock() to reach timeout in one second which can
> avoid HARD LOCKUP.

This is just not making sense.. one thing is broken so then you tape on
another? Fix the first already.

Also, why do you care?

2014-04-30 06:19:51

by wangxiaoming321

[permalink] [raw]
Subject: RE: [PATCH] lib/spinlock_debug: avoid one thread can not obtain the spinlock for a long time.

Dear Peter
If we wait the end of loop as loops_per_jiffy.
It may last more than 130s and local IRQ disabled at interval
which may cause Hard LOCKUP. We break out in 1 second and
dump the stack for debug.

> -----Original Message-----
> From: Peter Zijlstra [mailto:[email protected]]
> Sent: Wednesday, April 30, 2014 2:06 PM
> To: Wang, Xiaoming
> Cc: [email protected]; [email protected]; Liu, Chuansheng
> Subject: Re: [PATCH] lib/spinlock_debug: avoid one thread can not obtain
> the spinlock for a long time.
>
> On Wed, Apr 30, 2014 at 01:04:31PM -0400, Wang, Xiaoming wrote:
> > loops_per_jiffy is larger than expectation that possible causes one
> > thread can not obtain the spin lock for a long time.
> > So use cpu_clock() to reach timeout in one second which can avoid
> HARD
> > LOCKUP.
>
> This is just not making sense.. one thing is broken so then you tape on
> another? Fix the first already.
>
> Also, why do you care?

2014-04-30 07:43:12

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] lib/spinlock_debug: avoid one thread can not obtain the spinlock for a long time.

On Wed, Apr 30, 2014 at 06:17:48AM +0000, Wang, Xiaoming wrote:
> Dear Peter
> If we wait the end of loop as loops_per_jiffy.
> It may last more than 130s and local IRQ disabled at interval
> which may cause Hard LOCKUP. We break out in 1 second and
> dump the stack for debug.

Yeah, so? That makes shoddy engineering alright then?

So either fix the loops_per_jiffy thing, it is supposed to wait for 1
second after all, or explain why its broken and entirely replace it.

What you do not do is make a loop with 2 differently broken timeouts and
hope one works.