Date: 11 Oct 2005 19:50:17 -0400
Message-ID: <20051011235017.21719.qmail@science.horizon.com>
From: linux@horizon.com
To: dev@sw.ru
Subject: Re: SMP syncronization on AMD processors (broken?)
Cc: linux-kernel@vger.kernel.org
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2351
Lines: 98

> The whole story started when we wrote the following code:
> 
> void XXX(void)
> {
> 	/* ints disabled */
> restart:
> 	spin_lock(&lock);
> 	do_something();
> 	if (!flag)
> 		need_restart = 1;
> 	spin_unlock(&lock);
> 	if (need_restart)
> 		goto restart;	<<<< LOOPS 4EVER ON AMD!!!
> }
> 
> void YYY(void)
> {
> 	spin_lock(&lock);	<<<< SPINS 4EVER ON AMD!!!
> 	flag = 1;
> 	spin_unlock(&lock);
> }
> 
> function XXX() starts on CPU0 and begins to loop since flag is not set, 
> then CPU1 calls function YYY() and it turns out that it can't take the 
> lock any arbitrary time.

The right thing to do here is to wait for the flag to be set *outside*
the lock, and then re-validate inside the lock:

void XXX(void)
{
	/* ints disabled */
restart:
	spin_lock(&lock);
	do_something();
	if (!flag)
		need_restart = 1;
	spin_unlock(&lock);
	if (need_restart) {
		while (!flag)
			cpu_relax();
		goto restart;
	}
}

This way, XXX() keeps the lock dropped for as long as it takes for
YYY() to notice and grab it.


However, I realize that this is of course a simplified case of some real
code, where even *finding* the flag requires the spin lock.

The generic solution is to have a global "progress" counter, which
records "I made progress toward setting flag", that XXX() can
busy-loop on:

int progress;

void XXX(void)
{
	int old_progress;
	/* ints disabled */
restart:
	spin_lock(&lock);
	do_something();
	if (!flag) {
		old_progress = progress;
		need_restart = 1;
	}
	spin_unlock(&lock);
	if (need_restart) {
		while (progress == old_progress)
			cpu_relax();
		goto restart;
	}
}

void YYY(void)
{
	spin_lock(&lock);
	flag = 1;
	progress++;
	spin_unlock(&lock);
}

It may be that in your data structure, there is one or a series of
fields that already exist that you can use for the purpose.  The goal
is to merely detect *change*, so you can reacquire the lock and test
definitively.  It's okay to read freed memory while doing this, as long as
you can be sure that:
- The memory read won't oops the kernel, and
- You don't end up depending on the value of the freed memory to
  get you out of the stall.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/