2002-07-29 21:01:40

by Van Maren, Kevin

[permalink] [raw]
Subject: RE: [Linux-ia64] Linux kernel deadlock caused by spinlock bug

> On Mon, Jul 29, 2002 at 03:37:17PM -0500, Van Maren, Kevin wrote:
> > I changed the code to allow the writer bit to remain set even if
> > there is a reader. By only allowing a single processor to set
> > the writer bit, I don't have to worry about pending writers starving
> > out readers. The potential writer that was able to set the
> writer bit
> > gains ownership of the lock when the current readers finish. Since
> > the retry for read_lock does not keep trying to increment the reader
> > count, there are no other required changes.
>
> however, this is broken. linux relies on being able to do
>
> read_lock(x);
> func()
> -> func()
> -> func()
> -> read_lock(x);
>
> if a writer comes between those two read locks, you're toast.
>
> i suspect the right answer for the contention you're seeing
> is an improved
> get_timeofday which is lockless.

Recursive read locks certainly make it more difficult to fix the
problem. Placing a band-aid on gettimeofday will fix the symptom
in one location, but will not fix the general problem, which is
writer starvation with heavy read lock load. The only way to fix
that is to make writer locks fair or to eliminate them (make them
_all_ stateless).

Recursive read locks also imply that you can't replace them with
a "normal" spinlock, which would also solve the problem (although
they do _not_ scale under contention -- something like O(N^2)
cache-cache transfers for N processors to acquire once).

There are ways of fixing the writer starvation and allowing recursive
read locks, but that is more work (and heavier-weight than desirable).
How pervasive are recursive reader locks? Should they be a special
type of reader lock?

Kevin


2002-07-29 21:15:28

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [Linux-ia64] Linux kernel deadlock caused by spinlock bug

On Mon, Jul 29, 2002 at 04:05:35PM -0500, Van Maren, Kevin wrote:
> Recursive read locks certainly make it more difficult to fix the
> problem. Placing a band-aid on gettimeofday will fix the symptom
> in one location, but will not fix the general problem, which is
> writer starvation with heavy read lock load. The only way to fix
> that is to make writer locks fair or to eliminate them (make them
> _all_ stateless).

The basic principle is that if you see contention on a spinlock, you
should eliminate the spinlock somehow. making spinlocks `fair' doesn't
help that you're spending lots of time spinning on a lock.

--
Revolutions do not require corporate support.