2001-10-13 14:59:19

by Paul McKenney

[permalink] [raw]
Subject: Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion


> >
> > And the read side is:
> > lock
> > loopup
> > atomic_inc
> > unlock
> >
> > With RCU, read side is:
> > loopup
> > atomic_inc
>
> Yes. With maybe
>
> non_preempt()
> ..
> preempt()
>
> around it for the pre-emption patches.
>
> However, you also need to make your free _free_ be aware of the count.
> Which means that the current RCU patch is really unusable for this. You
> need to have the "count" always in a generic place (put it with the
hash),

???

Ah! Are you thinking of the trick that associates a reference counter
with each pointer, and where one uses a double-compare-and-swap instruction
to do updates? That is definitely -not- what we are proposing here, since
it is important to avoid writes during read-only traversals.

Instead, we use side information to deduce when it is no longer possible
for there to be any references to a given data structure.

It -is- possible to use RCU in Linux -without- reference counts. See
the Maneesh Soni's FD-management patch and description at:

http://lse.sourceforge.net/locking/patches/files_struct_rcu-2.4.10-04.patch
http://lse.sourceforge.net/locking/files_struct_rcu.txt

The reference counts are needed -only- in cases where references to an
RCU-protected data structure may be held across a sleep. Dipankar Sarma's
IPV4 route-cache lookup patch at:

http://lse.sourceforge.net/locking/patches/rt_rcu-2.4.6-02.patch

is an example use of RCU where reference counts are used, since entries
can be queued.

Thanx, Paul


2001-10-13 17:23:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion


On Sat, 13 Oct 2001, Paul McKenney wrote:
>
> The reference counts are needed -only- in cases where references to an
> RCU-protected data structure may be held across a sleep.

My point is that
(a) such uses are not very common.
(b) the whole notiong of "scheduling point" is a lot too fragile to be
acceptable for important data structures. It breaks with the
pre-emption patches on UP, and we've seen many times how hard it is
for developers to notice even when there _is_ an explicit "end
critical region now" kind of thing

Have you seen how many uses of implicitly blocking operations there have
been with interrupts disabled or spinlocks held? Now, if the data
structure doesn't even have a "ok, I'm done with you" thing, then those
kinds of mistakes are going to be not only impossible to find
automatically, but they are going to be even easier to make by mistake.

In short, RCU has several problems in my opinion:

- nobody has shown a case where existing normal locking ends up being
really a huge problem, and where RCU clearly helps.

- RCU obviously has major subtle issues, ranging from memory ordering to
"what is quiescent", ie the scheduling points. And "subtlety" directly
translates into bugs and hard-to-debug seldom-seen problems that end up
being "unexplainable".

In short, RCU seems to be a case of "hey, that's cool", but it's a
solution in search of a problem so severe that it is worth it.

Linus

2001-10-13 17:28:09

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion


On Sat, 13 Oct 2001, Linus Torvalds wrote:
>
> In short, RCU seems to be a case of "hey, that's cool", but it's a
> solution in search of a problem so severe that it is worth it.

Oh, and before people start telling me that RCU was successfully used in
AIX/projectX/xxxx/etc, you have to realize that I don't give a rats *ss
about the fact that there are OS's out there that are "more scalable".

The last time I looked, Solaris and AIX and all the rest of the "scalable"
systems were absolute pigs on smaller hardware, and the "scalability" in
them often translates into "we scale linearly to many CPU's by being
really bad even on one".

Linus

2001-10-13 18:42:27

by Andi Kleen

[permalink] [raw]
Subject: Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion

In article <[email protected]>,
Linus Torvalds <[email protected]> writes:

> - nobody has shown a case where existing normal locking ends up being
> really a huge problem, and where RCU clearly helps.

The poster child of such a case is module unloading. Keeping reference
counts for every even non sleeping use of a module is very painful.
The current "fix" -- putting module count increases in all possible module
callers to fix the unload races is slow and ugly and far too subtle to
get everything right. Waiting quiescent periods before unloading is a nice
alternative.

-Andi

2001-10-13 19:15:37

by Alexander Viro

[permalink] [raw]
Subject: Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion



On 13 Oct 2001, Andi Kleen wrote:

> In article <[email protected]>,
> Linus Torvalds <[email protected]> writes:
>
> > - nobody has shown a case where existing normal locking ends up being
> > really a huge problem, and where RCU clearly helps.
>
> The poster child of such a case is module unloading. Keeping reference
> counts for every even non sleeping use of a module is very painful.
> The current "fix" -- putting module count increases in all possible module
> callers to fix the unload races is slow and ugly and far too subtle to
> get everything right. Waiting quiescent periods before unloading is a nice
> alternative.

... while quiescent stuff is _not_ subtle and not prone to breakage. Right.
In the same world where Vomit-Making System is elegant, SGI "designs" are
and NT is The Wave Of Future(tm). Pardon me, but I'll stay in our universe
and away from the drugs of such power.

2001-10-13 20:48:53

by Rusty Russell

[permalink] [raw]
Subject: Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion

In message <[email protected]> you write:
> In article <[email protected]>,
> Linus Torvalds <[email protected]> writes:
>
> > - nobody has shown a case where existing normal locking ends up being
> > really a huge problem, and where RCU clearly helps.
>
> The poster child of such a case is module unloading. Keeping reference
> counts for every even non sleeping use of a module is very painful.

Well, module unloading requires only a small fraction of the read copy
update infrastructure (synchronize_kernel()), and can be implemented
without any scheduler changes, as it's not at all speed critical.

If nothing else, this thread has served to make more kernel hackers
aware of the technique, so they can try it themselves as desired.

Cheers,
Rusty.
--
Premature optmztion is rt of all evl. --DK

2001-10-13 21:24:17

by Rusty Russell

[permalink] [raw]
Subject: Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion

In message <[email protected]> you
write:
>
> (b) the whole notiong of "scheduling point" is a lot too fragile to be
> acceptable for important data structures. It breaks with the
> pre-emption patches on UP, and we've seen many times how hard it is
> for developers to notice even when there _is_ an explicit "end
> critical region now" kind of thing

Yeah, if you can't get locking right, you can't get RCU right. I've
shown you that using it in place of standard locking is simple: the
ONLY added issue is being careful not to screw readers while doing the
actual insert/delete.

Where, if anywhere, is this worth it? Good question: 3% on 4-way
dbench doesn't cut it in my book...

Rusty.
--
Premature optmztion is rt of all evl. --DK

2001-10-14 01:37:31

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Re: RFC: patch to allow lock-free traversal of lists with insertion

>Where, if anywhere, is this worth it? Good question: 3% on 4-way
>dbench doesn't cut it in my book...

No argument here, we need to show compelling performance cases for RCU in
a number of situations. We have some (e.g., for FD set management on
8-way reported at OLS), but need more. We will continue to work on this.

If nothing else, this thread seems to have raised awareness of RCU. ;-)

Thanx, Paul

2001-10-14 02:04:05

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Re: RFC: patch to allow lock-free traversal of lists

>On Sat, 13 Oct 2001, Linus Torvalds wrote:
>>
>> In short, RCU seems to be a case of "hey, that's cool", but it's a
>> solution in search of a problem so severe that it is worth it.
>
>Oh, and before people start telling me that RCU was successfully used in
>AIX/projectX/xxxx/etc, you have to realize that I don't give a rats *ss
>about the fact that there are OS's out there that are "more scalable".

I understand and agree: RCU must prove itself -in- -Linux-.

>The last time I looked, Solaris and AIX and all the rest of the "scalable"
>systems were absolute pigs on smaller hardware, and the "scalability" in
>them often translates into "we scale linearly to many CPU's by being
>really bad even on one".

Again understood and agreed. Simplicity and single-CPU performance are
at least as important as scalability. Although I believe that RCU can
be an important part of all three in some situations, this belief is
clearly in need of more proof.

Thanx, Paul

2001-10-14 07:19:11

by Dipankar Sarma

[permalink] [raw]
Subject: Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion

On Sat, Oct 13, 2001 at 10:28:13AM -0700, Linus Torvalds wrote:
>
> On Sat, 13 Oct 2001, Linus Torvalds wrote:
> >
> > In short, RCU seems to be a case of "hey, that's cool", but it's a
> > solution in search of a problem so severe that it is worth it.
>
> Oh, and before people start telling me that RCU was successfully used in
> AIX/projectX/xxxx/etc, you have to realize that I don't give a rats *ss
> about the fact that there are OS's out there that are "more scalable".

Absolutely. Those are different OSes, different environments and mostly
imprtantly different goals. We may draw on that experience, but still
need to prove that the ideas work for *Linux*.

>
> The last time I looked, Solaris and AIX and all the rest of the "scalable"
> systems were absolute pigs on smaller hardware, and the "scalability" in
> them often translates into "we scale linearly to many CPU's by being
> really bad even on one".

No argument here at all. Big iron is only a part of what linux
does and we are very conscious of that fact. In fact, this makes
our work quite an interesting challenge.

Thanks
Dipankar
--
Dipankar Sarma <[email protected]> Project: http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.