Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757921AbYBDWER (ORCPT ); Mon, 4 Feb 2008 17:04:17 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754029AbYBDWEE (ORCPT ); Mon, 4 Feb 2008 17:04:04 -0500 Received: from ms-smtp-05.nyroc.rr.com ([24.24.2.59]:58505 "EHLO ms-smtp-05.nyroc.rr.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753490AbYBDWEB (ORCPT ); Mon, 4 Feb 2008 17:04:01 -0500 Date: Mon, 4 Feb 2008 17:03:47 -0500 (EST) From: Steven Rostedt X-X-Sender: rostedt@gandalf.stny.rr.com To: "Paul E. McKenney" cc: Peter Zijlstra , LKML , Ingo Molnar , Linus Torvalds , Andrew Morton , Christoph Hellwig , Mathieu Desnoyers , Gregory Haskins , Arnaldo Carvalho de Melo , Thomas Gleixner , Tim Bird , Sam Ravnborg , "Frank Ch. Eigler" , Jan Kiszka , John Stultz , Arjan van de Ven , Steven Rostedt Subject: Re: [PATCH 02/22 -v7] Add basic support for gcc profiler instrumentation In-Reply-To: <20080204214023.GC29646@linux.vnet.ibm.com> Message-ID: References: <20080130031521.258552785@goodmis.org> <20080130031840.337019504@goodmis.org> <1201682801.28547.162.camel@lappy> <1201703101.28547.224.camel@lappy> <20080201223413.GB9247@linux.vnet.ibm.com> <20080202214124.GA28612@linux.vnet.ibm.com> <20080204214023.GC29646@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4439 Lines: 145 On Mon, 4 Feb 2008, Paul E. McKenney wrote: > OK, will see what I can do... > > > On Sat, 2 Feb 2008, Paul E. McKenney wrote: > > > > > Yep, you have dependencies, so something like the following: > > > > > > initial state: > > > > > > struct foo { > > > int a; > > > }; > > > struct foo x = { 0 }; > > > struct foo y = { 0 }; > > > struct foo *global_p = &y; > > > /* other variables are appropriately declared auto variables */ > > > > > > /* No kmalloc() or kfree(), hence no RCU grace periods. */ > > > /* In the terminology of http://lwn.net/Articles/262464/, we */ > > > /* are doing only publish-subscribe, nothing else. */ > > > > > > writer: > > > > > > x.a = 1; > > > smp_wmb(); /* or smp_mb() */ > > > global_p = &x; > > > > > > reader: > > > > > > p = global_p; > > > ta = p->a; > > > > > > Both Alpha and aggressive compiler optimizations can result in the reader > > > seeing the new value of the pointer (&x) but the old value of the field > > > (0). Strange but true. The fix is as follows: > > > > > > reader: > > > > > > p = global_p; > > > smp_read_barrier_depends(); /* or use rcu_dereference() */ > > > ta = p->a; > > > > > > So how can this happen? First note that if smp_read_barrier_depends() > > > was unnecessary in this case, it would be unnecessary in all cases. > > > > > > Second, let's start with the compiler. Suppose that a highly optimizing > > > compiler notices that in almost all cases, the reader finds p==global_p. > > > Suppose that this compiler also notices that one of the registers (say > > > r1) almost always contains this expected value of global_p, and that > > > cache pressure ensures that an actual load from global_p almost always > > > generates an expensive cache miss. Such a compiler would be within its > > > rights (as defined by the C standard) to generate code assuming that r1 > > > already had the right value, while also generating code to validate this > > > assumption, perhaps as follows: > > > > > > r2 = global_p; /* high latency, other things complete meanwhile */ > > > ta == r1->a; > > > if (r1 != r2) > > > ta = r2->a; > > > > > > Now consider the following sequence of events on a superscalar CPU: > > > > I think you missed one step here (causing my confusion). I don't want to > > assume so I'll try to put in the missing step: > > > > writer: r1 = p; /* happens to use r1 to store parameter p */ > > You lost me on this one... The writer has only the following three steps: You're right. I meant "writer: r1 = x;" > > writer: > > x.a = 1; > smp_wmb(); /* or smp_mb() */ > global_p = &x; > > Where did the "r1 = p" come from? For that matter, where did "p" come > from? > > > > reader: r2 = global_p; /* issued, has not yet completed. */ > > > reader: ta = r1->a; /* which gives zero. */ > > > writer: x.a = 1; > > > writer: smp_wmb(); > > > writer: global_p = &x; > > > reader: r2 = global_p; /* this instruction now completes */ > > > reader: if (r1 != r2) /* and these are equal, so we keep bad ta! */ > > > > Is that the case? > > Ah! Please note that I am doing something unusual here in that I am > working with global variables, as opposed to the normal RCU practice of > dynamically allocating memory. So "x" is just a global struct, not a > pointer to a struct. > But lets look at a simple version of my original code anyway ;-) Writer: void add_op(struct myops *x) { /* x->next may be garbage here */ x->next = global_p; smp_wmb(); global_p = x; } Reader: void read_op(void) { struct myops *p = global_p; while (p != NULL) { p->func(); p = next; /* if p->next is garbage we crash */ } } Here, we are missing the read_barrier_depends(). Lets look at the Alpha cache issue: reader reads the new version of global_p, and then reads the next pointer. But since the next pointer is on a different cacheline than global_p, it may have somehow had that in it's cache still. So it uses the old next pointer which contains the garbage. Is that correct? But I will have to admit, that I can't see how an aggressive compiler might have screwed this up. Being that x is a parameter, and the function add_op is not in a header file. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/