2009-04-17 01:03:23

by Mathieu Desnoyers

[permalink] [raw]
Subject: [patch 2/3] RCU move trace defines to rcupdate_types.h

Given tracepoint.h need to include the minimal set of headers, split some
RCU defines from the global rcupdate.h header.

Signed-off-by: Mathieu Desnoyers <[email protected]>
CC: Jeremy Fitzhardinge <[email protected]>
CC: "Paul E. McKenney" <[email protected]>
CC: Steven Rostedt <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Andrew Morton <[email protected]>
CC: Christoph Hellwig <[email protected]>
---
include/linux/rcupdate.h | 128 --------------------------------
include/linux/rcupdate_types.h | 163 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 165 insertions(+), 126 deletions(-)

Index: linux.trees.git/include/linux/rcupdate.h
===================================================================
--- linux.trees.git.orig/include/linux/rcupdate.h 2009-04-16 20:34:19.000000000 -0400
+++ linux.trees.git/include/linux/rcupdate.h 2009-04-16 20:35:23.000000000 -0400
@@ -40,6 +40,7 @@
#include <linux/seqlock.h>
#include <linux/lockdep.h>
#include <linux/completion.h>
+#include <linux/rcupdate_types.h>

/**
* struct rcu_head - callback structure for use with RCU
@@ -70,132 +71,7 @@ extern int rcu_scheduler_active;
(ptr)->next = NULL; (ptr)->func = NULL; \
} while (0)

-/**
- * rcu_read_lock - mark the beginning of an RCU read-side critical section.
- *
- * When synchronize_rcu() is invoked on one CPU while other CPUs
- * are within RCU read-side critical sections, then the
- * synchronize_rcu() is guaranteed to block until after all the other
- * CPUs exit their critical sections. Similarly, if call_rcu() is invoked
- * on one CPU while other CPUs are within RCU read-side critical
- * sections, invocation of the corresponding RCU callback is deferred
- * until after the all the other CPUs exit their critical sections.
- *
- * Note, however, that RCU callbacks are permitted to run concurrently
- * with RCU read-side critical sections. One way that this can happen
- * is via the following sequence of events: (1) CPU 0 enters an RCU
- * read-side critical section, (2) CPU 1 invokes call_rcu() to register
- * an RCU callback, (3) CPU 0 exits the RCU read-side critical section,
- * (4) CPU 2 enters a RCU read-side critical section, (5) the RCU
- * callback is invoked. This is legal, because the RCU read-side critical
- * section that was running concurrently with the call_rcu() (and which
- * therefore might be referencing something that the corresponding RCU
- * callback would free up) has completed before the corresponding
- * RCU callback is invoked.
- *
- * RCU read-side critical sections may be nested. Any deferred actions
- * will be deferred until the outermost RCU read-side critical section
- * completes.
- *
- * It is illegal to block while in an RCU read-side critical section.
- */
-#define rcu_read_lock() __rcu_read_lock()
-
-/**
- * rcu_read_unlock - marks the end of an RCU read-side critical section.
- *
- * See rcu_read_lock() for more information.
- */
-
-/*
- * So where is rcu_write_lock()? It does not exist, as there is no
- * way for writers to lock out RCU readers. This is a feature, not
- * a bug -- this property is what provides RCU's performance benefits.
- * Of course, writers must coordinate with each other. The normal
- * spinlock primitives work well for this, but any other technique may be
- * used as well. RCU does not care how the writers keep out of each
- * others' way, as long as they do so.
- */
-#define rcu_read_unlock() __rcu_read_unlock()
-
-/**
- * rcu_read_lock_bh - mark the beginning of a softirq-only RCU critical section
- *
- * This is equivalent of rcu_read_lock(), but to be used when updates
- * are being done using call_rcu_bh(). Since call_rcu_bh() callbacks
- * consider completion of a softirq handler to be a quiescent state,
- * a process in RCU read-side critical section must be protected by
- * disabling softirqs. Read-side critical sections in interrupt context
- * can use just rcu_read_lock().
- *
- */
-#define rcu_read_lock_bh() __rcu_read_lock_bh()
-
-/*
- * rcu_read_unlock_bh - marks the end of a softirq-only RCU critical section
- *
- * See rcu_read_lock_bh() for more information.
- */
-#define rcu_read_unlock_bh() __rcu_read_unlock_bh()
-
-/**
- * rcu_read_lock_sched - mark the beginning of a RCU-classic critical section
- *
- * Should be used with either
- * - synchronize_sched()
- * or
- * - call_rcu_sched() and rcu_barrier_sched()
- * on the write-side to insure proper synchronization.
- */
-#define rcu_read_lock_sched() preempt_disable()
-#define rcu_read_lock_sched_notrace() preempt_disable_notrace()
-
-/*
- * rcu_read_unlock_sched - marks the end of a RCU-classic critical section
- *
- * See rcu_read_lock_sched for more information.
- */
-#define rcu_read_unlock_sched() preempt_enable()
-#define rcu_read_unlock_sched_notrace() preempt_enable_notrace()
-
-
-
-/**
- * rcu_dereference - fetch an RCU-protected pointer in an
- * RCU read-side critical section. This pointer may later
- * be safely dereferenced.
- *
- * Inserts memory barriers on architectures that require them
- * (currently only the Alpha), and, more importantly, documents
- * exactly which pointers are protected by RCU.
- */
-
-#define rcu_dereference(p) ({ \
- typeof(p) _________p1 = ACCESS_ONCE(p); \
- smp_read_barrier_depends(); \
- (_________p1); \
- })
-
-/**
- * rcu_assign_pointer - assign (publicize) a pointer to a newly
- * initialized structure that will be dereferenced by RCU read-side
- * critical sections. Returns the value assigned.
- *
- * Inserts memory barriers on architectures that require them
- * (pretty much all of them other than x86), and also prevents
- * the compiler from reordering the code that initializes the
- * structure after the pointer assignment. More importantly, this
- * call documents which pointers will be dereferenced by RCU read-side
- * code.
- */
-
-#define rcu_assign_pointer(p, v) \
- ({ \
- if (!__builtin_constant_p(v) || \
- ((v) != NULL)) \
- smp_wmb(); \
- (p) = (v); \
- })
+/* See linux/rcupdate_types.h for read-side locking primitives */

/* Infrastructure to implement the synchronize_() primitives. */

Index: linux.trees.git/include/linux/rcupdate_types.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/include/linux/rcupdate_types.h 2009-04-16 20:35:09.000000000 -0400
@@ -0,0 +1,163 @@
+/*
+ * Read-Copy Update mechanism for mutual exclusion - read-side definitions
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright IBM Corporation, 2001
+ *
+ * Author: Dipankar Sarma <[email protected]>
+ *
+ * Based on the original work by Paul McKenney <[email protected]>
+ * and inputs from Rusty Russell, Andrea Arcangeli and Andi Kleen.
+ * Papers:
+ * http://www.rdrop.com/users/paulmck/paper/rclockpdcsproof.pdf
+ * http://lse.sourceforge.net/locking/rclock_OLS.2001.05.01c.sc.pdf (OLS2001)
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * http://lse.sourceforge.net/locking/rcupdate.html
+ *
+ */
+
+#ifndef __LINUX_RCUPDATE_TYPES_H
+#define __LINUX_RCUPDATE_TYPES_H
+
+/**
+ * rcu_read_lock - mark the beginning of an RCU read-side critical section.
+ *
+ * When synchronize_rcu() is invoked on one CPU while other CPUs
+ * are within RCU read-side critical sections, then the
+ * synchronize_rcu() is guaranteed to block until after all the other
+ * CPUs exit their critical sections. Similarly, if call_rcu() is invoked
+ * on one CPU while other CPUs are within RCU read-side critical
+ * sections, invocation of the corresponding RCU callback is deferred
+ * until after the all the other CPUs exit their critical sections.
+ *
+ * Note, however, that RCU callbacks are permitted to run concurrently
+ * with RCU read-side critical sections. One way that this can happen
+ * is via the following sequence of events: (1) CPU 0 enters an RCU
+ * read-side critical section, (2) CPU 1 invokes call_rcu() to register
+ * an RCU callback, (3) CPU 0 exits the RCU read-side critical section,
+ * (4) CPU 2 enters a RCU read-side critical section, (5) the RCU
+ * callback is invoked. This is legal, because the RCU read-side critical
+ * section that was running concurrently with the call_rcu() (and which
+ * therefore might be referencing something that the corresponding RCU
+ * callback would free up) has completed before the corresponding
+ * RCU callback is invoked.
+ *
+ * RCU read-side critical sections may be nested. Any deferred actions
+ * will be deferred until the outermost RCU read-side critical section
+ * completes.
+ *
+ * It is illegal to block while in an RCU read-side critical section.
+ */
+#define rcu_read_lock() __rcu_read_lock()
+
+/**
+ * rcu_read_unlock - marks the end of an RCU read-side critical section.
+ *
+ * See rcu_read_lock() for more information.
+ */
+
+/*
+ * So where is rcu_write_lock()? It does not exist, as there is no
+ * way for writers to lock out RCU readers. This is a feature, not
+ * a bug -- this property is what provides RCU's performance benefits.
+ * Of course, writers must coordinate with each other. The normal
+ * spinlock primitives work well for this, but any other technique may be
+ * used as well. RCU does not care how the writers keep out of each
+ * others' way, as long as they do so.
+ */
+#define rcu_read_unlock() __rcu_read_unlock()
+
+/**
+ * rcu_read_lock_bh - mark the beginning of a softirq-only RCU critical section
+ *
+ * This is equivalent of rcu_read_lock(), but to be used when updates
+ * are being done using call_rcu_bh(). Since call_rcu_bh() callbacks
+ * consider completion of a softirq handler to be a quiescent state,
+ * a process in RCU read-side critical section must be protected by
+ * disabling softirqs. Read-side critical sections in interrupt context
+ * can use just rcu_read_lock().
+ *
+ */
+#define rcu_read_lock_bh() __rcu_read_lock_bh()
+
+/*
+ * rcu_read_unlock_bh - marks the end of a softirq-only RCU critical section
+ *
+ * See rcu_read_lock_bh() for more information.
+ */
+#define rcu_read_unlock_bh() __rcu_read_unlock_bh()
+
+/**
+ * rcu_read_lock_sched - mark the beginning of a RCU-classic critical section
+ *
+ * Should be used with either
+ * - synchronize_sched()
+ * or
+ * - call_rcu_sched() and rcu_barrier_sched()
+ * on the write-side to insure proper synchronization.
+ */
+#define rcu_read_lock_sched() preempt_disable()
+#define rcu_read_lock_sched_notrace() preempt_disable_notrace()
+
+/*
+ * rcu_read_unlock_sched - marks the end of a RCU-classic critical section
+ *
+ * See rcu_read_lock_sched for more information.
+ */
+#define rcu_read_unlock_sched() preempt_enable()
+#define rcu_read_unlock_sched_notrace() preempt_enable_notrace()
+
+
+
+/**
+ * rcu_dereference - fetch an RCU-protected pointer in an
+ * RCU read-side critical section. This pointer may later
+ * be safely dereferenced.
+ *
+ * Inserts memory barriers on architectures that require them
+ * (currently only the Alpha), and, more importantly, documents
+ * exactly which pointers are protected by RCU.
+ */
+
+#define rcu_dereference(p) ({ \
+ typeof(p) _________p1 = ACCESS_ONCE(p); \
+ smp_read_barrier_depends(); \
+ (_________p1); \
+ })
+
+/**
+ * rcu_assign_pointer - assign (publicize) a pointer to a newly
+ * initialized structure that will be dereferenced by RCU read-side
+ * critical sections. Returns the value assigned.
+ *
+ * Inserts memory barriers on architectures that require them
+ * (pretty much all of them other than x86), and also prevents
+ * the compiler from reordering the code that initializes the
+ * structure after the pointer assignment. More importantly, this
+ * call documents which pointers will be dereferenced by RCU read-side
+ * code.
+ */
+
+#define rcu_assign_pointer(p, v) \
+ ({ \
+ if (!__builtin_constant_p(v) || \
+ ((v) != NULL)) \
+ smp_wmb(); \
+ (p) = (v); \
+ })
+
+#endif /* __LINUX_RCUPDATE_TYPES_H */

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68


2009-04-17 01:10:37

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

I don't think this helps. rcupdate_types.h uses preempt_disable/enable,
but doesn't include linux/preempt.h for them - but someone's going to
have to, so you've got an implicit dependency on the user to #include
the right headers in advance.

J

2009-04-17 01:23:34

by Steven Rostedt

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h


On Thu, 16 Apr 2009, Jeremy Fitzhardinge wrote:

> I don't think this helps. rcupdate_types.h uses preempt_disable/enable, but
> doesn't include linux/preempt.h for them - but someone's going to have to, so
> you've got an implicit dependency on the user to #include the right headers in
> advance.

Would including linux/preempt.h in rcupdate_types.h be a problem?

-- Steve

2009-04-17 01:42:21

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

* Steven Rostedt ([email protected]) wrote:
>
> On Thu, 16 Apr 2009, Jeremy Fitzhardinge wrote:
>
> > I don't think this helps. rcupdate_types.h uses preempt_disable/enable, but
> > doesn't include linux/preempt.h for them - but someone's going to have to, so
> > you've got an implicit dependency on the user to #include the right headers in
> > advance.
>
> Would including linux/preempt.h in rcupdate_types.h be a problem?
>
> -- Steve
>

I did not include preempt.h in rcupdate_types.h because rcupdate.h did
not include it, so I thought it had a special status such a kernel.h.
However, I notice the rcupdate.h includes spinlock.h, which in turn
includes preempt.h, so we would need to include preempt.h in
rcupdate_types.h.

But I think preempt.h is pretty much only type definitions. I don't
think that would be a problem, but maybe Jeremy knows better.

Mathieu


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-17 01:47:55

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

* Mathieu Desnoyers ([email protected]) wrote:
> * Steven Rostedt ([email protected]) wrote:
> >
> > On Thu, 16 Apr 2009, Jeremy Fitzhardinge wrote:
> >
> > > I don't think this helps. rcupdate_types.h uses preempt_disable/enable, but
> > > doesn't include linux/preempt.h for them - but someone's going to have to, so
> > > you've got an implicit dependency on the user to #include the right headers in
> > > advance.
> >
> > Would including linux/preempt.h in rcupdate_types.h be a problem?
> >
> > -- Steve
> >
>
> I did not include preempt.h in rcupdate_types.h because rcupdate.h did
> not include it, so I thought it had a special status such a kernel.h.
> However, I notice the rcupdate.h includes spinlock.h, which in turn
> includes preempt.h, so we would need to include preempt.h in
> rcupdate_types.h.
>

Hrm, well, the actual question is :

given rcupdate_types.h only defines macros, and given we won't want to
include all the headers that implement all the content of these macros,
does it make sense to typically require people to either include
rcupdate.h directly if they want to have the full includes required to
expand the macros ?

If we do that, then including preempt.h in rcupdate_types.h is not
necessary. However, tracepoint.h should now include preempt.h, because
it would be cumbersome to require from every tracepoint.h users to
include preempt.h.

One way or another, we will have to include preempt.h under
tracepoint.h, but I don't see it as a roadblock, given that preempt.h is
quite slim.

Mathieu

> But I think preempt.h is pretty much only type definitions. I don't
> think that would be a problem, but maybe Jeremy knows better.
>
> Mathieu
>
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-17 02:38:18

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h (update)

* Steven Rostedt ([email protected]) wrote:
>
> On Thu, 16 Apr 2009, Jeremy Fitzhardinge wrote:
>
> > I don't think this helps. rcupdate_types.h uses preempt_disable/enable, but
> > doesn't include linux/preempt.h for them - but someone's going to have to, so
> > you've got an implicit dependency on the user to #include the right headers in
> > advance.
>
> Would including linux/preempt.h in rcupdate_types.h be a problem?
>
> -- Steve
>

Here is the updated patch to tracepoint.h that includes preempt.h.


tracepoints : remove rcupdate.h dependency

Use the slimmer rcupdate_types.h instead of the fat rcupdate.h.

Changelog : include preempt.h from tracepoint.h.

Signed-off-by: Mathieu Desnoyers <[email protected]>
CC: Jeremy Fitzhardinge <[email protected]>
CC: "Paul E. McKenney" <[email protected]>
CC: Steven Rostedt <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Andrew Morton <[email protected]>
CC: Christoph Hellwig <[email protected]>
---
include/linux/tracepoint.h | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

Index: linux.trees.git/include/linux/tracepoint.h
===================================================================
--- linux.trees.git.orig/include/linux/tracepoint.h 2009-04-16 20:32:27.000000000 -0400
+++ linux.trees.git/include/linux/tracepoint.h 2009-04-16 22:13:53.000000000 -0400
@@ -15,7 +15,8 @@
*/

#include <linux/types.h>
-#include <linux/rcupdate.h>
+#include <linux/rcupdate_types.h>
+#include <linux/preempt.h>

struct module;
struct tracepoint;
@@ -150,11 +151,11 @@ extern int tracepoint_get_iter_range(str
* tracepoint_synchronize_unregister must be called between the last tracepoint
* probe unregistration and the end of module exit to make sure there is no
* caller executing a probe when it is freed.
+ * Using a define rather than a static inline to make sure tracepoint.h does not
+ * depend on rcupdate.h. rcupdate.h must be included whenever
+ * tracepoint_synchronize_unregister() is used.
*/
-static inline void tracepoint_synchronize_unregister(void)
-{
- synchronize_sched();
-}
+#define tracepoint_synchronize_unregister() synchronize_sched()

#define PARAMS(args...) args


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-17 05:58:14

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

Mathieu Desnoyers wrote:
> * Steven Rostedt ([email protected]) wrote:
>
>> On Thu, 16 Apr 2009, Jeremy Fitzhardinge wrote:
>>
>>
>>> I don't think this helps. rcupdate_types.h uses preempt_disable/enable, but
>>> doesn't include linux/preempt.h for them - but someone's going to have to, so
>>> you've got an implicit dependency on the user to #include the right headers in
>>> advance.
>>>
>> Would including linux/preempt.h in rcupdate_types.h be a problem?
>>
>> -- Steve
>>
>>
>
> I did not include preempt.h in rcupdate_types.h because rcupdate.h did
> not include it, so I thought it had a special status such a kernel.h.
> However, I notice the rcupdate.h includes spinlock.h, which in turn
> includes preempt.h, so we would need to include preempt.h in
> rcupdate_types.h.
>
> But I think preempt.h is pretty much only type definitions. I don't
> think that would be a problem, but maybe Jeremy knows better.

No, preempt.h has fairly complex #include requirements:
#include <linux/thread_info.h>
#include <linux/linkage.h>
#include <linux/list.h>

which in turn include:
/* linux/thread_info */
#include <linux/bitops.h>
#include <asm/thread_info.h>

/* list.h */
#include <linux/stddef.h>
#include <linux/poison.h>
#include <linux/prefetch.h>
#include <asm/system.h>

And from there more complex still:
#include <asm/asm.h>
#include <asm/segment.h>
#include <asm/cpufeature.h>
#include <asm/cmpxchg.h>
#include <asm/nops.h>
#include <linux/kernel.h>
#include <linux/irqflags.h>
#include <asm/page.h>
#include <asm/processor.h>
#include <asm/ftrace.h>
#include <asm/atomic.h>
[...and much more...]

Given that paravirt.h is included in some of those headers, it
eventually gets cyclic.

J

2009-04-17 06:23:45

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

Mathieu Desnoyers wrote:
> Hrm, well, the actual question is :
>
> given rcupdate_types.h only defines macros, and given we won't want to
> include all the headers that implement all the content of these macros,
> does it make sense to typically require people to either include
> rcupdate.h directly if they want to have the full includes required to
> expand the macros ?
>

Well, any user of tracepoint who wants to directly or indirectly use
DECLARE_TRACE() or DEFINE_TRACE() will need to include it, so it ends up
getting included everywhere.

> If we do that, then including preempt.h in rcupdate_types.h is not
> necessary. However, tracepoint.h should now include preempt.h, because
> it would be cumbersome to require from every tracepoint.h users to
> include preempt.h.
>
> One way or another, we will have to include preempt.h under
> tracepoint.h, but I don't see it as a roadblock, given that preempt.h is
> quite slim.
>

Our headers are a tangled mass, and including anything non-trivial tends
to bring in everything. Despite the fact that preempt.h doesn't include
much explicitly, it does include non-trivial headers, and so will pull
in a broad swath of headers.

For example, here's the list of 64 headers included by cpp from a .c
file containing a single line: #include <linux/preempt.h>

"/home/jeremy/git/linux/arch/x86/include/asm/alternative.h"
"/home/jeremy/git/linux/arch/x86/include/asm/atomic_64.h"
"/home/jeremy/git/linux/arch/x86/include/asm/bitops.h"
"/home/jeremy/git/linux/arch/x86/include/asm/bug.h"
"/home/jeremy/git/linux/arch/x86/include/asm/cmpxchg_64.h"
"/home/jeremy/git/linux/arch/x86/include/asm/cpufeature.h"
"/home/jeremy/git/linux/arch/x86/include/asm/desc_defs.h"
"/home/jeremy/git/linux/arch/x86/include/asm/ds.h"
"/home/jeremy/git/linux/arch/x86/include/asm/ftrace.h"
"/home/jeremy/git/linux/arch/x86/include/asm/irqflags.h"
"/home/jeremy/git/linux/arch/x86/include/asm/kmap_types.h"
"/home/jeremy/git/linux/arch/x86/include/asm/math_emu.h"
"/home/jeremy/git/linux/arch/x86/include/asm/msr.h"
"/home/jeremy/git/linux/arch/x86/include/asm/page.h"
"/home/jeremy/git/linux/arch/x86/include/asm/page_64_types.h"
"/home/jeremy/git/linux/arch/x86/include/asm/page_types.h"
"/home/jeremy/git/linux/arch/x86/include/asm/paravirt.h"
"/home/jeremy/git/linux/arch/x86/include/asm/paravirt_types.h"
"/home/jeremy/git/linux/arch/x86/include/asm/percpu.h"
"/home/jeremy/git/linux/arch/x86/include/asm/pgtable_64_types.h"
"/home/jeremy/git/linux/arch/x86/include/asm/pgtable_types.h"
"/home/jeremy/git/linux/arch/x86/include/asm/posix_types_64.h"
"/home/jeremy/git/linux/arch/x86/include/asm/processor.h"
"/home/jeremy/git/linux/arch/x86/include/asm/ptrace-abi.h"
"/home/jeremy/git/linux/arch/x86/include/asm/ptrace.h"
"/home/jeremy/git/linux/arch/x86/include/asm/segment.h"
"/home/jeremy/git/linux/arch/x86/include/asm/sigcontext.h"
"/home/jeremy/git/linux/arch/x86/include/asm/string_64.h"
"/home/jeremy/git/linux/arch/x86/include/asm/swab.h"
"/home/jeremy/git/linux/arch/x86/include/asm/system.h"
"/home/jeremy/git/linux/arch/x86/include/asm/thread_info.h"
"/home/jeremy/git/linux/arch/x86/include/asm/types.h"
"/home/jeremy/git/linux/arch/x86/include/asm/vm86.h"
"/home/jeremy/git/linux/include/asm-generic/atomic.h"
"/home/jeremy/git/linux/include/asm-generic/bitops/fls64.h"
"/home/jeremy/git/linux/include/asm-generic/bitops/sched.h"
"/home/jeremy/git/linux/include/asm-generic/bug.h"
"/home/jeremy/git/linux/include/asm-generic/int-ll64.h"
"/home/jeremy/git/linux/include/asm-generic/page.h"
"/home/jeremy/git/linux/include/asm-generic/percpu.h"
"/home/jeremy/git/linux/include/linux/bitmap.h"
"/home/jeremy/git/linux/include/linux/bitops.h"
"/home/jeremy/git/linux/include/linux/byteorder/generic.h"
"/home/jeremy/git/linux/include/linux/byteorder/little_endian.h"
"/home/jeremy/git/linux/include/linux/compiler-gcc.h"
"/home/jeremy/git/linux/include/linux/compiler.h"
"/home/jeremy/git/linux/include/linux/cpumask.h"
"/home/jeremy/git/linux/include/linux/dynamic_debug.h"
"/home/jeremy/git/linux/include/linux/err.h"
"/home/jeremy/git/linux/include/linux/init.h"
"/home/jeremy/git/linux/include/linux/irqflags.h"
"/home/jeremy/git/linux/include/linux/kernel.h"
"/home/jeremy/git/linux/include/linux/list.h"
"/home/jeremy/git/linux/include/linux/log2.h"
"/home/jeremy/git/linux/include/linux/personality.h"
"/home/jeremy/git/linux/include/linux/posix_types.h"
"/home/jeremy/git/linux/include/linux/preempt.h"
"/home/jeremy/git/linux/include/linux/prefetch.h"
"/home/jeremy/git/linux/include/linux/stddef.h"
"/home/jeremy/git/linux/include/linux/string.h"
"/home/jeremy/git/linux/include/linux/swab.h"
"/home/jeremy/git/linux/include/linux/thread_info.h"
"/home/jeremy/git/linux/include/linux/tracepoint.h"
"/home/jeremy/git/linux/include/linux/types.h"
"/home/jeremy/git/linux/include/trace/events/pvops.h"


Using your patch to split out rcupdate_types.h, if I leave my pvops.h
header as-is, I get:

CC arch/x86/kernel/asm-offsets.s
In file included from /home/jeremy/git/linux/arch/x86/include/asm/paravirt.h:18,
from /home/jeremy/git/linux/arch/x86/include/asm/irqflags.h:55,
from /home/jeremy/git/linux/include/linux/irqflags.h:57,
from /home/jeremy/git/linux/arch/x86/include/asm/system.h:11,
from /home/jeremy/git/linux/arch/x86/include/asm/processor.h:17,
from /home/jeremy/git/linux/include/linux/prefetch.h:14,
from /home/jeremy/git/linux/include/linux/list.h:6,
from /home/jeremy/git/linux/include/linux/module.h:9,
from /home/jeremy/git/linux/include/linux/crypto.h:21,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets_64.c:7,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets.c:4:
/home/jeremy/git/linux/include/trace/events/pvops.h: In function 'trace_load_sp0':
/home/jeremy/git/linux/include/trace/events/pvops.h:31: error: implicit declaration of function 'preempt_disable_notrace'
/home/jeremy/git/linux/include/trace/events/pvops.h:31: error: implicit declaration of function 'smp_read_barrier_depends'
/home/jeremy/git/linux/include/trace/events/pvops.h:31: error: implicit declaration of function 'preempt_enable_notrace'
make[3]: *** [arch/x86/kernel/asm-offsets.s] Error 1


If I try to fix that by adding <linux/preempt.h> just before
<linux/tracepoint.h> in pvops.h, I get:

CC arch/x86/kernel/asm-offsets.s
In file included from /home/jeremy/git/linux/include/linux/thread_info.h:55,
from /home/jeremy/git/linux/include/linux/preempt.h:9,
from /home/jeremy/git/linux/include/trace/events/pvops.h:4,
from /home/jeremy/git/linux/arch/x86/include/asm/paravirt.h:18,
from /home/jeremy/git/linux/arch/x86/include/asm/irqflags.h:55,
from /home/jeremy/git/linux/include/linux/irqflags.h:57,
from /home/jeremy/git/linux/arch/x86/include/asm/system.h:11,
from /home/jeremy/git/linux/arch/x86/include/asm/processor.h:17,
from /home/jeremy/git/linux/include/linux/prefetch.h:14,
from /home/jeremy/git/linux/include/linux/list.h:6,
from /home/jeremy/git/linux/include/linux/module.h:9,
from /home/jeremy/git/linux/include/linux/crypto.h:21,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets_64.c:7,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets.c:4:
/home/jeremy/git/linux/arch/x86/include/asm/thread_info.h:34: error: expected specifier-qualifier-list before 'mm_segment_t'
In file included from /home/jeremy/git/linux/include/trace/events/pvops.h:4,
from /home/jeremy/git/linux/arch/x86/include/asm/paravirt.h:18,
from /home/jeremy/git/linux/arch/x86/include/asm/irqflags.h:55,
from /home/jeremy/git/linux/include/linux/irqflags.h:57,
from /home/jeremy/git/linux/arch/x86/include/asm/system.h:11,
from /home/jeremy/git/linux/arch/x86/include/asm/processor.h:17,
from /home/jeremy/git/linux/include/linux/prefetch.h:14,
from /home/jeremy/git/linux/include/linux/list.h:6,
from /home/jeremy/git/linux/include/linux/module.h:9,
from /home/jeremy/git/linux/include/linux/crypto.h:21,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets_64.c:7,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets.c:4:
/home/jeremy/git/linux/include/linux/preempt.h:123: error: field 'link' has incomplete type
/home/jeremy/git/linux/include/linux/preempt.h: In function 'preempt_notifier_init':
/home/jeremy/git/linux/include/linux/preempt.h:133: error: implicit declaration of function 'INIT_HLIST_NODE'
In file included from /home/jeremy/git/linux/arch/x86/include/asm/paravirt.h:18,
from /home/jeremy/git/linux/arch/x86/include/asm/irqflags.h:55,
from /home/jeremy/git/linux/include/linux/irqflags.h:57,
from /home/jeremy/git/linux/arch/x86/include/asm/system.h:11,
from /home/jeremy/git/linux/arch/x86/include/asm/processor.h:17,
from /home/jeremy/git/linux/include/linux/prefetch.h:14,
from /home/jeremy/git/linux/include/linux/list.h:6,
from /home/jeremy/git/linux/include/linux/module.h:9,
from /home/jeremy/git/linux/include/linux/crypto.h:21,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets_64.c:7,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets.c:4:
/home/jeremy/git/linux/include/trace/events/pvops.h: In function 'trace_load_sp0':
/home/jeremy/git/linux/include/trace/events/pvops.h:31: error: implicit declaration of function 'smp_read_barrier_depends'
In file included from /home/jeremy/git/linux/include/linux/module.h:9,
from /home/jeremy/git/linux/include/linux/crypto.h:21,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets_64.c:7,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets.c:4:
/home/jeremy/git/linux/include/linux/list.h: At top level:
/home/jeremy/git/linux/include/linux/list.h:551: warning: conflicting types for 'INIT_HLIST_NODE'
/home/jeremy/git/linux/include/linux/list.h:551: error: static declaration of 'INIT_HLIST_NODE' follows non-static declaration
/home/jeremy/git/linux/include/linux/preempt.h:133: error: previous implicit declaration of 'INIT_HLIST_NODE' was here
In file included from /home/jeremy/git/linux/arch/x86/include/asm/i387.h:15,
from /home/jeremy/git/linux/arch/x86/include/asm/suspend_64.h:10,
from /home/jeremy/git/linux/arch/x86/include/asm/suspend.h:4,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets_64.c:19,
from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets.c:4:
/home/jeremy/git/linux/include/linux/regset.h: In function 'copy_regset_to_user':
/home/jeremy/git/linux/include/linux/regset.h:338: error: 'struct thread_info' has no member named 'addr_limit'
/home/jeremy/git/linux/include/linux/regset.h: In function 'copy_regset_from_user':
/home/jeremy/git/linux/include/linux/regset.h:361: error: 'struct thread_info' has no member named 'addr_limit'
In file included from /home/jeremy/git/linux/arch/x86/kernel/asm-offsets.c:4:
/home/jeremy/git/linux/arch/x86/kernel/asm-offsets_64.c: In function 'main':
/home/jeremy/git/linux/arch/x86/kernel/asm-offsets_64.c:43: error: 'struct thread_info' has no member named 'addr_limit'
/home/jeremy/git/linux/arch/x86/kernel/asm-offsets_64.c:47: error: 'struct thread_info' has no member named 'sysenter_return'
make[3]: *** [arch/x86/kernel/asm-offsets.s] Error 1


J

2009-04-17 15:17:04

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

* Jeremy Fitzhardinge ([email protected]) wrote:
> Mathieu Desnoyers wrote:
>> * Steven Rostedt ([email protected]) wrote:
>>
>>> On Thu, 16 Apr 2009, Jeremy Fitzhardinge wrote:
>>>
>>>
>>>> I don't think this helps. rcupdate_types.h uses preempt_disable/enable, but
>>>> doesn't include linux/preempt.h for them - but someone's going to have to, so
>>>> you've got an implicit dependency on the user to #include the right headers in
>>>> advance.
>>>>
>>> Would including linux/preempt.h in rcupdate_types.h be a problem?
>>>
>>> -- Steve
>>>
>>>
>>
>> I did not include preempt.h in rcupdate_types.h because rcupdate.h did
>> not include it, so I thought it had a special status such a kernel.h.
>> However, I notice the rcupdate.h includes spinlock.h, which in turn
>> includes preempt.h, so we would need to include preempt.h in
>> rcupdate_types.h.
>>
>> But I think preempt.h is pretty much only type definitions. I don't
>> think that would be a problem, but maybe Jeremy knows better.
>
> No, preempt.h has fairly complex #include requirements:
> #include <linux/thread_info.h>
> #include <linux/linkage.h>
> #include <linux/list.h>
>
> which in turn include:
> /* linux/thread_info */
> #include <linux/bitops.h>
> #include <asm/thread_info.h>
>
> /* list.h */
> #include <linux/stddef.h>
> #include <linux/poison.h>
> #include <linux/prefetch.h>
> #include <asm/system.h>
>
> And from there more complex still:
> #include <asm/asm.h>
> #include <asm/segment.h>
> #include <asm/cpufeature.h>
> #include <asm/cmpxchg.h>
> #include <asm/nops.h>
> #include <linux/kernel.h>
> #include <linux/irqflags.h>
> #include <asm/page.h>
> #include <asm/processor.h>
> #include <asm/ftrace.h>
> #include <asm/atomic.h>
> [...and much more...]
>
> Given that paravirt.h is included in some of those headers, it
> eventually gets cyclic.
>

OK.

Given the simplicity of the preempt_disable/enable_notrace found in
preempt.h, we could move them to

include/preempt_types.h too, and that would solve all problems, wouldn't
it ?

Mathieu

> J

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-17 15:27:17

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

Mathieu Desnoyers wrote:
> Given the simplicity of the preempt_disable/enable_notrace found in
> preempt.h, we could move them to
>
> include/preempt_types.h too, and that would solve all problems, wouldn't
> it ?
>

No, it still needs linux/thread_info.h -> asm/thread_info.h, which in
turn gets quite a lot of things on x86 (and would need to be audited in
each architecture).

J

2009-04-17 15:42:39

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

* Jeremy Fitzhardinge ([email protected]) wrote:
> Mathieu Desnoyers wrote:
>> Given the simplicity of the preempt_disable/enable_notrace found in
>> preempt.h, we could move them to
>>
>> include/preempt_types.h too, and that would solve all problems, wouldn't
>> it ?
>>
>
> No, it still needs linux/thread_info.h -> asm/thread_info.h, which in
> turn gets quite a lot of things on x86 (and would need to be audited in
> each architecture).
>
> J

Well, I think it's a good time to do some cleanup then. Why on earth
would thread_info.h be anything else than a "_types"-like header ?

If headers has become in such a state in the kernel, then IMHO the
solution is not to shove more out-of-line functions under the carpet,
but rather to do the cleanup.

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-17 16:14:39

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

Mathieu Desnoyers wrote:
> * Jeremy Fitzhardinge ([email protected]) wrote:
>
>> Mathieu Desnoyers wrote:
>>
>>> Given the simplicity of the preempt_disable/enable_notrace found in
>>> preempt.h, we could move them to
>>>
>>> include/preempt_types.h too, and that would solve all problems, wouldn't
>>> it ?
>>>
>>>
>> No, it still needs linux/thread_info.h -> asm/thread_info.h, which in
>> turn gets quite a lot of things on x86 (and would need to be audited in
>> each architecture).
>>
>> J
>>
>
> Well, I think it's a good time to do some cleanup then. Why on earth
> would thread_info.h be anything else than a "_types"-like header ?
>

Why indeed? Because it includes a number of other headers to get the
definitions it needs, and defines various functions needed to operate on
the thread_info structure (including the all-important
current_thread_info()).

Yes, it can be refactored into thread_info.h and thread_info_types.h,
and all the headers it includes can be similarly refactored, and
linux/thread_info.h can also be split, and all the asm/*/thread_info.hs
can be split too, and it can be made to work for all arches under all
configs...

But that's going to take a long time, and if its a pre-requisite for
getting tracing going, then we're not going to see it merged this year.

> If headers has become in such a state in the kernel, then IMHO the
> solution is not to shove more out-of-line functions under the carpet,
> but rather to do the cleanup.
>

Besides, I'm still not convinced that putting the code inline is a good
idea. Direct call/return are not inherently expensive, and they're
something that CPU vendors have a lot of motivation to optimise for. In
particular, the call itself is no more expensive than a jmp other than
the return-address push, and the ret is also cheap because it will use
the return address cache rather than having to be a full indirect jmp.

And it would be much easier to justify leaving tracing compile-time
enabled all the time if each tracepoint really does have a minimal
icache profile when not enabled.

J

2009-04-17 16:38:57

by Steven Rostedt

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h


[ added Arjan ]

On Fri, 17 Apr 2009, Jeremy Fitzhardinge wrote:

> Mathieu Desnoyers wrote:
> > * Jeremy Fitzhardinge ([email protected]) wrote:
> >
> > > Mathieu Desnoyers wrote:
> > >
> > > > Given the simplicity of the preempt_disable/enable_notrace found in
> > > > preempt.h, we could move them to
> > > > include/preempt_types.h too, and that would solve all problems, wouldn't
> > > > it ?
> > > >
> > > No, it still needs linux/thread_info.h -> asm/thread_info.h, which in
> > > turn gets quite a lot of things on x86 (and would need to be audited in
> > > each architecture).
> > >
> > > J
> > >
> >
> > Well, I think it's a good time to do some cleanup then. Why on earth
> > would thread_info.h be anything else than a "_types"-like header ?
> >
>
> Why indeed? Because it includes a number of other headers to get the
> definitions it needs, and defines various functions needed to operate on the
> thread_info structure (including the all-important current_thread_info()).
>
> Yes, it can be refactored into thread_info.h and thread_info_types.h, and all
> the headers it includes can be similarly refactored, and linux/thread_info.h
> can also be split, and all the asm/*/thread_info.hs can be split too, and it
> can be made to work for all arches under all configs...
> But that's going to take a long time, and if its a pre-requisite for getting
> tracing going, then we're not going to see it merged this year.
>
> > If headers has become in such a state in the kernel, then IMHO the
> > solution is not to shove more out-of-line functions under the carpet,
> > but rather to do the cleanup.
> >
>
> Besides, I'm still not convinced that putting the code inline is a good idea.
> Direct call/return are not inherently expensive, and they're something that
> CPU vendors have a lot of motivation to optimise for. In particular, the call
> itself is no more expensive than a jmp other than the return-address push, and
> the ret is also cheap because it will use the return address cache rather than
> having to be a full indirect jmp.
>
> And it would be much easier to justify leaving tracing compile-time enabled
> all the time if each tracepoint really does have a minimal icache profile when
> not enabled.

I was talking with Arjan about this in San Francisco. The expense of doing
function calls. He told me (and he can correct me if I'm wrong here) that
function calls are like branch predictions. The branch part is the fact
that a retq is a jmp that can go to different locations. There's logic in
the CPU to match calls with retqs to speed this up.

He also told me that the "mcount" retq that I do is actually more
expensive. The logic does not expect a function to return immediately.
(for stubs, I'm not sure that was a good design).

Hence,

call mcount

[...]

mcount:
retq


is expensive, compared to a call to a function that actually does
something.

Again, Arjan can correct me here, since I'm just trying to paraphrase what
he told me.

-- Steve

2009-04-17 17:09:59

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch 2/3] RCU move trace defines to rcupdate_types.h

Steven Rostedt wrote:
> I was talking with Arjan about this in San Francisco. The expense of doing
> function calls. He told me (and he can correct me if I'm wrong here) that
> function calls are like branch predictions. The branch part is the fact
> that a retq is a jmp that can go to different locations. There's logic in
> the CPU to match calls with retqs to speed this up.
>

Right. The call is to a fixed address, so there's no prediction needed
at all; the CPU can immediately start fetching instructions at the call
target without missing a beat. When it hits the ret in the function,
assuming nobody has been playing games with the stack pointer or
modifying the return address on the stack, it can just look up the
return address from its cache and start fetching from there, again with
no bubbles. It should be very close to a pair of jumps, aside from one
extra memory write (for the return address on stack) - and that
shouldn't be too bad, because the chances are the cache is hot for the
stack.

> He also told me that the "mcount" retq that I do is actually more
> expensive. The logic does not expect a function to return immediately.
> (for stubs, I'm not sure that was a good design).
>
> Hence,
>
> call mcount
>
> [...]
>
> mcount:
> retq
>
>
> is expensive, compared to a call to a function that actually does
> something.
>
> Again, Arjan can correct me here, since I'm just trying to paraphrase what
> he told me.
>

Sounds reasonable; it takes a little while for the CPU to work out what
the return address will be, even though its cached, so doing an
immediate ret will cause a bubble while it sorts itself out. But that
shouldn't be an issue for the calls I'm talking about.

J