Subject: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

__get_cpu_var() is used for multiple purposes in the kernel source. One of them is
address calculation via the form &__get_cpu_var(x). This calculates the address for
the instance of the percpu variable of the current processor based on an offset.

Other use cases are for storing and retrieving data from the current processors percpu area.
__get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store and retrieve operations
could use a segment prefix (or global register on other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use
optimized assembly code to read and write per cpu variables.


This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr()
or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided
and less registers are used when code is generated.

At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too.

The patchset includes passes over all arches as well. Once these operations are used throughout then
specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by
f.e. using a global register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);

Converts to

int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);

Converts to

int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu variable.

DEFINE_PER_CPU(int, u);
int x = __get_cpu_var(y)

Converts to

int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);

Converts to

memcpy(this_cpu_ptr(&x), y, sizeof(x));


5. Assignment to a per cpu variable

DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;

Converts to

this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++

Converts to

this_cpu_inc(y)


Signed-off-by: Christoph Lameter <[email protected]>

Index: linux/include/linux/kernel_stat.h
===================================================================
--- linux.orig/include/linux/kernel_stat.h 2013-08-26 14:21:39.000000000 -0500
+++ linux/include/linux/kernel_stat.h 2013-08-26 14:23:08.883120293 -0500
@@ -47,8 +47,8 @@ DECLARE_PER_CPU(struct kernel_stat, ksta
DECLARE_PER_CPU(struct kernel_cpustat, kernel_cpustat);

/* Must have preemption disabled for this to be meaningful. */
-#define kstat_this_cpu (&__get_cpu_var(kstat))
-#define kcpustat_this_cpu (&__get_cpu_var(kernel_cpustat))
+#define kstat_this_cpu this_cpu_ptr(&kstat)
+#define kcpustat_this_cpu this_cpu_ptr(&kernel_cpustat)
#define kstat_cpu(cpu) per_cpu(kstat, cpu)
#define kcpustat_cpu(cpu) per_cpu(kernel_cpustat, cpu)

Index: linux/kernel/events/callchain.c
===================================================================
--- linux.orig/kernel/events/callchain.c 2013-08-26 14:21:39.000000000 -0500
+++ linux/kernel/events/callchain.c 2013-08-26 14:23:08.875120379 -0500
@@ -134,7 +134,7 @@ static struct perf_callchain_entry *get_
int cpu;
struct callchain_cpus_entries *entries;

- *rctx = get_recursion_context(__get_cpu_var(callchain_recursion));
+ *rctx = get_recursion_context(this_cpu_ptr(callchain_recursion));
if (*rctx == -1)
return NULL;

@@ -150,7 +150,7 @@ static struct perf_callchain_entry *get_
static void
put_callchain_entry(int rctx)
{
- put_recursion_context(__get_cpu_var(callchain_recursion), rctx);
+ put_recursion_context(this_cpu_ptr(callchain_recursion), rctx);
}

struct perf_callchain_entry *
Index: linux/kernel/events/core.c
===================================================================
--- linux.orig/kernel/events/core.c 2013-08-26 14:21:39.000000000 -0500
+++ linux/kernel/events/core.c 2013-08-26 14:23:08.875120379 -0500
@@ -238,10 +238,10 @@ void perf_sample_event_took(u64 sample_l
return;

/* decay the counter by 1 average sample */
- local_samples_len = __get_cpu_var(running_sample_length);
+ local_samples_len = __this_cpu_read(running_sample_length);
local_samples_len -= local_samples_len/NR_ACCUMULATED_SAMPLES;
local_samples_len += sample_len_ns;
- __get_cpu_var(running_sample_length) = local_samples_len;
+ __this_cpu_write(running_sample_length, local_samples_len);

/*
* note: this will be biased artifically low until we have
@@ -865,7 +865,7 @@ static DEFINE_PER_CPU(struct list_head,
static void perf_pmu_rotate_start(struct pmu *pmu)
{
struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
- struct list_head *head = &__get_cpu_var(rotation_list);
+ struct list_head *head = this_cpu_ptr(&rotation_list);

WARN_ON(!irqs_disabled());

@@ -2321,7 +2321,7 @@ void __perf_event_task_sched_out(struct
* to check if we have to switch out PMU state.
* cgroup event are system-wide mode only
*/
- if (atomic_read(&__get_cpu_var(perf_cgroup_events)))
+ if (atomic_read(this_cpu_ptr(&perf_cgroup_events)))
perf_cgroup_sched_out(task, next);
}

@@ -2566,11 +2566,11 @@ void __perf_event_task_sched_in(struct t
* to check if we have to switch in PMU state.
* cgroup event are system-wide mode only
*/
- if (atomic_read(&__get_cpu_var(perf_cgroup_events)))
+ if (atomic_read(this_cpu_ptr(&perf_cgroup_events)))
perf_cgroup_sched_in(prev, task);

/* check for system-wide branch_stack events */
- if (atomic_read(&__get_cpu_var(perf_branch_stack_events)))
+ if (atomic_read(this_cpu_ptr(&perf_branch_stack_events)))
perf_branch_stack_sched_in(prev, task);
}

@@ -2811,7 +2811,7 @@ done:
#ifdef CONFIG_NO_HZ_FULL
bool perf_event_can_stop_tick(void)
{
- if (list_empty(&__get_cpu_var(rotation_list)))
+ if (list_empty(this_cpu_ptr(&rotation_list)))
return true;
else
return false;
@@ -2820,7 +2820,7 @@ bool perf_event_can_stop_tick(void)

void perf_event_task_tick(void)
{
- struct list_head *head = &__get_cpu_var(rotation_list);
+ struct list_head *head = this_cpu_ptr(&rotation_list);
struct perf_cpu_context *cpuctx, *tmp;
struct perf_event_context *ctx;
int throttled;
@@ -5414,7 +5414,7 @@ static void do_perf_sw_event(enum perf_t
struct perf_sample_data *data,
struct pt_regs *regs)
{
- struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+ struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
struct perf_event *event;
struct hlist_head *head;

@@ -5433,7 +5433,7 @@ end:

int perf_swevent_get_recursion_context(void)
{
- struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+ struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);

return get_recursion_context(swhash->recursion);
}
@@ -5441,7 +5441,7 @@ EXPORT_SYMBOL_GPL(perf_swevent_get_recur

inline void perf_swevent_put_recursion_context(int rctx)
{
- struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+ struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);

put_recursion_context(swhash->recursion, rctx);
}
@@ -5470,7 +5470,7 @@ static void perf_swevent_read(struct per

static int perf_swevent_add(struct perf_event *event, int flags)
{
- struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+ struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
struct hw_perf_event *hwc = &event->hw;
struct hlist_head *head;

Index: linux/kernel/sched/cputime.c
===================================================================
--- linux.orig/kernel/sched/cputime.c 2013-08-26 14:21:39.000000000 -0500
+++ linux/kernel/sched/cputime.c 2013-08-26 14:23:08.879120335 -0500
@@ -121,7 +121,7 @@ static inline void task_group_account_fi
* is the only cgroup, then nothing else should be necessary.
*
*/
- __get_cpu_var(kernel_cpustat).cpustat[index] += tmp;
+ __this_cpu_add(kernel_cpustat.cpustat[index], tmp);

cpuacct_account_field(p, index, tmp);
}
Index: linux/kernel/sched/fair.c
===================================================================
--- linux.orig/kernel/sched/fair.c 2013-08-26 14:21:39.000000000 -0500
+++ linux/kernel/sched/fair.c 2013-08-26 14:23:08.879120335 -0500
@@ -5057,7 +5057,7 @@ static int load_balance(int this_cpu, st
struct sched_group *group;
struct rq *busiest;
unsigned long flags;
- struct cpumask *cpus = __get_cpu_var(load_balance_mask);
+ struct cpumask *cpus = this_cpu_ptr(load_balance_mask);

struct lb_env env = {
.sd = sd,
Index: linux/kernel/sched/rt.c
===================================================================
--- linux.orig/kernel/sched/rt.c 2013-08-26 14:21:39.000000000 -0500
+++ linux/kernel/sched/rt.c 2013-08-26 14:23:08.883120293 -0500
@@ -1389,7 +1389,7 @@ static DEFINE_PER_CPU(cpumask_var_t, loc
static int find_lowest_rq(struct task_struct *task)
{
struct sched_domain *sd;
- struct cpumask *lowest_mask = __get_cpu_var(local_cpu_mask);
+ struct cpumask *lowest_mask = this_cpu_ptr(local_cpu_mask);
int this_cpu = smp_processor_id();
int cpu = task_cpu(task);

Index: linux/kernel/sched/sched.h
===================================================================
--- linux.orig/kernel/sched/sched.h 2013-08-26 14:21:39.000000000 -0500
+++ linux/kernel/sched/sched.h 2013-08-26 14:23:08.883120293 -0500
@@ -538,10 +538,10 @@ static inline int cpu_of(struct rq *rq)
DECLARE_PER_CPU(struct rq, runqueues);

#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
-#define this_rq() (&__get_cpu_var(runqueues))
+#define this_rq() this_cpu_ptr(&runqueues)
#define task_rq(p) cpu_rq(task_cpu(p))
#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
-#define raw_rq() (&__raw_get_cpu_var(runqueues))
+#define raw_rq() __this_cpu_ptr(&runqueues)

static inline u64 rq_clock(struct rq *rq)
{
Index: linux/kernel/user-return-notifier.c
===================================================================
--- linux.orig/kernel/user-return-notifier.c 2013-08-26 14:21:39.000000000 -0500
+++ linux/kernel/user-return-notifier.c 2013-08-26 14:23:08.883120293 -0500
@@ -14,7 +14,7 @@ static DEFINE_PER_CPU(struct hlist_head,
void user_return_notifier_register(struct user_return_notifier *urn)
{
set_tsk_thread_flag(current, TIF_USER_RETURN_NOTIFY);
- hlist_add_head(&urn->link, &__get_cpu_var(return_notifier_list));
+ hlist_add_head(&urn->link, this_cpu_ptr(&return_notifier_list));
}
EXPORT_SYMBOL_GPL(user_return_notifier_register);

@@ -25,7 +25,7 @@ EXPORT_SYMBOL_GPL(user_return_notifier_r
void user_return_notifier_unregister(struct user_return_notifier *urn)
{
hlist_del(&urn->link);
- if (hlist_empty(&__get_cpu_var(return_notifier_list)))
+ if (hlist_empty(this_cpu_ptr(&return_notifier_list)))
clear_tsk_thread_flag(current, TIF_USER_RETURN_NOTIFY);
}
EXPORT_SYMBOL_GPL(user_return_notifier_unregister);


2013-08-29 07:58:29

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

On Wed, Aug 28, 2013 at 07:48:14PM +0000, Christoph Lameter wrote:
> Transformations done to __get_cpu_var()
>
>
> 1. Determine the address of the percpu instance of the current processor.
>
> DEFINE_PER_CPU(int, y);
> int *x = &__get_cpu_var(y);
>
> Converts to
>
> int *x = this_cpu_ptr(&y);
>
>
> 2. Same as #1 but this time an array structure is involved.
>
> DEFINE_PER_CPU(int, y[20]);
> int *x = __get_cpu_var(y);
>
> Converts to
>
> int *x = this_cpu_ptr(y);
>
>
> 3. Retrieve the content of the current processors instance of a per cpu variable.
>
> DEFINE_PER_CPU(int, u);
> int x = __get_cpu_var(y)
>
> Converts to
>
> int x = __this_cpu_read(y);
>

This looses a preemption debug check, so NAK

> 4. Retrieve the content of a percpu struct
>
> DEFINE_PER_CPU(struct mystruct, y);
> struct mystruct x = __get_cpu_var(y);
>
> Converts to
>
> memcpy(this_cpu_ptr(&x), y, sizeof(x));
>
> 5. Assignment to a per cpu variable
>
> DEFINE_PER_CPU(int, y)
> __get_cpu_var(y) = x;
>
> Converts to
>
> this_cpu_write(y, x);
>

This too looses a preemption debug check, NAK

> 6. Increment/Decrement etc of a per cpu variable
>
> DEFINE_PER_CPU(int, y);
> __get_cpu_var(y)++
>
> Converts to
>
> this_cpu_inc(y)
>

Lo and behold.. no preemption checks again.


Seriously first fix the debug and validation bits of the *this_cpu*
stuff.

2013-08-29 10:01:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses


* Peter Zijlstra <[email protected]> wrote:

> On Wed, Aug 28, 2013 at 07:48:14PM +0000, Christoph Lameter wrote:
> > Transformations done to __get_cpu_var()
> >
> >
> > 1. Determine the address of the percpu instance of the current processor.
> >
> > DEFINE_PER_CPU(int, y);
> > int *x = &__get_cpu_var(y);
> >
> > Converts to
> >
> > int *x = this_cpu_ptr(&y);
> >
> >
> > 2. Same as #1 but this time an array structure is involved.
> >
> > DEFINE_PER_CPU(int, y[20]);
> > int *x = __get_cpu_var(y);
> >
> > Converts to
> >
> > int *x = this_cpu_ptr(y);
> >
> >
> > 3. Retrieve the content of the current processors instance of a per cpu variable.
> >
> > DEFINE_PER_CPU(int, u);
> > int x = __get_cpu_var(y)
> >
> > Converts to
> >
> > int x = __this_cpu_read(y);
> >
>
> This looses a preemption debug check, so NAK
>
> > 4. Retrieve the content of a percpu struct
> >
> > DEFINE_PER_CPU(struct mystruct, y);
> > struct mystruct x = __get_cpu_var(y);
> >
> > Converts to
> >
> > memcpy(this_cpu_ptr(&x), y, sizeof(x));
> >
> > 5. Assignment to a per cpu variable
> >
> > DEFINE_PER_CPU(int, y)
> > __get_cpu_var(y) = x;
> >
> > Converts to
> >
> > this_cpu_write(y, x);
> >
>
> This too looses a preemption debug check, NAK
>
> > 6. Increment/Decrement etc of a per cpu variable
> >
> > DEFINE_PER_CPU(int, y);
> > __get_cpu_var(y)++
> >
> > Converts to
> >
> > this_cpu_inc(y)
> >
>
> Lo and behold.. no preemption checks again.
>
>
> Seriously first fix the debug and validation bits of the *this_cpu*
> stuff.

Note that most of the other 'gcv' patches have these problems as well, so
it's a NAK from me as well for most of the other patches as well ...

Thanks,

Ingo

Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

On Thu, 29 Aug 2013, Ingo Molnar wrote:

>
> >
> >
> > Seriously first fix the debug and validation bits of the *this_cpu*
> > stuff.
>
> Note that most of the other 'gcv' patches have these problems as well, so
> it's a NAK from me as well for most of the other patches as well ...

Note that this only affects __this_cpu_read and __this_cpu_write not the
this_cpu_ptr() operation.

The objection against having other variants of this_cpu operations before
was that there were too many. If we want to reintroduce the preemption
checks in the __ operations then we would need another variant for those
places that do not need it.

Right now we only have the regular ops which are interrupt safe and the
unsafe variant that can be used anyplace.

We could add a ____this_cpu variant that would be used in the cases we do
not want preemption checks? There should not be too many but it will
mean a whole lot of new definitions in percpu.h.

2013-08-29 17:33:02

by Steven Rostedt

[permalink] [raw]
Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

On Thu, Aug 29, 2013 at 04:57:43PM +0000, Christoph Lameter wrote:
>
> We could add a ____this_cpu variant that would be used in the cases we do
> not want preemption checks? There should not be too many but it will
> mean a whole lot of new definitions in percpu.h.

Let's get away from underscores as they are meaningless.

A this_cpu_atomic() or other descriptive name would be much more
appropriate.

-- Steve

Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

On Thu, 29 Aug 2013, Steven Rostedt wrote:

> On Thu, Aug 29, 2013 at 04:57:43PM +0000, Christoph Lameter wrote:
> >
> > We could add a ____this_cpu variant that would be used in the cases we do
> > not want preemption checks? There should not be too many but it will
> > mean a whole lot of new definitions in percpu.h.
>
> Let's get away from underscores as they are meaningless.
>
> A this_cpu_atomic() or other descriptive name would be much more
> appropriate.

Its not really an atomic operation in the classic sense.

this_cpu_no_preempt_check_read ?

The problem that I have is also that a kernel with preemption is not
something that see anywhere these days. Looks more like an academic
exercise? Does this really matter? All the distro I see use
PREEMPT_VOLUNTARY. Performance degradation is significant if massive
amounts of checks and preempt disable/enable points are added to the
kernel.

Do we agree that it is necessary and useful to add another variant of
this_cpu ops for this? The concern of having too many variants is no
longer there? Adding another variant is not that difficult just code
intensive.

2013-08-29 18:30:57

by Steven Rostedt

[permalink] [raw]
Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

On Thu, 29 Aug 2013 18:15:43 +0000
Christoph Lameter <[email protected]> wrote:

> On Thu, 29 Aug 2013, Steven Rostedt wrote:
>
> Its not really an atomic operation in the classic sense.

It doesn't need to be atomic, it could mean it is used within atomic
locations. Basically, "can't be interrupted here". I just said
"something like", it didn't even need to be that.

>
> this_cpu_no_preempt_check_read ?

I would make it much shorter. You could use "raw_this_cpu_read()",
which usually means "no checks here". Or, "this_cpu_read_nopreempt()".

>
> The problem that I have is also that a kernel with preemption is not
> something that see anywhere these days. Looks more like an academic
> exercise? Does this really matter? All the distro I see use

Um, my paycheck depends on PREEMPT_RT working. And there's a lot of
interest in real PREEMPT by audio folks. It's no more an
academic exercise than people wanting really low kernel latency.

> PREEMPT_VOLUNTARY. Performance degradation is significant if massive
> amounts of checks and preempt disable/enable points are added to the
> kernel.

They are usually disabled for production systems. But we run a bunch of
tests with the debug checks enabled, which catch bugs before we ship a
kernel for a production system.

>
> Do we agree that it is necessary and useful to add another variant of
> this_cpu ops for this? The concern of having too many variants is no
> longer there? Adding another variant is not that difficult just code
> intensive.

How many places use the this_cpu_*() without preemption disabled? I
wouldn't think there's many. I never complained about another variant,
so you need to ask those that have. The tough question for me is what
that variant name should be ;-)

-- Steve

2013-08-30 06:54:24

by Ingo Molnar

[permalink] [raw]
Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses


* Christoph Lameter <[email protected]> wrote:

> On Thu, 29 Aug 2013, Steven Rostedt wrote:
>
> > On Thu, Aug 29, 2013 at 04:57:43PM +0000, Christoph Lameter wrote:
> > >
> > > We could add a ____this_cpu variant that would be used in the cases we do
> > > not want preemption checks? There should not be too many but it will
> > > mean a whole lot of new definitions in percpu.h.
> >
> > Let's get away from underscores as they are meaningless.
> >
> > A this_cpu_atomic() or other descriptive name would be much more
> > appropriate.
>
> Its not really an atomic operation in the classic sense.
>
> this_cpu_no_preempt_check_read ?
>
> The problem that I have is also that a kernel with preemption is not
> something that see anywhere these days. Looks more like an academic
> exercise? Does this really matter? All the distro I see use
> PREEMPT_VOLUNTARY. Performance degradation is significant if massive
> amounts of checks and preempt disable/enable points are added to the
> kernel.
>
> Do we agree that it is necessary and useful to add another variant of
> this_cpu ops for this? The concern of having too many variants is no
> longer there? Adding another variant is not that difficult just code
> intensive.

Just stop the lame excuses and fix it already. This has come up in the
past and you know it: you were told to fix the this_cpu debug checks by
Linus as well, yet you didn't ... Don't send crap you know is broken.

Thanks,

Ingo

Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

On Thu, 29 Aug 2013, Steven Rostedt wrote:

> How many places use the this_cpu_*() without preemption disabled? I
> wouldn't think there's many. I never complained about another variant,
> so you need to ask those that have. The tough question for me is what
> that variant name should be ;-)

Tried to add preemption checks but the basic issue is that many of the
checks themselves use this_cpu_ops. percpu.h is very basic to the
operation of fundamental primitives for preempt etc. Use of a BUG_ON needs
a seris of includes in percpu.h that cause more trouble.

If I switch __this_cpu ops to check for preemption then the logic for
preemption etc must use the raw_this_cpu ops.

2013-09-03 14:45:48

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

2013/9/3 Christoph Lameter <[email protected]>:
> On Thu, 29 Aug 2013, Steven Rostedt wrote:
>
>> How many places use the this_cpu_*() without preemption disabled? I
>> wouldn't think there's many. I never complained about another variant,
>> so you need to ask those that have. The tough question for me is what
>> that variant name should be ;-)
>
> Tried to add preemption checks but the basic issue is that many of the
> checks themselves use this_cpu_ops. percpu.h is very basic to the
> operation of fundamental primitives for preempt etc. Use of a BUG_ON needs
> a seris of includes in percpu.h that cause more trouble.
>
> If I switch __this_cpu ops to check for preemption then the logic for
> preemption etc must use the raw_this_cpu ops.

IIUC the issue is that preempt debug checks themselves use per cpu
operations that can result in preempt debug checks? Hence a recursion.
Do you have an example of that?

Also in this case this must be fixed anyway given the checks that
already exist in smp_processor_id(), __get_cpu_var(), ...

2013-09-03 15:45:00

by Steven Rostedt

[permalink] [raw]
Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

On Tue, 3 Sep 2013 16:45:45 +0200
Frederic Weisbecker <[email protected]> wrote:

> 2013/9/3 Christoph Lameter <[email protected]>:
> > On Thu, 29 Aug 2013, Steven Rostedt wrote:
> >
> >> How many places use the this_cpu_*() without preemption disabled? I
> >> wouldn't think there's many. I never complained about another variant,
> >> so you need to ask those that have. The tough question for me is what
> >> that variant name should be ;-)
> >
> > Tried to add preemption checks but the basic issue is that many of the
> > checks themselves use this_cpu_ops. percpu.h is very basic to the
> > operation of fundamental primitives for preempt etc. Use of a BUG_ON needs
> > a seris of includes in percpu.h that cause more trouble.
> >
> > If I switch __this_cpu ops to check for preemption then the logic for
> > preemption etc must use the raw_this_cpu ops.
>
> IIUC the issue is that preempt debug checks themselves use per cpu
> operations that can result in preempt debug checks? Hence a recursion.
> Do you have an example of that?
>
> Also in this case this must be fixed anyway given the checks that
> already exist in smp_processor_id(), __get_cpu_var(), ...

Right, that's why there's a raw_smp_processor_id() and
__raw_get_cpu_var(). Those two are the ones without checks, and they
are called by the non "raw" versions after the check is done.

Really, what's so damn hard about this?

-- Steve

Subject: Re: [gcv v3 06/35] scheduler: Replace __get_cpu_var uses

On Tue, 3 Sep 2013, Steven Rostedt wrote:

> Right, that's why there's a raw_smp_processor_id() and
> __raw_get_cpu_var(). Those two are the ones without checks, and they
> are called by the non "raw" versions after the check is done.
>
> Really, what's so damn hard about this?

Well you tried it before as far as I can recall. Just came back from Labor
day. Should have something soon.