2013-05-12 08:17:55

by Mike Galbraith

[permalink] [raw]
Subject: dynticks: CONFIG_VIRT_CPU_ACCOUNTING + CONFIG_CONTEXT_TRACKING breaks accounting on core2 CPUs only

Greetings,

Turning on new NO_HZ feature on my Q6600 box in master, I see that tasks
accrue zero utime/stime. However, the same exact kernel on E5620 box
works fine, so it would appear there's a CPU dependency somewhere.

Is core2 expected to go dysfunctional with context tracking enabled?
CONFIG_VIRT_CPU_ACCOUNTING alone works fine in 3.9-stable, turn on
CONFIG_CONTEXT_TRACKING_FORCE, and CPU accounting stops working on core2
boxen only, same exact kernel continues to work just fine on E5620
(Westmere) box.

-Mike

marge:/usr/local/src/kernel/linux-3.9 # egrep 'NO_HR|CPU_ACCOUNTING|RCU|
CONTEXT' .config
CONFIG_VIRT_CPU_ACCOUNTING=y
# CONFIG_TICK_CPU_ACCOUNTING is not set
CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
# RCU Subsystem
CONFIG_TREE_RCU=y
# CONFIG_PREEMPT_RCU is not set
CONFIG_RCU_STALL_COMMON=y
CONFIG_CONTEXT_TRACKING=y
# CONFIG_RCU_USER_QS is not set
CONFIG_CONTEXT_TRACKING_FORCE=y
CONFIG_RCU_FANOUT=64
CONFIG_RCU_FANOUT_LEAF=16
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_RCU_FAST_NO_HZ is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_RCU_NOCB_CPU is not set
CONFIG_HAVE_CONTEXT_TRACKING=y
# RCU Debugging
# CONFIG_SPARSE_RCU_POINTER is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
# CONFIG_RCU_CPU_STALL_INFO is not set
# CONFIG_RCU_TRACE is not set
CONFIG_CONTEXT_SWITCH_TRACER=y


2013-05-14 00:57:51

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: dynticks: CONFIG_VIRT_CPU_ACCOUNTING + CONFIG_CONTEXT_TRACKING breaks accounting on core2 CPUs only

On Sun, May 12, 2013 at 10:17:49AM +0200, Mike Galbraith wrote:
> Greetings,
>
> Turning on new NO_HZ feature on my Q6600 box in master, I see that tasks
> accrue zero utime/stime. However, the same exact kernel on E5620 box
> works fine, so it would appear there's a CPU dependency somewhere.

Ah indeed, I just managed to reproduce the same issue.

>
> Is core2 expected to go dysfunctional with context tracking enabled?
> CONFIG_VIRT_CPU_ACCOUNTING alone works fine in 3.9-stable, turn on
> CONFIG_CONTEXT_TRACKING_FORCE, and CPU accounting stops working on core2
> boxen only, same exact kernel continues to work just fine on E5620
> (Westmere) box.

There was no known issue with core2. The box where I'm seeing the it
is a Phenom quad core that had NR_CPUS=2. May be the issue is more
likely to happen with this low number. I don't know.

I'm investigating further.

Thanks.

2013-05-14 07:37:31

by Mike Galbraith

[permalink] [raw]
Subject: Re: dynticks: CONFIG_VIRT_CPU_ACCOUNTING + CONFIG_CONTEXT_TRACKING breaks accounting on core2 CPUs only

On Tue, 2013-05-14 at 02:57 +0200, Frederic Weisbecker wrote:
> On Sun, May 12, 2013 at 10:17:49AM +0200, Mike Galbraith wrote:
> > Greetings,
> >
> > Turning on new NO_HZ feature on my Q6600 box in master, I see that tasks
> > accrue zero utime/stime. However, the same exact kernel on E5620 box
> > works fine, so it would appear there's a CPU dependency somewhere.
>
> Ah indeed, I just managed to reproduce the same issue.
>
> >
> > Is core2 expected to go dysfunctional with context tracking enabled?
> > CONFIG_VIRT_CPU_ACCOUNTING alone works fine in 3.9-stable, turn on
> > CONFIG_CONTEXT_TRACKING_FORCE, and CPU accounting stops working on core2
> > boxen only, same exact kernel continues to work just fine on E5620
> > (Westmere) box.
>
> There was no known issue with core2. The box where I'm seeing the it
> is a Phenom quad core that had NR_CPUS=2. May be the issue is more
> likely to happen with this low number. I don't know.
>
> I'm investigating further.

Me too.

bash-6023 [001] d... 290.494214: vtime_delta: clock: 289702961236 vtime_snap: 290493017701

Always. Not good.

I see..

current->vtime_snap = sched_clock();

and..

clock = local_clock();

Things that make ya go hmm. The below "fixes" it (not).

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index cc2dc3ee..3133665 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -634,14 +634,17 @@ void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut, cputime
#endif /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */

#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
-static unsigned long long vtime_delta(struct task_struct *tsk)
+static noinline unsigned long long vtime_delta(struct task_struct *tsk)
{
unsigned long long clock;

- clock = local_clock();
+// clock = local_clock();
+ clock = sched_clock();
+ trace_printk("clock: %Lu vtime_snap: %Lu\n", clock, tsk->vtime_snap);
if (clock < tsk->vtime_snap)
return 0;

+ trace_printk("clock: %Lu vtime_snap: %Lu returns :%Lu\n", clock, tsk->vtime_snap, clock - tsk->vtime_snap);
return clock - tsk->vtime_snap;
}



2013-05-14 14:07:27

by Mike Galbraith

[permalink] [raw]
Subject: Re: dynticks: CONFIG_VIRT_CPU_ACCOUNTING + CONFIG_CONTEXT_TRACKING breaks accounting on core2 CPUs only

On Tue, 2013-05-14 at 02:57 +0200, Frederic Weisbecker wrote:
> On Sun, May 12, 2013 at 10:17:49AM +0200, Mike Galbraith wrote:
> > Greetings,
> >
> > Turning on new NO_HZ feature on my Q6600 box in master, I see that tasks
> > accrue zero utime/stime. However, the same exact kernel on E5620 box
> > works fine, so it would appear there's a CPU dependency somewhere.
>
> Ah indeed, I just managed to reproduce the same issue.
>
> >
> > Is core2 expected to go dysfunctional with context tracking enabled?
> > CONFIG_VIRT_CPU_ACCOUNTING alone works fine in 3.9-stable, turn on
> > CONFIG_CONTEXT_TRACKING_FORCE, and CPU accounting stops working on core2
> > boxen only, same exact kernel continues to work just fine on E5620
> > (Westmere) box.
>
> There was no known issue with core2. The box where I'm seeing the it
> is a Phenom quad core that had NR_CPUS=2. May be the issue is more
> likely to happen with this low number. I don't know.
>
> I'm investigating further.

So with CONFIG_HAVE_UNSTABLE_SCHED_CLOCK, you can't mix sched_clock()
(pure tsc) with local_clock()/sched_clock_cpu(cpu). The former is
always quite a bit ahead of the later, so mixing clocks is a nogo on
crusty old (but beloved) core2 box.

-Mike

2013-05-15 00:26:57

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: dynticks: CONFIG_VIRT_CPU_ACCOUNTING + CONFIG_CONTEXT_TRACKING breaks accounting on core2 CPUs only

On Tue, May 14, 2013 at 04:07:20PM +0200, Mike Galbraith wrote:
> On Tue, 2013-05-14 at 02:57 +0200, Frederic Weisbecker wrote:
> > On Sun, May 12, 2013 at 10:17:49AM +0200, Mike Galbraith wrote:
> > > Greetings,
> > >
> > > Turning on new NO_HZ feature on my Q6600 box in master, I see that tasks
> > > accrue zero utime/stime. However, the same exact kernel on E5620 box
> > > works fine, so it would appear there's a CPU dependency somewhere.
> >
> > Ah indeed, I just managed to reproduce the same issue.
> >
> > >
> > > Is core2 expected to go dysfunctional with context tracking enabled?
> > > CONFIG_VIRT_CPU_ACCOUNTING alone works fine in 3.9-stable, turn on
> > > CONFIG_CONTEXT_TRACKING_FORCE, and CPU accounting stops working on core2
> > > boxen only, same exact kernel continues to work just fine on E5620
> > > (Westmere) box.
> >
> > There was no known issue with core2. The box where I'm seeing the it
> > is a Phenom quad core that had NR_CPUS=2. May be the issue is more
> > likely to happen with this low number. I don't know.
> >
> > I'm investigating further.
>
> So with CONFIG_HAVE_UNSTABLE_SCHED_CLOCK, you can't mix sched_clock()
> (pure tsc) with local_clock()/sched_clock_cpu(cpu). The former is
> always quite a bit ahead of the later, so mixing clocks is a nogo on
> crusty old (but beloved) core2 box.

Right I have the same issue. So let's use local_clock() everywhere here,
it takes care of unstable tsc.

Does the following fix the issue for you?

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index cc2dc3e..1ce322f 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -747,7 +748,7 @@ void arch_vtime_task_switch(struct task_struct *prev)

write_seqlock(&current->vtime_seqlock);
current->vtime_snap_whence = VTIME_SYS;
- current->vtime_snap = sched_clock();
+ current->vtime_snap = local_clock();
write_sequnlock(&current->vtime_seqlock);
}

@@ -757,7 +758,7 @@ void vtime_init_idle(struct task_struct *t)

write_seqlock_irqsave(&t->vtime_seqlock, flags);
t->vtime_snap_whence = VTIME_SYS;
- t->vtime_snap = sched_clock();
+ t->vtime_snap = local_clock();
write_sequnlock_irqrestore(&t->vtime_seqlock, flags);
}

2013-05-15 04:09:20

by Mike Galbraith

[permalink] [raw]
Subject: Re: dynticks: CONFIG_VIRT_CPU_ACCOUNTING + CONFIG_CONTEXT_TRACKING breaks accounting on core2 CPUs only

On Wed, 2013-05-15 at 02:26 +0200, Frederic Weisbecker wrote:
> On Tue, May 14, 2013 at 04:07:20PM +0200, Mike Galbraith wrote:
> > On Tue, 2013-05-14 at 02:57 +0200, Frederic Weisbecker wrote:
> > > On Sun, May 12, 2013 at 10:17:49AM +0200, Mike Galbraith wrote:
> > > > Greetings,
> > > >
> > > > Turning on new NO_HZ feature on my Q6600 box in master, I see that tasks
> > > > accrue zero utime/stime. However, the same exact kernel on E5620 box
> > > > works fine, so it would appear there's a CPU dependency somewhere.
> > >
> > > Ah indeed, I just managed to reproduce the same issue.
> > >
> > > >
> > > > Is core2 expected to go dysfunctional with context tracking enabled?
> > > > CONFIG_VIRT_CPU_ACCOUNTING alone works fine in 3.9-stable, turn on
> > > > CONFIG_CONTEXT_TRACKING_FORCE, and CPU accounting stops working on core2
> > > > boxen only, same exact kernel continues to work just fine on E5620
> > > > (Westmere) box.
> > >
> > > There was no known issue with core2. The box where I'm seeing the it
> > > is a Phenom quad core that had NR_CPUS=2. May be the issue is more
> > > likely to happen with this low number. I don't know.
> > >
> > > I'm investigating further.
> >
> > So with CONFIG_HAVE_UNSTABLE_SCHED_CLOCK, you can't mix sched_clock()
> > (pure tsc) with local_clock()/sched_clock_cpu(cpu). The former is
> > always quite a bit ahead of the later, so mixing clocks is a nogo on
> > crusty old (but beloved) core2 box.
>
> Right I have the same issue. So let's use local_clock() everywhere here,
> it takes care of unstable tsc.
>
> Does the following fix the issue for you?

Yeah, both can use sched_clock_cpu() instead though.

> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index cc2dc3e..1ce322f 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -747,7 +748,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
>
> write_seqlock(&current->vtime_seqlock);
> current->vtime_snap_whence = VTIME_SYS;
> - current->vtime_snap = sched_clock();
> + current->vtime_snap = local_clock();
> write_sequnlock(&current->vtime_seqlock);
> }
>
> @@ -757,7 +758,7 @@ void vtime_init_idle(struct task_struct *t)
>
> write_seqlock_irqsave(&t->vtime_seqlock, flags);
> t->vtime_snap_whence = VTIME_SYS;
> - t->vtime_snap = sched_clock();
> + t->vtime_snap = local_clock();
> write_sequnlock_irqrestore(&t->vtime_seqlock, flags);
> }
>

2013-05-15 16:05:58

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: dynticks: CONFIG_VIRT_CPU_ACCOUNTING + CONFIG_CONTEXT_TRACKING breaks accounting on core2 CPUs only

On Wed, May 15, 2013 at 06:09:15AM +0200, Mike Galbraith wrote:
> On Wed, 2013-05-15 at 02:26 +0200, Frederic Weisbecker wrote:
> > On Tue, May 14, 2013 at 04:07:20PM +0200, Mike Galbraith wrote:
> > > On Tue, 2013-05-14 at 02:57 +0200, Frederic Weisbecker wrote:
> > > > On Sun, May 12, 2013 at 10:17:49AM +0200, Mike Galbraith wrote:
> > > > > Greetings,
> > > > >
> > > > > Turning on new NO_HZ feature on my Q6600 box in master, I see that tasks
> > > > > accrue zero utime/stime. However, the same exact kernel on E5620 box
> > > > > works fine, so it would appear there's a CPU dependency somewhere.
> > > >
> > > > Ah indeed, I just managed to reproduce the same issue.
> > > >
> > > > >
> > > > > Is core2 expected to go dysfunctional with context tracking enabled?
> > > > > CONFIG_VIRT_CPU_ACCOUNTING alone works fine in 3.9-stable, turn on
> > > > > CONFIG_CONTEXT_TRACKING_FORCE, and CPU accounting stops working on core2
> > > > > boxen only, same exact kernel continues to work just fine on E5620
> > > > > (Westmere) box.
> > > >
> > > > There was no known issue with core2. The box where I'm seeing the it
> > > > is a Phenom quad core that had NR_CPUS=2. May be the issue is more
> > > > likely to happen with this low number. I don't know.
> > > >
> > > > I'm investigating further.
> > >
> > > So with CONFIG_HAVE_UNSTABLE_SCHED_CLOCK, you can't mix sched_clock()
> > > (pure tsc) with local_clock()/sched_clock_cpu(cpu). The former is
> > > always quite a bit ahead of the later, so mixing clocks is a nogo on
> > > crusty old (but beloved) core2 box.
> >
> > Right I have the same issue. So let's use local_clock() everywhere here,
> > it takes care of unstable tsc.
> >
> > Does the following fix the issue for you?
>
> Yeah, both can use sched_clock_cpu() instead though.

Right, given that irqs are already disabled. I'm preparing the patch.

Thanks!