2013-05-08 20:30:49

by Dave Jones

[permalink] [raw]
Subject: rcu_preempt running flat out on idle desktop.

On what should be an idle desktop, I'm seeing really strange things.

The XFCE CPU meter shows both cores are running with no idle time.

here's the top of the top(1) output..

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10 root 20 0 0 0 0 S 200.0 0.0 185301:36 rcu_preempt
553 root 20 0 268m 76m 6764 S 200.0 2.6 144579:53 Xorg
1199 root 20 0 0 0 0 S 200.0 0.0 306:17.85 kworker/1:0
501 root 20 0 0 0 0 S 200.0 0.0 4471:03 kworker/0:2
12 root 20 0 0 0 0 S 200.0 0.0 67277:16 rcuop/1
1237 davej 20 0 535m 15m 8484 S 200.0 0.5 3645:16 Terminal
859 davej 20 0 117m 3036 1336 S 200.0 0.1 1986:55 htop

There are a lot of processes allegedly using "200%" of CPU time, a handful
in the "196%" range, and then a bunch at 0.

rcu_preempt is staying at the top of the list while everything else bubbles
up the list.

And what's up with those TIME entries? Box has only been up a few hours,
so those numbers look really crazy.

RCU options for this build are..

CONFIG_TREE_PREEMPT_RCU=y
CONFIG_PREEMPT_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_USER_QS=y
CONFIG_RCU_FANOUT=64
CONFIG_RCU_FANOUT_LEAF=16
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_RCU_FAST_NO_HZ is not set
CONFIG_TREE_RCU_TRACE=y
CONFIG_RCU_BOOST=y
CONFIG_RCU_BOOST_PRIO=1
CONFIG_RCU_BOOST_DELAY=500
CONFIG_RCU_NOCB_CPU=y
CONFIG_RCU_NOCB_CPU_ALL=y
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
CONFIG_PROVE_RCU=y
# CONFIG_PROVE_RCU_REPEATEDLY is not set
CONFIG_PROVE_RCU_DELAY=y
CONFIG_SPARSE_RCU_POINTER=y
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=60
CONFIG_RCU_CPU_STALL_VERBOSE=y
# CONFIG_RCU_CPU_STALL_INFO is not set
CONFIG_RCU_TRACE=y

Dave


2013-05-08 20:53:19

by Paul E. McKenney

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Wed, May 08, 2013 at 04:30:42PM -0400, Dave Jones wrote:
> On what should be an idle desktop, I'm seeing really strange things.
>
> The XFCE CPU meter shows both cores are running with no idle time.
>
> here's the top of the top(1) output..
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 10 root 20 0 0 0 0 S 200.0 0.0 185301:36 rcu_preempt
> 553 root 20 0 268m 76m 6764 S 200.0 2.6 144579:53 Xorg
> 1199 root 20 0 0 0 0 S 200.0 0.0 306:17.85 kworker/1:0
> 501 root 20 0 0 0 0 S 200.0 0.0 4471:03 kworker/0:2
> 12 root 20 0 0 0 0 S 200.0 0.0 67277:16 rcuop/1
> 1237 davej 20 0 535m 15m 8484 S 200.0 0.5 3645:16 Terminal
> 859 davej 20 0 117m 3036 1336 S 200.0 0.1 1986:55 htop
>
> There are a lot of processes allegedly using "200%" of CPU time, a handful
> in the "196%" range, and then a bunch at 0.

Yow!!! 185301 minutes is 128 days, which is a truly impressive amount of
CPU time to accumulate in a few short hours.

> rcu_preempt is staying at the top of the list while everything else bubbles
> up the list.
>
> And what's up with those TIME entries? Box has only been up a few hours,
> so those numbers look really crazy.
>
> RCU options for this build are..
>
> CONFIG_TREE_PREEMPT_RCU=y
> CONFIG_PREEMPT_RCU=y
> CONFIG_RCU_STALL_COMMON=y
> CONFIG_RCU_USER_QS=y
> CONFIG_RCU_FANOUT=64
> CONFIG_RCU_FANOUT_LEAF=16
> # CONFIG_RCU_FANOUT_EXACT is not set
> # CONFIG_RCU_FAST_NO_HZ is not set
> CONFIG_TREE_RCU_TRACE=y
> CONFIG_RCU_BOOST=y
> CONFIG_RCU_BOOST_PRIO=1
> CONFIG_RCU_BOOST_DELAY=500
> CONFIG_RCU_NOCB_CPU=y
> CONFIG_RCU_NOCB_CPU_ALL=y
> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
> CONFIG_PROVE_RCU=y
> # CONFIG_PROVE_RCU_REPEATEDLY is not set
> CONFIG_PROVE_RCU_DELAY=y
> CONFIG_SPARSE_RCU_POINTER=y
> # CONFIG_RCU_TORTURE_TEST is not set
> CONFIG_RCU_CPU_STALL_TIMEOUT=60
> CONFIG_RCU_CPU_STALL_VERBOSE=y
> # CONFIG_RCU_CPU_STALL_INFO is not set
> CONFIG_RCU_TRACE=y

This is 3.9, or Linus's current tree? I am guessing the latter, but
figured I should ask.

Thanx, Paul

2013-05-08 20:56:40

by Dave Jones

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Wed, May 08, 2013 at 01:52:58PM -0700, Paul E. McKenney wrote:
> On Wed, May 08, 2013 at 04:30:42PM -0400, Dave Jones wrote:

> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 10 root 20 0 0 0 0 S 200.0 0.0 185301:36 rcu_preempt
> > 553 root 20 0 268m 76m 6764 S 200.0 2.6 144579:53 Xorg
> > 1199 root 20 0 0 0 0 S 200.0 0.0 306:17.85 kworker/1:0
> > 501 root 20 0 0 0 0 S 200.0 0.0 4471:03 kworker/0:2
> > 12 root 20 0 0 0 0 S 200.0 0.0 67277:16 rcuop/1
> > 1237 davej 20 0 535m 15m 8484 S 200.0 0.5 3645:16 Terminal
> > 859 davej 20 0 117m 3036 1336 S 200.0 0.1 1986:55 htop
> >
> > There are a lot of processes allegedly using "200%" of CPU time, a handful
> > in the "196%" range, and then a bunch at 0.
>
> Yow!!! 185301 minutes is 128 days, which is a truly impressive amount of
> CPU time to accumulate in a few short hours.

perf top doesn't show anything out of the ordinary. Just as an idle desktop
should behave, it's spending a bunch of time in delay_tsc.

> This is 3.9, or Linus's current tree? I am guessing the latter, but
> figured I should ask.

Yeah, the latter. (v3.9-11572-g5af43c2)

Dave

2013-05-08 21:16:50

by Paul E. McKenney

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Wed, May 08, 2013 at 04:56:33PM -0400, Dave Jones wrote:
> On Wed, May 08, 2013 at 01:52:58PM -0700, Paul E. McKenney wrote:
> > On Wed, May 08, 2013 at 04:30:42PM -0400, Dave Jones wrote:
>
> > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > > 10 root 20 0 0 0 0 S 200.0 0.0 185301:36 rcu_preempt
> > > 553 root 20 0 268m 76m 6764 S 200.0 2.6 144579:53 Xorg
> > > 1199 root 20 0 0 0 0 S 200.0 0.0 306:17.85 kworker/1:0
> > > 501 root 20 0 0 0 0 S 200.0 0.0 4471:03 kworker/0:2
> > > 12 root 20 0 0 0 0 S 200.0 0.0 67277:16 rcuop/1
> > > 1237 davej 20 0 535m 15m 8484 S 200.0 0.5 3645:16 Terminal
> > > 859 davej 20 0 117m 3036 1336 S 200.0 0.1 1986:55 htop
> > >
> > > There are a lot of processes allegedly using "200%" of CPU time, a handful
> > > in the "196%" range, and then a bunch at 0.
> >
> > Yow!!! 185301 minutes is 128 days, which is a truly impressive amount of
> > CPU time to accumulate in a few short hours.
>
> perf top doesn't show anything out of the ordinary. Just as an idle desktop
> should behave, it's spending a bunch of time in delay_tsc.
>
> > This is 3.9, or Linus's current tree? I am guessing the latter, but
> > figured I should ask.
>
> Yeah, the latter. (v3.9-11572-g5af43c2)

Do you have CONFIG_NO_HZ_FULL=y? We just might have a bug in tickless
CPU-time accounting...

Thanx, Paul

2013-05-08 21:18:14

by Dave Jones

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Wed, May 08, 2013 at 02:16:35PM -0700, Paul E. McKenney wrote:
> On Wed, May 08, 2013 at 04:56:33PM -0400, Dave Jones wrote:
> > On Wed, May 08, 2013 at 01:52:58PM -0700, Paul E. McKenney wrote:
> > > On Wed, May 08, 2013 at 04:30:42PM -0400, Dave Jones wrote:
> >
> > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > > > 10 root 20 0 0 0 0 S 200.0 0.0 185301:36 rcu_preempt
> > > > 553 root 20 0 268m 76m 6764 S 200.0 2.6 144579:53 Xorg
> > > > 1199 root 20 0 0 0 0 S 200.0 0.0 306:17.85 kworker/1:0
> > > > 501 root 20 0 0 0 0 S 200.0 0.0 4471:03 kworker/0:2
> > > > 12 root 20 0 0 0 0 S 200.0 0.0 67277:16 rcuop/1
> > > > 1237 davej 20 0 535m 15m 8484 S 200.0 0.5 3645:16 Terminal
> > > > 859 davej 20 0 117m 3036 1336 S 200.0 0.1 1986:55 htop
> > > >
> > > > There are a lot of processes allegedly using "200%" of CPU time, a handful
> > > > in the "196%" range, and then a bunch at 0.
> > >
> > > Yow!!! 185301 minutes is 128 days, which is a truly impressive amount of
> > > CPU time to accumulate in a few short hours.
> >
> > perf top doesn't show anything out of the ordinary. Just as an idle desktop
> > should behave, it's spending a bunch of time in delay_tsc.
> >
> > > This is 3.9, or Linus's current tree? I am guessing the latter, but
> > > figured I should ask.
> >
> > Yeah, the latter. (v3.9-11572-g5af43c2)
>
> Do you have CONFIG_NO_HZ_FULL=y? We just might have a bug in tickless
> CPU-time accounting...

I never saw a new config option I didn't like.

CONFIG_NO_HZ_FULL=y
CONFIG_NO_HZ_FULL_ALL=y

Dave

2013-05-08 22:20:35

by Paul E. McKenney

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Wed, May 08, 2013 at 05:17:59PM -0400, Dave Jones wrote:
> On Wed, May 08, 2013 at 02:16:35PM -0700, Paul E. McKenney wrote:
> > On Wed, May 08, 2013 at 04:56:33PM -0400, Dave Jones wrote:
> > > On Wed, May 08, 2013 at 01:52:58PM -0700, Paul E. McKenney wrote:
> > > > On Wed, May 08, 2013 at 04:30:42PM -0400, Dave Jones wrote:
> > >
> > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > > > > 10 root 20 0 0 0 0 S 200.0 0.0 185301:36 rcu_preempt
> > > > > 553 root 20 0 268m 76m 6764 S 200.0 2.6 144579:53 Xorg
> > > > > 1199 root 20 0 0 0 0 S 200.0 0.0 306:17.85 kworker/1:0
> > > > > 501 root 20 0 0 0 0 S 200.0 0.0 4471:03 kworker/0:2
> > > > > 12 root 20 0 0 0 0 S 200.0 0.0 67277:16 rcuop/1
> > > > > 1237 davej 20 0 535m 15m 8484 S 200.0 0.5 3645:16 Terminal
> > > > > 859 davej 20 0 117m 3036 1336 S 200.0 0.1 1986:55 htop
> > > > >
> > > > > There are a lot of processes allegedly using "200%" of CPU time, a handful
> > > > > in the "196%" range, and then a bunch at 0.
> > > >
> > > > Yow!!! 185301 minutes is 128 days, which is a truly impressive amount of
> > > > CPU time to accumulate in a few short hours.
> > >
> > > perf top doesn't show anything out of the ordinary. Just as an idle desktop
> > > should behave, it's spending a bunch of time in delay_tsc.
> > >
> > > > This is 3.9, or Linus's current tree? I am guessing the latter, but
> > > > figured I should ask.
> > >
> > > Yeah, the latter. (v3.9-11572-g5af43c2)
> >
> > Do you have CONFIG_NO_HZ_FULL=y? We just might have a bug in tickless
> > CPU-time accounting...
>
> I never saw a new config option I didn't like.

;-) ;-) ;-)

> CONFIG_NO_HZ_FULL=y
> CONFIG_NO_HZ_FULL_ALL=y

Adding Frederic on CC...

Thanx, Paul

2013-05-09 09:04:17

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

2013/5/8 Dave Jones <[email protected]>:
> On Wed, May 08, 2013 at 01:52:58PM -0700, Paul E. McKenney wrote:
> > On Wed, May 08, 2013 at 04:30:42PM -0400, Dave Jones wrote:
>
> > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > > 10 root 20 0 0 0 0 S 200.0 0.0 185301:36 rcu_preempt
> > > 553 root 20 0 268m 76m 6764 S 200.0 2.6 144579:53 Xorg
> > > 1199 root 20 0 0 0 0 S 200.0 0.0 306:17.85 kworker/1:0
> > > 501 root 20 0 0 0 0 S 200.0 0.0 4471:03 kworker/0:2
> > > 12 root 20 0 0 0 0 S 200.0 0.0 67277:16 rcuop/1
> > > 1237 davej 20 0 535m 15m 8484 S 200.0 0.5 3645:16 Terminal
> > > 859 davej 20 0 117m 3036 1336 S 200.0 0.1 1986:55 htop
> > >
> > > There are a lot of processes allegedly using "200%" of CPU time, a handful
> > > in the "196%" range, and then a bunch at 0.
> >
> > Yow!!! 185301 minutes is 128 days, which is a truly impressive amount of
> > CPU time to accumulate in a few short hours.

Right, Ingo has reported something about strange high numbers in top
too. I need to investigate that.

Thanks.

2013-05-09 09:20:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.


* Frederic Weisbecker <[email protected]> wrote:

> 2013/5/8 Dave Jones <[email protected]>:
> > On Wed, May 08, 2013 at 01:52:58PM -0700, Paul E. McKenney wrote:
> > > On Wed, May 08, 2013 at 04:30:42PM -0400, Dave Jones wrote:
> >
> > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > > > 10 root 20 0 0 0 0 S 200.0 0.0 185301:36 rcu_preempt
> > > > 553 root 20 0 268m 76m 6764 S 200.0 2.6 144579:53 Xorg
> > > > 1199 root 20 0 0 0 0 S 200.0 0.0 306:17.85 kworker/1:0
> > > > 501 root 20 0 0 0 0 S 200.0 0.0 4471:03 kworker/0:2
> > > > 12 root 20 0 0 0 0 S 200.0 0.0 67277:16 rcuop/1
> > > > 1237 davej 20 0 535m 15m 8484 S 200.0 0.5 3645:16 Terminal
> > > > 859 davej 20 0 117m 3036 1336 S 200.0 0.1 1986:55 htop
> > > >
> > > > There are a lot of processes allegedly using "200%" of CPU time, a handful
> > > > in the "196%" range, and then a bunch at 0.
> > >
> > > Yow!!! 185301 minutes is 128 days, which is a truly impressive amount of
> > > CPU time to accumulate in a few short hours.
>
> Right, Ingo has reported something about strange high numbers in top
> too. I need to investigate that.

For CPU-intense loops I have seen a small anomaly with "slightly higher
CPU usage numbers" - sometimes going to 101% or so but still converging to
around 99.9% in the long run.

The 200% bug pasted by Dave here seems to be very big and seems to affect
all tasks not just true CPU-loopers. So I'd suggest reproducing and fixing
Dave's bug first - that might fix the small anomaly I've seen.

Thanks,

Ingo

2013-05-09 21:02:13

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

2013/5/8 Dave Jones <[email protected]>:
> On what should be an idle desktop, I'm seeing really strange things.
>
> The XFCE CPU meter shows both cores are running with no idle time.
>
> here's the top of the top(1) output..
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 10 root 20 0 0 0 0 S 200.0 0.0 185301:36 rcu_preempt
> 553 root 20 0 268m 76m 6764 S 200.0 2.6 144579:53 Xorg
> 1199 root 20 0 0 0 0 S 200.0 0.0 306:17.85 kworker/1:0
> 501 root 20 0 0 0 0 S 200.0 0.0 4471:03 kworker/0:2
> 12 root 20 0 0 0 0 S 200.0 0.0 67277:16 rcuop/1
> 1237 davej 20 0 535m 15m 8484 S 200.0 0.5 3645:16 Terminal
> 859 davej 20 0 117m 3036 1336 S 200.0 0.1 1986:55 htop
>
> There are a lot of processes allegedly using "200%" of CPU time, a handful
> in the "196%" range, and then a bunch at 0.
>
> rcu_preempt is staying at the top of the list while everything else bubbles
> up the list.
>
> And what's up with those TIME entries? Box has only been up a few hours,
> so those numbers look really crazy.
>
> RCU options for this build are..
>
> CONFIG_TREE_PREEMPT_RCU=y
> CONFIG_PREEMPT_RCU=y
> CONFIG_RCU_STALL_COMMON=y
> CONFIG_RCU_USER_QS=y
> CONFIG_RCU_FANOUT=64
> CONFIG_RCU_FANOUT_LEAF=16
> # CONFIG_RCU_FANOUT_EXACT is not set
> # CONFIG_RCU_FAST_NO_HZ is not set
> CONFIG_TREE_RCU_TRACE=y
> CONFIG_RCU_BOOST=y
> CONFIG_RCU_BOOST_PRIO=1
> CONFIG_RCU_BOOST_DELAY=500
> CONFIG_RCU_NOCB_CPU=y
> CONFIG_RCU_NOCB_CPU_ALL=y
> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
> CONFIG_PROVE_RCU=y
> # CONFIG_PROVE_RCU_REPEATEDLY is not set
> CONFIG_PROVE_RCU_DELAY=y
> CONFIG_SPARSE_RCU_POINTER=y
> # CONFIG_RCU_TORTURE_TEST is not set
> CONFIG_RCU_CPU_STALL_TIMEOUT=60
> CONFIG_RCU_CPU_STALL_VERBOSE=y
> # CONFIG_RCU_CPU_STALL_INFO is not set
> CONFIG_RCU_TRACE=y

I can't reproduce that issue even with the above setting. Could you
please send me your whole config?

Thanks.

2013-05-09 21:10:54

by Dave Jones

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Thu, May 09, 2013 at 11:02:08PM +0200, Frederic Weisbecker wrote:

> > RCU options for this build are..
> >
> I can't reproduce that issue even with the above setting. Could you
> please send me your whole config?

http://paste.fedoraproject.org/11363/36813375/raw/

Dave

2013-05-14 01:21:15

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Thu, May 09, 2013 at 05:10:26PM -0400, Dave Jones wrote:
> On Thu, May 09, 2013 at 11:02:08PM +0200, Frederic Weisbecker wrote:
>
> > > RCU options for this build are..
> > >
> > I can't reproduce that issue even with the above setting. Could you
> > please send me your whole config?
>
> http://paste.fedoraproject.org/11363/36813375/raw/

Sorry it took me some time to get that config booting, I ran into
other unrelated crashes.

So I'm observing the exact opposite than you: the processes appear to
have no load nor to consume cputime. Even if I use some dumb user loop.
Top don't even show it.

I'm digging further and will tell you when I get more info.

Thanks.

2013-06-06 15:32:27

by Dave Jones

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Tue, May 14, 2013 at 03:21:07AM +0200, Frederic Weisbecker wrote:
> On Thu, May 09, 2013 at 05:10:26PM -0400, Dave Jones wrote:
> > On Thu, May 09, 2013 at 11:02:08PM +0200, Frederic Weisbecker wrote:
> >
> > > > RCU options for this build are..
> > > >
> > > I can't reproduce that issue even with the above setting. Could you
> > > please send me your whole config?
> >
> > http://paste.fedoraproject.org/11363/36813375/raw/
>
> Sorry it took me some time to get that config booting, I ran into
> other unrelated crashes.
>
> So I'm observing the exact opposite than you: the processes appear to
> have no load nor to consume cputime. Even if I use some dumb user loop.
> Top don't even show it.
>
> I'm digging further and will tell you when I get more info.

So I'm still seeing this on rc4.

Here's my 'idle' firewall, just routing packets:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4405 root 20 0 0 0 0 S 400.0 0.0 52336:27 kworker/2:0
20968 davej 20 0 115m 1840 1092 R 400.0 0.1 2297:48 top
964 named 20 0 383m 32m 2896 S 400.0 1.7 81384:54 named
10 root 20 0 0 0 0 S 400.0 0.0 195403:05 rcu_preempt
20830 davej 20 0 125m 2172 1024 S 400.0 0.1 939:36.10 sshd
11 root 20 0 0 0 0 S 400.0 0.0 42644:33 rcuop/0
13 root 20 0 0 0 0 S 400.0 0.0 34729:04 rcuop/2
754 lighttpd 20 0 54640 2692 1732 S 400.0 0.1 35286:11 lighttpd
756 root 20 0 216m 10m 6116 S 400.0 0.5 34143:41 snmpd
35 root 20 0 0 0 0 S 400.0 0.0 12790:56 ksoftirqd/2
19893 root 20 0 0 0 0 S 400.0 0.0 12570:49 kworker/0:1
20059 root 20 0 0 0 0 S 400.0 0.0 4666:30 kworker/1:1
32 root rt 0 0 0 0 S 400.0 0.0 8780:22 watchdog/2
792 root 20 0 92220 2516 632 S 400.0 0.1 7821:28 sendmail
25 root rt 0 0 0 0 S 400.0 0.0 8780:21 watchdog/0
26 root rt 0 0 0 0 S 400.0 0.0 8780:22 watchdog/1
38 root rt 0 0 0 0 S 400.0 0.0 8780:22 watchdog/3
19167 root 20 0 0 0 0 S 400.0 0.0 7804:04 kworker/3:2


Every process is either 400%, or 0%.

Here's my desktop box, just writing this email:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10 root 20 0 0 0 0 S 200.0 0.0 35134:00 rcu_preempt
12 root 20 0 0 0 0 S 200.0 0.0 14059:47 rcuop/1
12528 root 20 0 0 0 0 S 200.0 0.0 1012:56 kworker/0:1
12999 davej 20 0 115m 1700 956 R 200.0 0.1 1:09.10 top
4799 root 20 0 0 0 0 S 200.0 0.0 3510:40 kworker/1:1
1 root 20 0 50760 5092 1648 S 0.0 0.2 3:14.96 systemd
2 root 20 0 0 0 0 S 0.0 0.0 1:40.55 kthreadd

Every process 200% or 0%.

Dave

2013-06-06 15:43:18

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Thu, Jun 06, 2013 at 11:31:36AM -0400, Dave Jones wrote:
> On Tue, May 14, 2013 at 03:21:07AM +0200, Frederic Weisbecker wrote:
> > On Thu, May 09, 2013 at 05:10:26PM -0400, Dave Jones wrote:
> > > On Thu, May 09, 2013 at 11:02:08PM +0200, Frederic Weisbecker wrote:
> > >
> > > > > RCU options for this build are..
> > > > >
> > > > I can't reproduce that issue even with the above setting. Could you
> > > > please send me your whole config?
> > >
> > > http://paste.fedoraproject.org/11363/36813375/raw/
> >
> > Sorry it took me some time to get that config booting, I ran into
> > other unrelated crashes.
> >
> > So I'm observing the exact opposite than you: the processes appear to
> > have no load nor to consume cputime. Even if I use some dumb user loop.
> > Top don't even show it.
> >
> > I'm digging further and will tell you when I get more info.
>
> So I'm still seeing this on rc4.
>
> Here's my 'idle' firewall, just routing packets:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 4405 root 20 0 0 0 0 S 400.0 0.0 52336:27 kworker/2:0
> 20968 davej 20 0 115m 1840 1092 R 400.0 0.1 2297:48 top
> 964 named 20 0 383m 32m 2896 S 400.0 1.7 81384:54 named
> 10 root 20 0 0 0 0 S 400.0 0.0 195403:05 rcu_preempt
> 20830 davej 20 0 125m 2172 1024 S 400.0 0.1 939:36.10 sshd
> 11 root 20 0 0 0 0 S 400.0 0.0 42644:33 rcuop/0
> 13 root 20 0 0 0 0 S 400.0 0.0 34729:04 rcuop/2
> 754 lighttpd 20 0 54640 2692 1732 S 400.0 0.1 35286:11 lighttpd
> 756 root 20 0 216m 10m 6116 S 400.0 0.5 34143:41 snmpd
> 35 root 20 0 0 0 0 S 400.0 0.0 12790:56 ksoftirqd/2
> 19893 root 20 0 0 0 0 S 400.0 0.0 12570:49 kworker/0:1
> 20059 root 20 0 0 0 0 S 400.0 0.0 4666:30 kworker/1:1
> 32 root rt 0 0 0 0 S 400.0 0.0 8780:22 watchdog/2
> 792 root 20 0 92220 2516 632 S 400.0 0.1 7821:28 sendmail
> 25 root rt 0 0 0 0 S 400.0 0.0 8780:21 watchdog/0
> 26 root rt 0 0 0 0 S 400.0 0.0 8780:22 watchdog/1
> 38 root rt 0 0 0 0 S 400.0 0.0 8780:22 watchdog/3
> 19167 root 20 0 0 0 0 S 400.0 0.0 7804:04 kworker/3:2
>
>
> Every process is either 400%, or 0%.
>
> Here's my desktop box, just writing this email:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 10 root 20 0 0 0 0 S 200.0 0.0 35134:00 rcu_preempt
> 12 root 20 0 0 0 0 S 200.0 0.0 14059:47 rcuop/1
> 12528 root 20 0 0 0 0 S 200.0 0.0 1012:56 kworker/0:1
> 12999 davej 20 0 115m 1700 956 R 200.0 0.1 1:09.10 top
> 4799 root 20 0 0 0 0 S 200.0 0.0 3510:40 kworker/1:1
> 1 root 20 0 50760 5092 1648 S 0.0 0.2 3:14.96 systemd
> 2 root 20 0 0 0 0 S 0.0 0.0 1:40.55 kthreadd
>
> Every process 200% or 0%.

I see, would you mind testing this branch?

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
timers/urgent

It might help, I specially think about 45eacc692771bd2b1ea3d384e6345cab3da10861
("vtime: Use consistent clocks among nohz accounting") which fixes an issue
with cputime accounting that resulted in similar symptoms.

Thanks.

2013-06-06 23:10:43

by Dave Jones

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Thu, Jun 06, 2013 at 05:43:13PM +0200, Frederic Weisbecker wrote:

> > Every process 200% or 0%.
>
> I see, would you mind testing this branch?
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> timers/urgent
>
> It might help, I specially think about 45eacc692771bd2b1ea3d384e6345cab3da10861
> ("vtime: Use consistent clocks among nohz accounting") which fixes an issue
> with cputime accounting that resulted in similar symptoms.
>
> Thanks.

5 hours of uptime so far, without anything looking weird.
I'll keep an eye on it, but this is looking a lot better so far.

Dave

2013-06-12 15:34:58

by Dave Jones

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Thu, Jun 06, 2013 at 05:43:13PM +0200, Frederic Weisbecker wrote:

> > Every process 200% or 0%.
>
> I see, would you mind testing this branch?
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> timers/urgent
>
> It might help, I specially think about 45eacc692771bd2b1ea3d384e6345cab3da10861
> ("vtime: Use consistent clocks among nohz accounting") which fixes an issue
> with cputime accounting that resulted in similar symptoms.

So that patch has been in 'urgent' for a month. Can someone move it
to really-urgent or whatever is necessary to get it into Linus' tree ?

DAve

2013-06-20 19:10:46

by Dave Jones

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Wed, Jun 12, 2013 at 11:34:07AM -0400, Dave Jones wrote:
> On Thu, Jun 06, 2013 at 05:43:13PM +0200, Frederic Weisbecker wrote:
>
> > > Every process 200% or 0%.
> >
> > I see, would you mind testing this branch?
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> > timers/urgent
> >
> > It might help, I specially think about 45eacc692771bd2b1ea3d384e6345cab3da10861
> > ("vtime: Use consistent clocks among nohz accounting") which fixes an issue
> > with cputime accounting that resulted in similar symptoms.
>
> So that patch has been in 'urgent' for a month. Can someone move it
> to really-urgent or whatever is necessary to get it into Linus' tree ?

Another week passes...

This is kind of ridiculous.

Dave

2013-06-23 10:58:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.


* Dave Jones <[email protected]> wrote:

> On Wed, Jun 12, 2013 at 11:34:07AM -0400, Dave Jones wrote:
> > On Thu, Jun 06, 2013 at 05:43:13PM +0200, Frederic Weisbecker wrote:
> >
> > > > Every process 200% or 0%.
> > >
> > > I see, would you mind testing this branch?
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> > > timers/urgent
> > >
> > > It might help, I specially think about 45eacc692771bd2b1ea3d384e6345cab3da10861
> > > ("vtime: Use consistent clocks among nohz accounting") which fixes an issue
> > > with cputime accounting that resulted in similar symptoms.
> >
> > So that patch has been in 'urgent' for a month. Can someone move it
> > to really-urgent or whatever is necessary to get it into Linus' tree ?
>
> Another week passes...

It was sent to Linus two days before your mail and Linus merged it at
around the time you wrote your mail.

So it should be all in -rc7, unless I'm missing something?

Thanks,

Ingo

2013-07-01 12:11:01

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: rcu_preempt running flat out on idle desktop.

On Sun, Jun 23, 2013 at 12:58:39PM +0200, Ingo Molnar wrote:
>
> * Dave Jones <[email protected]> wrote:
>
> > On Wed, Jun 12, 2013 at 11:34:07AM -0400, Dave Jones wrote:
> > > On Thu, Jun 06, 2013 at 05:43:13PM +0200, Frederic Weisbecker wrote:
> > >
> > > > > Every process 200% or 0%.
> > > >
> > > > I see, would you mind testing this branch?
> > > >
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> > > > timers/urgent
> > > >
> > > > It might help, I specially think about 45eacc692771bd2b1ea3d384e6345cab3da10861
> > > > ("vtime: Use consistent clocks among nohz accounting") which fixes an issue
> > > > with cputime accounting that resulted in similar symptoms.
> > >
> > > So that patch has been in 'urgent' for a month. Can someone move it
> > > to really-urgent or whatever is necessary to get it into Linus' tree ?
> >
> > Another week passes...
>
> It was sent to Linus two days before your mail and Linus merged it at
> around the time you wrote your mail.
>
> So it should be all in -rc7, unless I'm missing something?

Ah sorry I missed that merge. It's all fine indeed.

Thanks!