Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752200AbdHAWPw (ORCPT ); Tue, 1 Aug 2017 18:15:52 -0400 Received: from mail-wr0-f179.google.com ([209.85.128.179]:33620 "EHLO mail-wr0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752012AbdHAWPu (ORCPT ); Tue, 1 Aug 2017 18:15:50 -0400 Subject: Re: RCU stall when using function_graph To: paulmck@linux.vnet.ibm.com, john.stultz@linaro.org Cc: linux-kernel@vger.kernel.org, rostedt@goodmis.org References: <20170801220405.GL3730@linux.vnet.ibm.com> From: Daniel Lezcano Message-ID: Date: Wed, 2 Aug 2017 00:15:44 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170801220405.GL3730@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6259 Lines: 123 On 02/08/2017 00:04, Paul E. McKenney wrote: >> Hi Paul, >> >> I have been trying to set the function_graph tracer for ftrace and each time I >> get a CPU stall. >> >> How to reproduce: >> ----------------- >> >> echo function_graph > /sys/kernel/debug/tracing/current_tracer >> >> This error appears with v4.13-rc3 and v4.12-rc6. >> >> Is it something already reported ? > > I have seen this sort of thing, but only when actually dumping the trace > out, and I though those got fixed. You are seeing this just accumulating > the trace? No, just by changing the tracer. It is the first operation I do after rebooting and it is reproducible each time. That happens on an ARM64 platform. > These RCU CPU stall warnings usually occur when something grabs hold of > a CPU for too long, as in 21 seconds or so. One way that they can happen > is excessive lock contention, another is having the kernel run through > too much data at one shot. > > Adding Steven Rostedt on CC for his thoughts. > > Thanx, Paul > >> [ ... ] >> >> [ 472.934986] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 472.941302] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=3195 >> [ 472.948982] (detected by 3, t=6502 jiffies, g=549, c=548, q=10) >> [ 472.955180] Task dump for CPU 0: >> [ 472.958513] swapper/0 R running task 0 0 0 0x00000002 >> [ 550.954786] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 550.960657] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=12926 >> [ 550.967897] (detected by 3, t=26007 jiffies, g=549, c=548, q=27) >> [ 550.974181] Task dump for CPU 0: >> [ 550.977513] swapper/0 R running task 0 0 0 0x00000002 >> [ 628.974788] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 628.980659] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=22666 >> [ 628.987900] (detected by 3, t=45512 jiffies, g=549, c=548, q=43) >> [ 628.994184] Task dump for CPU 0: >> [ 628.997518] swapper/0 R running task 0 0 0 0x00000002 >> [ 706.994787] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 707.000658] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=31785 >> [ 707.007899] (detected by 3, t=65017 jiffies, g=549, c=548, q=74) >> [ 707.014183] Task dump for CPU 0: >> [ 707.017516] swapper/0 R running task 0 0 0 0x00000002 >> [ 707.024902] rcu_preempt kthread starved for 1261 jiffies! g549 c548 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 >> [ 707.034548] rcu_preempt S 0 8 2 0x00000000 >> [ 726.049583] INFO: task jbd2/mmcblk1p2-:1120 blocked for more than 120 seconds. >> [ 726.061772] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202 >> [ 726.074294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 726.089027] jbd2/mmcblk1p2- D 0 1120 2 0x00000000 >> [ 726.126737] INFO: task rs:main Q:Reg:2272 blocked for more than 120 seconds. >> [ 726.140484] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202 >> [ 726.154515] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 726.170140] rs:main Q:Reg D 0 2272 1 0x00000000 >> [ 726.205626] >> [ 726.205626] Showing all locks held in the system: >> [ 726.272952] >> [ 726.278194] ============================================= >> [ 726.278194] >> [ 785.014779] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 785.020658] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=41387 >> [ 785.027905] (detected by 3, t=84524 jiffies, g=549, c=548, q=94) >> [ 785.034194] Task dump for CPU 0: >> [ 785.037532] swapper/0 R running task 0 0 0 0x00000002 >> [ 846.879753] INFO: task jbd2/mmcblk1p2-:1120 blocked for more than 120 seconds. >> [ 846.892966] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202 >> [ 846.909597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 846.926726] jbd2/mmcblk1p2- D 0 1120 2 0x00000000 >> [ 846.981356] INFO: task rs:main Q:Reg:2272 blocked for more than 120 seconds. >> [ 847.001593] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202 >> [ 847.017642] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 847.034135] rs:main Q:Reg D 0 2272 1 0x00000000 >> [ 847.077209] >> [ 847.077209] Showing all locks held in the system: >> [ 847.180736] >> [ 847.186753] ============================================= >> [ 847.186753] >> [ 863.034778] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 863.040660] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=50881 >> [ 863.047907] (detected by 4, t=104030 jiffies, g=549, c=548, q=114) >> [ 863.054373] Task dump for CPU 0: >> [ 863.057711] swapper/0 R running task 0 0 0 0x00000002 >> [ 941.054780] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 941.060659] 0-...: (3 GPs behind) idle=d1a/2/0 softirq=2556/2557 fqs=59726 >> [ 941.067903] (detected by 7, t=123535 jiffies, g=549, c=548, q=303) >> [ 941.074370] Task dump for CPU 0: >> [ 941.077708] swapper/0 R running task 0 0 0 0x00000002 >> [ 967.711682] INFO: task jbd2/mmcblk1p2-:1120 blocked for more than 120 seconds. >> [ 967.725731] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202 >> [ 967.743347] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 967.765658] jbd2/mmcblk1p2- D 0 1120 2 0x00000000 >> [ 967.810595] INFO: task rs:main Q:Reg:2272 blocked for more than 120 seconds. >> [ 967.825360] Not tainted 4.13.0-rc3-ktest-linaro-hikey #202 >> [ 967.838595] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 967.854460] rs:main Q:Reg D 0 2272 1 0x00000000 >> [ 967.897804] >> [ 967.897804] Showing all locks held in the system: >> [ 967.997067] >> [ 968.002570] ============================================= >> >> >> [ ... ] >> >> etc ... > -- Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog