Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753432AbdHJVjp (ORCPT ); Thu, 10 Aug 2017 17:39:45 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:39370 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751487AbdHJVjo (ORCPT ); Thu, 10 Aug 2017 17:39:44 -0400 Date: Thu, 10 Aug 2017 14:39:39 -0700 From: "Paul E. McKenney" To: Daniel Lezcano Cc: Pratyush Anand , =?utf-8?B?6rmA64+Z7ZiE?= , john.stultz@linaro.org, Steven Rostedt , linux-kernel@vger.kernel.org Subject: Re: RCU stall when using function_graph Reply-To: paulmck@linux.vnet.ibm.com References: <20170806170220.GQ3730@linux.vnet.ibm.com> <20170809125804.GT3730@linux.vnet.ibm.com> <20170809144033.GU3730@linux.vnet.ibm.com> <208e981d-40ec-54fa-6293-5b8e6fe10a84@linaro.org> <20170809172236.GX3730@linux.vnet.ibm.com> <81dd7e5e-89be-2ff9-525e-7095e934baa5@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <81dd7e5e-89be-2ff9-525e-7095e934baa5@linaro.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17081021-0052-0000-0000-0000024E0E3E X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007521; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000219; SDB=6.00900466; UDB=6.00450815; IPR=6.00680668; BA=6.00005522; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016633; XFM=3.00000015; UTC=2017-08-10 21:39:42 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17081021-0053-0000-0000-000051A17E42 Message-Id: <20170810213939.GV3730@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-08-10_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1706020000 definitions=main-1708100341 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7226 Lines: 158 On Thu, Aug 10, 2017 at 11:45:09AM +0200, Daniel Lezcano wrote: > On 09/08/2017 19:22, Paul E. McKenney wrote: > > On Wed, Aug 09, 2017 at 05:51:33PM +0200, Daniel Lezcano wrote: > >> On 09/08/2017 16:40, Paul E. McKenney wrote: > >>> On Wed, Aug 09, 2017 at 03:28:05PM +0200, Daniel Lezcano wrote: > >>>> On 09/08/2017 14:58, Paul E. McKenney wrote: > >>>>> On Wed, Aug 09, 2017 at 02:43:49PM +0530, Pratyush Anand wrote: > >>>>>> > >>>>>> > >>>>>> On Sunday 06 August 2017 10:32 PM, Paul E. McKenney wrote: > >>>>>>> On Sat, Aug 05, 2017 at 02:24:21PM +0900, κΉ€λ™ν˜„ wrote: > >>>>>>>> Dear All > >>>>>>>> > >>>>>>>> As for me, after configuring function_graph as below, crash disappears. > >>>>>>>> "echo 0 > d/tracing/tracing_on" > >>>>>>>> "sleep 1" > >>>>>>>> > >>>>>>>> "echo function_graph > d/tracing/current_tracer" > >>>>>>>> "sleep 1" > >>>>>>>> > >>>>>>>> "echo smp_call_function_single > d/tracing/set_ftrace_filter" > >>>>>> > >>>>>> It will limit trace output to only for the filtered function > >>>>>> (smp_call_function_single). > >>>>>> > >>>>>>>> adb shell "sleep 1" > >>>>>>>> > >>>>>>>> "echo 1 > d/tracing/tracing_on" > >>>>>>>> adb shell "sleep 1" > >>>>>>>> > >>>>>>>> Right after function_graph is enabled, too many logs are traced upon IRQ > >>>>>>>> transaction which many times eventually causes stall. > >>>>>>> > >>>>>>> That would do it! > >>>>>>> > >>>>>>> Hmmm... > >>>>>>> > >>>>>>> Steven, would it be helpful if RCU were to inform tracing (say) halfway > >>>>>>> through the RCU CPU stall interval, allowing the tracer to do something > >>>>>>> like cond_resched_rcu_qs()? I can imagine all sorts of reasons why this > >>>>>>> wouldn't work, for example, if all the tracing was with irqs disabled > >>>>>>> or some such, but figured I should ask. > >>>>>>> > >>>>>>> Does Guillermo's approach work for others? > >>>>>> > >>>>>> Limited output with a couple of filtered function will definitely > >>>>>> not cause RCU schedule stall. But the question is whether we should > >>>>>> expect a full function graph trace working on every platform or not > >>>>>> (specially the one which generates high interrupts)? > >>>>> > >>>>> It might well be that the user must disable RCU CPU stall warnings via > >>>>> the rcu_cpu_stall_suppress sysfs entry (or increase their timeout via th > >>>>> rcu_cpu_stall_timeout sysfs entry) before doing something that greatly > >>>>> increases overhead. Like enabling large quantities of tracing. ;-) > >>>>> > >>>>> It -might- be possible to do this automatically, but reliable > >>>>> automation would require that tracing understand how often each > >>>>> function was called, which sounds to me to be a bit of a stretch. > >>>>> > >>>>> Thoughts? > >>>> > >>>> A random thought: > >>>> > >>>> Is it possible to have a mid-timeout happening and store some > >>>> information like the instruction pointer, so when the timeout happen we > >>>> can compare if there was some progress, if yes, very likely, system > >>>> performance collapsed and we are not fast enough. > >>> > >>> RCU already does take various actions for an impending stall, so something > >>> could be done. But in most slowdowns, the instruction pointer will be > >>> changing rapidly, just not as rapidly as it would normally. So exactly > >>> how would the forward-progress comparison be carried out? > >>> > >>> It would be easy to set up a notifier, so that if any notifier in the > >>> chain returned an error, stall warnings would be suppressed. It would > >>> be harder to figure out when to re-enable them, though I suppose that > >>> they could be suppressed only for the duration of the current grace > >>> period or some such. > >>> > >>> But what exactly would you use such a notifier for? > >>> > >>> Or am I misunderstanding your suggestion? > >> > >> Well, may be the instruction pointer thing is not a good idea. > >> > >> I learnt from this experience, an overloaded kernel with a lot of > >> interrupts can hang the console and issue RCU stall. > >> > >> However, someone else can face the same situation. Even if he reads the > >> RCU/stallwarn.txt documentation, it will be hard to figure out the issue. > >> > >> A message telling the grace period can't be reached because we are too > >> busy processing interrupts would have helped but I understand it is not > >> easy to implement. > >> > >> Perhaps, adding a new bullet in the documentation can help: > >> > >> "If the interrupt processing time is longer than the interval between > >> each interrupt, the CPU will keep processing the interrupts without > >> allowing the RCU's grace period kthread. This situation can happen if > >> there is a highly rated number of interrupts and the function_graph > >> tracer is enabled". > > > > How about this? > > Yes, it is clear. Thanks for reformulating it. > > > Any other debugging options that should be called out? I bet that > > the function_graph tracer isn't the only way to make this happen. > > Nothing coming in mind but may be worth to mention the slowness of the > CPU is the aggravating factor. In particular I was able to reproduce the > issue by setting to the min CPU frequency. With the ondemand governor, > we can have the frequency high (hence enough CPU power) at the moment we > set the function_graph because another CPU is loaded (and both CPUs are > sharing the same clock line). The system became stuck at the moment the > other CPU went idle with the lowest frequency. That introduced > randomness in the issue and made hard to figure out why the RCU stall > was happening. Adding this, then? Thanx, Paul ------------------------------------------------------------------------ commit f7d9ce95064f76be583c775fac32076fa59f1617 Author: Paul E. McKenney Date: Thu Aug 10 14:33:17 2017 -0700 documentation: Slow systems can stall RCU grace periods If a fast system has a worst-case grace-period duration of (say) ten seconds, then running the same workload on a system ten times as slow will get you an RCU CPU stall warning given default stall-warning timeout settings. This commit therefore adds this possibility to stallwarn.txt. Reported-by: Daniel Lezcano Signed-off-by: Paul E. McKenney diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 21b8913acbdf..238acbd94917 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -70,6 +70,12 @@ o A periodic interrupt whose handler takes longer than the time considerably longer than normal, which can in turn result in RCU CPU stall warnings. +o Testing a workload on a fast system, tuning the stall-warning + timeout down to just barely avoid RCU CPU stall warnings, and then + running the same workload with the same stall-warning timeout on a + slow system. Note that thermal throttling and on-demand governors + can cause a single system to be sometimes fast and sometimes slow! + o A hardware or software issue shuts off the scheduler-clock interrupt on a CPU that is not in dyntick-idle mode. This problem really has happened, and seems to be most likely to