Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp3770288pxb; Tue, 19 Apr 2022 09:28:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwNqMmKI9Yi0tnDTlToMFHNplUv3Y5MP1rtUPCD9nTpD1Obi5kRct8PTZRa8LzXLo+IhtZQ X-Received: by 2002:a17:90b:3e8b:b0:1c7:852d:e843 with SMTP id rj11-20020a17090b3e8b00b001c7852de843mr19537633pjb.244.1650385709603; Tue, 19 Apr 2022 09:28:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650385709; cv=none; d=google.com; s=arc-20160816; b=koNtXRhP4BUo2T3wcKprzIishrYoXtA0OO2VlprM72LA40NgaQIElaBp/21kTf7H5G zrh4X/kEem3hYyJHtjUqcl0/5rFOum7Pv5Goo9O1kU4WIiWO7xNVH2qBsfj3sYe/Byvl Eau0/2KvZKM3Hu6CYWgcq9FrPgo8EFr1vsHASa5RechinMm3Rj8Cv2qLaBwsoNoEfVi5 carVCvEZmNE5j7yD/Ky6+EY+EClhuO9f08zMkRRcoPPjv/Ej0MQDtk0mmlqfPJe2dHTD LlxNU1rNNKA8Hn/fpSJWIWXLANrxy6Zh46vs5bW+btn8BoQ4ErYblPKyofW/PPr6+MST 4P8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=x0XSj51r0Ee+HjjwpWIAE3+aN1GfWNJ/Awg0KJloBhI=; b=ij3LAtTy19BREXjxKKhd0IZse43TYWpS9y3zrKmADWHvnFeWGboyjlfNKi4exyOtA6 6H/dx2YL7vqPyzEU28vg3yxg4VHA6T0qb6ISJO+QN5ltoi8SZbkVivhE4XCffTm7Mrfz qq3GWbltgF4pXQcs03v2/JE5H84U6Es/+AOSvmgUvswWIhml0jBQ9BhiJ469VbemDpbI vGJ/upX2WMC0rvKPjAu9UZhZhQuI61oEPiYUPF3DrLTP9WNks6h8OV3vxdtL4ZoHfh0n Lny4Yzxpv2GqflXFCgfc3yv8711LZny6PyvP702JN5CtLLrLNJGO812StvCBJiZz2diY /+jw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t8-20020a170902b20800b0015893aa584csi11800015plr.272.2022.04.19.09.28.06; Tue, 19 Apr 2022 09:28:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347389AbiDRSgx (ORCPT + 99 others); Mon, 18 Apr 2022 14:36:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229523AbiDRSgu (ORCPT ); Mon, 18 Apr 2022 14:36:50 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C4A226113; Mon, 18 Apr 2022 11:34:10 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D8448B8104C; Mon, 18 Apr 2022 18:34:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5E621C385A7; Mon, 18 Apr 2022 18:34:06 +0000 (UTC) Date: Mon, 18 Apr 2022 14:34:04 -0400 From: Steven Rostedt To: Patrick Wang Cc: paulmck@kernel.org, frederic@kernel.org, quic_neeraju@quicinc.com, josh@joshtriplett.org, mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com, joel@joelfernandes.org, rcu@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] rcu: ftrace: avoid tracing a few functions executed in multi_cpu_stop() Message-ID: <20220418143404.55c8fcab@gandalf.local.home> In-Reply-To: <20220418043735.11441-1-patrick.wang.shcn@gmail.com> References: <20220418043735.11441-1-patrick.wang.shcn@gmail.com> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 18 Apr 2022 12:37:35 +0800 Patrick Wang wrote: > A few functions are in the call chain of rcu_momentary_dyntick_idle() > which is executed in multi_cpu_stop() and marked notrace. They are running > in traced when ftrace modify code. This may cause non-ftrace_modify_code > CPUs stall: I'm confused by this. How is traced functions causing this exactly? Is this on RISC-V? > > [ 72.686113] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 72.687344] rcu: 1-...!: (0 ticks this GP) idle=14f/1/0x4000000000000000 softirq=3397/3397 fqs=0 > [ 72.687800] rcu: 3-...!: (0 ticks this GP) idle=ee9/1/0x4000000000000000 softirq=5168/5168 fqs=0 > [ 72.688280] (detected by 0, t=8137 jiffies, g=5889, q=2 ncpus=4) > [ 72.688739] Task dump for CPU 1: > [ 72.688991] task:migration/1 state:R running task stack: 0 pid: 19 ppid: 2 flags:0x00000000 > [ 72.689594] Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 > [ 72.690242] Call Trace: > [ 72.690603] Task dump for CPU 3: > [ 72.690761] task:migration/3 state:R running task stack: 0 pid: 29 ppid: 2 flags:0x00000000 > [ 72.691135] Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 > [ 72.691474] Call Trace: > [ 72.691733] rcu: rcu_preempt kthread timer wakeup didn't happen for 8136 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 > [ 72.692180] rcu: Possible timer handling issue on cpu=2 timer-softirq=594 > [ 72.692485] rcu: rcu_preempt kthread starved for 8137 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2 > [ 72.692876] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > [ 72.693232] rcu: RCU grace-period kthread stack dump: > [ 72.693433] task:rcu_preempt state:I stack: 0 pid: 14 ppid: 2 flags:0x00000000 > [ 72.693788] Call Trace: > [ 72.694018] [] schedule+0x56/0xc2 > [ 72.694306] [] schedule_timeout+0x82/0x184 > [ 72.694539] [] rcu_gp_fqs_loop+0x19a/0x318 > [ 72.694809] [] rcu_gp_kthread+0x11a/0x140 > [ 72.695325] [] kthread+0xee/0x118 > [ 72.695657] [] ret_from_exception+0x0/0x14 > [ 72.696089] rcu: Stack dump where RCU GP kthread last ran: > [ 72.696383] Task dump for CPU 2: > [ 72.696562] task:migration/2 state:R running task stack: 0 pid: 24 ppid: 2 flags:0x00000000 > [ 72.697059] Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 > [ 72.697471] Call Trace: > > Mark rcu_preempt_deferred_qs(), rcu_preempt_need_deferred_qs() and > rcu_preempt_deferred_qs_irqrestore() notrace to avoid this. > The rcu_momentary_dyntick_idle() was marked notrace because of RISC-V not being able to call functions from within stop machine. If that's what is being prevented, then I'm fine with this (although I'm thinking we need different kinds of "notrace" for different architectures as one arch's limitation should not be cause for another's). But before I ack this patch, I want to understand the real issues here. -- Steve