Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp1097893rwi; Thu, 20 Oct 2022 08:36:26 -0700 (PDT) X-Google-Smtp-Source: AMsMyM59BR0GHwjxTFkNq7SW27CS1BcXCGJZmdgcwUQBlJddLUTjwXpdO9vbScoCmaO4akvPkOu3 X-Received: by 2002:a17:907:3e01:b0:730:a690:a211 with SMTP id hp1-20020a1709073e0100b00730a690a211mr11456027ejc.596.1666280185789; Thu, 20 Oct 2022 08:36:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666280185; cv=none; d=google.com; s=arc-20160816; b=BiRksTK6yvH2WgdY9JXYMJ4BAdsx+rOW6gKzA04dEDWwz5eYklvDW5bgQywz6Se0te ZKECw2mMl/PqsEq/yjKTJBpf/Hgs/7nTAGou+aFbI/Dz2vY5D2TqicQrxLxYloX1WGmp a4LsVCrynX2Yxu+Nck5YA+RLpdqanNqsNC0Utj0PB7QAjSguDFn1hamK+EZrE5AcWvRo //tMA87spFUMJFbYA+86q47lZl3kLyT9JjTMcs3rRnYeGtBPAlt83GZQP4h1LC3xhpvD MtNZ0pa3qjjEWKdILCXRCoo3hkX++lDoRVAzEEjtEyKszaIRM20XCRx9X29+Xt9t5OL3 VUyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=nJwHLcf/0f+WA6hGh9SbH2k5UqoC7BdA4gDyaTGmHNs=; b=O7gfpFIEPlgO94lnv2EUVYB/yg6yEEyQB4fSdjEGhKCFaLVeRclgM+K6eetVzZNk+X s49yl2REiaKio4rmyLFsJ/oACYhjJYf8NkK6+MBU4zCyBzcd0lgZxzlP30rtH797cBvk t5ubTy756Poh22yLZArLy6Bx+yWOer2Q4RdN4bgYkdgO44mdP5RhLAgkP4hAJC7nWjtO PMxWgIYTU82fhuHkMKzjfyKHe2FL8G+otkWLEQNSWmZKNnoSfobwkW6DJ1lB5o3BE3x3 OX5AxeVn49lLa56RyUh6Jrw0UuCLYCvgqH98BTeiM90g4WasUCwLm6Q/D239/so6q75j hWjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=ZJRb2+s7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e22-20020a1709067e1600b0073da5c8de1asi16662525ejr.178.2022.10.20.08.36.00; Thu, 20 Oct 2022 08:36:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=ZJRb2+s7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229972AbiJTOoJ (ORCPT + 99 others); Thu, 20 Oct 2022 10:44:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229707AbiJTOoG (ORCPT ); Thu, 20 Oct 2022 10:44:06 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F7E91F3 for ; Thu, 20 Oct 2022 07:44:05 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id d142so17343762iof.7 for ; Thu, 20 Oct 2022 07:44:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=nJwHLcf/0f+WA6hGh9SbH2k5UqoC7BdA4gDyaTGmHNs=; b=ZJRb2+s7ZG4O55mws3aWVZBf0D409xIPfEQiGx+SxEZ6eG8TpIeJz39MbCouB5kCbV b6Ducx3J1y2gY9RBLgAX1lAJTdhEisWbtzFSSdlUZAOlWk3me5xXQq13FPoVwurF/vsL YjfBkYRN7an1n1yHnPLWfa1EKfX9E69gdg2/4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=nJwHLcf/0f+WA6hGh9SbH2k5UqoC7BdA4gDyaTGmHNs=; b=F03bcV4lXo38AnKCZ0Yvgf6uduQ27ne0pK0gcxWyCZT/NAm15TvFoKAfAHpDq0IXrV KaMzhf9IUsEX+kDtex4w8G5W6TELXEZiCmuYP2Xcvkhge1JrrgxpnqHbyGmy6CPeONjo caw2P21P5j6onvlPNv/bzV8qpIFI6XeRKQkv4EOGwlFKjhMfncXKvRhVBc401k+i6BZI Hs2fuzzDK3YCghkxmuUVJKzzwziEHzRDHYLAu8jFSNjt+iJbVgN1z8N2uRSVw8rG7ibW gPGT7HERbcAgGAUjnmEbhuf1hyFDeGBWv8ejv8fZqsPrdohI6GUVv7Y67Yr5C+FDngIG qXlQ== X-Gm-Message-State: ACrzQf2A1RgXEZYad+6fi+qz2l57dqFtIsGPHWKFrg9zlFlv4BRkHnCy Z/9Q2BHkdfgFelAxHrgfnAMxHcbAuhcS9heewFhGe8r4KhQrzA== X-Received: by 2002:a05:6602:26d5:b0:68a:db5d:2923 with SMTP id g21-20020a05660226d500b0068adb5d2923mr10269264ioo.175.1666277044260; Thu, 20 Oct 2022 07:44:04 -0700 (PDT) MIME-Version: 1.0 References: <20221019200137.70343645@gandalf.local.home> In-Reply-To: <20221019200137.70343645@gandalf.local.home> From: Joel Fernandes Date: Thu, 20 Oct 2022 10:43:53 -0400 Message-ID: Subject: Re: [PATCH] tracing: Add trace_trigger kernel command line option To: Steven Rostedt Cc: LKML , Linux Trace Kernel , Masami Hiramatsu , Andrew Morton , Mathieu Desnoyers , "Paul E. McKenney" , Tom Zanussi Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Steven, On Wed, Oct 19, 2022 at 8:01 PM Steven Rostedt wrote: > > From: "Steven Rostedt (Google)" > > Allow triggers to be enabled at kernel boot up. For example: > > trace_trigger="sched_switch.stacktrace if prev_state == 2" > > The above will enable the stacktrace trigger on top of the sched_switch > event and only trigger if its prev_state is 2 (TASK_UNINTERRUPTIBLE). Then > at boot up, a stacktrace will trigger and be recorded in the tracing ring > buffer every time the sched_switch happens where the previous state is > TASK_INTERRUPTIBLE. > > As this calls into tracepoint logic during very early boot (before > interrupts are enabled), a check has to be done to see if early boot > interrupts are still disabled, and if so, avoid any call to RCU > synchronization, as that will enable interrupts and cause boot up issues. > > Signed-off-by: Steven Rostedt (Google) > --- rest_init() -> rcu_scheduler_starting() sets RCU_SCHEDULER_INIT start_kernel() disables interrupts and does the dance you mentioned, while setting that early_ flag. start_kernel() is called from arch code. From your crash, the trace trigger is happening before the early_ boot flag is set to true. rcu_blocking_is_gp() should return true at the point your trigger fires because RCU_SCHEDULER_INACTIVE is still set. So I would expect your synchronize_rcu() to not do anything. There's comments in rcu_blocking_is_gp() like so: * During early boot, any blocking grace-period wait automatically * implies a grace period. So it should be returning straight from here. I wonder where/why it's re-enabling interrupts for you. What does rcu_blocking_is_gp() return if you can print that, at the time of the crash? Cheers, - J > .../admin-guide/kernel-parameters.txt | 19 ++++++ > include/linux/tracepoint.h | 4 ++ > kernel/trace/trace.c | 3 +- > kernel/trace/trace_events.c | 63 ++++++++++++++++++- > kernel/tracepoint.c | 6 ++ > 5 files changed, 92 insertions(+), 3 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index a465d5242774..ccf91a4bf113 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -6257,6 +6257,25 @@ > See also Documentation/trace/ftrace.rst "trace options" > section. > > + trace_trigger=[trigger-list] > + [FTRACE] Add a event trigger on specific events. > + Set a trigger on top of a specific event, with an optional > + filter. > + > + The format is is "trace_trigger=.[ if ],..." > + Where more than one trigger may be specified that are comma deliminated. > + > + For example: > + > + trace_trigger="sched_switch.stacktrace if prev_state == 2" > + > + The above will enable the "stacktrace" trigger on the "sched_switch" > + event but only trigger it if the "prev_state" of the "sched_switch" > + event is "2" (TASK_UNINTERUPTIBLE). > + > + See also "Event triggers" in Documentation/trace/events.rst > + > + > traceoff_on_warning > [FTRACE] enable this option to disable tracing when a > warning is hit. This turns off "tracing_on". Tracing can > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index 4b33b95eb8be..a5c6b5772897 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -90,6 +90,10 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb) > #ifdef CONFIG_TRACEPOINTS > static inline void tracepoint_synchronize_unregister(void) > { > + /* Early updates do not need synchronization */ > + if (early_boot_irqs_disabled) > + return; > + > synchronize_srcu(&tracepoint_srcu); > synchronize_rcu(); > } > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index 47a44b055a1d..c03fd7037add 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -2749,7 +2749,8 @@ void trace_buffered_event_disable(void) > preempt_enable(); > > /* Wait for all current users to finish */ > - synchronize_rcu(); > + if (!early_boot_irqs_disabled) > + synchronize_rcu(); > > for_each_tracing_cpu(cpu) { > free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c > index 0356cae0cf74..06554939252c 100644 > --- a/kernel/trace/trace_events.c > +++ b/kernel/trace/trace_events.c > @@ -2796,6 +2796,44 @@ trace_create_new_event(struct trace_event_call *call, > return file; > } > > +#ifdef CONFIG_HIST_TRIGGERS > +#define MAX_BOOT_TRIGGERS 32 > + > +static struct boot_triggers { > + const char *event; > + char *trigger; > +} bootup_triggers[MAX_BOOT_TRIGGERS]; > + > +static char bootup_trigger_buf[COMMAND_LINE_SIZE]; > +static int nr_boot_triggers; > + > +static __init int setup_trace_triggers(char *str) > +{ > + char *trigger; > + char *buf; > + int i; > + > + strlcpy(bootup_trigger_buf, str, COMMAND_LINE_SIZE); > + ring_buffer_expanded = true; > + disable_tracing_selftest("running event triggers"); > + > + buf = bootup_trigger_buf; > + for (i = 0; i < MAX_BOOT_TRIGGERS; i++) { > + trigger = strsep(&buf, ","); > + if (!trigger) > + break; > + bootup_triggers[i].event = strsep(&trigger, "."); > + bootup_triggers[i].trigger = strsep(&trigger, "."); > + if (!bootup_triggers[i].trigger) > + break; > + } > + > + nr_boot_triggers = i; > + return 1; > +} > +__setup("trace_trigger=", setup_trace_triggers); > +#endif > + > /* Add an event to a trace directory */ > static int > __trace_add_new_event(struct trace_event_call *call, struct trace_array *tr) > @@ -2822,12 +2860,32 @@ __trace_early_add_new_event(struct trace_event_call *call, > struct trace_array *tr) > { > struct trace_event_file *file; > + int ret; > + int i; > > file = trace_create_new_event(call, tr); > if (!file) > return -ENOMEM; > > - return event_define_fields(call); > + ret = event_define_fields(call); > + if (ret) > + return ret; > + > +#ifdef CONFIG_HIST_TRIGGERS > + for (i = 0; i < nr_boot_triggers; i++) { > + if (strcmp(trace_event_name(call), bootup_triggers[i].event)) > + continue; > + mutex_lock(&event_mutex); > + ret = trigger_process_regex(file, bootup_triggers[i].trigger); > + mutex_unlock(&event_mutex); > + if (ret) > + pr_err("Failed to register trigger '%s' on event %s\n", > + bootup_triggers[i].trigger, > + bootup_triggers[i].event); > + } > +#endif > + > + return 0; > } > > struct ftrace_module_file_ops; > @@ -3726,6 +3784,8 @@ static __init int event_trace_enable(void) > list_add(&call->list, &ftrace_events); > } > > + register_trigger_cmds(); > + > /* > * We need the top trace array to have a working set of trace > * points at early init, before the debug files and directories > @@ -3740,7 +3800,6 @@ static __init int event_trace_enable(void) > > register_event_cmds(); > > - register_trigger_cmds(); > > return 0; > } > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index f23144af5743..f6e4ee1e40b3 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -48,6 +48,9 @@ static void tp_rcu_get_state(enum tp_transition_sync sync) > { > struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; > > + if (early_boot_irqs_disabled) > + return; > + > /* Keep the latest get_state snapshot. */ > snapshot->rcu = get_state_synchronize_rcu(); > snapshot->srcu = start_poll_synchronize_srcu(&tracepoint_srcu); > @@ -58,6 +61,9 @@ static void tp_rcu_cond_sync(enum tp_transition_sync sync) > { > struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; > > + if (early_boot_irqs_disabled) > + return; > + > if (!snapshot->ongoing) > return; > cond_synchronize_rcu(snapshot->rcu); > -- > 2.35.1 >