Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp1498587ybf; Sun, 1 Mar 2020 10:56:08 -0800 (PST) X-Google-Smtp-Source: APXvYqyWbUp2f6+A+7DOdibGJTL2QFAoLTbUi4vCRzsMdpub24ZC/OvRYfdrLQm+VwQt8tDymhDs X-Received: by 2002:a05:6830:1f18:: with SMTP id u24mr4432069otg.99.1583088968468; Sun, 01 Mar 2020 10:56:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583088968; cv=none; d=google.com; s=arc-20160816; b=rY7s+b5WKfhpjPRv7przomnAGzYvOSFR+b9q6dVGm5YdeKd/pm+E0urNfQFFdaIfE5 mDzpQk/Yxod1F0rT5whQRHBG7H98ga/0d+magbEpm7V0kmowz1lVhUqYN/IhErmtQLIm hXpwcFT4Ib2aqfaz/rx31UfmF/URPm0BmWLGddvYZSGv+yle20gFQulG4VvEsN4U5KX7 VVM4CFAYxZIurPMUvVWRsXHQjlwW8mIqMsd4EPrAJo5IOI0CUCHdKU/81TO7QWYhBphV Les/jxUazsoE3AW0dWNpIf5rEUDQzBOVp4+9fJOkfvWZIc8cmdvHmL9kw+XjgtOmDid0 p1eA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=/xb76mCswgfwUSKy8UPnVEJOqCr27PggfskZ8putk/s=; b=iYW067/8rty4Jb7DS1IMMjoCfwBIaeMxpqQyCeX5TxYHK3DxYLQCmQEPiVQou6DmXw PS1DUFdIQByxvN4G+Xis/kvJdUJsQfPHP5ZVOHHZyy5kbpASZtjv6ZkPYSExubVtPYNN pDp8O7sBTyQqkOxV8rkinStanlmuRNIcA4mvQDWiS04W5c2gOL5Ms03GotcGVdBpOhXb ofxu7aEc8FbYHeKuh8+0VejE+QdZnUCFZBLYb0SldgPyG5XRsvaM2T9akDMlbMFt8Zc7 8cRhSb+1aVqUxbgtXdWdZ8/wCrcCGmsw7OcMNkILbMXTvxyfD3JnJ8UJ9my/7mKWUPk0 fRtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Sf+3FX95; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 7si4866113oix.49.2020.03.01.10.55.25; Sun, 01 Mar 2020 10:56:08 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Sf+3FX95; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726674AbgCASyi (ORCPT + 99 others); Sun, 1 Mar 2020 13:54:38 -0500 Received: from mail.kernel.org ([198.145.29.99]:42302 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726359AbgCASyh (ORCPT ); Sun, 1 Mar 2020 13:54:37 -0500 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 75BC7246B6 for ; Sun, 1 Mar 2020 18:54:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1583088876; bh=BYvsTMF5iWMyl7cJ62zKTnOrq0LAxRDt7DQosTG9erg=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=Sf+3FX95OKtV0W37ogsvbV0GEHRcLLJo3uw50BgQliuQWLT9s3fGTrAJ6M/q0Ysia jQtr+Rwfpf450uoomSqrjT4L8qySQ8FRK/fBjSC5wJAtNs1uU0GL9pqR8+vaBtx3K8 Dm813TN3y4rMBuBi6QobxNwvRA6/7Et18p66CCRU= Received: by mail-wm1-f50.google.com with SMTP id p9so8799905wmc.2 for ; Sun, 01 Mar 2020 10:54:36 -0800 (PST) X-Gm-Message-State: APjAAAXTckReo1OKIMb5uq42amf/Ou6a2auw8xXRsRc2S4lDaBUxbzFz N6gA4LOU5LVm1A02CRqEDzHx2qaSLYRp/38fPOnBIg== X-Received: by 2002:a1c:b0c3:: with SMTP id z186mr14915299wme.36.1583088874896; Sun, 01 Mar 2020 10:54:34 -0800 (PST) MIME-Version: 1.0 References: <87imjofkhx.fsf@nanos.tec.linutronix.de> <87d09wf6dw.fsf@nanos.tec.linutronix.de> <878skkeygm.fsf@nanos.tec.linutronix.de> <20200301182605.GT2935@paulmck-ThinkPad-P72> In-Reply-To: <20200301182605.GT2935@paulmck-ThinkPad-P72> From: Andy Lutomirski Date: Sun, 1 Mar 2020 10:54:23 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [patch 4/8] x86/entry: Move irq tracing on syscall entry to C-code To: "Paul E. McKenney" Cc: Thomas Gleixner , Andy Lutomirski , Steven Rostedt , Peter Zijlstra , LKML , X86 ML , Brian Gerst , Juergen Gross , Paolo Bonzini , Arnd Bergmann Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Mar 1, 2020 at 10:26 AM Paul E. McKenney wrote: > > On Sun, Mar 01, 2020 at 07:12:25PM +0100, Thomas Gleixner wrote: > > Andy Lutomirski writes: > > > On Sun, Mar 1, 2020 at 7:21 AM Thomas Gleixner wrote: > > >> Andy Lutomirski writes: > > >> >> On Mar 1, 2020, at 2:16 AM, Thomas Gleixner wrote: > > >> >> Ok, but for the time being anything before/after CONTEXT_KERNEL is unsafe > > >> >> except trace_hardirq_off/on() as those trace functions do not allow to > > >> >> attach anything AFAICT. > > >> > > > >> > Can you point to whatever makes those particular functions special? I > > >> > failed to follow the macro maze. > > >> > > >> Those are not tracepoints and not going through the macro maze. See > > >> kernel/trace/trace_preemptirq.c > > > > > > That has: > > > > > > void trace_hardirqs_on(void) > > > { > > > if (this_cpu_read(tracing_irq_cpu)) { > > > if (!in_nmi()) > > > trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1); > > > tracer_hardirqs_on(CALLER_ADDR0, CALLER_ADDR1); > > > this_cpu_write(tracing_irq_cpu, 0); > > > } > > > > > > lockdep_hardirqs_on(CALLER_ADDR0); > > > } > > > EXPORT_SYMBOL(trace_hardirqs_on); > > > NOKPROBE_SYMBOL(trace_hardirqs_on); > > > > > > But this calls trace_irq_enable_rcuidle(), and that's the part of the > > > macro maze I got lost in. I found: > > > > > > #ifdef CONFIG_TRACE_IRQFLAGS > > > DEFINE_EVENT(preemptirq_template, irq_disable, > > > TP_PROTO(unsigned long ip, unsigned long parent_ip), > > > TP_ARGS(ip, parent_ip)); > > > > > > DEFINE_EVENT(preemptirq_template, irq_enable, > > > TP_PROTO(unsigned long ip, unsigned long parent_ip), > > > TP_ARGS(ip, parent_ip)); > > > #else > > > #define trace_irq_enable(...) > > > #define trace_irq_disable(...) > > > #define trace_irq_enable_rcuidle(...) > > > #define trace_irq_disable_rcuidle(...) > > > #endif > > > > > > But the DEFINE_EVENT doesn't have the "_rcuidle" part. And that's > > > where I got lost in the macro maze. I looked at the gcc asm output, > > > and there is, indeed: > > > > DEFINE_EVENT > > DECLARE_TRACE > > __DECLARE_TRACE > > __DECLARE_TRACE_RCU > > static inline void trace_##name##_rcuidle(proto) > > __DO_TRACE > > if (rcuidle) > > .... > > > > > But I also don't see why this is any different from any other tracepoint. > > > > Indeed. I took a wrong turn at some point in the macro jungle :) > > > > So tracing itself is fine, but then if you have probes or bpf programs > > attached to a tracepoint these use rcu_read_lock()/unlock() which is > > obviosly wrong in rcuidle context. > > Definitely, any such code needs to use tricks similar to that of the > tracing code. Or instead use something like SRCU, which is OK with > readers from idle. Or use something like Steve Rostedt's workqueue-based > approach, though please be very careful with this latter, lest the > battery-powered embedded guys come after you for waking up idle CPUs > too often. ;-) > Are we okay if we somehow ensure that all the entry code before enter_from_user_mode() only does rcuidle tracing variants and has kprobes off? Including for BPF use cases? It would be *really* nice if we could statically verify this, as has been mentioned elsewhere in the thread. It would also probably be good enough if we could do it at runtime. Maybe with lockdep on, we verify rcu state in tracepoints even if the tracepoint isn't active? And we could plausibly have some widget that could inject something into *every* kprobeable function to check rcu state. --Andy