Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp1209430ybh; Tue, 10 Mar 2020 17:19:48 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtWb98B1amgerjaGvDZ2WDquxvfLSzwi+jzvqCExx0mB69CE6D+csmGFGmqPUrBAr/ElksF X-Received: by 2002:a05:6830:403d:: with SMTP id i29mr250093ots.353.1583885988800; Tue, 10 Mar 2020 17:19:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1583885988; cv=none; d=google.com; s=arc-20160816; b=QpifJaDHE/Cq8Wpa2lPvuFre3teHqCqn70oAZ0cMq7c3au4+OMLTpH5g1ZHLC5tNMN KM8sDRLml5NNKJewJQg+5h4yGrPDJA3v+sJL7JzAhXaYJq7X6z+wowAvxynuLnSuzXfB GA2NVUrDIAlC7EILfdhEbCKWIZQeMCdB44fwb6XaMYiUy73Kt7kJCs0/FInRV5TLatGh bfpOHJwwJEeuUgIMWDZu8K3h6Ea8XGVG17XJ8tRcoDdfnwLrrrxc7LbeC2d5Uv75ZXZQ +defyFVPDF3tS1ssm58qi5t//Y6Ger6vkiNVXboh2KwQS/tZDB+RgsOiZYHr4y7YQ+tu u3Tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=Teox83L//TF1BIZc5oH/XjuC/KkWtnq/9p5VsRzxzJg=; b=WhdqVI+frbYzCdPSUXbXTU+Cs/GCu8AqMaiXdvuap0iIUfCiea6vgfWTgKV2W++Dxw XOMXjiepjfYqtR2rtsNU7hNKwAGf6ETaqm+Ijmp57vOdP6I9rhMNZMdOZ4mjCn1CSF9D zeZAlJzlp85/cr8C4rE6yKB5WoDPE4hq4myF1jwRdrYLpQxbJHtwY8JiXmkPyXiV9oKR ONKhmNsXj6bdb7SkzplLWmVef6WDIZzfpOwVhbmHlnnhSW2hF5nxFSZpNTmxLdMZrhZF GOI6X2b7kj1bR152hgzpyXkL1zRkE+0GCZJvFOaqMs5sObwNrBysvyxbnV73t72Fb+Ya jiGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=zkmttxiF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 45si155386otu.264.2020.03.10.17.19.34; Tue, 10 Mar 2020 17:19:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=zkmttxiF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727867AbgCKASW (ORCPT + 99 others); Tue, 10 Mar 2020 20:18:22 -0400 Received: from mail.kernel.org ([198.145.29.99]:38446 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726463AbgCKASW (ORCPT ); Tue, 10 Mar 2020 20:18:22 -0400 Received: from devnote2 (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 86637222C4; Wed, 11 Mar 2020 00:18:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1583885901; bh=UcNtWpBPblu9vR73y/1seDmYC7BGlEjFnb+kqdeS3Bk=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=zkmttxiF36VLz6lg5tyYyttgoIp2YkTtMDmbi+ciIQ/EXhdFDmRQAzjeAkVf9I9gp cHJ2iJu1z9A3KSuYWOQYtYFY+R5y6kp+KfCtp77SKI7WMz2RzjVp+KbLCUBN6Cwj1f YCIpFEhGYhi1ygDGR/zWxoyr5cx2FyIDleoO6X00= Date: Wed, 11 Mar 2020 09:18:15 +0900 From: Masami Hiramatsu To: Mathieu Desnoyers Cc: rostedt , Thomas Gleixner , Masami Hiramatsu , linux-kernel , Peter Zijlstra , Alexei Starovoitov , paulmck , "Joel Fernandes, Google" , Frederic Weisbecker , Jason Wessel Subject: Re: Instrumentation and RCU Message-Id: <20200311091815.fce458348bb7641b60f600d9@kernel.org> In-Reply-To: <1760242532.23694.1583857291763.JavaMail.zimbra@efficios.com> References: <87mu8p797b.fsf@nanos.tec.linutronix.de> <20200309141546.5b574908@gandalf.local.home> <87fteh73sp.fsf@nanos.tec.linutronix.de> <20200310170951.87c29e9c1cfbddd93ccd92b3@kernel.org> <87pndk5tb4.fsf@nanos.tec.linutronix.de> <450878559.23455.1583854311078.JavaMail.zimbra@efficios.com> <20200310114657.099122fd@gandalf.local.home> <1760242532.23694.1583857291763.JavaMail.zimbra@efficios.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Mathieu, On Tue, 10 Mar 2020 12:21:31 -0400 (EDT) Mathieu Desnoyers wrote: > ----- On Mar 10, 2020, at 11:46 AM, rostedt rostedt@goodmis.org wrote: > > > On Tue, 10 Mar 2020 11:31:51 -0400 (EDT) > > Mathieu Desnoyers wrote: > > > >> I think there are two distinct problems we are trying to solve here, > >> and it would be good to spell them out to see which pieces of technical > >> solution apply to which. > >> > >> Problem #1) Tracer invoked from partially initialized kernel context > >> > >> - Moving the early/late entry/exit points into sections invisible from > >> instrumentation seems to make tons of sense for this. > >> > >> Problem #2) Tracer recursion > >> > >> - I'm much less convinced that hiding entry points from instrumentation > >> works for this. As an example, with the isntr_begin/end() approach you > >> propose above, as soon as you have a tracer recursing into itself because > >> something below do_stuff() has been instrumented, having hidden the entry > >> point did not help at all. > >> > >> So I would be tempted to use the "hide entry/exit points" with explicit > >> instr begin/end annotation to solve Problem #1, but I'm still thinking there > >> is value in the per recursion context "in_tracing" flag to prevent tracer > >> recursion. > > > > The only recursion issue that I've seen discussed is breakpoints. And > > that's outside of the tracer infrastructure. Basically, if someone added a > > breakpoint for a kprobe on something that gets called in the int3 code > > before kprobes is called we have (let's say rcu_nmi_enter()): > > > > > > rcu_nmi_enter(); > > > > do_int3() { > > rcu_nmi_enter(); > > > > do_int3(); > > [..] > > > > Where would a "in_tracer" flag help here? Perhaps a "in_breakpoint" could? > > An approach where the "in_tracer" flag is tested and set by the instrumentation > (function tracer, kprobes, tracepoints) would work here. Let's say the beginning > of the int3 ISR is part of the code which is invisible to instrumentation, and > before we issue rcu_nmi_enter(), we handle the in_tracer flag: > > rcu_nmi_enter(); > > (recursion_ctx->in_tracer == false) > set recursion_ctx->in_tracer = true > do_int3() { > rcu_nmi_enter(); > > if (recursion_ctx->in_tracer == true) > iret > > We can change "in_tracer" for "in_breakpoint", "in_tracepoint" and > "in_function_trace" if we ever want to allow different types of instrumentation > to nest. I'm not sure whether this is useful or not through. Kprobes already has its own "in_kprobe" flag, and the recursion path is not so simple. Since the int3 replaces the original instruction, we have to execute the original instruction with single-step and fixup. This means it involves do_debug() too. Thus, we can not do iret directly from do_int3 like above, but if recursion happens, we have no way to recover to origonal execution path (and call BUG()). As my previous email, I showed a patch which is something like "bust_kprobes()" for oops path. That is not safe but no other way to escape from this recursion hell. (Maybe we can try to call it instead of calling BUG() so that the kernel can continue to run, but I'm not sure we can safely make the pagetable to readonly again.) Thank you, -- Masami Hiramatsu