Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp871884ybh; Tue, 10 Mar 2020 09:49:51 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtTe7YmR3zKAfPSjqiHp+pVwIOeUcHeireQjQIyA+cEHGrHPFrmSnM+y0goxEh7l9W14Zr6 X-Received: by 2002:aca:f12:: with SMTP id 18mr1815806oip.126.1583858991553; Tue, 10 Mar 2020 09:49:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1583858991; cv=none; d=google.com; s=arc-20160816; b=VNJTzxIZIQnTSBA7lKxxQfPxzc5yE1l6jm9gk9fDQZz7xSMAwyr/ggym2qhi9qqvnR xgI075V+qvUGWzEZlUmrzRCc0WSfpnMJaNwh3gXliJy0JBRtIEdnjw0KXJyg8oj0CSEg R8++cs+h41o005Ii6f5NDT7+BvvtCDH7Nt/1KMD5RNvO36I+YuY1AqSDynqMC9LVvNV0 nyh5t4wqj5bBaOHnxAzh4ocek/8tpppUT0omL0nbknDDTKK4KX4ZCHtXGz7bPaJ0tJ3l oJLLpTpdhRRmu3CQwzjMacGp94j58p3YRLNETaxfSPZYByT0xof3WZw8qxjcDR6+umry +uaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=AybD8SDiR8ojgZvM3xJksXWFhFD3VKpkgUNjFgi5r8g=; b=v5y50TrAozSwpLNyg1mKqubWkCBoCv9cNXw2Of/FJgpUoT9EK+MI++2DObGUD1H4xX XOpW+eReiAC4TQT34isE0ykQnW2xr9nLOlPHQyOND+W9uB03zMIW4HqlhAWIjAv1xJ6s k58OD6+tsaWcTqInA5U35RE3+Y+2pSWRnWOm+R8WHDnFTr6HTf5AXBUinPYISBxP+BVu Ljob03nmnt26NRj1Tq2bmolLJbp0qwOT4NPBw+LWm1p2tRvWn4Tp26RHWA1BgFa8frVg EPC+SGSl/j+hiZn47YOaT6ql9cZ1E2W36jQxz9X6tC4om6BcF3XLLjDfO34KrCBhVzBC SNXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a5si941992oii.181.2020.03.10.09.49.38; Tue, 10 Mar 2020 09:49:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726466AbgCJQtF (ORCPT + 99 others); Tue, 10 Mar 2020 12:49:05 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:34354 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726307AbgCJQtF (ORCPT ); Tue, 10 Mar 2020 12:49:05 -0400 Received: from [5.158.153.52] (helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1jBi3h-0007Lw-Qi; Tue, 10 Mar 2020 17:48:45 +0100 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id 7040C1040A5; Tue, 10 Mar 2020 17:48:45 +0100 (CET) From: Thomas Gleixner To: Mathieu Desnoyers Cc: linux-kernel , Peter Zijlstra , rostedt , Masami Hiramatsu , Alexei Starovoitov , paulmck , "Joel Fernandes\, Google" , Frederic Weisbecker Subject: Re: Instrumentation and RCU In-Reply-To: <1489283504.23399.1583852595008.JavaMail.zimbra@efficios.com> References: <87mu8p797b.fsf@nanos.tec.linutronix.de> <1403546357.21810.1583779060302.JavaMail.zimbra@efficios.com> <871rq171ca.fsf@nanos.tec.linutronix.de> <1489283504.23399.1583852595008.JavaMail.zimbra@efficios.com> Date: Tue, 10 Mar 2020 17:48:45 +0100 Message-ID: <87imjc5f6a.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mathieu Desnoyers writes: > ----- On Mar 9, 2020, at 3:52 PM, Thomas Gleixner tglx@linutronix.de wrote: >> In a quick test I did with a invalid syscall number with profiling the >> trace_hardirqs_off() is pretty prominent and goes down by roughly a >> factor of 2 when I move it past enter_from_user_mode() and use just the >> non RCU idle variant. > > I think one issue here is that trace_hardirqs_off() is now shared between > lockdep and tracing. For lockdep, we have the following comment: > > /* > * IRQ from user mode. > * > * We need to tell lockdep that IRQs are off. We can't do this until > * we fix gsbase, and we should do it before enter_from_user_mode > * (which can take locks). Since TRACE_IRQS_OFF is idempotent, > * the simplest way to handle it is to just call it twice if > * we enter from user mode. There's no reason to optimize this since > * TRACE_IRQS_OFF is a no-op if lockdep is off. > */ > TRACE_IRQS_OFF > > CALL_enter_from_user_mode > > 1: > ENTER_IRQ_STACK old_rsp=%rdi save_ret=1 > /* We entered an interrupt context - irqs are off: */ > TRACE_IRQS_OFF > > which seems to imply that lockdep requires TRACE_IRQS_OFF to be performed > _before_ entering from usermode. I don't expect this to be useful at all for > other tracers though. I think this should be replaced by a new e.g. > LOCKDEP_ENTER_FROM_USER_MODE or such which would call into lockdep without > calling other tracers. See the entry series I'm working on. Aside of moving all this nonsense into C-code it splits lockdep and tracing so it looks like this: lockdep_hardirqs_off(); user_exit_irqsoff(); __trace_hardirqs_off(); The latter uses regular RCU and not the scru/rcu_irq dance. >> Right, but that still does the whole rcu_irq dance especially in the >> entry code just to trace 50 or 100 instructions which are turning on RCU >> anyway. > > Agreed. Would changing this to a lockdep-specific call as I suggest above > solve this ? That split exist for a few weeks now at least in my patches :) >>> If a tracer recurses, or if a tracer attempts to trace another tracer, the >>> instrumentation would break the recursion chain by preventing instrumentation >>> from firing. If we end up caring about tracers tracing other tracers, we could >>> have one distinct flag per tracer and let each tracer break the recursion chain. >>> >>> Having this flag per kernel stack rather than per CPU or per thread would >>> allow tracing of nested interrupt handlers (and NMIs), but would break >>> call chains both within the same stack or going through a trap. I think >>> it could be a nice complementary safety net to handle mishaps in a non-fatal >>> way. >> >> That works as long as none of this uses breakpoint based patching to >> dynamically disable/enable stuff. > > I'm clearly missing something here. I was expecting the "in_tracing" flag trick > to be able to fix the breakpoint recursion issue. What is the problem I'm missing > here ? How do you "fix" that when you can't reach the tracepoint because you trip over a breakpoint and then while trying to fixup that stuff you hit another one? Thanks, tglx