Received: by 10.192.165.148 with SMTP id m20csp2883344imm; Sun, 22 Apr 2018 18:16:51 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+Qx0LRNv3LW67PhZFrhBAWQMme7BltowOrRW46R3SStQhk+RTPzMPBvXrkxBCtTMRElm4f X-Received: by 10.99.47.4 with SMTP id v4mr15899825pgv.42.1524446211817; Sun, 22 Apr 2018 18:16:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524446211; cv=none; d=google.com; s=arc-20160816; b=c8QTH+2xNKncmO89T2WBxFUiKi1cew6Iy+voDww2xMz1ZwK3NGrgTuXBcoGi2w8HWc EDOcDm2RrDpX7ASCveg8qSUS2h+1B4MR2ZYT+VyQ5xgMmo6l8shYAwMWy/bukhq/USuR BJ8g0Xd/JI23bsQYarJsql5i4lFbI3T7tJQLFi3vn/2odTRdquSeO9mQk57O5q2HzRBe q/qvqXrXXA7w8Kmo291JjaonHQ0hC+jO5zyM4PqPaC2J2PpTvna18BlXoeFihlYsifqX WAZOCkZq078Mt6mp6h8Aaa8XjIJY3IRDXXZ463tJHoWvixEF56HCtvC2t0MoG4Gnlcxe hdQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=VY1EV/m9I9AJ+oVvRAHfKk3+4dhA4wjdCj/nWB4nWYo=; b=WMXc2sufT0TbTqNCuVnzU9jSo7rJUTutcv5veDd2i4yE555VyZ8+s8zBhMFlVrTLex oTCM5sMoKijy5ZVT7HuUBunKpbXyqcKxXqV/fW1QSt6sDsu8EMe8HARD+VzwUIJ3UfVj nRWLIrAC6dySOYvCYgqim+DMybi0ebBFDEyUZ4ohUAr9JAC3uR7WrHygkaMvUd4+TYh+ 7fvRa0IbEfOwIp2xkKjtVNhAFzVQIk9Yux8/SybvlI4a7l6brQCdzzaKUHGjA7/3SgR3 rdlVJ/3d1z041paNbhsIAz78SgpBJ2g/WL/VPx6fByIUbBEJUaqBNVAeVK9kXv0yYIZz TOfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=OOpCoH+A; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 16si10334964pfh.354.2018.04.22.18.16.37; Sun, 22 Apr 2018 18:16:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=OOpCoH+A; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753886AbeDWBOZ (ORCPT + 99 others); Sun, 22 Apr 2018 21:14:25 -0400 Received: from mail-io0-f172.google.com ([209.85.223.172]:33647 "EHLO mail-io0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753755AbeDWBOU (ORCPT ); Sun, 22 Apr 2018 21:14:20 -0400 Received: by mail-io0-f172.google.com with SMTP id e78-v6so3990307iod.0 for ; Sun, 22 Apr 2018 18:14:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=VY1EV/m9I9AJ+oVvRAHfKk3+4dhA4wjdCj/nWB4nWYo=; b=OOpCoH+ATL8wSOSbeZPGpCQgUT29akYPcg9C8IUwMuTGQKBRLhk8s8CAcqCIPGCajF 8/QeTLqj/dEfUp1FYC16nVjJ1rOFUO6mRNyVS4RLgwqTbXKDMHwVOsj5tB2+WPfszr+7 oZlHnHE0nhwWT3Icg75cwo9iVL3V4OkRFmRc8e/pG0j81E5Di8Bhbz1559FilnxaLegP Bp8RjRLcqvWnbdncAJnZTVmBvHQYn340+hyqPoJRU3iGLsH+21rBI0VABIDWUP8eSZLr h5OM6cmvrXa40SfAMz8UctuQF924fZZf/vp4mDesW0R8NYOcC4H36MoYvcwXF/64ps0T 9haA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=VY1EV/m9I9AJ+oVvRAHfKk3+4dhA4wjdCj/nWB4nWYo=; b=XVKolN1jT58Niz937fxmzWmd2LTve+X+aHQhZu/JjcFYuzdLEngv3cK2NGmwQPJjg1 PQBHUrOErebZ37LD0ryVCTFw7Jeuw0Gpd3q1eYL5XEP9XJaO5xFEESEhjTlnwNAKWxAe MwvS87x/Prfyyy2FTPmLgs8L9hyfAILGD49Qs8REic8esXwBKFmjugVRNyfc5YKYEG/B tLDZI2jG3KsKUiGHdBDED8cigsoowqjU43XfOfOzLAuwL0IOqZjXDD0ySv7JkQBYVfUY nJV4EgqVo+UOXz8lftfqfqiOJCiaCbLkFSzQg0Alq1dwQWhsfwAmoCVmxznd7AAZOzUI 5tmA== X-Gm-Message-State: ALQs6tCYrq+s4EU7xyYlWVuh655zx7jNqkuH3Gikf5fjaLwjIenDKuJq 1bC5FaUTmF+a+HS1QLT01yfPpUPyjfFUB53P/tHRPw== X-Received: by 2002:a6b:2251:: with SMTP id i78-v6mr13080890ioi.276.1524446059326; Sun, 22 Apr 2018 18:14:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.181.213 with HTTP; Sun, 22 Apr 2018 18:14:18 -0700 (PDT) In-Reply-To: References: <20180417040748.212236-1-joelaf@google.com> <20180417040748.212236-4-joelaf@google.com> <20180418180250.7b6038dddba46b37c94b796c@kernel.org> <20180419054302.GD13370@sejong> From: Joel Fernandes Date: Sun, 22 Apr 2018 18:14:18 -0700 Message-ID: Subject: Re: [RFC v4 3/4] irqflags: Avoid unnecessary calls to trace_ if you can To: Namhyung Kim Cc: Masami Hiramatsu , LKML , linux-rt-users@vger.kernel.org, Steven Rostedt , Peter Zilstra , Ingo Molnar , Mathieu Desnoyers , Tom Zanussi , Thomas Glexiner , Boqun Feng , Paul McKenney , Frederic Weisbecker , Randy Dunlap , Fenguang Wu , Baohong Liu , Vedang Patel , kernel-team@lge.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 20, 2018 at 12:07 AM, Joel Fernandes wrote: > Hi, > > Thanks Matsami and Namhyung for the suggestions! > > On Wed, Apr 18, 2018 at 10:43 PM, Namhyung Kim wrote: >> On Wed, Apr 18, 2018 at 06:02:50PM +0900, Masami Hiramatsu wrote: >>> On Mon, 16 Apr 2018 21:07:47 -0700 >>> Joel Fernandes wrote: >>> >>> > With TRACE_IRQFLAGS, we call trace_ API too many times. We don't need >>> > to if local_irq_restore or local_irq_save didn't actually do anything. >>> > >>> > This gives around a 4% improvement in performance when doing the >>> > following command: "time find / > /dev/null" >>> > >>> > Also its best to avoid these calls where possible, since in this series, >>> > the RCU code in tracepoint.h seems to be call these quite a bit and I'd >>> > like to keep this overhead low. >>> >>> Can we assume that the "flags" has only 1 bit irq-disable flag? >>> Since it skips calling raw_local_irq_restore(flags); too, >> >> I don't know how many it impacts on performance but maybe we can have >> an arch-specific config option something like below? > > The flags restoration I am hoping is "cheap" but I haven't measured > specifically the cost of this though. > >> >> >>> if there is any state in the flags on any arch, it may change the >>> result. In that case, we can do it as below (just skipping trace_hardirqs_*) >>> >>> int disabled = irqs_disabled(); >> >> if (disabled == raw_irqs_disabled_flags(flags)) { >> #ifndef CONFIG_ARCH_CAN_SKIP_NESTED_IRQ_RESTORE >> raw_local_irq_restore(flags); >> #endif >> return; >> } > > Hmm, somehow I feel this part should be written generically enough > that it applies to all architectures (as a first step). > >> >>> >>> if (!raw_irqs_disabled_flags(flags) && disabled) >>> trace_hardirqs_on(); >>> >>> raw_local_irq_restore(flags); >>> >>> if (raw_irqs_disabled_flags(flags) && !disabled) >>> trace_hardirqs_off(); > > I like this idea since its a good thing to do the flag restoration > just to be safe and preserve the current behaviors. Also my goal was > to reduce the trace_ calls in this series, so its probably better I > just do as you're suggesting. I will do some experiments and make the > changes for the next series. So about performance of this series.. lockdep hooking into tracepoint code is a bit heavy, compared to without this series. That's because of the design approach of IRQ on/off -> Trace point -> lockdep Versus without this series which does IRQ on/off -> lockdep So we lose performance because of that. This particular patch improves the situation, as such so this particular patch is probably good to merge once we can test performance of Matsami's suggestion as well. However, patch 4/4 which makes lockdep use the tracepoint causes a performance hit of around 8% of mean time when I run: hackbench -g 4 -f 2 -l 30000 I narrowed the performance hit down to the call to rcu_irq_enter_irqson() and rcu_irq_exit_irqson() in __DO_TRACE. Commenting these 2 functions brings the perf level back. I was thinking about RCU usage here, and really we never change this particular performance-sensitive tracepoint's function table 99.9% of the time, so it seems there's quite in a win if we just had another read-mostly synchronization mechanism that doesn't do all the RCU tracking that's currently done here and such a mechanism can be simpler.. If I understand correctly, RCU also adds other complications such as that it can't be used from the idle path, that's why the rcu_irq_enter_* was added in the first place. Would be nice if we can just avoid these RCU calls for the preempt/irq tracepoints... Any thoughts about this or any other ideas to solve this? Meanwhile I'll also do some performance testing with Matsami's idea as well.. thanks, - Joel