Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753885AbdGKEkM (ORCPT ); Tue, 11 Jul 2017 00:40:12 -0400 Received: from mga09.intel.com ([134.134.136.24]:9289 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751162AbdGKEkL (ORCPT ); Tue, 11 Jul 2017 00:40:11 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.40,344,1496127600"; d="scan'208";a="109826087" Subject: Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods To: Andi Kleen , Peter Zijlstra Cc: Aubrey Li , tglx@linutronix.de, len.brown@intel.com, rjw@rjwysocki.net, tim.c.chen@linux.intel.com, arjan@linux.intel.com, paulmck@linux.vnet.ibm.com, yang.zhang.wz@gmail.com, x86@kernel.org, linux-kernel@vger.kernel.org References: <1499650721-5928-1-git-send-email-aubrey.li@intel.com> <20170710084647.zs6wkl3fumszd33g@hirez.programming.kicks-ass.net> <20170710144609.GD31832@tassilo.jf.intel.com> <20170710164206.5aon5kelbisxqyxq@hirez.programming.kicks-ass.net> <20170710172705.GA3441@tassilo.jf.intel.com> From: "Li, Aubrey" Message-ID: Date: Tue, 11 Jul 2017 12:40:06 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170710172705.GA3441@tassilo.jf.intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2199 Lines: 55 On 2017/7/11 1:27, Andi Kleen wrote: > On Mon, Jul 10, 2017 at 06:42:06PM +0200, Peter Zijlstra wrote: >> On Mon, Jul 10, 2017 at 07:46:09AM -0700, Andi Kleen wrote: >>>> So how much of the gain is simply due to skipping NOHZ? Mike used to >>>> carry a patch that would throttle NOHZ. And that is a _far_ smaller and >>>> simpler patch to do. >>> >>> Have you ever looked at a ftrace or PT trace of the idle entry? >>> >>> There's just too much stuff going on there. NOHZ is just the tip >>> of the iceberg. >> >> I have, and last time I did the actual poking at the LAPIC (to make NOHZ >> happen) was by far the slowest thing happening. > > That must have been a long time ago because modern systems use TSC deadline > for a very long time ... > > It's still slow, but not as slow as the LAPIC. > >> Data to indicate what hurts how much would be a very good addition to >> the Changelogs. Clearly you have some, you really should have shared. > Here is an article indicates why we need to improve this: https://cacm.acm.org/magazines/2017/4/215032-attack-of-the-killer-microseconds/fulltext Given that we have a few new low-latency I/O devices like Xpoint 3D memory, 25/40GB Ethernet, etc, this proposal targets to improve the latency of microsecond(us)-scale events as well. Basically we are looking at how much we can improve(instead of what hurts), the data is against v4.8.8. In the idle loop, - quiet_vmstat costs 5562ns - 6296ns - tick_nohz_idle_enter costs 7058ns - 10726ns - totally from arch_cpu_idle_enter entry to arch_cpu_idle_exit return costs 9122ns - 15318ns. --In this period, rcu_idle_enter costs 1985ns - 2262ns, rcu_idle_exit costs 1813ns - 3507ns - tick_nohz_idle_exit costs 8372ns - 20850ns Benchmark fio on a NVMe disk shows 3-4% improvement due to skipping nohz, extra 1-2% improvement overall Benchmark netperf loopback in TCP Request-Response mode shows 6-7% improvement due to skipping nohz, extra 2-3% improvement overall Note, the data includes measurement overhead, and it could be varied on the different platforms, different CPU frequency, and different workload, but they are consistent once the testing configuration is fixed. Thanks, -Aubrey