Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752424Ab3ISQzK (ORCPT ); Thu, 19 Sep 2013 12:55:10 -0400 Received: from mail-lb0-f177.google.com ([209.85.217.177]:37739 "EHLO mail-lb0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751984Ab3ISQy7 (ORCPT ); Thu, 19 Sep 2013 12:54:59 -0400 MIME-Version: 1.0 X-Originating-IP: [204.57.119.28] In-Reply-To: <20130918150659.5091a2c3ca94b99304427ec5@linux-foundation.org> References: <00000140e9dfd6bd-40db3d4f-c1be-434f-8132-7820f81bb586-000000@email.amazonses.com> <0000014109b8e5db-4b0f577e-c3b4-47fe-b7f2-0e5febbcc948-000000@email.amazonses.com> <20130918150659.5091a2c3ca94b99304427ec5@linux-foundation.org> Date: Thu, 19 Sep 2013 11:54:58 -0500 Message-ID: Subject: Re: RFC vmstat: On demand vmstat threads From: Gilad Ben-Yossef To: Andrew Morton Cc: Christoph Lameter , Thomas Gleixner , Tejun Heo , John Stultz , Mike Frysinger , Minchan Kim , Hakan Akkan , Max Krasnyansky , Frederic Weisbecker , "linux-kernel@vger.kernel.org" , "Paul E. McKenney" , Linux-MM Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2905 Lines: 82 On Wed, Sep 18, 2013 at 5:06 PM, Andrew Morton wrote: > On Tue, 10 Sep 2013 21:13:34 +0000 Christoph Lameter wrote: > >> With this patch it is possible then to have periods longer than >> 2 seconds without any OS event on a "cpu" (hardware thread). > > It would be useful (actually essential) to have a description of why > anyone cares about this. A good and detailed description, please. Let me have a stab at this: The existing vmstat_update mechanism depends on a deferrable timer firing every second by default which registers a work queue item that runs on the local CPU, with the result that we have 1 interrupt and one additional schedulable task on each CPU aprox. every second. If your workload indeed causes VM activity or you are running multiple tasks per CPU, you probably have bigger issues to deal with. However, many existing workloads dedicate a CPU for a single CPU bound task. This is done by high performance computing folks, by high frequency financial applications folks, by networking folks (Intel DPDK, EZchip NPS) and with the advent of systems with more and more CPUs over time, this will(?) become more and more common to do since when you have enough CPUs you care less about efficiently sharing your CPU with other tasks and more about efficiently monopolizing a CPU per task. The difference of having this timer firing and workqueue kernel thread scheduled per second can be enormous. An artificial test I made measuring the worst case time to do a simple "i++" in an endless loop on a bare metal system and under Linux on an isolated CPU (cpusets or isolcpus - take your pick) with dynticks and with and without this patch, have Linux match the bare metal performance (~700 cycles) with this patch and loose by couple of orders of magnitude (~200k cycles) without it[*] - and all this for something that just calculates statistics. For networking applications, for example, this is the difference between dropping packets or sustaining line rate. Statistics are important and useful, but if there is a way to not cause statistics gathering produce such a huge performance difference would be great. This is what we are trying to do here. Does it makes sense? [*] To be honest it required one more patch, but this one or something like is needed to get that one working, so... Thanks, Gilad -- Gilad Ben-Yossef Chief Coffee Drinker gilad@benyossef.com Israel Cell: +972-52-8260388 US Cell: +1-973-8260388 http://benyossef.com "If you take a class in large-scale robotics, can you end up in a situation where the homework eats your dog?" -- Jean-Baptiste Queru -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/