Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp74609imu; Tue, 8 Jan 2019 14:59:44 -0800 (PST) X-Google-Smtp-Source: ALg8bN4X/d6gj1v4597N+XyhCc38NiP/ewPqqwEqjLPz+cm3aw9jb1VJvDpJSS5kqMIrPSxFn9E9 X-Received: by 2002:a62:11c7:: with SMTP id 68mr3578162pfr.21.1546988384373; Tue, 08 Jan 2019 14:59:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546988384; cv=none; d=google.com; s=arc-20160816; b=Z029W4sxMcehOOwluPpgTdU25UlwFSEXjfvqoCyAK2PGcxd8FmQuoJlYSHXwkbrobo s3NojOitoFTBJBcIaVb4KQ+Dfx2Ea+f0TAg4u0KJSdZKC827vxRGOhKcwJxRvwxrtiCx mxcWYSWDlHCIgtSQYuwVC+XD4UENy3zkAAYwGsq32yzebkPDfaegeLFfL1UgNIn6VM5t Tw8dryNGKm36SF4Q2s0hCO3UE7cd3BJ4qFTKEp/5jtSH1ogl/ZDPeDnK4m8NR5C8olQB 2O5PHsr1UB3KtPKTNt2uEVV5L/3uzqqF8rHnuuLErryJ5VTMFW7fAUjKg5ofiQuX3Wer z30w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=i2qq0GzDi0g240DMuVgOzXoUEOzS58koO9dMTHnHE1Q=; b=fP/iixxpE9ZMtB3pcND5AEBxkOQmB010pgbHaM7a3i7lP7HNU5vNOSxJ1hzCSYlLy/ Ow64+awdHlth5j86GpVTgFbgbndDqU7W6KnoxKWuUvGIMIZQw8DNrkSmUP9bA/U8aP3Y ES1s63HuFt9Sj6KZ3rP/fot0QzpxCLuT3K0PUXRkclt7seaAZWyqkKi43ogpvukHy7Q2 FCDRVTcSmBRFCxXz06UiVciyVm3/eQL0dUv+wXcdbKu3e5Qg2VPlydGduGAIfkD9lUjI /4R0VtIJ8qrWl2pQo70WdN3ZosynyleBKmLbXp+sBRsMn3rCIa/ACZsNLIT+PfPABonV ODMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b6si65966585pgg.2.2019.01.08.14.59.29; Tue, 08 Jan 2019 14:59:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729484AbfAHW1Y (ORCPT + 99 others); Tue, 8 Jan 2019 17:27:24 -0500 Received: from ipmailnode02.adl6.internode.on.net ([150.101.137.148]:46766 "EHLO ipmailnode02.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729042AbfAHW1Y (ORCPT ); Tue, 8 Jan 2019 17:27:24 -0500 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail02.adl6.internode.on.net with ESMTP; 09 Jan 2019 08:57:05 +1030 Received: from dave by dastard with local (Exim 4.80) (envelope-from ) id 1ggzpw-0000BO-PR; Wed, 09 Jan 2019 09:27:04 +1100 Date: Wed, 9 Jan 2019 09:27:04 +1100 From: Dave Chinner To: Waiman Long Cc: Andrew Morton , Alexey Dobriyan , Luis Chamberlain , Kees Cook , Jonathan Corbet , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, Davidlohr Bueso , Miklos Szeredi , Daniel Colascione , Randy Dunlap Subject: Re: [PATCH 0/2] /proc/stat: Reduce irqs counting performance overhead Message-ID: <20190108222704.GD27534@dastard> References: <1546873978-27797-1-git-send-email-longman@redhat.com> <20190107223214.GZ6311@dastard> <9b4208b7-f97b-047c-4dab-15bd3791e7de@redhat.com> <20190108020422.GA27534@dastard> <56954b42-4258-7268-53b5-ddca28758193@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56954b42-4258-7268-53b5-ddca28758193@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 08, 2019 at 11:58:26AM -0500, Waiman Long wrote: > On 01/07/2019 09:04 PM, Dave Chinner wrote: > > On Mon, Jan 07, 2019 at 05:41:39PM -0500, Waiman Long wrote: > >> On 01/07/2019 05:32 PM, Dave Chinner wrote: > >>> On Mon, Jan 07, 2019 at 10:12:56AM -0500, Waiman Long wrote: > > What I was suggesting is that you change the per-cpu counter > > implementation to the /generic infrastructure/ that solves this > > problem, and then determine if the extra update overhead is at all > > measurable. If you can't measure any difference in update overhead, > > then slapping complexity on the existing counter to attempt to > > mitigate the summing overhead is the wrong solution. > > > > Indeed, it may be that you need o use a custom batch scaling curve > > for the generic per-cpu coutner infrastructure to mitigate the > > update overhead, but the fact is we already have generic > > infrastructure that solves your problem and so the solution should > > be "use the generic infrastructure" until it can be proven not to > > work. > > > > i.e. prove the generic infrastructure is not fit for purpose and > > cannot be improved sufficiently to work for this use case before > > implementing a complex, one-off snowflake counter implementation... > > I see your point. I like the deferred summation approach that I am > currently using. If I have to modify the current per-cpu counter > implementation to support that No! Stop that already. The "deferred counter summation" is exactly the problem the current algorithm has and exactly the problem the generic counters /don't have/. Changing the generic percpu counter algorithm to match this specific hand rolled implementation is not desirable as it will break implementations that rely on the bound maximum summation deviation of the existing algorithm (e.g. the ext4 and XFS ENOSPC accounting algorithms). > and I probably need to add counter > grouping support to amortize the overhead, that can be a major The per-cpu counters already have configurable update batching to amortise the summation update cost across multiple individual per-cpu updates. You don't need to change the implementation at all, just tweak the amortisation curve appropriately for the desired requirements for update scalability (high) vs read accuracy (low). Let's face it, on large systems where counters are frequently updated, the resultant sum can be highly inaccurate by the time a thousand CPU counters have been summed. The generic counters have a predictable and bound "fast sum" maximum deviation (batch size * nr_cpus), so over large machines are likely to be much more accurate than "unlocked summing on demand" algorithms. Cheers, Dave. -- Dave Chinner david@fromorbit.com