Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4206708imu; Mon, 7 Jan 2019 18:05:58 -0800 (PST) X-Google-Smtp-Source: ALg8bN4bpL7VYWJu1WHVP7dkzjor9NYQB9ic19WWkHSlrWxJq3f9Ejvd7gT1b49fUV03aCEsr2oK X-Received: by 2002:a63:3e05:: with SMTP id l5mr12090852pga.96.1546913158289; Mon, 07 Jan 2019 18:05:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546913158; cv=none; d=google.com; s=arc-20160816; b=FgAKHUnSxxgqxJjzOldHuFq148FaKjtbVqW8TDzT19xOLvu2+9G8RU0x1+sF6Mo5g9 97bnqBw+sREggeKn0NWTf+C/u4PHgETiMSqtEKPherswpPFE5ATNqLRGNXHjgeP2BTOo y4Z83txVcgOmq3RVJj7OhOsDF2BSGy0Wr2kBp6u1T1UZZAIGjmHSQbC+2UH7pnxDGQ++ GR4dX5bWeHV5pKLc7oelukt4/RfjPsi54S1XwAPX3SiMHU5oyWsn1Vzruy/f6M6LD7Tv LudLbJtiKa5LeFtE8YYLjVXhOGuOH+sk6mWL+QrPf84cY3r11wm2nkv7afWOsDIrH7Nk Wmgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=f7LFtM1b8OgTQyDCLsyHjJWrrYQlZ8GpjaV9msgHjCM=; b=o6TlxlSlW00FwppcQHMr4jXM0ed/JLQAI6IP40X+uf6Nh4ykGXPeLD6oMlQqEjMurz 8e+nUWrdGEEJd1nMsZ31c0RcaFiOmpeNPY1jWvYS1RLzUBJgyXx/Fnt7BIhLFfeHK4Y2 ZlGWf6kYlyb54uYftdtlnTMgPRFEBcdErAaIkYvRZQWOh1aMQWWCuuNJ/eiYiNYfzo4L 1KLIX2RT4DcX+ccAVzHjUDinf/B/s7ge+Ed/S9K7wDweV30ndcbB+XuT9WfVFr0lrnUR p0mipD8Pf5ewFCox1AUeI76CrTYcMASg5fI4pihICQR4jqhZiHt9RPSlk/2rRYHHwn3P m1cg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n1si63052226pgh.172.2019.01.07.18.05.43; Mon, 07 Jan 2019 18:05:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727324AbfAHCE1 (ORCPT + 99 others); Mon, 7 Jan 2019 21:04:27 -0500 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:50126 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727028AbfAHCE1 (ORCPT ); Mon, 7 Jan 2019 21:04:27 -0500 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl6.internode.on.net with ESMTP; 08 Jan 2019 12:34:24 +1030 Received: from dave by dastard with local (Exim 4.80) (envelope-from ) id 1gggkg-0007MW-6D; Tue, 08 Jan 2019 13:04:22 +1100 Date: Tue, 8 Jan 2019 13:04:22 +1100 From: Dave Chinner To: Waiman Long Cc: Andrew Morton , Alexey Dobriyan , Luis Chamberlain , Kees Cook , Jonathan Corbet , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, Davidlohr Bueso , Miklos Szeredi , Daniel Colascione , Randy Dunlap Subject: Re: [PATCH 0/2] /proc/stat: Reduce irqs counting performance overhead Message-ID: <20190108020422.GA27534@dastard> References: <1546873978-27797-1-git-send-email-longman@redhat.com> <20190107223214.GZ6311@dastard> <9b4208b7-f97b-047c-4dab-15bd3791e7de@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9b4208b7-f97b-047c-4dab-15bd3791e7de@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 07, 2019 at 05:41:39PM -0500, Waiman Long wrote: > On 01/07/2019 05:32 PM, Dave Chinner wrote: > > On Mon, Jan 07, 2019 at 10:12:56AM -0500, Waiman Long wrote: > >> As newer systems have more and more IRQs and CPUs available in their > >> system, the performance of reading /proc/stat frequently is getting > >> worse and worse. > > Because the "roll-your-own" per-cpu counter implementaiton has been > > optimised for low possible addition overhead on the premise that > > summing the counters is rare and isn't a performance issue. This > > patchset is a direct indication that this "summing is rare and can > > be slow" premise is now invalid. > > > > We have percpu counter infrastructure that trades off a small amount > > of addition overhead for zero-cost reading of the counter value. > > i.e. why not just convert this whole mess to percpu_counters and > > then just use percpu_counter_read_positive()? Then we just don't > > care how often userspace reads the /proc file because there is no > > summing involved at all... > > > > Cheers, > > > > Dave. > > Yes, percpu_counter_read_positive() is cheap. However, you still need to > pay the price somewhere. In the case of percpu_counter, the update is > more expensive. Ummm, that's exactly what I just said. It's a percpu counter that solves the "sum is expensive and frequent" problem, just like you are encountering here. I do not need basic scalability algorithms explained to me. > I would say the percentage of applications that will hit this problem is > small. But for them, this problem has some significant performance overhead. Well, duh! What I was suggesting is that you change the per-cpu counter implementation to the /generic infrastructure/ that solves this problem, and then determine if the extra update overhead is at all measurable. If you can't measure any difference in update overhead, then slapping complexity on the existing counter to attempt to mitigate the summing overhead is the wrong solution. Indeed, it may be that you need o use a custom batch scaling curve for the generic per-cpu coutner infrastructure to mitigate the update overhead, but the fact is we already have generic infrastructure that solves your problem and so the solution should be "use the generic infrastructure" until it can be proven not to work. i.e. prove the generic infrastructure is not fit for purpose and cannot be improved sufficiently to work for this use case before implementing a complex, one-off snowflake counter implementation... Cheers, Dave. -- Dave Chinner david@fromorbit.com