Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp4725652imm; Mon, 14 May 2018 11:55:39 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpDUl/s6P3y2g0OaOjwHvkTGkgZCn0FAoYGZX2gUlQ+f3yAhdtUrGlJKPD3sm4PQXnvE0ZY X-Received: by 2002:a63:7159:: with SMTP id b25-v6mr9252359pgn.194.1526324139478; Mon, 14 May 2018 11:55:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526324139; cv=none; d=google.com; s=arc-20160816; b=LC7SiyM0k9mv4pjkg63SVdgL5zqWb4W0w8Xnf+UhNex4l8ljgQcV0INojn4qc/yVlU c5gZxGpunE5dwAQ13lv7Yp0HLJdV/46cgS9L4XycR4C1AjhdkCN9Fg3FUXrDNuSEGcsX aw8tEHpnLIXhGQTUEnXoxNI6+6Xyjvs6Ir7uuQTxYCUKawKVervjZvwqPwtIzfj/kJli /Y4fxb3SpBYeE4SsDGGzggs50QgJbbsqqUzIbTYiKG0mMotNKGjVsJ7HDodPCwBcyluZ 40J5QHKmcB3gTqCuM+hQb3qzbGXQZpfLB9HCBguk5KA+a8NhVqLo044YV8ja9DdsWDLY I7ig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=eVYpmjqcSZBn4kAVO0jNYh3Jhm47wWqyq0luxDgSTjU=; b=W/5Lgd3N7VUWa/J0CyBaM/1ew+yy+j4VcLsNO/uza4sbt0Zyj0LZcDFTgfUR2zKeGM fiNIS21ys7L/zv0Jw9wT4qb6V8jEcpJbu8ohv7FhEpM3Uec+eei7kG+133U0dBYqy+9J fVDz4bVWSPunTJZk/GWj2DVeVrzun9R03QdotGJXivNXS3oxiklNeStEEJptu5V65Io5 fEgct546PExXom6ELqSlLQHpYBMBCJPENnzFszjn3b1Lx1A2vnbpF3Iykr4xQCYqZHCy IY94PhLn2I+mbzLZu3MvA5h+TXKKG8WrjIi4rPF+R87DqHFuW4nWjrs5pciJaUPPdp81 xppg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@cmpxchg.org header.s=x header.b=zBJZOJtc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u6-v6si10039789plz.461.2018.05.14.11.55.25; Mon, 14 May 2018 11:55:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@cmpxchg.org header.s=x header.b=zBJZOJtc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752295AbeENSxg (ORCPT + 99 others); Mon, 14 May 2018 14:53:36 -0400 Received: from gum.cmpxchg.org ([85.214.110.215]:50768 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750760AbeENSxd (ORCPT ); Mon, 14 May 2018 14:53:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=cmpxchg.org ; s=x; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject: Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=eVYpmjqcSZBn4kAVO0jNYh3Jhm47wWqyq0luxDgSTjU=; b=zBJZOJtcA7Tqlr13PMNOADnYVw I9jymj6mOnTFq6oqZjvSUij7TaK7krj8bkMyezzwx7yhFovx5j5mtEHFhP7JTQUNvQQkxJwHWA0/R ArdJqr/xsRVT7hyQddGiCjAKEfHPx9P7w7kvvNkOiwVFWDJnlnOz0c4qWy58nHS0oOBA=; Date: Mon, 14 May 2018 14:55:20 -0400 From: Johannes Weiner To: Christopher Lameter Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, Ingo Molnar , Peter Zijlstra , Andrew Morton , Tejun Heo , Balbir Singh , Mike Galbraith , Oliver Yang , Shakeel Butt , xxx xxx , Taras Kondratiuk , Daniel Walker , Vinayak Menon , Ruslan Ruslichenko , kernel-team@fb.com Subject: Re: [PATCH 0/7] psi: pressure stall information for CPU, memory, and IO Message-ID: <20180514185520.GA7398@cmpxchg.org> References: <20180507210135.1823-1-hannes@cmpxchg.org> <010001635f4e8be9-94e7be7a-e75c-438c-bffb-5b56301c4c55-000000@email.amazonses.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <010001635f4e8be9-94e7be7a-e75c-438c-bffb-5b56301c4c55-000000@email.amazonses.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 14, 2018 at 03:39:33PM +0000, Christopher Lameter wrote: > On Mon, 7 May 2018, Johannes Weiner wrote: > > > What to make of this number? If CPU utilization is at 100% and CPU > > pressure is 0, it means the system is perfectly utilized, with one > > runnable thread per CPU and nobody waiting. At two or more runnable > > tasks per CPU, the system is 100% overcommitted and the pressure > > average will indicate as much. From a utilization perspective this is > > a great state of course: no CPU cycles are being wasted, even when 50% > > of the threads were to go idle (and most workloads do vary). From the > > perspective of the individual job it's not great, however, and they > > might do better with more resources. Depending on what your priority > > is, an elevated "some" number may or may not require action. > > This looks awfully similar to loadavg. Problem is that loadavg gets > screwed up by tasks blocked waiting for I/O. Isnt there some way to fix > loadavg instead? Counting iowaiting tasks is one thing, but there are a few more things that make it hard to use for telling the impact of CPU competition: - It's not normalized to available CPU count. The loadavg in isolation doesn't mean anything, and you have to know the number of CPUs and any CPU bindings / restrictions in effect, which presents at least some difficulty when monitoring a big heterogeneous fleet. - The way it's sampled makes it impossible to use for latencies. You could be mostly idle but periodically have herds of tasks competing for the CPU for short, low-latency operations. Even if we changed this in the implementation, you're still stuck with the interface that has... - ...a short-term load window of 1m. This is generally fairly coarse for something that can be loaded and unloaded as abruptly as the CPU I'm trying to fix these with a portable way of aggregating multi-cpu states, as well as tracking the true time spent in a state instead of sampling it. Plus a smaller short-term window of 10s, but that's almost irrelevant because I'm exporting the absolute state time clock so you can calculate your own averages over any time window you want. Since I'm using the same model and infrastructure for memory and IO load as well, IMO it makes more sense to present them in a coherent interface instead of trying to retrofit and change the loadavg file, which might not even be possible.