Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp4720264ybv; Mon, 17 Feb 2020 04:40:43 -0800 (PST) X-Google-Smtp-Source: APXvYqx+DYpvNT/bputxIj/E2eEvcyh/urQ7co5X9qsnouron7wAUl63m3ysIV4Hf97icY5WwZjB X-Received: by 2002:a9d:7653:: with SMTP id o19mr11560024otl.118.1581943243119; Mon, 17 Feb 2020 04:40:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581943243; cv=none; d=google.com; s=arc-20160816; b=Jg6a6hNajmEQinQ9/sCKuxm2oKz5Mbz3c+yMMoRO3+gdDskVIXwGtMWiab7TA8et/p TlMDdvyePnF04MuyU+5bwLe5+gbutWVe/eHhEvU5wVJ8eaKCdM2bVolkgyXVlqRlj39V OV6OZg2pKjc2iZ3jt1JCzHgDjLyrTMQ5BjvF5w1cF6XDeM1lYIhetA017eG2PTwPaM3n MvD1SZnXlCkvNfEQyKRMHqWYTXpH32MtQ1+/nVCP4+v6apcd15aTdk4yauRlGioVBSEV I3ohNMncRQm0A87VnifacJ8DTzADL6dkI9mP5zcg54TOtQoQVF1Z68YFHvDDZODcj+m0 grFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=tBD1UdtIwYYVmF4ItM/NZCbon2BboXtSsDqjHsO1DVI=; b=bMKERF+2RS5mPegDqeIydIIjYEbXGQTX2VYx0IFJofHsafF9FLVj3caIobd1QT9mgW 2ZLSzurVKHdjxSyYTXgK++J+CXVyLKKeWfEQOLTd+8QtVb7k94b7MGz2q8wUSY8JdKtf tsGUSUsLDGmNYXgP7bCGV8o3iD26DUKzKTT3XaZYQ+4mLPw4sy3ctEkW048pq4J9BaNQ XEDVQd2eYp9Gw8mhYm/XcExJFmow9v2ei/nYU/uKBnJpNwb5/TPeyj+jNY0VnH0piEvc AH/h84eKJDjoRL5leLcREC+zNE+O6b8QwgdWoRBtAqwjEMD1WR/JCa4/vSoNty7ViMo/ fKtg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y9si144312otk.136.2020.02.17.04.40.30; Mon, 17 Feb 2020 04:40:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729124AbgBQL6S (ORCPT + 99 others); Mon, 17 Feb 2020 06:58:18 -0500 Received: from mx2.suse.de ([195.135.220.15]:38340 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728773AbgBQL6R (ORCPT ); Mon, 17 Feb 2020 06:58:17 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 7594CAAC2; Mon, 17 Feb 2020 11:58:15 +0000 (UTC) Date: Mon, 17 Feb 2020 11:58:10 +0000 From: Mel Gorman To: Peter Zijlstra Cc: ?????? , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Luis Chamberlain , Kees Cook , Iurii Zaikin , Michal Koutn? , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, "Paul E. McKenney" , Randy Dunlap , Jonathan Corbet Subject: Re: [PATCH RESEND v8 1/2] sched/numa: introduce per-cgroup NUMA locality info Message-ID: <20200217115810.GA3420@suse.de> References: <20200214151048.GL14914@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20200214151048.GL14914@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 14, 2020 at 04:10:48PM +0100, Peter Zijlstra wrote: > On Fri, Feb 07, 2020 at 11:35:30AM +0800, ?????? wrote: > > By monitoring the increments, we will be able to locate the per-cgroup > > workload which NUMA Balancing can't helpwith (usually caused by wrong > > CPU and memory node bindings), then we got chance to fix that in time. > > > > Cc: Mel Gorman > > Cc: Peter Zijlstra > > Cc: Michal Koutn? > > Signed-off-by: Michael Wang > > So here: > > https://lkml.kernel.org/r/20191127101932.GN28938@suse.de > > Mel argues that the information exposed is fairly implementation > specific and hard to use without understanding how NUMA balancing works. > > By exposing it to userspace, we tie ourselves to these particulars. We > can no longer change these NUMA balancing details if we wanted to, due > to UAPI concerns. > > Mel, I suspect you still feel that way, right? > Yes, I still think it would be a struggle to interpret the data meaningfully without very specific knowledge of the implementation. If the scan rate was constant, it would be easier but that would make NUMA balancing worse overall. Similarly, the stat might get very difficult to interpret when NUMA balancing is failing because of a load imbalance, pages are shared and being interleaved or NUMA groups span multiple active nodes. For example, the series that reconciles NUMA and CPU balancers may look worse in these stats even though the overall performance may be better. > In the document (patch 2/2) you write: > > > +However, there are no hardware counters for per-task local/remote accessing > > +info, we don't know how many remote page accesses have occurred for a > > +particular task. > > We can of course 'fix' that by adding a tracepoint. > > Mel, would you feel better by having a tracepoint in task_numa_fault() ? > A bit, although interpreting the data would still be difficult and the tracepoint would have to include information about the cgroup. While I've never tried, this seems like the type of thing that would be suited to a BPF script that probes task_numa_fault and extract the information it needs. -- Mel Gorman SUSE Labs