Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp19283603ybl; Fri, 3 Jan 2020 20:52:36 -0800 (PST) X-Google-Smtp-Source: APXvYqyj+ZorJSwSwA49D4fIr0HHNNDNU7KFlFggVjFbXJX/0CI3QBj+nXKClzq9xRYJrQ5N8nPQ X-Received: by 2002:a05:6830:10a:: with SMTP id i10mr97711285otp.365.1578113556722; Fri, 03 Jan 2020 20:52:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578113556; cv=none; d=google.com; s=arc-20160816; b=bvFgUvQPtbYXT0Rgxoe3WOylGb0v6Jfk4F13f++ahQdJRdjP+T9/k8rmQeJRQKx0S3 uTZeDA7Ox7iBWKDgKJUrnOIXIiR5A8/Eqbnq3xt7wMhPJQA6kZ2qs3Qxk0k11sIiCv9s AOipHFQ9yQLy+I7idMBEiNVFuztx6xRutDks6rf4G3nquBm6isP6jSYTmCHDncebRz2L r9H+Byp7MGtD+QvplGvMncTt7NvG//GJGUiS4XlKkOWF0JinWDg+FfouVaB+m0tzN7v5 dqnmx6C1GX03q5ZMPKP22RMxYimvmoO0yg0kPiyduPQDLdILdh8DJIvUjp4Xis/EdHOH DWlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=OjcI6ZZcSfWCwAU5LUx6fQ5SlXU38MPuK11ZJf9fkqY=; b=LNbzBN8wBStdUooP/j6ZROm9EhRHnv/lRFFvMFPdwaZnjbu8rGUvXIoYKA5p7ErWV/ SNXjyW5K/4aas3LuGOayBUhJtUy0PJE7LYmF9j62VrLX6jJQP9i5iOLwr6kOKfbv2BY4 szpiZh/t+fPxSJp5JweaCxNAUqdc6/0ip/g9TXZJ0xjaf8Zd1NDr8DVW6CZfOZMvRV8p mDb3pkFdZWaboEJcI+cGWHctggSYMZMFtzhO5jZQMI0MrqkzjUX6zBk65x+pd5g0RwC1 sdfEuDT09/Lbfe5hsDrks/HLFrkkkNsjdIQpZO8pYkkfJlZwhVqujE3k3iX85X4yyofr MNcQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j9si31244179otn.294.2020.01.03.20.52.24; Fri, 03 Jan 2020 20:52:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726026AbgADEvq (ORCPT + 99 others); Fri, 3 Jan 2020 23:51:46 -0500 Received: from out30-56.freemail.mail.aliyun.com ([115.124.30.56]:36249 "EHLO out30-56.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725790AbgADEvp (ORCPT ); Fri, 3 Jan 2020 23:51:45 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R691e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=yun.wang@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0TmmVgCV_1578113497; Received: from testdeMacBook-Pro.local(mailfrom:yun.wang@linux.alibaba.com fp:SMTPD_---0TmmVgCV_1578113497) by smtp.aliyun-inc.com(127.0.0.1); Sat, 04 Jan 2020 12:51:38 +0800 Subject: Re: [PATCH v6 1/2] sched/numa: introduce per-cgroup NUMA locality info To: =?UTF-8?Q?Michal_Koutn=c3=bd?= Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Luis Chamberlain , Kees Cook , Iurii Zaikin , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, "Paul E. McKenney" , Randy Dunlap , Jonathan Corbet References: <743eecad-9556-a241-546b-c8a66339840e@linux.alibaba.com> <207ef46c-672c-27c8-2012-735bd692a6de@linux.alibaba.com> <040def80-9c38-4bcc-e4a8-8a0d10f131ed@linux.alibaba.com> <25cf7ef5-e37e-7578-eea7-29ad0b76c4ea@linux.alibaba.com> <443641e7-f968-0954-5ff6-3b7e7fed0e83@linux.alibaba.com> <275a98ed-35b8-b65f-3600-64ab722dd836@linux.alibaba.com> <20200103151449.GA25747@blackbody.suse.cz> From: =?UTF-8?B?546L6LSH?= Message-ID: Date: Sat, 4 Jan 2020 12:51:37 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: <20200103151449.GA25747@blackbody.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/1/3 下午11:14, Michal Koutný wrote: > Hi. > > On Fri, Dec 13, 2019 at 09:47:36AM +0800, 王贇 wrote: >> By monitoring the increments, we will be able to locate the per-cgroup >> workload which NUMA Balancing can't helpwith (usually caused by wrong >> CPU and memory node bindings), then we got chance to fix that in time. > I just wonder do the data based on increments match with those you > obtained previously? They have different meaning, since now it's just the accumulation of local/remote page access counter, we have to increasing the sample period into the maximum NUMA balancing scan period, to my system it's 1 minute. We still get useful information from the increments, for example: local 100 remote 1000 <-- bad locality in last period local 0 remote 0 <-- no scan or NUMA PF happened in last period local 100 remote 0 <-- good locality but not much PF happened So I won't say they are matched, they tell the story in different way :-P > >> +static inline void >> +update_task_locality(struct task_struct *p, int pnid, int cnid, int pages) >> +{ >> + if (!static_branch_unlikely(&sched_numa_locality)) >> + return; >> + >> + /* >> + * pnid != cnid --> remote idx 0 >> + * pnid == cnid --> local idx 1 >> + */ >> + p->numa_page_access[!!(pnid == cnid)] += pages; > If the per-task information isn't used anywhere, why not accumulate > directly into task's cfs_rq->{local,remote}_page_access? > This is try to avoid hierarchy update in each PF, accumulate the counter and update together should cost less. Besides, as they won't be reset now, maybe we could expose them too. >> @@ -4298,6 +4359,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued) >> */ >> update_load_avg(cfs_rq, curr, UPDATE_TG); >> update_cfs_group(curr); >> + update_group_locality(cfs_rq); > With the per-NUMA node time tracked separately, isn't it unnecessary > doing group updates inside entity_tick? The hierarchy update can't be saved, and this is a good place where we already holding rq lock, iterate cfs_rq in hierarchy for current task. Regards, Michael Wang > > > Regards, > Michal >