Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp452116imd; Fri, 26 Oct 2018 11:04:57 -0700 (PDT) X-Google-Smtp-Source: AJdET5cMqv/LYsCGXoMy6CXyK2MubCnJYtPSzkRqMwJNp5jzL9eRWAvVfraFzOgLXNc0JNuQoyk8 X-Received: by 2002:a62:1c0f:: with SMTP id c15-v6mr4783129pfc.14.1540577097329; Fri, 26 Oct 2018 11:04:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540577097; cv=none; d=google.com; s=arc-20160816; b=AZVReZ2jITcLJDvsg7dUnMYACAnIBgspJaUP9PfKUfYTHT8Ycmyz14qrAzUSccHmHS IMlQNUw5jlh/e8I/O2YTdpK6Zm1n0jb1QctzEl/RdLinP4d1hQey9RjJQxvGQvMlmIaG tKMjmrSt0lyu88yvKwnKbUKpiHKse8IlKzb2ASxr4XefBFAQPfyBou8aOa1XIPT8UZBA Wxrrd0LcHy3pTGB1ujM8WyrcyrIY1hKq3A/QqghKmxFNhCCkSLCyR8KdJdx35jle2GuE lJ3o7zZrBPJHOS0NUFjZw1KKiyBLeRblEQTlJKdwZzfaMPbok7WW55yg3f7NDDIRg7GV lUwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=j31YQlERuxZ9j1vAHmtHVR3ClBoX5DeX/ysZyBWp0N4=; b=CK/n9Dk9Aw6AW1voKgK3cnygRDzWIH3M6hKh2QA6M1pTrVb+DvbJv6XGXZavPrnkBa QuPaoDdUz+cMFKeLcSjV+DKN79T5iFbX1X+fn+mpRS5BxInPxFE7/V35hORsSGj1TVpE Fux62x48OXckdiuxzpOAYnB1OL5S0PzepEeOmx1fkX8LsAkyPtCqiKKGrkTnFbTUERIL HyMtE7L3ZDEQthCIwb0jCAW3Oc8fcqrEGJU+mg45eECHnylWlhx2MvaB2atWvSsyRDzR Onw+FPzYLAJ7YhVGMUXvoNUr+CFJ0AmNLUnCvt9oEF/4gHidVvtm68ClwFaVe3XpfDFk TArA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q129-v6si11951377pga.96.2018.10.26.11.04.41; Fri, 26 Oct 2018 11:04:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727761AbeJ0CmK (ORCPT + 99 others); Fri, 26 Oct 2018 22:42:10 -0400 Received: from foss.arm.com ([217.140.101.70]:48100 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727450AbeJ0CmK (ORCPT ); Fri, 26 Oct 2018 22:42:10 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8F0F4A78; Fri, 26 Oct 2018 11:04:10 -0700 (PDT) Received: from [10.1.194.37] (e113632-lin.cambridge.arm.com [10.1.194.37]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5572E3F719; Fri, 26 Oct 2018 11:04:08 -0700 (PDT) Subject: Re: [PATCH 07/10] sched/fair: Provide can_migrate_task_llc To: Steve Sistare , mingo@redhat.com, peterz@infradead.org Cc: subhra.mazumdar@oracle.com, dhaval.giani@oracle.com, rohit.k.jain@oracle.com, daniel.m.jordan@oracle.com, pavel.tatashin@microsoft.com, matt@codeblueprint.co.uk, umgwanakikbuti@gmail.com, riel@redhat.com, jbacik@fb.com, juri.lelli@redhat.com, linux-kernel@vger.kernel.org References: <1540220381-424433-1-git-send-email-steven.sistare@oracle.com> <1540220381-424433-8-git-send-email-steven.sistare@oracle.com> From: Valentin Schneider Message-ID: Date: Fri, 26 Oct 2018 19:04:06 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <1540220381-424433-8-git-send-email-steven.sistare@oracle.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Steve, On 22/10/2018 15:59, Steve Sistare wrote: > Define a simpler version of can_migrate_task called can_migrate_task_llc > which does not require a struct lb_env argument, and judges whether a > migration from one CPU to another within the same LLC should be allowed. > > Signed-off-by: Steve Sistare > --- > kernel/sched/fair.c | 28 ++++++++++++++++++++++++++++ > 1 file changed, 28 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 4acdd8d..6548bed 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7168,6 +7168,34 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) > } > > /* > + * Return true if task @p can migrate from @rq to @dst_rq in the same LLC. > + * No need to test for co-locality, and no need to test task_hot(), as sharing > + * LLC provides cache warmth at that level. I was thinking that perhaps we could have scenarios where some rq's keep stealing tasks off of each other and we end up circulating tasks between CPUs. Now, that would only happen if we had a handful of tasks with a very tiny period, and I'm not familiar with (real) such hyperactive workloads similar to those generated by hackbench where that could happen. In short, I wonder if we should have task_hot() in there. Drawing a parallel with load_balance(), even if load-balancing is happening between rqs of the same LLC, we do go check task_hot(). Have you already experimented with adding a task_hot() check in here? I've run some iterations of hackbench (hackbench 2 process 100000) to investigate this task bouncing, but I didn't really see any of it. That was just a 4+4 big.LITTLE system though, I'll try to get numbers on a system with more CPUs. ----->8----- activations: # of task activations (task starts running) cpu_migrations: # of activations where cpu != prev_cpu % stats are percentiles - STEAL: | stat | cpu_migrations | activations | |-------+----------------+-------------| | count | 2005.000000 | 2005.000000 | | mean | 16.244888 | 290.608479 | | std | 38.963138 | 253.003528 | | min | 0.000000 | 3.000000 | | 50% | 3.000000 | 239.000000 | | 75% | 8.000000 | 436.000000 | | 90% | 45.000000 | 626.000000 | | 99% | 188.960000 | 1073.000000 | | max | 369.000000 | 1417.000000 | - NO_STEAL: | stat | cpu_migrations | activations | |-------+----------------+-------------| | count | 2005.000000 | 2005.000000 | | mean | 15.260848 | 297.860848 | | std | 46.331890 | 253.210813 | | min | 0.000000 | 3.000000 | | 50% | 3.000000 | 252.000000 | | 75% | 7.000000 | 444.000000 | | 90% | 32.600000 | 643.600000 | | 99% | 214.880000 | 1127.520000 | | max | 467.000000 | 1547.000000 | ----->8----- Otherwise, my only other concern at the moment is that since stealing doesn't care about load, we could steal a task that would cause a big imbalance, which wouldn't have happened with a call to load_balance(). I don't think this can be triggered with a symmetrical workload like hackbench, so I'll go explore something else.