Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp5098282imm; Tue, 31 Jul 2018 05:35:27 -0700 (PDT) X-Google-Smtp-Source: AAOMgpf59ijUkNMtUFursxmX4Z3Rz3gFPu9XUPE2HleeO+Ec52XXC2TQk3V3vKXqkNod8saiDk69 X-Received: by 2002:a62:229a:: with SMTP id p26-v6mr22142261pfj.53.1533040527279; Tue, 31 Jul 2018 05:35:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533040527; cv=none; d=google.com; s=arc-20160816; b=At7HW6i9d7SpRmRYFgDWl6uSTWpzHyMzPYmGGZJVcRA5NckfU8cJ+IKLcbY5Oq/E0A LL1N5kGSbkX19qnGzAYqZsjCVL7UG64M6/4xpbOLr6pKCj1Wjt2D4ZccuQ3km/vP/jf9 huUBDOdQ+j4gNvUdg0gE/99vEhCuSY6eph3WLLf3HoJanZfFO0bnPz4WgYUJAQYgo1nW Ldd1pfbMDsQh5yzJP48eBlN7Mdjbl/aPaSlLl5vwS/WxDuP9+GSIlUz/+cZV410HQ7Wu BwkocCH3TQPj+hrYZq0B1JO3jVRv//cJnkC3legDR4Rbg0KKU6qw8ssL/IAcNXJ/Gs8d TsUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=y1JzwQcGf91snVQCK5nUtOl+TyQEHMABemATCgSnf8k=; b=tt/umZA+D11AzkLI72FCaKj5IECfPegP2Wf7CW2ieAFeAOrM9mIOTMFG6i68bPmmjh iQEi1A6DIuPVTgfbxUYFtV7isKBt7NvkxKZi5kmwolwVfvOqHP/W7RsKInbTNnaEHdaO KhXsx2a20JA2RIjvv6YzcIYSImAjzn+rW8RhweG+PoKYa+PrYhNxRd7jCgWoRicbIzcf 2q9pVPeeJIJUlJLM7Ram3z8mzULFNZvsTJPdTg6vfy0885zMcgCLlhaCAd7dB/UoSZMy jTJkEWAOBLxrPuoIhOiCTYU0HYPOfQMAHwPQX8vIo78klDvnNhvywxeXmeqIcApQMYZV h+Rg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s23-v6si12758552pga.563.2018.07.31.05.35.13; Tue, 31 Jul 2018 05:35:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732266AbeGaONw (ORCPT + 99 others); Tue, 31 Jul 2018 10:13:52 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:53470 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732198AbeGaONw (ORCPT ); Tue, 31 Jul 2018 10:13:52 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 147AB80D; Tue, 31 Jul 2018 05:33:45 -0700 (PDT) Received: from [10.4.12.39] (e113632-lin.emea.arm.com [10.4.12.39]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D47373F5D0; Tue, 31 Jul 2018 05:33:43 -0700 (PDT) Subject: Re: [PATCHv4 12/12] sched/core: Disable SD_PREFER_SIBLING on asymmetric cpu capacity domains To: Vincent Guittot , Morten Rasmussen Cc: Peter Zijlstra , Ingo Molnar , Dietmar Eggemann , gaku.inami.xh@renesas.com, linux-kernel References: <1530699470-29808-1-git-send-email-morten.rasmussen@arm.com> <1530699470-29808-13-git-send-email-morten.rasmussen@arm.com> <20180706143139.GE8596@e105550-lin.cambridge.arm.com> From: Valentin Schneider Message-ID: Date: Tue, 31 Jul 2018 13:33:42 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 31/07/18 13:17, Vincent Guittot wrote: > On Fri, 6 Jul 2018 at 16:31, Morten Rasmussen wrote: >> >> On Fri, Jul 06, 2018 at 12:18:17PM +0200, Vincent Guittot wrote: >>> [...] >> >> Scheduling one task per cpu when n_task == n_cpus on asymmetric >> topologies is generally broken already and this patch set doesn't fix >> that problem. >> >> SD_PREFER_SIBLING might seem to help in very specific cases: >> n_litte_cpus == n_big_cpus. In that case the little group might >> classified as overloaded. It doesn't guarantee that anything gets pulled >> as the grp_load/grp_capacity in the imbalance calculation on some system >> still says the little cpus are more loaded than the bigs despite one of >> them being idle. That depends on the little cpu capacities. >> >> On systems where n_little_cpus != n_big_cpus SD_PREFER_SIBLING is broken >> as it assumes the group_weight to be the same. This is the case on Juno >> and several other platforms. >> >> IMHO, SD_PREFER_SIBLING isn't the solution to this problem. It might > > I agree but this patchset creates a regression in the scheduling behavior > >> help for a limited subset of topologies/capacities but the right >> solution is to change the imbalance calculation. As the name says, it is > > Yes that what does the prototype that I came with. > >> meant to spread tasks and does so unconditionally. For asymmetric >> systems we would like to consider cpu capacity before migrating tasks. >> >>> When running the tests of your cover letter, 1 long >>> running task is often co scheduled on a big core whereas short pinned >>> tasks are still running and a little core is idle which is not an >>> optimal scheduling decision >> >> This can easily happen with SD_PREFER_SIBLING enabled too so I wouldn't >> say that this patch breaks anything that isn't broken already. In fact >> we this happening with and without this patch applied. > > At least for the use case above, this doesn't happen when > SD_PREFER_SIBLING is set > On my HiKey960 I can see coscheduling on a big CPU while a LITTLE is free with **and** without SD_PREFER_SIBLING. Having it set only means that in some cases the imbalance will be re-classified as group_overloaded instead of group_misfit_task, so we'll skip the misfit logic when we shouldn't (this happens on Juno for instance). It does nothing for the "1 task per CPU" problem that Morten described above. When you have this little amount of tasks, load isn't very relevant, but it skews the load-balancer into thinking the LITTLE CPUs are more busy than the bigs even though there's an idle one in the lot. >> >> Morten