Received: by 2002:a25:b323:0:0:0:0:0 with SMTP id l35csp1558275ybj; Fri, 20 Sep 2019 12:25:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqyENU0xuBygKo6VQ0iHPQ9hHZhEgEDibVOpBhUg++k3rnvmcKftJP+fTKNqD6QHvb827wsW X-Received: by 2002:a17:906:839a:: with SMTP id p26mr20404134ejx.94.1569007536316; Fri, 20 Sep 2019 12:25:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569007536; cv=none; d=google.com; s=arc-20160816; b=o1DFJ8eCR4n8gk6h2XN3vXWqrE/iesPjEPx/BwFeOOjk3hwSyTDoHEguk+di5Murd7 pNOqM9jYggR4pZZIhtMGnNYuqf4foJuSW9sRwbKUFQkFO8Tcf6OCIeFESa8YA8qKjuX6 6spAVjHYmYJMsp04Z81QRVEurudFbsgy7bWIpmnkqRr6TvruV/SLLX2+Htk3xOsEyrw3 PsdyXKXAfHKt8e/jPOGD2r1sA5W9wCEVCzYDWoYH8yW3XVNX3O7sR2LilybhubIbPdyp EfVhHaNHfwI93ZTkMrvZ4u8UHQJaESE4Vaama8NIpkhJbDDYyQTPQJkT/QSr9DTCGX6W mKWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=9HV6MXFUqRGM0doiGioUGLlc11XcSa+B501GCaOewvc=; b=zaN716MkW7HgIjKaoHPsQHIgNaT3zMUbZ2uTYjT+XCBwmT0b6KagyfOITr+iWdRWiB s8DGrFJO4bjNY8/KifAQvUTluLdZ/MuJD6uzziHIUnn0jYvxClbeAN6o2ehFl8hdG+55 0B5/muxgvJWu81tYWmy7PQg9As5EUx4MpsO5ZOAL8DCPLpy5jwUlPLo1JJ320lFtukak 5fSgCf+Rbw7sp789B5MGS4UmkGhuf8WqDmdYE8i41fwIL96A1ZyogkWrzDFOoN0APjhW SWXJrT0u0FETJ+r3TNjk4XOpO7tQDjAhB1s7+fnv78vsACVjYVPINf5KOWwLWr1KwxWc ToqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j28si2229467eda.161.2019.09.20.12.25.11; Fri, 20 Sep 2019 12:25:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726610AbfITMwu (ORCPT + 99 others); Fri, 20 Sep 2019 08:52:50 -0400 Received: from foss.arm.com ([217.140.110.172]:44314 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726223AbfITMwu (ORCPT ); Fri, 20 Sep 2019 08:52:50 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8D67D1570; Fri, 20 Sep 2019 05:52:49 -0700 (PDT) Received: from [192.168.2.253] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 19D783F67D; Fri, 20 Sep 2019 05:52:46 -0700 (PDT) Subject: Re: [PATCH] sched: rt: Make RT capacity aware To: Qais Yousef Cc: Steven Rostedt , Ingo Molnar , Peter Zijlstra , Vincent Guittot , Alessio Balsini , linux-kernel@vger.kernel.org References: <20190903103329.24961-1-qais.yousef@arm.com> <20190904072524.09de28aa@oasis.local.home> <20190904154052.ygbhtduzkfj3xs5d@e107158-lin.cambridge.arm.com> <20190918145233.kgntor5nb2gmnczd@e107158-lin.cambridge.arm.com> From: Dietmar Eggemann Message-ID: Date: Fri, 20 Sep 2019 14:52:43 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190918145233.kgntor5nb2gmnczd@e107158-lin.cambridge.arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/18/19 4:52 PM, Qais Yousef wrote: > On 09/13/19 14:30, Dietmar Eggemann wrote: >> On 9/4/19 4:40 PM, Qais Yousef wrote: >>> On 09/04/19 07:25, Steven Rostedt wrote: >>>> On Tue, 3 Sep 2019 11:33:29 +0100 >>>> Qais Yousef wrote: [...] >> On a big.LITTLE system (6 CPUs with [446 1024 1024 446 466 466] CPU >> capacity vector) I try to trace the handling of the 3rd big task >> (big2-2, util_min: 800, util_max: 1024) of an rt-app workload running 3 >> of them. >> >> rt_task_fits_capacity() call in: >> >> tag 1: select_task_rq_rt(), 3-7: 1st till 5th in find_lowest_rq() >> >> [ 37.505325] rt_task_fits_capacity: CPU3 tag=1 [big2-2 285] ret=0 >> [ 37.505882] find_lowest_rq: CPU3 [big2-2 285] tag=1 find_lowest_rq >> [ 37.506509] CPU3 [big2-2 285] lowest_mask=0,3-5 >> [ 37.507971] rt_task_fits_capacity: CPU3 tag=3 [big2-2 285] ret=0 >> [ 37.508200] rt_task_fits_capacity: CPU3 tag=4 [big2-2 285] ret=0 >> [ 37.509486] rt_task_fits_capacity: CPU0 tag=5 [big2-2 285] ret=0 >> [ 37.510191] rt_task_fits_capacity: CPU3 tag=5 [big2-2 285] ret=0 >> [ 37.511334] rt_task_fits_capacity: CPU4 tag=5 [big2-2 285] ret=0 >> [ 37.512194] rt_task_fits_capacity: CPU5 tag=5 [big2-2 285] ret=0 >> [ 37.513210] rt_task_fits_capacity: CPU0 tag=6 [big2-2 285] ret=0 >> [ 37.514085] rt_task_fits_capacity: CPU3 tag=7 [big2-2 285] ret=0 >> [ 37.514732] --> select_task_rq_rt: CPU3 [big2-2 285] cpu=0 >> >> Since CPU 0,3-5 can't run big2-2, it takes up to the test that the >> fitness hasn't changed. In case a capacity-aware (with fallback CPU) >> cpupri_find() would have returned a suitable lowest_mask, less work >> would have been needed. > > rt_task_fits_capacity() is inlined and I'd expect all the data it accesses to > be cache hot, so it should be fast. My concern is not so much runtime but maintenance of the new code. We want to integrate a feature 'Asymmetric CPU capacity support' in RT and DL sched class which have the same overall design (DL was copied from RT). IMHO, the new feature 'Asymmetric CPU capacity support' has to be integrated the same way (if possible) to let people understand the related code easily. > I had 2 concerns with using cpupri_find() > > 1. find_lowest_rq() is not the only user But the only one calling it with lowest_mask != NULL (like for DL find_later_rq() is the only one calling cpudl_find() with later_mask != NULL. > 2. The fallback mechanism means we either have to call cpupri_find() > twice once to find filtered lowest_rq and the other to return the > none filtered version. This is what I have in mind. (Only compile tested! ... and the 'if (cpumask_any(lowest_mask) >= nr_cpu_ids)' condition has to be considered as well): @@ -98,8 +103,26 @@ int cpupri_find(struct cpupri *cp, struct task_struct *p, continue; if (lowest_mask) { + int cpu, max_cap_cpu = -1; + unsigned long max_cap = 0; + cpumask_and(lowest_mask, p->cpus_ptr, vec->mask); + for_each_cpu(cpu, lowest_mask) { + unsigned long cap = arch_scale_cpu_capacity(cpu); + + if (!rt_task_fits_capacity(p, cpu)) + cpumask_clear_cpu(cpu, lowest_mask); + + if (cap > max_cap) { + max_cap = cap; + max_cap_cpu = cpu; + } + } + + if (cpumask_empty(lowest_mask) && max_cap) + cpumask_set_cpu(max_cap_cpu, lowest_mask); + [...] >> I can't see the >> >> for_each_domain(cpu, sd) { >> for_each_cpu_and(best_cpu, lowest_mask, sched_domain_span(sd)) { >> ... >> } >> } >> >> other than we want to pick a CPU from the lowest_mask first which is >> closer to task_cpu(task). >> >> Qemu doesn't give me cluster: >> >> # cat /proc/sys/kernel/sched_domain/cpu0/domain*/name >> MC > > I'm not sure I get what you're trying to get at here. Can you please elaborate > more? I was referring to this for each scheduler domain snippet: for_each_domain(cpu, sd) { } this would first look for a best_cpu in MC and then in DIE level in the original code. I remember that Steven asked about the for_each_cpu_and() you introduced here. But in case you can test the idea of introducing the changes in cpupri_find() you won't have to worry about this anymore. [...]