Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp764079ybl; Fri, 31 Jan 2020 07:36:32 -0800 (PST) X-Google-Smtp-Source: APXvYqwkEv0U9/bJvhkZhxShmU9Zn+K80hsmkADY5PC/1AoyHjyT19yzt2fp2rcdNhjZ1bkncOzi X-Received: by 2002:a05:6830:1f1c:: with SMTP id u28mr8362846otg.143.1580484992092; Fri, 31 Jan 2020 07:36:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580484992; cv=none; d=google.com; s=arc-20160816; b=LKumP+rc0gMoPNNVjGJLIz9AE4uYkS1lvKv/PmaLWTrC23qkKY/2u9HIGdBrmuQLUI rSjAgmWgxxMnI06N6YeBc77bYZyMhkRakwYP3/XRPgfMhQyMT/25wMYiV0OCoOpxoiaf nvnNpF2qorMq9xHR6B0KvC33p6Ny60PklyfEBlfe1gcETWxB3s8wcZarEPOI0Y/FlJLm um4/pseaPnkZABghF66Fexyws3ObgQXlVtK/iJy98YR5TzpKUMFNcxEhcjJhgCUp9K3V s2ewQG2NIqbUwBNcM0j9hBA3/DUhq3Q/owG9BGK7tVghyWvW3ViX+N31oeVunIuggs9R HsVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=MrfKCMVyi4Z8KYQPiTtEki9nDQp/vjxkYJe3rq5Idtc=; b=V5u/RSTB0eQWgvQ9iiusaJkFJZ/MuXKqD0zugdfRWJlVt61xPqikhUH4GeBDnDskT7 lPDug68r0/2hV6+1jLkh0xlbjwsutOgzVRWyI5B4dZcpRkEAoX3MxK1ifARjjNlpbPAN nEuM+zbx7WkxLUIp6fAvmM7HeW9Wc19JBpUS6GUC+Aj0eGUJtXX6U+4+ZFwQDL4xPbDq reAsjM7MZnAWbS3T1d/mhG5bzTjN7N0LlwQMWvYSIgc/Nc2uipopLQlp0vZs3quIROIG llsZaHdeUcPTou5HaVTNbXucZcSs1e89AY4Gi+buvN4NNwIhZAfNEUIXIyi2zCwfONgM 4gLQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x26si4732629otk.325.2020.01.31.07.36.10; Fri, 31 Jan 2020 07:36:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729147AbgAaPeL (ORCPT + 99 others); Fri, 31 Jan 2020 10:34:11 -0500 Received: from foss.arm.com ([217.140.110.172]:36836 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728860AbgAaPeL (ORCPT ); Fri, 31 Jan 2020 10:34:11 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 94DD4FEC; Fri, 31 Jan 2020 07:34:10 -0800 (PST) Received: from e107158-lin.cambridge.arm.com (e107158-lin.cambridge.arm.com [10.1.195.21]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 41B493F67D; Fri, 31 Jan 2020 07:34:09 -0800 (PST) Date: Fri, 31 Jan 2020 15:34:06 +0000 From: Qais Yousef To: Pavan Kondeti Cc: Ingo Molnar , Peter Zijlstra , Steven Rostedt , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] sched: rt: Make RT capacity aware Message-ID: <20200131153405.2ejp7fggqtg5dodx@e107158-lin.cambridge.arm.com> References: <20191009104611.15363-1-qais.yousef@arm.com> <20200131100629.GC27398@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200131100629.GC27398@codeaurora.org> User-Agent: NeoMutt/20171215 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Pavan On 01/31/20 15:36, Pavan Kondeti wrote: > Hi Qais, > > On Wed, Oct 09, 2019 at 11:46:11AM +0100, Qais Yousef wrote: [...] > > > > For RT we don't have a per task utilization signal and we lack any > > information in general about what performance requirement the RT task > > needs. But with the introduction of uclamp, RT tasks can now control > > that by setting uclamp_min to guarantee a minimum performance point. [...] > > --- > > > > Changes in v2: > > - Use cpupri_find() to check the fitness of the task instead of > > sprinkling find_lowest_rq() with several checks of > > rt_task_fits_capacity(). > > > > The selected implementation opted to pass the fitness function as an > > argument rather than call rt_task_fits_capacity() capacity which is > > a cleaner to keep the logical separation of the 2 modules; but it > > means the compiler has less room to optimize rt_task_fits_capacity() > > out when it's a constant value. > > > > The logic is not perfect. For example if a 'small' task is occupying a big CPU > > and another big task wakes up; we won't force migrate the small task to clear > > the big cpu for the big task that woke up. > > > > IOW, the logic is best effort and can't give hard guarantees. But improves the > > current situation where a task can randomly end up on any CPU regardless of > > what it needs. ie: without this patch an RT task can wake up on a big or small > > CPU, but with this it will always wake up on a big CPU (assuming the big CPUs > > aren't overloaded) - hence provide a consistent performance. [...] > I understand that RT tasks run on BIG cores by default when uclamp is enabled. > Can you tell what happens when we have more runnable RT tasks than the BIG > CPUs? Do they get packed on the BIG CPUs or eventually silver CPUs pull those > tasks? Since rt_task_fits_capacity() is considered during wakeup, push and > pull, the tasks may get packed on BIG forever. Is my understanding correct? I left up the relevant part from the commit message and my 'cover-letter' above that should contain answers to your question. In short, the logic is best effort and isn't a hard guarantee. When the system is overloaded we'll still spread, and a task that needs a big core might end up on a little one. But AFAIU with RT, if you really want guarantees you need to do some planning otherwise there are no guarantees in general that your task will get what it needs. But I understand your question is for the general purpose case. I've hacked my notebook to run a few tests for you https://gist.github.com/qais-yousef/cfe7487e3b43c3c06a152da31ae09101 Look at the diagrams in "Test {1, 2, 3} Results". I spawned 6 tasks which match the 6 cores on the Juno I ran on. Based on Linus' master from a couple of days. Note on Juno cores 1 and 2 are the big cors. 'b_*' and 'l_*' are the task names which are remnants from my previous testing where I spawned different numbers of big and small tasks. I repeat the same tests 3 times to demonstrate the repeatability. The logic causes 2 tasks to run on a big CPU, but there's spreading. IMO on a general purpose system this is a good behavior. On a real time system that needs better guarantee then there's no alternative to doing proper RT planning. In the last test I just spawn 2 tasks which end up on the right CPUs, 1 and 2. On system like Android my observations has been that there are very little concurrent RT tasks active at the same time. So if there are some tasks in the system that do want to be on the big CPU, they most likely to get that guarantee. Without this patch what you get is completely random. > > Also what happens for the case where RT tasks are pinned to silver but with > default uclamp value i.e p.uclamp.min=1024 ? They may all get queued on a > single silver and other silvers may not help since the task does not fit > there. In practice, we may not use this setup. Just wanted to know if this > behavior is intentional or not. I'm not sure I understand your question. If the RT tasks are affined to a set of CPUs, then we'll only search in these CPUs. I expect the logic not to change with this patch. If you have a use case that you think that breaks with this patch, can you please share the details so I can reproduce? I just ran several tests spawning 4 tasks affined to the little cores and I indeed see them spreading on the littles. Cheers -- Qais Yousef