Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp2689204ybv; Mon, 24 Feb 2020 09:42:05 -0800 (PST) X-Google-Smtp-Source: APXvYqzOEk0wndixkTiArqkoRJLXvL7v9ZVtrHrPLOPMrtmW63mkUSefxcLyK6CP2sORMHrCGbRm X-Received: by 2002:a9d:6005:: with SMTP id h5mr43074436otj.153.1582566125029; Mon, 24 Feb 2020 09:42:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582566125; cv=none; d=google.com; s=arc-20160816; b=QNXaIHW1udplEmzkRI7is/ZP+YkE4aLEP9v/7fWbKeYTQv1htoD4Qkq7Wo36orpO3f WSeNj+oUlgQroZMfZIEgc7PYrTlnfUi1d7IfpkvWNObbE4dN/qC2YITDcTckEvwWv4fn +NqBKlMo2RMLdEocaBQ/i8UUMuFFioMZvVWjbjYtnLjIPTqE0K1VVD1pPCNM/9Wv4jMT Eh/xCcsUgdYkmCxUEGcHMNa0MXWJOpstWgiMmyhEeYahHdtbS89OagwBr+O0N12Rs7OB Rv0mlgTiLv4IkLd7o8sTR5j6Q/2XCVaMnFnv0z+Pe+77BthFT0/5RGhU5r0s1npum+v+ DWlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=NkYhTQvv85WFdpWAWgmKRATgU2bsBvHNFMbbTY9+oUs=; b=ninC3Ko1/VD6EmkMlDhCMra3KK0lrUH6mpu9FzrfP0NWUsuCYqXK7mCCFovGyCj+uF H2E2r9q78uDUi7cymKt6czCvFUvOglKg0Kdiq0KpKMgFE2Tfo4/1Q5a0HUvCnMtCah9r 2F+JelVtUX5bO9lrhYK35fau+SGo+Ttt0vECzc27S27Ar8ZDMI796VMpdgK7gs+EScaO CUQnHHM4M7Iyl/VAr+o46/VkAc/PQRwWqtUkUQk+bQg7lo9W4GnTYkI+e93GZXtMbXJY WRvTnrQw3XQPMSBQDuLsBjWwqPDreRH03MxsRA3dimrWBdNLsm3+ktHLX7dzMM1PVn2G QFNA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c130si5519836oib.182.2020.02.24.09.41.53; Mon, 24 Feb 2020 09:42:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728095AbgBXRlq (ORCPT + 99 others); Mon, 24 Feb 2020 12:41:46 -0500 Received: from foss.arm.com ([217.140.110.172]:40750 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728090AbgBXRln (ORCPT ); Mon, 24 Feb 2020 12:41:43 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DBC701FB; Mon, 24 Feb 2020 09:41:42 -0800 (PST) Received: from e107158-lin.cambridge.arm.com (e107158-lin.cambridge.arm.com [10.1.195.21]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8C8693F703; Mon, 24 Feb 2020 09:41:41 -0800 (PST) Date: Mon, 24 Feb 2020 17:41:39 +0000 From: Qais Yousef To: Pavan Kondeti Cc: Ingo Molnar , Peter Zijlstra , Steven Rostedt , Dietmar Eggemann , Juri Lelli , Vincent Guittot , Ben Segall , Mel Gorman , LKML Subject: Re: [PATCH v2 5/6] sched/rt: Better manage pushing unfit tasks on wakeup Message-ID: <20200224174138.n6pmoeffqg7eqiy2@e107158-lin.cambridge.arm.com> References: <20200223184001.14248-1-qais.yousef@arm.com> <20200223184001.14248-6-qais.yousef@arm.com> <20200224061004.GH28029@codeaurora.org> <20200224121139.cbz2dt5heiouknif@e107158-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20171215 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/24/20 21:34, Pavan Kondeti wrote: > Hi Qais, > > On Mon, Feb 24, 2020 at 5:42 PM Qais Yousef wrote: > [...] > > We could do, temporarily, to get these fixes into 5.6. But I do think > > select_task_rq_rt() doesn't do a good enough job into pushing unfit tasks to > > the right CPUs. > > > > I don't understand the reasons behind your objection. It seems you think that > > select_task_rq_rt() should be enough, but not AFAICS. Can you be a bit more > > detailed please? > > > > FWIW, here's a screenshot of what I see > > > > https://imgur.com/a/peV27nE > > > > After the first activation, select_task_rq_rt() fails to find the right CPU > > (due to the same move all tasks to the cpumask_fist()) - but when the task > > wakes up on 4, the logic I put causes it to migrate to CPU2, which is the 2nd > > big core. CPU1 and CPU2 are the big cores on Juno. > > > > Now maybe we should fix select_task_rq_rt() to better balance tasks, but not > > sure how easy is that. > > > > Thanks for the trace. Now things are clear to me. Two RT tasks woke up > simultaneously and the first task got its previous CPU i.e CPU#1. The next task > goes through find_lowest_rq() and got the same CPU#1. Since this task priority > is not more than the just queued task (already queued on CPU#1), it is sent > to its previous CPU i.e CPU#4 in your case. > > From task_woken_rt() path, CPU#4 attempts push_rt_tasks(). CPU#4 is > not overloaded, > but we have rt_task_fits_capacity() check which forces the push. Since the CPU > is not overloaded, your has_unfit_tasks() comes to rescue and push the > task. Since > the task has not scheduled in yet, it is eligible for push. You added checks > to skip resched_curr() in push_rt_tasks() otherwise the push won't happen. Nice summary, that's exactly what it is :) > Finally, I understood your patch. Obviously this is not clear to me > before. I am not > sure if this patch is the right approach to solve this race. I will > think a bit more. I haven't been staring at this code for as long as you, but since we have logic at wakeup to do a push, I think we need something here anyway for unfit tasks. Fixing select_task_rq_rt() to better balance tasks will help a lot in general, but if that was enough already then why do we need to consider a push at the wakeup at all then? AFAIU, in SMP the whole push-pull mechanism is racy and we introduce redundancy at taking the decision on various points to ensure we minimize this racy nature of SMP systems. Anything could have happened between the time we called select_task_rq_rt() and the wakeup, so we double check again before we finally go and run. That's how I interpret it. I am open to hear about other alternatives first anyway. Your help has been much appreciated so far. Thanks -- Qais Yousef