Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp3177912ybv; Mon, 24 Feb 2020 19:56:08 -0800 (PST) X-Google-Smtp-Source: APXvYqwNj3bkZJY+CbM0m7gF9kYzat5EVeS2iZ22TTt6KAIVZ3S7B3nTHmLP4pLDA3VtqdY1rCYR X-Received: by 2002:aca:2315:: with SMTP id e21mr1845054oie.147.1582602968307; Mon, 24 Feb 2020 19:56:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582602968; cv=none; d=google.com; s=arc-20160816; b=q0ipc/ETzySzobq77hBF4oRSnTVqGpXW0jkoPjN+h3uE0urStddUG7z8FhHwIJ4OCO sN+BJQ/1+SWS/4JRYW3MOrqIMcG6LR7SC+fC43idWVb3vh5vgRXd13KBilDRJF9f25O7 OVNHRn6Dgp/mZYCN+tUNbLQl9wuNynnV6s6Kqm8b41b2S1j08AiKENJvlOnNzX/A9oIm 2W+UdjVDQb43N1SWI5s3WjoJ62FKdrzEJy4AnuirdU07dkGx5XTAp73sDRrn7QBg9v3P Wy8bKEpQHRHjxT5KIB/LgMo9u8yAFAHdvLea4Z/zVCUomdobvTN+xtrlmfrYMe4czEN7 6c4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dmarc-filter:dkim-signature; bh=s35BPNR6Z45kmkW/uZVQApEVNGV6Tou60aGUAhwOQt4=; b=l8+Dx7s8eS8Ht3Rm0QEQCCHM6TJDFL+fMwhxSc2ixTKtRVALQL8cj5M+8ttA4INdlQ gmfOr5b+YqVngzWwNszfg8/+LxjMxgD4lIZusMkmoXwbGA0UXbwfBmdQT/rloAn8t47W 5r+g/QxlfIabiCSuuRDKB3iy5ZeYnwNz6s29pPAwAcdAsNyvneJUkZtn+gk7c2W0Ba31 h3RH5nyjDnaCq07g6Vgc1sb9rEzzbsjhXb7L5JS8A+NGSNflOnhiSTSBpO6jhNsjevbw HNIj1+36ThsKCD6CruKhnZrVGJvViRT76EUEJPk1rh2pI7dPPukVSNxCU5/FjsdWac7f eq/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=xUld4MdS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o2si7469390otj.312.2020.02.24.19.55.54; Mon, 24 Feb 2020 19:56:08 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=xUld4MdS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728866AbgBYDzk (ORCPT + 99 others); Mon, 24 Feb 2020 22:55:40 -0500 Received: from mail26.static.mailgun.info ([104.130.122.26]:29341 "EHLO mail26.static.mailgun.info" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726962AbgBYDzk (ORCPT ); Mon, 24 Feb 2020 22:55:40 -0500 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1582602939; h=In-Reply-To: Content-Type: MIME-Version: References: Message-ID: Subject: Cc: To: From: Date: Sender; bh=s35BPNR6Z45kmkW/uZVQApEVNGV6Tou60aGUAhwOQt4=; b=xUld4MdSUBAmhxKRdwbrDLGX3A5pJr0pX755vZhl+G6C0W7vz9gjdp0j6eFFDSQ/zIVZ0Rcw sz2//BeWjD5/msinWx8p9GDvEqjC6MWnuHcTz++JJoSChJUZcywxbif/4Uop0DX3HZgowLmP rDeGU4VaXXTqamu/x8TLAUuHBOM= X-Mailgun-Sending-Ip: 104.130.122.26 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by mxa.mailgun.org with ESMTP id 5e549aa3.7f32c5460a08-smtp-out-n03; Tue, 25 Feb 2020 03:55:15 -0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 59122C447A2; Tue, 25 Feb 2020 03:55:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=2.0 tests=ALL_TRUSTED,SPF_NONE, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from codeaurora.org (blr-c-bdr-fw-01_GlobalNAT_AllZones-Outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: pkondeti) by smtp.codeaurora.org (Postfix) with ESMTPSA id 13E91C43383; Tue, 25 Feb 2020 03:55:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 13E91C43383 Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=none smtp.mailfrom=pkondeti@codeaurora.org Date: Tue, 25 Feb 2020 09:25:05 +0530 From: Pavan Kondeti To: Qais Yousef Cc: Ingo Molnar , Peter Zijlstra , Steven Rostedt , Dietmar Eggemann , Juri Lelli , Vincent Guittot , Ben Segall , Mel Gorman , LKML Subject: Re: [PATCH v2 5/6] sched/rt: Better manage pushing unfit tasks on wakeup Message-ID: <20200225035505.GI28029@codeaurora.org> References: <20200223184001.14248-1-qais.yousef@arm.com> <20200223184001.14248-6-qais.yousef@arm.com> <20200224061004.GH28029@codeaurora.org> <20200224121139.cbz2dt5heiouknif@e107158-lin.cambridge.arm.com> <20200224174138.n6pmoeffqg7eqiy2@e107158-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200224174138.n6pmoeffqg7eqiy2@e107158-lin.cambridge.arm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 24, 2020 at 05:41:39PM +0000, Qais Yousef wrote: > On 02/24/20 21:34, Pavan Kondeti wrote: > > Hi Qais, > > > > On Mon, Feb 24, 2020 at 5:42 PM Qais Yousef wrote: > > [...] > > > We could do, temporarily, to get these fixes into 5.6. But I do think > > > select_task_rq_rt() doesn't do a good enough job into pushing unfit tasks to > > > the right CPUs. > > > > > > I don't understand the reasons behind your objection. It seems you think that > > > select_task_rq_rt() should be enough, but not AFAICS. Can you be a bit more > > > detailed please? > > > > > > FWIW, here's a screenshot of what I see > > > > > > https://imgur.com/a/peV27nE > > > > > > After the first activation, select_task_rq_rt() fails to find the right CPU > > > (due to the same move all tasks to the cpumask_fist()) - but when the task > > > wakes up on 4, the logic I put causes it to migrate to CPU2, which is the 2nd > > > big core. CPU1 and CPU2 are the big cores on Juno. > > > > > > Now maybe we should fix select_task_rq_rt() to better balance tasks, but not > > > sure how easy is that. > > > > > > > Thanks for the trace. Now things are clear to me. Two RT tasks woke up > > simultaneously and the first task got its previous CPU i.e CPU#1. The next task > > goes through find_lowest_rq() and got the same CPU#1. Since this task priority > > is not more than the just queued task (already queued on CPU#1), it is sent > > to its previous CPU i.e CPU#4 in your case. > > > > From task_woken_rt() path, CPU#4 attempts push_rt_tasks(). CPU#4 is > > not overloaded, > > but we have rt_task_fits_capacity() check which forces the push. Since the CPU > > is not overloaded, your has_unfit_tasks() comes to rescue and push the > > task. Since > > the task has not scheduled in yet, it is eligible for push. You added checks > > to skip resched_curr() in push_rt_tasks() otherwise the push won't happen. > > Nice summary, that's exactly what it is :) > > > Finally, I understood your patch. Obviously this is not clear to me > > before. I am not > > sure if this patch is the right approach to solve this race. I will > > think a bit more. > > I haven't been staring at this code for as long as you, but since we have > logic at wakeup to do a push, I think we need something here anyway for unfit > tasks. > > Fixing select_task_rq_rt() to better balance tasks will help a lot in general, > but if that was enough already then why do we need to consider a push at the > wakeup at all then? > > AFAIU, in SMP the whole push-pull mechanism is racy and we introduce redundancy > at taking the decision on various points to ensure we minimize this racy nature > of SMP systems. Anything could have happened between the time we called > select_task_rq_rt() and the wakeup, so we double check again before we finally > go and run. That's how I interpret it. > > I am open to hear about other alternatives first anyway. Your help has been > much appreciated so far. > The search inside find_lowest_rq() happens without any locks so I believe it is expected to have races like this. In fact there is a comment in the code saying "This test is optimistic, if we get it wrong the load-balancer will have to sort it out" in select_task_rq_rt(). However, the push logic as of today works only for overloaded case. In that sense, your patch fixes this race for b.L systems. At the same time, I feel like tracking nonfit tasks just to fix this race seems to be too much. I will leave this to Steve and others to take a decision. I thought of suggesting to remove the below check from select_task_rq_rt() p->prio < cpu_rq(target)->rt.highest_prio.curr which would then make the target CPU overloaded and the push logic would spread the tasks. That works for a b.L system too. However there seems to be a very good reason for doing this. see https://lore.kernel.org/patchwork/patch/539137/ The fact that a CPU is part of lowest_mask but running a higher prio RT task means there is a race. Should we retry one more time to see if we find another CPU? Thanks, Pavan -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.