Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp614883ybf; Wed, 26 Feb 2020 19:39:23 -0800 (PST) X-Google-Smtp-Source: APXvYqwSqIfRTy8TMCM8yWkCwtXkoDJN9aX12EUTnLx6yDS9O1k2SOOhEYsEBGPXeU3EWLpQaCf3 X-Received: by 2002:a05:6830:4b9:: with SMTP id l25mr1769952otd.266.1582774763724; Wed, 26 Feb 2020 19:39:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582774763; cv=none; d=google.com; s=arc-20160816; b=cS5OnWPbnAaF5CYaUc1hTwLH08OCDY2z+P2llkUHFoHvWTIQqqQ8qPSJqc6pHuNN8V QWPOTo8+Dyv0qihA67vzUVsEbOSghoFkhNSoNvvyKW+MV4z3RtdRXGTQU+NKzGBAXktb KC00UFeBIMkUWhuRffKvChYq1wKhc/5T1sPwbnAQ+CEsEOZjQ2GrlWz0IdygqwNb6GIR YSJ0o6lsvKh9VTtYoZGnwrmLy0A4Ug32GO4VoVbbrm41EZigSGcIHIajXGwgzGe67/HK 0kGksE+dgpjB9bNPz2KYtm8/sM9+LHAc6s+NydA6jaaC+R6jR99xfX3zUWK+BZxF995B +gHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dmarc-filter:dkim-signature; bh=jwMlsAus4argjYMwypaw/3x3lbhGt3TcpI3om3s5BIQ=; b=WknRBMZZpBJv6a13ArhapPx7u2RYian346S9Fd6uwkXPXKQ80srnRCifeEL0UOEc4+ 0vo4Ick+Yhyk2IWNFMiwDy8y2tI+Q5F+e9Ni62I53+DsMwaWAeAIJzjWJfNxIJ3ELS0E TVXm9Ca7EeYMGuTYFBVBAObCqE2ygeUntKlDZnm9pqUPSwDFqefm4tu+xCd61WAAF4UF kF4nYUXluxyYIMpbCO5lPC264/Vc/xU+l5fR0cCxEJDM8Hb0zuinfp4IGxZ37ItNIoCG rxR9V3M16Oo3tmwstexTyEepQ0MhdNoryEBYDdHk142/BPNeKoM51AKRJJ3rPN0KiiWu t1KQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=J2LhFOSa; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q189si607577oic.235.2020.02.26.19.39.11; Wed, 26 Feb 2020 19:39:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=J2LhFOSa; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728296AbgB0Dgi (ORCPT + 99 others); Wed, 26 Feb 2020 22:36:38 -0500 Received: from mail26.static.mailgun.info ([104.130.122.26]:63604 "EHLO mail26.static.mailgun.info" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728253AbgB0Dgh (ORCPT ); Wed, 26 Feb 2020 22:36:37 -0500 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1582774597; h=In-Reply-To: Content-Type: MIME-Version: References: Message-ID: Subject: Cc: To: From: Date: Sender; bh=jwMlsAus4argjYMwypaw/3x3lbhGt3TcpI3om3s5BIQ=; b=J2LhFOSaKxOYvmy2nPjO+mlnAPluW659AUhWRkoYXeh8RIAIBO+6nh8tAMu6E5NhuRl70osT Lwhk6TmmRBRuDyev9GRxH4xAF2Srk3cMqSjnG7Otg6Y4D873kjE6rG/gtcbHSy4vJa6rdmR/ Vr2evBwzKUiBy0lTXKuXkXtih+g= X-Mailgun-Sending-Ip: 104.130.122.26 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by mxa.mailgun.org with ESMTP id 5e573931.7fb4868c3d50-smtp-out-n03; Thu, 27 Feb 2020 03:36:17 -0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 5645BC447A3; Thu, 27 Feb 2020 03:36:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=2.0 tests=ALL_TRUSTED,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.0 Received: from codeaurora.org (blr-c-bdr-fw-01_GlobalNAT_AllZones-Outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: pkondeti) by smtp.codeaurora.org (Postfix) with ESMTPSA id 85AE2C43383; Thu, 27 Feb 2020 03:36:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 85AE2C43383 Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=none smtp.mailfrom=pkondeti@codeaurora.org Date: Thu, 27 Feb 2020 09:06:08 +0530 From: Pavan Kondeti To: Qais Yousef Cc: Ingo Molnar , Peter Zijlstra , Steven Rostedt , Dietmar Eggemann , Juri Lelli , Vincent Guittot , Ben Segall , Mel Gorman , LKML Subject: Re: [PATCH v2 5/6] sched/rt: Better manage pushing unfit tasks on wakeup Message-ID: <20200227033608.GN28029@codeaurora.org> References: <20200223184001.14248-1-qais.yousef@arm.com> <20200223184001.14248-6-qais.yousef@arm.com> <20200224061004.GH28029@codeaurora.org> <20200224121139.cbz2dt5heiouknif@e107158-lin.cambridge.arm.com> <20200224174138.n6pmoeffqg7eqiy2@e107158-lin.cambridge.arm.com> <20200225035505.GI28029@codeaurora.org> <20200226160247.iqvdakiqbakk2llz@e107158-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200226160247.iqvdakiqbakk2llz@e107158-lin.cambridge.arm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 26, 2020 at 04:02:48PM +0000, Qais Yousef wrote: > On 02/25/20 09:25, Pavan Kondeti wrote: > > > I haven't been staring at this code for as long as you, but since we have > > > logic at wakeup to do a push, I think we need something here anyway for unfit > > > tasks. > > > > > > Fixing select_task_rq_rt() to better balance tasks will help a lot in general, > > > but if that was enough already then why do we need to consider a push at the > > > wakeup at all then? > > > > > > AFAIU, in SMP the whole push-pull mechanism is racy and we introduce redundancy > > > at taking the decision on various points to ensure we minimize this racy nature > > > of SMP systems. Anything could have happened between the time we called > > > select_task_rq_rt() and the wakeup, so we double check again before we finally > > > go and run. That's how I interpret it. > > > > > > I am open to hear about other alternatives first anyway. Your help has been > > > much appreciated so far. > > > > > > > The search inside find_lowest_rq() happens without any locks so I believe it > > is expected to have races like this. In fact there is a comment in the code > > saying "This test is optimistic, if we get it wrong the load-balancer > > will have to sort it out" in select_task_rq_rt(). However, the push logic > > as of today works only for overloaded case. In that sense, your patch fixes > > this race for b.L systems. At the same time, I feel like tracking nonfit tasks > > just to fix this race seems to be too much. I will leave this to Steve and > > others to take a decision. > > I do think without this tasks can end up on the wrong CPU longer than they > should. Keep in mind that if a task is boosted to run on a big core, it still > have to compete with non-boosted tasks who can run on a any cpu. So this > opportunistic push might be necessary. > > For 5.6 though, I'll send an updated series that removes the fitness check from > task_woken_rt() && switched_to_rt() and carry on with this discussion for 5.7. > > > > > I thought of suggesting to remove the below check from select_task_rq_rt() > > > > p->prio < cpu_rq(target)->rt.highest_prio.curr > > > > which would then make the target CPU overloaded and the push logic would > > spread the tasks. That works for a b.L system too. However there seems to > > be a very good reason for doing this. see > > https://lore.kernel.org/patchwork/patch/539137/ > > > > The fact that a CPU is part of lowest_mask but running a higher prio RT > > task means there is a race. Should we retry one more time to see if we find > > another CPU? > > Isn't this what I did in v1? > > https://lore.kernel.org/lkml/20200214163949.27850-4-qais.yousef@arm.com/ > Yes, that patch allows overloading the CPU When the priorities are same. I think, We should also consider when a low prio task and high prio task are waking at the same time and high prio task winning the race. Thanks, Pavan -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.