Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp149193pxu; Wed, 25 Nov 2020 15:53:08 -0800 (PST) X-Google-Smtp-Source: ABdhPJxB7Fauw2Xm/uSjDdpYP9dwMwkY1bAWvaSMUxvhkOATLk0dTcs3RP2d/ssi9qLX3sPLuOmB X-Received: by 2002:a17:906:3daa:: with SMTP id y10mr343882ejh.23.1606348388633; Wed, 25 Nov 2020 15:53:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606348388; cv=none; d=google.com; s=arc-20160816; b=NWKsoYLRN6ZKgp4gexnAWgBiNoQlTpW+t+47PTqgQPhW5reSLp6t0sZ1SO2uOZsQZi MMG58/VTOXpLfZEhi5KBKXbxNWvwCAgpzknt4UZR+vyZ02V7LZrwV58aP2yG3yui/B1C yG3chatBxITxEetLcPUbCxobxT+rx38vBT5KHLi0X9NvS/T3x3ND7zu3sCMaAPMMj+72 8wbbxSn24VPMpdqoP6ltCCVOxrqEFEBzfYwxOy3nVbmhwqkNkluSktOH994DslG27IOg 876U+BXN30boj4aHtkatjY419SDPtc/wmRcIGT4y3uEF9TMGqddqbl1jyOEOHUbKrVbd 74IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=fPBiGbxZNVp4Ewr892Wb2GxDmA9Rasrltq1wrpJkpxk=; b=eXMkUNubhqYSTyR8QR4UAU8fv16tnkQmlre/Bh3DdmaGYt/VSZDbQl2QP/YhjLtjnx a0oGxXpmlACBYwfu/p+AhQFfnUtyZI0D5BHnoYKov0svjjH0rKV/OBjY/03JTf8Gh/2U nqzjvFvbYe+/4U6fuFohjRz42gF3WaFkUyso/O0vGyuqJgFudfAQ9ytA+uHrGtdDFb8E x0IR7ql+vVPTvoi87Hn8iqutBDY5JLf5l3grFCYiBFZ6bC6bhKZXXZCZoqjz+ZMlNrQm Y+E0WAsGKX2EJfcORLeixd3saeHoZeoLnQIdu+4sax7ixCIWZZY3m0DQnp+VWnY+FzSy uVOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=T4+aIkFg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q23si2605347edb.184.2020.11.25.15.52.45; Wed, 25 Nov 2020 15:53:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=T4+aIkFg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729752AbgKYXRU (ORCPT + 99 others); Wed, 25 Nov 2020 18:17:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729729AbgKYXRU (ORCPT ); Wed, 25 Nov 2020 18:17:20 -0500 Received: from mail-pg1-x52f.google.com (mail-pg1-x52f.google.com [IPv6:2607:f8b0:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F071DC0613D4 for ; Wed, 25 Nov 2020 15:17:19 -0800 (PST) Received: by mail-pg1-x52f.google.com with SMTP id j19so72298pgg.5 for ; Wed, 25 Nov 2020 15:17:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=fPBiGbxZNVp4Ewr892Wb2GxDmA9Rasrltq1wrpJkpxk=; b=T4+aIkFguFj7jpu+w3emceF+mJyXUzjk3wCn6Kfr785/a9cTvfuK6WNcmwRWsbHPF0 EDLSIv8TqK+JSvfOLYe51zZMiYdvUzeAWHKMHoz0x3nUw/0MXqasAKqgX24EYB+N+wVA RgFW9CdBHvWFbWzca1AAmerxPhMSOLL2gMOF7Gi39BWQWejMCOoio2lFwrir0bgb8JDn aMjHX1Z+a1PdftFUx+cfXgGUjB+EgI0ANs7C0Dlh0U8wQviWE0cZEMbbI8v5QznticKI EM6fHK2uF1+TzQo6b5xD4bo176H3PGKXPfuyEFY4x3uha0PZXn/axQ2eBD2RNYEmnKkq 3FYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=fPBiGbxZNVp4Ewr892Wb2GxDmA9Rasrltq1wrpJkpxk=; b=lsgc9GiipIo2Pkn/kQoICGPsYUSkqPnstt8KFuAKc0KT3GpOS/ap5iyw/kWvcVItiv wwwvPb0Xz5CitZOu2SHn53aNLz9w+j/+wJt3s+AOx0OZVo6c8+U6tba+sAWHcbPS6sqi +46w6j6AazCBKBFCcSiT/szZnIsdUNC34mcIQ7d0cE5niSkqlDGPQuqvHLkP0lNpL/aw 7BXTTjaZg0umSrUOHmEVrcafINldrXKxz5PY6erxCzxVWW8igaOpq9QNCx8hwYPBR3Kj bxAnP4J0DzI8rtP5+L1sHZiK6XAOdIdOpRgPfDKVb+ml5C8bfyqbTDQflfPsUHsKDu+8 5e8g== X-Gm-Message-State: AOAM532zFXHY32JME9M7/RZ7MeJrWM2CjsTzwB1HPtrvSLB4hSstv5td UmDrhnoJrYsp9ckjMP6OJK4= X-Received: by 2002:a17:90a:8402:: with SMTP id j2mr128334pjn.120.1606346239197; Wed, 25 Nov 2020 15:17:19 -0800 (PST) Received: from localhost (61-68-227-232.tpgi.com.au. [61.68.227.232]) by smtp.gmail.com with ESMTPSA id r3sm3981272pjl.23.2020.11.25.15.17.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Nov 2020 15:17:17 -0800 (PST) Date: Thu, 26 Nov 2020 10:17:15 +1100 From: Balbir Singh To: Peter Zijlstra Cc: Vineeth Pillai , "Joel Fernandes (Google)" , Nishanth Aravamudan , Julien Desfossez , Tim Chen , Aaron Lu , Aubrey Li , tglx@linutronix.de, linux-kernel@vger.kernel.org, mingo@kernel.org, torvalds@linux-foundation.org, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , vineeth@bitbyteword.org, Chen Yu , Christian Brauner , Agata Gruza , Antonio Gomez Iglesias , graf@amazon.com, konrad.wilk@oracle.com, dfaggioli@suse.com, pjt@google.com, rostedt@goodmis.org, derkling@google.com, benbjiang@tencent.com, Alexandre Chartre , James.Bottomley@hansenpartnership.com, OWeisse@umich.edu, Dhaval Giani , Junaid Shahid , jsbarnes@google.com, chris.hyser@oracle.com, Ben Segall , Josh Don , Hao Luo , Tom Lendacky , Aubrey Li , "Paul E. McKenney" , Tim Chen Subject: Re: [PATCH -tip 09/32] sched/fair: Snapshot the min_vruntime of CPUs on force idle Message-ID: <20201125231715.GD163610@balbir-desktop> References: <20201117232003.3580179-1-joel@joelfernandes.org> <20201117232003.3580179-10-joel@joelfernandes.org> <20201122114442.GD110669@balbir-desktop> <2cb42831-5074-e0a9-9e2a-f2a880504026@linux.microsoft.com> <20201123233149.GB8893@balbir-desktop> <20201124090955.GV3021@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201124090955.GV3021@hirez.programming.kicks-ass.net> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 24, 2020 at 10:09:55AM +0100, Peter Zijlstra wrote: > On Tue, Nov 24, 2020 at 10:31:49AM +1100, Balbir Singh wrote: > > On Mon, Nov 23, 2020 at 07:31:31AM -0500, Vineeth Pillai wrote: > > > Hi Balbir, > > > > > > On 11/22/20 6:44 AM, Balbir Singh wrote: > > > > > > > > This seems cumbersome, is there no way to track the min_vruntime via > > > > rq->core->min_vruntime? > > > Do you mean to have a core wide min_vruntime? We had a > > > similar approach from v3 to v7 and it had major issues which > > > broke the assumptions of cfs. There were some lengthy > > > discussions and Peter explained in-depth about the issues: > > > > > > https://lwn.net/ml/linux-kernel/20200506143506.GH5298@hirez.programming.kicks-ass.net/ > > > https://lwn.net/ml/linux-kernel/20200515103844.GG2978@hirez.programming.kicks-ass.net/ > > > > > > > One of the equations in the link is > > > > ">From which immediately follows that: > > > > T_k + T_l > > S_k+l = --------- (13) > > W_k + W_l > > > > On which we can define a combined lag: > > > > lag_k+l(i) := S_k+l - s_i (14) > > > > And that gives us the tools to compare tasks across a combined runqueue. > > " > > > > S_k+l reads like rq->core->vruntime, but it sounds like the equivalent > > of rq->core->vruntime is updated when we enter forced idle as opposed to > > all the time. > > Yes, but actually computing and maintaining it is hella hard. Try it > with the very first example in that email (the infeasible weight > scenario) and tell me how it works for you ;-) > OK, I was hoping it could be done in the new RBTree's enqueue/dequeue, but yes I've not implemented it and I should go back and take a look at the first example again. > Also note that the text below (6) mentions dynamic, then look up the > EEVDF paper which describes some of the dynamics -- the paper is > incomplete and contains a bug, I forget if it ever got updated or if > there's another paper that completes it (the BQF I/O scheduler started > from that and fixed it). I see, I am yet to read the EEVDF paper, but now I am out on a tangent :) > > I'm not saying it cannot be done, I'm just saying it is really rather > involved and probably not worth it. > Fair enough > The basic observation the current approach relies on is that al that > faffery basically boils down to the fact that vruntime only means > something when there is contention. And that only the progression is > important not the actual value. That is, this is all fundamentally a > differential equation and our integration constant is meaningless (also > embodied in (7)). > I'll reread (6) and (7), I am trying to understand forced idle and contention together, from what I understand of the patches, there is 1. two tasks that are core scheduled, in that case vruntime works as expected on each CPU, but we need to compare their combined vrtuntime against other tasks on each CPU respectively for them to be selected/chosen? 2. When one of the tasks selected is a part of the core scheduling group and the other CPU does not select a core scheduled tasks, we need to ask ourselves if that CPU should force idle and that's where this logic comes into play? > Also, I think the code as proposed here relies on SMT2 and is buggered > for SMT3+. Now, that second link above describes means of making SMT3+ > work, but we're not there yet. Thanks, Balbir Singh.