Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp2466587ybi; Mon, 1 Jul 2019 12:34:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqzYrjPh+vfdIwgHgeDgQWQyI4M/EMNfuZGawKHKut4rkl+Jsjun2prbPovDhXaZKnTLQQHi X-Received: by 2002:a17:902:aa0a:: with SMTP id be10mr29893301plb.293.1562009688121; Mon, 01 Jul 2019 12:34:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562009688; cv=none; d=google.com; s=arc-20160816; b=avgS0jKhq5cU0awEGreA+xb7eXNPInEEmlLy6KnqYDweRehjdFFNlGW9DbrhZNYpg8 Sforr0X5HieRKKVYavg1xbQhI93BdAoQSpj09j2hcFaPYDcqTnNkkXcO7PPv65MO/QEN hA1sIPqoGdhdFg8Yw1DSVSE1AEgr0tXI5GQhakpO0Zs5uWfRVhbJMF8+eRFi2FIeI+l3 J85+yZyYIaoNwuMouO4Xv0jDhbhIoHNGq8pxqC7iTU36J4QOt1yZQ5jMZMo6g0dy7tYA 0YIwjr7n1DCwlkH1wGsKitWgOICRuY8xOUreBCwM33JJ9S5iWzewYQgvm+9HdvoDgSTX Lu2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:date:cc:to:from:subject:message-id; bh=NtvTAs7VWPmJnTlVv7Cwpbz8vrCtuX6ur8pmHUYXdDE=; b=DTl4d4fd08fSynGqgVE0yqY4KNuik8+k5RAvTiLhnIXaLgNCEy9Qgp7nFNnVf7n6yy 6V3ZG/2VbiSO/l+TDFoMjLX7tEeROhPdUstImhSH1ekxmpjg0/ou5PyzLDXQpK1N8JfF WsQcQuVRdfHVpuOqTuBfJLQeJDscl9rT2cNuO97xHBjMZgMdYHhcnxFi10ANZ3wM9YFy Q6tEYPxuk1iQopyVq0mHWi3gjBsJ6Knha5ETHF2wRzWkN5KbDRAoohgBczP+0Wz+Plcl 15GfX86uQGxkjZ7O79v/Dber+X1MLoJEclSC/ihdlb2gltBIzerhwirx34nxphlGqkg/ c6Cg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i9si10601286pgl.452.2019.07.01.12.34.32; Mon, 01 Jul 2019 12:34:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726757AbfGATct (ORCPT + 99 others); Mon, 1 Jul 2019 15:32:49 -0400 Received: from shelob.surriel.com ([96.67.55.147]:45972 "EHLO shelob.surriel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726442AbfGATct (ORCPT ); Mon, 1 Jul 2019 15:32:49 -0400 Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.92) (envelope-from ) id 1hi22h-00006O-IZ; Mon, 01 Jul 2019 15:32:47 -0400 Message-ID: <4bf9c6e1eeec9b52aa9c10b9149384cb99399b04.camel@surriel.com> Subject: Re: [PATCH 03/10] sched,fair: redefine runnable_load_avg as the sum of task_h_load From: Rik van Riel To: Josef Bacik Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, pjt@google.com, dietmar.eggemann@arm.com, peterz@infradead.org, mingo@redhat.com, morten.rasmussen@arm.com, tglx@linutronix.de, mgorman@techsingularity.net, vincent.guittot@linaro.org Date: Mon, 01 Jul 2019 15:32:47 -0400 In-Reply-To: <20190701192234.5oxkuysimi437utz@macbook-pro-91.dhcp.thefacebook.com> References: <20190628204913.10287-1-riel@surriel.com> <20190628204913.10287-4-riel@surriel.com> <20190701162949.vhxjndychoamhkbq@MacBook-Pro-91.local.dhcp.thefacebook.com> <757e0af14b714b596417b31c45098fc314ed7c8a.camel@surriel.com> <20190701192234.5oxkuysimi437utz@macbook-pro-91.dhcp.thefacebook.com> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-G0E3IlZFVcKl+naxuyrr" User-Agent: Evolution 3.30.5 (3.30.5-1.fc29) MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-G0E3IlZFVcKl+naxuyrr Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, 2019-07-01 at 15:22 -0400, Josef Bacik wrote: > On Mon, Jul 01, 2019 at 12:47:35PM -0400, Rik van Riel wrote: > > On Mon, 2019-07-01 at 12:29 -0400, Josef Bacik wrote: > > >=20 > > > My memory on this stuff is very hazy, but IIRC we had the > > > runnable_sum and the > > > runnable_avg separated out because you could have the avg lag > > > behind > > > the sum. > > > So like you enqueue a bunch of new entities who's avg may have > > > decayed a bunch > > > and so their overall load is not felt on the CPU until they start > > > running, and > > > now you've overloaded that CPU. The sum was there to make sure > > > new > > > things > > > coming onto the CPU added actual load to the queue instead of > > > looking > > > like there > > > was no load. > > >=20 > > > Is this going to be a problem now with this new code? > >=20 > > That is a good question! > >=20 > > On the one hand, you may well be right. > >=20 > > On the other hand, while I see the old code calculating > > runnable_sum, I don't really see it _using_ it to drive > > scheduling decisions. > >=20 > > It would be easy to define the CPU cfs_rq->runnable_load_sum > > as being the sum of task_se_h_weight() of each runnable task > > on the CPU (for example), but what would we use it for? > >=20 > > What am I missing? >=20 > It's suuuuuper sublte, but you're right in that we don't really need > the > runnable_avg per-se, but what you do is you kill calc_group_runnable, > which used > to do this >=20 > load_avg =3D max(cfs_rq->avg.load_avg, > scale_load_down(cfs_rq->load.weight)); >=20 > runnable =3D max(cfs_rq->avg.runnable_load_avg, > scale_load_down(cfs_rq->runnable_weight)); >=20 > so we'd account for this weirdness of adding a previously idle > process to a new > CPU and overloading the CPU because we'd add a bunch of these 0 > weight looking > tasks that suddenly all wake up and are on the same CPU. So we used > the > runnable_weight to account for what was actually happening, and the > max of > load_avg and the weight to figure out what the potential load would > be. >=20 > What you've done here is change the weighting stuff to be completely > based on > load avg, which is problematic for the reasons above. Did you fix > this later on > in your patches? If so then just tell me to keep reading and I'll do > that ;). No, I have not fixed that later in the code. I am not entirely sure how I could do that, without reintroducing walking the hierarchy at task enqueue and dequeue, but maybe you have some idea... --=20 All Rights Reversed. --=-G0E3IlZFVcKl+naxuyrr Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEKR73pCCtJ5Xj3yADznnekoTE3oMFAl0aX98ACgkQznnekoTE 3oOQLwf/V7laRZB80HojKDUBvWPexAXiERPG6bdrZTlssrh8dnwDYtmnSHAibtqd UzoMDXpZRe455omNpUB4F7tyBlrds3jbDJmaoOaVDJnPQTaJRVBGddfIn7QRYdkF MGuQqodBHwjwiKX4m2Lh0gJE/msyNd3tbDYH58yRf4m/N6Sty2dOCBwuP5QGbB3v iCsTMs3LJOQifwIVZlyNEMvHwlGJBHXrB2Uuv0Ya5e9SQHTooqTAj9ZjpQgQ4nQj AImBeflH3cD17Msk6JuCRTMaFiuhsUOMUHGiHFB+VSQnAPZ83Y3e1jLgr3ekLw2n gwSIHBh+vb5zsRsmkCeSIISwupydLg== =7Xe7 -----END PGP SIGNATURE----- --=-G0E3IlZFVcKl+naxuyrr--