Date: Fri, 19 Jun 2015 20:22:07 +0800
From: Boqun Feng <boqun.feng@gmail.com>
To: Yuyang Du <yuyang.du@intel.com>
Cc: mingo@kernel.org, peterz@infradead.org, linux-kernel@vger.kernel.org,
        pjt@google.com, bsegall@google.com, morten.rasmussen@arm.com,
        vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
        len.brown@intel.com, rafael.j.wysocki@intel.com,
        fengguang.wu@intel.com, srikar@linux.vnet.ibm.com
Subject: Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization
 average tracking
Message-ID: <20150619122207.GB5331@fixme-laptop.cn.ibm.com>
References: <1434396367-27979-1-git-send-email-yuyang.du@intel.com>
 <1434396367-27979-3-git-send-email-yuyang.du@intel.com>
 <20150619060038.GA1240@fixme-laptop.cn.ibm.com>
 <20150618230554.GA3436@intel.com>
 <20150619075724.GA5331@fixme-laptop.cn.ibm.com>
 <20150619031116.GA3933@intel.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature"; boundary="E39vaYmALEf/7YXx"
Content-Disposition: inline
In-Reply-To: <20150619031116.GA3933@intel.com>
User-Agent: Mutt/1.5.23+89 (0255b37be491) (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4644
Lines: 131


--E39vaYmALEf/7YXx
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Yuyang,

On Fri, Jun 19, 2015 at 11:11:16AM +0800, Yuyang Du wrote:
> On Fri, Jun 19, 2015 at 03:57:24PM +0800, Boqun Feng wrote:
> > >=20
> > > This rewrite patch does not NEED to aggregate entity's load to cfs_rq,
> > > but rather directly update the cfs_rq's load (both runnable and block=
ed),
> > > so there is NO NEED to iterate all of the cfs_rqs.
> >=20
> > Actually, I'm not sure whether we NEED to aggregate or NOT.
> >=20
> > >=20
> > > So simply updating the top cfs_rq is already equivalent to the stock.
> > >=20
>=20
> Ok. By aggregate, the rewrite patch does not need it, because the cfs_rq's
> load is calculated at once with all its runnable and blocked tasks counte=
d,
> assuming the all children's weights are up-to-date, of course. Please ref=
er
> to the changelog to get an idea.
>=20
> >=20
> > The stock does have a bottom up update, so simply updating the top
> > cfs_rq is not equivalent to it. Simply updateing the top cfs_rq is
> > equivalent to the rewrite patch, because the rewrite patch lacks of the
> > aggregation.
>=20
> It is not the rewrite patch "lacks" aggregation, it is needless. The stock
> has to do a bottom-up update and aggregate, because 1) it updates the
> load at an entity granularity, 2) the blocked load is separate.

Yep, you are right, the aggregation is not necessary.

Let me see if I understand you, in the rewrite, when we
update_cfs_rq_load_avg() we need neither to aggregate child's load_avg,
nor to update cfs_rq->load.weight. Because:

1) For the load before cfs_rq->last_update_time, it's already in the
->load_avg, and decay will do the job.
2) For the load from cfs_rq->last_update_time to now, we calculate
with cfs_rq->load.weight, and the weight should be weight at
->last_update_time rather than now.

Right?

>=20
> > > It is better if we iterate the cfs_rq to update the actually weight
> > > (update_cfs_share), because the weight may have already changed, which
> > > would in turn change the load. But update_cfs_share is not cheap.
> > >=20
> > > Right?
> >=20
> > You get me right for most part ;-)
> >=20
> > My points are:
> >=20
> > 1. We *may not* need to aggregate entity's load to cfs_rq in
> > update_blocked_averages(), simply updating the top cfs_rq may be just
> > fine, but I'm not sure, so scheduler experts' insights are needed here.
> =20
> Then I don't need to say anything about this.
>=20
> > 2. Whether we need to aggregate or not, the update_blocked_averages() in
> > the rewrite patch could be improved. If we need to aggregate, we have to
> > add something like update_cfs_shares(). If we don't need, we can just
> > replace the loop with one update_cfs_rq_load_avg() on root cfs_rq.
> =20
> If update_cfs_shares() is done here, it is good, but probably not necessa=
ry
> though. However, we do need to update_tg_load_avg() here, because if cfs_=
rq's

We may have another problem even we udpate_tg_load_avg(), because after
the loop, for each cfs_rq, ->load.weight is not up-to-date, right? So
next time before we update_cfs_rq_load_avg(), we need to guarantee that
the cfs_rq->load.weight is already updated, right? And IMO, we don't
have that guarantee yet, do we?

> load change, the parent tg's load_avg should change too. I will upload a =
next
> version soon.
>=20
> In addition, an update to the stress + dbench test case:
>=20
> I have a Core i7, not a Xeon Nehalem, and I have a patch that may not imp=
act
> the result. Then, the dbench runs at very low CPU utilization ~1%. Boqun =
said
> this may result from cgroup control, the dbench I/O is low.
>=20
> Anyway, I can't reproduce the results, the CPU0's util is 92+%, and other=
 CPUs
> have ~100% util.

Thank you for looking into that problem, and I will test with your new
version of patch ;-)

Thanks,
Boqun

>=20
> Thanks,
> Yuyang

--E39vaYmALEf/7YXx
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAABCAAGBQJVhAlrAAoJEEl56MO1B/q4k8QH/in0kBI23QEqp76YYyrrZRww
RTe4K4WmFmKU2ezrfRTSxqzme43hfwNYSrm93Gb5eXJi0CDkAmkTlOHBQrRHA6L7
ltfoG3ibNLw+WDrcdjcDfgm2ZnN8bVrY5imON90K4TDT3q20p3Yt6U7ZpJhu8j1J
EU/Vsj8EqYqxFVqUnrQhU6U38lmKvRHTDEWoVQPhmYlc3xix8XHfxyHwOG+k2GEw
jdP2p5awuVnkyFS3p7PNDKFw1I/iegU/DsspV8HN4Q95FJcabi34NFzaCngxhRPm
Cw2GHoYe8uajeIadhCq3k1nyeuPf7P63zml1wVEz4zMDwnHae4cmCNJVlUh/0N0=
=K/ZT
-----END PGP SIGNATURE-----

--E39vaYmALEf/7YXx--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/