Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756259AbcDDTM2 (ORCPT ); Mon, 4 Apr 2016 15:12:28 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44263 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754709AbcDDTM1 (ORCPT ); Mon, 4 Apr 2016 15:12:27 -0400 Message-ID: <1459797143.6219.22.camel@redhat.com> Subject: Re: [PATCH] nohz_full: Make sched_should_stop_tick() more conservative From: Rik van Riel To: Chris Metcalf , Frederic Weisbecker , Christoph Lameter , Ingo Molnar , Luiz Capitulino , Peter Zijlstra , Thomas Gleixner , Viresh Kumar , linux-kernel@vger.kernel.org Date: Mon, 04 Apr 2016 15:12:23 -0400 In-Reply-To: <1459539771-4251-1-git-send-email-cmetcalf@mellanox.com> References: <1459539771-4251-1-git-send-email-cmetcalf@mellanox.com> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-bl5wecJqdQtQICn6spIH" Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4116 Lines: 114 --=-bl5wecJqdQtQICn6spIH Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2016-04-01 at 15:42 -0400, Chris Metcalf wrote: > On arm64, when calling enqueue_task_fair() from migration_cpu_stop(), > we find the nr_running value updated by add_nr_running(), but the > cfs.nr_running value has not always yet been updated.=C2=A0=C2=A0Accordin= gly, > the sched_can_stop_tick() false returns true when we are migrating a > second task onto a core. I don't get it. Looking at the enqueue_task_fair(), I see this: =C2=A0 =C2=A0 =C2=A0 =C2=A0 for_each_sched_entity(se) { =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0cfs_rq =3D cfs_rq_of(se); =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0cfs_rq->h_nr_running++; ... } =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!se) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0add_nr_running(rq, 1); What is the difference between cfs_rq->h_nr_running, and rq->cfs.nr_running? Why do we have two? Are we simply testing against the wrong one in sched_can_stop_tick? > Correct this by using rq->nr_running instead of rq->cfs.nr_running. > This should always be more conservative, and reverts the test to the > form it had before commit 76d92ac305f2 ("sched: Migrate sched to use > new tick dependency mask model"). That would cause us to run the timer tick while running a single SCHED_RR real time task, with a single SCHED_OTHER task sitting in the background (which will not get run until the SCHED_RR task is done). I don't think that is the quite behaviour we want. > Signed-off-by: Chris Metcalf > --- > I found this bug because I had a program running in nohz_full > on a core, and from a different core I called sched_setaffinity() > to force that task onto the nohz_full core, but I did not end up with > a kick to the nohz_full core, so tick-based scheduling did not start. > This is probably bad enough that we should fix it for 4.6. >=20 > Strangely, for some reason, the existing code worked correctly for me > for tilegx, but not for arm64.=C2=A0=C2=A0I see that the enqueue_task_fai= r() > code calls enqueue_entity(), which calls account_entity_enqueue() to > adjust cfs.nr_running.=C2=A0=C2=A0That seemed to happen on tilegx, but no= t > arm64. > Perhaps there is some difference in how the sched_entity stuff is > done, > but frankly that took me a little deeper into the CFS stuff than I > was > willing to dive in this moment. >=20 > I could also argue that sched/core.c shouldn't have a lot of CFS > stuff in it anyway, and if we view the FIFO/RR stuff as handling the > real special cases in sched_can_stop_tick() anyway, then just > checking > the core nr_running feels like the right thing to do regardless. >=20 > =C2=A0kernel/sched/core.c | 2 +- > =C2=A01 file changed, 1 insertion(+), 1 deletion(-) >=20 > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 00649f7ad567..1737d63c65fa 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -599,7 +599,7 @@ bool sched_can_stop_tick(struct rq *rq) > =C2=A0 } > =C2=A0 > =C2=A0 /* Normal multitasking need periodic preemption checks */ > - if (rq->cfs.nr_running > 1) > + if (rq->nr_running > 1) > =C2=A0 return false; > =C2=A0 > =C2=A0 return true; --=20 All Rights Reversed. --=-bl5wecJqdQtQICn6spIH Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJXAryXAAoJEM553pKExN6DUzYIALSmHM8AGXIurlMc+uujBeri 9GymkYqZprgByYhpFNgXRR/A32yf9dlVrdfpUCu3T8CpY+m/e1cWdspI1K8OB2Br sBUGzbIqrYXclZrpb140yKPFyAwbT0KGsjG0Px53yoZSreouQpYzFCgUrZl8NlFo aBeco623A1QnVWr8QktOTf1kZ7NeRa6ALITB7093RdNwNqlKASbgbfzoqYJo/GAM 8bYZj707j3r2ja3erykrt2NgBXu0tr9bVGZDCEjwYttCQ6ugvj6HT9DTNi2mMyak uF6U3PTf76X1IJepuPSAJO/gI2TUNZRYMhqEYc2pWq13H/lS6++jHyVYilOCpTg= =Yn7N -----END PGP SIGNATURE----- --=-bl5wecJqdQtQICn6spIH--