Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933257AbcDYRyQ (ORCPT ); Mon, 25 Apr 2016 13:54:16 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42405 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932980AbcDYRyM (ORCPT ); Mon, 25 Apr 2016 13:54:12 -0400 Message-ID: <1461606848.13397.42.camel@redhat.com> Subject: Re: [RFC] The Linux Scheduler: a Decade of Wasted Cores Report From: Rik van Riel To: Peter Zijlstra , Brendan Gregg Cc: Jeff Merkey , LKML , Mike Galbraith , Ingo Molnar Date: Mon, 25 Apr 2016 13:54:08 -0400 In-Reply-To: <20160425093424.GE12845@twins.programming.kicks-ass.net> References: <20160425093424.GE12845@twins.programming.kicks-ass.net> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-THDz7n1rHr1X7k0JsCG6" Mime-Version: 1.0 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 25 Apr 2016 17:54:11 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2116 Lines: 61 --=-THDz7n1rHr1X7k0JsCG6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, 2016-04-25 at 11:34 +0200, Peter Zijlstra wrote: > On Sat, Apr 23, 2016 at 06:38:25PM -0700, Brendan Gregg wrote: > >=C2=A0 > > Their proof of concept patches are online[1]. I tested them and saw > > 0% > > improvements on the systems I tested, for some simple workloads[2]. > > I > > tested 1 and 2 node NUMA, as that is typical for my employer > > (Netflix, > > and our tens of thousands of Linux instances in the AWS/EC2 cloud), > > even though I wasn't expecting any difference on 1 node. I've used > > synthetic workloads so far. > So their setup uses a bigger (not fully connected) NUMA topology, and > I'm not entirely sure how much of their problems are due to that, but > at > least one of them is. >=20 > Such boxes are fairly rare. Their proposed fix, of making sure we build all 8 sched groups with 5 nodes each in them seems a little bit roundabout when compared with a simpler alternative, though. When dealing with a NUMA_GLUELESS_MESH topology, we should simply not build any sched domains with multiple nodes inside them, except for the top level domain that contains all the nodes. At that point, we will balance between threads, inside each core, and between all nodes, without running into those pointless (and potentially harmful) intermediate sched domains. --=20 All Rights Reversed. --=-THDz7n1rHr1X7k0JsCG6 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJXHlnBAAoJEM553pKExN6DIpgH/RApxtSwXKkGFX0HJCaUQla6 AdStFmND9fVM7KvAFUKmWPcpAPEkCHer20+ED1L5lBNPJYoyFEpzHt79HVaPvNgY 3qgHlUsEW43TF9e+BZhn3Vf6mDycmlvQmVlAsR6C9UYEvuNIHbaIBjZEei6qVvQS MdiAKz+anBmO76GZhzaS9A+KcFpJImEJd5KF+LYgHK/sp+XZAXYaeciSjmeZG4Sl PxUikvHror0oDuMZMw9JWcA46V0adXHsPkxoGEbnbM1zafjIY4sqr+k37PLEcLta VBrN3nIfTQhs2Ln5WzQELi8OoHaz1138kVIclFOtss9DQoPrXu9AM6NqQK7+bIk= =t6EN -----END PGP SIGNATURE----- --=-THDz7n1rHr1X7k0JsCG6--