Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp693295yba; Fri, 26 Apr 2019 07:16:15 -0700 (PDT) X-Google-Smtp-Source: APXvYqy7R4oy1jdgAsmQu89psV2kiUhHqQFkkz48cUP414ZchwL01NnjZHMXh4ZoRg5Z8wzNIkkR X-Received: by 2002:a62:2603:: with SMTP id m3mr8408357pfm.232.1556288175459; Fri, 26 Apr 2019 07:16:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556288175; cv=none; d=google.com; s=arc-20160816; b=KUeVLadmD7MHgcTkLwlDdIrhMjI2JrJ0cVur1ZAeRkydtfHnbshwELtCgaHwvB+rgd cWGyZ/kqkOIFaYVDcT1sDrF7fFB7PF5Ez//r2aQEi2014poNb6x3DGFqLAHRdztVLLSF cne+UywoicXnM3qZ1D2PKGSOhmGulaIPlbYDsjRc2dQ3pbOOXxJQioB4tgh7j214WzMP lp/6ZQazLa9kg+YKdW5kqsuanttlEq9PjUjJErwkYr+r3WNqPQZIgJlyQ2xz6xs+w4qh AIZNeOq8BeSUQr4LakxVSG8Bea5jVMVahM1jtljJbNJ1iFxbuzhDS/nxJtmOpETv3HJe Anzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=yXseYV4ycfxPZF7ibQGQmz3ezYGEKuEdYHTKY6I1NEE=; b=0be4dy0Y/iWCkOdHGVHj9DdJfdN4VlQjr0tvBWTihIzZt4bhpwU3AZeOfcKYfzORq/ sNBMsGjLa3Pf8DCyjFANbewMTR4b5avt+ypE2eF/DytzoA5pahYSEADcFWfSbFQLd9mJ k5Ed/M8nPkcWh8JrlnCJeDTBL+Q8SuU11WlCPpLnM0S84AbdGEyVG8rW8WGSd2kylsF5 SV6USwVHyTFdfyJGwz0Lvny0raGjTBkmlP6OtGbpnWWUVDFjyCNr3M9Ao92P+/ka9pEJ Kj3GKy/P4UkTdYgwBNhdpjn6+kn9jzRrHMWY63n057oXimz1Wn0Ce/MPhJYy93LSnD2D 5u5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 134si24781788pga.249.2019.04.26.07.15.59; Fri, 26 Apr 2019 07:16:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726357AbfDZOPJ (ORCPT + 99 others); Fri, 26 Apr 2019 10:15:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36228 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726138AbfDZOPJ (ORCPT ); Fri, 26 Apr 2019 10:15:09 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 41F0E30842A0; Fri, 26 Apr 2019 14:15:08 +0000 (UTC) Received: from lorien.usersys.redhat.com (ovpn-116-103.phx2.redhat.com [10.3.116.103]) by smtp.corp.redhat.com (Postfix) with ESMTPS id F08C4277D6; Fri, 26 Apr 2019 14:15:05 +0000 (UTC) Date: Fri, 26 Apr 2019 10:15:04 -0400 From: Phil Auld To: Ingo Molnar Cc: Mel Gorman , Aubrey Li , Julien Desfossez , Vineeth Remanan Pillai , Nishanth Aravamudan , Peter Zijlstra , Tim Chen , Thomas Gleixner , Paul Turner , Linus Torvalds , Linux List Kernel Mailing , Subhra Mazumdar , Fr?d?ric Weisbecker , Kees Cook , Greg Kerr , Aaron Lu , Valentin Schneider , Pawan Gupta , Paolo Bonzini , Jiri Kosina Subject: Re: [RFC PATCH v2 00/17] Core scheduling v2 Message-ID: <20190426141503.GB16477@lorien.usersys.redhat.com> References: <20190424140013.GA14594@sinkpad> <20190425095508.GA8387@gmail.com> <20190425144619.GX18914@techsingularity.net> <20190425185343.GA122353@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190425185343.GA122353@gmail.com> User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Fri, 26 Apr 2019 14:15:08 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 25, 2019 at 08:53:43PM +0200 Ingo Molnar wrote: > Interesting. This strongly suggests sub-optimal SMT-scheduling in the > non-saturated HT case, i.e. a scheduler balancing bug. > > As long as loads are clearly below the physical cores count (which they > are in the early phases of your table) the scheduler should spread tasks > without overlapping two tasks on the same core. > > Clearly it doesn't. > That's especially true if there are cgroups with different numbers of tasks in them involved. Here's an example showing the average number of tasks on each of the 4 numa nodes during a test run. 20 cpus per node. There are 78 threads total, 76 for lu and 2 stress cpu hogs. So fewer than the 80 CPUs on the box. The GROUP test has the two stresses and lu in distinct cgroups. The NORMAL test has them all in one. This is from 5.0-rc3+, but the version doesn't matter. It's reproducible on any kernel. SMT is on, but that also doesn't matter here. The first two lines show where the stress jobs ran and the second show where the 76 threads of lu ran. GROUP_1.stress.ps.numa.hist Average 1.00 1.00 NORMAL_1.stress.ps.numa.hist Average 0.00 1.10 0.90 lu.C.x_76_GROUP_1.ps.numa.hist Average 10.97 11.78 26.28 26.97 lu.C.x_76_NORMAL_1.ps.numa.hist Average 19.70 18.70 17.80 19.80 The NORMAL test is evenly balanced across the 20 cpus per numa node. There is between a 4x and 10x performance hit to the lu benchmark between group and normal in any of these test runs. In this particular case it was 10x: ============76_GROUP========Mop/s=================================== min q1 median q3 max 3776.51 3776.51 3776.51 3776.51 3776.51 ============76_GROUP========time==================================== min q1 median q3 max 539.92 539.92 539.92 539.92 539.92 ============76_NORMAL========Mop/s=================================== min q1 median q3 max 39386 39386 39386 39386 39386 ============76_NORMAL========time==================================== min q1 median q3 max 51.77 51.77 51.77 51.77 51.77 This a bit off topic, but since balancing bugs was mentioned and I've been trying to track this down for a while (and learning the scheduler code in the process) I figured I'd just throw it out there :) Cheers, Phil --