Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1563445yba; Sat, 27 Apr 2019 02:08:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqy11tBNN09Kyy6TnTOI3egs87teWibEA6Bgb/Aih/d+zUUjggmROdghZHlp+6TjMdv5tUr5 X-Received: by 2002:a63:6fcd:: with SMTP id k196mr47842261pgc.238.1556356090177; Sat, 27 Apr 2019 02:08:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556356090; cv=none; d=google.com; s=arc-20160816; b=OmYOGiC26FsDrh/LIuNnGT9bLdbKVgsX07TZ8kuLdeToMxQ15D9aUBnbe7mdp2cSKo Yd27mZMzGsm2uyeYwbfLtaOm+CFXQ94qmJzq9+sBYjjIOk7lRzPNyLXiDgie69Vq0UNi 5Kuxq/ZMpfudDg27135oDzOD65pgkb2sDp7g/fDlRunClXDBysQNE3sdbuaaGppUzxXf bIRkyBDhyINJJpTYz4bhRYN5akJDbD5K/HWUnJpsPPdIEkAJX/0Hld/tKDheF9i8Hj4Z R/CxGSq8hwMWHMwGAMaCpK4HIuCE17xSwjwQ70BuONrRj9prE1+VFcAPjkI/NzLiT+1C lf8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=zvO5tx75mBcwL4BBc74Z0twD+gtLAHXQirkVGs5qmHU=; b=tFHjWvIKr+9LqNxCAEZuWjXtxevaU30S4P0Z+hvcrc/PjXU8LnYwgRIQ0tpjcTVLjh YTv51zPkb6otd7fbVk+Aj0tQp7NtB9qmgo1WjNZEqH7lMISUsiv54ycpfEVRxU7teFXV 8olgWJMv7EMvomz/yC6fm46ZYJwvAdSgrJ1VRONZjPzCpgLz119MFr8sQS0NDou9zNJo J8qAqB/WeciVCcx1GRQy3EXcWMfISJPOT6SkboTMDjaoZgzI1UlAGY64QSkIeKl73c26 VWniTNHztAo2put67/AXAY+mGs+HuKv3jBIqcVLxaq8itK1gqySXGqvqIrxPMTnf0aCp iKbw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=TawwbpFD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k9si16663384pls.343.2019.04.27.02.07.54; Sat, 27 Apr 2019 02:08:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=TawwbpFD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726050AbfD0JHD (ORCPT + 99 others); Sat, 27 Apr 2019 05:07:03 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:39796 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725857AbfD0JHD (ORCPT ); Sat, 27 Apr 2019 05:07:03 -0400 Received: by mail-wm1-f67.google.com with SMTP id n25so7764120wmk.4 for ; Sat, 27 Apr 2019 02:07:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=zvO5tx75mBcwL4BBc74Z0twD+gtLAHXQirkVGs5qmHU=; b=TawwbpFDHBdf7qHKQwanK01akp5UHmfFwWygBiZHzZsXduGGd3h07WJn6bfdBoHST6 JtGNK5krSkpbpET+dMkO3T6ollDhJX9z4V5hyNIIZkgPTc5fJMOE/tZvSgZjBviwawQO IXNRYNCTueNd/00Ke+TG5wktiR8TnRi6XbywMUcixLCzLVy2gOJEdFnfvfTXHM0dVs5a aiiBYXlyvfmo4aTpZjztKHcp9TJHkNjJwG249jMwi1c09ZK8xjMOXM+RXWO3EM9II2Mr YXtcIyVscoQEGU4stS7kKKZ5zexyiUAG1pu8rmJLzxGaA3nUn4tB8ER8PEx7RM9mTQW9 mvQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=zvO5tx75mBcwL4BBc74Z0twD+gtLAHXQirkVGs5qmHU=; b=pFnD9gBAzUR14gigEz9+55rS1f8vo83mhGXSuNWBPSEozUfMKSN10it1CvdcncnorM 9aYI0PQ4X5sfvIUOjPfYY/hZkdENi6bdgkpMMZYxYY1lkJgRxvRxHZIagMYfdE9+hJ+s cTfElH7qtTO3nei3VjxpZ2cA19TItb6qcCqy2SMNq5X3PFm7JuGsOG8K3Hi7+A+a28dz PahHQQKXSFLx1ogXqqT41z2dS/GL495hZZ+5hCGQ96jGftGeysBBqWhcrUaT2Q/1zxlT uA/xtYvcrsXtJgqd4zWZZTFlwKzuhE8AfxwQe4PPP68H10IvCMsO+zQQgdu/1QUuqr9p TCog== X-Gm-Message-State: APjAAAVvX14AnPQN1I6iPldsFjw5UU7VqNO4/jt6mlKFaiZi7W+/tutk cQiSk/VNTb6LbBaDdFQ7jzE= X-Received: by 2002:a1c:ca07:: with SMTP id a7mr11423307wmg.104.1556356021027; Sat, 27 Apr 2019 02:07:01 -0700 (PDT) Received: from gmail.com (2E8B0CD5.catv.pool.telekom.hu. [46.139.12.213]) by smtp.gmail.com with ESMTPSA id 204sm34633192wmc.1.2019.04.27.02.06.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 27 Apr 2019 02:06:59 -0700 (PDT) Date: Sat, 27 Apr 2019 11:06:57 +0200 From: Ingo Molnar To: Mel Gorman Cc: Aubrey Li , Julien Desfossez , Vineeth Remanan Pillai , Nishanth Aravamudan , Peter Zijlstra , Tim Chen , Thomas Gleixner , Paul Turner , Linus Torvalds , Linux List Kernel Mailing , Subhra Mazumdar , Fr?d?ric Weisbecker , Kees Cook , Greg Kerr , Phil Auld , Aaron Lu , Valentin Schneider , Pawan Gupta , Paolo Bonzini , Jiri Kosina Subject: Re: [RFC PATCH v2 00/17] Core scheduling v2 Message-ID: <20190427090657.GB99668@gmail.com> References: <20190424140013.GA14594@sinkpad> <20190425095508.GA8387@gmail.com> <20190425144619.GX18914@techsingularity.net> <20190425185343.GA122353@gmail.com> <20190425213145.GY18914@techsingularity.net> <20190426094545.GD126896@gmail.com> <20190426101947.GZ18914@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190426101947.GZ18914@techsingularity.net> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Mel Gorman wrote: > On Fri, Apr 26, 2019 at 11:45:45AM +0200, Ingo Molnar wrote: > > > > * Mel Gorman wrote: > > > > > > > I can show a comparison with equal levels of parallelisation but with > > > > > HT off, it is a completely broken configuration and I do not think a > > > > > comparison like that makes any sense. > > > > > > > > I would still be interested in that comparison, because I'd like > > > > to learn whether there's any true *inherent* performance advantage to > > > > HyperThreading for that particular workload, for exactly tuned > > > > parallelism. > > > > > > > > > > It really isn't a fair comparison. MPI seems to behave very differently > > > when a machine is saturated. It's documented as changing its behaviour > > > as it tries to avoid the worst consequences of saturation. > > > > > > Curiously, the results on the 2-socket machine were not as bad as I > > > feared when the HT configuration is running with twice the number of > > > threads as there are CPUs > > > > > > Amean bt 771.15 ( 0.00%) 1086.74 * -40.93%* > > > Amean cg 445.92 ( 0.00%) 543.41 * -21.86%* > > > Amean ep 70.01 ( 0.00%) 96.29 * -37.53%* > > > Amean is 16.75 ( 0.00%) 21.19 * -26.51%* > > > Amean lu 882.84 ( 0.00%) 595.14 * 32.59%* > > > Amean mg 84.10 ( 0.00%) 80.02 * 4.84%* > > > Amean sp 1353.88 ( 0.00%) 1384.10 * -2.23%* > > > > Yeah, so what I wanted to suggest is a parallel numeric throughput test > > with few inter-process data dependencies, and see whether HT actually > > improves total throughput versus the no-HT case. > > > > No over-saturation - but exactly as many threads as logical CPUs. > > > > I.e. with 20 physical cores and 40 logical CPUs the numbers to compare > > would be a 'nosmt' benchmark running 20 threads, versus a SMT test > > running 40 threads. > > > > I.e. how much does SMT improve total throughput when the workload's > > parallelism is tuned to utilize 100% of the available CPUs? > > > > Does this make sense? > > > > Yes. Here is the comparison. > > Amean bt 678.75 ( 0.00%) 789.13 * -16.26%* > Amean cg 261.22 ( 0.00%) 428.82 * -64.16%* > Amean ep 55.36 ( 0.00%) 84.41 * -52.48%* > Amean is 13.25 ( 0.00%) 17.82 * -34.47%* > Amean lu 1065.08 ( 0.00%) 1090.44 ( -2.38%) > Amean mg 89.96 ( 0.00%) 84.28 * 6.31%* > Amean sp 1579.52 ( 0.00%) 1506.16 * 4.64%* > Amean ua 611.87 ( 0.00%) 663.26 * -8.40%* > > This is the socket machine and with HT On, there are 80 logical CPUs > versus HT Off with 40 logical CPUs. That's very interesting - so for most workloads HyperThreading is a massive loss, and for 'mg' and 'sp' it's a 5-6% win? I'm wondering how much of say the 'cg' workload's -64% loss could be task placement inefficiency - or are these all probable effects of 80 threads trying to use too many cache and memory resources and thus utilizing it all way too inefficiently? Are these relatively simple numeric workloads, with not much scheduling and good overall pinning of tasks, or is it more complex than that? Also, the takeaway appears to be: by using HT there's a potential advantage of +6% on the benefit side, but a potential -50%+ performance hit on the risk side? I believe these results also *strongly* support a much stricter task placement policy in up to 50% saturation of SMT systems - it's almost always going to be a win for workloads that are actually trying to fill in some useful role. Thanks, Ingo