Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp944347pxk; Mon, 31 Aug 2020 05:54:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwowTyyei5DQVmbFBrinSbNzCdQBr4mc7MHS61nEKpm/R/2fM/YgOv008n02lhIjtL96DzC X-Received: by 2002:a17:907:728e:: with SMTP id dt14mr1058240ejc.4.1598878450435; Mon, 31 Aug 2020 05:54:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598878450; cv=none; d=google.com; s=arc-20160816; b=eZhvbdrvoDrZV6QWmanbjcDLYhXuo5I8DXYEZVUDl8JX3QGRqnk5o2UoL8zYREKC5k Lp/CmeuSxCqCCXWPaMa8P/LcA7WRbNH7lpQQKOM9m2JFV1hUqzL/Pgb0PM0Fm2rXKvI4 fzlyvmWyW3QLl4eCNkvrSwe5R04EKkNLdwle+rh1+KSEMbjm6NQGro6UY/MXsbIiQ+6w GPmRDfjCI+UBjwuUFV67PdrAEOflzC563Q/hugz3ABkdTmwKMpF6S2MLYj0UWLuobhE7 12XI0jZICGv+3yutWP8lnYhdpZBcoWrU0u1pZDZyud6zr542g815cz9Mkz7TxEuqOzzU rCGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:ironport-sdr:ironport-sdr; bh=L8Mg8MtR9ZCd4+QqfDu6Og0iWfExOmME7Dpe7+Moa1E=; b=q1qItdLaj59Jy7NM00wLsdROTSLFIc0pNItKY0P/2o7df0WmO7ou/YR9OFbZ5UVaUv 9InycpICLzHClEQ1xi6eN68gHSR+XNc1d3DAx8YtFExQGu326I8Nc4e4+SC3HoM8UTPK vzN3FyVWMQPEM0wACEpjsi3lkIGGtKANTDVspyG1/3+3NTO+DFMe+Mh2siMd51yCnrDm mfMA5wPal+9KpdSb9h/fIfOyGaYY/88x4W2ngJtovkjlrMX9mJDPts82N5l5/ptN08Pg cFlH/cpArpbgeic9qMU6tKwSRZFgpDjWWqc4ZGwyf0HEqvJeet9l0VlrWLP41TqqV1GP TV9A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v10si5034743edy.425.2020.08.31.05.53.47; Mon, 31 Aug 2020 05:54:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727019AbgHaMxJ (ORCPT + 99 others); Mon, 31 Aug 2020 08:53:09 -0400 Received: from mga02.intel.com ([134.134.136.20]:13510 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726359AbgHaMxF (ORCPT ); Mon, 31 Aug 2020 08:53:05 -0400 IronPort-SDR: Ta9hHTMkqiWiykcGjdn9Ns/Z5RCl0F4FVS5ReAdRgMR8zS9zDGH/pxA17gCKQ8xBA3erNnV0GG rNdoSQvMf64g== X-IronPort-AV: E=McAfee;i="6000,8403,9729"; a="144699580" X-IronPort-AV: E=Sophos;i="5.76,375,1592895600"; d="scan'208";a="144699580" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2020 05:53:04 -0700 IronPort-SDR: BSw738ePdufYSI/aQWhpo3K+Tv/AI2LO1mnVx3Oi9rc0UqWb0Zbp2gacQMPJrNXpAS0V4uY4rf N3fxjVFx3bqg== X-IronPort-AV: E=Sophos;i="5.76,375,1592895600"; d="scan'208";a="476715203" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.107]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2020 05:53:02 -0700 Date: Mon, 31 Aug 2020 20:53:00 +0800 From: Feng Tang To: Mel Gorman Cc: Borislav Petkov , "Luck, Tony" , kernel test robot , LKML , lkp@lists.01.org Subject: Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression Message-ID: <20200831125300.GB61340@shbuild999.sh.intel.com> References: <20200821020259.GA90000@shbuild999.sh.intel.com> <20200824151425.GF4794@zn.tnic> <20200824153300.GA56944@shbuild999.sh.intel.com> <20200824161238.GI4794@zn.tnic> <20200825062305.GA83850@shbuild999.sh.intel.com> <20200828174839.GD19448@zn.tnic> <20200831021638.GB65971@shbuild999.sh.intel.com> <20200831075611.GA2976@suse.com> <20200831082306.GA61340@shbuild999.sh.intel.com> <20200831085516.GE2976@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200831085516.GE2976@suse.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 31, 2020 at 09:55:17AM +0100, Mel Gorman wrote: > On Mon, Aug 31, 2020 at 04:23:06PM +0800, Feng Tang wrote: > > On Mon, Aug 31, 2020 at 08:56:11AM +0100, Mel Gorman wrote: > > > On Mon, Aug 31, 2020 at 10:16:38AM +0800, Feng Tang wrote: > > > > > So why don't you define both variables with DEFINE_PER_CPU_ALIGNED and > > > > > check if all your bad measurements go away this way? > > > > > > > > For 'arch_freq_scale', there are other percpu variables in the same > > > > smpboot.c: 'arch_prev_aperf' and 'arch_prev_mperf', and in hot path > > > > arch_scale_freq_tick(), these 3 variables are all accessed, so I didn't > > > > touch it. Or maybe we can align the first of these 3 variables, so > > > > that they sit in one cacheline. > > > > > > > > > You'd also need to check whether there's no detrimental effect from > > > > > this change on other, i.e., !KNL platforms, and I think there won't > > > > > be because both variables will be in separate cachelines then and all > > > > > should be good. > > > > > > > > Yes, these kind of changes should be verified on other platforms. > > > > > > > > One thing still puzzles me, that the 2 variables are per-cpu things, and > > > > there is no case of many CPU contending, why the cacheline layout matters? > > > > I doubt it is due to the contention of the same cache set, and am trying > > > > to find some way to test it. > > > > > > > > > > Because if you have two structures that are per-cpu and not cache-aligned > > > then a write in one can bounce the cache line in another due to > > > cache coherency protocol. It's generally called "false cache line > > > sharing". https://en.wikipedia.org/wiki/False_sharing has basic examples > > > (lets not get into whether wikipedia is a valid citation source, there > > > are books on the topic if someone really cared). > > > > For 'arch_freq_scale' and 'tsc_adjust' percpu variable, they are only > > accessed by their own CPU, and usually no other CPU will touch them, > > Read "false sharing again". Two adjacent per-CPU structures can still > interfere with each other if the structures happen to cross a cache line > boundary and are not cache aligned. Sure, will recheck that wiki, thanks. Some cache info about the test box is, it's the Xeon Phi platform with 72 cores, and each core has 4 HT threads, so there are 288 CPUs. The L1 D-cache and I-cache are both 32KB. There is no L3 cache, and every 2 cores share one 1MB L2 cache. The L1 D-cache is 64 sets, 8 ways, and the L2 cache is 1024 sets, 16 ways. Thanks, Feng > > the > > hot node path only use this_cpu_read/write/ptr. And each CPU's static > > percpu variables are all packed together in one area (256KB for one CPU on > > this test box), > > If the structure is not cache aligned (probably 64KB) then there is a > boundary when cache line bounces can occur. > > -- > Mel Gorman > SUSE Labs