Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1067709pxb; Wed, 6 Apr 2022 08:00:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwNLolRm+PzAn/lmc0aE6+ZK/mWQIGlw+GLgPb19zIBoBz/A8u1I1wAb+8u9EQgnHyQtfV7 X-Received: by 2002:a17:90a:7:b0:1c7:c286:abc2 with SMTP id 7-20020a17090a000700b001c7c286abc2mr10446727pja.65.1649257210075; Wed, 06 Apr 2022 08:00:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649257210; cv=none; d=google.com; s=arc-20160816; b=jqpRGTN3vZvt9q4GY7EK+eyaY6WX3mXR6gO3jmnkyiFW7Y2ZykswFPYA6ol3mDKoba RZjr30dDZ3x/rU7XIJXK4CRmNDm2vbT7kaNWiMPJh+i4FCfwhO2h1nVN0YDoDFqJri6/ NSDhLPCI9/ujnjzEZe14S8Fffpiqqj5Ip1jgRP/7Qsit9mQ/FfFAeI9/LGDp2pfoirLy lt4nRGe2T1575GS1AXrnbhKQTAw6caLBM+/tj9EI0aPmJcjBbUH/gQGMR918ozRccple S14MnffhhctBGScYEwVJt+DG/Q8XP3xy7v7Mu0Ves/bR65VDlGmcPa6niz4NF4E1d+J0 C+6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=H8twa+BlFOzdMD8uNwdcJMy8ySsA4+hl/jakcFj9iEM=; b=NnaYs6f2eZZZnnhhzmosMXNzlFgaXm75LSpMgwvDLVVMP+MV5/U6L//WfNqSI+1HJ5 XVHuyrr63H1fQEWFhx1XEI0pUmGdbBex3ZG4IrlGg7WpBAELoDG36I+fwATLNoBh5Ly/ BlP32d08ermzkXP2Bt0xKtX9MVtNqk/YI4g126XxQmPQSePysoTSUS8SRKVjIKESxSPi N0dDV62d5+SKJoOkLqh7bkfz8rB6gZ83dbxPVv0eZrE2/JTRjdEY02WoASBjmPUJZGAD s07CUaxFcTvZFU6N+3bKHVQcWMpqegV91PIGE5JeH+AgRiIblHBlcw/p1DsYgwRVqyvg WE3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QrW3DlQg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id v12-20020a17090a4ecc00b001ca8b863364si5929169pjl.65.2022.04.06.08.00.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Apr 2022 08:00:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QrW3DlQg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 3FE8D6B7B05; Wed, 6 Apr 2022 05:47:15 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242791AbiDFJkS (ORCPT + 99 others); Wed, 6 Apr 2022 05:40:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1583056AbiDFJfl (ORCPT ); Wed, 6 Apr 2022 05:35:41 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86B0068FB8 for ; Tue, 5 Apr 2022 18:41:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649209314; x=1680745314; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=jS0mTy6ZTROUsnVE2643D/khG3eHsXQHKYgDfRMLldc=; b=QrW3DlQgqD6sdaIrPERmioWm2pcug3ymwKtzANBntvF5g4ls2qWDUnab Yrh48KJldQOnFeREjajCHtHQk3bW0835TnjFyFULIpVgS/i8qmCc1FotG 7g2ilRa5+hshD8/w70bwvAXrl3Akunqa7uTAAGONDJDU4b3WNmHSU3ZEy mCN7AThH0D0QaS36BmXjD0DTZUBp2zou7SnsDx8nbRYgFDZhJAm9gem7s wRkRNzwc1FGEa9g45WANbrCG03gWHoqNV/LwHl4btgO/S53B6tzeL0RhD 1Dc9ipKQrCLZweDJY9vF81MN3MthzP2jNG3TqhfFbKtFoH5cZtJS65VOV Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10308"; a="324091443" X-IronPort-AV: E=Sophos;i="5.90,238,1643702400"; d="scan'208";a="324091443" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Apr 2022 18:41:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,238,1643702400"; d="scan'208";a="570275381" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.138]) by orsmga008.jf.intel.com with ESMTP; 05 Apr 2022 18:41:48 -0700 Date: Wed, 6 Apr 2022 09:41:47 +0800 From: Feng Tang To: Linus Torvalds Cc: kernel test robot , Yang Shi , Baolin Wang , Johannes Weiner , Oscar Salvador , Michal Hocko , Rik van Riel , Mel Gorman , Peter Zijlstra , Dave Hansen , Zi Yan , Wei Xu , Shakeel Butt , zhongjiang-ali , Randy Dunlap , Andrew Morton , LKML , lkp@lists.01.org, kernel test robot , "Huang, Ying" , Zhengjun Xing , fengwei.yin@intel.com Subject: Re: [NUMA Balancing] e39bb6be9f: will-it-scale.per_thread_ops 64.4% improvement Message-ID: <20220406014147.GA64277@shbuild999.sh.intel.com> References: <20220401094214.GA8368@xsang-OptiPlex-9020> <20220402085005.GC32311@shbuild999.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220402085005.GC32311@shbuild999.sh.intel.com> X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Apr 02, 2022 at 04:50:05PM +0800, Feng Tang wrote: > Hi Linus, > > On Fri, Apr 01, 2022 at 09:35:24AM -0700, Linus Torvalds wrote: > > On Fri, Apr 1, 2022 at 2:42 AM kernel test robot wrote: > > > > > > FYI, we noticed a 64.4% improvement of will-it-scale.per_thread_ops due to commit: > > > e39bb6be9f2b ("NUMA Balancing: add page promotion counter") > > > > That looks odd and unlikely. > > > > That commit only modifies some page counting statistics. Sure, it > > could be another cache layout thing, and maybe it's due to the subtle > > change in how NUMA_PAGE_MIGRATE gets counted, but it still looks a bit > > odd. > > We did a quick check about cache stuff by disabling HW cache prefetch > completely (writing 0xf to MSR 0x1a4), and the performance change > is almost gone: > > ee97347fe058d020 e39bb6be9f2b39a6dbaeff48436 > ---------------- --------------------------- > 134793 -1.4% 132867 will-it-scale.per_thread_ops > > The test box is a Cascadelake machine with 4 nodes, and the similar trend > is found on a 2 nodes machine, that the commit has 55% improvement with > HW cache prefetch enabled, and has less than 1% change when disabled. > > Though we still cannot pin-point the exact place affected. We did more tests and debugs, and here are some updates: * For the HW cache prefetcher, we narrowed down it to be related with 'L2 cache prefetcher', and not the 'L2 adjacent cache line prefetcher'. We can't find any documents about the detail of the prefetcher, which make it hard to analyze how the performance is affected * Debug shows the change is related with the struct 'mem_cgroup''s size change, that with commit ee97347fe058d020, its size is 4096, which turns to 4160 with commit e39bb6be9f2b3. - commit e39bb6be9f2b adds one counter 'PGPROMOTE_SUCCESS' and some code change, if we remove the code change and leave the counter, the 60% improvement remains. - revert e39bb6be9f2b, and only add 16 bytes padding inside 'mem_cgroup', the 60% change also remains. Debug patch is as below: diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index a68dce3873fcc..2bd56fb2e5b5f 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -303,6 +303,8 @@ struct mem_cgroup { /* memory.stat */ struct memcg_vmstats vmstats; + unsigned long padding[2]; + /* memory.events */ atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; Thanks, Feng > Also per our experience, the patch changing vm statistics can easily > trigger strange performance bumps for micro-benchmarks like will-it-scale, > stress-ng etc. > > Thanks, > Feng > > > > Linus