Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp1764444rda; Tue, 24 Oct 2023 02:31:19 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF/ixzHG2GBhcK0XqTKSkFh9ealzZ3+YCYYyh4iXSrjU/CcE0goB8iKtyptt8GTV6X0flBw X-Received: by 2002:a17:90a:1a52:b0:27f:bd9e:5a10 with SMTP id 18-20020a17090a1a5200b0027fbd9e5a10mr3159777pjl.20.1698139878823; Tue, 24 Oct 2023 02:31:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698139878; cv=none; d=google.com; s=arc-20160816; b=iSPYFwvM7BLCcheJDHj3A5s0+yFwGfcKWuPx0WlybxQuhvMQ8uqu7gwlfZ2RaVvOkd 1aBfWkKCyzlyHylyMxVDzdO7XTUYsGDIvdfqRv87iXLf3hienyceNl29VfGkOliiyfpD bd+uVnSM4mGO/4VvHD3f9V1yz7/gi/7xAgDUJ2/LR0kBNrTU2zn7duQuawdb8FpgAqAP cFA00by5sA8d/EMWmgofwBF/TDLQnDNkmctPRo/cc5rfZy6ug9fr//QUbAvZgQm8pprJ KUQVZgsjO1t7b9oppTzFoOacR7YNa5v+FUhmnXYiO9keJdRGRKflB+b6O+xfTCus7zbz WEhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tukJiYoE+JoVbcAOLzY6dzGTtdmqO1g/81ZAlLFqiOM=; fh=gPQ6jqLSfsDb5bE3yrtO+AlT5R4d75RXkjC5xckz7Dk=; b=qwrNhRhpIocHSvbh7/2sPCEM4I65Wm5zJG79w4tR/GhgRGHguSdegQxAwMNvw7siaW vSt+JBGW05jgqkdV4Q4YZAagCVEdWz3xMhbRRFmvwFEtlUdks9yCEXeUIUfkp0CQ7E48 nwC9Kngj/IGTJZdcnHOBjb9/HceR3xuZBwv3C2ai7vmUYNSw/Y3JJaCi92iFqA4VY4xF CBF52KohhbVMP9Jl7IMRhyRDn/eyU9QyRlsZgMvm4gTEL0y97hNgEHv6ZGb7MRk4rJsT YOkdJgGAaNj2y7Z2VNaXJm2SGS5srFFA6kt4LG0gq3xetRssmYewkc//EANdeAG0OmIZ AGlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aMtvfXym; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id kb1-20020a17090ae7c100b0027761a3a4b0si8636694pjb.0.2023.10.24.02.31.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Oct 2023 02:31:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aMtvfXym; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 84FA980936D9; Tue, 24 Oct 2023 02:30:25 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234349AbjJXJaL (ORCPT + 99 others); Tue, 24 Oct 2023 05:30:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234236AbjJXJ3g (ORCPT ); Tue, 24 Oct 2023 05:29:36 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF27C1729; Tue, 24 Oct 2023 02:28:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698139735; x=1729675735; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bqHDSFhKyH5O4q+WlQcPjHBDU9dNxKX/Zht0fn5lHcg=; b=aMtvfXymGwsc9tLQQCSu9gvp6/91FilF7HER6ERPLVwFJ5T1AcE6Dhxf z2Dx9kuZCtOyaXGizSNXlpbR4/FCecoHuLGZ6w1eNaZjFMSHSfdmvMgkU ife3J+PF0NYhu44xW0y4ZiOyjfImuUqXCawW8cjWTv1QrYkV+gWqot4Dq qPr5Jpep0LxOIFpARMvBxB7xl9pml10n12CvPLHucA/sMoHQnBo//Kiwj RYBJHsLbt0MWNAzgsinfRydDEyaycTz5f5pXEBCEZrA8REA+MAMCxa2Bu 0yZkDRkc1ULLOdeo0Gbsts+eNe2quhe76F1j+mtRCgz0258EsOSRpeqQx A==; X-IronPort-AV: E=McAfee;i="6600,9927,10872"; a="384218902" X-IronPort-AV: E=Sophos;i="6.03,247,1694761200"; d="scan'208";a="384218902" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Oct 2023 02:28:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,247,1694761200"; d="scan'208";a="6397798" Received: from hprosing-mobl.ger.corp.intel.com (HELO localhost) ([10.249.40.219]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Oct 2023 02:28:44 -0700 From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= To: linux-kselftest@vger.kernel.org, Reinette Chatre , Shuah Khan , Shaopeng Tan , =?UTF-8?q?Maciej=20Wiecz=C3=B3r-Retman?= , Fenghua Yu Cc: linux-kernel@vger.kernel.org, =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= Subject: [PATCH 15/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations Date: Tue, 24 Oct 2023 12:26:25 +0300 Message-Id: <20231024092634.7122-16-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231024092634.7122-1-ilpo.jarvinen@linux.intel.com> References: <20231024092634.7122-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Tue, 24 Oct 2023 02:30:26 -0700 (PDT) When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results. Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo. Using a prime multiplier with modulo ensures the entire buffer is eventually read. 23 is small enough that the reads are spread out but wrapping does not occur very frequently (wrapping too often can trigger L2 hits more frequently which causes noise to the test because getting the data from LLC is not required). It was discovered that not all primes work equally well and some can cause wildly unstable results (e.g., in an earlier version of this patch, the reads were done in reversed order and 59 was used as the prime resulting in unacceptably high and unstable results in MBA and MBM test on some architectures). Link: https://lore.kernel.org/linux-kselftest/TYAPR01MB6330025B5E6537F94DA49ACB8B499@TYAPR01MB6330.jpnprd01.prod.outlook.com/ Signed-off-by: Ilpo Järvinen --- tools/testing/selftests/resctrl/fill_buf.c | 38 +++++++++++++++++----- 1 file changed, 30 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 9d0b0bf4b85a..326d530425d0 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -51,16 +51,38 @@ static void mem_flush(unsigned char *buf, size_t buf_size) sb(); } +/* + * Buffer index step advance to workaround HW prefetching interfering with + * the measurements. + * + * Must be a prime to step through all indexes of the buffer. + * + * Some primes work better than others on some architectures (from MBA/MBM + * result stability point of view). + */ +#define FILL_IDX_MULT 23 + static int fill_one_span_read(unsigned char *buf, size_t buf_size) { - unsigned char *end_ptr = buf + buf_size; - unsigned char sum, *p; - - sum = 0; - p = buf; - while (p < end_ptr) { - sum += *p; - p += (CL_SIZE / 2); + unsigned int size = buf_size / (CL_SIZE / 2); + unsigned int i, idx = 0; + unsigned char sum = 0; + + /* + * Read the buffer in an order that is unexpected by HW prefetching + * optimizations to prevent them interfering with the caching pattern. + * + * The read order is (in terms of halves of cachelines): + * i * FILL_IDX_MULT % size + * The formula is open-coded below to avoiding modulo inside the loop + * as it improves MBA/MBM result stability on some architectures. + */ + for (i = 0; i < size; i++) { + sum += buf[idx * (CL_SIZE / 2)]; + + idx += FILL_IDX_MULT; + while (idx >= size) + idx -= size; } return sum; -- 2.30.2