Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp36078563rwd; Mon, 10 Jul 2023 17:45:57 -0700 (PDT) X-Google-Smtp-Source: APBJJlHmUd+VbDmGWH7FJdusgx2fBAQYWcXjS/VKGd7XkikqoEN9N2b+zG9xhEOug3U+qqyPFPtO X-Received: by 2002:a17:906:4ad2:b0:98e:2b00:c509 with SMTP id u18-20020a1709064ad200b0098e2b00c509mr16237184ejt.30.1689036357157; Mon, 10 Jul 2023 17:45:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689036357; cv=none; d=google.com; s=arc-20160816; b=kZLL5jV+SKBlE5bm9wLi8R3WzL8ew1zTwaiaFwI719NYCeR92Tb+jUxnx/p1T8MLoc gyauZToJIRpPVSMysx8mNydCuc3obWYutuJ8EhcwLoAPnNcH8+6v0eJLQ6aov1Ele7mQ KUIkzrxwPzLb0ndM01Qbaet7NBh3VMODavTxyrAD4Z5KkrmArATp6iqLX3V/00C6GsTZ 86z3UxKQ7yWM5W3dUSzLtukiBrGu7dA8LiSqV0LSWAs8kk94+J4t+qUQs8FmmQHTFBeZ hSuSw+zcZEBl2TGBX1A9STthOpaoTwNz2x0d9o4TcQYznKrwdBe5M/Tab0xRzk23CLVp /RbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=qRhVdPoECBOUpVtFYtnPXr9CF6dLyhDrFpsVvZhTPpU=; fh=Vq4KMqnUWrJ5BbcI3dHorhXX5A9mLsfxyTEpLrLySII=; b=fY1qz4yvQQJZiPAo60nXxGuoxpKs9hkzOPrMx8Kj4FafcZT0VXarPR+0bf1FlbMF5y avNwOQsKXItAzQI5flu/tQLRElJqETMYsRlR0UTNkMjbxqISy4Ph5+1/8R67gaHY5Hxz JGiRxdd3Ih+z1tGtsCZHY7mEE8t4KG4rj7QyQ+MkbIVEBnKr5vVXMnoI0KLFAWDZQ1pJ LQezI28WhfLdSZaoTC89g1HA0ueQ+RBGJ3AMSKLt+RlIooxcfE8JrE1eokbZi/v0SUrg 98CLutzCjMC6wATGizRUbdRRm37eJG1npibPXKOn3HKhdIvk1pOSiZE5iYQQOV5W4ka/ Jvhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=m6fyJLc6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z1-20020a170906240100b0098e1908d573si924579eja.325.2023.07.10.17.45.33; Mon, 10 Jul 2023 17:45:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=m6fyJLc6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229641AbjGJXf6 (ORCPT + 99 others); Mon, 10 Jul 2023 19:35:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55876 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229528AbjGJXf4 (ORCPT ); Mon, 10 Jul 2023 19:35:56 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13AA799 for ; Mon, 10 Jul 2023 16:35:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689032155; x=1720568155; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=tFbXAaj7P8nAVQ+f9ECcnlO8BUDW6scbPt26NrFxtyo=; b=m6fyJLc6fn0KlB43dEajGW/hDmaX8VuFh2ebN0/t/JMAp8ur6cQWIUH+ jTipdIaYVXz+XbKy0aJR325rSE2ZCkOS3SE4j9EGeC4AI/ojcxGdjxRhA SiqFWhRg44e7F/23kaAWGsvevL59mJ6rmpokVS7JY0mtuxqgyJPeYdwFw BsL1AywfvHhhnccxByeARcd3NDuqUxFCpDfdergTgWCSJg9hVo45n31ov JWLFk9sZWlN4T2JExjqk0LtJNiQS8DmADClHlhvhoxp2Uxy6tDrAUwReg lKb7gA23AC4Ja0i0JaNo9dwE4YL1GfzfKZR9xaw8umzNVKZPZEL+x8BwH Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10767"; a="395269312" X-IronPort-AV: E=Sophos;i="6.01,195,1684825200"; d="scan'208";a="395269312" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2023 16:35:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10767"; a="720867161" X-IronPort-AV: E=Sophos;i="6.01,195,1684825200"; d="scan'208";a="720867161" Received: from agluck-desk3.sc.intel.com (HELO agluck-desk3) ([172.25.222.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2023 16:35:53 -0700 Date: Mon, 10 Jul 2023 16:35:52 -0700 From: Tony Luck To: Peter Newman Cc: James Morse , "Yu, Fenghua" , "Chatre, Reinette" , Drew Fustini , Babu Moger , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , "shameerali.kolothum.thodi@huawei.com" , D Scott Phillips OS , "carl@os.amperecomputing.com" , "lcherian@marvell.com" , "bobo.shaobowang@huawei.com" , "tan.shaopeng@fujitsu.com" , "xingxin.hx@openanolis.org" , "baolin.wang@linux.alibaba.com" , Jamie Iles , Xin Hao , "Pitre, Nicolas" , Kevin Hilman , "aricciardi@baylibre.com" , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "patches@lists.linux.dev" , Stephane Eranian Subject: Re: [RFC PATCH 2/2] resctrl2: Arch x86 modules for most of the legacy control/monitor functions Message-ID: References: <20230620033702.33344-1-tony.luck@intel.com> <20230620033702.33344-3-tony.luck@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 06, 2023 at 12:22:03PM +0200, Peter Newman wrote: > Hi Tony, > > On Wed, Jul 5, 2023 at 6:46 AM Luck, Tony wrote: > > The mbm_poll() code that makes sure that counters don't wrap is > > doing all the expensive wrmsr(QM_EVTSEL);rdmsr(QM_COUNT) > > once per second to give you the data you want. > > I was doing that in the soft RMID series I posted earlier because it > simplified things, but then I had some realizations about how much > error +/- 1 second on the sampling point could result in[1]. We > usually measure the bandwidth rate with a 5-second window, so a > reading that's up to one second old would mean a 20% error in the > bandwidth calculation. I just pushed the latest version of the resctrl2 patches to the resctrl2_v65rc1 branch of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git As well as locking, bug fixes, and general updates it includes an experimental feature to provide summary MBM information for each node. E.g. (both "total" and "local" rates are provided). Note that you have to load modules rdt_mbm_local_bytes and rdt_mbm_total_bytes so that the MBM overflow threads are running. I should fix the code to print "n/a" instead of "0" if they are not. $ cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_summary 3638 3638 /g2 3087 3087 /g2/m2 3267 3267 /g2/m1 3443 3443 /g1 3629 3629 /g1/m2 3588 3587 /g1/m1 3999 3993 / 3370 3369 /m2 3432 3432 /m1 The rates are produced once per second by the MBM overflow code. They compute MBytes/sec as "chunks since last poll" divided by (now - then). I'm using jiffies for the times which may be good enough. "now - then" is one second (maybe more if the kernel thread doing the MBM polling is delayed from running). I should fix the summarization code to work the same as the regular MBM files (i.e. make the parent control directory report the sum of all its children). The code also attempts (but fails) to make these mbm_summary files poll(2)-able. With the wakeup dependent on aggregate measure bandwidth compared against a configurable threshold: $ cat /sys/fs/resctrl/info/L3_MON/mbm_poll_threshold 10000000 There's something wrong though. Poll(2) always says there is data to be read. I only see one other piece of kernel code implementing poll on kernfs (in the cgroup code). Perhaps my problem is inability to write an appliction that uses poll(2) correctly. Let me know if this all seems like a useful direction. Maybe the polling part is overkill and it is sufficient to just have a cheap way to get all the bandwidths even if the values seen might be up to one second old. -Tony