Received: by 10.213.65.68 with SMTP id h4csp503592imn; Wed, 4 Apr 2018 02:12:49 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/l41dhNhuf/u/XHfJj++y6rj7kCS6/o1QErGItcb8Qf2B/AHsExaZMAyd5ku9PhKI7rX3+ X-Received: by 10.99.120.74 with SMTP id t71mr11720240pgc.310.1522833169402; Wed, 04 Apr 2018 02:12:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522833169; cv=none; d=google.com; s=arc-20160816; b=Hp47iNGlxlrl6HF22K2Jcn/ZJQsQXkMzGv5TuptCLT2PpLNKAo1lb3/Q0g8DXMNRFa O0WbUAe6XYJsDsZXxL7xUWaJB1bPmw8YYLVvZ2Z8vKSCi/qxUpiXmjKwLoV4bA8Dhk0Z zs4dhAvE25axoMwxbaok4gxC+jcQubaebwnCLY8crvPPPJ2MLRHf0m5MZriJgeKLn3o7 b8JzT77i0+WkHH9iOjz1rg/S7uSEU64KpP8xBJDDL/o4gKesvwABB78H2Ojhy0aIe0Sw SOx60AivkwSfKk4N3qIFuosk01x3VdgusGITbjNQwmcRX2MG3IwpJQeCKnS7elDbJ+ir x4CA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=iahrTpAl6SRebujmBCT8GaTpWD0yd9tZy1IutLvVbsw=; b=a97nzkc67HkhqcR6HkV+pz3b6tM0SWPw7niuDkQHsA1sOUlWg7KRgeVNF0HBcvK81N Ylm33qmvBjf1C3x7dvhcTc5xb2MGoj0zpGDmu1RaNV+GV+CVo0hzSOKZHaqWykC0npmY mQgOkx6pWRpUaS91cNDv3yerRcTewQg+8xMDlqy0g2vdnhA9OqAGNS8DQshykvE+sbVi WT9dgE8GToCACgj0KjjRASCxG1QiJ5IBoIiVlyCyGjqxPScRHVzDpx7Ya5s/COepsuqH Pr6K3+f8+rHvCfL7uSyy0AP4bnsRyelawl41k2KQUoK8/bNYHDFV6/C3dAHUE8nVm5RM xExg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l23si3306857pgn.696.2018.04.04.02.12.34; Wed, 04 Apr 2018 02:12:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751510AbeDDJLT (ORCPT + 99 others); Wed, 4 Apr 2018 05:11:19 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:32844 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750995AbeDDJLS (ORCPT ); Wed, 4 Apr 2018 05:11:18 -0400 Received: from hsi-kbw-5-158-153-52.hsi19.kabel-badenwuerttemberg.de ([5.158.153.52] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1f3eRj-0002Fi-Bh; Wed, 04 Apr 2018 11:11:11 +0200 Date: Wed, 4 Apr 2018 11:11:10 +0200 (CEST) From: Thomas Gleixner To: Shivappa Vikas cc: Vikas Shivappa , tony.luck@intel.com, ravi.v.shankar@intel.com, fenghua.yu@intel.com, sai.praneeth.prakhya@intel.com, x86@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org, ak@linux.intel.com Subject: Re: [PATCH 1/6] x86/intel_rdt/mba_sc: Add documentation for MBA software controller In-Reply-To: Message-ID: References: <1522362376-3505-1-git-send-email-vikas.shivappa@linux.intel.com> <1522362376-3505-2-git-send-email-vikas.shivappa@linux.intel.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 3 Apr 2018, Shivappa Vikas wrote: > On Tue, 3 Apr 2018, Thomas Gleixner wrote: > > On Thu, 29 Mar 2018, Vikas Shivappa wrote: > > The L2 external bandwidth is higher than the L3 external bandwidth. > > > > Is there any information available from CPUID or whatever source which > > allows us to retrieve the bandwidth ratio or the absolute maximum > > bandwidth per level? > > There is no information in cpuid on the bandwidth available. Also we have seen > from our experiments that the increase is not perfectly linear (delta > bandwidth increase from 30% to 40% may not be same as 70% to 80%). So we > currently dynamically caliberate this delta for the software controller. I assume you mean: calibrate Though I don't see anything which looks remotely like calibration. Calibration means that you determine the exact parameters by observation and then can use the calibrated values afterwards. But that's not what you are doing. So please don't claim its calibration. You observe behaviour which depends on the workload and other factors. That's not calibration. If you change the MSR by a granularity value then you calculate the bandwidth delta vs. the previous MSR value. That only makes sense and works when the application is having the same memory access patterns accross both observation periods. And of course, this won't be necessarily linear because if you throttle the application then it gets less work done per CPU time slice and the resulting stalls will also have side effects on the requested amount of memory and therefore distort the measurement. Ditto the other way around. There are too many factors influencing this, so claiming that it's calibration is window dressing at best. Even worse it suggests that it's something accurate, which subverts your goal of reducing confusion. Adaptive control might be an acceptable description, though given the amount of factors which play into that it's still an euphemism for 'heuristic'. > > What's also missing from your explanation is how that feedback loop behaves > > under different workloads. > > > > Is this assuming that the involved threads/cpus actually try to utilize > > the bandwidth completely? > > No, the feedback loop only guarentees that the usage will not exceed what the > user specifies as max bandwidth. If it is using below the max value it does > not matter how much less it is using. > > > > What happens if the threads/cpus are only using a small set because they > > are idle or their computations are mostly cache local and do not need > > external bandwidth? Looking at the implementation I don't see how that is > > taken into account. > > The feedback only kicks into action if a rdtgroup uses more bandwidth than the > max specified by the user. I specified that it is always "ensure the "actual > b/w > 354 < user b/w" " and can add more explanation on these scenarios. Please finally stop to use this horrible 'b/w' thingy. It makes my eyes bleed everytime. > Also note that we are using the MBM counters for this feedback loop. Now that > the interface is much more useful because we have the same rdtgroup that is > being monitored and controlled. (vs. if we had the perf mbm the group of > threads in resctrl mba and in mbm could be different and would be hard to > measure what the threads/cpus in the resctrl are using). Why does that make me smile? Thanks, tglx