Received: by 10.213.65.68 with SMTP id h4csp1066112imn; Wed, 4 Apr 2018 12:00:52 -0700 (PDT) X-Google-Smtp-Source: AIpwx48/Vr8Hr4ohX88jwPNSCw9wcHCUsEhge3jsjx7vss4TKgfziCARpsssJryg4MGTumfMWgtA X-Received: by 2002:a17:902:82c2:: with SMTP id u2-v6mr19270550plz.401.1522868452828; Wed, 04 Apr 2018 12:00:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522868452; cv=none; d=google.com; s=arc-20160816; b=mAUcojm0SMjOIBu0tmg67QEv/eFInYT9ZpEtqO3pMqxP1gFrDDUlTKT37Gg7R1OqT6 hQWUIMXv9cviBBO1oCn+m8l728UHKD/Y8Y0BmNJS43xIpmfz/PvqENVfiiuhDXSbdBT+ TCqePh4NwSeYrA886UF0C44sKIZvxTkHOYevtUWWC0U/7y8m8fFeJYk/Ble9WsoMI42N lHkt7ieWShMggvY0uY1ARmx0/4HWSywjVVpROUnFYVkiFnKI5kFE151XKwklywSXy+tX vutao2CQ0jk8eGtJzupLptL2gnm7SbEZP2jbsBANcLg4ZVvZMiU5SnA5+P9kCe9N1SLm aazQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=uhM43rYdKzOiYCJisSPF1DRyv57vAIT17K4bElewsp4=; b=B1cQS91sJ9RY78sccVJ+NHSxFzhqNFVv6MpdRJEu9WL9zYiu29EVbjmMero0d/HwRm a/LIcXDfHkKnftiMqiG+cIRF50v9dpSNpvDh+4V7d9/B8Zjps7Nzz3XQqbMNlI2Hbzvx 8M1TeN2owwyiTQ6md0y6eluL2qHuHXKai/W0UFJRo/KoFCOW6T64OC3utYxZNVFuEIr0 DOXH6F+c8SwcsykF65IPvAod05noUz0ib7EKh/l66t1/x00LmevO56f0RBS0/kSfb38Z bPVl1W9XwFsYdVEXU8wwknn7SJthzqTBfEKZtSqCONRDJfuXk6R4HSzclthW0AS2mfJP rz5A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m10-v6si4055705pln.595.2018.04.04.12.00.37; Wed, 04 Apr 2018 12:00:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751598AbeDDS7V (ORCPT + 99 others); Wed, 4 Apr 2018 14:59:21 -0400 Received: from mga07.intel.com ([134.134.136.100]:17360 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751231AbeDDS7Q (ORCPT ); Wed, 4 Apr 2018 14:59:16 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Apr 2018 11:59:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,407,1517904000"; d="scan'208";a="40708552" Received: from vshiva-udesk.sc.intel.com (HELO vshiva-Udesk) ([10.3.52.52]) by orsmga003.jf.intel.com with ESMTP; 04 Apr 2018 11:59:15 -0700 Date: Wed, 4 Apr 2018 11:56:16 -0700 (PDT) From: Shivappa Vikas X-X-Sender: vikas@vshiva-Udesk To: Thomas Gleixner cc: Shivappa Vikas , Vikas Shivappa , tony.luck@intel.com, ravi.v.shankar@intel.com, fenghua.yu@intel.com, sai.praneeth.prakhya@intel.com, x86@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org, ak@linux.intel.com Subject: Re: [PATCH 1/6] x86/intel_rdt/mba_sc: Add documentation for MBA software controller In-Reply-To: Message-ID: References: <1522362376-3505-1-git-send-email-vikas.shivappa@linux.intel.com> <1522362376-3505-2-git-send-email-vikas.shivappa@linux.intel.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 4 Apr 2018, Thomas Gleixner wrote: > On Tue, 3 Apr 2018, Shivappa Vikas wrote: > > On Tue, 3 Apr 2018, Thomas Gleixner wrote: > > > On Thu, 29 Mar 2018, Vikas Shivappa wrote: > > > The L2 external bandwidth is higher than the L3 external bandwidth. > > > > > > Is there any information available from CPUID or whatever source which > > > allows us to retrieve the bandwidth ratio or the absolute maximum > > > bandwidth per level? > > > > There is no information in cpuid on the bandwidth available. Also we have seen > > from our experiments that the increase is not perfectly linear (delta > > bandwidth increase from 30% to 40% may not be same as 70% to 80%). So we > > currently dynamically caliberate this delta for the software controller. > > I assume you mean: calibrate > > Though I don't see anything which looks remotely like calibration. > Calibration means that you determine the exact parameters by observation and > then can use the calibrated values afterwards. But that's not what you are > doing. So please don't claim its calibration. > > You observe behaviour which depends on the workload and other > factors. That's not calibration. If you change the MSR by a granularity > value then you calculate the bandwidth delta vs. the previous MSR > value. That only makes sense and works when the application is having the > same memory access patterns accross both observation periods. > > And of course, this won't be necessarily linear because if you throttle the > application then it gets less work done per CPU time slice and the > resulting stalls will also have side effects on the requested amount of > memory and therefore distort the measurement. Ditto the other way > around. > > There are too many factors influencing this, so claiming that it's > calibration is window dressing at best. Even worse it suggests that it's > something accurate, which subverts your goal of reducing confusion. > > Adaptive control might be an acceptable description, though given the > amount of factors which play into that it's still an euphemism for > 'heuristic'. Agree we donot really caliberate and the only thing we guarentee is that the actual bandwidth in bytes < user specified bandwidth bytes. This is what the hardware guarenteed when we specified the values in percentage as well but just that it was confusing. > > > > What's also missing from your explanation is how that feedback loop behaves > > > under different workloads. > > > > > > Is this assuming that the involved threads/cpus actually try to utilize > > > the bandwidth completely? > > > > No, the feedback loop only guarentees that the usage will not exceed what the > > user specifies as max bandwidth. If it is using below the max value it does > > not matter how much less it is using. > > > > > > What happens if the threads/cpus are only using a small set because they > > > are idle or their computations are mostly cache local and do not need > > > external bandwidth? Looking at the implementation I don't see how that is > > > taken into account. > > > > The feedback only kicks into action if a rdtgroup uses more bandwidth than the > > max specified by the user. I specified that it is always "ensure the "actual > > b/w > > 354 < user b/w" " and can add more explanation on these scenarios. > > Please finally stop to use this horrible 'b/w' thingy. It makes my eyes bleed > everytime. Will fix - this was a text from already existing documentation. > > > Also note that we are using the MBM counters for this feedback loop. Now that > > the interface is much more useful because we have the same rdtgroup that is > > being monitored and controlled. (vs. if we had the perf mbm the group of > > threads in resctrl mba and in mbm could be different and would be hard to > > measure what the threads/cpus in the resctrl are using). > > Why does that make me smile? I know why :) Full credits to you as you had suggested to rewrite the cqm/mbm in resctrl which is definitely very good in long term ! Thanks, Vikas > > Thanks, > > tglx >