Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752219AbdGERZS (ORCPT ); Wed, 5 Jul 2017 13:25:18 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:53856 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751826AbdGERZQ (ORCPT ); Wed, 5 Jul 2017 13:25:16 -0400 Date: Wed, 5 Jul 2017 19:25:07 +0200 (CEST) From: Thomas Gleixner To: Peter Zijlstra cc: Vikas Shivappa , x86@kernel.org, linux-kernel@vger.kernel.org, hpa@zytor.com, ravi.v.shankar@intel.com, vikas.shivappa@intel.com, tony.luck@intel.com, fenghua.yu@intel.com, andi.kleen@intel.com Subject: Re: [PATCH 08/21] x86/intel_rdt/cqm: Add RMID(Resource monitoring ID) management In-Reply-To: <20170705153439.xudhew5wpq3liivf@hirez.programming.kicks-ass.net> Message-ID: References: <1498503368-20173-1-git-send-email-vikas.shivappa@linux.intel.com> <1498503368-20173-9-git-send-email-vikas.shivappa@linux.intel.com> <20170705153439.xudhew5wpq3liivf@hirez.programming.kicks-ass.net> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2483 Lines: 76 On Wed, 5 Jul 2017, Peter Zijlstra wrote: > On Mon, Jul 03, 2017 at 11:55:37AM +0200, Thomas Gleixner wrote: > > > > > if (static_branch_likely(&rdt_mon_enable_key)) { > > if (unlikely(current->rmid)) { > > newstate.rmid = current->rmid; > > __set_bit(newstate.rmid, this_cpu_ptr(rmid_bitmap)); > > Non atomic op > > > } > > } > > > > Now in rmid_free() we can collect that information: > > > > cpumask_clear(&tmpmask); > > cpumask_clear(rmid_entry->mask); > > > > cpus_read_lock(); > > for_each_online_cpu(cpu) { > > if (test_and_clear_bit(rmid, per_cpu_ptr(cpu, rmid_bitmap))) > > atomic op Indeed. We need atomic on both sides unfortunately. > > cpumask_set(cpu, tmpmask); > > } > > Another thing which needs some thought it the CPU hotplug code. We need to > > make sure that pending work which is scheduled on an outgoing CPU is moved > > in the offline callback to a still online CPU of the same domain and not > > moved to some random CPU by the workqueue hotplug code. > > just flush the workqueue for that CPU? That's what the workqueue core > _should_ do in any case. And that also covers the case where @cpu is the > last in the set of CPUs we could run on. Indeed. > > There is another subtle issue. Assume a RMID is freed. The limbo stuff is > > scheduled on all domains which have online CPUs. > > > > Now the last CPU of a domain goes offline before the threshold for clearing > > the domain CPU bit in the rme->mask is reached. > > > > So we have two options here: > > > > 1) Clear the bit unconditionally when the last CPU of a domain goes > > offline. > > Arguably this. This is cache level stuff, that means this is the last > CPU of a cache, so just explicitly kill the _entire_ cache and insta > mark everything good again; WBINVD ftw. Right. > > 2) Arm a timer which clears the bit after a grace period > > > > #1 The RMID might become available for reuse right away because all other > > domains have not used it or have cleared their bits already. > > > > If one of the CPUs of that domain comes online again and is associated > > to that reused RMID again, then the counter content might still contain > > leftovers from the previous usage. > > Not if we kill the cache on offline -- also, if all CPUs have been > offline, its not too weird to expect something like a package idle state > to have happened and shot down the caches anyway. Yes, didn't think about that. Thanks, tglx