Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp620935imm; Thu, 6 Sep 2018 07:41:18 -0700 (PDT) X-Google-Smtp-Source: ANB0VdaCUAXOMI56m0bqjlNiZaQiABhs+8DqBWqn5MFkk3mwORsmbKqnGmV16TuEmXS+AD577Ctb X-Received: by 2002:a17:902:14e:: with SMTP id 72-v6mr2915322plb.299.1536244878507; Thu, 06 Sep 2018 07:41:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536244878; cv=none; d=google.com; s=arc-20160816; b=gctvAO7nv8PmbBs3FKRIB/Ny74u4a/qcCk+IiGq4q6VLT+x+Ab1C1w+IVA9aBZQ3vs 4F+M19ZMPEIyeLFekvoBn8zPkQefIkk1qJAt3RndYeS91P7pFwX9u1hwGsOSuf1oR3XF WIkpnD8fpUD4hIfj/AVbaac3hiZzGz3q+PleEWWGp1tJS+4pxthGvhVcFNXZY01DHRRK 8g6WOVqvcWOqpMv8Q8f3Pz0D6UCDFp0nBBnzwB4dVFtT+90BGpIrsieyaJB5OqG7Ktk2 a1tZo9fP449Qg8PrannFr3DfeGEQSLIwgplYyOE12mVXo+mSUWNupELwQremUtHNdNXk L/Kw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=eT9nFDqXulFbQZ1mQdFfGIqn13Rb+nL0n5/MvwTDLFw=; b=rK2t6lVpqq0Cb92zm9tiTpfvPsp9dWpWDvnJGg46Zv1+JZCRCFEekcsxav0hjpwB+G cJQQVrZtocL08I1E1cakGQ0qBGoyAO5mhDU4ua6PY213Z0WT5xvEWqm0xF9fcjQWuPnM AYqxIDW/+mYTXiGOslmpGv2gNnkb4qea5F56TJt22uD7fE8J3uIltMONpmJDSr/RaYpZ RwEEMwAoP33ZrsAvnLo2GOI/vCWVmGGtM82ZED0h55Tq+jkT98sUvyX2qPdpHd5vn9Vu +Cngkuy1Qp+aMa74RAA6DZcTSgoJtOYKbeAlD1JwGv5amuL6vFka+ujTSd5cH7TvCVVh 9gfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=0Htq1vvw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t4-v6si5069333plo.235.2018.09.06.07.40.59; Thu, 06 Sep 2018 07:41:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=0Htq1vvw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730123AbeIFTOi (ORCPT + 99 others); Thu, 6 Sep 2018 15:14:38 -0400 Received: from merlin.infradead.org ([205.233.59.134]:46148 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729735AbeIFTOi (ORCPT ); Thu, 6 Sep 2018 15:14:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=eT9nFDqXulFbQZ1mQdFfGIqn13Rb+nL0n5/MvwTDLFw=; b=0Htq1vvwbsdbiJvSf0+SW/7JL Mce2YKB/u0mVkXJ1z7M+iRhHjTzt5WMmoCcEWQ3WEnBKbhkzPQeIDl7jFGL/WhEVEd3ekPQMaKpB7 XQHqrgLx7MRUDZfkh7TxOUUm98tP2bl5UdShC6oKzf9V0WwL8VpCxhAnZ7M7AQ1ZTfTIOArlaOIip l5vyblPmxhbSdoRgJobsB7PLV7ULZun8GCwK6R/zxh4h2H5l8R27ZPLMvdT/BTN+oLRH7QkMpnCVn hGasauvxlk2ON6O9/r1FCxn29rzVFN9RjdvaVjGfz+E0XusbVXx0WoaNBV+BUdyoUVL8um1z1tQ8B M626+G4pQ==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1fxvQc-0003CC-G8; Thu, 06 Sep 2018 14:38:39 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 292882025C698; Thu, 6 Sep 2018 16:38:36 +0200 (CEST) Date: Thu, 6 Sep 2018 16:38:36 +0200 From: Peter Zijlstra To: Reinette Chatre Cc: tglx@linutronix.de, fenghua.yu@intel.com, tony.luck@intel.com, mingo@redhat.com, acme@kernel.org, vikas.shivappa@linux.intel.com, gavin.hindman@intel.com, jithu.joseph@intel.com, dave.hansen@intel.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH V2 5/6] x86/intel_rdt: Use perf infrastructure for measurements Message-ID: <20180906143836.GG24106@hirez.programming.kicks-ass.net> References: <30b32ebd826023ab88f3ab3122e4c414ea532722.1534450299.git.reinette.chatre@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <30b32ebd826023ab88f3ab3122e4c414ea532722.1534450299.git.reinette.chatre@intel.com> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 16, 2018 at 01:16:08PM -0700, Reinette Chatre wrote: > + l2_miss_event = perf_event_create_kernel_counter(&perf_miss_attr, > + plr->cpu, > + NULL, NULL, NULL); > + if (IS_ERR(l2_miss_event)) > + goto out; > + > + l2_hit_event = perf_event_create_kernel_counter(&perf_hit_attr, > + plr->cpu, > + NULL, NULL, NULL); > + if (IS_ERR(l2_hit_event)) > + goto out_l2_miss; > + > + local_irq_disable(); > + /* > + * Check any possible error state of events used by performing > + * one local read. > + */ > + if (perf_event_read_local(l2_miss_event, &tmp, NULL, NULL)) { > + local_irq_enable(); > + goto out_l2_hit; > + } > + if (perf_event_read_local(l2_hit_event, &tmp, NULL, NULL)) { > + local_irq_enable(); > + goto out_l2_hit; > + } > + > + /* > + * Disable hardware prefetchers. > * > + * Call wrmsr direcly to avoid the local register variables from > + * being overwritten due to reordering of their assignment with > + * the wrmsr calls. > + */ > + __wrmsr(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits, 0x0); So what about virt? > + > + /* Initialize rest of local variables */ > + /* > + * Performance event has been validated right before this with > + * interrupts disabled - it is thus safe to read the counter index. > + */ > + l2_miss_pmcnum = x86_perf_rdpmc_ctr_get(l2_miss_event); > + l2_hit_pmcnum = x86_perf_rdpmc_ctr_get(l2_hit_event); > + line_size = plr->line_size; > + mem_r = plr->kmem; > + size = plr->size; You probably want READ_ONCE() on that, the volatile cast in there disallows the compiler from re-loading the values later. > + > + /* > + * Read counter variables twice - first to load the instructions > + * used in L1 cache, second to capture accurate value that does not > + * include cache misses incurred because of instruction loads. > + */ > + rdpmcl(l2_hit_pmcnum, l2_hits_before); And this again does do virt. > + rdpmcl(l2_miss_pmcnum, l2_miss_before); > + /* > + * From SDM: Performing back-to-back fast reads are not guaranteed > + * to be monotonic. To guarantee monotonicity on back-toback reads, > + * a serializing instruction must be placed between the two > + * RDPMC instructions > + */ > + rmb(); You're copying the horrid horrid (did I say truly horrid?) use of 'serializing' from the SDM. Please don't do that. LFENCE is not a serializing instruction. But given the (new) definition LFENCE does ensure all prior instructions are retired before it proceeds. > + rdpmcl(l2_hit_pmcnum, l2_hits_before); > + rdpmcl(l2_miss_pmcnum, l2_miss_before); > + /* > + * rdpmc is not a serializing instruction. Add barrier to prevent > + * instructions that follow to begin executing before reading the > + * counter value. > + */ > + rmb(); > + for (i = 0; i < size; i += line_size) { > + /* > + * Add a barrier to prevent speculative execution of this > + * loop reading beyond the end of the buffer. > + */ > + rmb(); > + asm volatile("mov (%0,%1,1), %%eax\n\t" > + : > + : "r" (mem_r), "r" (i) > + : "%eax", "memory"); Why does that need to be asm? > + } I think you want another LFENCE here, to ensure the RDPMCs don't overlap with the last LOAD in the loop above. > + rdpmcl(l2_hit_pmcnum, l2_hits_after); > + rdpmcl(l2_miss_pmcnum, l2_miss_after); > + /* > + * rdpmc is not a serializing instruction. Add barrier to ensure > + * events measured have completed and prevent instructions that > + * follow to begin executing before reading the counter value. > + */ > + rmb(); > + /* Re-enable hardware prefetchers */ > + wrmsr(MSR_MISC_FEATURE_CONTROL, 0x0, 0x0); So what I do in userspace is: mmap_read_pinned(ctx); /* prime */ for (many-times) { cnt = mmap_read_pinned(evt); barrier(); cnt = mmap_read_pinned(evt) - cnt; update_stats(&empty, cnt); cnt = mmap_read_pinned(evt); barrier(); /* the thing */ barrier(); cnt = mmap_read_pinned(evt) - cnt; update_stats(&stat, cnt); } sub_stats(&stat, &empty); Maybe I should've used asm("lfence" ::: "memory") instead of barrier(), but the results were good enough.