Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752845AbaFEOp2 (ORCPT ); Thu, 5 Jun 2014 10:45:28 -0400 Received: from mail-yk0-f177.google.com ([209.85.160.177]:41947 "EHLO mail-yk0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752587AbaFEOpY (ORCPT ); Thu, 5 Jun 2014 10:45:24 -0400 MIME-Version: 1.0 In-Reply-To: <20140605140316.GB16811@pd.tnic> References: <1401917658-26065-1-git-send-email-eranian@google.com> <1401917658-26065-10-git-send-email-eranian@google.com> <20140605132750.GA16811@pd.tnic> <20140605140316.GB16811@pd.tnic> Date: Thu, 5 Jun 2014 17:45:24 +0300 Message-ID: Subject: Re: [PATCH 9/9] perf/x86: add syfs entry to disable HT bug workaround From: Maria Dimakopoulou To: Borislav Petkov Cc: Stephane Eranian , Matt Fleming , LKML , Peter Zijlstra , "mingo@elte.hu" , "ak@linux.intel.com" , Jiri Olsa , "Yan, Zheng" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 5, 2014 at 5:03 PM, Borislav Petkov wrote: > On Thu, Jun 05, 2014 at 03:42:14PM +0200, Stephane Eranian wrote: >> I gave a test case earlier: >> >> # echo 0 >/proc/sys/kernel/nmi_watchdog >> # run_my_uniform_workload_on_all_cpus & >> # perf stat -a -e r81d0,r01d1,r08d0,r20d1 sleep 5 >> >> That run gives the correct answer. >> >> If I just look at CPU0 CPU4 siblings: >> >> CPU0, counter0 leaks N counts to CPU4, counter 0 >> >> but at the same time: >> >> CPU4, counter0 leaks N counts to CPU0, counter 0 >> >> This is because we have the same event in the same >> counter AND the workload is uniform, meaning the >> event (here loads retired) occurs at the same rate >> on both siblings. >> >> You can test this by measuring only on one HT. >> # perf stat -a -C0 -e r81d0,r01d1,r08d0,r20d1 sleep 5 >> >> Note that some events, leak more than they count. > > Ok, so AFAIU, this particular workload counts correctly just because > counters leak the same amount. If so, what happens if you run this exact > same workload with the workaround enabled? I read something about a bit > more counter multiplexing... or is there a more serious issue? The issue is that the outcoming leaked counts are not compensated by the incoming leaked counts of the sibling thread. With the workaround, corrupting events are always scheduled with an empty sibling counter. This means that their leaked counts are lost. So it is expected to see lower counts with the workaround. Note that this is not a side-effect of the workaround; leaked counts are expected to be lost with nothing measured on the sibling counter in general. In a second series we intend to re-integrate the counts for counting mode events. The workaround makes this easier because it guarantees that the sibling counter is unused, thus its counts are purely leaked counts and they can be safely re-integrated. > > Thanks. > > -- > Regards/Gruss, > Boris. > > Sent from a fat crate under my desk. Formatting is fine. > -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/