Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp2782026ybi; Sun, 9 Jun 2019 21:03:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqwzd4StSlO4zUZjGz39bzgLmyNkOEokN6h44Es0JFb/2/4SWqyfGHNgMfetONNDSAVOWLAK X-Received: by 2002:a63:4d0b:: with SMTP id a11mr13693461pgb.74.1560139428853; Sun, 09 Jun 2019 21:03:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560139428; cv=none; d=google.com; s=arc-20160816; b=V/4kON4lmt+v+G6nQri+8lgH6NZJfInwPlzhPmkVGotCylaLBZ5U31xsxdBhr0uwBi GKAjWlp1P0FvrcZxbxFyvXvuAulC3ePLVCXMVkoO/qrM053611oDTnuoJ/mIATcUol6f FKLtZvwUwpUHPDIuDJyDQSdp6sde8MsNIONwqKd5eWzM3scgtwut6F7HYYLi0EVibyJV aRMyRNHqHeceekuequNXlJZ2+k6jSC4mKVxx/tCWq9CpyK20LEMR3HBWBfghypffF7tI SQKdVJ5rS23RCIbz9mErwjUvZ16aDuMcEBL22uW0YuOzZQyVkwwGhahzghBy4ZQWwM95 q7dA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=qjCknN6dqgAUkAguuWw33Lp1SVyVrBwL3xoqzmCU3Zw=; b=Foz8hovDSj5214x8t0QCulgpYtZmhJ3VQrk6t6Goj1ShqX3/bt0NOwj+JVtI23vmPK hejalzeSXlVcHY0Zoqc4sO+EROTRPn9gd3D2ih3ZqdlDCOlVYl4jNGBU4g+h4E3sHGGd CU7m2AdZ+CukDpkcI+nTTyONH3Gq5zhzw9BMYHsyN5eIOpYn8M6AO9tDh94voldadrvv d/HT107gm6Qkdp/2c+9ikiruSBDRHmMj8Vjfba5rDn/hLRPK1+HIZK6DMC98kUf9C4HA GQHeX1Eu9r4huK0Vc/g9oeSx0pWxrpqcQUYS9iaRjQ62V8ixKMf6NP2xRmAPJojqZMNF geDg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p5si8336929plk.244.2019.06.09.21.03.34; Sun, 09 Jun 2019 21:03:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726769AbfFJECX (ORCPT + 99 others); Mon, 10 Jun 2019 00:02:23 -0400 Received: from mga14.intel.com ([192.55.52.115]:49116 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725873AbfFJECX (ORCPT ); Mon, 10 Jun 2019 00:02:23 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Jun 2019 21:02:21 -0700 X-ExtLoop1: 1 Received: from romley-ivt3.sc.intel.com ([172.25.110.60]) by orsmga006.jf.intel.com with ESMTP; 09 Jun 2019 21:02:21 -0700 Date: Sun, 9 Jun 2019 20:53:03 -0700 From: Fenghua Yu To: Andy Lutomirski Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Ashok Raj , Tony Luck , Ravi V Shankar , linux-kernel , x86 Subject: Re: [PATCH v4 3/5] x86/umwait: Add sysfs interface to control umwait C0.2 state Message-ID: <20190610035302.GA162238@romley-ivt3.sc.intel.com> References: <1559944837-149589-1-git-send-email-fenghua.yu@intel.com> <1559944837-149589-4-git-send-email-fenghua.yu@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 08, 2019 at 03:50:32PM -0700, Andy Lutomirski wrote: > On Fri, Jun 7, 2019 at 3:10 PM Fenghua Yu wrote: > > > > C0.2 state in umwait and tpause instructions can be enabled or disabled > > on a processor through IA32_UMWAIT_CONTROL MSR register. > > > > By default, C0.2 is enabled and the user wait instructions result in > > lower power consumption with slower wakeup time. > > > > But in real time systems which require faster wakeup time although power > > savings could be smaller, the administrator needs to disable C0.2 and all > > C0.2 requests from user applications revert to C0.1. > > > > A sysfs interface "/sys/devices/system/cpu/umwait_control/enable_c02" is > > created to allow the administrator to control C0.2 state during run time. > > This looks better than the previous version. I think the locking is > still rather confused. You have a mutex that you hold while changing > the value, which is entirely reasonable. But, of the code paths that > write the MSR, only one takes the mutex. > > I think you should consider making a function that just does: > > wrmsr(MSR_IA32_UMWAIT_CONTROL, READ_ONCE(umwait_control_cached), 0); > > and using it in all the places that update the MSR. The only thing > that should need the lock is the sysfs code to avoid accidentally > corrupting the value, but that code should also use WRITE_ONCE to do > its update. Based on the comment, the illustrative CPU online and enable_c02 store functions would be: umwait_cpu_online() { wrmsr(MSR_IA32_UMWAIT_CONTROL, READ_ONCE(umwait_control_cached), 0); return 0; } enable_c02_store() { mutex_lock(&umwait_lock); umwait_control_c02 = (u32)!c02_enabled; WRITE_ONCE(umwait_control_cached, 2 | get_umwait_control_max_time()); on_each_cpu(umwait_control_msr_update, NULL, 1); mutex_unlock(&umwait_lock); } Then suppose umwait_control_cached = 100000 initially and only CPU0 is running. Admin change bit 0 in MSR from 0 to 1 to disable C0.2 and is onlining CPU1 in the same time: 1. On CPU1, read umwait_control_cached to eax as 100000 in umwait_cpu_online() 2. On CPU0, write 100001 to umwait_control_cached in enable_c02_store() 3. On CPU1, wrmsr with eax=100000 in umwaint_cpu_online() 4. On CPU0, wrmsr with 100001 in enabled_c02_store() The result is CPU0 and CPU1 have different MSR values. The problem is because there is no wrmsr serialization b/w uwait_cpu_online() and enable_c02_store(). The WRITE_ONCE() and READ_ONCE() only serialize access to umwait_control_cached. But we need to serialize wrmsr() as well to guarantee all CPUs have the same MSR value. So does it make sense to keep the mutex and locking as the current patch does? Thanks. -Fenghua