Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp3227265ybi; Mon, 10 Jun 2019 06:46:02 -0700 (PDT) X-Google-Smtp-Source: APXvYqypKRExO8kL0rGogaU2GsnU6zREic51wEOfs6KA0ntetu22bQupOMbA3zNxD3AGRQTizIMu X-Received: by 2002:a17:902:968b:: with SMTP id n11mr34040117plp.120.1560174362571; Mon, 10 Jun 2019 06:46:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560174362; cv=none; d=google.com; s=arc-20160816; b=v+6eMef80P7Fh/79byz/jFHN8NuCMrI+6UiXmN7rUU1cJVNVw6UvsirEJD72Yxph+P j9IxvNDU7j4FaC4IQjmyW9yGcgRJYnDuesid0g/0xBxWlc9FTRfEj/TCMb879LOPMcC1 +oXPbTFbEMQiJxjxUR/NSFAYVoTQVHgFGAZvuCghtDsCtJv/BcmttTU8T/96wpeB4SlC 3c2U8CxN5b6Hmdwe3M4Wmcs0Vf6O/Q8qzgDsQjFuOwbmvL9GxdXJZnDl+1WevxtnvJY7 Stclp0B7LH/SIMaiMmsl5QfUapt8wnLLjiG/hUUH4Vrhm+3GH5NiR0raWd+Qc6ujuhR6 49mQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=ErGbE0m83drl7dTry1xy777Y6e0hhPDAZgr47/W8iV4=; b=ZDOitKBr6GSjMYzCSdKjjl/gFtkwFP0BAk/tCoBxb8UE8mYhkD1faFME0u3aHzesFe 9/Y9OYJzgFK/QB772sCEMdhRBG3quXxf1Ig+ySRRRv81lJ9tlGWOuI9/2h9hg82mrVB3 1FuUxvhjhnXaFO5TqarIIjU/UAAicr1UnDxotuyd+nRQkeLrjJS5RIHIampJU2/Gw0uX KYhnl7G6ETDs6yxDL8cN4ShS8kqLzGIPm02ZRAxsohf0L/JAxA9i8j0vVVJjfzzW20p1 tTy1U5KJ/5okHqKVQYYHGMwZpyhEEv++sZRBM8NW/jo4tOQh4snLfykzwu3DTp7xMtpy WA2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=Uty0cYCc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u12si3846442pgk.387.2019.06.10.06.45.46; Mon, 10 Jun 2019 06:46:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=Uty0cYCc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390414AbfFJNle (ORCPT + 99 others); Mon, 10 Jun 2019 09:41:34 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:42668 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390156AbfFJNle (ORCPT ); Mon, 10 Jun 2019 09:41:34 -0400 Received: by mail-pf1-f193.google.com with SMTP id q10so5326264pff.9 for ; Mon, 10 Jun 2019 06:41:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=ErGbE0m83drl7dTry1xy777Y6e0hhPDAZgr47/W8iV4=; b=Uty0cYCcUyTPhGivuo38Ob2kXf+Xu8/9JLqYha/HPU0lFdVNvXgIrI5qKDzyoArMC2 Vc2oLjOqZ1cu0c46mcrYCP5JRGpkQSGkji0rpT3UEQQO4bmFKKcDlY981G6xAPXDiGaL L8XAZo4FW4ndKyGqSFL2IUBPuH3g54z6ze3awEbnHv+Dt1eq9mvqlHCgD2ZHmeZrZY4w tpD+VKWLApK4Vg36ZsHLXyzJ5398zSnxUbdpEKl+vXzxEUAlhHP8L1rbicREAq0D52nn p5wVO7MAhcMIVLCvxAfdLdHFbC1NMIiXzx+GznqPILCgnHPi9jj6EU0k4sF49zPhlUsh 99+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=ErGbE0m83drl7dTry1xy777Y6e0hhPDAZgr47/W8iV4=; b=LxPmCT7BE06dH9lgHXz2Ham7cF6kmqRW+vQOPgFBExLdB/5HrYZ3FOoeD7OCHbYGAj v+WGVClfel5S9RfpgW7pVEoX5mInHi3x2ESBA4nUxRTsppe1Wfnh++v27ldZ8JWRdJoT XY3Xapgt3zl1ndVD+4qDCfyXo5x/iuD8dtz3GuvRY8EEB0KwJzCNiRL5jpW2SXlMclU2 rLtLRVMbxtFv6ia4JjcieEFMmMT32bC16eE5SH8nSdrreg3NWYhl4IpZ5hEuJSDrxpfN 1MX0D9jifJWJ+xPzX07eJFS5pMu64JBPPoj7E1HRH9EbWCVlXvQ5pJQuxGUdqHNNVUhk udzw== X-Gm-Message-State: APjAAAVMa2bsc5Zkw5qGsorEujPBhKETxAL80EG62jUU7ddypxHuHCmn GpYwYcgZ/vm0oVNxvuSpqM3wyw== X-Received: by 2002:aa7:8b12:: with SMTP id f18mr73733166pfd.178.1560174093464; Mon, 10 Jun 2019 06:41:33 -0700 (PDT) Received: from ?IPv6:2601:646:c200:1ef2:40ea:1a58:6516:d6f2? ([2601:646:c200:1ef2:40ea:1a58:6516:d6f2]) by smtp.gmail.com with ESMTPSA id m20sm10718928pjn.16.2019.06.10.06.41.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 10 Jun 2019 06:41:32 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: [PATCH v4 3/5] x86/umwait: Add sysfs interface to control umwait C0.2 state From: Andy Lutomirski X-Mailer: iPhone Mail (16F203) In-Reply-To: <20190610060234.GD162238@romley-ivt3.sc.intel.com> Date: Mon, 10 Jun 2019 06:41:31 -0700 Cc: Andy Lutomirski , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Ashok Raj , Tony Luck , Ravi V Shankar , linux-kernel , x86 Content-Transfer-Encoding: quoted-printable Message-Id: References: <1559944837-149589-1-git-send-email-fenghua.yu@intel.com> <1559944837-149589-4-git-send-email-fenghua.yu@intel.com> <20190610035302.GA162238@romley-ivt3.sc.intel.com> <20190610060234.GD162238@romley-ivt3.sc.intel.com> To: Fenghua Yu Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jun 9, 2019, at 11:02 PM, Fenghua Yu wrote: >=20 >> On Sun, Jun 09, 2019 at 09:24:18PM -0700, Andy Lutomirski wrote: >>> On Sun, Jun 9, 2019 at 9:02 PM Fenghua Yu wrote: >>>=20 >>>> On Sat, Jun 08, 2019 at 03:50:32PM -0700, Andy Lutomirski wrote: >>>>> On Fri, Jun 7, 2019 at 3:10 PM Fenghua Yu wrote= : >>>>>=20 >>>>> C0.2 state in umwait and tpause instructions can be enabled or disable= d >>>>> on a processor through IA32_UMWAIT_CONTROL MSR register. >>>>>=20 >>>>> By default, C0.2 is enabled and the user wait instructions result in >>>>> lower power consumption with slower wakeup time. >>>>>=20 >>>>> But in real time systems which require faster wakeup time although pow= er >>>>> savings could be smaller, the administrator needs to disable C0.2 and a= ll >>>>> C0.2 requests from user applications revert to C0.1. >>>>>=20 >>>>> A sysfs interface "/sys/devices/system/cpu/umwait_control/enable_c02" i= s >>>>> created to allow the administrator to control C0.2 state during run ti= me. >>>>=20 >>>> This looks better than the previous version. I think the locking is >>>> still rather confused. You have a mutex that you hold while changing >>>> the value, which is entirely reasonable. But, of the code paths that >>>> write the MSR, only one takes the mutex. >>>>=20 >>>> I think you should consider making a function that just does: >>>>=20 >>>> wrmsr(MSR_IA32_UMWAIT_CONTROL, READ_ONCE(umwait_control_cached), 0); >>>>=20 >>>> and using it in all the places that update the MSR. The only thing >>>> that should need the lock is the sysfs code to avoid accidentally >>>> corrupting the value, but that code should also use WRITE_ONCE to do >>>> its update. >>>=20 >>> Based on the comment, the illustrative CPU online and enable_c02 store >>> functions would be: >>>=20 >>> umwait_cpu_online() >>> { >>> wrmsr(MSR_IA32_UMWAIT_CONTROL, READ_ONCE(umwait_control_cached), 0= ); >>> return 0; >>> } >>>=20 >>> enable_c02_store() >>> { >>> mutex_lock(&umwait_lock); >>> umwait_control_c02 =3D (u32)!c02_enabled; >>> WRITE_ONCE(umwait_control_cached, 2 | get_umwait_control_max_time(= )); >>> on_each_cpu(umwait_control_msr_update, NULL, 1); >>> mutex_unlock(&umwait_lock); >>> } >>>=20 >>> Then suppose umwait_control_cached =3D 100000 initially and only CPU0 is= >>> running. Admin change bit 0 in MSR from 0 to 1 to disable C0.2 and is >>> onlining CPU1 in the same time: >>>=20 >>> 1. On CPU1, read umwait_control_cached to eax as 100000 in >>> umwait_cpu_online() >>> 2. On CPU0, write 100001 to umwait_control_cached in enable_c02_store() >>> 3. On CPU1, wrmsr with eax=3D100000 in umwaint_cpu_online() >>> 4. On CPU0, wrmsr with 100001 in enabled_c02_store() >>>=20 >>> The result is CPU0 and CPU1 have different MSR values. >>=20 >> Yes, but only transiently, because you didn't finish your example. >>=20 >> Step 5: enable_c02_store() does on_each_cpu(), and CPU 1 gets updated. >=20 > There is no sync on wrmsr on CPU0 and CPU1. What do you mean by sync? > So a better sequence to > describe the problem is changing the order of wrmsr: >=20 > 1. On CPU1, read umwait_control_cached to eax as 100000 in > umwait_cpu_online() > 2. On CPU0, write 100001 to umwait_control_cached in enable_c02_store() > 3. On CPU0, wrmsr with 100001 in on_each_cpu() in enabled_c02_store() > 4. On CPU1, wrmsr with eax=3D100000 in umwaint_cpu_online() >=20 > So CPU1 and CPU0 have different MSR values. This won't be transient. You are still ignoring the wrmsr on CPU1 due to on_each_cpu().