Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp4345130rdb; Thu, 14 Sep 2023 21:55:10 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG1ufSTe/2d+ms5or4pvonxj6lQeQ1qtihJLdiPjO7OSr3A4wKt5AgnZNlpz813jfJYJQrU X-Received: by 2002:a05:6358:591e:b0:140:f2fd:e9d7 with SMTP id g30-20020a056358591e00b00140f2fde9d7mr1048124rwf.4.1694753710078; Thu, 14 Sep 2023 21:55:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694753710; cv=none; d=google.com; s=arc-20160816; b=mcNxW6MQI9GZdYw6np2cA0t57U1ncIn261WixVsJiSMenCGZX/WeTyN3RdywMgzVtU XoXy1YHA42QH+IEwumsLZiWLrNpzJjcNYYdh81z0KEeiOT2Nsp5DYs/o3GLI+NdFO3Sh tD+QYJ/6OeFA3IChoTz4Cj6CbQQCLP5D2SSKfwbjwXMogY2lflTc6fzY4FJu7qVlUE/M T+2XOGVhEtbTjGojuxkI/q/tWG3mB36FZ5qhsEN4RPqbGMj86+OVMDrT1yHbuq24Mzqe kQFU+vfrAFuG7ECoReDZyIdF++M8noNrJIp6S8JmMwV+uS+WRJLX2iOdncjD6NDeOXMm K5mA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:references :cc:to:content-language:subject:from:user-agent:mime-version:date :message-id:dkim-signature; bh=jEsOE3VahVBRrZQB+F/BuNrsfqTbbzggGN9bcwX+LFw=; fh=pBu2wn3bXdvXrzepYNQWZzIpz+stB6aXbliiNy6TwXw=; b=jlq+GjkLTEC+6yVvEp1quaenbBuxyWiv2KUr+qwB01cVTmxC2FzYpyikIwKq7JtPD3 ew7sR9bNaX85THYqyXQXuR1U7iHLeTgoZOmdLKVpztTkbLhEuYiNMLaq6jJigU9Rm7Jp 6YnVtNaFRZaTA3Nb9ewD87afgQMcLeTgXCVBMfJwbl1aZfaHRxk5PG4y/6d8gi+vy5So LzFm14tsCvFpig4J04/D0Rls0ENrmqL5H/YDFnIqr+SFmrE3VDaNgijJ+Kof5fF1l6MG gnzmkKvlmA9gc7Pp3+1MORIh85PHBvwbwmrRJAo2Qamc4AdX5M53/ZSdANmCarFbnOyT B6EQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@citrix.com header.s=google header.b=WBmURdAI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=citrix.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id m1-20020a656a01000000b0056532fbe28csi2789949pgu.459.2023.09.14.21.55.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 21:55:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@citrix.com header.s=google header.b=WBmURdAI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=citrix.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 3DD97832A00C; Thu, 14 Sep 2023 16:47:02 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230429AbjINXqj (ORCPT + 99 others); Thu, 14 Sep 2023 19:46:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230306AbjINXqh (ORCPT ); Thu, 14 Sep 2023 19:46:37 -0400 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A4D12707 for ; Thu, 14 Sep 2023 16:46:33 -0700 (PDT) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-9a9d6b98845so588666266b.0 for ; Thu, 14 Sep 2023 16:46:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1694735192; x=1695339992; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:references:cc:to :content-language:subject:from:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=jEsOE3VahVBRrZQB+F/BuNrsfqTbbzggGN9bcwX+LFw=; b=WBmURdAIViqySOYd4mwoL5+CGavhBHlAcQZJOFnePa/0bAXu0bgybRI3OcGj4HTG2j c4WP2JsR5jpOWTsFRjLuua//y/QQpf3bQLoZvnXtVvdNZZ0OV5Ll75KL7oUytg7KbJMU ZOjCMe/ghixf3/EUuPcV+lYqQ1FwmK9/STj1Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694735192; x=1695339992; h=content-transfer-encoding:in-reply-to:references:cc:to :content-language:subject:from:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jEsOE3VahVBRrZQB+F/BuNrsfqTbbzggGN9bcwX+LFw=; b=gSfrBGMu0QaocrI9XcZtHOWHuK9+NLlS/i8dYpfu0PZ+1MkEQ69ZPDx6VkUucW2xVP mLUy+9gfAR+AbpnHYodEM13HaFAzwrKSNJaGeGh2HRO2jpO19XaNVOFWYrnxJB9a8+3q xzzcZJ0M4NhWJz8kyOxhIeZWCEQoQo5UApDA4Uy2cArqtGZ/SWX+7fPZ3eSOguttaJaG aXqM3DSxVay+8AJHIDkAhOwtIrTE95EHWxzXG6HTl0c1SS5mJMXLFTlET6vVuqXVRmH8 CTsJgdSLCQ582vJBVKSSnksGJtrDMRugAQE58FVUv6I0pj1whYNPgefSwxOB1TluhZ9U 5PvA== X-Gm-Message-State: AOJu0Yw9Sgrj2jFFgMMKDJdM8zhxpAbmDMOSnGSCeBPEwoGtyj7Ezavv JWgS/ZmBLSxh/0SDyX1jwEJ44A== X-Received: by 2002:a17:906:51d0:b0:9a1:fda6:2e2a with SMTP id v16-20020a17090651d000b009a1fda62e2amr179416ejk.9.1694735191861; Thu, 14 Sep 2023 16:46:31 -0700 (PDT) Received: from [192.168.1.10] (host-92-12-44-130.as13285.net. [92.12.44.130]) by smtp.gmail.com with ESMTPSA id m21-20020a1709060d9500b009937dbabbd5sm1623158eji.220.2023.09.14.16.46.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Sep 2023 16:46:31 -0700 (PDT) Message-ID: <7ba4ae3e-f75d-66a8-7669-b6eb17c1aa1c@citrix.com> Date: Fri, 15 Sep 2023 00:46:30 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 From: andrew.cooper3@citrix.com Subject: Re: [PATCH v10 03/38] x86/msr: Add the WRMSRNS instruction support Content-Language: en-GB To: Thomas Gleixner , Xin Li , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, linux-hyperv@vger.kernel.org, kvm@vger.kernel.org, xen-devel@lists.xenproject.org Cc: mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, luto@kernel.org, pbonzini@redhat.com, seanjc@google.com, peterz@infradead.org, jgross@suse.com, ravi.v.shankar@intel.com, mhiramat@kernel.org, jiangshanlai@gmail.com References: <20230914044805.301390-1-xin3.li@intel.com> <20230914044805.301390-4-xin3.li@intel.com> <6f5678ff-f8b1-9ada-c8c7-f32cfb77263a@citrix.com> <87y1h81ht4.ffs@tglx> In-Reply-To: <87y1h81ht4.ffs@tglx> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 14 Sep 2023 16:47:02 -0700 (PDT) X-Spam-Status: No, score=-2.3 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email On 15/09/2023 12:00 am, Thomas Gleixner wrote: >> and despite what Juergen said, I'm going to recommend that you do wire >> this through the paravirt infrastructure, for the benefit of regular >> users having a nice API, not because XenPV is expecting to do something >> wildly different here. > I fundamentaly hate adding this to the PV infrastructure. We don't want > more PV ops, quite the contrary. What I meant was "there should be the two top-level APIs, and under the covers they DTRT".  Part of doing the right thing is to wire up paravirt for configs where that is specified. Anything else is going to force people to write logic of the form:     if (WRMSRNS && !XENPV)         wrmsr_ns(...)     else         wrmsr(...) which is going to be worse overall.  And there really is one example of this antipattern already in the series. > For the initial use case at hand, there is an explicit FRED dependency > and the code in question really wants to use WRMSRNS directly and not > through a PV function call. > > I agree with your reasoning for the more generic use case where we can > gain performance independent of FRED by using WRMSRNS for cases where > the write has no serialization requirements. > > But this made me look into PV ops some more. For actual performance > relevant code the current PV ops mechanics are a horrorshow when the op > defaults to the native instruction. > > Let's look at wrmsrl(): > > wrmsrl(msr, val > wrmsr(msr, (u32)val, (u32)val >> 32)) > paravirt_write_msr(msr, low, high) > PVOP_VCALL3(cpu.write_msr, msr, low, high) > > Which results in > > mov $msr, %edi > mov $val, %rdx > mov %edx, %esi > shr $0x20, %rdx > call native_write_msr > > and native_write_msr() does at minimum: > > mov %edi,%ecx > mov %esi,%eax > wrmsr > ret Yeah, this is daft.  But it can also be fixed irrespective of WRMSRNS. WRMSR has one complexity that most other PV-ops don't, and that's the exception table reference for the instruction itself. In a theoretical future ought to look like:     mov    $msr, %ecx     mov    $lo, %eax     mov    $hi, %edx     1: {call paravirt_blah(%rip) | cs...cs wrmsr | cs...cs wrmsrns }     _ASM_EXTABLE(1b, ...) In paravirt builds, the CALL needs to be the emitted form, because it needs to function in very early boot. But once the paravirt-ness has been chosen and alternatives run, the as-native paths are fully inline. The alternative which processes this site wants to conclude that, in the case it does not alter from the CALL, to clobber the EXTABLE reference.  CALL instructions can #GP, and you don't want to end up thinking you're handling a WRMSR #GP when in fact it was a non-canonical function pointer. > In the worst case 'ret' is going through the return thunk. Not to talk > about function prologues and whatever. > > This becomes even more silly for trivial instructions like STI/CLI or in > the worst case paravirt_nop(). STI/CLI are already magic.  Are they not inlined? > The call makes only sense, when the native default is an actual > function, but for the trivial cases it's a blatant engineering > trainwreck. > > I wouldn't care at all if CONFIG_PARAVIRT_XXL would be the esoteric use > case, but AFAICT it's default enabled on all major distros. > > So no. I'm fundamentally disagreeing with your recommendation. The way > forward is: > > 1) Provide the native variant for wrmsrns(), i.e. rename the proposed > wrmsrns() to native_wrmsr_ns() and have the X86_FEATURE_WRMSRNS > safety net as you pointed out. > > That function can be used in code which is guaranteed to be not > affected by the PV_XXL madness. > > 2) Come up with a sensible solution for the PV_XXL horrorshow > > 3) Implement a sane general variant of wrmsr_ns() which handles > both X86_FEATURE_WRMSRNS and X86_MISFEATURE_PV_XXL > > 4) Convert other code which benefits from the non-serializing variant > to wrmsr_ns() Well - point 1 is mostly work that needs reverting as part of completing point 3, and point 2 clearly needs doing irrespective of anything else. Thanks, ~Andrew