Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp464421pxa; Fri, 31 Jul 2020 17:43:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw0kGUJNZZpAgdJwO68EAaUjnXWVESOx5XIfr7zYM+Quu77pPtvVdG4INohtjSzybZdhCr5 X-Received: by 2002:a17:906:fb04:: with SMTP id lz4mr6774672ejb.394.1596242625354; Fri, 31 Jul 2020 17:43:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596242625; cv=none; d=google.com; s=arc-20160816; b=RUklF1XX4fzIgNPOrQTuwqj3XDvHPUil2A8l49CMJi/OjtGH+SXeu8Gw4NwkZiD94R MrseP3cQ+zOsPPYuI6dlfEbq5idbXWSH15WKlg053OH9goosirP34ro8LId6htXf6XrO ol2hcRf+lohduUjhLVtNA4kZce5j2USEvqspas3yE4WXdeJNXwP90KQrIu5GbVTDp9yB XXNE33TdNjPGx8ALWnjbTsnAbdV23OcBeL/ndaDw4WhJKcTrF1NEdVqj7yVWF85mHI0m 7cANi1H/y5mHt6X7Ow3aXHWAlSKH6olIBwdE92EjfgvMrcxBWZq+m+Fkb7FPk7LGHxhM 3rnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :ironport-sdr:ironport-sdr; bh=ipUWPXbMFirVE9NENl7BROJMZJB01xXgz+UEXR+PRY8=; b=XwLnS7HVy/IQZqujpr/Gel+3dceejZZqR1XwNgtZ4Hx/T1opZt2YZZxTLb1WDqNN/5 MdIUJQgoSjVNBRicS26VGTcg9cMQKdgBI2VkmfvIGS3wSa5+HN2H0mBmDGEdNY+TGNnt OMA6KCgGIriN5vK6SizCDZvaPnD8Go5euKnOSjCJKgS38Ys5Sra+HR0LFmgIDdIYUVwp lJ47u66DJI+KD6PXad9VRTDhQkHg63vJC4oGl1juhYKqdqaPn0C9UGnHW+z1hnnI6QU2 dy2r1zrCgTWGlvR4jtXcUmly5R6ncn7W+kLO+/6YxMxMVl3TFnVHvRXfqdEGXJ37Igwc 73xA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h1si5315916ejb.4.2020.07.31.17.43.21; Fri, 31 Jul 2020 17:43:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727816AbgHAAmL (ORCPT + 99 others); Fri, 31 Jul 2020 20:42:11 -0400 Received: from mga03.intel.com ([134.134.136.65]:62565 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726099AbgHAAmK (ORCPT ); Fri, 31 Jul 2020 20:42:10 -0400 IronPort-SDR: 4Hn59rEN1boew/7WW2S5t5EkMs5R2R7z4kt1D+PD5fXxoYjkC2N4jJx4nf/ujP/J/dJb39fi2i eBzN3789wZ7A== X-IronPort-AV: E=McAfee;i="6000,8403,9699"; a="151859753" X-IronPort-AV: E=Sophos;i="5.75,420,1589266800"; d="scan'208";a="151859753" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Jul 2020 17:42:09 -0700 IronPort-SDR: P6H3jEK43GdS1/8t4/7dSlDlGHhnsqkKpcsT+VzyPTLx2laDWwmcvrMJ+8mkdacBSts+EB4NII 1VV+17pJzvng== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,420,1589266800"; d="scan'208";a="329450720" Received: from otcwcpicx6.sc.intel.com ([172.25.55.29]) by FMSMGA003.fm.intel.com with ESMTP; 31 Jul 2020 17:42:08 -0700 Date: Sat, 1 Aug 2020 00:42:08 +0000 From: Fenghua Yu To: Andy Lutomirski Cc: Fenghua Yu , Thomas Gleixner , Joerg Roedel , Ingo Molnar , Borislav Petkov , Peter Zijlstra , H Peter Anvin , David Woodhouse , Lu Baolu , Felix Kuehling , Dave Hansen , Tony Luck , Jean-Philippe Brucker , Christoph Hellwig , Ashok Raj , Jacob Jun Pan , Dave Jiang , Sohil Mehta , Ravi V Shankar , linux-kernel , x86 , iommu , amd-gfx Subject: Re: [PATCH v6 12/12] x86/traps: Fix up invalid PASID Message-ID: <20200801004208.GA324365@otcwcpicx6.sc.intel.com> References: <1594684087-61184-1-git-send-email-fenghua.yu@intel.com> <1594684087-61184-13-git-send-email-fenghua.yu@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Andy, On Fri, Jul 31, 2020 at 04:34:11PM -0700, Andy Lutomirski wrote: > On Mon, Jul 13, 2020 at 4:48 PM Fenghua Yu wrote: > > > > A #GP fault is generated when ENQCMD instruction is executed without > > a valid PASID value programmed in the current thread's PASID MSR. The > > #GP fault handler will initialize the MSR if a PASID has been allocated > > for this process. > > > > Decoding the user instruction is ugly and sets a bad architecture > > precedent. It may not function if the faulting instruction is modified > > after #GP. > > > > Thomas suggested to provide a reason for the #GP caused by executing ENQCMD > > without a valid PASID value programmed. #GP error codes are 16 bits and all > > 16 bits are taken. Refer to SDM Vol 3, Chapter 16.13 for details. The other > > choice was to reflect the error code in an MSR. ENQCMD can also cause #GP > > when loading from the source operand, so its not fully comprehending all > > the reasons. Rather than special case the ENQCMD, in future Intel may > > choose a different fault mechanism for such cases if recovery is needed on > > #GP. > > Decoding the user instruction is ugly and sets a bad architecture > precedent, but we already do it in #GP for UMIP. So I'm unconvinced. Maybe just remove the "Decoding the user instruction ... bad architecture precedent" sentence? The sentence is vague. As described in the following "It may not function ..." sentence, the real issue of parsing the instruction is the instruction may be modified by another processor before it's parsed in the #GP handler. If just keep the "It may not function ..." sentence, is that good enough to explain why we don't parse the faulting instruction? > > Memo to Intel, though: you REALLY need to start thinking about what > the heck an OS is supposed to do with all these new faults you're > coming up with. The new #NM for TILE is utterly nonsensical. Sure, > it works for an OS that does not use CR0.TS and as long as no one > tries to extend the same mechanism for some new optional piece of > state, but as soon as Intel tries to use the same mechanism for > anything else, it falls apart. > > Please do better. Internally we did discuss the error code in #GP for PASID with HW architects. But due to some uarch reason, it's not simple to report the error code for PASID:( Please see previous discussion on the error code for PASID: https://lore.kernel.org/lkml/20200427224646.GA103955@otc-nc-03/ It's painful for our SW guys to check exception reasons if hardware doesn't explicitly tell us. Hopefully the heuristics (fixup the PASID MSR if the process already has a valid PASID but the MSR doesn't have one yet) in this patch is acceptable. > > > + > > +/* > > + * Write the current task's PASID MSR/state. This is called only when PASID > > + * is enabled. > > + */ > > +static void fpu__pasid_write(u32 pasid) > > +{ > > + u64 msr_val = pasid | MSR_IA32_PASID_VALID; > > + > > + fpregs_lock(); > > + > > + /* > > + * If the MSR is active and owned by the current task's FPU, it can > > + * be directly written. > > + * > > + * Otherwise, write the fpstate. > > + */ > > + if (!test_thread_flag(TIF_NEED_FPU_LOAD)) { > > + wrmsrl(MSR_IA32_PASID, msr_val); > > + } else { > > + struct ia32_pasid_state *ppasid_state; > > + > > + ppasid_state = get_xsave_addr(¤t->thread.fpu.state.xsave, > > + XFEATURE_PASID); > > + /* > > + * ppasid_state shouldn't be NULL because XFEATURE_PASID > > + * is enabled. > > + */ > > + WARN_ON_ONCE(!ppasid_state); > > + ppasid_state->pasid = msr_val; > > WARN instead of BUG is nice, but you'll immediate oops if this fails. > How about: > > if (!WARN_ON_ONCE(!ppasid_state)) > ppasid_state->pasid = msr_val; OK. I will fix this issue. Thank you very much for your review! -Fenghua