Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp8043ybt; Fri, 12 Jun 2020 17:47:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyhqkZCEnAB4qosMc1TdnM85K97jITr2NplShIEQVoAyDWhM1Y6SRcPSYKJXorIR1/ZD3NN X-Received: by 2002:a05:6402:228a:: with SMTP id cw10mr13641715edb.147.1592009229841; Fri, 12 Jun 2020 17:47:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592009229; cv=none; d=google.com; s=arc-20160816; b=sv8teBmQvzKCDlJiMQQH+6q0b0td8Y60sB5LaJHQHjhpVcwivtPQVycdwl9p+CtE4m nQCz+uuV5SfM9EBhwEeSKrQEGwLahphlkFI3b2dQEEDELm5kK8J5lM9ntrPng3z4dWCQ huT64LQls088EjVMhtqQGAzIEvitVTFxDVtk/q+ZwgCS6ioqGNovnUsTcEf/qEEt68UZ NdV2l4IZaxOBSUJooWxyrYi2HXfkD70s+ifTim4pGsg9eaQX4vvG3eXjIrqyjyzYdPk6 Ads3utFBdfeQ8mI7qNnPFIMSEXdTEUePj0lZ6WADGBclDMrDRKsJepStHUJLBmBZCzqL oVYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:ironport-sdr:ironport-sdr; bh=rTDK8kJw4W+yEkIV9i/ch6UVtxnE82wHtLQjKuUoLQk=; b=hzCUJvkNXD8VCLKK9VY0SlMF3qfZpDONHFv9LFsbK+Thh6KshnmuniFMxOfcW5dyEh 0HePOk1GnQcDoON4x2egOWjnbuFLqlgrNnJj3oRPMc5iVtC1UXmCBvca0YPdJSqT1OlG ASsopV3Y4QCdq2HbTf8NLSgZuGt+xtnriSNkFXZtpYJFxflLD18mDOgmoPRy580yVSy+ GLfOPoDb4wPuYhAD+JXtKWx1m8vW0EyGgFkdiRkbdU3EhT+/PTZIEFthdOTnILF9x39G HA/eEzyeIgdBUJyFDxPvrddD+B4cJf5z1im3yEyxZPKQptzH5Kb9nWYL2LThWK0t/lyY e52g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y1si4265369edr.479.2020.06.12.17.46.46; Fri, 12 Jun 2020 17:47:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726570AbgFMAmS (ORCPT + 99 others); Fri, 12 Jun 2020 20:42:18 -0400 Received: from mga12.intel.com ([192.55.52.136]:1253 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726479AbgFMAl5 (ORCPT ); Fri, 12 Jun 2020 20:41:57 -0400 IronPort-SDR: PTVG84oH7R8pUgdLwoL4hAgxC3Ybm0PwJZch4EPXz5H17dc0chCpc8cnNBdpq0UNfKV+TMAfpy j0HmFmwdTmlw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jun 2020 17:41:54 -0700 IronPort-SDR: /QgbcwgLugJdiuzdLc/C+vLLCwvVxA8VT9nu3tDcoZozoZ6rfniRPFuzcJHXJ3FORyAJLQbANe cBmRvAW6/hSg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,505,1583222400"; d="scan'208";a="261011241" Received: from romley-ivt3.sc.intel.com ([172.25.110.60]) by orsmga007.jf.intel.com with ESMTP; 12 Jun 2020 17:41:54 -0700 From: Fenghua Yu To: "Thomas Gleixner" , "Ingo Molnar" , "Borislav Petkov" , "H Peter Anvin" , "David Woodhouse" , "Lu Baolu" , "Frederic Barrat" , "Andrew Donnellan" , "Felix Kuehling" , "Joerg Roedel" , "Dave Hansen" , "Tony Luck" , "Ashok Raj" , "Jacob Jun Pan" , "Dave Jiang" , "Yu-cheng Yu" , "Sohil Mehta" , "Ravi V Shankar" Cc: "linux-kernel" , "x86" , iommu@lists.linux-foundation.org, "amd-gfx" , "linuxppc-dev" , Fenghua Yu Subject: [PATCH v2 12/12] x86/traps: Fix up invalid PASID Date: Fri, 12 Jun 2020 17:41:33 -0700 Message-Id: <1592008893-9388-13-git-send-email-fenghua.yu@intel.com> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com> References: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org A #GP fault is generated when ENQCMD instruction is executed without a valid PASID value programmed in the current thread's PASID MSR. The #GP fault handler will initialize the MSR if a PASID has been allocated for this process. Decoding the user instruction is ugly and sets a bad architecture precedent. It may not function if the faulting instruction is modified after #GP. Thomas suggested to provide a reason for the #GP caused by executing ENQCMD without a valid PASID value programmed. #GP error codes are 16 bits and all 16 bits are taken. Refer to SDM Vol 3, Chapter 16.13 for details. The other choice was to reflect the error code in an MSR. ENQCMD can also cause #GP when loading from the source operand, so its not fully comprehending all the reasons. Rather than special case the ENQCMD, in future Intel may choose a different fault mechanism for such cases if recovery is needed on #GP. The following heuristic is used to avoid decoding the user instructions to determine the precise reason for the #GP fault: 1) If the mm for the process has not been allocated a PASID, this #GP cannot be fixed. 2) If the PASID MSR is already initialized, then the #GP was for some other reason 3) Try initializing the PASID MSR and returning. If the #GP was from an ENQCMD this will fix it. If not, the #GP fault will be repeated and will hit case "2". Suggested-by: Thomas Gleixner Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck --- v2: - Update the first paragraph of the commit message (Thomas) - Add reasons why don't decode the user instruction and don't use #GP error code (Thomas) - Change get_task_mm() to current->mm (Thomas) - Add comments on why IRQ is disabled during PASID fixup (Thomas) - Add comment in fixup() that the function is called when #GP is from user (so mm is not NULL) (Dave Hansen) arch/x86/include/asm/iommu.h | 1 + arch/x86/kernel/traps.c | 23 +++++++++++++++++++++ drivers/iommu/intel/svm.c | 39 ++++++++++++++++++++++++++++++++++++ 3 files changed, 63 insertions(+) diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h index ed41259fe7ac..e9365a5d6f7d 100644 --- a/arch/x86/include/asm/iommu.h +++ b/arch/x86/include/asm/iommu.h @@ -27,5 +27,6 @@ arch_rmrr_sanity_check(struct acpi_dmar_reserved_memory *rmrr) } void __free_pasid(struct mm_struct *mm); +bool __fixup_pasid_exception(void); #endif /* _ASM_X86_IOMMU_H */ diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 4cc541051994..0f78d5cdddfe 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -59,6 +59,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 #include @@ -436,6 +437,16 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs, return GP_CANONICAL; } +static bool fixup_pasid_exception(void) +{ + if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM)) + return false; + if (!static_cpu_has(X86_FEATURE_ENQCMD)) + return false; + + return __fixup_pasid_exception(); +} + #define GPFSTR "general protection fault" dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code) @@ -447,6 +458,18 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code) int ret; RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU"); + + /* + * Perform the check for a user mode PASID exception before enable + * interrupts. Doing this here ensures that the PASID MSR can be simply + * accessed because the contents are known to be still associated + * with the current process. + */ + if (user_mode(regs) && fixup_pasid_exception()) { + cond_local_irq_enable(regs); + return; + } + cond_local_irq_enable(regs); if (static_cpu_has(X86_FEATURE_UMIP)) { diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 27dc866b8461..81fd2380c0f9 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -1078,3 +1078,42 @@ void __free_pasid(struct mm_struct *mm) */ ioasid_free(pasid); } + +/* + * Apply some heuristics to see if the #GP fault was caused by a thread + * that hasn't had the IA32_PASID MSR initialized. If it looks like that + * is the problem, try initializing the IA32_PASID MSR. If the heuristic + * guesses incorrectly, take one more #GP fault. + */ +bool __fixup_pasid_exception(void) +{ + u64 pasid_msr; + unsigned int pasid; + + /* + * This function is called only when this #GP was triggered from user + * space. So the mm cannot be NULL. + */ + pasid = current->mm->pasid; + /* If the mm doesn't have a valid PASID, then can't help. */ + if (invalid_pasid(pasid)) + return false; + + /* + * Since IRQ is disabled now, the current task still owns the FPU on + * this CPU and the PASID MSR can be directly accessed. + * + * If the MSR has a valid PASID, the #GP must be for some other reason. + * + * If rdmsr() is really a performance issue, a TIF_ flag may be + * added to check if the thread has a valid PASID instead of rdmsr(). + */ + rdmsrl(MSR_IA32_PASID, pasid_msr); + if (pasid_msr & MSR_IA32_PASID_VALID) + return false; + + /* Fix up the MSR if the MSR doesn't have a valid PASID. */ + wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID); + + return true; +} -- 2.19.1