Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp3423148ybl; Mon, 19 Aug 2019 18:43:12 -0700 (PDT) X-Google-Smtp-Source: APXvYqz4kDo472d4/bD3mNRsd8jpbr8unqmenc75upj1aopzso0d/QLTDSEUhnvOMkVkln6pIt8u X-Received: by 2002:a65:5144:: with SMTP id g4mr22616458pgq.202.1566265392703; Mon, 19 Aug 2019 18:43:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566265392; cv=none; d=google.com; s=arc-20160816; b=X29eVoQqbyxuu1hwEkk6daGtz1PF7QM2cQI6+HRcCkuvSqJNEWoFQA4S9CkdYOwPc6 1+V3qeMnXRCA8t2vHDpWowPS+dbbQcVxMqt0y9zoqQGsIgTL2tuh+8rdu3iS1xha5tio ZeoxL6/jZyK9Fm7H2DEZXW731xFahYGzz0udmhe6OO1+bzV5zYWnI4Am3ciettV4NrKp LVHV7m1uWCuYp2d5pZX+TUqAMWBPMSMrPIucGrZaLRt1rBXERqy95OE0viLk57mrAYnC pqHnunJqz+ywHOGcsA3nXpcWRSvnlMgWxtSYtjgkaA6kJomemdW7WPDgu0Q+UrAmED0Q DR3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=R7Z9ooUfjQGLXmGd/K9bZ5ZdhFuNN1lE4Y7UVK0+gI0=; b=Jnrg6qt7x6mjVHhLdYYh+2wHBrrCtBbys7XKAQ21iwDQ7asRAE+8GBiprQCqK3Mhsn wgLfjcuzRstJS4ay2lBlLl4yYdYVx743fQllkVBeBpeJ4tjBzjC5sMSgPXVjHKjhPlZd nXnMasVi7xUnPR7lcseETQ9XhYWpva1HPXyi+RBwUtfkatm5/S8cimBAKWTO7VdoQxLu RV1bOGus5MBSqz58aBDlo9xcdRzyLT56WavdBfCc24S4d6obCiquhaD8T1ot3kQWezp5 jz8PfgsC5a9RC3bXfwbf+kbKOXZxIiqknhTggqWJ9ZDDK52Vtaqlmqgmr7Tsyj9cw4u0 cS9g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b15si10967135pgj.141.2019.08.19.18.42.57; Mon, 19 Aug 2019 18:43:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729003AbfHTBls (ORCPT + 99 others); Mon, 19 Aug 2019 21:41:48 -0400 Received: from mga02.intel.com ([134.134.136.20]:57687 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728965AbfHTBlq (ORCPT ); Mon, 19 Aug 2019 21:41:46 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 18:41:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,407,1559545200"; d="scan'208";a="262016113" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.41]) by orsmga001.jf.intel.com with ESMTP; 19 Aug 2019 18:41:17 -0700 Date: Mon, 19 Aug 2019 18:41:17 -0700 From: Sean Christopherson To: Andy Lutomirski Cc: Paolo Bonzini , Radim =?utf-8?B?S3LEjW3DocWZ?= , Thomas Gleixner , Ingo Molnar , Borislav Petkov , X86 ML , Jarkko Sakkinen , Joerg Roedel , "H. Peter Anvin" , kvm list , LKML , linux-sgx@vger.kernel.org Subject: Re: [RFC PATCH 08/21] KVM: x86: Add kvm_x86_ops hook to short circuit emulation Message-ID: <20190820014117.GJ1916@linux.intel.com> References: <20190727055214.9282-1-sean.j.christopherson@intel.com> <20190727055214.9282-9-sean.j.christopherson@intel.com> <20190730024940.GL21120@linux.intel.com> <25BBDA64-1253-4429-95AF-5D578684F6CC@amacapital.net> <20190819220150.GE1916@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 19, 2019 at 06:34:07PM -0700, Andy Lutomirski wrote: > On Mon, Aug 19, 2019 at 3:01 PM Sean Christopherson > wrote: > > > > On Thu, Aug 15, 2019 at 05:47:12PM -0700, Andy Lutomirski wrote: > > > > > > > > > >> On Jul 29, 2019, at 7:49 PM, Sean Christopherson wrote: > > > >> > > > >> On Sat, Jul 27, 2019 at 10:38:03AM -0700, Andy Lutomirski wrote: > > > >> On Fri, Jul 26, 2019 at 10:52 PM Sean Christopherson > > > >> wrote: > > > >>> > > > >>> Similar to the existing AMD #NPF case where emulation of the current > > > >>> instruction is not possible due to lack of information, virtualization > > > >>> of Intel SGX will introduce a scenario where emulation is not possible > > > >>> due to the VMExit occurring in an SGX enclave. And again similar to > > > >>> the AMD case, emulation can be initiated by kvm_mmu_page_fault(), i.e. > > > >>> outside of the control of the vendor-specific code. > > > >>> > > > >>> While the cause and architecturally visible behavior of the two cases > > > >>> is different, e.g. Intel SGX will inject a #UD whereas AMD #NPF is a > > > >>> clean resume or complete shutdown, the impact on the common emulation > > > >>> code is identical: KVM must stop emulation immediately and resume the > > > >>> guest. > > > >>> > > > >>> Replace the exisiting need_emulation_on_page_fault() with a more generic > > > >>> is_emulatable() kvm_x86_ops callback, which is called unconditionally > > > >>> by x86_emulate_instruction(). > > > >> > > > >> Having recently noticed that emulate_ud() is broken when the guest's > > > >> TF is set, I suppose I should ask: does your new code function > > > >> sensibly when TF is set? > > > > > > > > Barring a VMX fault injection interaction I'm not thinking of, yes. The > > > > SGX reaction to the #UD VM-Exit is to inject a #UD and resume the guest, > > > > pending breakpoints shouldn't be affected in any way (unless some other > > > > part of KVM mucks with them, e.g. when guest single-stepping is enabled). > > > > > > What I mean is: does the code actually do what you think it does if TF is > > > set? Right now, as I understand it, the KVM emulation code has a bug in > > > which some emulated faults also inject #DB despite the fact that the > > > instruction faulted, and the #DB seems to take precedence over the original > > > fault. This confuses the guest. > > > > Yes. The proposed change is to inject the #UD instead of calling into the > > emulator, and by inspection I've verified that all code that injects a #DB > > is either contained within the emulator or is mutually exclusive with an > > intercepted #UD. It's a qualified yes because I don't have an actual > > testcase to verify my literacy. I'll look into adding a test, either to > > the selftest/x86/sgx or to kvm-unit-tests. > > I wrote one, and it fails: > > # ./tools/testing/selftests/x86/syscall_arg_fault_32 > [RUN] SYSENTER with invalid state > [OK] Seems okay > [RUN] SYSCALL with invalid state > [SKIP] Illegal instruction > [RUN] SYSENTER with TF and invalid state > [OK] Seems okay > [RUN] SYSCALL with TF and invalid state > [WARN] Got stuck single-stepping -- you probably have a KVM bug > > emulate_ud() is buggy. Heh, yeah, I meant for the SGX case, e.g. SYSCALL from inside an enclave with RFLAGS.TF=1.