Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751969AbdHPNai (ORCPT ); Wed, 16 Aug 2017 09:30:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47720 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751089AbdHPNag (ORCPT ); Wed, 16 Aug 2017 09:30:36 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 93C3251172 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=pbonzini@redhat.com Subject: Re: [PATCH] kvm: x86: disable KVM_FAST_MMIO_BUS To: "Michael S. Tsirkin" Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , stable@vger.kernel.org References: <20170816112249.28939-1-pbonzini@redhat.com> <20170816155132-mutt-send-email-mst@kernel.org> <9de5ebf5-457d-2a34-0314-c6c612ddb2e9@redhat.com> <20170816161301-mutt-send-email-mst@kernel.org> From: Paolo Bonzini Message-ID: Date: Wed, 16 Aug 2017 15:30:31 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170816161301-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Wed, 16 Aug 2017 13:30:36 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2821 Lines: 62 On 16/08/2017 15:16, Michael S. Tsirkin wrote: > On Wed, Aug 16, 2017 at 03:05:51PM +0200, Paolo Bonzini wrote: >> On 16/08/2017 14:58, Michael S. Tsirkin wrote: >>> On Wed, Aug 16, 2017 at 01:22:49PM +0200, Paolo Bonzini wrote: >>>> Microsoft pointed out privately to me that KVM's handling of >>>> KVM_FAST_MMIO_BUS is invalid. Using skip_emulation_instruction is invalid >>>> in EPT misconfiguration vmexit handlers, because neither EPT violations >>>> nor misconfigurations are listed in the manual among the VM exits that >>>> set the VM-exit instruction length field. >>>> >>>> While physical processors seem to set the field, this is not architectural >>>> and is just a side effect of the implementation. I couldn't convince >>>> myself of any condition on the exit qualification where VM-exit >>>> instruction length "has" to be defined; there are no trap-like VM-exits >>>> that can be repurposed; and fault-like VM-exits such as descriptor-table >>>> exits provide no decoding information. So I don't really see any elegant >>>> way to fix it except by disabling KVM_FAST_MMIO_BUS, which means virtio >>>> 1 will go slower. >>> >>> How about I will try asking Intel about it? If they can commit to length >>> being there in the future, we are all set. >> >> Nope, "I couldn't convince myself of any condition on the exit >> qualification where VM-exit instruction length "has" to be defined". So >> assuming Intel can do it, it would only apply to future processors (2 >> years+ for server SKUs). > > Well maybe there's a reason it's actually working. Let's see what can > be done. Sure there is. It just happens that the actual condition for VM-exit instruction length being set correctly is "the fault was taken after the accessing instruction has been decoded". But there's no way according to the spec to detect whether that has happened. While you can filter out instruction fetches, that's not enough. A data read could happen because someone pointed the IDT to MMIO area, and who knows what the VM-exit instruction length points to in that case. >> Plus of course it wouldn't be guaranteed to work on nested. > > Not sure I got this one. Not all nested hypervisors are setting the VM-exit instruction length field on EPT violations, since it's documented not to be set. >>>> Adding a hypercall or MSR write that does a fast MMIO write to a physical >>>> address would do it, but it adds hypervisor knowledge in virtio, including >>>> CPUID handling. >>> >>> Another issue is that it will break DPDK on virtio. >> >> Not break, just make it slower. > > I thought hypercalls can only be triggered from ring 0, userspace can't call them. > Dod I get it wrong? That's just a limitation that KVM makes on currently-defined hypercalls. VMCALL causes a vmexit if executed from ring 3. Paolo