Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752711AbdHPVZp (ORCPT ); Wed, 16 Aug 2017 17:25:45 -0400 Received: from mail-wr0-f194.google.com ([209.85.128.194]:38255 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752678AbdHPVZm (ORCPT ); Wed, 16 Aug 2017 17:25:42 -0400 Subject: Re: [PATCH] kvm: x86: disable KVM_FAST_MMIO_BUS To: "Michael S. Tsirkin" , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, stable@vger.kernel.org References: <20170816112249.28939-1-pbonzini@redhat.com> <20170816155132-mutt-send-email-mst@kernel.org> <9de5ebf5-457d-2a34-0314-c6c612ddb2e9@redhat.com> <20170816161301-mutt-send-email-mst@kernel.org> <20170816194342-mutt-send-email-mst@kernel.org> <81dabc78-edfd-32fc-024c-c57330386a51@redhat.com> <20170816190316.GA2566@flask> <20170816224815-mutt-send-email-mst@kernel.org> From: Paolo Bonzini Message-ID: Date: Wed, 16 Aug 2017 23:25:35 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170816224815-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2869 Lines: 62 On 16/08/2017 21:59, Michael S. Tsirkin wrote: > On Wed, Aug 16, 2017 at 09:03:17PM +0200, Radim Krčmář wrote: >> 2017-08-16 19:19+0200, Paolo Bonzini: >>> On 16/08/2017 18:50, Michael S. Tsirkin wrote: >>>> On Wed, Aug 16, 2017 at 03:30:31PM +0200, Paolo Bonzini wrote: >>>>> While you can filter out instruction fetches, that's not enough. A data >>>>> read could happen because someone pointed the IDT to MMIO area, and who >>>>> knows what the VM-exit instruction length points to in that case. >>>> >>>> Thinking more about it, I don't really see how anything >>>> legal guest might be doing with virtio would trigger anything >>>> but a fault after decoding the instruction. How does >>>> skipping instruction even make sense in the example you give? >>> >>> There's no such thing as a legal guest. Anything that the hypervisor >>> does, that differs from real hardware, is a possible escalation path. >>> >>> This in fact makes me doubt the EMULTYPE_SKIP patch too. >> >> The main hack is that we expect EPT misconfig within a given range to be >> a MMIO NULL write. I think it is fine -- EMULTYPE_SKIP is a common path >> that should have well tested error paths and, IIUC, virtio doesn't allow >> any other access, so it is a problem of the guest if a buggy/malicious >> application can access virtio memory. Yes, I agree. EMULTYPE_SKIP is fine because failed decoding still causes an exception to be injected. Maybe it's better to gate the EMULTYPE_SKIP emulation on the exit qualification saying this is a write and also not a page table walk---just in case. >>>> how about we blacklist nested virt for this optimization? >> >> Not every hypervisor can be easily detected ... > > Hypervisors that don't set a hypervisor bit in CPUID are violating the > spec themselves, aren't they? Anyway, we can add a management option > for use in a nested scenario. No, the hypervisor bit only says that CPUID leaf 0x40000000 is defined. See for example https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458: "Intel and AMD have also reserved CPUID leaves 0x40000000 - 0x400000FF for software use. Hypervisors can use these leaves to provide an interface to pass information from the hypervisor to the guest operating system running inside a virtual machine. The hypervisor bit indicates the presence of a hypervisor and that it is safe to test these additional software leaves". >> KVM uses standard features and SDM clearly says that the >> instruction length field is undefined. > > True. Let's see whether intel can commit to a stronger definition. > I don't think there's any rush to make this change. I disagree. Relying on undefined processor features is a bad idea. > It's just that this has been there for 3 years and people have built a > product around this. Around 700 clock cycles? Paolo