Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752523AbdHPTrk (ORCPT ); Wed, 16 Aug 2017 15:47:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52630 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752105AbdHPTri (ORCPT ); Wed, 16 Aug 2017 15:47:38 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 60D7E7ACCF Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=mst@redhat.com Date: Wed, 16 Aug 2017 22:47:37 +0300 From: "Michael S. Tsirkin" To: Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Radim =?utf-8?B?S3LEjW3DocWZ?= , stable@vger.kernel.org Subject: Re: [PATCH] kvm: x86: disable KVM_FAST_MMIO_BUS Message-ID: <20170816224500-mutt-send-email-mst@kernel.org> References: <20170816112249.28939-1-pbonzini@redhat.com> <20170816155132-mutt-send-email-mst@kernel.org> <9de5ebf5-457d-2a34-0314-c6c612ddb2e9@redhat.com> <20170816161301-mutt-send-email-mst@kernel.org> <20170816194342-mutt-send-email-mst@kernel.org> <81dabc78-edfd-32fc-024c-c57330386a51@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <81dabc78-edfd-32fc-024c-c57330386a51@redhat.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 16 Aug 2017 19:47:38 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2511 Lines: 68 On Wed, Aug 16, 2017 at 07:19:28PM +0200, Paolo Bonzini wrote: > On 16/08/2017 18:50, Michael S. Tsirkin wrote: > > On Wed, Aug 16, 2017 at 03:30:31PM +0200, Paolo Bonzini wrote: > >> While you can filter out instruction fetches, that's not enough. A data > >> read could happen because someone pointed the IDT to MMIO area, and who > >> knows what the VM-exit instruction length points to in that case. > > > > Thinking more about it, I don't really see how anything > > legal guest might be doing with virtio would trigger anything > > but a fault after decoding the instruction. How does > > skipping instruction even make sense in the example you give? > > There's no such thing as a legal guest. Anything that the hypervisor > does, that differs from real hardware, is a possible escalation path. Fast MMIO bus devices don't apprear out of thin air. They appear because guest enabled a virtio device. So it is a PV guest and if it doesn't behave according to the virtio spec, it is going to crash. > > This in fact makes me doubt the EMULTYPE_SKIP patch too. > > >>>> Plus of course it wouldn't be guaranteed to work on nested. > >>> > >>> Not sure I got this one. > >> > >> Not all nested hypervisors are setting the VM-exit instruction length > >> field on EPT violations, since it's documented not to be set. > > > > So that's probably the real issue - nested virt which has to do it > > in software at extra cost. We already limit this to intel processors, > > how about we blacklist nested virt for this optimization? > > > > I agree it's skating it a bit close to the dangerous edge, > > but so are other tricks we play with PTEs to speed up MMIO. > > Not at all. Everything else we do is perfectly fine according to the > spec, this one isn't. > > Paolo Virtio MMIO is kind of special in many ways. What happens if I map and try to execute an MMIO BAR? I don't think it will work, will it? > >>>>>> Adding a hypercall or MSR write that does a fast MMIO write to a physical > >>>>>> address would do it, but it adds hypervisor knowledge in virtio, including > >>>>>> CPUID handling. > >>>>> > >>>>> Another issue is that it will break DPDK on virtio. > >>>> > >>>> Not break, just make it slower. > >>> > >>> I thought hypercalls can only be triggered from ring 0, userspace can't call them. > >>> Dod I get it wrong? > >> > >> That's just a limitation that KVM makes on currently-defined hypercalls. > >> > >> VMCALL causes a vmexit if executed from ring 3. > >> > >> Paolo > >