Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp17940ybt; Thu, 9 Jul 2020 14:14:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwl+ZSSeiYzJca3dSkjBqpLFRJIyDPOVPoHxbBCsN6MPwPA4wezOOHVYTG8cF1Xd4dezi6m X-Received: by 2002:a50:c219:: with SMTP id n25mr75067673edf.306.1594329255854; Thu, 09 Jul 2020 14:14:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594329255; cv=none; d=google.com; s=arc-20160816; b=dAXm9wkrlQNA3yn++3HD1seNeHUYMu+Py4xcwgz4VJ2UGfizea462Xer4VGu1hAqWx kIN4OEeLm2HoXXYN4xir8SgakLeYJqhE1t/cDfY539LUhgvXdiz1MRL1nfXzqknPbGpY XlENo43VJ6xDV3+793B1J9nbu7IXtbT47jCGpAPX2c01MntW6aJ40O9qd+8Jmh2SbmfY Keaizugu+j3iQ9Zlm2/X5fP3q0xIhP9UNKH00bBC7v/BeE+/SUI74dI5RVZzH/GH1CVY pB8ENAOy4KcZQWz0JrbHTfVAvPQE5n4pLfc9demKqnFlJ9XUSESrT8X2B6jEsiIAK/Kj YMWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:ironport-sdr:ironport-sdr; bh=w64ZePuz+KVxtXMM2zv7nUqQZJuFM5AV0Z9lpUKT9FI=; b=vhBshDBqh6D8yXXJasit+glNWhYn7FMA9BhfEBpMlaCIP0H9BzDPIt/V+YW8+qcD8x 8sQZEU9AxmcD2SkX0xJWqkfq8wMFOJtMfZSaI6TDOEUirYT3wxLpHqIyTOKsnYDp6s1b lLqsGMhWiXAbjSD2d8VMRLx2AOo68SSOgMcDLmOquRATpF+gZf7Fh274d5/5b3t6gzM0 oO9obsoyq0aePo5XGvnvKpwIXgGGVS+4ZHz+SpAK9tYDYVVxbCe8rzGske35vDwaFUxo n4zmIEPAJD4vkRJp9qzfCOAGRaG+xqhnR3nNEHQVaZkDOvARmSPDVIaE9hYanzJnurul CbYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r6si2463309ejd.213.2020.07.09.14.13.51; Thu, 09 Jul 2020 14:14:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726600AbgGIVMz (ORCPT + 99 others); Thu, 9 Jul 2020 17:12:55 -0400 Received: from mga14.intel.com ([192.55.52.115]:48641 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726193AbgGIVMy (ORCPT ); Thu, 9 Jul 2020 17:12:54 -0400 IronPort-SDR: zYmXNkegbqZW8ttBVqW+Fb9OPMitK+9NJngz/+TbgxgVnoU3D8mdRsJjjFX5DF5hHeWbfAUSwC L4phwt6RBvPQ== X-IronPort-AV: E=McAfee;i="6000,8403,9677"; a="147200600" X-IronPort-AV: E=Sophos;i="5.75,332,1589266800"; d="scan'208";a="147200600" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Jul 2020 14:12:53 -0700 IronPort-SDR: HL1NxPNKrlVgGsyiHBtUvZOy7JZ9ONOwA/QDyK5kyklSsh/Edk5Q8Scsc3DgwBI0El5KtpVVxg uqD2y2SHHsOg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,332,1589266800"; d="scan'208";a="358569577" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.152]) by orsmga001.jf.intel.com with ESMTP; 09 Jul 2020 14:12:53 -0700 Date: Thu, 9 Jul 2020 14:12:53 -0700 From: Sean Christopherson To: Paolo Bonzini Cc: Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Xiong Zhang , Wayne Boyer , Zhenyu Wang , Jun Nakajima Subject: Re: [PATCH] KVM: x86/mmu: Add capability to zap only sptes for the affected memslot Message-ID: <20200709211253.GW24919@linux.intel.com> References: <20200703025047.13987-1-sean.j.christopherson@intel.com> <51637a13-f23b-8b76-c93a-76346b4cc982@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51637a13-f23b-8b76-c93a-76346b4cc982@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 08, 2020 at 06:08:24PM +0200, Paolo Bonzini wrote: > On 03/07/20 04:50, Sean Christopherson wrote: > > Introduce a new capability, KVM_CAP_MEMSLOT_ZAP_CONTROL, to allow > > userspace to control the memslot zapping behavior on a per-VM basis. > > x86's default behavior is to zap all SPTEs, including the root shadow > > page, across all memslots. While effective, the nuke and pave approach > > isn't exactly performant, especially for large VMs and/or VMs that > > heavily utilize RO memslots for MMIO devices, e.g. option ROMs. > > > > On a vanilla VM with 6gb of RAM, the targeted zap reduces the number of > > EPT violations during boot by ~14% with THP enabled in the host, and by > > ~7% with THP disabled in the host. On a much more custom VM with 32gb > > and a significant amount of memslot zapping, this can reduce the number > > of EPT violations by 50% during guest boot, and improve boot time by > > as much as 25%. > > > > Keep the current x86 memslot zapping behavior as the default, as there's > > an unresolved bug that pops up when zapping only the affected memslot, > > and the exact conditions that trigger the bug are not fully known. See > > https://patchwork.kernel.org/patch/10798453 for details. > > > > Implement the capability as a set of flags so that other architectures > > might be able to use the capability without having to conform to x86's > > semantics. > > It's bad that we have no clue what's causing the bad behavior, but I > don't think it's wise to have a bug that is known to happen when you > enable the capability. :/ I don't necessarily disagree, but at the same time it's entirely possible it's a Qemu bug. If the bad behavior doesn't occur with other VMMs, those other VMMs shouldn't be penalized because we can't figure out what Qemu is getting wrong. Even if this is a kernel bug, I'm fairly confident at this point that it's not a KVM bug. Or rather, if it's a KVM "bug", then there's a fundamental dependency in memslot management that needs to be rooted out and documented. And we're kind of in a catch-22; it'll be extremely difficult to narrow down exactly who is breaking what without being able to easily test the optimized zapping with other VMMs and/or setups.