Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp3774777ybz; Mon, 20 Apr 2020 09:12:34 -0700 (PDT) X-Google-Smtp-Source: APiQypKLyoEsTfJAMfh2EsclSCWhzB30uKu6V5E0QTMdv0yzq9563o8EcRCzkJE2cyc5OzdkzJaV X-Received: by 2002:a17:906:bce4:: with SMTP id op4mr11748592ejb.174.1587399154445; Mon, 20 Apr 2020 09:12:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587399154; cv=none; d=google.com; s=arc-20160816; b=Dz44o2NelfIUgYlOkdZ8ShGxgr2ux7mPUDjQkvUQ8yyQ946vYE2/GXcTZ/MucVtJ6k sw0sNN9/ZKvV97VaIz+RD4bpCGUSmQ78rIGebQDgusCn+Pw3u8ZBvo8CBGuwkeSSNGCU t9F10vpiS9MeLNsX30jvLsAR2Woz3ON3w9G/sRyaAAhooRcw+hPPLrVxvQTXGubWt4Hp Yiq3kIrjpxF4TSK9BB1GQzn+SXdQIxfg0f69K06qUaBJQFSDTbOe+89+38ePZWX1unaW NO2QV+jNqk+PmvuTRfiTm7bctKpOPsBrAf0YR1oKM4JD6Smyf+4nlyOeWQjAEC5rf6tD FM5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=rq4yseR6JSI+Kq6QDHlI+bhRNfE1inV4yCS6udH03ZY=; b=OqDaszUI/lN/bn9hEtuZu8Xk8AnFW7t+utUDWKFNt9Rbeqp7dwk8KrJf9ESC18vDjn UN39wSuwxPKBor2pWzy/QtGFsVp/7l87Bo8/qjffkFF365rTIE6ciX/pbA1NcDwe/4oq Pr70WD2INzslQ/A8IDWgMCbXRF8kiHNIGJx9Vo6qwuGMBHZm6s3DIDayw8ntXKNV61M9 +jkkODTufLO+hq1WWXDeIhOB/2Hqg41jw6sdjkDJMrlju7zCyeqOULIpLN1bgKqsrHKd oGofcxJgXXYIslxAYVcaBfCI86szb56XcHuaowgjKBRIUkJPp6C8QYLjblQ1Xu017riK mJbw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g4si278093edu.287.2020.04.20.09.12.09; Mon, 20 Apr 2020 09:12:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726717AbgDTQKe (ORCPT + 99 others); Mon, 20 Apr 2020 12:10:34 -0400 Received: from foss.arm.com ([217.140.110.172]:51626 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726036AbgDTQKe (ORCPT ); Mon, 20 Apr 2020 12:10:34 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 61D1531B; Mon, 20 Apr 2020 09:10:33 -0700 (PDT) Received: from [192.168.0.110] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A6BFC3F73D; Mon, 20 Apr 2020 09:10:32 -0700 (PDT) Subject: Re: [PATCH RFC] KVM: arm64: Sidestep stage2_unmap_vm() on vcpu reset when S2FWB is supported To: Zenghui Yu , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Marc Zyngier References: <20200415072835.1164-1-yuzenghui@huawei.com> From: Alexandru Elisei Message-ID: Date: Mon, 20 Apr 2020 17:10:53 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20200415072835.1164-1-yuzenghui@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 4/15/20 8:28 AM, Zenghui Yu wrote: > stage2_unmap_vm() was introduced to unmap user RAM region in the stage2 > page table to make the caches coherent. E.g., a guest reboot with stage1 > MMU disabled will access memory using non-cacheable attributes. If the > RAM and caches are not coherent at this stage, some evicted dirty cache > line may go and corrupt guest data in RAM. > > Since ARMv8.4, S2FWB feature is mandatory and KVM will take advantage > of it to configure the stage2 page table and the attributes of memory > access. So we ensure that guests always access memory using cacheable > attributes and thus, the caches always be coherent. > > So on CPUs that support S2FWB, we can safely reset the vcpu without a > heavy stage2 unmapping. > > Cc: Marc Zyngier > Cc: Christoffer Dall > Cc: James Morse > Cc: Julien Thierry > Cc: Suzuki K Poulose > Signed-off-by: Zenghui Yu > --- > > If this is correct, there should be a great performance improvement on > a guest reboot (or reset) on systems support S2FWB. But I'm afraid that > I've missed some points here, so please comment! > > The commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm") > was merged about six years ago and I failed to track its histroy and > intention. Instead of a whole stage2 unmapping, something like > stage2_flush_vm() looks enough to me. But again, I'm unsure... > > Thanks for having a look! I had a chat with Christoffer about stage2_unmap_vm, and as I understood it, the purpose was to make sure that any changes made by userspace were seen by the guest while the MMU is off. When a stage 2 fault happens, we do clean+inval on the dcache, or inval on the icache if it was an exec fault. This means that whatever the host userspace writes while the guest is shut down and is still in the cache, the guest will be able to read/execute. This can be relevant if the guest relocates the kernel and overwrites the original image location, and userspace copies the original kernel image back in before restarting the vm. > > virt/kvm/arm/arm.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index 48d0ec44ad77..e6378162cdef 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -983,8 +983,11 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, > /* > * Ensure a rebooted VM will fault in RAM pages and detect if the > * guest MMU is turned off and flush the caches as needed. > + * > + * S2FWB enforces all memory accesses to RAM being cacheable, we > + * ensure that the cache is always coherent. > */ > - if (vcpu->arch.has_run_once) > + if (vcpu->arch.has_run_once && !cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) I think userspace does not invalidate the icache when loading a new kernel image, and if the guest patched instructions, they could potentially still be in the icache. Should the icache be invalidated if FWB is present? Thanks, Alex > stage2_unmap_vm(vcpu->kvm); > > vcpu_reset_hcr(vcpu);