Received: by 10.223.176.5 with SMTP id f5csp1771889wra; Thu, 8 Feb 2018 03:15:17 -0800 (PST) X-Google-Smtp-Source: AH8x227cMWEYnT9/PjrhQwL2yKicZhqoBw/TUr0Fjg4FWWd3bCMLFqkBEpkiQrczX5gLdjvUQlZg X-Received: by 2002:a17:902:46a:: with SMTP id 97-v6mr331515ple.96.1518088516939; Thu, 08 Feb 2018 03:15:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518088516; cv=none; d=google.com; s=arc-20160816; b=nzUR75VwkvgFqmpvjuQ7sQS6VrwDvOhsiL89+z6qqEUNsDbhHysgEjcOwP3omBsh2+ F2UEFzJReHWmyt9tpvMn1SgH/TpAOqM7eU2u1bZ3f7ws634Rb0cWNOE3VQ6P41/Hph+N 4eawj3qkEK+IuqrKqVIrtg0nS5EUAkYILB0nRTSkgg7ZgJXeME0QC+EjKvCRkyJQl/wo kYXIplpXpe1wrIRRw+3m8fj2O7h8Aguo3PtAVP4iOYqaYJTu3bX75CGq4TEsiQViF2PN dNKadqU++JHkdApdc9mO64+z/cAS4upLpTOg4L/rF8hHFal5V2qYklnOuYZYFIGsVRdU SpMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=V5Q0UcQoMkz5XPnYKZAsASneKj6upr9yAKfrh24/lXs=; b=wa5RnBM/LBVbDaSEtd1SSBX5sVrpRf5PPL3cSTPThYTV1K+snvDrHsl3Hvxnt6QKcD 97weztTWQBWjHDCYzuQhGFpz6uXuDSLt8+IPD/5p/KpJk12kig+5t2ZH2OsPMr/MGrgL N8rSTQ7MdLUCjUXujrQ9Ve5+lgi7yhX/M/pQhZkOaoWnCMr0cZhy9FL9mJ793MyhgrE1 juAZvhOqoXvcGFuq3L6cuuTp9R2F+M4Kf1x1SR+C3YrHS8x8vNJHdGkEJ+4KEp96gPds Oiu8SjjwufKV2lDip3lKKIHdzeEnBxupKeKZdUWvyuGIvCuubUQH8JSkQyZ4HXEKOtBt NlIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Ss/kcm2l; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g1-v6si2704876plk.422.2018.02.08.03.15.01; Thu, 08 Feb 2018 03:15:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Ss/kcm2l; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752167AbeBHLOT (ORCPT + 99 others); Thu, 8 Feb 2018 06:14:19 -0500 Received: from mail-wm0-f68.google.com ([74.125.82.68]:33236 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752021AbeBHLOR (ORCPT ); Thu, 8 Feb 2018 06:14:17 -0500 Received: by mail-wm0-f68.google.com with SMTP id x4-v6so26778334wmc.0 for ; Thu, 08 Feb 2018 03:14:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=V5Q0UcQoMkz5XPnYKZAsASneKj6upr9yAKfrh24/lXs=; b=Ss/kcm2lBVAhAhlK9lj4sNQ7wCXlx6zjWS0tL3ih3SlvCAlezTw2v4h5V8LfPAOft5 uSCg/+kVq0pIUay0Kg0Y3qmaFDD6srbW41CzRhKWJ7dx4g6vBVza4lkET5KzpHEJfEDD smi4+GMpKOXaU1g43+HhT/evXCUz1YbGemejQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=V5Q0UcQoMkz5XPnYKZAsASneKj6upr9yAKfrh24/lXs=; b=dC5ueWU4VFqYH0LsJ8tcxbDweA1F4jsRXPl4pok+nSODZi2ZG0CCV5/mYKg6ABw4Vk ADYbZAF3d5i17V6iXVxulQAoO7smygvqbeZrrCsh6Hl8oohaWnnP9e0e7qRxgZ+kjqEX exoB9aYxJFP9tA0z4K9/ubDj1CwERlgkU2I9Fc8rIYT1WgIyVDq/rXX1lFKFnO2cjFO+ 7En8IXKAD+z758EUCl86HxWF+ulQ5OSLH5hTAX/EZPC4pRH1hmsivZ6ZPQOWo0BmxLRz UmGcUSNW89/5zDN6RmnWz5bVjikAT7hccSFZqIfWQJjRK+TByiBRAUJ4XrMV6mkyNmg+ n0kw== X-Gm-Message-State: APf1xPCf36TowPBohVvAGwFIOa9G5tF/kLf59Nu+EVziPI1XkMnIyDni tVCzj4lpVXV5a/HGfHP8bcg9qA== X-Received: by 10.80.135.69 with SMTP id 5mr998122edv.226.1518088456177; Thu, 08 Feb 2018 03:14:16 -0800 (PST) Received: from localhost (x50d2404e.cust.hiper.dk. [80.210.64.78]) by smtp.gmail.com with ESMTPSA id g30sm2600556edb.48.2018.02.08.03.14.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Feb 2018 03:14:15 -0800 (PST) Date: Thu, 8 Feb 2018 12:14:14 +0100 From: Christoffer Dall To: Suzuki K Poulose Cc: linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, marc.zyngier@arm.com, linux-kernel@vger.kernel.org, kristina.martsenko@arm.com, peter.maydell@linaro.org, pbonzini@redhat.com, rkrcmar@redhat.com, will.deacon@arm.com, ard.biesheuvel@linaro.org, mark.rutland@arm.com, catalin.marinas@arm.com, Christoffer Dall Subject: Re: [PATCH v1 15/16] kvm: arm64: Allow configuring physical address space size Message-ID: <20180208111414.GM29286@cbox> References: <20180109190414.4017-1-suzuki.poulose@arm.com> <20180109190414.4017-16-suzuki.poulose@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180109190414.4017-16-suzuki.poulose@arm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 09, 2018 at 07:04:10PM +0000, Suzuki K Poulose wrote: > Allow the guests to choose a larger physical address space size. > The default and minimum size is 40bits. A guest can change this > right after the VM creation, but before the stage2 entry page > tables are allocated (i.e, before it registers a memory range > or maps a device address). The size is restricted to the maximum > supported by the host. Also, the guest can only increase the PA size, > from the existing value, as reducing it could break the devices which > may have verified their physical address for validity and may do a > lazy mapping(e.g, VGIC). > > Cc: Marc Zyngier > Cc: Christoffer Dall > Cc: Peter Maydell > Signed-off-by: Suzuki K Poulose > --- > Documentation/virtual/kvm/api.txt | 27 ++++++++++++++++++++++++++ > arch/arm/include/asm/kvm_host.h | 7 +++++++ > arch/arm64/include/asm/kvm_host.h | 1 + > arch/arm64/include/asm/kvm_mmu.h | 41 ++++++++++++++++++++++++++++++--------- > arch/arm64/kvm/reset.c | 28 ++++++++++++++++++++++++++ > include/uapi/linux/kvm.h | 4 ++++ > virt/kvm/arm/arm.c | 2 +- > 7 files changed, 100 insertions(+), 10 deletions(-) > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > index 57d3ee9e4bde..a203faf768c4 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -3403,6 +3403,33 @@ invalid, if invalid pages are written to (e.g. after the end of memory) > or if no page table is present for the addresses (e.g. when using > hugepages). > > +4.109 KVM_ARM_GET_PHYS_SHIFT > + > +Capability: KVM_CAP_ARM_CONFIG_PHYS_SHIFT > +Architectures: arm64 > +Type: vm ioctl > +Parameters: __u32 (out) > +Returns: 0 on success, a negative value on error > + > +This ioctl is used to get the current maximum physical address size for > +the VM. The value is Log2(Maximum_Physical_Address). This is neither the > + amount of physical memory assigned to the VM nor the maximum physical address > +that a real CPU on the host can handle. Rather, this is the upper limit of the > +guest physical address that can be used by the VM. What is the point of this? Presumably if userspace has set the size, it already knows the size? > + > +4.109 KVM_ARM_SET_PHYS_SHIFT > + > +Capability: KVM_CAP_ARM_CONFIG_PHYS_SHIFT > +Architectures: arm64 > +Type: vm ioctl > +Parameters: __u32 (in) > +Returns: 0 on success, a negative value on error > + > +This ioctl is used to set the maximum physical address size for > +the VM. The value is Log2(Maximum_Physical_Address). The value can only > +be increased from the existing setting. The value cannot be changed > +after the stage-2 page tables are allocated and will return an error. > + Is there a way for userspace to discover what the underlying hardware can actually support, beyond trial-and-error on this ioctl? > 5. The kvm_run structure > ------------------------ > > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h > index a9f7d3f47134..fa8e68a4f692 100644 > --- a/arch/arm/include/asm/kvm_host.h > +++ b/arch/arm/include/asm/kvm_host.h > @@ -268,6 +268,13 @@ static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext) > return 0; > } > > +static inline long kvm_arch_dev_vm_ioctl(struct kvm *kvm, > + unsigned int ioctl, > + unsigned long arg) > +{ > + return -EINVAL; > +} > + > int kvm_perf_init(void); > int kvm_perf_teardown(void); > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index 1e66e5ab3dde..2895c2cda8fc 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -50,6 +50,7 @@ > int __attribute_const__ kvm_target_cpu(void); > int kvm_reset_vcpu(struct kvm_vcpu *vcpu); > int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext); > +long kvm_arch_dev_vm_ioctl(struct kvm *kvm, unsigned int ioctl, unsigned long arg); > void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start); > > struct kvm_arch { > diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h > index ab6a8b905065..ab7f50f20bcd 100644 > --- a/arch/arm64/include/asm/kvm_mmu.h > +++ b/arch/arm64/include/asm/kvm_mmu.h > @@ -347,21 +347,44 @@ static inline u64 kvm_vttbr_baddr_mask(struct kvm *kvm) > return GENMASK_ULL(PHYS_MASK_SHIFT - 1, x); > } > > +static inline int kvm_reconfig_stage2(struct kvm *kvm, u32 phys_shift) > +{ > + int rc = 0; > + unsigned int pa_max, parange; > + > + parange = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1) & 7; > + pa_max = id_aa64mmfr0_parange_to_phys_shift(parange); > + /* Raise it to 40bits for backward compatibility */ > + pa_max = (pa_max < 40) ? 40 : pa_max; > + /* Make sure the size is supported/available */ > + if (phys_shift > PHYS_MASK_SHIFT || phys_shift > pa_max) > + return -EINVAL; > + /* > + * The stage2 PGD is dependent on the settings we initialise here > + * and should be allocated only after this step. We cannot allow > + * down sizing the IPA size as there could be devices or memory > + * regions, that depend on the previous size. > + */ > + mutex_lock(&kvm->slots_lock); > + if (kvm->arch.pgd || phys_shift < kvm->arch.phys_shift) { > + rc = -EPERM; > + } else if (phys_shift > kvm->arch.phys_shift) { > + kvm->arch.phys_shift = phys_shift; > + kvm->arch.s2_levels = stage2_pt_levels(kvm->arch.phys_shift); > + kvm->arch.vtcr_private = VTCR_EL2_SL0(kvm->arch.s2_levels) | > + TCR_T0SZ(kvm->arch.phys_shift); > + } I think you can rework the above to make it more obvious what's going on in this way: rc = -EPERM; if (kvm->arch.pgd || phys_shift < kvm->arch.phys_shift) goto out_unlock; rc = 0; if (phys_shift == kvm->arch.phys_shift) goto out_unlock; kvm->arch.phys_shift = phys_shift; kvm->arch.s2_levels = stage2_pt_levels(kvm->arch.phys_shift); kvm->arch.vtcr_private = VTCR_EL2_SL0(kvm->arch.s2_levels) | TCR_T0SZ(kvm->arch.phys_shift); out_unlock: > + mutex_unlock(&kvm->slots_lock); > + return rc; > +} > + > /* > * kvm_init_stage2_config: Initialise the VM specific stage2 page table > * details to default IPA size. > */ > static inline void kvm_init_stage2_config(struct kvm *kvm) > { > - /* > - * The stage2 PGD is dependent on the settings we initialise here > - * and should be allocated only after this step. > - */ > - VM_BUG_ON(kvm->arch.pgd != NULL); > - kvm->arch.phys_shift = KVM_PHYS_SHIFT_DEFAULT; > - kvm->arch.s2_levels = stage2_pt_levels(kvm->arch.phys_shift); > - kvm->arch.vtcr_private = VTCR_EL2_SL0(kvm->arch.s2_levels) | > - TCR_T0SZ(kvm->arch.phys_shift); > + kvm_reconfig_stage2(kvm, KVM_PHYS_SHIFT_DEFAULT); > } > > #endif /* __ASSEMBLY__ */ > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c > index 3256b9228e75..90ceca823aca 100644 > --- a/arch/arm64/kvm/reset.c > +++ b/arch/arm64/kvm/reset.c > @@ -23,6 +23,7 @@ > #include > #include > #include > +#include > > #include > > @@ -81,6 +82,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext) > case KVM_CAP_VCPU_ATTRIBUTES: > r = 1; > break; > + case KVM_CAP_ARM_CONFIG_PHYS_SHIFT: > + r = 1; > + break; > default: > r = 0; > } > @@ -88,6 +92,30 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext) > return r; > } > > +long kvm_arch_dev_vm_ioctl(struct kvm *kvm, > + unsigned int ioctl, unsigned long arg) > +{ > + void __user *argp = (void __user *)arg; > + u32 phys_shift; > + long r = -EFAULT; > + > + switch (ioctl) { > + case KVM_ARM_GET_PHYS_SHIFT: > + phys_shift = kvm_phys_shift(kvm); > + if (!put_user(phys_shift, (u32 __user *)argp)) > + r = 0; > + break; > + case KVM_ARM_SET_PHYS_SHIFT: > + if (!get_user(phys_shift, (u32 __user*)argp)) > + r = kvm_reconfig_stage2(kvm, phys_shift); > + break; > + default: > + r = -EINVAL; > + } > + return r; > +} > + > + > /** > * kvm_reset_vcpu - sets core registers and sys_regs to reset value > * @vcpu: The VCPU pointer > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 496e59a2738b..66bfbe19b434 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -932,6 +932,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_HYPERV_SYNIC2 148 > #define KVM_CAP_HYPERV_VP_INDEX 149 > #define KVM_CAP_S390_AIS_MIGRATION 150 > +#define KVM_CAP_ARM_CONFIG_PHYS_SHIFT 151 > > #ifdef KVM_CAP_IRQ_ROUTING > > @@ -1261,6 +1262,9 @@ struct kvm_s390_ucas_mapping { > #define KVM_PPC_CONFIGURE_V3_MMU _IOW(KVMIO, 0xaf, struct kvm_ppc_mmuv3_cfg) > /* Available with KVM_CAP_PPC_RADIX_MMU */ > #define KVM_PPC_GET_RMMU_INFO _IOW(KVMIO, 0xb0, struct kvm_ppc_rmmu_info) > +/* Available with KVM_CAP_ARM_CONFIG_PHYS_SHIFT */ > +#define KVM_ARM_GET_PHYS_SHIFT _IOR(KVMIO, 0xb1, __u32) > +#define KVM_ARM_SET_PHYS_SHIFT _IOW(KVMIO, 0xb2, __u32) > > /* ioctl for vm fd */ > #define KVM_CREATE_DEVICE _IOWR(KVMIO, 0xe0, struct kvm_create_device) > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index e0bf8d19fcfe..05fc49304722 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -1136,7 +1136,7 @@ long kvm_arch_vm_ioctl(struct file *filp, > return 0; > } > default: > - return -EINVAL; > + return kvm_arch_dev_vm_ioctl(kvm, ioctl, arg); > } > } > > -- > 2.13.6 > Have you considered making this capability a generic capability and encoding this in the 'type' argument to KVM_CREATE_VM? This would significantly simplify all the above and would allow you to drop patch 8 and 9 I think. Thanks, -Christoffer