Received: by 2002:ac0:a591:0:0:0:0:0 with SMTP id m17-v6csp1977644imm; Fri, 6 Jul 2018 09:39:40 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeShIdpSVW8UX0tcQatokJ+r89ve0S1rnDMoCcVJ4U0z4ECowDPXFNwnaxX7kSey/zYaZrY X-Received: by 2002:a17:902:f83:: with SMTP id 3-v6mr10892384plz.282.1530895180181; Fri, 06 Jul 2018 09:39:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530895180; cv=none; d=google.com; s=arc-20160816; b=LmT5fBTvo5yZHNAXLRwSoAAkYKxb6oLuQiKxAOovVbAbaxmFmZVhPjNjYnJaFfLzDV 85tCHczZzjeolX6BJZtnqo6lOj1PXOUQUFv5hH6ITT3VdRZ3twAw977eUTsWPW0GdYVg JNkvdiqnJO+pk7GojXJuEQF13AQxHM9+aehWbPc04uZtcF8Bepphz2TaJwYNf3p8JD1H xH8DUsRgMSy+vUleXEsJPlscXnUpnWXpyTe+mhUHNwZFj1MbYNpgCiZ7vv0hUvd1RWLQ 5tLsuzhncwRjXC3Ly+oJuOcXDtNr+K4tpfzJ/LDtKUzY+tMQSkCt8uWhnuJ1eEgG6V5N Binw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=IOXQxiUZ3ApsbbDJyoHGHTvMK1WUXUiOhRoS7rVpZ/c=; b=gz/bkKSWt54JT3jVcC/ZNctkBKjTeYLCpc8y1cQnfCtYEaAqluIgwOoSHhgXAww/dy xFQgmPGdZGLFE6SzO1oJhzWsfuKWnVBAVtrw/6x7ZUtFAhRIDIlAANimcje/iHUjcNhp xBL2YJb7FyIYhl4Edp/qloVM8x9g21oZ4Sqb7F76NCi1SzUL2J0YUJneLdkm40/zlOjS b/ILU/plqWl4Qh5NMKtNvLDuAPO4IxbSa0Sejryryr1M5LxFubW8z8vBaVwkW1SuYqNJ wyNiePt2cvDhUzBqkndZLTdqH0XyVnQpwJKVEa1HO7x5iaEYeaqz6bySEkdemDNZGBay L22Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a24-v6si7725033pgv.527.2018.07.06.09.39.20; Fri, 06 Jul 2018 09:39:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933890AbeGFQin (ORCPT + 99 others); Fri, 6 Jul 2018 12:38:43 -0400 Received: from foss.arm.com ([217.140.101.70]:39894 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932730AbeGFQik (ORCPT ); Fri, 6 Jul 2018 12:38:40 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D153318A; Fri, 6 Jul 2018 09:38:39 -0700 (PDT) Received: from [10.37.8.159] (unknown [10.37.8.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 30F9C3F2EA; Fri, 6 Jul 2018 09:38:35 -0700 (PDT) Subject: Re: [PATCH v3 15/20] kvm: arm/arm64: Allow tuning the physical address size for VM To: Marc Zyngier , Will Deacon Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, james.morse@arm.com, cdall@kernel.org, eric.auger@redhat.com, julien.grall@arm.com, catalin.marinas@arm.com, punit.agrawal@arm.com, qemu-devel@nongnu.org, Peter Maydel , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= References: <1530270944-11351-1-git-send-email-suzuki.poulose@arm.com> <1530270944-11351-16-git-send-email-suzuki.poulose@arm.com> <20180704155104.GN4828@arm.com> <12d1832a-1a13-7dd4-662b-addf58400789@arm.com> <9f1af26e-2913-2b0b-1352-63160096f78f@arm.com> From: Suzuki K Poulose Message-ID: Date: Fri, 6 Jul 2018 17:39:00 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <9f1af26e-2913-2b0b-1352-63160096f78f@arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/06/2018 04:09 PM, Marc Zyngier wrote: > On 06/07/18 14:49, Suzuki K Poulose wrote: >> On 04/07/18 23:03, Suzuki K Poulose wrote: >>> On 07/04/2018 04:51 PM, Will Deacon wrote: >>>> Hi Suzuki, >>>> >>>> On Fri, Jun 29, 2018 at 12:15:35PM +0100, Suzuki K Poulose wrote: >>>>> Allow specifying the physical address size for a new VM via >>>>> the kvm_type argument for KVM_CREATE_VM ioctl. This allows >>>>> us to finalise the stage2 page table format as early as possible >>>>> and hence perform the right checks on the memory slots without >>>>> complication. The size is encoded as Log2(PA_Size) in the bits[7:0] >>>>> of the type field and can encode more information in the future if >>>>> required. The IPA size is still capped at 40bits. >>>>> >>>>> Cc: Marc Zyngier >>>>> Cc: Christoffer Dall >>>>> Cc: Peter Maydel >>>>> Cc: Paolo Bonzini >>>>> Cc: Radim Krčmář >>>>> Signed-off-by: Suzuki K Poulose >>>>> --- >>>>>   arch/arm/include/asm/kvm_mmu.h   |  2 ++ >>>>>   arch/arm64/include/asm/kvm_arm.h | 10 +++------- >>>>>   arch/arm64/include/asm/kvm_mmu.h |  2 ++ >>>>>   include/uapi/linux/kvm.h         | 10 ++++++++++ >>>>>   virt/kvm/arm/arm.c               | 24 ++++++++++++++++++++++-- >>>>>   5 files changed, 39 insertions(+), 9 deletions(-) >>>> >>>> [...] >>>> >>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h >>>>> index 4df9bb6..fa4cab0 100644 >>>>> --- a/include/uapi/linux/kvm.h >>>>> +++ b/include/uapi/linux/kvm.h >>>>> @@ -751,6 +751,16 @@ struct kvm_ppc_resize_hpt { >>>>>   #define KVM_S390_SIE_PAGE_OFFSET 1 >>>>>   /* >>>>> + * On arm/arm64, machine type can be used to request the physical >>>>> + * address size for the VM. Bits [7-0] have been reserved for the >>>>> + * PA size shift (i.e, log2(PA_Size)). For backward compatibility, >>>>> + * value 0 implies the default IPA size, which is 40bits. >>>>> + */ >>>>> +#define KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK    0xff >>>>> +#define KVM_VM_TYPE_ARM_PHYS_SHIFT(x)        \ >>>>> +    ((x) & KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK) >>>> >>>> This seems like you're allocating quite a lot of bits in a non-extensible >>>> interface to a fairly esoteric parameter. Would it be better to add another >>>> ioctl, or condense the number of sizes you support instead? >>> >>> As I explained in the other thread, we need the size as soon as the VM >>> is created. The major challenge is keeping the backward compatibility by >>> mapping 0 to 40bits. I will give it a thought. >> >> Here is one option. We could re-use the {V}TCR_ELx.{I}PS field format, which >> occupies 3 bits and has the following definitions. (ID_AA64MMFR0_EL1:PARange >> also has the field definitions, except that the field is 4bits wide, but >> only 3bits are used) >> >> 000 32 bits, 4GB. >> 001 36 bits, 64GB. >> 010 40 bits, 1TB. >> 011 42 bits, 4TB. >> 100 44 bits, 16TB. >> 101 48 bits, 256TB. >> 110 52 bits, 4PB >> >> But we need to map 0 => 40bits IPA to make our ABI backward compatible. So >> we could use the additional one bit to indicate that IPA size is requested >> in the 3 bits. >> >> i.e, >> >> machine_type: >> >> Bit [2:0] - Requested IPA size. Values follow VTCR_EL2.PS format. >> >> Bit [3] - 1 => IPA Size bits (Bits[2:0]) requested. >> 0 => Not requested >> >> The only minor down side is restricting to the predefined values above, >> which is not a real issue for a VM. >> >> Thoughts ? > > I'd be very wary of using that 4th bit to do something that is not in > the architecture. We have only a single value left to be used (0b111), > and then your scheme clashes with the architecture definition. I agree. However, if we ever go beyond the 3bits in PARange, we have an issue with {V}TCR counter part. But lets not take that chance. > > I'd rather encode things in a way that is independent from the > architecture, and be done with it. You can map 0 to 40bits, and we have > the ability to express all values the architecture has (just in a > different order). The other option I can think of is encoding a signed number which is the difference of the IPA from 40. But that would need 5 bits if we were to encode it as it is. And if we want to squeeze it in 4bit, we could store half the difference (limiting the IPA limit to even numbers). i.e IPA = 40 + 2 * sign_extend(bits[3:0); Suzuki