Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18;
MIME-Version: 1.0
References: <20200803211423.29398-1-graf@amazon.com> <20200803211423.29398-3-graf@amazon.com>
In-Reply-To: <20200803211423.29398-3-graf@amazon.com>
From:   Jim Mattson <jmattson@google.com>
Date:   Wed, 19 Aug 2020 15:49:32 -0700
Message-ID: <CALMp9eS3Y845mPMD6H+5nmYDMvhPcDcFCWUXpLiscxo_9--EYQ@mail.gmail.com>
Subject: Re: [PATCH v4 2/3] KVM: x86: Introduce allow list for MSR emulation
To:     Alexander Graf <graf@amazon.com>
Cc:     Paolo Bonzini <pbonzini@redhat.com>,
        Jonathan Corbet <corbet@lwn.net>,
        Sean Christopherson <sean.j.christopherson@intel.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Joerg Roedel <joro@8bytes.org>,
        KarimAllah Raslan <karahmed@amazon.de>,
        Aaron Lewis <aaronlewis@google.com>,
        kvm list <kvm@vger.kernel.org>, linux-doc@vger.kernel.org,
        LKML <linux-kernel@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk

On Mon, Aug 3, 2020 at 2:14 PM Alexander Graf <graf@amazon.com> wrote:

> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -901,6 +901,13 @@ struct kvm_hv {
>         struct kvm_hv_syndbg hv_syndbg;
>  };
>
> +struct msr_bitmap_range {
> +       u32 flags;
> +       u32 nmsrs;
> +       u32 base;
> +       unsigned long *bitmap;
> +};
> +
>  enum kvm_irqchip_mode {
>         KVM_IRQCHIP_NONE,
>         KVM_IRQCHIP_KERNEL,       /* created with KVM_CREATE_IRQCHIP */
> @@ -1005,6 +1012,9 @@ struct kvm_arch {
>         /* Deflect RDMSR and WRMSR to user space when they trigger a #GP */
>         bool user_space_msr_enabled;
>
> +       struct msr_bitmap_range msr_allowlist_ranges[10];

Why 10? I think this is the only use of this constant, but a macro
would still be nice, especially since the number appears to be
arbitrary.

> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 0780f97c1850..c33fb1d72d52 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -192,6 +192,21 @@ struct kvm_msr_list {
>         __u32 indices[0];
>  };
>
> +#define KVM_MSR_ALLOW_READ  (1 << 0)
> +#define KVM_MSR_ALLOW_WRITE (1 << 1)
> +
> +/* Maximum size of the of the bitmap in bytes */
> +#define KVM_MSR_ALLOWLIST_MAX_LEN 0x600

Wouldn't 0x400 be a more natural size, since both Intel and AMD MSR
permission bitmaps cover ranges of 8192 MSRs?

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e1139124350f..25e58ceb19de 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1472,6 +1472,38 @@ void kvm_enable_efer_bits(u64 mask)
>  }
>  EXPORT_SYMBOL_GPL(kvm_enable_efer_bits);
>
> +static bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type)

In another thread, when I suggested that a function should return
bool, you said, "'I'm not a big fan of bool returning APIs unless they
have an "is" in their name.' This function doesn't have "is" in its
name. :-)

> +{
> +       struct kvm *kvm = vcpu->kvm;
> +       struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
> +       u32 count = kvm->arch.msr_allowlist_ranges_count;

Shouldn't the read of kvm->arch.msr_allowlist_ranges_count be guarded
by the mutex, below?

> +       u32 i;
> +       bool r = false;
> +
> +       /* MSR allowlist not set up, allow everything */
> +       if (!count)
> +               return true;
> +
> +       /* Prevent collision with clear_msr_allowlist */
> +       mutex_lock(&kvm->lock);
> +
> +       for (i = 0; i < count; i++) {
> +               u32 start = ranges[i].base;
> +               u32 end = start + ranges[i].nmsrs;
> +               u32 flags = ranges[i].flags;
> +               unsigned long *bitmap = ranges[i].bitmap;
> +
> +               if ((index >= start) && (index < end) && (flags & type)) {
> +                       r = !!test_bit(index - start, bitmap);

The !! seems gratuitous, since r is of type bool.

> @@ -1483,6 +1515,9 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
>  {
>         struct msr_data msr;
>
> +       if (!host_initiated && !kvm_msr_allowed(vcpu, index, KVM_MSR_ALLOW_WRITE))
> +               return -ENOENT;

Perhaps -EPERM is more appropriate here?

>         switch (index) {
>         case MSR_FS_BASE:
>         case MSR_GS_BASE:
> @@ -1528,6 +1563,9 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
>         struct msr_data msr;
>         int ret;
>
> +       if (!host_initiated && !kvm_msr_allowed(vcpu, index, KVM_MSR_ALLOW_READ))
> +               return -ENOENT;

...and here?

> +static bool msr_range_overlaps(struct kvm *kvm, struct msr_bitmap_range *range)

Another bool function with no "is"? :-)

> +{
> +       struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
> +       u32 i, count = kvm->arch.msr_allowlist_ranges_count;
> +       bool r = false;
> +
> +       for (i = 0; i < count; i++) {
> +               u32 start = max(range->base, ranges[i].base);
> +               u32 end = min(range->base + range->nmsrs,
> +                             ranges[i].base + ranges[i].nmsrs);
> +
> +               if ((start < end) && (range->flags & ranges[i].flags)) {
> +                       r = true;
> +                       break;
> +               }
> +       }
> +
> +       return r;
> +}

This seems like an awkward constraint. Would it be possible to allow
overlapping ranges as long as the access types don't clash? So, for
example, could I specify an allow list for READ of MSRs 0-0x1ffff and
an allow list for WRITE of MSRs 0-0x1ffff? Actually, I don't see why
you have to prohibit overlapping ranges at all.


> +static int kvm_vm_ioctl_clear_msr_allowlist(struct kvm *kvm)
> +{
> +       int i;

Nit: In earlier code, you use u32 for this index. (I'm actually a fan
of int, myself.)


> @@ -10086,6 +10235,8 @@ void kvm_arch_pre_destroy_vm(struct kvm *kvm)
>
>  void kvm_arch_destroy_vm(struct kvm *kvm)
>  {
> +       int i;

It's 50/50 now, u32 vs. int. :-)