Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1209628ybl; Thu, 22 Aug 2019 10:59:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqzHMcAL1JVe9RWDYg5E02fjOtZbUkkxewj4MNfQieV+KNFVbAelQw7zr3YnxnY4PXkEbv2d X-Received: by 2002:a17:902:2f24:: with SMTP id s33mr43381plb.314.1566496780108; Thu, 22 Aug 2019 10:59:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566496780; cv=none; d=google.com; s=arc-20160816; b=yrTJADyib1urFjRvPQpAdMqchT62AKjKhlPj8S35GVHNWS/ssVbihEYMMP85Mnc947 +uNXyuBR6yPqSwgH2SRMXeAPMZoEkunv6x6YMXe6RQhaEeqJpYUIl4qL4l8eF4Ilo2W9 /9K5Dj0/l0OmrIdDM4mFrnaUmXrZPdMatQOACGFUqZO3bLLhfaJ+lnJWr0UjgsxYTFkR 29N8fW1sdOecba+DQzeTDgCMtseNTbyKb7B1Em5Xsfqyn4K2EY12PbDN+IySnkJKBs81 l0HyCipdgLM3z2drGD+/K/yQGWYZZQ0R7CVuzCLDG1zMXmX9hpOMgi6pT0Ooa8Nvyra9 EUxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=BnFwQaeHO9BwaDaPDs30ViOA3PYpNnt8DXjyb7utwWk=; b=EmzWxBnQ1U7Zqu8RREyshl1fWijDsfrZarpBSdgvt5DCcHxoHo2GMxEzngd/ehdQJd /XwmdOuEBzPndAsgh9lUouEuFwqPqe86nTw13JVX5UrNedzbGLjaj+j3aAWEgxOBm4hl 2FJ+m9LJpVHicEuDZFezevy5n/ZhvIWccdMEmj2eDFXLTmBYkWe6jbMJEoguB6H1qpcc YtAg3hKbSsOqudokFKBzHVsKEljDP2IyccYhLDjJzbZUEPUZ2zAoWYkzDMfRlB2wIHc1 WCAd2wHwplxtlewD3Csn//+f6DK1Nu5vMw+fnAQUaJ98+51CIriaUjkGDbXXmRh71fcV /62g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brainfault-org.20150623.gappssmtp.com header.s=20150623 header.b=Qg8jr2yW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i97si275779pje.47.2019.08.22.10.59.24; Thu, 22 Aug 2019 10:59:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@brainfault-org.20150623.gappssmtp.com header.s=20150623 header.b=Qg8jr2yW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387801AbfHVMir (ORCPT + 99 others); Thu, 22 Aug 2019 08:38:47 -0400 Received: from mail-wr1-f68.google.com ([209.85.221.68]:46315 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732332AbfHVMir (ORCPT ); Thu, 22 Aug 2019 08:38:47 -0400 Received: by mail-wr1-f68.google.com with SMTP id z1so5249671wru.13 for ; Thu, 22 Aug 2019 05:38:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brainfault-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BnFwQaeHO9BwaDaPDs30ViOA3PYpNnt8DXjyb7utwWk=; b=Qg8jr2yWOLGcPOOts1J5YDswgaDhv2IKEGB6uG/dpPewlZixUmhziVkFuGIHuajdda gTepf4bkcV/X96wxwDEW5TutP6blq3MrJKOtjDZaDC1hBqeuln5N0N5AaM1mlvedjnd+ rmFgPP4Nwn5+CwnFqhI/ogR5nGw7oQ8OxuuuyS1PaNVD+KAdItTbB1ew6xbd+zm2+ocr 7CNtzkP3EtQXqybSQFOANXOiBnXZ5tJMUKzRarkt4W3CyvtrM0W/G4eteBzXY9TBDoGs 3nTtfpHwAZFZduYVg7KBikEcycQXr2P/BrDUlumgBk84Sa0dK8tqxO0Mt1nnOiGsHUAO Rxfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BnFwQaeHO9BwaDaPDs30ViOA3PYpNnt8DXjyb7utwWk=; b=U4apKiniQCJsacnrLvEVpsIgDeU+3h5HEWSQPERrE8dozJ9ErPuLboePH+vpbf60qm r8PKoCektRV6+pSMEC9Tnd71DN6RAYtk9HROuDKioX2WZWWgdPOVwrDiRHKDGXDoQatg l8ASl+kwUtp2nmWnqNykmTtzImHaL2Vd6OXljEBXxaVziSPabUG9SkI+0umEckNBYTVo cOvpGKB4FMytBbdZZTaGqV03r6qWRV4wQNfDPlJR/lZQDHYDwPb0xBA+IEaZJLXJIwKB ZY5BjSWITWcWVi7o1+XmV3iohImMvpuGWIuxFQ1kuGRICKJbTLDRUoXUGsT5rqcHae+3 nuzQ== X-Gm-Message-State: APjAAAUsRXYcS/zSznyawNKeUzqiwgSDvtwt+4IkLtoepZIMA0G/V8uF XZaJXOW6wF9ZcEzE3B2ZiQo2m35Y9IW9clAZ7TGxQg== X-Received: by 2002:a5d:4ecb:: with SMTP id s11mr20341742wrv.323.1566477524109; Thu, 22 Aug 2019 05:38:44 -0700 (PDT) MIME-Version: 1.0 References: <20190822084131.114764-1-anup.patel@wdc.com> <20190822084131.114764-14-anup.patel@wdc.com> <77b9ff3c-292f-ee17-ddbb-134c0666fde7@amazon.com> In-Reply-To: <77b9ff3c-292f-ee17-ddbb-134c0666fde7@amazon.com> From: Anup Patel Date: Thu, 22 Aug 2019 18:08:32 +0530 Message-ID: Subject: Re: [PATCH v5 13/20] RISC-V: KVM: Implement stage2 page table programming To: Alexander Graf Cc: Anup Patel , Palmer Dabbelt , Paul Walmsley , Paolo Bonzini , Radim K , Daniel Lezcano , Thomas Gleixner , Atish Patra , Alistair Francis , Damien Le Moal , Christoph Hellwig , "kvm@vger.kernel.org" , "linux-riscv@lists.infradead.org" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 22, 2019 at 5:58 PM Alexander Graf wrote: > > On 22.08.19 10:45, Anup Patel wrote: > > This patch implements all required functions for programming > > the stage2 page table for each Guest/VM. > > > > At high-level, the flow of stage2 related functions is similar > > from KVM ARM/ARM64 implementation but the stage2 page table > > format is quite different for KVM RISC-V. > > > > Signed-off-by: Anup Patel > > Acked-by: Paolo Bonzini > > Reviewed-by: Paolo Bonzini > > --- > > arch/riscv/include/asm/kvm_host.h | 10 + > > arch/riscv/include/asm/pgtable-bits.h | 1 + > > arch/riscv/kvm/mmu.c | 637 +++++++++++++++++++++++++- > > 3 files changed, 638 insertions(+), 10 deletions(-) > > > > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h > > index 3b09158f80f2..a37775c92586 100644 > > --- a/arch/riscv/include/asm/kvm_host.h > > +++ b/arch/riscv/include/asm/kvm_host.h > > @@ -72,6 +72,13 @@ struct kvm_mmio_decode { > > int shift; > > }; > > > > +#define KVM_MMU_PAGE_CACHE_NR_OBJS 32 > > + > > +struct kvm_mmu_page_cache { > > + int nobjs; > > + void *objects[KVM_MMU_PAGE_CACHE_NR_OBJS]; > > +}; > > + > > struct kvm_cpu_context { > > unsigned long zero; > > unsigned long ra; > > @@ -163,6 +170,9 @@ struct kvm_vcpu_arch { > > /* MMIO instruction details */ > > struct kvm_mmio_decode mmio_decode; > > > > + /* Cache pages needed to program page tables with spinlock held */ > > + struct kvm_mmu_page_cache mmu_page_cache; > > + > > /* VCPU power-off state */ > > bool power_off; > > > > diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h > > index bbaeb5d35842..be49d62fcc2b 100644 > > --- a/arch/riscv/include/asm/pgtable-bits.h > > +++ b/arch/riscv/include/asm/pgtable-bits.h > > @@ -26,6 +26,7 @@ > > > > #define _PAGE_SPECIAL _PAGE_SOFT > > #define _PAGE_TABLE _PAGE_PRESENT > > +#define _PAGE_LEAF (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC) > > > > /* > > * _PAGE_PROT_NONE is set on not-present pages (and ignored by the hardware) to > > diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c > > index 2b965f9aac07..9e95ab6769f6 100644 > > --- a/arch/riscv/kvm/mmu.c > > +++ b/arch/riscv/kvm/mmu.c > > @@ -18,6 +18,432 @@ > > #include > > #include > > > > +#ifdef CONFIG_64BIT > > +#define stage2_have_pmd true > > +#define stage2_gpa_size ((phys_addr_t)(1ULL << 39)) > > +#define stage2_cache_min_pages 2 > > +#else > > +#define pmd_index(x) 0 > > +#define pfn_pmd(x, y) ({ pmd_t __x = { 0 }; __x; }) > > +#define stage2_have_pmd false > > +#define stage2_gpa_size ((phys_addr_t)(1ULL << 32)) > > +#define stage2_cache_min_pages 1 > > +#endif > > + > > +static int stage2_cache_topup(struct kvm_mmu_page_cache *pcache, > > + int min, int max) > > +{ > > + void *page; > > + > > + BUG_ON(max > KVM_MMU_PAGE_CACHE_NR_OBJS); > > + if (pcache->nobjs >= min) > > + return 0; > > + while (pcache->nobjs < max) { > > + page = (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO); > > + if (!page) > > + return -ENOMEM; > > + pcache->objects[pcache->nobjs++] = page; > > + } > > + > > + return 0; > > +} > > + > > +static void stage2_cache_flush(struct kvm_mmu_page_cache *pcache) > > +{ > > + while (pcache && pcache->nobjs) > > + free_page((unsigned long)pcache->objects[--pcache->nobjs]); > > +} > > + > > +static void *stage2_cache_alloc(struct kvm_mmu_page_cache *pcache) > > +{ > > + void *p; > > + > > + if (!pcache) > > + return NULL; > > + > > + BUG_ON(!pcache->nobjs); > > + p = pcache->objects[--pcache->nobjs]; > > + > > + return p; > > +} > > + > > +struct local_guest_tlb_info { > > + struct kvm_vmid *vmid; > > + gpa_t addr; > > +}; > > + > > +static void local_guest_tlb_flush_vmid_gpa(void *info) > > +{ > > + struct local_guest_tlb_info *infop = info; > > + > > + __kvm_riscv_hfence_gvma_vmid_gpa(READ_ONCE(infop->vmid->vmid_version), > > + infop->addr); > > +} > > + > > +static void stage2_remote_tlb_flush(struct kvm *kvm, gpa_t addr) > > +{ > > + struct local_guest_tlb_info info; > > + struct kvm_vmid *vmid = &kvm->arch.vmid; > > + > > + /* TODO: This should be SBI call */ > > + info.vmid = vmid; > > + info.addr = addr; > > + preempt_disable(); > > + smp_call_function_many(cpu_all_mask, local_guest_tlb_flush_vmid_gpa, > > + &info, true); > > This is all nice and dandy on the toy 4 core systems we have today, but > it will become a bottleneck further down the road. > > How many VMIDs do you have? Could you just allocate a new one every time > you switch host CPUs? Then you know exactly which CPUs to flush by > looking at all your vcpu structs and a local field that tells you which > pCPU they're on at this moment. > > Either way, it's nothing that should block inclusion. For today, we're fine. We are not happy about this either. Other two options, we have are: 1. Have SBI calls for remote HFENCEs 2. Propose RISC-V ISA extension for remote FENCEs Option1 is mostly extending SBI spec and implementing it in runtime firmware. Option2 is ideal solution but requires consensus among wider audience in RISC-V foundation. At this point, we are fine with a simple solution. Regards, Anup > > > Alex