Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1255149ybl; Fri, 23 Aug 2019 16:17:32 -0700 (PDT) X-Google-Smtp-Source: APXvYqx+nhX2mEYITjbdlXubz0FoXxy4POy5lCXs74ytg9S2+X/OjKjeBA+Y5rVJiAYRBem3DW2L X-Received: by 2002:a63:3148:: with SMTP id x69mr5624123pgx.300.1566602252338; Fri, 23 Aug 2019 16:17:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566602252; cv=none; d=google.com; s=arc-20160816; b=EcBqWdyf1tWoroak2keGCnYt0itwZcfQ9zuP+KKFEyLGEiNx7RInCZvHv61IGPvTDl wPOs7rRw/HhU8urvIQjuBqMKyJLx3HpWzSt8yOrlDEqPC75u0a0Pgx4M5ZRUixeq4PTi m1It0+kYnB+TXYo7aBxTK0Z3tN70+6BfbmfkF0Sw84Sx1nmvV6KnY+YrfVRQoHWWNeKx omwdpcRUP5cmRR43uvJ9vfp+4p3tiuG/CMNcd4N5lCPf2sx9oIHLyDRXrcsRK6mhumpI 2UrLhdctHi0vAGo4s9+Ext/NO5ws6vuGMhRDSa2BJoNrwrWgY6wTUp7KpdX8c0WRorHu Gq2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=v3SECXYZMgxemCryQWtmgixga1wgTZLwdqnQ2dtsTnI=; b=odPATBGwnKcV/TG0OoZHAhftUyqnDVUm8nFoW46r5ajGvw4s7+SLOAaUBRqVXLOm4W 8H2f5jvbxBVahqmfubuayuj0mH70Nf8HeC9PAId6UuUTakAGRXDJtCWiLeK5rFXSd63p LmdDa78fKDoucugi8k7gEBcvuaWtNGDlrIyajgBNGqZnaEHd1UwPmNpNx6Fc2ERTvov3 VoXqfDcKKNGvuV1fJy0+IgE0NAZAL/7NRRVWN8w3GLVxB8RTdiXcHtz8Btiyx3z/BiIa 84SmxEXyZ5C1nPnBOfZR6crCpn+kJgRT05PKHIlj/zMB6NkmloMntaYkXKUXnmjwAfxR PG3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brainfault-org.20150623.gappssmtp.com header.s=20150623 header.b=COnwfN2D; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 1si3630583ply.122.2019.08.23.16.17.17; Fri, 23 Aug 2019 16:17:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@brainfault-org.20150623.gappssmtp.com header.s=20150623 header.b=COnwfN2D; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394741AbfHWM3P (ORCPT + 99 others); Fri, 23 Aug 2019 08:29:15 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:51701 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732402AbfHWM3N (ORCPT ); Fri, 23 Aug 2019 08:29:13 -0400 Received: by mail-wm1-f66.google.com with SMTP id k1so8763437wmi.1 for ; Fri, 23 Aug 2019 05:29:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brainfault-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=v3SECXYZMgxemCryQWtmgixga1wgTZLwdqnQ2dtsTnI=; b=COnwfN2DAxkwdACO0pfIBW80uOhqEtE8aoeu479J8M0lSdcDzkZ6KC99XiJQI7Y6/c ktUZvkeRi2IlTM4WJ/UBdQEq74jn/MFGr+M49KtitptIYIzdWpZaJVJ/vQ3ybOuY27lI 40aJ1InPnTbY4xiM+8N9uFl2P2Pp2qa4b4iOXsf3kUDA5nb3BnuJiolZOjlVRhDvSOMi 0aFnFZW++EKpE04Q3LT9n1peugPvdWOk2nI9h8QszmzyUosOy20KJkCyxXaZVBumAzgE sRKiCJMkp7pAGoaC57VyngFaPktL3UeMUr/kC1aWXKIsXZZ86AKBzHXo20xfBQqJL4kY R+gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=v3SECXYZMgxemCryQWtmgixga1wgTZLwdqnQ2dtsTnI=; b=BDcLAtXGxkbw7cINwMdogxOlOub673GOPV1I2dnDfYw76zhmdp8tJ4wYsJguvqa9/J 9waSLD7DRNmibigrB5WSCO3bMvp9+eP9EMSjji6EV/KHBuWhZ6HuMvyqz2JaOya24Gpc xeWrqaAF2qa0qiVVb9e8cP5lZrsL9ONSRcLZ3kdi6E5o4BzG5i4eBfIL3rVqTN/k4Qx+ qGRV39lDK2r6+OIovcwxF2c289FE4iF+37NFeT0DzFTOCZ8g1nBCex6ObZ4kcEiYLoLH 8s4VekTEUWnwSH10SMQeq+afR5TQBYmbWkQwC/7eH6sDxXOvVsSzO/dGMWgln+2QGqjU tNrQ== X-Gm-Message-State: APjAAAWJnAGNLXUgfpgz3BsfQwULLJ3eDn+U9Y22c+6CPm/VVV9a60Za NIlQcdtOVB+Pj8qGbc+px1XhRz5KE55DfxN8bHT2KA== X-Received: by 2002:a7b:c933:: with SMTP id h19mr4734454wml.177.1566563349005; Fri, 23 Aug 2019 05:29:09 -0700 (PDT) MIME-Version: 1.0 References: <20190822084131.114764-1-anup.patel@wdc.com> <20190822084131.114764-19-anup.patel@wdc.com> <40911e08-e0ce-a2b8-24d4-9cf357432850@amazon.com> In-Reply-To: From: Anup Patel Date: Fri, 23 Aug 2019 17:58:57 +0530 Message-ID: Subject: Re: [PATCH v5 18/20] RISC-V: KVM: Add SBI v0.1 support To: Alexander Graf Cc: Anup Patel , Palmer Dabbelt , Paul Walmsley , Paolo Bonzini , Radim K , Damien Le Moal , "kvm@vger.kernel.org" , Daniel Lezcano , "linux-kernel@vger.kernel.org" , Christoph Hellwig , Atish Patra , Alistair Francis , Thomas Gleixner , "linux-riscv@lists.infradead.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 23, 2019 at 5:50 PM Alexander Graf wrote: > > > > On 23.08.19 14:00, Anup Patel wrote: > > On Fri, Aug 23, 2019 at 5:09 PM Graf (AWS), Alexander = wrote: > >> > >> > >> > >>> Am 23.08.2019 um 13:18 schrieb Anup Patel : > >>> > >>>> On Fri, Aug 23, 2019 at 1:34 PM Alexander Graf wro= te: > >>>> > >>>>> On 22.08.19 10:46, Anup Patel wrote: > >>>>> From: Atish Patra > >>>>> > >>>>> The KVM host kernel running in HS-mode needs to handle SBI calls co= ming > >>>>> from guest kernel running in VS-mode. > >>>>> > >>>>> This patch adds SBI v0.1 support in KVM RISC-V. All the SBI calls a= re > >>>>> implemented correctly except remote tlb flushes. For remote TLB flu= shes, > >>>>> we are doing full TLB flush and this will be optimized in future. > >>>>> > >>>>> Signed-off-by: Atish Patra > >>>>> Signed-off-by: Anup Patel > >>>>> Acked-by: Paolo Bonzini > >>>>> Reviewed-by: Paolo Bonzini > >>>>> --- > >>>>> arch/riscv/include/asm/kvm_host.h | 2 + > >>>>> arch/riscv/kvm/Makefile | 2 +- > >>>>> arch/riscv/kvm/vcpu_exit.c | 3 + > >>>>> arch/riscv/kvm/vcpu_sbi.c | 119 +++++++++++++++++++++++++= +++++ > >>>>> 4 files changed, 125 insertions(+), 1 deletion(-) > >>>>> create mode 100644 arch/riscv/kvm/vcpu_sbi.c > >>>>> > >>>>> diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include= /asm/kvm_host.h > >>>>> index 2af3a179c08e..0b1eceaef59f 100644 > >>>>> --- a/arch/riscv/include/asm/kvm_host.h > >>>>> +++ b/arch/riscv/include/asm/kvm_host.h > >>>>> @@ -241,4 +241,6 @@ bool kvm_riscv_vcpu_has_interrupt(struct kvm_vc= pu *vcpu); > >>>>> void kvm_riscv_vcpu_power_off(struct kvm_vcpu *vcpu); > >>>>> void kvm_riscv_vcpu_power_on(struct kvm_vcpu *vcpu); > >>>>> > >>>>> +int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu); > >>>>> + > >>>>> #endif /* __RISCV_KVM_HOST_H__ */ > >>>>> diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile > >>>>> index 3e0c7558320d..b56dc1650d2c 100644 > >>>>> --- a/arch/riscv/kvm/Makefile > >>>>> +++ b/arch/riscv/kvm/Makefile > >>>>> @@ -9,6 +9,6 @@ ccflags-y :=3D -Ivirt/kvm -Iarch/riscv/kvm > >>>>> kvm-objs :=3D $(common-objs-y) > >>>>> > >>>>> kvm-objs +=3D main.o vm.o vmid.o tlb.o mmu.o > >>>>> -kvm-objs +=3D vcpu.o vcpu_exit.o vcpu_switch.o vcpu_timer.o > >>>>> +kvm-objs +=3D vcpu.o vcpu_exit.o vcpu_switch.o vcpu_timer.o vcpu_s= bi.o > >>>>> > >>>>> obj-$(CONFIG_KVM) +=3D kvm.o > >>>>> diff --git a/arch/riscv/kvm/vcpu_exit.c b/arch/riscv/kvm/vcpu_exit.= c > >>>>> index fbc04fe335ad..87b83fcf9a14 100644 > >>>>> --- a/arch/riscv/kvm/vcpu_exit.c > >>>>> +++ b/arch/riscv/kvm/vcpu_exit.c > >>>>> @@ -534,6 +534,9 @@ int kvm_riscv_vcpu_exit(struct kvm_vcpu *vcpu, = struct kvm_run *run, > >>>>> (vcpu->arch.guest_context.hstatus & HSTATUS_STL)) > >>>>> ret =3D stage2_page_fault(vcpu, run, scause, = stval); > >>>>> break; > >>>>> + case EXC_SUPERVISOR_SYSCALL: > >>>>> + if (vcpu->arch.guest_context.hstatus & HSTATUS_SPV) > >>>>> + ret =3D kvm_riscv_vcpu_sbi_ecall(vcpu); > >>>>> default: > >>>>> break; > >>>>> }; > >>>>> diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c > >>>>> new file mode 100644 > >>>>> index 000000000000..5793202eb514 > >>>>> --- /dev/null > >>>>> +++ b/arch/riscv/kvm/vcpu_sbi.c > >>>>> @@ -0,0 +1,119 @@ > >>>>> +// SPDX-License-Identifier: GPL-2.0 > >>>>> +/** > >>>>> + * Copyright (c) 2019 Western Digital Corporation or its affiliate= s. > >>>>> + * > >>>>> + * Authors: > >>>>> + * Atish Patra > >>>>> + */ > >>>>> + > >>>>> +#include > >>>>> +#include > >>>>> +#include > >>>>> +#include > >>>>> +#include > >>>>> + > >>>>> +#define SBI_VERSION_MAJOR 0 > >>>>> +#define SBI_VERSION_MINOR 1 > >>>>> + > >>>>> +/* TODO: Handle traps due to unpriv load and redirect it back to V= S-mode */ > >>>> > >>>> Ugh, another one of those? Can't you just figure out a way to recove= r > >>>> from the page fault? Also, you want to combine this with the instruc= tion > >>>> load logic, so that we have a single place that guest address space > >>>> reads go through. > >>> > >>> Walking Guest page table would be more expensive compared to implemen= ting > >>> a trap handling mechanism. > >>> > >>> We will be adding trap handling mechanism for reading instruction and= reading > >>> load. > >>> > >>> Both these operations are different in following ways: > >>> 1. RISC-V instructions are variable length. We get to know exact inst= ruction > >>> length only after reading first 16bits > >>> 2. We need to set VSSTATUS.MXR bit when reading instruction for > >>> execute-only Guest pages. > >> > >> Yup, sounds like you could solve that with a trivial if() based on "re= ad instruction" or not, no? If you want to, feel free to provide short vers= ions that do only read ins/data, but I would really like to see the whole "= data reads become guest reads" magic to be funneled through a single functi= on (in C, can be inline unrolled in asm of course) > >> > >>> > >>>> > >>>>> +static unsigned long kvm_sbi_unpriv_load(const unsigned long *addr= , > >>>>> + struct kvm_vcpu *vcpu) > >>>>> +{ > >>>>> + unsigned long flags, val; > >>>>> + unsigned long __hstatus, __sstatus; > >>>>> + > >>>>> + local_irq_save(flags); > >>>>> + __hstatus =3D csr_read(CSR_HSTATUS); > >>>>> + __sstatus =3D csr_read(CSR_SSTATUS); > >>>>> + csr_write(CSR_HSTATUS, vcpu->arch.guest_context.hstatus | HST= ATUS_SPRV); > >>>>> + csr_write(CSR_SSTATUS, vcpu->arch.guest_context.sstatus); > >>>>> + val =3D *addr; > >>>>> + csr_write(CSR_HSTATUS, __hstatus); > >>>>> + csr_write(CSR_SSTATUS, __sstatus); > >>>>> + local_irq_restore(flags); > >>>>> + > >>>>> + return val; > >>>>> +} > >>>>> + > >>>>> +static void kvm_sbi_system_shutdown(struct kvm_vcpu *vcpu, u32 typ= e) > >>>>> +{ > >>>>> + int i; > >>>>> + struct kvm_vcpu *tmp; > >>>>> + > >>>>> + kvm_for_each_vcpu(i, tmp, vcpu->kvm) > >>>>> + tmp->arch.power_off =3D true; > >>>>> + kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP); > >>>>> + > >>>>> + memset(&vcpu->run->system_event, 0, sizeof(vcpu->run->system_= event)); > >>>>> + vcpu->run->system_event.type =3D type; > >>>>> + vcpu->run->exit_reason =3D KVM_EXIT_SYSTEM_EVENT; > >>>>> +} > >>>>> + > >>>>> +int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu) > >>>>> +{ > >>>>> + int ret =3D 1; > >>>>> + u64 next_cycle; > >>>>> + int vcpuid; > >>>>> + struct kvm_vcpu *remote_vcpu; > >>>>> + ulong dhart_mask; > >>>>> + struct kvm_cpu_context *cp =3D &vcpu->arch.guest_context; > >>>>> + > >>>>> + if (!cp) > >>>>> + return -EINVAL; > >>>>> + switch (cp->a7) { > >>>>> + case SBI_SET_TIMER: > >>>>> +#if __riscv_xlen =3D=3D 32 > >>>>> + next_cycle =3D ((u64)cp->a1 << 32) | (u64)cp->a0; > >>>>> +#else > >>>>> + next_cycle =3D (u64)cp->a0; > >>>>> +#endif > >>>>> + kvm_riscv_vcpu_timer_next_event(vcpu, next_cycle); > >>>> > >>>> Ah, this is where the timer set happens. I still don't understand ho= w > >>>> this takes the frequency bit into account? > >>> > >>> Explained it in PATCH17 comments. > >>> > >>>> > >>>>> + break; > >>>>> + case SBI_CONSOLE_PUTCHAR: > >>>>> + /* Not implemented */ > >>>>> + cp->a0 =3D -ENOTSUPP; > >>>>> + break; > >>>>> + case SBI_CONSOLE_GETCHAR: > >>>>> + /* Not implemented */ > >>>>> + cp->a0 =3D -ENOTSUPP; > >>>>> + break; > >>>> > >>>> These two should be covered by the default case. > >>> > >>> Sure, I will update. > >>> > >>>> > >>>>> + case SBI_CLEAR_IPI: > >>>>> + kvm_riscv_vcpu_unset_interrupt(vcpu, IRQ_S_SOFT); > >>>>> + break; > >>>>> + case SBI_SEND_IPI: > >>>>> + dhart_mask =3D kvm_sbi_unpriv_load((unsigned long *)c= p->a0, vcpu); > >>>>> + for_each_set_bit(vcpuid, &dhart_mask, BITS_PER_LONG) = { > >>>>> + remote_vcpu =3D kvm_get_vcpu_by_id(vcpu->kvm,= vcpuid); > >>>>> + kvm_riscv_vcpu_set_interrupt(remote_vcpu, IRQ= _S_SOFT); > >>>>> + } > >>>>> + break; > >>>>> + case SBI_SHUTDOWN: > >>>>> + kvm_sbi_system_shutdown(vcpu, KVM_SYSTEM_EVENT_SHUTDO= WN); > >>>>> + ret =3D 0; > >>>>> + break; > >>>>> + case SBI_REMOTE_FENCE_I: > >>>>> + sbi_remote_fence_i(NULL); > >>>>> + break; > >>>>> + /* > >>>>> + * TODO: There should be a way to call remote hfence.bvma. > >>>>> + * Preferred method is now a SBI call. Until then, just flush > >>>>> + * all tlbs. > >>>>> + */ > >>>>> + case SBI_REMOTE_SFENCE_VMA: > >>>>> + /*TODO: Parse vma range.*/ > >>>>> + sbi_remote_sfence_vma(NULL, 0, 0); > >>>>> + break; > >>>>> + case SBI_REMOTE_SFENCE_VMA_ASID: > >>>>> + /*TODO: Parse vma range for given ASID */ > >>>>> + sbi_remote_sfence_vma(NULL, 0, 0); > >>>>> + break; > >>>>> + default: > >>>>> + cp->a0 =3D ENOTSUPP; > >>>>> + break; > >>>> > >>>> Please just send unsupported SBI events into user space. > >>> > >>> For unsupported SBI calls, we should be returning error to the > >>> Guest Linux so that do something about it. This is in accordance > >>> with the SBI spec. > >> > >> That's up to user space (QEMU / kvmtool) to decide. If user space want= s to implement the console functions (like we do on s390), it should have = the chance to do so. > > > > The SBI_CONSOLE_PUTCHAR and SBI_CONSOLE_GETCHAR are > > for debugging only. These calls are deprecated in SBI v0.2 onwards > > because we now have earlycon for early prints in Linux RISC-V. > > > > The RISC-V Guest will generally have it's own MMIO based UART > > which will be the default console. > > > > Due to these reasons, we have not implemented these SBI calls. > > I'm not saying we should implement them. I'm saying we should leave a > policy decision like that up to user space. By terminating the SBI in > kernel space, you can not quickly debug something going wrong. > > > If we still want user-space to implement this then we will require > > separate exit reasons and we are trying to avoid adding RISC-V > > specific exit reasons/ioctls in KVM user-space ABI. > > Why? > > I had so many occasions where I would have loved to have user space > exits for MSR access, SPR access, hypercalls, etc etc. It really makes > life so much easier when you can quickly hack something up in user space > rather than modify the kernel. > > > The absence of SBI_CONSOLE_PUTCHAR/GETCHAR certainly > > does not block anyone in debugging Guest Linux because we have > > earlycon support in Linux RISC-V. > > I'm not hung on on the console. What I'm trying to express is a general > sentiment that terminating extensible hypervisor <-> guest interfaces in > kvm is not a great idea. Some times we can't get around it (like on page > tables), but some times we do. And this is a case where we could. > > At the end of the day this is your call though :). I am not sure about user-space CSRs but having ability to route unsupported SBI calls to user-space can be very useful. For this series, we will continue to return error to Guest Linux for unsupported SBI calls. We will add unsupported SBI routing to user-space in next series. Regards, Anup > > > Alex