Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp687784ybi; Fri, 2 Aug 2019 02:38:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqwWoWhjvClJ9KTt5YzipOVnBPGMy6Q3XBSFAt+gaAuU/Iu/I8Ff02Msp7XuXkGWaLJOdr/c X-Received: by 2002:a63:4612:: with SMTP id t18mr114992478pga.85.1564738680703; Fri, 02 Aug 2019 02:38:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564738680; cv=none; d=google.com; s=arc-20160816; b=eyPMLZurbH1tvlD1ALl5AXtXyBMERvxhiZulsOgAaVbnIJ7TEUWJyZ9Y0c/fZM4jH8 UPOHe388UBrmCquqbitnTvRMMazYy/LvoCoQuKH6Uq7QVzAqoJWxPyvTk9aEUGglSSsz iQ7BWYBKVvApi/TiW3rfW7OZlujtmhWr3BLXaQ2P+q5dyO1OyS48JVvLEI0BI/Tawzy7 q+IGs4JnmOJHt3er7NERcPzlyE3e0OJr2QDKyQus2x3kwYoI57o48ffAld+BH3rb7BP7 YFXj5kVm3rGmYD5jWtK6Uz98QobF1tKHhMt6ye0hX4PmUAvmgTJq7HPdqsRTRdX8NLuH YXvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:openpgp:from:references:cc:to:subject; bh=QOtgP+1RY1QtN6WT4QHkjPhGkfEsydxRR6ux+horNPY=; b=Imqh8NOrX+xfb5u+9VqzV8cklfaC/Tbvqy5cpYqH5okEm3UdmQZDYbgXkJ7fZ/WP/F SsKqKXs/uIhi8vlfgfcUgtxgzwhcvQb+QSBlqdspbkVXFsEH8aDLbOVwD+CgrlwHbNRM GUMeiJ5StgP2UYqBRXFmdcTldjk0zpCNrc863604A2vq1y6LyLE0Lyeq6R22FVeCtYHk fLdYnfq9lozgOnrFcGJBleZ1DWBeJuDCFsPvQR4/B4z3JqhdOpKTMQSs5w3kLVw7rulH jMDopihLufkzAJ2jr3DSq9Yg1cjqGkyq6XNlBXdot2e3Y5RrNYDjdr4BSygaQyrS7HF6 /NIw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b15si30791510pgj.141.2019.08.02.02.37.45; Fri, 02 Aug 2019 02:38:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404553AbfHBJBv (ORCPT + 99 others); Fri, 2 Aug 2019 05:01:51 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:52027 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732534AbfHBJBv (ORCPT ); Fri, 2 Aug 2019 05:01:51 -0400 Received: by mail-wm1-f67.google.com with SMTP id 207so67189747wma.1 for ; Fri, 02 Aug 2019 02:01:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:openpgp:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=QOtgP+1RY1QtN6WT4QHkjPhGkfEsydxRR6ux+horNPY=; b=aL3k1ct5wOfyYjThzJJmYBCj9F8ZYzuYzFvWeip1zqUTmo5Mlh+iuuIoREshNfCAA4 nvjLWc0R+uWdRYQITzqeANgoAjbLfxgtChvHYJw0F2TqJcGUqkuBr4YkQh7zXSyHHlxR HAPNEzjrXmxhnJLhJzbOkGEGp9ozFYHYsy8uasV+NPRvyb8124Y8neBRleJ5Ure0YMYT 7tLkePbz/QSJ+82LS9jTKy0HmCyNANeRjQjaYTa+1CiDpLMupI8Bif/bDfstwYDDXFSq aT7s6aiMjt32RO4N0lnR0t1cwsM9sfzpd7fGGw+iFfDoD7resA93z7Gk7AKJ4yi+2MRl Vcxg== X-Gm-Message-State: APjAAAXKmAnhqFwA5qwjMeM7uCJJAB37rWGxrQLTWdO6kUjPaq/i5DSI SQKj3TMxfHLGY9J8LVRbDiQ82ehOFOI= X-Received: by 2002:a05:600c:2388:: with SMTP id m8mr3344920wma.23.1564736507947; Fri, 02 Aug 2019 02:01:47 -0700 (PDT) Received: from ?IPv6:2001:b07:6468:f312:4013:e920:9388:c3ff? ([2001:b07:6468:f312:4013:e920:9388:c3ff]) by smtp.gmail.com with ESMTPSA id s10sm90218243wmf.8.2019.08.02.02.01.46 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Fri, 02 Aug 2019 02:01:47 -0700 (PDT) Subject: Re: [RFC PATCH v2 04/19] RISC-V: Add initial skeletal KVM support To: Anup Patel , Palmer Dabbelt , Paul Walmsley , Radim K Cc: Daniel Lezcano , Thomas Gleixner , Atish Patra , Alistair Francis , Damien Le Moal , Christoph Hellwig , Anup Patel , "kvm@vger.kernel.org" , "linux-riscv@lists.infradead.org" , "linux-kernel@vger.kernel.org" References: <20190802074620.115029-1-anup.patel@wdc.com> <20190802074620.115029-5-anup.patel@wdc.com> From: Paolo Bonzini Openpgp: preference=signencrypt Message-ID: <9f30d2b6-fa2c-22ff-e597-b9fbd1c700ff@redhat.com> Date: Fri, 2 Aug 2019 11:01:47 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190802074620.115029-5-anup.patel@wdc.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/08/19 09:47, Anup Patel wrote: > +static void kvm_riscv_check_vcpu_requests(struct kvm_vcpu *vcpu) > +{ > + if (kvm_request_pending(vcpu)) { > + /* TODO: */ > + > + /* > + * Clear IRQ_PENDING requests that were made to guarantee > + * that a VCPU sees new virtual interrupts. > + */ > + kvm_check_request(KVM_REQ_IRQ_PENDING, vcpu); > + } > +} This kvm_check_request can go away (as it does in patch 6). > +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) > +{ > + int ret; > + unsigned long scause, stval; You need to wrap this with srcu_read_lock/srcu_read_unlock, otherwise stage2_page_fault can access freed memslot arrays. (ARM doesn't have this issue because it does not have to decode instructions on MMIO faults). That is, vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); > + /* Process MMIO value returned from user-space */ > + if (run->exit_reason == KVM_EXIT_MMIO) { > + ret = kvm_riscv_vcpu_mmio_return(vcpu, vcpu->run); > + if (ret) > + return ret; > + } > + > + if (run->immediate_exit) > + return -EINTR; > + > + vcpu_load(vcpu); > + > + kvm_sigset_activate(vcpu); > + > + ret = 1; > + run->exit_reason = KVM_EXIT_UNKNOWN; > + while (ret > 0) { > + /* Check conditions before entering the guest */ > + cond_resched(); > + > + kvm_riscv_check_vcpu_requests(vcpu); > + > + preempt_disable(); > + > + local_irq_disable(); > + > + /* > + * Exit if we have a signal pending so that we can deliver > + * the signal to user space. > + */ > + if (signal_pending(current)) { > + ret = -EINTR; > + run->exit_reason = KVM_EXIT_INTR; > + } Add an srcu_read_unlock here (and then the smp_store_mb can become smp_mb__after_srcu_read_unlock + WRITE_ONCE). > + /* > + * Ensure we set mode to IN_GUEST_MODE after we disable > + * interrupts and before the final VCPU requests check. > + * See the comment in kvm_vcpu_exiting_guest_mode() and > + * Documentation/virtual/kvm/vcpu-requests.rst > + */ > + smp_store_mb(vcpu->mode, IN_GUEST_MODE); > + > + if (ret <= 0 || > + kvm_request_pending(vcpu)) { > + vcpu->mode = OUTSIDE_GUEST_MODE; > + local_irq_enable(); > + preempt_enable(); > + continue; > + } > + > + guest_enter_irqoff(); > + > + __kvm_riscv_switch_to(&vcpu->arch); > + > + vcpu->mode = OUTSIDE_GUEST_MODE; > + vcpu->stat.exits++; > + > + /* Save SCAUSE and STVAL because we might get an interrupt > + * between __kvm_riscv_switch_to() and local_irq_enable() > + * which can potentially overwrite SCAUSE and STVAL. > + */ > + scause = csr_read(CSR_SCAUSE); > + stval = csr_read(CSR_STVAL); > + > + /* > + * We may have taken a host interrupt in VS/VU-mode (i.e. > + * while executing the guest). This interrupt is still > + * pending, as we haven't serviced it yet! > + * > + * We're now back in HS-mode with interrupts disabled > + * so enabling the interrupts now will have the effect > + * of taking the interrupt again, in HS-mode this time. > + */ > + local_irq_enable(); > + > + /* > + * We do local_irq_enable() before calling guest_exit() so > + * that if a timer interrupt hits while running the guest > + * we account that tick as being spent in the guest. We > + * enable preemption after calling guest_exit() so that if > + * we get preempted we make sure ticks after that is not > + * counted as guest time. > + */ > + guest_exit(); > + > + preempt_enable(); And another srcu_read_lock here. Using vcpu->srcu_idx instead of a local variable also allows system_opcode_insn to wrap kvm_vcpu_block with a srcu_read_unlock/srcu_read_lock pair. > + ret = kvm_riscv_vcpu_exit(vcpu, run, scause, stval); > + } > + > + kvm_sigset_deactivate(vcpu); And finally srcu_read_unlock here. Paolo > + vcpu_put(vcpu); > + return ret; > +} > diff --git a/arch/riscv/kvm/vcpu_exit.c b/arch/riscv/kvm/vcpu_exit.c > new file mode 100644 > index 000000000000..e4d7c8f0807a > --- /dev/null > +++ b/arch/riscv/kvm/vcpu_exit.c > @@ -0,0 +1,35 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2019 Western Digital Corporation or its affiliates. > + * > + * Authors: > + * Anup Patel > + */ > + > +#include > +#include > +#include > + > +/** > + * kvm_riscv_vcpu_mmio_return -- Handle MMIO loads after user space emulation > + * or in-kernel IO emulation > + * > + * @vcpu: The VCPU pointer > + * @run: The VCPU run struct containing the mmio data > + */ > +int kvm_riscv_vcpu_mmio_return(struct kvm_vcpu *vcpu, struct kvm_run *run) > +{ > + /* TODO: */ > + return 0; > +} > + > +/* > + * Return > 0 to return to guest, < 0 on error, 0 (and set exit_reason) on > + * proper exit to userspace. > + */ > +int kvm_riscv_vcpu_exit(struct kvm_vcpu *vcpu, struct kvm_run *run, > + unsigned long scause, unsigned long stval) > +{ > + /* TODO: */ > + return 0; > +} > diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c > new file mode 100644 > index 000000000000..ac0211820521 > --- /dev/null > +++ b/arch/riscv/kvm/vm.c > @@ -0,0 +1,79 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2019 Western Digital Corporation or its affiliates. > + * > + * Authors: > + * Anup Patel > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) > +{ > + /* TODO: To be added later. */ > + return -ENOTSUPP; > +} > + > +int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) > +{ > + int r; > + > + r = kvm_riscv_stage2_alloc_pgd(kvm); > + if (r) > + return r; > + > + return 0; > +} > + > +void kvm_arch_destroy_vm(struct kvm *kvm) > +{ > + int i; > + > + for (i = 0; i < KVM_MAX_VCPUS; ++i) { > + if (kvm->vcpus[i]) { > + kvm_arch_vcpu_destroy(kvm->vcpus[i]); > + kvm->vcpus[i] = NULL; > + } > + } > +} > + > +int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > +{ > + int r; > + > + switch (ext) { > + case KVM_CAP_DEVICE_CTRL: > + case KVM_CAP_USER_MEMORY: > + case KVM_CAP_DESTROY_MEMORY_REGION_WORKS: > + case KVM_CAP_ONE_REG: > + case KVM_CAP_READONLY_MEM: > + case KVM_CAP_MP_STATE: > + case KVM_CAP_IMMEDIATE_EXIT: > + r = 1; > + break; > + case KVM_CAP_NR_VCPUS: > + r = num_online_cpus(); > + break; > + case KVM_CAP_MAX_VCPUS: > + r = KVM_MAX_VCPUS; > + break; > + case KVM_CAP_NR_MEMSLOTS: > + r = KVM_USER_MEM_SLOTS; > + break; > + default: > + r = 0; > + break; > + } > + > + return r; > +} > + > +long kvm_arch_vm_ioctl(struct file *filp, > + unsigned int ioctl, unsigned long arg) > +{ > + return -EINVAL; > +} >