Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1106984ybe; Thu, 5 Sep 2019 10:22:21 -0700 (PDT) X-Google-Smtp-Source: APXvYqy+Nwf8WRmH21MJTuwtDcIuyOS4rkwNn43xIFgEu2EX6yA4LXIbkezM0OWRcSmvKWZEusC9 X-Received: by 2002:a17:90a:ba96:: with SMTP id t22mr5200342pjr.104.1567704140961; Thu, 05 Sep 2019 10:22:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567704140; cv=none; d=google.com; s=arc-20160816; b=T09PpVxGnpuDSUGZKNOoUDmG0TYOi+w+9uBPU38bSls/JZnbdT0G8rXcrEuEB/EyzR nbwqAWOuK3I9x/EF3A9NRuIYcgC2IkBKQrTeKfP5go4j4LUrTfJR+DH9dP8JyfvtGW7K I12os2IuylIzQzgU5h10ymzMj1wE71+/nceZrFPenUo24zEjlhM74YoqV8aNOy6xMh7o zXV0BGjWaGCNSdaKHdLoEx/VCuTh0mOCR2Ue00ZkHPlZ5kz6b02vyfN58aQqziFuZQl3 XVuMNcN8s3Y4aJAJc3Jqzg4GeKy+G3And2/4X56xWFtKHlEsjk5RPJ+bwVX2oX4x6vJU 2IGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject; bh=aLU6ZCGSaCEW2oXeC4fT5+o9G3mRUo2kKCLyjgEOs08=; b=HRBh21g+rY6noHgj5iQFdz2PnCThct6C2xA42p1TUjREW/8rvSlvEAdu/osI5/uIFP 8jiNL8knsTKS1J7Sd59bmkGJ6S8CkmF3bQrUlMMOIF9QBqMfvTs3ndK+BBW+53a6jWWr AmiUpLBRU53cDNfGsOgbYlwilWPD+KIbcVEKhB6Ar9Jxbp+1IkAK0jF+oJJ3WLW1NtyE nOXWJ1XnwbLHsBaOFH9qprD1wFMUtk530OxMFv22f9Dym4vOPJ1/sANol/HIFlATkAFT fFjXkI1mT0utPjHw5ux1I6WnzLY9hOOrf0duNOhuC1OZGzSRxOxVM161BeCBqovuyYO7 aXOA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l9si2724928pjl.32.2019.09.05.10.22.04; Thu, 05 Sep 2019 10:22:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732827AbfIENJX (ORCPT + 99 others); Thu, 5 Sep 2019 09:09:23 -0400 Received: from foss.arm.com ([217.140.110.172]:44844 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726097AbfIENJX (ORCPT ); Thu, 5 Sep 2019 09:09:23 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 35D2628; Thu, 5 Sep 2019 06:09:22 -0700 (PDT) Received: from [10.1.197.61] (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 31D723F67D; Thu, 5 Sep 2019 06:09:21 -0700 (PDT) Subject: Re: [PATCH 1/1] KVM: inject data abort if instruction cannot be decoded To: Christoffer Dall , Peter Maydell Cc: =?UTF-8?Q?Daniel_P=2e_Berrang=c3=a9?= , Heinrich Schuchardt , lkml - Kernel Mailing List , Stefan Hajnoczi , kvmarm@lists.cs.columbia.edu, arm-mail-list References: <20190904180736.29009-1-xypron.glpk@gmx.de> <86r24vrwyh.wl-maz@kernel.org> <86mufjrup7.wl-maz@kernel.org> <20190905092223.GC4320@e113682-lin.lund.arm.com> From: Marc Zyngier Organization: Approximate Message-ID: <4b6662bd-56e4-3c10-3b65-7c90828a22f9@kernel.org> Date: Thu, 5 Sep 2019 14:09:18 +0100 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190905092223.GC4320@e113682-lin.lund.arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/09/2019 10:22, Christoffer Dall wrote: > On Thu, Sep 05, 2019 at 09:56:44AM +0100, Peter Maydell wrote: >> On Thu, 5 Sep 2019 at 09:52, Marc Zyngier wrote: >>> >>> On Thu, 05 Sep 2019 09:16:54 +0100, >>> Peter Maydell wrote: >>>> This is true, but the problem is that barfing out to userspace >>>> makes it harder to debug the guest because it means that >>>> the VM is immediately destroyed, whereas AIUI if we >>>> inject some kind of exception then (assuming you're set up >>>> to do kernel-debug via gdbstub) you can actually examine >>>> the offending guest code with a debugger because at least >>>> your VM is still around to inspect... >>> >>> To Christoffer's point, I find the benefit a bit dubious. Yes, you get >>> an exception, but the instruction that caused it may be completely >>> legal (store with post-increment, for example), leading to an even >>> more puzzled developer (that exception should never have been >>> delivered the first place). >> >> Right, but the combination of "host kernel prints a message >> about an unsupported load/store insn" and "within-guest debug >> dump/stack trace/etc" is much more useful than just having >> "host kernel prints message" and "QEMU exits"; and it requires >> about 3 lines of code change... >> >>> I'm far more in favour of dumping the state of the access in the run >>> structure (much like we do for a MMIO access) and let userspace do >>> something about it (such as dumping information on the console or >>> breaking). It could even inject an exception *if* the user has asked >>> for it. >> >> ...whereas this requires agreement on a kernel-userspace API, >> larger changes in the kernel, somebody to implement the userspace >> side of things, and the user to update both the kernel and QEMU. >> It's hard for me to see that the benefit here over the 3-line >> approach really outweighs the extra effort needed. In practice >> saying "we should do this" is saying "we're going to do nothing", >> based on the historical record. >> > > How about something like the following (completely untested, liable for > ABI discussions etc. etc., but for illustration purposes). > > I think it raises the question (and likely many other) of whether we can > break the existing 'ABI' and change behavior for missing ISV > retrospectively for legacy user space when the issue has occurred? > > Someone might have written code that reacts to the -ENOSYS, so I've > taken the conservative approach for this for the time being. > > > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h > index 8a37c8e89777..19a92c49039c 100644 > --- a/arch/arm/include/asm/kvm_host.h > +++ b/arch/arm/include/asm/kvm_host.h > @@ -76,6 +76,14 @@ struct kvm_arch { > > /* Mandated version of PSCI */ > u32 psci_version; > + > + /* > + * If we encounter a data abort without valid instruction syndrome > + * information, report this to user space. User space can (and > + * should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is > + * supported. > + */ > + bool return_nisv_io_abort_to_user; > }; > > #define KVM_NR_MEM_OBJS 40 > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index f656169db8c3..019bc560edc1 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -83,6 +83,14 @@ struct kvm_arch { > > /* Mandated version of PSCI */ > u32 psci_version; > + > + /* > + * If we encounter a data abort without valid instruction syndrome > + * information, report this to user space. User space can (and > + * should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is > + * supported. > + */ > + bool return_nisv_io_abort_to_user; > }; > > #define KVM_NR_MEM_OBJS 40 > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 5e3f12d5359e..a4dd004d0db9 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -235,6 +235,7 @@ struct kvm_hyperv_exit { > #define KVM_EXIT_S390_STSI 25 > #define KVM_EXIT_IOAPIC_EOI 26 > #define KVM_EXIT_HYPERV 27 > +#define KVM_EXIT_ARM_NISV 28 > > /* For KVM_EXIT_INTERNAL_ERROR */ > /* Emulate instruction failed. */ > @@ -996,6 +997,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_ARM_PTRAUTH_ADDRESS 171 > #define KVM_CAP_ARM_PTRAUTH_GENERIC 172 > #define KVM_CAP_PMU_EVENT_FILTER 173 > +#define KVM_CAP_ARM_NISV_TO_USER 174 > > #ifdef KVM_CAP_IRQ_ROUTING > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index 35a069815baf..2ce94bd9d4a9 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -98,6 +98,26 @@ int kvm_arch_check_processor_compat(void) > return 0; > } > > +int kvm_vm_ioctl_enable_cap(struct kvm *kvm, > + struct kvm_enable_cap *cap) > +{ > + int r; > + > + if (cap->flags) > + return -EINVAL; > + > + switch (cap->cap) { > + case KVM_CAP_ARM_NISV_TO_USER: > + r = 0; > + kvm->arch.return_nisv_io_abort_to_user = true; > + break; > + default: > + r = -EINVAL; > + break; > + } > + > + return r; > +} > > /** > * kvm_arch_init_vm - initializes a VM data structure > @@ -196,6 +216,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > case KVM_CAP_MP_STATE: > case KVM_CAP_IMMEDIATE_EXIT: > case KVM_CAP_VCPU_EVENTS: > + case KVM_CAP_ARM_NISV_TO_USER: > r = 1; > break; > case KVM_CAP_ARM_SET_DEVICE_ADDR: > @@ -673,6 +694,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) > ret = kvm_handle_mmio_return(vcpu, vcpu->run); > if (ret) > return ret; > + } else if (run->exit_reason == KVM_EXIT_ARM_NISV) { > + kvm_inject_undefined(vcpu); Just to make sure I understand: Is the expectation here that userspace could clear the exit reason if it managed to handle the exit? And otherwise we'd inject an UNDEF on reentry? > } > > if (run->immediate_exit) > diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c > index 6af5c91337f2..62e6ef47a6de 100644 > --- a/virt/kvm/arm/mmio.c > +++ b/virt/kvm/arm/mmio.c > @@ -167,8 +167,15 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run, > if (ret) > return ret; > } else { > - kvm_err("load/store instruction decoding not implemented\n"); > - return -ENOSYS; > + if (vcpu->kvm->arch.return_nisv_io_abort_to_user) { > + run->exit_reason = KVM_EXIT_ARM_NISV; > + run->mmio.phys_addr = fault_ipa; We could also record whether that's a read or a write (WnR should still be valid). Actually, we could store a sanitized version of the ESR. > + vcpu->stat.mmio_exit_user++; > + return 0; > + } else { > + kvm_info("encountered data abort without syndrome info\n"); My only issue with this is that the previous message has been sort of documented... Thanks, M. -- Jazz is not dead, it just smells funny...