Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1067971ybe; Thu, 5 Sep 2019 09:49:31 -0700 (PDT) X-Google-Smtp-Source: APXvYqxEF20RYGxc6S89NTEbDFfl72gZJZejhwOd7RKxYO8P3EH9vBiVi9DSHmooyz4s+GmLX9ij X-Received: by 2002:a62:db84:: with SMTP id f126mr5172146pfg.25.1567702171655; Thu, 05 Sep 2019 09:49:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567702171; cv=none; d=google.com; s=arc-20160816; b=EJd4YEmTEzXajhyWV3kA+DVURiKXvCKKzUlNUHr97BRcjMPR4DOwO9pP+4bOLcexIe +qbhEs8UDO8/82wO8b1MiqwSSkCRSCXUQNUsG+Pva+XZKeOpP5Qg9+nztccvEC5ov10n 9XKfpwMVrQAaTQR2OEoFFrRX12Y7OKKvhf8fjdBB1nfqvFH38CVnqgh8n72LeX86o6cN 2WF3vi9YA3+ZxuhNpKVRTuh0oFSZJOpFs7vCoJNZ40OTt1+YbtNvAcxAIJvhJBNZ0iOn rcnlGW/Jvdo08Hx6npZiVQDVxANZkcbR23v0RfJyFUBhlLHkNu5WkciwlAE3eTssFRQf Km1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=VCwnfWfKJo8jslVlCE4kqvlB6pTkwCrtSTSDb4sekGM=; b=sQEiZIX9qHYEB22WFabI8PK0RpMXVxtoOwVoZQRcCnMc0eB422wxNU6tKQTP0gU3ax WBg24s1HNmhbvuFzH9CFAa58tytKdj8UBmMMpumeg+F+PskDMQT3A/yMBbhpJX9gad5B hZDm2nW4y0ZNVvkfzP4nwuWbygbkWzP/u2Px6AE8ehQIrYNwB2PTdkaY6UEGvn9W/TuV WeEordSPmspbnKepNd0IsacO2HcGodd0i6xWfHSY6av+nr4qRQm4OJuwOA+2Y0fpzrIu stcWNHso+KyF1oE/ymI02rKdt8t2c7GxUzqCIlapBEqFoVF6pK6LnjnuDXWRPvMxEacv H/mg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g24si2254103pgh.416.2019.09.05.09.49.15; Thu, 05 Sep 2019 09:49:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732800AbfIEMQm (ORCPT + 99 others); Thu, 5 Sep 2019 08:16:42 -0400 Received: from foss.arm.com ([217.140.110.172]:43878 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731196AbfIEMQl (ORCPT ); Thu, 5 Sep 2019 08:16:41 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9A3E128; Thu, 5 Sep 2019 05:16:40 -0700 (PDT) Received: from localhost (e113682-lin.copenhagen.arm.com [10.32.144.41]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2EA823F718; Thu, 5 Sep 2019 05:16:40 -0700 (PDT) Date: Thu, 5 Sep 2019 14:16:38 +0200 From: Christoffer Dall To: Heinrich Schuchardt Cc: Stefan Hajnoczi , Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= , Marc Zyngier , linux-kernel@vger.kernel.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH 1/1] KVM: inject data abort if instruction cannot be decoded Message-ID: <20190905121638.GD4320@e113682-lin.lund.arm.com> References: <20190904180736.29009-1-xypron.glpk@gmx.de> <20190905092039.GG32415@stefanha-x1.localdomain> <561eae08-c5f1-9543-275c-0da0a85cd7df@gmx.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <561eae08-c5f1-9543-275c-0da0a85cd7df@gmx.de> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Heinrich, On Thu, Sep 05, 2019 at 02:01:36PM +0200, Heinrich Schuchardt wrote: > On 9/5/19 11:20 AM, Stefan Hajnoczi wrote: > > On Wed, Sep 04, 2019 at 08:07:36PM +0200, Heinrich Schuchardt wrote: > > > If an application tries to access memory that is not mapped, an error > > > ENOSYS, "load/store instruction decoding not implemented" may occur. > > > QEMU will hang with a register dump. > > > > > > Instead create a data abort that can be handled gracefully by the > > > application running in the virtual environment. > > > > > > Now the virtual machine can react to the event in the most appropriate > > > way - by recovering, by writing an informative log, or by rebooting. > > > > > > Signed-off-by: Heinrich Schuchardt > > > --- > > > virt/kvm/arm/mmio.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c > > > index a8a6a0c883f1..0cbed7d6a0f4 100644 > > > --- a/virt/kvm/arm/mmio.c > > > +++ b/virt/kvm/arm/mmio.c > > > @@ -161,8 +161,8 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run, > > > if (ret) > > > return ret; > > > } else { > > > - kvm_err("load/store instruction decoding not implemented\n"); > > > - return -ENOSYS; > > > + kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu)); > > > + return 1; > > > > I see this more as a temporary debugging hack than something to merge. > > > > It sounds like in your case the guest environment provided good > > debugging information and you preferred it over debugging this from the > > host side. That's fine, but allowing the guest to continue running in > > the general case makes it much harder to track down the root cause of a > > problem because many guest CPU instructions may be executed after the > > original problem occurs. Other guest software may fail silently in > > weird ways. IMO it's best to fail early. > > > > Stefan > > > > As virtual machine are ubiquitous, expect also mission critical system > to run on them. At development time halting a machine may be a good > idea. In production this is often the worst solution. Rebooting may be > essential for survival. > > For an anecdotal example see: > https://www.hq.nasa.gov/alsj/a11/a11.1201-pa.html > > I am convinced that leaving it to the guest to decide how to react is > the best choice. > Maintaining strong adherence to the architecture is equally important, and I'm sure we can find anecdotes to support how not doing the expected, can also lead to disastrous outcomes. Have you had a look at the suggested patch I sent? The idea is that we can preserve existing legacy ABI, allow for a better debugging experience, allow userspace to do emulation if it so wishes, and provide a better error message if userspace doesn't handle this properly. One thing we could change from my proposed patch would be to have KVM inject the access as an external abort if the target address also doesn't hit an MMIO device, which is by far the common scenario reported here on the list. Hopefully, a mission critical deployment based on KVM/Arm (scary as that sounds), would use a recent and patched VMM (QEMU) that either causes the external abort, or reboots the VM, as per the configuration of the particular system in question. Thanks, Christoffer