Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1923142imu; Thu, 10 Jan 2019 05:28:33 -0800 (PST) X-Google-Smtp-Source: ALg8bN7/kkw2D4KfvHw6T/NRFiFRichyTHKhIvIOmzwTG/4FwOrv9VDXOJb8Bk8lzWwY1zNexaMb X-Received: by 2002:a17:902:9691:: with SMTP id n17mr10640335plp.9.1547126913736; Thu, 10 Jan 2019 05:28:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547126913; cv=none; d=google.com; s=arc-20160816; b=MwQo4QTHn+20IMDJnV7TuieowATKUWKmjpumdj5wxWxAgaON9OoRw73XJmRbJSQXHt 7kYuFJ9lp+/GF3WK0BF91MpBN4v+uzNxO2TuRqQYlOhmMUYJxjVj0dzDxiHgFRgqwU0h K78Jak1GH9N9WclNwmuaxhAWC4iryZgxdK0P451UVXliiLSbgw6jus5VuUYpkccujw5d zPW9z8vT6DzfMBc8KUx0gNBYktFrmFGlyjmqxn7VQYVCByN8fDxW/lWKHc4zeniQlneV +DfRzHjwe9W3k8XMJWCiVUXvhYwlqLah6WQtEoRjMR1AQhgeyfjjDfTcSI0pM6gIJrd8 24ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=ohYVSf1fMbgvNKdeTvE1yHp/DGUKv0oXC+hX6SOxMkA=; b=qtsQCCDeFBOFtdPjVYE49Hvj30NGhhDtDsmHk/cjt4WDzowpNxid41z3TKyxHzbM9X zlRtOBiUY8VEs80NFSzmuagfhS4ZU/d6Vihega+Ho8efs1hnjQcBpuL+aFyQ2UhZL41M kblPXtN00T4WarBHmopRhQ+HWmPip1fHeQyJHF9UnEmFrSpcRQ+LO0nECS4jKDRgKwBi AGx9oVFhc12HIgfgNmfJ4WAQVCyfNcBDgfvKyzlqCypEMkNcpAoAOpeUhlYB/DTPytvu pxghNzTkHRSO/TxHqV+ZQ6lqjpq4cEHVU/MsiJorl1htjnr35iMjnsSU/rcPZ62pD9xP A1Wg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IIW2DCjV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y17si37467265pgh.353.2019.01.10.05.28.18; Thu, 10 Jan 2019 05:28:33 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IIW2DCjV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728916AbfAJN0D (ORCPT + 99 others); Thu, 10 Jan 2019 08:26:03 -0500 Received: from mail-oi1-f194.google.com ([209.85.167.194]:46520 "EHLO mail-oi1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728843AbfAJN0C (ORCPT ); Thu, 10 Jan 2019 08:26:02 -0500 Received: by mail-oi1-f194.google.com with SMTP id x202so9217205oif.13 for ; Thu, 10 Jan 2019 05:26:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ohYVSf1fMbgvNKdeTvE1yHp/DGUKv0oXC+hX6SOxMkA=; b=IIW2DCjVNVZuslGTFaSmM0zrd5AQVQs9KDp2KFRXSbCQoxYyw+94e92qDGBqTyYj9f rTjqASpKAt6Xb8tRKhvwmEL28Vqwml66hb0v5SEzBEhnYj39EnFmpVUHCBEttJ7OqDKw vCUsXhvqdH5rZYw4Bqk3Z72b/G35JUNQffeNU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ohYVSf1fMbgvNKdeTvE1yHp/DGUKv0oXC+hX6SOxMkA=; b=JJunUZL6d2foHshutTyyix2QleJcHgmkGPqvWXzGQ8ln8hhKOcqkpr0PaxyEiWnBBD EA/EC1AJebQ/YnkkT2DfKs8WJGvaiL4YzVXCUQXwdKPs/20RgGTcjxsNUZgRLcd0HKFo hRE1B9xsZ/qf45dFQ6/6hA6tyJtN4363Jm5Nnozq4UAyyH313+abzLTSk1uhM/jnGgOl ISbIe+cFmJ5rfKVG44SKJ5e0u7239JLYuR/aso0Chgpv+VrsttychjQek1LgoFPI3m6o 8MJtPrJrLtAtw7yBfJ8RArJlSg77hUcEQ8IGBJ7EeNxcp584sZtLwNPAOjH5ow3SYvI+ 1s/Q== X-Gm-Message-State: AJcUuke1AiPrs1o9DjZaY0J66seck/8KoEjwpMX9ntcNG9hS+CxrfL7X xQEqiVSSTPFFU+d6hsOnu+3QoEHYQVDRIbRzW7wy1g== X-Received: by 2002:aca:2dc9:: with SMTP id t192mr6774360oit.325.1547126761553; Thu, 10 Jan 2019 05:26:01 -0800 (PST) MIME-Version: 1.0 References: <0184EA26B2509940AA629AE1405DD7F201FFC21E@dggema523-mbx.china.huawei.com> <1d73ec5a-b58d-1e00-b681-53cd80cba999@arm.com> <63a1ad58-b4db-c266-1077-0a3d0a3975d0@huawei.com> In-Reply-To: <63a1ad58-b4db-c266-1077-0a3d0a3975d0@huawei.com> From: Peter Maydell Date: Thu, 10 Jan 2019 13:25:50 +0000 Message-ID: Subject: Re: [RFC RESEND PATCH] kvm: arm64: export memory error recovery capability to user space To: gengdongjiu Cc: James Morse , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Jonathan Corbet , Christoffer Dall , Marc Zyngier , Catalin Marinas , Will Deacon , kvm-devel , "open list:DOCUMENTATION" , lkml - Kernel Mailing List , arm-mail-list Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 10 Jan 2019 at 12:09, gengdongjiu wrote: > Peter, I summarize James's main idea, James think QEMU does not needs > to check *something* if Qemu support firmware-first. > What do we do for your comments? Unless I'm missing something, the code in your most recent patchset attempts to update an ACPI table when it gets the SIGBUS from the host kernel without doing anything to check whether it has ever created the ACPI table (and set up the QEMU global variable that tells the code where it is in the guest memory) in the first place. I don't see how that can work. > >> I think one question here which it would be good to answer is: > >> if we are modelling a guest and we haven't specifically provided > >> it an ACPI table to tell it about memory errors, what do we do > >> when we get a sigbus from the host? We have basically two choices: > >> (1) send the guest an SError (aka asynchronous external abort) > >> anyway (with no further info about what the memory error is) > > > > For an AR signal an external abort is valid. Its up to the implementation > > whether these are synchronous or asynchronous. Qemu can only take a signal for > > something that was synchronous, so you can choose between the two. > > Synchronous external abort is marginally better as an unaware OS knows its > > affects this thread, and may be able to kill it. > > SError with an imp-def ESR is indistinguishable from 'part of the soc fell out', > > and should always result in a panic(). > > > > > >> (2) just stop QEMU (as we would for a memory error in QEMU's > >> own memory) > > > > This is also valid. A machine may take external-abort to EL3 and then > > reboot/crash/burn. We should decide which of these we want to do, and have a comment explaining what we're doing. If I'm reading your current patchset correctly, it does neither -- if it can't record the fault in the ACPI table it just ignores it without either stopping QEMU or delivering an SError. I think I favour option (2) here. thanks -- PMM