Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp284608imm; Fri, 1 Jun 2018 00:22:23 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKj5W0t0Z55mZNKxuOoSvSpNpbnSFdhYrDA0D08PmjdwXE4bSoJkFPJqTKbadXBC8qQpfY8 X-Received: by 2002:a63:9d8e:: with SMTP id i136-v6mr7775941pgd.288.1527837743557; Fri, 01 Jun 2018 00:22:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527837743; cv=none; d=google.com; s=arc-20160816; b=Mw6lnHnmZdPZeVBkWZkC5hOwc/OteZJVnfKfXwVX6fdSb5BNVrAEN0eIvgpC5Li9kq FDzGv2JffIEQuJetOAsngQ4OhKtQ9H2rdPz36Tkxu94jlCvts725zoIRHP6qxjPJePkL IzK3HkCS2pEkrJZyDD5O6uNLpHJTFQ6TdCcAFv5HDSW2/leS7OmYTH2cnq8YvCWVNZCU vMkPgWFxz8OhxiHMyOXMtSiS++AW5thC3fBS+ALJlvKwDLzBOp5pVE8yQquDos4taJEW MKkFc83h0OvUwlqoNoPkErpr8kJ187VrePIkylwuvPIlytg2HgK80ztFfHZ821tFqik/ 9hvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=puK8u8uMBFEgYOaoEP3NWSESDRAHnD9Bbp8K+fovE3M=; b=JcQ6r5h/JHeYO0/25d9jMbIGJhTX/5ZQOWbpO8s0iaeSJRmyCF7ibMZu4+35kzHiN3 6IZpuNXUSQ8+FHWiAiqonO05TWjZlpdyVZcc457N7alcG7Ng0v/W7tNIFH3E3TmCpmsA tTB6TL4nidcEAYNTWl+KnrWct1SgPB9d69RElcJJIxshQTwZL4Hub1yG4MyYlDCBbByL 6xG8bfVIJYweSmEWIdvjGEcGhRRxH8+8TeotZil6IgZGhCSi6qMNyNJTbNTfF3lf9LkS S/tT1mM8h87lhnWWpvu9DCl53UuZ0P6ItEj0tyxakXsOGRyYl5/u9ViB4gVIRWnj0tJH G3TQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y1-v6si38392583plt.316.2018.06.01.00.22.07; Fri, 01 Jun 2018 00:22:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750968AbeFAHVk (ORCPT + 99 others); Fri, 1 Jun 2018 03:21:40 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:36075 "EHLO huawei.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1750796AbeFAHVi (ORCPT ); Fri, 1 Jun 2018 03:21:38 -0400 Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 93F50DF7030E3; Fri, 1 Jun 2018 15:21:33 +0800 (CST) Received: from [127.0.0.1] (10.142.68.147) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.382.0; Fri, 1 Jun 2018 15:21:33 +0800 Subject: Re: [PATCH v1 2/2] arm64: handle NOTIFY_SEI notification by the APEI driver To: James Morse , Mark Rutland CC: , , , , , , , , , , , , References: <1527770506-8076-1-git-send-email-gengdongjiu@huawei.com> <1527770506-8076-3-git-send-email-gengdongjiu@huawei.com> <20180531110115.uglmicy3nzwfoyx3@lakrids.cambridge.arm.com> <71afa669-e3c5-979e-da5b-1d9cb7056fd6@arm.com> From: gengdongjiu Message-ID: Date: Fri, 1 Jun 2018 15:21:11 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <71afa669-e3c5-979e-da5b-1d9cb7056fd6@arm.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.142.68.147] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Mark, James, Thanks the review. On 2018/6/1 0:51, James Morse wrote: > Hi Mark, Dongjiu Geng, > > On 31/05/18 12:01, Mark Rutland wrote: >> In do_serror() we already handle nmi_{enter,exit}(), so there's no need >> for that here. > > Even better: nmi_enter() has a BUG_ON(in_nmi()). There are two places call the arm64_is_fatal_ras_serror(): 1. do_serror() 2. kvm_handle_guest_serror() Yes, the do_serror() already handle nmi_{enter,exit}(), so the arm64_is_fatal_ras_serror() does not need to handle it again. For the kvm_handle_guest_serror(), it does not handle the nmi_{enter,exit}(), so should we add nmi_{enter,exit}() in kvm_handle_guest_serror() before calling arm64_is_fatal_ras_serror()? For the NOTIFY_SEA, I do not know why handle_guest_sea() does not handle the nmi_{enter,exit}(). James, does we miss it in handle_guest_sea()? > > >> TBH, I don't understand why do_sea() does that conditionally today. >> Unless there's some constraint I'm missing, > > APEI uses a different fixmap entry and locks when in_nmi(). This was because we > may interrupt the irq-masked region in APEI that was using the regular memory. > (e.g. the 'polled' notification, or something backed by an interrupt.) But, > Borislav has spotted other things in here that are broken[0]. I'm working on > rolling all that into 'v5' of the in_nmi() rework stuff. > > We currently get away with this on arm because 'SEA' is the only NMI-like thing, > and it occurs synchronously. The problem cases are all also cases where the > kernel text is corrupt, which we can't possibly hope to handle. > > For NOTIFY_SDEI and NOTIFY_SEI this is the wrong pattern as these are > asynchronous. do_serror() has already done most of the work for NOTIFY_SEI, but > we need to use the estatus queue in APEI, which is currently x86 only. I think this patch can based on you 'v5' of the in_nmi() rework stuff > > >> I think it would make more >> sense to do that regardless of whether the interrupted context had >> interrupts enabled. James -- does that make sense to you? >> >> If you update the prior patch with a stub for !CONFIG_ACPI_APEI_SEI, you >> can simplify all of the above additions down to: >> >> if (!ghes_notify_sei()) >> return; >> >> ... which would look a lot nicer. > > The code that calls ghes_notify_sei() may need calling by KVM too, but its > default action to an 'unclaimed' SError will be different. > Because of the race between memory_failure() and return-to-userspace, we may > need to kick the irq work queue (if we can), as we return from do_serror(). [1] > and [2] provide an example for NOTIFY_SEA. SDEI does this by returning to the > kernel through the IRQ handler, (which handles the KVM case too). I can kick the IRQ work queue as you do for the NOTIFY_SEA and NOTIFY_SDEI. > > > I think this series is unsafe until we can use the estatus queue in APEI. Its > also missing the handling for an SError interrupting a KVM guest. how about this series is based on your patches that uses the estatus queue in APEI to make it safe? when an SError interrupting a KVM guest, it will trap to hypervisor, hypervisor will call below software stack to handle it: kvm_handle_guest_serror()->arm64_is_fatal_ras_serror()->ghes_notify_sei() so it already handles the case that an SError interrupting a KVM guest. > > > Thanks, > > James > > [0] https://www.spinics.net/lists/arm-kernel/msg653332.html > [1] https://www.spinics.net/lists/arm-kernel/msg649237.html > [2] https://www.spinics.net/lists/arm-kernel/msg649239.html > > . >