Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp5360781pxu; Wed, 21 Oct 2020 23:21:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz0v2RmdlzrLiSJZAP5K7Ka70avyFN07hT2j//HIW7RYqtuONWovmDzy24f/1eUNHZRymRw X-Received: by 2002:a50:d69e:: with SMTP id r30mr798531edi.383.1603347700320; Wed, 21 Oct 2020 23:21:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603347700; cv=none; d=google.com; s=arc-20160816; b=vyenhdYHk9DwxZvgpKhyeY1KRLJkJd9F2CsaXBor4tX/eSwpe8dEiP0tbzu2X9cg/y j1V0+0Ac1Z3ReudAE/akRtVNBdYXCR1uF3IvAF7gehpB/5PqCGYosBgBEx2Wf1wGiHUP ++Bq62IxxnK1om5sh3o23Le0i3DHs5ncOQfh1XVknd4cFl2wH30PyEk4H5K8J+xCJjWm kIRyPPXN7BROOhzWafDjMPcISMbL7jIdULvLo86OVnxnW2LwVsxVLLiMBtnKI1yzBob/ DwYagcvath7U6YTNV6FSvsubclxnC26xSeTgL4FRv0MY6TIT0XJf1iyIy4YVswiYMF59 xR9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dmarc-filter:sender:dkim-signature; bh=EcqTO31xMrs/OBAzDnSsJ48CkOQQ/LxYvl/FImj2B20=; b=abdBielPi5EWAWy1Jpf0Vnxl+zVLg7D07Rh+RUH+4xiq+sP4zgP3YJuY93tLsBN7Dx XjSu54S/59nwVfBhWJy+9tVI9Np2OcqeAnfuOlQMKAQDqRDZ3itIuQ0EZd4ZeEICmkJo IZKuVlIC4QO6R0w6Fu2hexNEVy1xGNJJz1SKwD/5vYAOO/+F6Hium2cEJL6Y/Y5D9DeY +w98ATG72Nrz7HVPc+uEmCGQ6zEGQ7Rn9C6ooaqocly9x5Z8zpkq4FM7O/uxMcanXi+d R+5L5AWNxDRHAaAeXg8mZA0cuPdJvvRV8AAB7wES0R09V6xncEP31czK9FVu8XrbPl4q l9Dw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=epJN8I79; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w22si383738edl.271.2020.10.21.23.21.17; Wed, 21 Oct 2020 23:21:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=epJN8I79; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2503541AbgJURcJ (ORCPT + 99 others); Wed, 21 Oct 2020 13:32:09 -0400 Received: from z5.mailgun.us ([104.130.96.5]:58433 "EHLO z5.mailgun.us" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2503534AbgJURcI (ORCPT ); Wed, 21 Oct 2020 13:32:08 -0400 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1603301526; h=Content-Transfer-Encoding: Content-Type: In-Reply-To: MIME-Version: Date: Message-ID: From: References: Cc: To: Subject: Sender; bh=EcqTO31xMrs/OBAzDnSsJ48CkOQQ/LxYvl/FImj2B20=; b=epJN8I792tRkO2OItqIYKALxjiAqAmtLxgr2ddrbwplrz0EdS/i0ZOA5RHyY5xUVe9yRY31x jxQ2GNLYDTqJ5orNHpBRptOMka9JatRjcBq5wdD4w5wYwtr10RxvPtn/UcsCbzOQ1YGVjjd2 xrenE53GeM9J+xuo1bFKYs0sCoI= X-Mailgun-Sending-Ip: 104.130.96.5 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n01.prod.us-west-2.postgun.com with SMTP id 5f907082d6d00c7a9e10a177 (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Wed, 21 Oct 2020 17:31:46 GMT Sender: neeraju=codeaurora.org@mg.codeaurora.org Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 94C14C433F0; Wed, 21 Oct 2020 17:31:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=ALL_TRUSTED,BAYES_00, NICE_REPLY_A,SPF_FAIL autolearn=no autolearn_force=no version=3.4.0 Received: from [192.168.0.102] (unknown [124.123.181.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: neeraju) by smtp.codeaurora.org (Postfix) with ESMTPSA id 84610C433C9; Wed, 21 Oct 2020 17:31:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 84610C433C9 Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=fail smtp.mailfrom=neeraju@codeaurora.org Subject: Re: Queries on ARM SDEI Linux kernel code To: James Morse Cc: linux-arm-kernel@lists.infradead.org, lkml References: <1dcda05c-5235-fd0d-087e-a32772e05f97@arm.com> From: Neeraj Upadhyay Message-ID: Date: Wed, 21 Oct 2020 23:01:41 +0530 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 MIME-Version: 1.0 In-Reply-To: <1dcda05c-5235-fd0d-087e-a32772e05f97@arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi James, Sorry for late reply. Thanks for your comments! On 10/16/2020 9:57 PM, James Morse wrote: > Hi Neeraj, > > On 15/10/2020 07:07, Neeraj Upadhyay wrote: >> 1. Looks like interrupt bind interface (SDEI_1_0_FN_SDEI_INTERRUPT_BIND) is not available >> for clients to use; can you please share information on >> why it is not provided? > > There is no compelling use-case for it, and its very complex to support as the driver can > no longer hide things like hibernate. > > Last time I looked, it looked like the SDEI driver would need to ask the irqchip to > prevent modification while firmware re-configures the irq. I couldn't work out how this > would work if the irq is in-progress on another CPU. > Got it. I will think in this direction, on how to achieve this. > The reasons to use bound-interrupts can equally be supported with an event provided by > firmware. > > Ok, I will explore in that direction. >> While trying to dig information on this, I saw  that [1] says: >>   Now the hotplug callbacks save  nothing, and restore the OS-view of registered/enabled. >> This makes bound-interrupts harder to work with. > >> Based on this comment, the changes from v4 [2], which I could understand is, cpu down path >> does not save the current event enable status, and we rely on the enable status >> `event->reenable', which is set, when register/unregister, enable/disable calls are made; >> this enable status is used during cpu up path, to decide whether to reenable the interrupt. > >> Does this make, bound-interrupts harder to work with? how? Can you please explain? Or >> above save/restore is not the reason and you meant something else? > > If you bind a level-triggered interrupt, how does firmware know how to clear the interrupt > from whatever is generating it? > > What happens if the OS can't do this either, as it needs to allocate memory, or take a > lock, which it can't do in nmi context? > > Ok, makes sense. > The people that wrote the SDEI spec's answer to this was that the handler can disable the > event from inside the handler... and firmware will do, something, to stop the interrupt > screaming. > > So now an event can become disabled anytime its registered, which makes it more > complicated to save/restore. > > >> Also, does shared bound interrupts > > Shared-interrupts as an NMI made me jump. But I think you mean a bound interrupt as a > shared event. i.e. and SPI not a PPI. > > Sorry I should have worded properly; yes I meant SPI as shared event. >> also have the same problem, as save/restore behavior >> was only for private events? > > See above, the problem is the event disabling itself. > This makes sense now. > Additionally those changes to unregister the private-event mean the code can't tell the > difference between cpuhp and hibernate... only hibernate additionally loses the state in > firmware. > > Got it! >> 2. SDEI_EVENT_SIGNAL api is not provided? What is the reason for it? Its handling has the >> same problems, which are there for bound interrupts? > > Its not supported as no-one showed up with a use-case. > While firmware is expected to back it with a PPI, its doesn't have the same problems as > bound-interrupts as its not an interrupt the OS ever knows about. > > >> Also, if it is provided, clients need to register event 0 ? Vendor events or other event >> nums are not supported, as per spec. > > Ideally the driver would register the event, and provide a call_on_cpu() helper to trigger > it. This should fit in with however the GIC's PMR based NMI does its PPI based > crash/stacktrace call so that the caller doesn't need to know if its back by IRQ, pNMI or > SDEI. > > Ok; I will explore how PMR based NMIs work; I thought it was SGI based. But will recheck. >> 3. Can kernel panic() be triggered from sdei event handler? > > Yes, > > >> Is it a safe operation? > > panic() wipes out the machine... did you expect it to keep running? I wanted to check the case where panic triggers kexec/kdump path into capture kernel. > What does safe mean here? > I think I didn't put it correctly; I meant what possible scenarios can happen in this case and you explained one below, thanks! > You should probably call nmi_panic() if there is the risk that the event occurred during > panic() on the same CPU, as it would otherwise just block. > > >> The spec says, synchronous exceptions should not be triggered; I think panic >> won't do it; but anything which triggers a WARN >> or other sync exception in that path can cause undefined behavior. Can you share your >> thoughts on this? > > What do you mean by undefined behaviour? > I was thinking, if SDEI event preempts EL1, at the point, where EL1 has just entered an exception, and hasn't captured the registers like spsr_el1, elr_el1 and other registers, what will be the behavior? > SDEI was originally to report external abort to the OS in regions where the OS can't take > an exception because the exception-registers are live, just after and exception and just > before eret. > > If you take another exception from the NMI handler, chances are you're going to go back > round the loop again, only this time firmware can't inject the SDEI event, so it has to > reboot. > Got it. > If you know it might cause an exception, you shouldn't do it in NMI context. > > Ok, I understand now. >> "The handler code should not enable asynchronous exceptions by clearing any of the >> PSTATE.DAIF bits, and should not cause synchronous exceptions to the client Exception level." > > > What are you using this thing for? > > Usecase is, a watchdog SPI interrupt, which we want to bound to a SDEI event. Below is the flow: wdog expiry -> SDEI event -> HLOS panic -> trigger kexec/kdump Thanks Neeraj > Thanks, > > James > -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation