Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp947931ybl; Wed, 29 Jan 2020 12:30:06 -0800 (PST) X-Google-Smtp-Source: APXvYqxeIV4m2jiqoil/V5QQUg9UvN47axbk+BJeu52yLSGr/PJ333Sak/Xx/GYq0Yn3r3whz0MD X-Received: by 2002:aca:c782:: with SMTP id x124mr559150oif.88.1580329806308; Wed, 29 Jan 2020 12:30:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580329806; cv=none; d=google.com; s=arc-20160816; b=PI5+h0CnTZRyqdWTKJ7FAni6cpYthSXbcktdssR+Utme33Gz1vU2BWBe+JvVlJ7bCK Rg+uBf0qxyUIZhuPxLCcUO2nPfGLYRQdTwJSO4ELRdgse0GzOC0tXWkLK1I/At/FhDYp xKfu/rRbM8ODBKtFYoSJRULC4+TgTX7Y4I11IEkqoWC9h+peeiXgxNkgU6TIC4k1iHZE n7g4jEeiNyLclp7638UWdhSmrV3K9264jjAUGjuLRaB+rQmwvpktMOV/iQifNpIvoA8j Xo6dbWDK5bG9LmJZ1jKXUmeq3KCb7UQ7MeQRDuiA5hucMrWS1/I74rUPYJKJFYsu2gag nyAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=xUYJpOBIHFT2Q+fbwYwCce8J1hD/zrDr2WkofLApH0M=; b=FZDtG/OWah3JtjyhGYtAGRE+Peiajt6SaS0oc4hXyYVUuh1aYw1dCSQ2LCVyS1nR+3 QpqNXmklH7Nu031RpXLzdQndH/dMjJj6HB5dd+A3/7ahwSnPOP9oC3NKjXNZerE1abcG gACt92ixYILTvt9VsiW+V0BET7PSgw6UAOFJNrudwHxImpHGqPnEXgPiT0aUtlNVdmtc EGOy8K23WokCz33oCDwcEmLJ+vF38dfeZ6+SbycDxSQkCjNhlN4eAmqf6ARhXOMHucAt c1v23w2E2bNSZDWBfr/Le1GIL2s39xapdGqNNfiz2lFWos5pKIh1S6fvO2vSc49+bRKR V6qg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=qagYbN+L; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j9si1575208otq.317.2020.01.29.12.29.50; Wed, 29 Jan 2020 12:30:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=qagYbN+L; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727097AbgA2UPu (ORCPT + 99 others); Wed, 29 Jan 2020 15:15:50 -0500 Received: from mail-pg1-f195.google.com ([209.85.215.195]:41622 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727025AbgA2UPu (ORCPT ); Wed, 29 Jan 2020 15:15:50 -0500 Received: by mail-pg1-f195.google.com with SMTP id x8so363300pgk.8 for ; Wed, 29 Jan 2020 12:15:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=xUYJpOBIHFT2Q+fbwYwCce8J1hD/zrDr2WkofLApH0M=; b=qagYbN+Lmcm1+NIrT2EoYkbq6Law+pIXujyl5Q4o/Ek5a/F4ymxJZn6re8z2qHr1B6 pol5R9hEoDKdtU3BpIVFgvarKP3eAagbF2dcWuwCr1Z5N4uCcRRiGnp0pilC0phNDc3Z Pg1rUUbhcBZ+J538vgIoDhFsQo/IsiYBfhGl2ZfeFl9Y/gdZJtBx2cbw6JZ0YEOoKRyQ iDs+/KJ5FaeM/gkUwNR/wUb50fBJEYfO9R8ncccFI04gk8Qmst9cMP1v8bS+JkYVbp8T GNMTOCx5Oba595Otov+HLqHYjfzbzM0wAKVMdsHgld4wxNPfXlFre6P+EIE349bx4Dm2 Z8bQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=xUYJpOBIHFT2Q+fbwYwCce8J1hD/zrDr2WkofLApH0M=; b=N4d92EawJPNAUriKSw26S7TpYTAFHpa7d54HsqEdv4A9sQNvjdIMdM1xHYphr15meh dEYqwZy50I/Gzxv7+vo1yu/F+CQqRBdrH1c8TFUKhTL3NdiUxEwEphldc8C9EVfc8dfh +KplaxWMFavGIoPdFzUx+VangxFVSvdn+pU0655U9Zh77IF1BuWrHLp5ehILujewoou0 3dXZjE2lC8HB3zaqY7K3CJpYKnX01LZw4RRogQk+1c6OrsQ1v7xAddly26qn1GjP4s/I 6n5wkvXb7r3BeX8MdWQ6YcBv6D08xgTwdVLd4kcCnYihMKzliNS5Gcq5DEfz8wSWks/i 8/6A== X-Gm-Message-State: APjAAAUb0bPxM+8s1g6kq3LWTDFqBXmec0SIs2D4QE+rNvvGnkQCPwJq EFA8QyzZm+6VQeLCXXonaVsiEQ== X-Received: by 2002:a63:454a:: with SMTP id u10mr884844pgk.248.1580328949182; Wed, 29 Jan 2020 12:15:49 -0800 (PST) Received: from xps15 (S0106002369de4dac.cg.shawcable.net. [68.147.8.254]) by smtp.gmail.com with ESMTPSA id u13sm3474995pjn.29.2020.01.29.12.15.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jan 2020 12:15:48 -0800 (PST) Date: Wed, 29 Jan 2020 13:15:46 -0700 From: Mathieu Poirier To: Arnaud POULIQUEN Cc: Bjorn Andersson , Rob Herring , Mark Rutland , Ohad Ben-Cohen , linux-arm-msm , devicetree@vger.kernel.org, Linux Kernel Mailing List , linux-remoteproc , Sibi Sankar , Rishabh Bhatnagar Subject: Re: [PATCH v2 7/8] remoteproc: qcom: q6v5: Add common panic handler Message-ID: <20200129201546.GA31696@xps15> References: <20191227053215.423811-1-bjorn.andersson@linaro.org> <20191227053215.423811-8-bjorn.andersson@linaro.org> <20200110212806.GD11555@xps15> <20200122193936.GB3261042@ripper> <20200123171524.GV1511@yoga> <8d92c4b5-4238-23d2-50fc-1a5bdfc2c67b@st.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 27, 2020 at 10:46:05AM +0100, Arnaud POULIQUEN wrote: > > > On 1/24/20 7:44 PM, Mathieu Poirier wrote: > > On Thu, 23 Jan 2020 at 10:49, Arnaud POULIQUEN wrote: > >> > >> Hi Bjorn, Mathieu > >> > >> On 1/23/20 6:15 PM, Bjorn Andersson wrote: > >>> On Thu 23 Jan 09:01 PST 2020, Mathieu Poirier wrote: > >>> > >>>> On Wed, 22 Jan 2020 at 12:40, Bjorn Andersson > >>>> wrote: > >>>>> > >>>>> On Fri 10 Jan 13:28 PST 2020, Mathieu Poirier wrote: > >>>>> > >>>>>> On Thu, Dec 26, 2019 at 09:32:14PM -0800, Bjorn Andersson wrote: > >>>>>>> Add a common panic handler that invokes a stop request and sleep enough > >>>>>>> to let the remoteproc flush it's caches etc in order to aid post mortem > >>>>>>> debugging. > >>>>>>> > >>>>>>> Signed-off-by: Bjorn Andersson > >>>>>>> --- > >>>>>>> > >>>>>>> Changes since v1: > >>>>>>> - None > >>>>>>> > >>>>>>> drivers/remoteproc/qcom_q6v5.c | 19 +++++++++++++++++++ > >>>>>>> drivers/remoteproc/qcom_q6v5.h | 1 + > >>>>>>> 2 files changed, 20 insertions(+) > >>>>>>> > >>>>>>> diff --git a/drivers/remoteproc/qcom_q6v5.c b/drivers/remoteproc/qcom_q6v5.c > >>>>>>> index cb0f4a0be032..17167c980e02 100644 > >>>>>>> --- a/drivers/remoteproc/qcom_q6v5.c > >>>>>>> +++ b/drivers/remoteproc/qcom_q6v5.c > >>>>>>> @@ -6,6 +6,7 @@ > >>>>>>> * Copyright (C) 2014 Sony Mobile Communications AB > >>>>>>> * Copyright (c) 2012-2013, The Linux Foundation. All rights reserved. > >>>>>>> */ > >>>>>>> +#include > >>>>>>> #include > >>>>>>> #include > >>>>>>> #include > >>>>>>> @@ -15,6 +16,8 @@ > >>>>>>> #include > >>>>>>> #include "qcom_q6v5.h" > >>>>>>> > >>>>>>> +#define Q6V5_PANIC_DELAY_MS 200 > >>>>>>> + > >>>>>>> /** > >>>>>>> * qcom_q6v5_prepare() - reinitialize the qcom_q6v5 context before start > >>>>>>> * @q6v5: reference to qcom_q6v5 context to be reinitialized > >>>>>>> @@ -162,6 +165,22 @@ int qcom_q6v5_request_stop(struct qcom_q6v5 *q6v5) > >>>>>>> } > >>>>>>> EXPORT_SYMBOL_GPL(qcom_q6v5_request_stop); > >>>>>>> > >>>>>>> +/** > >>>>>>> + * qcom_q6v5_panic() - panic handler to invoke a stop on the remote > >>>>>>> + * @q6v5: reference to qcom_q6v5 context > >>>>>>> + * > >>>>>>> + * Set the stop bit and sleep in order to allow the remote processor to flush > >>>>>>> + * its caches etc for post mortem debugging. > >>>>>>> + */ > >>>>>>> +void qcom_q6v5_panic(struct qcom_q6v5 *q6v5) > >>>>>>> +{ > >>>>>>> + qcom_smem_state_update_bits(q6v5->state, > >>>>>>> + BIT(q6v5->stop_bit), BIT(q6v5->stop_bit)); > >>>>>>> + > >>>>>>> + mdelay(Q6V5_PANIC_DELAY_MS); > >>>>>> > >>>>>> I really wonder if the delay should be part of the remoteproc core and > >>>>>> configurable via device tree. Wanting the remote processor to flush its caches > >>>>>> is likely something other vendors will want when dealing with a kernel panic. > >>>>>> It would be nice to see if other people have an opinion on this topic. If not > >>>>>> then we can keep the delay here and move it to the core if need be. > >>>>>> > >>>>> > >>>>> I gave this some more thought and what we're trying to achieve is to > >>>>> signal the remote processors about the panic and then give them time to > >>>>> react, but per the proposal (and Qualcomm downstream iirc) we will do > >>>>> this for each remote processor, one by one. > >>>>> > >>>>> So in the typical case of a Qualcomm platform with 4-5 remoteprocs we'll > >>>>> end up giving the first one a whole second to react and the last one > >>>>> "only" 200ms. > >>>>> > >>>>> Moving the delay to the core by iterating over rproc_list calling > >>>>> panic() and then delaying would be cleaner imo. > >>>> > >>>> I agree. > >>>> > >>>>> > >>>>> It might be nice to make this configurable in DT, but I agree that it > >>>>> would be nice to hear from others if this would be useful. > >>>> > >>>> I think the delay has to be configurable via DT if we move this to the > >>>> core. The binding can be optional and default to 200ms if not > >>>> present. > >>>> > >>> > >>> How about I make the panic() return the required delay and then we let > >>> the core sleep for MAX() of the returned durations? > > > > I like it. > > > >> That way the default > >>> is still a property of the remoteproc drivers - and 200ms seems rather > >>> arbitrary to put in the core, even as a default. > >> > >> I agree with Bjorn, the delay should be provided by the platform. > >> But in this case i wonder if it is simpler to just let the platform take care it? > > > > If I understand you correctly, that is what Bjorn's original > > implementation was doing and it had drawbacks. > Yes, > Please tell me if i missed something, the only drawback seems mentioned is the accumulative delay. Yes, that is correct. > Could you elaborate how to implement the delay in remote proc core for multi rproc instance. > Here is my view: > To optimize the delay it would probably be necessary to compute: > - the delay based on an initial date, > - the delay requested by each rproc instance, > - the delay elapsed in each rproc panic ops. > Feasible but not straight forward... > So I suppose that you are thinking about a solution based on the store of the max delay that would be applied after last panic() return? Yes > anyway, how do you determine the last rproc instance? seems that a prerequisite would be that the panic ops is mandatory... Each ->panic() should return the amount of time to way or 0 if no delay is required. If an rpoc doesn't implement ->panic() then it is treated as 0. From there wait for the maximum time that was collected. It would be possible to do something more complicated like taking timestamps everytime a ->panic() returns and optimize the time to wait for but that may be for a future set. The first implementation could go with an simple heuristic as detailed above. > > I'm not familiar with panic mechanism, but how panic ops are scheduled in SMP? Does panics ops would be treated in parallel (using msleep instead of mdelay)? > In this case delays could not be cumulative... The processor that triggered the panic sequentially runs the notifier registered with the panic_notifier_list. Other processors are instructed to take themselves offline. As such there won't be multiple ->panic() running concurrently. > > > > >> For instance for stm32mp1 the stop corresponds to the reset on the remote processor core. To inform the coprocessor about an imminent shutdown we use a signal relying on a mailbox (cf. stm32_rproc_stop). > >> In this case we would need a delay between the signal and the reset, but not after (no cache management). > > > > Here I believe you are referring to the upper limit of 500ms that is > > needed for the mbox_send_message() in stm32_rproc_stop() to complete. > > Since that is a blocking call I think it would fit with Bjorn's > > proposal above if a value of '0' is returned by rproc->ops->panic(). > > That would mean no further delays are needed (because the blocking > > mbox_send_message() would have done the job already). Let me know if > > I'm in the weeds. > Yes you are :), this is what i thought, if delay implemented in core. Not sure I understand your last reply but I _think_ we are saying the same thing. > > Regards, > Arnaud > > > > >> > >> Regards, > >> Arnaud > >>> > >>> Regards, > >>> Bjorn > >>>