Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp333857ybz; Wed, 29 Apr 2020 00:46:38 -0700 (PDT) X-Google-Smtp-Source: APiQypJBLXoos67xoTJ30XDDhFvCohEx0EjRytaw83PGVklUs45u1J/tbxROqV9l3fYW7kjYo9fa X-Received: by 2002:aa7:ca45:: with SMTP id j5mr1216640edt.268.1588146398593; Wed, 29 Apr 2020 00:46:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588146398; cv=none; d=google.com; s=arc-20160816; b=qdc8rsB5dblLmFHYS1Z5jj7OieqPDSt80WGFhZKFXKJuUHjvN1IAwDyn8An71i3xEt nyVXUG0Mi34hxdcQfVRgV8TV4X51K4Tp/grbp/VOFUB1NvyrOr9tdbHcUiv+IGrm/1qF Bk3fTzGgzF+r7koZ2WsTKyvedCN0dlRQfHIgMa59Vw4N/jb+iY9flVcXsMCNNwydr3wg G+sLxvTejGShBmNrEReMrM93nDWPBOpJdV7/n6qSmd1MJuq+DgMQT/HxCKVDvgDEVKGG hUYxc+wPSx2CmaGx5en4nAoNhLWb6UjDNdkJe8TNN2pcKiHLAjvCazpOpK3OqVUJPFrM O3SA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=swIqQ4vbYZQqA2Q1t8s/GzJMhKq3YLoYM29QkeGfje8=; b=g/CwKdx3aeoFKHOmBn4XELDF3aTwD/CX4/AuOj5j3ie8N7oJyVf8YHsuC+72HH5YT3 M/ZG9jFqt+ZgICTPOvoIn8zjDoZbzWd4jJqOfOi+pFPzWLTlazeF6IeaXOw5zsvrhWiO /P/O/bqqO41CZE73BPDHigAbw2wfVIElY4srEV37g9/xUI3YYIm2JxeY2vQSflO4dOmo /eDfgsSG6Ly40EQ1GqXotDuUqsPfawRVfrTtBsoFSm26ZPRNw0MxEO2S5Xzm7rl8BoGG KSDN4UDDoWIPwUCy4h4iQ2V6YqZ9TIHFKtoe+kDwSIQWYmMMWX2n+Uq43HF86JM+zXpB d+pg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@st.com header.s=STMicroelectronics header.b=i5qIPMCK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=st.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o24si2891603edr.456.2020.04.29.00.46.15; Wed, 29 Apr 2020 00:46:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@st.com header.s=STMicroelectronics header.b=i5qIPMCK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=st.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726526AbgD2HoO (ORCPT + 99 others); Wed, 29 Apr 2020 03:44:14 -0400 Received: from mx08-00178001.pphosted.com ([91.207.212.93]:49938 "EHLO mx07-00178001.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726345AbgD2HoO (ORCPT ); Wed, 29 Apr 2020 03:44:14 -0400 Received: from pps.filterd (m0046661.ppops.net [127.0.0.1]) by mx07-00178001.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03T7cPXB007566; Wed, 29 Apr 2020 09:44:04 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=st.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=STMicroelectronics; bh=swIqQ4vbYZQqA2Q1t8s/GzJMhKq3YLoYM29QkeGfje8=; b=i5qIPMCK5FQ2WZuZWEsbrJUBaE6iDTAYsRCp06lQpumNgiId1DE+zNLQZ43zXW1urF2Q O1osbvqMdSuVYpOJ98UnNmo8tY+b/r5d4iYXDIqrExex2DlXJtQQyGdTqC0tI0XnFoMf pIRmnSpx47CCOInF8tDUv970h/jmjRCNrrKFNPgxLNWVQ/X6EywE+GJrOhYihrcbVfP7 lvHC8fCNBm/yiKhp5w+nS+IUXEMAck0TJ0/xMeFSqqMMDlWr76CgTzYLRlcOhVo1PiXO tyUnRjr3TeUVeXgzCSRQUBlxEKdesjCPO1zzoehaVf79AyI4ORXdbdAvbNM/Z+MzqG/G /Q== Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx07-00178001.pphosted.com with ESMTP id 30mhjwvsvx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 29 Apr 2020 09:44:04 +0200 Received: from euls16034.sgp.st.com (euls16034.sgp.st.com [10.75.44.20]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 17AB410002A; Wed, 29 Apr 2020 09:44:04 +0200 (CEST) Received: from Webmail-eu.st.com (sfhdag3node1.st.com [10.75.127.7]) by euls16034.sgp.st.com (STMicroelectronics) with ESMTP id 03DF4205D20; Wed, 29 Apr 2020 09:44:04 +0200 (CEST) Received: from lmecxl0889.tpe.st.com (10.75.127.50) by SFHDAG3NODE1.st.com (10.75.127.7) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Wed, 29 Apr 2020 09:44:02 +0200 Subject: Re: [PATCH v3 09/14] remoteproc: Deal with synchronisation when crashing To: Mathieu Poirier , , CC: , , , , , References: <20200424200135.28825-1-mathieu.poirier@linaro.org> <20200424200135.28825-10-mathieu.poirier@linaro.org> From: Arnaud POULIQUEN Message-ID: Date: Wed, 29 Apr 2020 09:44:02 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20200424200135.28825-10-mathieu.poirier@linaro.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.75.127.50] X-ClientProxiedBy: SFHDAG5NODE2.st.com (10.75.127.14) To SFHDAG3NODE1.st.com (10.75.127.7) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.676 definitions=2020-04-29_02:2020-04-28,2020-04-29 signatures=0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Mathieu, On 4/24/20 10:01 PM, Mathieu Poirier wrote: > Refactor function rproc_trigger_recovery() in order to avoid > reloading the firmware image when synchronising with a remote > processor rather than booting it. Also part of the process, > properly set the synchronisation flag in order to properly > recover the system. > > Signed-off-by: Mathieu Poirier > --- > drivers/remoteproc/remoteproc_core.c | 23 ++++++++++++++------ > drivers/remoteproc/remoteproc_internal.h | 27 ++++++++++++++++++++++++ > 2 files changed, 43 insertions(+), 7 deletions(-) > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c > index ef88d3e84bfb..3a84a38ba37b 100644 > --- a/drivers/remoteproc/remoteproc_core.c > +++ b/drivers/remoteproc/remoteproc_core.c > @@ -1697,7 +1697,7 @@ static void rproc_coredump(struct rproc *rproc) > */ > int rproc_trigger_recovery(struct rproc *rproc) > { > - const struct firmware *firmware_p; > + const struct firmware *firmware_p = NULL; > struct device *dev = &rproc->dev; > int ret; > > @@ -1718,14 +1718,16 @@ int rproc_trigger_recovery(struct rproc *rproc) > /* generate coredump */ > rproc_coredump(rproc); > > - /* load firmware */ > - ret = request_firmware(&firmware_p, rproc->firmware, dev); > - if (ret < 0) { > - dev_err(dev, "request_firmware failed: %d\n", ret); > - goto unlock_mutex; > + /* load firmware if need be */ > + if (!rproc_needs_syncing(rproc)) { > + ret = request_firmware(&firmware_p, rproc->firmware, dev); > + if (ret < 0) { > + dev_err(dev, "request_firmware failed: %d\n", ret); > + goto unlock_mutex; > + } If we started in syncing mode then rpoc->firmware is null rproc_set_sync_flag(rproc, RPROC_SYNC_STATE_CRASHED) can make rproc_needs_syncing(rproc) false. In this case here we fail the recovery an leave in RPROC_STOP state. As you proposed in Loic RFC[1], what about adding a more explicit message to inform that the recovery failed. [1]https://lkml.org/lkml/2020/3/11/402 Regards, Arnaud > } > > - /* boot the remote processor up again */ > + /* boot up or synchronise with the remote processor again */ > ret = rproc_start(rproc, firmware_p); > > release_firmware(firmware_p); > @@ -1761,6 +1763,13 @@ static void rproc_crash_handler_work(struct work_struct *work) > dev_err(dev, "handling crash #%u in %s\n", ++rproc->crash_cnt, > rproc->name); > > + /* > + * The remote processor has crashed - tell the core what operation > + * to use from hereon, i.e whether an external entity will reboot > + * the MCU or it is now the remoteproc core's responsability. > + */ > + rproc_set_sync_flag(rproc, RPROC_SYNC_STATE_CRASHED); > + > mutex_unlock(&rproc->lock); > > if (!rproc->recovery_disabled) > diff --git a/drivers/remoteproc/remoteproc_internal.h b/drivers/remoteproc/remoteproc_internal.h > index 3985c084b184..61500981155c 100644 > --- a/drivers/remoteproc/remoteproc_internal.h > +++ b/drivers/remoteproc/remoteproc_internal.h > @@ -24,6 +24,33 @@ struct rproc_debug_trace { > struct rproc_mem_entry trace_mem; > }; > > +/* > + * enum rproc_sync_states - remote processsor sync states > + * > + * @RPROC_SYNC_STATE_CRASHED state to use after the remote processor > + * has crashed but has not been recovered by > + * the remoteproc core yet. > + * > + * Keeping these separate from the enum rproc_state in order to avoid > + * introducing coupling between the state of the MCU and the synchronisation > + * operation to use. > + */ > +enum rproc_sync_states { > + RPROC_SYNC_STATE_CRASHED, > +}; > + > +static inline void rproc_set_sync_flag(struct rproc *rproc, > + enum rproc_sync_states state) > +{ > + switch (state) { > + case RPROC_SYNC_STATE_CRASHED: > + rproc->sync_with_rproc = rproc->sync_flags.after_crash; > + break; > + default: > + break; > + } > +} > + > /* from remoteproc_core.c */ > void rproc_release(struct kref *kref); > irqreturn_t rproc_vq_interrupt(struct rproc *rproc, int vq_id); >