Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp4552589pxj; Tue, 8 Jun 2021 17:47:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxTlubKtCrVkHLC+awqTdsTOJkALRambO0l/0D6FIvBy07KiG4KvrIR2AKQeEcXh6bgRY54 X-Received: by 2002:a17:907:1b20:: with SMTP id mp32mr26288167ejc.495.1623199654912; Tue, 08 Jun 2021 17:47:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623199654; cv=none; d=google.com; s=arc-20160816; b=iciemkvkfxagICbM3R3KmQCQnYK38YqIm2mtNdse129oZ96dMSOhBzaLhMbw/tKLWi fN9vn12UeyFZoWaYwbWT+AvedNdJoaz6W08JVukU07x2dV4yJVafLb/IzLFq68s+xFu4 1J0dUIVZP5ngYDiIZ8dFKkutz6GztuTrC52AmEtu+F9R0MdIyFHLXoyx5fwlJFh5kKji NWvdW/i/7A/+ty1CWjGKWBG3tsUSVKk8Dm8vC06ggBSSBq1M1h+HquW7iyMNn7tkMxYb FGxOU35JfZQKF9oTKLM+mZWRecP8ny8ObFdmLrYfFlKse7KvfvOT+mchQZc1EKX7bnRY tF0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=9qXzJHo/6EJ7bFOhpG4WQE4j5pcp9mGPSSSvp2QDpoI=; b=lpkXFWLT0ax//rr5nK0hAOmYsMdZFa+6pzDY0kY0MxelEEj+cQH+cjopQPH1R7cY9S FdIHsNWdQ1Jl4OVl8NPIRxGsEhypFwmhZmiG/O5AXW+y/f0kAae3TMAIhrdiTivR6jOk AIWFnEcFsv3/8STs58djwUpCN4X2+/stUGtR37p5wzjahDskjRPzZHXB6KkZJSr5LM8L 168hvwfkCIjmlgGTZb/lQmY9ChwALdl8Dw5IcMggD6ytGoQOFJb485AhcjCOkae6sGld hh8gP0XmCbN6zLZ/M4U9mbeJuEy7jRhg3r23sXrN1CVFX+GtBtLTb+s1KEnKW0uk1N8B nG9g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s3si949931ejs.220.2021.06.08.17.47.11; Tue, 08 Jun 2021 17:47:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231834AbhFHLTr (ORCPT + 99 others); Tue, 8 Jun 2021 07:19:47 -0400 Received: from foss.arm.com ([217.140.110.172]:56144 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231409AbhFHLTr (ORCPT ); Tue, 8 Jun 2021 07:19:47 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3BB461396; Tue, 8 Jun 2021 04:17:54 -0700 (PDT) Received: from bogus (unknown [10.57.73.170]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A4B8C3F694; Tue, 8 Jun 2021 04:17:52 -0700 (PDT) Date: Tue, 8 Jun 2021 12:17:08 +0100 From: Sudeep Holla To: Cristian Marussi Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, james.quinlan@broadcom.com, Jonathan.Cameron@Huawei.com, f.fainelli@gmail.com, etienne.carriere@linaro.org, Sudeep Holla , vincent.guittot@linaro.org, souvik.chakravarty@arm.com Subject: Re: [RFC PATCH 01/10] firmware: arm_scmi: Reset properly xfer SCMI status Message-ID: <20210608111708.lxgjkszrvq4au6bm@bogus> References: <20210606221232.33768-1-cristian.marussi@arm.com> <20210606221232.33768-2-cristian.marussi@arm.com> <20210607173809.et6fzayvubsosvso@bogus> <20210607180137.GB40811@e120937-lin> <20210607182754.3wsmhc2t5mh36ycm@bogus> <20210608101048.GD40811@e120937-lin> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20210608101048.GD40811@e120937-lin> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 08, 2021 at 11:10:48AM +0100, Cristian Marussi wrote: > Hi Sudeep, > > On Mon, Jun 07, 2021 at 07:27:54PM +0100, Sudeep Holla wrote: > > On Mon, Jun 07, 2021 at 07:01:37PM +0100, Cristian Marussi wrote: > > > On Mon, Jun 07, 2021 at 06:38:09PM +0100, Sudeep Holla wrote: > > > > On Sun, Jun 06, 2021 at 11:12:23PM +0100, Cristian Marussi wrote: > > > > > When an SCMI command transfer fails due to some protocol issue an SCMI > > > > > error code is reported inside the SCMI message payload itself and it is > > > > > then retrieved and transcribed by the specific transport layer into the > > > > > xfer.hdr.status field by transport specific .fetch_response(). > > > > > > > > > > The core SCMI transport layer never explicitly reset xfer.hdr.status, > > > > > so when an xfer is reused, if a transport misbehaved in handling such > > > > > status field, we risk to see an invalid ghost error code. > > > > > > > > > > Reset xfer.hdr.status to SCMI_SUCCESS right before each transfer is > > > > > started. > > > > > > > > > > > > > Any particular reason why it can't be part of xfer_get_init which has other > > > > initialisations ? If none, please move it there. > > > > > > > > > > Well it was there initially then I moved it here. > > > > > > The reason is mostly the same as the reason for the other patch in this > > > series that adds a reinit_completion() in this same point: the core does > > > not forbid to reuse an xfer multiple times, once obtained with xfer_get() > > > or xfer_get_init(), and indeed some protocols do such a thing: they > > > implements such do_xfer looping and bails out on error. > > > > > > > Makes sense. But it is okay to retain xfer->transfer_id for every transfer > > in such a loop ? > > > No you are right and indeed I saw that anomaly, but I have not addressed > it since, even if wrong, it is harmless and transfer_id is really used > only for debugging/profiling, while the missing reinit_completion is > potentially broken. > No agreed, just wanted to make it clear that if do_xfer is used in loops the transfer_id remains same. I am fine with that. > > > In the way that it is implemented now in protocols poses no problem > > > indeed because the do_xfer loop bails out on error and the xfer is put, > > > but as soon as some protocol is implemented that violates this common > > > practice and it just keeps on reuse an xfer after an error fo other > > > do_xfers() this breaks...so it seemed more defensive to just reinit the > > > completion and the status before each send. > > > > Fair enough. But they use it to send same message I guess, may be if it > > gave error or something ? I would like to really know such a sequence > > instead of assisting that ????. > > > > So the current real 'looping do_xfer' behavior is safe and so this missing > reinit is only potentially broken in the future, and we cannot really > know now in advance about some future protocol needs, but it seems as of now > wrong that you'll want to keep going on and reuse an xfer for the same command > after an error in your loop. > Fair enough. > On the other side we allow such behaviour, so I thought was good to > provide a safe net if it is misused. > Agreed. > But, beside this patches, that, as said, are more defensive that strictly > needed as of now, I think now it's worth mentioning that this same 'issue' > affects also, as an example, the new mechanism I introduced later in this > same series to always use monotonically increasing sequence number for > outgoing messages. > OK, I haven't seen that yet. > In that case I stick to the current behavior and I assign such monotonically > increasing sequence numbers to message during xfer_get, but the potential > issue is the same: if a do_xfer loop is used you end up reusing the same > seq_num for multiple do_xfers (so defeating really the mechanism itself > that aims not to reuse immediately the most recently used seq_num). > I assumed the do_xfer loop is to avoid those overheads with compromise of reusing seq_num. > In that case I did this to keep it simple and to avoid placing more burden > on tx path by picking and assigning a seq_num upon each transfer...but, again, > also this behavior of picking a seq_num only at xfer_get is NOT really broken > as of now even for do_xfer loops since we bail out on error and you won't > really reuse that xfer. > OK. > It's just that in this seq_num selection case seems to add a lot of burden > and complexity if moved to the do_xfer phase, while status/reinit seemed > to me cheaper to move it in the do_xfer so I tried to play defensive. > I assumed the same as mentioned above. > At the end, in general I would say that all of these ops (status/reinit/ > seq_nums/transfer_id) DO really belong logically to the do_xfer phase more than > to the xfer_get/xfer_get_init, but in reality we can cope with having them > @xfer_get/get_init and this keeps things simple and reduce burden, especially > in the monotonic seq_nums case: so I am not so sure anymore if it is fine to > move reinit/status to the do_xfer, as proposed here, while keeping seq_nums > (for good reasons) to the xfer_get phase, because we'd use 2 different strategies > to address similar issues. > I almost agreed with the change just to read here you think otherwise now ????. > I would say: just keep reinit and status in the xfer_get phase instead and > maybe warn somehow if a failed xfer is detected being reused. (but this > would anyway need a check in every tx transaction to see if status != SUCCESS > so is it worth ?) I have started thinking why do we need to reset the status. Since it is always read from the shmem, do we really have to ? -- Regards, Sudeep