Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp5404670pxv; Wed, 28 Jul 2021 09:56:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwry3r5CgHHUemfC+koBwU7KHntH2nTT8mstpJCrdomHXhtLKUNO75u2iVgN7k4+h0ylml7 X-Received: by 2002:a17:906:6009:: with SMTP id o9mr444031ejj.266.1627491390538; Wed, 28 Jul 2021 09:56:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627491390; cv=none; d=google.com; s=arc-20160816; b=wIbglbO1eEDudLXFSlcwEPaDA7Uh9tQXWCxYgw/aneU2UfNh1GIRnfFlbx3s4a1AtF Ea2H/el3ohIca//NdXlzbKmiZDmSmhsn7m5ruKsknm/+Todhc0UKwddTe5Rsw+SlTQ17 f2uhPNxs6BZ5Mjt6kx+g/TVUDBrAx9GbsNQofFIdB7l/tbb6+TN9qB9F+1mE4gOAkM/c Mw4/ZpjV9D7yslDbub6m30sUwnoYh0cmmPSLb3iED/YH9duBayzn12Fx7RG7x6AG429u +ho3jOMJxUR1zFvvmpyypFpTZn5haEhBNVPjB9d2mFDxmfeEtCkRw3Mf/4qBeAiWKWmQ zlGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=a6Qtb8LO+jmXdHkBGfYZwOVAKVDEtTZ9h0dnq2Hjpqk=; b=JTSPc55NDQ69ER08L2piepIOBL3+WH4kW8PmgeSj1RqGshOrqxs9+BAAGxxw4jxQFM q9dT6juC2kyH4ZlHL/z/RqcUvPtXlyXxFgmgeM6glZkyA4XjyQqbh1FUB58/EY0OtjLR lQ6BVZu0plPXpVMfYolylMX6g17DUeVQxgjLX5fPHUeZbptDMZo5mWXE+WqW3urlQtB9 VzzFDne+Fql98XmCm5xzkuD1X3kFPe0DV0fufJhQKD1ax5ouHdWuc2DaxBehz7HilVVw OsEQZ4UvBFhBw2O3VdBgh52X8aCCsEjVQXRhnhVOtabr8nQBf3gd1u/XiMqgYsfln/FR 4Efg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c26si151180edt.227.2021.07.28.09.56.05; Wed, 28 Jul 2021 09:56:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229962AbhG1Qyq (ORCPT + 99 others); Wed, 28 Jul 2021 12:54:46 -0400 Received: from foss.arm.com ([217.140.110.172]:60988 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229515AbhG1Qyq (ORCPT ); Wed, 28 Jul 2021 12:54:46 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 444CA6D; Wed, 28 Jul 2021 09:54:44 -0700 (PDT) Received: from e120937-lin (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 61E5F3F66F; Wed, 28 Jul 2021 09:54:41 -0700 (PDT) Date: Wed, 28 Jul 2021 17:54:30 +0100 From: Cristian Marussi To: Sudeep Holla Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, virtualization@lists.linux-foundation.org, virtio-dev@lists.oasis-open.org, james.quinlan@broadcom.com, Jonathan.Cameron@Huawei.com, f.fainelli@gmail.com, etienne.carriere@linaro.org, vincent.guittot@linaro.org, souvik.chakravarty@arm.com, igor.skalkin@opensynergy.com, peter.hilber@opensynergy.com, alex.bennee@linaro.org, jean-philippe@linaro.org, mikhail.golubev@opensynergy.com, anton.yakovlev@opensynergy.com, Vasyl.Vavrychuk@opensynergy.com, Andriy.Tryshnivskyy@opensynergy.com Subject: Re: [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens Message-ID: <20210728165430.GJ6592@e120937-lin> References: <20210712141833.6628-1-cristian.marussi@arm.com> <20210712141833.6628-7-cristian.marussi@arm.com> <20210728141746.chqwhspnwviz67xn@bogus> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210728141746.chqwhspnwviz67xn@bogus> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 28, 2021 at 03:17:46PM +0100, Sudeep Holla wrote: > On Mon, Jul 12, 2021 at 03:18:22PM +0100, Cristian Marussi wrote: > > Tokens are sequence numbers embedded in the each SCMI message header: they > > are used to correlate commands with responses (and delayed responses), but > > their usage and policy of selection is entirely up to the caller (usually > > the OSPM agent), while they are completely opaque to the callee (SCMI > > server platform) which merely copies them back from the command into the > > response message header. > > This also means that the platform does not, can not and should not enforce > > any kind of policy on received messages depending on the contained sequence > > number: platform can perfectly handle concurrent requests carrying the same > > identifiying token if that should happen. > > > > Moreover the platform is not required to produce in-order responses to > > agent requests, the only constraint in these regards is that in case of > > an asynchronous message the delayed response must be sent after the > > immediate response for the synchronous part of the command transaction. > > > > Currenly the SCMI stack of the OSPM agent selects a token for the egressing > > commands picking the lowest possible number which is not already in use by > > an existing in-flight transaction, which means, in other words, that we > > immediately reuse any token after its transaction has completed or it has > > timed out: this policy indeed does simplify management and lookup of tokens > > and associated xfers. > > > > Under the above assumptions and constraints, since there is really no state > > shared between the agent and the platform to let the platform know when a > > token and its associated message has timed out, the current policy of early > > reuse of tokens can easily lead to the situation in which a spurious or > > late received response (or delayed_response), related to an old stale and > > timed out transaction, can be wrongly associated to a newer valid in-flight > > xfer that just happens to have reused the same token. > > > > This misbehaviour on such ghost responses is more easily exposed on those > > transports that naturally have an higher level of parallelism in processing > > multiple concurrent in-flight messages. > > Hi, > > The term ghost is used here without any reference to what it means. That > could make it difficult to follow if someone unaware of it is trying to > understand. > Right, I'll reword this. > > This commit introduces a new policy of selection of tokens for the OSPM > > agent: each new command transfer now gets the next available, monotonically > > increasing token, until tokens are exhausted and the counter rolls over. > > > > Such new policy mitigates the above issues with ghost responses since the > > tokens are now reused as late as possible (when they roll back ideally) > > and so it is much easier to identify such ghost responses to stale timed > > out transactions: this also helps in simplifying the specific transports > > implementation since stale transport messages can be easily identified > > and discarded early on in the rx path without the need to cross check > > their actual state with the core transport layer. > > This mitigation is even more effective when, as is usually the case, the > > maximum number of pending messages is capped by the platform to a much > > lower number than the whole possible range of tokens values (2^10). > > > > This internal policy change in the core SCMI transport layer is fully > > transparent to the specific transports so it has not and should not have > > any impact on the transports implementation. > > > > Signed-off-by: Cristian Marussi > > --- > > v4 --> V5 > > - removed empirical profiling info from commit msg > > - do NOT use monotonic tokens and pending HT for notifications (not needed) > > - release xfer_lock later in scmi_xfer_get > > --- > > drivers/firmware/arm_scmi/common.h | 25 +++ > > drivers/firmware/arm_scmi/driver.c | 260 +++++++++++++++++++++++++---- > > 2 files changed, 249 insertions(+), 36 deletions(-) > > > > diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h > > index 6bb734e0e3ac..2233d0a188fc 100644 > > --- a/drivers/firmware/arm_scmi/common.h > > +++ b/drivers/firmware/arm_scmi/common.h > > @@ -14,7 +14,10 @@ > > #include > > #include > > #include > > +#include > > +#include > > #include > > +#include > > #include > > #include > > > > @@ -138,6 +141,10 @@ struct scmi_msg { > > * buffer for the rx path as we use for the tx path. > > * @done: command message transmit completion event > > * @async_done: pointer to delayed response message received event completion > > + * @users: A refcount to track the active users for this xfer > > + * @pending: True for xfers added to @pending_xfers hashtable > > + * @node: An hlist_node reference used to store this xfer, alternatively, on > > + * the free list @free_xfers or in the @pending_xfers hashtable > > */ > > struct scmi_xfer { > > int transfer_id; > > @@ -146,8 +153,26 @@ struct scmi_xfer { > > struct scmi_msg rx; > > struct completion done; > > struct completion *async_done; > > + refcount_t users; > > + bool pending; > > + struct hlist_node node; > > }; > > > > +/* > > + * An helper macro to lookup an xfer from the @pending_xfers hashtable > > + * using the message sequence number token as a key. > > + */ > > +#define XFER_FIND(__ht, __k) \ > > +({ \ > > + typeof(__k) k_ = __k; \ > > + struct scmi_xfer *xfer_ = NULL; \ > > + \ > > + hash_for_each_possible((__ht), xfer_, node, k_) \ > > + if (xfer_->hdr.seq == k_) \ > > + break; \ > > + xfer_; \ > > +}) > > + > > struct scmi_xfer_ops; > > > > /** > > diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c > > index 4c77ee13b1ad..245ede223302 100644 > > --- a/drivers/firmware/arm_scmi/driver.c > > +++ b/drivers/firmware/arm_scmi/driver.c > > @@ -21,6 +21,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -65,19 +66,29 @@ struct scmi_requested_dev { > > struct list_head node; > > }; > > > > +#define SCMI_PENDING_XFERS_HT_ORDER_SZ 9 > > + > > Is there any particular reason to choose half the token size as hash bucket > size ? IOW why not 1/3 or 1/4th of it ? I would appreciate a comment here. > I see it is mentioned in the commit log. Also is it not better to associate > or keep it close to MSG_TOKEN_ID_MASK and associated macros. > I'll move this in the proper place where associated macros are defined. The reason for the size choice is tricky (and not sure about its value still...so I have not commented yet :D); the ideal size of this hashtable would be desc->max_msg so equal to the maximum number of inflight messages allowed on the system in order to minimize (probably to zero) collisions on the hashtable: unfortunately max_msg is only finally available at runtime time and the kernel hashtable is statically sized by design.... I tried to play some tricks to define dynamically the size but everything falls apart since a lot of stuff in linux/hashtable.h is based on ARRAY_SIZE() and friends (to speedup all I suppose). Another non-fit (in my opinion) alternative would be using relativistic hashtable (linux/rhashtable.h) but those are definitely overkill in our case since they are hashtables that can be resized completely at runtime while populated O_o. (with even more overhead) At the end the size that fits all possible in-flight messages minimizing collisions in any possible case that I can set at compile time would be 10, which means really 2^10 1024 HT entries (equal to MAX_MSG_TOKEN) each of which is a struct list_head (*prev,*next 16bytes) i.e. 16KB HT: Peter pointed out that it would be a lot of wasted space on normal systems in which max in-flight messages are far-less than 1024 AND would not even fit in one 4Kb page, so I reduced it to 512 entries but the best would be 256 (8) if we want to fit in one regular 4kb page. The drawback will be a bit of HT collisions on system with more than 256 possible and effective in-flight messages. So, best of all would be runtime sizing of the HT, which is not possible with current statically sized Kernel hashtable.h (even though I started looking into that I have to admit :D), or sizing to 8 (256 entries 4k) to fit in a page, or 10 (1024 entries 16kb) to minimize any possible collision if we do not care about wasted memory and/or page boundary. > > /** > > * struct scmi_xfers_info - Structure to manage transfer information > > * > > - * @xfer_block: Preallocated Message array > > * @xfer_alloc_table: Bitmap table for allocated messages. > > * Index of this bitmap table is also used for message > > * sequence identifier. > > * @xfer_lock: Protection for message allocation > > + * @last_token: A counter to use as base to generate for monotonically > > + * increasing tokens. > > + * @free_xfers: A free list for available to use xfers. It is initialized with > > + * a number of xfers equal to the maximum allowed in-flight > > + * messages. > > + * @pending_xfers: An hashtable, indexed by msg_hdr.seq, used to keep all the > > + * currently in-flight messages. > > */ > > struct scmi_xfers_info { > > - struct scmi_xfer *xfer_block; > > unsigned long *xfer_alloc_table; > > spinlock_t xfer_lock; > > + atomic_t last_token; > > Can we merge this and transfer_last_id ? Let this be free running like > transfer_last_id and just use [0-9] from this ? I don't see any point > having 2 different monotonically increasing tokens/id. > Mmm I was tempted about that, but the reason I did not was that in some rare limit condition as you can see in the ASCII art (:O) I can find a hole in the next available token ids so I have to skip and update last_token itself, not sure if this could cause confusion seeing transfer_ids with holes during tracing if I unify them. > > + struct hlist_head free_xfers; > > + DECLARE_HASHTABLE(pending_xfers, SCMI_PENDING_XFERS_HT_ORDER_SZ); > > }; > > > > /** > > @@ -190,45 +201,177 @@ void *scmi_notification_instance_data_get(const struct scmi_handle *handle) > > return info->notify_priv; > > } > > > > +/** > > + * scmi_xfer_token_set - Reserve and set new token for the xfer at hand > > + * > > + * @minfo: Pointer to Tx/Rx Message management info based on channel type > > + * @xfer: The xfer to act upon > > + * > > + * Pick the next unused monotonically increasing token and set it into > > + * xfer->hdr.seq: picking a monotonically increasing value avoids immediate > > + * reuse of freshly completed or timed-out xfers, thus mitigating the risk > > + * of incorrect association of a late and expired xfer with a live in-flight > > + * transaction, both happening to re-use the same token identifier. > > + * > > + * Since platform is NOT required to answer our request in-order we should > > + * account for a few rare but possible scenarios: > > + * > > + * - exactly 'next_token' may be NOT available so pick xfer_id >= next_token > > + * using find_next_zero_bit() starting from candidate next_token bit > > + * > > + * - all tokens ahead upto (MSG_TOKEN_ID_MASK - 1) are used in-flight but we > > + * are plenty of free tokens at start, so try a second pass using > > + * find_next_zero_bit() and starting from 0. > > + * > > + * X = used in-flight > > + * > > + * Normal > > + * ------ > > + * > > + * |- xfer_id picked > > + * -----------+---------------------------------------------------------- > > + * | | |X|X|X| | | | | | ... ... ... ... ... ... ... ... ... ... ...|X|X| > > + * ---------------------------------------------------------------------- > > + * ^ > > + * |- next_token > > + * > > + * Out-of-order pending at start > > + * ----------------------------- > > + * > > + * |- xfer_id picked, last_token fixed > > + * -----+---------------------------------------------------------------- > > + * |X|X| | | | |X|X| ... ... ... ... ... ... ... ... ... ... ... ...|X| | > > + * ---------------------------------------------------------------------- > > + * ^ > > + * |- next_token > > + * > > + * > > + * Out-of-order pending at end > > + * --------------------------- > > + * > > + * |- xfer_id picked, last_token fixed > > + * -----+---------------------------------------------------------------- > > + * |X|X| | | | |X|X| ... ... ... ... ... ... ... ... ... ... |X|X|X||X|X| > > + * ---------------------------------------------------------------------- > > + * ^ > > + * |- next_token > > + * > > + * Context: Assumes to be called with @xfer_lock already acquired. > > + * > > + * Return: 0 on Success or error > > + */ > > +static int scmi_xfer_token_set(struct scmi_xfers_info *minfo, > > + struct scmi_xfer *xfer) > > +{ > > + unsigned long xfer_id, next_token; > > + > > + /* Pick a candidate monotonic token in range [0, MSG_TOKEN_MAX - 1] */ > > + next_token = (atomic_inc_return(&minfo->last_token) & > > + (MSG_TOKEN_MAX - 1)); > > + > > + /* Pick the next available xfer_id >= next_token */ > > + xfer_id = find_next_zero_bit(minfo->xfer_alloc_table, > > + MSG_TOKEN_MAX, next_token); > > + if (xfer_id == MSG_TOKEN_MAX) { > > + /* > > + * After heavily out-of-order responses, there are no free > > + * tokens ahead, but only at start of xfer_alloc_table so > > + * try again from the beginning. > > + */ > > + xfer_id = find_next_zero_bit(minfo->xfer_alloc_table, > > + MSG_TOKEN_MAX, 0); > > + /* > > + * Something is wrong if we got here since there can be a > > + * maximum number of (MSG_TOKEN_MAX - 1) in-flight messages > > + * but we have not found any free token [0, MSG_TOKEN_MAX - 1]. > > + */ > > + if (WARN_ON_ONCE(xfer_id == MSG_TOKEN_MAX)) > > + return -ENOMEM; > > + } > > + > > + /* Update +/- last_token accordingly if we skipped some hole */ > > + if (xfer_id != next_token) > > + atomic_add((int)(xfer_id - next_token), &minfo->last_token); > > + > > + /* Set in-flight */ > > + set_bit(xfer_id, minfo->xfer_alloc_table); > > + xfer->hdr.seq = (u16)xfer_id; > > + > > + return 0; > > +} > > + > > +/** > > + * scmi_xfer_token_clear - Release the token > > + * > > + * @minfo: Pointer to Tx/Rx Message management info based on channel type > > + * @xfer: The xfer to act upon > > + */ > > +static inline void scmi_xfer_token_clear(struct scmi_xfers_info *minfo, > > + struct scmi_xfer *xfer) > > +{ > > + clear_bit(xfer->hdr.seq, minfo->xfer_alloc_table); > > +} > > + > > /** > > * scmi_xfer_get() - Allocate one message > > * > > * @handle: Pointer to SCMI entity handle > > * @minfo: Pointer to Tx/Rx Message management info based on channel type > > + * @set_pending: If true a monotonic token is picked and the xfer is added to > > + * the pending hash table. > > * > > * Helper function which is used by various message functions that are > > * exposed to clients of this driver for allocating a message traffic event. > > * > > - * This function can sleep depending on pending requests already in the system > > - * for the SCMI entity. Further, this also holds a spinlock to maintain > > - * integrity of internal data structures. > > + * Picks an xfer from the free list @free_xfers (if any available) and, if > > + * required, sets a monotonically increasing token and stores the inflight xfer > > + * into the @pending_xfers hashtable for later retrieval. > > + * > > + * The successfully initialized xfer is refcounted. > > + * > > + * Context: Holds @xfer_lock while manipulating @xfer_alloc_table and > > + * @free_xfers. > > * > > * Return: 0 if all went fine, else corresponding error. > > */ > > static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle, > > - struct scmi_xfers_info *minfo) > > + struct scmi_xfers_info *minfo, > > + bool set_pending) > > { > > - u16 xfer_id; > > + int ret; > > + unsigned long flags; > > struct scmi_xfer *xfer; > > - unsigned long flags, bit_pos; > > - struct scmi_info *info = handle_to_scmi_info(handle); > > > > - /* Keep the locked section as small as possible */ > > spin_lock_irqsave(&minfo->xfer_lock, flags); > > - bit_pos = find_first_zero_bit(minfo->xfer_alloc_table, > > - info->desc->max_msg); > > - if (bit_pos == info->desc->max_msg) { > > + if (hlist_empty(&minfo->free_xfers)) { > > spin_unlock_irqrestore(&minfo->xfer_lock, flags); > > return ERR_PTR(-ENOMEM); > > } > > - set_bit(bit_pos, minfo->xfer_alloc_table); > > - spin_unlock_irqrestore(&minfo->xfer_lock, flags); > > > > - xfer_id = bit_pos; > > + /* grab an xfer from the free_list */ > > + xfer = hlist_entry(minfo->free_xfers.first, struct scmi_xfer, node); > > + hlist_del_init(&xfer->node); > > + > > + if (set_pending) { > > + /* Pick and set monotonic token */ > > + ret = scmi_xfer_token_set(minfo, xfer); > > + if (!ret) { > > + hash_add(minfo->pending_xfers, &xfer->node, > > + xfer->hdr.seq); > > + xfer->pending = true; > > + } else { > > + dev_err(handle->dev, > > + "Failed to get monotonic token %d\n", ret); > > + hlist_add_head(&xfer->node, &minfo->free_xfers); > > + xfer = ERR_PTR(ret); > > + } > > + } > > > > - xfer = &minfo->xfer_block[xfer_id]; > > - xfer->hdr.seq = xfer_id; > > - xfer->transfer_id = atomic_inc_return(&transfer_last_id); > > + if (!IS_ERR(xfer)) { > > + refcount_set(&xfer->users, 1); > > + xfer->transfer_id = atomic_inc_return(&transfer_last_id); > > + } > > + spin_unlock_irqrestore(&minfo->xfer_lock, flags); > > > > return xfer; > > } > > @@ -239,6 +382,9 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle, > > * @minfo: Pointer to Tx/Rx Message management info based on channel type > > * @xfer: message that was reserved by scmi_xfer_get > > * > > + * After refcount check, possibly release an xfer, clearing the token slot, > > + * removing xfer from @pending_xfers and putting it back into free_xfers. > > + * > > * This holds a spinlock to maintain integrity of internal data structures. > > */ > > static void > > @@ -246,16 +392,44 @@ __scmi_xfer_put(struct scmi_xfers_info *minfo, struct scmi_xfer *xfer) > > { > > unsigned long flags; > > > > - /* > > - * Keep the locked section as small as possible > > - * NOTE: we might escape with smp_mb and no lock here.. > > - * but just be conservative and symmetric. > > - */ > > spin_lock_irqsave(&minfo->xfer_lock, flags); > > - clear_bit(xfer->hdr.seq, minfo->xfer_alloc_table); > > + if (refcount_dec_and_test(&xfer->users)) { > > The introduction of users in this patch seems useless. I am assuming there are > multiple users and this is to prevent some race. I was about to ask if we can > manage without it, but seeing some additional use of it in later patches, I > will comment later. It may still make sense to move this to later patch as > it doesn't add anything here ? > Indeed I was not sure about introducing it here, the problem that addresses is as follows: a normal cmd flow is xfer_get/send/rx/xfer_put ... but if a command times out on the TX path BEFORE is finally xfer_put() by the invoking protocol, the MSG is still valid (as state) and pending for a while and if something is received in the RX path concurrently, the transport calls scmi_rx_callback and the risk would be that the xfer would be acquired just fine and processed by RX (still has to be xfer_put as said) and then be suddendly vanished by the final xfer_put on the TX path while RX is working on it. (or worst released while RX is working on it and then picked from the free_list and reused for send while RX is still processing it...ok bit of unlucky case I have to admit..) So xfer->users is meant to avoid the xfer being vanished by the xfer_put() in the TX path while rx_callback is still working on it....in such a case the rx_callback will be allowed to process safely the xfer (even though it timed out in TX) till the end of its RX-processing and the result will be then finally discarded at the end with the final __scmi_xfer_put inside handle_response that at the end releases the xfer once for all. Using a refcount allows me to avoid bigger spinlocked critical sections (or at least was the idea :D) > > + if (xfer->pending) { > > + scmi_xfer_token_clear(minfo, xfer); > > + hash_del(&xfer->node); > > + xfer->pending = false; > > + } > > + hlist_add_head(&xfer->node, &minfo->free_xfers); > > + } > > spin_unlock_irqrestore(&minfo->xfer_lock, flags); > > } > > > > +/** > > + * scmi_xfer_lookup_unlocked - Helper to lookup an xfer_id > > + * > > + * @minfo: Pointer to Tx/Rx Message management info based on channel type > > + * @xfer_id: Token ID to lookup in @pending_xfers > > + * > > + * Refcounting is untouched. > > + * > > + * Context: Assumes to be called with @xfer_lock already acquired. > > + * > > + * Return: A valid xfer on Success or error otherwise > > + */ > > +static struct scmi_xfer * > > +scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id) > > +{ > > + struct scmi_xfer *xfer = NULL; > > + > > + if (xfer_id >= MSG_TOKEN_MAX) > > + return ERR_PTR(-EINVAL); > > + > > Is this really needed ? I guess we always use MSG_XTRACT_TOKEN(hdr) to > fetch the xfer_id, no ? > Right, indeed comes from a MSG_XTRACT_TOKEN, I'll drop it. Thanks, Cristian