Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751719AbdG1KUG (ORCPT ); Fri, 28 Jul 2017 06:20:06 -0400 Received: from mail-wm0-f47.google.com ([74.125.82.47]:34288 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751595AbdG1KUE (ORCPT ); Fri, 28 Jul 2017 06:20:04 -0400 MIME-Version: 1.0 In-Reply-To: References: <1500620142-910-1-git-send-email-anup.patel@broadcom.com> <1500620142-910-7-git-send-email-anup.patel@broadcom.com> From: Jassi Brar Date: Fri, 28 Jul 2017 15:50:01 +0530 Message-ID: Subject: Re: [PATCH v2 6/7] mailbox: bcm-flexrm-mailbox: Set msg_queue_len for each channel To: Anup Patel Cc: Rob Herring , Mark Rutland , Catalin Marinas , Will Deacon , Florian Fainelli , Scott Branden , Ray Jui , Linux Kernel Mailing List , "linux-arm-kernel@lists.infradead.org" , Devicetree List , BCM Kernel Feedback Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11296 Lines: 254 On Fri, Jul 28, 2017 at 3:18 PM, Anup Patel wrote: > On Fri, Jul 28, 2017 at 2:34 PM, Jassi Brar wrote: >> On Fri, Jul 28, 2017 at 2:19 PM, Anup Patel wrote: >>> On Thu, Jul 27, 2017 at 5:23 PM, Jassi Brar wrote: >>>> On Thu, Jul 27, 2017 at 11:20 AM, Anup Patel wrote: >>>>> On Thu, Jul 27, 2017 at 10:29 AM, Jassi Brar wrote: >>>> >>>>>>>>>>> Sorry for the delayed response... >>>>>>>>>>> >>>>>>>>>>> On Fri, Jul 21, 2017 at 9:16 PM, Jassi Brar wrote: >>>>>>>>>>>> Hi Anup, >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 21, 2017 at 12:25 PM, Anup Patel wrote: >>>>>>>>>>>>> The Broadcom FlexRM ring (i.e. mailbox channel) can handle >>>>>>>>>>>>> larger number of messages queued in one FlexRM ring hence >>>>>>>>>>>>> this patch sets msg_queue_len for each mailbox channel to >>>>>>>>>>>>> be same as RING_MAX_REQ_COUNT. >>>>>>>>>>>>> >>>>>>>>>>>>> Signed-off-by: Anup Patel >>>>>>>>>>>>> Reviewed-by: Scott Branden >>>>>>>>>>>>> --- >>>>>>>>>>>>> drivers/mailbox/bcm-flexrm-mailbox.c | 5 ++++- >>>>>>>>>>>>> 1 file changed, 4 insertions(+), 1 deletion(-) >>>>>>>>>>>>> >>>>>>>>>>>>> diff --git a/drivers/mailbox/bcm-flexrm-mailbox.c b/drivers/mailbox/bcm-flexrm-mailbox.c >>>>>>>>>>>>> index 9873818..20055a0 100644 >>>>>>>>>>>>> --- a/drivers/mailbox/bcm-flexrm-mailbox.c >>>>>>>>>>>>> +++ b/drivers/mailbox/bcm-flexrm-mailbox.c >>>>>>>>>>>>> @@ -1683,8 +1683,11 @@ static int flexrm_mbox_probe(struct platform_device *pdev) >>>>>>>>>>>>> ret = -ENOMEM; >>>>>>>>>>>>> goto fail_free_debugfs_root; >>>>>>>>>>>>> } >>>>>>>>>>>>> - for (index = 0; index < mbox->num_rings; index++) >>>>>>>>>>>>> + for (index = 0; index < mbox->num_rings; index++) { >>>>>>>>>>>>> + mbox->controller.chans[index].msg_queue_len = >>>>>>>>>>>>> + RING_MAX_REQ_COUNT; >>>>>>>>>>>>> mbox->controller.chans[index].con_priv = &mbox->rings[index]; >>>>>>>>>>>>> + } >>>>>>>>>>>>> >>>>>>>>>>>> While writing mailbox.c I wasn't unaware that there is the option to >>>>>>>>>>>> choose the queue length at runtime. >>>>>>>>>>>> The idea was to keep the code as simple as possible. I am open to >>>>>>>>>>>> making it a runtime thing, but first, please help me understand how >>>>>>>>>>>> that is useful here. >>>>>>>>>>>> >>>>>>>>>>>> I understand FlexRm has a ring buffer of RING_MAX_REQ_COUNT(1024) >>>>>>>>>>>> elements. Any message submitted to mailbox api can be immediately >>>>>>>>>>>> written onto the ringbuffer if there is some space. >>>>>>>>>>>> Is there any mechanism to report back to a client driver, if its >>>>>>>>>>>> message in ringbuffer failed "to be sent"? >>>>>>>>>>>> If there isn't any, then I think, in flexrm_last_tx_done() you should >>>>>>>>>>>> simply return true if there is some space left in the rung-buffer, >>>>>>>>>>>> false otherwise. >>>>>>>>>>> >>>>>>>>>>> Yes, we have error code in "struct brcm_message" to report back >>>>>>>>>>> errors from send_message. In our mailbox clients, we check >>>>>>>>>>> return value of mbox_send_message() and also the error code >>>>>>>>>>> in "struct brcm_message". >>>>>>>>>>> >>>>>>>>>> I meant after the message has been accepted in the ringbuffer but the >>>>>>>>>> remote failed to receive it. >>>>>>>>> >>>>>>>>> Yes, even this case is handled. >>>>>>>>> >>>>>>>>> In case of IO errors after message has been put in ring buffer, we get >>>>>>>>> completion message with error code and mailbox client drivers will >>>>>>>>> receive back "struct brcm_message" with error set. >>>>>>>>> >>>>>>>>> You can refer flexrm_process_completions() for more details. >>>>>>>>> >>>>>> It doesn't seem to be what I suggest. I see two issues in >>>>>> flexrm_process_completions() >>>>>> 1) It calls mbox_send_message(), which is a big NO for a controller >>>>>> driver. Why should you have one more message stored outside of >>>>>> ringbuffer? >>>>> >>>>> The "last_pending_msg" in each FlexRM ring was added to fit FlexRM >>>>> in Mailbox framework. >>>>> >>>>> We don't have any IRQ for TX done so "txdone_irq" out of the question for >>>>> FlexRM. We only have completions for both success or failures (IO errors). >>>>> >>>>> This means we have to use "txdone_poll" for FlexRM. For "txdone_poll", >>>>> we have to provide last_tx_done() callback. The last_tx_done() callback >>>>> is supposed to return true if last send_data() call succeeded. >>>>> >>>>> To implement last_tx_done() in FlexRM driver, we added "last_pending_msg". >>>>> >>>>> When "last_pending_msg" is NULL it means last call to send_data() succeeded >>>>> and when "last_pending_msg" is != NULL it means last call to send_data() >>>>> did not go through due to lack of space in FlexRM ring. >>>>> >>>> It could be simpler. >>>> Since flexrm_send_data() is essentially about putting the message in >>>> the ring-buffer (and not about _transmission_ failures), the >>>> last_tx_done() should simply return true if requests_ida has not all >>>> ids allocated. False otherwise. >>> >>> It's not that simple because we have two cases in-which >>> send_data() will fail: >>> 1. It run-out of IDs in requests_ida >>> 2. There is no room in BD queue of FlexRM ring. This because each >>> brcm_message can be translated into variable number of descriptors. >>> In fact, using SPU2 crypto client we have one brcm_message translating >>> into 100's of descriptors. All-in-all few messages (< 1024) can also >>> fill-up the BD queue of FlexRM ring. >>> >> OK let me put it abstractly... return false if "there is no space for >> another message in the ringbuffer", true otherwise. > > Let say at time T, there was no space in BD queue. Now at > time T+X when last_tx_done() it is possible that BD queue > has space because FlexRM has processed some more > descriptor. > > I think last_tx_done() for "txdone_poll" method will require > some information passing from send_data() callback to > last_tx_done() which is last_pending_msg for FlexRM driver. > The problem is flexrm_send_data() accepts single as well as batched messages, so each send_data() can require different spaces. If you make flexrm_send_data() accept fixed size messages then you can simply set a flag (say, last_tx_busy) when max possible messages are queued and unset that flag in flexrm_process_completions(). > Anyways, I plan to try "txdone_ack" method so I will > remove last_tx_done() and last_pending_msg both. > What do you think? > Sounds good. >> >>>>>> >>>>>> 2) It calls mbox_chan_received_data() which is for messages received >>>>>> from the remote. And not the way to report failed _transmission_, for >>>>>> which the api calls back mbox_client.tx_done() . In your client >>>>>> driver please populate mbox_client.tx_done() and see which message is >>>>>> reported "sent fine" when. >>>>>> >>>>>> >>>>>>>>>> There seems no such provision. IIANW, then you should be able to >>>>>>>>>> consider every message as "sent successfully" once it is in the ring >>>>>>>>>> buffer i.e, immediately after mbox_send_message() returns 0. >>>>>>>>>> In that case I would think you don't need more than a couple of >>>>>>>>>> entries out of MBOX_TX_QUEUE_LEN ? >>>>>>>>> >>>>>>>>> What I am trying to suggest is that we can take upto 1024 messages >>>>>>>>> in a FlexRM ring but the MBOX_TX_QUEUE_LEN limits us queuing >>>>>>>>> more messages. This issue manifest easily when multiple CPUs >>>>>>>>> queues to same FlexRM ring (i.e. same mailbox channel). >>>>>>>>> >>>>>>>> OK then, I guess we have to make the queue length a runtime decision. >>>>>>> >>>>>>> Do you agree with approach taken by PATCH5 and PATCH6 to >>>>>>> make queue length runtime? >>>>>>> >>>>>> I agree that we may have to get the queue length from platform, if >>>>>> MBOX_TX_QUEUE_LEN is limiting performance. That will be easier on both >>>>>> of us. However I suspect the right fix for _this_ situation is in >>>>>> flexrm driver. See above. >>>>> >>>>> The current implementation is trying to model FlexRM using "txdone_poll" >>>>> method and that's why we have dependency on MBOX_TX_QUEUE_LEN >>>>> >>>>> I think what we really need is new method for "txdone" to model ring >>>>> manager HW (such as FlexRM). Let's call it "txdone_none". >>>>> >>>>> For "txdone_none", it means there is no "txdone" reporting in HW >>>>> and mbox_send_data() should simply return value returned by >>>>> send_data() callback. The last_tx_done() callback is not required >>>>> for "txdone_none" and MBOX_TX_QUEUE_LEN also has no >>>>> effect on "txdone_none". Both blocking and non-blocking clients >>>>> are treated same for "txdone_none". >>>>> >>>> That is already supported :) >>> >>> If you are referring to "txdone_ack" then this cannot be used here >>> because for "txdone_ack" we have to call mbox_chan_txdon() API >>> after writing descriptors in send_data() callback which will cause >>> dead-lock in tx_tick() called by mbox_chan_txdone(). >>> >> Did you read my code snippet below? >> >> It's not mbox_chan_txdone(), but mbox_client_txdone() which is called >> by the client. >> >>>> >>>> In drivers/dma/bcm-sba-raid.c >>>> >>>> sba_send_mbox_request(...) >>>> { >>>> ...... >>>> req->msg.error = 0; >>>> ret = mbox_send_message(sba->mchans[mchans_idx], &req->msg); >>>> if (ret < 0) { >>>> dev_err(sba->dev, "send message failed with error %d", ret); >>>> return ret; >>>> } >>>> ret = req->msg.error; >>>> if (ret < 0) { >>>> dev_err(sba->dev, "message error %d", ret); >>>> return ret; >>>> } >>>> ..... >>>> } >>>> >>>> Here you _do_ assume that as soon as the mbox_send_message() returns, >>>> the last_tx_done() is true. In other words, this is a case of client >>>> 'knows_txdone'. >>>> >>>> So ideally you should specify cl->knows_txdone = true during >>>> mbox_request_channel() and have ... >>>> >>>> sba_send_mbox_request(...) >>>> { >>>> ret = mbox_send_message(sba->mchans[mchans_idx], &req->msg); >>>> if (ret < 0) { >>>> dev_err(sba->dev, "send message failed with error %d", ret); >>>> return ret; >>>> } >>>> >>>> ret = req->msg.error; >>>> >>>> /* Message successfully placed in the ringbuffer, i.e, done */ >>>> mbox_client_txdone(sba->mchans[mchans_idx], ret); >>>> >>>> if (ret < 0) { >>>> dev_err(sba->dev, "message error %d", ret); >>>> return ret; >>>> } >>>> >>>> ..... >>>> } >>>> >>> >>> I think we need to improve mailbox.c so that >>> mbox_chan_txdone() can be called from >>> send_data() callback. >>> >> No please. Other clients call mbox_send_message() followed by >> mbox_client_txdone(), and they are right. For example, >> drivers/firmware/tegra/bpmp.c > > OK so I got confused between mbox_chan_txdone() and > mbox_client_txdone(). > > We should do mbox_client_txdone() from mailbox client > when mbox_chan txmethod is ACK. > Yes. Thanks.