Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp3469487ybb; Mon, 23 Mar 2020 01:28:39 -0700 (PDT) X-Google-Smtp-Source: ADFU+vscKp2KWvzE/mu4q+gi9NPco+bmIc7PyS06dTKcN9lK+s3wDt58YY8BNyZC9zkLaFyPrluY X-Received: by 2002:a9d:5e06:: with SMTP id d6mr17572354oti.311.1584952118882; Mon, 23 Mar 2020 01:28:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584952118; cv=none; d=google.com; s=arc-20160816; b=nHXniABNtbnrULYcsv3bpk08uS+iJjONy7bdRQvYt70IH11CKr4//ZbU56orCuljvo Verx0Eo4XEUQAr7n3iBpvVE8BfOxsC40PNjLwNb7aQ5J1xEgnidyAEXv8ql/f7G5Db2E F1TNUdIG2kfEjz4hZJbjqh8QPzcr41zK9Jf2hw2879TLx2/LW8QizFbs/+arcRv9Rilz XShV8AWslXu/Ep36VOa5qSu6wn7xwQK88L2zTogALe6jvtvciLUvn2l3h1603lapl9mh jG1TMRgisucgCDUC4ImVf4DDZLOEFTgfOB4v3WXfljDTD5i7kTe9s4kDagfhSE8NnHRF yyOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject; bh=CGprITb3Acq5uKsbVu9rhDUkPnb4AdmM0Kkr9A1raMU=; b=mpPYgTz1e8waFjul7k/fzCGY2HkHj6kNmlRx1w4kIi4Oe7+uHzZVxhSphHAYRCGsSv 9MZahx9UXRuESzVscecPle4wzq7PojJEookZt2VhGLf9iaxykwm2kXt3b0PERyQuZwJU IYlLI10nHMNASnk2bMD6d1I3h7JlsNlF98PY80QZlYFWVSvctLX7bYojbJAMxAuhBasj ZSDgCvgr+qSKMatXC2KE29R7t3Hk0rfmH9vbTehWqNB8R+k5P7nFOgW84i37KRfYESiz kuTSNGF157Uv1wyfnHNryWKRcr3b3P66ZHr9SQkEMWVOpQ5w9CpuXOzLQaT2WP5RVL7k Y6Ew== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m81si7132343oig.190.2020.03.23.01.28.25; Mon, 23 Mar 2020 01:28:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727534AbgCWI2F (ORCPT + 99 others); Mon, 23 Mar 2020 04:28:05 -0400 Received: from foss.arm.com ([217.140.110.172]:45346 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727451AbgCWI2F (ORCPT ); Mon, 23 Mar 2020 04:28:05 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4477B31B; Mon, 23 Mar 2020 01:28:04 -0700 (PDT) Received: from [10.57.24.152] (unknown [10.57.24.152]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5D2D03F792; Mon, 23 Mar 2020 01:28:03 -0700 (PDT) Subject: Re: [PATCH v4 07/13] firmware: arm_scmi: Add notification dispatch and delivery To: Lukasz Luba , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, sudeep.holla@arm.com, james.quinlan@broadcom.com, Jonathan.Cameron@Huawei.com References: <20200304162558.48836-1-cristian.marussi@arm.com> <20200304162558.48836-8-cristian.marussi@arm.com> <45d4aee9-57df-6be9-c176-cf0d03940c21@arm.com> <363cb1ba-76b5-cc1e-af45-454837fae788@arm.com> <484214b4-a71d-9c63-86fc-2e469cb1809b@arm.com> <20200313190224.GA5808@e120937-lin> From: Cristian Marussi Message-ID: Date: Mon, 23 Mar 2020 08:28:13 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi On 3/18/20 8:26 AM, Lukasz Luba wrote: > Hi Cristian, > > On 3/16/20 2:46 PM, Cristian Marussi wrote: >> On Thu, Mar 12, 2020 at 09:43:31PM +0000, Lukasz Luba wrote: >>> >>> >>> On 3/12/20 6:34 PM, Cristian Marussi wrote: >>>> On 12/03/2020 13:51, Lukasz Luba wrote: >>>>> Hi Cristian, >>>>> >> Hi Lukasz >> >>>>> just one comment below... >> [snip] >>>>>> +    eh.timestamp = ts; >>>>>> +    eh.evt_id = evt_id; >>>>>> +    eh.payld_sz = len; >>>>>> +    kfifo_in(&r_evt->proto->equeue.kfifo, &eh, sizeof(eh)); >>>>>> +    kfifo_in(&r_evt->proto->equeue.kfifo, buf, len); >>>>>> +    queue_work(r_evt->proto->equeue.wq, >>>>>> +           &r_evt->proto->equeue.notify_work); >>>>> >>>>> Is it safe to ignore the return value from the queue_work here? >>>>> >>>> [snip] >> On the other side considering the impact of such scenario, I can imagine that >> it's not simply that we could only have a delayed delivery, but we must consider >> that if the delayed event is effectively the last one ever it would remain >> undelivered forever; this is particularly worrying in a scenario in which such >> last event is particularly important: imagine a system shutdown where a last >> system-power-off remains undelivered. > > Agree, another example could be a thermal notification for some critical > trip point. > >> >> As a consequence I think this rare racy condition should be addressed somehow. >> >> Looking at this scenario, it seems the classic situation in which you want to >> use some sort of completion to avoid missing out on events delivery, BUT in our >> usecase: >> >> - placing the workers loaned from cmwq into an unbounded wait_for_completion() >>    once the queue is empty seems not the best to use resources (and probably >>    frowned upon)....using a few dedicated kernel threads to simply let them idle >>    waiting most of the time seems equally frowned upon (I could be wrong...)) >> - the needed complete() in the ISR would introduce a spinlock_irqsave into the >>    interrupt path (there's already one inside queue_work in fact) so it is not >>    desirable, at least not if used on a regular base (for each event notified) >> >> So I was thinking to try to reduce sensibly the above race window, more >> than eliminate it completely, by adding an early flag to be checked under >> specific conditions in order to retry the queue_work a few times when the race >> is hit, something like: >> >> ISR (core N)        |    WQ (core N+1) >> ------------------------------------------------------------------------------- >>             | atomic_set(&exiting, 0); >>             | >>             | do { >>             |    ... >>             |     if (queue_is_empty)        - WORK_PENDING        0 events queued >>             +          atomic_set(&exiting, 1)    - WORK_PENDING        0 events queued >> static int cnt=3    |          --> breakout of while    - WORK_PENDING        0 events queued >> kfifo_in()        |    .... >>             | } while (scmi_process_event_payload); >> kfifo_in()        | >> exiting = atomic_read()    |     ...cmwq backing out        - WORK_PENDING        1 events queued >> do {            |     ...cmwq backing out        - WORK_PENDING        1 events queued >>      ret = queue_work()     |     ...cmwq backing out        - WORK_PENDING        1 events queued >>      if (ret || !exiting)|     ...cmwq backing out        - WORK_PENDING        1 events queued >>     break;        |     ...cmwq backing out        - WORK_PENDING        1 events queued >>      mdelay(5);        |     ...cmwq backing out        - WORK_PENDING        1 events queued >>      exiting =        |     ...cmwq backing out        - WORK_PENDING        1 events queued >>        atomic_read;    |     ...cmwq backing out        - WORK_PENDING        1 events queued >> } while (--cnt);    |     ...cmwq backing out        - WORK_PENDING        1 events queued >>             | ---- WORKER EXIT             - !WORK_PENDING        0 events queued >> >> like down below between the scissors. >> >> Not tested or tried....I could be missing something...and the mdelay is horrible (and not >> the cleanest thing you've ever seen probably :D)...I'll have a chat with Sudeep too. > > Indeed it looks more complicated. If you like I can join your offline > discuss when Sudeep is back. > Yes this is as of now my main remaining issue to address for v6. I'll wait for Sudeep general review/feedback and raise this point. Regards Cristian > Regards, > Lukasz