Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp11163462ybi; Thu, 25 Jul 2019 11:05:27 -0700 (PDT) X-Google-Smtp-Source: APXvYqyjrzch47iPrU9/4QfT0VaQJjlxIfpegKejo0YfMKoUq973n7+Hch6pUlxWnE17jBBvWSO4 X-Received: by 2002:aa7:9ab5:: with SMTP id x21mr17716161pfi.139.1564077927838; Thu, 25 Jul 2019 11:05:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564077927; cv=none; d=google.com; s=arc-20160816; b=tbCCp/uWZvHikzz/Z7PH6pbIpDJshTDzufG3mVJvYyk5cjiz9o8gPn9vnQlrKDYvVT mmLCvIN+YEfekPzd58JTeIjg9oqRvIoQy0v6vy492R/mBDxmkiJxHfyW0Qom4jrcwIRG nsoOkvNKJNUsyYrJue1iCR01YgJi3T6A2G8UQbI9T0mkly0bx1si9vrU24yvmH4vWEwo Vvs0m70trScATxWc9BmOFB3ravR237jN+BUV0f4yAehXzYZRAIGB+UXhj7ngLk1MDXto V4qnhvDE79dgzmPPQjjCZpW6cACOYs5CQuqpSV2xZep9Dy/3aOFXYPllD5tknKDMY3zD 5VuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=zUSY0yNX7mjAe5hgCsuzMt6JoI+T3veQET0Qe3Sg7bs=; b=DZBGQUk2xrAr156ZgT8kcrqHQt9XLiF9lRIWmOhhicSTktyFa8sOG6MqwEqRvhJG7D VINRhHk1+nZWRdM6DbqLzOmlxk43eY/qrzm8NfUP/LlDdIxc5yLThBx15rFdZmt6uEgE 6gPigt+pjfrJyfrBc3gcu3cc9J9/+RmcmJrrMb3Yj/BSg3/i39vuw8Rpz2JXbX/8l8/m /2cgQANEljwh4SgsL3yPfAh8ENrxgmyk2dEmr0IWx/r/zAR9JyLZXxLnWAnA6GRZtYdX kBdpjHmo8B9K52/6grHjJJtqbsFxGTIGyhqn8m3HJF+9nnrnhXh4/lGQTkuW+EvAJWPX n9BA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=nmCIp8hq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z1si1881049pgi.498.2019.07.25.11.05.11; Thu, 25 Jul 2019 11:05:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=nmCIp8hq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389152AbfGYPjx (ORCPT + 99 others); Thu, 25 Jul 2019 11:39:53 -0400 Received: from mail-ua1-f67.google.com ([209.85.222.67]:34034 "EHLO mail-ua1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728392AbfGYPjx (ORCPT ); Thu, 25 Jul 2019 11:39:53 -0400 Received: by mail-ua1-f67.google.com with SMTP id c4so20013087uad.1 for ; Thu, 25 Jul 2019 08:39:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=zUSY0yNX7mjAe5hgCsuzMt6JoI+T3veQET0Qe3Sg7bs=; b=nmCIp8hqFv2u364QTeF1hoE1hDHQBhUlMI7TLjTtUqlM+25nPEpVMXgqAJLBrL8W0E KIj2AH3i+7XfPuNPCPGSvPkgR0tOh0YPzOZGrx8Hwrh+I0lAOo36NobtMFOnjg6kl3YJ Qk+Lxh1O2J1WrPHMivAvH/rBYsTe2Iw+lYFPQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=zUSY0yNX7mjAe5hgCsuzMt6JoI+T3veQET0Qe3Sg7bs=; b=GUZ3NP2gZEjMixlbNWZoVIPtCloNQxD7B3W+Ddid7e3uTSmC9MnrVHH0tfSHm3dV8e mmjj6GDSRUpjSTg1LiUIWORYKhFsaiHvsfWV8Bj5wbB+HEeXPzbYwMzXO2XjqGgWGHAR OF80swMphTRMEOn7nGKUmQcYfhud+rpQGE3yF0eSGAZvGJ/QWOMXXb4EsllB48mYuOiE retxoitFP1t92Qp8LYppq1pMXKE1oXqMCcb0UEoyWLPZPO2rdtJIkWwkUD/euGIAWAaM owaoAXAhF6Jye7P8IOST1gwS10d4W3Wn+WjB6tbbJNwNE84vlduwNChei/HOkW5w6Xru FbBw== X-Gm-Message-State: APjAAAUx/OWpqT6Z4VeKW4+kVXH12gaPvJYWpH2SLwH/4+EqhrV/SXzc 6oLxeQb3ypZFkldox3TYfxDmsuyTXS8= X-Received: by 2002:ab0:23ce:: with SMTP id c14mr29425099uan.77.1564069191944; Thu, 25 Jul 2019 08:39:51 -0700 (PDT) Received: from mail-ua1-f41.google.com (mail-ua1-f41.google.com. [209.85.222.41]) by smtp.gmail.com with ESMTPSA id n187sm47997496vkd.9.2019.07.25.08.39.51 for (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Thu, 25 Jul 2019 08:39:51 -0700 (PDT) Received: by mail-ua1-f41.google.com with SMTP id j2so20013742uaq.5 for ; Thu, 25 Jul 2019 08:39:51 -0700 (PDT) X-Received: by 2002:ab0:1391:: with SMTP id m17mr52819855uae.90.1564069190798; Thu, 25 Jul 2019 08:39:50 -0700 (PDT) MIME-Version: 1.0 References: <20190722215340.3071-1-ilina@codeaurora.org> <20190722215340.3071-2-ilina@codeaurora.org> <5d3769df.1c69fb81.55d03.aa33@mx.google.com> <20190724145251.GB18620@codeaurora.org> <5d38b38e.1c69fb81.e8e5d.035b@mx.google.com> <20190724203610.GE18620@codeaurora.org> <20190725151851.GG18620@codeaurora.org> In-Reply-To: <20190725151851.GG18620@codeaurora.org> From: Doug Anderson Date: Thu, 25 Jul 2019 08:39:30 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH V2 2/4] drivers: qcom: rpmh-rsc: avoid locking in the interrupt handler To: Lina Iyer Cc: Stephen Boyd , Andy Gross , Bjorn Andersson , linux-arm-msm , "open list:ARM/QUALCOMM SUPPORT" , Rajendra Nayak , LKML , Linux PM , mkshah@codeaurora.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Thu, Jul 25, 2019 at 8:18 AM Lina Iyer wrote: > > On Wed, Jul 24 2019 at 17:28 -0600, Doug Anderson wrote: > >Hi, > > > >On Wed, Jul 24, 2019 at 1:36 PM Lina Iyer wrote: > >> > >> On Wed, Jul 24 2019 at 13:38 -0600, Stephen Boyd wrote: > >> >Quoting Lina Iyer (2019-07-24 07:52:51) > >> >> On Tue, Jul 23 2019 at 14:11 -0600, Stephen Boyd wrote: > >> >> >Quoting Lina Iyer (2019-07-22 14:53:38) > >> >> >> Avoid locking in the interrupt context to improve latency. Since we > >> >> >> don't lock in the interrupt context, it is possible that we now could > >> >> >> race with the DRV_CONTROL register that writes the enable register and > >> >> >> cleared by the interrupt handler. For fire-n-forget requests, the > >> >> >> interrupt may be raised as soon as the TCS is triggered and the IRQ > >> >> >> handler may clear the enable bit before the DRV_CONTROL is read back. > >> >> >> > >> >> >> Use the non-sync variant when enabling the TCS register to avoid reading > >> >> >> back a value that may been cleared because the interrupt handler ran > >> >> >> immediately after triggering the TCS. > >> >> >> > >> >> >> Signed-off-by: Lina Iyer > >> >> >> --- > >> >> > > >> >> >I have to read this patch carefully. The commit text isn't convincing me > >> >> >that it is actually safe to make this change. It mostly talks about the > >> >> >performance improvements and how we need to fix __tcs_trigger(), which > >> >> >is good, but I was hoping to be convinced that not grabbing the lock > >> >> >here is safe. > >> >> > > >> >> >How do we ensure that drv->tcs_in_use is cleared before we call > >> >> >tcs_write() and try to look for a free bit? Isn't it possible that we'll > >> >> >get into a situation where the bitmap is all used up but the hardware > >> >> >has just received an interrupt and is going to clear out a bit and then > >> >> >an rpmh write fails with -EBUSY? > >> >> > > >> >> If we have a situation where there are no available free bits, we retry > >> >> and that is part of the function. Since we have only 2 TCSes avaialble > >> >> to write to the hardware and there could be multiple requests coming in, > >> >> it is a very common situation. We try and acquire the drv->lock and if > >> >> there are free TCS available and if available mark them busy and send > >> >> our requests. If there are none available, we keep retrying. > >> >> > >> > > >> >Ok. I wonder if we need some sort of barriers here too, like an > >> >smp_mb__after_atomic()? That way we can make sure that the write to > >> >clear the bit is seen by another CPU that could be spinning forever > >> >waiting for that bit to be cleared? Before this change the spinlock > >> >would be guaranteed to make these barriers for us, but now that doesn't > >> >seem to be the case. I really hope that this whole thing can be changed > >> >to be a mutex though, in which case we can use the bit_wait() API, etc. > >> >to put tasks to sleep while RPMh is processing things. > >> > > >> We have drivers that want to send requests in atomic contexts and > >> therefore mutex locks would not work. > > > >Jumping in without reading all the context, but I saw this fly by and > >it seemed odd. If I'm way off base then please ignore... > > > >Can you give more details? Why are these drivers in atomic contexts? > >If they are in atomic contexts because they are running in the context > >of an interrupt then your next patch in the series isn't so correct. > > > >Also: when people submit requests in atomic context are they always > >submitting an asynchronous request? In that case we could > >(presumably) just use a spinlock to protect the queue of async > >requests and a mutex for everything else? > Yes, drivers only make async requests in interrupt contexts. So correct me if I'm off base, but you're saying that drivers make requests in interrupt contexts even after your whole series and that's why you're using spinlocks instead of mutexes. ...but then in patch #3 in your series you say: > Switch over from using _irqsave/_irqrestore variants since we no longer > race with a lock from the interrupt handler. Those seem like contradictions. What happens if someone is holding the lock, then an interrupt fires, then the interrupt routine wants to do an async request. Boom, right? > They cannot > use the sync variants. The async and sync variants are streamlined into > the same code path. Hence the use of spinlocks instead of mutexes > through the critical path. I will perhaps defer to Stephen who was the one thinking that a mutex would be a big win here. ...but if a mutex truly is a big win then it doesn't seem like it'd be that hard to have a linked list (protected by a spinlock) and then some type of async worker that: 1. Grab the spinlock, pops one element off the linked list, release the spinlock 2. Grab the mutex, send the one element, release the mutex 3. Go back to step #1. This will keep the spinlock held for as little time as possible. -Doug