Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp1039877imm; Tue, 15 May 2018 12:54:10 -0700 (PDT) X-Google-Smtp-Source: AB8JxZp3HjnyTvhZXqi8o0QEs6efUpb5byLfoWpMsQI9SQn1hukUW+qvtIL+lLTpJzX//PpQoder X-Received: by 2002:a65:4ed1:: with SMTP id w17-v6mr13034741pgq.83.1526414050808; Tue, 15 May 2018 12:54:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526414050; cv=none; d=google.com; s=arc-20160816; b=LPkX+P0DnVnGbeirANM4v16lJdZ8Vs8ds1rebfhYc8Jh7/woZ0AOXhBASyynfKcboV mz6es3ct393ngwH2bgkxKp0LnTP5HN4BwIFdclha7JFhCx1Wpbb+XbXsMWoXI8FktCeI nQd5J90tITNnXJ6BbBWwElr3AmvD8s2EKpuU4UEC+D2kIMGcWWsdIz9Pd6P8QnRMZV7a HxY/rGE3sdx//CLAasow15f70e30Le10LlR/iMpVUEXVC3MZOR+6krfbJursGgK2Y+tG jiSbgO6E3N6U22K++KTWrvqtmvF4axrpTL1HdZbS66l/iPj2uWxIHxxig1OMJN9+0plt SN/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature:dkim-signature :arc-authentication-results; bh=SP2ysSbXpAeIegw5JRDwDTxunx39xJ2V0GhnfmrtQB4=; b=CTNhvsYzstRss197XSa12PZEHIk/kLwvOR39L4ebzx1CiYmUUT2PA21CybdLZfJtoO 5bhqF4Y4NlKQRSA00ncHLeTlv0QzViymmaWRHLQXQOh1+01cRhElzzD1nI/cjii49vlV MPqbp17q5+QHhbuzSB3gbj9pUYTEOIjpf6loT2puGwmQ8cv6SRK75w9rJm6PkaDq4m5H DcHfgaZrzHk9ra9U3JJIMWFgYELYQ8YxLOQdmZmrXQ8NWqlEmLbGA2KhhdU79GwxNhtS pmch5tC5pAcPEWTvlsmH+hkllm/BJG2HOKO4H1wyvJ2T6vwIrp1zqnGc2IpvcaxceGaI 96qw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@google.com header.s=20161025 header.b=TK0Sejm+; dkim=fail header.i=@chromium.org header.s=google header.b=F98vk5yJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z5-v6si713007pfh.367.2018.05.15.12.53.55; Tue, 15 May 2018 12:54:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@google.com header.s=20161025 header.b=TK0Sejm+; dkim=fail header.i=@chromium.org header.s=google header.b=F98vk5yJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752870AbeEOTwL (ORCPT + 99 others); Tue, 15 May 2018 15:52:11 -0400 Received: from mail-ua0-f195.google.com ([209.85.217.195]:36177 "EHLO mail-ua0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752059AbeEOTwF (ORCPT ); Tue, 15 May 2018 15:52:05 -0400 Received: by mail-ua0-f195.google.com with SMTP id b25-v6so992967uak.3 for ; Tue, 15 May 2018 12:52:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=SP2ysSbXpAeIegw5JRDwDTxunx39xJ2V0GhnfmrtQB4=; b=TK0Sejm+oB1a+6Y0PxQaP2a07+g/1TXHQ891OiDviHDyOCLmZ1NHR+jdH07YR5h5GW TbgqP4O/PwXQkv5BucD76DNcoy+Xer14YFkOoMWaO/8UypRXZAXeJWHuhDRNIe31NO0T MXbW209MS90MVNE4jY3V75WAZOss09bT84UQ41jy8dIrYg3Hb8DnOQjNvmzB9AUlECni RwxPEo7zikrUkVCBK7agWH6toGeRA+FqlM5NAIp6vZq9ssHiIgV2dQdvupy+wlSivLdr g2hG8HPifdz+OhjnWq6O1XsV30Krnksq12GMlC+xScoaDRNMt3/POA7S60BQcgeKkDG1 wmeg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=SP2ysSbXpAeIegw5JRDwDTxunx39xJ2V0GhnfmrtQB4=; b=F98vk5yJx4c7VOka/WHw6Bl3LSG+8iskNVtu+mpQbV8V5iBXD7ps7OnGtGKKxRR3gt YPze9cMZjfVP5jWLZLjjlVXAWllWJl39SJ13ObgZEX5zHH499JK/QgVOcTQ/0AnUcR+w IvGI0+m6sDvDYdc17VfxHyluNXGLMWuY9dKU4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=SP2ysSbXpAeIegw5JRDwDTxunx39xJ2V0GhnfmrtQB4=; b=r8CZAL/m9p2BErKuN/YBQKBIlJafDkLUukubrqtIvk4JLESWCaZxYw2OeMy8nbgEx4 fs43vchgDHAdcGb55FYb6imDotdCKyAPrWsvvZKyhUzwuo24tO9o8rW/tjA+WpSPxWFX i9cqyHiTfpswG5GkMGoPX/f/axUv+WDOESj+EUHhr36ZkrhA5dx3mY3O0zcd+jtzgsDD zRlb/d2RTilUo6t6eG2a4WLlVcjPfgbhKyyj/osCOsapGDbSbNOd4SDyAdd1nfoWrU4m PjjNxXU8WoYAAOQoe4d0L9hxw8vS3xLYRfPsUem3a2q/5pSthY5jXUGDl/sIdAlrf2WF A9PA== X-Gm-Message-State: ALKqPwc3kzIX5EttWE+w2vnU6mlTliuxOXYcc4RzLN0T9cy++SaXrL+y K8j7V7yZ5acMqxzOQv8pzE2b3Z+H/7aE8WztP51UjQ== X-Received: by 2002:ab0:6aa:: with SMTP id g39-v6mr17545962uag.82.1526413924218; Tue, 15 May 2018 12:52:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.31.48.82 with HTTP; Tue, 15 May 2018 12:52:02 -0700 (PDT) In-Reply-To: <20180515180309.GC28489@codeaurora.org> References: <20180509170159.29682-1-ilina@codeaurora.org> <20180509170159.29682-10-ilina@codeaurora.org> <20180514195929.GA22950@codeaurora.org> <20180515162326.GA28489@codeaurora.org> <20180515180309.GC28489@codeaurora.org> From: Doug Anderson Date: Tue, 15 May 2018 12:52:02 -0700 X-Google-Sender-Auth: yV9bpkeWwX-VMtjb5FS5aDbEpMg Message-ID: Subject: Re: [PATCH v8 09/10] drivers: qcom: rpmh: add support for batch RPMH request To: Lina Iyer Cc: Andy Gross , David Brown , linux-arm-msm@vger.kernel.org, "open list:ARM/QUALCOMM SUPPORT" , Rajendra Nayak , Bjorn Andersson , LKML , Stephen Boyd , Evan Green , Matthias Kaehlcke , rplsssn@codeaurora.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Tue, May 15, 2018 at 11:03 AM, Lina Iyer wrote: >>> Also, the bus requests have quite a churn during the course of an >>> usecase. They are setup and invalidated often. >>> It is faster to have them separate and invalidate the whole lot of the >>> batch_cache instead of intertwining them with requests from other >>> drivers and then figuring out which all must be invalidated and rebuilt >>> when the next batch requests comes in. Remember, that requests may come >>> from any driver any time and therefore will be mangled if we use the >>> same data structure. So to be faster and to avoid having mangled requests >>> in the TCS, I have kept the data structures separate. >> >> >> If you need to find a certain group of requests then can't you just >> tag them and it's easy to find them? I'm still not seeing the need >> for a whole separate code path and data structure. >> > Could be done but it will be slower and involve a lot more code than > separate data structures. When you say "a lot more code", you mean that you'll have to write a lot more code or that each request will execute a lot more code? I would argue that very little code would need to be added (compared to your patch) if rpmh_write_batch() was implemented atop rpmh_write_async(). Even with extra tagging / prioritization it would be small. I could try to prototype it I guess. I think you mean that it would _execute_ a lot more code and thus be slower. Is that right? Is your main argument here that statically pre-allocating 20 slots in an array is going to be a lot faster than kzalloc-ing 20 linked list nodes and chaining them together? If you're truly worried about the kzalloc calls being slow, I suppose you could always allocate them with a kmem_cache. ...but we're talking fewer than 20 kzalloc calls, right? I'll also make an argument (with no data to back me up) that maintaining separate functions for handling the batch cases will slow down your kernel because your footprint in the icache will be bigger and you'll be more likely to kick something else out that actually needs to run fast. >>>>>>> + spin_unlock_irqrestore(&ctrlr->lock, flags); >>>>>>> + >>>>>>> + return ret; >>>>>>> +} >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> As part of my overall confusion about why the batch cache is different >>>>>> than the normal one: for the normal use case you still call >>>>>> rpmh_rsc_write_ctrl_data() for things you put in your cache, but you >>>>>> don't for the batch cache. I still haven't totally figured out what >>>>>> rpmh_rsc_write_ctrl_data() does, but it seems strange that you don't >>>>>> do it for the batch cache but you do for the other one. >>>>>> >>>>>> >>>>> flush_batch does write to the controller using >>>>> rpmh_rsc_write_ctrl_data() >>>> >>>> >>>> >>>> My confusion is that they happen at different times. As I understand >>>> it: >>>> >>>> * For the normal case, you immediately calls >>>> rpmh_rsc_write_ctrl_data() and then later do the rest of the work. >>>> >>>> * For the batch case, you call both later. >>>> >>>> Is there a good reason for this, or is it just an arbitrary >>>> difference? If there's a good reason, it should be documented. >>>> >>>> >>> In both the cases, the requests are cached in the rpmh library and are >>> only sent to the controller only when the flushed. I am not sure the >>> work is any different. The rpmh_flush() flushes out batch requests and >>> then the requests from other drivers. >> >> >> OK then, here are the code paths I see: >> >> rpmh_write >> => __rpmh_write >> => cache_rpm_request() >> => (state != RPMH_ACTIVE_ONLY_STATE): rpmh_rsc_send_data() >> >> rpmh_write_batch >> => (state != RPMH_ACTIVE_ONLY_STATE): cache_batch() >> => No call to rpmh_rsc_send_data() >> >> >> Said another way: >> >> * if you call rpmh_write() for something you're going to defer you >> will still call cache_rpm_request() _before_ rpmh_write() returns. >> >> * if you call rpmh_write_batch() for something you're going to defer >> then you _don't_ call cache_rpm_request() before rpmh_write_batch() >> returns. >> >> >> Do you find a fault in my analysis of the current code? If you see a >> fault then please point it out. If you don't see a fault, please say >> why the behaviors are different. I certainly understand that >> eventually you will call cache_rpm_request() for both cases. It's >> just that in one case the call happens right away and the other case >> it is deferred. > > True. I see where your confusion is. It is because of an optimization and > our existential need for saving power at every opportunity. > > For rpmh_write path - > The reason being for the disparity is that, we can vote for a system low > power mode (modes where we flush the caches and put the entire silicon > is a low power state) at any point when all the CPUs are idle. If a > driver has updated its vote for a system resource, it needs to be > updated in the cache and thats what cache_rpm_request() does. Its > possible that we may enter a system low power mode immediately after > that. The rpmh_flush() would be called by the idle driver and we would > flush the cached values and enter the idle state. By writing immediately > to the TCS, we avoid increasing the latency of entering an idle state. > > For the rpmh_write_batch() path - > The Bus driver would invalidate the TCS and the batch_cache. The use > case fills up the batch_cache with values as needed by the use case. > While during the usecase, the CPUs can go idle and the idle drivers > would call rpmh_flush(). At that time, we would write the batch_cache > into the already invalidated TCSes and then write the rest of the cached > requests from ->cache. We then enter the low power modes. > > The optimization of writing the sleep/wake votes in the same context of > the driver making the request, helps us avoid writing some extra > registers in the critical idle path. I don't think I followed all that, but I'll try to read deeper into RPMh. ...but one question: could the two paths be changed to be the same? AKA would it be bad to call rpmh_rsc_send_data() synchronously in rpmh_write_batch()? If the two cases really need to behave differently then it should be documented in the code. -Doug