Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp724774pxb; Tue, 3 Nov 2020 10:45:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJzmadc7G0mN7AimZoas5cuVs3ehfsAmekygoLwF3gMJap679nTGOGZrPSndfuVGs3HlZOzo X-Received: by 2002:aa7:c2d7:: with SMTP id m23mr23294572edp.230.1604429132813; Tue, 03 Nov 2020 10:45:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604429132; cv=none; d=google.com; s=arc-20160816; b=cOxOqfbXaFcDBkggc+beUFNtDocpQ4guyVNEqDJYq2BqoBglZzLFBJHz7pT2fbabWh EmOyJuv439Koz6rL4dKvbVfeKJVryOP0L7315hf3/GbHvgTvxGLoxMuQDp+dpOekRu6L w52OAdcVZ1HuOjrHod0upqbCLZztYSX8nmoe3KKItsD+D79jMDYUi/KRa3Q6dyMBq09k mLiE7iI4rim46xIxzZwWMs2SGeYZAGuKJ/FkvxlyP5+ivsgqtPrAzha7uGMzkqfExq1a h6VuaPRGIaVbSWPcF/ifRlvtlTH+G8QTNqASzKugKxwkrgFB/e6CwTGtdFwv4TPMOszK 21QA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=ooc/4skm8b41a0c2OVLhlIuGfy2g3W0VRGaKqUSnLp8=; b=a9UB2vQuTmPZ435Y8G5w+0576MeFAlWi9zT8Gf4zAvPd5ZHAD0IsAxe0XSsm3NZteD OQ3Lyv2Hv6l44ZKK0T5J9owj6HNrt1AnVfjsy4ylpelYxXDh/03YahAtxRhdTvHmRpMV uoUVoiahmPBY/TL6Ld7EpzgAYVys+dO0tBX3OzNgjtA5kQJW2OAd7EBaA5Szyd2I1bL6 ODMsP+1TPfFORHL5/vu724708WIGsPZghiYewgpQQGpKn81CT8DDLj112CB0855hJFra PIqiSLFhzBZuk/N8wwXznrVfcM/mPNkoimmJ9ClfR7UR6c17nsQPkLffqQcXdtEDJFwi a3og== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="ctpRq/I4"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k5si3012471ejk.655.2020.11.03.10.45.08; Tue, 03 Nov 2020 10:45:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="ctpRq/I4"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729377AbgKCSnA (ORCPT + 99 others); Tue, 3 Nov 2020 13:43:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728351AbgKCSm7 (ORCPT ); Tue, 3 Nov 2020 13:42:59 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66D6CC0617A6 for ; Tue, 3 Nov 2020 10:42:59 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id s9so19567955wro.8 for ; Tue, 03 Nov 2020 10:42:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=ooc/4skm8b41a0c2OVLhlIuGfy2g3W0VRGaKqUSnLp8=; b=ctpRq/I4KR2iZH4NQ6gdEh86ePuqpslZPlOn1mdzkX+79TdYTXNdWmtONeK67ocwGN pnpFAv/UkMXqgSNTbgfO35gWmZtYuIVWAdwrgjXu9RMB5KxH7c7zATXBzZmaOElSiDzK NwvUz6Q7ybj6DW34pIRDPsIQxRJWCajO81s7pGSLsvZdDy4fznW7405Todk5u6FFSd8s KufWeN1WYlltVwSLTeOsO/I3lDFOi50n4uyXpU73x1h9OW13mFZ3neTy+WIlQ4D2ERrR Phr0M4VzKyHdBJRzm+yQYqHyZiwM7sd9CoSEn8FYbmb5jNyA9AzBl0YZPKLHr68DCKPX DbdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=ooc/4skm8b41a0c2OVLhlIuGfy2g3W0VRGaKqUSnLp8=; b=Byly49TJc+Myu5iaRPbjzH/CRa5rXnl3yVxsbwG2HVZ4Zgi9igui2grD3k+cQvWGm5 nup8ZQapBl8fFjRj0IxKGNmr8h+JwVMK6xuYSeLVpK/MbyJ3OfR2Y9h3qVyQZPVgl36c /gSm7pt7RMnBf+Skv8Z98MYAz5cGlOlzZ+FirutdVpJmIENC7iXReIZP+32auF31yFpY 7eHagAXcKqmgm9AL2bjNxWVgz6hVtJIagDAhSOL9otHj0QkTJPcDFNlTzwxz5XVGhXbi aSEioLvrZQmObyW6FFszqm/nf91bf9KXHKE1fnnm8RAnbBAcL64LuZIp9Uq5uPrlmhiQ YmyA== X-Gm-Message-State: AOAM531HJJmQNRSUbwm7+jUWvXYs39J8dQ8rkCYg7punW/zg8ejNQ18i rhqjSves3XV0WiiaH3zDzPPtG5Pxg7GYbA== X-Received: by 2002:a5d:6681:: with SMTP id l1mr28278104wru.356.1604428977874; Tue, 03 Nov 2020 10:42:57 -0800 (PST) Received: from ?IPv6:2a01:e34:ed2f:f020:f853:3b7b:eb7b:1bf? ([2a01:e34:ed2f:f020:f853:3b7b:eb7b:1bf]) by smtp.googlemail.com with ESMTPSA id y20sm3731260wma.15.2020.11.03.10.42.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 03 Nov 2020 10:42:57 -0800 (PST) Subject: Re: [PATCH 3/4] powercap/drivers/dtpm: Add API for dynamic thermal power management To: Lukasz Luba Cc: rafael@kernel.org, srinivas.pandruvada@linux.intel.com, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rui.zhang@intel.com, "Rafael J. Wysocki" , Arnd Bergmann , "open list:GENERIC INCLUDE/ASM HEADER FILES" References: <20201006122024.14539-1-daniel.lezcano@linaro.org> <20201006122024.14539-4-daniel.lezcano@linaro.org> <4484e771-9011-0928-e961-cd3a53be55e9@arm.com> From: Daniel Lezcano Message-ID: <18d2393a-2954-f271-817f-f9f9bf651f25@linaro.org> Date: Tue, 3 Nov 2020 19:42:55 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <4484e771-9011-0928-e961-cd3a53be55e9@arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Lukasz, thanks for the review and the comments. On 23/10/2020 12:29, Lukasz Luba wrote: > Hi Daniel, [ ... ] >> + >> +config DTPM >> +    bool "Power capping for dynamic thermal power management" > > Maybe starting with capital letters: Dynamic Thermal Power Management? Ok, noted. [ ... ] >> +++ b/drivers/powercap/dtpm.c >> @@ -0,0 +1,430 @@ >> +// SPDX-License-Identifier: GPL-2.0-only >> +/* >> + * Copyright 2020 Linaro Limited >> + * >> + * Author: Daniel Lezcano >> + * >> + * The powercap based Dynamic Thermal Power Management framework >> + * provides to the userspace a consistent API to set the power limit >> + * on some devices. >> + * >> + * DTPM defines the functions to create a tree of constraints. Each >> + * parent node is a virtual description of the aggregation of the >> + * children. It propagates the constraints set at its level to its >> + * children and collect the children power infomation. The leaves of > > s/infomation/information/ Ok, thanks [ ... ] >> +static struct powercap_control_type *pct; >> +static struct dtpm *root; > > I wonder if it safe to have the tree without a global lock for it, like > mutex tree_lock ? > I have put some comments below when the code traverses the tree. The mutex is a heavy lock and the its purpose is to allow the current process to be preempted while the spinlock is very fast without preemption. Putting in place a single lock will simplify the code but I'm not sure it is worth as it could be a contention. It would be simpler to switch to a big lock than the opposite. [ ... ] >> +static void dtpm_rebalance_weight(void) >> +{ >> +    __dtpm_rebalance_weight(root); >> +} >> + >> +static void dtpm_sub_power(struct dtpm *dtpm) >> +{ >> +    struct dtpm *parent = dtpm->parent; >> + >> +    while (parent) { > > I am not sure if it is safe for a corner case when the > nodes are removing from bottom to top. We don't hold a tree > lock, so these two (above line and below) operations might > be split/preempted and 'parent' freed before taking the lock. > Is it possible? (Note: I might missed something like double > locking using this local node spinlock). The parent can not be freed until it has children, the check is done in the release node function. >> +        spin_lock(&parent->lock); >> +        parent->power_min -= dtpm->power_min; >> +        parent->power_max -= dtpm->power_max; >> +        spin_unlock(&parent->lock); >> +        parent = parent->parent; >> +    } >> + >> +    dtpm_rebalance_weight(); >> +} >> + >> +static void dtpm_add_power(struct dtpm *dtpm) >> +{ >> +    struct dtpm *parent = dtpm->parent; >> + >> +    while (parent) { > > Similar here? > >> +        spin_lock(&parent->lock); >> +        parent->power_min += dtpm->power_min; >> +        parent->power_max += dtpm->power_max; >> +        spin_unlock(&parent->lock); >> +        parent = parent->parent; >> +    } >> + >> +    dtpm_rebalance_weight(); >> +} >> + >> +/** >> + * dtpm_update_power - Update the power on the dtpm >> + * @dtpm: a pointer to a dtpm structure to update >> + * @power_min: a u64 representing the new power_min value >> + * @power_max: a u64 representing the new power_max value >> + * >> + * Function to update the power values of the dtpm node specified in >> + * parameter. These new values will be propagated to the tree. >> + * >> + * Return: zero on success, -EINVAL if the values are inconsistent >> + */ >> +int dtpm_update_power(struct dtpm *dtpm, u64 power_min, u64 power_max) >> +{ >> +    if (power_min == dtpm->power_min && power_max == dtpm->power_max) >> +        return 0; >> + >> +    if (power_max < power_min) >> +        return -EINVAL; >> + >> +    dtpm_sub_power(dtpm); >> +    spin_lock(&dtpm->lock); >> +    dtpm->power_min = power_min; >> +    dtpm->power_max = power_max; >> +    spin_unlock(&dtpm->lock); >> +    dtpm_add_power(dtpm); >> + >> +    return 0; >> +} >> + >> +/** >> + * dtpm_release_zone - Cleanup when the node is released >> + * @pcz: a pointer to a powercap_zone structure >> + * >> + * Do some housecleaning and update the weight on the tree. The >> + * release will be denied if the node has children. This function must >> + * be called by the specific release callback of the different >> + * backends. >> + * >> + * Return: 0 on success, -EBUSY if there are children >> + */ >> +int dtpm_release_zone(struct powercap_zone *pcz) >> +{ >> +    struct dtpm *dtpm = to_dtpm(pcz); >> +    struct dtpm *parent = dtpm->parent; >> + > > I would lock the whole tree, just to play safe. > What do you think? I would like to keep the fine grain locking to prevent a potential contention. If it appears we hit a locking incorrectness or a race putting in question the fine grain locking scheme, then we can consider switching to a tree lock. >> +    if (!list_empty(&dtpm->children)) >> +        return -EBUSY; >> + >> +    if (parent) { >> +        spin_lock(&parent->lock); >> +        list_del(&dtpm->sibling); >> +        spin_unlock(&parent->lock); >> +    } >> + >> +    dtpm_sub_power(dtpm); >> + >> +    kfree(dtpm); >> + >> +    return 0; >> +} [ ... ] >> +struct dtpm *dtpm_alloc(void) >> +{ >> +    struct dtpm *dtpm; >> + >> +    dtpm = kzalloc(sizeof(*dtpm), GFP_KERNEL); >> +    if (dtpm) { >> +        INIT_LIST_HEAD(&dtpm->children); >> +        INIT_LIST_HEAD(&dtpm->sibling); >> +        spin_lock_init(&dtpm->lock); > > Why do we use spinlock and not mutex? The mutex will force the calling process to be preempted, that is useful when the critical sections contains blocking calls. Here we are just changing values without blocking calls, so using the spinlock is more adequate as they are faster. [ ... ] >> +static int __init dtpm_init(void) >> +{ >> +    struct dtpm_descr **dtpm_descr; >> + >> +    pct = powercap_register_control_type(NULL, "dtpm", NULL); >> +    if (!pct) { >> +        pr_err("Failed to register control type\n"); >> +        return -EINVAL; >> +    } >> + >> +    for_each_dtpm_table(dtpm_descr) >> +        (*dtpm_descr)->init(*dtpm_descr); > > We don't check the returned value here. It is required that the > devices should already be up and running (like cpufreq). > But if for some reason the init() failed, maybe it's worth to add a > field inside the dtpm_desc or dtpm struct like 'bool ready' ? > It could be retried to init later. It would be make sense to check the init return value if we want to rollback what we have done. Here we don't want to do that. If one subsystem fails to insert itself in the tree, it will log an error but the tree should continue to give access to what have been successfully initialized. The rollback is important in the init() ops, not in dtpm_init(). >> + >> +    return 0; >> +} >> +late_initcall(dtpm_init); > > The framework would start operating at late boot. We don't control > the thermal/power issues in earier stages. > Although, at this late stage all other things like cpufreq should be > ready, so the ->init() on them is likely to success. Right, the dtpm is accessible through sysfs for an userspace thermal daemon doing the smart mitigation. So do the initcall can be really late. [ ... ] Thanks for the review. -- Daniel -- Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog