Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp53178pxj; Tue, 15 Jun 2021 19:54:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz3LOOtaIasyGLoHHzN81HWOe5hYkfWLENzNBPRxgbNYkJ7J0BRoeuQxdEssLXegt+DmUQl X-Received: by 2002:a17:907:c02:: with SMTP id ga2mr2724530ejc.215.1623812048268; Tue, 15 Jun 2021 19:54:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623812048; cv=none; d=google.com; s=arc-20160816; b=TK8Ek/2g1F1ZTrJTEsc5eJkZgwg0Trkfg38oKbW/ZiveFL6Pjx6Iut+tscgq2wl+16 GY9EI2AOPL4477jNCDy6WFBKorof1JWHU9z21FuSru+al06x3lPSZyDtsVb1jDPcybzr MMu+QgZ2L0FwxHwDFmYOwBeYLNOM8FtOk9Jum7hvT6j5B3nmpQUHcR1O2joa1uLvisdJ hGBtZuaDQ1q6zxCu93t2nc/sHlvapSf13d3w2SHxnxztI2+FOgEAYwU7jsVVmLe2VA1q Qx1Zc/OT9FW6JQUW0wvbkOmVrL5yusH/5P1W+M9yQRflQUV+MEZPjaT1mWWgAf1tqJQI MFgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=4WFQbb2V5zS4ZcwvanZqqRP8ujMtDAGcUorwM6R/paU=; b=rewUnyOAuD7BCLKwl2RFi4NiE60iwRZIR2atjkaTAe8im+nGLgQweX6hmKZRKJ/4+c tJL32Sqc0T5FoHMixF7KSdTEyVwIoecemnmakExjhwicmzQUchAxz7s51Twp9QX9Nj3T ObC8aZZ28oDgQgywvtFbq0nxPykCZMrOWQl1ESBU3edZ/HerwGu6ox/KKA2SF6eLh0Vv YH9cssg75sFBeWzHCj4ZL9lpqmDEbrj7epzZg+Va/1n6t7SJ2+tvXVW48zDH/f7+MPwl r9AhsM7ZC2ObNKGd+h8VjJ9XaGAQdTh1GwKRKc7HDRccD3yDJUes7IRwVrrokSpggiWL uTvA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=aU7UShi7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a16si762916eds.411.2021.06.15.19.53.45; Tue, 15 Jun 2021 19:54:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=aU7UShi7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231954AbhFPCwx (ORCPT + 99 others); Tue, 15 Jun 2021 22:52:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231934AbhFPCww (ORCPT ); Tue, 15 Jun 2021 22:52:52 -0400 Received: from mail-qv1-xf33.google.com (mail-qv1-xf33.google.com [IPv6:2607:f8b0:4864:20::f33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D71A5C061760 for ; Tue, 15 Jun 2021 19:50:46 -0700 (PDT) Received: by mail-qv1-xf33.google.com with SMTP id if15so855139qvb.2 for ; Tue, 15 Jun 2021 19:50:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=4WFQbb2V5zS4ZcwvanZqqRP8ujMtDAGcUorwM6R/paU=; b=aU7UShi7TZ4Wge/ZEmrVVSVUpctNK7g7onlswiQ9Aaj1TwwPi0tz/Kv4Bg6o/if6I7 VNr1rRD/syrBM2rNBEEaRrivvt3MLYXt0KpJfuUChLp8qsUpR55/dGDTJ7C0eLl6ywBd 6BDzuzNRV67kPH2tZ+7Q4RpIj9SqR9hXvtq+SnaWO8hrTXHcUUW+RooMRLC6cXO86arA CqGG1ggddJMrMUhYCRCPxc+owtOTLckbozQVR8ABQYd7Zv9Q/eXpmYnjV7d7thNHkXpo b+4mx2HOz63z3kORwTJcTL0U7fVk+Vl/vFTRWIQxGAZHzlW4mLE7/Ju2VbTFmuqgB/CE 0sfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=4WFQbb2V5zS4ZcwvanZqqRP8ujMtDAGcUorwM6R/paU=; b=SUyuvxqiGd3E04DZmZcGUU6u7p8i4ypg9Vc54Ii28cKI5P0G0sjbvTOKAFN2lUZdfC YBeoxYDmYSTlse8xvoP2flRV1tnnG6KpwWYOprEy5AkkZ7yQu5SyY5hwy0V2iga3uN6o DrAS4/AvLBQEknPaGbyXyu3QH/bdL/sRX4xGuuo8uoh/5Gu7v58Rcy/K/8J3mTOnHUim TIxC709RPcHoZ6aesOF1Vq/GjVk9mp0VjHKBJrcUZN30DVxT8V44pYYQgb33aYxjuPuh irF+d6G7tfmUEJVj6jlPAZaGM77Hj7qIbQX9EhAI33bDiH4jjbVe3JpHUtisWdDS92V3 CsLQ== X-Gm-Message-State: AOAM533W+CP+DU5W6+o/39z/mbm/CbvZAWY5EhXpNk3axFw+H7zcbt68 3W0K5yUD1AWwamzY+7lhcrQYtA== X-Received: by 2002:a05:6214:8f1:: with SMTP id dr17mr8597926qvb.42.1623811845774; Tue, 15 Jun 2021 19:50:45 -0700 (PDT) Received: from [192.168.1.93] (pool-71-163-245-5.washdc.fios.verizon.net. [71.163.245.5]) by smtp.gmail.com with ESMTPSA id w2sm710034qkf.88.2021.06.15.19.50.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 15 Jun 2021 19:50:45 -0700 (PDT) Subject: Re: [PATCH v3 4/7] thermal/drivers/tegra: Add driver for Tegra30 thermal sensor To: Dmitry Osipenko , Daniel Lezcano , Viresh Kumar Cc: Thierry Reding , Jonathan Hunter , Zhang Rui , Amit Kucheria , Andreas Westman Dorcsak , Maxim Schwalm , Svyatoslav Ryhel , Ihor Didenko , Ion Agorria , Matt Merhar , Peter Geis , devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, linux-tegra@vger.kernel.org, linux-pm@vger.kernel.org References: <20210529170955.32574-1-digetx@gmail.com> <20210529170955.32574-5-digetx@gmail.com> <6f2b6290-095a-bd39-c160-1616a0ff89b1@linaro.org> <20210615102626.dja3agclwzxv2sj4@vireshk-i7> <595f5e53-b872-bcc6-e886-ed225e26e9fe@gmail.com> <4c7b23c4-cf6a-0942-5250-63515be4a219@gmail.com> From: Thara Gopinath Message-ID: <545974aa-bb0f-169b-6f31-6e8c2461343f@linaro.org> Date: Tue, 15 Jun 2021 22:50:43 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <4c7b23c4-cf6a-0942-5250-63515be4a219@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/15/21 3:32 PM, Dmitry Osipenko wrote: > 15.06.2021 19:18, Daniel Lezcano пишет: >> On 15/06/2021 15:01, Dmitry Osipenko wrote: >>> 15.06.2021 13:26, Viresh Kumar пишет: >>>> On 15-06-21, 12:03, Daniel Lezcano wrote: >>>>> >>>>> [Cc Viresh] >>>>> >>>>> On 29/05/2021 19:09, Dmitry Osipenko wrote: >>>>>> All NVIDIA Tegra30 SoCs have a two-channel on-chip sensor unit which >>>>>> monitors temperature and voltage of the SoC. Sensors control CPU frequency >>>>>> throttling, which is activated by hardware once preprogrammed temperature >>>>>> level is breached, they also send signal to Power Management controller to >>>>>> perform emergency shutdown on a critical overheat of the SoC die. Add >>>>>> driver for the Tegra30 TSENSOR module, exposing it as a thermal sensor >>>>>> and a cooling device. >>>>> >>>>> IMO it does not make sense to expose the hardware throttling mechanism >>>>> as a cooling device because it is not usable anywhere from the thermal >>>>> framework. >>>>> >>>>> Moreover, that will collide with the thermal / cpufreq framework >>>>> mitigation (hardware sets the frequency but the software thinks the freq >>>>> is different), right ? >>> >>> H/w mitigation is additional and should be transparent to the software >>> mitigation. The software mitigation is much more flexible, but it has >>> latency. Software also could crash and hang. >>> >>> Hardware mitigation doesn't have latency and it will continue to work >>> regardless of the software state. >> >> Yes, I agree. Both solutions have their pros and cons. However, I don't >> think they can co-exist sanely. >> >>> The CCF driver is aware about the h/w cooling status [1], hence software >>> sees the actual frequency. >>> >>> [1] >>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit?id=344d5df34f5abd468267daa98f041abf90b2f660 >> >> Ah interesting, thanks for the pointer. >> >> What I'm worried about is the consistency with cpufreq. >> >> Probably cpufreq_update_limits() should be called from the interrupt >> handler. > > IIUC, the cpufreq already should be prepared for the case where firmware > may override frequency. Viresh, could you please clarify what are the > possible implications of the frequency overriding? > >>>> I am not even sure what the cooling device is doing here: >>>> >>>> tegra_tsensor_set_cur_state() is not implemented and it says hardware >>>> changed it by itself. What is the benefit you are getting out of the >>>> cooling device here ? >>> >>> It allows userspace to check whether hardware cooling is active via the >>> cooling_device sysfs. Otherwise we don't have ability to check whether >>> h/w cooling is active, I think it's a useful information. It's also >>> interesting to see the cooling_device stats, showing how many times h/w >>> mitigation was active. >> >> Actually the stats are for software mitigation. For the driver, create a >> debugfs entry like what do the other drivers or a module parameter with >> the stats. > > Okay > >>>>> The hardware limiter should let know the cpufreq framework about the >>>>> frequency change. >>>>> >>>>> https://lkml.org/lkml/2021/6/8/1792 >>>>> >>>>> May be post the sensor without the hw limiter for now and address that >>>>> in a separate series ? >>>> >>> >>> I wasn't aware about existence of the thermal pressure, thank you for >>> pointing at it. At a quick glance it should be possible to benefit from >>> the information about the additional pressure. >>> >>> Seems the current thermal pressure API assumes that there is only one >>> user of the API. So it's impossible to aggregate the pressure from >>> different sources, like software cpufreq pressure + h/w freq pressure. >>> Correct? If yes, then please let me know yours thoughts about the best >>> approach of supporting the aggregation. Hi, Thermal pressure is letting scheduler know that the max capacity available for a cpu to schedule tasks is reduced due to a thermal event. So you cannot have a h/w thermal pressure and s/w thermal pressure. There is eventually only one capping applied at h/w level and the frequency corresponding to this capping should be used for thermal pressure. Ideally you should not be having both s/w and h/w trying to throttle at the same time. Why is this a scenario and what prevents you from disabling s/w throttling when h/w throttling is enabled. Now if there has to a aggregation for whatever reason this should be done at the thermal driver level and passed to scheduler. >> >> That is a good question. IMO, first step would be to call >> cpufreq_update_limits(). > > Right > >> [ Cc Thara who implemented the thermal pressure ] >> >> May be Thara has an idea about how to aggregate both? There is another >> series floating around with hardware limiter [1] and the same problematic. >> >> [1] https://lkml.org/lkml/2021/6/8/1791 > > Thanks, it indeed looks similar. > > I guess the common thermal pressure update code could be moved out into > a new special cpufreq thermal QoS handler (policy->thermal_constraints), > where handler will select the frequency constraint and set up the > pressure accordingly. So there won't be any races in the code. > It was a conscious decision to keep thermal pressure update out of qos max freq update because there are platforms that don't use the qos framework. For eg acpi uses cpufreq_update_policy. But you are right. We have two platforms now applying h/w throttling and cpufreq_cooling applying s/w throttling. So it does make sense to have one api doing all the computation to update thermal pressure. I am not sure how exactly/where exactly this will reside. So for starters, I think you should replicate the update of thermal pressure in your h/w driver when you know that h/w is throttling/throttled the frequency. You can refer to cpufreq_cooling.c to see how it is done. Moving to a common api can be done as a separate patch series. -- Warm Regards Thara (She/Her/Hers)