Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp489498pxb; Wed, 6 Oct 2021 09:08:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxWi2slYTAXUObavIAZEu50YZL6dgcinnbfmQmDk1ogAyyV1I398RgRbCZgBGweIJA24lVI X-Received: by 2002:a19:8c0f:: with SMTP id o15mr10483989lfd.464.1633536493551; Wed, 06 Oct 2021 09:08:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633536493; cv=none; d=google.com; s=arc-20160816; b=HOidEED4SyzDN2+e8KRaKUnPDLchLGpxdSnPStzNH3tTQ6oYpAdD9SKaG6WOfltWhy 7z8wqfMqwvAs4y18MTq3WzueehyiTFV6E4MFcshAqvd4D5rLEPXPabuTeqSzsE6WhK7L 6TgngwsncQQv91AYI17t6TJVkgf7R0RYvAJ26g1OnUVi+3hI2J9kgIKfa2nkPUbQb+2H xg4pLUgD3zcNRgS2c147QiXABF0EHesEPiprC9o2wwvLeInLa9AIWR3pyOqqcTHakP+W 7fou/bIwqyc1gbXH+Y9Kk6vvFqXKpfhqm3ucNezy2A/ZTvwhNjcxH6w4XUH3rWOiGXhd A1QQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=DwG3TKkoHLRsZlvw7J2WuNQwAH3IO2xZ0zZvAFY9XgQ=; b=0mLeH4nGOWpUpfCLATlU9LxwFmSbgW7HKjCjpwUALjonHz2qFKwlhuYN598aPy+Xju ZAbRzmoo2d5IVq9V6nSn0Hm/x5F/WF+nonjQev/3+NmFdn0NSAGyGKH8y+mSI6Tt67k4 4WuOyRYOx/7ZBaKTOE2FRuApY71pRlQkytkacwZ+f93If9HaYPncD2Mww8eZjSnS6QbZ PRduFUF//0mnQIeGnr49SxuNVadT8A2RkSmWCPz7HqjEOSTcucjjLniNYL2ygp1f/6Fe HxTB7f8hjEVhlk9GpoKLLILhT1zSgF3gtjSxyrtg3l8G10jXRyVX3skQHsNA5+ZZ5bQ4 9zaw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=k+RFZe1B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e6si17410072ejl.722.2021.10.06.09.07.47; Wed, 06 Oct 2021 09:08:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=k+RFZe1B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231267AbhJFQIS (ORCPT + 99 others); Wed, 6 Oct 2021 12:08:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230021AbhJFQIN (ORCPT ); Wed, 6 Oct 2021 12:08:13 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DCA12C061746 for ; Wed, 6 Oct 2021 09:06:20 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id v25so10332353wra.2 for ; Wed, 06 Oct 2021 09:06:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=DwG3TKkoHLRsZlvw7J2WuNQwAH3IO2xZ0zZvAFY9XgQ=; b=k+RFZe1BpV83dwWJVGfEagHtyB2krN5IIqdbEPpzyfAGxITS00i5d3TdJGJWY6mBph +WX5+O6NaROsL26KMQxISL6DUyddqy7X1fLvugUl6lM9Gk7+7epnbFAbC4bMbUqD9BRc vXzJ5KAmWMIP5rAXlonYev+OGRkkURDXsfhwa6i7U80AEGFbaibpl4P6rtSeQq/i3RQp 3wDrspfCTH5o7QzeGMAn8ebDTbdUt1bUkwcml2fXIXwbs2toRLKAoXgvmJnjNyOeaPWq Sj8+cn1I5OVbCGm0PriofljzgYrEFCis2muayfqh/7QvFXxSSrPJOJqqzi6HKtYNdnW3 KT+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=DwG3TKkoHLRsZlvw7J2WuNQwAH3IO2xZ0zZvAFY9XgQ=; b=fJ4WW4KbsMk8o2WQowTYxRSj7uC6+knVnNF+X7wgjmj0Acx798svRKHY6R+gJRCyUQ hueD6rPjhoq2MvhYGxFpMyiIaNY/3qy8Bx9a8CNZhun3WmKDX8oLhOvdyPdyich/JMRm hlx2zjYE+lKYOs92eafcDGbPoVc1IHa7jAzWRxEDJCMXO8k8LLhkbiJnkJI3HEFvsP06 OF5EYO7OGCzoEGLKs+RiUEt/bRAohw0pjkX/QYLz89cdBRX+JGmXmD4TET/hKq71RiFN ltNvS62DGcqJ6m6qmW5YumUjwoDQwko6Ebfv6pa68jChTXxGf5k8yQUOPeiy+2EuPGkQ 0Anw== X-Gm-Message-State: AOAM531D3V13EOkuR03ryv06uD4Ohu6wqKZRrYBvkmrd81TYNnjsN1pR yU4xyLINfHrz1shBFr+dPzJ2/A== X-Received: by 2002:a5d:6da9:: with SMTP id u9mr1878130wrs.84.1633536379085; Wed, 06 Oct 2021 09:06:19 -0700 (PDT) Received: from ?IPv6:2a01:e34:ed2f:f020:278:1f59:2992:87fe? ([2a01:e34:ed2f:f020:278:1f59:2992:87fe]) by smtp.googlemail.com with ESMTPSA id y8sm17477131wrr.21.2021.10.06.09.06.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 06 Oct 2021 09:06:18 -0700 (PDT) Subject: Re: [PATCH v2 0/2] Add a generic virtual thermal sensor To: "Rafael J. Wysocki" Cc: Alexandre Bailon , "Zhang, Rui" , Amit Kucheria , Linux PM , Linux Kernel Mailing List , ben.tseng@mediatek.com, Kevin Hilman , Matthias Kaehlcke References: <20210917072732.611140-1-abailon@baylibre.com> <7cddcdb7-4efd-bfdb-3d86-f5862ea0b7fe@baylibre.com> <8a9e5f13-6253-2d0d-35a8-789090af4521@linaro.org> <794e62ea-d867-3827-de5f-24ddc86c3524@linaro.org> <4446577e-c7fa-daeb-e0fe-8a530633ef5d@baylibre.com> From: Daniel Lezcano Message-ID: <03aeb132-bc0c-93f7-c7db-8575a665d2a7@linaro.org> Date: Wed, 6 Oct 2021 18:06:17 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/10/2021 18:45, Rafael J. Wysocki wrote: > On Mon, Oct 4, 2021 at 3:42 PM Daniel Lezcano wrote: >> >> On 04/10/2021 12:24, Alexandre Bailon wrote: >>> >>> On 9/22/21 10:10 AM, Daniel Lezcano wrote: >>>> On 20/09/2021 15:12, Alexandre Bailon wrote: >>>>> On 9/17/21 4:03 PM, Daniel Lezcano wrote: >>>>>> On 17/09/2021 15:33, Alexandre Bailon wrote: >>>>>>> Hi Daniel, >>>>>>> >>>>>>> On 9/17/21 2:41 PM, Daniel Lezcano wrote: >>>>>>>> On 17/09/2021 09:27, Alexandre Bailon wrote: >>>>>>>>> This series add a virtual thermal sensor. >>>>>>>>> It could be used to get a temperature using some thermal sensors. >>>>>>>>> Currently, the supported operations are max, min and avg. >>>>>>>>> The virtual sensor could be easily extended to support others >>>>>>>>> operations. >>>>>>>>> >>>>>>>>> Note: >>>>>>>>> Currently, thermal drivers must explicitly register their sensors to >>>>>>>>> make them >>>>>>>>> available to the virtual sensor. >>>>>>>>> This doesn't seem a good solution to me and I think it would be >>>>>>>>> preferable to >>>>>>>>> update the framework to register the list of each available sensors. >>>>>>>> Why must the drivers do that ? >>>>>>> Because there are no central place where thermal sensor are >>>>>>> registered. >>>>>>> The only other way I found was to update thermal_of.c, >>>>>>> to register the thermal sensors and make them available later to the >>>>>>> virtual thermal sensor. >>>>>>> >>>>>>> To work, the virtual thermal need to get the sensor_data the ops from >>>>>>> the thermal sensor. >>>>>>> And as far I know, this is only registered in thermal_of.c, in the >>>>>>> thermal zone data >>>>>>> but I can't access it directly from the virtual thermal sensor. >>>>>>> >>>>>>> How would you do it ? >>>>>> Via the phandles when registering the virtual sensor ? >>>>> As far I know, we can't get the ops or the sensor_data from the phandle >>>>> of a thermal sensor. >>>>> The closest solution I found so far would be to aggregate the thermal >>>>> zones instead of thermal sensors. >>>>> thermal_zone_device has the data needed and a thermal zone could be find >>>>> easily using its name. >>>> Yeah, the concept of the thermal zone and the sensor are very close. >>>> >>>> There is the function in thermal_core.h: >>>> >>>> -> for_each_thermal_zone() >>>> >>>> You should be able for each 'slave' sensor, do a lookup to find the >>>> corresponding thermal_zone_device_ops. >>>> >>>>> But, using a thermal_zone_device, I don't see how to handle module >>>>> unloading. >>>> I think try_module_get() / module_put() are adequate for this situation >>>> as it is done on an external module and we can not rely on the exported >>>> symbols. >>> I don't see how it would be possible to use these functions. >>> The thermal zone doesn't have the data required to use it. >> >> Actually I was able to crash the kernel by doing: >> >> console 1: >> >> while $(true); do insmod && rmmod ; done >> >> console 2: >> >> while $(true); cat /sys/class/thermal/thermal_zone0/temp; done >> >> So there is something wrong already in the thermal framework. > > Hmmm. > >> IMO, the first thing would be to fix this critical issue by getting the >> sensor module refcount when the thermal zone is enabled and dropping it >> when it is disabled. >> >> With that fixed, perhaps it will possible to get the device associated >> with the sensor and then try_module_get(dev->driver->owner) >> >>> Maybe a more easier way is to use the thermal_zone_device mutex. >>> If I get a lock before to use the thermal_zone_device ops, I have the >>> guaranty that module won't be unloaded. > > That would be my approach too. The mutex is private to the thermal core. The virtual sensor should not touch it :/ Perhaps, it can work with a private spin_lock with a try_spinlock() ? >>> When a "thermal of sensor" is unloaded, it calls >>> thermal_zone_of_sensor_unregister which takes a lock before >>> update ops. >> >> I'm not sure to understand. The goal is to have the refcount on the >> modules to be incremented when the virtual sensor is using them. > > IMO the goal is to prevent the code from crashing when modules get > unloaded. I'm not really sure if refcounts alone are sufficient for > that. The problem is in the loop: +static int virtual_thermal_sensor_get_temp(void *data, int *temperature) +{ + struct virtual_thermal_sensor *sensor = data; + int max_temp = INT_MIN; + int temp; + int i; + + for (i = 0; i < sensor->count; i++) { + struct thermal_sensor_data *hw_sensor; + + hw_sensor = &sensor->sensors[i]; + if (!hw_sensor->ops) + return -ENODEV; + + hw_sensor->ops->get_temp(hw_sensor->sensor_data, &temp); + max_temp = sensor->aggr_temp(max_temp, temp); + } + + *temperature = max_temp; + + return 0; +} If one of the sensor is unloaded when get_temp is called, hw_sensor->ops->get_temp will crash. So the proposal is virtual_sensor_add_sensor() does try_get_module() and virtual_sensor_remove_sensor() does put_module(). The ref on the 'slave' modules will be release only if the virtual sensor is unregistered. So until, the virtual sensor is unregistered, the 'slaves' modules can not be unloaded. That is what we find with eg. the wifi modules. >> Until the virtual sensor is registered, it will prevent the other >> modules to be unloaded. > > Unless they are forced to unload that is, AFAICS. > > IMO it would be better to make the code survive unloading of a module. -- Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog