Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933802AbcCOXIE (ORCPT ); Tue, 15 Mar 2016 19:08:04 -0400 Received: from mga04.intel.com ([192.55.52.120]:57204 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753750AbcCOXIB (ORCPT ); Tue, 15 Mar 2016 19:08:01 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,341,1455004800"; d="scan'208";a="937938456" From: "Pandruvada, Srinivas" To: "Zhang, Rui" , "linux-kernel@vger.kernel.org" , "edubezval@gmail.com" , "srikars@nvidia.com" , "linux-pm@vger.kernel.org" , "linux-tegra@vger.kernel.org" , "mlongnecker@nvidia.com" Subject: Re: [PATCH v2] thermal: add sysfs_notify on some attributes Thread-Topic: [PATCH v2] thermal: add sysfs_notify on some attributes Thread-Index: AQHRfw+EtbHsrnt/jkqwyrCo77I33A== Date: Tue, 15 Mar 2016 23:08:00 +0000 Message-ID: <1458079520.4486.39.camel@intel.com> References: <1457979156-10972-1-git-send-email-srikars@nvidia.com> In-Reply-To: <1457979156-10972-1-git-send-email-srikars@nvidia.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.254.53.46] Content-Type: text/plain; charset="utf-8" Content-ID: <3D6FB98C92254D4489335040A9D2C741@intel.com> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u2FN895B006173 Content-Length: 5478 Lines: 166 On Mon, 2016-03-14 at 11:12 -0700, Srikar Srimath Tirumala wrote: > Add a sysfs_notify on thermal_zone*/temp and cooling_device*/ > cur_state whenever any trip is triggered or cur state is changed. > > This change allows usermode apps to register themselves to get > notified, when certain thermal conditions occur and reduce their > workload. This workload throttling allows usermode to react before > hardware clocks are throttled and keep some critical apps running > reliably longer. I think we need a combination of proposal in  https://patchwork.kernel.org/patch/7876351/ and this. For example this patch notifies that some trip is violated, but that is not enough for user space application to take any action. Some trips violations user space may not care as this may be a transient one. The patch from Eduardo address that by providing trip, temperature and last temperature information. But that patch only address hot trips. I understand why Eduardo doesn't want to be notified for passive trips as there will be too many. So IMO we need some mechanism to turn off notification and decide what notification will result in user space notifications. On some x86 systems we have 10+ passive/active trips, this will results in too many notifications. We may be in thermally sensitive zone, where more code excecution is more heat. We may have some mask of trips for which will result in notifications. By default no notifications, unless some user space requests. During last LPC we discussed about using IIO for temperature threshold notifications and I submitted multiple changes for that. Looks like we also care of trip point changes. So I think we need more comprehensive mechanism to address this. May be we should have thermal mini summit during LPC again and decide a comprehensive plan to address all asynchronous thermal notifications. Thanks, Srinivas > > Signed-off-by: Srikar Srimath Tirumala > --- > > Changes from v1: >  - Calling sysfs_notify for thermal_zone*/temp only when there is a >    trip violated on the thermal zone. >  - Modified commit message. > >  drivers/thermal/thermal_core.c | 34 ++++++++++++++++++++++++++---- > ---- >  1 file changed, 26 insertions(+), 8 deletions(-) > > diff --git a/drivers/thermal/thermal_core.c > b/drivers/thermal/thermal_core.c > index a0a8fd1..f54519e 100644 > --- a/drivers/thermal/thermal_core.c > +++ b/drivers/thermal/thermal_core.c > @@ -419,14 +419,23 @@ static void monitor_thermal_zone(struct > thermal_zone_device *tz) >   mutex_unlock(&tz->lock); >  } >   > -static void handle_non_critical_trips(struct thermal_zone_device > *tz, > +static int handle_non_critical_trips(struct thermal_zone_device *tz, >   int trip, enum thermal_trip_type trip_type) >  { > + int trip_temp; > + int ret = 0; > + >   tz->governor ? tz->governor->throttle(tz, trip) : >          def_governor->throttle(tz, trip); > + > + tz->ops->get_trip_temp(tz, trip, &trip_temp); > + if (tz->temperature >= trip_temp) > + ret = 1; > + > + return ret; >  } >   > -static void handle_critical_trips(struct thermal_zone_device *tz, > +static int handle_critical_trips(struct thermal_zone_device *tz, >   int trip, enum thermal_trip_type > trip_type) >  { >   int trip_temp; > @@ -435,7 +444,7 @@ static void handle_critical_trips(struct > thermal_zone_device *tz, >   >   /* If we have not crossed the trip_temp, we do not care. */ >   if (trip_temp <= 0 || tz->temperature < trip_temp) > - return; > + return 0; >   >   trace_thermal_zone_trip(tz, trip, trip_type); >   > @@ -448,23 +457,28 @@ static void handle_critical_trips(struct > thermal_zone_device *tz, >     tz->temperature / 1000); >   orderly_poweroff(true); >   } > + > + return 1; >  } >   > -static void handle_thermal_trip(struct thermal_zone_device *tz, int > trip) > +static int handle_thermal_trip(struct thermal_zone_device *tz, int > trip) >  { > + int ret = 0; >   enum thermal_trip_type type; >   >   tz->ops->get_trip_type(tz, trip, &type); >   >   if (type == THERMAL_TRIP_CRITICAL || type == > THERMAL_TRIP_HOT) > - handle_critical_trips(tz, trip, type); > + ret = handle_critical_trips(tz, trip, type); >   else > - handle_non_critical_trips(tz, trip, type); > + ret = handle_non_critical_trips(tz, trip, type); >   /* >    * Alright, we handled this trip successfully. >    * So, start monitoring again. >    */ >   monitor_thermal_zone(tz); > + > + return ret; >  } >   >  /** > @@ -556,7 +570,7 @@ static void thermal_zone_device_reset(struct > thermal_zone_device *tz) >  void thermal_zone_device_update(struct thermal_zone_device *tz) >  { >   int count; > - > + int trips = 0; >   if (atomic_read(&in_suspend)) >   return; >   > @@ -566,7 +580,10 @@ void thermal_zone_device_update(struct > thermal_zone_device *tz) >   update_temperature(tz); >   >   for (count = 0; count < tz->trips; count++) > - handle_thermal_trip(tz, count); > + trips += handle_thermal_trip(tz, count); > + > + if (trips) > + sysfs_notify(&tz->device.kobj, NULL, "temp"); >  } >  EXPORT_SYMBOL_GPL(thermal_zone_device_update); >   > @@ -1638,6 +1655,7 @@ void thermal_cdev_update(struct > thermal_cooling_device *cdev) >   cdev->updated = true; >   trace_cdev_update(cdev, target); >   dev_dbg(&cdev->device, "set to state %lu\n", target); > + sysfs_notify(&cdev->device.kobj, NULL, "cur_state"); >  } >  EXPORT_SYMBOL(thermal_cdev_update); >