Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp660169yba; Wed, 3 Apr 2019 17:00:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqyg6+uv+hCZbYpsJjjnT/yb8Ebyu3ngrGdxptYX87EeyA7C93gfHroFXDWixUxHFFNkAJ1l X-Received: by 2002:aa7:920b:: with SMTP id 11mr2434839pfo.3.1554336024205; Wed, 03 Apr 2019 17:00:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554336024; cv=none; d=google.com; s=arc-20160816; b=MNEFP92601D44RUKK6wROZuMiWiPWw+8mD/ZPvj4bNdb01iAk1ZkvhbZqJ6VsxumUT tmSmo5d3pMY2e4tUFWELrCoe3M9KLvkVOVA699bd3TyQklalnIw8kP7rbCbLBcNZhrWf o1F8+Tnhl5MsyBy/UPO+8VP5+3ZqgeTIA6Us3sSxWrH0ILWq0lh5IUBJGHaP6rlEPMKr LjGu+cyt3lkSwy7bTXDAUKWLiWOuePJaOFdDTxtptAO44l0vMRJbS+vDPuElXgiv6LzL rv0fRGl8tkSjyjwy7moL22DSWB+CfipJoBaHSCOm4UTaAXSFfY/NRs1I5gt4vb5Ewocz bW6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=WfVn3Ew4pr/sbbtiLArUGAKqB8dQpXD6omZcx+u7lC8=; b=hKr1sRpKPQK+ICd5An6osax7laQUC6a9VhY0JK/EA4xmMbS8NwYgZRrtk9S14tH502 wtynY0HTf+dkqw8DbIHR46f7fY/SIVXxKJSdCdT4esFJLTKITqVuuO5UDMQffODnzt6y wqM8GXN8lHI2j6v6IcflITWyQUov8x9MRaExAVYCUs6PumDY9LQhLJCmkBo32Yj8NvOC sZETKJ/KDbre6PJqDR7SLBhm2sjsCcHGpgloGM1dzQ9uJEqUnMxu7ZU7mi2/JhXtYski dEj4cN2yJ9/NTmDkVe5qKyFmo7h3MKmWCjwQgRtGTbcURRPwmNvY8dtNCmcO930zp8ay 8K5w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d36si15384115pla.425.2019.04.03.17.00.08; Wed, 03 Apr 2019 17:00:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726411AbfDCX7f (ORCPT + 99 others); Wed, 3 Apr 2019 19:59:35 -0400 Received: from mga18.intel.com ([134.134.136.126]:55854 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726199AbfDCX7e (ORCPT ); Wed, 3 Apr 2019 19:59:34 -0400 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Apr 2019 16:59:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,306,1549958400"; d="scan'208";a="312959833" Received: from hao-dev.bj.intel.com (HELO localhost) ([10.238.157.65]) by orsmga005.jf.intel.com with ESMTP; 03 Apr 2019 16:59:31 -0700 Date: Thu, 4 Apr 2019 07:43:56 +0800 From: Wu Hao To: Moritz Fischer Cc: atull@kernel.org, linux-fpga@vger.kernel.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Luwei Kang , Russ Weight , Xu Yilun Subject: Re: [PATCH 14/17] fpga: dfl: fme: add thermal management support Message-ID: <20190403234356.GA11098@hao-dev> References: <1553483264-5379-1-git-send-email-hao.wu@intel.com> <1553483264-5379-15-git-send-email-hao.wu@intel.com> <20190402145925.GA15773@archbook> <20190403163147.GA28570@hao-dev> <20190403180909.GD5752@archbook> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190403180909.GD5752@archbook> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 03, 2019 at 11:09:09AM -0700, Moritz Fischer wrote: > Hi Hao, > > On Thu, Apr 04, 2019 at 12:31:47AM +0800, Wu Hao wrote: > > On Tue, Apr 02, 2019 at 07:59:25AM -0700, Moritz Fischer wrote: > > > Hi Wu, > > > > > > On Mon, Mar 25, 2019 at 11:07:41AM +0800, Wu Hao wrote: > > > > This patch adds support to thermal management private feature for DFL > > > > FPGA Management Engine (FME). As thermal throttling is handled by > > > > hardware automatically per pre-defined thresholds, this private > > > > feature driver only provides read-only sysfs interfaces for user > > > > to read temperature, thresholds, threshold policy and other info. > > > > > > > > Signed-off-by: Luwei Kang > > > > Signed-off-by: Russ Weight > > > > Signed-off-by: Xu Yilun > > > > Signed-off-by: Wu Hao > > > > --- > > > > Documentation/ABI/testing/sysfs-platform-dfl-fme | 56 +++++++ > > > > drivers/fpga/dfl-fme-main.c | 202 +++++++++++++++++++++++ > > > > 2 files changed, 258 insertions(+) > > > > > > > > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme > > > > index b8327e9..d3aeb88 100644 > > > > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme > > > > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme > > > > @@ -44,3 +44,59 @@ Description: Read-only. It returns socket_id to indicate which socket > > > > this FPGA belongs to, only valid for integrated solution. > > > > User only needs this information, in case standard numa node > > > > can't provide correct information. > > > > + > > > > +What: /sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/temperature > > > > +Date: March 2019 > > > > +KernelVersion: 5.2 > > > > +Contact: Wu Hao > > > > +Description: Read-only. It returns temperature (in Celsius) of this FPGA > > > > + device. > > > > + > > > > +What: /sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1 > > > > +Date: March 2019 > > > > +KernelVersion: 5.2 > > > > +Contact: Wu Hao > > > > +Description: Read-only. Read this file to get the temperature threshold1 > > > > + (in Celsius). > > > > + > > > > +What: /sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold2 > > > > +Date: March 2019 > > > > +KernelVersion: 5.2 > > > > +Contact: Wu Hao > > > > +Description: Read-only. Read this file to get the temperature threshold2 > > > > + (in Celsius). > > > > + > > > > +What: /sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/trip_threshold > > > > +Date: March 2019 > > > > +KernelVersion: 5.2 > > > > +Contact: Wu Hao > > > > +Description: Read-only. It returns trip threshold (in Celsius), once FPGA > > > > + temperature reaches trip threshold, it triggers a fatal event > > > > + to board management controller (BMC) to shutdown FPGA. > > > > + > > > > +What: /sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1_status > > > > +Date: March 2019 > > > > +KernelVersion: 5.2 > > > > +Contact: Wu Hao > > > > +Description: Read-only. It returns 1 if temperature reaches threshold1, > > > > + otherwise 0. Once temperature reaches threshold1, hardware > > > > + will automatically enter throttling state (AP1 - 50% > > > > + or AP2 - 90% throttling, see 'threshold1_policy'). > > > > + > > > > +What: /sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold2_status > > > > +Date: March 2019 > > > > +KernelVersion: 5.2 > > > > +Contact: Wu Hao > > > > +Description: Read-only. It returns 1 if temperature reaches threshold2, > > > > + otherwise 0. Once temperature reaches threshold2, hardware > > > > + will automatically enter the deepest throttling state (AP6 > > > > + - 100% throttling). > > > > + > > > > +What: /sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1_policy > > > > +Date: March 2019 > > > > +KernelVersion: 5.2 > > > > +Contact: Wu Hao > > > > +Description: Read-only. Read this file to get the policy of temperature > > > > + threshold1. It only supports two value (policy): > > > > + 0 - AP2 state (90% throttling) > > > > + 1 - AP1 state (50% throttling) > > > > > > These look like they could directly map to the linux thermal framework, > > > any reason you can't use the thermal framework? > > > > > > The trip stuff literally maps 1:1 to what a thermal driver does, I think > > > that's something you'd wanna consider. > > > > > > > Hi Moritz, > > > > Thanks a lot for the suggestion, actually I feel that the trip points in thermal > > zone are used to indicate cooling actions required for thermal software either > > in kernel or userspace. But in this case, such FPGA hardware handles cooling > > automatically (yes, driver only expose Read-only sysfs for information), so > > software doesn't need to take care of this at all. For this purpose, it seems > > that we don't have to put these thresholds as trip points. And per my > > understanding, if people use such FPGA device, then they may need to know > > what's the current hardware throttling behavior, e.g. 50% vs 90%. These > > information can't be provided by standard thermal zone sysfs, so anyway user > > needs these sysfs interfaces to know it. But it seems that we still could > > create a thermal zone without trip points, it could help if user wants to > > connect some external cooling devices via userspace thermal daemon, they can > > define whatever trip points they like to activate the external cooling > > device. I will consider this further more and come up with a new patch in > > v2 patchset. > > Generally speaking extending an existing framework with the > functionality you want is preferable over rolling 100% your own. > > So please look into this. Yes, agree, will look into this and try to fix this in next version. Thanks for the comments. Hao > > Thanks, > Moritz