Subject: Re: [PATCH] thermal: core: Add a back up thermal shutdown mechanism
To: Eduardo Valentin <edubezval@gmail.com>,
        Zhang Rui <rui.zhang@intel.com>
References: <1490941820-13511-1-git-send-email-j-keerthy@ti.com>
 <20170411172918.GA5193@localhost.localdomain>
 <f64632d5-c780-e5fe-cff7-8ed1459348a0@ti.com>
 <1491967248.2357.25.camel@intel.com>
 <492e72af-ff33-d193-071e-5bc00df9a8b0@ti.com>
 <20170412040542.GA11305@localhost.localdomain>
 <abf93eec-890f-4c3e-68fa-58c10678dde9@ti.com>
 <1491985580.2357.39.camel@intel.com>
 <db07d448-0fa9-f582-a323-5edcb9a0b509@ti.com>
 <1491986744.2357.42.camel@intel.com>
 <20170412154358.GA12881@localhost.localdomain>
CC: Keerthy <j-keerthy@ti.com>, <linux-pm@vger.kernel.org>,
        <linux-kernel@vger.kernel.org>, <linux-omap@vger.kernel.org>,
        <nm@ti.com>, <t-kristo@ti.com>
From: Grygorii Strashko <grygorii.strashko@ti.com>
Message-ID: <b565f2c9-fdd7-7525-da91-695f113e631b@ti.com>
Date: Wed, 12 Apr 2017 11:31:18 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <20170412154358.GA12881@localhost.localdomain>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3126
Lines: 99


On 04/12/2017 10:44 AM, Eduardo Valentin wrote:
> Hello,
> 
...

> 
> I agree. But there it nothing that says it is not reenterable. If you
> saw something in this line, can you please share?
> 
>>>> will you generate a patch to do this?
>>> Sure. I will generate a patch to take care of 1) To make sure that
>>> orderly_poweroff is called only once right away. I have already
>>> tested.
>>>
>>> for 2) Cancel all the scheduled work queues to monitor the
>>> temperature.
>>> I will take some more time to make it and test.
>>>
>>> Is that okay? Or you want me to send both together?
>>>
>> I think you can send patch for step 1 first.
> 
> I am happy to see that Keerthy found the problem with his setup and a
> possible solution. But I have a few concerns here.
> 
> 1. If regular shutdown process takes 10seconds, that is a ballpark that
> thermal should never wait. orderly_poweroff() calls run_cmd() with wait
> flag set. That means, if regular userland shutdown takes 10s, we are
> waiting for it. Obviously this not acceptable. Specially if you setup
> critical trip to be 125C. Now, if you properly size the critical trip to
> fire before hotspot really reach 125C, for 10s (or the time it takes to
> shutdown), then fine. But based on what was described in this thread,
> his system is waiting 10s on regular shutdown, and his silicon is on
> out-of-spec temperature for 10s, which is wrong.
> 
> 2. The above scenario is not acceptable in a long run, specially from a
> reliability perspective. If orderly_poweroff() has a possibility to
> simply never return (or take too long), I would say the thermal
> subsystem is using the wrong API.
> 


Hh, I do not see that orderly_poweroff() will wait for anything now:
void orderly_poweroff(bool force)
{
	if (force) /* do not override the pending "true" */
		poweroff_force = true;
	schedule_work(&poweroff_work); 
^^^^^^^ async call. even here can be pretty big delay if system is under pressure
}


static int __orderly_poweroff(bool force)
{
	int ret;

	ret = run_cmd(poweroff_cmd);
^^^^ no wait for the process - only for exec. flags == UMH_WAIT_EXEC

	if (ret && force) {
		pr_warn("Failed to start orderly shutdown: forcing the issue\n");

		/*
		 * I guess this should try to kick off some daemon to sync and
		 * poweroff asap.  Or not even bother syncing if we're doing an
		 * emergency shutdown?
		 */
		emergency_sync();
		kernel_power_off();
^^^ force power off, but only if run_cmd() failed - for example /sbin/poweroff doesn't exist
	}

	return ret;
}

static bool poweroff_force;

static void poweroff_work_func(struct work_struct *work)
{
	__orderly_poweroff(poweroff_force);
}

As result thermal has no control of power off any more after calling orderly_poweroff() and can get the result
of US poweroff binary execution.

> 
> If you are going to implement the above two patches, keep in mind:
> i. At least within the thermal subsystem, you need to take care of all
> zones that could trigger a shutdown.
> ii. serializing the calls to orderly_poweroff() seams to be more
> concerning than cancelling all monitoring.
> 
> 

-- 
regards,
-grygorii