Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp1727192pxb; Mon, 12 Apr 2021 05:26:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxcpwEv/LfHISb98anb1gaoVYDiTcFJSfx+4P7hMDDmtdmkhD8/NW5DJ7M+5QXSS8l/WeZy X-Received: by 2002:aa7:c549:: with SMTP id s9mr23392079edr.326.1618230401699; Mon, 12 Apr 2021 05:26:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618230401; cv=none; d=google.com; s=arc-20160816; b=Zdv2/n2ZVXIZLyqq7Jj3bUv+3B/cao09b+HDX+i4v9Nfe5uPds4jFEHzfu9eMRmtkM 2GqQgLce8xS6OPPm0Onl+H2aecw8WJNbxq8I5RzVlZ1qCN+qj+oMP4NDmcHSce02Y2yU 60CcBI24mryTszX+sF1sJ8zviymXVaO9B+17g24w2FEDmebyWuhavYydrVdf3d41IjMR Gh4dVLsYUvosKG8JrHoY87JZFbmxOM3Z1h0kyqcl+DAdkZ4BDt0jXkHJQE+25i63Pkuo OediRylVGvzDzsDQtFfBDZnMFPl+K74Q9uhb8U6DT505JJlmwmjzIU90b9CdIUVVm9bC 6RpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:user-agent:date :mime-version:references:in-reply-to:cc:to:reply-to:from:subject :message-id; bh=5PHG89rkHmtJu5sjeUoGT1FTEcW/fFE95LEXIAesfmA=; b=gTPI80B4eV8bb2Gh4ylyX6JusbsE9ki/uwmY9QIC5laTrvek46PT3O/KFRa5ZQEf14 hN4TGCkqOILI3wS3vmvuUOerl6KWpXsV+Ft5o4EBQ8vr7UepPHMVITJCm2dLufIXMWLm wPsmd4XvA6bnfaCoNHUFpoCembTYzf3mHDz6S6XG6F0bE+qUlkzEXdUN98ggh9wozHWD SIRq4wwKNhOlIRPh3yAQJKJv7cyNe+ILBxfxJzyNMv7LHGtiYY42OtGk2q0UzD0QY5Q/ 8B0WGKS9G0sMqofNY4z7E7CNtGVsBQCzH5cnlvYRnnE9lBxGzFJELsylG8mRqq2zQ9Zx zNFw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bq21si5636573ejb.186.2021.04.12.05.26.18; Mon, 12 Apr 2021 05:26:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240878AbhDLMYq (ORCPT + 99 others); Mon, 12 Apr 2021 08:24:46 -0400 Received: from mail-lj1-f172.google.com ([209.85.208.172]:42822 "EHLO mail-lj1-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240245AbhDLMYp (ORCPT ); Mon, 12 Apr 2021 08:24:45 -0400 Received: by mail-lj1-f172.google.com with SMTP id l22so7743374ljc.9; Mon, 12 Apr 2021 05:24:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:reply-to:to:cc :in-reply-to:references:mime-version:date:user-agent :content-transfer-encoding; bh=5PHG89rkHmtJu5sjeUoGT1FTEcW/fFE95LEXIAesfmA=; b=Hx8/r/C/Vjj1KvpHowxonVLON7SDT3jnLCtpA1V800HksR9zcpDZPhrM5GFYnAUs3R s8lgoHTaPI46Ro0WDk9WDwnk+5MjulHOGgMygoxLX9bgzsEZJZ+WfgxmIdewieOnap7E xpDNmBGStJ5vowEzHuFHiaGt8uzAoigy5jMNKVdO7ef7LTkxy70UqAxNJObNTCA3eaQl y4FXq7xbmGDO+GX+prd7URP12xKypJuo4ambvzcTMTu2g3OAWQ/D/IwC+jO8Yj0wa41g ZqbvZR7FhdCDwmoRdBLmnI6FlZ9Sm3RujhA4nGZDIUp28Kzw14H5ighIqeNZhJ9K30O+ US+A== X-Gm-Message-State: AOAM531b+mo4vYmUBbAsSKz4IvdSg7acGbM9hzDD4Fb2oEbHv2oELvsj uX/lUMj9s0NDX/+dF+GPKeo= X-Received: by 2002:a2e:588:: with SMTP id 130mr15431771ljf.28.1618230266591; Mon, 12 Apr 2021 05:24:26 -0700 (PDT) Received: from dc7vkhyyyyyyyyyyyyydy-3.rev.dnainternet.fi (dc7vkhyyyyyyyyyyyyydy-3.rev.dnainternet.fi. [2001:14ba:16e2:8300::6]) by smtp.gmail.com with ESMTPSA id m6sm1203846lfu.308.2021.04.12.05.24.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 05:24:25 -0700 (PDT) Message-ID: <882c4561ebc20313098312bb9cfae60736d69475.camel@fi.rohmeurope.com> Subject: Re: [PATCH v4 3/7] regulator: IRQ based event/error notification helpers From: Matti Vaittinen Reply-To: matti.vaittinen@fi.rohmeurope.com To: Kees Cook , Andy Shevchenko , Zhang Rui , Guenter Roeck Cc: "agross@kernel.org" , "broonie@kernel.org" , "devicetree@vger.kernel.org" , linux-power , "linux-kernel@vger.kernel.org" , "linux-renesas-soc@vger.kernel.org" , "linux-arm-msm@vger.kernel.org" , "bjorn.andersson@linaro.org" , "lgirdwood@gmail.com" , "robh+dt@kernel.org" In-Reply-To: References: <2b87b4637fde2225006cc122bc855efca0dcd7f1.1617692184.git.matti.vaittinen@fi.rohmeurope.com> <55397166b1c4107efc2a013635f63af142d9b187.camel@fi.rohmeurope.com> <42210c909c55f7672e4a4a9bfd34553a6f4c8146.camel@fi.rohmeurope.com> <202104082015.4DADF9DC48@keescook> Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Date: Mon, 12 Apr 2021 15:24:16 +0300 User-Agent: Evolution 3.34.4 (3.34.4-1.fc31) Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2021-04-09 at 10:08 +0300, Matti Vaittinen wrote: > On Thu, 2021-04-08 at 20:20 -0700, Kees Cook wrote: > > On Wed, Apr 07, 2021 at 03:50:15PM +0300, Andy Shevchenko wrote: > > > On Wed, Apr 7, 2021 at 12:49 PM Vaittinen, Matti > > > wrote: > > > > On Wed, 2021-04-07 at 12:10 +0300, Andy Shevchenko wrote: > > > > > On Wed, Apr 7, 2021 at 8:02 AM Matti Vaittinen > > > > > wrote: > > > > > > On Wed, 2021-04-07 at 01:44 +0300, Andy Shevchenko wrote: > > > > > > > On Tuesday, April 6, 2021, Matti Vaittinen < > > > > > > > matti.vaittinen@fi.rohmeurope.com> wrote: > > > > > > > > + BUG(); > > > > > > > > +} > > > > This, though, are you sure you want to use BUG()? Linus gets upset > > about > > such things: > > https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on > > > > I see. I am unsure of what would be the best action in the regulator > case we are handling here. To give the context, we assume here a > situation where power has gone out of regulation and the hardware is > probably failing. First countermeasure to protect what is left of HW > is > to shut-down the failing regulator. BUG() was called here as a last > resort if shutting the power via regulator interface was not > implemented or working. > > Eg, we try to take what ever last measure we can to minimize the HW > damage - and BUG() was used for this in the qcom driver where I stole > the idea. Judging the comment related to BUG() in asm-generic/bug.h > > /* > * Don't use BUG() or BUG_ON() unless there's really no way out; one > > * example might be detecting data structure corruption in the middle > * > of an operation that can't be backed out of. If the (sub)system > * can > somehow continue operating, perhaps with reduced functionality, > * it's > probably not BUG-worthy. > * > * If you're tempted to BUG(), think > again: is completely giving up > * really the *only* solution? There > are usually better options, where > * users don't need to reboot ASAP and > can mostly shut down cleanly. > */ > https://elixir.bootlin.com/linux/v5.12-rc6/source/include/asm-generic/bug.h#L55 > > this really might be valid use-case. > > To me the real question is what happens after the BUG() - and if > there > is any generic handling or if it is platform/board specific? Does it > actually have any chance to save the HW? > > Mark already pointed that we might need to figure a way to punt a > "failing event" to the user-space to initiate better "safety > shutdown". > Such event does not currently exist so I think the main use-case here > is to do logging and potentially prevent enabling any further actions > in the failing HW. > > So - any better suggestions? > Maybe we should take same approach as is taken in thermal_core? Quoting the thermal documentation: "On an event of critical trip temperature crossing. Thermal framework allows the system to shutdown gracefully by calling orderly_poweroff(). In the event of a failure of orderly_poweroff() to shut down the system we are in danger of keeping the system alive at undesirably high temperatures. To mitigate this high risk scenario we program a work queue to fire after a pre-determined number of seconds to start an emergency shutdown of the device using the kernel_power_off() function. In case kernel_power_off() fails then finally emergency_restart() is called in the worst case." Maybe this 'hardware protection, in-kernel, emergency HW saving shutdown' - logic, should be pulled out of thermal_core.c (or at least exported) for (other parts like) the regulators to use? I don't like the idea relying in the user-space to be in shape it can handle the situation. I may be mistaken but I think a quick action might be required. Hence the in-kernel handling does not sound so bad to me. I am open to all education and suggestions. Meanwhile I am planning to just convert the BUG() to WARN(). I don't claim I know how BUG() is implemented on each platform - but my understanding is that it does not guarantee any power to be cut but just halts the calling process(?). I guess this does not guarantee what happens next - maybe it even keeps the power enabled and end up just deadlocking the system by reserved locks? I think thermal guys have been pondering this scenario for severe temperature protection shutdown so I would like to hear your opinions. Best Regards Matti Vaittinen