Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751282AbbEQX46 (ORCPT ); Sun, 17 May 2015 19:56:58 -0400 Received: from v094114.home.net.pl ([79.96.170.134]:63609 "HELO v094114.home.net.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1750837AbbEQX4s (ORCPT ); Sun, 17 May 2015 19:56:48 -0400 From: "Rafael J. Wysocki" To: Geert Uytterhoeven Cc: Dmitry Torokhov , "Grygorii.Strashko@linaro.org" , Geert Uytterhoeven , Kevin Hilman , Santosh Shilimkar , Linux PM list , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] PM / clock_ops: Fix clock error check in __pm_clk_add() Date: Mon, 18 May 2015 02:22:05 +0200 Message-ID: <6908510.UPNVrX8Gin@vostro.rjw.lan> User-Agent: KMail/4.11.5 (Linux/4.0.0+; KDE/4.11.5; x86_64; ; ) In-Reply-To: References: <1431074863-19124-1-git-send-email-geert+renesas@glider.be> <9611184.kabx71SGcD@vostro.rjw.lan> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="utf-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7223 Lines: 145 On Saturday, May 16, 2015 11:37:01 PM Geert Uytterhoeven wrote: > On Thu, May 14, 2015 at 12:45 AM, Rafael J. Wysocki wrote: > > On Tuesday, May 12, 2015 05:32:29 PM Dmitry Torokhov wrote: > >> On Wed, May 13, 2015 at 02:22:50AM +0200, Rafael J. Wysocki wrote: > >> > On Tuesday, May 12, 2015 11:07:33 AM Dmitry Torokhov wrote: > >> > > On Tue, May 12, 2015 at 08:59:03PM +0300, Grygorii.Strashko@linaro.org wrote: > >> > > > On 05/12/2015 07:42 PM, Dmitry Torokhov wrote: > >> > > > > On Tue, May 12, 2015 at 04:55:39PM +0300, Grygorii.Strashko@linaro.org wrote: > >> > > > >> On 05/09/2015 12:05 AM, Dmitry Torokhov wrote: > >> > > > >>> On Fri, May 08, 2015 at 10:59:04PM +0200, Geert Uytterhoeven wrote: > >> > > > >>>> On Fri, May 8, 2015 at 7:19 PM, Dmitry Torokhov > >> > > > >>>> wrote: > >> > > > >>>>> On Fri, May 08, 2015 at 10:47:43AM +0200, Geert Uytterhoeven wrote: > >> > > > >>>>>> In the final iteration of commit 245bd6f6af8a62a2 ("PM / clock_ops: Add > >> > > > >>>>>> pm_clk_add_clk()"), a refcount increment was added by Grygorii Strashko. > >> > > > >>>>>> However, the accompanying IS_ERR() check operates on the wrong clock > >> > > > >>>>>> pointer, which is always zero at this point, i.e. not an error. > >> > > > >>>>>> This may lead to a NULL pointer dereference later, when __clk_get() > >> > > > >>>>>> tries to dereference an error pointer. > >> > > > >>>>>> > >> > > > >>>>>> Check the passed clock pointer instead to fix this. > >> > > > >>>>> > >> > > > >>>>> Frankly I would remove the check altogether. Why do we only check for > >> > > > >>>>> IS_ERR and not NULL or otherwise validate the pointer? The clk is passed > >> > > > >>>> > >> > > > >>>> __clk_get() does the NULL check. > >> > > > >>> > >> > > > >>> No, not really. It _handles_ clk being NULL and returns "everything is > >> > > > >>> fine". In any case it is __clk_get's decision what to do. > >> > > > >>> > >> > > > >>> I dislike gratuitous checks of arguments passed in. Instead of relying > >> > > > >>> on APIs refusing grabage we better not pass garbage to these APIs in the > >> > > > >>> first place. So I'd change it to trust that we are given a usable > >> > > > >>> pointer and simply do: > >> > > > >>> > >> > > > >>> if (!__clk_get(clk)) { > >> > > > >>> kfree(ce); > >> > > > >>> return -ENOENTl > >> > > > >>> } > >> > > > >> > >> > > > >> Not sure this is right thing to do, because this API initially > >> > > > >> was intended to be used as below [1]: > >> > > > >> clk = of_clk_get(dev->of_node, i)); > >> > > > >> ret = pm_clk_add_clk(dev, clk); > >> > > > >> clk_put(clk); > >> > > > >> > >> > > > >> and of_clk_get may return ERR_PTR(). > >> > > > > > >> > > > > Jeez, that sequence was not meant to be taken literally, it does miss > >> > > > > error handling completely. If you notice the majority of users of this > >> > > > > API do something like below: > > What's the majority of zero users? ;-) > > >> > > > > > >> > > > > i = 0; > >> > > > > while ((clk = of_clk_get(dev->of_node, i++)) && !IS_ERR(clk)) { > >> > > > > dev_dbg(dev, "adding clock '%s' to list of PM clocks\n", > >> > > > > __clk_get_name(clk)); > >> > > > > error = pm_clk_add_clk(dev, clk); > >> > > > > clk_put(clk); > >> > > > > if (error) { > >> > > > > dev_err(dev, "pm_clk_add_clk failed %d\n", error); > >> > > > > pm_clk_destroy(dev); > >> > > > > return error; > >> > > > > } > >> > > > > } > >> > > > > > >> > > > > i.e. it already validates clk pointer before passing it on since it > >> > > > > needs to know when to stop iterating. > >> > > > > >> > > > np. It's just my opinion - if you agree that code will just crash > >> > > > in case of passing invalid @clk argument (in worst case:) > >> > > > > >> > > > int __clk_get(struct clk *clk) > >> > > > { > >> > > > struct clk_core *core = !clk ? NULL : clk->core; > >> > > > ^^^ here > >> > > > >> > > Yes, it will crash if you pass invalid pointer here, be it > >> > > ERR_PTR-encoded value, or, for example, 0x1, or maybe (void > >> > > *)random_32(). The latter will probably not crash right away, but cause > >> > > some random damage that will manifest later. > >> > > >> > Oh well. Shouldn't we actually do: > >> > > >> > int __clk_get(struct clk *clk) > >> > { > >> > struct clk_core *core = IS_ERR_OR_NULL(clk) ? NULL : clk->core; > >> > > >> > and remove the check from __pm_clk_add() at the same time? > >> > > >> > Knowingly crashing on an error encoded as a pointer is kind of disgusting to me > >> > and the difference between that and a random invalid pointer is that poeple who > >> > pass error values encoded as pointers up the stack usually expect them to be > >> > handled cleanly. > >> > >> I think the operative work here is "up". Returning ERR_PTR-encoded > >> pointer is fine, checking it fine as well, blindly passing it *down* > >> into a random API is not fine and we should not try to accommodate this. > > > > You're basically saying "Passing an error-encoding pointer down to an API is > > not valid" which I agree with, but I don't agree that it's OK to crash the > > kernel when that happens. It's never OK to crash the kernel when we can > > easily avoid that, because it may lead to user data loss. > > > > However, you seem to be arguing against fixing up things *silently* which may > > hide serious bugs. That's a good point, so what about adding a WARN_ON_ONCE() > > aroud the IS_ERR() check in the Geert's patch? > > Most (all?) clock API calls allow to pass in error pointers as returned by > clk_get(). This allows for calling clk_get() and clk_prepare_enable() in a row, > without any checking by the user (in many drivers, clocks are optional). > > __clk_get() is more of an internal function, that's why it doesn't > have the check. > > So Grygorii's answer "the API is to be used like this", is not that insane, > following other clock API calls. > > Now, pm_clk_add_clk() returns -ENOENT if the clock is not valid. > This is a visible difference from pm_clk_add(), which (ignoring -ENOMEM) always > returns zero, whether the clock for the con_id can be found or not (i.e. whether > pm_clk_acquire() succeeds or not). > > I guess we want to be consistent here: > 1. Either always return zero, > 2. Either always propagate failures. > > Then, clocks can be optional, especially when considering clock domains. > Hence existing code calling pm_clk_add() from the generic_pm_domain.attach_dev() > callback may start to break when pm_clk_add() starts returning errors for > non-existent clocks. OK, I'll apply the patch as is, then. Thanks! -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/