From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>,
        "Grygorii.Strashko@linaro.org" <grygorii.strashko@linaro.org>,
        Geert Uytterhoeven <geert+renesas@glider.be>,
        Kevin Hilman <khilman@linaro.org>,
        Santosh Shilimkar <santosh.shilimkar@ti.com>,
        Linux PM list <linux-pm@vger.kernel.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] PM / clock_ops: Fix clock error check in __pm_clk_add()
Date: Mon, 18 May 2015 02:22:05 +0200
Message-ID: <6908510.UPNVrX8Gin@vostro.rjw.lan>
User-Agent: KMail/4.11.5 (Linux/4.0.0+; KDE/4.11.5; x86_64; ; )
In-Reply-To: <CAMuHMdWySBrfoyE=5809rHPBVWURT_54yQiZF+XACpwFCCdBTg@mail.gmail.com>
References: <1431074863-19124-1-git-send-email-geert+renesas@glider.be> <9611184.kabx71SGcD@vostro.rjw.lan> <CAMuHMdWySBrfoyE=5809rHPBVWURT_54yQiZF+XACpwFCCdBTg@mail.gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="utf-8"
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 7223
Lines: 145

On Saturday, May 16, 2015 11:37:01 PM Geert Uytterhoeven wrote:
> On Thu, May 14, 2015 at 12:45 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, May 12, 2015 05:32:29 PM Dmitry Torokhov wrote:
> >> On Wed, May 13, 2015 at 02:22:50AM +0200, Rafael J. Wysocki wrote:
> >> > On Tuesday, May 12, 2015 11:07:33 AM Dmitry Torokhov wrote:
> >> > > On Tue, May 12, 2015 at 08:59:03PM +0300, Grygorii.Strashko@linaro.org wrote:
> >> > > > On 05/12/2015 07:42 PM, Dmitry Torokhov wrote:
> >> > > > > On Tue, May 12, 2015 at 04:55:39PM +0300, Grygorii.Strashko@linaro.org wrote:
> >> > > > >> On 05/09/2015 12:05 AM, Dmitry Torokhov wrote:
> >> > > > >>> On Fri, May 08, 2015 at 10:59:04PM +0200, Geert Uytterhoeven wrote:
> >> > > > >>>> On Fri, May 8, 2015 at 7:19 PM, Dmitry Torokhov
> >> > > > >>>> <dmitry.torokhov@gmail.com> wrote:
> >> > > > >>>>> On Fri, May 08, 2015 at 10:47:43AM +0200, Geert Uytterhoeven wrote:
> >> > > > >>>>>> In the final iteration of commit 245bd6f6af8a62a2 ("PM / clock_ops: Add
> >> > > > >>>>>> pm_clk_add_clk()"), a refcount increment was added by Grygorii Strashko.
> >> > > > >>>>>> However, the accompanying IS_ERR() check operates on the wrong clock
> >> > > > >>>>>> pointer, which is always zero at this point, i.e. not an error.
> >> > > > >>>>>> This may lead to a NULL pointer dereference later, when __clk_get()
> >> > > > >>>>>> tries to dereference an error pointer.
> >> > > > >>>>>>
> >> > > > >>>>>> Check the passed clock pointer instead to fix this.
> >> > > > >>>>>
> >> > > > >>>>> Frankly I would remove the check altogether. Why do we only check for
> >> > > > >>>>> IS_ERR and not NULL or otherwise validate the pointer? The clk is passed
> >> > > > >>>>
> >> > > > >>>> __clk_get() does the NULL check.
> >> > > > >>>
> >> > > > >>> No, not really. It _handles_ clk being NULL and returns "everything is
> >> > > > >>> fine". In any case it is __clk_get's decision what to do.
> >> > > > >>>
> >> > > > >>> I dislike gratuitous checks of arguments passed in. Instead of relying
> >> > > > >>> on APIs refusing grabage we better not pass garbage to these APIs in the
> >> > > > >>> first place. So I'd change it to trust that we are given a usable
> >> > > > >>> pointer and simply do:
> >> > > > >>>
> >> > > > >>>     if (!__clk_get(clk)) {
> >> > > > >>>             kfree(ce);
> >> > > > >>>             return -ENOENTl
> >> > > > >>>     }
> >> > > > >>
> >> > > > >> Not sure this is right thing to do, because this API initially
> >> > > > >> was intended to be used as below [1]:
> >> > > > >>      clk = of_clk_get(dev->of_node, i));
> >> > > > >>      ret = pm_clk_add_clk(dev, clk);
> >> > > > >>      clk_put(clk);
> >> > > > >>
> >> > > > >> and of_clk_get may return ERR_PTR().
> >> > > > >
> >> > > > > Jeez, that sequence was not meant to be taken literally, it does miss
> >> > > > > error handling completely. If you notice the majority of users of this
> >> > > > > API do something like below:
> 
> What's the majority of zero users? ;-)
> 
> >> > > > >
> >> > > > >       i = 0;
> >> > > > >       while ((clk = of_clk_get(dev->of_node, i++)) && !IS_ERR(clk)) {
> >> > > > >               dev_dbg(dev, "adding clock '%s' to list of PM clocks\n",
> >> > > > >                       __clk_get_name(clk));
> >> > > > >               error = pm_clk_add_clk(dev, clk);
> >> > > > >               clk_put(clk);
> >> > > > >               if (error) {
> >> > > > >                       dev_err(dev, "pm_clk_add_clk failed %d\n", error);
> >> > > > >                       pm_clk_destroy(dev);
> >> > > > >                       return error;
> >> > > > >               }
> >> > > > >       }
> >> > > > >
> >> > > > > i.e. it already validates clk pointer before passing it on since it
> >> > > > > needs to know when to stop iterating.
> >> > > >
> >> > > > np. It's just my opinion - if you agree that code will just crash
> >> > > > in case of passing invalid @clk argument (in worst case:)
> >> > > >
> >> > > > int __clk_get(struct clk *clk)
> >> > > > {
> >> > > >         struct clk_core *core = !clk ? NULL : clk->core;
> >> > > >                                                 ^^^ here
> >> > >
> >> > > Yes, it will crash if you pass invalid pointer here, be it
> >> > > ERR_PTR-encoded value, or, for example, 0x1, or maybe (void
> >> > > *)random_32(). The latter will probably not crash right away, but cause
> >> > > some random damage that will manifest later.
> >> >
> >> > Oh well.  Shouldn't we actually do:
> >> >
> >> > int __clk_get(struct clk *clk)
> >> > {
> >> >     struct clk_core *core = IS_ERR_OR_NULL(clk) ? NULL : clk->core;
> >> >
> >> > and remove the check from __pm_clk_add() at the same time?
> >> >
> >> > Knowingly crashing on an error encoded as a pointer is kind of disgusting to me
> >> > and the difference between that and a random invalid pointer is that poeple who
> >> > pass error values encoded as pointers up the stack usually expect them to be
> >> > handled cleanly.
> >>
> >> I think the operative work here is "up". Returning ERR_PTR-encoded
> >> pointer is fine, checking it fine as well, blindly passing it *down*
> >> into a random API is not fine and we should not try to accommodate this.
> >
> > You're basically saying "Passing an error-encoding pointer down to an API is
> > not valid" which I agree with, but I don't agree that it's OK to crash the
> > kernel when that happens.  It's never OK to crash the kernel when we can
> > easily avoid that, because it may lead to user data loss.
> >
> > However, you seem to be arguing against fixing up things *silently* which may
> > hide serious bugs.  That's a good point, so what about adding a WARN_ON_ONCE()
> > aroud the IS_ERR() check in the Geert's patch?
> 
> Most (all?) clock API calls allow to pass in error pointers as returned by
> clk_get(). This allows for calling clk_get() and clk_prepare_enable() in a row,
> without any checking by the user (in many drivers, clocks are optional).
> 
> __clk_get() is more of an internal function, that's why it doesn't
> have the check.
> 
> So Grygorii's answer "the API is to be used like this", is not that insane,
> following other clock API calls.
> 
> Now, pm_clk_add_clk() returns -ENOENT if the clock is not valid.
> This is a visible difference from pm_clk_add(), which (ignoring -ENOMEM) always
> returns zero, whether the clock for the con_id can be found or not (i.e. whether
> pm_clk_acquire() succeeds or not).
> 
> I guess we want to be consistent here:
>   1. Either always return zero,
>   2. Either always propagate failures.
> 
> Then, clocks can be optional, especially when considering clock domains.
> Hence existing code calling pm_clk_add() from the generic_pm_domain.attach_dev()
> callback may start to break when pm_clk_add() starts returning errors for
> non-existent clocks.

OK, I'll apply the patch as is, then.  Thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/