Hello,
I write a static analysis tool (DSAC), and it finds that printk can sleep.
According to this finding, there is an example bug in drivers/pci/pci.c
in Linux-4.16.7.
Here is the call path for this bug.
Please look at it *from the bottom up*.
========== BUG ==========
[FUNC] __might_sleep
kernel/locking/mutex.c: 747: __might_sleep in __mutex_lock_common
kernel/locking/mutex.c: 893: __mutex_lock_common in __mutex_lock
kernel/locking/mutex.c: 908: __mutex_lock in mutex_lock_nested
drivers/clk/clk.c, 123: mutex_lock_nested in clk_prepare_lock
drivers/clk/clk.c, 1369: clk_prepare_lock in clk_core_get_rate
drivers/clk/clk.c, 1393: clk_core_get_rate in clk_get_rate
lib/vsprintf.c, 1450: clk_get_rate in clock
lib/vsprintf.c, 1944: clock in pointer
lib/vsprintf.c, 2286: pointer in vsnprintf
lib/vsprintf.c, 2385: vsnprintf in vscnprintf
kernel/printk/printk.c, 1853: vscnprintf in vprintk_emit
kernel/printk/printk.c, 1947: vprintk_emit in vprintk_default
kernel/printk/printk_safe.c, 379: vprintk_default in vprintk_func
kernel/printk/printk.c, 1980: vprintk_func in printk
drivers/pci/pci.c, 5364: printk in pci_specified_resource_alignment
drivers/pci/pci.c, 5313: spin_lock in pci_specified_resource_alignment
In fact, I suspect that my report is false, because I always have an
impression that printk() cannot sleep.
But according to the call path, I cannot find where I make the mistake...
So could someone please help me to point the mistake?
Best wishes,
Jia-Ju Bai
Cc-ing CLK people.
On (05/31/18 17:08), Jia-Ju Bai wrote:
> [FUNC] __might_sleep
> kernel/locking/mutex.c: 747: __might_sleep in __mutex_lock_common
> kernel/locking/mutex.c: 893: __mutex_lock_common in __mutex_lock
> kernel/locking/mutex.c: 908: __mutex_lock in mutex_lock_nested
> drivers/clk/clk.c, 123: mutex_lock_nested in clk_prepare_lock
> drivers/clk/clk.c, 1369: clk_prepare_lock in clk_core_get_rate
> drivers/clk/clk.c, 1393: clk_core_get_rate in clk_get_rate
> lib/vsprintf.c, 1450: clk_get_rate in clock
> lib/vsprintf.c, 1944: clock in pointer
> lib/vsprintf.c, 2286: pointer in vsnprintf
> lib/vsprintf.c, 2385: vsnprintf in vscnprintf
> kernel/printk/printk.c, 1853: vscnprintf in vprintk_emit
> kernel/printk/printk.c, 1947: vprintk_emit in vprintk_default
> kernel/printk/printk_safe.c, 379: vprintk_default in vprintk_func
> kernel/printk/printk.c, 1980: vprintk_func in printk
> drivers/pci/pci.c, 5364: printk in pci_specified_resource_alignment
> drivers/pci/pci.c, 5313: spin_lock in pci_specified_resource_alignment
>
> In fact, I suspect that my report is false, because I always have an
> impression that printk() cannot sleep.
> But according to the call path, I cannot find where I make the mistake...
>
> So could someone please help me to point the mistake?
I think you are right.
number(buf, end, clk_get_rate(clk), spec) indeed locks the `prepare_lock'
mutex from atomic context.
-ss
Adding Geert and timer people into Cc.
On Thu 2018-05-31 17:08:49, Jia-Ju Bai wrote:
> Hello,
>
> I write a static analysis tool (DSAC), and it finds that printk can sleep.
> According to this finding, there is an example bug in drivers/pci/pci.c in
> Linux-4.16.7.
>
> Here is the call path for this bug.
> Please look at it *from the bottom up*.
>
> ========== BUG ==========
> [FUNC] __might_sleep
> kernel/locking/mutex.c: 747: __might_sleep in __mutex_lock_common
> kernel/locking/mutex.c: 893: __mutex_lock_common in __mutex_lock
> kernel/locking/mutex.c: 908: __mutex_lock in mutex_lock_nested
> drivers/clk/clk.c, 123: mutex_lock_nested in clk_prepare_lock
> drivers/clk/clk.c, 1369: clk_prepare_lock in clk_core_get_rate
> drivers/clk/clk.c, 1393: clk_core_get_rate in clk_get_rate
> lib/vsprintf.c, 1450: clk_get_rate in clock
The problem is in %pCr print format implementation. It calls
clk_get_rate() that might sleep. I wonder if we could use some
lock-less alternative instead.
Anyway, we need to fix or remove this format. vsprintf-like functions
are called in any context and nobody expect that they might sleep.
In each case, printk() must call it under logbuf_lock spinlock.
Otherwise, it would need to allocate a buffer for the formatted
string and it is not acceptable.
Best Regards,
Petr
> lib/vsprintf.c, 1944: clock in pointer
> lib/vsprintf.c, 2286: pointer in vsnprintf
> lib/vsprintf.c, 2385: vsnprintf in vscnprintf
> kernel/printk/printk.c, 1853: vscnprintf in vprintk_emit
> kernel/printk/printk.c, 1947: vprintk_emit in vprintk_default
> kernel/printk/printk_safe.c, 379: vprintk_default in vprintk_func
> kernel/printk/printk.c, 1980: vprintk_func in printk
> drivers/pci/pci.c, 5364: printk in pci_specified_resource_alignment
> drivers/pci/pci.c, 5313: spin_lock in pci_specified_resource_alignment
>
> In fact, I suspect that my report is false, because I always have an
> impression that printk() cannot sleep.
> But according to the call path, I cannot find where I make the mistake...
>
> So could someone please help me to point the mistake?
>
>
> Best wishes,
> Jia-Ju Bai
On Thu, May 31, 2018 at 5:05 AM Petr Mladek <[email protected]> wrote:
>
> Anyway, we need to fix or remove this format. vsprintf-like functions
> are called in any context and nobody expect that they might sleep.
Ack. I guess the argument is that "%pCr" is rare, and none of *those*
users may care, but I do think that doing things wrong as-is.
It's too subtle to have to know you're in a particular locking context
when you use a particular %p modifier.
Linus
Quoting Linus Torvalds (2018-05-31 07:32:10)
> On Thu, May 31, 2018 at 5:05 AM Petr Mladek <[email protected]> wrote:
> >
> > Anyway, we need to fix or remove this format. vsprintf-like functions
> > are called in any context and nobody expect that they might sleep.
>
> Ack. I guess the argument is that "%pCr" is rare, and none of *those*
> users may care, but I do think that doing things wrong as-is.
>
> It's too subtle to have to know you're in a particular locking context
> when you use a particular %p modifier.
>
Agreed. Removing the format seems to be the best approach. It looks like
only Geert has used it in the last few years and it hasn't been used
much otherwise.
Hi Stephen,
On Thu, May 31, 2018 at 5:19 PM, Stephen Boyd <[email protected]> wrote:
> Quoting Linus Torvalds (2018-05-31 07:32:10)
>> On Thu, May 31, 2018 at 5:05 AM Petr Mladek <[email protected]> wrote:
>> > Anyway, we need to fix or remove this format. vsprintf-like functions
>> > are called in any context and nobody expect that they might sleep.
>>
>> Ack. I guess the argument is that "%pCr" is rare, and none of *those*
>> users may care, but I do think that doing things wrong as-is.
>>
>> It's too subtle to have to know you're in a particular locking context
>> when you use a particular %p modifier.
>
> Agreed. Removing the format seems to be the best approach. It looks like
> only Geert has used it in the last few years and it hasn't been used
> much otherwise.
Indeed, just 3 users (the broadcom one isn't mine):
drivers/clk/renesas/renesas-cpg-mssr.c
drivers/thermal/broadcom/bcm2835_thermal.c
drivers/tty/serial/sh-sci.c
Alternatively, can we have a special version __clk_get_rate() that just
returns clk->core->rate?
Or would that be too inaccurate in the presence of CLK_GET_RATE_NOCACHE?
The function could still return 0 in case the flag is set.
Thanks!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Thu, 31 May 2018 18:42:48 +0200
Geert Uytterhoeven <[email protected]> wrote:
> Hi Stephen,
>
> On Thu, May 31, 2018 at 5:19 PM, Stephen Boyd <[email protected]> wrote:
> > Quoting Linus Torvalds (2018-05-31 07:32:10)
> >> On Thu, May 31, 2018 at 5:05 AM Petr Mladek <[email protected]> wrote:
> >> > Anyway, we need to fix or remove this format. vsprintf-like functions
> >> > are called in any context and nobody expect that they might sleep.
> >>
> >> Ack. I guess the argument is that "%pCr" is rare, and none of *those*
> >> users may care, but I do think that doing things wrong as-is.
> >>
> >> It's too subtle to have to know you're in a particular locking context
> >> when you use a particular %p modifier.
> >
> > Agreed. Removing the format seems to be the best approach. It looks like
> > only Geert has used it in the last few years and it hasn't been used
> > much otherwise.
>
> Indeed, just 3 users (the broadcom one isn't mine):
> drivers/clk/renesas/renesas-cpg-mssr.c
> drivers/thermal/broadcom/bcm2835_thermal.c
> drivers/tty/serial/sh-sci.c
>
> Alternatively, can we have a special version __clk_get_rate() that just
> returns clk->core->rate?
> Or would that be too inaccurate in the presence of CLK_GET_RATE_NOCACHE?
> The function could still return 0 in case the flag is set.
If it's only used in three locations, I think it would be better to
simply remove it from vsprintf() and have the three callers call
clk_get_rate() directly.
-- Steve
On 2018/6/1 5:13, Steven Rostedt wrote:
> On Thu, 31 May 2018 18:42:48 +0200
> Geert Uytterhoeven <[email protected]> wrote:
>
>> Hi Stephen,
>>
>> On Thu, May 31, 2018 at 5:19 PM, Stephen Boyd <[email protected]> wrote:
>>> Quoting Linus Torvalds (2018-05-31 07:32:10)
>>>> On Thu, May 31, 2018 at 5:05 AM Petr Mladek <[email protected]> wrote:
>>>>> Anyway, we need to fix or remove this format. vsprintf-like functions
>>>>> are called in any context and nobody expect that they might sleep.
>>>> Ack. I guess the argument is that "%pCr" is rare, and none of *those*
>>>> users may care, but I do think that doing things wrong as-is.
>>>>
>>>> It's too subtle to have to know you're in a particular locking context
>>>> when you use a particular %p modifier.
>>> Agreed. Removing the format seems to be the best approach. It looks like
>>> only Geert has used it in the last few years and it hasn't been used
>>> much otherwise.
>> Indeed, just 3 users (the broadcom one isn't mine):
>> drivers/clk/renesas/renesas-cpg-mssr.c
>> drivers/thermal/broadcom/bcm2835_thermal.c
>> drivers/tty/serial/sh-sci.c
>>
>> Alternatively, can we have a special version __clk_get_rate() that just
>> returns clk->core->rate?
>> Or would that be too inaccurate in the presence of CLK_GET_RATE_NOCACHE?
>> The function could still return 0 in case the flag is set.
> If it's only used in three locations, I think it would be better to
> simply remove it from vsprintf() and have the three callers call
> clk_get_rate() directly.
Agreed.
Best wishes,
Jia-Ju Bai
Hi Jia-Ju,
On Fri, Jun 1, 2018 at 2:59 AM, Jia-Ju Bai <[email protected]> wrote:
> On 2018/6/1 5:13, Steven Rostedt wrote:
>> On Thu, 31 May 2018 18:42:48 +0200
>> Geert Uytterhoeven <[email protected]> wrote:
>>> On Thu, May 31, 2018 at 5:19 PM, Stephen Boyd <[email protected]> wrote:
>>>> Quoting Linus Torvalds (2018-05-31 07:32:10)
>>>>> On Thu, May 31, 2018 at 5:05 AM Petr Mladek <[email protected]> wrote:
>>>>>>
>>>>>> Anyway, we need to fix or remove this format. vsprintf-like functions
>>>>>> are called in any context and nobody expect that they might sleep.
>>>>>
>>>>> Ack. I guess the argument is that "%pCr" is rare, and none of *those*
>>>>> users may care, but I do think that doing things wrong as-is.
>>>>>
>>>>> It's too subtle to have to know you're in a particular locking context
>>>>> when you use a particular %p modifier.
>>>>
>>>> Agreed. Removing the format seems to be the best approach. It looks like
>>>> only Geert has used it in the last few years and it hasn't been used
>>>> much otherwise.
>>>
>>> Indeed, just 3 users (the broadcom one isn't mine):
>>> drivers/clk/renesas/renesas-cpg-mssr.c
>>> drivers/thermal/broadcom/bcm2835_thermal.c
>>> drivers/tty/serial/sh-sci.c
>>>
>>> Alternatively, can we have a special version __clk_get_rate() that just
>>> returns clk->core->rate?
>>> Or would that be too inaccurate in the presence of CLK_GET_RATE_NOCACHE?
>>> The function could still return 0 in case the flag is set.
>>
>> If it's only used in three locations, I think it would be better to
>> simply remove it from vsprintf() and have the three callers call
>> clk_get_rate() directly.
OK, patches sent.
Thanks for reporting!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds