2015-12-03 20:51:31

by Joe Perches

[permalink] [raw]
Subject: Re: use-after-free in sctp_do_sm

(adding lkml as this is likely better discussed there)

On Thu, 2015-12-03 at 15:42 -0500, Jason Baron wrote:
> On 12/03/2015 03:24 PM, Joe Perches wrote:
> > On Thu, 2015-12-03 at 15:10 -0500, Jason Baron wrote:
> > > On 12/03/2015 03:03 PM, Joe Perches wrote:
> > > > On Thu, 2015-12-03 at 14:32 -0500, Jason Baron wrote:
> > > > > On 12/03/2015 01:52 PM, Aaron Conole wrote:
> > > > > > I think that as a minimum, the following patch should be evaluted,
> > > > > > but am unsure to whom I should submit it (after I test):
> > > > []
> > > > > Agreed - the intention here is certainly to have no side effects. It
> > > > > looks like 'no_printk()' is used in quite a few other places that would
> > > > > benefit from this change. So we probably want a generic
> > > > > 'really_no_printk()' macro.
> > > >
> > > > https://lkml.org/lkml/2012/6/17/231
> > >
> > > I don't see this in the tree.
> >
> > It never got applied.
> >
> > > Also maybe we should just convert
> > > no_printk() to do what your 'eliminated_printk()'.
> >
> > Some of them at least.
> >
> > > So we can convert all users with this change?
> >
> > I don't think so, I think there are some
> > function evaluation/side effects that are
> > required.??I believe some do hardware I/O.
> >
> > It'd be good to at least isolate them.
> >
> > I'm not sure how to find them via some
> > automated tool/mechanism though.
> >
> > I asked Julia Lawall about it once in this
> > thread:??https://lkml.org/lkml/2014/12/3/696
> >
>
> Seems rather fragile to have side effects that we rely
> upon hidden in a printk().

Yup.

> Just convert them and see what breaks :)

I appreciate your optimism. ?It's very 1995.
Try it and see what happens.


2015-12-04 10:40:26

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: use-after-free in sctp_do_sm

On Thu, Dec 3, 2015 at 9:51 PM, Joe Perches <[email protected]> wrote:
> (adding lkml as this is likely better discussed there)
>
> On Thu, 2015-12-03 at 15:42 -0500, Jason Baron wrote:
>> On 12/03/2015 03:24 PM, Joe Perches wrote:
>> > On Thu, 2015-12-03 at 15:10 -0500, Jason Baron wrote:
>> > > On 12/03/2015 03:03 PM, Joe Perches wrote:
>> > > > On Thu, 2015-12-03 at 14:32 -0500, Jason Baron wrote:
>> > > > > On 12/03/2015 01:52 PM, Aaron Conole wrote:
>> > > > > > I think that as a minimum, the following patch should be evaluted,
>> > > > > > but am unsure to whom I should submit it (after I test):
>> > > > []
>> > > > > Agreed - the intention here is certainly to have no side effects. It
>> > > > > looks like 'no_printk()' is used in quite a few other places that would
>> > > > > benefit from this change. So we probably want a generic
>> > > > > 'really_no_printk()' macro.
>> > > >
>> > > > https://lkml.org/lkml/2012/6/17/231
>> > >
>> > > I don't see this in the tree.
>> >
>> > It never got applied.
>> >
>> > > Also maybe we should just convert
>> > > no_printk() to do what your 'eliminated_printk()'.
>> >
>> > Some of them at least.
>> >
>> > > So we can convert all users with this change?
>> >
>> > I don't think so, I think there are some
>> > function evaluation/side effects that are
>> > required. I believe some do hardware I/O.
>> >
>> > It'd be good to at least isolate them.
>> >
>> > I'm not sure how to find them via some
>> > automated tool/mechanism though.
>> >
>> > I asked Julia Lawall about it once in this
>> > thread: https://lkml.org/lkml/2014/12/3/696
>> >
>>
>> Seems rather fragile to have side effects that we rely
>> upon hidden in a printk().
>
> Yup.
>
>> Just convert them and see what breaks :)
>
> I appreciate your optimism. It's very 1995.
> Try it and see what happens.


Whatever is the resolution for pr_debug, we still need to fix this
particular use-after-free. It affects stability of debug builds, gives
invalid debug output, prevents us from finding more bugs in SCTP. And
maybe somebody uses CONFIG_DYNAMIC_DEBUG in production.

2015-12-04 12:55:52

by Marcelo Ricardo Leitner

[permalink] [raw]
Subject: Re: use-after-free in sctp_do_sm

On Fri, Dec 04, 2015 at 11:40:02AM +0100, Dmitry Vyukov wrote:
> On Thu, Dec 3, 2015 at 9:51 PM, Joe Perches <[email protected]> wrote:
> > (adding lkml as this is likely better discussed there)
> >
> > On Thu, 2015-12-03 at 15:42 -0500, Jason Baron wrote:
> >> On 12/03/2015 03:24 PM, Joe Perches wrote:
> >> > On Thu, 2015-12-03 at 15:10 -0500, Jason Baron wrote:
> >> > > On 12/03/2015 03:03 PM, Joe Perches wrote:
> >> > > > On Thu, 2015-12-03 at 14:32 -0500, Jason Baron wrote:
> >> > > > > On 12/03/2015 01:52 PM, Aaron Conole wrote:
> >> > > > > > I think that as a minimum, the following patch should be evaluted,
> >> > > > > > but am unsure to whom I should submit it (after I test):
> >> > > > []
> >> > > > > Agreed - the intention here is certainly to have no side effects. It
> >> > > > > looks like 'no_printk()' is used in quite a few other places that would
> >> > > > > benefit from this change. So we probably want a generic
> >> > > > > 'really_no_printk()' macro.
> >> > > >
> >> > > > https://lkml.org/lkml/2012/6/17/231
> >> > >
> >> > > I don't see this in the tree.
> >> >
> >> > It never got applied.
> >> >
> >> > > Also maybe we should just convert
> >> > > no_printk() to do what your 'eliminated_printk()'.
> >> >
> >> > Some of them at least.
> >> >
> >> > > So we can convert all users with this change?
> >> >
> >> > I don't think so, I think there are some
> >> > function evaluation/side effects that are
> >> > required. I believe some do hardware I/O.
> >> >
> >> > It'd be good to at least isolate them.
> >> >
> >> > I'm not sure how to find them via some
> >> > automated tool/mechanism though.
> >> >
> >> > I asked Julia Lawall about it once in this
> >> > thread: https://lkml.org/lkml/2014/12/3/696
> >> >
> >>
> >> Seems rather fragile to have side effects that we rely
> >> upon hidden in a printk().
> >
> > Yup.
> >
> >> Just convert them and see what breaks :)
> >
> > I appreciate your optimism. It's very 1995.
> > Try it and see what happens.
>
>
> Whatever is the resolution for pr_debug, we still need to fix this
> particular use-after-free. It affects stability of debug builds, gives
> invalid debug output, prevents us from finding more bugs in SCTP. And
> maybe somebody uses CONFIG_DYNAMIC_DEBUG in production.

Agreed. I'm already working on a fix for this particular use-after-free.

Another interesting thing about this is that sctp_do_sm() is called for
nearly every movement that happens on a sctp socket. Said that, that
always-running IDR search hidden on that debug statement do have some
nasty performance impact, specially because it's serialized on a
spinlock. This wouldn't be happening if it was fully ellided and would
be ok if that pr_debug() was really being printed, but not as it is.
Kudos to this report that I could notice this. I'm trying to fix this on
SCTP-side as well.

Marcelo

2015-12-04 15:37:33

by Vlad Yasevich

[permalink] [raw]
Subject: Re: use-after-free in sctp_do_sm

On 12/04/2015 07:55 AM, Marcelo Ricardo Leitner wrote:
> On Fri, Dec 04, 2015 at 11:40:02AM +0100, Dmitry Vyukov wrote:
>> On Thu, Dec 3, 2015 at 9:51 PM, Joe Perches <[email protected]> wrote:
>>> (adding lkml as this is likely better discussed there)
>>>
>>> On Thu, 2015-12-03 at 15:42 -0500, Jason Baron wrote:
>>>> On 12/03/2015 03:24 PM, Joe Perches wrote:
>>>>> On Thu, 2015-12-03 at 15:10 -0500, Jason Baron wrote:
>>>>>> On 12/03/2015 03:03 PM, Joe Perches wrote:
>>>>>>> On Thu, 2015-12-03 at 14:32 -0500, Jason Baron wrote:
>>>>>>>> On 12/03/2015 01:52 PM, Aaron Conole wrote:
>>>>>>>>> I think that as a minimum, the following patch should be evaluted,
>>>>>>>>> but am unsure to whom I should submit it (after I test):
>>>>>>> []
>>>>>>>> Agreed - the intention here is certainly to have no side effects. It
>>>>>>>> looks like 'no_printk()' is used in quite a few other places that would
>>>>>>>> benefit from this change. So we probably want a generic
>>>>>>>> 'really_no_printk()' macro.
>>>>>>>
>>>>>>> https://lkml.org/lkml/2012/6/17/231
>>>>>>
>>>>>> I don't see this in the tree.
>>>>>
>>>>> It never got applied.
>>>>>
>>>>>> Also maybe we should just convert
>>>>>> no_printk() to do what your 'eliminated_printk()'.
>>>>>
>>>>> Some of them at least.
>>>>>
>>>>>> So we can convert all users with this change?
>>>>>
>>>>> I don't think so, I think there are some
>>>>> function evaluation/side effects that are
>>>>> required. I believe some do hardware I/O.
>>>>>
>>>>> It'd be good to at least isolate them.
>>>>>
>>>>> I'm not sure how to find them via some
>>>>> automated tool/mechanism though.
>>>>>
>>>>> I asked Julia Lawall about it once in this
>>>>> thread: https://lkml.org/lkml/2014/12/3/696
>>>>>
>>>>
>>>> Seems rather fragile to have side effects that we rely
>>>> upon hidden in a printk().
>>>
>>> Yup.
>>>
>>>> Just convert them and see what breaks :)
>>>
>>> I appreciate your optimism. It's very 1995.
>>> Try it and see what happens.
>>
>>
>> Whatever is the resolution for pr_debug, we still need to fix this
>> particular use-after-free. It affects stability of debug builds, gives
>> invalid debug output, prevents us from finding more bugs in SCTP. And
>> maybe somebody uses CONFIG_DYNAMIC_DEBUG in production.
>
> Agreed. I'm already working on a fix for this particular use-after-free.
>
> Another interesting thing about this is that sctp_do_sm() is called for
> nearly every movement that happens on a sctp socket. Said that, that
> always-running IDR search hidden on that debug statement do have some
> nasty performance impact, specially because it's serialized on a
> spinlock.

YUCK! I didn't really pay much attention to those debug macros before, but
debug_post_sfx() is truly awful.

This wasn't such a bad thing where these macros depended on CONFIG_SCTP_DEBUG,
but now that they are always built, we need fix them.

-vlad



> This wouldn't be happening if it was fully ellided and would
> be ok if that pr_debug() was really being printed, but not as it is.
> Kudos to this report that I could notice this. I'm trying to fix this on
> SCTP-side as well.
>
> Marcelo
>

2015-12-04 15:51:42

by Aaron Conole

[permalink] [raw]
Subject: Re: use-after-free in sctp_do_sm

Vlad Yasevich <[email protected]> writes:
> On 12/04/2015 07:55 AM, Marcelo Ricardo Leitner wrote:
>> On Fri, Dec 04, 2015 at 11:40:02AM +0100, Dmitry Vyukov wrote:
>>> On Thu, Dec 3, 2015 at 9:51 PM, Joe Perches <[email protected]> wrote:
>>>> (adding lkml as this is likely better discussed there)
>>>>
>>>> On Thu, 2015-12-03 at 15:42 -0500, Jason Baron wrote:
>>>>> On 12/03/2015 03:24 PM, Joe Perches wrote:
>>>>>> On Thu, 2015-12-03 at 15:10 -0500, Jason Baron wrote:
>>>>>>> On 12/03/2015 03:03 PM, Joe Perches wrote:
>>>>>>>> On Thu, 2015-12-03 at 14:32 -0500, Jason Baron wrote:
>>>>>>>>> On 12/03/2015 01:52 PM, Aaron Conole wrote:
>>>>>>>>>> I think that as a minimum, the following patch should be evaluted,
>>>>>>>>>> but am unsure to whom I should submit it (after I test):
>>>>>>>> []
>>>>>>>>> Agreed - the intention here is certainly to have no side effects. It
>>>>>>>>> looks like 'no_printk()' is used in quite a few other places that would
>>>>>>>>> benefit from this change. So we probably want a generic
>>>>>>>>> 'really_no_printk()' macro.
>>>>>>>>
>>>>>>>> https://lkml.org/lkml/2012/6/17/231
>>>>>>>
>>>>>>> I don't see this in the tree.
>>>>>>
>>>>>> It never got applied.
>>>>>>
>>>>>>> Also maybe we should just convert
>>>>>>> no_printk() to do what your 'eliminated_printk()'.
>>>>>>
>>>>>> Some of them at least.
>>>>>>
>>>>>>> So we can convert all users with this change?
>>>>>>
>>>>>> I don't think so, I think there are some
>>>>>> function evaluation/side effects that are
>>>>>> required. I believe some do hardware I/O.
>>>>>>
>>>>>> It'd be good to at least isolate them.
>>>>>>
>>>>>> I'm not sure how to find them via some
>>>>>> automated tool/mechanism though.
>>>>>>
>>>>>> I asked Julia Lawall about it once in this
>>>>>> thread: https://lkml.org/lkml/2014/12/3/696
>>>>>>
>>>>>
>>>>> Seems rather fragile to have side effects that we rely
>>>>> upon hidden in a printk().
>>>>
>>>> Yup.
>>>>
>>>>> Just convert them and see what breaks :)
>>>>
>>>> I appreciate your optimism. It's very 1995.
>>>> Try it and see what happens.
>>>
>>>
>>> Whatever is the resolution for pr_debug, we still need to fix this
>>> particular use-after-free. It affects stability of debug builds, gives
>>> invalid debug output, prevents us from finding more bugs in SCTP. And
>>> maybe somebody uses CONFIG_DYNAMIC_DEBUG in production.
>>
>> Agreed. I'm already working on a fix for this particular use-after-free.
>>
>> Another interesting thing about this is that sctp_do_sm() is called for
>> nearly every movement that happens on a sctp socket. Said that, that
>> always-running IDR search hidden on that debug statement do have some
>> nasty performance impact, specially because it's serialized on a
>> spinlock.
>
> YUCK! I didn't really pay much attention to those debug macros before, but
> debug_post_sfx() is truly awful.
>
> This wasn't such a bad thing where these macros depended on CONFIG_SCTP_DEBUG,
> but now that they are always built, we need fix them.

I've proposed a patch to linux-kernel to fix them, but I don't think
it's really as bad as folks imagine. Ubuntu, RHEL, and Fedora all use
DYNAMIC_DEBUG configuration option, which means that the code is getting
emitted anyway (correctly, I'll add) and is shunted out by a dynamic
debug flag. So for the average user, it's not even really a blip.

That does mean there's a cool side-effect of the entire print-macro setup
which implies we execute less code when running with DYNAMIC_DEBUG=y in
the "normal" case. "Turn on the dynamic debugging config and watch
everything get better" isn't the worst mantra, is it? :)

> -vlad
>
>
>
>> This wouldn't be happening if it was fully ellided and would
>> be ok if that pr_debug() was really being printed, but not as it is.
>> Kudos to this report that I could notice this. I'm trying to fix this on
>> SCTP-side as well.
>>
>> Marcelo
>>

2015-12-04 16:12:35

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: use-after-free in sctp_do_sm

On Thu, Dec 3, 2015 at 9:51 PM, Joe Perches <[email protected]> wrote:
> (adding lkml as this is likely better discussed there)
>
> On Thu, 2015-12-03 at 15:42 -0500, Jason Baron wrote:
>> On 12/03/2015 03:24 PM, Joe Perches wrote:
>> > On Thu, 2015-12-03 at 15:10 -0500, Jason Baron wrote:
>> > > On 12/03/2015 03:03 PM, Joe Perches wrote:
>> > > > On Thu, 2015-12-03 at 14:32 -0500, Jason Baron wrote:
>> > > > > On 12/03/2015 01:52 PM, Aaron Conole wrote:
>> > > > > > I think that as a minimum, the following patch should be evaluted,
>> > > > > > but am unsure to whom I should submit it (after I test):
>> > > > []
>> > > > > Agreed - the intention here is certainly to have no side effects. It
>> > > > > looks like 'no_printk()' is used in quite a few other places that would
>> > > > > benefit from this change. So we probably want a generic
>> > > > > 'really_no_printk()' macro.
>> > > >
>> > > > https://lkml.org/lkml/2012/6/17/231
>> > >
>> > > I don't see this in the tree.
>> >
>> > It never got applied.
>> >
>> > > Also maybe we should just convert
>> > > no_printk() to do what your 'eliminated_printk()'.
>> >
>> > Some of them at least.
>> >
>> > > So we can convert all users with this change?
>> >
>> > I don't think so, I think there are some
>> > function evaluation/side effects that are
>> > required. I believe some do hardware I/O.
>> >
>> > It'd be good to at least isolate them.
>> >
>> > I'm not sure how to find them via some
>> > automated tool/mechanism though.
>> >
>> > I asked Julia Lawall about it once in this
>> > thread: https://lkml.org/lkml/2014/12/3/696
>> >
>>
>> Seems rather fragile to have side effects that we rely
>> upon hidden in a printk().
>
> Yup.
>
>> Just convert them and see what breaks :)
>
> I appreciate your optimism. It's very 1995.
> Try it and see what happens.


But Aaron says that DYNAMIC_DEBUG is enabled in most major
distributions, and all these side-effects don't happen with
DYNAMIC_DEBUG. This suggests that we can make these side-effects not
happen without DYNAMIC_DEBUG as well.
Or I am missing something here?

2015-12-04 16:47:55

by Jason Baron

[permalink] [raw]
Subject: Re: use-after-free in sctp_do_sm

On 12/04/2015 11:12 AM, Dmitry Vyukov wrote:
> On Thu, Dec 3, 2015 at 9:51 PM, Joe Perches <[email protected]> wrote:
>> (adding lkml as this is likely better discussed there)
>>
>> On Thu, 2015-12-03 at 15:42 -0500, Jason Baron wrote:
>>> On 12/03/2015 03:24 PM, Joe Perches wrote:
>>>> On Thu, 2015-12-03 at 15:10 -0500, Jason Baron wrote:
>>>>> On 12/03/2015 03:03 PM, Joe Perches wrote:
>>>>>> On Thu, 2015-12-03 at 14:32 -0500, Jason Baron wrote:
>>>>>>> On 12/03/2015 01:52 PM, Aaron Conole wrote:
>>>>>>>> I think that as a minimum, the following patch should be evaluted,
>>>>>>>> but am unsure to whom I should submit it (after I test):
>>>>>> []
>>>>>>> Agreed - the intention here is certainly to have no side effects. It
>>>>>>> looks like 'no_printk()' is used in quite a few other places that would
>>>>>>> benefit from this change. So we probably want a generic
>>>>>>> 'really_no_printk()' macro.
>>>>>>
>>>>>> https://lkml.org/lkml/2012/6/17/231
>>>>>
>>>>> I don't see this in the tree.
>>>>
>>>> It never got applied.
>>>>
>>>>> Also maybe we should just convert
>>>>> no_printk() to do what your 'eliminated_printk()'.
>>>>
>>>> Some of them at least.
>>>>
>>>>> So we can convert all users with this change?
>>>>
>>>> I don't think so, I think there are some
>>>> function evaluation/side effects that are
>>>> required. I believe some do hardware I/O.
>>>>
>>>> It'd be good to at least isolate them.
>>>>
>>>> I'm not sure how to find them via some
>>>> automated tool/mechanism though.
>>>>
>>>> I asked Julia Lawall about it once in this
>>>> thread: https://lkml.org/lkml/2014/12/3/696
>>>>
>>>
>>> Seems rather fragile to have side effects that we rely
>>> upon hidden in a printk().
>>
>> Yup.
>>
>>> Just convert them and see what breaks :)
>>
>> I appreciate your optimism. It's very 1995.
>> Try it and see what happens.
>
>
> But Aaron says that DYNAMIC_DEBUG is enabled in most major
> distributions, and all these side-effects don't happen with
> DYNAMIC_DEBUG.

When DYNAMIC_DEBUG is enabled we have this wrapper from
include/linux/dynamic_debug.h:

if (unlikely(descriptor.flags & _DPRINTK_FLAGS_PRINT))
<do debug stuff>

So the compiler is not emitting the side-effects in this
case.

>This suggests that we can make these side-effects not
> happen without DYNAMIC_DEBUG as well.
> Or I am missing something here?
>

When DYNAMIC_DEBUG is disabled we are instead replacing
pr_debug() with the 'no_printk()' function as you've pointed
out. We are changing this to emit no code at all:

http://marc.info/?l=linux-kernel&m=144918276518878&w=2

Thanks,

-Jason

2015-12-04 17:03:16

by Joe Perches

[permalink] [raw]
Subject: Re: use-after-free in sctp_do_sm

On Fri, 2015-12-04 at 11:47 -0500, Jason Baron wrote:
> When DYNAMIC_DEBUG is enabled we have this wrapper from
> include/linux/dynamic_debug.h:
>
> if (unlikely(descriptor.flags & _DPRINTK_FLAGS_PRINT))
> <do debug stuff>
>
> So the compiler is not emitting the side-effects in this
> case.

Huh? ?Do I misunderstand what you are writing?

You are testing a variable that is not generally set
so the call is not being performed in the general case,
but the compiler can not elide the code.

If the variable was enabled via the control file, the
__dynamic_pr_debug would be performed with the
use-after-free.

2015-12-04 17:11:11

by Jason Baron

[permalink] [raw]
Subject: Re: use-after-free in sctp_do_sm



On 12/04/2015 12:03 PM, Joe Perches wrote:
> On Fri, 2015-12-04 at 11:47 -0500, Jason Baron wrote:
>> When DYNAMIC_DEBUG is enabled we have this wrapper from
>> include/linux/dynamic_debug.h:
>>
>> if (unlikely(descriptor.flags & _DPRINTK_FLAGS_PRINT))
>> <do debug stuff>
>>
>> So the compiler is not emitting the side-effects in this
>> case.
>
> Huh? Do I misunderstand what you are writing?

Yes, I wasn't terribly clear - I was trying to say that the
'side-effects', in this case the debug code and use-after-free, are
hidden behind the branch. They aren't invoked unless we enable the debug
statement.

Thanks,

-Jason

>
> You are testing a variable that is not generally set
> so the call is not being performed in the general case,
> but the compiler can not elide the code.
>
> If the variable was enabled via the control file, the
> __dynamic_pr_debug would be performed with the
> use-after-free.
>