LinuxLists.cc - KASAN: vmalloc-out-of-bounds Read in bpf_trace

2020-11-02 11:56:22

Subject: KASAN: vmalloc-out-of-bounds Read in bpf_trace_run3

Hello,

syzbot found the following issue on:

HEAD commit: 080b6f40 bpf: Don't rely on GCC __attribute__((optimize)) ..
git tree: bpf
console output: https://syzkaller.appspot.com/x/log.txt?x=1089d37c500000
kernel config: https://syzkaller.appspot.com/x/.config?x=58a4ca757d776bfe
dashboard link: https://syzkaller.appspot.com/bug?extid=d29e58bb557324e55e5e
compiler: gcc (GCC) 10.1.0-syz 20200507
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f4b032500000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1371a47c500000

The issue was bisected to:

commit 9df1c28bb75217b244257152ab7d788bb2a386d0
Author: Matt Mullins <[email protected]>
Date: Fri Apr 26 18:49:47 2019 +0000

bpf: add writable context for raw tracepoints

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12b6c4da500000
final oops: https://syzkaller.appspot.com/x/report.txt?x=11b6c4da500000
console output: https://syzkaller.appspot.com/x/log.txt?x=16b6c4da500000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: 9df1c28bb752 ("bpf: add writable context for raw tracepoints")

==================================================================
BUG: KASAN: vmalloc-out-of-bounds in __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
BUG: KASAN: vmalloc-out-of-bounds in bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
Read of size 8 at addr ffffc90000e6c030 by task kworker/0:3/3754

CPU: 0 PID: 3754 Comm: kworker/0:3 Not tainted 5.9.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: 0x0 (events)
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:118
print_address_description.constprop.0.cold+0x5/0x4c8 mm/kasan/report.c:385
__kasan_report mm/kasan/report.c:545 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
__bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
__bpf_trace_sched_switch+0xdc/0x120 include/trace/events/sched.h:138
__traceiter_sched_switch+0x64/0xb0 include/trace/events/sched.h:138
trace_sched_switch include/trace/events/sched.h:138 [inline]
__schedule+0xeb8/0x2130 kernel/sched/core.c:4520
schedule+0xcf/0x270 kernel/sched/core.c:4601
worker_thread+0x14c/0x1120 kernel/workqueue.c:2439
kthread+0x3af/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Memory state around the buggy address:
ffffc90000e6bf00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffffc90000e6bf80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>ffffc90000e6c000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
^
ffffc90000e6c080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffffc90000e6c100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==================================================================

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

2020-11-11 15:00:04

by Dmitry Vyukov

[permalink] [raw]

Subject: Re: KASAN: vmalloc-out-of-bounds Read in bpf_trace_run3

On Mon, Nov 2, 2020 at 12:54 PM syzbot
<[email protected]> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 080b6f40 bpf: Don't rely on GCC __attribute__((optimize)) ..
> git tree: bpf
> console output: https://syzkaller.appspot.com/x/log.txt?x=1089d37c500000
> kernel config: https://syzkaller.appspot.com/x/.config?x=58a4ca757d776bfe
> dashboard link: https://syzkaller.appspot.com/bug?extid=d29e58bb557324e55e5e
> compiler: gcc (GCC) 10.1.0-syz 20200507
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f4b032500000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1371a47c500000
>
> The issue was bisected to:
>
> commit 9df1c28bb75217b244257152ab7d788bb2a386d0
> Author: Matt Mullins <[email protected]>
> Date: Fri Apr 26 18:49:47 2019 +0000
>
> bpf: add writable context for raw tracepoints

We have a number of kernel memory corruptions related to bpf_trace_run now:
https://groups.google.com/g/syzkaller-bugs/search?q=kernel%2Ftrace%2Fbpf_trace.c

Can raw tracepoints "legally" corrupt kernel memory (a-la /dev/kmem)?
Or they shouldn't?

Looking at the description of Matt's commit, it seems that corruptions
should not be possible (bounded buffer, checked size, etc). Then it
means it's a real kernel bug?

> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12b6c4da500000
> final oops: https://syzkaller.appspot.com/x/report.txt?x=11b6c4da500000
> console output: https://syzkaller.appspot.com/x/log.txt?x=16b6c4da500000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
> Fixes: 9df1c28bb752 ("bpf: add writable context for raw tracepoints")
>
> ==================================================================
> BUG: KASAN: vmalloc-out-of-bounds in __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
> BUG: KASAN: vmalloc-out-of-bounds in bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
> Read of size 8 at addr ffffc90000e6c030 by task kworker/0:3/3754
>
> CPU: 0 PID: 3754 Comm: kworker/0:3 Not tainted 5.9.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Workqueue: 0x0 (events)
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x107/0x163 lib/dump_stack.c:118
> print_address_description.constprop.0.cold+0x5/0x4c8 mm/kasan/report.c:385
> __kasan_report mm/kasan/report.c:545 [inline]
> kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
> __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
> bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
> __bpf_trace_sched_switch+0xdc/0x120 include/trace/events/sched.h:138
> __traceiter_sched_switch+0x64/0xb0 include/trace/events/sched.h:138
> trace_sched_switch include/trace/events/sched.h:138 [inline]
> __schedule+0xeb8/0x2130 kernel/sched/core.c:4520
> schedule+0xcf/0x270 kernel/sched/core.c:4601
> worker_thread+0x14c/0x1120 kernel/workqueue.c:2439
> kthread+0x3af/0x4a0 kernel/kthread.c:292
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
>
>
> Memory state around the buggy address:
> ffffc90000e6bf00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> ffffc90000e6bf80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> >ffffc90000e6c000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> ^
> ffffc90000e6c080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> ffffc90000e6c100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> ==================================================================
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this issue, for details see:
> https://goo.gl/tpsmEJ#testing-patches
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/00000000000004500b05b31e68ce%40google.com.

2020-11-13 05:45:41

by Matt Mullins

[permalink] [raw]

Subject: Re: KASAN: vmalloc-out-of-bounds Read in bpf_trace_run3

On Wed, Nov 11, 2020 at 03:57:50PM +0100, Dmitry Vyukov wrote:
> On Mon, Nov 2, 2020 at 12:54 PM syzbot
> <[email protected]> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 080b6f40 bpf: Don't rely on GCC __attribute__((optimize)) ..
> > git tree: bpf
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1089d37c500000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=58a4ca757d776bfe
> > dashboard link: https://syzkaller.appspot.com/bug?extid=d29e58bb557324e55e5e
> > compiler: gcc (GCC) 10.1.0-syz 20200507
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f4b032500000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1371a47c500000
> >
> > The issue was bisected to:
> >
> > commit 9df1c28bb75217b244257152ab7d788bb2a386d0
> > Author: Matt Mullins <[email protected]>
> > Date: Fri Apr 26 18:49:47 2019 +0000
> >
> > bpf: add writable context for raw tracepoints
>
>
> We have a number of kernel memory corruptions related to bpf_trace_run now:
> https://groups.google.com/g/syzkaller-bugs/search?q=kernel%2Ftrace%2Fbpf_trace.c
>
> Can raw tracepoints "legally" corrupt kernel memory (a-la /dev/kmem)?
> Or they shouldn't?
>
> Looking at the description of Matt's commit, it seems that corruptions
> should not be possible (bounded buffer, checked size, etc). Then it
> means it's a real kernel bug?

This bug doesn't seem to be related to the writability of the
tracepoint; it bisected to that commit simply because it used
BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE for the reproducer and it EINVAL's
before that program type was introduced. The BPF program it loads is
pretty much a no-op.

The problem here is a kmalloc failure injection into
tracepoint_probe_unregister, but the error is ignored -- so the bpf
program is freed even though the tracepoint is never unregistered.

I have a first pass at a patch to pipe through the error code, but it's
pretty ugly. It's also called from the file_operations ->release(), for
which errors are solidly ignored in __fput(), so I'm not sure what the
best way to handle ENOMEM is...

>
>
>
> > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12b6c4da500000
> > final oops: https://syzkaller.appspot.com/x/report.txt?x=11b6c4da500000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=16b6c4da500000
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
> > Fixes: 9df1c28bb752 ("bpf: add writable context for raw tracepoints")
> >
> > ==================================================================
> > BUG: KASAN: vmalloc-out-of-bounds in __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
> > BUG: KASAN: vmalloc-out-of-bounds in bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
> > Read of size 8 at addr ffffc90000e6c030 by task kworker/0:3/3754
> >
> > CPU: 0 PID: 3754 Comm: kworker/0:3 Not tainted 5.9.0-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Workqueue: 0x0 (events)
> > Call Trace:
> > __dump_stack lib/dump_stack.c:77 [inline]
> > dump_stack+0x107/0x163 lib/dump_stack.c:118
> > print_address_description.constprop.0.cold+0x5/0x4c8 mm/kasan/report.c:385
> > __kasan_report mm/kasan/report.c:545 [inline]
> > kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
> > __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
> > bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
> > __bpf_trace_sched_switch+0xdc/0x120 include/trace/events/sched.h:138
> > __traceiter_sched_switch+0x64/0xb0 include/trace/events/sched.h:138
> > trace_sched_switch include/trace/events/sched.h:138 [inline]
> > __schedule+0xeb8/0x2130 kernel/sched/core.c:4520
> > schedule+0xcf/0x270 kernel/sched/core.c:4601
> > worker_thread+0x14c/0x1120 kernel/workqueue.c:2439
> > kthread+0x3af/0x4a0 kernel/kthread.c:292
> > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
> >
> >
> > Memory state around the buggy address:
> > ffffc90000e6bf00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> > ffffc90000e6bf80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> > >ffffc90000e6c000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> > ^
> > ffffc90000e6c080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> > ffffc90000e6c100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> > ==================================================================
> >
> >
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at [email protected].
> >
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> > syzbot can test patches for this issue, for details see:
> > https://goo.gl/tpsmEJ#testing-patches
> >
> > --
> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/00000000000004500b05b31e68ce%40google.com.

2020-11-13 16:12:18

by Yonghong Song

[permalink] [raw]

Subject: Re: KASAN: vmalloc-out-of-bounds Read in bpf_trace_run3

On 11/12/20 9:37 PM, Matt Mullins wrote:
> On Wed, Nov 11, 2020 at 03:57:50PM +0100, Dmitry Vyukov wrote:
>> On Mon, Nov 2, 2020 at 12:54 PM syzbot
>> <[email protected]> wrote:
>>>
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit: 080b6f40 bpf: Don't rely on GCC __attribute__((optimize)) ..
>>> git tree: bpf
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1089d37c500000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=58a4ca757d776bfe
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=d29e58bb557324e55e5e
>>> compiler: gcc (GCC) 10.1.0-syz 20200507
>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f4b032500000
>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1371a47c500000
>>>
>>> The issue was bisected to:
>>>
>>> commit 9df1c28bb75217b244257152ab7d788bb2a386d0
>>> Author: Matt Mullins <[email protected]>
>>> Date: Fri Apr 26 18:49:47 2019 +0000
>>>
>>> bpf: add writable context for raw tracepoints
>>
>>
>> We have a number of kernel memory corruptions related to bpf_trace_run now:
>> https://groups.google.com/g/syzkaller-bugs/search?q=kernel/trace/bpf_trace.c
>>
>> Can raw tracepoints "legally" corrupt kernel memory (a-la /dev/kmem)?
>> Or they shouldn't?
>>
>> Looking at the description of Matt's commit, it seems that corruptions
>> should not be possible (bounded buffer, checked size, etc). Then it
>> means it's a real kernel bug?
>
> This bug doesn't seem to be related to the writability of the
> tracepoint; it bisected to that commit simply because it used
> BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE for the reproducer and it EINVAL's
> before that program type was introduced. The BPF program it loads is
> pretty much a no-op.
>
> The problem here is a kmalloc failure injection into
> tracepoint_probe_unregister, but the error is ignored -- so the bpf
> program is freed even though the tracepoint is never unregistered.
>
> I have a first pass at a patch to pipe through the error code, but it's
> pretty ugly. It's also called from the file_operations ->release(), for

Maybe you can still post the patch, so people can review and make
suggestions which may lead to a *better* solution.

> which errors are solidly ignored in __fput(), so I'm not sure what the
> best way to handle ENOMEM is...
>
>>
>>
>>
>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12b6c4da500000
>>> final oops: https://syzkaller.appspot.com/x/report.txt?x=11b6c4da500000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=16b6c4da500000
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: [email protected]
>>> Fixes: 9df1c28bb752 ("bpf: add writable context for raw tracepoints")
>>>
>>> ==================================================================
>>> BUG: KASAN: vmalloc-out-of-bounds in __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
>>> BUG: KASAN: vmalloc-out-of-bounds in bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
>>> Read of size 8 at addr ffffc90000e6c030 by task kworker/0:3/3754
>>>
>>> CPU: 0 PID: 3754 Comm: kworker/0:3 Not tainted 5.9.0-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> Workqueue: 0x0 (events)
>>> Call Trace:
>>> __dump_stack lib/dump_stack.c:77 [inline]
>>> dump_stack+0x107/0x163 lib/dump_stack.c:118
>>> print_address_description.constprop.0.cold+0x5/0x4c8 mm/kasan/report.c:385
>>> __kasan_report mm/kasan/report.c:545 [inline]
>>> kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
>>> __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
>>> bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
>>> __bpf_trace_sched_switch+0xdc/0x120 include/trace/events/sched.h:138
>>> __traceiter_sched_switch+0x64/0xb0 include/trace/events/sched.h:138
>>> trace_sched_switch include/trace/events/sched.h:138 [inline]
>>> __schedule+0xeb8/0x2130 kernel/sched/core.c:4520
>>> schedule+0xcf/0x270 kernel/sched/core.c:4601
>>> worker_thread+0x14c/0x1120 kernel/workqueue.c:2439
>>> kthread+0x3af/0x4a0 kernel/kthread.c:292
>>> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
>>>
>>>
>>> Memory state around the buggy address:
>>> ffffc90000e6bf00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>> ffffc90000e6bf80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>>> ffffc90000e6c000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>> ^
>>> ffffc90000e6c080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>> ffffc90000e6c100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>> ==================================================================
[...]

2021-02-10 18:36:26

by Eric Dumazet

[permalink] [raw]

Subject: Re: KASAN: vmalloc-out-of-bounds Read in bpf_trace_run3

On 11/13/20 5:08 PM, Yonghong Song wrote:
>
>
> On 11/12/20 9:37 PM, Matt Mullins wrote:
>> On Wed, Nov 11, 2020 at 03:57:50PM +0100, Dmitry Vyukov wrote:
>>> On Mon, Nov 2, 2020 at 12:54 PM syzbot
>>> <[email protected]> wrote:
>>>>
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit:    080b6f40 bpf: Don't rely on GCC __attribute__((optimize)) ..
>>>> git tree:       bpf
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1089d37c500000
>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=58a4ca757d776bfe
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=d29e58bb557324e55e5e
>>>> compiler:       gcc (GCC) 10.1.0-syz 20200507
>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=10f4b032500000
>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1371a47c500000
>>>>
>>>> The issue was bisected to:
>>>>
>>>> commit 9df1c28bb75217b244257152ab7d788bb2a386d0
>>>> Author: Matt Mullins <[email protected]>
>>>> Date:   Fri Apr 26 18:49:47 2019 +0000
>>>>
>>>>      bpf: add writable context for raw tracepoints
>>>
>>>
>>> We have a number of kernel memory corruptions related to bpf_trace_run now:
>>> https://groups.google.com/g/syzkaller-bugs/search?q=kernel/trace/bpf_trace.c
>>>
>>> Can raw tracepoints "legally" corrupt kernel memory (a-la /dev/kmem)?
>>> Or they shouldn't?
>>>
>>> Looking at the description of Matt's commit, it seems that corruptions
>>> should not be possible (bounded buffer, checked size, etc). Then it
>>> means it's a real kernel bug?
>>
>> This bug doesn't seem to be related to the writability of the
>> tracepoint; it bisected to that commit simply because it used
>> BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE for the reproducer and it EINVAL's
>> before that program type was introduced. The BPF program it loads is
>> pretty much a no-op.
>>
>> The problem here is a kmalloc failure injection into
>> tracepoint_probe_unregister, but the error is ignored -- so the bpf
>> program is freed even though the tracepoint is never unregistered.
>>
>> I have a first pass at a patch to pipe through the error code, but it's
>> pretty ugly. It's also called from the file_operations ->release(), for
>
> Maybe you can still post the patch, so people can review and make suggestions which may lead to a *better* solution.

ping

This bug is still there.

>
>> which errors are solidly ignored in __fput(), so I'm not sure what the
>> best way to handle ENOMEM is...
>>
>>>
>>>
>>>
>>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12b6c4da500000
>>>> final oops:     https://syzkaller.appspot.com/x/report.txt?x=11b6c4da500000
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=16b6c4da500000
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: [email protected]
>>>> Fixes: 9df1c28bb752 ("bpf: add writable context for raw tracepoints")
>>>>
>>>> ==================================================================
>>>> BUG: KASAN: vmalloc-out-of-bounds in __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
>>>> BUG: KASAN: vmalloc-out-of-bounds in bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
>>>> Read of size 8 at addr ffffc90000e6c030 by task kworker/0:3/3754
>>>>
>>>> CPU: 0 PID: 3754 Comm: kworker/0:3 Not tainted 5.9.0-syzkaller #0
>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>>> Workqueue: 0x0 (events)
>>>> Call Trace:
>>>> __dump_stack lib/dump_stack.c:77 [inline]
>>>> dump_stack+0x107/0x163 lib/dump_stack.c:118
>>>> print_address_description.constprop.0.cold+0x5/0x4c8 mm/kasan/report.c:385
>>>> __kasan_report mm/kasan/report.c:545 [inline]
>>>> kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
>>>> __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
>>>> bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
>>>> __bpf_trace_sched_switch+0xdc/0x120 include/trace/events/sched.h:138
>>>> __traceiter_sched_switch+0x64/0xb0 include/trace/events/sched.h:138
>>>> trace_sched_switch include/trace/events/sched.h:138 [inline]
>>>> __schedule+0xeb8/0x2130 kernel/sched/core.c:4520
>>>> schedule+0xcf/0x270 kernel/sched/core.c:4601
>>>> worker_thread+0x14c/0x1120 kernel/workqueue.c:2439
>>>> kthread+0x3af/0x4a0 kernel/kthread.c:292
>>>> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
>>>>
>>>>
>>>> Memory state around the buggy address:
>>>> ffffc90000e6bf00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>>> ffffc90000e6bf80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>>>> ffffc90000e6c000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>>>                                       ^
>>>> ffffc90000e6c080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>>> ffffc90000e6c100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>>>> ==================================================================
> [...]

2021-02-10 19:55:08

by Steven Rostedt

[permalink] [raw]

Subject: Re: KASAN: vmalloc-out-of-bounds Read in bpf_trace_run3

On Wed, 10 Feb 2021 19:23:38 +0100
Eric Dumazet <[email protected]> wrote:

> >> The problem here is a kmalloc failure injection into
> >> tracepoint_probe_unregister, but the error is ignored -- so the bpf
> >> program is freed even though the tracepoint is never unregistered.
> >>
> >> I have a first pass at a patch to pipe through the error code, but it's
> >> pretty ugly. It's also called from the file_operations ->release(), for
> >
> > Maybe you can still post the patch, so people can review and make suggestions which may lead to a *better* solution.
>
>
> ping
>
> This bug is still there.

Is this a bug via syzkaller?

I have this fix in linux-next:

befe6d946551 ("tracepoint: Do not fail unregistering a probe due to memory failure")

But because it is using undefined behavior (calling a sub return from a
call that has parameters, but Peter Zijlstra says is OK), I'm hesitant to
send it to Linus now or label it as stable.

Now this can only happen if kmalloc fails from here (called by func_remove).

static inline void *allocate_probes(int count)
{
struct tp_probes *p = kmalloc(struct_size(p, probes, count),
GFP_KERNEL);
return p == NULL ? NULL : p->probes;
}

As probes and count together is typically much less than a page (unless you
are doing fuzz testing and adding a ton of callbacks to a single
tracepoint), that kmalloc should always succeed.

The failure above can only happen if allocate_probes returns NULL, which is
extremely unlikely.

My question is, how is this triggered? And this should only be triggered by
root doing stupid crap. Is it that critical to have fixed?

-- Steve