2020-09-10 09:32:33

by syzbot

[permalink] [raw]
Subject: WARNING in bpf_raw_tp_link_fill_link_info

Hello,

syzbot found the following issue on:

HEAD commit: 7fb5eefd selftests/bpf: Fix test_sysctl_loop{1, 2} failure..
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1424fdb3900000
kernel config: https://syzkaller.appspot.com/x/.config?x=b6856d16f78d8fa9
dashboard link: https://syzkaller.appspot.com/bug?extid=976d5ecfab0c7eb43ac3
compiler: gcc (GCC) 10.1.0-syz 20200507
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14a1f411900000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10929c11900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

------------[ cut here ]------------
WARNING: CPU: 0 PID: 6854 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
WARNING: CPU: 0 PID: 6854 at include/linux/thread_info.h:150 copy_to_user include/linux/uaccess.h:167 [inline]
WARNING: CPU: 0 PID: 6854 at include/linux/thread_info.h:150 bpf_raw_tp_link_fill_link_info+0x306/0x350 kernel/bpf/syscall.c:2661
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 6854 Comm: syz-executor574 Not tainted 5.9.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x18f/0x20d lib/dump_stack.c:118
panic+0x2e3/0x75c kernel/panic.c:231
__warn.cold+0x20/0x4a kernel/panic.c:600
report_bug+0x1bd/0x210 lib/bug.c:198
handle_bug+0x38/0x90 arch/x86/kernel/traps.c:234
exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:254
asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536
RIP: 0010:check_copy_size include/linux/thread_info.h:150 [inline]
RIP: 0010:copy_to_user include/linux/uaccess.h:167 [inline]
RIP: 0010:bpf_raw_tp_link_fill_link_info+0x306/0x350 kernel/bpf/syscall.c:2661
Code: 41 bc ea ff ff ff e9 35 ff ff ff 4c 89 ff e8 41 66 33 00 e9 d0 fd ff ff 4c 89 ff e8 a4 66 33 00 e9 06 ff ff ff e8 ca ed f2 ff <0f> 0b eb 94 48 89 ef e8 2e 66 33 00 e9 65 fd ff ff e8 24 66 33 00
RSP: 0018:ffffc900051c7bd0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffc900051c7c60 RCX: ffffffff818179d6
RDX: ffff88808b490000 RSI: ffffffff81817a96 RDI: 0000000000000006
RBP: 0000000000000019 R08: 0000000000000000 R09: ffffc900051c7c7f
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000019
R13: 0000000000001265 R14: ffffffff8986ecc0 R15: ffffc900051c7c78
bpf_link_get_info_by_fd kernel/bpf/syscall.c:3626 [inline]
bpf_obj_get_info_by_fd+0x43a/0xc40 kernel/bpf/syscall.c:3664
__do_sys_bpf+0x1906/0x4b30 kernel/bpf/syscall.c:4237
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x4405f9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fff47155808 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004405f9
RDX: 0000000000000010 RSI: 00000000200000c0 RDI: 000000000000000f
RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 00000000004002c8 R11: 0000000000000246 R12: 0000000000401e00
R13: 0000000000401e90 R14: 0000000000000000 R15: 0000000000000000
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches


2020-09-10 22:04:18

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: WARNING in bpf_raw_tp_link_fill_link_info

On Thu, Sep 10, 2020 at 2:31 AM syzbot
<[email protected]> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 7fb5eefd selftests/bpf: Fix test_sysctl_loop{1, 2} failure..
> git tree: bpf-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1424fdb3900000
> kernel config: https://syzkaller.appspot.com/x/.config?x=b6856d16f78d8fa9
> dashboard link: https://syzkaller.appspot.com/bug?extid=976d5ecfab0c7eb43ac3
> compiler: gcc (GCC) 10.1.0-syz 20200507
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14a1f411900000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10929c11900000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 6854 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
> WARNING: CPU: 0 PID: 6854 at include/linux/thread_info.h:150 copy_to_user include/linux/uaccess.h:167 [inline]
> WARNING: CPU: 0 PID: 6854 at include/linux/thread_info.h:150 bpf_raw_tp_link_fill_link_info+0x306/0x350 kernel/bpf/syscall.c:2661
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 0 PID: 6854 Comm: syz-executor574 Not tainted 5.9.0-rc1-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x18f/0x20d lib/dump_stack.c:118
> panic+0x2e3/0x75c kernel/panic.c:231
> __warn.cold+0x20/0x4a kernel/panic.c:600
> report_bug+0x1bd/0x210 lib/bug.c:198
> handle_bug+0x38/0x90 arch/x86/kernel/traps.c:234
> exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:254
> asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536
> RIP: 0010:check_copy_size include/linux/thread_info.h:150 [inline]
> RIP: 0010:copy_to_user include/linux/uaccess.h:167 [inline]
> RIP: 0010:bpf_raw_tp_link_fill_link_info+0x306/0x350 kernel/bpf/syscall.c:2661
> Code: 41 bc ea ff ff ff e9 35 ff ff ff 4c 89 ff e8 41 66 33 00 e9 d0 fd ff ff 4c 89 ff e8 a4 66 33 00 e9 06 ff ff ff e8 ca ed f2 ff <0f> 0b eb 94 48 89 ef e8 2e 66 33 00 e9 65 fd ff ff e8 24 66 33 00
> RSP: 0018:ffffc900051c7bd0 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffffc900051c7c60 RCX: ffffffff818179d6
> RDX: ffff88808b490000 RSI: ffffffff81817a96 RDI: 0000000000000006
> RBP: 0000000000000019 R08: 0000000000000000 R09: ffffc900051c7c7f
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000019
> R13: 0000000000001265 R14: ffffffff8986ecc0 R15: ffffc900051c7c78
> bpf_link_get_info_by_fd kernel/bpf/syscall.c:3626 [inline]
> bpf_obj_get_info_by_fd+0x43a/0xc40 kernel/bpf/syscall.c:3664
> __do_sys_bpf+0x1906/0x4b30 kernel/bpf/syscall.c:4237
> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x4405f9
> Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007fff47155808 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004405f9
> RDX: 0000000000000010 RSI: 00000000200000c0 RDI: 000000000000000f
> RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
> R10: 00000000004002c8 R11: 0000000000000246 R12: 0000000000401e00
> R13: 0000000000401e90 R14: 0000000000000000 R15: 0000000000000000
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>

#syz fix: b474959d5afd ("bpf: Fix a buffer out-of-bound access when
filling raw_tp link_info")

>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> syzbot can test patches for this issue, for details see:
> https://goo.gl/tpsmEJ#testing-patches

2020-09-12 11:40:14

by Anant Thazhemadam

[permalink] [raw]
Subject: [PATCH] Using a pointer and kzalloc in place of a struct directly

Updated the usage of a struct variable directly, in bpf_link_get_info_by_fd
to using a pointer of the same type instead, which points to a memory
location allocated using kzalloc.

Signed-off-by: Anant Thazhemadam <[email protected]>
---
I saw this bug (https://syzkaller.appspot.com/bug?extid=976d5ecfab0c7eb43ac3),
and tried to come up with a patch for it (before I saw that this had already
been taken care of).
Although I don't think it fundamentally changes how things work much, it still
seems to have fixed the error on it's own too.
I'd like to hear anyone's 2c on this, and know  if this method of using info
(of type bpf_link_info) instead
would be a welcome change in general, even if it was not centered around
fixing the bug.
If instead, as an unwelcome consequence, this patch might make something go
wrong somewhere, or passing
the syzbot test was a false positive, I would appreciate it if you could shed
some light on that for me as well.
If this patch seems acceptable, then I'll send in a cleaner v2 that's a little
more articulate, if required.
Just trying to understand how things work, and sometimes why things work
in and around the kernel.
Thanks,
Anant


kernel/bpf/syscall.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 4108ef3b828b..01b9c203ef65 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3605,30 +3605,31 @@ static int bpf_link_get_info_by_fd(struct file *file,
union bpf_attr __user *uattr)
{
struct bpf_link_info __user *uinfo = u64_to_user_ptr(attr->info.info);
- struct bpf_link_info info;
+ struct bpf_link_info *info = NULL;
u32 info_len = attr->info.info_len;
int err;

- err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+ err = bpf_check_uarg_tail_zero(uinfo, sizeof(struct bpf_link_info), info_len);
+
if (err)
return err;
info_len = min_t(u32, sizeof(info), info_len);

- memset(&info, 0, sizeof(info));
- if (copy_from_user(&info, uinfo, info_len))
+ info = kzalloc(sizeof(struct bpf_link_info), GFP_KERNEL);
+ if (copy_from_user(info, uinfo, info_len))
return -EFAULT;

- info.type = link->type;
- info.id = link->id;
- info.prog_id = link->prog->aux->id;
+ info->type = link->type;
+ info->id = link->id;
+ info->prog_id = link->prog->aux->id;

if (link->ops->fill_link_info) {
- err = link->ops->fill_link_info(link, &info);
+ err = link->ops->fill_link_info(link, info);
if (err)
return err;
}

- if (copy_to_user(uinfo, &info, info_len) ||
+ if (copy_to_user(uinfo, info, info_len) ||
put_user(info_len, &uattr->info.info_len))
return -EFAULT;

--
2.25.1

2020-09-12 11:50:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] Using a pointer and kzalloc in place of a struct directly

On Sat, Sep 12, 2020 at 05:08:04PM +0530, Anant Thazhemadam wrote:
> Updated the usage of a struct variable directly, in bpf_link_get_info_by_fd
> to using a pointer of the same type instead, which points to a memory
> location allocated using kzalloc.
>
> Signed-off-by: Anant Thazhemadam <[email protected]>

Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing
list as well?

Anyway, comment on your patch below:

> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 4108ef3b828b..01b9c203ef65 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -3605,30 +3605,31 @@ static int bpf_link_get_info_by_fd(struct file *file,
> union bpf_attr __user *uattr)
> {
> struct bpf_link_info __user *uinfo = u64_to_user_ptr(attr->info.info);
> - struct bpf_link_info info;
> + struct bpf_link_info *info = NULL;
> u32 info_len = attr->info.info_len;
> int err;
>
> - err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
> + err = bpf_check_uarg_tail_zero(uinfo, sizeof(struct bpf_link_info), info_len);
> +
> if (err)
> return err;
> info_len = min_t(u32, sizeof(info), info_len);
>
> - memset(&info, 0, sizeof(info));
> - if (copy_from_user(&info, uinfo, info_len))
> + info = kzalloc(sizeof(struct bpf_link_info), GFP_KERNEL);
> + if (copy_from_user(info, uinfo, info_len))
> return -EFAULT;

You leaked memory :(

Did you test this patch? Where do you free this memory, I don't see
that happening anywhere in this patch, did I miss it?

And odds are this change will slow things down, right? Why make this
change, what's wrong with the structure being on the stack?

thanks,

greg k-h

2020-09-12 12:15:15

by Anant Thazhemadam

[permalink] [raw]
Subject: Re: [PATCH] Using a pointer and kzalloc in place of a struct directly


On 12/09/20 5:17 pm, Greg KH wrote:
> Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing
> list as well?
Oh, I'm sorry about that. I pulled the emails of all the people to whom
this mail was sent off from the header in lkml mail, and just cc-ed
everyone.

> You leaked memory :(
>
> Did you test this patch? Where do you free this memory, I don't see
> that happening anywhere in this patch, did I miss it?

Yes, I did test this patch, which didn't seem to trigger any issues.
It surprised me so much, that I ended up sending it in, to have
it checked out.

I wasn't sure where exactly the memory allocated here was
supposed to be freed (might be why the current implementation
isn't exactly using kzalloc). I forgot to mention it in the initial mail,
and I was hoping that someone would point me in the right direction
(if this approach was actually going to be considered, that is, which in
retrospect I now feel might not be the best thing)

> And odds are this change will slow things down, right? Why make this
> change, what's wrong with the structure being on the stack?

For more clarity, I'm not exactly pushing for this patch to get accepted,
as much as I'm trying to understand what exactly is going on, and maybe
even understand syzbot's working a little better in the process.

At the time when I did send in this patch, the error seemed to be
present as far as syzbot was concerned. (I had sent in a test request not
too long before I sent this in, which returned a positive).
I just wanted to know, in the off-chance that the commit fix that was
pointed out wasn't merged in the tree yet when syzbot tested it, why
exactly would a patch like this lead to no issues getting triggered?
(I understand that if the fix was in the tree when syzbot ran the next test,
this patch immediately is rendered obsolete, ofcourse)

It felt somewhat a bit like an anomaly to me, and I figured it might be
worth investigating, is all; and I'd either infer something about syzbot,
or about whatever just happened there.

Now that I say it out loud, I realize it might sound a little silly, but
then again, I had tested the 'validity' of the bug, not too long before I
sent in the patch for syzbot to test too, and it seemed to be present when I did.

Thanks,
Anant


2020-09-12 15:05:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] Using a pointer and kzalloc in place of a struct directly

On Sat, Sep 12, 2020 at 05:43:38PM +0530, Anant Thazhemadam wrote:
>
> On 12/09/20 5:17 pm, Greg KH wrote:
> > Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing
> > list as well?
> Oh, I'm sorry about that. I pulled the emails of all the people to whom
> this mail was sent off from the header in lkml mail, and just cc-ed
> everyone.
>
> > You leaked memory :(
> >
> > Did you test this patch? Where do you free this memory, I don't see
> > that happening anywhere in this patch, did I miss it?
>
> Yes, I did test this patch, which didn't seem to trigger any issues.
> It surprised me so much, that I ended up sending it in, to have
> it checked out.

You might not have noticed the memory leak if you were not looking for
it.

How did you test this?

> I wasn't sure where exactly the memory allocated here was
> supposed to be freed (might be why the current implementation
> isn't exactly using kzalloc). I forgot to mention it in the initial mail,
> and I was hoping that someone would point me in the right direction
> (if this approach was actually going to be considered, that is, which in
> retrospect I now feel might not be the best thing)

It has to be freed somewhere, you wrote the patch :)

But back to the original question here, why do you feel this change is
needed? What does this do better/faster/more correct than the code that
is currently there? Unless you can provide that, the change should not
be needed, right?

thanks,

greg k-h

2020-09-12 20:03:57

by Anant Thazhemadam

[permalink] [raw]
Subject: Re: [PATCH] Using a pointer and kzalloc in place of a struct directly


On 12/09/20 8:25 pm, Greg KH wrote:
> On Sat, Sep 12, 2020 at 05:43:38PM +0530, Anant Thazhemadam wrote:
>> On 12/09/20 5:17 pm, Greg KH wrote:
>>> Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing
>>> list as well?
>> Oh, I'm sorry about that. I pulled the emails of all the people to whom
>> this mail was sent off from the header in lkml mail, and just cc-ed
>> everyone.
>>
>>> You leaked memory :(
>>>
>>> Did you test this patch? Where do you free this memory, I don't see
>>> that happening anywhere in this patch, did I miss it?
>> Yes, I did test this patch, which didn't seem to trigger any issues.
>> It surprised me so much, that I ended up sending it in, to have
>> it checked out.
> You might not have noticed the memory leak if you were not looking for
> it.
>
> How did you test this?
Ah, that must be it. I tested this using syzbot, which wouldn't have looked
for memory leaks, but only the issue that was reported. My apologies.
>> I wasn't sure where exactly the memory allocated here was
>> supposed to be freed (might be why the current implementation
>> isn't exactly using kzalloc). I forgot to mention it in the initial mail,
>> and I was hoping that someone would point me in the right direction
>> (if this approach was actually going to be considered, that is, which in
>> retrospect I now feel might not be the best thing)
> It has to be freed somewhere, you wrote the patch :)
>
> But back to the original question here, why do you feel this change is
> needed? What does this do better/faster/more correct than the code that
> is currently there? Unless you can provide that, the change should not
> be needed, right?
I was initially trying to see if allocating memory would be an appropriate
heuristic in trying to get a better sense of the bug and crash report, and
at that moment, that was my goal, and figured that I'd deal with rest
(such as freeing the memory) later on, if this was a something that could work.

I was surprised when the patch (although it caused a memory leak), seemed
to pass the test for the bug, without triggering any issues; since this patch
basically only allocates memory as compared to locally declaring variables.

I wanted some input or explanation, about how is it that doing this no longer
triggers the bug?
It felt (and still feels) extremely unlikely to me, that allocating memory also
prevents the issue, which is why I figured it might do some help asking
someone, if it does, and I just felt sending in the patch might make it at least
a little less absurd sounding.
Also, if simply allocating memory provides this security (which syzbot seems to
approve, but I still do not understand fully how), wouldn't it be a
welcome change?

Like I said, I'm trying to understand how things work, a little better here,
and I apologize for any confusion that I may have caused.

TLDR;
I tried allocating memory as a heuristic while trying to understand
the bug and the bpf-next tree a little better.
Surprisingly the bug didn't seem to get triggered.
I would like to know the reason why the bug didn't get triggered when syzbot
applied this patch to the bpf-next tree.
If the reason, and allocating memory approach seems sensible? enough,
(or provides some sort of security that I seem to oblivious to), I will try and
come up with a way to free the allocated memory, and send in a v2 as well.

(For anyone who might say that there is another commit that fixes this - yes, I
am aware.
However, if you take a look at the bug at
??? ??? ??? https://syzkaller.appspot.com/bug?extid=976d5ecfab0c7eb43ac3
you can see that a generic test (no patch attached) to see if the bug was
still valid was issued much later, and it still turned out to trigger an issue)

2020-09-13 11:53:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] Using a pointer and kzalloc in place of a struct directly

On Sun, Sep 13, 2020 at 01:32:43AM +0530, Anant Thazhemadam wrote:
> On 12/09/20 8:25 pm, Greg KH wrote:
> > On Sat, Sep 12, 2020 at 05:43:38PM +0530, Anant Thazhemadam wrote:
> >> On 12/09/20 5:17 pm, Greg KH wrote:
> >>> Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing
> >>> list as well?
> >> Oh, I'm sorry about that. I pulled the emails of all the people to whom
> >> this mail was sent off from the header in lkml mail, and just cc-ed
> >> everyone.
> >>
> >>> You leaked memory :(
> >>>
> >>> Did you test this patch? Where do you free this memory, I don't see
> >>> that happening anywhere in this patch, did I miss it?
> >> Yes, I did test this patch, which didn't seem to trigger any issues.
> >> It surprised me so much, that I ended up sending it in, to have
> >> it checked out.
> > You might not have noticed the memory leak if you were not looking for
> > it.
> >
> > How did you test this?
> Ah, that must be it. I tested this using syzbot, which wouldn't have looked
> for memory leaks, but only the issue that was reported. My apologies.
> >> I wasn't sure where exactly the memory allocated here was
> >> supposed to be freed (might be why the current implementation
> >> isn't exactly using kzalloc). I forgot to mention it in the initial mail,
> >> and I was hoping that someone would point me in the right direction
> >> (if this approach was actually going to be considered, that is, which in
> >> retrospect I now feel might not be the best thing)
> > It has to be freed somewhere, you wrote the patch :)
> >
> > But back to the original question here, why do you feel this change is
> > needed? What does this do better/faster/more correct than the code that
> > is currently there? Unless you can provide that, the change should not
> > be needed, right?
> I was initially trying to see if allocating memory would be an appropriate
> heuristic in trying to get a better sense of the bug and crash report, and
> at that moment, that was my goal, and figured that I'd deal with rest
> (such as freeing the memory) later on, if this was a something that could work.
>
> I was surprised when the patch (although it caused a memory leak), seemed
> to pass the test for the bug, without triggering any issues; since this patch
> basically only allocates memory as compared to locally declaring variables.
>
> I wanted some input or explanation, about how is it that doing this no longer
> triggers the bug?

That really is up to you to work out, sorry.

Look at what the syzbot is testing, and look at the code change to see
the difference, and you should notice what memory is now being cleared
that previously was not.

good luck!

greg k-h

2020-10-30 10:11:46

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: WARNING in bpf_raw_tp_link_fill_link_info

On Fri, Sep 11, 2020 at 12:01 AM Andrii Nakryiko
<[email protected]> wrote:
>
> On Thu, Sep 10, 2020 at 2:31 AM syzbot
> <[email protected]> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 7fb5eefd selftests/bpf: Fix test_sysctl_loop{1, 2} failure..
> > git tree: bpf-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1424fdb3900000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=b6856d16f78d8fa9
> > dashboard link: https://syzkaller.appspot.com/bug?extid=976d5ecfab0c7eb43ac3
> > compiler: gcc (GCC) 10.1.0-syz 20200507
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14a1f411900000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10929c11900000
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 6854 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
> > WARNING: CPU: 0 PID: 6854 at include/linux/thread_info.h:150 copy_to_user include/linux/uaccess.h:167 [inline]
> > WARNING: CPU: 0 PID: 6854 at include/linux/thread_info.h:150 bpf_raw_tp_link_fill_link_info+0x306/0x350 kernel/bpf/syscall.c:2661
> > Kernel panic - not syncing: panic_on_warn set ...
> > CPU: 0 PID: 6854 Comm: syz-executor574 Not tainted 5.9.0-rc1-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> > __dump_stack lib/dump_stack.c:77 [inline]
> > dump_stack+0x18f/0x20d lib/dump_stack.c:118
> > panic+0x2e3/0x75c kernel/panic.c:231
> > __warn.cold+0x20/0x4a kernel/panic.c:600
> > report_bug+0x1bd/0x210 lib/bug.c:198
> > handle_bug+0x38/0x90 arch/x86/kernel/traps.c:234
> > exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:254
> > asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536
> > RIP: 0010:check_copy_size include/linux/thread_info.h:150 [inline]
> > RIP: 0010:copy_to_user include/linux/uaccess.h:167 [inline]
> > RIP: 0010:bpf_raw_tp_link_fill_link_info+0x306/0x350 kernel/bpf/syscall.c:2661
> > Code: 41 bc ea ff ff ff e9 35 ff ff ff 4c 89 ff e8 41 66 33 00 e9 d0 fd ff ff 4c 89 ff e8 a4 66 33 00 e9 06 ff ff ff e8 ca ed f2 ff <0f> 0b eb 94 48 89 ef e8 2e 66 33 00 e9 65 fd ff ff e8 24 66 33 00
> > RSP: 0018:ffffc900051c7bd0 EFLAGS: 00010293
> > RAX: 0000000000000000 RBX: ffffc900051c7c60 RCX: ffffffff818179d6
> > RDX: ffff88808b490000 RSI: ffffffff81817a96 RDI: 0000000000000006
> > RBP: 0000000000000019 R08: 0000000000000000 R09: ffffc900051c7c7f
> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000019
> > R13: 0000000000001265 R14: ffffffff8986ecc0 R15: ffffc900051c7c78
> > bpf_link_get_info_by_fd kernel/bpf/syscall.c:3626 [inline]
> > bpf_obj_get_info_by_fd+0x43a/0xc40 kernel/bpf/syscall.c:3664
> > __do_sys_bpf+0x1906/0x4b30 kernel/bpf/syscall.c:4237
> > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x4405f9
> > Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:00007fff47155808 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> > RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004405f9
> > RDX: 0000000000000010 RSI: 00000000200000c0 RDI: 000000000000000f
> > RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
> > R10: 00000000004002c8 R11: 0000000000000246 R12: 0000000000401e00
> > R13: 0000000000401e90 R14: 0000000000000000 R15: 0000000000000000
> > Kernel Offset: disabled
> > Rebooting in 86400 seconds..
> >
>
> #syz fix: b474959d5afd ("bpf: Fix a buffer out-of-bound access when
> filling raw_tp link_info")

Complete patch title:

#syz fix:
bpf: Fix a buffer out-of-bound access when filling raw_tp link_info