2018-03-09 21:01:38

by syzbot

[permalink] [raw]
Subject: KASAN: use-after-free Read in sctp_association_free (2)

Hello,

syzbot hit the following crash on net-next commit
fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 +0000)
Merge tag 'mlx5-updates-2018-02-28-2' of
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

So far this crash happened 2 times on net-next.
C reproducer is attached.
syzkaller reproducer is attached.
Raw console output is attached.
compiler: gcc (GCC) 7.1.1 20170620
.config is attached.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]
It will help syzbot understand when the bug is fixed. See footer for
details.
If you forward the report, please keep this part and the footer.

IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
==================================================================
BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
net/sctp/associola.c:332
Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202

CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23c/0x360 mm/kasan/report.c:412
__asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
sock_sendmsg_nosec net/socket.c:629 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:639
SYSC_sendto+0x361/0x5c0 net/socket.c:1748
SyS_sendto+0x40/0x50 net/socket.c:1716
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x446d09
RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09
RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003
RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c
R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60

Allocated by task 4202:
save_stack+0x43/0xd0 mm/kasan/kasan.c:447
set_track mm/kasan/kasan.c:459 [inline]
kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:552
kmem_cache_alloc_trace+0x136/0x740 mm/slab.c:3607
kmalloc include/linux/slab.h:512 [inline]
kzalloc include/linux/slab.h:701 [inline]
sctp_association_new+0x114/0x2130 net/sctp/associola.c:308
sctp_sendmsg_new_asoc+0x2e6/0x1000 net/sctp/socket.c:1705
sctp_sendmsg+0x1450/0x1a80 net/sctp/socket.c:2046
inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
sock_sendmsg_nosec net/socket.c:629 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:639
SYSC_sendto+0x361/0x5c0 net/socket.c:1748
SyS_sendto+0x40/0x50 net/socket.c:1716
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7

Freed by task 4202:
save_stack+0x43/0xd0 mm/kasan/kasan.c:447
set_track mm/kasan/kasan.c:459 [inline]
__kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:520
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:527
__cache_free mm/slab.c:3485 [inline]
kfree+0xd9/0x260 mm/slab.c:3800
sctp_association_destroy net/sctp/associola.c:434 [inline]
sctp_association_put+0x21c/0x2f0 net/sctp/associola.c:883
sctp_wait_for_sndbuf net/sctp/socket.c:8179 [inline]
sctp_sendmsg_to_asoc+0x1693/0x1e80 net/sctp/socket.c:1904
sctp_sendmsg+0xc3e/0x1a80 net/sctp/socket.c:2073
inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
sock_sendmsg_nosec net/socket.c:629 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:639
SYSC_sendto+0x361/0x5c0 net/socket.c:1748
SyS_sendto+0x40/0x50 net/socket.c:1716
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7

The buggy address belongs to the object at ffff8801d8006ac0
which belongs to the cache kmalloc-4096 of size 4096
The buggy address is located 32 bytes inside of
4096-byte region [ffff8801d8006ac0, ffff8801d8007ac0)
The buggy address belongs to the page:
page:ffffea0007600180 count:1 mapcount:0 mapping:ffff8801d8006ac0 index:0x0
compound_mapcount: 0
flags: 0x2fffc0000008100(slab|head)
raw: 02fffc0000008100 ffff8801d8006ac0 0000000000000000 0000000100000001
raw: ffffea0007600120 ffffea0007606c20 ffff8801dac00dc0 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8801d8006980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8801d8006a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff8801d8006a80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
^
ffff8801d8006b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801d8006b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to [email protected].

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
Note: all commands must start from beginning of the line in the email body.


Attachments:
raw.log.txt (11.58 kB)
repro.syz.txt (3.43 kB)
repro.c.txt (26.07 kB)
config.txt (134.23 kB)
Download all attachments

2018-03-09 22:10:31

by Neil Horman

[permalink] [raw]
Subject: Re: KASAN: use-after-free Read in sctp_association_free (2)

On Fri, Mar 09, 2018 at 12:59:06PM -0800, syzbot wrote:
> Hello,
>
> syzbot hit the following crash on net-next commit
> fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 +0000)
> Merge tag 'mlx5-updates-2018-02-28-2' of
> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
>
> So far this crash happened 2 times on net-next.
> C reproducer is attached.
> syzkaller reproducer is attached.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: [email protected]
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> IPVS: ftp: loaded support on port[0] = 21
> IPVS: ftp: loaded support on port[0] = 21
> IPVS: ftp: loaded support on port[0] = 21
> IPVS: ftp: loaded support on port[0] = 21
> ==================================================================
> BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
> net/sctp/associola.c:332
> Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202
>
> CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:17 [inline]
> dump_stack+0x194/0x24d lib/dump_stack.c:53
> print_address_description+0x73/0x250 mm/kasan/report.c:256
> kasan_report_error mm/kasan/report.c:354 [inline]
> kasan_report+0x23c/0x360 mm/kasan/report.c:412
> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
> sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
> inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
> sock_sendmsg_nosec net/socket.c:629 [inline]
> sock_sendmsg+0xca/0x110 net/socket.c:639
> SYSC_sendto+0x361/0x5c0 net/socket.c:1748
> SyS_sendto+0x40/0x50 net/socket.c:1716
> do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
> entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x446d09
> RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
> RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09
> RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003
> RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c
> R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
> R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60
>
I think we have a corner case with a0ff660058b88d12625a783ce9e5c1371c87951f
here. If a peeloff event happens during a wait for sendbuf space, EPIPE will be
returned, and the code path appears to call sctp_association_put twice, leading
to the use after free situation. I'll write a patch this weekend

Neil


2018-03-10 07:59:58

by Xin Long

[permalink] [raw]
Subject: Re: KASAN: use-after-free Read in sctp_association_free (2)

On Sat, Mar 10, 2018 at 6:08 AM, Neil Horman <[email protected]> wrote:
> On Fri, Mar 09, 2018 at 12:59:06PM -0800, syzbot wrote:
>> Hello,
>>
>> syzbot hit the following crash on net-next commit
>> fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 +0000)
>> Merge tag 'mlx5-updates-2018-02-28-2' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
>>
>> So far this crash happened 2 times on net-next.
>> C reproducer is attached.
>> syzkaller reproducer is attached.
>> Raw console output is attached.
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached.
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: [email protected]
>> It will help syzbot understand when the bug is fixed. See footer for
>> details.
>> If you forward the report, please keep this part and the footer.
>>
>> IPVS: ftp: loaded support on port[0] = 21
>> IPVS: ftp: loaded support on port[0] = 21
>> IPVS: ftp: loaded support on port[0] = 21
>> IPVS: ftp: loaded support on port[0] = 21
>> ==================================================================
>> BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
>> net/sctp/associola.c:332
>> Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202
>>
>> CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>> __dump_stack lib/dump_stack.c:17 [inline]
>> dump_stack+0x194/0x24d lib/dump_stack.c:53
>> print_address_description+0x73/0x250 mm/kasan/report.c:256
>> kasan_report_error mm/kasan/report.c:354 [inline]
>> kasan_report+0x23c/0x360 mm/kasan/report.c:412
>> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>> sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
>> sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
>> inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
>> sock_sendmsg_nosec net/socket.c:629 [inline]
>> sock_sendmsg+0xca/0x110 net/socket.c:639
>> SYSC_sendto+0x361/0x5c0 net/socket.c:1748
>> SyS_sendto+0x40/0x50 net/socket.c:1716
>> do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>> entry_SYSCALL_64_after_hwframe+0x42/0xb7
>> RIP: 0033:0x446d09
>> RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
>> RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09
>> RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003
>> RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c
>> R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
>> R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60
>>
> I think we have a corner case with a0ff660058b88d12625a783ce9e5c1371c87951f
> here. If a peeloff event happens during a wait for sendbuf space, EPIPE will be
> returned, and the code path appears to call sctp_association_put twice, leading
> to the use after free situation. I'll write a patch this weekend
Hi, Neil, you're right.

I didn't expect peeloff can be done on a NEW asoc, as peeloff needs
assoc_id, which can only be set when connecting has started.

But I realized that:
f84af33 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg

moved sctp_primitive_ASSOCIATE(connecting) before sctp_wait_for_sndbuf
(snd buffer waiting). It means peeloff can be done on a NEW asoc.
So you may want to move it back.

One good thing is the fix shouldn't touch the conflict on
https://lkml.org/lkml/2018/3/7/1175
We can fix it right now, I think. But pls double check it before
submitting the patch. We just can't grow up that fixup for linus
tree's merge.

Thanks.

2018-03-10 13:16:29

by Neil Horman

[permalink] [raw]
Subject: Re: KASAN: use-after-free Read in sctp_association_free (2)

On Sat, Mar 10, 2018 at 03:58:04PM +0800, Xin Long wrote:
> On Sat, Mar 10, 2018 at 6:08 AM, Neil Horman <[email protected]> wrote:
> > On Fri, Mar 09, 2018 at 12:59:06PM -0800, syzbot wrote:
> >> Hello,
> >>
> >> syzbot hit the following crash on net-next commit
> >> fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 +0000)
> >> Merge tag 'mlx5-updates-2018-02-28-2' of
> >> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
> >>
> >> So far this crash happened 2 times on net-next.
> >> C reproducer is attached.
> >> syzkaller reproducer is attached.
> >> Raw console output is attached.
> >> compiler: gcc (GCC) 7.1.1 20170620
> >> .config is attached.
> >>
> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> Reported-by: [email protected]
> >> It will help syzbot understand when the bug is fixed. See footer for
> >> details.
> >> If you forward the report, please keep this part and the footer.
> >>
> >> IPVS: ftp: loaded support on port[0] = 21
> >> IPVS: ftp: loaded support on port[0] = 21
> >> IPVS: ftp: loaded support on port[0] = 21
> >> IPVS: ftp: loaded support on port[0] = 21
> >> ==================================================================
> >> BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
> >> net/sctp/associola.c:332
> >> Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202
> >>
> >> CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >> Google 01/01/2011
> >> Call Trace:
> >> __dump_stack lib/dump_stack.c:17 [inline]
> >> dump_stack+0x194/0x24d lib/dump_stack.c:53
> >> print_address_description+0x73/0x250 mm/kasan/report.c:256
> >> kasan_report_error mm/kasan/report.c:354 [inline]
> >> kasan_report+0x23c/0x360 mm/kasan/report.c:412
> >> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >> sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
> >> sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
> >> inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
> >> sock_sendmsg_nosec net/socket.c:629 [inline]
> >> sock_sendmsg+0xca/0x110 net/socket.c:639
> >> SYSC_sendto+0x361/0x5c0 net/socket.c:1748
> >> SyS_sendto+0x40/0x50 net/socket.c:1716
> >> do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
> >> entry_SYSCALL_64_after_hwframe+0x42/0xb7
> >> RIP: 0033:0x446d09
> >> RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
> >> RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09
> >> RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003
> >> RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c
> >> R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
> >> R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60
> >>
> > I think we have a corner case with a0ff660058b88d12625a783ce9e5c1371c87951f
> > here. If a peeloff event happens during a wait for sendbuf space, EPIPE will be
> > returned, and the code path appears to call sctp_association_put twice, leading
> > to the use after free situation. I'll write a patch this weekend
> Hi, Neil, you're right.
>
> I didn't expect peeloff can be done on a NEW asoc, as peeloff needs
> assoc_id, which can only be set when connecting has started.
>
> But I realized that:
> f84af33 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg
>
> moved sctp_primitive_ASSOCIATE(connecting) before sctp_wait_for_sndbuf
> (snd buffer waiting). It means peeloff can be done on a NEW asoc.
> So you may want to move it back.
>
I agree with the root cause, but I'm not sure I agree with just moving the
wait_for_sndbuf call back above the call to associate. I'm not sure I like
relying on placing a call in a spcific order solely to avoid an error condition
that might legitimately occur. I think would rather check the return code at
the call site for the complete set of conditions for which we should not free
the association. Something like this:

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 7d3476a4860d..a68846d2b0ef 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -2071,8 +2071,9 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)

/* Send msg to the asoc */
err = sctp_sendmsg_to_asoc(asoc, msg, msg_len, transport, sinfo);
- if (err < 0 && err != -ESRCH && new)
- sctp_association_free(asoc);
+ if ((err != -ESRCH) && (err != -EPIPE))
+ if (err < 0 && new)
+ sctp_association_free(asoc);

out_unlock:
release_sock(sk);

Which I think also avoids the noted conflict.

Thoughts?

Neil

> One good thing is the fix shouldn't touch the conflict on
> https://lkml.org/lkml/2018/3/7/1175
> We can fix it right now, I think. But pls double check it before
> submitting the patch. We just can't grow up that fixup for linus
> tree's merge.
>
> Thanks.
>

2018-03-10 16:23:46

by Xin Long

[permalink] [raw]
Subject: Re: KASAN: use-after-free Read in sctp_association_free (2)

On Sat, Mar 10, 2018 at 9:13 PM, Neil Horman <[email protected]> wrote:
> On Sat, Mar 10, 2018 at 03:58:04PM +0800, Xin Long wrote:
>> On Sat, Mar 10, 2018 at 6:08 AM, Neil Horman <[email protected]> wrote:
>> > On Fri, Mar 09, 2018 at 12:59:06PM -0800, syzbot wrote:
>> >> Hello,
>> >>
>> >> syzbot hit the following crash on net-next commit
>> >> fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 +0000)
>> >> Merge tag 'mlx5-updates-2018-02-28-2' of
>> >> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
>> >>
>> >> So far this crash happened 2 times on net-next.
>> >> C reproducer is attached.
>> >> syzkaller reproducer is attached.
>> >> Raw console output is attached.
>> >> compiler: gcc (GCC) 7.1.1 20170620
>> >> .config is attached.
>> >>
>> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> >> Reported-by: [email protected]
>> >> It will help syzbot understand when the bug is fixed. See footer for
>> >> details.
>> >> If you forward the report, please keep this part and the footer.
>> >>
>> >> IPVS: ftp: loaded support on port[0] = 21
>> >> IPVS: ftp: loaded support on port[0] = 21
>> >> IPVS: ftp: loaded support on port[0] = 21
>> >> IPVS: ftp: loaded support on port[0] = 21
>> >> ==================================================================
>> >> BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
>> >> net/sctp/associola.c:332
>> >> Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202
>> >>
>> >> CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
>> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> >> Google 01/01/2011
>> >> Call Trace:
>> >> __dump_stack lib/dump_stack.c:17 [inline]
>> >> dump_stack+0x194/0x24d lib/dump_stack.c:53
>> >> print_address_description+0x73/0x250 mm/kasan/report.c:256
>> >> kasan_report_error mm/kasan/report.c:354 [inline]
>> >> kasan_report+0x23c/0x360 mm/kasan/report.c:412
>> >> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>> >> sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
>> >> sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
>> >> inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
>> >> sock_sendmsg_nosec net/socket.c:629 [inline]
>> >> sock_sendmsg+0xca/0x110 net/socket.c:639
>> >> SYSC_sendto+0x361/0x5c0 net/socket.c:1748
>> >> SyS_sendto+0x40/0x50 net/socket.c:1716
>> >> do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>> >> entry_SYSCALL_64_after_hwframe+0x42/0xb7
>> >> RIP: 0033:0x446d09
>> >> RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
>> >> RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09
>> >> RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003
>> >> RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c
>> >> R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
>> >> R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60
>> >>
>> > I think we have a corner case with a0ff660058b88d12625a783ce9e5c1371c87951f
>> > here. If a peeloff event happens during a wait for sendbuf space, EPIPE will be
>> > returned, and the code path appears to call sctp_association_put twice, leading
>> > to the use after free situation. I'll write a patch this weekend
>> Hi, Neil, you're right.
>>
>> I didn't expect peeloff can be done on a NEW asoc, as peeloff needs
>> assoc_id, which can only be set when connecting has started.
>>
>> But I realized that:
>> f84af33 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg
>>
>> moved sctp_primitive_ASSOCIATE(connecting) before sctp_wait_for_sndbuf
>> (snd buffer waiting). It means peeloff can be done on a NEW asoc.
>> So you may want to move it back.
>>
> I agree with the root cause, but I'm not sure I agree with just moving the
> wait_for_sndbuf call back above the call to associate. I'm not sure I like
> relying on placing a call in a spcific order solely to avoid an error condition
> that might legitimately occur. I think would rather check the return code at
> the call site for the complete set of conditions for which we should not free
> the association. Something like this:
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 7d3476a4860d..a68846d2b0ef 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -2071,8 +2071,9 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
>
> /* Send msg to the asoc */
> err = sctp_sendmsg_to_asoc(asoc, msg, msg_len, transport, sinfo);
> - if (err < 0 && err != -ESRCH && new)
> - sctp_association_free(asoc);
> + if ((err != -ESRCH) && (err != -EPIPE))
> + if (err < 0 && new)
> + sctp_association_free(asoc);
>
> out_unlock:
> release_sock(sk);
>
> Which I think also avoids the noted conflict.
>
> Thoughts?
If sctp_association_free is called for general asoc, yes, I agree with you.
But it's actually only for NEW asoc, a special case, not worth a extra check.
'err != -ESRCH' is already kind of ugly there (I couldn't find a nicer way :D),
I don't hope there will be more like that.

And also, I think I moved sctp_primitive_ASSOCIATE before
sctp_wait_for_sndbuf by accident in that patch(Sorry), which may
already change some behavior. We have to move it back.

2018-03-10 19:06:42

by Neil Horman

[permalink] [raw]
Subject: Re: KASAN: use-after-free Read in sctp_association_free (2)

On Sun, Mar 11, 2018 at 12:22:32AM +0800, Xin Long wrote:
> On Sat, Mar 10, 2018 at 9:13 PM, Neil Horman <[email protected]> wrote:
> > On Sat, Mar 10, 2018 at 03:58:04PM +0800, Xin Long wrote:
> >> On Sat, Mar 10, 2018 at 6:08 AM, Neil Horman <[email protected]> wrote:
> >> > On Fri, Mar 09, 2018 at 12:59:06PM -0800, syzbot wrote:
> >> >> Hello,
> >> >>
> >> >> syzbot hit the following crash on net-next commit
> >> >> fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 +0000)
> >> >> Merge tag 'mlx5-updates-2018-02-28-2' of
> >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
> >> >>
> >> >> So far this crash happened 2 times on net-next.
> >> >> C reproducer is attached.
> >> >> syzkaller reproducer is attached.
> >> >> Raw console output is attached.
> >> >> compiler: gcc (GCC) 7.1.1 20170620
> >> >> .config is attached.
> >> >>
> >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> >> Reported-by: [email protected]
> >> >> It will help syzbot understand when the bug is fixed. See footer for
> >> >> details.
> >> >> If you forward the report, please keep this part and the footer.
> >> >>
> >> >> IPVS: ftp: loaded support on port[0] = 21
> >> >> IPVS: ftp: loaded support on port[0] = 21
> >> >> IPVS: ftp: loaded support on port[0] = 21
> >> >> IPVS: ftp: loaded support on port[0] = 21
> >> >> ==================================================================
> >> >> BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
> >> >> net/sctp/associola.c:332
> >> >> Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202
> >> >>
> >> >> CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
> >> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >> >> Google 01/01/2011
> >> >> Call Trace:
> >> >> __dump_stack lib/dump_stack.c:17 [inline]
> >> >> dump_stack+0x194/0x24d lib/dump_stack.c:53
> >> >> print_address_description+0x73/0x250 mm/kasan/report.c:256
> >> >> kasan_report_error mm/kasan/report.c:354 [inline]
> >> >> kasan_report+0x23c/0x360 mm/kasan/report.c:412
> >> >> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >> >> sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
> >> >> sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
> >> >> inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
> >> >> sock_sendmsg_nosec net/socket.c:629 [inline]
> >> >> sock_sendmsg+0xca/0x110 net/socket.c:639
> >> >> SYSC_sendto+0x361/0x5c0 net/socket.c:1748
> >> >> SyS_sendto+0x40/0x50 net/socket.c:1716
> >> >> do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
> >> >> entry_SYSCALL_64_after_hwframe+0x42/0xb7
> >> >> RIP: 0033:0x446d09
> >> >> RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
> >> >> RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09
> >> >> RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003
> >> >> RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c
> >> >> R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
> >> >> R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60
> >> >>
> >> > I think we have a corner case with a0ff660058b88d12625a783ce9e5c1371c87951f
> >> > here. If a peeloff event happens during a wait for sendbuf space, EPIPE will be
> >> > returned, and the code path appears to call sctp_association_put twice, leading
> >> > to the use after free situation. I'll write a patch this weekend
> >> Hi, Neil, you're right.
> >>
> >> I didn't expect peeloff can be done on a NEW asoc, as peeloff needs
> >> assoc_id, which can only be set when connecting has started.
> >>
> >> But I realized that:
> >> f84af33 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg
> >>
> >> moved sctp_primitive_ASSOCIATE(connecting) before sctp_wait_for_sndbuf
> >> (snd buffer waiting). It means peeloff can be done on a NEW asoc.
> >> So you may want to move it back.
> >>
> > I agree with the root cause, but I'm not sure I agree with just moving the
> > wait_for_sndbuf call back above the call to associate. I'm not sure I like
> > relying on placing a call in a spcific order solely to avoid an error condition
> > that might legitimately occur. I think would rather check the return code at
> > the call site for the complete set of conditions for which we should not free
> > the association. Something like this:
> >
> > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > index 7d3476a4860d..a68846d2b0ef 100644
> > --- a/net/sctp/socket.c
> > +++ b/net/sctp/socket.c
> > @@ -2071,8 +2071,9 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
> >
> > /* Send msg to the asoc */
> > err = sctp_sendmsg_to_asoc(asoc, msg, msg_len, transport, sinfo);
> > - if (err < 0 && err != -ESRCH && new)
> > - sctp_association_free(asoc);
> > + if ((err != -ESRCH) && (err != -EPIPE))
> > + if (err < 0 && new)
> > + sctp_association_free(asoc);
> >
> > out_unlock:
> > release_sock(sk);
> >
> > Which I think also avoids the noted conflict.
> >
> > Thoughts?
> If sctp_association_free is called for general asoc, yes, I agree with you.
> But it's actually only for NEW asoc, a special case, not worth a extra check.
> 'err != -ESRCH' is already kind of ugly there (I couldn't find a nicer way :D),
> I don't hope there will be more like that.
>
I agree with you on the uglyness aspect of the return code check, but I really
don't like the idea of placing the call to wait_for_sndbuf to guarantee a given
error code isn't returned, It just feels rickety to me.

> And also, I think I moved sctp_primitive_ASSOCIATE before
> sctp_wait_for_sndbuf by accident in that patch(Sorry), which may
> already change some behavior. We have to move it back.
>
It seemed to me you did that specifically to enable the check for peeloff
events. That is to say, if you check the socket write space first on a new
association, and then get a peeloff event, the write space might have changed
(though I suppose in fairness, we will always be operating the old socket, so
the write space will only grow larger). Either way there should be nothing
really wrong with checking write space after we preform the association.

But I suppose that moving it back is the safer approach. It would be nice to
eliminate that odd error check as well if we can, but I'll start with just
returning the positioning to the way it was, and look to improve it later. I'll
post the patch once I confirm the build works

Neil


2018-03-12 08:18:59

by Xin Long

[permalink] [raw]
Subject: Re: KASAN: use-after-free Read in sctp_association_free (2)

On Sun, Mar 11, 2018 at 3:04 AM, Neil Horman <[email protected]> wrote:
> On Sun, Mar 11, 2018 at 12:22:32AM +0800, Xin Long wrote:
>> On Sat, Mar 10, 2018 at 9:13 PM, Neil Horman <[email protected]> wrote:
>> > On Sat, Mar 10, 2018 at 03:58:04PM +0800, Xin Long wrote:
>> >> On Sat, Mar 10, 2018 at 6:08 AM, Neil Horman <[email protected]> wrote:
>> >> > On Fri, Mar 09, 2018 at 12:59:06PM -0800, syzbot wrote:
>> >> >> Hello,
>> >> >>
>> >> >> syzbot hit the following crash on net-next commit
>> >> >> fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 +0000)
>> >> >> Merge tag 'mlx5-updates-2018-02-28-2' of
>> >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
>> >> >>
>> >> >> So far this crash happened 2 times on net-next.
>> >> >> C reproducer is attached.
>> >> >> syzkaller reproducer is attached.
>> >> >> Raw console output is attached.
>> >> >> compiler: gcc (GCC) 7.1.1 20170620
>> >> >> .config is attached.
>> >> >>
>> >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> >> >> Reported-by: [email protected]
>> >> >> It will help syzbot understand when the bug is fixed. See footer for
>> >> >> details.
>> >> >> If you forward the report, please keep this part and the footer.
>> >> >>
>> >> >> IPVS: ftp: loaded support on port[0] = 21
>> >> >> IPVS: ftp: loaded support on port[0] = 21
>> >> >> IPVS: ftp: loaded support on port[0] = 21
>> >> >> IPVS: ftp: loaded support on port[0] = 21
>> >> >> ==================================================================
>> >> >> BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
>> >> >> net/sctp/associola.c:332
>> >> >> Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202
>> >> >>
>> >> >> CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
>> >> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> >> >> Google 01/01/2011
>> >> >> Call Trace:
>> >> >> __dump_stack lib/dump_stack.c:17 [inline]
>> >> >> dump_stack+0x194/0x24d lib/dump_stack.c:53
>> >> >> print_address_description+0x73/0x250 mm/kasan/report.c:256
>> >> >> kasan_report_error mm/kasan/report.c:354 [inline]
>> >> >> kasan_report+0x23c/0x360 mm/kasan/report.c:412
>> >> >> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>> >> >> sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
>> >> >> sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
>> >> >> inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
>> >> >> sock_sendmsg_nosec net/socket.c:629 [inline]
>> >> >> sock_sendmsg+0xca/0x110 net/socket.c:639
>> >> >> SYSC_sendto+0x361/0x5c0 net/socket.c:1748
>> >> >> SyS_sendto+0x40/0x50 net/socket.c:1716
>> >> >> do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>> >> >> entry_SYSCALL_64_after_hwframe+0x42/0xb7
>> >> >> RIP: 0033:0x446d09
>> >> >> RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
>> >> >> RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09
>> >> >> RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003
>> >> >> RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c
>> >> >> R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
>> >> >> R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60
>> >> >>
>> >> > I think we have a corner case with a0ff660058b88d12625a783ce9e5c1371c87951f
>> >> > here. If a peeloff event happens during a wait for sendbuf space, EPIPE will be
>> >> > returned, and the code path appears to call sctp_association_put twice, leading
>> >> > to the use after free situation. I'll write a patch this weekend
>> >> Hi, Neil, you're right.
>> >>
>> >> I didn't expect peeloff can be done on a NEW asoc, as peeloff needs
>> >> assoc_id, which can only be set when connecting has started.
>> >>
>> >> But I realized that:
>> >> f84af33 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg
>> >>
>> >> moved sctp_primitive_ASSOCIATE(connecting) before sctp_wait_for_sndbuf
>> >> (snd buffer waiting). It means peeloff can be done on a NEW asoc.
>> >> So you may want to move it back.
>> >>
>> > I agree with the root cause, but I'm not sure I agree with just moving the
>> > wait_for_sndbuf call back above the call to associate. I'm not sure I like
>> > relying on placing a call in a spcific order solely to avoid an error condition
>> > that might legitimately occur. I think would rather check the return code at
>> > the call site for the complete set of conditions for which we should not free
>> > the association. Something like this:
>> >
>> > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
>> > index 7d3476a4860d..a68846d2b0ef 100644
>> > --- a/net/sctp/socket.c
>> > +++ b/net/sctp/socket.c
>> > @@ -2071,8 +2071,9 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
>> >
>> > /* Send msg to the asoc */
>> > err = sctp_sendmsg_to_asoc(asoc, msg, msg_len, transport, sinfo);
>> > - if (err < 0 && err != -ESRCH && new)
>> > - sctp_association_free(asoc);
>> > + if ((err != -ESRCH) && (err != -EPIPE))
>> > + if (err < 0 && new)
>> > + sctp_association_free(asoc);
>> >
>> > out_unlock:
>> > release_sock(sk);
>> >
>> > Which I think also avoids the noted conflict.
>> >
>> > Thoughts?
>> If sctp_association_free is called for general asoc, yes, I agree with you.
>> But it's actually only for NEW asoc, a special case, not worth a extra check.
>> 'err != -ESRCH' is already kind of ugly there (I couldn't find a nicer way :D),
>> I don't hope there will be more like that.
>>
> I agree with you on the uglyness aspect of the return code check, but I really
> don't like the idea of placing the call to wait_for_sndbuf to guarantee a given
> error code isn't returned, It just feels rickety to me.
I understand, let's not count this moving back as the official fix
for this, but only for the compatibility :-)

Then we can start a new one for improving it later as you said below,
FYI, we've tried to pass 'new' parameter into sctp_sendmsg_to_asoc
or use &asoc as the parameter instead. But it seems not good, and
also sctp_association_free will have to be done in sctp_sendmsg_to_asoc,
which looks worse.

So yes, pls think about it, you may have a better idea on this.

Thanks.

>
>> And also, I think I moved sctp_primitive_ASSOCIATE before
>> sctp_wait_for_sndbuf by accident in that patch(Sorry), which may
>> already change some behavior. We have to move it back.
>>
> It seemed to me you did that specifically to enable the check for peeloff
> events. That is to say, if you check the socket write space first on a new
> association, and then get a peeloff event, the write space might have changed
> (though I suppose in fairness, we will always be operating the old socket, so
> the write space will only grow larger). Either way there should be nothing
> really wrong with checking write space after we preform the association.
>
> But I suppose that moving it back is the safer approach. It would be nice to
> eliminate that odd error check as well if we can, but I'll start with just
> returning the positioning to the way it was, and look to improve it later. I'll
> post the patch once I confirm the build works
>
> Neil
>

2018-03-12 11:32:14

by Neil Horman

[permalink] [raw]
Subject: Re: KASAN: use-after-free Read in sctp_association_free (2)

On Mon, Mar 12, 2018 at 04:16:27PM +0800, Xin Long wrote:
> On Sun, Mar 11, 2018 at 3:04 AM, Neil Horman <[email protected]> wrote:
> > On Sun, Mar 11, 2018 at 12:22:32AM +0800, Xin Long wrote:
> >> On Sat, Mar 10, 2018 at 9:13 PM, Neil Horman <[email protected]> wrote:
> >> > On Sat, Mar 10, 2018 at 03:58:04PM +0800, Xin Long wrote:
> >> >> On Sat, Mar 10, 2018 at 6:08 AM, Neil Horman <[email protected]> wrote:
> >> >> > On Fri, Mar 09, 2018 at 12:59:06PM -0800, syzbot wrote:
> >> >> >> Hello,
> >> >> >>
> >> >> >> syzbot hit the following crash on net-next commit
> >> >> >> fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 +0000)
> >> >> >> Merge tag 'mlx5-updates-2018-02-28-2' of
> >> >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
> >> >> >>
> >> >> >> So far this crash happened 2 times on net-next.
> >> >> >> C reproducer is attached.
> >> >> >> syzkaller reproducer is attached.
> >> >> >> Raw console output is attached.
> >> >> >> compiler: gcc (GCC) 7.1.1 20170620
> >> >> >> .config is attached.
> >> >> >>
> >> >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> >> >> Reported-by: [email protected]
> >> >> >> It will help syzbot understand when the bug is fixed. See footer for
> >> >> >> details.
> >> >> >> If you forward the report, please keep this part and the footer.
> >> >> >>
> >> >> >> IPVS: ftp: loaded support on port[0] = 21
> >> >> >> IPVS: ftp: loaded support on port[0] = 21
> >> >> >> IPVS: ftp: loaded support on port[0] = 21
> >> >> >> IPVS: ftp: loaded support on port[0] = 21
> >> >> >> ==================================================================
> >> >> >> BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
> >> >> >> net/sctp/associola.c:332
> >> >> >> Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202
> >> >> >>
> >> >> >> CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
> >> >> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >> >> >> Google 01/01/2011
> >> >> >> Call Trace:
> >> >> >> __dump_stack lib/dump_stack.c:17 [inline]
> >> >> >> dump_stack+0x194/0x24d lib/dump_stack.c:53
> >> >> >> print_address_description+0x73/0x250 mm/kasan/report.c:256
> >> >> >> kasan_report_error mm/kasan/report.c:354 [inline]
> >> >> >> kasan_report+0x23c/0x360 mm/kasan/report.c:412
> >> >> >> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
> >> >> >> sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
> >> >> >> sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
> >> >> >> inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
> >> >> >> sock_sendmsg_nosec net/socket.c:629 [inline]
> >> >> >> sock_sendmsg+0xca/0x110 net/socket.c:639
> >> >> >> SYSC_sendto+0x361/0x5c0 net/socket.c:1748
> >> >> >> SyS_sendto+0x40/0x50 net/socket.c:1716
> >> >> >> do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
> >> >> >> entry_SYSCALL_64_after_hwframe+0x42/0xb7
> >> >> >> RIP: 0033:0x446d09
> >> >> >> RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
> >> >> >> RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09
> >> >> >> RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003
> >> >> >> RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c
> >> >> >> R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
> >> >> >> R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60
> >> >> >>
> >> >> > I think we have a corner case with a0ff660058b88d12625a783ce9e5c1371c87951f
> >> >> > here. If a peeloff event happens during a wait for sendbuf space, EPIPE will be
> >> >> > returned, and the code path appears to call sctp_association_put twice, leading
> >> >> > to the use after free situation. I'll write a patch this weekend
> >> >> Hi, Neil, you're right.
> >> >>
> >> >> I didn't expect peeloff can be done on a NEW asoc, as peeloff needs
> >> >> assoc_id, which can only be set when connecting has started.
> >> >>
> >> >> But I realized that:
> >> >> f84af33 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg
> >> >>
> >> >> moved sctp_primitive_ASSOCIATE(connecting) before sctp_wait_for_sndbuf
> >> >> (snd buffer waiting). It means peeloff can be done on a NEW asoc.
> >> >> So you may want to move it back.
> >> >>
> >> > I agree with the root cause, but I'm not sure I agree with just moving the
> >> > wait_for_sndbuf call back above the call to associate. I'm not sure I like
> >> > relying on placing a call in a spcific order solely to avoid an error condition
> >> > that might legitimately occur. I think would rather check the return code at
> >> > the call site for the complete set of conditions for which we should not free
> >> > the association. Something like this:
> >> >
> >> > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> >> > index 7d3476a4860d..a68846d2b0ef 100644
> >> > --- a/net/sctp/socket.c
> >> > +++ b/net/sctp/socket.c
> >> > @@ -2071,8 +2071,9 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
> >> >
> >> > /* Send msg to the asoc */
> >> > err = sctp_sendmsg_to_asoc(asoc, msg, msg_len, transport, sinfo);
> >> > - if (err < 0 && err != -ESRCH && new)
> >> > - sctp_association_free(asoc);
> >> > + if ((err != -ESRCH) && (err != -EPIPE))
> >> > + if (err < 0 && new)
> >> > + sctp_association_free(asoc);
> >> >
> >> > out_unlock:
> >> > release_sock(sk);
> >> >
> >> > Which I think also avoids the noted conflict.
> >> >
> >> > Thoughts?
> >> If sctp_association_free is called for general asoc, yes, I agree with you.
> >> But it's actually only for NEW asoc, a special case, not worth a extra check.
> >> 'err != -ESRCH' is already kind of ugly there (I couldn't find a nicer way :D),
> >> I don't hope there will be more like that.
> >>
> > I agree with you on the uglyness aspect of the return code check, but I really
> > don't like the idea of placing the call to wait_for_sndbuf to guarantee a given
> > error code isn't returned, It just feels rickety to me.
> I understand, let's not count this moving back as the official fix
> for this, but only for the compatibility :-)
>
> Then we can start a new one for improving it later as you said below,
> FYI, we've tried to pass 'new' parameter into sctp_sendmsg_to_asoc
> or use &asoc as the parameter instead. But it seems not good, and
> also sctp_association_free will have to be done in sctp_sendmsg_to_asoc,
> which looks worse.
>
Yeah, I thought about that, and don't particularly care for it either. I also
thought about checking the dead flag at the end of send_to_asoc, but that won't
work either, as the caller of wait_for_sndbuf still has a reference. I think,
for now, you were right the first time, we just need to reposition it. I'll be
checking the fix with the provided reproducer today.

Neil