Received: by 10.213.65.16 with SMTP id m16csp306476imf; Mon, 12 Mar 2018 04:32:14 -0700 (PDT) X-Google-Smtp-Source: AG47ELsCgVFzaStzlZqUXDFF5dnRbK9cnJpdoddwstMuGYkvOeFDNlZyFZKJTyNsyYLKu2u50RSm X-Received: by 10.99.185.7 with SMTP id z7mr6508456pge.123.1520854334906; Mon, 12 Mar 2018 04:32:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520854334; cv=none; d=google.com; s=arc-20160816; b=A2wvU2zIntvT318NHqxx8PnrokRv6+mqLKQgHXKgigNSGkAUPQmM+1+0lOXhOWZaDY otw7zG3cJ0JzJssT2e5TtvLahI0bG7tE5Bw9CMrFf2eDPXhFSSYxsNb2eOp2ClOKNOP/ EaRD24k8kg6NjYP1NFAKmBcbWY+pdTewTdPCAUL4pf6fVJFzFJA8sJQJowxbkwWns0XE qnQunoCpJBebit4q6i4W3ArJeF4gm/+M5RnmUiZuMNBPljYVBl+r8aplS6kkOu/1weQQ vc+ffHK+MCEYCtgszfLNl3ndG6KtmAbghRVo0MKFf8UVtiYEJzpZpOrh4ciBGc9b44CZ j2PQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=puedjpFds3guWRT9VklNYzxA+IvLFQpEhd1CbaGenpU=; b=s2BCxxVJwec1+tw8Yj10C0zV1sR0Dp97bzmakgXoEEWeQSl7LxIiugnXmL2wIHhL4d 3kUreGUn1K93N+45+8GzXLaGuoYF5R6QIyyzn+GuZAmqyyc/4y2h4aYLFnvKEJrn5JDF t1Z/LgeMhxtonhRcPe35cw+rwrKfa17qXgPZzaXHiZio9I2/WUzB43/6OiIqOII3Qekm WJobhBukeeoQqO+uEIrvBoxHd3fKqKmwkbkSM6cFEINlpLkReZxUVggTq3UEgBhJ37D2 TN84Oioxsr42hYb3/O96b23DMq/6pKy3MO0cixcEcbaSXKtafplo0VYCwSG8Fqp2UqfS HhrQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y1si5742194pfg.297.2018.03.12.04.32.00; Mon, 12 Mar 2018 04:32:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751438AbeCLLbC (ORCPT + 99 others); Mon, 12 Mar 2018 07:31:02 -0400 Received: from charlotte.tuxdriver.com ([70.61.120.58]:55652 "EHLO smtp.tuxdriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750988AbeCLLbA (ORCPT ); Mon, 12 Mar 2018 07:31:00 -0400 Received: from cpe-2606-a000-111b-40b7-640c-26a-4e16-9225.dyn6.twc.com ([2606:a000:111b:40b7:640c:26a:4e16:9225] helo=localhost) by smtp.tuxdriver.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1evLf4-0007wB-V2; Mon, 12 Mar 2018 07:30:52 -0400 Date: Mon, 12 Mar 2018 07:30:03 -0400 From: Neil Horman To: Xin Long Cc: syzbot , davem , LKML , linux-sctp@vger.kernel.org, network dev , syzkaller-bugs@googlegroups.com, Vlad Yasevich , Marcelo Ricardo Leitner Subject: Re: KASAN: use-after-free Read in sctp_association_free (2) Message-ID: <20180312113002.GA7457@hmswarspite.think-freely.org> References: <00000000000017dfe905670110cf@google.com> <20180309220823.GA24848@neilslaptop.think-freely.org> <20180310131355.GA8798@hmswarspite.think-freely.org> <20180310190427.GA2484@neilslaptop.think-freely.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) X-Spam-Score: -2.9 (--) X-Spam-Status: No Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 12, 2018 at 04:16:27PM +0800, Xin Long wrote: > On Sun, Mar 11, 2018 at 3:04 AM, Neil Horman wrote: > > On Sun, Mar 11, 2018 at 12:22:32AM +0800, Xin Long wrote: > >> On Sat, Mar 10, 2018 at 9:13 PM, Neil Horman wrote: > >> > On Sat, Mar 10, 2018 at 03:58:04PM +0800, Xin Long wrote: > >> >> On Sat, Mar 10, 2018 at 6:08 AM, Neil Horman wrote: > >> >> > On Fri, Mar 09, 2018 at 12:59:06PM -0800, syzbot wrote: > >> >> >> Hello, > >> >> >> > >> >> >> syzbot hit the following crash on net-next commit > >> >> >> fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 +0000) > >> >> >> Merge tag 'mlx5-updates-2018-02-28-2' of > >> >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux > >> >> >> > >> >> >> So far this crash happened 2 times on net-next. > >> >> >> C reproducer is attached. > >> >> >> syzkaller reproducer is attached. > >> >> >> Raw console output is attached. > >> >> >> compiler: gcc (GCC) 7.1.1 20170620 > >> >> >> .config is attached. > >> >> >> > >> >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> >> >> Reported-by: syzbot+a4e4112c3aff00c8cfd8@syzkaller.appspotmail.com > >> >> >> It will help syzbot understand when the bug is fixed. See footer for > >> >> >> details. > >> >> >> If you forward the report, please keep this part and the footer. > >> >> >> > >> >> >> IPVS: ftp: loaded support on port[0] = 21 > >> >> >> IPVS: ftp: loaded support on port[0] = 21 > >> >> >> IPVS: ftp: loaded support on port[0] = 21 > >> >> >> IPVS: ftp: loaded support on port[0] = 21 > >> >> >> ================================================================== > >> >> >> BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930 > >> >> >> net/sctp/associola.c:332 > >> >> >> Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202 > >> >> >> > >> >> >> CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258 > >> >> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > >> >> >> Google 01/01/2011 > >> >> >> Call Trace: > >> >> >> __dump_stack lib/dump_stack.c:17 [inline] > >> >> >> dump_stack+0x194/0x24d lib/dump_stack.c:53 > >> >> >> print_address_description+0x73/0x250 mm/kasan/report.c:256 > >> >> >> kasan_report_error mm/kasan/report.c:354 [inline] > >> >> >> kasan_report+0x23c/0x360 mm/kasan/report.c:412 > >> >> >> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > >> >> >> sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332 > >> >> >> sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075 > >> >> >> inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763 > >> >> >> sock_sendmsg_nosec net/socket.c:629 [inline] > >> >> >> sock_sendmsg+0xca/0x110 net/socket.c:639 > >> >> >> SYSC_sendto+0x361/0x5c0 net/socket.c:1748 > >> >> >> SyS_sendto+0x40/0x50 net/socket.c:1716 > >> >> >> do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 > >> >> >> entry_SYSCALL_64_after_hwframe+0x42/0xb7 > >> >> >> RIP: 0033:0x446d09 > >> >> >> RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c > >> >> >> RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09 > >> >> >> RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003 > >> >> >> RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c > >> >> >> R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000 > >> >> >> R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60 > >> >> >> > >> >> > I think we have a corner case with a0ff660058b88d12625a783ce9e5c1371c87951f > >> >> > here. If a peeloff event happens during a wait for sendbuf space, EPIPE will be > >> >> > returned, and the code path appears to call sctp_association_put twice, leading > >> >> > to the use after free situation. I'll write a patch this weekend > >> >> Hi, Neil, you're right. > >> >> > >> >> I didn't expect peeloff can be done on a NEW asoc, as peeloff needs > >> >> assoc_id, which can only be set when connecting has started. > >> >> > >> >> But I realized that: > >> >> f84af33 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg > >> >> > >> >> moved sctp_primitive_ASSOCIATE(connecting) before sctp_wait_for_sndbuf > >> >> (snd buffer waiting). It means peeloff can be done on a NEW asoc. > >> >> So you may want to move it back. > >> >> > >> > I agree with the root cause, but I'm not sure I agree with just moving the > >> > wait_for_sndbuf call back above the call to associate. I'm not sure I like > >> > relying on placing a call in a spcific order solely to avoid an error condition > >> > that might legitimately occur. I think would rather check the return code at > >> > the call site for the complete set of conditions for which we should not free > >> > the association. Something like this: > >> > > >> > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > >> > index 7d3476a4860d..a68846d2b0ef 100644 > >> > --- a/net/sctp/socket.c > >> > +++ b/net/sctp/socket.c > >> > @@ -2071,8 +2071,9 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len) > >> > > >> > /* Send msg to the asoc */ > >> > err = sctp_sendmsg_to_asoc(asoc, msg, msg_len, transport, sinfo); > >> > - if (err < 0 && err != -ESRCH && new) > >> > - sctp_association_free(asoc); > >> > + if ((err != -ESRCH) && (err != -EPIPE)) > >> > + if (err < 0 && new) > >> > + sctp_association_free(asoc); > >> > > >> > out_unlock: > >> > release_sock(sk); > >> > > >> > Which I think also avoids the noted conflict. > >> > > >> > Thoughts? > >> If sctp_association_free is called for general asoc, yes, I agree with you. > >> But it's actually only for NEW asoc, a special case, not worth a extra check. > >> 'err != -ESRCH' is already kind of ugly there (I couldn't find a nicer way :D), > >> I don't hope there will be more like that. > >> > > I agree with you on the uglyness aspect of the return code check, but I really > > don't like the idea of placing the call to wait_for_sndbuf to guarantee a given > > error code isn't returned, It just feels rickety to me. > I understand, let's not count this moving back as the official fix > for this, but only for the compatibility :-) > > Then we can start a new one for improving it later as you said below, > FYI, we've tried to pass 'new' parameter into sctp_sendmsg_to_asoc > or use &asoc as the parameter instead. But it seems not good, and > also sctp_association_free will have to be done in sctp_sendmsg_to_asoc, > which looks worse. > Yeah, I thought about that, and don't particularly care for it either. I also thought about checking the dead flag at the end of send_to_asoc, but that won't work either, as the caller of wait_for_sndbuf still has a reference. I think, for now, you were right the first time, we just need to reposition it. I'll be checking the fix with the provided reproducer today. Neil