Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp740552ybz; Fri, 24 Apr 2020 08:30:27 -0700 (PDT) X-Google-Smtp-Source: APiQypLUeX5qqAtl6AksyEqaUnjzkpXreQ1aq3xce66yKnXr3p1wO4HjwRi+5BFa1OcLHkw+vU4a X-Received: by 2002:a50:d308:: with SMTP id g8mr7719074edh.88.1587742227564; Fri, 24 Apr 2020 08:30:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587742227; cv=none; d=google.com; s=arc-20160816; b=vhDJXIshU69VONFl5yVTENwjVUdq01QBkggQtO+i1pL6zBKmca7BJlyezlIRSKJou9 ebe2FaKzs1CMw8hWdzqbyYIENKy4HC9yhXSzHWa/Lv0LmhB0kd5PGBoOxMxl9KjtmRuR k+RApWs1bvJtuRgdqtpocVGKyLFmEvZQEqmOCoUXYB8QeZtc6XMB/eCDGoVgTZCJScFr lyOA5XNsQeAto4ZOGqNPYU76MNUZmlWq/snJSafE8SjKH2vMngetUX2rJ5ptA2ZzdDUg BFNF8NGto/Yx7zf05iG9luvkvtRbrvtEsZxTWQ5SGbqum22gupvP90lORiMwQqG3DrcU wdvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:mime-version:user-agent:date:message-id :subject:cc:to:from:dkim-signature; bh=zOaIIh5Y3Ixgy8+bW/8s0a5JA8RFQkLrn9lA9JvNLy8=; b=Np2TkY17ksyOhfF+JR6z8vaxRuXIt9JKDXBtPh34OQkDwRNFFHHOs8KFw+r7F3eAZi IWDqcv35Oq8WT1NR3B5TN/6ZEDRDL6F7N6Y4emu3eD7MfJIKLp4As88jRm3bZGyu3+cL cLAjb78JHDUIC4JiqaPWrZ01k2kKNHWKp0aqcdF9Dgbh9YE/V+4RVxwUOmUNczyiqj3n 9GsseMXbHKEMesFH38w9QeGk8O/mHHI7oC581rtjQQaj2HGFVdOzrsIPY6xCF5ydJlG1 SYs0T8x5i9M69gpUwUBJSByNhxOkP9VSl1OX/DUaSwl31Ufo1PVwROiY1JpNGQSgx+S7 KrVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=JKL8z3JI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e8si3305332ejd.76.2020.04.24.08.30.02; Fri, 24 Apr 2020 08:30:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=JKL8z3JI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727070AbgDXP23 (ORCPT + 99 others); Fri, 24 Apr 2020 11:28:29 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:40720 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727016AbgDXP22 (ORCPT ); Fri, 24 Apr 2020 11:28:28 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03OFIll1061933; Fri, 24 Apr 2020 15:28:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : message-id : date : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=zOaIIh5Y3Ixgy8+bW/8s0a5JA8RFQkLrn9lA9JvNLy8=; b=JKL8z3JICdqTHf3tVLjbYDCwsZ7/2LePqlPUskPF0EEb4j31YoDElbnDjxpFL7PFB3+T o7b41+JXFs9vXmWXCDT/vOm7xCk0q7RbtGAtfGax9OItLT9PlfXcx+kfzWfqacyU80+x PjHtLZxxYqEPZIHf/slDkzxRYAJh5iHOEMxkTs+HYM5sKfeCeD8eR3H0fr4NzMotyJR0 YkVW9t6u8SyJ5F40u7pDgPfA9vX2A0x97gbk+zUOIqtnUqq7HrIfhF9/+Hp6VR3dAe1a Plcs7PwbgDs45i7rOIe5h0UMAyUldWQ3p/8GEQnmSyAyx74WehsytUu9dVkLeGfP8U+B XQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2130.oracle.com with ESMTP id 30ketdn08h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 24 Apr 2020 15:28:23 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03OFBdxZ096279; Fri, 24 Apr 2020 15:28:22 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 30k7qxbu2e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 24 Apr 2020 15:28:22 +0000 Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 03OFSLZW015782; Fri, 24 Apr 2020 15:28:21 GMT Received: from dhcp-10-159-159-71.vpn.oracle.com (/10.159.159.71) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 24 Apr 2020 08:28:21 -0700 From: Divya Indi To: linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org Cc: Jason Gunthorpe , =?UTF-8?Q?H=c3=a5kon_Bugge?= , Kaike Wan , Doug Ledford , Gerd Rausch , Srinivas Eeda , Rama Nichanamatlu Subject: Request for feedback : Possible use-after-free in routing SA query via netlink Message-ID: <8fbdf10e-3f08-6407-eb0d-a1bf663873c3@oracle.com> Date: Fri, 24 Apr 2020 08:28:09 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9601 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 suspectscore=11 bulkscore=0 phishscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004240121 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9601 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 spamscore=0 impostorscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 mlxscore=0 priorityscore=1501 clxscore=1011 suspectscore=11 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004240121 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi All, I wanted some feedback on a crash caused due to use-after-free in the ibacm code path [while routing SA query via netlink]. Commit 3ebd2fd IB/sa: Put netlink request into the request list before sending Above commit moved adding the query to the request list before ib_nl_snd_msg and moved ib_nl_snd_msg out of the spinlock (request_lock). However, if there is a delay in sending out the request (For eg: Delay due to low memory situation) the timer to handle request timeout might kick in before the request is sent out to ibacm via netlink. ib_nl_request_timeout may result in release of the query (by call to send_handler) while ib_nl_snd_msg is still accessing query. We get the following stacktrace for the crash - [] ? ib_pack+0x17b/0x240 [ib_core] [] ib_sa_path_rec_get+0x181/0x200 [ib_sa] [] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm] [] ? cma_bind_port+0xa0/0xa0 [rdma_cm] [] ? rds_rdma_cm_event_handler_cmn+0x850/0x850 [rds_rdma] [] rds_rdma_cm_event_handler_cmn+0x22c/0x850 [rds_rdma] [] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma] [] addr_handler+0x9e/0x140 [rdma_cm] [] process_req+0x134/0x190 [ib_addr] [] process_one_work+0x169/0x4a0 [] worker_thread+0x5b/0x560 [] ? flush_delayed_work+0x50/0x50 [] kthread+0xcb/0xf0 [] ? __schedule+0x24a/0x810 [] ? __schedule+0x24a/0x810 [] ? kthread_create_on_node+0x180/0x180 [] ret_from_fork+0x47/0x90 [] ? kthread_create_on_node+0x180/0x180 .... RIP [] send_mad+0x33d/0x5d0 [ib_sa] On analysis of the vmcore, we see crash happens at - ib_sa_path_rec_get send_mad ib_nl_make_request() ib_nl_send_msg 1 static void ib_nl_set_path_rec_attrs(struct sk_buff *skb, 2 struct ib_sa_query *query) 3 { 4 struct ib_sa_path_rec *sa_rec = query->mad_buf->context[1]; 5 struct ib_sa_mad *mad = query->mad_buf->mad; 6 ib_sa_comp_mask comp_mask = mad->sa_hdr.comp_mask; Page fault occurs at line 5 while trying to access query->mad_buf->mad; If we look at the query, it does not appear to be a valid ib_sa_query. Instead looks like a pid struct for a process -> Use-after-free situation. We could simulate the crash by explicitly introducing a delay in ib_nl_snd_msg with a sleep. The timer kicks in before ib_nl_send_msg has even sent out the request and releases the query. We could reproduce the crash with a similar stack trace. To summarize - We have a use-after-free possibility here when the timer(ib_nl_request_timeout) kicks in before ib_nl_snd_msg has completed sending the query out to ibacm via netlink. The timeout handler ie ib_nl_request_timeout may result in releasing the query while ib_nl_snd_msg is still accessing query. Appreciate your thoughts on the above issue. Thanks, Divya