Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp4105231imw; Tue, 12 Jul 2022 01:45:26 -0700 (PDT) X-Google-Smtp-Source: AGRyM1s2Uzx5LCZmkfjh79mrn6USM6AtELKMrNfJHsJIb0RwalS6W0wCFvrdrkxa9g8d9zjhmydJ X-Received: by 2002:a17:907:28ca:b0:72b:110a:b34e with SMTP id en10-20020a17090728ca00b0072b110ab34emr22679656ejc.113.1657615525859; Tue, 12 Jul 2022 01:45:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657615525; cv=none; d=google.com; s=arc-20160816; b=cSHl1dnEjUyI73G8L9GghNeQ79prWv24Ole50yJjGpI+03txG8utHcM4y676jHNnpL h4ZVcBKi2PIAZrlFcxzWNtRo6j4Ju+EAGQ1GVkKZwIifuokvmw6ljdnQZSt1yTkGQLYj d3GC8GUL8Bydm6wum/udTBWZd615VaIE64vDAgHRyUK+d9pwP44djoKxWjf5PYCus5ub 04+2o0r0ddBAH1V4o0YBsKW83qolwoqAXuG4QscENWuo2J9+bViX8u+FZcqaVGWZfVLT j5vIdBFy3bbkSgb8mU9+FxDh1ontP4PBAXlANEzuTHoNZmQPRIwJbs4DCi1HfE8HzOhI aLpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=+ihihgGZUmY2/0a390+NibBsCD73DBtUh+DsJP+XRRo=; b=nk254Wr8rLk2IKo8gVZwYb02wVHNMMtxaEccy/5GLSklOKtcLoZKNuISqSQVEITL0N CY9C1B9eTkcLTSd8uepSJCbt6bPXLeCbN4PMCXabIWpibqiqmbL6frB0eg8ITPr5vl6G TpXKOI/fUchTVkcSBxaJdHGg5hZI2xro+NdjjVUpBeIYDXR1qy6+QjJnKLV/49tnlW1q p9hNbFpVgV+W17PiJgZpNGB8MluZc1Hdse+sufRCdZp5dxLYzHe2UbAL3qsLpXexjWyC Uyph3u1Q2bjwhXKZF8J6hMRHAvE35iDYAmtk6xrs9LCpCd2vLe7tYsEHi+gWHPIkC1GM xdqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=mKKKcvcQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rh15-20020a17090720ef00b00717917b7d9bsi11594775ejb.334.2022.07.12.01.45.00; Tue, 12 Jul 2022 01:45:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=mKKKcvcQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231160AbiGLIcC (ORCPT + 99 others); Tue, 12 Jul 2022 04:32:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232490AbiGLIb4 (ORCPT ); Tue, 12 Jul 2022 04:31:56 -0400 Received: from mail-pf1-x443.google.com (mail-pf1-x443.google.com [IPv6:2607:f8b0:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62750A43BB; Tue, 12 Jul 2022 01:31:55 -0700 (PDT) Received: by mail-pf1-x443.google.com with SMTP id y141so6895855pfb.7; Tue, 12 Jul 2022 01:31:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+ihihgGZUmY2/0a390+NibBsCD73DBtUh+DsJP+XRRo=; b=mKKKcvcQqUipPu54C0OosbvkxjyBPI2HgOtfEtbeG8slxpHUF8od8fQ2iZsY8ZLxHI ++9GISY25O/wK+QWLDCLgyaztpX8ylXn6d482xCtD2oMrqbi5Tmmrf/ISqrw5d2Nc4Se 1AfpeW4cC/a+CZfGUzw9Fci+Rj6fNI8jAX8exDsMr8tdowV1D+968Uc4Fk130zG4M/K5 KTylvJJUAi/K75fBB4612BGe/G+2hp8ct7/UT6ZEYfx7upbgZnlpoKoRpzUYjOUcSh+C 84qP0ZLiruE+2exe8/Mnw1dCZwjGXoh5/RA6wOFdU47vwwAFrj6G7nF6MFEb6fw6nkyp lDUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+ihihgGZUmY2/0a390+NibBsCD73DBtUh+DsJP+XRRo=; b=e54GR4abbe15+k43bgyUZ0qzyGYwc4/wqxC4mj5gZvB5koggsnm/eLN0GzikPf26el hUPi1cdXETP5EcXZsTcNaJW4h7FVhJHhfBvsWTf6BMwNbO/juHr8pY6fMf/egjlImcE6 zYEIQ/XRsOqXiO/6dzzEmEHxgx43JLSEz3+URcNCUVn3KBuIEI6TRyK63YJVSHF+a6zk a6Bi8YZ++3cZUi8JhEkxP7kx9tgKgqcu6dSYisDPQ6qdEjuJYPFz0R0gEklXosgl7sGF ehPAXfTRKnzH/PmOR2vhtD6TBiEk3pxFvaOszbShTsavpz2ygwMCLM9JNCncKoEGHzer dauA== X-Gm-Message-State: AJIora/Tht0Z03awrG1EmCQ7Zr+2/fpPra7WfV97ws+ozu0upQFGL4AR rR9121gYAbCifW9G0fIzOgPEExKQrh56fzkCsW8= X-Received: by 2002:a05:6a00:3006:b0:52a:ca34:7e43 with SMTP id ay6-20020a056a00300600b0052aca347e43mr11603050pfb.10.1657614714676; Tue, 12 Jul 2022 01:31:54 -0700 (PDT) MIME-Version: 1.0 References: <20220711065907.23105-1-hbh25y@gmail.com> In-Reply-To: From: Hangyu Hua Date: Tue, 12 Jul 2022 16:31:43 +0800 Message-ID: Subject: Re: [PATCH] net: 9p: fix possible refcount leak in p9_read_work() and recv_done() To: asmadeus@codewreck.org Cc: ericvh@gmail.com, lucho@ionkov.net, linux_oss@crudebyte.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, tomasbortoli@gmail.com, v9fs-developer@lists.sourceforge.net, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/7/12 14:06, asmadeus@codewreck.org wrote: > Hangyu Hua wrote on Tue, Jul 12, 2022 at 11:24:36AM +0800: >> That's a little weird. If you are right, the three return paths of this >> function are inconsistent with the handling of refcount. >> >> static void p9_read_work(struct work_struct *work) >> { >> ... >> if ((m->rreq) && (m->rc.offset == m->rc.capacity)) { >> p9_debug(P9_DEBUG_TRANS, "got new packet\n"); >> m->rreq->rc.size = m->rc.offset; >> spin_lock(&m->client->lock); >> if (m->rreq->status == REQ_STATUS_SENT) { >> list_del(&m->rreq->req_list); >> p9_client_cb(m->client, m->rreq, REQ_STATUS_RCVD); <---- [1] >> } else if (m->rreq->status == REQ_STATUS_FLSHD) { >> /* Ignore replies associated with a cancelled request. */ >> p9_debug(P9_DEBUG_TRANS, >> "Ignore replies associated with a cancelled request\n"); <---- [2] >> } else { >> spin_unlock(&m->client->lock); >> p9_debug(P9_DEBUG_ERROR, >> "Request tag %d errored out while we were reading the reply\n", >> m->rc.tag); >> err = -EIO; >> goto error; <---- [3] >> } >> spin_unlock(&m->client->lock); >> m->rc.sdata = NULL; >> m->rc.offset = 0; >> m->rc.capacity = 0; >> p9_req_put(m->rreq); <---- [4] >> m->rreq = NULL; >> } >> ... >> error: >> p9_conn_cancel(m, err); <---- [5] >> clear_bit(Rworksched, &m->wsched); >> } >> >> There are three return paths here, [1] and [2] and [3]. >> [1]: m->rreq will be put twice in [1] and [4]. And m->rreq will be deleted >> from the m->req_list in [1]. >> >> [2]: m->rreq will be put in [4]. And m->rreq will not be deleted from >> m->req_list. > > when req status got put to FLUSHD the req was dropped from the list > already and put in p9_fd_cancel, so we shouldn't put it here. > >> [3]: m->rreq will be put in [5]. And m->rreq will be deleted from the >> m->req_list in [5]. > > On this error case I really can't say anything: it depends on how the > req got in this state in the first place -- more precisely is it still > in req_list or not? > > But even if it is and we leak it here, we return an error here, so the > connection will be marked as disconnected and won't be usable anymore. > The memory will be freed when the user umounts after that. > > If we took the time to re-init the rreq->req_list everytime we could > check if it's empty (don't think we can rely on it being poisoned), but > I just don't think it's worth it: it's better to consume a bit more > memory until umount than to risk a UAF. > > (note: while writing this I noticed p9_tag_cleanup() in > p9_client_destroy() only tracks requests still in the idr, so doesn't > work for requests that went through p9_tag_remove(). > We don't need p9_tag_remove() anymore so I've just gotten rid of it and > we will catch these now) > > >> If p9_tag_lookup keep the refcount of req which is in m->req_list. There >> will be a double put in return path [1] and a potential UAF in return path >> [2]. And this also means a req in m->req_list without getting refcount >> before p9_tag_lookup. > > That is the nominal path, we'd notice immediately if there are too many > puts there. > A request is initialized with two refs so that we can have one for the > transport ((a), for fd, "is the request tracked in a list?") and one for > the main thread ((b), p9_client_rpc which will put it at the end) > Then you get a third ref from p9_tag_lookup that I was forgetting about, > (c). > > Going through [1] removes it from the list, and removes the associated > ref (a), then through p9_client_cb which removes ref (c) and wakes up > p9_client_rpc which takes the last ref (b), freeing the request. > I think the normal path is right beacuase p9_tag_lookup and [4] keep the balance of refcount. This just proves that error path may have refcount leak. Beacause error path only put refcount once. In general, either [4] of the normal path is redundant(like you said, this is easy to catch), or the error path may have a refcount leak. But you are right, it'll be caught on umount. > > Now you were correct on one of these error paths not described in your > last mail: we -are- missing a p9_req_ut in the "No recv fcall for tag > %d" error path shortly after p9_tag_lookup, for the ref obtained from > p9_tag_lookup itself -- feel free to resend a patch with just that one. > But once again the connection is now unusable and it'll be caught on > umount so it's not the end of the world... > > (I'd appreciate if you base the new patch on top of > https://github.com/martinetd/linux/commits/9p-next ) > I see. I will make a new patch later. >> >> static void p9_write_work(struct work_struct *work) >> { >> ... >> list_move_tail(&req->req_list, &m->req_list); >> >> m->wbuf = req->tc.sdata; >> m->wsize = req->tc.size; >> m->wpos = 0; >> p9_req_get(req); >> ... >> } >> >> But if you check out p9_write_work, a refcount already get after >> list_move_tail. We don't need to rely on p9_tag_lookup to keep a list's >> refcount. > > This refcount is because we are keeping a ref in m->wreq, and is freed > when m->wreq is set back to null when the packet is done writing a few > lines below (but possibly in another call of the function). > > refs don't have to come from p9_tag_lookup, they're managing pointers > lifecycle: we're making a copy of the pointer, so we should increment > the refcount so another thread can't free the req under us. In this case > the p9_req_get() is under the trans fd m->client->lock where we got the > req from the list, so req can't be freed between its obtention from the > list and then; once the lock is dropped the req is protected by the ref. > > My fault. I misunderstood here. Thanks, Hangyu >> Whatsmore, code comments in p9_tag_alloc also proves that the >> refcount get by p9_tag_lookup is a temporary refcount. > > comments don't prove anything, but yes I forgot p9_tag_alloc takes a ref > when I commented earlier, sorry. > >>> This one isn't as clear cut, I see that they put the client in a >>> FLUSHING state but nothing seems to acton on it... But if this happens >>> we're already in the use after free realm -- it means rc.sdata was >>> already set so the other thread could be calling p9_client_cb anytime if >>> it already hasn't, and yet another thread will then do the final ref put >>> and free this. >>> We shouldn't free this here as that would also be an overflow. The best >>> possible thing to do at this point is just to stop using that pointer. >>> >> >> But p9_tag_lookup have a lock inside. Doesn't this mean p9_tag_lookup won't >> return a freed req? Otherwise we should fix the lock to avoid falling into >> the use after free realm. > > Right, that falls into the p9_tag_lookup ref, I had implemented this > better than I thought I did... > > I agree that one is also more correct to add, although I'd really want > to make some rdma setup and trigger a few errors to test. > > -- > Dominique