Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp1360342ybg; Thu, 11 Jun 2020 07:55:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyhA6WDcGoPDEvrZqSfO0B7RbtQnP/LHQHdtg3YDU0wcSsSAItp/8f4TgpESCzDEhTkdBLB X-Received: by 2002:aa7:c6c4:: with SMTP id b4mr7593487eds.210.1591887335618; Thu, 11 Jun 2020 07:55:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591887335; cv=none; d=google.com; s=arc-20160816; b=L5Q2/HOA6gkt6l+vjwvObxQaHQMPlAjI7n+3eQ/rjnH98omx40BLE8S+xQM4rfijae 6zsWSScna082N1W+oL8CdgI+lB6yLBO9hKHdw92qn3JZRiUmptF9xMGoPDWqi8YGcCEa DkpNaqrupdTYsqaxzF5qbxu7MAnvGKH3KB0BLvlz3l4jReeZH15TyVWJFLRY2mjBxpmM SrixwaCHdM2jQciFLCXDlhv6pwphd7z4FImauUQJjO5xHNb1YOspEao81jWxMzMmb2Ph VWSh45UYoDoiCtjko2A0JFtgMr4LFur4ZOw9hNzI8VJj++Qc1NkMTEkfMZ4KLJg8Ww3h 4k+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=2tc+rbs3cYLNVVDM36a+YAIXPKkWCZna/AbZUSRxxbs=; b=MOcQ7pS/jqi9tVk5OK7Wo4M4/JfmSnE0NpZv8uFrOjbegKM+btzHccqm3SFzKpb1f0 t4IHCxE/wQWdQISYa8VYXtwQxHAZkRebMDaRjWDveCmgqbUdh/LmwPRKl9d9bT43XO7d CPAYrFTFPylcfWbhfj2l6lUyqzb6vMsygwsZsh00vATYhAlFbwRguMEWctuSyqi4tagV JOjVfFKHyCP0HhOg23IZ1tQKliMnHzsm8oGdAyfPpxZAJ/IKLPXOhk6msEvQlG1ch6dJ zoeMCIbmynuRs9xDboGwLUFO4CSrL9HeFI4ejqON3p+8oY6IvrYWHwzzQwiEiRruXzRZ Yg6A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y1si2002164ejq.726.2020.06.11.07.55.12; Thu, 11 Jun 2020 07:55:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727053AbgFKOvN (ORCPT + 99 others); Thu, 11 Jun 2020 10:51:13 -0400 Received: from nautica.notk.org ([91.121.71.147]:38836 "EHLO nautica.notk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726109AbgFKOvN (ORCPT ); Thu, 11 Jun 2020 10:51:13 -0400 Received: by nautica.notk.org (Postfix, from userid 1001) id 38925C009; Thu, 11 Jun 2020 16:51:10 +0200 (CEST) Date: Thu, 11 Jun 2020 16:50:55 +0200 From: Dominique Martinet To: Wang Hai Cc: ericvh@gmail.com, lucho@ionkov.net, davem@davemloft.net, v9fs-developer@lists.sourceforge.net, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] 9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work Message-ID: <20200611145055.GA28945@nautica> References: <20200611014855.60550-1-wanghai38@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200611014855.60550-1-wanghai38@huawei.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Wang Hai wrote on Thu, Jun 11, 2020: > p9_read_work and p9_fd_cancelled may be called concurrently. Good catch. I'm sure this fixes some of the old syzbot bugs... I'll check other transports handle this properly as well. > Before list_del(&m->rreq->req_list) in p9_read_work is called, > the req->req_list may have been deleted in p9_fd_cancelled. > We can fix it by setting req->status to REQ_STATUS_FLSHD after > list_del(&req->req_list) in p9_fd_cancelled. hm if you do that read_work will fail with EIO and all further 9p messages will not be read? p9_read_work probably should handle REQ_STATUS_FLSHD in a special case that just throws the message away without error as well. > Before list_del(&req->req_list) in p9_fd_cancelled is called, > the req->req_list may have been deleted in p9_read_work. > We should return when req->status = REQ_STATUS_RCVD which means > we just received a response for oldreq, so we need do nothing > in p9_fd_cancelled. I'll need some time to convince myself the refcounting is correct in this case. Pre-ref counting this definitely was wrong, but now it might just work by chance.... I'll double-check. > Fixes: 60ff779c4abb ("9p: client: remove unused code and any reference > to "cancelled" function") I don't understand how this commit is related? At least make it afd8d65411 ("9P: Add cancelled() to the transport functions.") which adds the op, not something that removed a previous version of cancelled even earlier. > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c > index f868cf6fba79..a563699629cb 100644 > --- a/net/9p/trans_fd.c > +++ b/net/9p/trans_fd.c > @@ -718,11 +718,18 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req) > { > p9_debug(P9_DEBUG_TRANS, "client %p req %p\n", client, req); > > - /* we haven't received a response for oldreq, > - * remove it from the list. > + /* If req->status == REQ_STATUS_RCVD, it means we just received a > + * response for oldreq, we need do nothing here. Else, remove it from > + * the list. (nitpick) this feels a bit hard to read, and does not give any information: you're just paraphrasing the C code. I would suggest moving the comment after the spinlock and say what we really do ; something as simple as "ignore cancelled request if message has been received before lock" is enough. > */ > spin_lock(&client->lock); > + if (req->status == REQ_STATUS_RCVD) { > + spin_unlock(&client->lock); > + return 0; > + } > + > list_del(&req->req_list); > + req->status = REQ_STATUS_FLSHD; > spin_unlock(&client->lock); > p9_req_put(req); > -- Dominique