Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3349225imm; Sun, 29 Jul 2018 16:48:55 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfKWAv/qk+MZcjLTcpkAQMPaSURCoKoYN6RrENrXMpkuHugmM4GmCg3Yl6eK9VLylIJqHDT X-Received: by 2002:a63:b02:: with SMTP id 2-v6mr13767110pgl.301.1532908135647; Sun, 29 Jul 2018 16:48:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532908135; cv=none; d=google.com; s=arc-20160816; b=UUVSxWny9FUzagmoISuBTT9m0IHUpHbg9th91nZQuptuTAggfSH8QD0tgrMOEo/EMM R297U9a2qfaYFdFXMAkacNRrA4bfdW9lFlGGP1/xCPzhkPkKKFLh/UHlRxgfS5nZwP4N 0C62uA4sAFZmMPl8FykrpoUsc48fBF/1JiW0rqE3flp/LHD0ze/eSMLr0B49xoUbrCG7 93735pB/jrMzHqJnf+OMFjB6mqZHAiqtZgAJoafIPsH+Vdhxx7zTgFBD2mvywD3GQIu0 pzk8JSXXV74LiMoDCTJMXKHiPSFmbJOGLalTflppRQJJ6ZHCrfWwuD8+ZG+bokqumvrv itig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=Ni66HZ86VaAc0Tp9hAGBYpYvIFc1YE6evKPpy2D5WVw=; b=Yqhk9C+ao/btzMwRTa46zb6puxJMu3pHSmRrSXSgTsSu1K9Ds28QXSoSU0MOfeL31v Whc6cg7mlb9HM98H6PvYqbVDrkUUdr13bBGU4lkSDUZzNPqDEq0WX+J4DthA0k2G9Kky nASCubObbsSKoegFl63pxF6CGf30bViSnkA/t5KHMTb3Gzo4+Xp4n8y5nxI6RkxTS1ec 3HblpfAJIqE6PsqsGH7x95m9tIAE62VIDcMfMCu2l0ZOX7/KlQp4m2hQWqsggP1/J+Nw Ij6yvudEld+8s4aesfb6Yvlx503MfOSQe2sO2xJ2stXW/RXI7b7JrW5F+m3cVcxLTZck Sg1Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u21-v6si8776975pgm.230.2018.07.29.16.48.41; Sun, 29 Jul 2018 16:48:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731763AbeG3BGJ (ORCPT + 99 others); Sun, 29 Jul 2018 21:06:09 -0400 Received: from nautica.notk.org ([91.121.71.147]:46979 "EHLO nautica.notk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731549AbeG3BGJ (ORCPT ); Sun, 29 Jul 2018 21:06:09 -0400 Received: by nautica.notk.org (Postfix, from userid 1001) id 73184C009; Mon, 30 Jul 2018 01:33:51 +0200 (CEST) Date: Mon, 30 Jul 2018 01:33:36 +0200 From: Dominique Martinet To: Tomas Bortoli Cc: davem@davemloft.net, v9fs-developer@lists.sourceforge.net, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, syzkaller@googlegroups.com Subject: Re: [PATCH] 9p: fix Use-After-Free in p9_write_work() Message-ID: <20180729233336.GB28684@nautica> References: <20180729130248.29612-1-tomasbortoli@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20180729130248.29612-1-tomasbortoli@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Tomas Bortoli wrote on Sun, Jul 29, 2018: > There is a race condition between p9_free_req() and p9_write_work(). > A request might still need to be processed while p9_free_req() is called. > > To fix it, flush the read/write work before freeing any request. > > Signed-off-by: Tomas Bortoli > Reported-by: syzbot+467050c1ce275af2a5b8@syzkaller.appspotmail.com It looks like I have not received this report, I found it through google in the lkml archives but Dmitry do you have a convenient-ish way of finding the report on the syzkaller website with that reported-by tag? > --- > > To be able to flush the r/w work from client.c we need the p9_conn and > p9_trans_fd definitions. Therefore this commit moves most of the declarations in > trans_fd.c to trans_fd.h and import such file in client.c This cannot work as it is, because you're not just intorudcing the trans_fd types but you're really depending on the transport used being fd. 'conn.wq' won't even be valid memory in other transports so I don't want to know what trying to flush_work on this will do... :) Other transports also have the same issue see discussion in https://lkml.org/lkml/2018/7/19/727 (that is another syzbot report, slightly different but I believe it points to the same issue) Basically, a more global view of the problem is a race between p9_tag_lookup returning a p9_req_t and another thread freeing it. Matthew wrote the problem himself in a comment in p9_tag_lookup in his new version that used to be in linux-next at the time (I took the commit out temporarily until I've had time to benchmark it, but it will come back in, just you're working on thin air right now because the bug was only found thanks to this commit): + /* There's no refcount on the req; a malicious server could cause + * us to dereference a NULL pointer + */ So a more proper solution would be to had a refcount to req, have p9_tag_lookup increment the refcount within rcu_read_lock, and have a deref function free the req when the count hits 0. Now we're here though I'm not sure what to suggest, I had promised to get some performance benchmark out by this past weekend and I obviously failed, so this patch might be delayed to 4.20 and the refcount approach would not work with the current req cache/reuse system we have right now. If you want to finish this anyway you can work on my 9p-test branch (I've kept the commit there), and I'll take that patch in at the same time as the other. > Moreover, a couple of identifiers were altered to avoid name conflicts with the > new import. If we were to stick to this approach, two suggestions: - headers aren't all-or-nothing, I think it's better to only expose what you need (and e.g. keep the Opt_* enum in the .c) - instead of exposing trans_fd specific stuff, it's cleaner to add a new op vector '.flush' to the p9_trans_module struct and call that if it exists. Thanks, -- Dominique