Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7251153imu; Thu, 31 Jan 2019 07:22:44 -0800 (PST) X-Google-Smtp-Source: ALg8bN5PbUx07aw5k3Y0GsfK2nOTN+T5VxRXlQSIBSPsiGFVP34HXj02XwDQ/+OKVTeR/H4LMUqk X-Received: by 2002:a17:902:7c05:: with SMTP id x5mr34599006pll.273.1548948164151; Thu, 31 Jan 2019 07:22:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548948164; cv=none; d=google.com; s=arc-20160816; b=yCNEAvz5PDvwXkuiSqqLNIexRxn7TzR7ZsTBSA+eTF6H1biZ3FsXrlzt3DPzXfTAa6 MOK5ZBwRCvFShze63ch5X4aEFQAJyEwMA5KI8oFbTtKXHclQpqrQwy7dE1+ZtVVpteml IrwsX2W0hJSWrJRptoZ7L9KR8Hq4nRIjs5lISl0Qdgcg9uLjVfk7AwIAjS6sy2lrf1r0 uqAy4XJtVDOEeErpbXBvAP+RywhCe3NmN1G12Qcx6POuQxdIGIcHgVlHQstKiCzhKa9W 3Ewoe3ubLa/ONcu3gqtI3qS93LlzhEEVAjSyh4IeVviy0J/AxXBMjMY8m6T097uUJBHX 9/Dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=2F71SOsPq5ysea7mkVOHK06VKcHb1nQ49g/MhmFoBXE=; b=lzHPI/F2udC0xpqNfnMrEtyBm4883nOAEXb3IEYOfBDhII3lKkc8jvaLD4QozzF1O1 kEqsHaSxLLGGa4+Axd/9ii6GQNcMtIWEPRft9O4PPPSs1y6bmY1P78guWMF3bBsZ9Z86 FCQJDvpgnjnaWx3ongmHfyK3gLhWWL3bU7frbz+FTQi56xzfuz2DTOiuOnzFlyst1HP8 jMD+X5JI6+ly4QhG5PcPoBWU+ABe2aWUeRE6/jL9A0TE4Lk4rVDCNjvAlbCJgYEfEiU/ hk5/xUlQGYeZc+Eyg1dqBDchm+g9Ak7eBxQhf2xVZg46TKofH145vyNegNZIs/cv1mus k0+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=CHKo7edC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o192si4441043pgo.129.2019.01.31.07.22.27; Thu, 31 Jan 2019 07:22:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=CHKo7edC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728066AbfAaPUB (ORCPT + 99 others); Thu, 31 Jan 2019 10:20:01 -0500 Received: from mail.kernel.org ([198.145.29.99]:35826 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726082AbfAaPUB (ORCPT ); Thu, 31 Jan 2019 10:20:01 -0500 Received: from localhost (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 35B0720863; Thu, 31 Jan 2019 15:20:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1548948000; bh=2F71SOsPq5ysea7mkVOHK06VKcHb1nQ49g/MhmFoBXE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=CHKo7edCQyXIYyf+zjTbG5kokZg1IcQ55IcDFLtb6TPV+xDYJYW737lzLo/Ed0Qwy wX6db5SCMESjtIYy0GhqvOJZliGmswFJZmSYVgZ6hSuCqHvURn5R+WLz+LgoQeg6zS fxMUWO2vJSH78QyhVKReL2XX+/tkXKa+01aTqw6Y= Date: Thu, 31 Jan 2019 10:19:58 -0500 From: Sasha Levin To: Dexuan Cui Cc: kimbrownkd , Michael Kelley , Long Li , Sasha Levin , Stephen Hemminger , KY Srinivasan , Haiyang Zhang , "devel@linuxdriverproject.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] Drivers: hv: vmbus: Add mutex lock to channel show functions Message-ID: <20190131151958.GT3973@sasha-vm> References: <20190122020759.GA4054@ubu-Virtual-Machine> <20190122064246.GA28613@ubu-Virtual-Machine> <20190128195845.GA3723@ubu-Virtual-Machine> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 29, 2019 at 07:20:28PM +0000, Dexuan Cui wrote: >> From: Kimberly Brown >> > ... >> > But as you pointed, at least for sub-channels, channel->ringbuffer_page >> > can indeed disappear in vmbus_close() -> ... -> vmbus_free_ring(), and >> > the "attribute->show()" could crash when the race happens. >> > Adding channel_mutex here seems to be able to fix the race for >> > sub-channels, as the same channel_mutex is used in >> vmbus_disconnect_ring(). >> > >> > For a primary channel, vmbus_close() -> vmbus_free_ring() can still >> > free channel->ringbuffer_page without notifying the "attribute->show()". >> > We may also need to acquire/release the channel_mutex in vmbus_close()? >> > >> > > Actually, there should be checks that "chan" is not null and that >> > > "chan->state" is CHANNEL_OPENED_STATE within the locked section. I'll >> > > need to fix that. >> > I suppose "chan" can not be NULL here (see the above). >> > >> > Checking "chan->state" may not help to completely fix the race, because >> > there is no locking/synchronization code in >> > vmbus_close_internal() when we test and change "channel->state". >> > >> >> The calls to vmbus_close_internal() for the subchannels and the primary >> channel are protected with channel_mutex in vmbus_disconnect_ring(). >> This prevents "channel->state" from changing while "attribute->show()" is >> running. >Ah, I think you're right. > >> > I guess we may need to check if channel->ringbuffer_page is NULL in >> > the "attribute->show()". >> > >> >> For the primary channel, vmbus_free_ring() is called after the >> return from vmbus_disconnect_ring(). Therefore, the primary channel's >> state is changed before "channel->ringbuffer_page" is set to NULL. >> Checking the channel state should be sufficient to prevent the ring >> buffers from being freed while "attribute->show()" is running. The >> ring buffers can't be freed until the channel's state is changed, and >> the channel state change is protected by the mutex. >I think you're right (I noticed in a previous mail you mentioned you would >improve your patch to check "chan->state" with the mutax held). > >> I think checking that "channel->ringbuffer_page" is not NULL would >> also work, but, as you stated, we would need to aquire/release >> channel_mutex in vmbus_close(). >Then it looks unnecessary to check "channel->ringbuffer_page". > >> > PS, to prove that a race condition does exist and can really cause a panic or >> > something, I usually add some msleep() delays in different paths so that I >> > can reproduce the crash every time I take a special action, e.g. when I read >> > the sysfs files of a NIC, I try to remove hv_netvsc driver. This way, I can prove >> > a patch can indeed help, at least it can fix the crash which would happen >> > without the patch. :-) >> > >> >> Thanks! I was able to free the ring buffers while "attribute->show()" >> was running, which caused a null pointer dereference bug. As expected, >> the mutex lock fixed it. >Awesome! I've queued this one for hyper-fixes, thanks all! -- Thanks, Sasha