Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756478AbbBFOxr (ORCPT ); Fri, 6 Feb 2015 09:53:47 -0500 Received: from mail-by2on0136.outbound.protection.outlook.com ([207.46.100.136]:52192 "EHLO na01-by2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754740AbbBFOxo convert rfc822-to-8bit (ORCPT ); Fri, 6 Feb 2015 09:53:44 -0500 From: Dexuan Cui To: KY Srinivasan , Vitaly Kuznetsov CC: "devel@linuxdriverproject.org" , "Haiyang Zhang" , "linux-kernel@vger.kernel.org" , Jason Wang Subject: RE: [PATCH 0/4] Drivers: hv: Further protection for the rescind path Thread-Topic: [PATCH 0/4] Drivers: hv: Further protection for the rescind path Thread-Index: AQHQQSvuY4uwgwARLkGlnqLyl2EfhpziAL+AgACd26CAARfF8A== Date: Fri, 6 Feb 2015 14:53:32 +0000 Message-ID: References: <1422982839-3948-1-git-send-email-vkuznets@redhat.com> <87r3u4r7nd.fsf@vitty.brq.redhat.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [141.251.55.68] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:206.191.230.4;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10019020)(6009001)(52604005)(13464003)(51704005)(377454003)(41574002)(23726002)(102836002)(2656002)(86146001)(33656002)(87936001)(86362001)(575784001)(97756001)(19580405001)(50466002)(19580395003)(6806004)(2950100001)(76176999)(92566002)(47776003)(104016003)(54356999)(46406003)(50986999)(2900100001)(66066001)(77156002)(62966003)(106116001)(22756005)(55846006)(106466001)(93886004)(1511001)(46102003)(561944003)(22746005)(79686002);DIR:OUT;SFP:1102;SCL:1;SRVR:BL2PR03MB145;H:064-smtp-out.microsoft.com;FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Antispam: UriScan:; X-Microsoft-Antispam: BCL:0;PCL:0;RULEID:;SRVR:BL2PR03MB145; X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004);SRVR:BL2PR03MB145; X-Forefront-PRVS: 047999FF16 X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:;SRVR:BL2PR03MB145; X-OriginatorOrg: microsoft.onmicrosoft.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Feb 2015 14:53:41.0438 (UTC) X-MS-Exchange-CrossTenant-Id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=72f988bf-86f1-41af-91ab-2d7cd011db47;Ip=[206.191.230.4] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2PR03MB145 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 13918 Lines: 384 > -----Original Message----- > From: KY Srinivasan > Sent: Friday, February 6, 2015 6:47 AM > To: Dexuan Cui; Vitaly Kuznetsov > Cc: devel@linuxdriverproject.org; Haiyang Zhang; linux-kernel@vger.kernel.org; > Jason Wang > Subject: RE: [PATCH 0/4] Drivers: hv: Further protection for the rescind path > > -----Original Message----- > > From: Dexuan Cui > > Sent: Thursday, February 5, 2015 4:45 AM > > To: Vitaly Kuznetsov; KY Srinivasan > > Cc: devel@linuxdriverproject.org; Haiyang Zhang; linux- > > kernel@vger.kernel.org; Jason Wang > > Subject: RE: [PATCH 0/4] Drivers: hv: Further protection for the rescind path > > > > > -----Original Message----- > > > From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com] > > > Sent: Thursday, February 5, 2015 18:10 PM > > > To: KY Srinivasan > > > Cc: devel@linuxdriverproject.org; Haiyang Zhang; > > > linux-kernel@vger.kernel.org; Dexuan Cui; Jason Wang > > > Subject: Re: [PATCH 0/4] Drivers: hv: Further protection for the > > > rescind path > > > > > > KY Srinivasan writes: > > > > > > >> -----Original Message----- > > > >> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com] > > > >> Sent: Tuesday, February 3, 2015 9:01 AM > > > >> To: KY Srinivasan; devel@linuxdriverproject.org > > > >> Cc: Haiyang Zhang; linux-kernel@vger.kernel.org; Dexuan Cui; Jason > > > >> Wang > > > >> Subject: [PATCH 0/4] Drivers: hv: Further protection for the > > > >> rescind path > > > >> > > > >> This series is a continuation of the "Drivers: hv: vmbus: serialize > > > >> Offer and Rescind offer". I'm trying to address a number of > > > >> theoretically possible issues with rescind offer handling. All > > > >> these complications come from the fact that a rescind offer results > > > >> in vmbus channel being freed and we must ensure nobody still uses > > > >> it. Instead of introducing new locks I suggest we switch channels usage > > to the get/put workflow. > > > >> > > > >> The main part of the series is [PATCH 1/4] which introduces the > > > >> workflow for vmbus channels, all other patches fix different corner > > > >> cases using this workflow. I'm not sure all such cases are covered > > > >> with this series (probably not), but in case protection is required > > > >> in some other places it should become relatively easy to add one. > > > >> > > > >> I did some sanity testing with CONFIG_DEBUG_LOCKDEP=y and nothing > > > >> popped out, however, additional testing would be much appreciated. > > > >> > > > >> K.Y., Haiyang, I'm not sending this series to netdev@ and > > > >> linux-scsi@ as it is supposed to be applied as a whole, please > > > >> resend these patches with your sign-offs when (and if) we're done with > > reviews. Thanks! > > > > > > > > Vitaly, > > > > > > > > Thanks for looking into this issue. While today, rescind offer > > > > results in the > > > freeing of the channel, I don't think > > > > that is required. By not freeing up the channel in the rescind path, > > > > we can have > > > a safe way to access the channel and > > > > that does not have to involve taking a reference on the channel > > > > every time you > > > access it - the get/put workflow in your > > > > patch set. As part of the network performance improvement work, I > > > > had > > > eliminated all locks in the receive path by setting > > > > up per-cpu data structures for mapping the relid to channel etc. > > > > These set of > > > patches introduces locking/atomic operations > > > > in performance critical code paths to deal with an event that is > > > > truly rare - the channel getting rescinded. > > > > > > It is possible to eliminate all locks/atomic operations from > > > performance critical pyth in my patch series by following Dexuan's > > > suggestion - we'll get the channel in vmbus_open and put it in > > > vmbus_close (and on processing offer/rescind offer) this won't affect > > > performance. I'm in the middle of testing this approach. > > > > > > > > > > > All channel messages are handled in a single work context: > > > > vmbus_on_msg_dpc() -> vmbus_onmessage_work()-> Various channel > > > messages [offer, rescind etc.] > > This is true. > > > > > > > > > > So, the rescind message cannot be processed while we are processing > > > > the > > > offer message and since an offer > > > > cannot be rescinded before it is offered, offer and rescind are > > > > naturally serialized (I think I have patchset in my queue > > IMO this may not be true. > > The cause is: > > (I'm using the latest linux-next repo, which includes Vitaly's patch > > "Drivers: hv: vmbus: serialize Offer and Rescind offer".) > > > > vmbus_onoffer_rescind() runs in vmbus_connection.work_queue, but > > vmbus_process_offer() runs in the per-channel newchannel->controlwq, so > > the two functions are not serialized, at least in theory. > > > > As a result, in vmbus_process_offer(): after the new channel is added into > > vmbus_connection.chn_list, but before the channel is completely initialized > > by us (we need to create a vmbus device and associate the device with the > > channel -- this procedure could fail and we goto err_free_chan and free the > > channel directly!), vmbus_onoffer_rescind() can see the new channel, but > > doesn't know the channel could be freed in another place at the same time. > > > > BTW, when vmbus_process_offer() -> vmbus_device_create() fails, we goto > > err_free_chan without removing the new channel from > > vmbus_connection.chn_list? > > > > As another result : in vmbus_process_offer(), in the case > > vmbus_device_register() fails, we'll run "list_del(&newchannel->listentry) > > and unlock vmbus_connection.channel_lock" -- just after these 2 lines, at > > this time, > > vmbus_onoffer_rescind() -> relid2channel() can return NULL, and we'll miss > > the rescind message, i.e., we fail to send the > > CHANNELMSG_RELID_RELEASED message to the host. > > This is the way the code is; but it does not have to be. Here is a patch that > implements what I have > been trying to describe. This is a rough proposal; there is additional (obvious) > cleanup that I have not done. > This is on top of the patches that have been submitted to Greg. > > Signed-off-by: K. Y. Srinivasan Hi KY, The patch is great! It effectively removes the per-channel workqueue and uses the global vmbus_connection.work_queue to implement natural serialization. It simplifies all the things a lot! Two small suggestions: 1) hv_process_device_removal() -> hv_process_channel_removal(). 2) I think we can revert the patch "Drivers: hv: vmbus: serialize Offer and Rescind offer" first and then this change will be smaller and clearer. :-) -- Dexuan > --- > drivers/hv/channel.c | 8 +++++ > drivers/hv/channel_mgmt.c | 63 +++++++++++++------------------------------- > drivers/hv/hv_util.c | 2 +- > drivers/hv/vmbus_drv.c | 24 ++++++++++++----- > include/linux/hyperv.h | 1 + > 5 files changed, 46 insertions(+), 52 deletions(-) > > diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c > index bf0cf8f..4a21cd8 100644 > --- a/drivers/hv/channel.c > +++ b/drivers/hv/channel.c > @@ -501,6 +501,14 @@ static int vmbus_close_internal(struct vmbus_channel > *channel) > put_cpu(); > } > > +/* > + * If the channel has been rescinded; process device removal. > + */ > +if (channel->rescind) { > +hv_process_device_removal(channel); > +return 0; > +} > + > /* Send a closing message */ > > msg = &channel->close_msg.msg; > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c > index 0ba6b5c..a208256 100644 > --- a/drivers/hv/channel_mgmt.c > +++ b/drivers/hv/channel_mgmt.c > @@ -207,27 +207,11 @@ static void percpu_channel_deq(void *arg) > list_del(&channel->percpu_list); > } > > -/* > - * vmbus_process_rescind_offer - > - * Rescind the offer by initiating a device removal > - */ > -static void vmbus_process_rescind_offer(struct work_struct *work) > +void hv_process_device_removal(struct vmbus_channel *channel) > { > -struct vmbus_channel *channel = container_of(work, > - struct vmbus_channel, > - work); > +struct vmbus_channel_relid_released msg; > unsigned long flags; > struct vmbus_channel *primary_channel; > -struct vmbus_channel_relid_released msg; > -struct device *dev; > - > -if (channel->device_obj) { > -dev = get_device(&channel->device_obj->device); > -if (dev) { > -vmbus_device_unregister(channel->device_obj); > -put_device(dev); > -} > -} > > memset(&msg, 0, sizeof(struct vmbus_channel_relid_released)); > msg.child_relid = channel->offermsg.child_relid; > @@ -256,6 +240,7 @@ static void vmbus_process_rescind_offer(struct > work_struct *work) > free_channel(channel); > } > > + > void vmbus_free_channels(void) > { > struct vmbus_channel *channel; > @@ -270,11 +255,8 @@ void vmbus_free_channels(void) > * vmbus_process_offer - Process the offer by creating a channel/device > * associated with this offer > */ > -static void vmbus_process_offer(struct work_struct *work) > +static void vmbus_process_offer(struct vmbus_channel *newchannel) > { > -struct vmbus_channel *newchannel = container_of(work, > -struct vmbus_channel, > -work); > struct vmbus_channel *channel; > bool fnew = true; > bool enq = false; > @@ -340,7 +322,7 @@ static void vmbus_process_offer(struct work_struct > *work) > if (channel->sc_creation_callback != NULL) > channel->sc_creation_callback(newchannel); > > -goto done_init_rescind; > +goto done; > } > > goto err_free_chan; > @@ -381,15 +363,10 @@ static void vmbus_process_offer(struct work_struct > *work) > kfree(newchannel->device_obj); > goto err_free_chan; > } > -done_init_rescind: > -spin_lock_irqsave(&newchannel->lock, flags); > -/* The next possible work is rescind handling */ > -INIT_WORK(&newchannel->work, vmbus_process_rescind_offer); > -/* Check if rescind offer was already received */ > -if (newchannel->rescind) > -queue_work(newchannel->controlwq, &newchannel->work); > -spin_unlock_irqrestore(&newchannel->lock, flags); > + > +done: > return; > + > err_free_chan: > free_channel(newchannel); > } > @@ -515,8 +492,7 @@ static void vmbus_onoffer(struct > vmbus_channel_message_header *hdr) > newchannel->monitor_grp = (u8)offer->monitorid / 32; > newchannel->monitor_bit = (u8)offer->monitorid % 32; > > -INIT_WORK(&newchannel->work, vmbus_process_offer); > -queue_work(newchannel->controlwq, &newchannel->work); > +vmbus_process_offer(newchannel); > } > > /* > @@ -529,6 +505,7 @@ static void vmbus_onoffer_rescind(struct > vmbus_channel_message_header *hdr) > struct vmbus_channel_rescind_offer *rescind; > struct vmbus_channel *channel; > unsigned long flags; > +struct device *dev; > > rescind = (struct vmbus_channel_rescind_offer *)hdr; > channel = relid2channel(rescind->child_relid); > @@ -539,18 +516,16 @@ static void vmbus_onoffer_rescind(struct > vmbus_channel_message_header *hdr) > > spin_lock_irqsave(&channel->lock, flags); > channel->rescind = true; > -/* > - * channel->work.func != vmbus_process_rescind_offer means we are still > - * processing offer request and the rescind offer processing should be > - * postponed. It will be done at the very end of vmbus_process_offer() > - * as rescind flag is being checked there. > - */ > -if (channel->work.func == vmbus_process_rescind_offer) > -/* work is initialized for vmbus_process_rescind_offer() from > - * vmbus_process_offer() where the channel got created */ > -queue_work(channel->controlwq, &channel->work); > - > spin_unlock_irqrestore(&channel->lock, flags); > + > +if (channel->device_obj) { > +dev = get_device(&channel->device_obj->device); > +if (dev) { > +vmbus_device_unregister(channel->device_obj); > +put_device(dev); > +} > +} > + > } > > /* > diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c > index c5be773..7994ec2 100644 > --- a/drivers/hv/hv_util.c > +++ b/drivers/hv/hv_util.c > @@ -380,9 +380,9 @@ static int util_remove(struct hv_device *dev) > { > struct hv_util_service *srv = hv_get_drvdata(dev); > > -vmbus_close(dev->channel); > if (srv->util_deinit) > srv->util_deinit(); > +vmbus_close(dev->channel); > kfree(srv->recv_buffer); > > return 0; > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c > index 3cd44ae..64627c4 100644 > --- a/drivers/hv/vmbus_drv.c > +++ b/drivers/hv/vmbus_drv.c > @@ -505,15 +505,25 @@ static int vmbus_probe(struct device *child_device) > */ > static int vmbus_remove(struct device *child_device) > { > -struct hv_driver *drv = drv_to_hv_drv(child_device->driver); > +struct hv_driver *drv; > struct hv_device *dev = device_to_hv_device(child_device); > > -if (drv->remove) > -drv->remove(dev); > -else > -pr_err("remove not set for driver %s\n", > -dev_name(child_device)); > - > +if (child_device->driver) { > +drv = drv_to_hv_drv(child_device->driver); > +if (drv->remove) { > +drv->remove(dev); > +} else { > +pr_err("remove not set for driver %s\n", > +dev_name(child_device)); > +hv_process_device_removal(dev->channel); > +} > +} else { > +/* > + * We don't have a driver for this device; deal with the > + * rescind message. > + */ > +hv_process_device_removal(dev->channel); > +} > return 0; > } > > diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h > index b079abf..9658bfe 100644 > --- a/include/linux/hyperv.h > +++ b/include/linux/hyperv.h > @@ -1226,6 +1226,7 @@ void hv_kvp_onchannelcallback(void *); > int hv_vss_init(struct hv_util_service *); > void hv_vss_deinit(void); > void hv_vss_onchannelcallback(void *); > +void hv_process_device_removal(struct vmbus_channel *channel); > > extern struct resource *hyperv_mmio; > > -- > 1.7.4.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/