Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755488AbcDDVq4 (ORCPT ); Mon, 4 Apr 2016 17:46:56 -0400 Received: from mail1.bemta12.messagelabs.com ([216.82.251.6]:11511 "EHLO mail1.bemta12.messagelabs.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751902AbcDDVqy convert rfc822-to-8bit (ORCPT ); Mon, 4 Apr 2016 17:46:54 -0400 X-Greylist: delayed 386 seconds by postgrey-1.27 at vger.kernel.org; Mon, 04 Apr 2016 17:46:54 EDT X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrAKsWRWlGSWpSXmKPExsVywNY2Q9f+PlO 4wZs1shZ7zvxit2hevJ7N4sOUu6wWl3fNYbNo/TeH3YHV496+wywe++euYfdYd+k9o8f7fVfZ PD5vkgtgjWLNzEvKr0hgzbh9ZC1LwTeziq2b+lgaGJ9rdzFycQgJ7GKU6H3ZyQLhHGCUOLtlE 5SznlHix7UHbF2MnBxsAoYS15ZtA7NFBFQkLvzbyw5SxCzwnEmi5/RKZpCEsICtxLfLHVBFdh JLL31khrCNJP5PP8oEYrMANV/aOBXI5uDgFfCR6P3JD7FsD6PEzaZFYPWcAg4S22e9YgWxGQX EJL6fWgPWyywgLnHryXwwW0JAQGLJnvPMELaoxMvH/1ghbAOJrUv3sUDYShI/v29hgejVkViw +xMbhK0tsWzha7BeXgFBiZMzn4DVCAHddnL3P7YJjOKzkKybhaR9FpL2WUjaFzCyrGLUKE4tK kst0jUy1ksqykzPKMlNzMzRNTQ00stNLS5OTE/NSUwq1kvOz93ECIzZegYGxh2Mq6d7HWKU5G BSEuW9dZIpXIgvKT+lMiOxOCO+qDQntfgQowwHh5IEL889oJxgUWp6akVaZg4wecCkJTh4lER 49UDSvMUFibnFmekQqVOMilLivB/vAiUEQBIZpXlwbbCEdYlRVkqYl5GBgUGIpyC1KDezBFX+ FaM4B6OSMC83yHiezLwSuOmvgBYzAS2uFwZbXJKIkJJqYJxzt3F/+qslTHbqeYvDa8/uXSD+S ewDC2P4+mlC35L9l65QWPLz/EMNq2/VzzX+nXv/kOvODNs1aQdsDiRvKAowO7guxOfb8xW926 fMvchwkTFk/20NppSokimPujOD9hROdjybY1zurryew+by5Dn3Vf+yCL7eUFWTz3X/lHBz08H Nxjp/nfmVWIozEg21mIuKEwEo6KjgUwMAAA== X-Env-Sender: Timothy.Sell@unisys.com X-Msg-Ref: server-13.tower-200.messagelabs.com!1459806014!434551!1 X-Originating-IP: [192.61.61.104] X-StarScan-Received: X-StarScan-Version: 8.28; banners=-,-,- X-VirusChecked: Checked From: "Sell, Timothy C" To: Neil Horman CC: Iban Rodriguez , "Kershner, David A" , Greg Kroah-Hartman , Benjamin Romer , *S-Par-Maintainer , "devel@driverdev.osuosl.org" , "linux-kernel@vger.kernel.org" Subject: RE: Staging: unisys/verisonic: Correct double unlock Thread-Topic: Staging: unisys/verisonic: Correct double unlock Thread-Index: AQHRjn86aL29emOwhk+SVTi9CwBfJZ96NfFA Date: Mon, 4 Apr 2016 21:40:13 +0000 Message-ID: References: <1459619226-9786-1-git-send-email-iban.rodriguez@ono.com> <20160404143521.GB8807@hmsreliant.think-freely.org> In-Reply-To: <20160404143521.GB8807@hmsreliant.think-freely.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [172.17.125.147] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6976 Lines: 190 > -----Original Message----- > From: Neil Horman [mailto:nhorman@redhat.com] > Sent: Monday, April 04, 2016 10:35 AM > To: Sell, Timothy C > Cc: Iban Rodriguez; Kershner, David A; Greg Kroah-Hartman; Benjamin > Romer; *S-Par-Maintainer; devel@driverdev.osuosl.org; linux- > kernel@vger.kernel.org > Subject: Re: Staging: unisys/verisonic: Correct double unlock > > On Sat, Apr 02, 2016 at 11:20:14PM +0000, Sell, Timothy C wrote: > > > -----Original Message----- > > > From: Iban Rodriguez [mailto:iban.rodriguez@ono.com] > > > Sent: Saturday, April 02, 2016 1:47 PM > > > To: Kershner, David A; Greg Kroah-Hartman; Benjamin Romer; Sell, > Timothy > > > C; Neil Horman > > > Cc: *S-Par-Maintainer; devel@driverdev.osuosl.org; linux- > > > kernel@vger.kernel.org; Iban Rodriguez > > > Subject: Staging: unisys/verisonic: Correct double unlock > > > > > > 'priv_lock' is unlocked twice. The first one is removed and > > > the function 'visornic_serverdown_complete' is now called with > > > 'priv_lock' locked because 'devdata' is modified inside. > > > > > > Signed-off-by: Iban Rodriguez > > > --- > > > drivers/staging/unisys/visornic/visornic_main.c | 1 - > > > 1 file changed, 1 deletion(-) > > > > > > diff --git a/drivers/staging/unisys/visornic/visornic_main.c > > > b/drivers/staging/unisys/visornic/visornic_main.c > > > index be0d057346c3..af03f2938fe9 100644 > > > --- a/drivers/staging/unisys/visornic/visornic_main.c > > > +++ b/drivers/staging/unisys/visornic/visornic_main.c > > > @@ -368,7 +368,6 @@ visornic_serverdown(struct visornic_devdata > > > *devdata, > > > } > > > devdata->server_change_state = true; > > > devdata->server_down_complete_func = complete_func; > > > - spin_unlock_irqrestore(&devdata->priv_lock, flags); > > > visornic_serverdown_complete(devdata); > > > } else if (devdata->server_change_state) { > > > dev_dbg(&devdata->dev->device, "%s changing state\n", > > > > I agree there is a bug here involving priv_lock being unlocked > > twice, but this patch isn't the appropriate fix. Reason is, we can NOT > > call visornic_serverdown_complete() while holding a spinlock > > (which is what this patch would cause to occur) because > > visornic_serverdown_complete() might block when it calls > > rtnl_lock() in this code sequence (rtnl_lock() grabs a mutex): > > > > rtnl_lock(); > > dev_close(netdev); > > rtnl_unlock(); > > > > Blocking with a spinlock held is always a bad idea. :-( > > > > You should just get rid of the priv_lock entirely, its not needed. > > priv_lock is used the following functions: > > visornic_serverdown - only called at the end of a tx_timeout reset > operation, so > you are sure that the rx and tx paths are quiesced (i.e. no data access > happening) > > visornic_disable_with_timeout - move the netif_stop_queue operation to > the top > of this function and you will be guaranteed no concurrent access in the tx > path > > visornic_enable_with_timeout - same as above, make sure that > netif_start_queue > and napi_enable are at the end of the function and you are guarantted no > concurrent access. > > visornic_xmit - The queue lock in the netdev_start_xmit routine guarantees > you > single access here from multiple transmits. > > visornic_xmit_timeout - only called on a tx timeout, when you are > guaranteed not > to have concurrent transmit occuing, by definition. > > visornic_rx - the only tests made here are to devdata members that are > altered > in service_resp_queue, and the visornic_rx is only called from > service_resp_queue, so you are guaranteed a stable data structure, as there > is > only ever one context in service_resp_queue as its called from the napi poll > routine > > service_resp_queue - Same as above, for any given queue, > service_resp_queue only > has one context exectuing at once. > > host_side_disappeared - only called from visornic_remove, when implies > that all > associated devices are closed already, guaranteeing single access. > > visornic_remove > visornic_resume - Both of these function only get called when all network > interfaces are quiesced. > > just remove the lock and make the minor changes needed to guarantee > isolated > access. It makes the code cleaner and faster > > Neil Neil, Although I would also love to get rid of this lock, I think we still need it, and will attempt to explain. There's a thread of execution present in visornic that doesn't exist in traditional network drivers, which involves the visornic_pause() and visornic_resume() functions registered during: visorbus_register_visor_driver(&visornic_driver) visornic_pause() and visornic_resume() are called on a thread managed by visorbus, in response to messages received from our hypervisor back-end. Note that visornic_pause() calls visornic_serverdown(), which is one of the users of priv_lock. (I.e., visornic_serverdown() is called from other places besides the end of a tx_timeout reset operation, which is what you called out in your explanation above). We need priv_lock to do a false --> true transition of devdata->server_change_state in the pause/resume path, so we can prevent this transition from occurring during critical sections in the normal networking path. The comment present on priv_lock's declaration: spinlock_t priv_lock; /* spinlock to access devdata structures */ is indeed inadequate to the point of being misleading. visornic_serverdown() in its present form is hard-to-follow, in addition to having the double-unlock bug. I would prefer if it were corrected and rewritten to look like this (where the main-path falls thru down the left side of the screen): static int visornic_serverdown(struct visornic_devdata *devdata, visorbus_state_complete_func complete_func) { unsigned long flags; int err; spin_lock_irqsave(&devdata->priv_lock, flags); if (devdata->server_change_state) { dev_dbg(&devdata->dev->device, "%s changing state\n", __func__); err = -EINVAL; goto err_unlock; } if (devdata->server_down) { dev_dbg(&devdata->dev->device, "%s already down\n", __func__); err = -EINVAL; goto err_unlock; } if (devdata->going_away) { dev_dbg(&devdata->dev->device, "%s aborting because device removal pending\n", __func__); err = -ENODEV; goto err_unlock; } devdata->server_change_state = true; devdata->server_down_complete_func = complete_func; spin_unlock_irqrestore(&devdata->priv_lock, flags); visornic_serverdown_complete(devdata); return 0; err_unlock: spin_unlock_irqrestore(&devdata->priv_lock, flags); return err; } Tim > > > > -- > > > 1.9.1 > >