Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp15132485rwb; Mon, 28 Nov 2022 08:16:49 -0800 (PST) X-Google-Smtp-Source: AA0mqf7kI5bmkFJmMEzTDq2UkEefNClojw9e7xO4Il69A9amYBPXFyXFkE81qaSs1dY1Msj5UEj0 X-Received: by 2002:a17:906:2998:b0:78d:3ff8:6ec8 with SMTP id x24-20020a170906299800b0078d3ff86ec8mr28271214eje.568.1669652209650; Mon, 28 Nov 2022 08:16:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669652209; cv=none; d=google.com; s=arc-20160816; b=VuTSN+N/+UjgbboiiFng/X6GT+8FV10KpSw2k5o4ueHgeF17Gv3KQ+DNPP2Pf6h5Hu BbFRZ4NgRtjzah/430ft/QgIlqwcYqg7qY6kimhPQjhwNqaDciEh3tDwXDADERF+Fzhz 6SaIepoPRM8QHhuBt+zeTJN4Uma7RRiR7AD4rVuTWwx6FEjumzvqsHhnlS/Ymved5PKS 0AJdR3s+JVjBZHHEl55ccGeZ/OiVD+VVi8P/sjO+VWCb0vwfRJ5Kz6Kdyc9e8QgMxLzw KNN9qK2WcwsvrAO4UYnPGs5L01cX4Izz8hrz1sfbpyqUIyLLlws3wNPwl/bj71X5VSKa 6pLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=Dt1j38BEaVUfvfsq7aWlcqVGHEygpmc9LXzwv9Vxv68=; b=li4AMdIswo8E1PlBTcJ7Md0Cjd9e+3gZZG0rWrn65MvnhK8rgIsBpMlaOI7DAQ+I2y 2Zwp+6rUKSQ+DS/TJTsRPLB0fjDm9eFwor1Yd2JSWXzYxzdvVVNaglVtWyLr3f305IC/ YQIXzNepIQx4FR+55kTRd9DLrk1ZyCWsvPLek1TtNL1aLX8QpUrvYxMYiwK56zdtUvMR rBr0ve3KW5SGJ7OEYs/QTQUBScUAqW/ncg1QV+718VSCSl5lHvyRD5xd91Gz+GBgT/wq LljKq8bXsa9zkRHOAavkRJklv32XJM4FU/XZPTpWNFIs0sBVy4nNOWf+bep/2oBER47o Hrtg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h9-20020a056402280900b0046b7410c015si167710ede.18.2022.11.28.08.16.25; Mon, 28 Nov 2022 08:16:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232105AbiK1Pum (ORCPT + 84 others); Mon, 28 Nov 2022 10:50:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231894AbiK1Puk (ORCPT ); Mon, 28 Nov 2022 10:50:40 -0500 Received: from netrider.rowland.org (netrider.rowland.org [192.131.102.5]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 4AB1C634E for ; Mon, 28 Nov 2022 07:50:39 -0800 (PST) Received: (qmail 327280 invoked by uid 1000); 28 Nov 2022 10:50:38 -0500 Date: Mon, 28 Nov 2022 10:50:38 -0500 From: Alan Stern To: Vincent MAILHOL Cc: Andrew Lunn , linux-can@vger.kernel.org, Marc Kleine-Budde , linux-kernel@vger.kernel.org, Greg Kroah-Hartman , netdev@vger.kernel.org, linux-usb@vger.kernel.org, Saeed Mahameed , Jiri Pirko , Lukas Magel Subject: Re: [PATCH v4 2/6] can: etas_es58x: add devlink support Message-ID: References: <20221104073659.414147-1-mailhol.vincent@wanadoo.fr> <20221126162211.93322-1-mailhol.vincent@wanadoo.fr> <20221126162211.93322-3-mailhol.vincent@wanadoo.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 28, 2022 at 02:32:23PM +0900, Vincent MAILHOL wrote: > On Mon. 28 Nov. 2022 at 10:34, Vincent MAILHOL > wrote: > > On Mon. 28 Nov. 2022 at 00:41, Alan Stern wrote: > > > On Sun, Nov 27, 2022 at 02:10:32PM +0900, Vincent MAILHOL wrote: > > > > > Should devlink_free() be after usb_set_inftdata()? > > > > > > > > A look at > > > > $ git grep -W "usb_set_intfdata(.*NULL)" > > > > > > > > shows that the two patterns (freeing before or after > > > > usb_set_intfdata()) coexist. > > > > > > > > You are raising an important question here. usb_set_intfdata() does > > > > not have documentation that freeing before it is risky. And the > > > > documentation of usb_driver::disconnect says that: > > > > "@disconnect: Called when the interface is no longer accessible, > > > > usually because its device has been (or is being) disconnected > > > > or the driver module is being unloaded." > > > > Ref: https://elixir.bootlin.com/linux/v6.1-rc6/source/include/linux/usb.h#L1130 > > > > > > > > So the interface no longer being accessible makes me assume that the > > > > order does not matter. If it indeed matters, then this is a foot gun > > > > and there is some clean-up work waiting for us on many drivers. > > > > > > > > @Greg, any thoughts on whether or not the order of usb_set_intfdata() > > > > and resource freeing matters or not? > > > > > > In fact, drivers don't have to call usb_set_intfdata(NULL) at all; the > > > USB core does it for them after the ->disconnect() callback returns. > > > > Interesting. This fact is widely unknown, cf: > > $ git grep "usb_set_intfdata(.*NULL)" | wc -l > > 215 > > > > I will do some clean-up later on, at least for the CAN USB drivers. > > > > > But if a driver does make the call, it should be careful to ensure that > > > the call happens _after_ the driver is finished using the interface-data > > > pointer. For example, after all outstanding URBs have completed, if the > > > completion handlers will need to call usb_get_intfdata(). > > > > ACK. I understand that it should be called *after* the completion of > > any ongoing task. > > > > My question was more on: > > > > devlink_free(priv_to_devlink(es58x_dev)); > > usb_set_intfdata(intf, NULL); > > > > VS. > > > > usb_set_intfdata(intf, NULL); > > devlink_free(priv_to_devlink(es58x_dev)); > > > > From your comments, I understand that both are fine. > > Do we agree that the usb-skeleton is doing it wrong? > https://elixir.bootlin.com/linux/latest/source/drivers/usb/usb-skeleton.c#L567 > usb_set_intfdata(interface, NULL) is called before deregistering the > interface and terminating the outstanding URBs! Going through the usb-skeleton.c source code, you will find that usb_get_intfdata() is called from only a few routines: skel_open() skel_disconnect() skel_suspend() skel_pre_reset() skel_post_reset() Of those, all but the first are called only by the USB core and they are mutually exclusive with disconnect processing (except for skel_disconnect() itself, of course). So they don't matter. The first, skel_open(), can be called as a result of actions by the user, so the driver needs to ensure that this can't happen after it clears the interface-data pointer. The user can open the device file at any time before the minor number is given back, so it is not proper to call usb_set_intfdata(interface, NULL) before usb_deregister_dev() -- but the driver does exactly this! (Well, it's not quite that bad. skel_open() does check whether the interface-data pointer value it gets from usb_get_intfdata() is NULL. But it's still a race.) So yes, the current code is wrong. And in fact, it will still be wrong even after the usb_set_intfdata(interface, NULL) line is removed, because there is no synchronization between skel_open() and skel_disconnect(). It is possible for skel_disconnect() to run to completion and the USB core to clear the interface-data pointer all while skel_open() is running. The driver needs a static private mutex to synchronize opens with unregistrations. (This is a general phenomenon, true of all drivers that have a user interface such as a device file.) The driver _does_ have a per-instance mutex, dev->io_mutex, to synchronize I/O with disconnects. But that's separate from synchronizing opens with unregistrations, because at open time the driver doesn't yet know the address of the private data structure or even if the structure is still allocated. So obviously it can't use a mutex that is embedded within the private data structure for this purpose. Alan Stern