Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1054459rwd; Sat, 27 May 2023 10:22:19 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6xVdScRjI6IZNXQTQLlxHs6Rb7lEXMCOYVeVO/wv97QD/ZrsrQOTlapOR8LNIazwjXSnOY X-Received: by 2002:a17:902:e846:b0:1af:eaac:4b80 with SMTP id t6-20020a170902e84600b001afeaac4b80mr7997274plg.35.1685208139609; Sat, 27 May 2023 10:22:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685208139; cv=none; d=google.com; s=arc-20160816; b=EEIABbrn2W26ecr2KUdlDMf3Ka17V9xY2WuPf624GAi/+fc39ZSeBJBbOupHAqrKhT gV9URU5didef8y/e0L0xOqITWExPGigFhZhITnGojIV6TpXibClPzsawTs0iLFu8+XjF Dih6ceMwwU++d28FGYu5ObGZeK+NjJJOzxYeNgIbK6n3hCxI1bugAgwABd61AT1DoKIW S7zCsK+8toTrGGy7OVoZ0UEDsBjFKHV1zx2fYjOGL7/J67jr4reEeQxbws3fFiKBCEa2 hTn5VCfEu6wv3oz8bqVFxFKe0y6FjhgvpSjUAKNgjppaWVX0E3QQG1OBhDzOZHILO4DS ZDFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=YwUFF90eaQEYKdOgeZRy60m5hqzJ9os9Y+uC/Jf9dMQ=; b=A2D7saA4gPuixje7h51hl74Od4X3ALnFXHXSOnbzZYukSr/c0ffRqXffzDH6jgGlFd OnlfO1KzZ5he9GzFstCrvGaPUKPu2/wsxIpTMxcMkFlfJnnAzQg2YP7Y4hhdlG1O8pC5 xU7qlAz/uI8LXs9XGZrruRgaSq0lVBsHj4we9l/VZBSqFeEhnKbrlam5zVfKGEdbOFzR 9qorIhO2uZ8umolPlYS0zyJyEOQ+zMidJjHzWIn9iE9Mr+q0s/TgVbHdGEKhu2iJ0hHF IPO2QPfGnR/4sdPs0TJRBSF3iRt4kdod8xLfC8BOuZyNCJYa7MXllhSmyXG2Y0WhtJUr 0yKA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id km3-20020a17090327c300b001a6bb7b7a44si6091550plb.307.2023.05.27.10.22.08; Sat, 27 May 2023 10:22:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229647AbjE0QhC (ORCPT + 99 others); Sat, 27 May 2023 12:37:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229512AbjE0QhB (ORCPT ); Sat, 27 May 2023 12:37:01 -0400 Received: from netrider.rowland.org (netrider.rowland.org [192.131.102.5]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 65CB4BB for ; Sat, 27 May 2023 09:36:59 -0700 (PDT) Received: (qmail 327854 invoked by uid 1000); 27 May 2023 12:36:58 -0400 Date: Sat, 27 May 2023 12:36:58 -0400 From: Alan Stern To: Badhri Jagan Sridharan Cc: gregkh@linuxfoundation.org, colin.i.king@gmail.com, xuetao09@huawei.com, quic_eserrao@quicinc.com, water.zhangjiantao@huawei.com, peter.chen@freescale.com, francesco@dolcini.it, alistair@alistair23.me, stephan@gerhold.net, bagasdotme@gmail.com, luca@z3ntu.xyz, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, Francesco Dolcini Subject: Re: [PATCH v2] usb: gadget: udc: core: Offload usb_udc_vbus_handler processing Message-ID: <406371f0-db48-4195-b85d-b75ce83e738b@rowland.harvard.edu> References: <20230519043041.1593578-1-badhri@google.com> <547ecbb2-921d-4714-82b7-066202ccf292@rowland.harvard.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 26, 2023 at 07:42:39PM -0700, Badhri Jagan Sridharan wrote: > Thanks again Alan ! > > On Mon, May 22, 2023 at 8:55 AM Alan Stern wrote: > > Getting back to your first point, it looks like we need to assume any > > routine that needs to communicate with the UDC hardware (such as the > > ->pullup callback used in usb_gadget_{dis}connect()) must always be > > called in process context. This means that usb_udc_connect_control() > > always has to run in process context, since it will do either a connect > > or a disconnect. > > > > On the other hand, some routines -- in particular, > > usb_udc_vbus_handler() -- may be called by a UDC driver's interrupt > > handler and therefore may run in interrupt context. (This fact should > > be noted in that routine's kerneldoc, by the way.) > > > > So here's the problem: usb_udc_vbus_handler() running in interrupt > > context calls usb_udc_connect_control(), which has to run in process > > context. And this is not just a simple issue caused by the > > ->disconnect() callback or use of mutexes; it's more fundamental. > > > > I'm led to conclude that you were right to offload part of > > usb_udc_vbus_handler()'s job to a workqueue. It's an awkward thing to > > do, because you have to make sure to cancel the work item at times when > > it's no longer needed. But there doesn't seem to be any other choice. > > > > Here's two related problems for you to think about: > > > > 1. Once gadget_unbind_driver() has called usb_gadget_disconnect(), > > we don't want a VBUS change to cause usb_udc_vbus_handler()'s > > work routine to turn the pullup back on. How can we prevent > > this? > > > > 2. More generally, suppose usb_udc_vbus_handler() gets called at > > exactly the same time that some other pathway (either > > gadget_bind_driver() or soft_connect_store()) tries to do a > > connect or disconnect. What should happen then? > > > I believe I can solve the above races by protecting the flags set by > each of them with connect_lock and not pulling up unless all of them > are true. > > The caller will hold connect_lock, update the respective flag and > invoke the below usb_gadget_pullup_update_locked function(shown > below). Are you certain this can be done without causing any deadlocks? > Code stub: > /* Internal version of usb_gadget_connect needs to be called with > connect_lock held. */ > static int usb_gadget_pullup_update_locked(struct usb_gadget *gadget) > __must_hold(&gadget->udc->connect_lock) > { > int ret = 0; > bool connect = !gadget->deactivated && gadget->udc->started && > gadget->udc->vbus && > gadget->udc->allow_connect; On further thought, I decided "allow_connect" is a dumb name. Let's call it "unbinding" instead, since it gets set only when a gadget driver is about to be unbound (which is when we want to prevent new connections). > if (!gadget->ops->pullup) { > ret = -EOPNOTSUPP; > goto out; > } > > if (connect != gadget->connected) { You need to be more careful here. It's possible to have gadget->connected set at the same time as gadget->deactivated -- it means that when the gadget gets re-activated, it will immediately try to connect again. In fact, this logic doesn't look right at all. For example, suppose the gadget driver wants to disconnect. This routine will compute connect = true and will see that gadget->connected is set, so it won't do anything! I think it would be better just to merge the new material into usb_gadget_connect() and usb_gadget_disconnect(). > ret = gadget->ops->pullup(gadget, connect); > if (!ret) > gadget->connected = connect; > if (!connect) { > mutex_lock(&udc_lock); > if (gadget->udc->driver) > gadget->udc->driver->disconnect(gadget); > mutex_unlock(&udc_lock); > } > > out: > trace_usb_gadget_connect(gadget, ret); > > return ret; > } > > However, while auditing the code again, I noticed another potential > race as well: > Looks like usb_del_gadget() can potentially race against > usb_udc_vbus_handler() and call device_unregister. > This implies usb_udc can be freed while usb_udc_vbus_handler() or the > work item is executing. > > void usb_del_gadget(struct usb_gadget *gadget) > { > struct usb_udc *udc = gadget->udc; > > .. > ... > device_unregister(&udc->dev); > } > EXPORT_SYMBOL_GPL(usb_del_gadget); > > Does this look like a valid concern to you or am I misunderstanding this ? You're missing an important point. Before doing device_unregister(), this routine calls device_del(&gadget->dev). That will unbind the gadget driver, which (among other things) will stop the UDC, preventing it from calling usb_udc_vbus_handler(). However, you're right that the work item will need to be cancelled at some point before the usb_udc is unregistered. > If so, I am afraid that the only way to solve this is by synchronizing > usb_udc_vbus_handler() against usb_del_gadget() through a mutex as > device_unregister() can sleep. > So offloading usb_udc_vbus_handler() cannot work either. > > usb_udc_vbus_hander() seems to be invoked from the following drivers: > > 1. drivers/usb/chipidea/udc.c: > usb_udc_vbus_hander() is called from a non-atomic context. > > 2. drivers/usb/gadget/udc/max3420_udc.c > usb_udc_vbus_hander() is called from the interrupt handler. > However, all the events are processed from max3420_thread kthread. > So I am thinking of making them threaded irq handlers instead. > > 3. drivers/usb/gadget/udc/renesas_usbf.c > This one looks more invasive. However, still attempting to move them > to threaded irq handlers. > > As always, I'm looking forward to your feedback ! Moving those things to threaded IRQ handlers is a separate job. Let's get this issue fixed first. Alan Stern