Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp10727510rwl; Mon, 2 Jan 2023 07:16:48 -0800 (PST) X-Google-Smtp-Source: AMrXdXsQVNPnYGUbxbOAgiufvs/yKSqCr7u2UDZefgTKh/ncabPNjln2Gf8x1Cc5FEbHJaDgNyZB X-Received: by 2002:a17:903:2312:b0:192:8c7f:2654 with SMTP id d18-20020a170903231200b001928c7f2654mr29487824plh.0.1672672607795; Mon, 02 Jan 2023 07:16:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672672607; cv=none; d=google.com; s=arc-20160816; b=IFeB9PY4puKbCOpG0Cq6sHk+WqGgA67Rh3b420JLatgPuXpaQI+GiRRG1gLMRVvyau ZgOQhK4tTLHqQK96ee+rhaFva8+I64FGkZTRVBkK9+qtC0XtBYZ+qfTAHRbKkUbxQ5V8 fxr94MB0zvwQc7mU4ACC9iCHWdfvaRp0Fr91o5x9ErVTWY86X5iubMXLQ9xD1onbtvu3 4O4+WSBtRVYeBrPZna7k3a8SeS4fkM9Ybt8TVfnTeJB8SZja6i7vtRDixWOwmcg2I1Ib WBNQ3C20Hk6PYAz6yE38u2vQai00tL+I+4VLACzeiar8OwjDFeKb1b3E2MHCEjMQUhME Hxkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=IxapWSzgjamitjV/jR1lNJ7sTPtmfx2vDo+BxWXSDM8=; b=UjDrS60Qd2ybAkPJSLSr58B38K+1fzvhzO7ktd8zFIdZqTCbV4eMxK8wCQRxadGtYA GSsiT53ZHTzk15TVtltYJ4M7u4yix/eSOvyLQD4IRGqHEaf61roRCpVzCphV8nRzLzGV Jus5/cVMdnTcRSxx1pBSwzgt2sJsGql3QCYcZ49/bnw0PZx+6wR+aAdTJ1uVogVzp7gV D8/rOI6k8BaRJ7+HwkLQbpZkdAiIURxdaySffPtTCi1VZfNa5fYCYcTRRJ2B5RWjzT16 AsAUvfUM6QGncP49MJiPbwc2xlMfvhb8r/IWrMyAdHTulcryVoAhb9sQMnyZsyo7d2iq gXCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass (test mode) header.i=@ideasonboard.com header.s=mail header.b=XuCfZKz4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u11-20020a170903124b00b00188d892999esi35746850plh.521.2023.01.02.07.16.38; Mon, 02 Jan 2023 07:16:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass (test mode) header.i=@ideasonboard.com header.s=mail header.b=XuCfZKz4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236157AbjABO2P (ORCPT + 61 others); Mon, 2 Jan 2023 09:28:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232572AbjABO2N (ORCPT ); Mon, 2 Jan 2023 09:28:13 -0500 Received: from perceval.ideasonboard.com (perceval.ideasonboard.com [213.167.242.64]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0AE75FD1; Mon, 2 Jan 2023 06:28:11 -0800 (PST) Received: from pendragon.ideasonboard.com (213-243-189-158.bb.dnainternet.fi [213.243.189.158]) by perceval.ideasonboard.com (Postfix) with ESMTPSA id F37817C5; Mon, 2 Jan 2023 15:28:09 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com; s=mail; t=1672669690; bh=TocTMfPTqr/Rq4Pwi+HVWRSeFQ6SonKxJcmFc78E9yQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=XuCfZKz4diQjvh3yLYUcMCZVI3VGC0r6GbS2l/3E0yXC3uFIWAnciPKPYUr+tYck+ WwkJfXkj7FHui4obVog/Ocnag9rK6CvVxpnnKcb2vkizCHITWWTrtvyY/6XO31isSO +UalSEYPNpkCTMJUbZWpFBEoSBKFdPDA9RGhA+XI= Date: Mon, 2 Jan 2023 16:28:06 +0200 From: Laurent Pinchart To: Ricardo Ribalda Cc: Sergey Senozhatsky , Mauro Carvalho Chehab , Max Staudt , linux-media@vger.kernel.org, stable@vger.kernel.org, linux-kernel@vger.kernel.org, Yunke Cao Subject: Re: [PATCH v5] media: uvcvideo: Fix race condition with usb_kill_urb Message-ID: References: <20221212-uvc-race-v5-0-3db3933d1608@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20221212-uvc-race-v5-0-3db3933d1608@chromium.org> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Ricardo, Thank you for the patch. On Mon, Jan 02, 2023 at 01:53:38PM +0100, Ricardo Ribalda wrote: > usb_kill_urb warranties that all the handlers are finished when it > returns, but does not protect against threads that might be handling > asynchronously the urb. > > For UVC, the function uvc_ctrl_status_event_async() takes care of > control changes asynchronously. > > If the code is executed in the following order: > > CPU 0 CPU 1 > ===== ===== > uvc_status_complete() > uvc_status_stop() > uvc_ctrl_status_event_work() > uvc_status_start() -> FAIL > > Then uvc_status_start will keep failing and this error will be shown: > > <4>[ 5.540139] URB 0000000000000000 submitted while active > drivers/usb/core/urb.c:378 usb_submit_urb+0x4c3/0x528 > > Let's improve the current situation, by not re-submiting the urb if > we are stopping the status event. Also process the queued work > (if any) during stop. > > CPU 0 CPU 1 > ===== ===== > uvc_status_complete() > uvc_status_stop() > uvc_status_start() > uvc_ctrl_status_event_work() -> FAIL > > Hopefully, with the usb layer protection this should be enough to cover > all the cases. > > Cc: stable@vger.kernel.org > Fixes: e5225c820c05 ("media: uvcvideo: Send a control event when a Control Change interrupt arrives") > Reviewed-by: Yunke Cao > Signed-off-by: Ricardo Ribalda > --- > uvc: Fix race condition on uvc > > Make sure that all the async work is finished when we stop the status urb. > > To: Yunke Cao > To: Sergey Senozhatsky > To: Max Staudt > To: Laurent Pinchart > To: Mauro Carvalho Chehab > Cc: linux-media@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > --- > Changes in v5: > - atomic_t do not impose barriers, use smp_mb() instead. (Thanks Laurent) > - Add an extra cancel_work_sync(). > - Link to v4: https://lore.kernel.org/r/20221212-uvc-race-v4-0-38d7075b03f5@chromium.org > > Changes in v4: > - Replace bool with atomic_t to avoid compiler reordering. > - First complete the async work and then kill the urb to avoid race (Thanks Laurent!) > - Link to v3: https://lore.kernel.org/r/20221212-uvc-race-v3-0-954efc752c9a@chromium.org > > Changes in v3: > - Remove the patch for dev->status, makes more sense in another series, and makes > the zero day less nervous. > - Update reviewed-by (thanks Yunke!). > - Link to v2: https://lore.kernel.org/r/20221212-uvc-race-v2-0-54496cc3b8ab@chromium.org > > Changes in v2: > - Add a patch for not kalloc dev->status > - Redo the logic mechanism, so it also works with suspend (Thanks Yunke!) > - Link to v1: https://lore.kernel.org/r/20221212-uvc-race-v1-0-c52e1783c31d@chromium.org > --- > drivers/media/usb/uvc/uvc_ctrl.c | 3 +++ > drivers/media/usb/uvc/uvc_status.c | 36 ++++++++++++++++++++++++++++++++++++ > drivers/media/usb/uvc/uvcvideo.h | 1 + > 3 files changed, 40 insertions(+) > > diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c > index c95a2229f4fa..5160facc8e20 100644 > --- a/drivers/media/usb/uvc/uvc_ctrl.c > +++ b/drivers/media/usb/uvc/uvc_ctrl.c > @@ -1442,6 +1442,9 @@ static void uvc_ctrl_status_event_work(struct work_struct *work) > > uvc_ctrl_status_event(w->chain, w->ctrl, w->data); > > + if (dev->flush_status) > + return; > + > /* Resubmit the URB. */ > w->urb->interval = dev->int_ep->desc.bInterval; > ret = usb_submit_urb(w->urb, GFP_KERNEL); > diff --git a/drivers/media/usb/uvc/uvc_status.c b/drivers/media/usb/uvc/uvc_status.c > index 7518ffce22ed..5911e63776e1 100644 > --- a/drivers/media/usb/uvc/uvc_status.c > +++ b/drivers/media/usb/uvc/uvc_status.c > @@ -6,6 +6,7 @@ > * Laurent Pinchart (laurent.pinchart@ideasonboard.com) > */ > > +#include > #include > #include > #include > @@ -309,5 +310,40 @@ int uvc_status_start(struct uvc_device *dev, gfp_t flags) > > void uvc_status_stop(struct uvc_device *dev) > { > + struct uvc_ctrl_work *w = &dev->async_ctrl; > + > + /* From this point, the status urb is not re-queued */ > + dev->flush_status = 1; flush_status is now a bool, set it to true instead of 1. Same below for false instead of 0. > + /* > + * Make sure that the other CPUs are aware of the new value of > + * flush_status. > + */ > + smp_mb(); /* * Prevent to asynchronous control handler from requeing the URB. The * barrier is needed the flush_status change is visible to other CPUs * running the asynchronous handler before usb_kill_urb() is called * below. */ dev->flush_status = true; smp_mb(); > + > + /* If there is any status event on the queue, process it. */ > + if (cancel_work_sync(&w->work)) > + uvc_ctrl_status_event(w->chain, w->ctrl, w->data); > + > + /* Kill the urb. */ > usb_kill_urb(dev->int_urb); > + > + /* > + * If an status event was queued between cancel_work_sync() and > + * usb_kill_urb(), process it. > + */ /* * The URB completion handler may have queued asynchronous work. This * won't resubmit the URB as flush_status is set, but it needs to be * cancelled before returning or it could then race with a future * uvc_status_start() call. */ > + if (cancel_work_sync(&w->work)) > + uvc_ctrl_status_event(w->chain, w->ctrl, w->data); I think I'd still prefer checking flush_status in uvc_ctrl_status_event_async() and drop the event, as it would be simpler here. uvc_status_stop() is called when the last user indicates it's not interested in events anymore (by closing the device at the moment, possibly by unsubscribing from events in the future), so dropping the event isn't a problem as far as I can tell. What do you think ? > + > + /* > + * From this point, there are no events on the queue and the status urb s/urb/URB/ > + * is dead, this is, no events will be queued until uvc_status_start() > + * is called. > + */ > + dev->flush_status = 0; > + /* > + * Write to memory the value of flush_status before uvc_status_start() > + * is called again, s/,/./ > + */ > + smp_mb(); /* * From this point, the status URB is dead, and no asynchronous work is * queued. No event will be processed until uvc_status_start() is * called. Reset flush_status and make it visible to the asynchronous * handler before uvc_status_start() requeues the status URB. */ dev->flush_status = 0; smp_mb(); The barrier here is most likely overkill given that I'd be very surprised if a URB could be queued, followed by a work item being queued, without any memory barrier, but it's good to be explicit I suppose :-) > + Drop the blank line. > } > diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h > index df93db259312..6a9b72d6789e 100644 > --- a/drivers/media/usb/uvc/uvcvideo.h > +++ b/drivers/media/usb/uvc/uvcvideo.h > @@ -560,6 +560,7 @@ struct uvc_device { > struct usb_host_endpoint *int_ep; > struct urb *int_urb; > u8 *status; > + bool flush_status; > struct input_dev *input; > char input_phys[64]; > > > --- > base-commit: 0ec5a38bf8499f403f81cb81a0e3a60887d1993c > change-id: 20221212-uvc-race-09276ea68bf8 -- Regards, Laurent Pinchart