Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp2525925pxa; Mon, 17 Aug 2020 11:50:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxM/H/FhmJXXCJG/FNHO8M9IkhysEMMHdlQLj2HE2YiPQyjDmfTGUr6p4y7LmwDqaAjPKXu X-Received: by 2002:a05:6402:2037:: with SMTP id ay23mr15902752edb.48.1597690214178; Mon, 17 Aug 2020 11:50:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597690214; cv=none; d=google.com; s=arc-20160816; b=Rr7nR2DARYnrHauRMh9UPVUYyevwdJuHIyDYFO1LnPX1ohdjS04Uu4xzUjeDGDctCo OWTolo5hZ5DJI7XdBA9cnePBPPRyMNoncOgpeJE9MdAgVbvl40jCibKBJoP7IHWL12EX 1qTcfLkgXsPjKe+OPiBVVS13mAisCEApbqaixtG2Elk+btNwdnst4vK83A2ARv2i3PYt qgImSxh5j3qnaF6kZWBWnUTh/aGpKJZ/5QQnRNo8h3xHrfGOeasQhdUDChrTrGP22ZoR zgERr+pUukiqhEfE6fvGwjkHLWy6ccOstETvnyme8glq48+0f7KL+1pscn8WCZgS0iAX e4Zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=daItpE4clyddA7rFTQDrT/Yb5IPEtymid6jOMJWwLNs=; b=jKZMXm70xxVhy4zycDxabIJ0VtM977JhqJtwVzYTV1FUxwzrTSriRwmm57LKhuGbjr F6xJnBOVU1NRAOGnOEvu5yydd5OR0iUn4xoiCI6aqY/bomMgNe1QesHI26UhIljJexsa XlcFFw2mnltZLOKWNCqWcQBjIUdB2pim/2mloH9/dlkYtGyjMCcSDfgBc4k31GF4BiYN kvWlsyTmfbkVksc/PwnUH6iDYnEeQXRQ1G1ik9THDaEwlDl2BJiulTXpDmtoCDTp9NA3 oAimn4eVOrzQ3OjDQzEW5Fqg60+GBTb/FSnhIExPVl430gOzKvVh5mlUR18NHY2D/A8A 2GpQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f16si11272168ejx.464.2020.08.17.11.49.50; Mon, 17 Aug 2020 11:50:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388013AbgHQSr5 (ORCPT + 99 others); Mon, 17 Aug 2020 14:47:57 -0400 Received: from netrider.rowland.org ([192.131.102.5]:54675 "HELO netrider.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S2391397AbgHQSrz (ORCPT ); Mon, 17 Aug 2020 14:47:55 -0400 Received: (qmail 120920 invoked by uid 1000); 17 Aug 2020 14:47:53 -0400 Date: Mon, 17 Aug 2020 14:47:53 -0400 From: Alan Stern To: Jim Baxter Cc: Greg KH , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-usb@vger.kernel.org, "Resch Carsten \(CM/ESO6\)" , "Rosca, Eugeniu \(ADITG/ESB\)" Subject: Re: PROBLEM: Long Workqueue delays. Message-ID: <20200817184753.GA120209@rowland.harvard.edu> References: <71aafe68-7fe0-6b77-ea8e-83edd3f16c8d@mentor.com> <20200817115744.GA3985908@kroah.com> <57a7841d-86e3-b6df-1488-a252a68a9ee0@mentor.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57a7841d-86e3-b6df-1488-a252a68a9ee0@mentor.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 17, 2020 at 07:24:01PM +0100, Jim Baxter wrote: > On 17/08/2020 12:57, Greg KH wrote: > > On Mon, Aug 17, 2020 at 12:40:03PM +0100, Jim Baxter wrote: > >> We have issues with the workqueue of the kernel overloading the CPU 0 > >> when we we disconnect a USB stick. > >> > >> This results in other items on the shared workqueue being delayed by > >> around 6.5 seconds with a default kernel configuration and 2.3 seconds > >> on a config tailored for our RCar embedded platform. > >> > > > > Is this data really flushed out to the device? In the example, data was copied _from_ the device, not to it. > I am unsure if the delay is due to a single system or a combination of memory, > usb or filesystem operations, the delay also occurs if the device is mounted > as ro and using the sync option. > > Using umount stops the issue occurring but is unfortunately not guaranteed in > our particular system. Unplugging a R/W USB drive without unmounting it first is a great way to corrupt the data. > >> - Disconnect the USB stick: > >> [ 1551.796792] usb 2-1: USB disconnect, device number 2 > >> [ 1558.625517] DELAY: 6782 > >> > >> > >> The Delay output 6782 is in milliseconds. > > > > What USB workqueue is taking so long?> > > The one trying to deal with the filesystem flushing out the data that it > > can't do now that the device is removed? :) > > > From my analysis it is the hub_event workqueue shown to be using most of the CPU, > the kworker/0:1+usb thread uses around 98% of the CPU. > > I have traced the workqueue:workqueue_queue_work function while unplugging the USB > but not found a particular workqueue function being called a lot. > > Using perf Iidentified the hub_events workqueue was spending a lot of time in > invalidate_partition(), I have included a cut down the captured data from perf in > [2] which shows the additional functions where the kworker spends most of its time. invalidate_partition() is part of the block layer, not part of USB. It gets called whenever a drive is removed from the system, no matter what type of drive it is. You should ask the people involved in that subsystem why it takes so long. > I realise that not unmounting the USB stick is not ideal, though I wonder what > additional work is done when unplugging the USB stick compared to unmounting it. Unmounting a drive flushes all the dirty buffers from memory back to the drive. Obviously that can't be done if the drive is unplugged first. As far as the USB subsystem is concerned, exactly the same amount of work is done during disconnect regardless of whether or not the drive is mounted. (In fact, the USB subsystem doesn't even know whether a drive is mounted; that concept is part of the block and filesystem layers.) > I guess it may be waiting for a time-out during the operation without the unmount. That seems very unlikely. When a USB device gets unplugged the system realizes it. Any I/O meant for that device is immediately cancelled; there are no timeouts. (Okay, not strictly true; there is a fraction-of-a-second timeout during which the system waits to see whether the disconnect was permanent or just a temporary glitch. But you're talking about 6-second long delays.) Alan Stern