Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp459331pxa; Wed, 19 Aug 2020 06:22:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyc2YuClq3XrEznhpt9btjNXwrkUjIOrdAWMxSreyVEjvNN6rAEZKTZh9PAgOoSk6SN5JPj X-Received: by 2002:aa7:c1ca:: with SMTP id d10mr25492344edp.261.1597843356419; Wed, 19 Aug 2020 06:22:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597843356; cv=none; d=google.com; s=arc-20160816; b=MIsXxWMDLbFm9c20IdXqBAcov3kA27SPoZSWJq/SQVqTtIvJDyFxrzJOPdrcFhlJYU sYOz39FEWXb8NB4YnoSNVIX+4MSZEQ1839BB5bSxnCyAEHF4B20qRKKFC+Rsvev/mepw LWCkutcxUqsw6o3tjp2R0HpEnnnOijdr84MPMimF2N6gloqPJJIJyPkQ/5a8CLPMGP7F KW6JBiNX55AL0/bDnfWGceRiJYmQgQq3FrODMhyx34hK8ZyWm2/YFdBC2PJ2BViBvivt ajlTfbyf7MHn4tvFwYVJU8XFnQsMZA4d8sfJE3GOzcOEyNuuqOyfC00pRd6NN2wHWjnf RCIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject:ironport-sdr:ironport-sdr; bh=ate8tfVy+AeCXj6MA7RO+S+rqEUeDk7LZD17ai1UygI=; b=S7Z8lFRIVHf8wXPMmNZJcov1ZOM8+IUJhQE8B0FWjwc7k6eNcdmUpt2v5KyREwTDs5 uhEnkxibRvqE0x1nFMmjrqpgUVCCNmhzAz62p5FIYjH417G72C18VPP2a1DCZrnymQp0 C1CZu7448G0wjT5Cz5wdxTRWNQeE+YEryzTRV36yVgt9uvdCFTIf4LBrB+NTfXc65qTQ i41ijP6r2eqjzJEBDWwLWu5RJfF7c/4y2/2HK/GYLa61J7okZf1v/bdXdfGBinHx8Qhm 3CqKyI9iRlNguCQv/tgM5fvVdcgA/hFsw4Um8b+3dYEuxxwqvIcX3KeWtyzVbXYSi3UF oAXQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p20si15677912ejl.1.2020.08.19.06.22.12; Wed, 19 Aug 2020 06:22:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727932AbgHSNQU (ORCPT + 99 others); Wed, 19 Aug 2020 09:16:20 -0400 Received: from esa1.mentor.iphmx.com ([68.232.129.153]:2433 "EHLO esa1.mentor.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728467AbgHSNMk (ORCPT ); Wed, 19 Aug 2020 09:12:40 -0400 IronPort-SDR: sgnuXoSGnZA8kGpvGIJRHMLbPKb87irF5jERhhiuQ4E53DLWD/wc/7AuVYiuwkD4awUdrM06vS +K28wDIOL2VRQ1TT/dxMWXIftGcIVFH9NP9YlzPxgppY2gXfeU2f4kLntSkm3iff3QNp/a+mBP bK3rWLvzwSRhI6qrmngYIJhCIfzWDlB9IhpGiroWr28f4bOMLYh9J6U52NNYtmiGw72DT38yVh g2pp33Mst8231BvuWs8jAM+HL9rqyDRfSUFH0OnVXkrzZ285gsEnyZE9TCRjf1SdT6ft3prO4K XEs= X-IronPort-AV: E=Sophos;i="5.76,331,1592899200"; d="scan'208";a="54241395" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 19 Aug 2020 05:12:30 -0800 IronPort-SDR: ESPCdr3PB5nacUlfmLmN0RuCzcFWoXVmFTtIbiTTLRwe7ondCoy5uD3yMmdCR5+pGO962D9+rh g/NAgyTcsAK+gOnQy/2cyTDWJAPCaBAtAQmXHnk56xNzuSvZEMLMvxzQ4Vkx4VTdxYOj3Q9S65 bTq8Lz9NBfWOXVxjnoyT5pyICOxgKsfiOYfFb9+7jeAknZHsMS2WEATQ2qLd+A3tvB7OEOYT+F 0N3bL90Mf3vp/LrPQ030ufzyA5F55orpIzuA8SwsZeymuFofeWyPtQAkoa7omoXaWhmiGXfnsK ud0= Subject: Re: PROBLEM: Long Workqueue delays V2 From: Jim Baxter To: , , , CC: "Resch Carsten (CM/ESO6)" , "Rosca, Eugeniu (ADITG/ESB)" References: <625615f2-3a6b-3136-35f9-2f2fb3c110cf@mentor.com> Message-ID: <066753ec-eddc-d7f6-5cc8-fe282baba6ec@mentor.com> Date: Wed, 19 Aug 2020 14:12:24 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <625615f2-3a6b-3136-35f9-2f2fb3c110cf@mentor.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-08.mgc.mentorg.com (139.181.222.8) To SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Added linux-block List which may also be relevant to this issue. -------- Original Message -------- Subject: PROBLEM: Long Workqueue delays V2 From: Jim Baxter To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org CC: "Resch Carsten (CM/ESO6)" , "Rosca, Eugeniu (ADITG/ESB)" Date: Tue, 18 Aug 2020 12:58:13 +0100 > I am asking this question again to include the fs-devel list. > > > We have issues with the workqueue of the kernel overloading the CPU 0 > when we we disconnect a USB stick. > > This results in other items on the shared workqueue being delayed by > around 6.5 seconds with a default kernel configuration and 2.3 seconds > on a config tailored for our RCar embedded platform. > > > > We first noticed this issue on custom hardware and we have recreated it > on an RCar Starter Kit using a test module [1] to replicate the > behaviour, the test module outputs any delays of greater then 9ms. > > To run the test we have a 4GB random file on a USB stick and perform > the following test. > The stick is mounted as R/O and we are copying data from the stick: > > - Mount the stick. > mount -o ro,remount /dev/sda1 > > - Load the Module: > # taskset -c 0 modprobe latency-mon > > - Copy large amount of data from the stick: > # dd if=/run/media/sda1/sample.txt of=/dev/zero > [ 1437.517603] DELAY: 10 > 8388607+1 records in > 8388607+1 records out > > > - Disconnect the USB stick: > [ 1551.796792] usb 2-1: USB disconnect, device number 2 > [ 1558.625517] DELAY: 6782 > > > The Delay output 6782 is in milliseconds. > > > > Using umount stops the issue occurring but is unfortunately not guaranteed > in our particular system. > > > From my analysis the hub_event workqueue kworker/0:1+usb thread uses around > 98% of the CPU. > > I have traced the workqueue:workqueue_queue_work function while unplugging the USB > and there is no particular workqueue function being executed a lot more then the > others for the kworker/0:1+usb thread. > > > Using perf I identified the hub_events workqueue was spending a lot of time in > invalidate_partition(), I have included a cut down the captured data from perf in > [2] which shows the additional functions where the kworker spends most of its time. > > > I am aware there will be delays on the shared workqueue, are the delays > we are seeing considered normal? > > > Is there any way to mitigate or identify where the delay is? > I am unsure if this is a memory or filesystem subsystem issue. > > > Thank you for you help. > > Thanks, > Jim Baxter > > [1] Test Module: > // SPDX-License-Identifier: GPL-2.0 > /* > * Simple WQ latency monitoring > * > * Copyright (C) 2020 Advanced Driver Information Technology. > */ > > #include > #include > #include > > #define PERIOD_MS 100 > > static struct delayed_work wq; > static u64 us_save; > > static void wq_cb(struct work_struct *work) > { > u64 us = ktime_to_us(ktime_get()); > u64 us_diff = us - us_save; > u64 us_print = 0; > > if (!us_save) > goto skip_print; > > > us_print = us_diff / 1000 - PERIOD_MS; > if (us_print > 9) > pr_crit("DELAY: %lld\n", us_print); > > skip_print: > us_save = us; > schedule_delayed_work(&wq, msecs_to_jiffies(PERIOD_MS)); > } > > static int latency_mon_init(void) > { > us_save = 0; > INIT_DELAYED_WORK(&wq, wq_cb); > schedule_delayed_work(&wq, msecs_to_jiffies(PERIOD_MS)); > > return 0; > } > > static void latency_mon_exit(void) > { > cancel_delayed_work_sync(&wq); > pr_info("%s\n", __func__); > } > > module_init(latency_mon_init); > module_exit(latency_mon_exit); > MODULE_AUTHOR("Eugeniu Rosca "); > MODULE_LICENSE("GPL"); > > > [2] perf trace: > 95.22% 0.00% kworker/0:2-eve [kernel.kallsyms] > | > ---ret_from_fork > kthread > worker_thread > | > --95.15%--process_one_work > | > --94.99%--hub_event > | > --94.99%--usb_disconnect > > | > --94.90%--invalidate_partition > __invalidate_device > | > |--64.55%--invalidate_bdev > | | > | --64.13%--invalidate_mapping_pages > | | > | |--24.09%--invalidate_inode_page > | | | > | | --23.44%--remove_mapping > | | | > | | --23.20%--__remove_mapping > | | | > | | --21.90%--arch_local_irq_restore > | | > | |--22.44%--arch_local_irq_enable > | > --30.35%--shrink_dcache_sb > > | > --30.17%--truncate_inode_pages_range >