Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp5538047rwb; Wed, 17 Aug 2022 20:50:28 -0700 (PDT) X-Google-Smtp-Source: AA6agR6tg+fpL7yaoFZKTQZ6hEGLor8Ib9XafqwcbTXraj8o/tyufllnUn/9uFcTUE6jugQCMzTX X-Received: by 2002:a17:907:3111:b0:730:6535:b3fb with SMTP id wl17-20020a170907311100b007306535b3fbmr627681ejb.490.1660794628208; Wed, 17 Aug 2022 20:50:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660794628; cv=none; d=google.com; s=arc-20160816; b=vuPiyO5FcFM+/7I3aZVzSZ9N9BFHtX85OI7qguuWFdF2yV+MmIaWi5hxJMj6TzDJC6 Ollsm9o9ZfLY7nyL408lYf2VshkXmfj3VltusND4/CtvT7gF2y9fbnlud5nSaALy7ahC dfxEQzxlR3orJGqcMSj5yO2PEuFM6NR7vw1t6sVtdhFzLItwVD1uig5RXycP7VUR0YVr GVeP9uZpf+UkJiBVLj/soVCsBHSfTlEMavGi8mbbWTA6s58poolESEq9+D9XOmID2Lev K7YXjhebAR6Zf1Zt+5sVgr6J3/tT8QtB7FePn+fi3R8nMunKPtJ4SDuKrdBnf8AWHB1A 9qRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=3MPK+GM2uD1yCea/zMrRjLKC1MQOzHK2bgMfr6/LwJY=; b=G9TizbKAYp8hyJFgFx/8m56bmu13dEhkbsScGzy40vKpBUib/oCI2Z6GjHr+zpJ288 v8BXeHhzSzEYv/0dLEXm3CkLGFoMOquS6z8t1F7ysYVOHI+sLrf3skQeGMV0nhPxADrc XeKUxGSYcgJvrS0r6mO/cY/RtU+nz5/gWFsgAm6F/x9E8dJ7TeEBdQ/QZPXfVMy251U3 UN8fLDx5yETKpK9MEUFCLjrkQICydeWq8rti2hhtrb8u6M8z8be0ypQOmapLKBjm0z5H /9CZLBiCZRaZGNjZGaRo/b8xocupKgt7Lhc/+0vPI3EJRa/73PxQb4+CHMOEx7hSlXN7 5L/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FCOmCd1z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hp19-20020a1709073e1300b007315149a9a3si296078ejc.6.2022.08.17.20.50.02; Wed, 17 Aug 2022 20:50:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FCOmCd1z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243109AbiHRDZF (ORCPT + 99 others); Wed, 17 Aug 2022 23:25:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240598AbiHRDZD (ORCPT ); Wed, 17 Aug 2022 23:25:03 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB83D5C9FB for ; Wed, 17 Aug 2022 20:25:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1660793100; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=3MPK+GM2uD1yCea/zMrRjLKC1MQOzHK2bgMfr6/LwJY=; b=FCOmCd1zTMP5ewQVtkF9vmSuU6PSbpAxg90/p2egm0jjjMQ/EyLY/Wu3Oghmyd6OZmHmOw iJYltkxLzMEk3XNSBMdGOHQO3JH9hRhAwnj4ustgoShDfwc7r6PefoZE/3afJn5Z6LePKM JJ40DVEghTBYKa0t21eGNOR6a7HZ4FE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-463-ZFERLax1Opu6ZvU_1l8PDQ-1; Wed, 17 Aug 2022 23:24:56 -0400 X-MC-Unique: ZFERLax1Opu6ZvU_1l8PDQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F1EC3804197; Thu, 18 Aug 2022 03:24:55 +0000 (UTC) Received: from T590 (ovpn-8-20.pek2.redhat.com [10.72.8.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0487E2026D4C; Thu, 18 Aug 2022 03:24:46 +0000 (UTC) Date: Thu, 18 Aug 2022 11:24:41 +0800 From: Ming Lei To: Chris Murphy Cc: Nikolay Borisov , Jens Axboe , Jan Kara , Paolo Valente , Btrfs BTRFS , Linux-RAID , linux-block , linux-kernel , Josef Bacik Subject: Re: stalling IO regression since linux 5.12, through 5.18 Message-ID: References: <2b8a38fa-f15f-45e8-8caa-61c5f8cd52de@www.fastmail.com> <35f0d608-7448-4276-8922-19a23d8f9049@www.fastmail.com> <568465de-5c3b-4d94-a74b-5b83ce2f942f@www.fastmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <568465de-5c3b-4d94-a74b-5b83ce2f942f@www.fastmail.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 17, 2022 at 10:30:39PM -0400, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 9:03 PM, Ming Lei wrote: > > On Wed, Aug 17, 2022 at 12:34:42PM -0400, Chris Murphy wrote: > >> > >> > >> On Wed, Aug 17, 2022, at 11:34 AM, Ming Lei wrote: > >> > >> > From the 2nd log of blockdebugfs-all.txt, still not see any in-flight IO on > >> > request based block devices, but sda is _not_ included in this log, and > >> > only sdi, sdg and sdf are collected, is that expected? > >> > >> While the problem was happening I did > >> > >> cd /sys/kernel/debug/block > >> find . -type f -exec grep -aH . {} \; > >> > >> The file has the nodes out of order, but I don't know enough about the interface to see if there are things that are missing, or what it means. > >> > >> > >> > BTW, all request based block devices should be observed in blk-mq debugfs. > >> > >> /sys/kernel/debug/block contains > >> > >> drwxr-xr-x. 2 root root 0 Aug 17 15:20 md0 > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sda > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdb > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdc > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdd > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sde > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdf > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdg > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdh > >> drwxr-xr-x. 4 root root 0 Aug 17 15:20 sdi > >> drwxr-xr-x. 2 root root 0 Aug 17 15:20 zram0 > > > > OK, so lots of devices are missed in your log, and the following command > > is supposed to work for collecting log from all block device's debugfs: > > > > (cd /sys/kernel/debug/block/ && find . -type f -exec grep -aH . {} \;) > > OK here it is: > > https://drive.google.com/file/d/18nEOx2Ghsqx8uII6nzWpCFuYENHuQd-f/view?usp=sharing The above log shows that the io stall happens on sdd, where: 1) 616 requests pending from scheduler queue grep "busy=" blockdebugfs-all2.txt | grep sdd | grep sched | awk -F "=" '{s+=$2} END {print s}' 616 2) 11 requests pending from ./sdd/hctx2/dispatch for more than 300 seconds Recently we seldom observe io hang from dispatch list, except for the following two: https://lore.kernel.org/linux-block/20220803023355.3687360-1-yuyufen@huaweicloud.com/ https://lore.kernel.org/linux-block/20220726122224.1790882-1-yukuai1@huaweicloud.com/ BTW, what is the output of the following log? (cd /sys/block/sdd/device && find . -type f -exec grep -aH . {} \;) Also the above log shows that host_tagset_enable support is still crippled on v5.12, I guess the issue may not be triggered(or pretty hard) after you update to d97e594c5166 ("blk-mq: Use request queue-wide tags for tagset-wide sbitmap"), or v5.14. thanks, Ming