Received: by 2002:a05:7412:bb8d:b0:d7:7d3a:4fe2 with SMTP id js13csp1784127rdb; Thu, 17 Aug 2023 01:47:53 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGcJKKktj8Dq/tLhMtwPsDcNFUs5kyLnkLsGiqodAE7ZNDJNUSCOMO4tZEBybzlzmaU/VqH X-Received: by 2002:a17:906:74d7:b0:993:e752:1a71 with SMTP id z23-20020a17090674d700b00993e7521a71mr3380994ejl.9.1692262073625; Thu, 17 Aug 2023 01:47:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692262073; cv=none; d=google.com; s=arc-20160816; b=xIYE9gQBJ9NWbPv8u76ptuQV31nUoldoeccxIUFcnY4/qcy9zW2w1fkQZnKjy+DfvU GI/gs4+M6qseR/Pq3QCR7UIBMh2sAOCpdCAbBvF53F8s6lRBhFJI934Vb1yT4BNCs8OH TpER2tX2DRSz8DW5hJ0bqKo20AiURN+bGQv9PrSn4+v+uTAIlENsAuJIODUrOoTMVPJ6 8c7bCugWq+TWt8MYpQkfQgUsF69/qjMBPdkSuORxv2otEpdcSYU63iHd0EjcBjd2Enyf R5IOLDrZY/zQUFaEkG7HCL3J9dk9kbSkmblV2ifnuQc73v0OlDMl9A3YXLaFHoKkBIMJ FB1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=5c5yqBvUHR1tbkw5FNjWvNPlWqNPbuTio9zLcbP9tSY=; fh=WwXu/+NnTC80x3YCBnaaqXFRA70eaJX2TTd6YFwLu1k=; b=xbu1AxNxpwpSjfdQ6n5MG3MND6eI9jh9rENkmNpSbS/AK/EwD3HzjuLEhspDFq4n4f VusiJP8QaKxNS9bmiv0hDt6AMe7nSz0T/MfSTdsbaia/ocy+/ZxZdCr7UwdbHrsZfkag WaAJnUNm0x4HMxDONUXKRytAmDUajZ+2sEZ8JjnsFucbFYRnv6UHsjHX4VLTFNW7lYft gNE5Aj6Lg1pUbUcdKXbcwyJ5rCcJvOHZHl40eEbHAqhNs5l8F9AQhejmHzZ5v88Q/VFp qpPvdD/N/CdoUltDNE0+xosmRqW1XY0i73fgww2lNVna4ovmupyL17PjbuQGay0dE6jA +Q6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@treblig.org header.s=bytemarkmx header.b="c/qWeKRZ"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a10-20020a170906244a00b00997e96fdb73si12340100ejb.296.2023.08.17.01.47.29; Thu, 17 Aug 2023 01:47:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@treblig.org header.s=bytemarkmx header.b="c/qWeKRZ"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343985AbjHPPGh (ORCPT + 99 others); Wed, 16 Aug 2023 11:06:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343976AbjHPPGJ (ORCPT ); Wed, 16 Aug 2023 11:06:09 -0400 Received: from mx.treblig.org (unknown [IPv6:2a00:1098:5b::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E47A10FF; Wed, 16 Aug 2023 08:06:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=treblig.org ; s=bytemarkmx; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID :Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID :Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To: Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe :List-Post:List-Owner:List-Archive; bh=5c5yqBvUHR1tbkw5FNjWvNPlWqNPbuTio9zLcbP9tSY=; b=c/qWeKRZA8ojurTR+jgKTP8r2O utXDNsJQV3vFVQs+pLQ90w+GEDH3fo+sC9Neyv3jco+MnVgFD7aIhYgzjrNvCXv9g0XbfSAA7SJRr Kh1xjACBHjJhPKnR5NYRzR/NCLkD6b+DEOTpbP6H8XNjznunS5KBWsnKmoZEVxnIo3XBgCbSBZRpx 2vpJwq+UBrW2bGBkpwIS17jIa8Czb6myNooiST8M3oE3uLoKTiQUF2Lzej/olNYE2pSq4AKL2PY/f XWTZlXRaWM+CAjSR6OV01aH/3sZO6CciGYVO7mrJFtSm21UX60QU+pBETYYua5Ltcdxocc1qc+6hA 0uNEB7ZA==; Received: from dg by mx.treblig.org with local (Exim 4.94.2) (envelope-from ) id 1qWI5x-007Gfb-UI; Wed, 16 Aug 2023 15:06:01 +0000 Date: Wed, 16 Aug 2023 15:06:01 +0000 From: "Dr. David Alan Gilbert" To: Jens Axboe Cc: Theodore Ts'o , hch@lst.de, adilger.kernel@dilger.ca, song@kernel.org, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org Subject: Re: 6.5.0rc5 fs hang - ext4? raid? Message-ID: References: <20230815125146.GA1508930@mit.edu> <324fc71c-dead-4418-af81-6817e1f41c39@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: X-Chocolate: 70 percent or better cocoa solids preferably X-Operating-System: Linux/5.10.0-23-amd64 (x86_64) X-Uptime: 15:04:18 up 41 days, 35 min, 2 users, load average: 0.00, 0.00, 0.00 User-Agent: Mutt/2.0.5 (2021-01-21) X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED,RDNS_NONE, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Dr. David Alan Gilbert (dave@treblig.org) wrote: > * Jens Axboe (axboe@kernel.dk) wrote: > > On 8/15/23 7:31 PM, Dr. David Alan Gilbert wrote: > > > (Copying in Christoph and Jens) > > > > > > * Dr. David Alan Gilbert (dave@treblig.org) wrote: > > >> * Dr. David Alan Gilbert (dave@treblig.org) wrote: > > >>> * Theodore Ts'o (tytso@mit.edu) wrote: > > >>>> On Mon, Aug 14, 2023 at 09:02:53PM +0000, Dr. David Alan Gilbert wrote: > > >>>>> dg 29594 29592 0 18:40 pts/0 00:00:00 /usr/bin/ar --plugin /usr/libexec/gcc/x86_64-redhat-linux/13/liblto_plugin.so -csrDT src/intel/perf/libintel_perf.a src/intel/perf/libintel_perf.a.p/meson-generated_.._intel_perf_metrics.c.o src/intel/perf/libintel_perf.a.p/intel_perf.c.o src/intel/perf/libintel_perf.a.p/intel_perf_query.c.o src/intel/perf/libintel_perf.a.p/intel_perf_mdapi.c.o > > >>>>> > > >>>>> [root@dalek dg]# cat /proc/29594/stack > > >>>>> [<0>] md_super_wait+0xa2/0xe0 > > >>>>> [<0>] md_bitmap_unplug+0xd2/0x120 > > >>>>> [<0>] flush_bio_list+0xf3/0x100 [raid1] > > >>>>> [<0>] raid1_unplug+0x3b/0xb0 [raid1] > > >>>>> [<0>] __blk_flush_plug+0xd7/0x150 > > >>>>> [<0>] blk_finish_plug+0x29/0x40 > > >>>>> [<0>] ext4_do_writepages+0x401/0xc90 > > >>>>> [<0>] ext4_writepages+0xad/0x180 > > >>>> > > >>>> If you want a few seconds and try grabbing cat /proc/29594/stack > > >>>> again, what does the stack trace stay consistent as above? > > >>> > > >>> I'll get back to that and retry it. > > >> > > >> Yeh, the stack is consistent; this time around it's an 'ar' in a kernel > > >> build: > > >> > > >> [root@dalek dg]# cat /proc/17970/stack > > >> [<0>] md_super_wait+0xa2/0xe0 > > >> [<0>] md_bitmap_unplug+0xad/0x120 > > >> [<0>] flush_bio_list+0xf3/0x100 [raid1] > > >> [<0>] raid1_unplug+0x3b/0xb0 [raid1] > > >> [<0>] __blk_flush_plug+0xd7/0x150 > > >> [<0>] blk_finish_plug+0x29/0x40 > > >> [<0>] ext4_do_writepages+0x401/0xc90 > > >> [<0>] ext4_writepages+0xad/0x180 > > >> [<0>] do_writepages+0xd2/0x1e0 > > >> [<0>] filemap_fdatawrite_wbc+0x63/0x90 > > >> [<0>] __filemap_fdatawrite_range+0x5c/0x80 > > >> [<0>] ext4_release_file+0x74/0xb0 > > >> [<0>] __fput+0xf5/0x2a0 > > >> [<0>] task_work_run+0x5d/0x90 > > >> [<0>] exit_to_user_mode_prepare+0x1e6/0x1f0 > > >> [<0>] syscall_exit_to_user_mode+0x1b/0x40 > > >> [<0>] do_syscall_64+0x6c/0x90 > > >> [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > > >> [root@dalek dg]# cat /proc/17970/stack > > >> [<0>] md_super_wait+0xa2/0xe0 > > >> [<0>] md_bitmap_unplug+0xad/0x120 > > >> [<0>] flush_bio_list+0xf3/0x100 [raid1] > > >> [<0>] raid1_unplug+0x3b/0xb0 [raid1] > > >> [<0>] __blk_flush_plug+0xd7/0x150 > > >> [<0>] blk_finish_plug+0x29/0x40 > > >> [<0>] ext4_do_writepages+0x401/0xc90 > > >> [<0>] ext4_writepages+0xad/0x180 > > >> [<0>] do_writepages+0xd2/0x1e0 > > >> [<0>] filemap_fdatawrite_wbc+0x63/0x90 > > >> [<0>] __filemap_fdatawrite_range+0x5c/0x80 > > >> [<0>] ext4_release_file+0x74/0xb0 > > >> [<0>] __fput+0xf5/0x2a0 > > >> [<0>] task_work_run+0x5d/0x90 > > >> [<0>] exit_to_user_mode_prepare+0x1e6/0x1f0 > > >> [<0>] syscall_exit_to_user_mode+0x1b/0x40 > > >> [<0>] do_syscall_64+0x6c/0x90 > > >> [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > > >> > > >>>> Also, if you have iostat installed (usually part of the sysstat > > >>>> package), does "iostat 1" show any I/O activity on the md device? > > >> > > >> iostat is showing something odd, most devices are at 0, > > >> except for 3 of the dm's that are stuck at 100% utilisation with > > >> apparently nothing going on: > > >> > > >> avg-cpu: %user %nice %system %iowait %steal %idle > > >> 0.06 0.00 0.03 53.06 0.00 46.84 > > >> > > >> Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util > > >> ... > > >> dm-16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 > > >> dm-17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 > > >> dm-18 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > > >> dm-19 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > > >> dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > > >> dm-20 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 > > >> .... > > >> > > >> dm-20 is the /dev/mapper/main-more which is the RAID on which the > > >> fs runs, 16 and 17 are main-more_rmeta_0 and main-more_rimage_0 > > >> so something screwy is going on there. > > > > > > I've just finished a bisect of this hang, and got to: > > > > > > 615939a2ae734e3e68c816d6749d1f5f79c62ab7 is the first bad commit > > > commit 615939a2ae734e3e68c816d6749d1f5f79c62ab7 > > > Author: Christoph Hellwig > > > Date: Fri May 19 06:40:48 2023 +0200 > > > > > > blk-mq: defer to the normal submission path for post-flush requests > > > > > > Requests with the FUA bit on hardware without FUA support need a post > > > flush before returning to the caller, but they can still be sent using > > > the normal I/O path after initializing the flush-related fields and > > > end I/O handler. > > > > > > Signed-off-by: Christoph Hellwig > > > Reviewed-by: Bart Van Assche > > > Link: https://lore.kernel.org/r/20230519044050.107790-6-hch@lst.de > > > Signed-off-by: Jens Axboe > > > > Can you try and pull in: > > > > https://git.kernel.dk/cgit/linux/commit/?h=block-6.5&id=5ff3213a5387e076af2b87f796f94b36965e8c3a > > > > and see if that helps? > > Yes it seems to fix it - thanks! Dave > Dave > > > -- > > Jens Axboe > > > -- > -----Open up your eyes, open up your mind, open up your code ------- > / Dr. David Alan Gilbert | Running GNU/Linux | Happy \ > \ dave @ treblig.org | | In Hex / > \ _________________________|_____ http://www.treblig.org |_______/ -- -----Open up your eyes, open up your mind, open up your code ------- / Dr. David Alan Gilbert | Running GNU/Linux | Happy \ \ dave @ treblig.org | | In Hex / \ _________________________|_____ http://www.treblig.org |_______/