Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp1344014rwr; Thu, 27 Apr 2023 16:35:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4kyRRwefat8hUrWKCFKGvMIHOO+IgGUvd3l0tJRyVGop9R8a3YEWwaxp8iERRBpAp2OMx/ X-Received: by 2002:a17:90a:2e12:b0:247:1233:9b28 with SMTP id q18-20020a17090a2e1200b0024712339b28mr3166998pjd.17.1682638520688; Thu, 27 Apr 2023 16:35:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682638520; cv=none; d=google.com; s=arc-20160816; b=vBgtgpniz8sJX3bWHDWIGX9/UuJBp+q/CtyHAvCQsuXYrlkqBJkvzseV9gjBR9uBqK Le8wlYbB8Obex3KFakYX5JKFIjctdX+2uqMtsnR3QzSaYWY5n99m0bpbjWDKjhnlQUcD PDYlBkgZ/4BQLF/JqvuOdrQYK7ShyvEnJj4wH59dk9DgYs+sCH35dCJRPqJ9DkMirmhl zJPMIE9ckax/3j17LLNKjJB4FJudWZjDjK42j1vcxuuldN/3ovfW0OAAPukm1gn/DOBp iocifWiAyu/H7oqSoI3k10d/PfF0gaHWRU+/TqN7RFZPNawROKa1ShJuc5ZcVNV5zD9L If8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=8j1mufvIgrEof2rz1gioQYw01o/93612V76WfPJiQ9s=; b=ZfHfldWS5DyfUsjW5dsxT2T06mlGIpeaHQg1X7gQVJutLpS/jpNbAMUQPDpHVS3Pjy Gh8Kg3UavS9i5Zx+cCRNSVwHoctnaz7dqrOJ6/f2tPMSwKmQI53iKShxKdGvX5btOIbt bYYjD2v58qSNlOWixQyibwtwS+YYVrPJlB0nLmyGR2Un/xB0iDvroQ5slyVxFuQLvFHa mBiZmjNbBzT6s834nXcqMN61PPAyub8KXnk7+HJCvbcsvGgDaAg9H2lLP0cdo19SAn4c T73o/Q6Spcy+lkRygdu14Z9gptRuGEvRhjOz5tHtagyVnA4KdO7WpayYTaEYkZEJCMjG ia1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FojXuSDQ; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lr17-20020a17090b4b9100b0023b481b8dcesi24422487pjb.102.2023.04.27.16.35.03; Thu, 27 Apr 2023 16:35:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FojXuSDQ; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229724AbjD0Xe6 (ORCPT + 99 others); Thu, 27 Apr 2023 19:34:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229508AbjD0Xe5 (ORCPT ); Thu, 27 Apr 2023 19:34:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28D3D2D55 for ; Thu, 27 Apr 2023 16:33:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1682638412; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8j1mufvIgrEof2rz1gioQYw01o/93612V76WfPJiQ9s=; b=FojXuSDQZ41z3gKKATqVRliiJxP8kQf2GpGF6ZKdad6fa4+BLM3VVq0eaTL3n/Rrm5UlaG y+1IZQ0ljG457FGUdZl1ytKp2f64IS9ex71ujB2NlgtynwRsbtdv/uNqMKT3bJw7S4x1ji zeXcqcQ/VShM7wSonY6o4wfuvbBJXKw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-522-OJLfBfwhM7m20J7kGJ2TRQ-1; Thu, 27 Apr 2023 19:33:28 -0400 X-MC-Unique: OJLfBfwhM7m20J7kGJ2TRQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3E5FE1C068C6; Thu, 27 Apr 2023 23:33:28 +0000 (UTC) Received: from rh (vpn2-52-17.bne.redhat.com [10.64.52.17]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5C0581121314; Thu, 27 Apr 2023 23:33:27 +0000 (UTC) Received: from localhost ([::1] helo=rh) by rh with esmtps (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1psB75-0001eO-2u; Fri, 28 Apr 2023 09:33:23 +1000 Date: Fri, 28 Apr 2023 09:33:20 +1000 From: Dave Chinner To: Ming Lei Cc: Theodore Ts'o , linux-ext4@vger.kernel.org, Andreas Dilger , linux-block@vger.kernel.org, Andrew Morton , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Eric Sandeen , Christoph Hellwig , Zhang Yi Subject: Re: [ext4 io hang] buffered write io hang in balance_dirty_pages Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Thu, Apr 27, 2023 at 10:20:28AM +0800, Ming Lei wrote: > Hello Guys, > > I got one report in which buffered write IO hangs in balance_dirty_pages, > after one nvme block device is unplugged physically, then umount can't > succeed. The bug here is that the device unplug code has not told the filesystem that it's gone away permanently. This is the same problem we've been having for the past 15 years - when block device goes away permanently it leaves the filesystem and everything else dependent on the block device completely unaware that they are unable to function anymore. IOWs, the block device remove path is relying on -unreliable side effects- of filesystem IO error handling to produce what we'd call "correct behaviour". The block device needs to be shutting down the filesystem when it has some sort of fatal, unrecoverable error like this (e.g. hot unplug). We have the XFS_IOC_GOINGDOWN ioctl for telling the filesystem it can't function anymore. This ioctl (_IOR('X',125,__u32)) has also been replicated into ext4, f2fs and CIFS and it gets exercised heavily by fstests. Hence this isn't XFS specific functionality, nor is it untested functionality. The ioctl should be lifted to the VFS as FS_IOC_SHUTDOWN and a super_operations method added to trigger a filesystem shutdown. That way the block device removal code could simply call sb->s_ops->shutdown(sb, REASON) if it exists rather than sync_filesystem(sb) if there's a superblock associated with the block device. Then all these This way we won't have to spend another two decades of people complaining about how applications and filesystems hang when they pull the storage device out from under them and the filesystem didn't do something that made it notice before the system hung.... > So far only observed on ext4 FS, not see it on XFS. Pure dumb luck - a journal IO failed on XFS (probably during the sync_filesystem() call) and that shut the filesystem down. > I guess it isn't > related with disk type, and not tried such test on other type of disks yet, > but will do. It can happen on any block device based storage that gets pulled from under any filesystem without warning. > Seems like dirty pages aren't cleaned after ext4 bio is failed in this > situation? Yes, because the filesystem wasn't shut down on device removal to tell it that it's allowed to toss away dirty pages as they cannot be cleaned via the IO path.... -Dave. -- Dave Chinner dchinner@redhat.com