Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp818814pxb; Sat, 10 Apr 2021 21:34:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxn+Jy9y0bQczb0gcH1ioW4e0vTynsJi5Hmn5Bjdx3i8E49yyhJAammdDXvSi5TmPm6/4q+ X-Received: by 2002:a17:90a:4093:: with SMTP id l19mr2112041pjg.100.1618115693848; Sat, 10 Apr 2021 21:34:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618115693; cv=none; d=google.com; s=arc-20160816; b=OEklYFgLpNSl+wZxR72RZ/YaHAHG8ZDqge1G4KrhV7CtZ3etpizdkSuzw9AaraXWIN FvOF1Un/hfYct5rleu+GqyT0aJNBAlo0qipLiBHYsZFQ67c0qkqCN5AlQ8fVfslc0yQ3 1gIiWonMr+nZIw+bJ4ZfvyTrDz3sYuUWJNM/6A2IekI0Y/8dLr9+Q7YJz5Ovxqh4BhD8 wxR2p7YXQBSR4rSWrwiynazjZWw8N4HYSIskLp3fFyOtd1Gff1iEMwYodlBDIfps6gFP oO/jt37CR8agiAD2+R03R2wpFdVToDNWrDaWPAaFRkLyFc+EDENTkvkZRz75z3dmVTZ6 fElw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=iRn6MQzOwpRhvPKleNaQUqBpa4fppmLLGFMtkkFRr34=; b=jL9KtBHVOE3mWClD68U306NmL/IV6d8GRueiQZg4Igf9hZ+p0O3e17n39zSS+iHVMK 1YZPzAF7F16n5N/mbr2qe8QplfZFdb6abeH/QQIxODwz9c0Q1WXv8OVDEs4cqUMB3A9j mfv2u7cBaowyE8oEQY49JfBWRiqc/FG2Lbg8yt9d/6bqCy2A0ISZCJ6/RBA7WohGpLb9 lFRfkjnYw2fsmDQoI8vYIwjnSHrXvPnhL76aVXuwNPpx+n6Sx3At9fE21ED0TMxW4hpo JWhjIqF/CqfI0smy0FYgYUQcsxteIyAskSmPfPo5RLaDNTb1jw4DZpjBK8f0vyqrWuhK MdxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z20si2357082pge.422.2021.04.10.21.34.19; Sat, 10 Apr 2021 21:34:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229746AbhDKEZj (ORCPT + 99 others); Sun, 11 Apr 2021 00:25:39 -0400 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:51209 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S229452AbhDKEZi (ORCPT ); Sun, 11 Apr 2021 00:25:38 -0400 Received: from cwcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 13B4P4f6013673 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 11 Apr 2021 00:25:04 -0400 Received: by cwcc.thunk.org (Postfix, from userid 15806) id DE76B15C3B12; Sun, 11 Apr 2021 00:25:03 -0400 (EDT) Date: Sun, 11 Apr 2021 00:25:03 -0400 From: "Theodore Ts'o" To: Wen Yang Cc: riteshh , adilger@dilger.ca, linux-ext4@vger.kernel.org, jack@suse.cz, linux-kernel@vger.kernel.org, baoyou.xie@alibaba-inc.com Subject: Re: [PATCH] ext4: add a configurable parameter to prevent endless loop in ext4_mb_discard_group_p Message-ID: References: <20210409054733.avv3ofqpka4m6xe5@riteshh-domain> <0e0c2283-5eb9-e121-35b2-61dbccc8203b@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0e0c2283-5eb9-e121-35b2-61dbccc8203b@linux.alibaba.com> Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Sun, Apr 11, 2021 at 03:45:01AM +0800, Wen Yang wrote: > At this time, some logs are lost. It is suspected that the hard disk itself > is faulty. If you have a kernel crash dump, that means you can extract out the dmesg buffer, correct? Is there any I/O error messages in the kernel log? What is the basis of the suspicion that the hard drive is faulty? Kernel dmesg output? Error reporting from smartctl? > There are many hard disks on our server. Maybe we should not occupy 100% CPU > for a long time just because one hard disk fails. It depends on the nature of the hard drive failure. How is it failing? One thing which we do need to be careful about is when focusing on how to prevent a failure caused by some particular (potentially extreme) scenarios, that we don't cause problems on more common scenarios (for example a heavily loaded server, and/or a case where the file system is almost full where we have multiple files "fighting" over a small number of free blocks). In general, my attitude is that the best way to protect against hard drive failures is to have processes which are monitoring the health of the system, and if there is evidence of a failed drive, that we immediately kill all jobs which are relying on that drive (which we call "draining" a particular drive), and/or if a sufficiently large percentage of the drives have failed, or the machine can no longer do its job, to automatically move all of those jobs to other servers (e.g., "drain" the server), and then send the machine to some kind of data center repair service, where the failed hard drives can be replaced. I'm skeptical of attempts to try to make the file system to somehow continue to be able to "work" in the face of hard drive failures, since failures can be highly atypical, and what might work well in one failure scenario might be catastrophic in another. It's especially problematic if the HDD is not explcitly signalling an error condition, but rather being slow (because it's doing a huge number of retries), or the HDD is returning data which is simply different from what was previously written. The best we can do in that case is to detect that something is wrong (this is where metadata checksums would be very helpful), and then either remount the file system r/o, or panic the machine, and/or signal to userspace that some particular file system should be drained. Cheers, - Ted