Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1407558imm; Fri, 27 Jul 2018 17:19:41 -0700 (PDT) X-Google-Smtp-Source: AAOMgpe7CXSlZvL9mrIzZfu+LNP6+85K2hH9IC0RvJ/LJNxtJJFkXbjjrHQrfkVGG7tGoUch1bnA X-Received: by 2002:a17:902:28e4:: with SMTP id f91-v6mr7978929plb.146.1532737181115; Fri, 27 Jul 2018 17:19:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532737181; cv=none; d=google.com; s=arc-20160816; b=jT+zPV1QME42DmPi2jby5tx5mbe3bcsadLZEtHTL402rmaeJvsZuOfGNbdM1roBzOc wJa5yY4aQ1NHDekWZMM0VM/ciWyqp2fDgpwyA1kVbskfNwhXeAp+pZUSEAHvF0RyvWeN cnilsC4RrTmxKOOXEMrxfAnFRZ9BTcY4q1ETRkD0RP6L5Qr0sqiCTU35nlFCu1eRqIhv J4lLGbliPEzrovSf5pCmodFufbh+k4Rs+WFP/V/jAM611tgp8iMlpqJSFJD0pVuF3cI6 73DTih5GVTOsAGYNMXYyXDfrHDj6vPhej7iZKe6vKi8NTnvYP2sEhd6pbRmvx5hfkJTt 9Mkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:mail-followup-to :message-id:subject:cc:to:from:date:dkim-signature :arc-authentication-results; bh=V/rq9AjCmrv1vLkZ8SJ0RCyE7QiKZl4Rwvwd94hvRSA=; b=fbXCONuLWpYMrp3dWxgzj86l6Z6krHOcEmJ/MxiqKoTqpK9D2Pvuoh/L5Jqj6P7iqK x/4UTUOvVlHRPCZwX+4rVEewrdYr/jaqav+vo9Gj0jwBErajcTYcXWAEs2L1v6trAIEa R8SN4+o1tmdVH2kka3Qx4SgEZfT3jb7mu2NIBQf6aEmKL7MkV14aZWtW9NePLHcfo/Wg lph9ijBr5o1zQx/8nwbvzjmN7e9/sZ1DntDBKcEOsC5ZSaeGwyX3Lz/P0s/zf46c9ss/ 6s/83qtvvC5/msUjlqpJr9CIY5tTu3iyWIikCeyARn8CqwuX39EeubAM79JejlhnklRN JHuQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@thunk.org header.s=ef5046eb header.b=pnGhATGp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d30-v6si4429397pla.64.2018.07.27.17.19.26; Fri, 27 Jul 2018 17:19:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@thunk.org header.s=ef5046eb header.b=pnGhATGp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389109AbeG1Bmu (ORCPT + 99 others); Fri, 27 Jul 2018 21:42:50 -0400 Received: from imap.thunk.org ([74.207.234.97]:37422 "EHLO imap.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730026AbeG1Bmu (ORCPT ); Fri, 27 Jul 2018 21:42:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=thunk.org; s=ef5046eb; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=V/rq9AjCmrv1vLkZ8SJ0RCyE7QiKZl4Rwvwd94hvRSA=; b=pnGhATGpzDuU7lKN+PMygiNI+y Bn2oBD/nZYq3ntg4lnAJ9+NxGPJq4upKz+HGZ29eCkc72ayy7Kh3isuw+BAzd2S0LBXqqRFTA3UGh YuEVHkCU53yl8i3gTxY4UWNeTOM7BqLTxKQEr6wU0BfGzyzWGwsou6DnnvHEWj73c0Ok=; Received: from root (helo=callcc.thunk.org) by imap.thunk.org with local-esmtp (Exim 4.89) (envelope-from ) id 1fjCwI-000386-68; Sat, 28 Jul 2018 00:18:30 +0000 Received: by callcc.thunk.org (Postfix, from userid 15806) id 23F097A6163; Fri, 27 Jul 2018 20:18:23 -0400 (EDT) Date: Fri, 27 Jul 2018 20:18:23 -0400 From: "Theodore Y. Ts'o" To: Sodagudi Prasad Cc: adilger.kernel@dilger.ca, wen.xu@gatech.edu, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: Remounting filesystem read-only Message-ID: <20180728001823.GA28432@thunk.org> Mail-Followup-To: "Theodore Y. Ts'o" , Sodagudi Prasad , adilger.kernel@dilger.ca, wen.xu@gatech.edu, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org References: <366cf3ac534bbadaaa61714a43006ac7@codeaurora.org> <20180727195213.GE13922@thunk.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on imap.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 27, 2018 at 01:34:31PM -0700, Sodagudi Prasad wrote: > > The error should be pretty clear: "Inode table for bg 0 marked as > > needing zeroing". That should never happen. > > Can you provide any debug patch to detect when this corruption is happening? > Source of this corruption and how this is partition getting corrupted? > Or which file system operation lead to this corruption? Do you have a reliable repro? If it's a one-off, it can be caused by *anything*. Crappy hardware, a bug in some proprietary, binary-only GPU driver dereferencing some wild pointer that corrupts kernel memory, etc. Asking for a debug patch is like asking for "can you create technology that can detect when a cockroach enter my house?" So if you have a reliable repro, then we know what operations might be triggering the corruption, and then you work on creating a minimal repro, and only *then* when we have a restricted set of possibilities that might be the cause (for example, if removing a GPU call makes the problem go away, then the patch would need to be in the proprietary GPU driver....) > I am digging code a bit around this warning to understand more. The warning means that a flag in block group descriptor #0 is set that should never be set. How did the flag get set? There is any number of things that could cause that. You might want to look at the block group descriptor via dumpe2fs or debugfs, to see if it's just a single bit getting flipped, or if the entire block group descriptor is garbage. Note that under normal code paths, the flag *never* gets set by ext4 kernel code. The flag will get set on non-block group 0 block group descriptors by ext4, and the ext4 kernel code will only clear the flag. Of course, if there is a bug in some driver that dereferences a pointer widely, all bets are off. - Ted