Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp1231598rda; Mon, 23 Oct 2023 06:40:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGONXMrhBXui27YiUtPR38KzHjlnHHjX63RxPOfqneuUSAhtWfUJDmALGmqlq0KEMkfeaVD X-Received: by 2002:a17:902:e747:b0:1c9:bca1:d70b with SMTP id p7-20020a170902e74700b001c9bca1d70bmr9342236plf.39.1698068459320; Mon, 23 Oct 2023 06:40:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698068459; cv=none; d=google.com; s=arc-20160816; b=KVBqXiARX/w2eTGsqi9O+hTwfnmfbaj9Yyc65sDJvbAxFU0iBlZTCcfEH9kIKc7Q22 89BzYU7qCmcyjIM8hiQ0WOrohcky6gz3DYLaykhPKX5W6nHN68L4MoZKbQSMcmKeO+dc VTwIpotdcdbVmtiXET2c/fvxf2t7h1MRqI6CPfK8VkzU2wG31tfmotS7JGV5XIpQ39rk dmf//yoi6r0dX3zdkOhGs8/OFpwQl0WmpFsgjyHirYRhF6/+ut5CfxqQgIf3XiXqeGgL PavqugXs9+r7zfms0r1ixjfrjkBxkQ6TkwuzOBHmFxnRsxJ/kREEm5tCgtpiJBwxz5E/ S2sA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=zVuHez8ZJCGEhu0jFpAFl3Azt3mrUsn43w3YX4Opejw=; fh=pxd57xEfDugkJmkixJZl+eChypRbNGjYz4QZ16F0ms0=; b=R+w67rgQRDcDpTcYDzO3TSwtQIiJ+bJnOKrBEPlq/SZfn521xPkvx1Ob9mgEGg0cJU I2CP2XGhW78E28OBpls/gSAfNNH1l+sTq6nsDAhcc4umHCtNmcTQs2eNYNlyoHr0MQuC +EBa1oiqsSOrxm+Ni/KwTPqM1Aj6IQxQB060ATHNPai7KHMuAeqIJyHRAWNbN48AUYNs PGjJkrRRbeaQJn2TmrKlh4fBe/QpgAcZlNR7kkaMiTGJDy0ZbhIcybGzRn7+EFIYUOm+ YQMstbn31yhU9tnwhulehS2FRHqfgCNXd2qbrNYVUodIv5RxdMhE1Ny+XAuqiQNpzvU5 O3HA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id y4-20020a170902ed4400b001b8b4371af8si6144110plb.648.2023.10.23.06.40.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 06:40:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 4F0B5805B2F4; Mon, 23 Oct 2023 06:40:53 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229575AbjJWNkx (ORCPT + 99 others); Mon, 23 Oct 2023 09:40:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229568AbjJWNkw (ORCPT ); Mon, 23 Oct 2023 09:40:52 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B141C91; Mon, 23 Oct 2023 06:40:48 -0700 (PDT) Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4SDbmp6R1Bzcdmh; Mon, 23 Oct 2023 21:35:54 +0800 (CST) Received: from [10.174.177.174] (10.174.177.174) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Mon, 23 Oct 2023 21:40:44 +0800 Message-ID: <0c2de951-cd14-f1c7-fd9b-697563ad8092@huawei.com> Date: Mon, 23 Oct 2023 21:40:44 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.2 Subject: Re: [GIT PULL] ext2, quota, and udf fixes for 6.6-rc1 Content-Language: en-US To: Andy Shevchenko CC: Linus Torvalds , Josh Poimboeuf , Jan Kara , Nathan Chancellor , Nick Desaulniers , Kees Cook , Ferry Toth , , , yangerkun , Baokun Li References: <20231019164240.lhg5jotsh6vfuy67@treble> <826dbab6-f6e0-fc02-e5d3-141c00a2a141@huawei.com> From: Baokun Li In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.174] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.1 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 23 Oct 2023 06:40:53 -0700 (PDT) Hello! On 2023/10/23 20:19, Andy Shevchenko wrote: > On Sat, Oct 21, 2023 at 09:48:38AM +0800, Baokun Li wrote: >> On 2023/10/20 23:06, Andy Shevchenko wrote: >>> On Fri, Oct 20, 2023 at 05:51:59PM +0300, Andy Shevchenko wrote: >>>> On Thu, Oct 19, 2023 at 11:43:47AM -0700, Linus Torvalds wrote: > ... > >>>> I even rebuilt again with just rebased on top of e64db1c50eb5 and it doesn't >>>> boot, so we found the culprit that triggers this issue. >> This patch does not seem to cause this problem. Just like linus said, this >> patch >> has only two slight differences from the previous: >> 1) Change "if (err)" to "if (err < 0)" >>     In all the implementations of dq_op->write_dquot(), the returned value >> of err >>     is not greater than 0. Therefore, this does not cause behavior >> inconsistency. >> 2) Adding quota_error() >>     quota_error() does not seem to cause a boot failure. >> >> Also, you mentioned that the root file system is initramfs. If no other file >> system >> that supports quota is automatically mounted during startup, it seems that >> quota >> does not cause this problem logically. >> >> In my opinion, as Josh mentioned, replace the CONFIG_DEBUG_LIST related >> BUG()/BUG_ON() with WARN_ON(), and then check whether the system can be >> started normally. If yes, it indicates that the panic is caused by the list >> corruption, then, check for the items that may damage the list. If WARN_ON() >> is recorded in the dmesg log of the machine after the startup, it is easier >> to locate the problem. > I mentioned that I have checked that, but okay, lemme double check it. > I took the test-mrfld-jr branch and applied that patch on top. > And as expected no luck. By "okay" do you mean that after replacing BUG()/BUG_ON() with WARN_ON() you can boot up normally but don't see any prints, or does the replacement have no effect and still fails to boot up? > fstab I have, btw is this > > $ cat output/target/etc/fstab > # > /dev/root / ext2 rw,noauto 0 1 > proc /proc proc defaults 0 0 > devpts /dev/pts devpts defaults,gid=5,mode=620,ptmxmode=0666 0 0 > tmpfs /dev/shm tmpfs mode=0777 0 0 > tmpfs /tmp tmpfs mode=1777 0 0 > tmpfs /run tmpfs mode=0755,nosuid,nodev 0 0 > sysfs /sys sysfs defaults 0 0 > > Not sure if /dev/root affects this all, it's Buildroot + Busybox in initramfs > at the end. > > On the booted machine > (clang build of my main branch, based on the latest -rcX): > > Welcome to Buildroot > buildroot login: root > > # uname -a > Linux buildroot 6.6.0-rc7-00142-g9266a02ba229 #28 SMP PREEMPT_DYNAMIC Mon Oct 23 15:00:17 EEST 2023 x86_64 GNU/Linux > > # mount > rootfs on / type rootfs (rw,size=453412k,nr_inodes=113353) > devtmpfs on /dev type devtmpfs (rw,relatime,size=453412k,nr_inodes=113353,mode=755) > proc on /proc type proc (rw,relatime) > devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=666) > tmpfs on /dev/shm type tmpfs (rw,relatime,mode=777) > tmpfs on /tmp type tmpfs (rw,relatime) > tmpfs on /run type tmpfs (rw,nosuid,nodev,relatime,mode=755) > sysfs on /sys type sysfs (rw,relatime) > > What is fishy here is the size of rootfs, it's only 30M compressed side, > I can't be ~450M decompressed. I just noticed this, dunno if it's related. > Of the filesystems mounted above, only ext2 (aka rootfs) supports quota, but it doesn't appear to have quota enabled. If the quota change is causing ext2 to fail to load the root directory, you can now do the following checks: 1) Compare the binary generated by ext2  before and after the quota patch. 2) `dumpe2fs -h /dev/root` to see if there are any useful error messages saved on disk superblock. Thanks! -- With Best Regards, Baokun Li .