Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp11841909rwd; Thu, 22 Jun 2023 20:23:50 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6xZpyo2wa0aubKBmcHctp8gfdagNTb3evwwEl8WDEQi+XuACkzQdn2DG+qfvylzz2QBYa/ X-Received: by 2002:a17:903:2443:b0:1b5:5ee1:1211 with SMTP id l3-20020a170903244300b001b55ee11211mr22459055pls.9.1687490630195; Thu, 22 Jun 2023 20:23:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687490630; cv=none; d=google.com; s=arc-20160816; b=KcVjvarH1TDj+hKwsbXdgBKnZuAx9mcCMlYB6trKyavMdO7t1vGpMuBZlM7MU5isVj CFp71GL2tSQGg/gpFclgHg0E8GBrqoHzI8i1XpY+2tXBF7jNBJJeiu3KH5nAGBPAEa2E f7wi0TEHHocCZY0wrsaYvRVWjEMtOCgBPBrsBp0+9/yWaLo7nGcFldvbM6gMZcpBaVcn 0TBduBxw0e6ELCfu/5NkfTs1hrtKiZ0KDA/kDtsu7mqLHD8WDDxlW/RrHnQf48BCR8j+ QunEWJ4ul84G+vPusDisZH1UWSf+8nUbFkRntOHmDmOSlnx48Ipr/tsSHb8fd6k7lwDI ySpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature:dkim-filter; bh=ZfHETNQq0d/FxNd8Cqifm7YEKZfyJcsUBf0Mj9fk/vk=; b=rzT5Gp86ayhwL2L2sBdhszILbRxubwhXEKdZ8+rpqte9dzMtjsfmCF0UN1znkG+con SxHF5eQr+iJzmu0nUAceiGE381L6kc5GyvathWGtKv3lpeA5XTbQ1zJtCTqzjUDsrQKD nnAiWPfFoAtn08Jz/W7f2L+oL9Zbx+LrdTmMXvk1mR71uunlireyhlPH7T7SeEeo6SDi kIcq2yoofYYqREqprPlLGe44i5IJb3wMePavbD0xfcGO+wNQPxZrzZ7zsv99YW0jP+ww nUVXROpNGJF7ysrPakHdDegjjPRF35L7vykzepVPyneVRI9GT5OkrIR3HbM5iAFRIXvr xhVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sandeen.net header.s=default header.b=nbJnkBU0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sandeen.net Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h5-20020a633845000000b00524ca1b89f0si7911043pgn.596.2023.06.22.20.23.30; Thu, 22 Jun 2023 20:23:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@sandeen.net header.s=default header.b=nbJnkBU0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sandeen.net Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231524AbjFWDKC (ORCPT + 99 others); Thu, 22 Jun 2023 23:10:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231592AbjFWDJ7 (ORCPT ); Thu, 22 Jun 2023 23:09:59 -0400 Received: from sandeen.net (sandeen.net [63.231.237.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A51F91FF6; Thu, 22 Jun 2023 20:09:57 -0700 (PDT) Received: from [10.0.0.71] (liberator.sandeen.net [10.0.0.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTPSA id 3E72B5CCE56; Thu, 22 Jun 2023 22:09:56 -0500 (CDT) DKIM-Filter: OpenDKIM Filter v2.11.0 sandeen.net 3E72B5CCE56 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sandeen.net; s=default; t=1687489796; bh=ZfHETNQq0d/FxNd8Cqifm7YEKZfyJcsUBf0Mj9fk/vk=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=nbJnkBU0dJCrw6P2Z90sS0KwTnPYvOJzCiezmdgq3p73cTDyHZ/sFzvus/XKh5x7j SS39icoYSrfejj2DMl/2//QNpbZovrFfWYpVB0cv09KopaaWZcf1xGQFkd6hKa4wh+ ZwKW6kkxGuSJZaM8AL2ndgzpoQr83E+1hmcOXeqG1s+axflrtCqMGvF88nHr2HiU7E 3R7QMzbovnvu9LI/K99cwogqDWtvUVNZNMVP09vdHGJhxJ4o+Wg38a4GWuuvKLwa8m 8DaxIh3bVUflIA078eEdph6Mc85O3ZDh1cfw5zVIQfyBsCfq6EV9TbqzHDO22+LxWb I7eq4WheGgqhg== Message-ID: <328c6e2c-e055-3391-3499-4963e351b0be@sandeen.net> Date: Thu, 22 Jun 2023 22:09:55 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [syzbot] [xfs?] WARNING: Reset corrupted AGFL on AG NUM. NUM blocks leaked. Please unmount and run xfs_repair. Content-Language: en-US To: Eric Biggers , Dave Chinner Cc: syzbot , dchinner@redhat.com, djwong@kernel.org, hch@lst.de, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, syzkaller-bugs@googlegroups.com References: <000000000000ffcb2e05fe9a445c@google.com> <20230621075421.GA56560@sol.localdomain> <20230623005617.GA1949@sol.localdomain> From: Eric Sandeen In-Reply-To: <20230623005617.GA1949@sol.localdomain> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/22/23 7:56 PM, Eric Biggers wrote: > On Thu, Jun 22, 2023 at 06:59:56PM +1000, Dave Chinner wrote: >> On Wed, Jun 21, 2023 at 12:54:21AM -0700, Eric Biggers wrote: >>> On Wed, Jun 21, 2023 at 05:07:15PM +1000, 'Dave Chinner' via syzkaller-bugs wrote: >>>> On Tue, Jun 20, 2023 at 07:10:19PM -0700, syzbot wrote: >>>> So exactly what is syzbot complaining about here? There's no kernel >>>> issue here at all. >>>> >>>> Also, I cannot tell syzbot "don't ever report this as a bug again", >>>> so the syzbot developers are going to have to triage and fix this >>>> syzbot problem themselves so it doesn't keep getting reported to >>>> us... >>> >>> I think the problem here was that XFS logged a message beginning with >>> "WARNING:", followed by a stack trace. In the log that looks like a warning >>> generated by the WARN_ON() macro, which is meant for reporting recoverable >>> kernel bugs. It's difficult for any program to understand the log in cases like >>> this. This is why include/asm-generic/bug.h contains the following comment: >>> >>> * Do not include "BUG"/"WARNING" in format strings manually to make these >>> * conditions distinguishable from kernel issues. >> >> Nice. >> >> Syzbot author doesn't like log messages using certain key words >> because it's hard for syzbot to work out what went wrong. >> >> Gets new rule added to kernel in a comment in some header file that >> almost nobody doing kernel development work ever looks at. >> >> Nothing was added to the coding style rules or checkpatch so nobody >> is likely to accidentally trip over this new rule that nobody has >> been told about. >> >> Syzbot maintainer also fails to do an audit of the kernel to remove >> all existing "WARNING" keywords from existing log messages so leaves >> landmines for subsystems to have to handle at some time in the >> future. >> >> Five years later, syzbot trips over a log message containing WARNING >> in it that was in code introduced before the rule was "introduced". >> Subsystem maintainers are blamed for not know the rule existed. >> >> Result: *yet again* we are being told that our only option is >> to *change code that is not broken* just to *shut up some fucking >> bot* we have no control over and could happily live without. >> >>> If you have a constructive suggestion of how all programs that >>> parse the kernel log can identify real warnings reliably without >>> getting confused by cases like this, I'm sure that would be >>> appreciated. It would need to be documented and then the guidance >>> in bug.h could then be removed. But until then, the above is the >>> current guidance. >> >> That is so not the problem here, Eric. >> > > Grepping for "WARNING:" is how other kernel testing systems find WARN_ON's in > the log too. For example, see _check_dmesg() in common/rc in xfstests. > xfstests fails tests if "WARNING:" is logged. You might be aware of this, as > you reviewed and applied xfstests commit 47e5d7d2bb17 which added the code. > > I understand it's frustrating that Dmitry's attempt to do something about this > problem was incomplete. I don't think it is helpful to then send a reflexive, > adversarial response that shifts the blame for this longstanding problem with > the kernel logs entirely onto syzbot and even Dmitry personally. That just > causes confusion about the problem that needs to be solved. > > Anyway, either everything that parses the kernel logs needs to be smarter about > identifying real WARN_ON's, or all instances of "WARNING:" need to be eliminated > from the log (with existing code, coding style guidelines, and checkpatch > updated as you mentioned). I think I'm leaning towards the position that fake > "WARNING:"s should be eliminated. It does seem like a hack, but it makes the > "obvious" log pattern matching that everyone tends to write work as expected... > > If you don't want to help, fine, but at least please try not to be obstructive. I didn't read Dave's reply as "obstructive." There's been a trend lately of ever-growing hoards of people (with machines behind them) generating ever-more work for a very small and fixed number of developers who are burning out. It's not sustainable. The work-generators need to help make things better, or the whole system is going to break. Dave being frustrated that he has to deal with "bug reports" about a printk phrase is valid, IMHO. There are many straws breaking the camel's back these days. You had asked for a constructive suggestion. My specific suggestion is that the people who decided that printk("WARNING") merits must-fix syzbot reports should submit patches to any subsystem they plan to test, to replace printk("WARNING") with something that will not trigger syzbot reports. Don't spread that pain onto every subsystem developer who already has to deal with legitimate and pressing work. Or, work out some other reliable way to discern WARN_ON from WARNING. And add it to checkpatch etc, as Dave suggested. This falls into the "help us help you" category. Early on, syszbot filesystem reports presented filesystems only as a giant array of hex in a C file, leaving it to the poor developer to work out how to use standard filesystem tools to analyze the input. Now we get standard images. That's an improvement, with some effort on the syzbot side that saves time and effort for every filesystem developer forever more. Find more ways to make these reports more relevant, more accurate, and more efficient to triage. That's my constructive suggestion. -Eric