Received: by 2002:ac2:5a04:0:0:0:0:0 with SMTP id q4csp1070781lfn; Wed, 23 Feb 2022 18:13:14 -0800 (PST) X-Google-Smtp-Source: ABdhPJzSn5w5Dj/P3Lk1B44W2zdeGvPT0WU7rbEc3wYLsamFdu4eldmNP0d3oIPOfbwOTuQEjE6X X-Received: by 2002:a17:90a:bf91:b0:1b9:bda3:10ff with SMTP id d17-20020a17090abf9100b001b9bda310ffmr529946pjs.38.1645668794261; Wed, 23 Feb 2022 18:13:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645668794; cv=none; d=google.com; s=arc-20160816; b=JiAqFIfogjGOgYG5Jzrce0XZ5Xcmu3HO6TpW0fqAHZW+nUEWf0u1M4QJW5j104UspA ig1tI6ZUzWpDq+VuGVPUA2ZsYMnyM4ZjWkMmz4Ck0li3Z1JrseIwt5fpKch6b5ugxxl5 egtE/DJrsvkdPTs6Gub4E2cn28AFL1KuydgE7SfgHttLyh66jATDayvX73XHrif9f0dI F3o9quNZMnTfVnikbJhPri0R2hEu2tEnSLB8iF9O8fhz9eAGnjR3dRDkDc8LvTIzkHfH WSWblONUJ1xm0086ruAzqhZd53dPbo9ILouZ+AbSg2Xrw4tqzNL3abtxWebonefW62Mx 9oQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=yg8H+oaIpUTb/SOXTWNJe8Q4zkiHWBDy8f4Oyp47Jk0=; b=gta0amMVBq0CEH4hY0qFUiyBuay3ETBXMBHRQ95R6OF3s1g7woaouQFwQanzQ3x8oT P+Pd44Aa3zO1FMEvUpVbKl4XbQm2qzlEixxiRzYUVVUFdvmG6LMpz1easXnrXSGfg4yZ CD4+Wz/X1H3N4xUZN7tvw5AHKwvb3WJhEe5jAbZ8wzHNUdeP8MT8rpP30K8cSZz5oNVq Xpox/D23lyXq6oLhkvjoM83AiB7xKNoOZvaReJNQCQ1qbWmY5Pcnvjkz5VpSoF2NV8mu FxDqmzRC41WLSoG1GYgJrxukpoWJLxbiKf5+yCH8GHhBFR1dS8U11ZuSTfnaHY0n7Q+7 V3cg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id u11si1333154plh.180.2022.02.23.18.13.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 18:13:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C48363DDD2; Wed, 23 Feb 2022 17:48:50 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229764AbiBXBtR (ORCPT + 99 others); Wed, 23 Feb 2022 20:49:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229739AbiBXBtQ (ORCPT ); Wed, 23 Feb 2022 20:49:16 -0500 Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 6C5B1D1985; Wed, 23 Feb 2022 17:48:47 -0800 (PST) Received: from dread.disaster.area (pa49-186-17-0.pa.vic.optusnet.com.au [49.186.17.0]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 1FBA553306C; Thu, 24 Feb 2022 12:48:43 +1100 (AEDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1nN3FK-00FgMA-GD; Thu, 24 Feb 2022 12:48:42 +1100 Date: Thu, 24 Feb 2022 12:48:42 +1100 From: Dave Chinner To: Theodore Ts'o Cc: Greg Kroah-Hartman , John Hubbard , Lee Jones , linux-ext4@vger.kernel.org, Christoph Hellwig , Dave Chinner , Goldwyn Rodrigues , "Darrick J . Wong" , Bob Peterson , Damien Le Moal , Andreas Gruenbacher , Ritesh Harjani , Johannes Thumshirn , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, cluster-devel@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [REPORT] kernel BUG at fs/ext4/inode.c:2620 - page_buffers() Message-ID: <20220224014842.GM59715@dread.disaster.area> References: <82d0f4e4-c911-a245-4701-4712453592d9@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.4 cv=VuxAv86n c=1 sm=1 tr=0 ts=6216e3fe a=+dVDrTVfsjPpH/ci3UuFng==:117 a=+dVDrTVfsjPpH/ci3UuFng==:17 a=kj9zAlcOel0A:10 a=oGFeUVbbRNcA:10 a=7-415B0cAAAA:8 a=-kggqxpRYmVVPZmSJLkA:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Wed, Feb 23, 2022 at 06:35:54PM -0500, Theodore Ts'o wrote: > On Fri, Feb 18, 2022 at 08:51:54AM +0100, Greg Kroah-Hartman wrote: > > > The challenge is that fixing this "the right away" is probably not > > > something we can backport into an LTS kernel, whether it's 5.15 or > > > 5.10... or 4.19. > > > > Don't worry about stable backports to start with. Do it the "right way" > > first and then we can consider if it needs to be backported or not. > > Fair enough; on the other hand, we could also view this as making ext4 > more robust against buggy code in other subsystems, and while other > file systems may be losing user data if they are actually trying to do > remote memory access to file-backed memory, apparently other file > systems aren't noticing and so they're not crashing. Oh, we've noticed them, no question about that. We've got bug reports going back years for systems being crashed, triggering BUGs and/or corrupting data on both XFS and ext4 filesystems due to users trying to run RDMA applications with file backed pages. Most of the people doing this now know that we won't support such applications until the RDMA stack/hardware can trigger on-demand write page faults the same way CPUs do when they first write to a clean page. They don't have this, so mostly these people don't bother reporting these class of problems to us anymore. The gup/RDMA infrastructure to make this all work is slowly moving forwards, but it's not here yet. > Issuing a > warning and then not crashing is arguably a better way for ext4 to > react, especially if there are other parts of the kernel that are > randomly calling set_page_dirty() on file-backed memory without > properly first informing the file system in a context where it can > block and potentially do I/O to do things like allocate blocks. I'm not sure that replacing the BUG() with a warning is good enough - it's still indicative of an application doing something dangerous that could result in silent data corruption and/or other problems. Cheers, Dave. -- Dave Chinner david@fromorbit.com