Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1666994rwd; Thu, 25 May 2023 16:31:51 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7BynDHYYK5ytENsyrkIkcXLnOFLxUUX2BN/mWnVCpibqnXz1t3+0n9sSqCLbQqm4mE4KqY X-Received: by 2002:a17:90b:4014:b0:255:ee7d:46dd with SMTP id ie20-20020a17090b401400b00255ee7d46ddmr444330pjb.4.1685057511348; Thu, 25 May 2023 16:31:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685057511; cv=none; d=google.com; s=arc-20160816; b=D2ZSjen87MpyMJU59GNQBcx8sEzynBgdtn8EruXaExnUbjM9kyBSHrUDW6YFB9Dkw8 h5XHxOzqKAKE5nPB9Dxkn6jsLU73DLKcfuR7CQo9qfufikxoW/Xe4NCKZA9bLDdVLV4x k9DF4d7Mb9s2wnD2drwpNWtGJBo/rSXyC4GdeXt2pUnqK2XrxzMGLd8g7jGiGvRFCUvC uWsXdlPDEqm5T4EHuaFGiKcsjnqn44Ut76HkNxzZJV2OGZpiZ8DWDU1YVxxyRnw2bKg5 khDyTpbCKdsJJyl6izttu7R7KkOpS7KFQ8vDYDJOuRhqgVRBlD2ZgMZrPQNFwqD8sYXU Heug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:dkim-signature:date; bh=awHDGDbBCgCx605l2nM3DZRvWVnQwgXG0c2+8QFPH08=; b=CWIPQNnz5gXw1Ijd/N9XaNiWMnkfJbVYyVy96biiydFBqqFuBFBYIK4GA2sEeHUqzJ cnspzihubKWHPuYZQ1kxs2qJn2FGtnC4fIufvpLmOHZIsp+V1S6912UPl85V9rzVf1Jm 2WkRI3dXngVEXar8HLT9OKvfgVYeeeWPDWZUUC2VraCPFvC5SJf+VjjJqC0xVYdTXA5r 06yVw5zG69GlcsAKueXxXDc3gaVWjQxjPeJcIdMqBuRB+jL6JchRUFVe5dVHjyL1KOSl 2lf+NOLpmjT89jYuh1YZW8Ju4rSGenwmEgImfb+8K6Q9GqwGrd/z+sILlw9BiZTijg7n Qz4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="afc/C9pQ"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j9-20020a17090aeb0900b0024de4be9639si2664052pjz.34.2023.05.25.16.31.39; Thu, 25 May 2023 16:31:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="afc/C9pQ"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229727AbjEYXU5 (ORCPT + 99 others); Thu, 25 May 2023 19:20:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229590AbjEYXUz (ORCPT ); Thu, 25 May 2023 19:20:55 -0400 Received: from out-9.mta1.migadu.com (out-9.mta1.migadu.com [95.215.58.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E39DDE7 for ; Thu, 25 May 2023 16:20:53 -0700 (PDT) Date: Thu, 25 May 2023 19:20:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1685056851; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=awHDGDbBCgCx605l2nM3DZRvWVnQwgXG0c2+8QFPH08=; b=afc/C9pQwn3XtiqPNBF4cA1bN259A/OGSakqaqaneKEyfT/qZ+IzTmhTll1eoRS53CjktS 5yqEL3omabBC56nM3oxDuPubjFnNq+nkFI2DfCbHeaFdHwyl4jfKNaGWZTmSpROCDL8ACY XxFqe+nKoVuyVihmB1LuNcS+2j/FL10= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Andreas =?utf-8?Q?Gr=C3=BCnbacher?= Cc: Christoph Hellwig , Jan Kara , cluster-devel@redhat.com, "Darrick J . Wong" , linux-kernel@vger.kernel.org, dhowells@redhat.com, linux-bcachefs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Kent Overstreet Subject: Re: [Cluster-devel] [PATCH 06/32] sched: Add task_struct->faults_disabled_mapping Message-ID: References: <20230509165657.1735798-1-kent.overstreet@linux.dev> <20230509165657.1735798-7-kent.overstreet@linux.dev> <20230510010737.heniyuxazlprrbd6@quack3> <20230523133431.wwrkjtptu6vqqh5e@quack3> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 26, 2023 at 12:25:31AM +0200, Andreas Grünbacher wrote: > Am Di., 23. Mai 2023 um 18:28 Uhr schrieb Christoph Hellwig : > > On Tue, May 23, 2023 at 03:34:31PM +0200, Jan Kara wrote: > > > I've checked the code and AFAICT it is all indeed handled. BTW, I've now > > > remembered that GFS2 has dealt with the same deadlocks - b01b2d72da25 > > > ("gfs2: Fix mmap + page fault deadlocks for direct I/O") - in a different > > > way (by prefaulting pages from the iter before grabbing the problematic > > > lock and then disabling page faults for the iomap_dio_rw() call). I guess > > > we should somehow unify these schemes so that we don't have two mechanisms > > > for avoiding exactly the same deadlock. Adding GFS2 guys to CC. > > > > > > Also good that you've written a fstest for this, that is definitely a useful > > > addition, although I suspect GFS2 guys added a test for this not so long > > > ago when testing their stuff. Maybe they have a pointer handy? > > > > generic/708 is the btrfs version of this. > > > > But I think all of the file systems that have this deadlock are actually > > fundamentally broken because they have a mess up locking hierarchy > > where page faults take the same lock that is held over the the direct I/ > > operation. And the right thing is to fix this. I have work in progress > > for btrfs, and something similar should apply to gfs2, with the added > > complication that it probably means a revision to their network > > protocol. > > We do disable page faults, and there can be deadlocks in page fault > handlers while no page faults are allowed. > > I'm roughly aware of the locking hierarchy that other filesystems use, > and that's something we want to avoid because of two reasons: (1) it > would be an incompatible change, and (2) we want to avoid cluster-wide > locking operations as much as possible because they are very slow. > > These kinds of locking conflicts are so rare in practice that the > theoretical inefficiency of having to retry the operation doesn't > matter. Would you be willing to expand on that? I'm wondering if this would simplify things for gfs2, but you mention locking heirarchy being an incompatible change - how does that work? > > > I'm absolutely not in favour to add workarounds for thes kind of locking > > problems to the core kernel. I already feel bad for allowing the > > small workaround in iomap for btrfs, as just fixing the locking back > > then would have avoid massive ratholing. > > Please let me know when those btrfs changes are in a presentable shape ... I would also be curious to know what btrfs needs and what the approach is there.