Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1634783rwd; Thu, 25 May 2023 15:59:54 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4/fdKIxxhhVgvpYyaXZJ3a+H9iZihN7IvvxhlHZhsBmqIPCbKsBqjSKhWU9s5ffcqQbRtP X-Received: by 2002:a05:6a20:428b:b0:10d:d42:f6bc with SMTP id o11-20020a056a20428b00b0010d0d42f6bcmr10053105pzj.41.1685055594272; Thu, 25 May 2023 15:59:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685055594; cv=none; d=google.com; s=arc-20160816; b=sRlSXMM7tNpSgmiAIJLcmC9tjPv/o4WWrX4iqheQd0vLFb7l6VeMCtpcbzXucngA2Q 5ut4f93VJdGxjgovGOtnyZ1mkKO2ARmZTigBZBoOSYdDMqTtiqT4V3ppNiuRYTuheQ3w cR4za+V+G9qcb2UzoJqV4Z+EpoFGG2rNVda5hqKYBG66YRSQZ/CKQ9BkE5rLhqw/oOrM amB9cK24YfFbD/4DksmHyfVoRkGgpkRs9SB7OU+Mh+LzCTePslkb+4F3PyyccFEPjQJI TgFduwThKIEYCCNALZqKIUsmTrQKFV1iAH/XkUUEsI5QObUp5W2xFBp7dH90iJhydATg mgHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=BsY/WgiwAnYCALQVm6Bp47qhqPyD/a7vg0rsgz10LlE=; b=q5lbnRj4iKAgY4NzubFnrWMnQElXy90MQAfSg6YHI49rt7xKiDj56ckZai/5kingj/ z0Ic8MxVodKrnQYFDqX7zMmQ5sESK8wSqYY7fKxqQyOS99uMtniisZpLVaWIfyoXI1zk 5/AYcu4wsA2ccukPgi37SUWfBlW74gxe+m+uWt3NBQP72GU2UsXJr278nW+Y8lF2LG81 QHLb7QF4hhkVKbqYGu/tYpcXclAMv8sVVliCtBkADsY7VXCvDWs9xFin0A+1y7mUbz3H w3H0wVtBz4hOA81+VKJsBs3cycLKxXDhijkx9nkGNnXAypxWIKT5vAmy8WIIoBFxgPTX ZQVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=TUo2Wvnf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bk13-20020a056a02028d00b0053f2601e5aasi2069636pgb.195.2023.05.25.15.59.42; Thu, 25 May 2023 15:59:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=TUo2Wvnf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242026AbjEYWEf (ORCPT + 99 others); Thu, 25 May 2023 18:04:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242090AbjEYWEc (ORCPT ); Thu, 25 May 2023 18:04:32 -0400 Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78CFBCC; Thu, 25 May 2023 15:04:31 -0700 (PDT) Received: by mail-lj1-x229.google.com with SMTP id 38308e7fff4ca-2af290cf9b7so518001fa.3; Thu, 25 May 2023 15:04:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685052270; x=1687644270; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=BsY/WgiwAnYCALQVm6Bp47qhqPyD/a7vg0rsgz10LlE=; b=TUo2WvnfP9OGI1VxpWw2n8J8rt9QR+/8yVLJTqcGcnMZdpcc9kBW0J7Mtb/3X4/jr0 7ar5NHR0CD6CBCWzyR+DQFLXlGfqwndERiv1vZXjEu4e4eiGkqvBRDI9pSkZuhmYlzjE nuF/jwfkacKlSSspRImeupNYWJ/p0Y5U6DLX6oauxGRFLkVhsbpdQHj961elAUzWnvQN SeDh3NrzNsl99rlHKcJa6ZWNGeLt4vSyBZycxW+fNpGXuIFWlX8eo7O3Gy99Lvp+ikfI g8JA8Tdj4/sU2Z4RWsADxYZEEMfGlO/jP9JPBw3dQclGKNdWlxeu08IehtQRB1gSqdOH 5YvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685052270; x=1687644270; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=BsY/WgiwAnYCALQVm6Bp47qhqPyD/a7vg0rsgz10LlE=; b=j0kD13X0eFvOAs4tth3I8TAMLYf/rVJY9JC24bvWd0swqjUJlX3LL8N/fA9ZGmk1cv oICgdAQfpKR6LJ+njLfv9/YEScbo270XuCfrsoC+9Gz0F9CvGNoEtor9YyAKq2VqVtbX ymCQdpAZbak6ieGeQMVXN6QkjSrdJcSLmUqFoRCNB+L/FsJ8P90VpqPSUG5PaQSuuRWQ RNTPuh/S+CLtgu6AsJGRnskohhpdCXu9TlkK0ThODGnaqnzLrQr959h6GiHltwdelxpr xrsH5RncisDyiHRWbUDy6jsCcK+X6qOceRYHMTlXlFw2MyRo6exFr9fkKWTInZtsQYxs an2A== X-Gm-Message-State: AC+VfDz+cvAeKZGoNatGYq3rqsPwXpUy3letMuOMvmYtPWUPTafSbCbw G1Mi6IP+wyLKc4sutxiJHRD8ADwADZCtlBhI7Bg= X-Received: by 2002:a2e:3005:0:b0:2ac:6038:ece5 with SMTP id w5-20020a2e3005000000b002ac6038ece5mr1292743ljw.49.1685052269350; Thu, 25 May 2023 15:04:29 -0700 (PDT) MIME-Version: 1.0 References: <20230509165657.1735798-1-kent.overstreet@linux.dev> <20230509165657.1735798-7-kent.overstreet@linux.dev> <20230510010737.heniyuxazlprrbd6@quack3> <20230523133431.wwrkjtptu6vqqh5e@quack3> In-Reply-To: <20230523133431.wwrkjtptu6vqqh5e@quack3> From: =?UTF-8?Q?Andreas_Gr=C3=BCnbacher?= Date: Fri, 26 May 2023 00:04:18 +0200 Message-ID: Subject: Re: [PATCH 06/32] sched: Add task_struct->faults_disabled_mapping To: Jan Kara Cc: Kent Overstreet , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-bcachefs@vger.kernel.org, Kent Overstreet , "Darrick J . Wong" , dhowells@redhat.com, Andreas Gruenbacher , cluster-devel@redhat.com, Bob Peterson Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am Di., 23. Mai 2023 um 15:37 Uhr schrieb Jan Kara : > On Wed 10-05-23 02:18:45, Kent Overstreet wrote: > > On Wed, May 10, 2023 at 03:07:37AM +0200, Jan Kara wrote: > > > On Tue 09-05-23 12:56:31, Kent Overstreet wrote: > > > > From: Kent Overstreet > > > > > > > > This is used by bcachefs to fix a page cache coherency issue with > > > > O_DIRECT writes. > > > > > > > > Also relevant: mapping->invalidate_lock, see below. > > > > > > > > O_DIRECT writes (and other filesystem operations that modify file data > > > > while bypassing the page cache) need to shoot down ranges of the page > > > > cache - and additionally, need locking to prevent those pages from > > > > pulled back in. > > > > > > > > But O_DIRECT writes invoke the page fault handler (via get_user_pages), > > > > and the page fault handler will need to take that same lock - this is a > > > > classic recursive deadlock if userspace has mmaped the file they're DIO > > > > writing to and uses those pages for the buffer to write from, and it's a > > > > lock ordering deadlock in general. > > > > > > > > Thus we need a way to signal from the dio code to the page fault handler > > > > when we already are holding the pagecache add lock on an address space - > > > > this patch just adds a member to task_struct for this purpose. For now > > > > only bcachefs is implementing this locking, though it may be moved out > > > > of bcachefs and made available to other filesystems in the future. > > > > > > It would be nice to have at least a link to the code that's actually using > > > the field you are adding. > > > > Bit of a trick to link to a _later_ patch in the series from a commit > > message, but... > > > > https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/fs-io.c#n975 > > https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/fs-io.c#n2454 > > Thanks and I'm sorry for the delay. > > > > Also I think we were already through this discussion [1] and we ended up > > > agreeing that your scheme actually solves only the AA deadlock but a > > > malicious userspace can easily create AB BA deadlock by running direct IO > > > to file A using mapped file B as a buffer *and* direct IO to file B using > > > mapped file A as a buffer. > > > > No, that's definitely handled (and you can see it in the code I linked), > > and I wrote a torture test for fstests as well. > > I've checked the code and AFAICT it is all indeed handled. BTW, I've now > remembered that GFS2 has dealt with the same deadlocks - b01b2d72da25 > ("gfs2: Fix mmap + page fault deadlocks for direct I/O") - in a different > way (by prefaulting pages from the iter before grabbing the problematic > lock and then disabling page faults for the iomap_dio_rw() call). I guess > we should somehow unify these schemes so that we don't have two mechanisms > for avoiding exactly the same deadlock. Adding GFS2 guys to CC. > > Also good that you've written a fstest for this, that is definitely a useful > addition, although I suspect GFS2 guys added a test for this not so long > ago when testing their stuff. Maybe they have a pointer handy? Ah yes, that's xfstests commit d3cbdabf ("generic: Test page faults during read and write"). Thanks, Andreas > Honza > -- > Jan Kara > SUSE Labs, CR