Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp1606199pxj; Sat, 12 Jun 2021 14:09:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwmJC4+bITZh4wa63ZySgL7j9r5wU9X3RKhOYyvwTuDUEs8U1ipjetNVnZ4bkntBVuwIk+2 X-Received: by 2002:a05:6402:1513:: with SMTP id f19mr10333757edw.235.1623532163021; Sat, 12 Jun 2021 14:09:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623532163; cv=none; d=google.com; s=arc-20160816; b=VvTNFG/hLheA1U8DB73UOYbO4sYm30nOfnnEbG0Ayf7J2LDDb8IdfFwXVYrZjgPR7F FT1/bDolnroC57F8OeNAFO/cS6gM9/ynAznA3NqEIZ2yHe8dkrNRQQlv76tQfhJYJY43 0QgVSLIiNMMsRq7JsfeTOURzpxQtOz9ouTIhcHVqufnnWHRre6mSQFgYE7TfbYjuNDKC pCC/doN83Vrm+saf5SXQ2O0YfJV6pTGdCvJkm5qekv6Ez+sFQOg1TqQpWZ1ZVTT1yrJe uuwJpWoSzfCejRdUXZPNYSttdvwLKE8shHbGtE8c5kyHICWQTDUHO+Q15EXuDJWXjrNX fAXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=M62KKm+dNqWaH8Syuv0m6ywtVVkbgiUG09iJ7PhSVsU=; b=OGfjehwxxbYYiYEUYV7UkShpE5R6qJPzP/HfvuJzx+ajcO1/NJ3CqKMI06ZyTG3TKR IxFWR5CAeOMyAtQTxhRNl10Ej1VBf8desu+y1/POaxhdPDAO473lfmx4hu3XWCVrO4VS ZMIlOCmZVaQ5ViBH/NLH34UYhnBiTpZje/b4Wf8yPuEJpI0i+HMlYAhxTKssocY2rA6D 4B42k3R94Ultb/z8dyMnf0xaWNFPsFa+x2tMQ6QJXBKQO6Z+3qgiNl5/0iWLOURq3PXZ PvGJwr9flopXfmmmzTifciC1wnAjw53UInOFf9xUJe9gyu4TqygXcY+nfab4OzKvWz3X JvFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z34si8324533ede.317.2021.06.12.14.09.00; Sat, 12 Jun 2021 14:09:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231494AbhFLVHt (ORCPT + 99 others); Sat, 12 Jun 2021 17:07:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229753AbhFLVHt (ORCPT ); Sat, 12 Jun 2021 17:07:49 -0400 Received: from zeniv-ca.linux.org.uk (zeniv-ca.linux.org.uk [IPv6:2607:5300:60:148a::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79F9BC061574 for ; Sat, 12 Jun 2021 14:05:49 -0700 (PDT) Received: from viro by zeniv-ca.linux.org.uk with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1lsAp2-007R1x-4p; Sat, 12 Jun 2021 21:05:40 +0000 Date: Sat, 12 Jun 2021 21:05:40 +0000 From: Al Viro To: Andreas Gruenbacher Cc: Linus Torvalds , cluster-devel , Linux Kernel Mailing List , Jan Kara , Matthew Wilcox Subject: Re: [RFC 4/9] gfs2: Fix mmap + page fault deadlocks (part 1) Message-ID: References: <20210531170123.243771-1-agruenba@redhat.com> <20210531170123.243771-5-agruenba@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: Al Viro Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 11, 2021 at 04:25:10PM +0000, Al Viro wrote: > On Wed, Jun 02, 2021 at 01:16:32PM +0200, Andreas Gruenbacher wrote: > > > Well, iomap_file_buffered_write() does that by using > > iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic() as > > in iomap_write_actor(), but the read and direct I/O side doesn't seem > > to have equivalents. I suspect we can't just wrap > > generic_file_read_iter() and iomap_dio_rw() calls in > > pagefault_disable(). > > And it will have zero effect on O_DIRECT case, so you get the same > deadlocks right back. Because there you hit > iomap_dio_bio_actor() > bio_iov_iter_get_pages() > .... > get_user_pages_fast() > .... > faultin_page() > handle_mm_fault() > and at no point had CPU hit an exception, so disable_pagefault() will > have no effect whatsoever. You can bloody well hit gfs2 readpage/mkwrite > if the destination is in mmapped area of some GFS2 file. Do that > while holding GFS2 locks and you are fucked. > > No amount of prefaulting will protect you, BTW - it might make the > deadlock harder to reproduce, but that's it. AFAICS, what we have is * handle_mm_fault() can hit gfs2_fault(), which grabs per-inode lock shared * handle_mm_fault() for write can hit gfs2_page_mkwrite(), which grabs per-inode lock exclusive * pagefault_disable() prevents that for real page faults, but not for get_user_pages_fast() * normal write: with inode_lock(inode) in a loop with per-inode lock exclusive __gfs2_iomap_get possibly gfs2_iomap_begin_write in a loop fault-in [read faults] iomap_write_begin copy_page_from_iter_atomic() [pf disabled] iomap_write_end gfs2_iomap_end * O_DIRECT write: with inode_lock(inode) and per-inode lock deferred (?) in a loop __gfs2_iomap_get possibly gfs2_iomap_begin_write bio_iov_iter_get_pages(), map and submit [gup] gfs2_iomap_end * normal read: in a loop filemap_get_pages (grab pages and readpage them if needed) copy_page_to_iter() for each [write faults] * O_DIRECT read: with per-inode lock deferred in a loop __gfs2_iomap_get either iov_iter_zero() (on hole) [write faults] or bio_iov_iter_get_pages(), map and submit [gup] gfs2_iomap_end ... with some amount of waiting on buffered IO in case of O_DIRECT writes Is the above an accurate description of the mainline situation there? In particular, normal read doesn't seem to bother with locks at all. What exactly are those cluster locks for in O_DIRECT read?