Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp1546972pxb; Fri, 1 Oct 2021 13:06:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy+ye9/Kn1hcVmPqhWj7znktPMKIKTeHF4r+c5CPXvx6u7DWYZZpNh0cSKYcpYdVrromkuK X-Received: by 2002:a17:907:7ba9:: with SMTP id ne41mr8538790ejc.105.1633118782034; Fri, 01 Oct 2021 13:06:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633118782; cv=none; d=google.com; s=arc-20160816; b=KIRYif8lNjAKi92Zae8UlEmXh7LuKdpfSpydhSP/gIX/czNbrW6k+wbRQSPPMS56w7 SjmBFLtGPNWz43IdUHCWWnv4kkMBxIEguIxT0EYIVd+L1DL3+WWzTBewCtylQ0ZgCDOU CnQ9mDqjA8KLbH4nvLecCCNtRse6aqt0G/OtoRd6cssdXVTjmWu9TNYbPVUJsnDs5HFt 9535ruaEqelynjh1LJ/JSw7PuwPO50LOU1gpbmfz9ZQRtEDqCpc98srw+1uTlNT/hnOT 3t/wjmBdB3O9xHnx/cPm0p9HiW5pIwgm9rMDDXMhn7rxRhwXMm5kyPku22XOQDWqOJr/ 4WGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=XiYRe7l+AkFXd4W8/qljO00jcN54Re+n4pMLJBImcn4=; b=eB8wo4jQ8t2QdD1gPQo2VEe0/U1ixdbmLzJa2WixQAAzV41mmEF0EtG3VfuIgidNrs GqWe2Xk4kYr42aWGpJ3GBYVCS5ucHYoArZOT9C9/m9j2DrLLPBuiIpn79cyQcpkq4Yk9 m4QXLpHC5hSNMyZ4G/SMB9MQfBDymOYGm51dq1G/ovecVoWqd0i666DE+dSIN3PwjUUP 7VrPtTdD9wHJtKRnVDvwVSc1H6tIxGMQYXp+2NnfCvPSxUpdAywIxzggSCXtts2Phgp7 t7FMa/S4rXpwVqgBErrOXJ+xDSwg5i44Vg0jpbKYC2csDEx03gJTZ2SnsFg2fYvL94Np OoEQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=iQI6FiAS; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j22si10169727ejt.173.2021.10.01.13.05.44; Fri, 01 Oct 2021 13:06:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=iQI6FiAS; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229628AbhJATpD (ORCPT + 99 others); Fri, 1 Oct 2021 15:45:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229623AbhJATpD (ORCPT ); Fri, 1 Oct 2021 15:45:03 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D66CC061775 for ; Fri, 1 Oct 2021 12:43:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=XiYRe7l+AkFXd4W8/qljO00jcN54Re+n4pMLJBImcn4=; b=iQI6FiASRn2habvKhbw6JQqj52 OKre4X2XWAD4zcFFS6NAZUIei9XMNcRLT85OPvTamROTSOr6+OC70n1Bre4z3l7ppfXJJwTbMBMu5 PS5B3glxN8RhlLhnHnQzECNI5VmCXUdW5fb4SThejwj9nxD3mMh9nLfO7S7PWzXwcTcVJ87Oya+qc RH8rjyCPRWBl5eHSAKwK5gmdomKGrz/LSgdNonQA+TSO3xUe1vSS0KAS//Bg5ifXw7WGWEOUMA925 /AB1+s3jE5kptfbxsYbi0NcK39gypiHAjjfRvN52ZirZVGRp1oK06oW+1RgQuWxg2l2B6pnRQTeHV ZSUTwNxQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1mWOQT-00EDKy-Sm; Fri, 01 Oct 2021 19:42:39 +0000 Date: Fri, 1 Oct 2021 20:42:33 +0100 From: Matthew Wilcox To: Trond Myklebust Cc: "dhowells@redhat.com" , "linux-cachefs@redhat.com" , "linux-mm@kvack.org" , "linux-nfs@vger.kernel.org" , "anna.schumaker@netapp.com" , "dwysocha@redhat.com" Subject: Re: Can the GFP flags to releasepage() be trusted? -- was Re: [PATCH v2 3/8] nfs: Move to using the alternate fallback fscache I/O API Message-ID: References: <97eb17f51c8fd9a89f10d9dd0bf35f1075f6b236.camel@hammerspace.com> <163189104510.2509237.10805032055807259087.stgit@warthog.procyon.org.uk> <163189108292.2509237.12615909591150927232.stgit@warthog.procyon.org.uk> <81120.1633099916@warthog.procyon.org.uk> <23033528036059af4633f60b8325e48eab95ac36.camel@hammerspace.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <23033528036059af4633f60b8325e48eab95ac36.camel@hammerspace.com> Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Fri, Oct 01, 2021 at 03:04:08PM +0000, Trond Myklebust wrote: > On Fri, 2021-10-01 at 15:51 +0100, David Howells wrote: > > Trond Myklebust wrote: > > > > > > > @@ -432,7 +432,12 @@ static int nfs_release_page(struct page > > > > > *page, gfp_t gfp) > > > > > ????????/* If PagePrivate() is set, then the page is not > > > > > freeable */ > > > > > ????????if (PagePrivate(page)) > > > > > ????????????????return 0; > > > > > -???????return nfs_fscache_release_page(page, gfp); > > > > > +???????if (PageFsCache(page)) { > > > > > +???????????????if (!(gfp & __GFP_DIRECT_RECLAIM) || !(gfp & > > > > > __GFP_FS)) > > > > > +???????????????????????return false; > > > > > +???????????????wait_on_page_fscache(page); > > > > > +???????} > > > > > +???????return true; > > > > > ?} > > > > > > I've found this generally not to be safe. The VM calls - > > > >release_page() > > > from a variety of contexts, and often fails to report it correctly > > > in > > > the gfp flags. That's particularly true of the stuff in > > > mm/vmscan.c. > > > This is why we have the check above that vetos page removal upon > > > PagePrivate() being set. > > > > [Adding Willy and the mm crew to the cc list] > > > > I wonder if that matters in this case.? In the worst case, we'll wait > > for the > > page to cease being DMA'd - but we won't return true if it is. > > > > But if vmscan is generating the wrong VM flags, we should look at > > fixing that. > > > > > > To elaborate a bit: we used to have code here that would check whether > the page had been cleaned but was unstable, and if an argument of > GFP_KERNEL or above was set, we'd try to call COMMIT to ensure the page > was synched to disk on the server (and we'd wait for that call to > complete). > > That code would end up deadlocking in all sorts of horrible ways, so we > ended up having to pull it. Based on having read zero code at all in this area ... Is it possible that you can wait for an existing operation to finish, but starting a new operation will take a lock that is already being held somewhere in your call chain? So it's not that the gfp flags are being set incorrectly, it's just that you're not in a context where you can start a new operation.