Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp2628383ybc; Mon, 25 Nov 2019 01:19:36 -0800 (PST) X-Google-Smtp-Source: APXvYqzV3b0+fFwWSrB7Cu9wctD3JT+YNNJqpxjhFCwrfa9Q0HzklVZ/emb+GqHrsXanX3oZluvY X-Received: by 2002:aa7:d60e:: with SMTP id c14mr17349563edr.174.1574673576408; Mon, 25 Nov 2019 01:19:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574673576; cv=none; d=google.com; s=arc-20160816; b=ExUxzUW7nLTHQ13MuylZHAjES49nVfgvc/Mbt3hTw3eN77ozPN7wgvlQlUaYwcl6Ey DxT9IjizIh6NRxCHSv4ZcfmvgjELPPHBnenx5OCI6X0nek2DmHW6yoKMXLGF8j5+ht9A wFnKHBwAKKX5Jz0RkaYMat1a8+Um8OtP7fmEYmei8NwItEu7s85/s9/xbKbUvpcV6Npq qCQmC25eSDFE0rcqRScPlO8zmOQdpRYb5IjZMj2N7slEwqujyQsWrD4IIuNFGiufHJ6m Zo0E6vHvqc3PkyT6au7UaUiMD0ZusEJL+CkcBxcONVmHj4dlj+yO00LC0ECJsJqVM5be fSbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ZHZjuct0CLXN3AYcQ+YfYJz+luTtYA/oliVypF8e6hc=; b=iuSaVvVAIWyyT3CUx7JxaLUKBvO55smNz5q2hJnALhh2+Zko62emrtUuexQXLfrca8 f9qk7nJp1sQB1Zq1kKTrZWbuYXnM2KEZl1ZXmMhgoDy/T02v1zZSLR+S1TR5NuSKlytK GQYQKzZlNvu0C3va9JxTXlZZcHhpuAGQ5BRbDtVuVUW8rF/d2EpM/fevShlbDYHJjQQC TndfB/s853cUaHPQV8DWkuVENswI7d/iit/COo5kz6VqvzHpoNYAZFHQM1JWmSE7Kx1H 6e4sJsJNkWCdyTEhJS2xCae0otlVRWhvblZm9w8m8XIP67gK4nqRg2creeebtbtfhu7p QOBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@shutemov-name.20150623.gappssmtp.com header.s=20150623 header.b=Kv+a6C+O; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o19si4389646edq.238.2019.11.25.01.19.11; Mon, 25 Nov 2019 01:19:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@shutemov-name.20150623.gappssmtp.com header.s=20150623 header.b=Kv+a6C+O; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727051AbfKYJRg (ORCPT + 99 others); Mon, 25 Nov 2019 04:17:36 -0500 Received: from mail-lj1-f196.google.com ([209.85.208.196]:42636 "EHLO mail-lj1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726947AbfKYJRg (ORCPT ); Mon, 25 Nov 2019 04:17:36 -0500 Received: by mail-lj1-f196.google.com with SMTP id n5so14887510ljc.9 for ; Mon, 25 Nov 2019 01:17:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=ZHZjuct0CLXN3AYcQ+YfYJz+luTtYA/oliVypF8e6hc=; b=Kv+a6C+O8YeU1QaBfhf0j35CEfH/LR23KqQcSQNZ8dEsGWE2Q23YPJVaSt1OOuaLbd 5jBPpWKpaHfKNAEUsyutTc6kHaSoIg23DbBYqQy5weekxhj35gYoDoaifr46Yq7lW5RQ mO1FmglcYhLebSjuvQ6EwuP5ZB4NPus1UBDqlh3NXwcinyYbUY4541wS9uNzZXRax1lz lBxJCiPU1GZ2UM5E7rweB19wEagyR1dGyqpqvWo++g9AjoWy4VNJfxkbhzaFzDXPcaPv FL9wu6KlgjOU68rQLaWeTMflE/baiE/V+KVXwhMxvwy8Us681Uhnj9vpSVntW6kQ82FD c8TQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=ZHZjuct0CLXN3AYcQ+YfYJz+luTtYA/oliVypF8e6hc=; b=Nco1nyeScVhIdFuKqxgr45J1aHYbQL0a7Oj0te8xT/eB9rqIjsFwjDPD5ERcNxotYP vicqheSIfWilo8PaMIGMSAsXgwojE7WhYQk7KCd75g2U+7K2xvB6hEfzuxl2A5dt1LdJ yVfYch7SLfOLYvtm8XhRcKaRYLI3G019nJ9ZGIovEpC3hB6uxXdwylqUuynHk8WOpKix m7GsEXJlgH9vHvCYpKALMbEqhvsiu+maLJUpoHabKpFj2LBw7M5bN4/p7kYwjEIO9itw 75rlZIVAFVtkKRLnkssW0coXeihpI28B1zUGcXi3OXs+zE8sIIqTAYltDqb+4DE8CaRo sxqg== X-Gm-Message-State: APjAAAVuveew+LWSrFzbLHWKm+vYv/3bsZPpcyiJ6dKQVC+Rw8IHSY3H qVwO8GRY+HhipWtRC+J5DBjNKQ== X-Received: by 2002:a2e:760d:: with SMTP id r13mr20970401ljc.15.1574673453729; Mon, 25 Nov 2019 01:17:33 -0800 (PST) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id b3sm3238431lfq.10.2019.11.25.01.17.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Nov 2019 01:17:33 -0800 (PST) Received: by box.localdomain (Postfix, from userid 1000) id E895C1032C4; Mon, 25 Nov 2019 12:17:41 +0300 (+03) Date: Mon, 25 Nov 2019 12:17:41 +0300 From: "Kirill A. Shutemov" To: Andreas Gruenbacher Cc: Linus Torvalds , Steven Whitehouse , Konstantin Khlebnikov , linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Alexander Viro , Johannes Weiner , cluster-devel@redhat.com, Ronnie Sahlberg , Steve French , Bob Peterson Subject: Re: [RFC PATCH 0/3] Rework the gfs2 read and page fault locking Message-ID: <20191125091741.firh7stqcpniwvga@box> References: <20191122235324.17245-1-agruenba@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191122235324.17245-1-agruenba@redhat.com> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Nov 23, 2019 at 12:53:21AM +0100, Andreas Gruenbacher wrote: > Hello, > > this patch series moves the glock lock taking in gfs2 from the > ->readpage and ->readpages inode operations to the ->read_iter file and > ->fault vm operations. To achieve that, we add flags to the > generic_file_read_iter and filemap_fault generic helpers. > > This proposal was triggered by the following discussion: > > https://lore.kernel.org/linux-fsdevel/157225677483.3442.4227193290486305330.stgit@buzz/ > > In that thread, Linus argued that filesystems should make sure the inode > size is sufficiently up-to-date before calling the generic helpers, and > that filesystems can do it themselves if they want more than that. > That's surely doable. However, implementing those operations properly > at the filesystem level quickly becomes complicated when it gets to > things like readahead. In addition, those slightly modified copies of > those helpers would surely diverge from their originals over time, and > maintaining them properly would become hard. So I hope the relatively > small changes to make the original helpers slightly more flexible will > be acceptable instead. > > With the IOCB_CACHED flag added by one of the patches in this series, > the code that Konstantin's initial patch adds to > generic_file_buffered_read could be made conditional on the IOCB_CACHED > flag being cleared. That way, it won't misfire on filesystems that > allow a stale inode size. (I'm not sure if any filesystems other than > gfs2 are actually affected.) > > Some additional explanation: > > The cache consistency model of filesystems like gfs2 is such that if > pages are found in an inode's address space, those pages as well as the > inode size are up to date and can be used without taking any filesystem > locks. If a page is not cached, filesystem locks must be taken before > the page can be read; this will also bring the inode size up to date. > > Thus far, gfs2 has taken the filesystem locks inside the ->readpage and > ->readpages address space operations. A better approach seems to be to > take those locks earlier, in the ->read_iter file and ->fault vm > operations. This would also avoid a lock inversion in ->readpages. > > We obviously want to avoid taking the filesystem locks unnecessarily > when the pages we are looking for are already cached; otherwise, we > would cripple performance. So we need to check if those pages are > present first. That's actually exactly what the generic_file_read_iter > and filemap_fault helpers do already anyway, except that they will call > into ->readpage and ->readpages when they find pages missing. Instead > of that, we'd like those helpers to return with an error code that > allows us to retry the operation after taking the filesystem locks. Do you see IOCB_CACHED/FAULT_FLAG_CACHED semantics being usable for anyting beyond gfs2? -- Kirill A. Shutemov