Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1292010pxk; Sat, 12 Sep 2020 19:40:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx9GPYOP5xtDu3ApYHlEd0tSA4ljEj1w49OzUD9oIZQ8P9Arx02OH08qqoStWY2vqQpr/8M X-Received: by 2002:a50:fc87:: with SMTP id f7mr10790901edq.162.1599964838595; Sat, 12 Sep 2020 19:40:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599964838; cv=none; d=google.com; s=arc-20160816; b=iMXf0zbd6IwoyeodPVF/nhoblLtyRjXRio287ZlEBnpu7ehQs5DFuHw1QVYmCvuc13 Qh5Hj4LVsBo/4UWvrL3R1FiqWOpd0LRjSmbDF6TXUP6tKDPUHqDxBSK/m+UPb7QaORTv syviPqBHrwnvVIn9RbIIlT5kg0F8EGyiIyJeLJk3haIU96S+yZ65EC7CuvZm5Ds9337G xHQgTTCB8EBtzVQ6kDfl3Z1sDei4zEgTdx0MMsJE2NmGRDirXMRGvvHBpUpzt8YlcZSO yzTU1I6GXuuKCSM0dr6UlV990qiZrrqwcP6ue1cHnwN77FD3yLtQcvqDAEOOY3HM/0aw odkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=2C/qz8WTQg0h0JMo2/r0VFEztzIc2ITvUiyhbZtdyQg=; b=s/clKbMF3PtboLhVQLebznTxvGoUy7TOVObDneOoKKftpU1BXROgljahfBygeWE70B tr3ToEEkNL32CMntIEJaBzmMP+l+xvJ/8tr7yysvhRHYlDD7URXhUBtyPQVsMYgih6ZZ 71pNIpN3IO6VFvOQYH7t8ddYe+ipxzsc27RffXllYLqZzCy7Ok3wQDrzqNMpBwXjqs+9 raaW4lMqrMVJw/jK7fM0x6jg9V5CVzU7vfCu/JrNM7+1C8F+JORoNBTmy0szDZ/Wg5xN AbEoHHcqk0yzxBxEbwvCUu+VyacyM6O2FDfk7iJxk/yuCfv5qLtxOdZKwWkv7b4U+4mI 9XFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=Izk6OPa4; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b11si4588700edq.546.2020.09.12.19.40.03; Sat, 12 Sep 2020 19:40:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=Izk6OPa4; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725907AbgIMCj4 (ORCPT + 99 others); Sat, 12 Sep 2020 22:39:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725908AbgIMCjy (ORCPT ); Sat, 12 Sep 2020 22:39:54 -0400 Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DEF7C061574 for ; Sat, 12 Sep 2020 19:39:53 -0700 (PDT) Received: by mail-lj1-x235.google.com with SMTP id u4so15630520ljd.10 for ; Sat, 12 Sep 2020 19:39:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2C/qz8WTQg0h0JMo2/r0VFEztzIc2ITvUiyhbZtdyQg=; b=Izk6OPa4qUb+khxBOAgR1fle5sieDHyxqjm9jpILoEO8u1K+OHUG84kO/1INA6hRCG ePFHOC5QcsUjPBaQLZ/XmWz8NLkZ/vMkS6s49DPLhmxzE02EXRvFI2E7Y9eY8YJ2CrKQ o+ouBlYIOmUBs6BsvnZoky/SHLY7xY6j0j/YU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2C/qz8WTQg0h0JMo2/r0VFEztzIc2ITvUiyhbZtdyQg=; b=jBWaK5HouvY1WeG7yDiFeF/P8Fuo2FyWLo/gLUrC0Im++XtQ8+8Yp6wUn2uPRklNdS 0sAFrX3C1h2t3GaJgSsPCFNg9qNRdMTbZl/tJlhhk5ygRY2LHopdfb5hNPSp6A7l7Gr1 cy6iEJ3g3VaCwIdqyndhjX9qAWba8HsI8dICeBsluwtpjfXRwceVrNz4TCMDNR2/j2vo nPxw5bOp31wUdGGOMcX2Bwi+HPMx+2BoVkECcM8/hwKmG815B+CfT4Ej69CntKdjQJ6K HK3YT0oMGQiltkQHuDvTalRvbLnw+miS1Fvmtp6aetGZH75pkXmmBxkI+kiBV+URt+Ml QAmw== X-Gm-Message-State: AOAM532xdRCTUehg1obEOn0omiJx5PkCAF6Z2EjMenSh/gZS9LROTWyr owdTlNkvu17i6nbWkDBIZ0Wmh57N4iQ1WA== X-Received: by 2002:a2e:240e:: with SMTP id k14mr2889139ljk.169.1599964789889; Sat, 12 Sep 2020 19:39:49 -0700 (PDT) Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com. [209.85.167.51]) by smtp.gmail.com with ESMTPSA id z6sm1732362lfq.297.2020.09.12.19.39.48 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 12 Sep 2020 19:39:48 -0700 (PDT) Received: by mail-lf1-f51.google.com with SMTP id d15so9778495lfq.11 for ; Sat, 12 Sep 2020 19:39:48 -0700 (PDT) X-Received: by 2002:ac2:5594:: with SMTP id v20mr2585356lfg.344.1599964787701; Sat, 12 Sep 2020 19:39:47 -0700 (PDT) MIME-Version: 1.0 References: <8bb582d2-2841-94eb-8862-91d1225d5ebc@MichaelLarabel.com> <0cbc959e-1b8d-8d7e-1dc6-672cf5b3899a@MichaelLarabel.com> <20200913004057.GR12096@dread.disaster.area> In-Reply-To: <20200913004057.GR12096@dread.disaster.area> From: Linus Torvalds Date: Sat, 12 Sep 2020 19:39:31 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Kernel Benchmarking To: Dave Chinner Cc: Amir Goldstein , Hugh Dickins , Michael Larabel , "Ted Ts'o" , Andreas Dilger , Ext4 Developers List , Jan Kara , linux-fsdevel Content-Type: text/plain; charset="UTF-8" Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Sat, Sep 12, 2020 at 5:41 PM Dave Chinner wrote: > > Hmmmm. So this is a typically a truncate race check, but this isn't > sufficient to protect the fault against all page invalidation races > as the page can be re-inserted into the same mapping at a different > page->index now within EOF. Using some "move" ioctl or similar and using a "invalidate page mapping, then move it to a different point" model? Yeah. I think that ends up being basically an extended special case of the truncate thing (for the invalidate), and would require the filesystem to serialize externally to the page anyway. Which they would presumably already do with the MMAPLOCK or similar, so I guess that's not a huge deal. The real worry with (d) is that we are using the page lock for other things too, not *just* the truncate check. Things where the inode lock wouldn't be helping, like locking against throwing pages out of the page cache entirely, or the hugepage splitting/merging etc. It's not being truncated, it's just the VM shrinking the cache or modifying things in other ways. So I do worry a bit about trying to make things per-inode (or even some per-range thing with a smarter lock) for those reasons. We use the page lock not just for synchronizing with filesystem operations, but for other page state synchronization too. In many ways I think keeping it as a page-lock, and making the filesystem operations just act on the range of pages would be safer. But the page locking code does have some extreme downsides, exactly because there are so _many_ pages and we end up having to play some extreme size games due to that (ie the whole external hashing, but also just not being able to use any debug locks etc, because we just don't have the resources to do debugging locks at that kind of granularity). That's somewhat more longer-term. I'll try to do another version of the "hybrid fairness" page lock (and/or just try some limited optimistic spinning) to see if I can at least avoid the nasty regression. Admittedly it really probably only happens for these kinds of microbenchmarks that just hammer on one page over and over again, but it's a big enough regression for a "real enough" load that I really don't like it. Linus