Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1972718pxb; Fri, 22 Oct 2021 11:09:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx1MEOvUOU6kyffTIByM7TkPHyOAv216BKvHKsefNjw+xtjXb4TDfgfdCIpZ6qXlZB0BSFx X-Received: by 2002:a17:907:1112:: with SMTP id qu18mr33008ejb.46.1634926147264; Fri, 22 Oct 2021 11:09:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634926147; cv=none; d=google.com; s=arc-20160816; b=pxRBLJuaNL5dWKHCaDBguxyhUQjNLbYENzCaAW7QhEXOHasCK0uqO5CrK6oJtLKmFj nrSqfMwwc+x6DIXx5nDSxAnlVi8/Px9qC48qehTKqEfRbGW9vhusMhOJn2rfViUpnJ+h gSHuk0FiGP4wWG/r0COhNJeXRt2KavzeuJsCM8q91j0444vFsrt7F74U1uxPhgcJMO5N zle6acq4aNhjr6Wx/j2r5V20KheNDazlMJhwMOKmHzmBwUEdHkb2+Vs+tXTCI2+7y86l Ld3GaWJHmRQ3Nr3gIkvzRsJBVfJtgmIg/CSRArJtl72Dkz2TcOF53vLpb/4nDTWhfPym eEbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=VNDDy0LQywmWi4370ZAAwvJISmNCBixKbT4YwtlPoJI=; b=j32zpXctzqABIt4GR7IraiOomfCtNmYbr8/Z3LFHKZK30KTd/WHpuPm//QG6ykTXTS k5/VWbGpAv/UaJhlpRNtKsmPZuIT+5irEzrCGQ2hpq1OnhptDbjmQMgUnh8N/TaMii15 ir7WT4tkJgUB7oHs0Gv5VuVqspZnDR76zw0MTA702MFT86aqqnbX2c6gW/rXEf+RYrZk oABQxxO4V4gpZBwN9HdqzI2dEwK1++j1PB4/91ACBYNzkD/ilqjb2Pz1549404DMNhtU NOXWm/wTjPaQHjYR7gBr/EoSGFt0OzqgTaIUIQyujTWcyfUj34kw6kEOlH9InD1kbbeE HPYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a11si11363561edf.98.2021.10.22.11.08.43; Fri, 22 Oct 2021 11:09:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233709AbhJVSJN (ORCPT + 99 others); Fri, 22 Oct 2021 14:09:13 -0400 Received: from mail.kernel.org ([198.145.29.99]:50144 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233380AbhJVSJL (ORCPT ); Fri, 22 Oct 2021 14:09:11 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 32ED1610A4; Fri, 22 Oct 2021 18:06:49 +0000 (UTC) Date: Fri, 22 Oct 2021 19:06:45 +0100 From: Catalin Marinas To: Linus Torvalds Cc: Andreas Gruenbacher , Paul Mackerras , Alexander Viro , Christoph Hellwig , "Darrick J. Wong" , Jan Kara , Matthew Wilcox , cluster-devel , linux-fsdevel , Linux Kernel Mailing List , ocfs2-devel@oss.oracle.com, kvm-ppc@vger.kernel.org, linux-btrfs Subject: Re: [PATCH v8 00/17] gfs2: Fix mmap + page fault deadlocks Message-ID: References: <20211019134204.3382645-1-agruenba@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 20, 2021 at 08:19:40PM -1000, Linus Torvalds wrote: > On Wed, Oct 20, 2021 at 12:44 PM Catalin Marinas > wrote: > > > > However, with MTE doing both get_user() every 16 bytes and > > gup can get pretty expensive. > > So I really think that anything that is performance-critical had > better only do the "fault_in_write()" code path in the cold error path > where you took a page fault. [...] > So I wouldn't worry too much about the performance concerns. It simply > shouldn't be a common or hot path. > > And yes, I've seen code that does that "fault_in_xyz()" before the > critical operation that cannot take page faults, and does it > unconditionally. > > But then it isn't the "fault_in_xyz()" that should be blamed if it is > slow, but the caller that does things the wrong way around. Some more thinking out loud. I did some unscientific benchmarks on a Raspberry Pi 4 with the filesystem in a RAM block device and a "dd if=/dev/zero of=/mnt/test" writing 512MB in 1MB blocks. I changed fault_in_readable() in linux-next to probe every 16 bytes: - ext4 drops from around 261MB/s to 246MB/s: 5.7% penalty - btrfs drops from around 360MB/s to 337MB/s: 6.4% penalty For generic_perform_write() Dave Hansen attempted to move the fault-in after the uaccess in commit 998ef75ddb57 ("fs: do not prefault sys_write() user buffer pages"). This was reverted as it was exposing an ext4 bug. I don't whether it was fixed but re-applying Dave's commit avoids the performance drop. btrfs_buffered_write() has a comment about faulting pages in before locking them in prepare_pages(). I suspect it's a similar problem and the fault_in() could be moved, though I can't say I understand this code well enough. Probing only the first byte(s) in fault_in() would be ideal, no need to go through all filesystems and try to change the uaccess/probing order. -- Catalin