Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp597898pxb; Mon, 25 Oct 2021 14:37:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwC28eTPF7jS1rgh5PMDnVti3aLp6RHZRRb+rsGJ6I4YFULgWFA79knZP9cwniXILeQIn2f X-Received: by 2002:a17:902:cecf:b0:140:5397:92b7 with SMTP id d15-20020a170902cecf00b00140539792b7mr8596000plg.66.1635197859873; Mon, 25 Oct 2021 14:37:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635197859; cv=none; d=google.com; s=arc-20160816; b=n5BIVY1PWjEeVy1E4OkBgUzo7FqcTMwhYrIRn8QfYkQyAdZpXiPSZbHSkOUCplqeOK NLyMsCjMZ0XJTxh1ER4ABQ/z+JxelbEovD8/a9wR467/MXwXdbohJtmYFaHUFFN+qDYX Lq0sEV3J7/SyEZN3zEby5+l+RR9bt/Ucz8fNvMug+ccakBUkb6VHz+FMzDiKBbQyl6iV 9M5TQqvDJ/xFPzol/Y63izeuKmIoU0YjtV2I5/dDiYzARncCNrAwpABQ03mXejqLlTev HyBu6R10dKshZNNtqnxj3p9GzC43qPHezpnITp1gQM1p3j+QT5cKu803luR1God/xSJf LoqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=F3SLaknlCrgsyNEFXulutuLS6zkvJHd+fv8cS/rpPqI=; b=IzaCUxg2zUc0hpewv34XcYEGjqCXDqmhkMN+8i83kSej4G0koolX3BveMYAEIxnFfa p7FGecneFKSO/W2WWlkOiA/fzcfqMShvRxTsGaotF7fU7ACe0UarL6nDmZcnFBfkNJX2 V5xr8tpXpNHYpEKwORIU3W/eoODQAnmfFYSqEAAy//xmnN8yVXLuAawTQFYW7xRxhcTX uaZ7FAC0/CPPF7UoXzQPS/B6CE7Qh+3d2A2qedZikEQndbfO+qrr04Q+Z5joP/slXpEt 7c0+PZsFPVXYNRmAHoI+sB0CCBpAHNnsHWl2KStUI3HvK0V5S3opjXBSvISTwSq7kIWu dwAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UrCbbeo3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d8si32340250plg.21.2021.10.25.14.37.26; Mon, 25 Oct 2021 14:37:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UrCbbeo3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232894AbhJYS1E (ORCPT + 99 others); Mon, 25 Oct 2021 14:27:04 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:25164 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229811AbhJYS1D (ORCPT ); Mon, 25 Oct 2021 14:27:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635186280; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=F3SLaknlCrgsyNEFXulutuLS6zkvJHd+fv8cS/rpPqI=; b=UrCbbeo37FFAA5hy86HB+nhRyZZezns6zs7souS+/3D9VqGNo0OFKRuTRie7LoXwScCtXw kI8+6BOVJH9R1gsJxYUPHwET26cq3J4CKt+s+Pdr5sHKnBn6+pN9A/6fYF/2bcVnVaoF0h 2vRhPltp6uVkXh01qqSomsXXhubMsCY= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-546-0ciZaSa3Od-ItlArMtyIog-1; Mon, 25 Oct 2021 14:24:38 -0400 X-MC-Unique: 0ciZaSa3Od-ItlArMtyIog-1 Received: by mail-wm1-f70.google.com with SMTP id 5-20020a1c0005000000b0032c9c156acbso3977798wma.9 for ; Mon, 25 Oct 2021 11:24:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=F3SLaknlCrgsyNEFXulutuLS6zkvJHd+fv8cS/rpPqI=; b=U9uu3lWWSU1KeoLak/wS4mFJIbaRWM0+8K6ktww1fvw76WT7NBPPwTT+F4I5CSzOba 40tGD4Sg5TxbCeXJQ0dcHoad64w/CEfipTgnvg+3rCfotXgLMA6Nbf2qPeQuzcwIA8Vk ZW2Bjbp32qgA1y28dIM8DJNQQgXplBHU+9CgNot3MYGm2GdGbj8fit1xG2X0yXC7htzK 1sEB9LLOszc8M9IpGESCQjUNjnKXiiYf4k7YOQXONkDBmCcL80RJ33ZkF6sxqSGRm1c9 VgCMZDAkrKLzH37pJqWEgi+m/JHT/J5iYlXLcO+T4XG3zxBRCyVWksUSAAo8uqfgc5hj Lr8g== X-Gm-Message-State: AOAM5314h545NLDN+hUtiFKsb3QxNOyt+aC2dVjqcKHlTfZ7dcS5BDOV dAzWGpZd54mSmjREudghoDG5suJnCN66Sth/7C3EWQxbQ0vMYDgYXEWj+O46q9hX4SCMg2AHWuj fT5f4pIPc5XijChIevvXFeotBJbteVfLGIrtSt4j4 X-Received: by 2002:a5d:4bc2:: with SMTP id l2mr24538453wrt.81.1635186277641; Mon, 25 Oct 2021 11:24:37 -0700 (PDT) X-Received: by 2002:a5d:4bc2:: with SMTP id l2mr24538418wrt.81.1635186277388; Mon, 25 Oct 2021 11:24:37 -0700 (PDT) MIME-Version: 1.0 References: <20211019134204.3382645-1-agruenba@redhat.com> In-Reply-To: From: Andreas Gruenbacher Date: Mon, 25 Oct 2021 20:24:26 +0200 Message-ID: Subject: Re: [PATCH v8 00/17] gfs2: Fix mmap + page fault deadlocks To: Catalin Marinas , Dave Hansen , "Ted Ts'o" Cc: Linus Torvalds , Paul Mackerras , Alexander Viro , Christoph Hellwig , "Darrick J. Wong" , Jan Kara , Matthew Wilcox , cluster-devel , linux-fsdevel , Linux Kernel Mailing List , ocfs2-devel@oss.oracle.com, kvm-ppc@vger.kernel.org, linux-btrfs Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org commit On Fri, Oct 22, 2021 at 8:07 PM Catalin Marinas wrote: > On Wed, Oct 20, 2021 at 08:19:40PM -1000, Linus Torvalds wrote: > > On Wed, Oct 20, 2021 at 12:44 PM Catalin Marinas > > wrote: > > > > > > However, with MTE doing both get_user() every 16 bytes and > > > gup can get pretty expensive. > > > > So I really think that anything that is performance-critical had > > better only do the "fault_in_write()" code path in the cold error path > > where you took a page fault. > [...] > > So I wouldn't worry too much about the performance concerns. It simply > > shouldn't be a common or hot path. > > > > And yes, I've seen code that does that "fault_in_xyz()" before the > > critical operation that cannot take page faults, and does it > > unconditionally. > > > > But then it isn't the "fault_in_xyz()" that should be blamed if it is > > slow, but the caller that does things the wrong way around. > > Some more thinking out loud. I did some unscientific benchmarks on a > Raspberry Pi 4 with the filesystem in a RAM block device and a > "dd if=/dev/zero of=/mnt/test" writing 512MB in 1MB blocks. I changed > fault_in_readable() in linux-next to probe every 16 bytes: > > - ext4 drops from around 261MB/s to 246MB/s: 5.7% penalty > > - btrfs drops from around 360MB/s to 337MB/s: 6.4% penalty > > For generic_perform_write() Dave Hansen attempted to move the fault-in > after the uaccess in commit 998ef75ddb57 ("fs: do not prefault > sys_write() user buffer pages"). This was reverted as it was exposing an > ext4 bug. I don't [know] whether it was fixed but re-applying Dave's commit > avoids the performance drop. Interesting. The revert of commit 998ef75ddb57 is in commit 00a3d660cbac. Maybe Dave and Ted can tell us more about what went wrong in ext4 and whether it's still an issue. Commit 998ef75ddb57 looks mostly good except that it should loop around whenever the fault-in succeeds even partially, so it needs the semantic change of patch 4 [*] of this series. A copy of the same code now lives in iomap_write_iter, so the same fix needs to be applied there. Finally, it may be worthwhile to check for pagefault_disabled() in generic_perform_write and iomap_write_iter before trying the fault-in; this would help gfs2 which will always call into iomap_write_iter with page faults disabled, and additional callers like that could emerge relatively soon. [*] https://lore.kernel.org/lkml/20211019134204.3382645-5-agruenba@redhat.com/ > btrfs_buffered_write() has a comment about faulting pages in before > locking them in prepare_pages(). I suspect it's a similar problem and > the fault_in() could be moved, though I can't say I understand this code > well enough. > > Probing only the first byte(s) in fault_in() would be ideal, no need to > go through all filesystems and try to change the uaccess/probing order. Thanks, Andreas