Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3775557pxf; Mon, 22 Mar 2021 15:01:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyn6f7TCH+CY8nWvuL6fVP+RelJznx3nxImiFdE4fTpQqF68yq1l7tzvTpCjcVr/ipl3JL4 X-Received: by 2002:a17:906:f203:: with SMTP id gt3mr1784715ejb.346.1616450508678; Mon, 22 Mar 2021 15:01:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616450508; cv=none; d=google.com; s=arc-20160816; b=EDJmlf5mlbJmUvlYlFMvR9BPaguRhb+wJOQNlF2uQRdZkT1etQRdW4RCToYZ9uTHr1 T49nwNX7v0/y0W1DkKK7M4YHyyXuTwY8Ya7gMXI2B3kCm2qxWhDJ/Dd9D70gkDJcegz0 p77c9dg2G8BunsV5aFdsw13fOadG7qLqGFKzehacAtf39OMPdiPsG545vJ+qqSCqZ4El MbRTVKLRBlq1ESzGe+mwuAnidP7PLbHpP0dT2UenKQlhWI7WuvC3wfxl912NMxcMmcpD mqNNrkPSXCQthJjeFhvYSG8ro/4Vcm2n9iwE1kl4KDKUwFwXGf5CM8WsG5o7uq+94BNg rB9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:cc:from:subject:mime-version:message-id:date :dkim-signature; bh=voOmLAB+ChC2A4QEJf/yJCzWFfGe3YHCm7ZcDcYjjnU=; b=KkKlnVoGIaFD44Az41ublNg940xXVZZwWeKIy8cSMRB1YSlHqT7KwS8L8wtUfF+az6 Si57CQpc7lzhbNdGq1/fq6xoPUGZ2rJvsWXGUYEnuKdP3YcdeslZ1rVv37TE5tSRXU5p y+QustUvD7+UCmJ3y40jNVu+foxpEO+4xSdTLM6tqrI1NR6fcPJH8hOosreShHsBhnzQ C61bKXd0uIDKg4bu03jYlE5abNR5skKMJ4ccJEToPLBhlFCH75EjdDtqcCiQ3/h8nl7J xHbiQw/vDdf2zOUEdOmBenbYP+nZE7WgnwTNb7bLtrfwd0D6nPnIWXF5eVcnZEf4ew/C nxyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=cd0M2VHO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j5si12165431edh.218.2021.03.22.15.01.25; Mon, 22 Mar 2021 15:01:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=cd0M2VHO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230063AbhCVWAT (ORCPT + 99 others); Mon, 22 Mar 2021 18:00:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230097AbhCVV7z (ORCPT ); Mon, 22 Mar 2021 17:59:55 -0400 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E64FC061574 for ; Mon, 22 Mar 2021 14:59:54 -0700 (PDT) Received: by mail-pf1-x449.google.com with SMTP id u126so192413pfc.6 for ; Mon, 22 Mar 2021 14:59:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:cc; bh=voOmLAB+ChC2A4QEJf/yJCzWFfGe3YHCm7ZcDcYjjnU=; b=cd0M2VHOYGzQl8ZXH/1Sz3MjSzG2cudqMvnHDFR7ZzJq4unb+n72PKnDuWzta6zl56 BT1G5HDQbp9Yo8Ym/E/Ed9bxayOeNfsP3OSuab7gWEBii+DL5NwY7Nkci6JOoXtnTHlV u+8jkHVP35f7FsExNWvgKUeF5xzdyzsQSZVvV/3wDVq/gUt4CajAGEJhK3ce9GVifPdN MdnIBUMdevzZaCIN0LcH0J4zJgPwDJpyhZoZkgg/cVJn2t/BBRzakpwcJvig9QNFaNH5 4s/YDGKFKO/zGvlM7wWOFowipybMEr6brimqeTXqGP2eBz1Eh7yocLh7VM17Z/DLSKww OH8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:cc; bh=voOmLAB+ChC2A4QEJf/yJCzWFfGe3YHCm7ZcDcYjjnU=; b=h5J4gfkeqBD1leLRLOfZTiJOrJKALC1Wg2jMCR84BSCvfYVlrxMEa9Qfh/nZwFjY1B ZGrm3GRZ2QC67Osc6f4XKnqyl980NItGIp5akt378OLWjw9DN6HRzWog1HJDz0FY5u3F frS5fAVJLw6E8ESuyrLpNnKDaO+FGBLie9iz/33atXS5QOWsi8XSZrkOBuVxiO49W1xO tj/cvx+7Zqp5ocvhssUVu5yQLw8xXh6mEuAjq0YRrhu9Gs+hP3LRWEOLro/3F5NpP2gd kqbzbD2wB9tWG/eAJkasY29wySK2EVNkTj2UFeT7arKZaqhE3bKvtw4sBAnXTmZQO3sV 8kIA== X-Gm-Message-State: AOAM531xTaiuSXQPwR3HyyfVwz+b5QuclUD0rP+pT6z7dtZAl6b4mRDx CA/CsrSEXZujToUcCBhC35/1vYFwgtzxvQv54fA= X-Received: from cfijalkovich.mtv.corp.google.com ([2620:15c:211:202:3d09:6020:a2b1:f8fb]) (user=cfijalkovich job=sendgmr) by 2002:a17:90a:ce0c:: with SMTP id f12mr1126811pju.11.1616450394091; Mon, 22 Mar 2021 14:59:54 -0700 (PDT) Date: Mon, 22 Mar 2021 14:58:23 -0700 Message-Id: <20210322215823.962758-1-cfijalkovich@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.31.0.rc2.261.g7f71774620-goog Subject: [PATCH] mm, thp: Relax the VM_DENYWRITE constraint on file-backed THPs From: Collin Fijalkovich Cc: songliubraving@fb.com, surenb@google.com, hridya@google.com, kaleshsingh@google.com, hughd@google.com, timmurray@google.com, Collin Fijalkovich , Alexander Viro , Andrew Morton , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" To: unlisted-recipients:; (no To-header on input) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Transparent huge pages are supported for read-only non-shmem filesystems, but are only used for vmas with VM_DENYWRITE. This condition ensures that file THPs are protected from writes while an application is running (ETXTBSY). Any existing file THPs are then dropped from the page cache when a file is opened for write in do_dentry_open(). Since sys_mmap ignores MAP_DENYWRITE, this constrains the use of file THPs to vmas produced by execve(). Systems that make heavy use of shared libraries (e.g. Android) are unable to apply VM_DENYWRITE through the dynamic linker, preventing them from benefiting from the resultant reduced contention on the TLB. This patch reduces the constraint on file THPs allowing use with any executable mapping from a file not opened for write (see inode_is_open_for_write()). It also introduces additional conditions to ensure that files opened for write will never be backed by file THPs. Restricting the use of THPs to executable mappings eliminates the risk that a read-only file later opened for write would encounter significant latencies due to page cache truncation. The ld linker flag '-z max-page-size=(hugepage size)' can be used to produce executables with the necessary layout. The dynamic linker must map these file's segments at a hugepage size aligned vma for the mapping to be backed with THPs. Signed-off-by: Collin Fijalkovich --- fs/open.c | 13 +++++++++++-- mm/khugepaged.c | 16 +++++++++++++++- 2 files changed, 26 insertions(+), 3 deletions(-) diff --git a/fs/open.c b/fs/open.c index e53af13b5835..f76e960d10ea 100644 --- a/fs/open.c +++ b/fs/open.c @@ -852,8 +852,17 @@ static int do_dentry_open(struct file *f, * XXX: Huge page cache doesn't support writing yet. Drop all page * cache for this file before processing writes. */ - if ((f->f_mode & FMODE_WRITE) && filemap_nr_thps(inode->i_mapping)) - truncate_pagecache(inode, 0); + if (f->f_mode & FMODE_WRITE) { + /* + * Paired with smp_mb() in collapse_file() to ensure nr_thps + * is up to date and the update to i_writecount by + * get_write_access() is visible. Ensures subsequent insertion + * of THPs into the page cache will fail. + */ + smp_mb(); + if (filemap_nr_thps(inode->i_mapping)) + truncate_pagecache(inode, 0); + } return 0; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a7d6cb912b05..4c7cc877d5e3 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -459,7 +459,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma, /* Read-only file mappings need to be aligned for THP to work. */ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file && - (vm_flags & VM_DENYWRITE)) { + !inode_is_open_for_write(vma->vm_file->f_inode) && + (vm_flags & VM_EXEC)) { return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff, HPAGE_PMD_NR); } @@ -1872,6 +1873,19 @@ static void collapse_file(struct mm_struct *mm, else { __mod_lruvec_page_state(new_page, NR_FILE_THPS, nr); filemap_nr_thps_inc(mapping); + /* + * Paired with smp_mb() in do_dentry_open() to ensure + * i_writecount is up to date and the update to nr_thps is + * visible. Ensures the page cache will be truncated if the + * file is opened writable. + */ + smp_mb(); + if (inode_is_open_for_write(mapping->host)) { + result = SCAN_FAIL; + __mod_lruvec_page_state(new_page, NR_FILE_THPS, -nr); + filemap_nr_thps_dec(mapping); + goto xa_locked; + } } if (nr_none) { -- 2.31.0.rc2.261.g7f71774620-goog