Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3947005pxf; Tue, 6 Apr 2021 04:30:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyQdpVgjo+aqw3EuNZORXU0qSKUf/H52mi6dRxtSTKldleZMGg9KpcSMwU3Lm9cOqOQqEAk X-Received: by 2002:a17:906:9515:: with SMTP id u21mr34251617ejx.86.1617708655217; Tue, 06 Apr 2021 04:30:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617708655; cv=none; d=google.com; s=arc-20160816; b=dHo64wkwpNEDJBV6Eg5QakuGQssyCIBmJ44ifsoSPVZBPKAbgBcLU3WDn39JdXnjRV /3rTOmLWmSoAbRG3j48x9JsWZkitzhmiLGD7qkP90Jd603vNOXZJdYYHcKiOeZyFbRcm mkFZMkkBZikpIsyauIqvQx4CVlp+UXLMCybe+6v1nbe1GAZvp4v+yiY3TKK96Z9/gzOO /1zuGdxAI7MyUS3Z+kiZHNs09LBF92aQcGz89M8sARsVcc/zdT8Z8R9iTFXTh06w4Zgj xNaztFnjILuyYMd7k8N0ubjs1DuGtmWGO51NIqKCuNKvJ5kng6Ddx2Nv8m7BEAqK93OH EARQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:cc:from:subject:mime-version:message-id:date :dkim-signature; bh=SbZbPTFJCpAYJ6ceslhgL56jY3VfoMYUKBkk6r45HOA=; b=Tw501E//7PnvILBR+K4S7Y4xgx/sqlaULsCtuEJyti3KTg6oTueDCy7HJZRI0+tM/E mPM35pSJsO1VTa8ME+myTN3rQSwuzfhZlTbvDJrd3RPZKukJ+lmnC+D2H4P5l3sLfxGX enVVynsBwu3UJJiMBv85mTRubHzotpAF/vJZtT/XJCQqvosH0zTCObME4HflngUvsAet 0tTFxrs68BqKwWwQIby6qnunaiHYw6p8UZ/v8sWFfafgEVnj+Iqan38wpv71LyNPJw07 x20JF6wd6mxCzS4LpHRrJUoWeIxpV2+YJDLhjaJIBw1dhORDYwAVIttXXZ0JvNdbgCBN /fVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=UZ9KBrbI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v21si801355edd.508.2021.04.06.04.30.31; Tue, 06 Apr 2021 04:30:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=UZ9KBrbI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238660AbhDFAJn (ORCPT + 99 others); Mon, 5 Apr 2021 20:09:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238224AbhDFAJl (ORCPT ); Mon, 5 Apr 2021 20:09:41 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEA04C061756 for ; Mon, 5 Apr 2021 17:09:34 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id u128so18578578ybf.12 for ; Mon, 05 Apr 2021 17:09:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:cc; bh=SbZbPTFJCpAYJ6ceslhgL56jY3VfoMYUKBkk6r45HOA=; b=UZ9KBrbI1nM5xWuW0nXCQxtVxWoF1SZ25VkOCplQ+47tOX9JP+PJmjURhaLLNtXhyk 3F3h2hbNyjKoSemmCs8bvDw0BPhtD64GzUFH08dJxnkhoEK+WeaEY8JVrQlHi/41c1Zu Ze/UfoBaoeiH4fF077FjUMKUmoujB4X9friykVh6uM7Ceh0Q5KaOaAM/vXElJNETc3w6 SYxrrW+ESxgWySiYyMbE77PsbB6OAScAhcURo9CSrXqytfzFXjB2k2T4Q3GX25BYSely 9ZMsxQ5GPzaBRyj5XO3uDB1f9Z4ciwDNH1dOzFl28oRGrLD2z8HH9emjhNGnrv7N1gRy b0Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:cc; bh=SbZbPTFJCpAYJ6ceslhgL56jY3VfoMYUKBkk6r45HOA=; b=YQYGZryHwh+lEcreE9zlOSEa/TAsXzWf1LJnN198nRiM/SPKVJSEXMTzE/Sa1lgFHh +Iz5E5hRC17qU77UQRP1lJAAKfBkDKfTcIKAhvZuQMp/jvAnaboil2Vy2rsZ2CLAy/PQ 0TvztOdGzFE8EM6tbY7WrxY51Y+w7stP+ZwAG3dUzdhW2yQ3vHH+vHkyidTAMmgdXGdn CozZh1HKXa7o017G2T93SEMQql5Jn20ATkIWcXMY6f9k9Km/qrtFDf9i84KeIW4R/Haq 9VMB1gI89RqkBULMH0RHj1cbT4GQwPG9nC7vL9M2RTKsYJzoWHEhs2vS+wV4t1+LknNb uHNA== X-Gm-Message-State: AOAM531HHorby2kO7e1SqAILS8X4L2OdzE1CIacKucDGGBgq573JePiY pQZQGBbi8VguxtkEubJUjMhp0Pes8SFSFZ84vb4= X-Received: from cfijalkovich.mtv.corp.google.com ([2620:15c:211:202:76bb:8f57:ed5a:ae22]) (user=cfijalkovich job=sendgmr) by 2002:a25:4014:: with SMTP id n20mr38672945yba.39.1617667774125; Mon, 05 Apr 2021 17:09:34 -0700 (PDT) Date: Mon, 5 Apr 2021 17:09:30 -0700 Message-Id: <20210406000930.3455850-1-cfijalkovich@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2] mm, thp: Relax the VM_DENYWRITE constraint on file-backed THPs From: Collin Fijalkovich Cc: songliubraving@fb.com, surenb@google.com, hridya@google.com, kaleshsingh@google.com, hughd@google.com, timmurray@google.com, william.kucharski@oracle.com, akpm@linux-foundation.org, willy@infradead.org, Collin Fijalkovich , Alexander Viro , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" To: unlisted-recipients:; (no To-header on input) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Transparent huge pages are supported for read-only non-shmem files, but are only used for vmas with VM_DENYWRITE. This condition ensures that file THPs are protected from writes while an application is running (ETXTBSY). Any existing file THPs are then dropped from the page cache when a file is opened for write in do_dentry_open(). Since sys_mmap ignores MAP_DENYWRITE, this constrains the use of file THPs to vmas produced by execve(). Systems that make heavy use of shared libraries (e.g. Android) are unable to apply VM_DENYWRITE through the dynamic linker, preventing them from benefiting from the resultant reduced contention on the TLB. This patch reduces the constraint on file THPs allowing use with any executable mapping from a file not opened for write (see inode_is_open_for_write()). It also introduces additional conditions to ensure that files opened for write will never be backed by file THPs. Restricting the use of THPs to executable mappings eliminates the risk that a read-only file later opened for write would encounter significant latencies due to page cache truncation. The ld linker flag '-z max-page-size=(hugepage size)' can be used to produce executables with the necessary layout. The dynamic linker must map these file's segments at a hugepage size aligned vma for the mapping to be backed with THPs. Comparison of the performance characteristics of 4KB and 2MB-backed libraries follows; the Android dex2oat tool was used to AOT compile an example application on a single ARM core. 4KB Pages: ========== count event_name # count / runtime 598,995,035,942 cpu-cycles # 1.800861 GHz 81,195,620,851 raw-stall-frontend # 244.112 M/sec 347,754,466,597 iTLB-loads # 1.046 G/sec 2,970,248,900 iTLB-load-misses # 0.854122% miss rate Total test time: 332.854998 seconds. 2MB Pages: ========== count event_name # count / runtime 592,872,663,047 cpu-cycles # 1.800358 GHz 76,485,624,143 raw-stall-frontend # 232.261 M/sec 350,478,413,710 iTLB-loads # 1.064 G/sec 803,233,322 iTLB-load-misses # 0.229182% miss rate Total test time: 329.826087 seconds A check of /proc/$(pidof dex2oat64)/smaps shows THPs in use: /apex/com.android.art/lib64/libart.so FilePmdMapped: 4096 kB /apex/com.android.art/lib64/libart-compiler.so FilePmdMapped: 2048 kB Signed-off-by: Collin Fijalkovich --- Changes v1 -> v2: * commit message 'non-shmem filesystems' -> 'non-shmem files' * Add performance testing data to commit message fs/open.c | 13 +++++++++++-- mm/khugepaged.c | 16 +++++++++++++++- 2 files changed, 26 insertions(+), 3 deletions(-) diff --git a/fs/open.c b/fs/open.c index e53af13b5835..f76e960d10ea 100644 --- a/fs/open.c +++ b/fs/open.c @@ -852,8 +852,17 @@ static int do_dentry_open(struct file *f, * XXX: Huge page cache doesn't support writing yet. Drop all page * cache for this file before processing writes. */ - if ((f->f_mode & FMODE_WRITE) && filemap_nr_thps(inode->i_mapping)) - truncate_pagecache(inode, 0); + if (f->f_mode & FMODE_WRITE) { + /* + * Paired with smp_mb() in collapse_file() to ensure nr_thps + * is up to date and the update to i_writecount by + * get_write_access() is visible. Ensures subsequent insertion + * of THPs into the page cache will fail. + */ + smp_mb(); + if (filemap_nr_thps(inode->i_mapping)) + truncate_pagecache(inode, 0); + } return 0; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a7d6cb912b05..4c7cc877d5e3 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -459,7 +459,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma, /* Read-only file mappings need to be aligned for THP to work. */ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file && - (vm_flags & VM_DENYWRITE)) { + !inode_is_open_for_write(vma->vm_file->f_inode) && + (vm_flags & VM_EXEC)) { return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff, HPAGE_PMD_NR); } @@ -1872,6 +1873,19 @@ static void collapse_file(struct mm_struct *mm, else { __mod_lruvec_page_state(new_page, NR_FILE_THPS, nr); filemap_nr_thps_inc(mapping); + /* + * Paired with smp_mb() in do_dentry_open() to ensure + * i_writecount is up to date and the update to nr_thps is + * visible. Ensures the page cache will be truncated if the + * file is opened writable. + */ + smp_mb(); + if (inode_is_open_for_write(mapping->host)) { + result = SCAN_FAIL; + __mod_lruvec_page_state(new_page, NR_FILE_THPS, -nr); + filemap_nr_thps_dec(mapping); + goto xa_locked; + } } if (nr_none) { -- 2.31.0.208.g409f899ff0-goog