Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp697871pxp; Fri, 11 Mar 2022 12:41:12 -0800 (PST) X-Google-Smtp-Source: ABdhPJy4zhWONXCucqQMpRHtHW4NH+SKRzKiJ3NsqUmdp/lWJdVgPbZ5xs2VkVHtL7rwM5ZXfSTK X-Received: by 2002:a65:4c82:0:b0:380:3aee:6948 with SMTP id m2-20020a654c82000000b003803aee6948mr9831424pgt.527.1647031272241; Fri, 11 Mar 2022 12:41:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1647031272; cv=none; d=google.com; s=arc-20160816; b=R9FnDv75DhmO8opTRZj7ZuGSLDWS0kby4h2sxSyTViB2vHYgacXQErXgylVvGtO8wq JksNmRiaIAZ/huG044YsRK/LwG2xcTVM1Gffzjzt8C+dIy18Twvd89Bjc8r/t+E4kZOW 6JOkEqydrazdF1u8aS4SsC9p+zaq7UM2H1MzGa82bQgIm7jhH7RTb+MisuGnM0iufNkq OCdFY0hRkg004WNv0yF40Gz+vhz/6gfHx1IOXH6npgklgxJ7bl67xcECZzT+KkpomGWI 6BuDasjYVVFoeoPVc1BQ2K0GjWAcUyGzZv/9+sdaZ3qDh0fRiczaCvLHSVg/hWOsuwqR qB0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=XW7DMcnwk1D85iVl8OKvTHZ55GYo9rx6/0j7AsIh0HQ=; b=KYlaTslUz7C/3adVBs6AyYGHZ7KPjwfx2e9AgUDOkfHOtG9pbwI9kD0vhcXVmBlDg7 kQ9f47+2q9UdPdHZ/VFpzV4TSXCB/lKKD2WfLkPvY7hfOggS3EoJnhFAjaunF6cmqxJm K+UjngGsuwaOoFEGd2XZSf8dPTmaGa0e8Lhs/qehjAI4/Nsjiw5Gu3Kj9QwtQN/uSyvR MlTI2e5hUc4nn++h644DR1nOwqnc0IK/BGN7C2HFLfDprBAhnAkAOuLc3M3fsnFykVrZ B8tuIOWMoVoi46nXj2S67xWRtxKxZmOxcit0WuHhy/SnNHwJyJtLvteir5vnTF9PvWSB uCTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="e47/s6ZB"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id c1-20020a170902d48100b00151dd6041a6si9859506plg.59.2022.03.11.12.41.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 12:41:12 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="e47/s6ZB"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 599501D67ED; Fri, 11 Mar 2022 12:38:19 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351032AbiCKTIJ (ORCPT + 99 others); Fri, 11 Mar 2022 14:08:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1351027AbiCKTII (ORCPT ); Fri, 11 Mar 2022 14:08:08 -0500 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 673701AEEC1 for ; Fri, 11 Mar 2022 11:07:03 -0800 (PST) Received: by mail-pl1-x635.google.com with SMTP id z3so8417508plg.8 for ; Fri, 11 Mar 2022 11:07:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=XW7DMcnwk1D85iVl8OKvTHZ55GYo9rx6/0j7AsIh0HQ=; b=e47/s6ZBl4wIUGL5c01pNpGqkL5r6JZQLVE6zMWFkZY0l+FwUINd/znyYKiGbpZ+oe i9GeHQ9AYKGtzUHOKvYOwS0igdHMwtCFUfzhdTjtlBir9fkng/T986ta5EQU4GqCeK2/ d0bt4Pyp9nn8jeL99jDgwYLNPc8FYeC3BN83OwsOyhc0nzBhFsokE+NNLx9c1LyZaSEo dIMaxL7BAClSzsw+ZO/oqhmUen17ycSl+wo+iOmz26h6kitOiJu7mcF8Gi0z5bphzjdP Z++wHrNhvD5RmEGuhTPRxRkuJB4UIXy2aAfl2tC3ttnG2VDw6SqvzMLYIwMiciBYw3B6 8IrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=XW7DMcnwk1D85iVl8OKvTHZ55GYo9rx6/0j7AsIh0HQ=; b=Qcpca898jul7icKmeN7CzQhVs8pk2rvZGeOvaqOs96GsmrEaMpB11Bqu265gmkgspE b7Mt6kKV2cGOeffhcLOkDGuWOfAkIAbOJaWQqV/D0QrxUFEV/wUs4+W76Lq3at/F16uR +HOIlRIuGi9klXyfKs1F7RaPjvOV3jIRTGC2rrZH2lJ6YuDTx+pLy5RRayFWIhEJu/N+ irwwxbzEcAqNFyWJf57T+ISvIv0AcunnhvPnvn1Y8kHJq/KNzLY71Scu/Os5FD5xXSMj D3kf5pEzNQg3t/C/0Zib/2WexMCO4XyubIWRad96RArZ29rRVxcTXrBr+pubah4KOF7u WGxQ== X-Gm-Message-State: AOAM5320w5nsJwlFqWCN8yApXaszVquJO+e7U/ULb+DQwZgB8H5wlh2i OUTLwJ+UXFrNknQAAu8C4F+EwuALOuc= X-Received: by 2002:a17:902:ba8f:b0:153:237c:a77f with SMTP id k15-20020a170902ba8f00b00153237ca77fmr9063928pls.1.1647025622607; Fri, 11 Mar 2022 11:07:02 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id a5-20020a621a05000000b004f79f8f795fsm857329pfa.0.2022.03.11.11.07.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 11:07:02 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Nadav Amit Subject: [RESEND PATCH v3 0/5] mm/mprotect: avoid unnecessary TLB flushes Date: Fri, 11 Mar 2022 11:07:44 -0800 Message-Id: <20220311190749.338281-1-namit@vmware.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Nadav Amit This patch-set is intended to remove unnecessary TLB flushes during mprotect() syscalls. Once this patch-set make it through, similar and further optimizations for MADV_COLD and userfaultfd would be possible. Sorry for the time between it took me to get to v3. Basically, there are 3 optimizations in this patch-set: 1. Use TLB batching infrastructure to batch flushes across VMAs and do better/fewer flushes. This would also be handy for later userfaultfd enhancements. 2. Avoid TLB flushes on permission demotion. This optimization is the one that provides most of the performance benefits. Note that the previous batching infrastructure changes are needed for that to happen. 3. Avoiding TLB flushes on change_huge_pmd() that are only needed to prevent the A/D bits from changing. Andrew asked for some benchmark numbers. I do not have an easy determinate macrobenchmark in which it is easy to show benefit. I therre ran a microbenchmark: a loop that does the following on anonymous memory, just as a sanity check to see that time is saved by avoiding TLB flushes. The loop goes: mprotect(p, PAGE_SIZE, PROT_READ) mprotect(p, PAGE_SIZE, PROT_READ|PROT_WRITE) *p = 0; // make the page writable The test was run in KVM guest with 1 or 2 threads (the second thread was busy-looping). I measured the time (cycles) of each operation: 1 thread 2 threads mmots +patch mmots +patch PROT_READ 3494 2725 (-22%) 8630 7788 (-10%) PROT_READ|WRITE 3952 2724 (-31%) 9075 2865 (-68%) [ mmots = v5.17-rc6-mmots-2022-03-06-20-38 ] The exact numbers are really meaningless, but the benefit is clear. There are 2 interesting results though. (1) PROT_READ is cheaper, while one can expect it not to be affected. This is presumably due to TLB miss that is saved (2) Without memory access (*p = 0), the speedup of the patch is even greater. In that scenario mprotect(PROT_READ) also avoids the TLB flush. As a result both operations on the patched kernel take roughly ~1500 cycles (with either 1 or 2 threads), whereas on mmotm their cost is as high as presented in the table. -- v2 -> v3: * Fix orders of patches (order could lead to breakage) * Better comments * Clearer KNL detection [Dave] * Assertion on PF error-code [Dave] * Comments, code, function names improvements [PeterZ] * Flush on access-bit clearing on PMD changes to follow the way flushing on x86 is done today in the kernel. v1 -> v2: * Wrong detection of permission demotion [Andrea] * Better comments [Andrea] * Handle THP [Andrea] * Batching across VMAs [Peter Xu] * Avoid open-coding PTE analysis * Fix wrong use of the mmu_gather() *** BLURB HERE *** Nadav Amit (5): x86: Detection of Knights Landing A/D leak x86/mm: check exec permissions on fault mm/mprotect: use mmu_gather mm/mprotect: do not flush on permission promotion mm: avoid unnecessary flush on change_huge_pmd() arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/pgtable.h | 5 ++ arch/x86/include/asm/pgtable_types.h | 2 + arch/x86/include/asm/tlbflush.h | 82 ++++++++++++++++++++++++ arch/x86/kernel/cpu/intel.c | 5 ++ arch/x86/mm/fault.c | 22 ++++++- arch/x86/mm/pgtable.c | 10 +++ fs/exec.c | 6 +- include/asm-generic/tlb.h | 14 +++++ include/linux/huge_mm.h | 5 +- include/linux/mm.h | 5 +- include/linux/pgtable.h | 20 ++++++ mm/huge_memory.c | 19 ++++-- mm/mprotect.c | 94 +++++++++++++++------------- mm/pgtable-generic.c | 8 +++ mm/userfaultfd.c | 6 +- 16 files changed, 248 insertions(+), 56 deletions(-) -- 2.25.1