Received: by 2002:ac2:464d:0:0:0:0:0 with SMTP id s13csp1995643lfo; Sat, 28 May 2022 13:06:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz6grK09Yd9r5x6sJBQbaGD+CAwUc22rFM/mQ2Fx9Pa4aeg9VMJqmDCD7PfFABkFLy2SoR8 X-Received: by 2002:a17:90a:8807:b0:1df:78c7:c215 with SMTP id s7-20020a17090a880700b001df78c7c215mr14677553pjn.234.1653768399736; Sat, 28 May 2022 13:06:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653768399; cv=none; d=google.com; s=arc-20160816; b=R7ihOPFOTmbPnH601ZSW/iuCZAdis7krXYoBbwUTgFwSaj3pa8i8E94Ik4VivLLLQS dbpm6OEphdZ691tPB1X+W3ou+79zZHLxe/TYFEgTW9vuvxVlPcmPRbUskQlK4YKWPdf+ E54qinqQvU0Yu4Un2GgOe04WNW9P8zTSiu7T77kSHCWhqdsrCDdw+dR5Np+aVqPrxmae ut+ys2aI9q2MCJHjEM3o9Sox87CZKatzItA4Y2Km8Vwd1oNOeOQpJOf7IeE9DiewUVA7 yaKBzKmyyG7Ok+sQWzy2dcFmoQBBU8lXW61Cyc/BD7nMzKMNt1S/QyKfucc36w7rg1oi +CDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=2j31mgyTy2/RMpvPafIuRR/7M5aCsOn5pH1JxQKELh0=; b=PSci2HF3flXo8L3eZOaRc1Gvc47HfC1tdX5iLbq1J4FMvv9MaBVXTp8F9hDT7E0Wue c8yHr13+5lC2bLMZvdCXKPc2xygqWhbUTXyi7PzPUQ5ljqY6YapZCTll1Oo1iq+iEL31 j1bz8wcobW9KAXIUYLS39rHhNRNyVhHurtjyMyoxZrPvUO5gMZzGp0P+Sz4z3rK8XHN4 Qk2cCQDo8F9FWSRFj6B/wG6gs6u5lCMpc1VuDu9hG3zB2M7Fu+m0F9n0cSXZ9jUwPDeA Z/7noafuHf69NaFs5ZO7jbkLYb5gDzRNGWl51KUyhpiID87Fs6hAv34xjKqlVUnAr9CT qH7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=WLF0ZKcK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id f11-20020a056a00238b00b0050d932fc7b4si10915075pfc.185.2022.05.28.13.06.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 May 2022 13:06:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=WLF0ZKcK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5A7F27642; Sat, 28 May 2022 12:21:54 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243478AbiE0Fjf (ORCPT + 99 others); Fri, 27 May 2022 01:39:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231261AbiE0Fj3 (ORCPT ); Fri, 27 May 2022 01:39:29 -0400 Received: from mail-oa1-x2a.google.com (mail-oa1-x2a.google.com [IPv6:2001:4860:4864:20::2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E08266AE2; Thu, 26 May 2022 22:39:28 -0700 (PDT) Received: by mail-oa1-x2a.google.com with SMTP id 586e51a60fabf-f2bb84f9edso4657949fac.10; Thu, 26 May 2022 22:39:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2j31mgyTy2/RMpvPafIuRR/7M5aCsOn5pH1JxQKELh0=; b=WLF0ZKcKkqyL/2e9Vrm4jJIU7Iq70SjWVWeR4mG+diJpadGhfuylwbDWHZVjDdFbuq Hxc+zHrNjRLjtRhI1yhC+nolDKLNrwOV2/oaWTaoh4VCK67BB0z5yPlcOmnthCOgtmez 8jkwQgYPNLUuG0ZEbUiSy7uVEpWFtrO5L4UM+4mbtXpWMiQIVZp0eLdp45xYgN9Jmvmc aulsD6JeO/TIOhvHj+bALfaW9BR7tPWEUeyaMI8bJbO4wOxVeH/I0lrHRJ+FP9F521BK /lEKp/SRZycKZGHLE0LZwkZV2trusLGaniTNIWTP9EnU7l+Xe/n5XI3K1uCmL30r4C8o Jgxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2j31mgyTy2/RMpvPafIuRR/7M5aCsOn5pH1JxQKELh0=; b=BNbKxGUp6nZBq3hFA1YncaPUzKL8qpEIMs3rWgr6v39diaLCfFbRSX/wLlLoYjrgaW iZNzVm+zgE4EQ+gTr6TLlvLkeK31+Y/z17Iu5F9m56355jwvM+HPeG2nX3iFnpR6TNUw RR5uix6MwAd6t+cuFAb1Jh56gui+nozuUl4s+6Zg44S3BEYf4IBWJDRQouz+mDf5YMm+ Z157VqCmh/Keg4Dt4xdZ2TQsT2jzunnPw51fxFzuKEOwgJa33Mby4oF6hR2TW8y2vgBx JCIT6f4nbPa8wmFDCHzjcRxJ+9/hrgxfMAKF6WMpfTx2OmXL5sX5IyT6SKEUpRAMzlq1 rMUg== X-Gm-Message-State: AOAM5320CJCnAsf0/QNp3+XFa0zxnHyh99tuedtHSSGS4Ff/eLQ0KYrA FNZ9Q1TKmZ+YoQxUbFrAUdNU/20eXVCw1BPniiw= X-Received: by 2002:a05:6870:5ba6:b0:f1:5840:f38e with SMTP id em38-20020a0568705ba600b000f15840f38emr3323758oab.210.1653629967367; Thu, 26 May 2022 22:39:27 -0700 (PDT) MIME-Version: 1.0 References: <20220524234531.1949-1-peterx@redhat.com> In-Reply-To: <20220524234531.1949-1-peterx@redhat.com> From: Max Filippov Date: Thu, 26 May 2022 22:39:15 -0700 Message-ID: Subject: Re: [PATCH v3] mm: Avoid unnecessary page fault retires on shared memory types To: Peter Xu Cc: LKML , Linux Memory Management List , Richard Henderson , David Hildenbrand , Matt Turner , Albert Ou , Michal Simek , Russell King , Ivan Kokshaysky , linux-riscv , Alexander Gordeev , Dave Hansen , Jonas Bonn , Will Deacon , "James E . J . Bottomley" , "H . Peter Anvin" , Andrea Arcangeli , openrisc@lists.librecores.org, linux-s390 , Ingo Molnar , "open list:M68K ARCHITECTURE" , Palmer Dabbelt , Heiko Carstens , Chris Zankel , Peter Zijlstra , Alistair Popple , linux-csky@vger.kernel.org, "open list:QUALCOMM HEXAGON..." , Vlastimil Babka , Thomas Gleixner , "open list:SPARC + UltraSPAR..." , Christian Borntraeger , Stafford Horne , Michael Ellerman , "maintainer:X86 ARCHITECTURE..." , Thomas Bogendoerfer , Paul Mackerras , linux-arm-kernel@lists.infradead.org, Sven Schnelle , Benjamin Herrenschmidt , "open list:TENSILICA XTENSA PORT (xtensa)" , Nicholas Piggin , "open list:SUPERH" , Vasily Gorbik , Borislav Petkov , linux-mips@vger.kernel.org, Helge Deller , Vineet Gupta , Al Viro , Paul Walmsley , Johannes Weiner , Anton Ivanov , Catalin Marinas , linux-um@lists.infradead.org, "open list:ALPHA PORT" , Johannes Berg , "open list:IA64 (Itanium) PL..." , Geert Uytterhoeven , Dinh Nguyen , Guo Ren , linux-snps-arc@lists.infradead.org, Hugh Dickins , Rich Felker , Andy Lutomirski , Richard Weinberger , linuxppc-dev@lists.ozlabs.org, Brian Cain , Yoshinori Sato , Andrew Morton , Stefan Kristiansson , "open list:PARISC ARCHITECTURE" , "David S . Miller" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, FROM_LOCAL_NOVOWEL,HEADER_FROM_DIFFERENT_DOMAINS,HK_RANDOM_FROM, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 24, 2022 at 4:45 PM Peter Xu wrote: > > I observed that for each of the shared file-backed page faults, we're very > likely to retry one more time for the 1st write fault upon no page. It's > because we'll need to release the mmap lock for dirty rate limit purpose > with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). > > Then after that throttling we return VM_FAULT_RETRY. > > We did that probably because VM_FAULT_RETRY is the only way we can return > to the fault handler at that time telling it we've released the mmap lock. > > However that's not ideal because it's very likely the fault does not need > to be retried at all since the pgtable was well installed before the > throttling, so the next continuous fault (including taking mmap read lock, > walk the pgtable, etc.) could be in most cases unnecessary. > > It's not only slowing down page faults for shared file-backed, but also add > more mmap lock contention which is in most cases not needed at all. > > To observe this, one could try to write to some shmem page and look at > "pgfault" value in /proc/vmstat, then we should expect 2 counts for each > shmem write simply because we retried, and vm event "pgfault" will capture > that. > > To make it more efficient, add a new VM_FAULT_COMPLETED return code just to > show that we've completed the whole fault and released the lock. It's also > a hint that we should very possibly not need another fault immediately on > this page because we've just completed it. > > This patch provides a ~12% perf boost on my aarch64 test VM with a simple > program sequentially dirtying 400MB shmem file being mmap()ed and these are > the time it needs: > > Before: 650.980 ms (+-1.94%) > After: 569.396 ms (+-1.38%) > > I believe it could help more than that. > > We need some special care on GUP and the s390 pgfault handler (for gmap > code before returning from pgfault), the rest changes in the page fault > handlers should be relatively straightforward. > > Another thing to mention is that mm_account_fault() does take this new > fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. > > I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do > not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping > them as-is. > > Signed-off-by: Peter Xu > --- > > v3: > - Rebase to akpm/mm-unstable > - Copy arch maintainers > --- For xtensa: Acked-by: Max Filippov -- Thanks. -- Max