Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1856617iog; Thu, 16 Jun 2022 15:42:49 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tkDIgl/ciQEt0nHXF3qv4uKBXAjQs+rJOVOS5oVyfxt7NyW2KMUCD33/AaQK1IZPpJrStZ X-Received: by 2002:a05:6402:5168:b0:42d:d3f6:2a1b with SMTP id d8-20020a056402516800b0042dd3f62a1bmr9042763ede.94.1655419369209; Thu, 16 Jun 2022 15:42:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655419369; cv=none; d=google.com; s=arc-20160816; b=VUcbbaQornydaG+LRR7kky1rpF92g7+HsgDgQsVNGzsoT8aKV+Vexxd19H6YrVs+0/ 5H9yW0Klc84bd/Qvv1cqnvEprK1XXAhgiJXaBNkvABODN+zyszbOuQf+EgvHLWskIXTX 8GV+X5jei40hSi48imr7/04/ZSG9Jzq336JljMuBLwP36bil5eneGjRVmPS0Z4P1ZUvZ eIFJ+eehoNW744kxmUBtsFksxU5czPJ4pXjkIYSB0imBztZ9JV1szPSJaz3f3qHuuqFi iVOJLbA5qMlYL4JyW6HRgNp2bAQL8mT+tNLsZshplOg/CEbp95ljcDHd9E0Ywa3OfGnU R3SA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=oz0FXxNhpV77sRT+9uMO5C0T7fXnI8REMLiqL4J+sbc=; b=febPQNkZzLcikCFbXByVmMM0+GbFCUt9TewLEXfxUURgds+5XTaA/YUluO6Tqq6A2n uhQ1VBGTTNN3ejpd/rfBT1hQcBTWbHcgcEZXNwkb50QkSHfER3SdDe3dj0qIRMJyuOpS POX5P8oiYx517rdKtF7JV5+9q9+4MU3N6at5kvDG4821CCr6MtD1UnpGvy7fGiawFkMF swwvksh3JWdqsp0tHd3h/dwj4Olo42RZjO+YfulUkfvdzOM1gxh6RgwVcScCYQQIN8KU zkwMCI1ZxZ1UGSg1Nj5Diw5P7ucT27zDOA9OnznY2O+V4/ZzluvlVK8+lbiK8+hOF86l Cxlw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=AXx7yCLv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e9-20020a170906044900b0070b6d46c7eesi2588999eja.105.2022.06.16.15.42.22; Thu, 16 Jun 2022 15:42:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=AXx7yCLv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378996AbiFPWd5 (ORCPT + 99 others); Thu, 16 Jun 2022 18:33:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378485AbiFPWdy (ORCPT ); Thu, 16 Jun 2022 18:33:54 -0400 Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4AD8D60A9A; Thu, 16 Jun 2022 15:33:53 -0700 (PDT) Received: by mail-ej1-x631.google.com with SMTP id m20so5338967ejj.10; Thu, 16 Jun 2022 15:33:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=oz0FXxNhpV77sRT+9uMO5C0T7fXnI8REMLiqL4J+sbc=; b=AXx7yCLvqlVFTzJfX0CnXNayJu51h3/JJ1q83ZC+OUrBcl2lANtsYhGkehIwfAMkNB +xjdRV2zP2hygqaE63pMpcYVT6Kh8TEPKK2h/urGujzN7l/x89h80MzVSQL7IhEB1VhA HgkC5kONftmYNAsF09JlEwwMMEkRHzNwhc4gKmWhiXPKA30CTkmggSFwLGTT/JE0pRvv wfaoH2DkhJI9AQ497Me4QScoda6mO5/XvL+Mqft7SDNc5szig+tfjR7875ZTOdM4q0Yz Ja6IX87j4qtBJeQzuVFNI0PLPrGVWJ36DAyYnEemS640Oox+9QL5Mtk0KuUQVQRBE+Ea b8Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=oz0FXxNhpV77sRT+9uMO5C0T7fXnI8REMLiqL4J+sbc=; b=vVv1CtOYDeaHh/+ZTPxXao0SxxEc5FujXwNK8DDbSWD9lFpWS4JJOpQT2k1tU8PjV0 zhoPCqGEuVp3aXGF6SYE5kSzwD1qOo7l/8R3IaDFBdTr7HREKqN2gBdkkfDk+rS97v0A Z2tcJ5n4oacZ9gEN4QixT++rHWYaGDwJG4qfW9f1d07hiCp3jwQopffKd4IhQUFEjfRo 3mqOsnubBjnYZ0A/K+T43H9XhK4NnE4tj51WTE/P3Ol11NGQ4R4xMbcBxpHQZ5bA8Eng 1i7/h/JcKz2r/HSD0zNyUIb/u2ndIwF56qMWEyQ9wZX9jVXjgDutHyqc8FgbezvWNG8j e0Tw== X-Gm-Message-State: AJIora9TuCc1ijlH/FHrY5Q6lj2UT+R+aT34KeT/P9y9+vzGl/3j8rg5 1i9Mr/D1C81f9Lw5Z8veQoSQFuX/QQDuKbMYdPg= X-Received: by 2002:a17:906:fb07:b0:706:ad5a:db9f with SMTP id lz7-20020a170906fb0700b00706ad5adb9fmr6622722ejb.91.1655418831793; Thu, 16 Jun 2022 15:33:51 -0700 (PDT) MIME-Version: 1.0 References: <20220518014632.922072-1-yuzhao@google.com> <20220518014632.922072-8-yuzhao@google.com> <20220607102135.GA32448@willie-the-truck> <20220607104358.GA32583@willie-the-truck> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Fri, 17 Jun 2022 10:33:40 +1200 Message-ID: Subject: Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap To: Yu Zhao Cc: Linus Torvalds , Will Deacon , Andrew Morton , Linux-MM , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Peter Zijlstra , Tejun Heo , Vlastimil Babka , LAK , Linux Doc Mailing List , LKML , x86 , Kernel Page Reclaim v2 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain , huzhanyuan@oppo.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 17, 2022 at 9:56 AM Yu Zhao wrote: > > On Wed, Jun 8, 2022 at 4:46 PM Barry Song <21cnbao@gmail.com> wrote: > > > > On Thu, Jun 9, 2022 at 3:52 AM Linus Torvalds > > wrote: > > > > > > On Tue, Jun 7, 2022 at 5:43 PM Barry Song <21cnbao@gmail.com> wrote: > > > > > > > > Given we used to have a flush for clear pte young in LRU, right now we are > > > > moving to nop in almost all cases for the flush unless the address becomes > > > > young exactly after look_around and before ptep_clear_flush_young_notify. > > > > It means we are actually dropping flush. So the question is, were we > > > > overcautious? we actually don't need the flush at all even without mglru? > > > > > > We stopped flushing the TLB on A bit clears on x86 back in 2014. > > > > > > See commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case > > > clear the accessed bit instead of flushing the TLB"). > > > > This is true for x86, RISC-V, powerpc and S390. but it is not true for > > most platforms. > > > > There was an attempt to do the same thing in arm64: > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1793830.html > > but arm64 still sent a nosync tlbi and depent on a deferred to dsb : > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1794484.html > > Barry, you've already answered your own question. > > Without commit 07509e10dcc7 arm64: pgtable: Fix pte_accessible(): > #define pte_accessible(mm, pte) \ > - (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid_young(pte)) > + (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte)) > > You missed all TLB flushes for PTEs that have gone through > ptep_test_and_clear_young() on the reclaim path. But most of the time, > you got away with it, only occasional app crashes: > https://lore.kernel.org/r/CAGsJ_4w6JjuG4rn2P=d974wBOUtXUUnaZKnx+-G6a8_mSROa+Q@mail.gmail.com/ > > Why? Yes. On the arm64 platform, ptep_test_and_clear_young() without flush can cause random App to crash. ptep_test_and_clear_young() + flush won't have this kind of crashes though. But after applying commit 07509e10dcc7 arm64: pgtable: Fix pte_accessible(), on arm64, ptep_test_and_clear_young() without flush won't cause App to crash. ptep_test_and_clear_young(), with flush, without commit 07509e10dcc7: OK ptep_test_and_clear_young(), without flush, with commit 07509e10dcc7: OK ptep_test_and_clear_young(), without flush, without commit 07509e10dcc7: CRASH So is it possible that other platforms have similar problems with arm64 while commit 07509e10dcc7 isn't there? but anyway, we depend on those platforms which can really use mglru to expose this kind of potential bugs. BTW, do you think it is safe to totally remove the flush code even for the original LRU? I don't see fundamental difference between MGLRU and LRU on this "flush" thing. Since MGLRU doesn't need flush, why does LRU need it? flush is very expensive, if we do think this flush is unnecessary, will we remove it for the original LRU as well? BTW, look_around is a great idea. Our experiments also show some great decrease on the cpu consumption of page_referenced(). Thanks Barry