Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp2202332iof; Tue, 7 Jun 2022 23:04:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzQm7XvaQbSPTWW6ATPtvHvCVGw5N+9qt92VBExut3EHA8W4k6EmIlWGOXre5KLICg797xJ X-Received: by 2002:a17:90a:c303:b0:1df:1ab6:68fb with SMTP id g3-20020a17090ac30300b001df1ab668fbmr35732401pjt.181.1654668284331; Tue, 07 Jun 2022 23:04:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654668284; cv=none; d=google.com; s=arc-20160816; b=RQ2YMvmjw4PuNkjHcqU35yiTmpGh93EJzuuzJkhDNX05PbkCd+Cav1/0T2jsfnXXc9 ajrCHoi9TVRjPA6xzARygNCoSZPs641Q81T2dbyRRIrjFY9Zzo7RQfTwrqer1CLffT3l JQKsPGDY0YNE9E/MuzStAB4dQmldhyMR7pdyQ3mZnm2/XIaRrkkjOtbXZEeHUVd9QnJE 7nqkAIyRgvsS+Dv9IKi1BOU+ZwI6A85Cl6Yu/re8iiTS4zN+qA/1A4H65uZEvH8/QVfO trFZAeUcHjHt3XiIRW0oFTiDDV/1pu+X99pzkCfzTZXSwn7AENd3QrFitivYr7D/K5cZ nyow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=BgxQ0h6I958Y2GI2JoqVGW/OESE9fxrXKfh9ZN0Zq08=; b=USBNhFi9T2GDAwrJcusW1H97RXSgz4nSOX1N/qSRQP2CtmG0GKSmq5+GEWyzM2thQ3 Ms2o1hxQTRqW3ldDE6UmNYnwKVqo6kXpZHbCCPS7MrDCX8MegaQbQ6dYq8hkW3WS5+sE gNsTWkB9xkDX12clKy7rCGbe5la+/zcP9YvNgU+7W6sVJsGKTB/kcYOexy9qHBsIn40D M+8dciZTqpM3+2NkLyMWap9tht17IW6LT8P4u2f44IRJHMTPjA3SgX6nYkcTEiS5BRyW zPKQpSNeU1KXJNuayJiqlvWnrLB+vCOnkDxxxb/0wCxDF3EeXEQW2q7wzrMoa9WTRa35 nPzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=oKoX3EKP; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id n9-20020a170902d2c900b0015cfa9a0a25si26630959plc.361.2022.06.07.23.04.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Jun 2022 23:04:44 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=oKoX3EKP; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F04748217A; Tue, 7 Jun 2022 22:29:04 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387502AbiFHB13 (ORCPT + 99 others); Tue, 7 Jun 2022 21:27:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1453564AbiFGXPi (ORCPT ); Tue, 7 Jun 2022 19:15:38 -0400 Received: from mail-vs1-xe34.google.com (mail-vs1-xe34.google.com [IPv6:2607:f8b0:4864:20::e34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75743248561 for ; Tue, 7 Jun 2022 14:07:43 -0700 (PDT) Received: by mail-vs1-xe34.google.com with SMTP id e11so3345652vsh.13 for ; Tue, 07 Jun 2022 14:07:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BgxQ0h6I958Y2GI2JoqVGW/OESE9fxrXKfh9ZN0Zq08=; b=oKoX3EKP+If3Pg3CSsiIzp7MdgwGoVt1eWuoV8GtnXDvuCJWY+M7wmLS8sygJpkjaK iOAb0QC3WVE1pARsV9XTNGUaZNsCEia7M1Y1OMdqEzCr2DXhR1x2R0fjgls2GHghq0DJ kmV9VrpihV+4mk7fDF1AIBg06e8OMvXc5a6mzwLGlAFIx2PS+fPJA4xnLb/RNoCfe/qz 5sLk3qDe0vNhpDymnq40eNUHcQXuLAYm2G/O3hH9F58tM1JmX/d7h9aWfcX3LmHxBJIg ulq8z9he5xJHfSCzWyqQiUlWhK6wtsKlDnmt21fI11y8OSoQVCEq2KhLU5sFa9gV6pa0 7+zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BgxQ0h6I958Y2GI2JoqVGW/OESE9fxrXKfh9ZN0Zq08=; b=4TsVOPqGZ66Ay0eKJwnanJlUPjdoFI2EBZdQ3mjbzh6Mdr4uQnbqEdHlXqm7RZFYFa 6iUs9bKWYY7vsEcvO7+x50OxvT/akLVScknQDUzElFRpHvrpc+mJhGe5rlzsydLzd7+h zzvZT3We6/aRTXhoCmPD8tP61atelWAOX5AsoFGStPFoyEtrSpWNzWRYxBgS8Bx8sRj8 cALbbpVzaVHxtJZt+7ulvDMU59FPd/pqbLvlnij+XrSqxgUsmDTOpI+vmkTuddFnV8fk YlwRGt8WkSpYkrK5lkFfV0WEex2512rj62OKDaOkd3LzE8/4xSi9vBQuyXW0gIg6IUh1 V6pw== X-Gm-Message-State: AOAM530MRNIij+91BnEuBjNA5XILAvVReLMkQ4myD1nzfDi448FT3dUa KrDCGqEckVo78JkEregbzPYnylZkk1FXimFoISCtHg== X-Received: by 2002:a67:f3d0:0:b0:34b:b52d:d676 with SMTP id j16-20020a67f3d0000000b0034bb52dd676mr6635528vsn.6.1654636053943; Tue, 07 Jun 2022 14:07:33 -0700 (PDT) MIME-Version: 1.0 References: <20220518014632.922072-1-yuzhao@google.com> <20220518014632.922072-8-yuzhao@google.com> <20220607102135.GA32448@willie-the-truck> <20220607104358.GA32583@willie-the-truck> In-Reply-To: <20220607104358.GA32583@willie-the-truck> From: Yu Zhao Date: Tue, 7 Jun 2022 15:06:57 -0600 Message-ID: Subject: Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap To: Will Deacon Cc: Barry Song <21cnbao@gmail.com>, Andrew Morton , Linux-MM , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Johannes Weiner , Jonathan Corbet , Linus Torvalds , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Peter Zijlstra , Tejun Heo , Vlastimil Babka , LAK , Linux Doc Mailing List , LKML , x86 , Kernel Page Reclaim v2 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain , huzhanyuan@oppo.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 7, 2022 at 4:44 AM Will Deacon wrote: > > On Tue, Jun 07, 2022 at 10:37:46AM +1200, Barry Song wrote: > > On Tue, Jun 7, 2022 at 10:21 PM Will Deacon wrote: > > > On Tue, Jun 07, 2022 at 07:37:10PM +1200, Barry Song wrote: > > > > I can't really explain why we are getting a random app/java vm crash in monkey > > > > test by using ptep_test_and_clear_young() only in lru_gen_look_around() on an > > > > armv8-a machine without hardware PTE young support. > > > > > > > > Moving to ptep_clear_flush_young() in look_around can make the random > > > > hang disappear according to zhanyuan(Cc-ed). > > > > > > > > On x86, ptep_clear_flush_young() is exactly ptep_test_and_clear_young() > > > > after > > > > 'commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case clear > > > > the accessed bit instead of flushing the TLB")' > > > > > > > > But on arm64, they are different. according to Will's comments in this > > > > thread which > > > > tried to make arm64 same with x86, > > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1793881.html > > > > > > > > " > > > > This is blindly copied from x86 and isn't true for us: we don't invalidate > > > > the TLB on context switch. That means our window for keeping the stale > > > > entries around is potentially much bigger and might not be a great idea. > > > > > > > > If we roll a TLB invalidation routine without the trailing DSB, what sort of > > > > performance does that get you? > > > > " > > > > We shouldn't think ptep_clear_flush_young() is safe enough in LRU to > > > > clear PTE young? Any comments from Will? > > > > > > Given that this issue is specific to the multi-gen LRU work, I think Yu is > > > the best person to comment. However, looking quickly at your analysis above, > > > I wonder if the code is relying on this sequence: > > > > > > > > > ptep_test_and_clear_young(vma, address, ptep); > > > ptep_clear_flush_young(vma, address, ptep); > > > > > > > > > to invalidate the TLB. On arm64, that won't be the case, as the invalidation > > > in ptep_clear_flush_young() is predicated on the pte being young (and this > > > patches the generic implementation in mm/pgtable-generic.c. In fact, that > > > second function call is always going to be a no-op unless the pte became > > > young again in the middle. > > > > thanks for your reply, sorry for failing to let you understand my question. > > my question is actually as below, > > right now lru_gen_look_around() is using ptep_test_and_clear_young() > > only without flush to clear pte for a couple of pages including the specific > > address: > > void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) > > { > > ... > > > > for (i = 0, addr = start; addr != end; i++, addr += PAGE_SIZE) { > > ... > > > > if (!ptep_test_and_clear_young(pvmw->vma, addr, pte + i)) > > continue; > > > > ... > > } > > > > I wonder if it is safe to arm64. Do we need to move to ptep_clear_flush_young() > > in the loop? > > I don't know what this code is doing, so Yu is the best person to answer > that. There's nothing inherently dangerous about eliding the TLB > maintenance; it really depends on the guarantees needed by the caller. Ack. > However, the snippet you posted from folio_referenced_one(): > > | if (pvmw.pte) { > | + if (lru_gen_enabled() && pte_young(*pvmw.pte) && > | + !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) { > | + lru_gen_look_around(&pvmw); > | + referenced++; > | + } > | + > | if (ptep_clear_flush_young_notify(vma, address, > > > Does seem to call lru_gen_look_around() *and* > ptep_clear_flush_young_notify(), which is what prompted my question as it > looks pretty suspicious to me. The _notify varint reaches into the MMU notifier -- lru_gen_look_around() doesn't do that because GPA space generally has no locality. I hope this explains why both. As to why the code is organized this way -- it depends on the point of view. Mine is that lru_gen_look_around() is an add-on, since its logic is independent/separable from ptep_clear_flush_young_notify(). We can make lru_gen_look_around() include ptep_clear_flush_young_notify(), but that would make the code functionally interwinted, which is bad for my taste.