Received: by 2002:a05:6358:16cd:b0:dc:6189:e246 with SMTP id r13csp1193598rwl; Fri, 4 Nov 2022 10:49:14 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4U52pWIzyVJkNpew9oLFEKd8SIy+sHewUngx1ZhgDprRsWXkhigS/onzWQ8RjKjNrKa2jK X-Received: by 2002:a17:90b:3b4c:b0:213:f05:6a8 with SMTP id ot12-20020a17090b3b4c00b002130f0506a8mr53868530pjb.108.1667584153887; Fri, 04 Nov 2022 10:49:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667584153; cv=none; d=google.com; s=arc-20160816; b=Han84QJEbraCcSH97WhAxr0TpUnUkKxv0AwHMuLgctX8+LcVAKFfWpnGNqP2I8a12V 4m6cFOu2+90gm+hScgce2s7y5nP4IB/LwnJewbn9ZZMd6vTANRbOtg+3Uine0H+ICL24 iRu+CW6RCfWm/uNqttACUSowV24RKJzJb/KxKRujCVbJ2a4AOTHRIau6abYZkiE7i31L 8RuxCkac0orZqg6Q3162ul4OnlOUohyGDOxZUzRperz6lqTUtElrlYw6rcYTHMF7nfFQ 6ZfSfdjV5aXdqxd8zhvzSej7VfL5Fe3jW4XbjoteZSjnCqnQESx6KDe1KOzcIpp6wW4X Lfsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Y7JBbJ2Fvdhb3EiG7i7cT3Xt7aRcECjo+Ml9qZORLks=; b=j/9pnXwVruw1rOEy2FVpVWFL04MqxI+R4M/BYXLTRyHZd4U6FEidq8cuepzfRoFUpH kSkpym9zx4nnRzqVc7VYBIPdPP83gWnmdHI1bjcjVTIUqalv35+wCk/8QxduIUhLyQIW 9RMSg2StbHh0CyLKMr2n22MDiNI7qlVnfK5FxaB/byXUtu5rvzhhpijGM8uaHcoRZ/ws wKPI4C221LBHHnLm3iQUxN96Y5xLq17eQsL6jos8bE1T0CBSuWVNigrX7mm0xYxxpLQ3 V5bX5y67KGkYlRwWJqkFDLO+IXkcxF41T09SeX/rsY1xTt+SVMJDLYPncBM4FLjiWvl3 Bjug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=QtMfbneh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w70-20020a638249000000b00434ffe3cc11si16976pgd.870.2022.11.04.10.49.01; Fri, 04 Nov 2022 10:49:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=QtMfbneh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231272AbiKDRft (ORCPT + 97 others); Fri, 4 Nov 2022 13:35:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44710 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231391AbiKDRfi (ORCPT ); Fri, 4 Nov 2022 13:35:38 -0400 Received: from mail-qv1-xf31.google.com (mail-qv1-xf31.google.com [IPv6:2607:f8b0:4864:20::f31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCE3C31FBA for ; Fri, 4 Nov 2022 10:35:37 -0700 (PDT) Received: by mail-qv1-xf31.google.com with SMTP id i12so3672083qvs.2 for ; Fri, 04 Nov 2022 10:35:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Y7JBbJ2Fvdhb3EiG7i7cT3Xt7aRcECjo+Ml9qZORLks=; b=QtMfbnehToVkf5LyUXFO65eJG6We/YS6qgzH8Z2QVSs2weEaaF7YugQA4+T+JfaIPw 6dimRZa67FxZZM+NBrxbn3JA/iGOuUs1OWmETpF/W40mXdVqlUFdgDVv7Dko3fTRTMqj 2E8UHpNacdU4QteBhPkPgS/IGURatVauQnjJM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Y7JBbJ2Fvdhb3EiG7i7cT3Xt7aRcECjo+Ml9qZORLks=; b=oJNhGuB8dYvKy8lm8p3ReKOjoscvwF1qhxb3WEevFmAy//mcGCqqcdYdslYVfpShrl m60RczG1T4NFgqmpUtmyPv7mVkn/bUjLUx1Hgn13mQAfz2obLfeNYTlMsaD6PIjiDBKD B/tFO5FrDkHpbXTda72zBJRHY3KZsJvPwnYQZIACKPy66eGiWJbE27osaUgaJnbrHd1W LcRpdLLivllUkRe3XqJM7jDtEEYqskz1gYUEsuC4Tkxqhu393QZt3E87oekLHaINCyAd 1BMKJ313fZjdhjkfkiLkkTUbqgXQz4u1TDEjkdilZbCmvBcjN2R+H94d/ouTDBXqzd+2 0D4A== X-Gm-Message-State: ACrzQf3/iqGT5VLNIUnAtHqSzMmE0fY++sGErvdBGDh4KyIkW4h104R5 OnvlBjA7dvDUvhg5Pe4oNFtiBCoAPU+N9A== X-Received: by 2002:a05:6214:e46:b0:4bc:c9b:6972 with SMTP id o6-20020a0562140e4600b004bc0c9b6972mr22826476qvc.113.1667583336609; Fri, 04 Nov 2022 10:35:36 -0700 (PDT) Received: from mail-yb1-f172.google.com (mail-yb1-f172.google.com. [209.85.219.172]) by smtp.gmail.com with ESMTPSA id o2-20020ac85a42000000b003a4f6a566e9sm2869938qta.83.2022.11.04.10.35.35 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 04 Nov 2022 10:35:35 -0700 (PDT) Received: by mail-yb1-f172.google.com with SMTP id j130so6594095ybj.9 for ; Fri, 04 Nov 2022 10:35:35 -0700 (PDT) X-Received: by 2002:a05:6902:124f:b0:66e:e3da:487e with SMTP id t15-20020a056902124f00b0066ee3da487emr37488759ybu.310.1667583334773; Fri, 04 Nov 2022 10:35:34 -0700 (PDT) MIME-Version: 1.0 References: <140B437E-B994-45B7-8DAC-E9B66885BEEF@gmail.com> In-Reply-To: From: Linus Torvalds Date: Fri, 4 Nov 2022 10:35:04 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: mm: delay rmap removal until after TLB flush To: Alexander Gordeev Cc: Peter Zijlstra , Will Deacon , Aneesh Kumar , Nick Piggin , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , Sven Schnelle , Nadav Amit , Jann Horn , John Hubbard , X86 ML , Matthew Wilcox , Andrew Morton , kernel list , Linux-MM , Andrea Arcangeli , "Kirill A . Shutemov" , Joerg Roedel , Uros Bizjak , Alistair Popple , linux-arch Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 3, 2022 at 11:33 PM Alexander Gordeev wrote: > > I rather have a question to the generic part (had to master the code quotting). Sure. Although now I think the series in in Andrew's -mm tree, or just about to get moved in there, so I'm not going to touch my actual branch any more. > > static void clean_and_free_pages_and_swap_cache(struct encoded_page **pages, unsigned int nr) > > { > > for (unsigned int i = 0; i < nr; i++) { > > struct encoded_page *encoded = pages[i]; > > unsigned int flags = encoded_page_flags(encoded); > > if (flags) { > > /* Clean the flagged pointer in-place */ > > struct page *page = encoded_page_ptr(encoded); > > pages[i] = encode_page(page, 0); > > > > /* The flag bit being set means that we should zap the rmap */ > > Why TLB_ZAP_RMAP bit is not checked explicitly here, like in s390 version? > (I assume, when/if ENCODE_PAGE_BITS is not TLB_ZAP_RMAP only, calling > page_zap_pte_rmap() without such a check would be a bug). No major reason. This is basically the same issue as the naming, which I touched on in https://lore.kernel.org/all/CAHk-=wiDg_1up8K4PhK4+kzPN7xJG297=nw+tvgrGn7aVgZdqw@mail.gmail.com/ and the follow-up note about how I hope the "encoded page pointer with flags" thing gets used by the mlock code some day too. IOW, there's kind of a generic "I have extra flags associated with the pointer", and then the specific "this case uses this flag", and depending on which mindset you have at the time, you might do one or the other. So in that clean_and_free_pages_and_swap_cache(), the code basically knows "I have a pointer with extra flags", and it's written that way. And that's partly historical, because it actually started with the original code tracking the dirty bit as the extra piece of information, and then transformed into this "no, the information is TLB_ZAP_RMAP". So "unsigned int flags" at one point was "bool dirty" instead, but then became more of a "I think this whole page pointer with flags is general", and the naming changed, and I had both cases in mind, and then the code is perhaps not so specifically named. I'm not sure the zap_page_range() case will ever use more than one flag, but the mlock case already has two different flags. So the "encode_page" thing is very much written to be about more than just the zap_page_range() case. But yes, that code could (and maybe should) use "if (flags & TLB_ZAP_RMAP)" to make it clear that in this case, the single flags bit is that one bit. But the "if ()" there actually does two conceptually *separate* things: it needs to clean the pointer in-place (which is regardless of "which" flag bit is set, and then it does that page_zap_pte_rmap(), which is just for the TLB_ZAP_RMAP bit. So to be really explicit about it, you'd have two different tests: one for "do I have flags that need to be cleaned up" and then an inner test for each flag. And since there is only one flag in this particular use case, it's essentially that inner test that I dropped as pointless. In contrast, in the s390 version, that bit was never encoded as a a general "flags associated with a page pointer" in the first place, so there was never any such duality. There is only TLB_ZAP_RMAP. Hope that explains the thinking. Linus