Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp365018rwb; Thu, 27 Jul 2023 14:06:42 -0700 (PDT) X-Google-Smtp-Source: APBJJlF5ywDS6yFgV9sJT4kGsQThrXwLruSQorTu+lrlJ7XwgqiG4rKg2TK2iHEfc4Uk7SK8DUkU X-Received: by 2002:a17:90a:8048:b0:268:f2e:b478 with SMTP id e8-20020a17090a804800b002680f2eb478mr371969pjw.3.1690492002504; Thu, 27 Jul 2023 14:06:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690492002; cv=none; d=google.com; s=arc-20160816; b=MK383jrwh6OPB9bpM/CVPePmuNd73V41P0TAuzkNMh0XhnQpo61TacVneVtvEqj5vD Et+GTFeuJGTuvXe1zB40ES+4rvdzYjJANYPDM1q0vHDFWke4QtzuhBvm7WTUWg9xUV/u aa0Os/oU20/6HTiAVdhrzwzYTW9td3eotiDlG/ktetzVu7IP+fD4mq1Je57RZasiS8+u AlU7rP1cSXopoftbPznSWSyZd8OjrE0YlbZqHpS55ozQtEvMnJliapY2vZhHA532kyAu yH5ONmzQSed+nH2NDX/BUWcHZ0GU9uLQA3v0QCMUPZRvfkOKD5CpOsSDkIbLwqZnyZhE kOTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=AuxACaLiBeZZ2plyxBG2ZphRqj/+K7EVaUkCF5E72Uk=; fh=YKgoKmAuGYA/Su4S9gY5rfCtDTdqq178If4u9r6PrhY=; b=C/wvRFXYI6EIA/Ay9VfgHt4eJg1rE6V/CF44JpAp2jc4yxYUbmMHWTQ0paD5B1L21y Uu60348fe8yvSI2/MyZh2goHw0M+kuGgRbUXXXHg+XEwvcC4HpGRCuR1Bgfhhz5ge0/8 NpHMrrmefbS1UgQfP2WGPMR6+xr+SzGKqQtFvBT4YTE3ibHfo2Sz7Jcx+ucvLMVpQ146 nt7+bCFEu09W/OJ3bRUuf9qN6HRuuKr5ryF4g1c/MuPtkZjcCcnasul+0LVMcXKD8xfV ULetLVIYtVoUcZj6czIe4VXN9i/JATT3ixAsF+CeRA0ihpZmyevqUxchBtJGAxe4b423 5Lgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TXdqxj55; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p4-20020a17090a4f0400b002683fd38fd4si1906231pjh.31.2023.07.27.14.06.29; Thu, 27 Jul 2023 14:06:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TXdqxj55; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231656AbjG0UbW (ORCPT + 99 others); Thu, 27 Jul 2023 16:31:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231295AbjG0UbR (ORCPT ); Thu, 27 Jul 2023 16:31:17 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0B4C2688 for ; Thu, 27 Jul 2023 13:30:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690489830; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=AuxACaLiBeZZ2plyxBG2ZphRqj/+K7EVaUkCF5E72Uk=; b=TXdqxj556hIQReAX8Io5G5dtm83wst1ubQym8sjxCYZxNE6s8Al8TZ5cJDd16d+xDPiqyb mT4a3S6irjkEgzDHJnoejxaD3CUBE4CiF9+AiYp5iH/nKlJR+g2SiFx5sIJjRWsFXvBHqr ug6VkUhDNI7jPuIs1aCtq+dKkFckXBk= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-682-3b9fZzLaPvSZSZ9daSXJow-1; Thu, 27 Jul 2023 16:30:28 -0400 X-MC-Unique: 3b9fZzLaPvSZSZ9daSXJow-1 Received: by mail-qv1-f69.google.com with SMTP id 6a1803df08f44-63cc3a44aedso3417726d6.0 for ; Thu, 27 Jul 2023 13:30:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690489828; x=1691094628; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=AuxACaLiBeZZ2plyxBG2ZphRqj/+K7EVaUkCF5E72Uk=; b=j1ziU0RmTIlr3yOE5mtd/2fMYL93b2CAztIps5DhRQwFlEVJ8Sn/J0jtLyFIMw3i+s 9jCj5Kn2Hwd8vK3R3BwCpyn5xoGN4dX7lQcEaxiTfPhR9Cax8ykLa8LBzhUtfPPxGexZ Zl3QjHCrDhl+VK6jSEjZ5bYQsuGEfXvXAaoRZcPg/woPdC/nZ0YEAdTT08l2F+So51Im fqavfq9nkpSTaxWI0g8lcod5nwqyoLpof0kkyBudMZItVdbzOevpyy/wbSsJR/Afg41o /lQs79SmeFw27qzUN92mAeJF++8IUvmhQ8NBYNlF/BqP7puXdegaSqB5zMs1p7AMwG7a RGQA== X-Gm-Message-State: ABy/qLYn5eTaCO4arM4G/CvwGfma7X/MNVyzqFjoDJeNjllNQWmczZ0a QHIknGDrUBfnkFM/ZF0tzgDzmCn2Bbp7VVMP6pcBlsgmNQ7hLfRZRj70WOJsMhIuVy+KnKIivHa Ai53dzeqGXGkIPzA1qpLLHKDv X-Received: by 2002:a05:6214:5199:b0:621:65de:f60c with SMTP id kl25-20020a056214519900b0062165def60cmr528116qvb.3.1690489828023; Thu, 27 Jul 2023 13:30:28 -0700 (PDT) X-Received: by 2002:a05:6214:5199:b0:621:65de:f60c with SMTP id kl25-20020a056214519900b0062165def60cmr528101qvb.3.1690489827673; Thu, 27 Jul 2023 13:30:27 -0700 (PDT) Received: from x1n (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id h13-20020a0cab0d000000b00635eeb8a4fcsm668775qvb.114.2023.07.27.13.30.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Jul 2023 13:30:27 -0700 (PDT) Date: Thu, 27 Jul 2023 16:30:25 -0400 From: Peter Xu To: David Hildenbrand Cc: liubo , akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, hughd@google.com, willy@infradead.org Subject: Re: [PATCH] smaps: Fix the abnormal memory statistics obtained through /proc/pid/smaps Message-ID: References: <20230726073409.631838-1-liubo254@huawei.com> <5a2c9ae4-50f5-3301-3b50-f57026e1f8e8@redhat.com> <30e58727-0a6a-4461-e9b1-f64d6eea026c@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <30e58727-0a6a-4461-e9b1-f64d6eea026c@redhat.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 27, 2023 at 09:17:45PM +0200, David Hildenbrand wrote: > On 27.07.23 20:59, Peter Xu wrote: > > On Thu, Jul 27, 2023 at 07:27:02PM +0200, David Hildenbrand wrote: > > > > > > > > > > This was wrong from the very start. If we're not in GUP, we shouldn't call > > > > > GUP functions. > > > > > > > > My understanding is !GET && !PIN is also called gup.. otherwise we don't > > > > need GET and it can just be always implied. > > > > > > That's not the point. The point is that _arbitrary_ code shouldn't call into > > > GUP internal helper functions, where they bypass, for example, any sanity > > > checks. > > > > What's the sanity checks that you're referring to? > > > > For example in follow_page() > > if (vma_is_secretmem(vma)) > return NULL; > > if (WARN_ON_ONCE(foll_flags & FOLL_PIN)) > return NULL; > > > Maybe you can elaborate why you think we should *not* be using > vm_normal_page_pmd() and instead some arbitrary GUP internal helper? I don't > get it. Because the old code was written like that? You're proposing to change it here. Again, I'm fine with the change, but please don't ask me to justify why the original code is fine.. because I simply don't see anything majorly wrong with either, it's the change that needs justification, not keeping it as-is (since Kirill wrote it in 2014). Well.. I feel like this becomes less helpful to discuss. let's try to move on. > > > > > > > > > > > > The other proof is try_grab_page() doesn't fail hard on !GET && !PIN. So I > > > > don't know whether that's "wrong" to be used.. > > > > > > > > > > To me, that is arbitrary code using a GUP internal helper and, therefore, > > > wrong. > > > > > > > Back to the topic: I'd say either of the patches look good to solve the > > > > problem. If p2pdma pages are mapped as PFNMAP/MIXEDMAP (?), I guess > > > > vm_normal_page_pmd() proposed here will also work on it, so nothing I see > > > > wrong on 2nd one yet. > > > > > > > > It looks nicer indeed to not have FOLL_FORCE here, but it also makes me > > > > just wonder whether we should document NUMA behavior for FOLL_* somewhere, > > > > because we have an implication right now on !FOLL_FORCE over NUMA, which is > > > > not obvious to me.. > > > > > > Yes, we probably should. For get_use_pages() and friends that behavior was > > > always like that and it makes sense: usually it represent application > > > behavior. > > > > > > > > > > > And to look more over that aspect, see follow_page(): previously we can > > > > follow a page for protnone (as it never applies FOLL_NUMA) but now it won't > > > > (it never applies FOLL_FORCE, either, so it seems "accidentally" implies > > > > FOLL_NUMA now). Not sure whether it's intended, though.. > > > > > > That was certainly an oversight, thanks for spotting that. That patch was > > > not supposed to change semantics: > > > > > > diff --git a/mm/gup.c b/mm/gup.c > > > index 76d222ccc3ff..ac926e19ff72 100644 > > > --- a/mm/gup.c > > > +++ b/mm/gup.c > > > @@ -851,6 +851,13 @@ struct page *follow_page(struct vm_area_struct *vma, > > > unsigned long address, > > > if (WARN_ON_ONCE(foll_flags & FOLL_PIN)) > > > return NULL; > > > > > > + /* > > > + * In contrast to get_user_pages() and friends, we don't want to > > > + * fail if the PTE is PROT_NONE: see gup_can_follow_protnone(). > > > + */ > > > + if (!(foll_flags & FOLL_WRITE)) > > > + foll_flags |= FOLL_FORCE; > > > + > > > page = follow_page_mask(vma, address, foll_flags, &ctx); > > > if (ctx.pgmap) > > > put_dev_pagemap(ctx.pgmap); > > > > This seems to be slightly against your other solution though for smaps, > > where we want to avoid abusing FOLL_FORCE.. isn't it.. > > This is GUP internal, not some arbitrary code, so to me a *completely* > different discussion. > > > > > Why read only? That'll always attach FOLL_FORCE to all follow page call > > sites indeed for now, but just curious - logically "I want to fetch the > > page even if protnone" is orthogonal to do with write permission here to > > me. > > Historical these were not the semantics, so I won't change them. > > FOLL_FORCE | FOLL_WRITE always had a special taste to it (COW ...). > > > > > I still worry about further abuse of FOLL_FORCE, I believe you also worry > > that so you proposed the other way for the smaps issue. > > > > Do you think we can just revive FOLL_NUMA? That'll be very clear to me > > from that aspect that we do still have valid use cases for it. > > FOLL_NUMA naming was nowadays wrong to begin with (not to mention, confusing > a we learned). There are other reasons why we have PROT_NONE -- mprotect(), > for example. It doesn't really violate with the name, IMHO - protnone can be either numa hint or PROT_NONE for real. As long as we return NULL for a FOLL_NUMA request we're achieving the goal we want - we guarantee a NUMA balancing to trigger with when FOLL_NUMA provided. It doesn't need to guarantee anything else, afaiu. The final check relies in vma_is_accessible() in the fault paths anyway. So I don't blame the old name that much. > > We could have a flag that goes the other way around: FOLL_IGNORE_PROTNONE > ... which surprisingly then ends up being exactly what FOLL_FORCE means > without FOLL_WRITE, and what this patch does. > > Does that make sense to you? > > > > > > The very least is if with above we should really document FOLL_FORCE - we > > should mention NUMA effects. But that's ... really confusing. Thinking > > about that I personally prefer a revival of FOLL_NUMA, then smaps issue all > > go away. > > smaps needs to be changed in any case IMHO. And I'm absolutely not in favor > of revicing FOLL_NUMA. As stated above, to me FOLL_NUMA is all fine and clear. If you think having a flag back for protnone is worthwhile no matter as-is (FOLL_NUMA) or with reverted meaning, then that sounds all fine to me. Maybe the old name at least makes old developers know what's that. I don't have a strong opinion on names though; mostly never had. Thanks, -- Peter Xu