Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp8491474ybi; Thu, 6 Jun 2019 13:18:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqwGEypaXxJEcZhpNwokSdyRLoXm0MiapWU+AfkKfAYI0rGOJrtj6NQjaAJJjF3KYso+JqGg X-Received: by 2002:a17:902:b43:: with SMTP id 61mr53305313plq.322.1559852296918; Thu, 06 Jun 2019 13:18:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559852296; cv=none; d=google.com; s=arc-20160816; b=sO/GOzO0HZ5UC/fsj4ApzrRNq10ubnwhTvANGFf8180dWLtm3RtrmXNo95eT4DhsfB savmJ2oiKJC1wCryAaiiBQSbHQIiZMScLNrfEmj7D5r7GsDT/nic4ro0oY0EYV7qHRYT V+2Mk0C6FRWFIC0jKcDg7hXQgcC/DXUVh/AWpHwgzIRtGHA0VnyYe+WBu3ToQ1IVAs5E kov+ABlMoba6g8aOXtcu8iN4taY7kRF8t7HEGRG67ehQfypW1F/0UrdEaMRf1QyyeiAQ vRsKsOaevY3C5S7kDjOatAiGN73BjnC4ZY3ndsEhknF36t6zW5bdn3n4TNSYR/SqJlOP wjJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject; bh=o0kYLjdhrExUrOoW4LG9f3sFa4hNcrSWOLxQkVeQOEg=; b=Cy+Q2KXUxoR3TftoBBaunlg6+jTlAz1dlw+V2LOqE3RpTvYhd3L+hgEc2JrteSRxVs wLgYH0TfxIXYIfT0Arui8DE4B/Ydmk3h2i3yEzoLxjbHfX/sFlvoRgeaKQTtlzwyH9XJ Yz1c3R4IOsrPrHf611CyuKD7l12dxifDBR9P55yxZLOus23z8yZSKIoUd/y8Xa+ZDIUn H1Z7YXDtnE6q2jiPrfnRqfTfrz3mA1yWKNXx8JDG2V8mtMDpQI+ttqa8idD14hH9PGGc gfaDIWA5yUe+BI8sxrZJGPf7PYE8Nxcl5/dyArZvztO6uKu6LWtiI/eCBElQNhqd/Nwf BjZQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v38si47421plg.277.2019.06.06.13.18.00; Thu, 06 Jun 2019 13:18:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728310AbfFFS7b (ORCPT + 99 others); Thu, 6 Jun 2019 14:59:31 -0400 Received: from out30-45.freemail.mail.aliyun.com ([115.124.30.45]:50388 "EHLO out30-45.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726863AbfFFS7a (ORCPT ); Thu, 6 Jun 2019 14:59:30 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0TTau0rq_1559847565; Received: from US-143344MP.local(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TTau0rq_1559847565) by smtp.aliyun-inc.com(127.0.0.1); Fri, 07 Jun 2019 02:59:28 +0800 Subject: Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility From: Yang Shi To: Michal Hocko Cc: "Kirill A. Shutemov" , vbabka@suse.cz, rientjes@google.com, kirill@shutemov.name, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Hugh Dickins References: <1556037781-57869-1-git-send-email-yang.shi@linux.alibaba.com> <20190423175252.GP25106@dhcp22.suse.cz> <5a571d64-bfce-aa04-312a-8e3547e0459a@linux.alibaba.com> <859fec1f-4b66-8c2c-98ee-2aee9358a81a@linux.alibaba.com> <20190507104709.GP31017@dhcp22.suse.cz> Message-ID: <217fc290-5800-31de-7d46-aa5c0f7b1c75@linux.alibaba.com> Date: Thu, 6 Jun 2019 11:59:21 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/7/19 10:10 AM, Yang Shi wrote: > > > On 5/7/19 3:47 AM, Michal Hocko wrote: >> [Hmm, I thought, Hugh was CCed] >> >> On Mon 06-05-19 16:37:42, Yang Shi wrote: >>> >>> On 4/28/19 12:13 PM, Yang Shi wrote: >>>> >>>> On 4/23/19 10:52 AM, Michal Hocko wrote: >>>>> On Wed 24-04-19 00:43:01, Yang Shi wrote: >>>>>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility >>>>>> for each >>>>>> vma") introduced THPeligible bit for processes' smaps. But, when >>>>>> checking >>>>>> the eligibility for shmem vma, __transparent_hugepage_enabled() is >>>>>> called to override the result from shmem_huge_enabled().  It may >>>>>> result >>>>>> in the anonymous vma's THP flag override shmem's.  For example, >>>>>> running a >>>>>> simple test which create THP for shmem, but with anonymous THP >>>>>> disabled, >>>>>> when reading the process's smaps, it may show: >>>>>> >>>>>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test >>>>>> Size:               4096 kB >>>>>> ... >>>>>> [snip] >>>>>> ... >>>>>> ShmemPmdMapped:     4096 kB >>>>>> ... >>>>>> [snip] >>>>>> ... >>>>>> THPeligible:    0 >>>>>> >>>>>> And, /proc/meminfo does show THP allocated and PMD mapped too: >>>>>> >>>>>> ShmemHugePages:     4096 kB >>>>>> ShmemPmdMapped:     4096 kB >>>>>> >>>>>> This doesn't make too much sense.  The anonymous THP flag should not >>>>>> intervene shmem THP.  Calling shmem_huge_enabled() with checking >>>>>> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and >>>>>> dax vma check since we already checked if the vma is shmem already. >>>>> Kirill, can we get a confirmation that this is really intended >>>>> behavior >>>>> rather than an omission please? Is this documented? What is a global >>>>> knob to simply disable THP system wise? >>>> Hi Kirill, >>>> >>>> Ping. Any comment? >>> Talked with Kirill at LSFMM, it sounds this is kind of intended >>> behavior >>> according to him. But, we all agree it looks inconsistent. >>> >>> So, we may have two options: >>>      - Just fix the false negative issue as what the patch does >>>      - Change the behavior to make it more consistent >>> >>> I'm not sure whether anyone relies on the behavior explicitly or >>> implicitly >>> or not. >> Well, I would be certainly more happy with a more consistent behavior. >> Talked to Hugh at LSFMM about this and he finds treating shmem objects >> separately from the anonymous memory. And that is already the case >> partially when each mount point might have its own setup. So the primary >> question is whether we need a one global knob to controll all THP >> allocations. One argument to have that is that it might be helpful to >> for an admin to simply disable source of THP at a single place rather >> than crawling over all shmem mount points and remount them. Especially >> in environments where shmem points are mounted in a container by a >> non-root. Why would somebody wanted something like that? One example >> would be to temporarily workaround high order allocations issues which >> we have seen non trivial amount of in the past and we are likely not at >> the end of the tunel. > > Shmem has a global control for such use. Setting shmem_enabled to > "force" or "deny" would enable or disable THP for shmem globally, > including non-fs objects, i.e. memfd, SYS V shmem, etc. > >> >> That being said I would be in favor of treating the global sysfs knob to >> be global for all THP allocations. I will not push back on that if there >> is a general consensus that shmem and fs in general are a different >> class of objects and a single global control is not desirable for >> whatever reasons. > > OK, we need more inputs from Kirill, Hugh and other folks. [Forgot cc to mailing lists] Hi guys, How should we move forward for this one? Make the sysfs knob (/sys/kernel/mm/transparent_hugepage/enabled) to be global for both anonymous and tmpfs? Or just treat shmem objects separately from anon memory then fix the false-negative of THP eligibility by this patch? > >> >> Kirill, Hugh othe folks? >