Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp4136786ybg; Fri, 25 Oct 2019 13:55:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqx9VLKUHrhNrCgnm5WOXzmFO9rMyaglRBSA0F6NgdWSzMJpwz3rTeynDCEnfYXR9WAlu+ee X-Received: by 2002:a05:6402:213:: with SMTP id t19mr6242696edv.7.1572036946638; Fri, 25 Oct 2019 13:55:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572036946; cv=none; d=google.com; s=arc-20160816; b=YaTxPgg89AHb/px8Ap4gXwhNJjKofMWygRj7E8GHrKW7hyXbAe/kI/pu+zhiPGvbs+ bTZZa/PelzFhfBn4IxLUVGmZzthZjMO9HgqS4sqT3c9dNqFtBNBpfLNAArWW5bPOKe4T X5qwRGJ/aJ/u1n0/7iXxWLv4GOLq1kx6DfNMj/ZgAhts7Ignbwdxs81iAsFVUS96ToU2 SbH44UKkj6dKpO6igZL0DrRVeYKFZWQ99vrLU8oKl+oi/meWuOJGUPC3aSWEJ8oGJhUs wD5uItKco7XaAgTT3PfQOUeUIWaVkBIz4E3V4PFNiZVEwFCXUCV3D9YHrttYN5yiTVUS zBFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=5QylFVVKg+pRTELJO3ksh72EOQwVaa1vDZtisT5vgY4=; b=YBcyJzRhgMiwLRzVZhfzwqS9b/fCWXWPgehDhgXyUYQSkxcehzd58UPsk9vlZ33koD a/TqEHGhdcP8VW4+rB+I52bHLtw7u2sqnluuxKgEdwdSpFs/Kg2hN79oLOV5TM925RsK UO+2u7dUD9OESM5iAiG0o4TbFuqHcmBBv08PSz+N9EJ9jId7rREXf8WhK6fAmcGc/rBV 3w+gUfJtgLM6vMI7056AHZF1EzsE7Xr7sbvJQoa5wBSC1qvZ195OHvLjfKJU4I0N6dkc V8f0r9I1QG4Ox2guCjhGo+KClJW1A+QaE8steJKjCRG4hRg40QCUJ/CEmzsnrS4YkqRb GsRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i34si2237613ede.37.2019.10.25.13.55.22; Fri, 25 Oct 2019 13:55:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726858AbfJYStk (ORCPT + 99 others); Fri, 25 Oct 2019 14:49:40 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]:38365 "EHLO out30-133.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726079AbfJYStj (ORCPT ); Fri, 25 Oct 2019 14:49:39 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R651e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01451;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0TgBtoWc_1572029368; Received: from US-143344MP.local(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TgBtoWc_1572029368) by smtp.aliyun-inc.com(127.0.0.1); Sat, 26 Oct 2019 02:49:31 +0800 Subject: Re: [PATCH] mm: thp: clear PageDoubleMap flag when the last PMD map gone To: "Kirill A. Shutemov" Cc: hughd@google.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1571938066-29031-1-git-send-email-yang.shi@linux.alibaba.com> <20191025153618.ajcecye3bjm5abax@box> <74becfc0-3c34-bdd2-02cd-25b763c92f3b@linux.alibaba.com> <20191025163233.myl7kcgz25qsbnwm@box> <20191025163955.qsvkqic2hrorvdzj@box> From: Yang Shi Message-ID: <2171f0a9-d01a-e863-2009-3f1bfa249d6c@linux.alibaba.com> Date: Fri, 25 Oct 2019 11:49:26 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20191025163955.qsvkqic2hrorvdzj@box> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/25/19 9:39 AM, Kirill A. Shutemov wrote: > On Fri, Oct 25, 2019 at 07:32:33PM +0300, Kirill A. Shutemov wrote: >> On Fri, Oct 25, 2019 at 08:58:22AM -0700, Yang Shi wrote: >>> >>> On 10/25/19 8:36 AM, Kirill A. Shutemov wrote: >>>> On Fri, Oct 25, 2019 at 01:27:46AM +0800, Yang Shi wrote: >>>>> File THP sets PageDoubleMap flag when the first it gets PTE mapped, but >>>>> the flag is never cleared until the THP is freed. This result in >>>>> unbalanced state although it is not a big deal. >>>>> >>>>> Clear the flag when the last compound_mapcount is gone. It should be >>>>> cleared when all the PTE maps are gone (become PMD mapped only) as well, >>>>> but this needs check all subpage's _mapcount every time any subpage's >>>>> rmap is removed, the overhead may be not worth. The anonymous THP also >>>>> just clears PageDoubleMap flag when the last PMD map is gone. >>>> NAK, sorry. >>>> >>>> The key difference with anon THP that file THP can be mapped again with >>>> PMD after all PMD (or all) mappings are gone. >>>> >>>> Your patch breaks the case when you map the page with PMD again while the >>>> page is still mapped with PTEs. Who would set PageDoubleMap() in this >>>> case? >>> Aha, yes, you are right. I missed that point. However, I'm wondering we >>> might move this up a little bit like this: >>> >>> diff --git a/mm/rmap.c b/mm/rmap.c >>> index d17cbf3..ac046fd 100644 >>> --- a/mm/rmap.c >>> +++ b/mm/rmap.c >>> @@ -1230,15 +1230,17 @@ static void page_remove_file_rmap(struct page *page, >>> bool compound) >>>                         if (atomic_add_negative(-1, &page[i]._mapcount)) >>>                                 nr++; >>>                 } >>> + >>> +               /* No PTE map anymore */ >>> +               if (nr == HPAGE_PMD_NR) >>> +                       ClearPageDoubleMap(compound_head(page)); >>> + >>>                 if (!atomic_add_negative(-1, compound_mapcount_ptr(page))) >>>                         goto out; >>>                 if (PageSwapBacked(page)) >>>                         __dec_node_page_state(page, NR_SHMEM_PMDMAPPED); >>>                 else >>>                         __dec_node_page_state(page, NR_FILE_PMDMAPPED); >>> - >>> -               /* The last PMD map is gone */ >>> -               ClearPageDoubleMap(compound_head(page)); >>>         } else { >>>                 if (!atomic_add_negative(-1, &page->_mapcount)) >>>                         goto out; >>> >>> >>> This should guarantee no PTE map anymore, it should be safe to clear the >>> flag. >> At first glance looks safe, but let me think more about it. I didn't >> expect it be that easy :P > How do you protect from races? What prevents other thread/process to map > the page as PTE after you've calculated 'nr'? > > I don't remember the code that well, but I believe we don't require > PageLock for all cases... Or do we? No, page lock is required by adding PTE rmap, but not required when removing rmap, i.e. huge pmd split. It looks we can't prevent from the races for processes, threads are protected by ptl. >