Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp3118594ybt; Mon, 29 Jun 2020 15:59:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyNTXvaN5/tnXOk2df22GkDnxt0zoqV+4bHnLn5jnxiYU27/Kc0dhJiphZTWoqFNwHoAXS+ X-Received: by 2002:a05:6402:1a42:: with SMTP id bf2mr13691912edb.292.1593471599217; Mon, 29 Jun 2020 15:59:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593471599; cv=none; d=google.com; s=arc-20160816; b=QxYw7oMnwq9D8fcsjGg692fErvx+FXm4JpkSB+RqL5moNjPC3wxEnIqrrGonz814RE ToXV9N2usgfwI2xMos5QSMsufagbPmwOqSfcbU9GNx256ITYebpvUWV7oMdwm7pI4f/e 01aiZUNwEqfF9iPtitLnn2MCZaxTPpA9pM+LXQLxU6XxSUtt6FVvGd/CirWUDiXyNjJv a2DAE5nSrnb++CGplQW7oZW/T8YCP3acHM5VwIb8lniDaENTyZKW+V2K1Bp3/AZmvkOl YYvDNa1KYh/3jli4kD+Ce+bOHEywwoy9V43nmyC8nkgJ+1A3R+/BHQL5bEp3nhNuUIyL CUbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date; bh=I6m8192VsVlpCsTiUvbY9iAe+Ss8aiSAT5fUWrdbot0=; b=wIxDEWKW2oLoYIFQYevntEpzLc9HYIUOfoCX/bHJRdqdSo+iW2WTRm+7TrCcCRMd6w qni6D5jU8grLD9420EUyQg7T/PZRybrnYLhFVegcNoybbt83ZjXkUMQq2L5bzi6ecoKI nETOoZfg5RNfzAn0XW9Ch/3WcHnksBRQ3TTpGm/Lrp5yrwjfxI8+obiF6c7wgmB9sbJO f3JHtM9zYn/VAxvart45W8a1CiXTBIHokfwHad/CL6indLy9zlhI8VK7VX0LeNn0pIHS QACLQCBoFUr0v2/u55SDaAVWuEKJGZ7HREuEIo5PnGb2aV12qD2353UZv5aQjEZC4ZXC o6Ug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j28si592641edj.142.2020.06.29.15.59.35; Mon, 29 Jun 2020 15:59:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728602AbgF2W6q (ORCPT + 99 others); Mon, 29 Jun 2020 18:58:46 -0400 Received: from out4436.biz.mail.alibaba.com ([47.88.44.36]:44774 "EHLO out4436.biz.mail.alibaba.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728318AbgF2W6q (ORCPT ); Mon, 29 Jun 2020 18:58:46 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=richard.weiyang@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0U16XY0q_1593471523; Received: from localhost(mailfrom:richard.weiyang@linux.alibaba.com fp:SMTPD_---0U16XY0q_1593471523) by smtp.aliyun-inc.com(127.0.0.1); Tue, 30 Jun 2020 06:58:43 +0800 Date: Tue, 30 Jun 2020 06:58:42 +0800 From: Wei Yang To: Dan Williams Cc: Wei Yang , David Hildenbrand , Michal Hocko , Andrew Morton , Oscar Salvador , Linux MM , Baoquan He , Linux Kernel Mailing List Subject: Re: [PATCH] mm/spase: never partially remove memmap for early section Message-ID: <20200629225842.GA38617@L-31X9LVDL-1304.local> Reply-To: Wei Yang References: <4D73CD59-BFD5-401A-A001-41F7BF5641BA@redhat.com> <20200629083411.GA38188@L-31X9LVDL-1304.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 29, 2020 at 03:13:25PM -0700, Dan Williams wrote: >On Mon, Jun 29, 2020 at 1:34 AM Wei Yang > wrote: >> >> On Thu, Jun 25, 2020 at 12:46:43PM -0700, Dan Williams wrote: >> >On Wed, Jun 24, 2020 at 10:53 PM David Hildenbrand wrote: >> >> >> >> >> >> >> >> > Am 25.06.2020 um 01:47 schrieb Dan Williams : >> >> > >> >> > On Wed, Jun 24, 2020 at 3:44 PM Wei Yang >> >> > wrote: >> >> > [..] >> >> >>> So, you are right that there is a mismatch here, but I think the >> >> >>> comprehensive fix is to allow early sections to be partially >> >> >>> depopulated/repopulated rather than have section_activate() and >> >> >>> section_deacticate() special case early sections. The special casing >> >> >>> is problematic in retrospect as section_deactivate() can't be >> >> >>> maintained without understand special rules in section_activate(). >> >> >> >> >> >> Hmm... This means we need to adjust pfn_valid() too, which always return true >> >> >> for early sections. >> >> > >> >> > Right, rather than carry workarounds in 3 locations, and the bug that >> >> > has resulted from then getting out of sync, just teach early section >> >> > mapping to allow for the subsection populate/depopulate. >> >> > >> >> >> >> I prefer the easy fix first - IOW what we Here here. Especially, pfn_to_online_page() will need changes as well. >> > >> >Agree, yes, let's do the simple fix first for 5.8 and the special-case >> >elimination work later. >> >> Dan, >> >> A quick test shows this is not a simple task. > >Thanks for taking a look... > >> First, early sections don't set subsection bitmap, which is necessary for the >> hot-add/remove. >> >> To properly set subsection bitmap, we need to know how many subsections in >> early section. While current code doesn't has a alignment requirement for >> last early section. We mark the whole last early section as present. > >I was thinking that the subsection map does not need to be accurate on >initial setup, it only needs to be accurate after the first removal. >However, that would result in new special casing that somewhat defeats >the purpose. The hardest part is potentially breaking up a PMD mapping >of the page array into a series of PTE mappings without disturbing >in-flight pfn_to_page() users. > >> I don't find a way to enable this. > >While I don't like that this bug crept into the mismatched special >casing of early sections, I'm now coming around to the same opinion. >I.e. that making the memmap for early sections permanent is a simpler >mechanism to maintain. I think so ... -- Wei Yang Help you, Help me