Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp952289ybt; Wed, 24 Jun 2020 15:45:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwYiP+f92HDVcLWsYMOcBScC3axDJJLQFuVKJIt3XUlQR6ZH/xeKsgxe8NAF6JNPn3/cTa3 X-Received: by 2002:a17:907:1002:: with SMTP id ox2mr13349860ejb.358.1593038710048; Wed, 24 Jun 2020 15:45:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593038710; cv=none; d=google.com; s=arc-20160816; b=OKk7wId/ae0FaG0bXG2DSbGixkrKK5Q638FuYYZh0z2n2l41JHDwx2/l9jKnW3FkaI kDQfXqk3DGymYeKijxM5+gqAOiB0cyQU96X8IXXEhdhaY4aXrGuRnFBxZIei8vbk5CHU anc6o4sW+pLQ8vo4NjJUrjEzACNpD8MJQvBWYUYUzVYri1FjI/QsrWnEiHAFZOUyhJTT 0tJ0xxCHaM2B4+7nZCYGPs+5Vs7Nxg493rhawpQxt6sBEZr5C1J+pjmwO0y9JWZErFEm wRFkiaCRNPp8PBTBGrqR/V+ZsTjCSgW+rMpaVuzSQg2fP+K9PE73E2uA36GUntV4TCTR PqHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:reply-to:message-id:subject:cc:to:from:date; bh=oEQ5z6XaDz8sYrpBP40VU2YaEZXDVvBbz8KwCPN+dEw=; b=b5R7u9UwGJPPImLTLx1N06+3/rbRBISOLtFpUa8JBls4xzzr+X4ZyHELJ5Oqi0GUAq hoZvdZmG148cCyOE49SpDCeoOqBSxursvMeRZ5uL3tFu9jA7lKXqMX/sYU6YzgHP+0D6 F3qs2q3I+tkrsxSxXV0UBbr1tAfktRr5ZLFSFBrPbErWTI4fEuAu57k5P+2SxMCgCWsR w/JDCNFU/dtHThZguWZjEop3W3TN+Gu1x416n3+qECzKX+SIkmKsrGfwCi7HaGV1SVAO BqoEkkrsC5SR65Xjd7xcM9u7cXimBM9xg5ZGzUkYGIyRoS3H7M5IQm2mB63rkOZ1y26N r6bQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s19si7328540ejq.719.2020.06.24.15.44.46; Wed, 24 Jun 2020 15:45:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389817AbgFXWoQ (ORCPT + 99 others); Wed, 24 Jun 2020 18:44:16 -0400 Received: from out30-42.freemail.mail.aliyun.com ([115.124.30.42]:49165 "EHLO out30-42.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388739AbgFXWoP (ORCPT ); Wed, 24 Jun 2020 18:44:15 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R231e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04357;MF=richard.weiyang@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0U0dYaOT_1593038651; Received: from localhost(mailfrom:richard.weiyang@linux.alibaba.com fp:SMTPD_---0U0dYaOT_1593038651) by smtp.aliyun-inc.com(127.0.0.1); Thu, 25 Jun 2020 06:44:11 +0800 Date: Thu, 25 Jun 2020 06:44:10 +0800 From: Wei Yang To: Dan Williams Cc: Wei Yang , Michal Hocko , Andrew Morton , Oscar Salvador , Linux MM , Baoquan He , Linux Kernel Mailing List , David Hildenbrand Subject: Re: [PATCH] mm/spase: never partially remove memmap for early section Message-ID: <20200624224410.GD15016@L-31X9LVDL-1304.local> Reply-To: Wei Yang References: <20200623094258.6705-1-richard.weiyang@linux.alibaba.com> <20200623151828.GA31426@dhcp22.suse.cz> <20200624061340.GA11552@L-31X9LVDL-1304.local> <20200624220552.GA15016@L-31X9LVDL-1304.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 24, 2020 at 03:20:59PM -0700, Dan Williams wrote: >On Wed, Jun 24, 2020 at 3:06 PM Wei Yang > wrote: >> >> On Wed, Jun 24, 2020 at 09:10:09AM -0700, Dan Williams wrote: >> >On Tue, Jun 23, 2020 at 11:14 PM Wei Yang >> > wrote: >> >> >> >> On Tue, Jun 23, 2020 at 05:18:28PM +0200, Michal Hocko wrote: >> >> >On Tue 23-06-20 17:42:58, Wei Yang wrote: >> >> >> For early sections, we assumes its memmap will never be partially >> >> >> removed. But current behavior breaks this. >> >> >> >> >> >> Let's correct it. >> >> >> >> >> >> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") >> >> >> Signed-off-by: Wei Yang >> >> > >> >> >Can a user trigger this or is this a theoretical bug? >> >> >> >> Let me rewrite the changelog a little. Look forward any comments. >> >> >> >> For early sections, its memmap is handled specially even sub-section is >> >> enabled. The memmap could only be populated as a whole. >> >> >> >> Quoted from the comment of section_activate(): >> >> >> >> * The early init code does not consider partially populated >> >> * initial sections, it simply assumes that memory will never be >> >> * referenced. If we hot-add memory into such a section then we >> >> * do not need to populate the memmap and can simply reuse what >> >> * is already there. >> >> >> >> While current section_deactivate() breaks this rule. When hot-remove a >> >> sub-section, section_deactivate() would depopulate its memmap. The >> >> consequence is if we hot-add this subsection again, its memmap never get >> >> proper populated. >> > >> >Ok, forgive the latency as re-fetched this logic into my mental cache. >> >So what I was remembering was the initial state of the code that >> >special cased early sections, and that still seems to be the case in >> >pfn_valid(). IIRC early_sections / bootmem are blocked from being >> >removed entirely. Partial / subsection removals are ok. >> >> Would you mind giving more words? Partial subsection removal is ok, so no need >> to fix this? > >Early sections establish a memmap for the full section. There's >conceptually nothing wrong with unplugging the non-system-RAM portion >of the memmap, but it would need to be careful, at least on x86, to >map the partial section with PTEs instead of PMDs. > >So, you are right that there is a mismatch here, but I think the >comprehensive fix is to allow early sections to be partially >depopulated/repopulated rather than have section_activate() and >section_deacticate() special case early sections. The special casing >is problematic in retrospect as section_deactivate() can't be >maintained without understand special rules in section_activate(). Hmm... This means we need to adjust pfn_valid() too, which always return true for early sections. -- Wei Yang Help you, Help me