Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp343747pxu; Wed, 25 Nov 2020 23:18:14 -0800 (PST) X-Google-Smtp-Source: ABdhPJxFpm8MmKZAyX+AtMo8nMfNJDb+Im3uIWcafkwpj+y7q05naSX0I3Bbw8qkcdAKAvqYZYLz X-Received: by 2002:a50:f148:: with SMTP id z8mr1264620edl.386.1606375094310; Wed, 25 Nov 2020 23:18:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606375094; cv=none; d=google.com; s=arc-20160816; b=UBynryMvLHLe9tfadz14n2uHTJqMlpSto/WIRy4gxf7hKZhb+u3Z4ZqwDXMHXRsad+ gkYAfz4p7qTlkfm7CEr1MM4Yi4IQO3JhHMlLDsPO8HSdkhVavWAsORBs3adOKRm2hA06 gPkDqak6Rt5CQbQoagC5RgX/qcX69E+N7ipEUDoEtLLctICMwAiiElq2yCfNmGbCV06x 5J6dX29rPQbaIhVhs/KeRqAoHqu5JPF43n4jvVjSXpLXUoOBINV61g2EJnsdfPH/s0+p zvAsOB7V3S+226fCi+eS4vCq7nAnNNVG7SOJ1Z83q7xSeDVUJl4+UUEOXWVPc2mGk/L6 yQvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=8uFZEde0fjOKdsoDsRuCfq7lnGFjepdQaAwnfRHGJI0=; b=OS2EIiizsodyM/B21q8SVnvzumzzhVgaS0rF90NCVGRpu+L1DlXKOaEgkFis70Qu5O /oBoqMTYJsQ4lAPs6GuXAjBU7Xr4wWGtgylS6pcre9uq+DuOp8dF+3sIHkcS3UF2RDeu OxjGjhNKcjhB+UmQ2jgolz5AkBIkk44tlnCv66y5PHf45d3qnBUw15hdiBHlhjm3QFoZ XW+aDcj0gQrvwnWplZoBuyJaATbHBunTtpWjsnMlXD9NFoAYqdBu4YQehN0LBFC1UZEM MPfEXI2mnCjjcaqBFNoRrvuZCxU88ICAjI0WPSvUYEExQMdmm3rSLibMme8QQgFV/O0m Js3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=NSt6QhKh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bt26si2535854ejb.706.2020.11.25.23.17.49; Wed, 25 Nov 2020 23:18:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=NSt6QhKh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730927AbgKYTCK (ORCPT + 99 others); Wed, 25 Nov 2020 14:02:10 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:39817 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730650AbgKYTCJ (ORCPT ); Wed, 25 Nov 2020 14:02:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1606330927; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8uFZEde0fjOKdsoDsRuCfq7lnGFjepdQaAwnfRHGJI0=; b=NSt6QhKhrqp1rw37UUY6NrT0A2tiHeoQI8bPfoqwrROwmOzgBQIoQanLtM9uUN/TxI4wM8 F+vsGISJaOsc0dRgXg0HoQVEt6Fl9f9FCQfRpLCfw37h4SHzZHAGakiKfl2MunL1kioUaE sXfb10q7pGsUgpJr4a854KYYlvyEl2Y= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-48-osmPAM7sNPiVfU8_dn8syg-1; Wed, 25 Nov 2020 14:02:03 -0500 X-MC-Unique: osmPAM7sNPiVfU8_dn8syg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 16BA61E7C7; Wed, 25 Nov 2020 19:02:01 +0000 (UTC) Received: from mail (ovpn-112-118.rdu2.redhat.com [10.10.112.118]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 924B65D9CA; Wed, 25 Nov 2020 19:01:57 +0000 (UTC) Date: Wed, 25 Nov 2020 14:01:56 -0500 From: Andrea Arcangeli To: Vlastimil Babka Cc: David Hildenbrand , Mel Gorman , Andrew Morton , linux-mm@kvack.org, Qian Cai , Michal Hocko , linux-kernel@vger.kernel.org, Mike Rapoport , Baoquan He Subject: Re: [PATCH 1/1] mm: compaction: avoid fast_isolate_around() to set pageblock_skip on reserved pages Message-ID: References: <8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw> <20201121194506.13464-1-aarcange@redhat.com> <20201121194506.13464-2-aarcange@redhat.com> <1c4c405b-52e0-cf6b-1f82-91a0a1e3dd53@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1c4c405b-52e0-cf6b-1f82-91a0a1e3dd53@suse.cz> User-Agent: Mutt/2.0.2 (2020-11-20) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 25, 2020 at 01:08:54PM +0100, Vlastimil Babka wrote: > Yeah I guess it would be simpler if zoneid/nid was correct for > pfn_valid() pfns within a zone's range, even if they are reserved due > not not being really usable memory. > > I don't think we want to introduce CONFIG_HOLES_IN_ZONE to x86. If the > chosen solution is to make this to a real hole, the hole should be > extended to MAX_ORDER_NR_PAGES aligned boundaries. The way pfn_valid works it's not possible to render all non-RAM pfn as !pfn_valid, CONFIG_HOLES_IN_ZONE would not achieve it 100% either. So I don't think we can rely on that to eliminate all non-RAM reserved pages from the mem_map and avoid having to initialize them in the first place. Some could remain as in this case since in the same pageblock there's non-RAM followed by RAM and all pfn are valid. > In any case, compaction code can't fix this with better range checks. David's correct that it can, by adding enough PageReserved (I'm running all systems reproducing this with plenty of PageReserved checks in all places to work around it until we do a proper fix). My problem with that is that 1) it's simply non enforceable at runtime that there is not missing PageReserved check and 2) what benefit it would provide to leave a wrong zoneid in reserved pages and having to add extra PageReserved checks? A struct page has a deterministic zoneid/nid, if it's pointed by a valid pfn (as in pfn_valid()) the simplest is that the zoneid/nid in the page remain correct no matter if it's reserved at boot, it was marked reserved by a driver that swap the page somewhere else with the GART or EFI or something else. All reserved pages should work the same, RAM and non-RAM, since the non-RAM status can basically change at runtime if a driver assigns the page to hw somehow. NOTE: on the compaction side, we still need to add thepageblock_pfn_to_page to validate the "highest" pfn because the pfn_valid() check is missing on the first pfn on the pageblock as it's also missing the check of a pageblock that spans over two different zones. Thanks, Andrea