Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp116286imm; Wed, 5 Sep 2018 22:39:31 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYb63QGZtHP8jbuYYmUY1tCOoQj76js2SoMmRqCoEYAASW7v5ZDf0Hx73Pf6WYQ8akLX0ZC X-Received: by 2002:a62:411a:: with SMTP id o26-v6mr1184923pfa.111.1536212371681; Wed, 05 Sep 2018 22:39:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536212371; cv=none; d=google.com; s=arc-20160816; b=Il/K7jFutBzs1zv7i00X0hfUPokkHjfw+B/a8475T8a160B0eFKzByp4BpM8o3Ay5/ r4nvYPErpsAvg9rQDcvYu3ddvDMJdX/XER0cHGD8JJwQIfNATXj0ZfcNc6xyo2c73qvT Q5dudS1qwGr+q8MPhJa7MNFuk4hkMLxrnP5IB+rtvf4zHl/4pc3PPbwHcnyYRrRZBHJx 7qxyOdsyRk4FhvaBjQO0O0n14IPLHBro7JSiE21fgGKZE+CF/C82nYXaJxYa2hrXNzJo CAE+8G0QZ7dckKJOJ0qQ2xgA6glEUSjOlpDzp1ZJ8kyzN5TsJ0ADkEmZ3giXfBG87so1 u7Kw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=JpwWNRg90+i6hLuoL78V7PqDmPaGF5vG3do00lBJNWE=; b=p17GXE10NJ66iQI4E0nhj+rhdiZOtJ28n/yMYWyQpOjKyY4vdXerrQcoNKbO6nYYnm PgfZQOA+6DSU7ncPKcSe2hhZeISRqZM8xzeqCen8gYzctLvuViNCnX/ieoySFwmAl12i 5l3AqyG2IRxjwA2q1nuvGc5bcZ7ARHWbnYluN53JvbzVyczmP/deXAe/COnLQt67qBpA 3/2CDBTu6ZB1dvBsEPcRohXuwQBqjv9/DM9LEGu0wucYK8T1s2jWEl5q+lgTf4naQjY2 69ivlg5GXa5BgtPEcDAj23TfuScKUAewntz5TAUMGINn0ClI2Tx9rVnw2y4Ws5pFTjHZ RTJA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j24-v6si4130519pfn.363.2018.09.05.22.39.16; Wed, 05 Sep 2018 22:39:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726539AbeIFKLx (ORCPT + 99 others); Thu, 6 Sep 2018 06:11:53 -0400 Received: from mx2.suse.de ([195.135.220.15]:45104 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725880AbeIFKLx (ORCPT ); Thu, 6 Sep 2018 06:11:53 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 63FB9ADA9; Thu, 6 Sep 2018 05:38:08 +0000 (UTC) Date: Thu, 6 Sep 2018 07:38:07 +0200 From: Michal Hocko To: Alexander Duyck Cc: linux-mm , LKML , "Duyck, Alexander H" , pavel.tatashin@microsoft.com, Andrew Morton , Ingo Molnar , "Kirill A. Shutemov" Subject: Re: [PATCH 1/2] mm: Move page struct poisoning from CONFIG_DEBUG_VM to CONFIG_DEBUG_VM_PGFLAGS Message-ID: <20180906053807.GH14951@dhcp22.suse.cz> References: <20180904181550.4416.50701.stgit@localhost.localdomain> <20180904183339.4416.44582.stgit@localhost.localdomain> <20180905061044.GT14951@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 05-09-18 08:32:05, Alexander Duyck wrote: > On Tue, Sep 4, 2018 at 11:10 PM Michal Hocko wrote: > > > > On Tue 04-09-18 11:33:39, Alexander Duyck wrote: > > > From: Alexander Duyck > > > > > > On systems with a large amount of memory it can take a significant amount > > > of time to initialize all of the page structs with the PAGE_POISON_PATTERN > > > value. I have seen it take over 2 minutes to initialize a system with > > > over 12GB of RAM. > > > > > > In order to work around the issue I had to disable CONFIG_DEBUG_VM and then > > > the boot time returned to something much more reasonable as the > > > arch_add_memory call completed in milliseconds versus seconds. However in > > > doing that I had to disable all of the other VM debugging on the system. > > > > I agree that CONFIG_DEBUG_VM is a big hammer but the primary point of > > this check is to catch uninitialized struct pages after the early mem > > init rework so the intention was to make it enabled on as many systems > > with debugging enabled as possible. DEBUG_VM is not free already so it > > sounded like a good idea to sneak it there. > > > > > I did a bit of research and it seems like the only function that checks > > > for this poison value is the PagePoisoned function, and it is only called > > > in two spots. One is the PF_POISONED_CHECK macro that is only in use when > > > CONFIG_DEBUG_VM_PGFLAGS is defined, and the other is as a part of the > > > __dump_page function which is using the check to prevent a recursive > > > failure in the event of discovering a poisoned page. > > > > Hmm, I have missed the dependency on CONFIG_DEBUG_VM_PGFLAGS when > > reviewing the patch. My debugging kernel config doesn't have it enabled > > for example. I know that Fedora configs have CONFIG_DEBUG_VM enabled > > but I cannot find their config right now to double check for the > > CONFIG_DEBUG_VM_PGFLAGS right now. > > > > I am not really sure this dependency was intentional but I strongly > > suspect Pavel really wanted to have it DEBUG_VM scoped. > > So I think the idea as per the earlier discussion with Pavel is that > by preloading it with all 1's anything that is expecting all 0's will > blow up one way or another. We just aren't explicitly checking for the > value, but it is still possibly going to be discovered via something > like a GPF when we try to access an invalid pointer or counter. > > What I think I can do to address some of the concern is make this > something that depends on CONFIG_DEBUG_VM and defaults to Y. That way > for systems that are defaulting their config they should maintain the > same behavior, however for those systems that are running a large > amount of memory they can optionally turn off > CONFIG_DEBUG_VM_PAGE_INIT_POISON instead of having to switch off all > the virtual memory debugging via CONFIG_DEBUG_VM. I guess it would > become more of a peer to CONFIG_DEBUG_VM_PGFLAGS as the poison check > wouldn't really apply after init anyway. So the most obvious question is, why don't you simply disable DEBUG_VM? It is not aimed at production workloads because it adds asserts at many places and it is quite likely to come up with performance penalty already. Besides that, Initializing memory to all ones is not much different to initializing it to all zeroes which we have been doing until recently when Pavel has removed that. So why do we need to add yet another debugging config option. We have way too many of config options already. -- Michal Hocko SUSE Labs