Received: by 2002:a05:6a10:144:0:0:0:0 with SMTP id 4csp666983pxw; Fri, 8 Apr 2022 18:45:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyZEnrPcM/DVUiOb4Cco/8mXL2fgWX4tKO/VF0FI3M8pRvjwJfSjHKyW5watRh3QL9F99AX X-Received: by 2002:a17:902:e741:b0:157:81b:b632 with SMTP id p1-20020a170902e74100b00157081bb632mr11061900plf.43.1649468721298; Fri, 08 Apr 2022 18:45:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649468721; cv=none; d=google.com; s=arc-20160816; b=F//IP3OAR8ebXSUnqO2AEhDZB5LPo/ARAghadiaR1y1t8WZ23d7OLwQRbyf0HP/ILB 3DnbU6saTTSVsFUL9jECrMPa+xtwqHtr+o2ddFildVbZm5xFAHF/8rXn7cnK502OhvAu j+vreffvzmNL/eltk6jKdrtvj67rJINnrut6PKvGqNbrJEU9UiLIbQqdpWhQfxQSVWuN l6OnXg3MAIihwY3/J+iCeUfQ8NtBVX4vSfNK9yB/Lr3hdebNv9ocRgqSTKRGZgKFh5Xs 6Cqg3nmsfiRyRAMhIo8fwZ5U50JO8jkr0tguBlCARZPezVfnA//eGdL3uCKPQtIfbQIg F9Xw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature; bh=neOmc+KiegDrMJmnC2F7DO6Z+sPhLUWMLqcO/XHVur0=; b=0er3hLkVLeIUKZHhuL1OCU3QhjrxX6klpdiyN5dolESQ2fWH5Axp6pufhYQTMX/V48 eGSkT/K+Uql+bKEKScf2UG6PWLY2A1LVdYBRwGnSTTdivX8ge3Puy8K2qgzRsNJksWgy yJUY42Z9SH0zoqKrmJL9BfMLIYrIH2m4RO0kaphswF5uHODGCUfZsatzuNKabdOkKUQR 9rmXSI9xapv8pbmCMLwWfFh+iGk/z9qq5mX0uTtuJr+1Tp1LJXV/in0tjTET5V2yNKdD cid+sI/yJ+SD4REl6c5uSKMTgtoTC5NYEOnWiljsCQrP0oRaPHAFnu65WPn8te+ePDWT Dpcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=D16xrJxs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t2-20020a635f02000000b0039cf4948dabsi2875409pgb.314.2022.04.08.18.44.57; Fri, 08 Apr 2022 18:45:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=D16xrJxs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237068AbiDHTOD (ORCPT + 99 others); Fri, 8 Apr 2022 15:14:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229676AbiDHTOC (ORCPT ); Fri, 8 Apr 2022 15:14:02 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D6B539B83; Fri, 8 Apr 2022 12:11:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649445117; x=1680981117; h=message-id:date:mime-version:to:cc:references:from: subject:in-reply-to:content-transfer-encoding; bh=+rU6SoAJQ46qQf+w0uFH/OvSXa690ikdV/2AaUpsdzc=; b=D16xrJxsd6/i7hNoCy4UGr0AdKHzTFY9iNBjOCXsii9t/Uh4pdvMK4U+ mv0pt16H53YAbDxuC3FtsXWxPX561dVbMAtfHynktngJO7fzb669Vb+pM 8DEIfU4XGUkF+qfc9DGg1cg2aJYCW8j97z6tlN7Yv+bZ8tSWHdY0sFAjQ QFl/hHJ3Z7RTHF1378GsWCQMDyqvnfKSFd4uWniYLMp592DFMBQ1H02yN XVrtHCCvHYHgFE+TnCHAGQhfbyB/sTfsOx1m20Z/hsE6NqVa8/HvYuJng VqeBTCjnLhEwGlll3K59cuxpFx626YbLIQ1/zxexJORzu0fVxPBQB2E5F A==; X-IronPort-AV: E=McAfee;i="6400,9594,10311"; a="286667834" X-IronPort-AV: E=Sophos;i="5.90,245,1643702400"; d="scan'208";a="286667834" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2022 12:11:56 -0700 X-IronPort-AV: E=Sophos;i="5.90,245,1643702400"; d="scan'208";a="525492958" Received: from tsungtae-mobl.amr.corp.intel.com (HELO [10.134.43.198]) ([10.134.43.198]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2022 12:11:55 -0700 Message-ID: <93a7cfdf-02e6-6880-c563-76b01c9f41f5@intel.com> Date: Fri, 8 Apr 2022 12:11:58 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Content-Language: en-US To: "Kirill A. Shutemov" , Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Brijesh Singh , Mike Rapoport , David Hildenbrand , x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, Mike Rapoport References: <20220405234343.74045-1-kirill.shutemov@linux.intel.com> <20220405234343.74045-2-kirill.shutemov@linux.intel.com> From: Dave Hansen Subject: Re: [PATCHv4 1/8] mm: Add support for unaccepted memory In-Reply-To: <20220405234343.74045-2-kirill.shutemov@linux.intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/5/22 16:43, Kirill A. Shutemov wrote: > Kernel only needs to accept memory once after boot, so during the boot > and warm up phase there will be a lot of memory acceptance. After things > are settled down the only price of the feature if couple of checks for > PageUnaccepted() in allocate and free paths. The check refers a hot > variable (that also encodes PageBuddy()), so it is cheap and not visible > on profiles. Let's also not sugar-coat this. Page acceptance is hideously slow. It's agonizingly slow. To boot, it's done holding a global spinlock with interrupts disabled (see patch 6/8). At the very, very least, each acceptance operation involves a couple of what are effectively ring transitions, a 2MB memset(), and a bunch of cache flushing. The system is going to be downright unusable during this time, right? Sure, it's *temporary* and only happens once at boot. But, it's going to suck. Am I over-stating this in any way? The ACCEPT_MEMORY vmstat is good to have around. Thanks for adding it. But, I think we should also write down some guidance like: If your TDX system seems as slow as snail after boot, look at the "accept_memory" counter in /proc/vmstat. If it is incrementing, then TDX memory acceptance is likely to blame. Do we need anything more discrete to tell users when acceptance is over? For instance, maybe they run something and it goes really slow, they watch "accept_memory" until it stops. They rejoice at their good fortune! Then, memory allocation starts falling over to a new node and the agony beings anew. I can think of dealing with this in two ways: cat /sys/.../unaccepted_pages_left which just walks the bitmap and counts the amount of pages remaining. or something like: echo 1 > /sys/devices/system/node/node0/make_the_pain_stop Which will, well, make the pain stop on node0.