Received: by 2002:a05:6a10:8a4d:0:0:0:0 with SMTP id dn13csp248439pxb; Thu, 12 Aug 2021 15:37:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxwNjmBDKBc4E/RNNjdZx09arU7/VdGRwCYJ1CzrOjvjU/C5FQjCTLBPxJHhTM/yTnPejXZ X-Received: by 2002:a17:907:a06c:: with SMTP id ia12mr5723477ejc.377.1628807866757; Thu, 12 Aug 2021 15:37:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628807866; cv=none; d=google.com; s=arc-20160816; b=WMw0g4CPeZq2MYF70DZQ/enbG0i8KDcPNubAYNYx1kTrruJS0iwoYICPuGUc8RXuGf +i/VgCZObcjrweW/HekLxZ5pD2d7SGDa3gZSVx5d49svVlQGZt1JXr2Ioc/1fx013+iR RpOWspcLiiLJum7XVGz57vtOEq0pA+GaFZ+J61doEo9pwUwWB687VqmnuATuDvmCJYGx 4tOSPZeNefo8kZL25xJbnA2l0RJQO7gD5eoiCfTj4f8xOUIP/m5+nVR8q7pGopot1n+u AJczJ/s4CZ7rRILKcPSaADgLZTGkNq7PiFbRTg0G8RTg/UJKR3GlEDcAR8A8SF7bGbiH P0GQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=pfVYeiF6NsTodz2n4eSeO7tT8BowjN/vmYohy2o7SVU=; b=SpAubZNrjDHVruaJnXKZK7D64ovhlipxS0bgbxdhny4Xl6Z9Sa60v4IeZ9Q8YprE0I BofFGqEU7s/uPGXQv99lifj1RO+kG8ckx0pPSbB4c3aSde2brHqad+nRUbPJ+LvJwDQ/ HQpogVjY+Kwg2Rjn91yD+QB+X+KQs2ds34HflhOwW7O6SVfZy3S24XYJNe6YJEqxeTjl IeHvN6Xc13Y7aKFa2+V1LJvY1a+cZ7/uXa9CclAej9wZNOXrknZrLMSXAQ7X6gcF2VA9 dZjAt28MVezmiHwXBWkHnAb+58QPo0TZWYhWmoqzs9FTWOLDgHRWvs2gM/M60gpjK4qv zffw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@shutemov-name.20150623.gappssmtp.com header.s=20150623 header.b=pTPOMagX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cx2si3226753edb.48.2021.08.12.15.37.23; Thu, 12 Aug 2021 15:37:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@shutemov-name.20150623.gappssmtp.com header.s=20150623 header.b=pTPOMagX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231547AbhHLUtj (ORCPT + 99 others); Thu, 12 Aug 2021 16:49:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229853AbhHLUtj (ORCPT ); Thu, 12 Aug 2021 16:49:39 -0400 Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 31BC8C061756 for ; Thu, 12 Aug 2021 13:49:13 -0700 (PDT) Received: by mail-lf1-x12f.google.com with SMTP id d4so15761175lfk.9 for ; Thu, 12 Aug 2021 13:49:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=pfVYeiF6NsTodz2n4eSeO7tT8BowjN/vmYohy2o7SVU=; b=pTPOMagXaOUXVggHiYUHmrkCBZT4ZIZ8OcOlXpc7anChygiuYEiuDMFnf3RqRNcM26 OzweCu8YyeDAJcSJIP63ZV4nfBiX7FGB7RB54fv4t9PkDPyt97XfD/cx97VeKalO8dY8 d+Vns1JaZOc0hTjGsxPeXXQGbUx58KdeJwUg+GBr2fNgaAlYb7iSeVtPHY7rE3sPeZCD 8vbdsy+9twQDArQLUz8Fh0pRIxgFjQ2NjOI0+cQbO6mI4cdNtv3P9uDjwThn0OgE4cPz 7dZOYsxhT3yQTth6RzYIEvXEH0JrQYHIrMRFuuXMvjZ139FYzBm/NxPw0P3oBGXN2oir emUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=pfVYeiF6NsTodz2n4eSeO7tT8BowjN/vmYohy2o7SVU=; b=b2YbSd9xCBNTo5rZuOpnfianEN9xYh5Ixnl7czV9TqmF6eXI66PWv2cPQbNKcJ5/zs QjNyIUB95oYgNkaXgahox5TX4VhmfxKiMHQFFwx78sq16/JhvYpZuOCDqNfpBbNevIpP rzaPwBybQTTS29nWLhzPRgLxOQ+ZTB8+S47geijT0TytgnRKDRh/Myo46FLQqoM4Gzoc A61yu8qmBVK2l3Cn+XnYINLxbWSdoFdbLFDDGUFMtAfynGlCpPXskHw6Cf1IStGF7HyM TSrnTEA66KYUcms9IyIMjoo7jBkelK1Ad63vqzjFqYBjLKgNQIoSJxY4/mU1WWlI3+3/ xhwA== X-Gm-Message-State: AOAM530ZjgXHPuLWqju1V/En2KY033b/edL1qs8sOjPJs9oOri7G0/vP Zt0sPU+5VbmNBj11tdVhYBKMLw== X-Received: by 2002:a05:6512:3682:: with SMTP id d2mr3897174lfs.50.1628801351604; Thu, 12 Aug 2021 13:49:11 -0700 (PDT) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id e1sm2294lfs.307.2021.08.12.13.49.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Aug 2021 13:49:10 -0700 (PDT) Received: by box.localdomain (Postfix, from userid 1000) id B3CE4102BEE; Thu, 12 Aug 2021 23:49:24 +0300 (+03) Date: Thu, 12 Aug 2021 23:49:24 +0300 From: "Kirill A. Shutemov" To: Dave Hansen Cc: Joerg Roedel , Andi Kleen , Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: Re: [PATCH 1/5] mm: Add support for unaccepted memory Message-ID: <20210812204924.haneuxapkmluli6t@box.shutemov.name> References: <20210810062626.1012-1-kirill.shutemov@linux.intel.com> <20210810062626.1012-2-kirill.shutemov@linux.intel.com> <9748c07c-4e59-89d0-f425-c57f778d1b42@linux.intel.com> <17b6a3a3-bd7d-f57e-8762-96258b16247a@intel.com> <796a4b20-7fa3-3086-efa0-2f728f35ae06@linux.intel.com> <3caf5e73-c104-0057-680c-7851476e67ac@linux.intel.com> <25312492-5d67-e5b0-1a51-b6880f45a550@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <25312492-5d67-e5b0-1a51-b6880f45a550@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 12, 2021 at 07:14:20AM -0700, Dave Hansen wrote: > On 8/12/21 1:19 AM, Joerg Roedel wrote: > > On Tue, Aug 10, 2021 at 02:20:08PM -0700, Andi Kleen wrote: > >> Also I agree with your suggestion that we should get the slow path out of > >> the zone locks/interrupt disable region. That should be easy enough and is > >> an obvious improvement. > > > > I also agree that the slow-path needs to be outside of the memory > > allocator locks. But I think this conflicts with the concept of > > accepting memory in 2MB chunks even if allocation size is smaller. > > > > Given some kernel code allocated 2 pages and the allocator path starts > > to validate the whole 2MB page the memory is on, then there are > > potential races to take into account. > > Yeah, the PageOffline()+PageBuddy() trick breaks down as soon as > PageBuddy() gets cleared. > > I'm not 100% sure we need a page flag, though. Imagine if we just did a > static key check in prep_new_page(): > > if (static_key_whatever(tdx_accept_ongoing)) > maybe_accept_page(page, order); > > maybe_accept_page() could just check the acceptance bitmap and see if > the 2MB page has been accepted. If so, just return. If not, take the > bitmap lock, accept the 2MB page, then mark the bitmap. > > maybe_accept_page() > { > unsigned long huge_pfn = page_to_phys(page) / PMD_SIZE; > > /* Test the bit before taking any locks: */ > if (test_bit(huge_pfn, &accepted_bitmap)) > return; > > spin_lock_irq(); > /* Retest inside the lock: */ > if (test_bit(huge_pfn, &accepted_bitmap)) > return; > tdx_accept_page(page, PMD_SIZE); > set_bit(huge_pfn, &accepted_bitmap)); > spin_unlock_irq(); > } > > That's still not great. It's still a global lock and the lock is still > held for quite a while because that accept isn't going to be lightning > fast. But, at least it's not holding any *other* locks. It also > doesn't take any locks in the fast path where the 2MB page was already > accepted. I expect a cache line with bitmap to bounce around during warm up. One cache line covers a gig of RAM. It's also not clear at all at what point the static key has to be switched. We don't have any obvious point where we can say we are done with accepting (unless you advocate for proactive accepting). I like PageOffline() because we only need to consult the cache page allocator already have in hands before looking into bitmap. > The locking could be more fine-grained, for sure. The bitmap could, for > instance, have a lock bit too. Or we could just have an array of locks > and hash the huge_pfn to find a lock given a huge_pfn. But, for now, I > think it's fine to just keep the global lock. > > > Either some other code path allocates memory from that page and returns > > it before validation is finished or we end up with double validation. > > Returning unvalidated memory is a guest-problem and double validation > > will cause security issues for SNP guests. > > Yeah, I think the *canonical* source of information for accepts is the > bitmap. The page flags and any static keys or whatever are > less-canonical sources that tell you when you _might_ need to consult > the bitmap. Right. -- Kirill A. Shutemov