Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp379264pxb; Tue, 3 Nov 2020 01:54:49 -0800 (PST) X-Google-Smtp-Source: ABdhPJyEMCDlbR7xSZ+xx5bHCJIXahuCijVWurcawo6P/BG9u1Nos2vIDxuFo0tqm/35NcLjwUjc X-Received: by 2002:aa7:db57:: with SMTP id n23mr25697edt.208.1604397289016; Tue, 03 Nov 2020 01:54:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604397289; cv=none; d=google.com; s=arc-20160816; b=cQTSbKQdnmJzXJD0RZqj6a75bFWeJyIhGiZ68UZkqZ8AHDrZQUwb5DTtfgeHzyoqDI x8IZN5DM8yd1gPlW5R/rcLkgou8ccBEC9dQBcX0XI/duPPGfVWuiEX0WWkpc6gLzsAqV WdGlbdszhPrVnuuVLUXCAYn3lelgZby+Ksb1CjQ8xNmnblnh8KbpXgndaHl5m0OiuDO7 ec6a6A2f3nw49VBzSaniQZQ7V7yV4iUs34fk5KNTcqZnlZizRnDHFrgbokCzomSZvoiV RRqTtGtSA3X7cSNPABV+o6Arf0CqMPbxTCusys2vj5w9mCpRXOVL3K+JYQ7Ydu1kNXoE Akeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=KGp0whN3QynLz70Qcpucyxb3bu8o0tmx/i96vpXMEOI=; b=AsLjAZhBbg8nde05NqWxbCmsBCC/6UQNyikcr2VjduDWrtBaI4yhuF0FPGhUvy5ae1 Smq7DYAvgtL0JxQeOi8Pci+BKyfXbB/Pdlluccr3jbhCSSk0XhRWGMWnOlVBidN1T9TT 31R4jbuFdHWsP31ChuCIbR5gECaPzSzwhPn7u5JExCJUazl7+2AGsr7NrJDl9jXBER6N e53iM/0vfL7RAqZ/viQAd5d4fpSJzZQ/wfAZhF4MqeQ1rIPx8SeuCvkSAHko8NzCtnep hIaUtOkfMzWPEsZTtXvjjKbkRt/bAHxIvYcVD7vTHnUwaytmYnHMOy/plH9a8dhW5p19 zPVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=djsvb14S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b6si15073769edu.343.2020.11.03.01.54.26; Tue, 03 Nov 2020 01:54:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=djsvb14S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727323AbgKCJxD (ORCPT + 99 others); Tue, 3 Nov 2020 04:53:03 -0500 Received: from mail.kernel.org ([198.145.29.99]:54190 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725988AbgKCJxC (ORCPT ); Tue, 3 Nov 2020 04:53:02 -0500 Received: from kernel.org (unknown [87.71.17.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0BBB92080C; Tue, 3 Nov 2020 09:52:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1604397181; bh=lh70U56vprDGYqn/Uu8IBJ+iWmElKUpwOZcmksn3MyU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=djsvb14SN0dTNdL0dHydqIGZTQQJIWeDKAvtb0veCp5Ix1gEvvQVcFhgeonkgf1/J RQrD7O/5SRUemCadMxWpVVfGhzdDbjksjDdqo9s8ZkUJfcSYo4rTs7bMtgSd2XHOjJ QvgAK34USsCDWELlPKdltzRhz93byDZzTNIknty0= Date: Tue, 3 Nov 2020 11:52:47 +0200 From: Mike Rapoport To: David Hildenbrand Cc: Andrew Morton , Alexander Viro , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christopher Lameter , Dan Williams , Dave Hansen , Elena Reshetova , "H. Peter Anvin" , Idan Yaniv , Ingo Molnar , James Bottomley , "Kirill A. Shutemov" , Matthew Wilcox , Mark Rutland , Mike Rapoport , Michael Kerrisk , Palmer Dabbelt , Paul Walmsley , Peter Zijlstra , Thomas Gleixner , Shuah Khan , Tycho Andersen , Will Deacon , linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org, x86@kernel.org Subject: Re: [PATCH v6 0/6] mm: introduce memfd_secret system call to create "secret" memory areas Message-ID: <20201103095247.GH4879@kernel.org> References: <20200924132904.1391-1-rppt@kernel.org> <9c38ac3b-c677-6a87-ce82-ec53b69eaf71@redhat.com> <20201102174308.GF4879@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 02, 2020 at 06:51:09PM +0100, David Hildenbrand wrote: > > > Assume you have a system with quite some ZONE_MOVABLE memory (esp. in > > > virtualized environments), eating up a significant amount of !ZONE_MOVABLE > > > memory dynamically at runtime can lead to non-obvious issues. It looks like > > > you have plenty of free memory, but the kernel might still OOM when trying > > > to do kernel allocations e.g., for pagetables. With CMA we at least know > > > what we're dealing with - it behaves like ZONE_MOVABLE except for the owner > > > that can place unmovable pages there. We can use it to compute statically > > > the amount of ZONE_MOVABLE memory we can have in the system without doing > > > harm to the system. > > > > Why would you say that secretmem allocates from !ZONE_MOVABLE? > > If we put boot time reservations aside, the memory allocation for > > secretmem follows the same rules as the memory allocations for any file > > descriptor. That means we allocate memory with GFP_HIGHUSER_MOVABLE. > > Oh, okay - I missed that! I had the impression that pages are unmovable and > allocating from ZONE_MOVABLE would be a violation of that? > > > After the allocation the memory indeed becomes unmovable but it's not > > like we are eating memory from other zones here. > > ... and here you have your problem. That's a no-no. We only allow it in very > special cases where it can't be avoided - e.g., vfio having to pin guest > memory when passing through memory to VMs. > > Hotplug memory, online it to ZONE_MOVABLE. Allocate secretmem. Try to unplug > the memory again -> endless loop in offline_pages(). > > Or have a CMA area that gets used with GFP_HIGHUSER_MOVABLE. Allocate > secretmem. The owner of the area tries to allocate memory - always fails. > Purpose of CMA destroyed. > > > > > > Ideally, we would want to support page migration/compaction and allow for > > > allocation from ZONE_MOVABLE as well. Would involve temporarily mapping, > > > copying, unmapping. Sounds feasible, but not sure which roadblocks we would > > > find on the way. > > > > We can support migration/compaction with temporary mapping. The first > > roadblock I've hit there was that migration allocates 4K destination > > page and if we use it in secret map we are back to scrambling the direct > > map into 4K pieces. It still sounds feasible but not as trivial :) > > That sounds like the proper way for me to do it then. Although migration of secretmem pages sounds feasible now, there maybe other issues I didn't see because I'm not very familiar with migration/compaction code. I've looked again at CMA and I'm inclined to agree with you that using CMA for secretmem allocations could be the right thing. -- Sincerely yours, Mike.