Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2075282pxb; Thu, 11 Feb 2021 03:49:54 -0800 (PST) X-Google-Smtp-Source: ABdhPJyxBclZeJ9RuGxKEcXLp+JS1JiTeeRLqAt6iKEVmQzvmUvFwhBuR/y59spBxjWw+7rtvFmi X-Received: by 2002:a17:906:24ca:: with SMTP id f10mr7942263ejb.96.1613044193987; Thu, 11 Feb 2021 03:49:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613044193; cv=none; d=google.com; s=arc-20160816; b=tHvc/8YGjvD09g5L6XtOdHlU/+/jocGLZlG5YXjlPENdNADk2eXPOUMuUAVAUsjKn2 w1Kad956n9ifExsVKC7ub8ncM6RGilJR2/5sXrh8OhoX02wmm4aLJItDr01YLk8gtB2Y DmLY2kqeXlH85vKKnGQb9x5WTuGEbNOzTqgAOEyskRX/5N2KOZkqv2auUS66/Kt+rtYg ofl8Ov7cnAHhL1Jr6erzQSzBPG3Wq1zIP5s7m9Fdn4s4gDzwcaUB4uI0DPXtsKLtzwPx Y1JMrI/2C8vNkW9CfFlZY4F4MSc1kpbDgIms4P2ziG8kGPYrc0G9QaJPZx3QoUp0eiSv S4Rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=wRfi9nROMHxjQshqgifDoCT94sJLMuIuSwzDoVGnhVs=; b=LhZCOVBDzwMlJ8eC9Vo29BL4ez7+fIfwVBtsz9TQNSEoqsA422cve756lR4RVfOpQa 6Ih2ugv+bMfmYiuhB+HUlBGRmHq1bWUG/fTbV0KcTizynfcWj9bDW2ptchCqBEZMLw0L Qlctn45wX1oGisNFqIXBdoPiQuKKH966QkKXqk4sgyxiFWmkWIDqZ5G/n14770FTrsW/ u/aV7t9q9eZomhgGV4RdouBp9fjiQIdGLcId3sl8DlAQ2c+8LSux/2EWylsJcFmkPzgt 7JWp6Q/IU4Z4JyErRbYL69IWWTQbn3Wll9gRZ6wyDxhtJOf9Ihm62dwpy1OsAyVC/4QX uGbw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=t3VWHvs0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h15si3402517eje.285.2021.02.11.03.49.31; Thu, 11 Feb 2021 03:49:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=t3VWHvs0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229879AbhBKLqo (ORCPT + 99 others); Thu, 11 Feb 2021 06:46:44 -0500 Received: from mail.kernel.org ([198.145.29.99]:40392 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231294AbhBKLVJ (ORCPT ); Thu, 11 Feb 2021 06:21:09 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 14FAC64DCF; Thu, 11 Feb 2021 11:20:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1613042428; bh=rXRK0H6A/K2Wb0tbxmc7ZILxeVB76yptr2cmsMFQE74=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=t3VWHvs0bXUxcCXOc97N7vSXyLqVUcEV92kaL6OdVxecAKxAQ7ns6LLG+vegpOVde U9dBh6sezDjn5D92Ll5noX1Vr+yCYEIk5R0rIUlhKrjSS7aOZ0i+PaR/+DQIweUHh6 8FzOzWyaxJsUOVv2GFc6tXB0iA4Kh3UCZBlISlSmmeVmiew2dh6clqSD3WkV3s59X7 15MQmJD2s7JLI/6iNb4Cojk2vx4RU0Y7gohP8KiCudLNjAvDZFFfG0p33iCbt4dNcP wEYFcHPp1StbE/lV4MGjTaz8WVMajl7z6z/1pA+KZbglGlxEBvUgFNkTEZP1OvrqS8 dLaGA+Ls37rug== Date: Thu, 11 Feb 2021 13:20:08 +0200 From: Mike Rapoport To: Michal Hocko Cc: Mike Rapoport , Andrew Morton , Alexander Viro , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christopher Lameter , Dan Williams , Dave Hansen , David Hildenbrand , Elena Reshetova , "H. Peter Anvin" , Ingo Molnar , James Bottomley , "Kirill A. Shutemov" , Matthew Wilcox , Mark Rutland , Michael Kerrisk , Palmer Dabbelt , Paul Walmsley , Peter Zijlstra , Rick Edgecombe , Roman Gushchin , Shakeel Butt , Shuah Khan , Thomas Gleixner , Tycho Andersen , Will Deacon , linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org, x86@kernel.org, Hagen Paul Pfeifer , Palmer Dabbelt Subject: Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas Message-ID: <20210211112008.GH242749@kernel.org> References: <20210208084920.2884-1-rppt@kernel.org> <20210208084920.2884-8-rppt@kernel.org> <20210208212605.GX242749@kernel.org> <20210209090938.GP299309@linux.ibm.com> <20210211071319.GF242749@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 11, 2021 at 09:39:38AM +0100, Michal Hocko wrote: > On Thu 11-02-21 09:13:19, Mike Rapoport wrote: > > On Tue, Feb 09, 2021 at 02:17:11PM +0100, Michal Hocko wrote: > > > On Tue 09-02-21 11:09:38, Mike Rapoport wrote: > [...] > > > > Citing my older email: > > > > > > > > I've hesitated whether to continue to use new flags to memfd_create() or to > > > > add a new system call and I've decided to use a new system call after I've > > > > started to look into man pages update. There would have been two completely > > > > independent descriptions and I think it would have been very confusing. > > > > > > Could you elaborate? Unmapping from the kernel address space can work > > > both for sealed or hugetlb memfds, no? Those features are completely > > > orthogonal AFAICS. With a dedicated syscall you will need to introduce > > > this functionality on top if that is required. Have you considered that? > > > I mean hugetlb pages are used to back guest memory very often. Is this > > > something that will be a secret memory usecase? > > > > > > Please be really specific when giving arguments to back a new syscall > > > decision. > > > > Isn't "syscalls have completely independent description" specific enough? > > No, it's not as you can see from questions I've had above. More on that > below. > > > We are talking about API here, not the implementation details whether > > secretmem supports large pages or not. > > > > The purpose of memfd_create() is to create a file-like access to memory. > > The purpose of memfd_secret() is to create a way to access memory hidden > > from the kernel. > > > > I don't think overloading memfd_create() with the secretmem flags because > > they happen to return a file descriptor will be better for users, but > > rather will be more confusing. > > This is quite a subjective conclusion. I could very well argue that it > would be much better to have a single syscall to get a fd backed memory > with spedific requirements (sealing, unmapping from the kernel address > space). > Neither of us would be clearly right or wrong. 100% agree :) > A more important point is a future extensibility and usability, though. > So let's just think of few usecases I have outlined above. Is it > unrealistic to expect that secret memory should be sealable? What about > hugetlb? Because if the answer is no then a new API is a clear win as the > combination of flags would never work and then we would just suffer from > the syscall multiplexing without much gain. On the other hand if > combination of the functionality is to be expected then you will have to > jam it into memfd_create and copy the interface likely causing more > confusion. See what I mean? I see your point, but I think that overloading memfd_create definitely gets us into syscall multiplexing from day one and support for seals and huge pages in the secretmem will not make it less of a multiplexer. Sealing is anyway controlled via fcntl() and I don't think MFD_ALLOW_SEALING makes much sense for the secretmem because it is there to prevent rogue file sealing in tmpfs/hugetlbfs. As for the huge pages, I'm not sure at all that supporting huge pages in secretmem will involve hugetlbfs. And even if yes, adding SECRETMEM_HUGE flag seems to me less confusing than saying "from kernel x.y you can use MFD_CREATE | MFD_SECRET | MFD_HUGE" etc for all possible combinations. > I by no means do not insist one way or the other but from what I have > seen so far I have a feeling that the interface hasn't been thought > through enough. It has been, but we have different thoughts about it ;-) -- Sincerely yours, Mike.