Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1842598pxb; Fri, 5 Feb 2021 02:55:00 -0800 (PST) X-Google-Smtp-Source: ABdhPJzRKKG2Faa9X7rHOyDdarf2rN8BTdSLFlPnj0Bd8UA2I1pCgVI0cvVWlj2WEXXBxYz+2bNx X-Received: by 2002:a17:906:2697:: with SMTP id t23mr3514087ejc.357.1612522500322; Fri, 05 Feb 2021 02:55:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612522500; cv=none; d=google.com; s=arc-20160816; b=m/UFMIIWwhCL31kIrxwHh0a9DsftB23SgkyC8TtIjbrz9+RcW0PvkWoPUZykMdnHwt +KmvD23qEde98CmzhOtSOzPnNMt7d8PeLwDa96f1IIoRTtXl+v7JDHXmfvM6g1t2cCDu fxj1ohmFwkBNPFd1qD2qSN3d0bVQSsE19QlkwFTaEI2dc0Y1QQLsdqT87Lam4AKPVxfH 0OiLK5N7K3Lew3CewYgGCberPsur9XVo3oC9ajgo7OYQjm3RgUNLYutvfZh7AJvQHuAF vgQv0KrvYwcKn2SA5LALPNdPOBPYXdlgXz8/eMQTufViXboqOAJsFGar671CkLo4f/KR G2Hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=P9U0KMSJeer5iuxUBgb27AQmqxqsG0SGRUB7HhfB5Yg=; b=koi5xZNrcfdAoBOK2mz6ZdEVzh7fldzRgX9gtHCfCKvIRSO7A13TL0fSjg0bk7ehJO Ij5XMY5MFw9plHBZrFSEuBITl/x5GU2cecnucYY7U4tVhExxaB2PaTRlk6LN8dGPy88Q jDL7KLIFHeLCpO94lm5Dfyc+Swy7VqLgKSry8/Q0ylmD+ouzKYerJEIe1sf4OD9S9quq 3RNsKqmfal4ZZR/d+OqX/S46p41p4+cN9zMEIK5nMpZG5LAhHYPYIi5Lg11uhtx/fKAy E3Rq5iyaO9WBkbUG+uhwnzaK+8mcO6q58pTKbOa48aE6zmv/zWst0tEaHD+zzvX6Pbuk wAPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=Citoa9zC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i4si5363889edg.272.2021.02.05.02.54.34; Fri, 05 Feb 2021 02:55:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=Citoa9zC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231816AbhBEKxX (ORCPT + 99 others); Fri, 5 Feb 2021 05:53:23 -0500 Received: from mx2.suse.de ([195.135.220.15]:60018 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231380AbhBEKuz (ORCPT ); Fri, 5 Feb 2021 05:50:55 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1612522206; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P9U0KMSJeer5iuxUBgb27AQmqxqsG0SGRUB7HhfB5Yg=; b=Citoa9zCTa64ulzZF72SIRhyq1icMnqacsuF3+aBSX8v0Ay3+uSAwtuBpH6DPO4RdcUWI9 /bbOOwGhOGwFGhQ8aLNe8y5T2n33xAn+nhDOmXxvW1w3cm0tGmfBMSKCUUZqsCFNgP+XKs oL986qV6hV6KO74AstcDYrEaT2us00Y= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 1024CAD2E; Fri, 5 Feb 2021 10:50:06 +0000 (UTC) Date: Fri, 5 Feb 2021 11:50:05 +0100 From: Michal Hocko To: Christian =?iso-8859-1?Q?K=F6nig?= Cc: Hugh Dickins , LKML Subject: Re: Possible deny of service with memfd_create() Message-ID: References: <762ad377-ac21-6d8d-d792-492ba7f6c000@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <762ad377-ac21-6d8d-d792-492ba7f6c000@amd.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 05-02-21 08:54:31, Christian K?nig wrote: > Am 05.02.21 um 01:32 schrieb Hugh Dickins: > > On Thu, 4 Feb 2021, Michal Hocko wrote: > > > On Thu 04-02-21 17:32:20, Christian Koenig wrote: > > > > Hi Michal, > > > > > > > > as requested in the other mail thread the following sample code gets my test > > > > system down within seconds. > > > > > > > > The issue is that the memory allocated for the file descriptor is not > > > > accounted to the process allocating it, so the OOM killer pics whatever > > > > process it things is good but never my small test program. > > > > > > > > Since memfd_create() doesn't need any special permission this is a rather > > > > nice deny of service and as far as I can see also works with a standard > > > > Ubuntu 5.4.0-65-generic kernel. > > > Thanks for following up. This is really nasty but now that I am looking > > > at it more closely, this is not really different from tmpfs in general. > > > You are free to create files and eat the memory without being accounted > > > for that memory because that is not seen as your memory from the sysstem > > > POV. You would have to map that memory to be part of your rss. > > I mostly agree. The big difference is that tmpfs is only available when > mounted. > > And tmpfs can be restricted in size per mount point as well as per user > quotas IIRC. Looking at my desktop system those restrictions are actually > exactly what I see there. I cannot find anything about per user quotas for tmpfs in the tmpfs man page. Or maybe I am looking at a wrong layer and there is a generic handling somewhere in the vfs core? > But memfd_create() is just free for all, you don't have any size limit nor > access restriction as far as I can see. Yes, this is unfortunate and a design decision that should have been considered when the syscall has been introduced. But this boat has sailed looong ago to change that without risking a userspace breakage. > > > The only existing protection right now is to use memoery cgroup > > > controller because the tmpfs memory is accounted to the process which > > > faults the memory in (or write to the file). > > Agreed, but having to rely on cgroup is not really satisfying when you have > to maintain a hardened server. Yes I do recognize the pain. The only other way to mitigate the risk is to disallow the syscall to untrusted users in a hardened environment. You should be very strict in tmpfs usage there already. -- Michal Hocko SUSE Labs