Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp6291007ybx; Mon, 11 Nov 2019 06:59:09 -0800 (PST) X-Google-Smtp-Source: APXvYqyZb7UfrvX6bNrosHBXTYqQnUfBYcUOKfkShkEv1n6pM4B5zAwAaw1lWFOgEF88egR1h50b X-Received: by 2002:a50:b626:: with SMTP id b35mr26663730ede.183.1573484349724; Mon, 11 Nov 2019 06:59:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573484349; cv=none; d=google.com; s=arc-20160816; b=pge7cYzQbvxVGmNA/EeH/bWyFLwj7EHz+3MH+FpTwJGORWCLgfbFqEssFR9bArr5yD d7Eomn40DTmvgzAN6fxzQ7Bxg8pZV3KrsH3oW5CV/uJLBM7gVR6ALAMH6JwczN7R5/Mi pjrVyyLNMMWkF6ZVVmDYgR68f0gPivnMsmcRfRUYeoFsv+FEBRfppkPUdX/i/8GHKSBo A9qAN1EcPULvGdauYmQvd31GTX2CFW/91xODyiX31RInLME6CiQBK1sdydxOO7B6kSoK B5CEtK6rFtkEEbvNtolxB0lFR2hb29yud2UY8KTKcwf22f0Ldpa375lQyC1DZnyBSiff qcGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=lFd9QgXHtlAAGLyAHn/OxQGBQG3M9vesrmYO8xOMHoM=; b=DPg4XCqJ2V/PWiVSi9+TWa89yWkRe0AIGwYH8QLqDoOKY1VHRv0PQ5u5NaWKmojxLP IcfJe6HcjoS8SRi8JjrW7RxeRpYntIEQzi+qhQkjHgoLayEpe+z/aaNu7R5ZkeKCtjmi H89pb43ag4ZAbr1umE1rt5qJHGbNgkGH1QoZZ3v9YWJM9fCtsVCaq2q1DPfDgl8YnID3 w2N5l39Hd3E3yjUtQZrrD0QI/p5fnv9c+gTfNhOqpjTmlrXbq6D3oa/Xa50uEqusu/Qo 3ZsTPoNWGC//G/3ATukfO0Lxr65XLw8qhKnxYABdz82BLkTXHh+Bw8C689W7J0zqCYOd 78Fw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=BP7+Bkdj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k20si2728397ejj.21.2019.11.11.06.58.44; Mon, 11 Nov 2019 06:59:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=BP7+Bkdj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726955AbfKKO4E (ORCPT + 99 others); Mon, 11 Nov 2019 09:56:04 -0500 Received: from mail-ot1-f66.google.com ([209.85.210.66]:33825 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726853AbfKKO4D (ORCPT ); Mon, 11 Nov 2019 09:56:03 -0500 Received: by mail-ot1-f66.google.com with SMTP id t4so11489052otr.1 for ; Mon, 11 Nov 2019 06:56:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=lFd9QgXHtlAAGLyAHn/OxQGBQG3M9vesrmYO8xOMHoM=; b=BP7+BkdjiK2/+DzxZLMF4OWcq+cRxwzGeFcLF7It+OsxMVvOjdL9wZmktEWUCYdChM FKtqwwh4KJYzY8pihfF6DNgymNipK813O4fvTgJe4wIoHcpokv1yQ4x+3fx5ehmb6SUT 2z8kM+HvyOlJSOUauByBIDmATfcJvoJyqsVPSeR7GfNEcKgvqSNhS05wsmU942cgN5HC 1Hwgm0atXLwTzkALMNcMh3rFwaz1qiTlJyiVeIKnBWajTpVseCqxnJ1FHzHiBX0RXvdb HIyPeavTnKiDH4+zz0WrHKvp/je+osxUryWha3EOHeUXzbdRia0UelWH7Gk62ZV+pbcD dkDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=lFd9QgXHtlAAGLyAHn/OxQGBQG3M9vesrmYO8xOMHoM=; b=ubV3FIPoHBE48w/+yl5nw7tPwl01OqiPpmNBDN1op4FETZmAieMODLIY2pdN1rJ5m9 yrf8PC8cWq+AZd0RBOWGX+Tin6FmqlB1BO4nfeMULMSctYcH6Ha9SCsP30SVXDDW97pK 0xCacxfCzOgl7Qw07nHkJhi7RMOqem71KrsM5pEXFL7z0BeindjkUK7uyq50A9XZbuY3 8aR3qFuJ61eXcWKyZnLbU/Sl/uTKwJTSdelIstAsKCFBJCi1v0ZWx2K3Z388uQRBMe0R xV1pI0kUkxB6q0PlkJV9uW36dQTjOhkYxDtka7s2hkC6tT5IPEajCHkn37kaHIcvT6jU FUZA== X-Gm-Message-State: APjAAAV4wdzs6dJPOozP8dVTh1mE6li1gmu0u4B0c/G9S3/PCebaxQTe ERJdGUKKd55QBmyoJtYxgyeKYPpfx+NWgtH880j8aA== X-Received: by 2002:a9d:7e8a:: with SMTP id m10mr2125174otp.180.1573484162476; Mon, 11 Nov 2019 06:56:02 -0800 (PST) MIME-Version: 1.0 References: <20191107151941.dw4gtul5lrtax4se@wittgenstein> <2eb2ab4c-b177-29aa-cdc4-420b24cfd7b3@gmail.com> In-Reply-To: <2eb2ab4c-b177-29aa-cdc4-420b24cfd7b3@gmail.com> From: Jann Horn Date: Mon, 11 Nov 2019 15:55:35 +0100 Message-ID: Subject: Re: For review: documentation of clone3() system call To: "Michael Kerrisk (man-pages)" Cc: Christian Brauner , Florian Weimer , Christian Brauner , lkml , linux-man , Kees Cook , Oleg Nesterov , Arnd Bergmann , David Howells , Pavel Emelyanov , Andrew Morton , Adrian Reber , Andrei Vagin , Linux API , Ingo Molnar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Nov 9, 2019 at 9:10 AM Michael Kerrisk (man-pages) wrote: [...] > On 11/7/19 4:19 PM, Christian Brauner wrote: > > On Fri, Oct 25, 2019 at 06:59:31PM +0200, Michael Kerrisk (man-pages) w= rote: [...] > >> The stack argument specifies the location of the stack used by = the > >> child process. Since the child and calling process may share m= em=E2=80=90 > >> ory, it is not possible for the child process to execute in = the > >> same stack as the calling process. The calling process m= ust > >> therefore set up memory space for the child stack and pas= s a > >> pointer to this space to clone(). Stacks grow downward on = all > > > > It might be a good idea to advise people to use mmap() to create a > > stack. The "canonical" way of doing this would usually be something lik= e > > > > #define DEFAULT_STACK_SIZE (4 * 1024 * 1024) /* 8 MB usually on Linux *= / > > void *stack =3D mmap(NULL, DEFAULT_STACK_SIZE, PROT_READ | PROT_WRITE, = MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0); > > > > (Yes, the MAP_STACK is usally a noop but people should always include i= t > > in case some arch will have weird alignment requirement in which case > > this flag can be changed to actually do something...) > > So, I'm getting a little bit of an education here, and maybe you are > going to further educate me. Long ago, I added the documentation of > MAP_STACK to mmap(2), but I never quite connected the dots. > > However, you say MAP_STACK is *usually* a noop. As far as I can see, > in current kernels it is *always* a noop. And AFAICS, since it was first > added in 2.6.27 (2008), it has always been a noop. > > I wonder if it will always be a noop. [...] > So, my understanding from the above is that MAP_STACK was added to > allow a possible fix on some old architectures, should anyone decide it > was worth doing the work of implementing it. But so far, after 12 years, > no one did. It kind of looks like no one ever will (since those old > architectures become less and less relevant). > > So, AFAICT, while it's not wrong to tell people to use mmap(MAP_STACKED), > it doesn't provide any benefit (and perhaps never will), and it is a > more clumsy than plain old malloc(). > > But, it could well be that there's something I still don't know here, > and I'd be interested to get further education. Not on Linux, but on OpenBSD, they do use MAP_STACK now AFAIK; this was announced here: . Basically they periodically check whether the userspace stack pointer points into a MAP_STACK region, and if not, they kill the process. So even if it's a no-op on Linux, it might make sense to advise people to use the flag to improve portability? I'm not sure if that's something that belongs in Linux manpages. Another reason against malloc() is that when setting up thread stacks in proper, reliable software, you'll probably want to place a guard page (in other words, a 4K PROT_NONE VMA) at the bottom of the stack to reliably catch stack overflows; and you probably don't want to do that with malloc, in particular with non-page-aligned allocations.