Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp4626456yba; Tue, 30 Apr 2019 01:24:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqx+3BLHN/jF/h7Dly8sNuWJfF8Z9azMLbUxOLO6dDuVmp8CoooynmKVaaY/BZOnybA0sdxB X-Received: by 2002:a17:902:e402:: with SMTP id ci2mr32508409plb.154.1556612656430; Tue, 30 Apr 2019 01:24:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556612656; cv=none; d=google.com; s=arc-20160816; b=qGV8obNlAx8P9uaZ/Ys84yEwSJOEH5SUPbVJ08hZ6V654uJ4lW6SzDjQSrrCtLbbQo VRTZV7gi9+wSzpSzgCx8E3W0KxgTPNe5sXe2Odjf26i4tz1wSEKiLj6KFx/FVBcLP7JL OifL/wfu0vVBw4X6SS/IiAEIYwGcPUQ7u1hJrn3132zekQCmIFcwhBMRkdu8vWgOOn9V El69Jge2N9DmRjSDI3bYVQ+jXs0qZPR6nUXCTzCDKhXZZDUFZTUoVQgthWdbidTPbJyH 2Oi44H8L9cjEg0GN78WQrAGAJuvPjV3TjGwRV8lRohqlfvU9OaJZtHzpEBGgr+mspIGV C31g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :in-reply-to:date:references:subject:cc:to:from; bh=F5QMYfO0uPjFaqeI+31QaP7OTznOkA9vcwh+QZ4SqMA=; b=NLknO9JGTPapCvcQcXutgRZZP6rJNl7p2hLtvMxnhc6F+NBWOuWocogFsgs+Tcpoqw gYZON4hOal2CG392w7oQCdo+MZG/9zsvUH24FoaeJuZKV4d92YlM+mC0cP+zL823mIb2 cJ2r1IjeQO5nEZ2woRM9BpLmXqEp4pbBvNY74KhehH/lgkclOWH4KZLle86mH6K11010 xkdUIwPBlJtjpOYNH+Ncn+cAszs6UItxadHbVWda+ORgHkx50N/9mtVOQ0K5L99crBGo yX5vA8ImCtgSuOP3jz2Fbwu6ZOKAdEB52a2K3NOHlbt9qPNGiL/MP3gWQLjhSByM6f5q rehQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z4si4015096pgl.591.2019.04.30.01.23.59; Tue, 30 Apr 2019 01:24:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726460AbfD3IVd (ORCPT + 99 others); Tue, 30 Apr 2019 04:21:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48016 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726129AbfD3IVd (ORCPT ); Tue, 30 Apr 2019 04:21:33 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E71933082B44; Tue, 30 Apr 2019 08:21:31 +0000 (UTC) Received: from oldenburg2.str.redhat.com (ovpn-116-90.ams2.redhat.com [10.36.116.90]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 51C6A6B8E9; Tue, 30 Apr 2019 08:21:22 +0000 (UTC) From: Florian Weimer To: Linus Torvalds Cc: Jann Horn , Kevin Easton , Andy Lutomirski , Christian Brauner , Aleksa Sarai , "Enrico Weigelt\, metux IT consult" , Al Viro , David Howells , Linux API , LKML , "Serge E. Hallyn" , Arnd Bergmann , "Eric W. Biederman" , Kees Cook , Thomas Gleixner , Michael Kerrisk , Andrew Morton , Oleg Nesterov , Joel Fernandes , Daniel Colascione Subject: Re: RFC: on adding new CLONE_* flags [WAS Re: [PATCH 0/4] clone: add CLONE_PIDFD] References: <20190414201436.19502-1-christian@brauner.io> <20190415195911.z7b7miwsj67ha54y@yavin> <20190420071406.GA22257@ip-172-31-15-78> Date: Tue, 30 Apr 2019 10:21:20 +0200 In-Reply-To: (Linus Torvalds's message of "Mon, 29 Apr 2019 19:16:11 -0700") Message-ID: <87r29jaoov.fsf@oldenburg2.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Tue, 30 Apr 2019 08:21:32 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Linus Torvalds: > Note that vfork() is "exciting" for the compiler in much the same way > "setjmp/longjmp()" is, because of the shared stack use in the child > and the parent. It is *very* easy to get this wrong and cause massive > and subtle memory corruption issues because the parent returns to > something that has been messed up by the child. Just using a wrapper around vfork is enough for that, if the return address is saved on the stack. It's surprising hard to write a test case for that, but the corruption is definitely there. > (In fact, if I recall correctly, the _reason_ we have an explicit > 'vfork()' entry point rather than using clone() with magic parameters > was that the lack of arguments meant that you didn't have to > save/restore any registers in user space, which made the whole stack > issue simpler. But it's been two decades, so my memory is bitrotting). That's an interesting point. Using a callback-style interface avoids that because you never need to restore the registers in the new subprocess. It's still appropriate to use an assembler implementation, I think, because it will be more obviously correct. > Also, particularly if you have a big address space, vfork()+execve() > can be quite a bit faster than fork()+execve(). Linux fork() is pretty > efficient, but if you have gigabytes of VM space to copy, it's going > to take time even if you do it fairly well. vfork is also more benign from a memory accounting perspective. In some environments, it's not possible to call fork from a large process because the accounting assumes (conservatively) that the new process will dirty a lot of its private memory. Thanks, Florian