Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751424AbdILLa5 (ORCPT ); Tue, 12 Sep 2017 07:30:57 -0400 Received: from mail-pf0-f173.google.com ([209.85.192.173]:33898 "EHLO mail-pf0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751379AbdILLaw (ORCPT ); Tue, 12 Sep 2017 07:30:52 -0400 X-Google-Smtp-Source: ADKCNb6V2jF3xpC6J0PQ7GwbYS5/6DiAPVCnHYizjZGx594U/KBipgACrY3VOakY6ocMqfjD5113KX1abog+KEgUPGc= MIME-Version: 1.0 In-Reply-To: References: <324c00d9-06a6-1fc5-83fe-5bd36d874501@landley.net> <20170905142436.262ed118@alans-desktop> <20170911151526.GA4126@redhat.com> From: Geert Uytterhoeven Date: Tue, 12 Sep 2017 13:30:51 +0200 X-Google-Sender-Auth: 5YFPjwYyd-cJY0LNIzpmPEGJCTQ Message-ID: Subject: Re: execve(NULL, argv, envp) for nommu? To: Rob Landley Cc: Oleg Nesterov , Alan Cox , Linux Embedded , Rich Felker , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2592 Lines: 51 Hi Rob, On Tue, Sep 12, 2017 at 12:48 PM, Rob Landley wrote: > A nommu system doesn't have a memory management unit, so all addresses > are physical addresses. This means two processes can't see different > things at the same address: either they see the same thing or one of > them can't see that address (due to a range register making it). > > Conventional fork() creates copy on write mappings of all the existing > writable memory of the parent process. So when the new PID dirties a > page, the old page gets copied by the fault handler. The problem isn't > the copies (that's just slow), the problem is two processes seeing > different things at the same address. That requires an MMU with a TLB > loaded from page tables. > > If you create _new_ mappings and copy the data over, they'll have > different addresses. But any pointers you copied will point to the _old_ > addresses. Finding and adjusting all those pointers to point to the new > addresses instead is basically the same problem as doing garbage > collection in C. > > Your stack has pointers. Your heap has pointers. Your data and bss (once > initialized) can have pointers. These pointers can be in the middle of > malloc()'ed structures so no ELF table anywhere knows anything about > them. A long variable containing a value that _could_ point into one of > these ranges isn't guaranteed to _be_ a pointer, in which case adjusting > it is breakage. Tracking them all down and fixing up just the right ones > without missing any or changing data you shouldn't is REALLY HARD. Hence (make the compiler) never store pointers, only offsets relative to a base register. So after making copies of stack, data/bss, and heap, all you need to do is adjust these base registers for the child process. Nothing in main memory needs to be modified. Text accesses can be PC-relative => nothing to adjust. Local variable accesses are stack-relative => nothing to adjust. Data/bss accesses can be relative to a reserved register that stores the data base address => only adjust the base register, nothing in RAM to adjust. Heap accesses can be relative to a reserved register that stores the heap base address => only adjust the base register, nothing in RAM to adjust. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds