Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp686912yba; Thu, 16 May 2019 07:24:55 -0700 (PDT) X-Google-Smtp-Source: APXvYqwwN9YyiK+A1UNSOV0RISrCEiaF9dkF7/PWFxB283BfjzEPMGFmBiB029o6acUlUosy9c2g X-Received: by 2002:aa7:980e:: with SMTP id e14mr54334216pfl.142.1558016695846; Thu, 16 May 2019 07:24:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558016695; cv=none; d=google.com; s=arc-20160816; b=yeS06vW3iVaZuLWEe5mG3Xby+HsUjLvdtiVXp85k97NltYv8QAAMUrlKBnYC2m8Zoz 7kmH5OntxlEdr31z3wSkF0uZm0jZlk/SzWh3/WcRCD86jPV0VolqKnxDh3ZvZARiziei //88FXofxD2ecBu9vDboTMqqWetBQdSmDWjo14/8rOwqswRYqmer+ycUTwVpNrv4Sd/8 NUdXbJJ9IytrZ4Bbkf4pXq9iBWW/9zwe/CmIjiol4PmfrUH/BT4jiOEjVVso9DcS74z8 oT7kg1a4Gxtgp7isMDYGLepxL3avIjFZh7mFaDcvLOyQTyNFbCNgPYFf5RS/5JG8Ee+Z yK3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=luxCrZEStgJBs7CpALfKRLP6982TIh6OfHD/4DCx+GU=; b=BsD/3+votkvwUgic54bmqi8j88vGhvcQ9SqcGU4EF/HM/rITVq4a7T/93FrG7FCbyl wOC+dWSNAOTRBsiAmGMc3Ph0QbYXAh4FAwUC7Gv6xoRFt6Dhrq3BoA3icB0uSHw0vJpe XYexLbw4qQ6xMPELtjeceWfgPJTieDBOGdTkMDlJLrJYEVeISWjYewGGqXIggMUHacov uPmBZdt/XsiPIAAuWTLOrCfk2fzYRxm911JoNTFc5HduQjKfXz8FXse+5S1fZR9J2o0q A5/T3v6RR+U4wj4kPFuXdFdQpoTeYVGgj/HeAf3gJuqqlj4wQ/C5G4rk7aNvJZi5+1tM w9Nw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l1si5035758pgi.344.2019.05.16.07.24.39; Thu, 16 May 2019 07:24:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727338AbfEPOWd (ORCPT + 99 others); Thu, 16 May 2019 10:22:33 -0400 Received: from relay.sw.ru ([185.231.240.75]:55154 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726696AbfEPOWd (ORCPT ); Thu, 16 May 2019 10:22:33 -0400 Received: from [172.16.25.169] by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1hRHH5-0007GL-T0; Thu, 16 May 2019 17:22:24 +0300 Subject: Re: [PATCH RFC 0/5] mm: process_vm_mmap() -- syscall for duplication a process mapping To: Michal Hocko Cc: akpm@linux-foundation.org, dan.j.williams@intel.com, keith.busch@intel.com, kirill.shutemov@linux.intel.com, pasha.tatashin@oracle.com, alexander.h.duyck@linux.intel.com, ira.weiny@intel.com, andreyknvl@google.com, arunks@codeaurora.org, vbabka@suse.cz, cl@linux.com, riel@surriel.com, keescook@chromium.org, hannes@cmpxchg.org, npiggin@gmail.com, mathieu.desnoyers@efficios.com, shakeelb@google.com, guro@fb.com, aarcange@redhat.com, hughd@google.com, jglisse@redhat.com, mgorman@techsingularity.net, daniel.m.jordan@oracle.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org References: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> <20190516133034.GT16651@dhcp22.suse.cz> <20190516135259.GU16651@dhcp22.suse.cz> From: Kirill Tkhai Message-ID: <85562807-2a13-9aa2-e67d-15513c766eae@virtuozzo.com> Date: Thu, 16 May 2019 17:22:23 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <20190516135259.GU16651@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 16.05.2019 16:52, Michal Hocko wrote: > On Thu 16-05-19 15:30:34, Michal Hocko wrote: >> [You are defining a new user visible API, please always add linux-api >> mailing list - now done] >> >> On Wed 15-05-19 18:11:15, Kirill Tkhai wrote: > [...] >>> The proposed syscall aims to introduce an interface, which >>> supplements currently existing process_vm_writev() and >>> process_vm_readv(), and allows to solve the problem with >>> anonymous memory transfer. The above example may be rewritten as: >>> >>> void *buf; >>> >>> buf = mmap(NULL, n * PAGE_SIZE, PROT_READ|PROT_WRITE, >>> MAP_PRIVATE|MAP_ANONYMOUS, ...); >>> recv(sock, buf, n * PAGE_SIZE, 0); >>> >>> /* Sign of @pid is direction: "from @pid task to current" or vice versa. */ >>> process_vm_mmap(-pid, buf, n * PAGE_SIZE, remote_addr, PVMMAP_FIXED); >>> munmap(buf, n * PAGE_SIZE); > > AFAIU this means that you actually want to do an mmap of an anonymous > memory with a COW semantic to the remote process right? Yes. > How does the remote process find out where and what has been mmaped? Any way. Isn't this a trivial task? :) You may use socket or any of appropriate linux features to communicate between them. >What if the range collides? This sounds quite scary to me TBH. In case of range collides, the part of old VMA becomes unmapped. The same way we behave on ordinary mmap. You may intersect a range, which another thread mapped, so you need a synchronization between them. There is no a principle difference. Also I'm going to add a flag to prevent unmapping like Kees suggested. Please, see his message. > Why cannot you simply use shared memory for that? Because of remote task may want specific type of VMA. It may want not to share a VMA with its children. Speaking about online migration, a task wants its anonymous private VMAs remain the same after the migration. Otherwise, imagine the situation, when task's stack becomes a shared VMA after the migration. Also, task wants anonymous mapping remains anonymous. In general, in case of shared memory is enough for everything, we would have never had process_vm_writev() and process_vm_readv() syscalls. Kirill