Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3118536imu; Mon, 19 Nov 2018 10:54:27 -0800 (PST) X-Google-Smtp-Source: AJdET5fqYvVG2PqrAcsaKBBPZoVjod63IXddDYZw/csJrlJ7L4uzXx3oJeCD8sEq80P/cidf/sq5 X-Received: by 2002:aa7:814f:: with SMTP id d15-v6mr24487687pfn.78.1542653667467; Mon, 19 Nov 2018 10:54:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542653667; cv=none; d=google.com; s=arc-20160816; b=xfI7m++SdBH+IWXr1MpBGQMkxEI3sUMMy0zldwoSF04oSU5kCCh+Mp2f5J7gVaI/PH NssRm9kwPcJztzAbmf7Uv0kdlaYkCc4Za7LAMwbYNKe/LThD7crB+kVczd2Ag0F9WOWR DG79fJswnAj8KYIYgpC4M61jSlRnSFlLVdW/dGIRjYv8fWbNFXO++Pw7SyL/7yugS60b AM0TClRRsaQ0T0ZfbDjiXV4GoJN7pmVVKRxy6om1ZrPWtb1ZGadN73CEek5zIp4qA84w TTzsTKw5/W1oqi5PHaARlMdwbqpJLcj6ikzIgdJKanDmXGxq9+tTIyeFZZFgxRFFI8xt u3qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=orlOOr/BaGm17SJPpAirxLsaGVwjrtuMXrDM5zHWlAM=; b=HaHS1Uv93M4X5lT0SI0GiUgam0yYUSHZ6PPVdonRgwSbQK78OCvDIObnp0TC1QughZ 9Va5axjlnS728pw2lio/Kmu7iMLmPbvAp6d2lG/lMBRqkh/pg6/hCMyVNc56fTZ7LbHW xPcqZovTlFihuYxUmF6YtORmokLok/a6XuLVIdSYA9XOuxX33qnVeXei3Bl1DWBWq/aZ 4ygqmZR9v+kWUBJQdPS5FhC7CqqNbxWURjlSnlMWUMZX7pRFJC4ATMx2CuU1tz/pWJgk M7TcbQEthO3wrGOao352FJhYMyh/vFZd105PYh5pXujlXIEch1wHyd6Gidr8qG/wXXoy Xcuw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=eDvpkynd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a18si40081680pgj.77.2018.11.19.10.54.12; Mon, 19 Nov 2018 10:54:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=eDvpkynd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730648AbeKTFS3 (ORCPT + 99 others); Tue, 20 Nov 2018 00:18:29 -0500 Received: from mail-pl1-f195.google.com ([209.85.214.195]:33087 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730588AbeKTFS2 (ORCPT ); Tue, 20 Nov 2018 00:18:28 -0500 Received: by mail-pl1-f195.google.com with SMTP id z23so3766129plo.0 for ; Mon, 19 Nov 2018 10:53:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=orlOOr/BaGm17SJPpAirxLsaGVwjrtuMXrDM5zHWlAM=; b=eDvpkyndPb9nHnCbDWhNMRFZo+KUUCjQ3cAGuI97NRsiytg5IiG5PwFWwQN5hawQ7h 2S+ZlFoXi2SY9MyNVMhReXKyV7DQe4eyXH+mV5jxVzd0b5FvsX+joGCA9y/bPa377WRB vwKbkAJeBwexOgy1Q5AxeTXiLSpSP2cB/B2mKvoYKhlXNIZ6WTTQ2RQx04TJoisWBtv6 AAm4Nw7llyJHLWK/Kw8obScvm0/QSNtRT7XDDaq3v1h8CQJbTpzVR91bFVVkf880NYv8 W3K8s/Jub68imUNSsoGkFri6V5wWxwh6GWiESbHN9K3PJr5P2JfTaDsJao4WADuD2b6h QAyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=orlOOr/BaGm17SJPpAirxLsaGVwjrtuMXrDM5zHWlAM=; b=TkR8Bl/PhKiWuGVMWb/HlUU3Sx+2MDFGYk0TNlceMfrURSFip90A45FfI7jz0DaN/l IbfZME2PcmqKffcwPTb3rR9Set8NnJB6fUC4wTCQDtZX9VOoqACQp0ozM9nQdIeeA5jX ozg855Twdz8OoNqbkxPywye6M7jqrbFry23YPnOdq9m4DS2mHwmNj2/vNC/bdpVfqVGL vOKbhGFcaVRdJGk6FQ+klscDmdL4r0Z+PjM5X2Ha83CyzShT1m6hsRbvKA9gmrVp6nrU z9NfXQ2htIpxFp/EmThOwvXugfk4vFp0AW7DtWwZ7I5KEHa8owsR2Ds59RoZ8UuMOyTF Zgzg== X-Gm-Message-State: AGRZ1gK0rOoRlbhF3Uj9ekBtCrakeg/BB93+RLwo/kTvlkwGn/+k9A/a Yct6jOktzf6f3DDck10eP5StYQ== X-Received: by 2002:a17:902:c5:: with SMTP id a63-v6mr23603684pla.201.1542653615215; Mon, 19 Nov 2018 10:53:35 -0800 (PST) Received: from ziepe.ca (S010614cc2056d97f.ed.shawcable.net. [174.3.196.123]) by smtp.gmail.com with ESMTPSA id g65sm115246778pfa.63.2018.11.19.10.53.34 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 19 Nov 2018 10:53:34 -0800 (PST) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1gOoft-000563-FM; Mon, 19 Nov 2018 11:53:33 -0700 Date: Mon, 19 Nov 2018 11:53:33 -0700 From: Jason Gunthorpe To: Jerome Glisse Cc: Leon Romanovsky , Kenneth Lee , Tim Sell , linux-doc@vger.kernel.org, Alexander Shishkin , Zaibo Xu , zhangfei.gao@foxmail.com, linuxarm@huawei.com, haojian.zhuang@linaro.org, Christoph Lameter , Hao Fang , Gavin Schenk , RDMA mailing list , Zhou Wang , Doug Ledford , Uwe =?utf-8?Q?Kleine-K=C3=B6nig?= , David Kershner , Kenneth Lee , Johan Hovold , Cyrille Pitchen , Sagar Dharia , Jens Axboe , guodong.xu@linaro.org, linux-netdev , Randy Dunlap , linux-kernel@vger.kernel.org, Vinod Koul , linux-crypto@vger.kernel.org, Philippe Ombredanne , Sanyog Kale , "David S. Miller" , linux-accelerators@lists.ozlabs.org Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce Message-ID: <20181119185333.GC4890@ziepe.ca> References: <95310df4-b32c-42f0-c750-3ad5eb89b3dd@gmail.com> <20181114160017.GI3759@mtr-leonro.mtl.com> <20181115085109.GD157308@Turing-Arch-b> <20181115145455.GN3759@mtr-leonro.mtl.com> <20181119091405.GE157308@Turing-Arch-b> <20181119091910.GF157308@Turing-Arch-b> <20181119104801.GF8268@mtr-leonro.mtl.com> <20181119164853.GA4593@redhat.com> <20181119182752.GA4890@ziepe.ca> <20181119184215.GB4593@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181119184215.GB4593@redhat.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 19, 2018 at 01:42:16PM -0500, Jerome Glisse wrote: > On Mon, Nov 19, 2018 at 11:27:52AM -0700, Jason Gunthorpe wrote: > > On Mon, Nov 19, 2018 at 11:48:54AM -0500, Jerome Glisse wrote: > > > > > Just to comment on this, any infiniband driver which use umem and do > > > not have ODP (here ODP for me means listening to mmu notifier so all > > > infiniband driver except mlx5) will be affected by same issue AFAICT. > > > > > > AFAICT there is no special thing happening after fork() inside any of > > > those driver. So if parent create a umem mr before fork() and program > > > hardware with it then after fork() the parent might start using new > > > page for the umem range while the old memory is use by the child. The > > > reverse is also true (parent using old memory and child new memory) > > > bottom line you can not predict which memory the child or the parent > > > will use for the range after fork(). > > > > > > So no matter what you consider the child or the parent, what the hw > > > will use for the mr is unlikely to match what the CPU use for the > > > same virtual address. In other word: > > > > > > Before fork: > > > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE > > > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > Case 1: > > > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE > > > CPU child: virtual addr ptr1 -> physical address = 0xDEAD > > > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > Case 2: > > > CPU parent: virtual addr ptr1 -> physical address = 0xBEEF > > > CPU child: virtual addr ptr1 -> physical address = 0xCAFE > > > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > > > IIRC this is solved in IB by automatically calling > > madvise(MADV_DONTFORK) before creating the MR. > > > > MADV_DONTFORK > > .. This is useful to prevent copy-on-write semantics from changing the > > physical location of a page if the parent writes to it after a > > fork(2) .. > > This would work around the issue but this is not transparent ie > range marked with DONTFORK no longer behave as expected from the > application point of view. Do you know what the difference is? The man page really gives no hint.. Does it sometimes unmap the pages during fork? I actually wonder if the kernel is a bit broken here, we have the same problem with O_DIRECT and other stuff, right? Really, if I have a get_user_pages FOLL_WRITE on a page and we fork, then shouldn't the COW immediately be broken during the fork? The kernel can't guarentee that an ongoing DMA will not write to those pages, and it breaks the fork semantic to write to both processes. > Also it relies on userspace doing the right thing (which is not > something i usualy trust :)). Well, if they do it wrong they get to keep all the broken bits :) Jason