Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp4042006pxk; Tue, 22 Sep 2020 08:57:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzWp3R8WSCSDEQVhssqHcqEWxn7Xz6NFQL7xmDfledsGW28lZZElIexz9lCz7Of9UsrBCc/ X-Received: by 2002:a17:906:7143:: with SMTP id z3mr5537432ejj.361.1600790261069; Tue, 22 Sep 2020 08:57:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600790261; cv=none; d=google.com; s=arc-20160816; b=XBPX1hWvV1RQ/gBgKE3uMklcNrdrFhQesWEtwjEfJn4RKc5/r/YDbmIgocHPyO5cUS V6ZD+bNxBFDrqWI2Wuj0W37KMadKuqQjSWgk4A3IZWWcWgRl+yq+mMxilVmNKxhVPjEP QA5pZr0SOaRaqYtCNJ1Qdj80zgVVeCdsJ3ccu1wfBeSB6e3GQFmzY82Shql2Fz7sq9Vw QAJVd7KjHPDErgcYbyzEzN5oSC1CXCWegu5IKfT/I3FFEPgLMn0BFnM7l4WdghTfQxc3 ljb7TNlCUs7VIAjwdkd2IlxgDfrnOMZK5jsSNqXTV0K/lzZteeNd0CBZdVuV2aeAvyZN cWVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=IYRTnlo90GNwIVVFhycLDwsrTtG0682e9ZAFQMaySN4=; b=i1kC2crsfqo5otzwqvwjDGxVRqWvsyk3YBUi53Mgp1AgwUQR+c0DiNd23iFmvIn37G RnStc6rdB5O6ZTh/z3K41Df1IXSDSf7td85bpnTG0Kfp0ukxz1wmYCxkj5yw6QwJLvE1 UZrtO0n0FOPi63QXTiSICrbEFfNQTLZ8bdZdk9mr5neWdJfqzQ2cKLJBzf8pYTexP/Ak fjMpZXyyS733xqmn4tJxK7Bu1QoWkSPQjTeu8U0tMnMddvSssGoaRgeUiK8cVRxxaQvh 59WAm3Vmt9v8Smxx7/ds3u+c5pVsK9eM9Bk8vCj2dkA6KDDj8LuXAN4VheL252/padXR X1+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=l7hoQh1e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r8si11176519ejo.510.2020.09.22.08.57.14; Tue, 22 Sep 2020 08:57:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=l7hoQh1e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726640AbgIVP4O (ORCPT + 99 others); Tue, 22 Sep 2020 11:56:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726566AbgIVP4O (ORCPT ); Tue, 22 Sep 2020 11:56:14 -0400 Received: from mail-qk1-x742.google.com (mail-qk1-x742.google.com [IPv6:2607:f8b0:4864:20::742]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48D9BC061755 for ; Tue, 22 Sep 2020 08:56:13 -0700 (PDT) Received: by mail-qk1-x742.google.com with SMTP id c62so6317858qke.1 for ; Tue, 22 Sep 2020 08:56:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=IYRTnlo90GNwIVVFhycLDwsrTtG0682e9ZAFQMaySN4=; b=l7hoQh1ejgd7aceitowuvAcgWnfb6V3IVadFJbkyLd0RsGpVIZrRAmC4qrxxiLU9mR 4VjBXw1UoI/QGOkYjE5oY+xdYlhNSs2jybizOpABLiKM6Wy7cuILDhq6W3lVg1HhneDh cNQksP8240Cm61ESg3I4jkLAXUjzRpb7wmx05Fk1rofGFheST7e2Fio35+trXpK6RdOZ fFPFTXnEYuzAqgdPAKmx+eMDGc0Or6jRk9J+c0FIe4CIRucI+VceaTitnmv58UBTHOrO yoAgQnha7oLMAKNQ7vtfmn6xsIejFx6kw4yTRqISvj1iOjfuBLwgtS31BiM74C4BP5Rw Ni8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=IYRTnlo90GNwIVVFhycLDwsrTtG0682e9ZAFQMaySN4=; b=RBEpWLB6z9HrgYX0pz6imm90WkoKbkSmgRVs7MmVSHXAFKew4oKvwnVH7vGtFgL59V Uac+cMOjOLJxSIdOIZkKNO5yon8oywPCWO9dfnWlFWhtqxCdbiguGPpo69OmATO1ViyR v3xYGa31uYgilX44iqlhWmDTJX47BlQOJl+oN/BtgT/0vLfdQSL04WEUl8VIAyaU3Qor S0GTwpTql5Npnv9wiNvGbyTaOWuwNHSMSv9Y6FMDQpVpoGa3A2ZhWEVLYf5ggdyKUZge gQTG9mm3NCUZ4+UGNcDwp7/Fz0fSy0yIq30PMPhN/uXXwzl4ZzaQaWwEgmx/J47X51cE lQqQ== X-Gm-Message-State: AOAM532TwZObDWfk1iHGFDZNClNCg115QwK7gY8pQTzQQsAs6K5BQq87 WJ5LoNzc4/J2dtHMefJ2+m6T4w== X-Received: by 2002:a37:a414:: with SMTP id n20mr5531835qke.332.1600790166390; Tue, 22 Sep 2020 08:56:06 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id 7sm11768068qkc.73.2020.09.22.08.56.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 08:56:05 -0700 (PDT) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kKkeC-0034WT-Nw; Tue, 22 Sep 2020 12:56:04 -0300 Date: Tue, 22 Sep 2020 12:56:04 -0300 From: Jason Gunthorpe To: Peter Xu Cc: Jann Horn , Linux-MM , kernel list , Andrew Morton , Jan Kara , Michal Hocko , Kirill Tkhai , Kirill Shutemov , Hugh Dickins , Christoph Hellwig , Andrea Arcangeli , John Hubbard , Oleg Nesterov , Leon Romanovsky , Linus Torvalds Subject: Re: [PATCH 1/5] mm: Introduce mm_struct.has_pinned Message-ID: <20200922155604.GA731578@ziepe.ca> References: <20200921211744.24758-1-peterx@redhat.com> <20200921211744.24758-2-peterx@redhat.com> <20200921223004.GB19098@xz-x1> <20200922115436.GG8409@ziepe.ca> <20200922142802.GC19098@xz-x1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200922142802.GC19098@xz-x1> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 22, 2020 at 10:28:02AM -0400, Peter Xu wrote: > On Tue, Sep 22, 2020 at 08:54:36AM -0300, Jason Gunthorpe wrote: > > On Tue, Sep 22, 2020 at 12:47:11AM +0200, Jann Horn wrote: > > > On Tue, Sep 22, 2020 at 12:30 AM Peter Xu wrote: > > > > On Mon, Sep 21, 2020 at 11:43:38PM +0200, Jann Horn wrote: > > > > > On Mon, Sep 21, 2020 at 11:17 PM Peter Xu wrote: > > > > > > (Commit message collected from Jason Gunthorpe) > > > > > > > > > > > > Reduce the chance of false positive from page_maybe_dma_pinned() by keeping > > > > > > track if the mm_struct has ever been used with pin_user_pages(). mm_structs > > > > > > that have never been passed to pin_user_pages() cannot have a positive > > > > > > page_maybe_dma_pinned() by definition. > > > > > > > > > > There are some caveats here, right? E.g. this isn't necessarily true > > > > > for pagecache pages, I think? > > > > > > > > Sorry I didn't follow here. Could you help explain with some details? > > > > > > The commit message says "mm_structs that have never been passed to > > > pin_user_pages() cannot have a positive page_maybe_dma_pinned() by > > > definition"; but that is not true for pages which may also be mapped > > > in a second mm and may have been passed to pin_user_pages() through > > > that second mm (meaning they must be writable over there and not > > > shared with us via CoW). > > > > The message does need a few more words to explain this trick can only > > be used with COW'able pages. > > > > > Process A: > > > > > > fd_a = open("/foo/bar", O_RDWR); > > > mapping_a = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, fd_a, 0); > > > pin_user_pages(mapping_a, 1, ...); > > > > > > Process B: > > > > > > fd_b = open("/foo/bar", O_RDONLY); > > > mapping_b = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd_b, 0); > > > *(volatile char *)mapping_b; > > > > > > At this point, process B has never called pin_user_pages(), but > > > page_maybe_dma_pinned() on the page at mapping_b would return true. > > > > My expectation is the pin_user_pages() should have already broken the > > COW for the MAP_PRIVATE, so process B should not have a > > page_maybe_dma_pinned() > > When process B maps with PROT_READ only (w/o PROT_WRITE) then it seems the same > page will be mapped. I thought MAP_PRIVATE without PROT_WRITE was nonsensical, it only has meaning for writes initiated by the mapping. MAP_SHARED/PROT_READ is the same behavior on Linux, IIRC. But, yes, you certainly can end up with B having page_maybe_dma_pinned() pages in shared VMA, just not in COW'able mappings. > I think I get the point from Jann now. Maybe it's easier I just remove the > whole "mm_structs that have never been passed to pin_user_pages() cannot have a > positive page_maybe_dma_pinned() by definition" sentence if that's misleading, > because the rest seem to be clear enough on what this new field is used for. "for COW" I think is still the important detail here, see for instance my remark on the PUD/PMD splitting where it is necessary to test for cow before using this. Perhaps we should call it "has_pinned_for_cow" to place emphasis on this detail? Due to the shared pages issue It really doesn't have any broader utility, eg for file back pages or otherwise. Jason