Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp1351816pxb; Fri, 1 Oct 2021 08:51:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxMf52SBxwHMKcOHQPaKEcSGBeqk/n4OgfiVKA1bIdGrLyduSvQFCnqxwFHPdC/Q1P3q0SS X-Received: by 2002:a17:90a:d78e:: with SMTP id z14mr14022286pju.182.1633103494535; Fri, 01 Oct 2021 08:51:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633103494; cv=none; d=google.com; s=arc-20160816; b=AQlVWxsliaUAsGKX8mJqLbRO8hazrhQ5wSb+CwSNQuamVbjhhxnH3UTT29yuqvBYHM C3hv+9BtJNeY7XBemPl1capPt0oNrnays5/q8J5F4nEOt/xs5OdzGMP0FJmE4UoRBNai qCw3GLh99f/RZBRdz87WpXaLsHIfqnLlAOsPdiUHNoxCExph0gN+VxIuUYDe30egPkFD sZTucTiCdDXCICyTF33zdFQ8kKPU4b14wO/w9hySb7Dqw0IuPPE3/JU/g39RvOaLoYlm Dg1A5jZndcqThvoZoKKb8wWpn73Dy7NJzKajooalKtar6R9iovxmO2b6lgU9MyUvXqRl ayFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=LG5nuVM12UKCUvMUz/H7oZ+JnPtb7xeOrg+cdikjdUw=; b=YfE4ETVsrdaXSgiu/7BBGyfUcN5G0miQh6Pa9uAu7gO/vdT+F8UO232RYT5FE3yjOg R5AQ0WWF8JlgpPAF+neH8SJWGHmo+lPRGBYRCA+5aTGspkNHVefZrXGGod7LbC1bIWAX PBMFm8d6pkRzeFsOpS89BcPiP1ZAe+tGlQCoLMewNypPBTCY0o6ojVOEDvjPDMlSCROT zTv2D90NZleZfun3xpV/3IlNUSzPkr5xH5R9DRR0i/fYph9paRXtZ7qVxsL7oST/9/Q/ FOni0m22biw2s17K/0/wnGsDcf6BC2mtSU3Eu4sHHmB6h7gYr6oITvYDCK1fcfKfnayc ppyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=i2nZSJal; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i27si4175445pfu.254.2021.10.01.08.51.18; Fri, 01 Oct 2021 08:51:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=i2nZSJal; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231502AbhJANur (ORCPT + 99 others); Fri, 1 Oct 2021 09:50:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231501AbhJANun (ORCPT ); Fri, 1 Oct 2021 09:50:43 -0400 Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A3A7C06177C for ; Fri, 1 Oct 2021 06:48:59 -0700 (PDT) Received: by mail-qk1-x732.google.com with SMTP id 73so9152131qki.4 for ; Fri, 01 Oct 2021 06:48:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=LG5nuVM12UKCUvMUz/H7oZ+JnPtb7xeOrg+cdikjdUw=; b=i2nZSJalGl2HFR2lljteQga9LLzznj1VY4sKfMFrOuI5TPw+9fxa9Y7VF6ORCDu6gf s1VZcbmu8prmAySSW0muDV9sPFy+ocsOAHMu7czDjdTka8/r9gDirvuLv+DjNcMP+yu2 2vpgG0jpIzFJO3o2EESr3Knc9sAVcu/3zrGpfyryk0rXsa3CM7UVEgJZI2lyiRkkfNpM 06X0/la9ZF1/JOw1nIH1hT6QVKTk7TJuv33hiHw5ih9p7MWcfmzNHEd0oxDuij3diKPb ruwnbMTM1vlfPcAVocSCOhTaVtV1FJ3tIhW47hPjbC6RUYlSTDx8/p4BvGKL4lQu/579 YBXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=LG5nuVM12UKCUvMUz/H7oZ+JnPtb7xeOrg+cdikjdUw=; b=Z3t8qeLpF9KyNFYmJyoKybN60cSxAwNM2auZrjqAuAai1EVKcXuSKQZ3Lw4E403KgJ mS2wUxLnzi66nKihAvYC6RHFG0394hoJclfOs18ve8ne6D8lCxC/Ct+CvbDk2dgQcAwi NGjYPSEUORrBC8T6rsyDJYux7A1DxbJhoXvoZzRP0P1Qa756l0Xp6I7IB0ir4WJ7JdUs nGa5ux6htIGL+eNsm226cV7ZUYl6ldEifl26WIaq1nZkwKFcYAQ0GQK6iiB8tL8J+Z59 A2205iKiVgXMzxPzqhAJaqsO0akiwmiORiN8c/vd263utys6GDLSEf5jK/AXQVeGNHrO ZmCA== X-Gm-Message-State: AOAM532PXCj65c3m8ypRzuU8quU+6aGrx63Y2xnLyZmnmNEbvHipA7sz o6HT8u8zHxhBJfLhFDzLv7GUbQ== X-Received: by 2002:a37:65d6:: with SMTP id z205mr9907719qkb.522.1633096138367; Fri, 01 Oct 2021 06:48:58 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-162-113-129.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.113.129]) by smtp.gmail.com with ESMTPSA id u19sm3747206qtx.40.2021.10.01.06.48.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Oct 2021 06:48:57 -0700 (PDT) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1mWIuG-008R2P-Oj; Fri, 01 Oct 2021 10:48:56 -0300 Date: Fri, 1 Oct 2021 10:48:56 -0300 From: Jason Gunthorpe To: Logan Gunthorpe , Alistair Popple , Felix Kuehling , Christoph Hellwig , Dan Williams Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, Stephen Bates , Christian =?utf-8?B?S8O2bmln?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni Subject: Re: [PATCH v3 19/20] PCI/P2PDMA: introduce pci_mmap_p2pmem() Message-ID: <20211001134856.GN3544071@ziepe.ca> References: <20210916234100.122368-1-logang@deltatee.com> <20210916234100.122368-20-logang@deltatee.com> <20210928195518.GV3544071@ziepe.ca> <8d386273-c721-c919-9749-fc0a7dc1ed8b@deltatee.com> <20210929230543.GB3544071@ziepe.ca> <32ce26d7-86e9-f8d5-f0cf-40497946efe9@deltatee.com> <20210929233540.GF3544071@ziepe.ca> <20210930003652.GH3544071@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210930003652.GH3544071@ziepe.ca> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 29, 2021 at 09:36:52PM -0300, Jason Gunthorpe wrote: > Why would DAX want to do this in the first place?? This means the > address space zap is much more important that just speeding up > destruction, it is essential for correctness since the PTEs are not > holding refcounts naturally... It is not really for this series to fix, but I think the whole thing is probably racy once you start allowing pte_special pages to be accessed by GUP. If we look at unmapping the PTE relative to GUP fast the important sequence is how the TLB flushing doesn't decrement the page refcount until after it knows any concurrent GUP fast is completed. This is arch specific, eg it could be done async through a call_rcu handler. This ensures that pages can't cross back into the free pool and be reallocated until we know for certain that nobody is walking the PTEs and could potentially take an additional reference on it. The scheme cannot rely on the page refcount being 0 because oce it goes into the free pool it could be immeidately reallocated back to a non-zero refcount. A DAX user that simply does an address space invalidation doesn't sequence itself with any of this mechanism. So we can race with a thread doing GUP fast and another thread re-cycling the page into another use - creating a leakage of the page from one security context to another. This seems to be made worse for the pgmap stuff due to the wonky refcount usage - at least if the refcount had dropped to zero gup fast would be blocked for a time, but even that doesn't happen. In short, I think using pg special for anything that can be returned by gup fast (and maybe even gup!) is racy/wrong. We must have the normal refcount mechanism work for correctness of the recycling flow. I don't know why DAX did this, I think we should be talking about udoing it all of it, not just the wonky refcounting Alistair and Felix are working on, but also the use of MIXEDMAP and pte special for struct page backed memory. Jason