Received: by 2002:a25:5b86:0:0:0:0:0 with SMTP id p128csp1658231ybb; Fri, 29 Mar 2019 08:43:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqw98SppRiZwBgP/r07DErb5zPMhCwhXwPBlpGfZ7hezu8pliyASJIoqPCNLrE3+OqA5P37D X-Received: by 2002:a62:e10e:: with SMTP id q14mr9520733pfh.161.1553874221829; Fri, 29 Mar 2019 08:43:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553874221; cv=none; d=google.com; s=arc-20160816; b=e1BnHauC7gHlbrJgC58M3bz3Ad44UvhI5u4rhDP8v5oWA4t/2YtwBukENEUB17eVN1 WFAnch9uX/pndbpFybeMA1j3rWeME9rCVZZaUePHX4axrkZBiQChpxn4h2E6egES1PFX tGaDgeLkOnBuWLrajQPIgule/GOOzlETTCqi5aQaa3rzfMkuFycqkJK0B8QISo9A1XTY 1oLmKVEhxPwbehfCVeOx5q2O38wTi21+8FT8fBeWXLxaiBdKUBs1Bf2IOdiVyZqv/D5b qyQJIfQekhMmYj22taslzAMz6o8rIAajIs7jB+oNgOapB7XQI+zNXyDllf0vWM4TqqM6 SYEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject; bh=9jYpiNvxrjAb8tEF/6LpJuv4nBpUNEzwmCJO9xz0jaY=; b=u96MaztT4jCe130HvgBMy7MMUK4aJZCNCSz6xGESjuKg/FMVkLzdY8dW8jwaAYq7oy Va+DNY6Gs/YI3A17o/4IJ2jQIuSLU7O3COlP4fHJnap2YzukmeBP3/LNXMhDtSPSiap4 /MaTtqy32VI5uGarYW17WHso6v7jCLbXBOXCj+azGMLZLQEkHx3sivozjly5xzkjPpqd wjrQmk6fHnlPf4sitHvjeQT2XJycMMe9XD0eeadHbfLmS9un2n5NtK72E8PsP9VgW1uP EqyWMSGWp7H3KvlZ05HNjJaViQqYiJwTyn/trQ/66OFUJTTUvVU08ZF/bVvQ7pmV93oA fAhw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x3si2152321pge.14.2019.03.29.08.43.26; Fri, 29 Mar 2019 08:43:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729584AbfC2Pld (ORCPT + 99 others); Fri, 29 Mar 2019 11:41:33 -0400 Received: from mga06.intel.com ([134.134.136.31]:8191 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728815AbfC2Plc (ORCPT ); Fri, 29 Mar 2019 11:41:32 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Mar 2019 08:40:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,284,1549958400"; d="scan'208";a="144965899" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by FMSMGA003.fm.intel.com with ESMTP; 29 Mar 2019 08:40:29 -0700 Subject: [PATCH 5/6] pci/p2pdma: Track pgmap references per resource, not globally From: Dan Williams To: akpm@linux-foundation.org Cc: Logan Gunthorpe , Bjorn Helgaas , Christoph Hellwig , linux-mm@kvack.org, linux-pci@vger.kernel.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Fri, 29 Mar 2019 08:27:50 -0700 Message-ID: <155387327020.2443841.6446837127378298192.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155387324370.2443841.574715745262628837.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155387324370.2443841.574715745262628837.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In preparation for fixing a race between devm_memremap_pages_release() and the final put of a page from the device-page-map, allocate a percpu-ref per p2pdma resource mapping. Cc: Logan Gunthorpe Cc: Bjorn Helgaas Cc: Christoph Hellwig Signed-off-by: Dan Williams --- drivers/pci/p2pdma.c | 114 ++++++++++++++++++++++++++++++++------------------ 1 file changed, 73 insertions(+), 41 deletions(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 595a534bd749..1b96c1688715 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -20,12 +20,16 @@ #include struct pci_p2pdma { - struct percpu_ref devmap_ref; - struct completion devmap_ref_done; struct gen_pool *pool; bool p2pmem_published; }; +struct p2pdma_pagemap { + struct dev_pagemap pgmap; + struct percpu_ref ref; + struct completion ref_done; +}; + static ssize_t size_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -74,28 +78,31 @@ static const struct attribute_group p2pmem_group = { .name = "p2pmem", }; +static struct p2pdma_pagemap *to_p2p_pgmap(struct percpu_ref *ref) +{ + return container_of(ref, struct p2pdma_pagemap, ref); +} + static void pci_p2pdma_percpu_release(struct percpu_ref *ref) { - struct pci_p2pdma *p2p = - container_of(ref, struct pci_p2pdma, devmap_ref); + struct p2pdma_pagemap *p2p_pgmap = to_p2p_pgmap(ref); - complete_all(&p2p->devmap_ref_done); + complete(&p2p_pgmap->ref_done); } static void pci_p2pdma_percpu_kill(struct percpu_ref *ref) { - /* - * pci_p2pdma_add_resource() may be called multiple times - * by a driver and may register the percpu_kill devm action multiple - * times. We only want the first action to actually kill the - * percpu_ref. - */ - if (percpu_ref_is_dying(ref)) - return; - percpu_ref_kill(ref); } +static void pci_p2pdma_percpu_cleanup(void *ref) +{ + struct p2pdma_pagemap *p2p_pgmap = to_p2p_pgmap(ref); + + wait_for_completion(&p2p_pgmap->ref_done); + percpu_ref_exit(&p2p_pgmap->ref); +} + static void pci_p2pdma_release(void *data) { struct pci_dev *pdev = data; @@ -103,12 +110,12 @@ static void pci_p2pdma_release(void *data) if (!pdev->p2pdma) return; - wait_for_completion(&pdev->p2pdma->devmap_ref_done); - percpu_ref_exit(&pdev->p2pdma->devmap_ref); + /* Flush and disable pci_alloc_p2p_mem() */ + pdev->p2pdma = NULL; + synchronize_rcu(); gen_pool_destroy(pdev->p2pdma->pool); sysfs_remove_group(&pdev->dev.kobj, &p2pmem_group); - pdev->p2pdma = NULL; } static int pci_p2pdma_setup(struct pci_dev *pdev) @@ -124,12 +131,6 @@ static int pci_p2pdma_setup(struct pci_dev *pdev) if (!p2p->pool) goto out; - init_completion(&p2p->devmap_ref_done); - error = percpu_ref_init(&p2p->devmap_ref, - pci_p2pdma_percpu_release, 0, GFP_KERNEL); - if (error) - goto out_pool_destroy; - error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_release, pdev); if (error) goto out_pool_destroy; @@ -163,6 +164,7 @@ static int pci_p2pdma_setup(struct pci_dev *pdev) int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, u64 offset) { + struct p2pdma_pagemap *p2p_pgmap; struct dev_pagemap *pgmap; void *addr; int error; @@ -185,14 +187,32 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, return error; } - pgmap = devm_kzalloc(&pdev->dev, sizeof(*pgmap), GFP_KERNEL); - if (!pgmap) + p2p_pgmap = devm_kzalloc(&pdev->dev, sizeof(*p2p_pgmap), GFP_KERNEL); + if (!p2p_pgmap) return -ENOMEM; + init_completion(&p2p_pgmap->ref_done); + error = percpu_ref_init(&p2p_pgmap->ref, + pci_p2pdma_percpu_release, 0, GFP_KERNEL); + if (error) + goto pgmap_free; + + /* + * FIXME: the percpu_ref_exit needs to be coordinated internal + * to devm_memremap_pages_release(). Duplicate the same ordering + * as other devm_memremap_pages() users for now. + */ + error = devm_add_action(&pdev->dev, pci_p2pdma_percpu_cleanup, + &p2p_pgmap->ref); + if (error) + goto ref_cleanup; + + pgmap = &p2p_pgmap->pgmap; + pgmap->res.start = pci_resource_start(pdev, bar) + offset; pgmap->res.end = pgmap->res.start + size - 1; pgmap->res.flags = pci_resource_flags(pdev, bar); - pgmap->ref = &pdev->p2pdma->devmap_ref; + pgmap->ref = &p2p_pgmap->ref; pgmap->type = MEMORY_DEVICE_PCI_P2PDMA; pgmap->pci_p2pdma_bus_offset = pci_bus_address(pdev, bar) - pci_resource_start(pdev, bar); @@ -201,12 +221,13 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, addr = devm_memremap_pages(&pdev->dev, pgmap); if (IS_ERR(addr)) { error = PTR_ERR(addr); - goto pgmap_free; + goto ref_exit; } - error = gen_pool_add_virt(pdev->p2pdma->pool, (unsigned long)addr, + error = gen_pool_add_owner(pdev->p2pdma->pool, (unsigned long)addr, pci_bus_address(pdev, bar) + offset, - resource_size(&pgmap->res), dev_to_node(&pdev->dev)); + resource_size(&pgmap->res), dev_to_node(&pdev->dev), + &p2p_pgmap->ref); if (error) goto pages_free; @@ -217,8 +238,10 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, pages_free: devm_memunmap_pages(&pdev->dev, pgmap); +ref_cleanup: + percpu_ref_exit(&p2p_pgmap->ref); pgmap_free: - devm_kfree(&pdev->dev, pgmap); + devm_kfree(&pdev->dev, p2p_pgmap); return error; } EXPORT_SYMBOL_GPL(pci_p2pdma_add_resource); @@ -555,19 +578,25 @@ EXPORT_SYMBOL_GPL(pci_p2pmem_find_many); */ void *pci_alloc_p2pmem(struct pci_dev *pdev, size_t size) { - void *ret; + void *ret = NULL; + struct percpu_ref *ref; + rcu_read_lock(); if (unlikely(!pdev->p2pdma)) - return NULL; - - if (unlikely(!percpu_ref_tryget_live(&pdev->p2pdma->devmap_ref))) - return NULL; - - ret = (void *)gen_pool_alloc(pdev->p2pdma->pool, size); + goto out; - if (unlikely(!ret)) - percpu_ref_put(&pdev->p2pdma->devmap_ref); + ret = (void *)gen_pool_alloc_owner(pdev->p2pdma->pool, size, + (void **) &ref); + if (!ret) + goto out; + if (unlikely(!percpu_ref_tryget_live(ref))) { + gen_pool_free(pdev->p2pdma->pool, (unsigned long) ret, size); + ret = NULL; + goto out; + } +out: + rcu_read_unlock(); return ret; } EXPORT_SYMBOL_GPL(pci_alloc_p2pmem); @@ -580,8 +609,11 @@ EXPORT_SYMBOL_GPL(pci_alloc_p2pmem); */ void pci_free_p2pmem(struct pci_dev *pdev, void *addr, size_t size) { - gen_pool_free(pdev->p2pdma->pool, (uintptr_t)addr, size); - percpu_ref_put(&pdev->p2pdma->devmap_ref); + struct percpu_ref *ref; + + gen_pool_free_owner(pdev->p2pdma->pool, (uintptr_t)addr, size, + (void **) &ref); + percpu_ref_put(ref); } EXPORT_SYMBOL_GPL(pci_free_p2pmem);