Received: by 10.223.164.221 with SMTP id h29csp3722848wrb; Wed, 4 Oct 2017 06:53:48 -0700 (PDT) X-Received: by 10.84.173.228 with SMTP id p91mr20478356plb.264.1507125228108; Wed, 04 Oct 2017 06:53:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1507125228; cv=none; d=google.com; s=arc-20160816; b=wl+3pPRYxkJWc9jiYOCi9Qx3nxz+o5JL33NVVCYzy+mutIc9Q6jnS2CH9Gx4zuxeTF A5pptW9irP/2iKoWW3mO0Q0vlIHJkmi1RfS4cgF+2TABtaZFp8+qP0knP5hM+gSvMGLe AvKlE6NkTQXkns5OV4rcrrLfSpc9yXjtgrwthsGMO8KK1i5XWOU7qjx7egp2rF/jW8ua tWm9epRVEbf9Rd+Ki4A50jw3kUGdcGBsSLbCUWcJShO4E/ECuOq8D0SzEmMqY/fcMwkf vVfQ3AQ1H9eGfjD9CmQtfrsB8a30EDbpOOdFVfGystdjYSnAVx6K1IngacUjeYvIv4vE cq4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=+Do70G+1DSt/jq/aAAwKpvEQkQfvb8HRrsRy5ZLUZpA=; b=070+RZtQmnqH2rWf/vRczZVkXNX4O+9QNAjzzG2iJUaEomkdX21VaHA98BsBQUF6Xv H6xDuP/OscrvQXH3ZR6NT8RRD213GYsfK04OTV3UHXFxZB+V+/DDsfEFLpV0tn6kO6JO yDgjyPF0DhwIm+uV0IM5OrVh4ajEgSkRu7ucPB5fASvK/fQF6N1A9LYtohHgmFgdtiYL cjc2gYKPiBf5BCPsuqM45MejIDDUsPId76HCPCrXhcrQ1EE8ALkJSaFyCGkxtUCib21Y swZqeKXE2liiqQmgUj0Lj5JhTzDS/PkLYSCtQRd3G+9FZUawOtQbYQAv3KgM1LXt+RqF Y2Hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=hB5Lqjnu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m5si5586992pgs.223.2017.10.04.06.53.33; Wed, 04 Oct 2017 06:53:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=hB5Lqjnu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752197AbdJDNxH (ORCPT + 99 others); Wed, 4 Oct 2017 09:53:07 -0400 Received: from mail-lf0-f68.google.com ([209.85.215.68]:43530 "EHLO mail-lf0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751176AbdJDNxF (ORCPT ); Wed, 4 Oct 2017 09:53:05 -0400 Received: by mail-lf0-f68.google.com with SMTP id o125so8899233lfe.0 for ; Wed, 04 Oct 2017 06:53:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=+Do70G+1DSt/jq/aAAwKpvEQkQfvb8HRrsRy5ZLUZpA=; b=hB5LqjnuBKUdACVlxzt7ON5iukvFCUKZuI+qrO2GYNZ1XUO1UTUQVq0VAB0spRxiCa 5V7wNgigotW3hOmRHkKWaM5Wjx1/M4xkPapYc8suZWeSfdqIyRI2e4y173npb0wGy9va ba0lVtrevSrn9HXfjPGPNG82BZebsDBOi1qJHfJtkJQhmG/o9CJXvQ9HBgeBUATxO8gn 8tWP48VMBsnrcC6stcA+QOW6+vR9X6TyUWBbcr2HBFt3P1vYws8VZMVRJupA2kGeU32E PyCGaFDvJbg4bXOgISG5vXViUiZIGyQdPATgv409v5WqfLvqnj/xkScMnE63nWkgCUmR S8pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=+Do70G+1DSt/jq/aAAwKpvEQkQfvb8HRrsRy5ZLUZpA=; b=E/rW5TizuB93oZXkuPWOZhao1Ek4F6nR3gEAfh84R8ewBlIQNvUy9r0hTLc+asHDjp uNhjvSJ/EMp8ihtYfFteCAqRsL0T+4XcUtRv9sonUVu9py2SKTWO8oLJxWKj7XMD2YaF S4ZyJsSpwtDM7dD9aDbK9ZeNoGuxY8Dk2hNg+067ro2D/G3qIIJiHobpOZdQ5jbnOw89 PgtlC6kJv8EWX/cskO0ZEWuj2n5vCwhPntTWVOqAa/+XQLryZLsxPCHteuABC4gMZ+Nj ZRfexl0/XJ9iTx7rO72iw1orsuOtcwn68TA8xixMLUpJsfmtXdGFhZ2cyV76/ColxFQZ MNfA== X-Gm-Message-State: AMCzsaUky+6RRSusN3q+e4af/Wg968DZxF4mNr5XYa2XJ9+TO9n3NWAF VSIL+lCR4RRnITZOh6dwdP0ZX3Ersvq4kg8Rmcw= X-Google-Smtp-Source: AOwi7QAEbm+PCsxwTSGtGGtHNkF0It0c6PQ6Y+JuQyVt22YYqEZ8/cr2AtU3r2HFUG6nV2/tAwpMUcTg0dkpfL8I460= X-Received: by 10.25.99.65 with SMTP id x62mr4183957lfb.129.1507125184273; Wed, 04 Oct 2017 06:53:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.147.207 with HTTP; Wed, 4 Oct 2017 06:53:03 -0700 (PDT) In-Reply-To: References: <20170921085922.11659-1-ganapatrao.kulkarni@cavium.com> <20170921085922.11659-4-ganapatrao.kulkarni@cavium.com> From: Ganapatrao Kulkarni Date: Wed, 4 Oct 2017 19:23:03 +0530 Message-ID: Subject: Re: [PATCH 3/4] iommu/arm-smmu-v3: Use NUMA memory allocations for stream tables and comamnd queues To: Robin Murphy Cc: Ganapatrao Kulkarni , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, iommu@lists.linux-foundation.org, linux-mm@kvack.org, Christoph Hellwig , Marek Szyprowski , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Joerg Roedel , vbabka@suse.cz, akpm@linux-foundation.org, mhocko@suse.com, Tomasz.Nowicki@cavium.com, Robert.Richter@cavium.com, jnair@caviumnetworks.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Robin, On Thu, Sep 21, 2017 at 5:28 PM, Robin Murphy wrote: > [+Christoph and Marek] > > On 21/09/17 09:59, Ganapatrao Kulkarni wrote: >> Introduce smmu_alloc_coherent and smmu_free_coherent functions to >> allocate/free dma coherent memory from NUMA node associated with SMMU. >> Replace all calls of dmam_alloc_coherent with smmu_alloc_coherent >> for SMMU stream tables and command queues. > > This doesn't work - not only do you lose the 'managed' aspect and risk > leaking various tables on probe failure or device removal, but more > importantly, unless you add DMA syncs around all the CPU accesses to the > tables, you lose the critical 'coherent' aspect, and that's a horribly > invasive change that I really don't want to make. this implementation is similar to function used to allocate memory for translation tables. why do you see it affects to stream tables and not to page tables. at runtime, both tables are accessed by SMMU only. As said in cover letter, having stream table from respective NUMA node is yielding around 30% performance! please suggest, if there is any better way to address this issue? > > Christoph, Marek; how reasonable do you think it is to expect > dma_alloc_coherent() to be inherently NUMA-aware on NUMA-capable > systems? SWIOTLB looks fairly straightforward to fix up (for the simple > allocation case; I'm not sure it's even worth it for bounce-buffering), > but the likes of CMA might be a little trickier... > > Robin. > >> Signed-off-by: Ganapatrao Kulkarni >> --- >> drivers/iommu/arm-smmu-v3.c | 57 ++++++++++++++++++++++++++++++++++++++++----- >> 1 file changed, 51 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c >> index e67ba6c..bc4ba1f 100644 >> --- a/drivers/iommu/arm-smmu-v3.c >> +++ b/drivers/iommu/arm-smmu-v3.c >> @@ -1158,6 +1158,50 @@ static void arm_smmu_init_bypass_stes(u64 *strtab, unsigned int nent) >> } >> } >> >> +static void *smmu_alloc_coherent(struct arm_smmu_device *smmu, size_t size, >> + dma_addr_t *dma_handle, gfp_t gfp) >> +{ >> + struct device *dev = smmu->dev; >> + void *pages; >> + dma_addr_t dma; >> + int numa_node = dev_to_node(dev); >> + >> + pages = alloc_pages_exact_nid(numa_node, size, gfp | __GFP_ZERO); >> + if (!pages) >> + return NULL; >> + >> + if (!(smmu->features & ARM_SMMU_FEAT_COHERENCY)) { >> + dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE); >> + if (dma_mapping_error(dev, dma)) >> + goto out_free; >> + /* >> + * We depend on the SMMU being able to work with any physical >> + * address directly, so if the DMA layer suggests otherwise by >> + * translating or truncating them, that bodes very badly... >> + */ >> + if (dma != virt_to_phys(pages)) >> + goto out_unmap; >> + } >> + >> + *dma_handle = (dma_addr_t)virt_to_phys(pages); >> + return pages; >> + >> +out_unmap: >> + dev_err(dev, "Cannot accommodate DMA translation for IOMMU page tables\n"); >> + dma_unmap_single(dev, dma, size, DMA_TO_DEVICE); >> +out_free: >> + free_pages_exact(pages, size); >> + return NULL; >> +} >> + >> +static void smmu_free_coherent(struct arm_smmu_device *smmu, size_t size, >> + void *pages, dma_addr_t dma_handle) >> +{ >> + if (!(smmu->features & ARM_SMMU_FEAT_COHERENCY)) >> + dma_unmap_single(smmu->dev, dma_handle, size, DMA_TO_DEVICE); >> + free_pages_exact(pages, size); >> +} >> + >> static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) >> { >> size_t size; >> @@ -1172,7 +1216,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) >> strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS]; >> >> desc->span = STRTAB_SPLIT + 1; >> - desc->l2ptr = dmam_alloc_coherent(smmu->dev, size, &desc->l2ptr_dma, >> + desc->l2ptr = smmu_alloc_coherent(smmu, size, &desc->l2ptr_dma, >> GFP_KERNEL | __GFP_ZERO); >> if (!desc->l2ptr) { >> dev_err(smmu->dev, >> @@ -1487,7 +1531,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) >> struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; >> >> if (cfg->cdptr) { >> - dmam_free_coherent(smmu_domain->smmu->dev, >> + smmu_free_coherent(smmu, >> CTXDESC_CD_DWORDS << 3, >> cfg->cdptr, >> cfg->cdptr_dma); >> @@ -1515,7 +1559,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, >> if (asid < 0) >> return asid; >> >> - cfg->cdptr = dmam_alloc_coherent(smmu->dev, CTXDESC_CD_DWORDS << 3, >> + cfg->cdptr = smmu_alloc_coherent(smmu, CTXDESC_CD_DWORDS << 3, >> &cfg->cdptr_dma, >> GFP_KERNEL | __GFP_ZERO); >> if (!cfg->cdptr) { >> @@ -1984,7 +2028,7 @@ static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu, >> { >> size_t qsz = ((1 << q->max_n_shift) * dwords) << 3; >> >> - q->base = dmam_alloc_coherent(smmu->dev, qsz, &q->base_dma, GFP_KERNEL); >> + q->base = smmu_alloc_coherent(smmu, qsz, &q->base_dma, GFP_KERNEL); >> if (!q->base) { >> dev_err(smmu->dev, "failed to allocate queue (0x%zx bytes)\n", >> qsz); >> @@ -2069,7 +2113,7 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu) >> size, smmu->sid_bits); >> >> l1size = cfg->num_l1_ents * (STRTAB_L1_DESC_DWORDS << 3); >> - strtab = dmam_alloc_coherent(smmu->dev, l1size, &cfg->strtab_dma, >> + strtab = smmu_alloc_coherent(smmu, l1size, &cfg->strtab_dma, >> GFP_KERNEL | __GFP_ZERO); >> if (!strtab) { >> dev_err(smmu->dev, >> @@ -2097,8 +2141,9 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu) >> u32 size; >> struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; >> >> + >> size = (1 << smmu->sid_bits) * (STRTAB_STE_DWORDS << 3); >> - strtab = dmam_alloc_coherent(smmu->dev, size, &cfg->strtab_dma, >> + strtab = smmu_alloc_coherent(smmu, size, &cfg->strtab_dma, >> GFP_KERNEL | __GFP_ZERO); >> if (!strtab) { >> dev_err(smmu->dev, >> > thanks Ganapat From 1579876117202504021@xxx Fri Sep 29 12:14:36 +0000 2017 X-GM-THRID: 1579139169210447923 X-Gmail-Labels: Inbox,Category Forums