Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754037AbaJCQfI (ORCPT ); Fri, 3 Oct 2014 12:35:08 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:48481 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753714AbaJCQfF (ORCPT ); Fri, 3 Oct 2014 12:35:05 -0400 Message-ID: <542ECFF4.7010602@oracle.com> Date: Fri, 03 Oct 2014 12:33:56 -0400 From: konrad wilk Organization: Oracle Corporation User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: Akinobu Mita CC: Peter Hurley , Thomas Gleixner , LKML , Andrew Morton , Marek Szyprowski , David Woodhouse , Don Dutile , Ingo Molnar , "H. Peter Anvin" , Andi Kleen , x86@kernel.org, iommu@lists.linux-foundation.org, Greg KH Subject: Re: [PATCH v3 0/5] enhance DMA CMA on x86 References: <1397567329-3771-1-git-send-email-akinobu.mita@gmail.com> <5426CA0A.7000806@hurleysoftware.com> <54294C0B.1060705@hurleysoftware.com> <542ABF77.1020402@hurleysoftware.com> <542B5DC2.8020806@hurleysoftware.com> <20141002164121.GF1715@laptop.dumpdata.com> <542DCB9C.4020703@hurleysoftware.com> <542EB242.4090102@hurleysoftware.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: ucsinet22.oracle.com [156.151.31.94] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/3/2014 12:06 PM, Akinobu Mita wrote: > 2014-10-03 23:27 GMT+09:00 Peter Hurley : >> On 10/02/2014 07:08 PM, Akinobu Mita wrote: >>> 2014-10-03 7:03 GMT+09:00 Peter Hurley : >>>> On 10/02/2014 12:41 PM, Konrad Rzeszutek Wilk wrote: >>>>> On Tue, Sep 30, 2014 at 09:49:54PM -0400, Peter Hurley wrote: >>>>>> On 09/30/2014 07:45 PM, Thomas Gleixner wrote: >>> >>>>>> Which is different than if the plan is to ship production units for x86; >>>>>> then a general purpose solution will be required. >>>>>> >>>>>> As to the good design of a general purpose solution for allocating and >>>>>> mapping huge order pages, you are certainly more qualified to help Akinobu >>>>>> than I am. >>>> >>>> What Akinobu's patches intend to support is: >>>> >>>> phys_addr = dma_alloc_coherent(dev, 64 * 1024 * 1024, &bus_addr, GFP_KERNEL); >>>> >>>> which raises three issues: >>>> >>>> 1. Where do coherent blocks of this size come from? >>>> 2. How to prevent fragmentation of these reserved blocks over time by >>>> existing DMA users? >>>> 3. Is this support generically required across all iommu implementations on x86? >>>> >>>> Questions 1 and 2 are non-trivial, in the general case, otherwise the page >>>> allocator would already do this. Simply dropping in the contiguous memory >>>> allocator doesn't work because CMA does not have the same policy and performance >>>> as the page allocator, and is already causing performance regressions even >>>> in the absence of huge page allocations. >>> >>> Could you take a look at the patches I sent? Can they fix these issues? >>> https://lkml.org/lkml/2014/9/28/110 >>> >>> With these patches, normal alloc_pages() is used for allocation first >>> and dma_alloc_from_contiguous() is used as a fallback. >> >> Sure, I can test these patches this weekend. >> Where are the unit tests? > > Thanks a lot. I would like to know whether the performance regression > you see will disappear or not with these patches as if CONFIG_DMA_CMA is > disabled. > >>>> So that's why I raised question 3; is making the necessary compromises to support >>>> 64MB coherent DMA allocations across all x86 iommu implementations actually >>>> required? >>>> >>>> Prior to Akinobu's patches, the use of CMA by x86 iommu configurations was >>>> designed to be limited to testing configurations, as the introductory >>>> commit states: >>>> >>>> commit 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6 >>>> Author: Marek Szyprowski >>>> Date: Thu Dec 29 13:09:51 2011 +0100 >>>> >>>> X86: integrate CMA with DMA-mapping subsystem >>>> >>>> This patch adds support for CMA to dma-mapping subsystem for x86 >>>> architecture that uses common pci-dma/pci-nommu implementation. This >>>> allows to test CMA on KVM/QEMU and a lot of common x86 boxes. >>>> >>>> Signed-off-by: Marek Szyprowski >>>> Signed-off-by: Kyungmin Park >>>> CC: Michal Nazarewicz >>>> Acked-by: Arnd Bergmann >>>> >>>> >>>> Which brings me to my suggestion: if support for huge coherent DMA is >>>> required only for a special test platform, then could not this support >>>> be specific to a new iommu configuration, namely iommu=cma, which would >>>> get initialized much the same way that iommu=calgary is now. >>>> >>>> The code for such a iommu configuration would mostly duplicate >>>> arch/x86/kernel/pci-swiotlb.c and the CMA support would get removed from >>>> the other x86 iommu implementations. >>> >>> I'm not sure I read correctly, though. Can boot option 'cma=0' also >>> help avoiding CMA from IOMMU implementation? >> >> Maybe, but that's not an appropriate solution for distro kernels. >> >> Nor does this address configurations that want a really large CMA so >> 1GB huge pages can be allocated (not for DMA though). > > Now I see the point of iommu=cma you suggested. But what should we do > when CONFIG_SWIOTLB is disabled, especially for x86_32? > Should we just introduce yet another flag to tell not using DMA_CMA > instead of adding new swiotlb-like iommu implementation? > If you implement an DMA API producer - aka dma_ops (which is what Peter is thinking I believe) it won't matter which IOMMUs / DMA producers are selected right? Or are you saying that CMA needs SWIOTLB to handle certain type of pages as a fallback mechanism - and hence there needs to be a tight relationship? In which case I would look at making SWIOTLB be more library like - the Xen-SWIOTLB already does that by using certain parts of the SWIOTLB code which are exposed to the rest of the kernel. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/