Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754159AbaJCQjY (ORCPT ); Fri, 3 Oct 2014 12:39:24 -0400 Received: from mailout32.mail01.mtsvc.net ([216.70.64.70]:56068 "EHLO n23.mail01.mtsvc.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752711AbaJCQjW (ORCPT ); Fri, 3 Oct 2014 12:39:22 -0400 Message-ID: <542ED130.2090501@hurleysoftware.com> Date: Fri, 03 Oct 2014 12:39:12 -0400 From: Peter Hurley User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: Akinobu Mita CC: Konrad Rzeszutek Wilk , Thomas Gleixner , LKML , Andrew Morton , Marek Szyprowski , David Woodhouse , Don Dutile , Ingo Molnar , "H. Peter Anvin" , Andi Kleen , x86@kernel.org, iommu@lists.linux-foundation.org, Greg KH Subject: Re: [PATCH v3 0/5] enhance DMA CMA on x86 References: <1397567329-3771-1-git-send-email-akinobu.mita@gmail.com> <5426CA0A.7000806@hurleysoftware.com> <54294C0B.1060705@hurleysoftware.com> <542ABF77.1020402@hurleysoftware.com> <542B5DC2.8020806@hurleysoftware.com> <20141002164121.GF1715@laptop.dumpdata.com> <542DCB9C.4020703@hurleysoftware.com> <542EB242.4090102@hurleysoftware.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Authenticated-User: 990527 peter@hurleysoftware.com X-MT-ID: 8FA290C2A27252AACF65DBC4A42F3CE3735FB2A4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/03/2014 12:06 PM, Akinobu Mita wrote: > 2014-10-03 23:27 GMT+09:00 Peter Hurley : >> On 10/02/2014 07:08 PM, Akinobu Mita wrote: >>> 2014-10-03 7:03 GMT+09:00 Peter Hurley : >>>> On 10/02/2014 12:41 PM, Konrad Rzeszutek Wilk wrote: >>>>> On Tue, Sep 30, 2014 at 09:49:54PM -0400, Peter Hurley wrote: >>>>>> On 09/30/2014 07:45 PM, Thomas Gleixner wrote: >>> >>>>>> Which is different than if the plan is to ship production units for x86; >>>>>> then a general purpose solution will be required. >>>>>> >>>>>> As to the good design of a general purpose solution for allocating and >>>>>> mapping huge order pages, you are certainly more qualified to help Akinobu >>>>>> than I am. >>>> >>>> What Akinobu's patches intend to support is: >>>> >>>> phys_addr = dma_alloc_coherent(dev, 64 * 1024 * 1024, &bus_addr, GFP_KERNEL); >>>> >>>> which raises three issues: >>>> >>>> 1. Where do coherent blocks of this size come from? >>>> 2. How to prevent fragmentation of these reserved blocks over time by >>>> existing DMA users? >>>> 3. Is this support generically required across all iommu implementations on x86? >>>> >>>> Questions 1 and 2 are non-trivial, in the general case, otherwise the page >>>> allocator would already do this. Simply dropping in the contiguous memory >>>> allocator doesn't work because CMA does not have the same policy and performance >>>> as the page allocator, and is already causing performance regressions even >>>> in the absence of huge page allocations. >>> >>> Could you take a look at the patches I sent? Can they fix these issues? >>> https://lkml.org/lkml/2014/9/28/110 >>> >>> With these patches, normal alloc_pages() is used for allocation first >>> and dma_alloc_from_contiguous() is used as a fallback. >> >> Sure, I can test these patches this weekend. >> Where are the unit tests? > > Thanks a lot. I would like to know whether the performance regression > you see will disappear or not with these patches as if CONFIG_DMA_CMA is > disabled. I think something may have gotten lost in translation. My "test" consists of doing my daily work (email, emacs, kernel builds, web breaks, etc). I don't have a testsuite that validates a page allocator or records any performance metrics (for TTM allocations under load, as an example). Without a unit test and performance metrics, my "test" is not really positive affirmation of a correct implementation. >>>> So that's why I raised question 3; is making the necessary compromises to support >>>> 64MB coherent DMA allocations across all x86 iommu implementations actually >>>> required? >>>> >>>> Prior to Akinobu's patches, the use of CMA by x86 iommu configurations was >>>> designed to be limited to testing configurations, as the introductory >>>> commit states: >>>> >>>> commit 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6 >>>> Author: Marek Szyprowski >>>> Date: Thu Dec 29 13:09:51 2011 +0100 >>>> >>>> X86: integrate CMA with DMA-mapping subsystem >>>> >>>> This patch adds support for CMA to dma-mapping subsystem for x86 >>>> architecture that uses common pci-dma/pci-nommu implementation. This >>>> allows to test CMA on KVM/QEMU and a lot of common x86 boxes. >>>> >>>> Signed-off-by: Marek Szyprowski >>>> Signed-off-by: Kyungmin Park >>>> CC: Michal Nazarewicz >>>> Acked-by: Arnd Bergmann >>>> >>>> >>>> Which brings me to my suggestion: if support for huge coherent DMA is >>>> required only for a special test platform, then could not this support >>>> be specific to a new iommu configuration, namely iommu=cma, which would >>>> get initialized much the same way that iommu=calgary is now. >>>> >>>> The code for such a iommu configuration would mostly duplicate >>>> arch/x86/kernel/pci-swiotlb.c and the CMA support would get removed from >>>> the other x86 iommu implementations. >>> >>> I'm not sure I read correctly, though. Can boot option 'cma=0' also >>> help avoiding CMA from IOMMU implementation? >> >> Maybe, but that's not an appropriate solution for distro kernels. >> >> Nor does this address configurations that want a really large CMA so >> 1GB huge pages can be allocated (not for DMA though). > > Now I see the point of iommu=cma you suggested. But what should we do > when CONFIG_SWIOTLB is disabled, especially for x86_32? > Should we just introduce yet another flag to tell not using DMA_CMA > instead of adding new swiotlb-like iommu implementation? Again, since I don't know what you're using this for and there are no existing mainline users, I can't really design this for you. I'm just trying to do my best to come up with alternative solutions that limit the impact to existing x86 configurations, while still achieving your goals (without really knowing what those design constraints are). Regards, Peter Hurley -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/