Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp4960192rwl; Mon, 10 Apr 2023 21:02:40 -0700 (PDT) X-Google-Smtp-Source: AKy350ZVpyX7AcipUP+flzNYAWch3VxYV3mZs1+Xr38enkTH+FUBjQyNqjvNDfYXEH8Wj6tQYblt X-Received: by 2002:aa7:d0c2:0:b0:504:8905:5218 with SMTP id u2-20020aa7d0c2000000b0050489055218mr8816409edo.1.1681185760502; Mon, 10 Apr 2023 21:02:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681185760; cv=none; d=google.com; s=arc-20160816; b=U8b9+azo1N5r8C8c1aeCgKU1SFCGZNgWFSPgSvaWsvPMD5tLcgmmDTlyJ3kg7Uyhht AUqauSZEIrI3d5MX5hWWIoiv7NlK3pnWCxUyPDysg8lr7Y+stjXEtefhrO+Vcqd6mXdp CDgpT5WuLN4HxtIAHFzLZP6DZPm8LKHOVnPip3URrznyRkXGgBfLjcyB0IxPgq6Pz0sa CauFkM7cWukgsIR5iSI9Ld0eJAAKLA6FX7NvGk45ypaqCPA1XXWhqvdZxdwGnT5+xBDW GhuQAKiYJsTH1bdN1EgxxcXRiRq/ZLG2eCkZspBW1OKzFjnBBYNuky7nJXLd093g6CDh Txfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=cv05zSp9LEBIRAx3itx1vlWa8bM920DZ5LSwzxaKVeA=; b=thBTk6T+AjJfK658bxiCwCNe17nm7xzRYTaHBIZjP93Gj9cUt0JtUvfeZ+8ULLGNfR fahrZJ7BFMFXUrDwNVhSYKebtivcVk/1o5xzUuJ+Y/CoZKXFI4GHI4mjdiXV3jVvKTPN gjwrx5P7DRyXE/ySYwlGvJlJAsCRWJrhqItao12OQAWbyPDgi5I3cL48em77o0TDXZP2 IUJfHce6QBM5kbC7wBMZ05dp874kZHmS0sg/KitIr6wIbvQ0z9B28QXqV8vkniS3mgmp cogiX3Ll0RtchIUfv7vZSJHENxkBGjuZUgqt8V0dOuew8Z8fozaZ+tRs/VaL4RpzvenA x7Og== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id tg8-20020a1709078dc800b0094a83215581si3560352ejc.965.2023.04.10.21.02.16; Mon, 10 Apr 2023 21:02:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229733AbjDKDvj (ORCPT + 99 others); Mon, 10 Apr 2023 23:51:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229688AbjDKDvi (ORCPT ); Mon, 10 Apr 2023 23:51:38 -0400 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7E012118; Mon, 10 Apr 2023 20:51:36 -0700 (PDT) Received: by verein.lst.de (Postfix, from userid 2407) id E4D5E68BEB; Tue, 11 Apr 2023 05:51:31 +0200 (CEST) Date: Tue, 11 Apr 2023 05:51:31 +0200 From: Christoph Hellwig To: Petr =?utf-8?B?VGVzYcWZw61r?= Cc: Christoph Hellwig , Petr Tesarik , Jonathan Corbet , Marek Szyprowski , Robin Murphy , Borislav Petkov , "Paul E. McKenney" , Andrew Morton , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , "open list:DOCUMENTATION" , open list , "open list:DMA MAPPING HELPERS" , Roberto Sassu , Alexander Graf Subject: Re: [RFC v1 3/4] swiotlb: Allow dynamic allocation of bounce buffers Message-ID: <20230411035131.GA15795@lst.de> References: <0334a54332ab75312c9de825548b616439dcc9f5.1679309810.git.petr.tesarik.ext@huawei.com> <20230328040724.GB25506@lst.de> <4268fa4e-4f0f-a2f6-a2a5-5b78ca4a073d@huaweicloud.com> <20230407055548.GC6803@lst.de> <20230407124627.74528415@meshulam.tesarici.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230407124627.74528415@meshulam.tesarici.cz> User-Agent: Mutt/1.5.17 (2007-11-01) X-Spam-Status: No, score=0.0 required=5.0 tests=SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 07, 2023 at 12:46:27PM +0200, Petr Tesařík wrote: > > b) find a way to migrate a buffer into other memory, similar to > > how page migration works for page cache > > Let me express the idea in my own words to make sure I get it right. > When a DMA buffer is imported, but before it is ultimately pinned in > memory, the importing device driver checks whether the buffer meets its > DMA constraints. If not, it calls a function provided by the exporting > device driver to migrate the buffer. Yes. > This makes sense, but: > > 1) The operation must be implemented in the exporting driver; this > will take some time. > > 2) In theory, there may be no overlap between the exporting device > and the importing device. OTOH I'm not aware of any real-world > example, so we can probably return a suitable error code, and > that's it. Indeed. And if there is no overlap, which as you said is indeed very unlikely but in theory possible, we could still keep migrating forther and back. One important thing that we should do is to consolidate more of the dma-buf implementation code. Right now they just seem to be a wild mess of copy and pasted boilerplate code unfortunately. > Anyway, I have already written in another reply that my original use > case is moot, because a more recent distribution can do the job without > using dma-buf, so it has been fixed in user space, be it in GNOME, > pipewire, or Mesa (I don't really have to know). > > At this point I would go with the assumption that large buffers > allocated by media subsystems will not hit swiotlb. Consequently, I > don't plan to spend more time on this branch of the story. Sounds fine to me, and thanks for taking the effort so far. > > > BTW my testing also suggests that the streaming DMA API is quite > > > inefficient, because UAS performance _improved_ with swiotlb=force. > > > Sure, this should probably be addressed in the UAS and/or xHCI driver, > > > but what I mean is that moving away from swiotlb may even cause > > > performance regressions, which is counter-intuitive. At least I would > > > _not_ have expected it. > > > > That is indeed very odd. Are you running with a very slow iommu > > driver there? Or what is the actual use case there in general? > > This was on a Raspberry Pi 4, which does not have any IOMMU. IOW it > looks like copying data around can be faster than sending it straight > to the device. When I have some more time, I must investigate what is > really happening there, because it does not make any sense to me. If you're not using an IOMMU that doesn't actually make any sense to me. swiotlb calls into exactly the same routines as dma-direct does for the dma setup on each I/O, just after copying the data. So if you do have some spare cycles to investigate what is going on here, I'd be really curious about the results.