Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp3824384rwl; Mon, 27 Mar 2023 21:26:18 -0700 (PDT) X-Google-Smtp-Source: AKy350b7qsEvfL3S2ySHzSSv9eXWxz4nxl7RJCZ+SzBBy74a4MoF8sDX2feqih17lysT3KiToVda X-Received: by 2002:a05:6402:1ca1:b0:502:508c:e546 with SMTP id cz1-20020a0564021ca100b00502508ce546mr1334633edb.6.1679977577964; Mon, 27 Mar 2023 21:26:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679977577; cv=none; d=google.com; s=arc-20160816; b=xuEBLMd21z1DEYCQ/1AOZXLS9DNkuDzokDo+tq4VyKBb6hV7X6GfOMwEQl/gc9XHaM DWT3zcBJbOk5eBNO3jeTzlGOYUXceiII7fljDR/bbzHsQnfjAZ7SVg4QTYTJ3uRf+Q/w c7HxSxCzgFhivyJheLcqcIryq84Z4UaeWv7SFoIhQW+lcnQm6ha7clI+IRW+6kjQFN9F lXgbAb9tPCKnUaFYxxhA6r4hNTvY2bNETfz4iYVGLHHeOPbU+m/FtCIl7ktO948vmx+8 LCLYPKXCIdcUqoB1e2nn9K1Q0m22D1qRcfogpvPZAIe80mXMoJ7+RePZbdQqvAr7DtSy tKrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=Rqd7mHLC32zOJhRWGYAcrMhgFo9p/Id2CW/GFWutw3E=; b=kVXqjfzdEriLu6u/iJ0eqkHoCmRkm2eopSvmqcumZ8I7zSIn9BRh7J0VCXAHi0BJIU TTz9W4D0D+H0syXcP21AT6V9NWACHKf4YOtItBKjclBiGGLX9B8Q3uZ4Ew2HOHvy3ju+ xf/aWsgIwV/u9DnJrFp0PUskSniAymZxLLnhWdoFWf4BtQpDab8AuBahSJAJPej9Cal2 4IdtuFfslPfJxhlxHPhG0N++QtIURYV1RPAmUiNFj2iUcBasN+9VpnxzdEYYDXc+TXRx eza7fABrITu77EWZM0aLsXNGZwWoeoK4bHngpR9F3bG/ipANjteOpD+rgOx0HsCh79FE jVlw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g10-20020aa7d1ca000000b004facc47c4adsi29790278edp.53.2023.03.27.21.25.52; Mon, 27 Mar 2023 21:26:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232833AbjC1EHc (ORCPT + 99 others); Tue, 28 Mar 2023 00:07:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229459AbjC1EHa (ORCPT ); Tue, 28 Mar 2023 00:07:30 -0400 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92667F3; Mon, 27 Mar 2023 21:07:29 -0700 (PDT) Received: by verein.lst.de (Postfix, from userid 2407) id 9E54F68B05; Tue, 28 Mar 2023 06:07:24 +0200 (CEST) Date: Tue, 28 Mar 2023 06:07:24 +0200 From: Christoph Hellwig To: Petr Tesarik Cc: Jonathan Corbet , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Borislav Petkov , "Paul E. McKenney" , Andrew Morton , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , "open list:DOCUMENTATION" , open list , "open list:DMA MAPPING HELPERS" , Roberto Sassu , petr@tesarici.cz, Alexander Graf Subject: Re: [RFC v1 3/4] swiotlb: Allow dynamic allocation of bounce buffers Message-ID: <20230328040724.GB25506@lst.de> References: <0334a54332ab75312c9de825548b616439dcc9f5.1679309810.git.petr.tesarik.ext@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0334a54332ab75312c9de825548b616439dcc9f5.1679309810.git.petr.tesarik.ext@huawei.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-Spam-Status: No, score=0.0 required=5.0 tests=SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [adding Alex as he has been interested in this in the past] On Mon, Mar 20, 2023 at 01:28:15PM +0100, Petr Tesarik wrote: > Second, on the Raspberry Pi 4, swiotlb is used by dma-buf for pages > moved from the rendering GPU (v3d driver), which can access all > memory, to the display output (vc4 driver), which is connected to a > bus with an address limit of 1 GiB and no IOMMU. These buffers can > be large (several megabytes) and cannot be handled by SWIOTLB, > because they exceed maximum segment size of 256 KiB. Such mapping > failures can be easily reproduced on a Raspberry Pi4: Starting > GNOME remote desktop results in a flood of kernel messages like > these: Shouldn't we make sure dma-buf allocates the buffers for the most restricted devices, and more importantly does something like a dma coherent allocation instead of a dynamic mapping of random memory? While a larger swiotlb works around this I don't think this fixes the root cause. > 1. The value is limited to ULONG_MAX, which is too little both for > physical addresses (e.g. x86 PAE or 32-bit ARM LPAE) and DMA > addresses (e.g. Xen guests on 32-bit ARM). > > 2. Since buffers are currently allocated with page granularity, a > PFN can be used instead. However, some values are reserved by > the maple tree implementation. Liam suggests to use > xa_mk_value() in that case, but that reduces the usable range by > half. Luckily, 31 bits are still enough to hold a PFN on all > 32-bit platforms. > > 3. Software IO TLB is used from interrupt context. The maple tree > implementation is not IRQ-safe (MT_FLAGS_LOCK_IRQ does nothing > AFAICS). Instead, I use an external lock, spin_lock_irqsave() and > spin_unlock_irqrestore(). > > Note that bounce buffers are never allocated dynamically if the > software IO TLB is in fact a DMA restricted pool, which is intended > to be stay in its designated location in physical memory. I'm a little worried about all that because it causes quite a bit of overhead even for callers that don't end up going into the dynamic range or do not use swiotlb at all. I don't really have a good answer here except for the usual avoid bounce buffering whenever you can that might not always be easy to do. > + gfp = (attrs & DMA_ATTR_MAY_SLEEP) ? GFP_KERNEL : GFP_NOWAIT; > + slot = kmalloc(sizeof(*slot), gfp | __GFP_NOWARN); > + if (!slot) > + goto err; > + > + slot->orig_addr = orig_addr; > + slot->alloc_size = alloc_size; > + slot->page = dma_direct_alloc_pages(dev, PAGE_ALIGN(alloc_size), > + &slot->dma_addr, dir, > + gfp | __GFP_NOWARN); > + if (!slot->page) > + goto err_free_slot; Without GFP_NOIO allocations this will deadlock eventually.