Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1995291rwd; Wed, 17 May 2023 04:30:11 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4+gd/h5MhJiwaqcZpuUjdearunc2uGgY5xforrVD4xS0Dwys9BCDJdH9m50ZPVTYggjNp6 X-Received: by 2002:aa7:8885:0:b0:646:b165:1b29 with SMTP id z5-20020aa78885000000b00646b1651b29mr658125pfe.23.1684323011253; Wed, 17 May 2023 04:30:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684323011; cv=none; d=google.com; s=arc-20160816; b=FFgOYM5ZzmMWXOAFpwH3sLpeeoP+cmVC2pXGa3AflMekyAZXqwykEcFYmU5wmqW3y7 YQRco8QdtXw5/axm5vht8eY4+EEih8NHAUcmhUitwb6bYC4acKt1qyT55NQyzOmWDNES WpfM5pqu5BYHITo0weJ3x4qr0sqnQ9Tz8cHtnjFRg7wBXTWd9w8X3l13tm/o9fY1mSMr ntRS8E5HQW4nUm+qcAdx1xUG6T2NqdKytSQwT2Mqpy5G3ygiOEQ1B9w1eH5sLUx2Z9np vTBcHUcHvC1uJgyWX+jxMyPv5Efr0mqJFHgSNbKlX3cti0Sn7IaT7kLx4thPSJrvp/cL qpcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=pvP2xnPh8S5EcnxEPsbSaOu3D3ykFyUfLmJ79vdoOW4=; b=mKmc18lDX8dPEl+TnYw11zFovCml7qOKnoj18+aeC8ZqC8bN9WbS66oKYzTvY929kU yNlJUs0JBosf0L2Wa4uoLNE/zA/zJG5EfcC0P+dPmbfvee+xcXeYUg4zABGPxvArWXhH h9/Fcw8sxz39ZUYja8XAR149BCmRt6bOHFr2cp//vw71opB3/EwJ8BGeFSuyPIyhZ0Q9 d7UVpnQ5SCs89qGqMJl1C1CPD7nZ54YtakgF4dgc0qrx4JYgu23vd/vrKgziDMJWDsM4 YfEBDe5wPvehxwuOpeDbBgmvK8iVC9UqbKc3PNEM4yMMynKOWMbLRlh+5mBJ+zTDDqQC swmw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z17-20020aa79f91000000b0063d3387b4a3si22557134pfr.303.2023.05.17.04.29.58; Wed, 17 May 2023 04:30:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229769AbjEQLJj (ORCPT + 99 others); Wed, 17 May 2023 07:09:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229592AbjEQLJi (ORCPT ); Wed, 17 May 2023 07:09:38 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDF5A5591; Wed, 17 May 2023 04:09:18 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1C86C64587; Wed, 17 May 2023 11:09:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A863EC433D2; Wed, 17 May 2023 11:08:58 +0000 (UTC) Date: Wed, 17 May 2023 12:08:55 +0100 From: Catalin Marinas To: Petr =?utf-8?B?VGVzYcWZw61r?= Cc: Christoph Hellwig , "Michael Kelley (LINUX)" , Petr Tesarik , Jonathan Corbet , Greg Kroah-Hartman , "Rafael J. Wysocki" , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Marek Szyprowski , Robin Murphy , "Paul E. McKenney" , Borislav Petkov , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Thomas Gleixner , "open list:DOCUMENTATION" , open list , "open list:DRM DRIVERS" , "open list:DMA MAPPING HELPERS" , Roberto Sassu , Kefeng Wang Subject: Re: [PATCH v2 RESEND 4/7] swiotlb: Dynamically allocated bounce buffers Message-ID: References: <346abecdb13b565820c414ecf3267275577dbbf3.1683623618.git.petr.tesarik.ext@huawei.com> <20230516061309.GA7219@lst.de> <20230516083942.0303b5fb@meshulam.tesarici.cz> <20230517083510.0cd7fa1a@meshulam.tesarici.cz> <20230517065653.GA25016@lst.de> <20230517115821.4bf63bf5@meshulam.tesarici.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230517115821.4bf63bf5@meshulam.tesarici.cz> X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 17, 2023 at 11:58:21AM +0200, Petr Tesařík wrote: > On Wed, 17 May 2023 10:41:19 +0100 > Catalin Marinas wrote: > > On Wed, May 17, 2023 at 08:56:53AM +0200, Christoph Hellwig wrote: > > > Just thinking out loud: > > > > > > - what if we always way overallocate the swiotlb buffer > > > - and then mark the second half / two thirds / > > of the thin air> slots as used, and make that region available > > > through a special CMA mechanism as ZONE_MOVABLE (but not allowing > > > other CMA allocations to dip into it). > > > > > > This allows us to have a single slot management for the entire > > > area, but allow reclaiming from it. We'd probably also need to make > > > this CMA variant irq safe. > > > > I think this could work. It doesn't need to be ZONE_MOVABLE (and we > > actually need this buffer in ZONE_DMA). But we can introduce a new > > migrate type, MIGRATE_SWIOTLB, and movable page allocations can use this > > range. The CMA allocations go to free_list[MIGRATE_CMA], so they won't > > overlap. > > > > One of the downsides is that migrating movable pages still needs a > > sleep-able context. > > Pages can be migrated by a separate worker thread when the number of > free slots reaches a low watermark. Indeed, you still need such worker thread. > > Another potential confusion is is_swiotlb_buffer() for pages in this > > range allocated through the normal page allocator. We may need to check > > the slots as well rather than just the buffer boundaries. > > Ah, yes, I forgot about this part; thanks for the reminder. > > Indeed, movable pages can be used for the page cache, and drivers do > DMA to/from buffers in the page cache. > > Let me recap: > > - Allocated chunks must still be tracked with this approach. > - The pool of available slots cannot be grown from interrupt context. > > So, what exactly is the advantage compared to allocating additional > swiotlb chunks from CMA? This would work as well but it depends on how many other drivers allocate from the CMA range. Maybe it's simpler to this initially (I haven't got to your other emails yet). > > > This could still be combined with more aggressive use of per-device > > > swiotlb area, which is probably a good idea based on some hints. > > > E.g. device could hint an amount of inflight DMA to the DMA layer, > > > and if there are addressing limitations and the amout is large enough > > > that could cause the allocation of a per-device swiotlb area. > > > > If we go for one large-ish per-device buffer for specific cases, maybe > > something similar to the rmem_swiotlb_setup() but which can be > > dynamically allocated at run-time and may live alongside the default > > swiotlb. The advantage is that it uses a similar slot tracking to the > > default swiotlb, no need to invent another. This per-device buffer could > > also be allocated from the MIGRATE_SWIOTLB range if we make it large > > enough at boot. It would be seen just a local accelerator for devices > > that use bouncing frequently or from irq context. > > A per-device pool could also be used for small buffers. IIRC somebody > was interested in that. That was me ;) but TBH, I don't care how large the bounce buffer is, only that it can bounce small structures. If there's some critical path, people can change the kmalloc() allocation for those structures to make them cacheline-aligned. -- Catalin