Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1772378rwd; Wed, 17 May 2023 00:49:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7xHzGgUUjZwzHMvIyfh4bF2ILvAcibmk+quZ5Ju7fWqsp5rGCGMJu37i4kHZV0Kq9TiUuO X-Received: by 2002:a17:902:ce8a:b0:1ad:ddf0:1311 with SMTP id f10-20020a170902ce8a00b001adddf01311mr25316986plg.50.1684309761434; Wed, 17 May 2023 00:49:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684309761; cv=none; d=google.com; s=arc-20160816; b=T+cweEeGn+d5+hkWm6cLZI66A/SVTaqolix3WAr0Ntu/m2sK1VWFbfy6rHBLbYzWEr qTsq3DSrsnckAG4QlhCp+DL1vjwxPusAD0SJzsNidB2/RuZiNYPCOjiyLF33QIZ/j5CI deyY4cADt9iKd7jKVSK4a7pHjIeqnvrAf/NqKbBGz6GkL6BNHHspX5PlIR/so79CdIwW Ly0gpB4ftHMuZmb498UMuKsrdTSz7QUEwsrMB7ec91SB5W/jjz+fqH9FSELQOCI6Tdrn NKJgpNgb6iuIuEHQQFn45ghDADwi/Aqgx2h1wzbH98pv9ZmYLjW5AmcAgjmTNG5OyYPT 1jbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=r444MNhcyES7jJaWpLAolQcoFWpH+rkz+QykxuT4qg0=; b=hAdj3FfRb+Sgr48BL8rY5VD49gxuuBRxS3wBqXsH872uVQc5VaU4ueCOTkD6Df+3Gi F4BbBMnRSMRafY8nnfd+FDw6Y1p+kSOsiRJDlgM2+u+6jjHbciovvcTro+QPB19wbCLM dBnp20zo9b3tnH28jLCcwEZms1tjh7HcHonPfykyukz7cMID/PsQ+Sg1lJMnvx2NRFfo 1L2f2L9CN/R7OpKWyext+wAbZuA0ut1oUKb9KawvyMMrHyHTrD6Tl5ez9Ww/TI6UrECD KLrt2llT2ksjwvSRa7dBAEwk4GTBxffk0Ot3/XYq5OoeOsvexZvW5b2dw/wRIEdwtj1h Urow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=mCcRh259; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=tesarici.cz Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z5-20020a170902834500b0019e9c8e51f4si18872684pln.165.2023.05.17.00.49.06; Wed, 17 May 2023 00:49:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=mCcRh259; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=tesarici.cz Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230026AbjEQHdV (ORCPT + 99 others); Wed, 17 May 2023 03:33:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230015AbjEQHcm (ORCPT ); Wed, 17 May 2023 03:32:42 -0400 Received: from bee.tesarici.cz (bee.tesarici.cz [IPv6:2a03:3b40:fe:2d4::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74D4259DC; Wed, 17 May 2023 00:32:31 -0700 (PDT) Received: from meshulam.tesarici.cz (dynamic-2a00-1028-83b8-1e7a-4427-cc85-6706-c595.ipv6.o2.cz [IPv6:2a00:1028:83b8:1e7a:4427:cc85:6706:c595]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bee.tesarici.cz (Postfix) with ESMTPSA id EFCF0164AA5; Wed, 17 May 2023 09:32:27 +0200 (CEST) Authentication-Results: mail.tesarici.cz; dmarc=fail (p=none dis=none) header.from=tesarici.cz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tesarici.cz; s=mail; t=1684308748; bh=SMZB1a3FJVckA4uh20yz2Mfq+e2cLIxWHNP75ATsZTs=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=mCcRh259Vaj22uiaMFcYNHaDzhfPsk/D0QMaoI55cIkowp5s+9dkExQMSa8mLBTUT 5HaVl2QV8QtWeIRuctMEN24UM7lmHIfQLwO8UrdS/9TVYOUE95+fnNnOKn5ipXk4AP evJWOCMsexuRG8FTtoAxJH7Sw//hc3hTXntur2OBH2Nelzkn8T4G8XWcAjsiAytoqy r5NZYBw8QLufWsBkKN5hlx34hE9EcHXlH8z+9sa9hrXKI/4ahp+nATH2KFDs3jtOBc mPZ69RpNCTo7KFBnr8LmhqIyqdXTIxSnHkaiJ3XNqz4PLNhJfG4Rz36dVOMuZs/mjk 5g6PC4LcI462w== Date: Wed, 17 May 2023 09:32:26 +0200 From: Petr =?UTF-8?B?VGVzYcWZw61r?= To: Christoph Hellwig Cc: Catalin Marinas , "Michael Kelley (LINUX)" , Petr Tesarik , Jonathan Corbet , Greg Kroah-Hartman , "Rafael J. Wysocki" , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Marek Szyprowski , Robin Murphy , "Paul E. McKenney" , Borislav Petkov , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Thomas Gleixner , "open list:DOCUMENTATION" , open list , "open list:DRM DRIVERS" , "open list:DMA MAPPING HELPERS" , Roberto Sassu , Kefeng Wang Subject: Re: [PATCH v2 RESEND 4/7] swiotlb: Dynamically allocated bounce buffers Message-ID: <20230517093226.77ab1d2a@meshulam.tesarici.cz> In-Reply-To: <20230517065653.GA25016@lst.de> References: <346abecdb13b565820c414ecf3267275577dbbf3.1683623618.git.petr.tesarik.ext@huawei.com> <20230516061309.GA7219@lst.de> <20230516083942.0303b5fb@meshulam.tesarici.cz> <20230517083510.0cd7fa1a@meshulam.tesarici.cz> <20230517065653.GA25016@lst.de> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-suse-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Christoph, On Wed, 17 May 2023 08:56:53 +0200 Christoph Hellwig wrote: > Just thinking out loud: > > - what if we always way overallocate the swiotlb buffer > - and then mark the second half / two thirds / of the thin air> slots as used, and make that region available > through a special CMA mechanism as ZONE_MOVABLE (but not allowing > other CMA allocations to dip into it). This approach has also been considered internally at Huawei, and it looked like a viable option, just more complex. We decided to send the simple approach first to get some feedback and find out who else might be interested in the dynamic sizing of swiotlb (if anyone). > This allows us to have a single slot management for the entire > area, but allow reclaiming from it. We'd probably also need to make > this CMA variant irq safe. Let me recap my internal analysis. On the pro side: - no performance penalty for devices that do not use swiotlb - all alignment and boundary constraints can be met - efficient use of memory for buffers smaller than 1 page On the con side: - ZONE_MOVABLE cannot be used for most kernel allocations - competition with CMA over precious physical address space (How much should be reserved for CMA and how much for SWIOTLB?) To quote from Memory hotplug documentation: Usually, MOVABLE:KERNEL ratios of up to 3:1 or even 4:1 are fine. [...] Actual safe zone ratios depend on the workload. Extreme cases, like excessive long-term pinning of pages, might not be able to deal with ZONE_MOVABLE at all. This should be no big issue on bare metal (where the motivation is addressing limitations), but the size of SWIOTLB in CoCo VMs probably needs some consideration. > This could still be combined with more aggressive use of per-device > swiotlb area, which is probably a good idea based on some hints. > E.g. device could hint an amount of inflight DMA to the DMA layer, > and if there are addressing limitations and the amout is large enough > that could cause the allocation of a per-device swiotlb area. I would not rely on device hints, because it probably depends on workload rather than type of device. I'd rather implement some logic based on the actual runtime usage pattern. I have some ideas already. Petr T