Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp31701978rwd; Fri, 7 Jul 2023 03:15:14 -0700 (PDT) X-Google-Smtp-Source: APBJJlFOihTQvFgsxiTckpNR9AT3SaB6PUhhONk8mGTtroKyoAz+RQbukd9Ok1RyeR4FGs1PY7o/ X-Received: by 2002:a17:90a:4414:b0:262:f8eb:ea5 with SMTP id s20-20020a17090a441400b00262f8eb0ea5mr3188980pjg.22.1688724913856; Fri, 07 Jul 2023 03:15:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688724913; cv=none; d=google.com; s=arc-20160816; b=aelEyosgPLHhUaoiwf6UaIPMVBiKmrepXEGJ0AuI5Yn7TK3S7Wn2q4usPtwLT65cvk 4tJIt18Drkd/16110kngP+D5Ogbp4F7XjsKZgZmEdKhzoC7AaEkg4IHl2CK94yNa0N+i hvo4B7prAznOhm4mdhLu6Oq3serS83gOIiGK08y47lfRkTh7qZGWUqQKbSkzoU8AzUn2 rWNrhBRDkdyj5rk3Fmg6TS+LaA/C+yTrLrELbxggcx2eAsuTnTMu3XwkVfq1kdqFyGSI oqYzxkpMEmTQCZ/fW4Flx92ikenNmh2gJwQcqAO4isnyF8j12mBpnvAtjL982n4Swv9Z rZNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=qXF2leZf3dzABsbLrdEie916f9hIFrzbwu6ZPx3ZXIk=; fh=DCD6/SNNcvWmTVViS7zUEHqbnBOVoBgTaRsQImZ/8a4=; b=U8bjQytUaOkJaqxjkicGhH4P0pUBcjYJY3BzDg8hiTDaTWyS7jwloPsuh5t8V7ylml K/byUa8o6FdV9i7yogvcfvPsVTecU8Il+4gbUZU6rQLdIoWt25Lkzxzdh6iSldW22G37 vaT+g1yMz7CX51D8M3VlTj4gN/ljc49oYmsHBhHv1WCQt7c+FbQ6tvj6I6JxefGloCdE ZBMNRtmERIZPl8vp+qOyshcgokaldxf7IowDbFIXDx+99ZtTm3F3dmWH6kPRvfVHEzcZ P0WDWD8xbx3z9CL2qnYtmfnwhhoV1hEnKewxjEVtb5+ZDEt6BIqFuDHxviaUXjO74s6v ornw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=LyuLd0FC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k71-20020a63844a000000b0054fec1e94d7si3462104pgd.705.2023.07.07.03.15.01; Fri, 07 Jul 2023 03:15:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=LyuLd0FC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232680AbjGGJ3O (ORCPT + 99 others); Fri, 7 Jul 2023 05:29:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229636AbjGGJ3N (ORCPT ); Fri, 7 Jul 2023 05:29:13 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 443D81FED; Fri, 7 Jul 2023 02:29:12 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C992D618DF; Fri, 7 Jul 2023 09:29:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 469D0C433C7; Fri, 7 Jul 2023 09:29:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1688722151; bh=TiVqgBQxOT3C3TBkmM/c7nKqqaoif0rl2+UYbMz3sqA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=LyuLd0FCJEQhqnrVpquSVLKc/9uI0i7rdMICm0QZElUzRmgl1WAypdZRf644WG5w6 wVwwOe7uTbd6a5um1d6z2qgLJrXKCpq1rXMHNpxh0jP21HzLgs6LU7p19zMbzFotb4 nJtjTSe2HVhiChBbqKoLsix3YSP7/IVd7wy0lWd4= Date: Fri, 7 Jul 2023 10:29:00 +0100 From: Greg Kroah-Hartman To: "Michael Kelley (LINUX)" Cc: Petr Tesarik , Stefano Stabellini , Thomas Bogendoerfer , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , "H. Peter Anvin" , "Rafael J. Wysocki" , Juergen Gross , Oleksandr Tyshchenko , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Saravana Kannan , "moderated list:XEN HYPERVISOR ARM" , "moderated list:ARM PORT" , open list , "open list:MIPS" , "open list:XEN SWIOTLB SUBSYSTEM" , Roberto Sassu , Kefeng Wang , "petr@tesarici.cz" Subject: Re: [PATCH v3 4/7] swiotlb: if swiotlb is full, fall back to a transient memory pool Message-ID: <2023070706-humbling-starfish-c68f@gregkh> References: <34c2a1ba721a7bc496128aac5e20724e4077f1ab.1687859323.git.petr.tesarik.ext@huawei.com> <2023070626-boxcar-bubbly-471d@gregkh> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 06, 2023 at 02:22:50PM +0000, Michael Kelley (LINUX) wrote: > From: Greg Kroah-Hartman Sent: Thursday, July 6, 2023 1:07 AM > > > > On Thu, Jul 06, 2023 at 03:50:55AM +0000, Michael Kelley (LINUX) wrote: > > > From: Petr Tesarik Sent: Tuesday, June 27, 2023 > > 2:54 AM > > > > > > > > Try to allocate a transient memory pool if no suitable slots can be found, > > > > except when allocating from a restricted pool. The transient pool is just > > > > enough big for this one bounce buffer. It is inserted into a per-device > > > > list of transient memory pools, and it is freed again when the bounce > > > > buffer is unmapped. > > > > > > > > Transient memory pools are kept in an RCU list. A memory barrier is > > > > required after adding a new entry, because any address within a transient > > > > buffer must be immediately recognized as belonging to the SWIOTLB, even if > > > > it is passed to another CPU. > > > > > > > > Deletion does not require any synchronization beyond RCU ordering > > > > guarantees. After a buffer is unmapped, its physical addresses may no > > > > longer be passed to the DMA API, so the memory range of the corresponding > > > > stale entry in the RCU list never matches. If the memory range gets > > > > allocated again, then it happens only after a RCU quiescent state. > > > > > > > > Since bounce buffers can now be allocated from different pools, add a > > > > parameter to swiotlb_alloc_pool() to let the caller know which memory pool > > > > is used. Add swiotlb_find_pool() to find the memory pool corresponding to > > > > an address. This function is now also used by is_swiotlb_buffer(), because > > > > a simple boundary check is no longer sufficient. > > > > > > > > The logic in swiotlb_alloc_tlb() is taken from __dma_direct_alloc_pages(), > > > > simplified and enhanced to use coherent memory pools if needed. > > > > > > > > Note that this is not the most efficient way to provide a bounce buffer, > > > > but when a DMA buffer can't be mapped, something may (and will) actually > > > > break. At that point it is better to make an allocation, even if it may be > > > > an expensive operation. > > > > > > I continue to think about swiotlb memory management from the standpoint > > > of CoCo VMs that may be quite large with high network and storage loads. > > > These VMs are often running mission-critical workloads that can't tolerate > > > a bounce buffer allocation failure. To prevent such failures, the swiotlb > > > memory size must be overly large, which wastes memory. > > > > If "mission critical workloads" are in a vm that allowes overcommit and > > no control over other vms in that same system, then you have worse > > problems, sorry. > > > > Just don't do that. > > > > No, the cases I'm concerned about don't involve memory overcommit. > > CoCo VMs must use swiotlb bounce buffers to do DMA I/O. Current swiotlb > code in the Linux guest allocates a configurable, but fixed, amount of guest > memory at boot time for this purpose. But it's hard to know how much > swiotlb bounce buffer memory will be needed to handle peak I/O loads. > This patch set does dynamic allocation of swiotlb bounce buffer memory, > which can help avoid needing to configure an overly large fixed size at boot. But, as you point out, memory allocation can fail at runtime, so how can you "guarantee" that this will work properly anymore if you are going to make it dynamic? confused, greg k-h