Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp649716rwl; Fri, 7 Apr 2023 03:17:10 -0700 (PDT) X-Google-Smtp-Source: AKy350bJz0RTvPBgIOUpn+oTvkqZXZvvW0idpmNe4Sv16H4ZloFHblL6di09XQ0LhpOBUcri7ozm X-Received: by 2002:a05:6402:503:b0:504:7f3c:1d63 with SMTP id m3-20020a056402050300b005047f3c1d63mr742015edv.23.1680862629984; Fri, 07 Apr 2023 03:17:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680862629; cv=none; d=google.com; s=arc-20160816; b=bvy1sGDbP0rUwshAhO6rFmxq6lovWl8gAPZxysgAFleAXV71WnvopL650LAh2AuR26 SJHDZw4sK7sfWNfR1PRpsiTTNo6PnLO7MbJMNgOUxSlWeIY+MZ64kGZxDgtz0MakvSNI OSog1+CFomIbFi/yOqwvbTyLAYYxy0v2mjrRj9zNMDLpxc6dCayKe+t2nW76EMlc9ID4 oEsmHC5w+6/b2AlNfroY8k7b/TDJnAdlyQnfumPDM5M6toViW+OB0b8U5Gzarh5KfrrR SH3wwh8S77UjsrCh9+6ki803iiAf2epGWN6Pcf9+Iqv62OfAl216+pORk8lAZ6Zv7sBe 7XqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=jDkek4cReJyYj7hfmdAF1RXEcjgCZPGO8Bvj5KpJoNE=; b=0vN3FcjZyY7Ja1bBvcr3fC6Tufn54S1qsRgt03jUL+yobzmsbA/N3NOFLiJBU0wQYk Rsa8AovFaDfUTIdCPBQqSkcTkbSlxJW8BceTGN7Do0JVNROekNt8dSIXENc2ivRntcFT sIy+ZBtyH7DgkCTnvOQItjvuziTzG0/TF+kccYG27rZ6z4Lid9AlUb+N+FlWKQSY6E+E 1SOw+I5KDg2qKBie7+XeUl6E2oImBgakGCK3fybf+i3vhXgRihuStHPeptT/32UYBa1n j/rbJDN/LMcmKYOeWX1Qb3iSxxYTtFsMnjUaqej7foP2Vk0SKclWAxcPbT5QQBq+z/zz ic1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=DNqwBXXv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=tesarici.cz Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x3-20020aa7d6c3000000b005026e488c81si2955527edr.520.2023.04.07.03.16.44; Fri, 07 Apr 2023 03:17:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=DNqwBXXv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=tesarici.cz Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232336AbjDGKQC (ORCPT + 99 others); Fri, 7 Apr 2023 06:16:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229609AbjDGKQB (ORCPT ); Fri, 7 Apr 2023 06:16:01 -0400 Received: from bee.tesarici.cz (bee.tesarici.cz [IPv6:2a03:3b40:fe:2d4::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9268F86A6; Fri, 7 Apr 2023 03:15:59 -0700 (PDT) Received: from meshulam.tesarici.cz (dynamic-2a00-1028-83b8-1e7a-4427-cc85-6706-c595.ipv6.o2.cz [IPv6:2a00:1028:83b8:1e7a:4427:cc85:6706:c595]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bee.tesarici.cz (Postfix) with ESMTPSA id 4239D14B818; Fri, 7 Apr 2023 12:15:56 +0200 (CEST) Authentication-Results: mail.tesarici.cz; dmarc=fail (p=none dis=none) header.from=tesarici.cz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tesarici.cz; s=mail; t=1680862556; bh=PaIGonr5JMM+Z9od4ricOA/7Hr3fGMxSQp5ApjSQwqw=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=DNqwBXXvCxQ8a7C2kJ5An/Ucg5U5OkUQ3Utahbqx2rhCY7B/Gw1j8zQS7oQO2PK4d OEOwIAQ+FoFE86ayjF16PfihXWG0zgOB4Onzs+jpTaLweXTG8jddQXY/W643jq8Ak3 I6a/GBljeXX9H4JaINyEH9bxopBJdZZIQIMCSiw6OVk380au26T+8G9Nu4vfvnkCG/ mxxDVVk5BljTm5rG82UshnUop9DV1ul8cVmZEDZkHYtvfmAAC93V0v2fRynBvT5HdQ MUCbpjgK/DaCkGrhD0k1y6u2xPBQbtVEZ/U/pRv0lVJLHos/u+rX3rhngO1X99kyVg M/NJ8vibVVIdw== Date: Fri, 7 Apr 2023 12:15:55 +0200 From: Petr =?UTF-8?B?VGVzYcWZw61r?= To: Christoph Hellwig Cc: Petr Tesarik , Jonathan Corbet , Marek Szyprowski , Robin Murphy , Borislav Petkov , "Paul E. McKenney" , Andrew Morton , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , "open list:DOCUMENTATION" , open list , "open list:DMA MAPPING HELPERS" , Roberto Sassu , Alexander Graf Subject: Re: [RFC v1 3/4] swiotlb: Allow dynamic allocation of bounce buffers Message-ID: <20230407121555.4290a011@meshulam.tesarici.cz> In-Reply-To: <20230407055704.GD6803@lst.de> References: <0334a54332ab75312c9de825548b616439dcc9f5.1679309810.git.petr.tesarik.ext@huawei.com> <20230328040724.GB25506@lst.de> <4268fa4e-4f0f-a2f6-a2a5-5b78ca4a073d@huaweicloud.com> <8cf7c515-9ce6-a2ed-0643-972aa3eba2fb@huaweicloud.com> <20230407055704.GD6803@lst.de> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.37; x86_64-suse-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 7 Apr 2023 07:57:04 +0200 Christoph Hellwig wrote: > On Tue, Mar 28, 2023 at 02:43:03PM +0200, Petr Tesarik wrote: > > Oh, wait! I can do at least something for devices which do not use > > swiotlb at all. > > > > If a device does not use bounce buffers, it cannot pass an address > > that belongs to the swiotlb. Consequently, the potentially > > expensive check can be skipped. This avoids the dynamic lookup > > penalty for devices which do not need the swiotlb. > > > > Note that the counter always remains zero if dma_io_tlb_mem is > > NULL, so the NULL check is not required. > > Hmm, that's yet another atomic for each map/unmap, and bloats > struct device. I'm not sure how bad it is to bloat struct device. It is already quite large, e.g. in my x86 build it is 768 bytes (exact size depends on config options), and nobody seems to be concerned... Regarding the atomic operations, I am currently testing a slightly different approach, which merely sets a flag if there are any dynamically allocated bounce buffers. The atomic check changes to smp_load_acquire(), and the atomic inc/dec to smp_store_release() only if the flag changes. That said, if I hammer this path with heavy parallel I/O, I can still see some performance cost for devices that use swiotlb, but at least devices that do not need such bounce buffers seem to be unaffected then. > (Btw, in case anyone is interested, we really need to get started > on moving the dma fields out of struct device into a sub-struct > only allocated for DMA capable busses) I like this idea. In fact, my WIP topic branch now moves the swiotlb fields into a separate struct, but I can surely go further and move all DMA-related fields. I doubt it is worth to allocate it separately, though. We are talking about replacing some 100 bytes (in the worst case) with a pointer to a dynamically allocated struct, but the dynamic allocator adds some overhead. I believe it pays off only if the vast majority of struct device instances do not need these DMA fields, but is that really the case? Petr T