Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp4724086rdb; Fri, 15 Sep 2023 10:15:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG9ykHtwLiFCEGpHXDoWaZlKorpKgA4HmLRnjmAw1umftDCWxToIEtqmo79V9NCuOj/rINr X-Received: by 2002:a05:6a20:7d8c:b0:131:c760:2a0b with SMTP id v12-20020a056a207d8c00b00131c7602a0bmr3108730pzj.52.1694798133536; Fri, 15 Sep 2023 10:15:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694798133; cv=none; d=google.com; s=arc-20160816; b=vAmU5wHnr5uxvOBwn2qRTAkwGKt6TAgyku+Ma6C1R6aDCYo9i4Y8z5Ssrjjv7J7e1V y9satUEwn5zhN0cdeo0iz/ZERw4cItKoUEQZwUwBAdQzCy2Jd52M92gNePJFbUjSTZEt Ugwq4+O5GxuZH3jZdIiF+cKKIwE9etpWnXKKVP6omUcsORUDW9BLD+5hgimW/bXGjc+i Erh51XRicRAXaHKwptXQxPclNdCzs2FPqYOCnxgKjdrHhM/XCgXnuk1fmfg6bg9DZM4G 1CTEvpZU422c2bpTUQB207fOB7Jn+ukbRzv3Vu0PY8mo1+XBRvzN05Y9Qgg3m/Xwq4b0 KdCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=YKK+E4VrhUV0eFvj4ywvjWvDj3FonT7Vjno/JTbEGpo=; fh=f3JYAadzaKkTjM/C/xopH45BiwSGFUeKzNsio6WoDEo=; b=jzOSCb08s1L2cSINBz/GYXUISerfbvR7yTC8ibV8DVuhfSODN2v7fQQL8BfEyV9qvR 0x3Ng5jylAzSGc94LHwXjuXMtMxj3Oa1Wzw9DURYQQaKvbzDbINkP606Q891IP7HknC8 92w+/jpJJbftpRnWmkmHFCh5LxQ//88EhPdb9oZRPNMxuYuxi457W9KH//KJIcBb8rIt EPtYaauJl0hg9fbu3HDcoEYBQr84ZV+LWR8WiKDoPUHXDrNWqHIVuCjt/oCyyyAJRSft BzCeJUSshPwFs9ZJqnxlB/ny2ReivM9EPtvIIxchULy7ZJqbBF+7z2J7dVYkUqN6mz53 eMcA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id i3-20020a63e903000000b00543c84bf588si3625119pgh.473.2023.09.15.10.15.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 10:15:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 1EA9D807F4CF; Fri, 15 Sep 2023 10:10:35 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234441AbjIORJ4 (ORCPT + 99 others); Fri, 15 Sep 2023 13:09:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231648AbjIORJh (ORCPT ); Fri, 15 Sep 2023 13:09:37 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A050EE7F for ; Fri, 15 Sep 2023 10:09:32 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A176EC433C7; Fri, 15 Sep 2023 17:09:30 +0000 (UTC) Date: Fri, 15 Sep 2023 18:09:28 +0100 From: Catalin Marinas To: Petr =?utf-8?B?VGVzYcWZw61r?= Cc: Christoph Hellwig , Marek Szyprowski , Robin Murphy , "open list:DMA MAPPING HELPERS" , open list , Roberto Sassu , Jonathan Corbet Subject: Re: [PATCH] swiotlb: fix the check whether a device has used software IO TLB Message-ID: References: <20230913114016.17752-1-petr@tesarici.cz> <20230913121403.GB4544@lst.de> <20230913142656.29e135d6@meshulam.tesarici.cz> <20230915111343.01496320@meshulam.tesarici.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230915111343.01496320@meshulam.tesarici.cz> X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 15 Sep 2023 10:10:35 -0700 (PDT) On Fri, Sep 15, 2023 at 11:13:43AM +0200, Petr Tesařík wrote: > On Thu, 14 Sep 2023 19:28:01 +0100 > Catalin Marinas wrote: > > What do the smp_wmb() barriers in swiotlb_find_slots() and > > swiotlb_dyn_alloc() order? The latter is even more unclear as it's at > > the end of the function and the "pairing" comment doesn't help. > > By the time swiotlb_find_slots() returns a valid slot index, the new > value of dev->dma_uses_io_tlb must be visible by all CPUs in > is_swiotlb_buffer(). The index is used to calculate the bounce buffer > address returned to device drivers. This address may be passed to > another CPU and used as an argument to is_swiotlb_buffer(). Ah, I remember now. So the smp_wmb() ensures that dma_uses_io_tlb is seen by other CPUs before the slot address (presumably passed via other memory write). It may be worth updating the comment in the code (I'm sure I'll forget it in a month time). The smp_rmb() before READ_ONCE() in this patch is also needed for the same reasons (ordering after the read of the address passed to is_swiotlb_buffer()). BTW, you may want to use WRITE_ONCE() when setting dma_uses_io_tlb (it also matches the READ_ONCE() in is_swiotlb_buffer()). Or you can use smp_store_mb() (but check its semantics first). > I am not sure that the smp_wmb() in swiotlb_dyn_alloc() is needed, but > let me explain my thinking. Even if the dev->dma_uses_io_tlb flag is > set, is_swiotlb_buffer() returns false unless the buffer address is > found in the RCU list of swiotlb pools, which is walked without taking > any lock. In some iterations of the patch series, there was a direct > call to swiotlb_dyn_alloc(), and a smp_wmb() was necessary to make sure > that the list of swiotlb pools cannot be observed as empty by another > CPU. You have already explained to me that taking a spin lock in > add_mem_pool() is not sufficient, because it does not invalidate a > value that might have been speculatively read by another CPU before > entering the critical section. OTOH swiotlb_dyn_alloc() is always > called through schedule_work() in the current version. If that > implicitly provides necessary barriers, this smp_wmb() can be removed. Normally you'd need a barrier between pool update and flag update: list_add_rcu(&pool->node, &mem->pools); smp_wmb(); WRITE_ONCE(dev->dma_uses_io_tlb, true); The lock around mem->pools update doesn't help since since it only has release semantics on the unlock path (which doesn't prevent the dma_uses_io_tlb write from being reordered before). On the reader side, you need to make sure that if the dma_uses_io_tlb is true, the mem->pools access was not speculated and read earlier as empty, so another dma_rmb(): if (READ_ONCE(dev->dma_uses_io_tlb)) { dma_rmb(); swiotlb_find_pool(...); } That's missing in the code currently (and rcu_read_lock() only has acquire semantics). However, what confuses me is that mem->pools is populated asynchronously via schedule_work(). Not sure how the barriers help since the work can be scheduled on any CPU at any point after, potentially after dma_uses_io_tlb has been updated. On the transient dev->dma_io_tlb_pools updated in swiotlb_find_slots(), you'd need the barriers as I mentioned above (unless I misunderstood how this pool searching works; not entirely sure why swiotlb_find_pool() searches both). > FTR list_add_rcu() alone is not sufficient. It adds the new element > using rcu_assign_pointer(), which modifies the forward link with > smp_store_release(). However, list_for_each_entry_rcu() reads the head > with list_entry_rcu(), which is a simple READ_ONCE(); there is no > ACQUIRE semantics. > > Actually, I wonder when a newly added RCU list element is guaranteed to > be visible by all CPUs. Existing documentation deals mainly with > element removal, explaining that both the old state and the new state > of an RCU list are valid after addition. Is it assumed that the old > state of an RCU list after addition is valid indefinitely? When some write becomes visible to other CPUs, I don't think the architectures define precisely (more like in vague terms as "eventually"). That's why we can't talk about barriers in relation to a single write as the barrier doesn't make a write visible but rather makes it visible _before_ a subsequent write becomes visible (whenever that may be). Anyway, as above, I think the current code is still missing some barriers between dma_uses_io_tlb update or read and the actual pool writes. But I haven't figured out how this works with the asynchronous swiotlb_dyn_alloc(). -- Catalin