Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp513750rwd; Tue, 16 May 2023 04:27:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6/YgFXvzBNN4GG8mLKL53k6tHg8XABepaKrekGzfFkudOVN4BPiRR276L0TeyNmcqGhHec X-Received: by 2002:a17:903:24f:b0:1ae:32d3:43b2 with SMTP id j15-20020a170903024f00b001ae32d343b2mr2543566plh.25.1684236464667; Tue, 16 May 2023 04:27:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684236464; cv=none; d=google.com; s=arc-20160816; b=l8oRj0GgI5+eZXeMWksMYH0jWcwBhkKsm9V3H6WFi2JKayDrPQ6jV8RMacRsganUvp pe88WeWhrLR3yIdzuox3EL6LJOOGrpyB1RQZ6ECIgfcvaJZCUJZITRU2jRF241OmBiW6 KCkVMaIdAK2SYtFLR+N9GWyIb2GR5vxMz6RGrhC9qGwvRWNZTodCDoHVIoVEAvMQCDeJ j5537jW1dgkva7iWZvCYds4Qr7J8l74yLbQ/zjdR3s9dA8t1bobL6cmWIWMB66qYoSC1 DvN8DgJfDWmQfPwr1Ix1TdDYoIbTBc/X7epAM9Z3Gh1RZYssaiMJr9IapMJnq5CPNASa UjwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=5iLnhekdBqMnCSdTstjJYYYk0WgagDJw9AzhlAyGXQg=; b=apZlCWv/+qW0b+ZA+OrU3dtShZ2DERbRvv90NqnSI+Vg7DLiddtNfLpg67jtH8fGwm c996T0hnXtAfWmIHMurU/R5DrLkp5qddmnY4E0sl6nfuaKlR8AlVTDCgU78k4C/hZwtz LkRfU1gXkUDKyI0MNSdLMxcQd6rhPcO4XIOWz6ezYz8HOJZdbHX49ufLI2kHbingQGVp 1WPBJH+Ymjsp4zm3q1mlWzQKp7g6mL0f6/1Z7SZYKiiBeMcSr+PG90x2FjNpsb6BRRN0 U2f5iNv8qs6tDQ9mKGJyH2Kn6ZuPLhPppCNwlrP7U0FhB/0v8yjs9zVAJvpScomaRyKc kY9g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x11-20020a170902fe8b00b001ae40e07fb0si575004plm.216.2023.05.16.04.27.32; Tue, 16 May 2023 04:27:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232447AbjEPLWX (ORCPT + 99 others); Tue, 16 May 2023 07:22:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231920AbjEPLWW (ORCPT ); Tue, 16 May 2023 07:22:22 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CB3D359D; Tue, 16 May 2023 04:22:11 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id BCDD663875; Tue, 16 May 2023 11:22:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6D5D5C433D2; Tue, 16 May 2023 11:22:05 +0000 (UTC) Date: Tue, 16 May 2023 12:22:02 +0100 From: Catalin Marinas To: Petr =?utf-8?B?VGVzYcWZw61r?= Cc: Petr Tesarik , Jonathan Corbet , Greg Kroah-Hartman , "Rafael J. Wysocki" , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Christoph Hellwig , Marek Szyprowski , Robin Murphy , "Paul E. McKenney" , Borislav Petkov , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , Andy Shevchenko , Hans de Goede , Jason Gunthorpe , Kees Cook , Thomas Gleixner , "open list:DOCUMENTATION" , open list , "open list:DRM DRIVERS" , "open list:DMA MAPPING HELPERS" , Roberto Sassu , Kefeng Wang Subject: Re: [PATCH v2 RESEND 7/7] swiotlb: per-device flag if there are dynamically allocated buffers Message-ID: References: <69f9e058bb1ad95905a62a4fc8461b064872af97.1683623618.git.petr.tesarik.ext@huawei.com> <20230515104847.6dfdf31b@meshulam.tesarici.cz> <20230515120054.0115a4eb@meshulam.tesarici.cz> <20230516095512.3c99c35e@meshulam.tesarici.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230516095512.3c99c35e@meshulam.tesarici.cz> X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 16, 2023 at 09:55:12AM +0200, Petr Tesařík wrote: > On Mon, 15 May 2023 17:28:38 +0100 > Catalin Marinas wrote: > > There is another scenario to take into account on the list_del() side. > > Let's assume that there are other elements on the list, so > > list_empty() == false: > > > > P0: > > list_del(paddr); > > /* the memory gets freed, added to some slab or page free list */ > > WRITE_ONCE(slab_free_list, __va(paddr)); > > > > P1: > > paddr = __pa(READ_ONCE(slab_free_list));/* re-allocating paddr freed on P0 */ > > if (!list_empty()) { /* assuming other elements on the list */ > > /* searching the list */ > > list_for_each() { > > if (pos->paddr) == __pa(vaddr)) > > /* match */ > > } > > } > > > > On P0, you want the list update to be visible before the memory is freed > > (and potentially reallocated on P1). An smp_wmb() on P0 would do. For > > P1, we don't care about list_empty() as there can be other elements > > already. But we do want any list elements reading during the search to > > be ordered after the slab_free_list reading. The smp_rmb() you'd add for > > the case above would suffice. > > Yes, but to protect against concurrent insertions/deletions, a spinlock > is held while searching the list. The spin lock provides the necessary > memory barriers implicitly. Well, mostly. The spinlock acquire/release semantics ensure that accesses within the locked region are not observed outside the lock/unlock. But it doesn't guarantee anything about accesses outside such region in relation to the accesses within the region. For example: P0: spin_lock_irqsave(&swiotlb_dyn_lock); list_del(paddr); spin_unlock_irqrestore(&swiotlb_dyn_lock); /* the blah write below can be observed before list_del() above */ WRITE_ONCE(blah, paddr); /* that's somewhat tricker but slab_free_list update can also be * seen before list_del() above on certain architectures */ spin_lock_irqsave(&slab_lock); WRITE_ONCE(slab_free_list, __va(paddr)); spin_unlock_irqrestore(&slab_lock); On most architectures, the writing of the pointer to a slab structure (assuming some spinlocks) would be ordered against the list_del() from the swiotlb code. Apart from powerpc where the spin_unlock() is not necessarily ordered against the subsequent spin_lock(). The architecture selects ARCH_WEAK_RELEASE_ACQUIRE which in turns makes smp_mb__after_unlock_lock() an smp_mb() (rather than no-op on all the other architectures). On arm64 we have smp_mb__after_spinlock() which ensures that memory accesses prior to spin_lock() are not observed after accesses within the locked region. I don't think this matters for your case but I thought I'd mention it. -- Catalin