Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp4492551pxb; Sun, 27 Mar 2022 20:22:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzROJEmo/30YVMqb0RKQcqNfMzFLRrsp1WQSg8wpRUD8ZymuFdu+EwOWVkqGR0TXOZGqagN X-Received: by 2002:a17:903:230c:b0:156:e47:387e with SMTP id d12-20020a170903230c00b001560e47387emr3668203plh.119.1648437765040; Sun, 27 Mar 2022 20:22:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648437765; cv=none; d=google.com; s=arc-20160816; b=Jdtn4NjMt0S4/nt3DYeLz+bs+VrbMtc4FMeOBLYh3f90mSaESpB363q/nBXaIWHlK7 bopC8dI1MbNSTEk42cENmQg45F54cNxao/LwxeHyJqMKOQuh7+xtAIjQtW0tZE2Z5D14 eJ2JKuMNBIbqmGye+pH1WnmV9BOpZJjXPBF7cni6fdVIs4dehb8TamnZVFEsIx5hSlCu gaC8vkcugY86WkadG44VoziYg9V2T9KGTDENpqIvXk2uETXR+BkAmWi178Iz/5pCt5d+ VT9juadKRWFfvuS2EmKSIDNNAf6j4p3pOoCPKREFUuuDjIj5v+iXDDS+z/BNfg/vbPeN bUfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=81R8ObTvU5bcOjgPkhjrVdYMQgLpHwwmsUh10DuWTXw=; b=BSWxs7oEJSsrpG1/mg3DDm835U+aKPX8a7qRWYuoXgHuDGIkOolwYSqAD7eFWuYBH4 ejZN4RI6iXO0aOSfWsBmx8RJR02UqZdEMly7lLXUoIlz+gpbUA57V2XH61KsvHpmm5TF GQ/8yESGCECQ0c1rOlbCdAf61No7ZJRGNBzZ88O7+TWZhZcavGXRcxO5K7KSzIfZeg9l ben5UrBJFtj/zQ3O6bQHhLe3o34IV5GQITLJTFIWoHdFp4zr2w31Pw8z3PdIH1mGEvbV fuCpP74HzNQ9ytaLEHlhAxDMOxKhDBq60ULX/gEPHezvtgCiUG3iSPghWs7aQITNXSez sa3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=QVswRSXv; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lj7-20020a17090b344700b001bf37e05b24si11030268pjb.130.2022.03.27.20.22.32; Sun, 27 Mar 2022 20:22:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=QVswRSXv; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236512AbiC0TZo (ORCPT + 68 others); Sun, 27 Mar 2022 15:25:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231214AbiC0TZj (ORCPT ); Sun, 27 Mar 2022 15:25:39 -0400 Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8740AAE57 for ; Sun, 27 Mar 2022 12:23:56 -0700 (PDT) Received: by mail-lj1-x231.google.com with SMTP id v12so3837814ljd.3 for ; Sun, 27 Mar 2022 12:23:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=81R8ObTvU5bcOjgPkhjrVdYMQgLpHwwmsUh10DuWTXw=; b=QVswRSXvuEFhZlnuaBAnMB1TOMg78uyn8mRLwSOZtWi/86mVxTdWeJmg8H09RYWcp4 O6lvIOKP0C2OxXUCyV3ffNy61gxSdcLR6Kl9jGUx2xgYt16l8oVTj7BEMInCe8aUUdRF U28WrmitywInVLFpJqv3NSZNOfcB3kKShK/cw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=81R8ObTvU5bcOjgPkhjrVdYMQgLpHwwmsUh10DuWTXw=; b=7w3z21mk1ZlxYyzRZ7xQxoywqf4+7ZLkz/NGWWqEX2gPZ73MN9Vjd0E2M8RG2RPBBU 6VhPaQgf5YxxZ9xgKp3NVvs3XNaUE4axWBTLYzHI+TrnObFQ6UILarmtTgSw8i0IrIlz swKtlxImzZJ59hj3VzfKfCDnargjsGOyEC9YLPwwnDzfFVUJwhi11YLojPtod4Lqed7u 2q5/vxXYkn507c5XiqRNv8ezECVLgKml16CuJc3rxbBirBVq+vzBtuTySCN8akjndfvl dY3pgxjuR698YSuO02HTRUhznfxyjGGfOeTYqBtvRwBE4oORJOcokFgWvbZS6xdlCogj N1sA== X-Gm-Message-State: AOAM533L0QfV6c2ZUiYwAGMV/xvgG0Nf9hkmtlHqFRXB9TybmdvBwcV8 fgQr3QYjG13ORo9Z0M2dc10VcgkQRuleDBiroB0= X-Received: by 2002:a2e:818f:0:b0:24a:7c17:7226 with SMTP id e15-20020a2e818f000000b0024a7c177226mr9675644ljg.472.1648409034484; Sun, 27 Mar 2022 12:23:54 -0700 (PDT) Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com. [209.85.208.171]) by smtp.gmail.com with ESMTPSA id j11-20020a056512108b00b0044a23c1e679sm1437521lfg.18.2022.03.27.12.23.52 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 27 Mar 2022 12:23:52 -0700 (PDT) Received: by mail-lj1-f171.google.com with SMTP id 17so16574948lji.1 for ; Sun, 27 Mar 2022 12:23:52 -0700 (PDT) X-Received: by 2002:a05:651c:1213:b0:247:e2d9:cdda with SMTP id i19-20020a05651c121300b00247e2d9cddamr17499555lja.443.1648409031649; Sun, 27 Mar 2022 12:23:51 -0700 (PDT) MIME-Version: 1.0 References: <1812355.tdWV9SEqCh@natalenko.name> <20220324055732.GB12078@lst.de> <4386660.LvFx2qVVIh@natalenko.name> <81ffc753-72aa-6327-b87b-3f11915f2549@arm.com> <878rsza0ih.fsf@toke.dk> <4be26f5d8725cdb016c6fdd9d05cfeb69cdd9e09.camel@freebox.fr> <20220324163132.GB26098@lst.de> <871qyr9t4e.fsf@toke.dk> <20220327054848.1a545b12.pasic@linux.ibm.com> <0745b44456d44d1e9fc364e5a3780d9a@AcuMS.aculab.com> In-Reply-To: <0745b44456d44d1e9fc364e5a3780d9a@AcuMS.aculab.com> From: Linus Torvalds Date: Sun, 27 Mar 2022 12:23:34 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP To: David Laight Cc: Halil Pasic , =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= , Robin Murphy , Christoph Hellwig , Maxime Bizon , Oleksandr Natalenko , Marek Szyprowski , Kalle Valo , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Olha Cherevyk , iommu , linux-wireless , Netdev , Linux Kernel Mailing List , Greg Kroah-Hartman , stable Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On Sun, Mar 27, 2022 at 8:24 AM David Laight wrote: > > Aren't bounce buffers just a more extreme case on non-coherent > memory accesses? No. In fact, this whoe change came about exactly because bounce buffers are different. The difference is that bounce buffers have that (wait for it) bounce buffer, which can have stale contents. > They just need explicit memory copies rather than just cache > writeback and invalidate operations. That's the thing - the memory copies introduce entirely new issues. I really think that instead of making up abstract rules ("ownership" or "bounce buffers are just extreme cases of non-coherency") we should make the rules very much practical and down to earth, and write out exactly what they *do*. The whole "sync DMA" is odd and abstract enough as a concept on its own, we shouldn't then make the rules for it odd and abstract. We should make them very very practical. So I will propose that we really make it very much about practical concerns, and we document things as (a) the "sync" operation has by definition a "whose _future_ access do we sync for" notion. So "dma_sync_single_for_cpu()" says that the future accesses to this dma area is for the CPU. Note how it does *NOT* say that the "CPU owns the are". That's bullsh*t, and we now know it's BS. (b) but the sync operation also has a "who wrote the data we're syncing" Note that this is *not* "who accessed or owned it last", because that's nonsensical: if we're syncing for the CPU, then the only reason to do so is because we expect that the last access was by the device, so specifying that separately would just be redundant and stupid. But specifying who *wrote* to the area is meaningful and matters. It matters for the non-coherent cache case (because of writeback issues), but it also matters for the bounce buffer case (becasue it determines which way we should copy). Note how this makes sense: a "sync" operation is clearly about taking some past state, and making it palatable for a future use. The past state is pretty much defined by who wrote the data, and then we can use that and the "the next thing to access it" to determine what we need to do about the sync. It is *NOT* about "ownership". So let's go through the cases, and I'm going to ignore the "CPU caches are coherent with device DMA" case because that's always going to be a no-op wrt data movement (but it will still generally need a memory barrier, which I will mention here and then ignore going forward). Syncing for *CPU* accesses (ie dma_sync_single_for_cpu()) has four choices I can see: - nobody wrote the data at all (aka DMA_NONE). This is nonsensical and should warn. If nobody wrote to it, why would the CPU ever validly access it? Maybe you should have written "memset(buffer, 0, size)" instead? - the CPU wrote the data in the first place (aka DMA_TO_DEVICE) This is a no-op (possibly a memory barrier), because even stale CPU caches are fine, and even if it was in a bounce buffer, the original CPU-side data is fine. - the device wrote the data (aka DMA_FROM_DEVICE) This is just the regular case of a device having written something, and the CPU wants to see it. It obviously needs real work, but it's simple and straightforward. For non-coherent caches, it needs a cache invalidate. For a bounce buffer, it needs a copy from the bounce buffer to the "real" buffer. - it's not clear who write the data (aka DMA_BIDIRECTIONAL) This is not really doable for a bounce buffer - we just don't know which buffer contents are valid. I think it's very very questionable for non-coherent caches too, but "writeback and invalidate" probably can't hurt. So probably warn about it, and do whatever we used to do historically. Syncing for device accesses (ie dma_sync_single_for_device()) also has the same four choices I can see, but obviously does different things: - nobody wrote the data at all (aka DMA_NONE) This sounds as nonsensical as the CPU case, but maybe isn't. We may not have any previous explicit writes, but we *do* have that "insecure and possibly stale buffer contents" bounce buffer thing on the device side. So with a bounce buffer, it's actually somewhat sane to say "initialize the bounce buffer to a known state". Because bounce buffers *are* special. Unlike even the "noncoherent caches" issue, they have that entirely *other* hidden state in the form of the bounce buffer itself. Discuss. - the CPU wrote the data in the first place (aka DMA_TO_DEVICE). This is the regular and common case of "we have data on the CPU side that is written to the device". Again, needs work, but is simple and straightforward. For non-coherent caches, we need a writeback on the CPU. For a bounce buffer, we need to copy from the regular buffer to the bounce buffer. - the device wrote the data in the first place (aka DMA_FROM_DEVICE) This is the case that we hit on ath9k. It's *not* obvious, but when we write this out this way, I really think the semantics are pretty clear. For non-coherent caches, we may need an "invalidate". For a bounce buffer, it's a no-op (because the bounce buffer already contains the data) - it's not clear who write the data (aka DMA_BIDIRECTIONAL) This is again not really doable for a bounce buffer. We don't know which buffer contains the right data, we should warn about it and do whatever we used to do historically. Again, it's very questionable for non-coherent caches too, but "writeback and invalidate" probably at least can't hurt. So hey, that's my thinking. The whole "ownership" model is, I think, obviously untenable. But just going through and listing the different cases and making them explicit I think explains exactly what the different situations are, and that then makes it fairly clear what the different combinations should do. No? Linus