Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp2171467pxb; Fri, 25 Mar 2022 12:22:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxsfMk3ySQ4KVB2ucSI1ir4o2nRlAjPh8WEWjNFfGK1pU+tX6JFLGx2GnK0nvtSs8qrAhZ/ X-Received: by 2002:a17:90b:3b8c:b0:1c6:eb72:24b4 with SMTP id pc12-20020a17090b3b8c00b001c6eb7224b4mr14427488pjb.171.1648236128889; Fri, 25 Mar 2022 12:22:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648236128; cv=none; d=google.com; s=arc-20160816; b=SeQheJ0bWPe1HC4hO4+A3KCEGcylE+wUbw66cdaBosWQXAAvDUp2c63jdu/wMMcbJT 5Jwmg7Irz9yltYlj6lw0uwf+8oSxiuaoZH3dnaXzKM9aZVUXzmDEbIrDWXOFR0DbdeVS JF7fyzzJv2doyPJ4UTAGWndQmDIm6266/ovEXlMSfszn3E5KWkIi4ptcuD/zs7RcXfFE s0lKbnxwaCNA7E4zELxyxwLYTAkHI3hMj/S58s7yTuwkLJ3+3fmTyjzHAGRh9LwjdqZd SnzQePfg4an4cEiPKaP5tpV7chtirn2z8A3OdL0rQ2xNLU5e+QDvdXhmJO8Bnlpdf85j /PpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=nSXfT1RoGXKcmDLTqZ4YEO9ohqxwx8CZtJXPBWuSK1Y=; b=ZJfp0W5eB9mUZRikMcvk7vTG55pdnu1T9fGUBGa3TyfuJ78nscPjXKdfHSC86qiDM8 31/HPD+8n90b+VNu0TtLavWzCYvAyBCwQkZIL5uMZB8s0y6m7WPeiCgvhou3jfSWbagO dVWmx0KyPR8gjrL/M8K3Xe0sVA+Z/3+ma1+eElvUC16GGczcMTrjo0sTe1E2bSYVlfZn eFDwDVk6g9JWAHihTnJhftxp9z95YBwlkXvlVuBrxqgTlF8gUe57sLbWYLDvAXpgM2gZ Q1ZzudqUa3FB+KIEQVH72hBXDzweZ0QWX8XZZOLqQAiQpXVLWZJL3RcUGfUjVRbeQbfN yBUQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id e10-20020a63744a000000b00382692aaf58si3609981pgn.110.2022.03.25.12.22.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Mar 2022 12:22:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AA0E617A2C8; Fri, 25 Mar 2022 11:27:50 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376897AbiCYQrW (ORCPT + 70 others); Fri, 25 Mar 2022 12:47:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245408AbiCYQrV (ORCPT ); Fri, 25 Mar 2022 12:47:21 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 63BDDB9191; Fri, 25 Mar 2022 09:45:46 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 06F6C12FC; Fri, 25 Mar 2022 09:45:46 -0700 (PDT) Received: from [10.57.41.19] (unknown [10.57.41.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 495323F73D; Fri, 25 Mar 2022 09:45:43 -0700 (PDT) Message-ID: Date: Fri, 25 Mar 2022 16:45:39 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP Content-Language: en-GB To: =?UTF-8?Q?Toke_H=c3=b8iland-J=c3=b8rgensen?= , mbizon@freebox.fr, Linus Torvalds Cc: Christoph Hellwig , Oleksandr Natalenko , Halil Pasic , Marek Szyprowski , Kalle Valo , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Olha Cherevyk , iommu , linux-wireless , Netdev , Linux Kernel Mailing List , Greg Kroah-Hartman , stable References: <1812355.tdWV9SEqCh@natalenko.name> <20220324055732.GB12078@lst.de> <4386660.LvFx2qVVIh@natalenko.name> <81ffc753-72aa-6327-b87b-3f11915f2549@arm.com> <878rsza0ih.fsf@toke.dk> <4be26f5d8725cdb016c6fdd9d05cfeb69cdd9e09.camel@freebox.fr> <20220324163132.GB26098@lst.de> <871qyr9t4e.fsf@toke.dk> <31434708dcad126a8334c99ee056dcce93e507f1.camel@freebox.fr> <87a6de80em.fsf@toke.dk> From: Robin Murphy In-Reply-To: <87a6de80em.fsf@toke.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On 2022-03-25 16:25, Toke Høiland-Jørgensen wrote: > Maxime Bizon writes: > >> On Thu, 2022-03-24 at 12:26 -0700, Linus Torvalds wrote: >> >>> >>> It's actually very natural in that situation to flush the caches from >>> the CPU side again. And so dma_sync_single_for_device() is a fairly >>> reasonable thing to do in that situation. >>> >> >> In the non-cache-coherent scenario, and assuming dma_map() did an >> initial cache invalidation, you can write this: >> >> rx_buffer_complete_1(buf) >> { >> invalidate_cache(buf, size) >> if (!is_ready(buf)) >> return; >> >> } >> >> or >> >> rx_buffer_complete_2(buf) >> { >> if (!is_ready(buf)) { >> invalidate_cache(buf, size) >> return; >> } >> >> } >> >> The latter is preferred for performance because dma_map() did the >> initial invalidate. >> >> Of course you could write: >> >> rx_buffer_complete_3(buf) >> { >> invalidate_cache(buf, size) >> if >> (!is_ready(buf)) { >> invalidate_cache(buf, size) >> return; >> } >> >> >> } >> >> >> but it's a waste of CPU cycles >> >> So I'd be very cautious assuming sync_for_cpu() and sync_for_device() >> are both doing invalidation in existing implementation of arch DMA ops, >> implementers may have taken some liberty around DMA-API to avoid >> unnecessary cache operation (not to blame them). > > I sense an implicit "and the driver can't (or shouldn't) influence > this" here, right? Right, drivers don't get a choice of how a given DMA API implementation works. >> For example looking at arch/arm/mm/dma-mapping.c, for DMA_FROM_DEVICE >> >> sync_single_for_device() >> => __dma_page_cpu_to_dev() >> => dma_cache_maint_page(op=dmac_map_area) >> => cpu_cache.dma_map_area() >> >> sync_single_for_cpu() >> => __dma_page_dev_to_cpu() >> => >> __dma_page_cpu_to_dev(op=dmac_unmap_area) >> => >> cpu_cache.dma_unmap_area() >> >> dma_map_area() always does cache invalidate. >> >> But for a couple of CPU variant, dma_unmap_area() is a noop, so >> sync_for_cpu() does nothing. >> >> Toke's patch will break ath9k on those platforms (mostly silent >> breakage, rx corruption leading to bad performance) > > Okay, so that would be bad obviously. So if I'm reading you correctly > (cf my question above), we can't fix this properly from the driver side, > and we should go with the partial SWIOTLB revert instead? Do you have any other way of telling if DMA is idle, or temporarily pausing it before the sync_for_cpu, such that you could honour the notion of ownership transfer properly? As mentioned elsewhere I suspect the only "real" fix if you really do need to allow concurrent access is to use the coherent DMA API for buffers rather than streaming mappings, but that's obviously some far more significant surgery. Robin.