Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp4433955pxu; Wed, 9 Dec 2020 17:43:00 -0800 (PST) X-Google-Smtp-Source: ABdhPJyQzmKc/bajGmrtLY4fMGfb1EgFfdnsieo287Fl7YmN+qqJnk86uC5a2I1gbmAvBVORfObE X-Received: by 2002:aa7:c892:: with SMTP id p18mr4586943eds.64.1607564580431; Wed, 09 Dec 2020 17:43:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607564580; cv=none; d=google.com; s=arc-20160816; b=frKz6kz4cgbTMf2C4rz8EYCLsMecVHr/KZCUV/dsvUZKSy5eG0H7ytAJ1TpL0fJ4fK 7zSdOWD3fjrA57f6bc/eIoCnByQYD+Kw7PvyFIf2Sfd004P0aU49/aTZ/BMuEVFdvhXH pGpqsm0JnistnLt9mHnxf0ICXN3/KMHEWBsj84bWYE2HRUjK0Uv7s89DtRj6Z5xF4riS EGOfJ/9sKyJtCd1cplBRg2+PMRARuFU3d0emf49wrr0RV2sHykIHCmeaVCu0ce7ATMYK 5KTeOAW8NZ9aW+rozjSkBinbkOJFgNh86u4yO6dZ7HUmgOwJDUirotIJBIdwr7S49Dwy jdBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=LnCq2e5r6qZwEGjx09HteLp5NyK+b4cLaVhTem5sobg=; b=Vw8hga2Ih8VD+1IBvpR2BkUGeYjdt1IDprA7CtKXGoaQYjxMEinUavHrKkQ8dB9AdQ Ob8GUQqJfRNI37OH8DtmvNb+CbfyMes83b15G/4yOiEZRGqIsWz47SAKo6aLLTROFymI 6as7H1D+QrA59t7bvdw1WeDJULLq23hurgYxHAIWdg7Lse4Ic9UVu7ZfCWuVuydhYwCp o2LCfHbW7luW1foAGz6NxaESXnCX91+0rX66C/EXUJfyu3lvaMBpYaKocrYhpQIhrFAf j+/HiSE2VDUhIseFR5GecbqSPTdHIA9MVIkDWbIMdMn9FzAG8+SBwTXrlmBa7znl/ns7 yCYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rp28si1735937ejb.10.2020.12.09.17.42.36; Wed, 09 Dec 2020 17:43:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732800AbgLIOLA (ORCPT + 99 others); Wed, 9 Dec 2020 09:11:00 -0500 Received: from vps0.lunn.ch ([185.16.172.187]:46442 "EHLO vps0.lunn.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732758AbgLIOKt (ORCPT ); Wed, 9 Dec 2020 09:10:49 -0500 Received: from andrew by vps0.lunn.ch with local (Exim 4.94) (envelope-from ) id 1kn0AG-00B3sU-Q5; Wed, 09 Dec 2020 15:09:56 +0100 Date: Wed, 9 Dec 2020 15:09:56 +0100 From: Andrew Lunn To: Sven Van Asbroeck Cc: Florian Fainelli , Jakub Kicinski , Bryan Whitehead , Microchip Linux Driver Support , David S Miller , netdev , Linux Kernel Mailing List Subject: Re: [PATCH net v1 2/2] lan743x: boost performance: limit PCIe bandwidth requirement Message-ID: <20201209140956.GC2611606@lunn.ch> References: <20201206034408.31492-1-TheSven73@gmail.com> <20201206034408.31492-2-TheSven73@gmail.com> <20201208114314.743ee6ec@kicinski-fedora-pc1c0hjn.DHCP.thefacebook.com> <20201208225125.GA2602479@lunn.ch> <3aed88da-8e82-3bd0-6822-d30f1bd5ec9e@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 08, 2020 at 10:49:16PM -0500, Sven Van Asbroeck wrote: > On Tue, Dec 8, 2020 at 6:36 PM Florian Fainelli wrote: > > > > dma_sync_single_for_{cpu,device} is what you would need in order to make > > a partial cache line invalidation. You would still need to unmap the > > same address+length pair that was used for the initial mapping otherwise > > the DMA-API debugging will rightfully complain. > > I tried replacing > dma_unmap_single(9K, DMA_FROM_DEVICE); > with > dma_sync_single_for_cpu(received_size=1500 bytes, DMA_FROM_DEVICE); > dma_unmap_single_attrs(9K, DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC); > > and that works! But the bandwidth is still pretty bad, because the cpu > now spends most of its time doing > dma_map_single(9K, DMA_FROM_DEVICE); > which spends a lot of time doing __dma_page_cpu_to_dev. 9K is not a nice number, since for each allocation it probably has to find 4 contiguous pages. See what the performance difference is with 2K, 4K and 8K. If there is a big difference, you might want to special case when the MTU is set for jumbo packets, or check if the hardware can do scatter/gather. You also need to be careful with caches and speculation. As you have seen, bad things can happen. And it can be a lot more subtle. If some code is accessing the page before the buffer and gets towards the end of the page, the CPU might speculatively bring in the next page, i.e the start of the buffer. If that happens before the DMA operation, and you don't invalidate the cache correctly, you get hard to find corruption. Andrew