Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751530AbdGZLSa (ORCPT ); Wed, 26 Jul 2017 07:18:30 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:10278 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750867AbdGZLS3 (ORCPT ); Wed, 26 Jul 2017 07:18:29 -0400 Subject: Re: [PATCH v2 0/4] Optimise 64-bit IOVA allocations To: Joerg Roedel , Robin Murphy References: <20170726110807.GN15833@8bytes.org> CC: , , , , , , , , From: "Leizhen (ThunderTown)" Message-ID: <59787A48.6060200@huawei.com> Date: Wed, 26 Jul 2017 19:17:28 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20170726110807.GN15833@8bytes.org> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020205.59787A5C.00EC,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: ceb2876b83e0d3e4608410aa498641cc Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2278 Lines: 59 On 2017/7/26 19:08, Joerg Roedel wrote: > Hi Robin. > > On Fri, Jul 21, 2017 at 12:41:57PM +0100, Robin Murphy wrote: >> Hi all, >> >> In the wake of the ARM SMMU optimisation efforts, it seems that certain >> workloads (e.g. storage I/O with large scatterlists) probably remain quite >> heavily influenced by IOVA allocation performance. Separately, Ard also >> reported massive performance drops for a graphical desktop on AMD Seattle >> when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA >> ops domain getting initialised differently for ACPI vs. DT, and exposing >> the overhead of the rbtree slow path. Whilst we could go around trying to >> close up all the little gaps that lead to hitting the slowest case, it >> seems a much better idea to simply make said slowest case a lot less slow. > > Do you have some numbers here? How big was the impact before these > patches and how is it with the patches? Here are some numbers: (before)$ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.2 sec 7.88 MBytes 6.48 Mbits/sec [ 5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900 [ 5] 0.0-10.3 sec 7.88 MBytes 6.43 Mbits/sec [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902 [ 4] 0.0-10.3 sec 7.88 MBytes 6.43 Mbits/sec (after)$ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec [ 5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332 [ 5] 0.0-10.0 sec 1.10 GBytes 939 Mbits/sec [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334 [ 4] 0.0-10.0 sec 1.10 GBytes 938 Mbits/sec > > > Joerg > > > . > -- Thanks! BestRegards