Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp2172355ybt; Sun, 28 Jun 2020 10:18:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwcYTMxTnn8mP0UWClB+f0nV8IXZVaB1DI5XLDwvw1PM6z4bU8ixLT4h4jzwZ1ovtW6msQv X-Received: by 2002:a17:906:3fd2:: with SMTP id k18mr11388414ejj.387.1593364695647; Sun, 28 Jun 2020 10:18:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593364695; cv=none; d=google.com; s=arc-20160816; b=js3tuSVALMrsh6//ObSmwo1CsGfDVDpnrmw5dwN1soVDxgG9h0zbkMB9mts5zvd7xi O7+BE2kIS7U7IBJT5EgvAaCOoLh+4/+ddaQ0gxaDeRaZKbnzrL/M4gV0mkXTuhWovGQb GqH4zxn3kmMbmzxjJ6UXJPwo8wYBaWE9OYQgbuvxxBq38aSDvRThO+SH+gRexfyylz8X YASSFEQduYBhfx5h+j1hvgb4dbF6deP6UCLFp4JjJ9uNcgVErKgdkKSAAcBlWJihyDvl kbEDa6omyezJqznvQcLkWaWFvbJoMXxAAeoQ0gSciWwO3j3dsXOEW2ILhR1ul2cJhHps 2cYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:ironport-sdr:ironport-sdr; bh=PSdppTUiiTRtyHNJYPe1EoHkl7njRvIOeNGBAv6lOhs=; b=MLfVTdrfN15gzFtSGzXBEIrsjeNgHEOx5sPWunqeUImhkM90RYLir/G5pXBhuJeCEU kChNdcxLNRdh/38DH6nkSwDHoT5i/utLc3Vm6Eegb3S6U2ngAIw4H8Lh6pIk1ZMxqEb0 6aAvvSaeRViZi9+HqCSQUn3EIWcaB4aX2s+wGaLCeC9ZiyMRNyHQcVG0Bi4Eh96tPcaP ydH9QkFiL1REQ7wHK2rgTHkJ7wZQkjjWgRJ3VGtr82VRvXW1ak6yupXFtRXc4KvGjJ8Q kBI01J7kbb+GDtRhpkesRV+Dq2FzKjLi+9yGSH1EaferNOrH/6+Zzh4NcgQeTiddL2MX 9/YQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r9si8705473edm.321.2020.06.28.10.17.52; Sun, 28 Jun 2020 10:18:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726611AbgF1RQk (ORCPT + 99 others); Sun, 28 Jun 2020 13:16:40 -0400 Received: from mga02.intel.com ([134.134.136.20]:15923 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726059AbgF1RQk (ORCPT ); Sun, 28 Jun 2020 13:16:40 -0400 IronPort-SDR: Tc5u38z0Rg0j4+DX+3gFuSTzd5U6V9JP54N5FQEAG/70xmUqDw67VZyEIF49FSxZgLXLrdKOpy Epyqpz9iB+2g== X-IronPort-AV: E=McAfee;i="6000,8403,9666"; a="134173291" X-IronPort-AV: E=Sophos;i="5.75,291,1589266800"; d="scan'208";a="134173291" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Jun 2020 10:16:38 -0700 IronPort-SDR: s/CaFswzkV9uLr+zFxqweD+BHz4N478o0Avg93xSnrVl+Mvwb54GAnFPN8KiQhUTMFnXdTGpBB 1JKO8wlMnqdA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,291,1589266800"; d="scan'208";a="453934416" Received: from unknown (HELO btopel-mobl.ger.intel.com) ([10.252.54.42]) by orsmga005.jf.intel.com with ESMTP; 28 Jun 2020 10:16:34 -0700 Subject: Re: [PATCH net] xsk: remove cheap_dma optimization To: Christoph Hellwig , Daniel Borkmann Cc: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , netdev@vger.kernel.org, davem@davemloft.net, konrad.wilk@oracle.com, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, maximmi@mellanox.com, magnus.karlsson@intel.com, jonathan.lemon@gmail.com References: <20200626134358.90122-1-bjorn.topel@gmail.com> <20200627070406.GB11854@lst.de> From: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= Message-ID: <88d27e1b-dbda-301c-64ba-2391092e3236@intel.com> Date: Sun, 28 Jun 2020 19:16:33 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <20200627070406.GB11854@lst.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020-06-27 09:04, Christoph Hellwig wrote: > On Sat, Jun 27, 2020 at 01:00:19AM +0200, Daniel Borkmann wrote: >> Given there is roughly a ~5 weeks window at max where this removal could >> still be applied in the worst case, could we come up with a fix / proposal >> first that moves this into the DMA mapping core? If there is something that >> can be agreed upon by all parties, then we could avoid re-adding the 9% >> slowdown. :/ > > I'd rather turn it upside down - this abuse of the internals blocks work > that has basically just missed the previous window and I'm not going > to wait weeks to sort out the API misuse. But we can add optimizations > back later if we find a sane way. > I'm not super excited about the performance loss, but I do get Christoph's frustration about gutting the DMA API making it harder for DMA people to get work done. Lets try to solve this properly using proper DMA APIs. > That being said I really can't see how this would make so much of a > difference. What architecture and what dma_ops are you using for > those measurements? What is the workload? > The 9% is for an AF_XDP (Fast raw Ethernet socket. Think AF_PACKET, but faster.) benchmark: receive the packet from the NIC, and drop it. The DMA syncs stand out in the perf top: 28.63% [kernel] [k] i40e_clean_rx_irq_zc 17.12% [kernel] [k] xp_alloc 8.80% [kernel] [k] __xsk_rcv_zc 7.69% [kernel] [k] xdp_do_redirect 5.35% bpf_prog_992d9ddc835e5629 [k] bpf_prog_992d9ddc835e5629 4.77% [kernel] [k] xsk_rcv.part.0 4.07% [kernel] [k] __xsk_map_redirect 3.80% [kernel] [k] dma_direct_sync_single_for_cpu 3.03% [kernel] [k] dma_direct_sync_single_for_device 2.76% [kernel] [k] i40e_alloc_rx_buffers_zc 1.83% [kernel] [k] xsk_flush ... For this benchmark the dma_ops are NULL (dma_is_direct() == true), and the main issue is that SWIOTLB is now unconditionally enabled [1] for x86, and for each sync we have to check that if is_swiotlb_buffer() which involves a some costly indirection. That was pretty much what my hack avoided. Instead we did all the checks upfront, since AF_XDP has long-term DMA mappings, and just set a flag for that. Avoiding the whole "is this address swiotlb" in dma_direct_sync_single_for_{cpu, device]() per-packet would help a lot. Somewhat related to the DMA API; It would have performance benefits for AF_XDP if the DMA range of the mapped memory was linear, i.e. by IOMMU utilization. I've started hacking a thing a little bit, but it would be nice if such API was part of the mapping core. Input: array of pages Output: array of dma addrs (and obviously dev, flags and such) For non-IOMMU len(array of pages) == len(array of dma addrs) For best-case IOMMU len(array of dma addrs) == 1 (large linear space) But that's for later. :-) Björn [1] commit: 09230cbc1bab ("swiotlb: move the SWIOTLB config symbol to lib/Kconfig")