Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp322250pxb; Wed, 13 Apr 2022 01:38:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyK6Z+itojK10WO4ElRLp1O25KCmZQjF7tWfLbxJv5USHBqCb6e/z/1FGpPSLOdSeAXLjuN X-Received: by 2002:a17:90a:4d0d:b0:1cb:9dac:7ed0 with SMTP id c13-20020a17090a4d0d00b001cb9dac7ed0mr9630955pjg.198.1649839134750; Wed, 13 Apr 2022 01:38:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649839134; cv=none; d=google.com; s=arc-20160816; b=n6f+8wmrk08hS1tmgGlkPIiRHb+XMztP6rb37IKLafx3njXZp6tzlrKPCrBx9+yBdV f8h68Clra4vUvI3cJmMzlg9GFT9V0AIROFk6JuLVMLuVA25CydvVKFDYDBUy76bcCYIl /+LMdppbz1xtPh3B/Ztz4QbqKbrR57q29+oKSul/T1SpzA9M6eUAQzfXJTxr2bUvSc9r luS48UWk+GwHmTbaxJT3CM573f9Fa7Jp8k9SQV9E6ujqiRpjzjO9QVLC+OEeuq16mO6A +YmDBQHN5kr9GFZoPKccGYOUnwpSpgjHnS6tudIIblTYs6p7Pjt8DlQC2fmqTbArKZ63 x10A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=LZoljAqLmNEKAL9VKJt2rhskIjeHwVBKmPTRRvERG9A=; b=0FpBQHXrvy8KcQCg+AbspDY75//9B5IUkUMOmNhiR45CesutHzIYGwGBP9NkebjVjm mr2Daf/JtI871FgAwG8fz9eFvEdR2L4oRXGqiIOETwI9uxyqAYK30xk+9775RTsbb/ll N5PJB8WPvWgiNwhnCKvo8I6WXVuoytUoI2CJlo9T9RtxXefoDUWGDMp580txYjPPvphr oLn/YuRyW8Chu9j7hqC0GqeNOlOoE8DzFYf9bfh7SrkIDTambAKqYDs1V3E0/PgZm+iq 5PcI6ncCaGjpr+TPEl8OxtYjfCO9orec/JDXwiQgGBigy49i3rlaeoSlFZdMS5PQmGBn O7sw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JkRaaZbf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e13-20020a17090ac20d00b001cb9d45bc36si7630127pjt.184.2022.04.13.01.38.42; Wed, 13 Apr 2022 01:38:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JkRaaZbf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229832AbiDMBEa (ORCPT + 99 others); Tue, 12 Apr 2022 21:04:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229463AbiDMBE3 (ORCPT ); Tue, 12 Apr 2022 21:04:29 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 918B8220D9 for ; Tue, 12 Apr 2022 18:02:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649811729; x=1681347729; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=jCOzTEFGDySWdZhcnJIZg2QMTm3MGx0Arwq3UD8MWIw=; b=JkRaaZbfVBiX8b1Q4PZp5M2JN1Sr9Hcn4bck8e/U/L3ky1lmIBR+SUnW 5gKCD9BdCS98qWXDjFK3idoCpTOuklgHwA/h9ALNJ1i1+DG1wvJnPB19f QDN5XiZHJs+rCSioiJZnVCmSkaFfFedCc98e8qkqYAij8Nm9t2F1i8BxH /3Jeo3GLgLkiiDZZWrd8m6vwZwnfOl1GTgYVEJatf8OcvaaAGLjKK2f8V ebKrTc76Q0DeRQv4OWn1sQ5XnbsQ/pp108m1xGt3cEI27fMc6DytaWmOe 0S0DFbtWuPVs1PbruaQzgZJZ6I7esSsPeUhf1i/kiXhdUNklE1c2lseXm Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10315"; a="348979605" X-IronPort-AV: E=Sophos;i="5.90,255,1643702400"; d="scan'208";a="348979605" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2022 18:02:09 -0700 X-IronPort-AV: E=Sophos;i="5.90,255,1643702400"; d="scan'208";a="551987361" Received: from gao-cwp.sh.intel.com (HELO gao-cwp) ([10.239.159.23]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2022 18:02:07 -0700 Date: Wed, 13 Apr 2022 09:02:02 +0800 From: Chao Gao To: Robin Murphy Cc: linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org, m.szyprowski@samsung.com, hch@lst.de, Wang Zhaoyang1 , Gao Liang , Kevin Tian Subject: Re: [PATCH] dma-direct: avoid redundant memory sync for swiotlb Message-ID: <20220413010157.GA10502@gao-cwp> References: <20220412113805.3210-1-chao.gao@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 12, 2022 at 02:33:05PM +0100, Robin Murphy wrote: >On 12/04/2022 12:38 pm, Chao Gao wrote: >> When we looked into FIO performance with swiotlb enabled in VM, we found >> swiotlb_bounce() is always called one more time than expected for each DMA >> read request. >> >> It turns out that the bounce buffer is copied to original DMA buffer twice >> after the completion of a DMA request (one is done by in >> dma_direct_sync_single_for_cpu(), the other by swiotlb_tbl_unmap_single()). >> But the content in bounce buffer actually doesn't change between the two >> rounds of copy. So, one round of copy is redundant. >> >> Pass DMA_ATTR_SKIP_CPU_SYNC flag to swiotlb_tbl_unmap_single() to >> skip the memory copy in it. > >It's still a little suboptimal and non-obvious to call into SWIOTLB twice >though - even better might be for SWIOTLB to call arch_sync_dma_for_cpu() at >the appropriate place internally, Hi Robin, dma_direct_sync_single_for_cpu() also calls arch_sync_dma_for_cpu_all() and arch_dma_mark_clean() in some cases. if SWIOTLB does sync internally, should these two functions be called by SWIOTLB? Personally, it might be better if swiotlb can just focus on bounce buffer alloc/free. Adding more DMA coherence logic into swiotlb will make it a little complicated. How about an open-coded version of dma_direct_sync_single_for_cpu in dma_direct_unmap_page with swiotlb_sync_single_for_cpu replaced by swiotlb_tbl_unmap_single?